piecewise regression

[This article was first published on Eran Raviv » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A beta of a stock generally means its relation with the market, how many percent move we should expect from the stock when the market moves one percent.

Market, being a somewhat vague notion is approximated here, as usual, using the S&P 500. This aforementioned relation (henceforth, beta) is detrimental to many aspects of trading and risk management. It is already well established that volatility has different dynamics for rising markets and for declining market. Recently, I read few papers that suggest the same holds true for beta, specifically that the beta is not the same for rising markets and for declining markets. We anyway use regression for estimation of beta, so piecewise linear regression can fit right in for an investor/speculator who wishes to accommodate himself with this asymmetry.

The idea is very simple, we divide the dataset into two (or more) parts and estimate each part separately, piece by piece, or piecewise. This simple idea can be made with complex notation and code, which is necessary when you move up in complexity level, the interested reader can google regression splines or check the references below.

For illustration, I use Microsoft returns (MSFT). I estimate different beta‘s for green days and for red days, positive days are above zero and negative days are below zero, so zero is our breaking point. (The breaking point is called “knot” in academic jargon, why “knot”? maybe because it ties the two parts together, but probably so people on the outside will not be able to get it.) The following plot shows the result:

piecewise beta
You know what? maybe it is not zero we should take as a break point, maybe the beta is the same all the way till the extreme negative, and only say when the market is moving sharply down, only then the relation changes. Let us ask the data. This falls under the category of structural change. I consider a grid of points along the axis, and build a model with a break at each point, one slope before the break another slope after the break. I look for the minimum of the Sum of Squared Errors over the whole sample, so I sum up the squared errors from the two models. The following figure shows the result:

Grid Search

Grid Search over optimal model

Well, data says the breaking point is not zero, but it is almost zero, tough luck, all this work and for what I ask you? OK then, in order to use the correct beta all you have to do now is to decide, is it a bear market or a bull market.. should be a breeze. Thanks for reading, as always code is given below.

References:                                                                                                                      

Statistical Learning from a Regression Perspective (Springer Series in Statistics)

R Cookbook (O’Reilly Cookbooks)

Code:
Note that you can also use package “segmented” and package “strucchange”, both are extensive and suitable, but maybe it is better to code it yourself.

?View Code RSPLUS
library(quantmod)
end<- format(Sys.Date(),"%Y-%m-%d") 
start<-format(as.Date("2000-09-18"),"%Y-%m-%d") # just an arbitrary date from the past.
tckr = c('MSFT','SPY')
dat0 = (getSymbols(tckr[1], src="yahoo", from=start, to=end, auto.assign = F))
n = NROW(dat0)
l = length(tckr)
dat = array(dim = c(n,6,l)) ; ret = matrix(nrow = n, ncol = l)
for (i in 1:l){
 dat0 = (getSymbols(tckr[i], src="yahoo", from=start, to=end, auto.assign = F))
 dat[1:length(dat0[,2]),,i] = dat0 
ret[,i] = as.numeric(dat[,4,i]/dat[,1,i] - 1)
 }
lmall = lm(ret[,1]~0+ret[,2])
plot(ret[,1]~ret[,2]) ; abline(lmall, col = 2, lwd = 2) ; k = 0
lmpos = lm(ret[ret[,1]>k,1]~0+ret[ret[,1]>k,2]) ; lmpos
lmneg = lm(ret[ret[,1]<k,1]~0+ret[ret[,1]<k,2]) ; lmneg
plot(ret[,1]~ret[,2], main = expression(paste( " Piecwise ", beta) ), ylab = "MSFT", xlab = "S&P" ) 
segments(min(ret[,2]),min(ret[,2])*lmneg$coef,x1 = 0, y1 = 0, lwd = 5,col =2)
segments(0,0,x1 = max(ret[,2]), y1 = max(ret[,2])*lmpos$coef, lwd = 5,col =3)
grid1 <- seq(10,(n-1),n/100)
grid2 <- sort(ret[,2])[grid1]
## Note here (ret[,2]<grid2[i]), is the Indicator function
d = NULL
for (i in 1:length(grid2) ) {
regneg <-lm(ret[ret[,2]<grid2[i],1]~0 + ret[ret[,2]<grid2[i],2] )
regpos <-lm(ret[ret[,2]>grid2[i],1]~0 + ret[ret[,2]>grid2[i],2] )
 d[i]<- summary(regneg)[[6]] + (summary(regpos)[[6]]) 
}
plot(d, ty = "b", main = "Grid Search for Best Break Point", ylab = "")  
text(which.min(d),y = min(d)+.001,paste('MIN at: ',signif(grid2[which.min(d)],2)),col = 4)
points(which.min(d),y = min(d), pch = 21, col = 4, cex = 2.2)

To leave a comment for the author, please follow the link and comment on their blog: Eran Raviv » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)