# piecewise regression

**Eran Raviv » R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A *beta* of a stock generally means its relation with the market, how many percent move we should expect from the stock when the market moves one percent.

Market, being a somewhat vague notion is approximated here, as usual, using the S&P 500. This aforementioned relation (henceforth, *beta*) is detrimental to many aspects of trading and risk management. It is already well established that volatility has different dynamics for rising markets and for declining market. Recently, I read few papers that suggest the same holds true for *beta*, specifically that the *beta* is not the same for rising markets and for declining markets. We anyway use regression for estimation of *beta*, so piecewise linear regression can fit right in for an investor/speculator who wishes to accommodate himself with this asymmetry.

The idea is very simple, we divide the dataset into two (or more) parts and estimate each part separately, piece by piece, or *piecewise*. This simple idea can be made with complex notation and code, which is necessary when you move up in complexity level, the interested reader can google *regression splines* or check the references below.

For illustration, I use Microsoft returns (MSFT). I estimate different *beta*‘s for green days and for red days, positive days are above zero and negative days are below zero, so zero is our breaking point. (The breaking point is called “knot” in academic jargon, why “knot”? maybe because it ties the two parts together, but probably so people on the outside will not be able to get it.) The following plot shows the result:

You know what? maybe it is not zero we should take as a break point, maybe the *beta* is the same all the way till the extreme negative, and only say when the market is moving sharply down, only then the relation changes. Let us ask the data. This falls under the category of structural change. I consider a grid of points along the axis, and build a model with a break at each point, one slope before the break another slope after the break. I look for the minimum of the Sum of Squared Errors over the whole sample, so I sum up the squared errors from the two models. The following figure shows the result:

Well, data says the breaking point is not zero, but it is almost zero, tough luck, all this work and for what I ask you? OK then, in order to use the correct *beta* all you have to do now is to decide, is it a bear market or a bull market.. should be a breeze. Thanks for reading, as always code is given below.

**References**:

Statistical Learning from a Regression Perspective (Springer Series in Statistics)

R Cookbook (O’Reilly Cookbooks)

**Code:**

Note that you can also use package “segmented” and package “strucchange”, both are extensive and suitable, but maybe it is better to code it yourself.

^{?}View Code RSPLUS

library(quantmod) end<- format(Sys.Date(),"%Y-%m-%d") start<-format(as.Date("2000-09-18"),"%Y-%m-%d") # just an arbitrary date from the past. tckr = c('MSFT','SPY') dat0 = (getSymbols(tckr[1], src="yahoo", from=start, to=end, auto.assign = F)) n = NROW(dat0) l = length(tckr) dat = array(dim = c(n,6,l)) ; ret = matrix(nrow = n, ncol = l) for (i in 1:l){ dat0 = (getSymbols(tckr[i], src="yahoo", from=start, to=end, auto.assign = F)) dat[1:length(dat0[,2]),,i] = dat0 ret[,i] = as.numeric(dat[,4,i]/dat[,1,i] - 1) } lmall = lm(ret[,1]~0+ret[,2]) plot(ret[,1]~ret[,2]) ; abline(lmall, col = 2, lwd = 2) ; k = 0 lmpos = lm(ret[ret[,1]>k,1]~0+ret[ret[,1]>k,2]) ; lmpos lmneg = lm(ret[ret[,1]<k,1]~0+ret[ret[,1]<k,2]) ; lmneg plot(ret[,1]~ret[,2], main = expression(paste( " Piecwise ", beta) ), ylab = "MSFT", xlab = "S&P" ) segments(min(ret[,2]),min(ret[,2])*lmneg$coef,x1 = 0, y1 = 0, lwd = 5,col =2) segments(0,0,x1 = max(ret[,2]), y1 = max(ret[,2])*lmpos$coef, lwd = 5,col =3) grid1 <- seq(10,(n-1),n/100) grid2 <- sort(ret[,2])[grid1] ## Note here (ret[,2]<grid2[i]), is the Indicator function d = NULL for (i in 1:length(grid2) ) { regneg <-lm(ret[ret[,2]<grid2[i],1]~0 + ret[ret[,2]<grid2[i],2] ) regpos <-lm(ret[ret[,2]>grid2[i],1]~0 + ret[ret[,2]>grid2[i],2] ) d[i]<- summary(regneg)[[6]] + (summary(regpos)[[6]]) } plot(d, ty = "b", main = "Grid Search for Best Break Point", ylab = "") text(which.min(d),y = min(d)+.001,paste('MIN at: ',signif(grid2[which.min(d)],2)),col = 4) points(which.min(d),y = min(d), pch = 21, col = 4, cex = 2.2) |

**leave a comment**for the author, please follow the link and comment on their blog:

**Eran Raviv » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.