Estimating continuous piecewise linear regression

April 2, 2013

(This article was first published on R snippets, and kindly contributed to R-bloggers)

When talking about smoothing splines a simple point to start with is a continuous piecewise linear regression with fixed knots. I did not find any simple example showing how to estimate the it in GNU R so I have created a little snippet that does the job.

Assume you are given continuous predictor x and continuous predicted variable y. We want to estimate continuous piecewise linear regression with fixed knots stored in variable knots using standard lm procedure.

The key to a solution is proper definition of regression formula. In order to introduce possibility of change of slope in knot k we have to add a so called hinge term to the model max(0, x-k).

In the code given below function piece.formula automatically generates a proper right hand side of the regression formula given variable name and list of required knots. It is next tested on a simple function.

N <- 40 # number of sampled points
K <- 5 # number of knots

piece.formula <- function(, knots) {
formula.sign <- rep(" - ", length(knots))
formula.sign[knots < 0] <- " + "
paste(, "+",
paste("I(pmax(",, formula.sign, abs(knots), ", 0))",
collapse = " + ", sep=""))

f <- function(x) {
2 * sin(6 * x)

x <- seq(-1, 1, len = N)
y <- f(x) + rnorm(length(x))

knots <- seq(min(x), max(x), len = K + 2)[-c(1, K + 2)]
model <- lm(formula(paste("y ~", piece.formula("x", knots))))

par(mar = c(4, 4, 1, 1))
plot(x, y)
lines(x, f(x))
new.x <- seq(min(x), max(x) ,len = 10000)
points(new.x, predict(model, newdata = data.frame(x = new.x)),
col = "red", pch = ".")
points(knots, predict(model, newdata = data.frame(x = knots)),
col = "red", pch = 18)

Below we can see the graph of estimation result. Red line is the desired continuous piecewise linear regression with fixed knots given by red diamonds. Notice that the plot uses points procedure to plot the red line to highlight that the generated predictions have the required properties.

An additional value of the presented solution that we do not do any preprocessing of predictor variable if we want to make a prediction – all the calculations are made within the formula.

Of course this simple example can be easily extended to obtain a simple smoother. For example we can set K to be large and use some regularized regression like ridge or lasso.

To leave a comment for the author, please follow the link and comment on their blog: R snippets. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)