Regression – covariate adjustment

[This article was first published on Gregor Gorjanc (gg), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Linear regression is one of the key concepts in statistics [wikipedia1, wikipedia2]. However, people are often confuse the meaning of parameters of linear regression – the intercept tells us the average value of y at x=0, while the slope tells us how much change of y can we expect on average when we change x for one unit – exactly the same as in the linear function, though we use averages here due to noise.

Today colleague got confused with the meaning of adjusting covariate (x variable) and the effect of parameter estimates. By shifting the x scale, we also shift the point at which intercept is estimated. I made the following graph to demonstrate this point in the case of nested regression of y on x within a group factor having two levels. R code to produce this plots is shown on bottom.
Regression – covariate adjustment
library(package="MASS")
 
x1 <- mvrnorm(n=100, mu=50, Sigma=20, empirical=TRUE)
x2 <- mvrnorm(n=100, mu=70, Sigma=20, empirical=TRUE)
 
mu1 <- mu2 <- 4
b1 <- 0.300
b2 <- 0.250
 
y1 <- mu1 + b1 * x1 + rnorm(n=100, sd=1) 
y2 <- mu2 + b2 * x2 + rnorm(n=100, sd=1) 
 
x <- c(x1, x2)
xK <- x - 60
y <- c(y1, y2)
g <- factor(rep(c(1, 2), each=100))
 
par(mfrow=c(2, 1), pty="m", bty="l")
 
(fit1n <- lm(y ~ g + x + x:g))
## (Intercept)           g2            x         g2:x  
##     3.06785      2.32448      0.31967     -0.09077  
beta <- coef(fit1n)
 
plot(y ~ x, col=c("blue", "red")[g], ylim=c(0, max(y)), xlim=c(0, max(x)), pch=19, cex=0.25)
points(x=mean(x1), y=mean(y1), pch=19)
points(x=mean(x2), y=mean(y2), pch=19)
abline(v=c(mean(x1), mean(x2)), lty=2, col="gray")
abline(h=c(mean(y1), mean(y2)), lty=2, col="gray")
 
points(x=0, y=beta["(Intercept)"],              pch=19, col="blue")
points(x=0, y=beta["(Intercept)"] + beta["g2"], pch=19, col="red")
 
z <- 0:max(x)
lines(y= beta["(Intercept)"]               +  beta["x"] * z                , x=z, col="blue")
lines(y=(beta["(Intercept)"] + beta["g2"]) + (beta["x"] + beta["g2:x"]) * z, x=z, col="red")
 
(fit2n <- lm(y ~ g + xK + xK:g))
## (Intercept)           g2           xK        g2:xK  
##    22.24824     -3.12153      0.31967     -0.09077
beta <- coef(fit2n)
 
plot(y ~ x, col=c("blue", "red")[g], ylim=c(0, max(y)), xlim=c(0, max(x)), pch=19, cex=0.25)
points(x=mean(x1), y=mean(y1), pch=19)
points(x=mean(x2), y=mean(y2), pch=19)
abline(v=c(mean(x1), mean(x2)), lty=2, col="gray")
abline(h=c(mean(y1), mean(y2)), lty=2, col="gray")
 
abline(v=60, lty=2, col="gray")
 
points(x=60, y=beta["(Intercept)"],              pch=19, col="blue")
points(x=60, y=beta["(Intercept)"] + beta["g2"], pch=19, col="red")
 
z <- 0:max(x) - 60
lines(y= beta["(Intercept)"]               +  beta["xK"] * z                 , x=z + 60, col="blue")
lines(y=(beta["(Intercept)"] + beta["g2"]) + (beta["xK"] + beta["g2:xK"]) * z, x=z + 60, col="red")

To leave a comment for the author, please follow the link and comment on their blog: Gregor Gorjanc (gg).

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)