# Applied Statistical Theory: Quantile Regression

November 13, 2015
By

[This article was first published on Mathew Analytics » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This is part two of the ‘applied statistical theory’ series that will cover the bare essentials of various statistical techniques. As analysts, we need to know enough about what we’re doing to be dangerous and explain approaches to others. It’s not enough to say “I used X because the misclassification rate was low.”

Standard linear regression summarizes the average relationship between a set of predictors and the response variable. $\beta_1$ represents the change in the mean value of $Y$ given a one unit change in $X_1$. A single slope is used to describe the relationship. Therefore, linear regression only provides a partial view of the link between the response variable and predictors. This is often inadaquete when there is heterogenous variance between $X$ and $Y$. In such cases, we need to examine how the relationship between $X$ and $Y$ changes depending on the value of $Y$. For example, the impact of education on income may be more pronounced for those at highr income levels than those at lower income levels. Likewise, the the affect of parental care on mean infant birth weight can be compared to it’s effect on other quantiles of infant birth weight. Quantile regression solves for these problems by looking at changes in the different quantiles of the response. The parameter estimates for this technique represent the change in a specified quantile of the response variable produced by a one unit change in the predictor variable. One major benefit of quantile regression is that it makes no assumptions about the error distribution.

library(quantreg)

frmla <- mpg ~ .
u=seq(.02,.98,by=.02)

mm = rq(frmla, data=mtcars, tau=u) # for a series of quantiles
mm = rq(frmla, data=mtcars, tau=0.50) # for the median

summ <- summary(mm, se = "boot")
summ

plot(summ)

To leave a comment for the author, please follow the link and comment on their blog: Mathew Analytics » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...