June 26, 2009
By

(This article was first published on Learning R, and kindly contributed to R-bloggers)

Before we delve into slightly more advanced plotting commands I want to talk a little about linear models, specifically, linear regression. In R this is very, very simple. For instance, in our ‘states’ data frame, we might want to look at median household income as a predictor of state education expenditures. The command lm calculates this for us. We’ll call our first model, ‘model1’:

model1 <- lm (publicedexp~hincome)

OK, great, but where are our results? One of the things about R is that you can assign names to all sorts of things, even models. That way, you can continually refer to them when doing other things (as we’ll see a bit later.) The way to look at our results is with this:

summary (model1)

Call:
lm(formula = publicedexp ~ hincome)

Residuals:
Min 1Q Median 3Q Max
-397.50 -127.43 -8.69 120.96 431.85

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.516e+02 1.735e+02 1.450 0.153
hincome 2.346e-02 3.869e-03 6.063 1.87e-07 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 198.8 on 49 degrees of freedom
Multiple R-squared: 0.4287, Adjusted R-squared: 0.417
F-statistic: 36.76 on 1 and 49 DF, p-value: 1.87e-07

Which gives us a lot more information than if we’d just run the lm command without assigning a name to the model. Later we’ll look at how we can integrate our linear model with our plots.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...