(This article was first published on

Standardized (or beta) coefficients from a linear regression model are the parameter estimates obtained when the predictors and outcomes have been standardized to have variance = 1. Alternatively, the regression model can be fit and then standardized post-hoc based on the appropriate standard deviations. The parameters are thus interpreted as change in the outcome, in standard deviations, per standard deviation change in the predictors. However they're calculated, standardized coefficients facilitate an assessment of which variables have the greatest association with the outcome (or response) variable, though such an assessment ignores the confidence limits associated with each pairwise association.**SAS and R**, and kindly contributed to R-bloggers)It's straightforward to calculate these quantities in SAS and R. We'll demonstrate with data from the HELP study, modeling PCS as a function of MCS and homelessness among female subjects.

**SAS**

In SAS, standardized coefficients are available as the

`stb`option for the

`model`statement in

`proc reg`.

proc reg data="c:\book\help";

where female eq 1;

model pcs = mcs homeless / stb;

run;

The REG Procedure

Model: MODEL1

Dependent Variable: PCS

Parameter Estimates

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 39.62619 2.49830 15.86 <.0001

MCS 1 0.21945 0.07644 2.87 0.0050

HOMELESS 1 -2.56907 1.95079 -1.32 0.1908

Parameter Estimates

Standardized

Variable DF Estimate

Intercept 1 0

MCS 1 0.26919

HOMELESS 1 -0.12348

**R**

In R we demonstrate the use of the

`lm.beta()`function in the

`QuantPsyc`package (due to Thomas D. Fletcher of State Farm). The function is short and sweet, and takes a linear model object as argument:

>lm.beta

function (MOD)

{

b <- summary(MOD)$coef[-1, 1]

sx <- sd(MOD$model[-1])

sy <- sd(MOD$model[1])

beta <- b * sx/sy

return(beta)

}

Here we apply the function to data from the HELP study.

ds = read.csv("http://www.math.smith.edu/r/data/help.csv")

female = subset(ds, female==1)

lm1 = lm(pcs ~ mcs + homeless, data=female)

The results, in terms of unstandardized regression parameters are the same as in SAS:

> summary(lm1)

Call:

lm(formula = pcs ~ mcs + homeless, data = female)

Residuals:

Min 1Q Median 3Q Max

-28.163 -5.821 -1.017 6.775 29.979

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 39.62619 2.49830 15.861 < 2e-16 ***

mcs 0.21945 0.07644 2.871 0.00496 **

homeless -2.56907 1.95079 -1.317 0.19075

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.761 on 104 degrees of freedom

Multiple R-squared: 0.0862, Adjusted R-squared: 0.06862

F-statistic: 4.905 on 2 and 104 DF, p-value: 0.009212

To generate the standardized parameter estimates, we use the

`lm.beta()`function.

library(QuantPsyc)

lm.beta(lm1)

This generates the following output:

mcs homeless

0.2691888 -0.1234776

A change in 1 standard deviation of MCS has more than twice the impact on PCS than a 1 standard deviation change in the HOMELESS variable. This example points up another potential weakness of standardized regression coefficients, however, in that the homeless variable can take on values of 0 or 1, and a 1 standard deviation change is hard to interpret.

To

**leave a comment**for the author, please follow the link and comment on his blog:**SAS and R**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...