# Introducing olsrr

**Rsquared Academy Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I am pleased to announce the **olsrr** package, a set of tools for improved
output from linear regression models, designed keeping in mind
beginner/intermediate R users. The package includes:

- comprehensive regression output
- variable selection procedures
- heteroskedasticiy, collinearity diagnostics and measures of influence
- various plots and underlying data

If you know how to build models using `lm()`

, you will find **olsrr** very
useful. Most of the functions use an object of class `lm`

as input. So you
just need to build a model using `lm()`

and then pass it onto the functions in
**olsrr**. Once you have picked up enough knowledge of R, you can move on to
more intuitive approach offered by tidymodels etc. as they offer more
flexibility, which **olsrr** does not.

### Installation

# Install release version from CRAN install.packages("olsrr") # Install development version from GitHub # install.packages("devtools") devtools::install_github("rsquaredacademy/olsrr")

### Shiny App

**olsrr** includes a shiny app which can be launched using

ols_launch_app()

or try the live version here.

Read on to learn more about the features of **olsrr**, or see the
olsrr website for
detailed documentation on using the package.

### Regression Output

model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) ols_regress(model) ## Model Summary ## -------------------------------------------------------------- ## R 0.914 RMSE 2.622 ## R-Squared 0.835 Coef. Var 13.051 ## Adj. R-Squared 0.811 MSE 6.875 ## Pred R-Squared 0.771 MAE 1.858 ## -------------------------------------------------------------- ## RMSE: Root Mean Square Error ## MSE: Mean Square Error ## MAE: Mean Absolute Error ## ## ANOVA ## -------------------------------------------------------------------- ## Sum of ## Squares DF Mean Square F Sig. ## -------------------------------------------------------------------- ## Regression 940.412 4 235.103 34.195 0.0000 ## Residual 185.635 27 6.875 ## Total 1126.047 31 ## -------------------------------------------------------------------- ## ## Parameter Estimates ## ---------------------------------------------------------------------------------------- ## model Beta Std. Error Std. Beta t Sig lower upper ## ---------------------------------------------------------------------------------------- ## (Intercept) 27.330 8.639 3.164 0.004 9.604 45.055 ## disp 0.003 0.011 0.055 0.248 0.806 -0.019 0.025 ## hp -0.019 0.016 -0.212 -1.196 0.242 -0.051 0.013 ## wt -4.609 1.266 -0.748 -3.641 0.001 -7.206 -2.012 ## qsec 0.544 0.466 0.161 1.166 0.254 -0.413 1.501 ## ----------------------------------------------------------------------------------------

In the presence of interaction terms in the model, the predictors are scaled
and centered before computing the standardized betas. `ols_regress()`

will
detect interaction terms automatically but in case you have created a new
variable instead of using the inline function, you can indicate the presence
of interaction terms by setting `iterm`

to `TRUE`

.

### Residual Diagnostics

**olsrr** offers tools for detecting violation of standard regression assumptions:

- Residual QQ plot
- Residual normality test
- Residual vs Fitted plot
- Residual histogram

ols_plot_resid_qq(model)

See Residual Diagnostics for more details.

### Heteroskedasticity

**olsrr** provides the following 4 tests for detecting heteroscedasticity:

- Bartlett Test
- Breusch Pagan Test
- Score Test
- F Test

ols_test_breusch_pagan(model) ## ## Breusch Pagan Test for Heteroskedasticity ## ----------------------------------------- ## Ho: the variance is constant ## Ha: the variance is not constant ## ## Data ## ------------------------------- ## Response : mpg ## Variables: fitted values of mpg ## ## Test Summary ## ---------------------------- ## DF = 1 ## Chi2 = 0.5884673 ## Prob > Chi2 = 0.4430124

See Heteroskedasticity for more details.

### Collinearity Diagnostics

VIF, Tolerance and condition indices to detect collinearity and plots for assessing mode fit and contributions of variables.

ols_coll_diag(model) ## Tolerance and Variance Inflation Factor ## --------------------------------------- ## # A tibble: 4 x 3 ## Variables Tolerance VIF #### 1 disp 0.125 7.99 ## 2 hp 0.194 5.17 ## 3 wt 0.145 6.92 ## 4 qsec 0.319 3.13 ## ## ## Eigenvalue and Condition Index ## ------------------------------ ## Eigenvalue Condition Index intercept disp hp ## 1 4.721487187 1.000000 0.000123237 0.001132468 0.001413094 ## 2 0.216562203 4.669260 0.002617424 0.036811051 0.027751289 ## 3 0.050416837 9.677242 0.001656551 0.120881424 0.392366164 ## 4 0.010104757 21.616057 0.025805998 0.777260487 0.059594623 ## 5 0.001429017 57.480524 0.969796790 0.063914571 0.518874831 ## wt qsec ## 1 0.0005253393 0.0001277169 ## 2 0.0002096014 0.0046789491 ## 3 0.0377028008 0.0001952599 ## 4 0.7017528428 0.0024577686 ## 5 0.2598094157 0.9925403056

See Collinearity Diagnostics for more details.

### Measures of Influence

**olsrr** offers the following tools to detect influential observations:

- Cook’s D Bar Plot
- Cook’s D Chart
- DFBETAs Panel
- DFFITs Plot
- Studentized Residual Plot
- Standardized Residual Chart
- Studentized Residuals vs Leverage Plot
- Deleted Studentized Residual vs Fitted Values Plot
- Hadi Plot
- Potential Residual Plot

ols_plot_resid_lev(model)

See Measures of Influence for more details.

### Variable Selection

Different variable selection procedures such as all possible regression, best subset regression, stepwise regression, stepwise forward regression and stepwise backward regression.

model <- lm(y ~ ., data = stepdata) ols_step_both_aic(model) ## Stepwise Selection Method ## ------------------------- ## ## Candidate Terms: ## ## 1 . x1 ## 2 . x2 ## 3 . x3 ## 4 . x4 ## 5 . x5 ## 6 . x6 ## ## ## Variables Entered/Removed: ## ## - x6 added ## - x1 added ## - x3 added ## - x2 added ## - x6 removed ## - x4 added ## ## No more variables to be added or removed. ## ## ## Stepwise Summary ## ---------------------------------------------------------------------------------- ## Variable Method AIC RSS Sum Sq R-Sq Adj. R-Sq ## ---------------------------------------------------------------------------------- ## x6 addition 33473.297 6241.497 13986.736 0.69145 0.69143 ## x1 addition 32931.758 6074.156 14154.076 0.69972 0.69969 ## x3 addition 31912.722 5771.842 14456.391 0.71466 0.71462 ## x2 addition 29304.296 5065.587 15162.646 0.74958 0.74953 ## x6 removal 29302.317 5065.592 15162.641 0.74958 0.74954 ## x4 addition 29300.814 5064.705 15163.528 0.74962 0.74957 ## ----------------------------------------------------------------------------------

See Variable Selection for more details.

### Learning More

The olsrr website includes comprehensive documentation on using the package, including the following articles that cover various aspects of using olsrr:

Variable Selection - Different variable selection procedures such as all possible regression, best subset regression, stepwise regression, stepwise forward regression and stepwise backward regression.

Residual Diagnostics - Includes plots to examine residuals to validate OLS assumptions.

Heteroskedasticity - Tests for heteroskedasticity include bartlett test, breusch pagan test, score test and f test.

Collinearity Diagnostics - VIF, Tolerance and condition indices to detect collinearity and plots for assessing mode fit and contributions of variables.

Measures of Influence - Includes 10 different plots to detect and identify influential observations.

### Feedback

**olsrr** has been on CRAN for more than an year while we were fixing bugs and
making the API stable. All feedback is welcome. Issues (bugs and feature
requests) can be posted to github tracker.
For help with code or other related questions, feel free to reach me [email protected].

**leave a comment**for the author, please follow the link and comment on their blog:

**Rsquared Academy Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.