Site icon R-bloggers

Compute R2s and other performance indices for all your models!

[This article was first published on R on easystats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Indices of model performance (i.e., model quality, goodness of fit, predictive accuracy etc.) are very important, both for model comparison and model description purposes. However, their computation or extraction for a wide variety of models can be complex.

To address this, please let us introduce the performance package!

performance

We have recently decided to collaborate around the new easystats project, a set of packages designed to make your life easier (currently WIP). This project encompasses several packages, devoted for instance to model access or Bayesian analysis, as well as indices of model performance.

The goal of performance is to provide lightweight tools to assess and check the quality of your model. It includes functions such as R2 for many models (including logistic, mixed and Bayesian models), ICC or helpers to check convergence, overdipsersion or zero-inflation. See the list of functions here.

performance can be installed as follows:

install.packages("performance")  # Install the package
library(performance)  # Load it

Examples

Mixed Models

First, we calculate the r-squared value and intra-class correlation coefficient (ICC) for a mixed model, using r2() and icc(). r2() internally calls the appropriate function for the given model. In case of mixed models this will be r2_nakagawa(). r2_nakagawa() computes the marginal and conditional r-squared values, while icc() calculates an adjusted and conditional ICC, both based on the proposals from Nakagawa et al. 2017. For more details on the computation of the variances, see get_variance().

# Load the lme4 package
library(lme4)

# Fit a mixed model
model <- lmer(Sepal.Width ~ Petal.Length + (1|Species), data = iris)

# compute R2, based on Nakagawa et al. 2017
r2(model)
> # R2 for mixed models
> 
>   Conditional R2: 0.913
>      Marginal R2: 0.216
# compute intra-class correlation coefficient (ICC)
icc(model)
> # Intraclass Correlation Coefficient
> 
>      Adjusted ICC: 0.889
>   Conditional ICC: 0.697

Now let’s compute all available indices of performance appropriate for a given model. This can be done via the model_performance().

# Compute all performance indices
model_performance(model)
>   AIC BIC R2_conditional R2_marginal  ICC RMSE
> 1 107 119           0.91        0.22 0.89 0.31

Bayesian Mixed Models

For Bayesian mixed models, we have the same features available (r-squared, ICC, …). In this example, we focus on the output from model_performance() only.

# Load the rstanarm package
library(rstanarm)

# Fit a Bayesian mixed model
model <- stan_glmer(Sepal.Width ~ Petal.Length + (1|Species), data = iris)

# Compute performance indices
model_performance(model)
>   ELPD ELPD_SE LOOIC LOOIC_SE WAIC   R2 R2_SE R2_marginal R2_marginal_SE
> 1  -43      10    87       20   87 0.47 0.045        0.26          0.048
>   R2_LOO_adjusted RMSE
> 1            0.45 0.31

More details about performance’s features are comming soon, stay tuned 😉

Get Involved

There is definitely room for improvement, and some new exciting features are already planned. Feel free to let us know how we could further improve this package!

To conclude, note that easystats is a new project in active development. Thus, do not hesitate to contact us if you want to get involved 🙂

  • Check out our other blog posts here!

To leave a comment for the author, please follow the link and comment on their blog: R on easystats.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.