Marginal Effects for Regression Models in R #rstats #dataviz

[This article was first published on R – Strenge Jacke!, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Regression coefficients are typically presented as tables that are easy to understand. Sometimes, estimates are difficult to interpret. This is especially true for interaction or transformed terms (quadratic or cubic terms, polynomials, splines), in particular for more complex models. In such cases, coefficients are no longer interpretable in a direct way and marginal effects are far easier to understand. Specifically, the visualization of marginal effects makes it possible to intuitively get the idea of how predictors and outcome are associated, even for complex models.

The ggeffects-package (Lüdecke 2018) aims at easily calculating marginal effects for a broad range of different regression models, beginning with classical models fitted with lm() or glm() to complex mixed models fitted with lme4 and glmmTMB or even Bayesian models from brms and rstanarm. The goal of the ggeffects-package is to provide a simple, user-friendly interface to calculate marginal effects, which is mainly achieved by one function: ggpredict(). Independent from the type of regression model, the output is always the same, a data frame with a consistent structure.

The idea behind this function is to compute (and visualize) the relationship between a model predictor (independent variable) and the model response (dependent variable). The predictor of interest needs to be specified in the terms-argument.

data(mtcars)
m <- lm(mpg ~ hp + wt + cyl + am, data = mtcars)
ggpredict(m, "cyl")
#> # A tibble: 3 x 5
#>       x predicted conf.low conf.high group
#>                  
#> 1     4      21.7     19.1      24.4 1    
#> 2     6      20.2     19.3      21.1 1    
#> 3     8      18.7     16.5      21.0 1

The relationship can be differentiated depending on further predictors, which is useful e.g. for interaction terms. Up to two further predictors that indicate the „grouping“ structure can be used to calculate marginal effects. The names of these predictors need to be passed as character vector to ggpredict().

m <- lm(mpg ~ wt * cyl + am + wt + cyl, data = mtcars)
p <- ggpredict(m, c("wt", "cyl"))
p
#> # A tibble: 27 x 5
#>        x predicted conf.low conf.high group
#>                   
#>  1   1.5      31.7     28.5      34.9 4    
#>  2   1.5      26.3     23.5      29.2 6    
#>  3   1.5      21.0     16.8      25.1 8    
#>  4   2        28.7     26.6      30.8 4    
#>  5   2        24.2     22.2      26.3 6    
#>  6   2        19.8     16.4      23.1 8    
#>  7   2.5      25.7     24.3      27.2 4    
#>  8   2.5      22.1     20.7      23.5 6    
#>  9   2.5      18.5     15.9      21.2 8    
#> 10   3        22.8     20.8      24.7 4    
#> # ... with 17 more rows

There’s a plot()-method, based on ggplot2:

plot(p)

The simple approach of ggpredict() can be used for all supported regression models. Thus, to calculate marginal effects with ggpredict(), it makes no differences if the model is a simpel linear model or a negative biniomial multilevel model or a cumulative link model etc. In case of cumulative link models, ggpredict() automatically takes care of proper grouping, in this case for the different levels of the response variable:

library(MASS)
library(ordinal)
data(housing)
m <- clm(Sat ~ Type * Cont + Infl, weights = Freq, data = housing)
p <- ggpredict(m, c("Cont", "Type"))
plot(p)

ggeffects also allows easily calculating marginal effects at specific levels of other predictors. This is particularly useful for interaction effects with continuous variables. In the following example, both variables of the interaction term have a larger range of values, which obscure the moderating effect:

m <- lm(mpg ~ wt * hp + am + wt, data = mtcars)
p <- ggpredict(m, c("hp", "wt"))
plot(p)

However, you can directly specify certain values, at which marginal effects should be calculated, or use „shortcuts“ that compute convenient values, like mean +/- 1 SD etc.

p <- ggpredict(m, c("hp", "wt [meansd]"))
plot(p)

The latest update of ggeffects on CRAN introduced some new features. There is a dedicated website that describes all the details of this package, including some vignettes with lots of examples.

To leave a comment for the author, please follow the link and comment on their blog: R – Strenge Jacke!.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)