Data visualization in social sciences – what’s new in the sjPlot-package? #rstats

[This article was first published on R – Strenge Jacke!, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

My sjPlot package just reached version 2.0 and got many updates during the couple of last months. The focus was less on adding new functions; rather, I improved existing functions by adding new smaller and bigger features to make working with the package easier and more reliable. In this blog post, I will report some of the new features.

Consistent name style of arguments

Most notably, I tried to give all package functions a consistent naming style or pattern for arguments. In previous versions, mixing different name-styles was sometimes very confusing. For example, some functions used showNA, others na.rm or show.na. Or some functions used hideLegend, some showLegend and others again show.legend.

Now, all argument names are 1) lower case, 2) dot separated for longer words and are 3) grouped according to their function (i.e., if you open the docs for ?sjt.lm, you’ll find all show. arguments, then all string. and finally all digits. arguments). I know that this means that you most likely have to completely re-write your code that uses sjPlot-function calls, but I think, in the long run, this makes working with the sjPlot package easier

Support for different model families and link functions

In previous package versions, functions related to generalized linear models (like sjp.glm or sjp.glmer) were hard coded for binomial model families for most plot types. Some effect or prediction plots only worked for logistic regression, because predictions were based on plogis. Also, automatic entitling of plots always included „probability“, even for count models.

In the past package updates and especially and the last major update, prediction or effect plot are now based on the link-inverse function of the models, so all common model families and link functions should work with sjPlot now.

Predictions and effect plots

In some cases, it is easier to interprete the predicted probabilities, incidents rates or marginal effects instead of the related estimate numbers (odds ratios, incident rate ratios, beta). For linear models (sjp.lm), linear mixed models (sjp.lmer), generalized linear models (sjp.glm) and generalized linear mixed models (sjp.glmer), there are three different plot types to plot predicted values or marginal effects:

  1. type = "slope" (or type = "fe.slope" and type = "ri.slope" for mixed models) to plot unadjusted predicted values, i.e. the relation between model terms and response.
  2. type = "eff" to plot marginal effects, adjusted for all predictors.
  3. type = "pred" (and type = "pred.fe" for mixed models) to plot predicted values against reponse, for particular model terms.

The following examples are taken from the vignette of the sjp.glm-function.

1. Predicted values, unadjusted

The predicted values from this plot type are based on the intercept’s estimate and each specific term’s estimate. All other co-variates are set to zero (i.e. ignored), which corresponds to family(fit)$linkinv(eta = b0 + bi * xi) (where xi is the estimate).

Predicted values, unadjusted

A probability curve of all predictors is plotted, which indicates the probability of the event (indicated by the response) occuring for each value of the predictor (not adjusted for remaining co-variates). In the above example, the first panel in the plot would be interpreted as: with increasing Barthel-Index (which means, better functional / physical status), the probability that caring for a dependent person is negatively perceived, decreases (in short: the less dependent a person I care for is, the less negative is the impact of care).

2. Effect plots

For marginal effects (predicted marginal probabilities resp. predicted marginal incident rates), all remaining co-variates are set to the mean, so this plot type adjusts for co-variates. Obtained results are based on the effects-package.

Marginal effects, adjusted

The effect plots can now also be non-faceted, and for selected model terms only (using the facit.grid and vars arguments).

3. Predicting values

The plot-type for predicting values did not produce any useful results in former package versions, because it just called the predict function without relationship to any predictor, or meaningful data. Now, this plot-type was completely revised. With type = "pred" (formerly, "y.pc"), you can plot predicted values for the response, related to specific model predictors. The predicted values of the response are computed, which corresponds to predict(fit, type = "response"). This plot type requires the vars argument to select specific terms that should be used for the x-axis and – optional – as grouping factor. Hence, vars must be a character vector with the names of one or two model predictors.

Predicting values

Predicting values

Table functions for mixed models

The table functions were also revised, especially for mixed models. You now have more details in the random parts section of the table, which now also shows the variance components of the random parts, or (pseudo-)r2-values.

The tables are crated as HTML-page and displayed in your IDE’s viewer or your web browser. You can see many examples at the package vignettes-page. For the following example, I have taken a screenshot, because else the blog’s style sheet would break the table layout. Anyway, this is an example of a quickly produced table:

table

Closing remarks

There have been a lot of improvements made in the sjPlot package during the past(s) updates. Above you see example of the most obvious user-visible changes. But there were also lots of other smaller and bigger improvements. E.g. plotting functions with different plot types, like sjp.glm, have many arguments; most of them only applied to specific plot types, while they were ignored by other plot types. Now, all plot types support more or mostly all arguments, and the documentation should be clearer about what the functions and their arguments do.

I hope you’ll enjoy the sjPlot-package. Feel free to submit issues or suggestions to the dedicated GitHub-page.


Tagged: data visualization, ggplot, R, rstats, sjPlot

To leave a comment for the author, please follow the link and comment on their blog: R – Strenge Jacke!.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)