My set of packages for (daily) data analysis #rstats

June 19, 2017

(This article was first published on R – Strenge Jacke!, and kindly contributed to R-bloggers)

I started writing my first package as collection of various functions that I needed for (almost) daily work. Meanwhile, packages were growing and bit by bit I sourced out functions to put them into new packages. Although this means more work for CRAN members when they have more packages to manage on their network, from a user-perspective it is much better if packages have a clear focus and a well defined set of functions. That’s why I now released a new package on CRAN, sjlabelled, which contains all functions that deal with labelled data. These functions use to live in the sjmisc-package, where they now are deprecated and will be removed in a future update.

My aim is not only to provide packages with a clear focus, but also with a consistent design and philosophy, making it easier and more intuitive to use (see also here) – I prefer to follow the so called „tidyverse“-approach here. It is still work in progress, but so far I think I’m on a good way…

So, what are the packages I use for (almost) daily work?

  • sjlabelled – reading, writing and working with labelled data (especially since I collaborate a lot with people who use Stata or SPSS)
  • sjmisc – data and variable transformation utilities (the complement to dplyr and tidyr, when it comes down from data frames to variables within the data wrangling process)
  • sjstats – out-of-the-box statistics that is not already provided by base R or packages
  • sjPlot – to quickly generate tables and plot
  • ggeffects – to visualize regression models

The next step is creating cheat sheets for my packages. I think if you can map the scope and idea of your package (functions) on a cheat sheet, its focus is probably well defined.

Btw, if you also use some of the above packages more or less regularly, you can install the „strengejacke“-package to load them all in one step. This package is not on CRAN, because its only purpose is to load other packages.

Disclaimer: Of course I use other packages everyday as well – this posting is just focussing on my packages that I created because I frequently needed these kind of functions.

Tagged: data visualization, ggplot, R, rstats, sjPlot, Statistik

To leave a comment for the author, please follow the link and comment on their blog: R – Strenge Jacke!. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)