Welcome to the Tidyverse

September 21, 2016

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Hadley Wickham, co-author (with Garrett Grolemund) of R for Data Science and RStudio's Chief Scientist, has focused much of his R package development on the un-sexy but critically important part of the data science process: data management. In the Tidy Tools Manifesto, he proposes four basic principles for any computer interface for handling data:

  1. Reuse existing data structures.

  2. Compose simple functions with the pipe.

  3. Embrace functional programming.

  4. Design for humans.

Those principles are realized in a new collection of his R packages: the tidyverse. Now, with a simple call to library(tidyverse) (after installing the package from CRAN), you can load a suite of tools to make managing data easier into your R session:

  • readr, for importing data from files
  • tibble, a modern iteration on data frames
  • tidyr, functions to rearrange data for analysis
  • dplyr, functions to filter, arrange, subset, modify and aggregate data frames

The tidyverse also loads purrr, for functional programming with data, and ggplot2, for data visualization using the grammar of graphics.

Installing the tidyverse package also installs for you (but doesn't automatically load) a raft of other packages to help you work with dates/time, strings, factors (with the new forcats package), and statistical models. It also provides various packages for connecting to remote data sources and data file formats.

Simply put, tidyverse puts a complete suite of modern data-handling tools into your R session, and provides an essential toolbox for any data scientist using R. (Also, it's a lot easier to simply add library(tidyverse) to the top of your script rather than the dozen or so library(…) calls previously required!) Hadley regularly updates these packages, and you can easily update them in your R installation using the provided tidyverse_update() function.

For more on tidyverse, check out Hadley's post on the RStudio blog, linked below.

RStudio Blog: tidyverse 1.0.0

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)