Learning guide: Introduction to the Tidyverse, one-day workshop

[This article was first published on George J. Mount, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

To an outsider, some R packages sound too cheeky to be very valuable. Take, for example, the tidyverse. What on earth does that groaner of a portmanteau do?

By the end of this workshop, you’ll know that the tidyverse is so-called because it’s a collection of packages used together to clean, model and depict data using “tidy” principles.

More than a package

That means the tidyverse is more than a package. It’s even more than a series of packages. It’s a whole “mental model” of how data should work.

Hadley Wickham and Garrett Grolemund, in R for Data Science depict the data workflow like this:

The “tidy” workflow from R for Data Science

That is, it’s an iterative process that starts with preparing the data by importing it and then “tidying” it. Those first steps are the focus of this learning guide, along with the basics of data visualization.

There are different ways to prepare a dataset for analysis, but the nice thing about using the “tidy” framework is there are no surprises in how to do it. This framework will help you explicitly think through what needs to happen to a dataset for it to be of much use in your analysis.

After that, the tidyverse provides a suite of tools for continuing your data journey: from the re-shaping, to the manipulating, to the visualizing.

This one-day workshop focuses on the elements of the tidyverse most commonly to be used in basic data cleaning, exploratory data analysis and visualization.

Get your copy of the guide below. You are welcome to use this learning guide at your school or workplace to guide workshops or for however you can benefit from it.

This learning guide is part of my resource library. For exclusive free access, subscribe to my newsletter below.

If you’re an individual user looking to get acquainted with the tidyverse, I suggest my book Advancing into Analytics: From Excel to Python and R.

Introduction to the tidyverse workshop

Lesson 1: The tidyverse and tidy data

Objective: Student can compare and contrast the tidyverse to the general R environment

Description:

  • What is tidy data?
  • A tour of the tidy galaxies
  • The tidy workflow

Time: 40 minutes

Assets needed: none

Lesson 2: Importing data

Objective: Student can read tabular files into R

Description:

  • Introduction to the tibble
  • Importing text files
  • Importing Excel workbooks

Time: 40 minutes

Assets needed: Baseball records

Lesson 3: Re-shaping data

Objective: Student can transform a dataset to fit tidy principles

Description:

  • Pivoting and un-pivoting datasets
  • Delimiting columns

Time: 60 minutes

Assets needed: Baseball records

Lesson 4: Manipulating data

Objective: Student can create a data manipulation pipeline

  • Manipulating rows & columns
  • Aggregating & summarizing data
  • Piping functions

Time: 75 minutes

Assets needed: Baseball records

Lesson 5: Joining and appending data

Objective: Student can create a data manipulation pipeline

  • Appending two or more tables
  • Joining two tables: left, right, inner, outer

Time: 75 minutes

Assets needed: Baseball records

Lesson 6: Miscellaneous tidying

Objective: Student can manipulate strings, factors and dates

  • Formatting, replacing and splitting strings
  • Ordering and modifying factors
  • Generating, calculating and resampling dates

Time: 60 minutes

Assets needed: Flight records

Lesson 7: Visualizing data

Objective: Student can create graphical depictions of variable relationships

  • The grammar of graphics
  • Plotting univariate relationships
  • Plotting bivariate relationships
  • Customizing scales, legends & themes

Time: 90 minutes

Assets needed: Baseball records

This learning guide is part of my resource library. For exclusive free access, subscribe to my newsletter below.

To leave a comment for the author, please follow the link and comment on their blog: George J. Mount.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)