Learning guide: Introduction to the Tidyverse, one-day workshop
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
To an outsider, some R packages sound too cheeky to be very valuable. Take, for example, the tidyverse
. What on earth does that groaner of a portmanteau do?
By the end of this workshop, you’ll know that the tidyverse
is so-called because it’s a collection of packages used together to clean, model and depict data using “tidy” principles.
More than a package
That means the tidyverse
is more than a package. It’s even more than a series of packages. It’s a whole “mental model” of how data should work.
Hadley Wickham and Garrett Grolemund, in R for Data Science depict the data workflow like this:
That is, it’s an iterative process that starts with preparing the data by importing it and then “tidying” it. Those first steps are the focus of this learning guide, along with the basics of data visualization.
There are different ways to prepare a dataset for analysis, but the nice thing about using the “tidy” framework is there are no surprises in how to do it. This framework will help you explicitly think through what needs to happen to a dataset for it to be of much use in your analysis.
After that, the tidyverse
provides a suite of tools for continuing your data journey: from the re-shaping, to the manipulating, to the visualizing.
This one-day workshop focuses on the elements of the tidyverse
most commonly to be used in basic data cleaning, exploratory data analysis and visualization.
Get your copy of the guide below. You are welcome to use this learning guide at your school or workplace to guide workshops or for however you can benefit from it.
This learning guide is part of my resource library. For exclusive free access, subscribe to my newsletter below.
If you’re an individual user looking to get acquainted with the tidyverse
, I suggest my book Advancing into Analytics: From Excel to Python and R.
Lesson 1: The tidyverse and tidy data
Objective: Student can compare and contrast the tidyverse to the general R environment
Description:
- What is tidy data?
- A tour of the tidy galaxies
- The tidy workflow
Time: 40 minutes
Assets needed: none
Lesson 2: Importing data
Objective: Student can read tabular files into R
Description:
- Introduction to the tibble
- Importing text files
- Importing Excel workbooks
Time: 40 minutes
Assets needed: Baseball records
Lesson 3: Re-shaping data
Objective: Student can transform a dataset to fit tidy principles
Description:
- Pivoting and un-pivoting datasets
- Delimiting columns
Time: 60 minutes
Assets needed: Baseball records
Lesson 4: Manipulating data
Objective: Student can create a data manipulation pipeline
- Manipulating rows & columns
- Aggregating & summarizing data
- Piping functions
Time: 75 minutes
Assets needed: Baseball records
Lesson 5: Joining and appending data
Objective: Student can create a data manipulation pipeline
- Appending two or more tables
- Joining two tables: left, right, inner, outer
Time: 75 minutes
Assets needed: Baseball records
Lesson 6: Miscellaneous tidying
Objective: Student can manipulate strings, factors and dates
- Formatting, replacing and splitting strings
- Ordering and modifying factors
- Generating, calculating and resampling dates
Time: 60 minutes
Assets needed: Flight records
Lesson 7: Visualizing data
Objective: Student can create graphical depictions of variable relationships
- The grammar of graphics
- Plotting univariate relationships
- Plotting bivariate relationships
- Customizing scales, legends & themes
Time: 90 minutes
Assets needed: Baseball records
This learning guide is part of my resource library. For exclusive free access, subscribe to my newsletter below.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.