Going deeper with dplyr: New features in 0.3 and 0.4 (video tutorial)

March 8, 2015

(This article was first published on R - Data School, and kindly contributed to R-bloggers)

In August 2014, I created a 40-minute video tutorial introducing the key functionality of the dplyr package in R. dplyr continues to be my “go-to” package for data exploration and manipulation because of its intuitive syntax, blazing fast performance, and excellent documentation.

I recorded that tutorial using the latest version at the time (0.2), but there have since been two significant updates to dplyr (versions 0.3 and 0.4). Because those updates introduced a ton of new functionality, I thought it was time to create another tutorial!

This new tutorial covers the most useful new features in 0.3 and 0.4, as well as some advanced functionality from previous versions that I didn’t cover last time. (If you have not watched the previous tutorial, I recommend you do so first since it covers some dplyr basics that are not covered in this tutorial.)

Table of contents

This new tutorial runs 37 minutes, but if you only want to watch a particular section, simply click the topic below and it will skip to that point in the video:

  1. Introduction (starts at 0:00)
  2. Loading dplyr and the nycflights13 dataset (starts at 1:12)
  3. Choosing columns: select, rename (starts at 2:28)
  4. Choosing rows: filter, between, slice, sample_n, top_n, distinct (starts at 5:40)
  5. Adding new variables: mutate, transmute, add_rownames (starts at 12:38)
  6. Grouping and counting: summarise, tally, count, group_size, n_groups, ungroup (starts at 15:20)
  7. Creating data frames: data_frame (starts at 23:01)
  8. Joining (merging) tables: left_join, right_join, inner_join, full_join, semi_join, anti_join (starts at 25:28)
  9. Viewing more output: print, View (starts at 31:29)
  10. Resources (starts at 34:41)

The video is embedded below, or you can view it on YouTube:

You can view the R Markdown document used in the video on RPubs, or you can download the source document from GitHub.

Here are the resources I mention in the video:

My previous tutorial is embedded below, or you can view it on YouTube:

If you have any questions about dplyr, I’d love to hear them in the comments!

If you’d like to be notified when I release new videos, please subscribe to my YouTube channel. I also blog about a wide array of data science topics, including R, Python, Git, and machine learning, and have an email newsletter if you’d like to hear about that content!

To leave a comment for the author, please follow the link and comment on their blog: R - Data School.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)