New Course: Joining Data in R with dplyr

December 1, 2016
By

(This article was first published on DataCamp Blog, and kindly contributed to R-bloggers)

We just launched Joining Data in R with dplyr taught by Garrett Grolemund, the author of Hands-On Programming with R and R for Data Science from O’Reilly Media. This course builds on what you learned in Data Manipulation in R with dplyr by showing you how to combine data sets with dplyr’s two table verbs. In the real world, data comes split across many data sets, but dplyr’s core functions are designed to work with single tables of data. In this course, you’ll learn the best ways to combine data sets into single tables. You’ll learn how to augment columns from one data set with columns from another with mutating joins, how to filter one data set against another with filtering joins, and how to sift through data sets with set operations. Along the way, you’ll discover the best practices for building data sets and troubleshooting joins with dplyr. Afterward, you’ll be well on your way to data manipulation mastery!

Start For Free
Joining Data in R with dplyr features 84 interactive exercises that combine high-quality video, in-browser coding, and gamification for an engaging learning experience that will help you become a data manipulation master!

What you’ll learn

The first chapter of this course covers mutating joins and explains the various ways you can join datasets together and what happens when you do [Start First Chapter For Free]. Next, you will learn all about filtering joins and set operations. Filtering joins and set operations combine information from datasets without adding new variables. Filtering joins filter the observations of one dataset based on whether or not they occur in a second dataset. Set operations use combinations of observations from both datasets to create a new dataset. The third chapter will show you how to build datasets from basic elements: vectors, lists, and individual datasets that do not require a join. dplyr contains a set of functions for assembling data that work more intuitively than base R’s functions. The chapter will also look at when dplyr does and does not use data type coercion.

Once you’ve mastered the basics, the fourth chapter dives deeper into the mechanics of joins. This chapter will show you how to spot common join problems, how to join based on multiple or mismatched keys, how to join multiple tables, and how to recreate dplyr’s joins with SQL and base R. The fifth and final chapter concludes the course with a case study that applies what you’ve learned to a real world application.

Start For Free

To leave a comment for the author, please follow the link and comment on their blog: DataCamp Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)