DataCamp just launched its latest interactive course: dplyr. This new course was developed in close collaboration with Garrett Grolemund, RStudio’s master instructor. By taking this course, you will be challenged one step at a time to master the essentials about transforming data sets fast and intuitively with the dplyr package. Start the course here.
dplyr package is an exciting new chapter in the mission to bring painless data manipulation to the crowd. It is an R package that provides you with a fast and intuitive way to transform data sets with R.
dplyr is the successor of
plyr and is mainly authored by Hadley Wickham and Romain Francois. It is designed to be intuitive and easy to learn, thereby making “doing things” in R more user friendly.
It introduces five key functions to straightforwardly manipulate data:
summarize. Thanks to optimization in C++, these functions allow you to work extremely fast with larger data sets. These ‘dplyr verbs’ can be understood as the atoms that combine to powerful molecular operations which can handle around 90% of data manipulation tasks. As such,
dplyr lets you, as a data scientist, accomplish more things, with more data, in less time. However,
dplyr isn’t limited to these five functions; it also enables automated groupwise operations in R, it provides a standard syntax for accessing and manipulating database data with R, and much more. All of this and more is covered and explained in this DataCamp course (check out the contents of the course).
To help you fully grasp the power and ease-of-use of
dplyr, DataCamp has developed a brand new interactive course together with Garrett Grolemund. Garrett is a Data Scientist and Master Instructor at RStudio, holds a Ph.D. in Statistics, and specializes in teaching. He is the author of Hands on Programming with R, as well as Data Science with R, an upcoming book from O’Reilly Media. He taught people how to use R at over 50 government agencies, small businesses, and multi-billion dollar global companies.
In the course, you will learn how to use
dplyr to perform basic data manipulation tasks using the five
dplyr verbs, as well as combining these to solve challenging problems. You’ll also learn about groupwise operations using
group_by(), about the pipe operator to chain your operations, and about the
tbl structure which provides a cleaner layout so you can better understand your data. Finally, you will learn how to use the
dplyr syntax to access data stored in a database outside R.
The course is set up in DataCamp’s interactive learning platform that aims to enhance your learning experience by allowing you to learn by doing. The course is comprised of 10 sections distributed over five chapters and each section has an instructional video by Garrett, followed by a vast set of interactive exercises. As such, the concepts that are introduced during the video lecture are directly tested through challenging assignments with tailored feedback to consolidate your knowledge step by step. You will effectively learn hands on instead of losing time with suboptimal solutions like a four-hour screencast or webinar.
This is the first course of the RStudio datacamp track that will cover some of the company’s flagship products: dplyr, ggvis, rmarkdown, and the RStudio IDE. These other courses are scheduled to launch later this year.
So, if you want to learn more about the powerful
dplyr package to solve challenging data analysis problems, head over to DataCamp and start right away!