R Workshops Updated to Include the Latest Packages

August 27, 2014
By

(This article was first published on r4stats.com » R, and kindly contributed to R-bloggers)

Two new R packages are quickly becoming standards in the R community:
Hadley Wickham’s dplyr and tidyr. The dplyr package almost completely replaces his popular plyr package for data manipulation. Most importantly for general R use, it makes it much easier to select variables.  For example,

R workshop series presented at a major pharmaceutical company.

R workshop series presented at a major pharmaceutical company. Photography by Stephen Bernard.

if your data included variables for race, gender, pretest, posttest, and four survey items q1 through q4, you could select various sets of variables using:

library("dplyr")
select(mydata, race, gender) # Just those two variables.
select(mydata, gender:posttest)   # From gender through posttest.
select(mydata, contains("test"))  # Gets pretest & posttest.
select(mydata, starts_with("q"))  # Gets all vars staring with "q".
select(mydata, ends_with("test")) # All vars ending with "test".
select(mydata, num_range("q", 1:4)) # q1 thru q4 regardless of location.
select(mydata, matches("^q"))  # Matches any regular expression.

As I show in my books, these were all possible in R before, but they required much more programming.

The tidyr package replaces Hadley’s popular reshape and reshape2 packages with a data reshaping approach that is simpler and more focused just on the reshaping process, especially converting from “wide” to “long” form and back.

I’ve integrated dplyr in to my workshop R for SAS, SPSS and Stata Users, and both tidyr and dplyr now play extensive roles in my Managing Data with R workshop. The next Virtual Instructor-led Classroom (webinar) version of those workshops I’m doing in partnership with Revolution Analytics during the week of October 6, 2014.  I’m also available to teach them at your organization’s site in partnership with RStudio.com (contact me at [email protected] to schedule a visit). These workshops will also soon be available 24/7 at Datacamp.com. “You’ll be able to take Bob’s popular workshops using an interactive combination of video and live exercises in the comfort of your own browser” said Jonathan Cornelissen, CEO of Datacamp.com.


To leave a comment for the author, please follow the link and comment on his blog: r4stats.com » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.