The split-apply-combine paradigm in R

February 25, 2011

(This article was first published on Stat Bandit » R, and kindly contributed to R-bloggers)

Last night at the DC R Users meetup, which was our largest meetup to date, I gave an introductory presentation on data munging, and spent a bit of time on the split-apply-combine paradigm that I use almost daily in my work. I talked mainly about the packages plyr and doBy, which I use a lot now. David Smith posted a link on the Revolution blog to this article by Steve Miller, talking about the virtues of the data.table package for doing “by-group processing”. It got me thinking about changing my workflow yet again and engaging this package in my computational workflow. I also noticed that Hadley Wickham tweeted that he wants to make plyr faster as well in the near future, which will of course be a very welcome development.

To leave a comment for the author, please follow the link and comment on their blog: Stat Bandit » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)