More Snowdoop Coming

December 16, 2014

(This article was first published on Mad (Data) Scientist, and kindly contributed to R-bloggers)

In spite of the banter between Yihui and me, I’m glad to hear that he may be interested in Snowdoop, as are some others.  I’m quite busy this week (finishing writing my Parallel Computation for Data Science book, and still have a lot of Fall Quarter grading to do 🙂 ), but you’ll definitely be hearing more from me on Snowdoop and partools, including the following:

  • Vignettes for Snowdoop and the debugging tool.
  • Code snippets for splitting and coalescing files, including dealing with header records.
  • Code snippet implementing a distributed version of subset().

And yes, I’ll likely break down and put it on Github. 🙂  [I’m not old-fashioned, just nuisance-averse. 🙂 ] Watch this space for news, next installment maybe 3-4 days from now.

To leave a comment for the author, please follow the link and comment on their blog: Mad (Data) Scientist. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)