New Release of partools Package

July 17, 2016

(This article was first published on Mad (Data) Scientist, and kindly contributed to R-bloggers)

My new release of partools is now on CRAN.

The package is aimed at doing parallel data science in what I call an “un-MapReduce” manner. It takes the point of view that MapReduce-based frameworks such as Hadoop and Spark are fine for the types of applications their designers had in mind, namely rather simple SQL actions, but have fundamental handicaps that prevent them from performing well on many, if not most, of the types of computation that typical users need for large data sets and/or highly compute-bound applications. The distributed file/object nature of those MapReduce systems is retained, but the confining MapReduce computational paradigm is avoided.

The package now contains about 30 functions, ranging from infrastructure support to summary and aggregation to statistical/machine learning applications. See the vignette for a fairly detailed introduction. Two new capabilities that I wish to highlight are:

  • Aggregation and related operations on objects of class “data.table”.
  • Parallel computation for some modern statistical/machine learning algorithms (they are statistics to me, but you may call them machine learning if you prefer).

The core of that second highlighted set of functions makes use of what I call Software Alchemy, which I have explained in previous blog posts. See for instance the example on random forests in the vignette.

Happy Paralleling.:-)

To leave a comment for the author, please follow the link and comment on their blog: Mad (Data) Scientist. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)