MRAN’s Packages Spotlight

July 30, 2015
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Joseph Rickert

New R packages just keep coming. The following plot, constructed with information from the monthly files on Dirk Eddelbuettel's CRANberries site, shows a plot of the number of new packages released to CRAN between January 1, 2013 and July 27, 2015 by month (not quite 31 months).

New_R_Packages

This is amazing growth! The mean rate is about 125 new packages a month. How can anyone keep up? The direct approach, of course, would be to become an avid, frequent reader of CRANberries. Every day the CRAN:New link presents the relentless roll call of new arrivals. However, dealing with this extreme level of tediousness is not for everyone.

At MRAN we are attempting to provide some help with the problem of keeping up with what's new through the old fashioned (pre-machine learning) practice of making some idiosyncratic, but not completely capricious, human generated recommendations. With every new release of RRO we publish on the Package Spotlight page brief descriptions of packages in three categories: New Packages, Updated Packages and GitHub packages. None of these lists are intended to be either comprehensive or complete in any sense.

The New Packages list includes new packages that have been released to CRAN since the previous release of RRO. My general rules for selecting packages for this list are: (1) that they should either be tools or infrastructure packages that may prove to be useful to a wide audience or (2) they should involve a new algorithm or statistical technique that I think will be of interest to statisticians and data scientists working in many different areas. The following two packages respectively illustrate these two selection rules:

metricsgraphics V0.8.5: provides an htmlwidgets widgets interface to the MetricsGraphics.js D3 JavaScript library for plotting time series data. The vignette shows what it can do

rotationForest V0.1: provides an implementation of the new Rotation Forest binary ensemble classifier described in the paper by Rodriguez et. al

I also tend to favor packages that are backed by a vignette, paper or url that provides additional explanatory material.

Of course, any scheme like this is limited by the knowledge and biases of the curator. I am particularly worried about missing packages targeted towards biotech applications that may indeed have broader appeal. The way to mitigate the shortcomings of this approach is to involve more people. So if you come across a new package that you think may have broad appeal send us a note and let us know why ([email protected]).

The Updated Package list is constructed with the single criterion that the fact that the package was updated should convey news of some sort. Most of the very popular and useful packages are updated frequently, some approaching monthly updates. So, even though they are important packages the fact that they have been updated is generally no news at all. It is also the case that package authors generally do not put much effort in to describing the updates. In my experience poking around CRAN I have found that the NEWS directories for packages go mostly unused. (An exemplary exception is the NEWS for ggplot2.)

Finally, the GitHub list is mostly built from repositories that are trending on GitHub with a few serendipitous finds included.

We would be very interested in learning how you keep up with new R packages. Please leave us a comment.

Post Script:

The code for generating the plot may be found here: Download New_packages

Also, we have written quite a few posts over the last year or so about the difficulties of searching for relevant packages on CRAN. Here are links to three recent posts:

How many packages are there really on CRAN?
Fishing for packages in CRAN
Working with R Studio CRAN Logs

 

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)