Scaling the R ecosystem: Possible Directions for Improving Dependency Versioning

July 2, 2013
By

(This article was first published on OpenCPU » R-bloggers, and kindly contributed to R-bloggers)

A paper published today in The R Journal discusses a fundamental limitation affecting reliability and reproducibility of R code. It explains how lack of dependency versioning causes R based applications break down, Sweave documents to stop working and CRAN to hit scaling problems. The paper suggests several solutions inspired by other open-source communities that could make R more reliable and help sustain further growth of the ecosystem. We hope the paper will invoke a constructive discussion in the community about how to manage the distributed development process in a sustainable way.

Abstract: One of the most powerful features of R is its infrastructure for contributed code. The
built-in package manager and complementary repositories provide a great system for development
and exchange of code, and have played an important role in the growth of the platform towards the
de-facto standard in statistical computing that it is today. However, the number of packages on CRAN
and other repositories has increased beyond what might have been foreseen, and is revealing some
limitations of the current design. One such problem is the general lack of dependency versioning in
the infrastructure. This paper explores this problem in greater detail, and suggests approaches taken
by other open source communities that might work for R as well. Three use cases are defined that
exemplify the issue, and illustrate how improving this aspect of package management could increase
reliability while supporting further growth of the R community.

Read the entire article here: http://journal.r-project.org/archive/2013-1/ooms.pdf

To leave a comment for the author, please follow the link and comment on his blog: OpenCPU » R-bloggers.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.