by Joseph Rickert
Saturday morning I was drinking my coffee wondering how much effort goes into R worldwide. (It’s my job.) I noticed that there were 4469 packages on CRAN, and it occurred to me that tabulating the packages by publication date would give some indication of how much effort is being expended to improve packags and keep them up to date. With very little work at all I was able to read the table on the Available CRAN packages by date of publication page and produce this plot.
Maybe I should not have been, but I was surprised to see that most CRAN packages were either created or updated in the last year or so. Apparently, only 264 packages haven’t been touched since 2010 or before. (If you are ever worried about whether an older package is going to work for you, go to the CRAN checks page for the package and look at the notes. For example, the CRAN check for vioplot, the current longevity record holder, look just fine to me .)
This is astounding! Like most people, I suppose, I tend to use only a small number of packages on a regular basis. I’m clueless about what most of the other packages do, and don’t think much about them. But they are all meaningful to somebody, probably thousands of somebodies, and a tremendous number of hours are being spent to keep them current and improve them. So the next time a colleague refers to R itself as a “statistical package” find a way to make this point: “Package” sounds so small, “wrapped up”, and done, but R is never done, it is a language that is constantly changing, improving and increasing its expressive power through the ongoing efforts a global, large scale engineering effort. The success of R is in no small part due to the mechanism that the R core group implemented to enhance the language through the parallel, asynchronous efforts of the package developers.
A little munging (download packages-post.r) on the CRAN Package Check Results page shows that there are at least 2,596 package maintainers active now. Since many packages have multiple authors this number is probably way less than the number of package developers. Nevertheless, it gives some idea, a lower bound, on the number of people that are actively involved in creating and maintaining R packages. Also, I might be wrong, but I am guessing that people who volunteer to maintain a package are also involved in improving it. So the following plot that lists the developers who are maintaining at least 10 packages must be the trace of untold hours of “after hours”, solitary work. I think we all owe a tremendous debt of gratitude to these R superstars and all the other package developers. After all, it’s not their job.