The most prolific package maintainers on CRAN
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
During a discussion with some other members of the R Consortium, the question came up: who maintains the most packages on CRAN? DataCamp maintains a list of most active maintainers by downloads, but in this case we were interested in the total number of packages by maintainer. Fortunately, this is pretty easy to figure thanks to the CRAN repository tools now included in R, and a little dplyr (see the code below) gives the answer quickly[*].
And the answer? The most prolific maintainer is Scott Chamberlain from ROpenSci, who is currently the maintainer of 77 packages. Here's a list of the top 20:
Maint n 1 Scott Chamberlain 77 2 Dirk Eddelbuettel 53 3 Jeroen Ooms 40 4 Hadley Wickham 39 5 Gábor Csárdi 37 6 ORPHANED 37 7 Thomas J. Leeper 29 8 Bob Rudis 28 9 Henrik Bengtsson 28 10 Kurt Hornik 28 11 Oliver Keyes 28 12 Martin Maechler 27 13 Richard Cotton 27 14 Robin K. S. Hankin 24 15 Simon Urbanek 24 16 Kirill Müller 23 17 Torsten Hothorn 23 18 Achim Zeileis 22 19 Paul Gilbert 21 20 Yihui Xie 21
(That list of orphaned packages with no current maintainer includes XML, d3heatmap, and flexclust, to name just 3 of the 37.) Here's the R code used to calculate the top 20:
[*]Well, it would have been quick, until I noticed that some maintainers had two forms of their name in the database, one with surrounding quotes and one without. It seemed like it was going to be trivial to fix with a regular expression, but it took me longer than I hoped to come up with the final regexp on line 4 above, which is now barely distinguishable from line noise. As usual, there an xkcd for this situation:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.