How R Grows – not so fast
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I have had some work on CRAN stats on the back-burner but the recent article How R Grows tempted me to push it up the list
In the interim, I have a couple of comments on Joseph Rickert`s article. Although the body of the article refers to packages either created or updated in a time period, the actual graph headlined “Packages submitted by Year“ shows a dramatic increase in numbers during 2012 with 2013 apparently poised for a further 50% increase. There are a couple of problems here. Firstly, he uses the latest revision as the date of submission. A package might have been revised 12 times in 2011 and only once in 2012 but it would only show up in the latter year`s data. From my analysis – which thankfully looks v similar for this graph – I can reproduce the image but with a different title
The 2012 figure will fall as the year progresses
Although I came across a couple of glitches, the archive files for each package give a date of first publication. Here are the data on initial releases by year
This shows a smoother upward trend. With a third of the year gone, it looks as though new R packages will be broadly similar to last year
Finally, here is a boxplot of the number of revisions a package has undergone by year.
Unsurprisingly it is not a normal distribution with lots of outliers – the Matrix package leads the way with 166 revisions since its introduction in 2000 – and a general tendency for the earlier packages to average more updates
(561)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.