In the last two posts we saw how to download posts from R-bloggers, and then extract the title, author and date of each post and write that information to a csv file. Since we now have a nice data set from r-bloggers, we can start to examine the development of the site during its time span. In this post I will look at the following patterns in the data :
- The rate of monthly posts submitted to r-bloggers
- The distribution of posts and contributors
- The top contributors in total and tabulated by year
The graph below show the monthly count of posts submitted to r-bloggers.com:
As you can see R-bloggers.com has experienced a tremendous growth in posts,. The first years, from 2005 to the end of 2008, where fairly consistent, with an average posting rate of 6 posts per month. In 2009 we see the beginning of a dramatic rise in submitted posts, which peaks in march 2011 with 266 posts that month. To see whether this is a function of a few very active bloggers, or if we also see a similar increase in contributors, the graph below plot the number of unique contributors for every month:
Here we see that the monthly number of contributors follows closely the monthly number of posts, therefor the rise in posts is not exclusively a result of a result of a few extremely active bloggers. However as the figure below show, most authors contribute a fairly small number of posts:
The distribution is extremely skewed with a median of 6 posts, and a few authors contributing 200 or more posts.
The overall top ten contributors to r-bloggers.org are:
|Thinking inside the box||217|
Breaking this down by year we can see that from 2009 there is a rise of some very active R bloggers:
|Vincent Zoonekynd& #039;s Blog||3|
|Rob J Hyndman||4|
|Rob J Hyndman||8|
|Thinking inside the box||63|
|John Myles White||28|
|Thinking inside the box||85|
|Thinking inside the box||66|
|BMS Add-ons » BMS Blog||58|
From 2009 a number of authors appear in every year as some of the top contributors, and of course in 2010 David Smith and Xi’an appears, both with a massive output.
I see r-bloggers as one of the great services in the R community, and the presence of very knowledgeable and prolific contributors is a public good that we can all enjoy. So lets hope the current trend will continue into the new year!
As always the full r script to reproduce the above analysis is here: