Analyzing R-bloggers
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the last two posts we saw how to download posts from R-bloggers, and then extract the title, author and date of each post and write that information to a csv file. Since we now have a nice data set from r-bloggers, we can start to examine the development of the site during its time span. In this post I will look at the following patterns in the data :
- The rate of monthly posts submitted to r-bloggers
- The distribution of posts and contributors
- The top contributors in total and tabulated by year
The graph below show the monthly count of posts submitted to r-bloggers.com:
As you can see R-bloggers.com has experienced a tremendous growth in posts,. The first years, from 2005 to the end of 2008, where fairly consistent, with an average posting rate of 6 posts per month. In 2009 we see the beginning of a dramatic rise in submitted posts, which peaks in march 2011 with 266 posts that month. To see whether this is a function of a few very active bloggers, or if we also see a similar increase in contributors, the graph below plot the number of unique contributors for every month:
Here we see that the monthly number of contributors follows closely the monthly number of posts, therefor the rise in posts is not exclusively a result of a result of a few extremely active bloggers. However as the figure below show, most authors contribute a fairly small number of posts:
The distribution is extremely skewed with a median of 6 posts, and a few authors contributing 200 or more posts.
The overall top ten contributors to r-bloggers.org are:
author | count |
---|---|
David Smith | 647 |
xi’an | 293 |
Thinking inside the box | 217 |
Tal Galili | 124 |
klr | 104 |
Stephen Turner | 102 |
dirk.eddelbuettel | 94 |
Ralph | 82 |
romain francois | 79 |
C | 77 |
Breaking this down by year we can see that from 2009 there is a rise of some very active R bloggers:
2005author | count |
---|---|
Hadley Wickham | 3 |
fernandohrosa | 2 |
author | count |
---|---|
seth | 6 |
Hadley Wickham | 5 |
dataninja | 5 |
Di Cook | 3 |
Vincent Zoonekynd& #039;s Blog | 3 |
fernandohrosa | 2 |
Andrew Gelman | 1 |
author | count |
---|---|
Mario Pineda-Krch | 20 |
Forester | 14 |
Egon Willighagen | 5 |
Andrew Gelman | 4 |
Rob J Hyndman | 4 |
dataninja | 4 |
Hadley Wickham | 3 |
John Johnson | 2 |
dan | 2 |
seth | 2 |
author | count |
---|---|
Yu-Sung Su | 28 |
Michal | 9 |
Rob J Hyndman | 8 |
Gregor Gorjanc | 6 |
Forester | 5 |
Di Cook | 4 |
John Johnson | 4 |
Mario Pineda-Krch | 4 |
Radford Neal | 4 |
abiao | 4 |
author | count |
---|---|
Thinking inside the box | 63 |
dirk.eddelbuettel | 36 |
Shige | 30 |
John Myles White | 28 |
Paolo | 26 |
David Smith | 25 |
Todos Logos | 25 |
Jeromy Anglim | 24 |
Stephen Turner | 23 |
romain francois | 23 |
author | count |
---|---|
David Smith | 352 |
xi’an | 152 |
Thinking inside the box | 85 |
C | 75 |
Tal Galili | 74 |
dirk.eddelbuettel | 58 |
Ralph | 53 |
romain francois | 41 |
Stephen Turner | 34 |
Kelly | 33 |
author | count |
---|---|
David Smith | 268 |
xi’an | 137 |
klr | 104 |
Thinking inside the box | 66 |
BMS Add-ons » BMS Blog | 58 |
Pat | 52 |
Scott Chamberlain | 48 |
Stephen Turner | 44 |
Kay Cichini | 43 |
Tal Galili | 37 |
From 2009 a number of authors appear in every year as some of the top contributors, and of course in 2010 David Smith and Xi’an appears, both with a massive output.
I see r-bloggers as one of the great services in the R community, and the presence of very knowledgeable and prolific contributors is a public good that we can all enjoy. So lets hope the current trend will continue into the new year!
As always the full r script to reproduce the above analysis is here:R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.