In the last two posts we saw how to download posts from R-bloggers, and then extract the title, author and date of each post and write that information to a csv file. Since we now have a nice data set from r-bloggers, we can start to examine the development of the site during its time span. In this post I will look at the following patterns in the data :
- The rate of monthly posts submitted to r-bloggers
- The distribution of posts and contributors
- The top contributors in total and tabulated by year
The graph below show the monthly count of posts submitted to r-bloggers.com:
As you can see R-bloggers.com has experienced a tremendous growth in posts,. The first years, from 2005 to the end of 2008, where fairly consistent, with an average posting rate of 6 posts per month. In 2009 we see the beginning of a dramatic rise in submitted posts, which peaks in march 2011 with 266 posts that month. To see whether this is a function of a few very active bloggers, or if we also see a similar increase in contributors, the graph below plot the number of unique contributors for every month:
Here we see that the monthly number of contributors follows closely the monthly number of posts, therefor the rise in posts is not exclusively a result of a result of a few extremely active bloggers. However as the figure below show, most authors contribute a fairly small number of posts:
The distribution is extremely skewed with a median of 6 posts, and a few authors contributing 200 or more posts.
The overall top ten contributors to r-bloggers.org are:
| author | count |
|---|---|
| David Smith | 647 |
| xi'an | 293 |
| Thinking inside the box | 217 |
| Tal Galili | 124 |
| klr | 104 |
| Stephen Turner | 102 |
| dirk.eddelbuettel | 94 |
| Ralph | 82 |
| romain francois | 79 |
| C | 77 |
Breaking this down by year we can see that from 2009 there is a rise of some very active R bloggers:
2005| author | count |
|---|---|
| Hadley Wickham | 3 |
| fernandohrosa | 2 |
| author | count |
|---|---|
| seth | 6 |
| Hadley Wickham | 5 |
| dataninja | 5 |
| Di Cook | 3 |
| Vincent Zoonekynd& #039;s Blog | 3 |
| fernandohrosa | 2 |
| Andrew Gelman | 1 |
| author | count |
|---|---|
| Mario Pineda-Krch | 20 |
| Forester | 14 |
| Egon Willighagen | 5 |
| Andrew Gelman | 4 |
| Rob J Hyndman | 4 |
| dataninja | 4 |
| Hadley Wickham | 3 |
| John Johnson | 2 |
| dan | 2 |
| seth | 2 |
| author | count |
|---|---|
| Yu-Sung Su | 28 |
| Michal | 9 |
| Rob J Hyndman | 8 |
| Gregor Gorjanc | 6 |
| Forester | 5 |
| Di Cook | 4 |
| John Johnson | 4 |
| Mario Pineda-Krch | 4 |
| Radford Neal | 4 |
| abiao | 4 |
| author | count |
|---|---|
| Thinking inside the box | 63 |
| dirk.eddelbuettel | 36 |
| Shige | 30 |
| John Myles White | 28 |
| Paolo | 26 |
| David Smith | 25 |
| Todos Logos | 25 |
| Jeromy Anglim | 24 |
| Stephen Turner | 23 |
| romain francois | 23 |
| author | count |
|---|---|
| David Smith | 352 |
| xi'an | 152 |
| Thinking inside the box | 85 |
| C | 75 |
| Tal Galili | 74 |
| dirk.eddelbuettel | 58 |
| Ralph | 53 |
| romain francois | 41 |
| Stephen Turner | 34 |
| Kelly | 33 |
| author | count |
|---|---|
| David Smith | 268 |
| xi'an | 137 |
| klr | 104 |
| Thinking inside the box | 66 |
| BMS Add-ons » BMS Blog | 58 |
| Pat | 52 |
| Scott Chamberlain | 48 |
| Stephen Turner | 44 |
| Kay Cichini | 43 |
| Tal Galili | 37 |
From 2009 a number of authors appear in every year as some of the top contributors, and of course in 2010 David Smith and Xi’an appears, both with a massive output.
I see r-bloggers as one of the great services in the R community, and the presence of very knowledgeable and prolific contributors is a public good that we can all enjoy. So lets hope the current trend will continue into the new year!
As always the full r script to reproduce the above analysis is here:R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).