**The PolStat R Feed**, and kindly contributed to R-bloggers)

In the last two posts we saw how to download posts from R-bloggers, and then extract the title, author and date of each post and write that information to a csv file. Since we now have a nice data set from r-bloggers, we can start to examine the development of the site during its time span. In this post I will look at the following patterns in the data :

- The rate of monthly posts submitted to r-bloggers
- The distribution of posts and contributors
- The top contributors in total and tabulated by year

The graph below show the monthly count of posts submitted to r-bloggers.com:

As you can see R-bloggers.com has experienced a tremendous growth in posts,. The first years, from 2005 to the end of 2008, where fairly consistent, with an average posting rate of 6 posts per month. In 2009 we see the beginning of a dramatic rise in submitted posts, which peaks in march 2011 with 266 posts that month. To see whether this is a function of a few very active bloggers, or if we also see a similar increase in contributors, the graph below plot the number of unique contributors for every month:

Here we see that the monthly number of contributors follows closely the monthly number of posts, therefor the rise in posts is not exclusively a result of a result of a few extremely active bloggers. However as the figure below show, most authors contribute a fairly small number of posts:

The distribution is extremely skewed with a median of 6 posts, and a few authors contributing 200 or more posts.

The overall top ten contributors to r-bloggers.org are:

author | count |
---|---|

David Smith | 647 |

xi’an | 293 |

Thinking inside the box | 217 |

Tal Galili | 124 |

klr | 104 |

Stephen Turner | 102 |

dirk.eddelbuettel | 94 |

Ralph | 82 |

romain francois | 79 |

C | 77 |

Breaking this down by year we can see that from 2009 there is a rise of some very active R bloggers:

2005

author | count |
---|---|

Hadley Wickham | 3 |

fernandohrosa | 2 |

2006

author | count |
---|---|

seth | 6 |

Hadley Wickham | 5 |

dataninja | 5 |

Di Cook | 3 |

Vincent Zoonekynd& #039;s Blog | 3 |

fernandohrosa | 2 |

Andrew Gelman | 1 |

2007

author | count |
---|---|

Mario Pineda-Krch | 20 |

Forester | 14 |

Egon Willighagen | 5 |

Andrew Gelman | 4 |

Rob J Hyndman | 4 |

dataninja | 4 |

Hadley Wickham | 3 |

John Johnson | 2 |

dan | 2 |

seth | 2 |

2008

author | count |
---|---|

Yu-Sung Su | 28 |

Michal | 9 |

Rob J Hyndman | 8 |

Gregor Gorjanc | 6 |

Forester | 5 |

Di Cook | 4 |

John Johnson | 4 |

Mario Pineda-Krch | 4 |

Radford Neal | 4 |

abiao | 4 |

2009

author | count |
---|---|

Thinking inside the box | 63 |

dirk.eddelbuettel | 36 |

Shige | 30 |

John Myles White | 28 |

Paolo | 26 |

David Smith | 25 |

Todos Logos | 25 |

Jeromy Anglim | 24 |

Stephen Turner | 23 |

romain francois | 23 |

2010

author | count |
---|---|

David Smith | 352 |

xi’an | 152 |

Thinking inside the box | 85 |

C | 75 |

Tal Galili | 74 |

dirk.eddelbuettel | 58 |

Ralph | 53 |

romain francois | 41 |

Stephen Turner | 34 |

Kelly | 33 |

2011

author | count |
---|---|

David Smith | 268 |

xi’an | 137 |

klr | 104 |

Thinking inside the box | 66 |

BMS Add-ons » BMS Blog | 58 |

Pat | 52 |

Scott Chamberlain | 48 |

Stephen Turner | 44 |

Kay Cichini | 43 |

Tal Galili | 37 |

From 2009 a number of authors appear in every year as some of the top contributors, and of course in 2010 David Smith and Xi’an appears, both with a massive output.

I see r-bloggers as one of the great services in the R community, and the presence of very knowledgeable and prolific contributors is a public good that we can all enjoy. So lets hope the current trend will continue into the new year!

As always the full r script to reproduce the above analysis is here:

**leave a comment**for the author, please follow the link and comment on his blog:

**The PolStat R Feed**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...