Since it's the end of the year, and since this is a statistics blog, I thought I'd pull some data from the blog server and run some number on the blog itself.
Overall, the blog has doubled the average number of daily visitors and pageviews compared to 2009. The number of pageviews varies quite a lot, as you can see from the kernel density estimate of daily values below. The median daily pageview count was just shy of 1400. I couldn't find an easy way to count the number of posts published in 2010, but there's been at least one every weekday (and often more than one).
By the way, some hints on exporting data from Google Analytics for analysis in R R. It's best to export using the TSV (tab-separated format) for ease of import. You'll need to delete or skip the first 9 rows of the export file to remove the comments Google Analytics prepends to the data. And finally, you'll need to read in the pageview data as text and then strip the commas from the numbers (which are exported as “1,234” instead of “1234”) before you convert them to numeric in R. I used the following commands:
df <- read.table("pageviews.tsv", header=TRUE, skip=9, sep="\t", as.is=TRUE) df$Pageviews <- as.numeric(gsub(",","",df$Pageviews))
The remaining stats I pulled directly from Google Analytics.
Top 10 Posts of 2010
The top 10 posts, as measured by direct traffic to each page, are listed below:
- Why Learn R? It's the language of Statistics (by Joseph Rickert)
- Facebook's Social Network Graph
- Video: Hadley Wickham gives a short course on graphics with R
- Charting the World Cup
- How to make a heat map in R
- A free book on probability and statistics with R
- Because it's Friday: The dating equation
- O'Reilly at OSBC: The future's in the data
- An analysis of the Wikileaks data with R
- How to animate Google Earth with R
This doesn't include data from posts syndicated at other sites (such as r-bloggers.com), and views to the blog homepage (which is how many people read the blog, rather than post-by-post). Interestingly, the #3 post from 2009, How to choose a random number in R, would have been the #1 post in 2010 -- thanks to its Google Search traffic -- if I hadn't limited the contenders to posts published this year.
Top 5 Browsers of 2010
It's interesting to look at the top Web browsers used to visit this blog, as measured by the number of pages server to each browser:
- Firefox (44.45%)
- Internet Explorer (22.04%)
- Chrome (20.08%)
- Safari (11.41%)
- Opera (2.02%)
Chrome is the big change here: it represented just 8.4% of traffic in 2009. In the Web as a whole, Internet Explorer represents a nearly 50% share of Web traffic, so clearly the readers of this blog prefer Firefox disproportionately. I use both Chrome and Firefox myself.
Top 5 Operating Systems of 2010
Google Analytics also reports the operating system used to visit the blog, and the top five are:
- Windows 64.99%
- Macintosh 20.69%
- Linux 11.96%
- iPhone 1.32%
- iPad 1.05%
This makes Firefox's dominance in the browsers even more remarkable, given IE's built-in advantage on Windows. Regarding the operating systems themselves, the order is unchanged from 2009, and the ratios are about the same too (Windows was 59% in 2009, Mac at 28%). The only change is that the iPad replaced the iPod for the #5 slot.
Well, that's about it for 2010 - it's been a great year, and a lot of fun writing the blog and getting the word out about all the awesome things the community is doing with R. Thanks to everyone who sent in suggestions for articles, made comments, or just read the blog in 2010. We'll be back again next year with more news about R, statistics and the world of open source. Happy New Year to all of our readers, and we'll see you in 2011!