Inspired by a post by a
As an appetizer for Paris triathlon, Jérôme and I ran as a team last week-end an adventure racing in Champagne region (it mainly consists in running, cycling, canoeing, with a flavor of orienteering, and Champagne is kept for the end). It was organized by Ecole Polytechnique students who, for the first time, divided Saturday’s legs
In a previous post I introduced the Smith for Congress data set. The data is 49k contributions made by individuals to a congressional campaign for the 2006-2010 electoral cycles. Smith for Congress is not the name of the actual campaign. Individual contributions are not required to be disclosed by a campaign unless the individual donates
I did the Two Castles Run today; it’s a 10km race between Warwick and Kenilworth castles. The organizers were very quick to put the results online and even went the extra mile of offering them as a CSV file. It … Continue reading →
Recently, I’ve been playing with patterns of drought in paleo records of streamflow. One of the earliest and most helpful tools I’ve developed, identifies and characterizes droughts in extremely long time series using R. I’m still hacking my way through it, but this is what has been cooking thus far… For this example I’ll be
I’ve been a big fan of ggplot2 for a long time but plyr has been in my toolkit for less than a year and it is now one of my most-used R packages. It is how aggregate/*apply would have been if they were awesome. In five lines this code computes the cumulative distribution functions of
An interesting sampling method that was covered briefly in my Bayesian statistics course was rejection sampling. Since I have nothing better to do, I thought it would be fun to make an acceptance-rejection algorithm using R. FUN!The Rejection Sampling method is usually used to simulate data from an unknown distribution. To do this one samples...
"The R-Files" is an occasional series from Revolution Analytics, where we profile prominent members of the R Community. Name: Jeroen Ooms Background: Ph.D. Candidate, Statistics, UCLA Nationality: Netherlands Years Using R: 3 1/2 Known for: Developing web applications for popular R packages including ggplot2, lme4, stockplot and irttool Jeroen Ooms is a statistical consultant and R enthusiast currently pursuing...
Thanks to this great post http://www.imachordata.com/?p=730 we can now put multiple plots on a display with ggplot2. This provides somewhat similar functionality to ‘par(mfrow=c(x,y))’ which would allow multiple plots with the base plot function. gridExtra doesn’t have quite the same level of options as ‘par’, but the syntax is simple. grid.arrange( graph1, graph2, ncol=2 Simple. ‘grid.table’
In this post Joseph Rickert demonstrates how to build a classification model on a large data set with the RevoScaleR package. A script file for use with Revolution R Enterprise to recreate the analysis below is at the end of the post, and can also be downloaded here -- ed. The k-means (Lloyd) algorithm, an intuitive way to explore...