How Kaggle competitors use R

April 19, 2011
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

The competitive data prediction competitions hosted by Kaggle require data scientists to bring their A game: the competition is intense, and competitors know in real time from the daily leaderboards how their predictions compare in accuracy to those of their rivals. So it's no surprise that open-source R, the most powerful statistics language, is a common tool of choice amongst competitors. 

In a presentation to the Bay Area R User Group last week, Kaggle CEO Anthony Goldbloom showed this chart of the software preferences of Kaggle competitors:

Kaggle software usage
As you can see, a third of all Kaggle competitors report using R. Moreover, Kaggle reports that fully 50% of competition winners used R to beat out their competitors to create the most accurate predictive models. Today, Revolution Analytics has released a new white paper that interviews some Kaggle competitors who used R to win their competitions, and what makes R uniquely suited to building the most predictive models. 

Revolution Analytics has also announced a partnership with Kaggle to make the big-data capabilities of Revolution R Enterprise available for use in Kaggle competitions, free of charge. Now Kaggle competitors can download Revolution R Enterprise and extend their use of R with the R Productivity Environment for coding and debugging predictive models, and apply out-of-memory statistical models to the large data sets appearing in many Kaggle-hosted competitions like the $3M Heritage Health prize or the forthcoming NASA and Wikipedia competitions.

Kaggle-RevolutionAnalytics

This quote from Jeff Erhardt (Revolution Analytics COO) sums up why we're excitied to make Revolution R Enterprise available for Kaggle competitions. “We’ve entered an era of information where data science can be applied to solve nearly any real world problem," says Jeff. "Technological and scientific advances brought us the R language, and by innovating on top of R, Revolution Analytics is providing data scientists with an opportunity to access broader sets of data faster to tackle today’s toughest data problems. We’re pleased to work with Kaggle to offer Revolution R Enterprise to its ambitious participants.”

Revolution White Papers: R Competition Brings Out the Best in Data Analytics

To leave a comment for the author, please follow the link and comment on his blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.