Posts Tagged ‘ predictive analytics ’

R still the preferred tool of predictive modelers competing at Kaggle

November 29, 2011
By
R still the preferred tool of predictive modelers competing at Kaggle

As reported on the Kaggle blog No Free Hunch, R remains the preferred tool for data scientists seeking to win the prizes in the predictive modeling competitions: More than 30% of Kaggle competitors report using R for their analysis, up from 22% a year ago. R's flexibility and the breadth of packages for machine learning and predictive modeling make...

Read more »

ACM Data Mining Camp 2011: Report

October 18, 2011
By

(By Joseph Rickert.) In San Jose topics like big data, map reduce, predictive models, mobile analytics and crowdsourcing draw a crowd even on a Saturday. So it turned out that the ACM data Mining Camp and "un-conference" was a very "happening" way to spend a Saturday. Over 500 people attended the event at the Ebay "Town Hall" on North...

Read more »

Big Analytics: Closing the "clue gap" with Big Data

August 31, 2011
By

There's been an growing discussion over the past couple of years on the topic of Big Data: how to deal with the situation when you have more data than can be conveniently managed and analyzed by traditional software tools. But Big Data has little intrinsic value in its own right: its value is only realized when you can deploy...

Read more »

How Google uses R to make online advertising more effective

August 3, 2011
By

At JSM 2011 today, three Google employees (amongst the more than 20 Google delegates there) gave a little insight into how statistical analysis with R yields better results for companies using Google's various advertising products. Bill Heavlin from Google kicked off the session with a talk about conditional regression models, a statistical technique at Google used to evaluate the...

Read more »

Fast logistic regression on Big Data with commodity hardware? No problem.

July 18, 2011
By

You might think that doing advanced statistical analysis on Big Data is out of reach for those of us without access to expensive hardware and software. For example, back in April SAS was proud to demonstrate being able to run logistic regression on a billion records (and "just a few" variables) in less than 80 seconds. But that feat...

Read more »

Sentiment Analysis for Airlines via Twitter

July 5, 2011
By
Sentiment Analysis for Airlines via Twitter

Last weekend here in the states was the 4th of July long weekend, one of the busier air travel days of the year. As anyone who flies in the States knows, with air travel often comes frustration, and in this social media age many express their frustration on Twitter: The image above comes from a tutorial on text mining...

Read more »

K-Means Clustering on Big Data

June 7, 2011
By
K-Means Clustering on Big Data

In this post Joseph Rickert demonstrates how to build a classification model on a large data set with the RevoScaleR package. A script file for use with Revolution R Enterprise to recreate the analysis below is at the end of the post, and can also be downloaded here -- ed. The k-means (Lloyd) algorithm, an intuitive way to explore...

Read more »

Participate in the 2011 Rexer Data Mining Survey

May 23, 2011
By

Last year's Rexer Data Mining Survey reported that R is used by more data miners than any other tool. If you're using R for data mining or data analysis generally, be counted at the 2011 Data Miner Survey (use access code: RL3X2), which closes in early June. Here's some background on the survey: The survey is conducted annually by...

Read more »

Quantifying gravitational lensing by dark matter

May 23, 2011
By
Quantifying gravitational lensing by dark matter

The latest prediction competition at Kaggle is literally "out of this world": the goal is to quantify the shape of 2-D images of galaxies from a simulated telescope, to test models for how invisible dark matter in the Universe distorts the images through gravitational lensing (as shown in the image below; see the FAQ for more details). If you're...

Read more »

How Kaggle competitors use R

April 19, 2011
By
How Kaggle competitors use R

The competitive data prediction competitions hosted by Kaggle require data scientists to bring their A game: the competition is intense, and competitors know in real time from the daily leaderboards how their predictions compare in accuracy to those of their rivals. So it's no surprise that open-source R, the most powerful statistics language, is a common tool of choice...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)