Posts Tagged ‘ Big Data ’

Webinar: Leveraging R in Hadoop Environments

September 6, 2011
By
Webinar: Leveraging R in Hadoop Environments

On Wednesday September 21, Revolution Analytics' CTO David Champagne will give a live webinar introducing three new open-source packages for R and Hadoop, which make it possible to work with Hadoop data in R, and bring in-database R analytics to Hadoop. Here are the details: Date: Wednesday, September 21st Time: 10:00AM - 10:30AM Pacific Time Presenter: David Champagne, Chief...

Read more »

Big Analytics: Closing the "clue gap" with Big Data

August 31, 2011
By

There's been an growing discussion over the past couple of years on the topic of Big Data: how to deal with the situation when you have more data than can be conveniently managed and analyzed by traditional software tools. But Big Data has little intrinsic value in its own right: its value is only realized when you can deploy...

Read more »

GigaOm article on R, Big Data and Data Science

July 18, 2011
By

I'm really pleased that an article I wrote, "5 real-world uses of big data", has been published in the widely-read technology blog GigaOm. In the article, I review five examples of using data science techniques and R to make sense of some large real-world data sets: Drew Conway's analysis of the Afghanistan attacks data released by Wikileaks Benetech's use...

Read more »

Fast logistic regression on Big Data with commodity hardware? No problem.

July 18, 2011
By

You might think that doing advanced statistical analysis on Big Data is out of reach for those of us without access to expensive hardware and software. For example, back in April SAS was proud to demonstrate being able to run logistic regression on a billion records (and "just a few" variables) in less than 80 seconds. But that feat...

Read more »

Big-Data PCA: 50 years of stock data

June 17, 2011
By
Big-Data PCA: 50 years of stock data

In this post, Revolution engineer Sherry LaMonica shows us how to use the RevoScaleR big-data package in Revolution R Enterprise to do principal components analysis on 50 years of stock market data -- ed. Principal components analysis, or PCA, seeks to find a set of orthogonal axes such that the first axis, or first principal component, accounts for as...

Read more »

The Big Analytics Revolution starts with R

June 15, 2011
By

Thanks to everyone who attended our webinar The 'Big Analytics' Revolution Starts with R yesterday. If you missed the live session, you can download the presentation slides (PDF) and the 30-minute replay video (WMV) from the Revolution Analytics website. The presentation focuses on the isse of Big Data, and how businesses can use advanced analytics methods implemented in the...

Read more »

K-Means Clustering on Big Data

June 7, 2011
By
K-Means Clustering on Big Data

In this post Joseph Rickert demonstrates how to build a classification model on a large data set with the RevoScaleR package. A script file for use with Revolution R Enterprise to recreate the analysis below is at the end of the post, and can also be downloaded here -- ed. The k-means (Lloyd) algorithm, an intuitive way to explore...

Read more »

The Netflix Prize, Big Data, SVD and R

May 31, 2011
By

One of the key data analysis tools that the BellKor team used to win the Netflix Prize was the Singular Value Decomposition (SVD) algorithm. As a file on disk, the Neflix Prize data (a matrix of about 480,000 members' ratings for about 18,000 movies) was about 65Gb in size -- too large to be read into the standard in-memory...

Read more »

A simple Big Data analysis using the RevoScaleR package in Revolution R

May 24, 2011
By
A simple Big Data analysis using the RevoScaleR package in Revolution R

This post from Stephen Weller is part of a series from members of the Revolution Analytics Engineering team. Learn more about the RevoScaleR package, available free to academics as part of Revolution R Enterprise — ed. The RevoScaleR package, installed with Revolution R Enterprise, offers parallel external memory algorithms that help R break through memory and performance limitations. RevoScaleR...

Read more »

A simple Big Data analysis using the RevoScaleR package in Revolution R

May 24, 2011
By
A simple Big Data analysis using the RevoScaleR package in Revolution R

This post from Stephen Weller is part of a series from members of the Revolution Analytics Engineering team. Learn more about the RevoScaleR package, available free to academics as part of Revolution R Enterprise — ed. The RevoScaleR package, installed with Revolution R Enterprise, offers parallel external memory algorithms that help R break through memory and performance limitations. RevoScaleR...

Read more »