202 search results for "hadoop"

RHIPE: An Interface Between Hadoop and R for Large and Complex Data Analysis

February 16, 2011
By
RHIPE: An Interface Between Hadoop and R for Large and Complex Data Analysis

RHIPE: An Interface Between Hadoop and R Presented by Saptarshi Guha About the Video: I filmed the event using LectureMaker’s live event recording technique. One special feature I add to my R video recordings is the addition of my own R source code … Continue reading

Read more »

In case you missed it: December Roundup

January 17, 2011
By

In case you missed them, here are some articles from December of particular interest to R users. A Facebook employee created a beautiful visualization of social connections around the world, which made a lot of news on the Web. The creator, Paul Butler, explained how he did it using R. With sponsorship from Revolution Analytics, the R/Finance conference in...

Read more »

Run R in parallel on a Hadoop cluster with AWS in 15 minutes

January 10, 2011
By

If you're looking to apply massively parallel resources to an R problem, one of the most time-consuming aspects of the problem might not be the computations themselves, but the task of setting up the cluster in the first place. You can use Amazon Web Services to set up the cluster in the cloud, but even that take some time,...

Read more »

Abusing Amazon’s Elastic MapReduce Hadoop service… easily, from R

January 10, 2011
By
Abusing Amazon’s Elastic MapReduce Hadoop service… easily, from R

JD Long's experimental segue package makes it easy to use Amazon's Elastic MapReduce service to fire up a Hadoop cluster and use it for non-Big Data, computationally-intensive tasks. The package provides a cluster-aware version of lapply() which "just works".

Read more »

How Orbitz uses Hadoop and R to optimize hotel search

December 21, 2010
By
How Orbitz uses Hadoop and R to optimize hotel search

Positional bias — the tendency for users to preferentially select results in the first few positions of a search — is a big issue for all kinds of search engines. But for online travel site Orbitz the stakes are higher than for a traditional Web search engine: if a customer chooses the first-listed hotel in a search for accommodations,...

Read more »

In case you missed it: November Roundup

December 17, 2010
By

In case you missed them, here are some articles from November of particular interest to R users. Dirk Eddelbuettel and Romain Francois went to Google to talk about integrating R (using Rcpp, for example), and we gave a review of the video presentation. R co-creator Ross Ihaka wins a Lifetime Achievement Award in Open Source. Revolution has job openings...

Read more »

Facebook’s Social Network Graph

December 14, 2010
By
Facebook’s Social Network Graph

Paul Butler, an intern on Facebook’s data infrastructure engineering team, was interested in visualizing the "locality of friendship". Luckily, he has some great data to work with: Facebook's social network of the friendships between its 500 million members. But visualizing that much data can be a challenge in its own right -- it takes skill to draw meaning from...

Read more »

What’s Next for Revolution R and Hadoop?

November 30, 2010
By

It's been a busy fall season for the team at Revolution Analytics. Over the past few months, we've announced major product enhancements for Revolution R -- RevoScaleR, for tackling big data sets, and RevoDeployR, for embedding Revolution R into wider applications. We've continued to add to our growing customer base at an aggressive rate and we've been busy crisscrossing...

Read more »

Learn Logistic Regression (and beyond)

November 23, 2010
By
Learn Logistic Regression (and beyond)

One of the current best tools in the machine learning toolbox is the 1930s statistical technique called logistic regression. We explain how to add professional quality logistic regression to your analytic repertoire and describe a bit beyond that. A statistical analyst working on data tends to deliberately start simple move cautiously to more complicated methods. Related posts:

  1. Read more »

In case you missed it: October Roundup

November 16, 2010
By

In case you missed them, here are some articles from October of particular interest to R users. Reviews of the winners and finalists of the 2010 ggplot2 case study competition. We have published a new article "R is Hot", with interviews from a dozen R users in industry and academia. A new code highlighting tool for displaying R code...

Read more »