I'm a bit late catching up on this, but Mario Inchosa (Revolution Analytics US Chief Scientist) gave a standing-room-only talk on high-performance predictive analytics in R and Hadoop at last month's Hadoop Summit. In the talk, he described some of the progress we've made integrating the ScaleR parallel external-memory algorithms into the Hadoop platform. He described some of the design considerations that makes the ScaleR algorithms so fast with big data, and how the architecture is being integrated into the Hadoop platform. He also described the interface from the perspective of an R programmer working at a desktop, but using the power of a remote Hadoop cluster for the analytics. I've embedded his slides below:

These new in-Hadoop predictive analytics capabilities will be available by the end of the year as part of the next update to Revolution R Enterprise. If this is something you'd like to try out, please get in touch.

SlideShare (Revolution Analytics): High Performance Predictive Analytics in R and Hadoop

*Related*

To

**leave a comment** for the author, please follow the link and comment on their blog:

** Revolutions**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as:

Data science,

Big Data, R jobs, visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...