I'm a bit late catching up on this, but Mario Inchosa (Revolution Analytics US Chief Scientist) gave a standing-room-only talk on high-performance predictive analytics in R and Hadoop at last month's Hadoop Summit. In the talk, he described some of the progress we've made integrating the ScaleR parallel external-memory algorithms into the Hadoop platform. He described some of the design considerations that makes the ScaleR algorithms so fast with big data, and how the architecture is being integrated into the Hadoop platform. He also described the interface from the perspective of an R programmer working at a desktop, but using the power of a remote Hadoop cluster for the analytics. I've embedded his slides below:
These new in-Hadoop predictive analytics capabilities will be available by the end of the year as part of the next update to Revolution R Enterprise. If this is something you'd like to try out, please get in touch.
SlideShare (Revolution Analytics): High Performance Predictive Analytics in R and Hadoop