304 search results for "hadoop"

In-Hadoop R-based Analytics coming to Cloudera

August 27, 2013
By

Revolution Analytics has teamed up with Cloudera to bring the scalable data manipulation and statistical modeling algorithms of Revolution R Enteprise to the massively-parallel computing environments of CDH3 and CDH4 Hadoop clusters. As ZDNet reports: Specifically, the upcoming version 7.0 of the Revolution R Enterpise distribution and its ScaleR algorithms will run inside CDH3 and CDH4, eliminating the need...

Read more »

New Webinar: High Performance Predictive Analytics in R and Hadoop

August 23, 2013
By

This coming Tuesday, August 27, our US Chief Scientist Mario Inchosa will reveal some details of the forthcoming in-Hadoop predictive analytics capabilities of Revolution R Enterprise 7, due for release later this year. Here's the abtract of his webinar, High Performance Predictive Analytics in R and Hadoop: Hadoop is rapidly being adopted as a major platform for storing and...

Read more »

Step by step to build my first R Hadoop System

August 20, 2013
By
Step by step to build my first R Hadoop System

by Yanchang Zhao, RDataMining.com After reading documents and tutorials on MapReduce and Hadoop and playing with RHadoop for about 2 weeks, finally I have built my first R Hadoop system and successfully run some R examples on it. My experience … Continue reading →

Read more »

An excellent introduction to MapReduce and Hadoop

July 19, 2013
By
An excellent introduction to MapReduce and Hadoop

by Yanchang Zhao, RDataMining.com The lectures in week 3 of a free online course Introduction to Data Science give an excellent introduction to MapReduce and Hadoop, and demonstrate with examples how to use MapReduce to do various tasks, such as, … Continue reading →

Read more »

Oracle R Connector for Hadoop 2.2.0 released

July 19, 2013
By

Oracle R Connector for Hadoop 2.2.0 is now available for download. The Oracle R Connector for Hadoop 2.x series has introduced numerous enhancements, which are highlighted in this article and summarized as follows: ORCH 2.0.0  ORCH 2.1.0  ORCH...

Read more »

High Performing Predictive Analytics with R and Hadoop

July 16, 2013
By

I'm a bit late catching up on this, but Mario Inchosa (Revolution Analytics US Chief Scientist) gave a standing-room-only talk on high-performance predictive analytics in R and Hadoop at last month's Hadoop Summit. In the talk, he described some of the progress we've made integrating the ScaleR parallel external-memory algorithms into the Hadoop platform. He described some of the...

Read more »

A possibility for use R and Hadoop together

July 8, 2013
By

(This article was first published on Milano R net, and kindly contributed to R-bloggers) As mentioned in the previous article, a possibility for dealing with some Big Data problems is to integrate R within the Hadoop ecosystem. Therefore, it's necessary to have a bridge between the two environments. It means that R should be capable of handling data the...

Read more »

Oracle R Connector for Hadoop 2.1.0 released

June 17, 2013
By

(This article was first published on Oracle R Enterprise, and kindly contributed to R-bloggers) Oracle R Connector for Hadoop (ORCH), a collection of R packages that enables Big Data analytics using HDFS, Hive, and Oracle Database from a local R environment, continues to make advancements. ORCH 2.1.0 is now available, providing a flexible framework while remarkably improving performance and...

Read more »

Resampling data in Hadoop with RHadoop

February 27, 2013
By

On Revolution Analytics partner Cloudera's blog, Uri Laserson has posted an excellent guide to resampling from a large data set in Hadoop. Resampling is an important step in fitting ensemble models (including random forests and other bagging techniques), and Uri provides a step-by-step guide to implementing resampling methods using RHadoop. He provides the complete map-reduce code in the R...

Read more »

New ways to Hadoop with R

February 26, 2013
By

Today, there are two main ways to use Hadoop with R and big data: 1. Use the open-source rmr package to write map-reduce tasks in R (running within the Hadoop cluster - great for data distillation!) 2. Import data from Hadoop to a server running Revolution R Enterprise, via Hbase, ODBC (for high-performance Hadoop/SQL interfaces), or streaming data direct...

Read more »