356 search results for "hadoop"

High Performing Predictive Analytics with R and Hadoop

July 16, 2013
By

I'm a bit late catching up on this, but Mario Inchosa (Revolution Analytics US Chief Scientist) gave a standing-room-only talk on high-performance predictive analytics in R and Hadoop at last month's Hadoop Summit. In the talk, he described some of the progress we've made integrating the ScaleR parallel external-memory algorithms into the Hadoop platform. He described some of the...

Read more »

A possibility for use R and Hadoop together

July 8, 2013
By

(This article was first published on Milano R net, and kindly contributed to R-bloggers) As mentioned in the previous article, a possibility for dealing with some Big Data problems is to integrate R within the Hadoop ecosystem. Therefore, it's necessary to have a bridge between the two environments. It means that R should be capable of handling data the...

Read more »

Oracle R Connector for Hadoop 2.1.0 released

June 17, 2013
By

(This article was first published on Oracle R Enterprise, and kindly contributed to R-bloggers) Oracle R Connector for Hadoop (ORCH), a collection of R packages that enables Big Data analytics using HDFS, Hive, and Oracle Database from a local R environment, continues to make advancements. ORCH 2.1.0 is now available, providing a flexible framework while remarkably improving performance and...

Read more »

Resampling data in Hadoop with RHadoop

February 27, 2013
By

On Revolution Analytics partner Cloudera's blog, Uri Laserson has posted an excellent guide to resampling from a large data set in Hadoop. Resampling is an important step in fitting ensemble models (including random forests and other bagging techniques), and Uri provides a step-by-step guide to implementing resampling methods using RHadoop. He provides the complete map-reduce code in the R...

Read more »

New ways to Hadoop with R

February 26, 2013
By

Today, there are two main ways to use Hadoop with R and big data: 1. Use the open-source rmr package to write map-reduce tasks in R (running within the Hadoop cluster - great for data distillation!) 2. Import data from Hadoop to a server running Revolution R Enterprise, via Hbase, ODBC (for high-performance Hadoop/SQL interfaces), or streaming data direct...

Read more »

Slides and replay of my “Using R with Hadoop” webinar now available #rstats #hadoop

January 25, 2013
By
Slides and replay of my “Using R with Hadoop” webinar now available #rstats #hadoop

I owe a big “thank you” to all of you who attended my webinar yesterday “Using R with Hadoop”. Revolution Analytics partnered with us at Think Big Analytics to produce the webinar, and I owe them thanks as well. For those of you who missed it, the slides and replay are now available from Revolution

Read more »

Video: Using R with Hadoop

January 25, 2013
By

If you weren't one of the almost 2000 people who signed up for yesterday's webinar "Using R with Hadoop", the replay and slides are now available. During the webinar, Jeffrey Breen (Principal at Think Big Academy) talked about extracting analytics from data in Hadoop and covered: How to use R and Hadoop Hadoop streaming Various R packages and RHadoop...

Read more »

Webinar Jan 24: Using R with Hadoop

January 10, 2013
By

In two weeks (on January 24), Think Big Analytics' Jeffrey Breen will present a new webinar on using R with Hadoop. Here's the webinar description: R and Hadoop are changing the way organizations manage and utilize big data. Think Big Analytics and Revolution Analytics are helping clients plan, build, test and implement innovative solutions based on the two technologies...

Read more »

Integration of R, RStudio and Hadoop in a VirtualBox Cloudera Demo VM on Mac OS X

December 29, 2012
By
Integration of R, RStudio and Hadoop in a VirtualBox Cloudera Demo VM on Mac OS X

MotivationI was inspired by Revolution's blog and step-by-step tutorial from Jeffrey Breen on the set up of a local virtual instance of Hadoop with R. However, this tutorial describes the implementation using VMware's application. One downside to using VMware is that it's not free. I know most of the people including me like to hear the words open-source and free,...

Read more »

Big Data Trees with Hadoop HDFS

December 4, 2012
By

Last month's release of Revolution R Enterprise 6.1 added the capability to fit decision and regresson trees on large data sets (using a new parallel external memory algorithm included in the RevoScaleR package). It also introduced the possibility of applying this and the other big-data statistical methods of RevoScaleR to data files distributed in in Hadoop's HDFS file system*,...

Read more »