379 search results for "hadoop"

Webinar Jan 24: Using R with Hadoop

January 10, 2013
By

In two weeks (on January 24), Think Big Analytics' Jeffrey Breen will present a new webinar on using R with Hadoop. Here's the webinar description: R and Hadoop are changing the way organizations manage and utilize big data. Think Big Analytics and Revolution Analytics are helping clients plan, build, test and implement innovative solutions based on the two technologies...

Read more »

Integration of R, RStudio and Hadoop in a VirtualBox Cloudera Demo VM on Mac OS X

December 29, 2012
By
Integration of R, RStudio and Hadoop in a VirtualBox Cloudera Demo VM on Mac OS X

MotivationI was inspired by Revolution's blog and step-by-step tutorial from Jeffrey Breen on the set up of a local virtual instance of Hadoop with R. However, this tutorial describes the implementation using VMware's application. One downside to using VMware is that it's not free. I know most of the people including me like to hear the words open-source and free,...

Read more »

Big Data Trees with Hadoop HDFS

December 4, 2012
By

Last month's release of Revolution R Enterprise 6.1 added the capability to fit decision and regresson trees on large data sets (using a new parallel external memory algorithm included in the RevoScaleR package). It also introduced the possibility of applying this and the other big-data statistical methods of RevoScaleR to data files distributed in in Hadoop's HDFS file system*,...

Read more »

Webinar Tomorrow: Big Data Trees and Hadoop Connection in Revolution R Enterprise 6.1

November 14, 2012
By

Tomorrow at 9AM Pacific, Revolution Analytics VP of Product Development Sue Ranney will introduce two key Big Data features of the new Revolution R Enterprise 6.1. Now you can train classification and regression trees on data sets of unlimited size, quickly and using the resources of multiple processors and clusters. (This white paper describes our implementation of tree models...

Read more »

Allstate compares SAS, Hadoop and R for Big-Data Insurance Models

October 25, 2012
By
Allstate compares SAS, Hadoop and R for Big-Data Insurance Models

At the Strata conference in New York today, Steve Yun (Principal Predictive Modeler at Allstate's Research and Planning Center) described the various ways he tackled the problem of fitting a generalized linear model to 150M records of insurance data. He evaluated several approaches: Proc GENMOD in SAS Installing a Hadoop cluster Using open-source R (both on the full data...

Read more »

Improving the integration between R and Hadoop: rmr 2.0 released

October 4, 2012
By

The RHadoop project, the open-source project supported by Revolution Analytics to integrate R and Hadoop, continues to evolve. Now available is version 2 of the rmr package, which makes it possible for R programmers to write map-reduce tasks in the R language, and have them run within the Hadoop cluster. This update is the "simplest and fastest rmr yet",...

Read more »

Getting Started with R and Hadoop

August 20, 2012
By
Getting Started with R and Hadoop

Last week's meeting of the Chicago area Hadoop User Group (a joint meeting the Chicago R User Group, and sponsored by Revolution Analytics) focused on crunching Big Data with R and Hadoop. Jeffrey Breen, president of Atmosphere Research Group, frequently deals with large data sets in his airline consulting work, and R is his "go-to tool for anything data-related"....

Read more »

Faster R in Hadoop: rmr 1.3 now available

July 23, 2012
By

The RHadoop project continues the Big Data integration of R and Hadoop, with a new update to its rmr package. Version 1.3 of rmr improves the performance of map-reduce jobs for Hadoop written in R. New features include: An optional vectorized API for efficient R programming when dealing with small records. Fast C implementations for serialization and deserialization from...

Read more »

Data distillation with Hadoop and R

June 11, 2012
By
Data distillation with Hadoop and R

We're definitely in the age of Big Data: today, there are many more sources of data readily available to us to analyze than there were even a couple of years ago. But what about extracting useful information from novel data streams that are often noisy and minutely transactional ... aye, there's the rub. One of the great things about...

Read more »

Facebook-class social network analysis with R and Hadoop

May 25, 2012
By
Facebook-class social network analysis with R and Hadoop

In computing, social networks are traditionally represented as graphs: a connection of nodes (people), pairs of which may be connected by edges (friend relationships). Visually, the social networks can then be represented like this: Social network analysis often amounts to calculating the statistics on a graph like this: the number of edges (friends) connected to a particular node (person),...

Read more »