356 search results for "hadoop"

Handling Large Datasets in R

Handling large dataset in R, especially CSV data, was briefly discussed before at Excellent free CSV splitter and Handling Large CSV Files in R. My file at that time was around 2GB with 30 million number of rows and 8 columns. Recently I started to collect and analyze US corporate bonds tick data from year...

Read more »

RHIPE in the SD Times

October 12, 2010
By

Saptarshi Guha, who we profiled yesterday, is at the Hadoop World conference in New York City today. At 4PM, Saptarshi will give a presentation on RHIPE, his link between R and Hadoop. Saptarashi was interviewed yesterday by Alex Handy of the SD Times, where he talked about his background and his motivation to create RHIPE. Saptarshi was sponsored by...

Read more »

In case you missed it: September Roundup

October 12, 2010
By

In case you missed them, here are some articles from August of particular interest to R users. We presented a profile of Hadley Wickham, author of many popular R packages including ggplot2 and reshape. We riffed the design of the new Twitter website into a discussion on calculating the Golden Mean with R. Several readers contributed 1-liners based on...

Read more »

The R-Files: Saptarshi Guha

October 11, 2010
By
The R-Files: Saptarshi Guha

"The R-Files" is an occasional series from Revolution Analytics, where we profile prominent members of the R Community. Name: Saptarshi Guha Background: Ph.D. in Statistics, Purdue University Nationality: India Years Using R: 6 Known for: Developing RHIPE package for R + Hadoop integration At just 31 years old, Saptarshi Guha has emerged as a cutting-edge contributor to the R...

Read more »

Making sense of MapReduce

September 24, 2010
By

From guest blogger Joseph Rickert. Last night I went to hear Ken Krugler of Bixolabs talk about Hadoop at the monthly meeting of the Software Developers Forum. Maybe because Ken is an unusually lucid speaker, or maybe because I just reached some sort of cumulative tipping point through the prep work of all those patient people who have tried...

Read more »

Taking R to the Limit: Parallelism and Big Data

August 23, 2010
By

In a two-part series at the Los Angeles R User Group, Ryan Rosario took a look at the many ways you can take the R language to the limits of high-performance computing. In Part I (see video at this link; slides and code also available), Ryan focuses on the various methods of parallel computing in R. There's some great...

Read more »

Taking R to the Limit, Part II – Large Datasets in R

August 20, 2010
By
Taking R to the Limit, Part II – Large Datasets in R

For Part I, Parallelism in R, click here. Tuesday night I again had the opportunity to present on high performance computing in R, at the Los Angeles R Users’ Group. This was the second part of a two part series called “Taking R to the Limit: High Performance Computing in R.” Part II discussed ways to work with large datasets...

Read more »

Announcing Big Data for Revolution R

August 3, 2010
By

I've hinted this was coming a few times before, but with today's press release the announcement is official: the next release of Revolution R Enterprise will include "Big Data" capabilities thanks to the new RevoScaleR package. We're pretty excited at how it's turned out: it's kinda amazing to be able to use R's formula syntax like this: arrDelayLm2 <-...

Read more »

Taking R to the Limit, Part I – Parallelization in R

July 28, 2010
By
Taking R to the Limit, Part I – Parallelization in R

Tuesday night I had the opportunity to present on high performance computing in R, and the Los Angeles R Users’ Group. There was so much to talk about that I had to split my talk into two parts. The first part was parallelization and the second ...

Read more »

An experiment in A/B Testing my Résumé

July 1, 2010
By
An experiment in A/B Testing my Résumé

Objective I’ll admit it: my résumé doesn’t stand out. I’ve had some great internships, but also a tendency to work for companies that aren’t (yet!) household names. And though I’m doing fine academically, it’s not well enough to stand out … Continue reading →

Read more »