331 search results for "hadoop"

In case you missed it: November Roundup

December 17, 2010
By

In case you missed them, here are some articles from November of particular interest to R users. Dirk Eddelbuettel and Romain Francois went to Google to talk about integrating R (using Rcpp, for example), and we gave a review of the video presentation. R co-creator Ross Ihaka wins a Lifetime Achievement Award in Open Source. Revolution has job openings...

Read more »

Facebook’s Social Network Graph

December 14, 2010
By
Facebook’s Social Network Graph

Paul Butler, an intern on Facebook’s data infrastructure engineering team, was interested in visualizing the "locality of friendship". Luckily, he has some great data to work with: Facebook's social network of the friendships between its 500 million members. But visualizing that much data can be a challenge in its own right -- it takes skill to draw meaning from...

Read more »

Learn Logistic Regression (and beyond)

November 23, 2010
By
Learn Logistic Regression (and beyond)

One of the current best tools in the machine learning toolbox is the 1930s statistical technique called logistic regression. We explain how to add professional quality logistic regression to your analytic repertoire and describe a bit beyond that. A statistical analyst working on data tends to deliberately start simple move cautiously to more complicated methods. Related posts:

Read more »

In case you missed it: October Roundup

November 16, 2010
By

In case you missed them, here are some articles from October of particular interest to R users. Reviews of the winners and finalists of the 2010 ggplot2 case study competition. We have published a new article "R is Hot", with interviews from a dozen R users in industry and academia. A new code highlighting tool for displaying R code...

Read more »

My Day at ACM Data Mining Camp III

November 13, 2010
By
My Day at ACM Data Mining Camp III

My first time at ACM Data Mining Camp was so awesome, that I was thrilled the make the trip up to San Jose for the November 2010 version. In July, I gave a talk at the Emerging Technologies for Online Learning Symposium conference with a faculty member in the Department of Statistics, at the Fairmont. The place was amazing,...

Read more »

Handling Large Datasets in R

Handling large dataset in R, especially CSV data, was briefly discussed before at Excellent free CSV splitter and Handling Large CSV Files in R. My file at that time was around 2GB with 30 million number of rows and 8 columns. Recently I started to collect and analyze US corporate bonds tick data from year...

Read more »

RHIPE in the SD Times

October 12, 2010
By

Saptarshi Guha, who we profiled yesterday, is at the Hadoop World conference in New York City today. At 4PM, Saptarshi will give a presentation on RHIPE, his link between R and Hadoop. Saptarashi was interviewed yesterday by Alex Handy of the SD Times, where he talked about his background and his motivation to create RHIPE. Saptarshi was sponsored by...

Read more »

In case you missed it: September Roundup

October 12, 2010
By

In case you missed them, here are some articles from August of particular interest to R users. We presented a profile of Hadley Wickham, author of many popular R packages including ggplot2 and reshape. We riffed the design of the new Twitter website into a discussion on calculating the Golden Mean with R. Several readers contributed 1-liners based on...

Read more »

The R-Files: Saptarshi Guha

October 11, 2010
By
The R-Files: Saptarshi Guha

"The R-Files" is an occasional series from Revolution Analytics, where we profile prominent members of the R Community. Name: Saptarshi Guha Background: Ph.D. in Statistics, Purdue University Nationality: India Years Using R: 6 Known for: Developing RHIPE package for R + Hadoop integration At just 31 years old, Saptarshi Guha has emerged as a cutting-edge contributor to the R...

Read more »

Making sense of MapReduce

September 24, 2010
By

From guest blogger Joseph Rickert. Last night I went to hear Ken Krugler of Bixolabs talk about Hadoop at the monthly meeting of the Software Developers Forum. Maybe because Ken is an unusually lucid speaker, or maybe because I just reached some sort of cumulative tipping point through the prep work of all those patient people who have tried...

Read more »