Monthly Archives: November 2010

Using R and Hadoop to analyze VOIP data

November 8, 2010
By

Last month, the newest member of Revolution's engineering team, Saptarshi Guha, gave a presentation at Hadoop World 2010 on using R and Hadoop to analyze 1.3 billion voice-over-IP packets to identify calls and measure call quality. Saptarshi, of course, is the author of RHIPE, which lets R programmers write map-reduce algorithms in the Hadoop framework without needing to learn...

Read more »

The Dataists answer your questions

November 8, 2010
By

The fine bloggers (and R experts) at the Dataists have volunteered to answer questions about data analysis on Reddit: A few months ago, a group of likeminded folks in New York and the San Francisco Bay area decided it was time to start a blog about data, and we can up with the Dataists. Since then we thought about...

Read more »

Example 8.13: Bike ride plot, part 2

November 8, 2010
By
Example 8.13: Bike ride plot, part 2

Before explaining how to make and interpret the plot above, Nick and I want to make a plea for questions--it's hard to come up with useful questions to explore each week!As shown in Example 8.12, data from the Cyclemeter app can be used to make interes...

Read more »

The NYC Marathon

November 8, 2010
By
The NYC Marathon

New York’s annual marathon took place yesterday. Watching a bit of it on television with my friends, I was struck by the much earlier starting time for women than men. Specifically, professional women started running yesterday at 9:10 AM, while professional men start running at 9:40 AM. (This information comes from the runner’s handbook.) I

Read more »

R Beginner’s Guide Book Update: Statistical Analysis with R Released

November 8, 2010
By
R Beginner’s Guide Book Update: Statistical Analysis with R Released

In the final days of October, my beginner's guide to R was released. The book's official title is Statistical Analysis with R and it can be found on the Packt Publishing website. The primary focus of Statistical Analysis with R is helping new users bec...

Read more »

R Beginner’s Guide Book Update: Statistical Analysis with R Released

November 8, 2010
By
R Beginner’s Guide Book Update: Statistical Analysis with R Released

In the final days of October, my beginner's guide to R was released. The book's official title is Statistical Analysis with R and it can be found on the Packt Publishing website. The primary focus of Statistical Analysis with R is helping new users bec...

Read more »

A R wrapper for Google Prediction API

November 8, 2010
By
A R wrapper for Google Prediction API

Since I got the chance to access to both Google Storage for Developers and Google Prediction API (more details here and here), I decided to create a simple wrapper (just 4 basic functions until now) to be capable to play with the Google Prediction API ...

Read more »

A R wrapper for Google Prediction API

November 8, 2010
By
A R wrapper for Google Prediction API

Since I got the chance to access to both Google Storage for Developers and Google Prediction API (more details here and here), I decided to create a simple wrapper (just 4 basic functions until now) to be capable to play with the Google Prediction API ...

Read more »

Le Monde puzzle [43]

November 7, 2010
By
Le Monde puzzle [43]

Here is the puzzle in Le Monde I missed last week: Given a country with 6 airports and a local company with three destinations from each of the six airports, is it possible to find a circular trip with three intermediate stops from one of the airports? From all of the airports? One more airport

Read more »

Wetbulb Temperature

November 7, 2010
By
Wetbulb Temperature

This google map display is just one of 230 GHCN stations that is located in the water. After finding  instances of this phenomena over and over, it seemed an easy thing to find and analyze all such cases in GHCN. The issue matters for a two reasons: In my temperature analysis program I use a

Read more »