Using R and Hadoop to analyze VOIP data

November 8, 2010

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Last month, the newest member of Revolution's engineering team, Saptarshi Guha, gave a presentation at Hadoop World 2010 on using R and Hadoop to analyze 1.3 billion voice-over-IP packets to identify calls and measure call quality. Saptarshi, of course, is the author of RHIPE, which lets R programmers write map-reduce algorithms in the Hadoop framework without needing to learn Java. With R running on each Hadoop node, Saptarshi used R's data analysis functions (such as robust regression) to process almost 100 Gb of data in just a few minutes.

The slides for Saptarshi's talk are now available to view at the Hadoop World website (linked below), or you can download a PDF version (7.3Mb).

Hadoop World 2010: Voice over IP: Studying Traffice Characteristics for Quality of Service using R and Hadoop


To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

