Blog Archives

Using R and Hadoop to analyze VOIP data

November 8, 2010
By

Last month, the newest member of Revolution's engineering team, Saptarshi Guha, gave a presentation at Hadoop World 2010 on using R and Hadoop to analyze 1.3 billion voice-over-IP packets to identify calls and measure call quality. Saptarshi, of course, is the author of RHIPE, which lets R programmers write map-reduce algorithms in the Hadoop framework without needing to learn...

Read more »

The Dataists answer your questions

November 8, 2010
By

The fine bloggers (and R experts) at the Dataists have volunteered to answer questions about data analysis on Reddit: A few months ago, a group of likeminded folks in New York and the San Francisco Bay area decided it was time to start a blog about data, and we can up with the Dataists. Since then we thought about...

Read more »

Because it’s Friday: Epidemiology in 1632

November 5, 2010
By
Because it’s Friday: Epidemiology in 1632

I first got interested in epidemiology when I saw the famous John Snow chart (in a Tufte book, I think?) which pinpointed the pump which caused the 1854 cholera outbreak in London. For some reason I'd gotten the impression that this was essentially the birth of epidemiology as a discipline, but it's actually been around a lot longer than...

Read more »

ACM Data Mining Camp 3

November 5, 2010
By

The San Francisco Bay Area chapter of the ACM is will hold its third data mining camp next Saturday (November 13) at the Ebay campus in San José. Like the previous camps, this will be a one-day "unconference"-style event, with an agenda developed ad-hoc on the day according to the interests of the attendee. With data scientists from the...

Read more »

Dress your R code for the Web with Pretty R

November 4, 2010
By
Dress your R code for the Web with Pretty R

If you have some R code to include in a document, especially a Web-based document like a blog post, the new "Pretty R" feature on inside-R.org can help you make it look its best. Given some raw R code, it will create a HTML version of the code, adding syntax highlighting elements and links. Functions, strings, comments and literals...

Read more »

R is Hot: Part 5

November 4, 2010
By

This the final installment of a five-part article series. You can download the complete article from the Revolution Analytics website. Building a Business The value of R to business is borne out by the experiences of John Lucker and his team of advanced analytics professionals at Deloitte Consulting LLP. John is a Deloitte Consulting Principal and leads the firm’s...

Read more »

Keeping up with election results, with R

November 3, 2010
By
Keeping up with election results, with R

Yesterday's US election is pretty much over now: most of the results are in, the pundits have offered their political analysis, and there's even been a bit of mathematical analysis of the results, too. But last night as the results were flowing in, R user Brock Tibert just wanted to track the results of the Massachusetts governor's race. The...

Read more »

Another lottery coincidence

November 2, 2010
By

Last week, the Freakonomics blog in the NYT reported that the Israeli lottery had drawn the same six numbers twice in a month. The seventh "bonus ball" was different, but still: quite a coincidence, right? Cue the quote from an expert to explain just how remarkable this is: Yitzhak Melechson, a statistics professor at the University of Tel Aviv,...

Read more »

SAS vs Open Source, ctd

November 2, 2010
By

Following up on the story from last week, where SAS CEO Jim Goodnight said he "hadn't noticed" competition from open-source alternatives, open-source BI vendor Pentaho's "Chief Geek" James Dixon responds: What this means is that SAS has moved from the Igorance phase to the Ridicule phase of battling open source, they only have Fighting and Losing to go. There...

Read more »

Google TechTalk on integrating R

November 1, 2010
By

As noted on the Google Open Source Blog last week, R package authors Dirk Eddelbuettel and Romain Francois recently gave a presentation on R at the Googleplex, on various topic related to "bridging" R into other systems. Their 90-minute talk is available for replay on YouTube (as part of the Google TechTalks series), and you can download the slides...

Read more »