Monthly Archives: April 2014

Why are R users so damn Stingy?!

April 26, 2014
By
Why are R users so damn Stingy?!

YES R!Looking at rapporter's recent blog post on "R activity Around the World", I am shocked by how few users actually monetarily support the R Foundation.  Looking at the US for instance there are only 27 donars representing a little more than 0....

Read more »

AISTATS 2014 / MLSS tutorial

April 25, 2014
By
AISTATS 2014 / MLSS tutorial

Here are the slides of the tutorial on ABC methods I gave yesterday at both AISTAST 2014 and MLSS. (I actually gave a tutorial at another MLSS a few years ago, on the pretty island of Berder in Brittany, next to Vannes.) They are definitely similar to previous talks and tutorials I delivered on this

Read more »

Making Better DNS TXT Record Lookups With Rcpp

April 25, 2014
By

Technically this is Part 2 of Firewall-busting ASN-lookups. However, I said (in Part 1) that Part 2 would be about making a vectorized version and this is absolutely not about that. Rather than fib, I merely misdirect. Moving on… As you can see in Part 1, we have to resort to a system() call to do the TXT record lookup...

Read more »

Stats in bed, part 1: Ubuntu Touch

April 25, 2014
By
Stats in bed, part 1: Ubuntu Touch

Round at the RSS Statistical Computing committee, we were having a chuckle at the prospect of a meeting about Stats In Bed. By which I mean analysis on mobile devices, phones and tablets (henceforth phablets), not some sort of raunchy … Continue reading →

Read more »

NYT uses R to forecast Senate elections

April 25, 2014
By
NYT uses R to forecast Senate elections

Nate Silver's departure to relaunch FiveThirtyEight.com left a bit of a hole at the New York Times, which The Upshot — the new data journalism practice at the Times — seeks to fill. And they've gotten off to a great start with the new Senate forecasting model, called Leo. Leo was created by Amanda Cox (longtime graphics editor at...

Read more »

Example 2014.5: Simple mean imputation

April 25, 2014
By
Example 2014.5: Simple mean imputation

We're both users of multiple imputation for missing data. We believe it is the most practical principled method for incorporating the most information into data analysis. In fact, one of our more successful collaborations is a review of software for ...

Read more »

MapReduce with R on Hadoop and Amazon EMR

April 25, 2014
By
MapReduce with R on Hadoop and Amazon EMR

You all know why MapReduce is fancy – so let’s just jump right in. I like researching data and I like to see results fast – does that mean I enjoy the process of setting up a Hadoop cluster? No, … Continue reading →

Read more »

Shout out to "R Handles Big Data"

April 24, 2014
By
Shout out to "R Handles Big Data"

Searching for SignificanceI just wanted to give a shout out to check out this post by Bob Muenchen on his excellent blog r4stats.com for his exceptional post a little over a year ago entitle "R Handles Big Data".

Read more »

Bandit Formulations for A/B Tests: Some Intuition

April 24, 2014
By
Bandit Formulations for A/B Tests: Some Intuition

Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior. – Kohavi, Henne, Sommerfeld, “Practical Guide to Controlled Experiments on the Web” (2007) A/B tests are one of the simplest ways of running controlled experiments to evaluate the efficacy of a proposed improvement (a new Related posts:

Read more »

R Helps With Employee Churn

April 24, 2014
By
R Helps With Employee Churn

by Joseph Rickert Pasha Roberts, Chief Scientist at Talent Analytics, is writing a series of articles on Employee Churn for the Predictive Analytics Times that comprise a really instructive and valuable example of using R to do some basic predictive modeling. So far, Pasha has published Employee Churn 201 in which he makes a case for the importance of...

Read more »