Blog Archives

In case you missed it: September 2014 Roundup

October 8, 2014
By

In case you missed them, here are some articles from September of particular interest to R users. Norm Matloff argues that T-tests shouldn't be part of the Statistics curriculum and questions the "star system" for p-values in R. A nice video introduction to the dplyr package and the %>% operator, presented by Kevin Markham. An animation of police militarization...

Read more »

R as a general-purpose language for creating DSLs

October 6, 2014
By

As a computer scientist, RStudio's Joe Cheng has some great insights into the R language and how it compares with other programming language. In the interview with DataScience.LA below, he notes that while R is often thought about as a domain-specific language (or DSL), the combination of a functional language with deferred evaluation of functional arguments actually makes it...

Read more »

New York Times approachably describes Bayesian Statistics

October 1, 2014
By

The New York Times published an article of interest to statisticians the other day: "The Odds, Continually Updated". Surprisingly for a general-audience newspaper, this article goes into the the distinctions between Bayesian and frequentist statistics, and does so in a very approachable way. Here's an excerpt: The essence of the frequentist technique is to apply probability to data. If...

Read more »

Video introduction to data manipulation with dplyr

September 29, 2014
By

Hadley Wickham's dplyr package is a great toolkit for getting data ready for analysis in R. If you haven't yet taken the plunge to using dplyr, Kevin Markham has put together a great hands-on video tutorial for his Data School blog, which you can see below. The video covers the five main data-manipulation "verbs" that dplyr provides: filter, select,...

Read more »

Police militarization in the US, over time

September 26, 2014
By
Police militarization in the US, over time

The militarization of local police departments here in the US has been much in the news lately, and the New York Times published in June an in-depth article on how materiel from wars has ended up in the hands of US counties. Besides the traditional reporting it's a fantastic piece of data journalism: the Times submitted a freedom-of-information request...

Read more »

Become an effective data hacker with the R-Hadoop stack

September 24, 2014
By

In discussion with several data scientists, Will Stanton (a data scientist with Return Path) learned that a common concern is: what software should I be using? There are many options out there, but what is the best platform to be an effective "data hacker"? Will recommends using a technology stack with R and Hadoop, which allows data scientists "to...

Read more »

Around the world in 80k miles

September 22, 2014
By
Around the world in 80k miles

You're probably familiar with the classic Travelling Salesman problem: given (say) 20 cities, what is shortest route you can take that passes through all 20 cities and returns to the starting point? It's a difficult problem to solve, because you need to try all possible routes to find the minimum, and there are a LOT of possibilities. For a...

Read more »

Webinar September 25: Data Science with R

September 19, 2014
By

A quick heads up that if you'd like to get a great introduction to doing data science with the R language, Joe Rickert will be giving a free webinar next Thursday, September 25: Data Science with R. Regular readers of the blog will be familiar with Joe's posts on this topic. A few recent examples include posts on comparing...

Read more »

Applications of R presentations at Dataweek

September 17, 2014
By

I'm speaking at the DataWeek conference in San Francisco today. My talk follows Skylar Lyon from Accenture — I'm really looking forward to hearing how he uses Revolution R Enterprise with Teradata Database to run R in-database with 400 million rows of data. Update: Here are Skylar's slides. The slides for my talk on other companies' applications of R...

Read more »

New members for R-core and R Foundation

September 16, 2014
By

The R Foundation for Statistical Computing, the Vienna-based non-profit organization that oversees the R Project, has just added several new "ordinary members". (Ordinary members participate in R Foundation meetings and provide guidance to the project.) The new members are: Dirk Eddelbuettel, Torsten Hothorn, Marc Schwartz, Hadley Wickham, and Achim Zeileis, Martin Morgan and Michael Lawrence. The R Core group,...

Read more »