Blog Archives

Create Fashion Fingerprints with R

October 27, 2014
By
Create Fashion Fingerprints with R

How do you summarize fashion? For New York Fashion Week, the New York Times used the idea of "Fashion Fingerprints", distilling a designer's collections into small fragments highlighting the palette. Here's what Marc Jacobs' current collection looks like: Click through for an interactive version where you can explore each design, and scroll down to the bottom where you can...

Read more »

Rocker: Docker containers for R

October 24, 2014
By

If you haven't heard the buzz about Docker but you often need to spin up Linux-based VM's for testing, simulations, etc. then you should check it out. In short, Docker rocks: we use it for testing our Linux-based distros of Revolution R Open. If you want to use R and Docker together, Dirk Eddelbuettel and Carl Boettiger have made...

Read more »

Explore R package connections at MRAN

October 20, 2014
By
Explore R package connections at MRAN

Many R scripts depend on CRAN packages, and most CRAN packages in turn depend on other CRAN packages. If you install an R package, you'll also be installing its dependencies to make it work, and possibly other packages as well to enable its full functionality. My colleague Andrie posted some R code to map package dependencies a couple of...

Read more »

Statistics doesn’t have to be so hard: simulate!

October 17, 2014
By

My second-favourite keynote from yesterday's Strata Hadoop World conference was this one, from Pinterest's John Rauser. To many people (especially in the Big Data world), Statistics is a series of complex equations, but a just a little intuition goes a long way to really understanding data. John illustrates this wonderfully using an example of data collected to determine whether...

Read more »

Introducing Revolution R Open and Revolution R Plus

October 15, 2014
By

For the past 7 years, Revolution Analytics has been the leading provider of R-based software and services to companies around the globe. Today, we're excited to announce a new, enhanced R distribution for everyone: Revolution R Open. Revolution R Open is a downstream distribution of R from the R Foundation for Statistical Computing. It's built on the R 3.1.1...

Read more »

14 Reasons Why R is better than Excel

October 10, 2014
By

The Fantasy Football Analytics blog shares these 14 reasons why R is better than Excel for data analysis: More powerful data manipulation capabilities Easier automation Faster computation It reads any type of data Easier project organization It supports larger data sets Reproducibility (important for detecting errors) Easier to find and fix errors It's free It's open source Advanced Statistics...

Read more »

In case you missed it: September 2014 Roundup

October 8, 2014
By

In case you missed them, here are some articles from September of particular interest to R users. Norm Matloff argues that T-tests shouldn't be part of the Statistics curriculum and questions the "star system" for p-values in R. A nice video introduction to the dplyr package and the %>% operator, presented by Kevin Markham. An animation of police militarization...

Read more »

R as a general-purpose language for creating DSLs

October 6, 2014
By

As a computer scientist, RStudio's Joe Cheng has some great insights into the R language and how it compares with other programming language. In the interview with DataScience.LA below, he notes that while R is often thought about as a domain-specific language (or DSL), the combination of a functional language with deferred evaluation of functional arguments actually makes it...

Read more »

New York Times approachably describes Bayesian Statistics

October 1, 2014
By

The New York Times published an article of interest to statisticians the other day: "The Odds, Continually Updated". Surprisingly for a general-audience newspaper, this article goes into the the distinctions between Bayesian and frequentist statistics, and does so in a very approachable way. Here's an excerpt: The essence of the frequentist technique is to apply probability to data. If...

Read more »

Video introduction to data manipulation with dplyr

September 29, 2014
By

Hadley Wickham's dplyr package is a great toolkit for getting data ready for analysis in R. If you haven't yet taken the plunge to using dplyr, Kevin Markham has put together a great hands-on video tutorial for his Data School blog, which you can see below. The video covers the five main data-manipulation "verbs" that dplyr provides: filter, select,...

Read more »