Blog Archives

Simpson’s Paradox in a nutshell

April 22, 2014
By

Norm Matloff points us to a pithy example that sums up Simpson's Paradox perfectly, captured in the title of a medical paper: "Good for Women, Good for Men, Bad for People". He explains how Simpson's Paradox isn't a paradox at all, but just the consequence of including a minor variable in a model ahead of a more significant variable,...

Read more »

Webinar: Big-Data Trees for R

April 21, 2014
By

If you missed last week's webinar presented by Revolution Analytics' US Chief Scientist Mario Inchiosa, Decision Trees built in Hadoop plus more Big Data Analytics with Revolution R Enterprise, the slides and webinar replay are now available for download. The webinar includes a demo of building decision trees and regression trees in Revolution R Enterprise, and using the Tree...

Read more »

R and the weather in the local news

April 18, 2014
By
R and the weather in the local news

The Mountain View Voice is a weekly newspaper serving the Silicon Valley area, and is a familiar sight to anyone wandering the streets of Palo Alto or Menlo Park. Angela Hey writes for 'Hey Tech!', an online blog of the Voice, and has just published a feature on R and the local Bay Area User Group (BARUG). It includes...

Read more »

DM Radio on Data Science

April 18, 2014
By

A couple of weeks ago, I participated in a panel discussion for DM Radio: "Still Sexy? How's that Data Scientist Gig Working Out?". The title was provocative, but the discussion mostly revolved around the rise of data science and how advanced analytics (often implemented with R) is changing the way many companies do business today. Also on the panel...

Read more »

Why writing vectorized code in R is a good idea

April 16, 2014
By

As a language for statistical computing, R has always had a bias towards linear algebra, and is optimized for operations dealing in complete vectors and matrixes. This can be surprising to programmers coming to R from lower-level languages, where iterative programming (looping over the elements of a vector or matrix) is more natural and often more efficient. That's not...

Read more »

Interfacing R with Web technologies

April 14, 2014
By

A new Task View on CRAN will be of anyone who needs to connect R with Web-based applications. The Web Technologies and Services Task View lists R functions and pacakges for reading data from websites (via public APIs or by scraping data from HTML packegs); for interfacing with Cloud-based platforms (including AWS); for authenticating and accessing data from social...

Read more »

Create an impressionist self-portrait from your Twitter followers

April 11, 2014
By
Create an impressionist self-portrait from your Twitter followers

Here's something fun you can do with R and its interface to Twitter, the TwitteR package. An R script by CMU student Mark Patterson downloads your Twitter profile picture, counts the number of Twitter followers you have, and then creates a pointillist version of your profile picture with as many dots as you have followers. Here's mine: Note that...

Read more »

R 3.1.0 "Spring Dance" is released

April 10, 2014
By

As announced this morning on mailing list, R 3.1.0 (codenamed "Spring Dance") has been released. The source code is available now; as of this writing binary versions haven't yet appeared on CRAN or propagated to the mirrors, but I expect they'll be available in a day or two. You can check out the full list of changes in the...

Read more »

Animated Choropleths in R

April 9, 2014
By
Animated Choropleths in R

Ari Lamstein has updated his choroplethr package with a new capability for creating animated data maps. I can't embed the animated version here, but click the image below to see an animation of US counties by average household income, from the richest to the poorest by percentile. (The code behind the animation is available on github.) The chloroplethr package...

Read more »

In case you missed it: March 2014 roundup

April 7, 2014
By

In case you missed them, here are some articles from March of particular interest to R users: Francis Smart offers five excellent reasons to use R, and notes that R is the top Google Search for statistical software. Revolution Analytics is offering R training for SAS users in Singapore and online. The number of R user groups worldwide continues...

Read more »