Monthly Archives: July 2013

user2013: The Rcpp tutorial

July 9, 2013
By
user2013: The Rcpp tutorial

I’m at user 2013, and this morning I attended Hadley Wickham and Romain Francois’s tutorial on the Rcpp package for calling C++ code from R. I’ve spent the last eight years avoiding C++ afer having nightmares about obscure pointer bugs, so I went into the room slightly skeptical about this package. I think the most

Read more »

X+1 uses Revolution R Enterprise for Marketing Optimization

July 9, 2013
By
X+1 uses Revolution R Enterprise for Marketing Optimization

In a recent article at Statistics View, Lillian Pierson describes how the X+1 Origin Digital Marketing Hub helps companies like JP Morgan Chase and Verizon optimize their marketing efforts. Back in 2011, X+1 saw the need to update their analytics platform to deal with increasing data sizes and to serve the increasingly sophisticated needs of their marketing clients: What...

Read more »

A Rough Guide to Data Science

July 9, 2013
By
A Rough Guide to Data Science

If Big Data was last year's buzzword, Data Science may reach the same level of hype this year. There's no shortage of discussion about the high demand for data scientists, the term's usefulness as a designation, and even declarations of its "sexiness" as a career. And as with many terms that reach a critical mass on social media, data...

Read more »

For faster R use OpenBLAS instead: better than ATLAS, trivial to switch to on Ubuntu

July 9, 2013
By

R speeds up when the Basic Linear Algebra System (BLAS) it uses is well tuned. The reference BLAS that comes with R and Ubuntu isn’t very fast. On my machine, it takes 9 minutes to run a well known R … Continue reading →

Read more »

Exploratory Data Analysis: Conceptual Foundations of Histograms – Illustrated with New York’s Ozone Pollution Data

Exploratory Data Analysis: Conceptual Foundations of Histograms – Illustrated with New York’s Ozone Pollution Data

Introduction Continuing my recent series on exploratory data analysis (EDA), today’s post focuses on histograms, which are very useful plots for visualizing the distribution of a data set.  I will discuss how histograms are constructed and use histograms to assess the distribution of the “Ozone” data from the built-in “airquality” data set in R.  In

Read more »

Unicode Tips in Python 2 and R

July 9, 2013
By
Unicode Tips in Python 2 and R

Most of time, I don’t need to deal with different encodings at all. When possible, I use ASCII characters. And when there is a little processing in Chinese characters or other Unicode characters, I use .Net languages or JVM languages, in which every string is Unicode and of course when the characters are displayed they are displayed as characters...

Read more »

googleVis tutorial at useR!2013

July 9, 2013
By
googleVis tutorial at useR!2013

Today Diego and I will give our googleVis tutorial at useR!2013 in Albacete, Spain.googleVis Tutorial at useR! 2013We will cover:Introduction and motivationGoogle Chart ToolsR package googleVisConcepts of googleVisCase studiesgoogleVis on shiny

Read more »

A possibility for use R and Hadoop together

July 8, 2013
By

(This article was first published on Milano R net, and kindly contributed to R-bloggers) As mentioned in the previous article, a possibility for dealing with some Big Data problems is to integrate R within the Hadoop ecosystem. Therefore, it's necessary to have a bridge between the two environments. It means that R should be capable of handling data the...

Read more »

Modeling Residential Electricity Usage with R – Part 2

July 8, 2013
By
Modeling Residential Electricity Usage with R – Part 2

(This article was first published on Commodity Stat Arb, and kindly contributed to R-bloggers) I can’t believe it has been nearly 6 months since I last posted.  Given the sustained heat it seemed like a good idea to finish off this subject.As hinted at in my last post, temperature is the missing variable to make sense of Residential electrical...

Read more »

Another view of ordinary regression

July 8, 2013
By
Another view of ordinary regression

This is something I’ve been meaning to write for ages. My formal training for most things is limited. Like a lot of folks, I’m an autodidact. This is good in that I’m always learning and always studying those things that I enjoy. At the same time, it means that I take in information in a

Read more »