Articles by David Ruau

Large correlation in parallel

February 24, 2013 | David Ruau

A little improvement to the bigcor function proposed on Rmazing to compute huge correlation matrix in R, I made the function work in parallel using all the CPU cores available on the machine. The code is here.Here is a benchmark of the 2 func...
[Read more...]

Air quality analysis from Beijing twitter feed.

January 14, 2013 | David Ruau

As air pollution in Beijing reach new high [NYT article]. I re-ran the analysis I put online a few months ago. "Crazy bad" is a good description when it reach those levels. But I am sure there are other place like Mexico city, LA etc... that also look as dramatic ... [Read more...]

Computing an empirical pFDR in R

December 21, 2012 | David Ruau

The positive false discovery rate (pFDR) has become a classical procedure to test for false positive. It is one of my favourite because it rely on a re-sampling approach.I base my implementation on John Storey PNAS paper and the technical report he published with Rob Tibshirani while at Stanford [1... [Read more...]

Religious restrictions index: how do countries compare?

September 21, 2012 | David Ruau

The Guardian DataBlog published yesterday an interesting article exploring graphically the religious intolerance across the world. The data are coming from a report published by Pew Research Center's Forum on Religion and Public Life. I like the philosophy DataBlog a lot, providing the raw data for everyone to look at. ... [Read more...]

Twitter analysis of air pollution in Beijing

July 31, 2012 | David Ruau

One of the air pollution detection machine in Beijing (at the American Embassy) is connected to Twitter and tweet about the air quality in real time. By default the machine in Beijing output the 24hr summary PM2.5 air pollution information. What is PM2.5 is define here Next will be to ... [Read more...]

Rcpp vs. R implementation of cosine similarity

June 9, 2012 | David Ruau

While speeding up some code the other day working on a project with a colleague I ended up trying Rcpp for the first time. I re-implemented the cosine distance function using RcppArmadillo relatively easily using bits and pieces of code I found scattered around the web. But the speed increase ... [Read more...]

Using R to graph a subject trend in PubMed

May 15, 2012 | David Ruau

The traditional way to show that your topic is worth studying in front of an audience is to show the state of the field based on a literature review. This is especially true if your subject is obscure except to a handful of scientists in the world.I was confronted ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)