Monthly Archives: April 2010

“The next big thing”, R, and Statistics in the cloud

April 14, 2010
By

A friend just e-mailed me about a blog post by Dr. AnnMaria De Mars titled “The Next Big Thing”. In it Dr. De Mars wrote (I allowed myself to emphasize some parts of the text): Contrary to what some people seem to think, R is definitely not the next big thing, either. I am always surprised when people ask me why I think...

Read more »

R: parallel processing using multicore package

April 14, 2010
By

I have been meaning to look at adding some parallel processing to R as I have some scripts that are painfully slow and embarrassingly parallel. There seem to be a lot of packages around for doing parallel computing, listed here.I decided to look at mul...

Read more »

R: parallel processing using multicore package

April 14, 2010
By

I have been meaning to look at adding some parallel processing to R as I have some scripts that are painfully slow and embarrassingly parallel. There seem to be a lot of packages around for doing parallel computing, listed here.I decided to look at mul...

Read more »

Plotting “time of day” data using ggplot2

April 14, 2010
By
Plotting “time of day” data using ggplot2

William asks: How can I make a graph that looks like this, “tweet density” style, showing time intervals? He then helpfully describes his input data: a CSV file with headers “time started, time finished, date”. Here’s a simple CSV file, tasks.csv: task,date,start,end task1,2010-03-05,09:00:00,13:00:00 task2,2010-03-06,10:00:00,15:00:00 task3,2010-03-06,11:00:00,18:00:00 task4,2010-03-07,08:00:00,11:00:00 task5,2010-03-08,14:00:00,17:00:00 task6,2010-03-09,12:00:00,16:00:00 task7,2010-03-10,14:00:00,19:00:00 task8,2010-03-11,09:30:00,13:30:00 Read into R, calculate the

Read more »

In case you missed it: March Roundup

April 13, 2010
By

In case you missed them, here are some articles from last month of particular interest to R users. We reviewed a special report in The Economist on the "Data Deluge" and the growing importance of statistical analysis in business. One section mentioned R specifically. We announced that Zack Urlocker, formerly responsible for engineering and marketing for the open-source database...

Read more »

formatR: farewell to ugly R code

April 13, 2010
By
formatR: farewell to ugly R code

It is not uncommon to see messy R code which is almost not human-readable like this: # rotation of the word "Animation" # in a loop; change the angle and color # step by step for (i in 1:360) { # redraw the plot again and again plot(1,ann=FALSE,type="n",axes=FALSE) # rotate; use rainbow() colors text(1,1,"Animation",srt=i,col=rainbow(360),cex=7*i/360) #

Read more »

Efficient Mixed-Model Association in GWAS using R

April 13, 2010
By

I recently did an analysis for the eMERGE network where I had lots of individuals from a small town in central Wisconsin where many of the subjects were related to one another. The subjects could not be treated as independent, but I could not use a fam...

Read more »

Repeated measures ANOVA with R (tutorials)

April 13, 2010
By

Repeated measures ANOVA is a common task for the data analyst. There are (at least) two ways of performing “repeated measures ANOVA” using R but none is really trivial, and each way has it’s own complication/pitfalls (explanation/solution to which I was usually able to find through searching in the R-help mailing list). So for future reference, I am starting this page...

Read more »

Cherry Picking to Generalize ~ NASA Global Temperature Trends ~ enhanced w/ ggplot2

April 12, 2010
By
Cherry Picking to Generalize ~ NASA Global Temperature Trends ~ enhanced w/ ggplot2

In a prior article, I tried to visualize the linear global temperatures trends for a grid of start and end years. The visual I created was confusing in that the specification of color scale was interdependent with the data values. I wanted a blue -> white -> red scale of the temperatures indicating cool ->

Read more »

Using MKL-Linked R in Eclipse

April 12, 2010
By
Using MKL-Linked R in Eclipse

Setting up Eclipse to use MKL-Linked RIn my previous post, I showed how to compile R 2.10.1 using Intel's Math Kernel Library for the BLAS/LAPACK interface. Even though it takes a bit of time to setup, I think the noticeably improved calculation speed justifies the effort. Although I'm happy to use R from the command line for basic stuff,...

Read more »