Blog Archives

Use plyr instead of _apply() in R

December 30, 2009
By

I've covered plyr once before, showing you how to get means and variances for two quantitative traits across multilocus genotypes. JD Long over at Cerebral Mastication recently posted a nice screencast illustrating how plyr "just works" as an alternative to R's family of apply commands.  There's a set of R functions (apply, sapply, lapply, tapply, eapply, and rapply) that...

Read more »

Capture system commands as R objects with system(…, intern=T)

December 28, 2009
By

Just discovered this very handy R command to capture the output from a system command as an R object.  I wanted to use R to read in the output from another program (PLINK) and do some processing on each output file. Of course if the files are named sequentially (plink1.out, plink2.out, plink3.out, etc.) this would be simple with a...

Read more »

Browse R Graphics with the R Graph Gallery and the R Graphical Manual

December 15, 2009
By

One of R's biggest strengths is its unparalleled graphing capabilities.  Just see any of our previous posts on ggplot2, visualization, or other posts tagged with R. R has several fundamentally different systems for plotting, including base graphics, lattice, and ggplot2.  Furthermore, many add-on packages come with their own functions for producing problem-domain specific graphics. For example,

Read more »

Get Started with Machine Learning in R

December 1, 2009
By

A Beautiful WWW put together a great set of resources for getting started with machine learning in R.  First, they recommend the previously mentioned free book, The Elements of Statistical Learning.  Then there's a link to a list of dozens of machine learning and statistical learning packages for R.  Next, you'll need data.  Hundreds of free real datasets are...

Read more »

NYT: SAS threatened by R

November 23, 2009
By

The New York Times had an interesting piece yesterday about how SAS is facing several business threats from companies like the recently IBM-acquired SPSS, and from burgeoning interest in open-source software like R.  The NYT ran an entire article about R earlier this year, and this article discusses how SAS has been revamping their technology to work seamlessly with...

Read more »

Seminar: Reproducible Research with R, LaTeX, & Sweave

November 16, 2009
By

Theresa Scott, instructor of the previously mentioned R workshop and weekly R clinic, is giving a lecture entitled "Reproducible Research with R, LaTeX, & Sweave" in MRB III, room 1220, this Wednesday 11/18 at 1:30.  You can see more details about the lecture here. Looks like her slides as well as much more introductory material on R, Latex, and Sweave...

Read more »

QQ plots of p-values in R using ggplot2

November 9, 2009
By

Way back will wrote on this topic.  See his previous post for Stata code for doing this.  Unfortunately the R package that was used to create QQ-plots here has been removed from CRAN, so I wrote my own using ggplot2 and some code I received from Daniel Shriner at NHGRI. Of course you can use R's built-in qqplot() function, but...

Read more »

Split, apply, and combine in R using PLYR

November 4, 2009
By

While flirting around with previously mentioned ggplot2 I came across an incredibly useful set of functions in the plyr package, made by Hadley Wickham, the same guy behind ggplot2.  If you've ever used MySQL before, think of "GROUP BY", but here you can arbitrarily apply any R function to splits of the data, or write one yourself. Imagine you have...

Read more »

Visualizing sample relatedness in a GWAS using PLINK and R

October 9, 2009
By

Strict quality control procedures are extremely important for any genome-wide association study.  One of the first steps you should take when running QC on your GWAS is to look for related samples in your dataset.  This does two things for you.  First, you can get an idea of how many related samples you have in your dataset, and second,...

Read more »

R Commander: A Basic Statistics GUI for R

October 6, 2009
By

R is a great tool with lots of resources for genetics, genome-wide association studies, and many other biological applications.  We've covered several places to find help in R in the past, but if you're still apprehensive about diving into R's command-line interface, fear not.  The R commander is a graphical user interface (GUI) for R that works under Windows,...

Read more »