Blog Archives

Efficient Mixed-Model Association eXpedited (EMMAX) to Simutaneously Account for Relatedness and Stratification in Genome-Wide Association Studies

June 9, 2010
By

A few months ago I covered an algorithm called EMMA (Efficient Mixed-Model Association) implemented in R for simultaneously correct for both population stratification and relatedness in an association study. This method/software is very useful because ...

Read more »

Use SQL queries to manipulate data frames in R with sqldf package

May 25, 2010
By

I've covered a few topics in the past including the plyr package, which is kind of like "GROUP BY" for R, and the merge function for merging datasets. I only recently found the sqldf package for R, and it's already one of the most useful packages I've ever installed. The main function in the package is sqldf(), which takes...

Read more »

Tutorial: Principal Components Analysis (PCA) in R

May 20, 2010
By

Found this tutorial by Emily Mankin on how to do principal components analysis (PCA) using R. Has a nice example with R code and several good references. The example starts by doing the PCA manually, then uses R's built in prcomp() function to do the s...

Read more »

Using R, LaTeX, and Sweave for Reproducible Research: Handouts, Templates, & Other Resources

May 13, 2010
By

Several readers emailed me or left a comment on my previous announcement of Frank Harrell's workshop on using Sweave for reproducible research asking if we could record the seminar. Unfortunately we couldn't record audio or video, but take a look a...

Read more »

Sweave for Reproducible Research and Beatiful Statistical Reports

May 11, 2010
By

Frank Harrell, chair of the Biostatistics department here at Vanderbilt, is giving a seminar entitled "Sweave for Reproducible Research and Beautiful Statistical Reports" tomorrow, Wednesday, May 12, 1:30-2:30pm, in the MRBIII Conference Room 1220. This tutorial covers the basics of Sweave and shows how to enhance the default output in various ways by using: latex methods for converting R...

Read more »

R Package ‘rms’ for Regression Modeling

May 11, 2010
By

If you attended Frank Harrell's Regression Modeling Strategies course a few weeks ago, you got a chance to see the rms package for R in action. Frank's rms package does regression modeling, testing, estimation, validation, graphics, prediction, and ty...

Read more »

Mixed linear model approach adapted for genome-wide association studies

May 6, 2010
By

A few weeks ago I covered an R package for efficient mixed model regression that is capable of simultaneously accounting for both population stratification and relatedness to compute unbiased estimates of standard errors and p-values for genetic associ...

Read more »

Top 10 Algorithms in Data Mining

April 23, 2010
By

The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative public...

Read more »

Efficient Mixed-Model Association in GWAS using R

April 13, 2010
By

I recently did an analysis for the eMERGE network where I had lots of individuals from a small town in central Wisconsin where many of the subjects were related to one another. The subjects could not be treated as independent, but I could not use a fam...

Read more »

ProbABEL – R package for GWAS data imputation

April 6, 2010
By

I've been using GenABEL for some time now for GWAS analysis using related individuals. It has an excellent set of functions for estimating a kinship matrix from a dense marker panel and then using this in a linear mixed effects model to allow for relat...

Read more »