Blog Archives

Fast(ish) extraction of exon locations from a BED12 file using data.table

March 20, 2011
By

Here is a fast R function to extract exon locations from a BED12 file. Note that fast is a relative term, the function below is fast enough for me, may not be fast enough for others :) Anyway, a BED12 file typically has locations of genomic features (t...

Read more »

Applying functions on groups: sqldf, plyr, doBy, aggregate or data.table ?

March 17, 2011
By
Applying functions on groups: sqldf, plyr, doBy, aggregate or data.table ?

Which one of the sqldf, plyr, doBy and aggregate functions/packages would be faster for applying functions on groups of rows? I was wondering about this earlier in this post.  It seems sqldf would be the fastest according to a post in manipulatr m...

Read more »

Applying functions on groups: sqldf, plyr, doBy, aggregate or data.table ?

March 17, 2011
By
Applying functions on groups: sqldf, plyr, doBy, aggregate or data.table ?

Which one of the sqldf, plyr, doBy and aggregate functions/packages would be faster for applying functions on groups of rows? I was wondering about this earlier in this post.  It seems sqldf would be the fastest according to a post in manipulatr m...

Read more »

Tips on installing R extension for Rapidminer on Mac OS X

March 9, 2011
By
Tips on installing R extension for Rapidminer on Mac OS X

Rapidminer is a cool toy to play with machine-learning/data-mining algorithms and it can interface with R. However, it was a bit problematic for me to get the R extension working properly on Mac OS X Leopard for R 2.11. Here is what works for me at the...

Read more »

Tips on installing R extension for Rapidminer on Mac OS X

March 9, 2011
By
Tips on installing R extension for Rapidminer on Mac OS X

Rapidminer is a cool toy to play with machine-learning/data-mining algorithms and it can interface with R. However, it was a bit problematic for me to get the R extension working properly on Mac OS X Leopard for R 2.11. Here is what works for me at the...

Read more »

Calling BEDtools from R

February 22, 2011
By

BEDtools suite provides command-line functionality when dealing with genomic coordinate based operations, such as overlapping bed files or getting coverage of a bed file over a genome (similar, not exactly same, functionality in R is provided by IRange...

Read more »

Calling BEDtools from R

February 22, 2011
By

BEDtools suite provides command-line functionality when dealing with genomic coordinate based operations, such as overlapping bed files or getting coverage of a bed file over a genome (similar, not exactly same, functionality in R is provided by IRange...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Utilizing multiple cores in R

February 8, 2011
By

There are a couple of options in R, if you want to utilize multiple cores on your machine. These days my favorite is doMC package, which depends on foreach and multicore packages.in the section below squareroot for each number is calculated in parallel...

Read more »