Blog Archives

Multiple cores in R, revisited

August 10, 2011
By

The bigmemory package in combination with doMC provides at least a partial solution for sharing a large data set across multiple cores in R. With this solution you can work on the same matrix using several threads. It is also a very scalable solution. ...

Read more »

Retrieving transcriptome sequences for RNASeq analysis

November 22, 2010
By

One approach for analyzing RNASeq data from an organism with a well-annotated genome, is to align the reads to mRNA (cDNA) sequences instead of the genome. To do that you need to extract the transcript sequences from a database. This is how to extract ensembl transcript sequences from UCSC from within R:_________________________________________________ library(GenomicFeatures) library(BSgenome.Hsapiens.UCSC.hg18) tr tr_seq write.XStringSet(tr_seq, file="hg18.ensgene.transcripts.fasta", 'fasta', width=80, append=F) _________________________________________________ Next steps...

Read more »

Retrieving transcriptome sequences for RNASeq analysis

November 22, 2010
By

One approach for analyzing RNASeq data from an organism with a well-annotated genome, is to align the reads to mRNA (cDNA) sequences instead of the genome. To do that you need to extract the transcript sequences from a database. This is how to extract ensembl transcript sequences from UCSC from within R:_________________________________________________ library(GenomicFeatures) library(BSgenome.Hsapiens.UCSC.hg18) tr tr_seq write.XStringSet(tr_seq, file="hg18.ensgene.transcripts.fasta", 'fasta', width=80, append=F) _________________________________________________ Next steps...

Read more »