One approach for analyzing RNASeq data from an organism with a well-annotated genome, is to align the reads to mRNA (cDNA) sequences instead of the genome. To do that you need to extract the transcript sequences from a database. This is how to extract ensembl transcript sequences from UCSC from within R:
_________________________________________________
library(GenomicFeatures)
library(BSgenome.Hsapiens.UCSC.hg18)
tr
tr_seq
write.XStringSet(tr_seq, file="hg18.ensgene.transcripts.fasta", 'fasta', width=80, append=F)
_________________________________________________
One approach for analyzing RNASeq data from an organism with a well-annotated genome, is to align the reads to mRNA (cDNA) sequences instead of the genome. To do that you need to extract the transcript sequences from a database. This is how to extract ensembl transcript sequences from UCSC from within R:
_________________________________________________
library(GenomicFeatures)
library(BSgenome.Hsapiens.UCSC.hg18)
tr
tr_seq
write.XStringSet(tr_seq, file="hg18.ensgene.transcripts.fasta", 'fasta', width=80, append=F)
_________________________________________________