RNA-Seq Methods & March Twitter Roundup

[This article was first published on Getting Genetics Done, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There were lots of interesting developments this month that didn’t work their way into a full blog post. Here is an incomplete list of what I’ve been tweeting about over the last few weeks. But first I want to draw your attention to the latest manuscript for a new bioconductor package for doing RNA-seq in R.

DEXSeq vs Cuffdiff. See the pre-publication manuscript from Simon Anders, Alejandro Reyes, and Wolfgang Huber: “Detecting differential usage of exons from RNA-Seq data.” DEXSeq is an R package by the same guys who developed the DESeq R package and the HTSeq python scripts. (Incidentally, both DESeq and DEXSeq are rare examples of bioconductor vignettes which are well developed and are a pleasure to read). I often use cufflinks/cuffdiff in the bioinformatics core was because many other tools and methods only allow you to interrogate differential expression at the gene level. Using cufflinks for transcriptome assembly enables you to interrogate transcript/isoform expression, differential splicing, differential coding output, differential promoter usage, etc. DEXSeq uses similar methodology as DESeq, but can give you exon-level differential expression, without going through all the assembly business that cufflinks does. In one of the supplementary tables in their pre-pub manuscript, they compare several versions of cuffdiff to DEXSeq on two datasets. Both of these datasets had biological replicates for treatment and control conditions. They compared treatment to controls, and found DEXseq gave you more significant hits than cuffdiff. Then they compared controls to other controls (ideally should have zero hits) and found cufflinks had way more hits. See p13, p23, tables S1 and S2.

Proper comparison treatment vs control, # significant hits:
DEXSeq: 159
Cuffdiff 1.1: 145
Cuffdiff 1.2: 69
Cuffdiff 1.3: 50

Mock comparison controls vs controls, # significant hits:
DEXSeq: 8
Cuffdiff 1.1: 314
Cuffdiff 1.2: 650
Cuffdiff 1.3: 639

In the UVA Bioinformatics core we strive for reproducibility, scalability, and transparency using the most robust tools and methodology available. It gives me pause to see such alarmingly different results with each new version and each new protocol of a particular tool. What are your thoughts and experiences with using Cufflinks/Cuffdiff, DESeq/DEXSeq, or the many, many other tools for RNA-Seq (MISO, ExpressionPlot, EdgeR, RSEM, easyRNASeq, etc.)? Please share in the comments.

Everything else:

Webinar from @goldenhelixinc: Learning From Our GWAS Mistakes: From experimental design to scientific method https://t.co/KkxAn18p

Sequencing technology does not eliminate biological variability http://t.co/NI3acZgn

[bump] Questions on cutoff setting of FPKM value & know genes filtering in Cuffmerge result http://t.co/iKMZ7Dsd #bioinformatics

Very cool: DNAse-Seq+RNA-seq used to show DNaseI sensitivity eQTLs are a major determinant of gene expression variation http://t.co/nPo3xHVa

Beware using UCSC GTFs in HTSeq/CovergeBed for counting RNA-seq reads. “transcript_id” is repeated as “gene_id”! https://t.co/ADg1Pi6U

Google Scholar Metrics: top 20 journals in bioinformatics http://t.co/QYTf5pyT

A systematic eQTL study of cis-trans #epistasis in 210 HapMap individuals http://t.co/0YHlFQak

Identification of allele-specific alternative mRNA processing via RNA-seq http://t.co/fig9cLlH #bioinformatics @myen

prepub on arXiv + analysis tutorial/walkthrough + AWS EC2 AMI + git repo + ipython notebook = reproducible research done right http://t.co/GPNmpdJD

Altmetrics in the Wild: Using Social Media to Explore Scholarly Impact http://t.co/8xZuIHw9

NSF-NIH Interagency Initiative: Core Techniques and Technologies for Advancing Big Data Science and Engineering http://t.co/W3LUdCsG

New approach from @MarylynRitchie lab to collapsing/combining: using biological pathways rather than positional info http://t.co/ywWj0MNn

The Transcription Factor Encyclopedia http://t.co/DjJUwR10 Paper: http://t.co/b0J8PXO6

NF-kB: where did it come from and why? http://t.co/NLV1mBd0

Cloud BioLinux: pre-configured and on-demand #bioinformatics computing for the genomics community. @myen http://t.co/3kCE0ktH

SCOTUS remands AMP v Myriad (BRCA) patent case to CAFC to consider in light of prometheus decision http://t.co/7CkTa4l0

57 year experiment, Drosophila kept in dark for 1400 generations, many evolutionary changes (record longest postdoc!) http://t.co/wukq8fAf

IsoLasso: a LASSO regression approach to RNA-Seq based transcriptome assembly http://t.co/FN1sYM8f #bioinformatics

Complex disease genetics is complex. Imagine that. Hirschhorn, Visscher, & the usual consortium suspects: http://t.co/Bwopxlx6

MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets http://t.co/pffuHIlO #bioinformatics

Paper about SEQanswers forum published in #Bioinformatics http://t.co/SUlQ6O8c

num-utils – like awk, grep, sort, cut, etc for numbers http://t.co/bgkB5FMV

Nat Protocols: Differential gene & transcript expression analysis of RNA-seq w/ TopHat & Cufflinks http://t.co/U1ZpSE7V #bioinformatics

To leave a comment for the author, please follow the link and comment on their blog: Getting Genetics Done.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)