Note for DEseq2 time course analysis

[This article was first published on One Tip Per Day, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In many cases, we need to perform differential expression across the time course data, e.g. finding genes that react in a condition-specific manner over time, compared to a set of baseline samples. DEseq2 has such an implementation for time-course experiments

There are a number of ways to analyze time-series experiments, depending on the biological question of interest. In order to test for any differences over multiple time points, once can use a design including the time factor, and then test using the likelihood ratio test as described in the following section, where the time factor is removed in the reduced formula. For a control and treatment time series, one can use a design formula containing the condition factor, the time factor, and the interaction of the two. In this case, using the likelihood ratio test with a reduced model which does not contain the interaction terms will test whether the condition induces a change in gene expression at any time point after the reference level time point (time 0). An example of the later analysis is provided in our RNA-seq workflow.

Below is the example in DEseq workflow using LRT to test the interaction term (e.g. any condition-specific changes at any timepoints after time 0):

http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#time-course-experiments

library(“fission”) data(“fission”) ddsTC <- DESeqDataSet(fission, ~ strain + minute + strain:minute)

ddsTC <- DESeq(ddsTC, test=“LRT”, reduced = ~ strain + minute) resTC <- results(ddsTC)

Several notes from the DEseq2 forum:

  1. By default, the result() function will return the LRT test p-value and MLE log2FC for the difference between Mut vs WT at the last timepoints, controlling for baseline
  2. To get the log2FC for the difference between Mut vs WT at a different timepoint, you have to manually specify it, e.g. “strainmut.minute15” is the difference between Mut vs WT at minute 15, controlling for baseline. 
  3. To generate the tables of log2 fold change of 60 minutes vs 0 minutes for the WT strain would be results(dds, name=”minute_60_vs_0″); 
  4. To generate the tables of log2 fold change of 60 minutes vs 0 minutes for the mut strain would be the sum of the WT term above and the interaction term which is an additional effect beyond the effect for the reference level (WT): results(dds, contrast=list(c(“minute_60_vs_0″,”strainmut.minute60”))
  5. “strainmut.minute15” is the difference between Mut vs WT at minute 15, controlling for baseline. If you add “strain_mut_vs_wt” to this, you get the LFC for Mutant vs WT at minute 15, not controlling for baseline. So the second one is the observed difference at minute 15 between the two groups (because you added in the change that was present at time=0).

Two kinds of hypothesis tests in DEseq2:

Wald test: to test if the estimated standard error of a log2 fold change is equal to zero

LRT (likelihood ratio test) between a full model and a reduced model: to test if the increased likelihood of the data using the extra terms in the full model is more than expected if those extra terms are truly zero.


To leave a comment for the author, please follow the link and comment on their blog: One Tip Per Day.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)