Making DE Gene Lists with freeCount
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Overview
Differential expression (DE) analysis is used to identify genes driving the patterns of variation associated with groups of samples.
It is easy to make curated lists of DE genes using the freeCount DA app.
Learning Goals
- Be able to perform comparisons to identify DE genes with edgeR
- Become comfortable with setting FDR and LFC thresholds to filter DE results
- Learn how to prepare DE results for downstream analysis (functional, network, set operations)
edgeR
The R package edgeR uses trimmed mean of M-values (TMM) for normalization before DE analysis with negative binomial distributions.
The data are normalized to account for sample size differences and variance among samples.
The normalized count data are then used to estimate per-gene fold changes and perform the DE analysis.
Before Starting
The exercise in this tutorial will be using the freeCount apps in RStudio on a personal computer. Make sure that you have the tools following downloaded and installed and up-to-date on your personal computer:
For Windows users, additionally install RTools.
It is also possible to run the freeCount apps online through Posit Cloud. To see how, checkout the freeCount Bioinformatics Analysis Apps on Posit Cloud tutorial.
Input Data
- Download the tribolium counts file
- Download the tribolium design file
Tip! Right click and select Save As… to download the above files in csv format.
Example Data
In this lesson we will be using data from a study of the effects of ultraviolet radiation (UVR) on the larvae of the red flour beetle titled “Digital gene expression profiling in larvae of Tribolium castaneum at different periods post UV-B exposure“.

UVR is common to many environments and it varies widely in its intensity and composition, such as differing ratios of UV-A and UV-B radiation. The different forms of UVR have distinct, and frequently harmful effects on organisms and biological systems.
Study Design
There are two factors for each sample, and within each of these factors are two levels:
- The condition factor has the levels of cntrl and treat
- The time factor has the levels of 4h and 24h
We are able to group our data using the different levels of each factor, then we are able to compare the expression levels of genes in those groups to identify DE.

Start the Analysis App
The following steps show you how to get and start running the freeCount differential expression analysis (DA) app.
- Download the freeCount R Shiny applications
- Go to https://github.com/ElizabethBrooks/freeCount
- Click the green < > Code button
- Click Download ZIP
- Extract the freeCount-main directory
- Navigate to the apps directory
- Open the DA.R file in RStudio
- Click Install on the yellow banner to install the necessary R packages (or run the code on lines 10 to 19)
- Click the Run App button in the upper right corner of the source pane
Analysis Process
Perform the following steps to make a list of DE genes that can then be used in a downstream analysis (e.g., functional).
- Upload the data and click Run Analysis
- Review the initial settings on the Analysis tab
- Select the sample groups to compare and click Analyze
- Explore the filtered and normalized data on the Data Normalization tab
- Inspect the patterns of variation among samples shown in the clustering plots on the Data Exploration tab
- Inspect the DE Analysis Results and numbers of DE genes on the Results tab
- Compare the groupings of samples in the heatmap of DE genes to the clustering plots
- Adjust FDR and LFC settings to filter the DE gene results
- Create a curated list of DE genes by repeating steps 6 through 8
- Download the curated list of DE genes
Upload the Data
Upload the data and click Run Analysis.

Review Initial Settings
Review the initial settings on the Analysis tab.

Select a Comparison
Select the sample groups to compare and click Analyze.

Explore Filtered and Normalized Data
Explore the filtered and normalized data on the Data Normalization tab.

Downstream Network Analysis
For downstream network analysis, click the Download Table button to download the Normalized Gene Counts Table. This table can be input to the freeCount NA app along with a study design file.
Inspect Patterns of Variation
Inspect the patterns of variation among samples shown in the clustering plots on the Data Exploration tab.

Notice in the above PCA that a couple of samples from different groups are mixed up and clustered with other groups. For example, one sample from the treat.4h group (treat2_4h) is not clustered with the other samples in that group.
The patterns of variation among samples that we observe here show us what to expect when analyzing the resulting set of DE genes. These patterns will help guide us while setting the FDR and LFC thresholds to filter our results.
Inspect the DE Analysis Results
Inspect the DE analysis results and numbers of DE genes on the Results tab.

Compare the Sample Groupings
Compare the groupings of samples in the heatmap of DE genes (Results tab) to the clustering plots (Data Exploration tab).

Adjust FDR and LFC Settings
Adjust FDR and LFC settings to filter the DE gene results.

Filtering DE Analysis Results
Adjust thresholds by…
- Increasing the LFC in noisy data to have more confident differences
- Decreasing the FDR to focus on high-likelihood targets
Narrow down the results to the genes that you think are driving the patterns of variation observed in the clustering plots.
Verify the FDR and LFC thresholds by visualizing the patterns with just those genes.
Verify Analysis Settings
Verify that the analysis settings have updated by looking at the Current Analysis Settings on the left side of the app.

Create Curated List of DE Genes
Create a curated list of DE genes by repeating steps 6 through 8.
It may be necessary to repeatedly adjust the settings and inspect the DE gene results to create a well curated and manageable list of DE genes.
Download DE Gene List
Finally, download the curated list of DE genes.

The DE Analysis Results Table can be used in downstream functional analysis. This table can be input to the freeCount FA app along with an annotations file.
The Significant DE Analysis Results Table can be used with set operations to identify sets of shared or unique genes. This table can be input to the freeCount SO app along with other lists of DE genes from an experiment.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.