Site icon R-bloggers

Network Analysis with freeCount

[This article was first published on R – Myscape, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Overview

Weighted gene co-expression network analysis (WGCNA) is used to investigate the function of genes at the system-level. In a network analysis genes with similar patterns of expression are grouped together into modules. The sets of genes in these modules are co-expressed as a result of shared biological functions, pathways, tissues, traits, etc.

Which genes share patterns of expression across samples?

The freeCount NA app will help you perform network analysis of normalized gene counts, which can be produced from differential expression analysis tools like freeCount DA.

Learning Goals

Related

This tutorial is the third in a series and uses the TMM normalized data made in the Making DE Gene Lists with freeCount tutorial.

WGCNA

The construction of co-expression networks using the WGCNA R package is a tricky process, but conceptually straightforward (DOI: 10.2202/1544-6115.1128). In a co-expression network the nodes represent genes. The nodes are connected if the corresponding genes are significantly co-expressed across appropriately chosen samples.

Given the assumptions of WGCNA, it is important to design your study appropriately for network analysis so that you can draw reasonable conclusions from the results. First, WGCNA assumes that the count data have been pre-processed and normalized (DOI: 10.1186/1471-2105-9-559). It is also important to consider if you have enough samples to construct an informative network, in which the signal of co-expression is not biased by a particular sample.


Before Starting

The exercise in this tutorial will be using the freeCount apps in RStudio on a personal computer. Make sure that you have the following tools downloaded, installed, and up-to-date on your personal computer:

  1. R software environment
  2. RStudio desktop application

For Windows users, additionally install RTools.

It is not possible to run the freeCount NA app online through the free plan of Posit Cloud, since it requires too much memory.

Input Data

  1. Download the tribolium normalized counts file
  2. Download the tribolium experimental design file

Tip! Right click and select Save As… to download the above files in the necessary formats.


The Analysis App

The following steps show you how to get and start running the freeCount network analysis (NA) app.

  1. Download the freeCount R Shiny applications
    1. Go to https://github.com/ElizabethBrooks/freeCount
    2. Click the green < > Code button
    3. Click Download ZIP
  2. Extract the freeCount-main directory 
  3. Navigate to the apps directory
  4. Open the NA.R file in RStudio
  5. Click Install on the yellow banner to install the necessary R packages (or run the code on lines 10 to 20)
  6. Click the Run App button in the upper right corner of the source pane

Analysis Process

Perform the following steps to make lists of co-expressed genes contained in network modules.

  1. Upload the data and click Upload
  2. Click the Run Analysis button that appears on the left side of the screen
  3. Review the data settings on the Data Cleaning tab
  4. Adjust the network settings on the Network Construction tab
  5. Create an informative network and curated list of co-expressed genes by repeating steps 3 and 4
  6. Download lists of genes or module eigengenes from the Results tab

1. Upload Data

Upload the data and click Upload.

Input Data

  1. The first file that you need to upload is the table of gene counts that has the normalized gene counts for your experiment. In this tutorial we are using the tribolium normalized counts file.
  2. The second file that you need to upload is the table with the experimental design that describes the samples in your study. In this tutorial we are using the tribolium experimental design file.

2. Run Analysis

Click the Run Analysis button that appears on the left side of the screen.

3. Review Data Settings

Review the data settings on the Data Cleaning tab.

The Minimum Branch Cluster Size and Branch Cut Heights can be adjusted to help identify and remove outliers from the input data. After changing these settings, look at the subsequent Sample clustering to detect outliers plot to see what samples are not clustering well with their groups and may need to be removed. The red line is the cut height that will be used to remove outliers.

4. Adjust Network Settings

Adjust the additional settings on the Network Construction tab.

Change the Soft Thresholding Power to shift the range of suggested soft thresholding powers (red numbers) in the following plots. Soft thresholding assigns a connection weight to each gene pair.

Next, set the Soft Thresholding Power by looking at the above Scale independence plot to see where the recommended scale free topology model fit falls (y-axis red line). Note what red number is closest to the red line on the y-axis. Then, look at the Mean connectivity plot to see what that number corresponds to in the mean connectivity (y-axis).

Then, you can set the Module Eigengene Cut Height in the Network Construction section. This will allow to adjust the size of your modules by merging modules according to co-expression similarity.

5. Create Curated Results

Create an informative network and curated list of co-expressed genes by repeating steps 3 and 4.

It may be necessary to repeatedly adjust the settings and inspect the network to create a well curated list of co-expression genes grouped into a manageable set of modules.

6. Download Results

Download lists of genes or module eigengenes from the Results tab.

The Gene Module Data table has the list of co-expressed genes associated with the different network modules. This file can be input to the freeCount FA functional analysis app to explore the potential functions of the gene sets contained in each module.

The Eigengene Expression Data table has the eigengene expression data from the network modules. This file can be used in various downstream analysis, such as a differential eigengene expression analysis.

To leave a comment for the author, please follow the link and comment on their blog: R – Myscape.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version