exprAnalysis package

[This article was first published on Shirin's playgRound, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I created the R package exprAnalysis designed to streamline my RNA-seq data analysis pipeline. Below you find the vignette for installation and usage of the package.


This package combines functions from various packages used to analyze and visualize expression data from NGS or expression chips.

  • It supports normalized input as e.g. from Cufflinks or expression chip arrays and raw count data from bam file input.
  • It supports mRNA, miRNA, protein or other expression data.

So far, it is implemented for human and mouse data only!

It uses functions from the following packages:

  • AnnotationDbi for annotating gene information
  • beadarray for importing Illumina expression chip files from GenomeStudio
  • clusterProfiler and DOSE for functional enrichment analysis
  • DESeq2 for differential expression analysis of raw count data
  • GenomicAlignments, GenomicFeatures, Rsamtools for reading bam files
  • pcaGoPromoter for principle component analysis
  • limma for differential expression analysis of normalised expression data
  • pathview for mapping KEGG pathways
  • gplots for heatmaps
  • sva for batch correction
  • WGCNA for coregulatory network determination

Prepare session

<span class="n">library</span><span class="p">(</span><span class="n">exprAnalysis</span><span class="p">)</span><span class="w">
</span>
<span class="c1"># update all packages
</span><span class="n">source.bioc</span><span class="o">=</span><span class="s2">"https://bioconductor.org/biocLite.R"</span><span class="w">
</span><span class="n">source</span><span class="p">(</span><span class="n">source.bioc</span><span class="p">)</span><span class="w">
</span><span class="n">biocLite</span><span class="p">(</span><span class="s2">"BiocUpgrade"</span><span class="p">)</span><span class="w">
</span><span class="n">biocLite</span><span class="p">(</span><span class="n">ask</span><span class="o">=</span><span class="kc">FALSE</span><span class="p">)</span><span class="w"> 

</span><span class="n">update.packages</span><span class="p">()</span><span class="w">

</span><span class="c1"># make sure the workspace is in pristine condition
</span><span class="n">rm</span><span class="p">(</span><span class="n">list</span><span class="o">=</span><span class="n">ls</span><span class="p">(</span><span class="n">all</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">))</span><span class="w">

</span><span class="n">orig_par</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">par</span><span class="p">(</span><span class="n">no.readonly</span><span class="o">=</span><span class="nb">T</span><span class="p">)</span><span class="w">

</span><span class="n">options</span><span class="p">(</span><span class="n">stringsAsFactors</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">)</span><span class="w">
</span>

The package can be installed via Github:

Beware that the vignette is rather large and thus takes a minute to compile. You can also just use this page (it is identical to the vignette).

<span class="c1"># install package from github
</span><span class="n">install.packages</span><span class="p">(</span><span class="s2">"devtools"</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">devtools</span><span class="p">)</span><span class="w">

</span><span class="c1"># either the latest stable release that passed TRAVIS CI check
</span><span class="n">devtools</span><span class="o">::</span><span class="n">install_github</span><span class="p">(</span><span class="s2">"ShirinG/exprAnalysis"</span><span class="p">,</span><span class="w"> </span><span class="n">build_vignettes</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">ref</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"stable.version0.1.0"</span><span class="p">)</span><span class="w">

</span><span class="c1"># or the development version
</span><span class="n">devtools</span><span class="o">::</span><span class="n">install_github</span><span class="p">(</span><span class="s2">"ShirinG/exprAnalysis"</span><span class="p">,</span><span class="w"> </span><span class="n">build_vignettes</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">ref</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"master"</span><span class="p">)</span><span class="w">
</span><span class="n">There</span><span class="w"> </span><span class="n">might</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">problems</span><span class="w"> </span><span class="n">with</span><span class="w"> </span><span class="n">installation</span><span class="w"> </span><span class="n">of</span><span class="w"> </span><span class="n">some</span><span class="w"> </span><span class="n">dependency</span><span class="w"> </span><span class="n">packages</span><span class="w"> </span><span class="p">(</span><span class="n">especially</span><span class="w"> </span><span class="n">Bioconductor</span><span class="w"> </span><span class="n">packages</span><span class="w"> </span><span class="n">and</span><span class="w"> </span><span class="n">WGCNA</span><span class="w"> </span><span class="n">and</span><span class="w"> </span><span class="n">its</span><span class="w"> </span><span class="n">dependencies</span><span class="w"> </span><span class="n">from</span><span class="w"> </span><span class="n">CRAN</span><span class="p">)</span><span class="n">.</span><span class="w"> </span><span class="n">In</span><span class="w"> </span><span class="n">order</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">install</span><span class="w"> </span><span class="n">them</span><span class="w"> </span><span class="n">manually</span><span class="o">:</span><span class="w">
</span>

There might be problems with installation of some dependency packages (especially Bioconductor packages and WGCNA and its dependencies from CRAN). In order to install them manually:

<span class="n">list.of.packages_bioconductor</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"arrayQualityMetrics"</span><span class="p">,</span><span class="w"> </span><span class="s2">"beadarray"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pcaGoPromoter"</span><span class="p">,</span><span class="w"> </span><span class="s2">"limma"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathview"</span><span class="p">,</span><span class="w"> </span><span class="s2">"sva"</span><span class="p">,</span><span class="w"> </span><span class="s2">"GO.db"</span><span class="p">,</span><span class="w"> </span><span class="s2">"impute"</span><span class="p">)</span><span class="w">
</span><span class="n">list.of.packages_cran</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"WGCNA"</span><span class="p">,</span><span class="w"> </span><span class="s2">"roxygen2"</span><span class="p">,</span><span class="w"> </span><span class="s2">"testthat"</span><span class="p">,</span><span class="w"> </span><span class="s2">"gplots"</span><span class="p">)</span><span class="w">

</span><span class="n">new.packages_bioconductor</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">list.of.packages_bioconductor</span><span class="p">[</span><span class="o">!</span><span class="p">(</span><span class="n">list.of.packages_bioconductor</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">installed.packages</span><span class="p">()[,</span><span class="s2">"Package"</span><span class="p">])]</span><span class="w">
</span><span class="n">new.packages_cran</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">list.of.packages_cran</span><span class="p">[</span><span class="o">!</span><span class="p">(</span><span class="n">list.of.packages_cran</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">installed.packages</span><span class="p">()[,</span><span class="s2">"Package"</span><span class="p">])]</span><span class="w">

</span><span class="c1"># CRAN
</span><span class="k">if</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">new.packages_cran</span><span class="p">)</span><span class="o">></span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="n">install.packages</span><span class="p">(</span><span class="n">new.packages_cran</span><span class="p">)</span><span class="w">

</span><span class="c1"># Bioconductor
</span><span class="k">if</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">new.packages_bioconductor</span><span class="p">)</span><span class="o">></span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="n">source</span><span class="p">(</span><span class="s2">"https://bioconductor.org/biocLite.R"</span><span class="p">)</span><span class="w">
  </span><span class="n">biocLite</span><span class="p">(</span><span class="n">new.packages_bioconductor</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>

Load the library:

<span class="n">library</span><span class="p">(</span><span class="s2">"exprAnalysis"</span><span class="p">)</span><span class="w">

</span><span class="c1"># To view the vignette if you built it with the package:
</span><span class="w">
</span><span class="n">vignette</span><span class="p">(</span><span class="s2">"exprAnalysis"</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="o">=</span><span class="s2">"exprAnalysis"</span><span class="p">)</span><span class="w">
</span><span class="n">vignette</span><span class="p">(</span><span class="s2">"CummeRbund"</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="o">=</span><span class="s2">"exprAnalysis"</span><span class="p">)</span><span class="w">

</span><span class="n">browseVignettes</span><span class="p">(</span><span class="s2">"exprAnalysis"</span><span class="p">)</span><span class="w">

</span><span class="n">enableWGCNAThreads</span><span class="p">()</span><span class="w">
</span>

Functions

See package help (?functionname, ?data) for detailed descriptions of functions and example datasets.


Input data

Takes an expression data matrix containing numeric values as expression measures (e.g. read count data, FPKM values, Illumina expression data).

  • Rownames should be gene or isoform identifiers (e.g. gene names)
  • Colnames should be sample IDs (sample names)

This package contains two example matrices of randomly generated expression data as raw read counts (called “countmatrix”) and as normalized read counts (called “expmatrix”).

<span class="n">data</span><span class="p">(</span><span class="s2">"countmatrix"</span><span class="p">)</span><span class="w">
</span><span class="n">data</span><span class="p">(</span><span class="s2">"expmatrix"</span><span class="p">)</span><span class="w">
</span>

Starting from FPKM data

If you have FPKM data (e.g. quantile normalized) from Cufflinks, treat the data such as the expmatrix example.

Cuffdiff

See CummeRbund vignette included among the vignettes of this package CummeRbundVignette.html.


Starting from count data

If you want to do count data analysis, you can either produce a count matrix (e.g. with HTSeq) and proceed to DESeq2 analysis or you can produce the count table directly from the bam files and then proceed to DESeq2.

read_bam_to_countmatrix() returns a DESeq data set that can directly be used with the DEseq2 pipeline. Or you can manually load the count matrix that was saved by read_bam_to_countmatrix() and then go to DESeqDataFrameFromMatrix().

In read_bam_to_countmatrix() fragments will be counted only once to each gene, even if they overlap multiple exons of a gene which may themselves be overlapping.

read_bam_to_countmatrix() uses the packages Rsamtools, GenomicAlignments, GenomicFeatures, Biobase and SummarizedExperiment.

<span class="c1"># Locate alignment files
</span><span class="n">dir</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">getwd</span><span class="p">()</span><span class="w">
</span><span class="n">filenames</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">fileslist</span><span class="p">[</span><span class="n">grep</span><span class="p">(</span><span class="s2">".*sorted.bam$"</span><span class="p">,</span><span class="w"> </span><span class="n">list.files</span><span class="p">(</span><span class="n">dir</span><span class="p">))]</span><span class="w">

</span><span class="c1"># Create a sample table
</span><span class="w">
</span><span class="n">samplename</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sub</span><span class="p">(</span><span class="s2">"_accepted_hits.sorted.bam"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">filenames</span><span class="p">)</span><span class="w">
</span><span class="n">design</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"Treatment"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Control"</span><span class="p">)</span><span class="w">
</span><span class="n">sampleid</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sub</span><span class="p">(</span><span class="s2">"Sample"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="n">samplename</span><span class="p">)</span><span class="w">

</span><span class="n">sampleTable</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">row.names</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">samplename</span><span class="p">,</span><span class="w"> </span><span class="n">sampleid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sampleid</span><span class="p">,</span><span class="w"> </span><span class="n">filenames</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">filenames</span><span class="p">,</span><span class="w"> </span><span class="n">colData</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">design</span><span class="p">,</span><span class="w"> </span><span class="n">stringsAsFactors</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">F</span><span class="p">)</span><span class="w">

</span><span class="c1">#         sampleid                         filenames    colData
#Sample1         1  Sample1_accepted_hits.sorted.bam  Treatment
#Sample2         2  Sample2_accepted_hits.sorted.bam    Control
</span><span class="w">
</span><span class="n">data</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">read_bam_to_countmatrix</span><span class="p">(</span><span class="n">sampleTable</span><span class="p">,</span><span class="w"> </span><span class="n">gtffile</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Homo_sapiens.GRCh38.83.gtf"</span><span class="p">,</span><span class="w"> </span><span class="n">projectfolder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">getwd</span><span class="p">(),</span><span class="w"> </span><span class="n">outPrefix</span><span class="o">=</span><span class="s2">"Test"</span><span class="p">)</span><span class="w">

</span><span class="n">countmatrix</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">SummarizedExperiment</span><span class="o">::</span><span class="n">assay</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w">
</span>

Continue count data analysis with DESeq2

See DESeq2 Vignette for additional details.

If you do not directly proceed from read_bam_to_countmatrix():

  • read in saved count matrix
  • define experimental design
  • convert to DESeq data set
<span class="n">countmatrix</span><span class="w"> </span><span class="o"><-</span><span class="w">   </span><span class="n">read.table</span><span class="p">(</span><span class="n">file.path</span><span class="p">(</span><span class="n">projectfolder</span><span class="p">,</span><span class="w"> </span><span class="s2">"Countdata"</span><span class="p">,</span><span class="w"> </span><span class="n">outPrefix</span><span class="p">,</span><span class="w"> </span><span class="n">paste0</span><span class="p">(</span><span class="n">outPrefix</span><span class="p">,</span><span class="w"> </span><span class="s2">"_Summarize_overlaps_count_matrix.txt"</span><span class="p">)),</span><span class="w"> </span><span class="n">header</span><span class="o">=</span><span class="nb">T</span><span class="p">,</span><span class="w"> </span><span class="n">sep</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"\t"</span><span class="p">)</span><span class="w">
</span>
<span class="n">design</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gsub</span><span class="p">(</span><span class="s2">"(.*)(_[0-9])"</span><span class="p">,</span><span class="w"> </span><span class="s2">"\\1"</span><span class="p">,</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">countmatrix</span><span class="p">))</span><span class="w">
</span><span class="n">ExpDesign</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">row.names</span><span class="o">=</span><span class="n">colnames</span><span class="p">(</span><span class="n">countmatrix</span><span class="p">),</span><span class="w"> </span><span class="n">treatment</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">design</span><span class="p">)</span><span class="w">

</span><span class="n">data</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">DESeq2</span><span class="o">::</span><span class="n">DESeqDataSetFromMatrix</span><span class="p">(</span><span class="n">countData</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">countmatrix</span><span class="p">,</span><span class="w"> </span><span class="n">colData</span><span class="o">=</span><span class="n">ExpDesign</span><span class="p">,</span><span class="w"> </span><span class="n">design</span><span class="o">=~</span><span class="n">treatment</span><span class="p">)</span><span class="w">
</span>

DESeq2

  • optional, but recommended: remove genes with zero counts over all samples
  • run DESeq
  • Extracting transformed values

Note: the rlog transformation is provided for applications other than differential testing. For differential testing we recommend the DESeq function applied to raw counts, as described later in this workflow, which also takes into account the dependence of the variance of counts on the mean value during the dispersion estimation step.

<span class="n">data</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="n">rowSums</span><span class="p">(</span><span class="n">DESeq2</span><span class="o">::</span><span class="n">counts</span><span class="p">(</span><span class="n">data</span><span class="p">))</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="n">data_DESeq</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">DESeq2</span><span class="o">::</span><span class="n">DESeq</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w">

</span><span class="n">expmatrix_DESeq</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">DESeq2</span><span class="o">::</span><span class="n">rlog</span><span class="p">(</span><span class="n">data_DESeq</span><span class="p">,</span><span class="w"> </span><span class="n">fitType</span><span class="o">=</span><span class="s2">"local"</span><span class="p">)</span><span class="w">
</span><span class="n">expmatrix</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">SummarizedExperiment</span><span class="o">::</span><span class="n">assay</span><span class="p">(</span><span class="n">expmatrix_DESeq</span><span class="p">)</span><span class="w">
</span>

Dispersion plot

<span class="n">DESeq2</span><span class="o">::</span><span class="n">plotDispEsts</span><span class="p">(</span><span class="n">data_DESeq</span><span class="p">,</span><span class="w"> </span><span class="n">main</span><span class="o">=</span><span class="s2">"Dispersion Estimates"</span><span class="p">)</span><span class="w">
</span>

Starting from Affymetrix expression chips

If you have CEL files, start with the following code to produce the expression matrix and then treat it like the expmatrix example:

  • Read in the data and create an expression, using RMA for example

Currently the rma function implements RMA in the following manner 1. Probe specific correction of the PM probes using a model based on observed intensity being the sum of signal and noise 2. Normalization of corrected PM probes using quantile normalization (Bolstad et al., 2003) 3. Calculation of Expression measure using median polish.

<span class="n">library</span><span class="p">(</span><span class="n">affy</span><span class="p">)</span><span class="w">

</span><span class="c1"># Choose directory containing CEL files
</span><span class="n">dir</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">getwd</span><span class="p">()</span><span class="w">

</span><span class="c1"># check, that you have the correct directory
</span><span class="n">celfiles</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">file.path</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span><span class="w"> </span><span class="n">list.files</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span><span class="w"> </span><span class="n">recursive</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)[</span><span class="n">grep</span><span class="p">(</span><span class="s2">".*CEL$"</span><span class="p">,</span><span class="w"> </span><span class="n">list.files</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span><span class="w"> </span><span class="n">recursive</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">),</span><span class="w"> </span><span class="n">ignore.case</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)])</span><span class="w">

</span><span class="n">affydata</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ReadAffy</span><span class="p">(</span><span class="n">filenames</span><span class="o">=</span><span class="n">celfiles</span><span class="p">)</span><span class="w">
</span><span class="n">affy</span><span class="o">::</span><span class="n">MAplot</span><span class="p">(</span><span class="n">affydata</span><span class="p">,</span><span class="n">plot.method</span><span class="o">=</span><span class="s2">"smoothScatter"</span><span class="p">)</span><span class="w">
</span><span class="n">eset</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">affy</span><span class="o">::</span><span class="n">rma</span><span class="p">(</span><span class="n">affydata</span><span class="p">)</span><span class="w">

</span><span class="c1"># If affy doesn' recognize the chip type, try the oligo package:
</span><span class="n">library</span><span class="p">(</span><span class="n">oligo</span><span class="p">)</span><span class="w">
</span><span class="n">rawData</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">read.celfiles</span><span class="p">(</span><span class="n">celfiles</span><span class="p">)</span><span class="w">
</span><span class="n">oligo</span><span class="o">::</span><span class="n">MAplot</span><span class="p">(</span><span class="n">rawData</span><span class="p">,</span><span class="w"> </span><span class="n">plotFun</span><span class="o">=</span><span class="n">smoothScatter</span><span class="p">)</span><span class="w">
</span><span class="n">eset</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">backgroundCorrect</span><span class="p">(</span><span class="n">rawData</span><span class="p">,</span><span class="w"> </span><span class="n">method</span><span class="o">=</span><span class="s2">"rma"</span><span class="p">)</span><span class="w">

</span><span class="n">quality_control_plots</span><span class="p">(</span><span class="n">eset</span><span class="p">,</span><span class="w"> </span><span class="n">groupColumn</span><span class="o">=</span><span class="nf">c</span><span class="p">())</span><span class="w">

</span><span class="c1"># Optional: Remove bad quality probes/ genes and/ or samples
</span><span class="w">
</span><span class="n">write.exprs</span><span class="p">(</span><span class="n">eset</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="o">=</span><span class="s2">"eset.txt"</span><span class="p">)</span><span class="w">

</span><span class="n">library</span><span class="p">(</span><span class="n">made4</span><span class="p">)</span><span class="w">
</span><span class="n">overview</span><span class="p">(</span><span class="n">eset</span><span class="p">)</span><span class="w">

</span><span class="c1"># RNA degradation plots
</span><span class="n">deg</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">AffyRNAdeg</span><span class="p">(</span><span class="n">affydata</span><span class="p">)</span><span class="w">
</span><span class="n">plotAffyRNAdeg</span><span class="p">(</span><span class="n">deg</span><span class="p">)</span><span class="w">


</span><span class="n">expmatrix</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">exprs</span><span class="p">(</span><span class="n">eset</span><span class="p">)</span><span class="w">
</span>

Starting from Illumina expression chips

If you have Illumina intensity data, load it into GenomeStudio first:

  • Open New Project -> Gene Expression
  • Choose Assay Type -> Next
  • Choose Project Repo and name the project -> Next
  • On the left side, mark all folders containing data, select “All” and import them into the right window -> Next
  • Name Group Set, on the left side, mark all folders containing data, select “All” and import them into the right window (optionally, you can directly create groups here) -> Next
  • For now, don’t use normalisation and don’ substract background
  • Choose appropriate Content Descriptor (e.g. HumanHT-12_V4_0_R1_15002873_B.bgx) -> Finish
  • Export SampleProbeProfile.txt: Choose columns AVG_Signal, Detection Pval, BEAD_STDERR and Avg_NBEADS
  • Export ControlProbeProfile.txt: Choose columns AVG_Signal and Detection Pval
  • You also need the Samplesheet.csv

read_Illumina() and normalise_eset() uses the R package beadarray.

<span class="n">dataFile</span><span class="w">      </span><span class="o"><-</span><span class="w"> </span><span class="n">file.path</span><span class="p">(</span><span class="s2">"SampleProbeProfile.txt"</span><span class="p">)</span><span class="w">
</span><span class="n">qcFile</span><span class="w">        </span><span class="o"><-</span><span class="w"> </span><span class="n">file.path</span><span class="p">(</span><span class="s2">"ControlProbeProfile.txt"</span><span class="p">)</span><span class="w">
</span><span class="n">sampleSheet</span><span class="w">   </span><span class="o"><-</span><span class="w"> </span><span class="n">file.path</span><span class="p">(</span><span class="s2">"samplesheet.csv"</span><span class="p">)</span><span class="w">

</span><span class="c1"># define Illumina Expression Array: "HumanHT-12 v?", "MouseWG-6 v?", "MouseRef-8 v?"
</span><span class="n">expressionchipType</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"HumanHT-12 v4"</span><span class="w">

</span><span class="n">eset</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">read_Illumina</span><span class="p">(</span><span class="n">dataFile</span><span class="p">,</span><span class="w"> </span><span class="n">qcFile</span><span class="p">,</span><span class="w"> </span><span class="n">sampleSheet</span><span class="p">,</span><span class="w"> </span><span class="n">expressionchipType</span><span class="p">,</span><span class="w"> </span><span class="n">ProbeID</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"PROBE_ID"</span><span class="p">,</span><span class="w"> </span><span class="n">skip</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">controlID</span><span class="o">=</span><span class="s2">"ProbeID"</span><span class="p">,</span><span class="w"> </span><span class="n">qc.skip</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">method_norm</span><span class="w"> </span><span class="o">=</span><span class="s2">"none"</span><span class="p">,</span><span class="w"> </span><span class="n">transform</span><span class="o">=</span><span class="s2">"log2"</span><span class="p">)</span><span class="w">

</span><span class="n">quality_control_plots</span><span class="p">(</span><span class="n">eset</span><span class="p">)</span><span class="w">

</span><span class="c1"># Optional: Remove bad quality probes/ genes and/ or samples
</span><span class="w">
</span><span class="n">eset</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">normalise_eset</span><span class="p">(</span><span class="n">eset</span><span class="p">)</span><span class="w">

</span><span class="n">expmatrix</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">exprs</span><span class="p">(</span><span class="n">eset</span><span class="p">)</span><span class="w">
</span>

Batch correction

batch_removal() uses the R package sva.

<span class="n">pheno</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">sample</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">16</span><span class="p">),</span><span class="w"> </span><span class="n">treatment</span><span class="o">=</span><span class="n">sub</span><span class="p">(</span><span class="s2">"(.*)(_[0-9])"</span><span class="p">,</span><span class="w"> </span><span class="s2">"\\1"</span><span class="p">,</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">)),</span><span class="w"> </span><span class="n">batch</span><span class="o">=</span><span class="n">ifelse</span><span class="p">(</span><span class="n">grepl</span><span class="p">(</span><span class="s2">"Ctrl"</span><span class="p">,</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">))</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="s2">"1"</span><span class="p">,</span><span class="w"> </span><span class="n">ifelse</span><span class="p">(</span><span class="n">grepl</span><span class="p">(</span><span class="s2">"ActLPS"</span><span class="p">,</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">))</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="s2">"1"</span><span class="p">,</span><span class="w"> </span><span class="s2">"2"</span><span class="p">)),</span><span class="w"> </span><span class="n">row.names</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">))</span><span class="w">

</span><span class="n">expmatrix</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">batch_removal</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">,</span><span class="w"> </span><span class="n">pheno</span><span class="p">)</span><span class="w">
</span>

Exploratory analysis of all genes

Variance vs mean gene expression across samples

Plots variance against mean gene expression across samples and calculates the correlation of a linear regression model.

var_vs_mean() uses the R package matrixStats.

<span class="n">var_vs_mean</span><span class="p">(</span><span class="n">countmatrix</span><span class="p">)</span><span class="w">
</span>

## 
##  Pearson's product-moment correlation
## 
## data:  log2(dispersion[, 1] + 1) and log2(dispersion[, 2] + 1)
## t = 281.2, df = 9998, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9399669 0.9443684
## sample estimates:
##       cor 
## 0.9422083
<span class="c1"># var_vs_mean(expmatrix)
</span>

Intersample variances

<span class="n">library</span><span class="p">(</span><span class="n">corrgram</span><span class="p">)</span><span class="w">

</span><span class="n">Ctrl_cor</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">expmatrix</span><span class="p">[,</span><span class="n">grep</span><span class="p">(</span><span class="s2">"Ctrl"</span><span class="p">,</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">))]</span><span class="w">

</span><span class="n">corrgram</span><span class="o">::</span><span class="n">corrgram</span><span class="p">(</span><span class="n">Ctrl_cor</span><span class="p">,</span><span class="w"> </span><span class="n">order</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">lower.panel</span><span class="o">=</span><span class="n">corrgram</span><span class="o">::</span><span class="n">panel.pie</span><span class="p">,</span><span class="w">
         </span><span class="n">upper.panel</span><span class="o">=</span><span class="n">corrgram</span><span class="o">::</span><span class="n">panel.pts</span><span class="p">,</span><span class="w"> </span><span class="n">text.panel</span><span class="o">=</span><span class="n">corrgram</span><span class="o">::</span><span class="n">panel.txt</span><span class="p">,</span><span class="w">
         </span><span class="n">main</span><span class="o">=</span><span class="s2">"Correlogram of controls"</span><span class="p">)</span><span class="w">
</span>

Principle Component Analysis

Uses functions from the R package pcaGoPromoter.

You can only plot the principle components using:

<span class="n">groups</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">as.factor</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="s2">"Ctrl"</span><span class="p">,</span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="s2">"TolLPS"</span><span class="p">,</span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="s2">"TolS100A8"</span><span class="p">,</span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="s2">"ActLPS"</span><span class="p">,</span><span class="m">4</span><span class="p">)))</span><span class="w">

</span><span class="n">pca_plot</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">,</span><span class="w"> </span><span class="n">groups</span><span class="p">)</span><span class="w">
</span>

Or you can plot the principle components and calculate TF and GO term enrichments of genes (defaults to top 2.5%) with highest and lowest loadings. With this function, the ouput files are directly saved to .pdf and .txt (by default to working directory).

<span class="n">pca_plot_enrich</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">,</span><span class="w"> </span><span class="n">groups</span><span class="p">)</span><span class="w">
</span>

Heatmaps

heatmaps() uses the R package gplots.

Here, of the 30 most highly expressed genes.

<span class="n">select</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">order</span><span class="p">(</span><span class="n">rowMeans</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">),</span><span class="n">decreasing</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">)[</span><span class="m">1</span><span class="o">:</span><span class="m">30</span><span class="p">]</span><span class="w">

</span><span class="n">heatmaps</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">[</span><span class="n">select</span><span class="p">,],</span><span class="w"> </span><span class="n">samplecols</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="s2">"#E41A1C"</span><span class="p">,</span><span class="w"> </span><span class="s2">"#377EB8"</span><span class="p">,</span><span class="w"> </span><span class="s2">"#4DAF4A"</span><span class="p">,</span><span class="w"> </span><span class="s2">"#984EA3"</span><span class="p">),</span><span class="w"> </span><span class="n">each</span><span class="o">=</span><span class="m">4</span><span class="p">))</span><span class="w">
</span>

<span class="c1"># Heatmap function from DESeq2, using pheatmap:
</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">pheatmap</span><span class="p">)</span><span class="w">

</span><span class="n">sampleDists</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">dist</span><span class="p">(</span><span class="n">t</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">))</span><span class="w">
</span><span class="n">sampleDistMatrix</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">as.matrix</span><span class="p">(</span><span class="n">sampleDists</span><span class="p">)</span><span class="w">
</span><span class="n">rownames</span><span class="p">(</span><span class="n">sampleDistMatrix</span><span class="p">)</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">paste</span><span class="p">(</span><span class="n">expmatrix_DESeq</span><span class="o">$</span><span class="n">treatment</span><span class="p">)</span><span class="w">
</span><span class="n">colnames</span><span class="p">(</span><span class="n">sampleDistMatrix</span><span class="p">)</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="kc">NULL</span><span class="w">
</span><span class="n">colors</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">grDevices</span><span class="o">::</span><span class="n">colorRampPalette</span><span class="p">(</span><span class="w"> </span><span class="n">rev</span><span class="p">(</span><span class="n">RColorBrewer</span><span class="o">::</span><span class="n">brewer.pal</span><span class="p">(</span><span class="m">9</span><span class="p">,</span><span class="w"> </span><span class="s2">"Blues"</span><span class="p">))</span><span class="w"> </span><span class="p">)(</span><span class="m">255</span><span class="p">)</span><span class="w">
</span><span class="n">pheatmap</span><span class="o">::</span><span class="n">pheatmap</span><span class="p">(</span><span class="n">sampleDistMatrix</span><span class="p">,</span><span class="w">
         </span><span class="n">clustering_distance_rows</span><span class="o">=</span><span class="n">sampleDists</span><span class="p">,</span><span class="w">
         </span><span class="n">clustering_distance_cols</span><span class="o">=</span><span class="n">sampleDists</span><span class="p">,</span><span class="w">
         </span><span class="n">col</span><span class="o">=</span><span class="n">colors</span><span class="p">)</span><span class="w">

</span><span class="n">df</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">treatment</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">SummarizedExperiment</span><span class="o">::</span><span class="n">colData</span><span class="p">(</span><span class="n">data_DESeq</span><span class="p">)[,</span><span class="nf">c</span><span class="p">(</span><span class="s2">"treatment"</span><span class="p">)],</span><span class="w"> </span><span class="n">row.names</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rownames</span><span class="p">(</span><span class="n">SummarizedExperiment</span><span class="o">::</span><span class="n">colData</span><span class="p">(</span><span class="n">data_DESeq</span><span class="p">)))</span><span class="w">
</span><span class="n">pheatmap</span><span class="o">::</span><span class="n">pheatmap</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">[</span><span class="n">select</span><span class="p">,],</span><span class="w"> </span><span class="n">cluster_rows</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">show_rownames</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">cluster_cols</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">annotation_col</span><span class="o">=</span><span class="n">df</span><span class="p">)</span><span class="w">
</span>
<span class="c1"># Interactive heatmap
</span><span class="w">
</span><span class="c1">#devtools::install_github('talgalili/heatmaply')
</span><span class="n">library</span><span class="p">(</span><span class="n">heatmaply</span><span class="p">)</span><span class="w">

</span><span class="c1"># sample correlation
</span><span class="n">hm</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">heatmapr</span><span class="p">(</span><span class="n">cor</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">[</span><span class="n">select</span><span class="p">,]),</span><span class="w"> </span><span class="n">k_row</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">4</span><span class="p">,</span><span class="w"> </span><span class="n">k_col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">4</span><span class="p">)</span><span class="w">

</span><span class="c1"># gene correlation
</span><span class="n">hm</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">heatmapr</span><span class="p">(</span><span class="n">cor</span><span class="p">(</span><span class="n">t</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">[</span><span class="n">select</span><span class="p">,])),</span><span class="w"> </span><span class="n">k_row</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">k_col</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">)</span><span class="w">

</span><span class="n">heatmaply</span><span class="p">(</span><span class="n">hm</span><span class="p">)</span><span class="w">

</span><span class="c1">#Hover over heatmap to see individual values and sample/ gene IDs
</span>

WGCNA

Check WGCNA page for detailed description of the WGCNA package.

Hierarchical Clustering and outlier detection

Uses adjacency matrix function from the R package WGCNA and hierarchical clustering from the R package flashClust.

<span class="n">datTraits</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">Ctrl</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">12</span><span class="p">)),</span><span class="w"> </span><span class="n">TolPS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">),</span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">8</span><span class="p">)),</span><span class="w"> </span><span class="n">TolS100A8</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">8</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">)),</span><span class="w"> </span><span class="n">ActLPS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">12</span><span class="p">),</span><span class="nf">rep</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">)),</span><span class="w"> </span><span class="n">Tol</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="m">8</span><span class="p">),</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">4</span><span class="p">)),</span><span class="w"> </span><span class="n">ExPhenotype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">stats</span><span class="o">::</span><span class="n">rnorm</span><span class="p">(</span><span class="m">4</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">),</span><span class="n">stats</span><span class="o">::</span><span class="n">rnorm</span><span class="p">(</span><span class="m">8</span><span class="p">,</span><span class="w"> </span><span class="m">25</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">),</span><span class="n">stats</span><span class="o">::</span><span class="n">rnorm</span><span class="p">(</span><span class="m">4</span><span class="p">,</span><span class="w"> </span><span class="m">50</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">)),</span><span class="w"> </span><span class="n">row.names</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">colnames</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">))</span><span class="w">

</span><span class="n">datExpr</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">wgcna_sample_dendrogram</span><span class="p">(</span><span class="n">expmatrix</span><span class="p">,</span><span class="w"> </span><span class="n">datTraits</span><span class="p">)</span><span class="w">
</span>
##

##  Flagging genes and samples with too many missing values...
##   ..step 1
## All genes are okay!
## All samples are okay!
<span class="c1"># Optional: Remove outlier samples and repeats: All genes flagged for removal are saved to the object "remove_genes"
#head(remove_genes)
</span>

Choosing a Soft Threshold

Ideally, pick a SFT with R^2 > 0.9. In this example, this threshold was not reached, so I pick the highest SFT: 30.

<span class="c1"># Choose a set of soft thresholding powers
</span><span class="n">powers</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">30</span><span class="p">)</span><span class="w">

</span><span class="c1"># choose power based on SFT criterion
</span><span class="n">sft</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pickSoftThreshold</span><span class="p">(</span><span class="n">datExpr</span><span class="p">,</span><span class="w"> </span><span class="n">powerVector</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">powers</span><span class="p">,</span><span class="w"> </span><span class="n">verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">5</span><span class="p">,</span><span class="w"> </span><span class="n">networkType</span><span class="w"> </span><span class="o">=</span><span class="s2">"signed"</span><span class="p">,</span><span class="w"> </span><span class="n">corFnc</span><span class="o">=</span><span class="w"> </span><span class="s2">"bicor"</span><span class="p">,</span><span class="w"> </span><span class="n">corOptions</span><span class="o">=</span><span class="nf">list</span><span class="p">(</span><span class="n">maxPOutliers</span><span class="o">=</span><span class="m">0.1</span><span class="p">))</span><span class="w">

</span><span class="c1"># Plot the results:
</span><span class="n">tiff</span><span class="p">(</span><span class="n">filename</span><span class="o">=</span><span class="w"> </span><span class="s2">"SFT.tiff"</span><span class="p">,</span><span class="w"> </span><span class="n">width</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">8000</span><span class="w"> </span><span class="p">,</span><span class="w"> </span><span class="n">height</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">4000</span><span class="p">,</span><span class="w"> </span><span class="n">res</span><span class="o">=</span><span class="m">600</span><span class="p">,</span><span class="w"> </span><span class="n">compression</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"lzw"</span><span class="p">)</span><span class="w">
</span><span class="n">par</span><span class="p">(</span><span class="n">mfrow</span><span class="o">=</span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">2</span><span class="p">))</span><span class="w">
</span><span class="c1"># SFT index as a function of different powers
</span><span class="n">plot</span><span class="p">(</span><span class="n">sft</span><span class="o">$</span><span class="n">fitIndices</span><span class="p">[,</span><span class="m">1</span><span class="p">],</span><span class="o">-</span><span class="nf">sign</span><span class="p">(</span><span class="n">sft</span><span class="o">$</span><span class="n">fitIndices</span><span class="p">[,</span><span class="m">3</span><span class="p">])</span><span class="o">*</span><span class="n">sft</span><span class="o">$</span><span class="n">fitIndices</span><span class="p">[,</span><span class="m">2</span><span class="p">],</span><span class="w">
     </span><span class="n">xlab</span><span class="o">=</span><span class="s2">"Soft Threshold (power)"</span><span class="p">,</span><span class="n">ylab</span><span class="o">=</span><span class="s2">"SFT, signed R^2"</span><span class="p">,</span><span class="n">type</span><span class="o">=</span><span class="s2">"n"</span><span class="p">,</span><span class="n">main</span><span class="o">=</span><span class="n">paste</span><span class="p">(</span><span class="s2">"Scale independence"</span><span class="p">))</span><span class="w">
</span><span class="n">text</span><span class="p">(</span><span class="n">sft</span><span class="o">$</span><span class="n">fitIndices</span><span class="p">[,</span><span class="m">1</span><span class="p">],</span><span class="o">-</span><span class="nf">sign</span><span class="p">(</span><span class="n">sft</span><span class="o">$</span><span class="n">fitIndices</span><span class="p">[,</span><span class="m">3</span><span class="p">])</span><span class="o">*</span><span class="n">sft</span><span class="o">$</span><span class="n">fitIndices</span><span class="p">[,</span><span class="m">2</span><span class="p">],</span><span class="w">
     </span><span class="n">labels</span><span class="o">=</span><span class="n">powers</span><span class="p">,</span><span class="n">col</span><span class="o">=</span><span class="s2">"red"</span><span class="p">)</span><span class="w">
</span><span class="c1"># this line corresponds to using an R^2 cut-off of h
</span><span class="n">abline</span><span class="p">(</span><span class="n">h</span><span class="o">=</span><span class="m">0.90</span><span class="p">,</span><span class="n">col</span><span class="o">=</span><span class="s2">"red"</span><span class="p">)</span><span class="w">
</span><span class="c1"># Mean connectivity as a function of different powers
</span><span class="n">plot</span><span class="p">(</span><span class="n">sft</span><span class="o">$</span><span class=&qu...

To leave a comment for the author, please follow the link and comment on their blog: Shirin's playgRound.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)