I’ve been asked a few times how to make a so-called volcano plot from gene expression results. A volcano plot typically plots some measure of effect on the x-axis (typically the fold change) and the statistical significance on the y-axis (typically the -log10 of the p-value). Genes that are highly dysregulated are farther to the left and right sides, while highly significant changes appear higher on the plot.

I’ve analyzed some data from GEO (GSE52202) using RNA-seq to study gene expression in motor neurons differentiated from induced pluripotent stem cells (iPSCs) derived from ALS patients carrying the C9ORF72 repeat expansion. I aligned the data, counted with featureCounts, and analyzed with DESeq2. I uploaded the results to this GitHub Gist.

Here’s how you can use R to create a simple volcano plot. First, you’ll need to install the devtools package so that you can install my Tmisc package directly from GitHub (I haven’t submitted it to CRAN). There’s a function in Tmisc called **read.gist()**, which reads data directly from Github Gists by specifying the GitHub Gist ID (*be careful with this…*).

After reading in the data from GitHub the next section creates a basic volcano plot. A few more lines color the points based on their fold change and statistical significance. Finally, if you have the **calibrate** package installed, the last line labels a few genes of interest.

*Related*

To

**leave a comment** for the author, please follow the link and comment on his blog:

** Getting Genetics Done**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as: visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...