Using genomation to analyze methylation profiles from Roadmap epigenomics and ENCODE

March 31, 2015
By

(This article was first published on Recipes, scripts and genomics, and kindly contributed to R-bloggers)

The genomation package is a toolkit for annotation and visualization of various genomic data. The package is currently in developmental version of BioC. It allows to analyze high-throughput data, including bisulfite sequencing data. Here, we will visualize the distribution of CpG methylation around promoters and their locations within gene structures on human chromosome 3.

Heatmap and plot of meta-profiles of CpG methylation around promoters

In this example we use data from Reduced Representation Bisulfite Sequencing (RRBS) and
Whole-genome Bisulfite Sequencing (WGBS) techniques and H1 and IMR90 cell types
derived from the ENCODE and the Roadmap Epigenomics Project databases.

We download the datasets and convert them to GRanges objects. Using rtracklayer and genomation functions. We also use a refseq bed file for annotation and extraction of promoter regions using readTranscriptFeatures function.



Since we have read the files now we can build base-pair resolution matrices of scores(methylation values) for each experiment. The returned list of matrices can be used to draw heatmaps or meta profiles of methylation ratio around promoters.







Distribution of covered CpGs across gene regions

genomation facilitates visualization of given locations of features aggregated by  exons, introns, promoters and TSSs. To find the distribution of covered CpGs within these gene structures, we will use transcript features we previously obtained. Here is the breakdown of the code

  1. Count overlap statistics between our CpGs from WGBS and RRBS H1 cell type and gene structures
  2. Calculate percentage of CpGs overlapping with annotation
  3. plot them in a form of pie charts



To leave a comment for the author, please follow the link and comment on their blog: Recipes, scripts and genomics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)