Access all UCSC wiggle tracks from R and your terminal

[This article was first published on Recipes, scripts and genomics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although you can get a similar information from rtracklayer, you will need to do the summary statistics for each genomic coordinate and you will need to loop over a RangedList object, which may not be practical if you have a large set of genomic coordinates.

The other way to do this is to use hgWiggle program from UCSC source code. You can download the program from here, and you will need to prepare a three line “.hg.conf” file in your home directory.
.hg.conf should contain the exact following lines:

db.host=genome-mysql.cse.ucsc.edu
db.user=genomep
db.password=password


You should download the related .wib file from UCSC.  For human hg18 assembly they are located at ftp://hgdownload.cse.ucsc.edu/gbdb/hg18/ . Then do “chmod 600 .hg.conf ” on the file. Now we are ready to use hgWiggle. See this wiki page for all the other details.

As an example, you can get the statistics for chr16 for phastCons scores as shown below.

hgWiggle -db=hg18 -chr=chr16 -doStats phastCons44wayPlacental

you can also give a bed file with genomic coordinates using -bedFile  option, then you will get phastCons summary for the regions in the bed file.

It is also pretty straight forward to call the hgWiggle from R using system() function, so you can include this functionality in your R code. Although I intended to do that here when I started the post, I thought that it is trivial to use the system() command and call any command line tool, so I won’t be doing that for now. I might add it later on when I have more time.

To leave a comment for the author, please follow the link and comment on their blog: Recipes, scripts and genomics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)