Calling BEDtools from R

February 22, 2011
By

(This article was first published on Recipes, scripts and genomics, and kindly contributed to R-bloggers)

BEDtools suite provides command-line functionality when dealing with genomic coordinate based operations, such as overlapping bed files or getting coverage of a bed file over a genome (similar, not exactly same, functionality in R is provided by IRanges package in bioconductor). If you have the BEDtools suite installed and it is in your path, you can easily call BEDtools executables from R using the "system()" command. See BEDtools website to learn about BEDtools installation.

The function pasted below calls BEDtools executables on two data-frames. Both data-frames are structured as bed files. The function first writes out the data-frames as temporary files and calls the BEDtools programs on those temporary files and writes the output to another temporary file. In the end the function reads in the resulting file as a data frame and deletes the temporary files. I guess you can do this more elegantly with pipe() command in R, but that is left as an exercise to you: )


bedTools.2in<-function(functionstring="bedIntersect",bed1,bed2,opt.string="")
{
#create temp files
a.file=tempfile()
b.file=tempfile()
out =tempfile()
options(scipen =99) # not to use scientific notation when writing out
 
#write bed formatted dataframes to tempfile
write.table(bed1,file=a.file,quote=F,sep="\t",col.names=F,row.names=F)
write.table(bed2,file=b.file,quote=F,sep="\t",col.names=F,row.names=F)
 
# create the command string and call the command using system()
command=paste(functionstring,"-a",a.file,"-b",b.file,opt.string,">",out,sep=" ")
cat(command,"\n")
try(system(command))
 
res=read.table(out,header=F)
unlink(a.file);unlink(b.file);unlink(out)
return(res)
}
 
>bed1
V1 V2 V3 V4 V5 V6
1 chr1 67051161 67052451 ENST00000371026 1 -
2 chr1 67060631 67060788 ENST00000371026 2 -
3 chr1 67065090 67065317 ENST00000371026 3 -
4 chr1 67066082 67066181 ENST00000371026 4 -
5 chr1 67071855 67071977 ENST00000371026 5 -
6 chr1 67072261 67072419 ENST00000371026 6 -
 
>bedTools.2in("bedIntersect",bed1,bed2) # equivalent to "bedIntersect -a bed1 -b bed2" on terminal

To leave a comment for the author, please follow the link and comment on his blog: Recipes, scripts and genomics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.