(This article was first published on

**YGC » R**, and kindly contributed to R-bloggers)I just figure out the way to query UTR sequences from ensembl by biomart tool.

It is very simple compared with using bioperl to parse gbk files to extract UTR sequences.

^{?}View Code RSPLUS

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
require(biomaRt) require(org.Hs.eg.db) ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") eg <- mappedkeys(org.Hs.egGO) utr <- getSequence(id=eg, type="entrezgene", seqType="3utr", mart=ensembl) outfile <- file("human-3utr.fa", "w") for (i in 1:nrow(utr)) { h = paste(c(">", utr[i,2]), collapse="") writeLines(h, outfile) writeLines(utr[i,1], outfile) } close(outfile) |

### Related Posts

To

**leave a comment**for the author, please follow the link and comment on their blog:**YGC » R**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...