Transitions and transversions in R
[This article was first published on The Praise of Insects, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A couple of months ago I wrote the following R function to calculate the number of transitions and transversions between DNA sequences in an alignment. The function is fairly slow (an alignment of ~100 sequences, 800 bp in length takes around 30 seconds to run) thanks to the double for loop, however in this case I shall plead Uwe’s Maxim: “Computers are cheap and thinking hurts”.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In other R news, there’s a cool site, R-bloggers, that is a portal to a number of other blogs that deal with R. It’s great to see what other people manage to do in R and a good way to learn about its capabilities.
Happy New Year!
library(ape)
#Input: dat—an object of class ‘DNAbin’
titv<-function(dat){
mat<-as.matrix(dat)
res<-matrix(NA, ncol=dim(mat)[1], nrow=dim(mat)[1], dimnames=list(x=names(dat), y=names(dat)))
for(i in 1:dim(mat)[1]){
for(j in 1:dim(mat)[1]){
vec<-as.numeric(mat[i,])+as.numeric(mat[j,])-8
res[i,j]<-length(grep("200|56",vec)) #Transitions
res[j,i]<-length(grep("152|168|88|104",vec)) #Transversions
}
}
res
}
#Example
data(woodmouse)
ti<-titv(woodmouse)
tv<-t(ti)
tv[lower.tri(tv)] #Number of transversions
ti[lower.tri(ti)] #Number of transitions
#Saturation plot
dist<-dist.dna(woodmouse)
plot(ti[lower.tri(ti)]~dist)
points(tv[lower.tri(tv)]~dist, pch=20, col=”red”)
To leave a comment for the author, please follow the link and comment on their blog: The Praise of Insects.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.