(This article was first published on The Praise of Insects, and kindly contributed to R-bloggers)
A couple of months ago I wrote the following R function to calculate the number of transitions and transversions between DNA sequences in an alignment. The function is fairly slow (an alignment of ~100 sequences, 800 bp in length takes around 30 seconds to run) thanks to the double for loop, however in this case I shall plead Uwe's Maxim: "Computers are cheap and thinking hurts".In other R news, there's a cool site, R-bloggers, that is a portal to a number of other blogs that deal with R. It's great to see what other people manage to do in R and a good way to learn about its capabilities.
Happy New Year!
library(ape)
#Input: dat---an object of class 'DNAbin'
titv<-function(dat){
mat<-as.matrix(dat)
res<-matrix(NA, ncol=dim(mat)[1], nrow=dim(mat)[1], dimnames=list(x=names(dat), y=names(dat)))
for(i in 1:dim(mat)[1]){
for(j in 1:dim(mat)[1]){
vec<-as.numeric(mat[i,])+as.numeric(mat[j,])-8
res[i,j]<-length(grep("200|56",vec)) #Transitions
res[j,i]<-length(grep("152|168|88|104",vec)) #Transversions
}
}
res
}
#Example
data(woodmouse)
ti<-titv(woodmouse)
tv<-t(ti)
tv[lower.tri(tv)] #Number of transversions
ti[lower.tri(ti)] #Number of transitions
#Saturation plot
dist<-dist.dna(woodmouse)
plot(ti[lower.tri(ti)]~dist)
points(tv[lower.tri(tv)]~dist, pch=20, col="red")
To leave a comment for the author, please follow the link and comment on his blog: The Praise of Insects.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).