A couple of months ago I wrote the following R function to calculate the number of transitions and transversions between DNA sequences in an alignment. The function is fairly slow (an alignment of ~100 sequences, 800 bp in length takes around 30 seconds to run) thanks to the double for loop, however in this case I shall plead Uwe’s Maxim: “Computers are cheap and thinking hurts”.

In other R news, there’s a cool site, R-bloggers, that is a portal to a number of other blogs that deal with R. It’s great to see what other people manage to do in R and a good way to learn about its capabilities.

Happy New Year!

library(ape)
#Input: dat—an object of class ‘DNAbin’

titv<-function(dat){

mat<-as.matrix(dat)

res<-matrix(NA, ncol=dim(mat)[1], nrow=dim(mat)[1], dimnames=list(x=names(dat), y=names(dat)))

for(i in 1:dim(mat)[1]){

for(j in 1:dim(mat)[1]){

vec<-as.numeric(mat[i,])+as.numeric(mat[j,])-8

res[i,j]<-length(grep("200|56",vec)) #Transitions

res[j,i]<-length(grep("152|168|88|104",vec)) #Transversions

}

}

res

}

#Example

data(woodmouse)

ti<-titv(woodmouse)

tv<-t(ti)
tv[lower.tri(tv)] #Number of transversions

ti[lower.tri(ti)] #Number of transitions

#Saturation plot

dist<-dist.dna(woodmouse)
plot(ti[lower.tri(ti)]~dist)

points(tv[lower.tri(tv)]~dist, pch=20, col=”red”)

*Related*

To

**leave a comment** for the author, please follow the link and comment on his blog:

** The Praise of Insects**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as: visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** computer, geeky, R, systematics