Changing phylogeny tip labels in R

January 14, 2011
By

(This article was first published on The Praise of Insects, and kindly contributed to R-bloggers)

During the process of molecular systematic research, specimens are given code names and numbers to keep track of data through the pipeline. These can contain a lot of information of relevance to the researcher, but unfortunately are meaningless to others who aren't as involved with the data. On publication, it is necessary to change the names from the code to a label that is more widely understood. This process can be tedious and fiddly, particularly when it needs to be done multiple times.

The following is a simple R-based solution for changing the tip labels of phylogenetic trees. First, we need to create a tree and a dataframe containing both the specimen codes and the ultimate labels.
library(ape)
tr <- rtree(5)
d1 <- c("t1","t2","t3","t4","t5")
d2 <- c( "paste(italic('Aus bus'), ' top')", "paste(italic('Aus bus'), ' bottom')", "paste(italic('Aus cus'), ' middle')", "paste(italic('Aus cus'), ' north')", "paste(italic('Dus gus'), ' south')" )
d <- as.data.frame(cbind(label=d1, nlabel=d2))

The code in the nlabel column contains code defining a plottable expression that enables scientific names to be formatted as italics. In my work, I saved this table as a separate file which I call with read.table("file.txt", header=TRUE, sep="\t", stringsAsFactors=FALSE, quote=""). The quote argument is important as it carries the nested quotes through into the dataframe properly.

The business of actually changing the tip labels is done with the following lines:
tr$tip.label<-d[[2]][match(tr$tip.label, d[[1]])]
tr$tip.label<-sapply(tr$tip.label, function(x) parse(text=x))

The first line enters the expressions for the new labels in the correct order. The second line converts the character string into a printable expression.

Plot the tree and voila!

To leave a comment for the author, please follow the link and comment on his blog: The Praise of Insects.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.