(This article was first published on

I wrote earlier about some code I wrote for running Phylometa (software to do phylogenetic meta-analysis) from R.**Recology**, and kindly contributed to R-bloggers)I have been concerned about what exactly is the right penalty for including phylogeny in a meta-analysis. E.g.: AIC is calculated from Q in Phylometa, and Q increases with tree size.

So, I wrote some code to shuffle the tips of your tree N number of times, run Phylometa, and extract just the "Phylogenetic MA" part of the output. So, we compare the observed output (without tip shuffling) to the distribution of the tip shuffled output, and we can calculate a P-value from that. The code I wrote simply extracts the pooled effect size for fixed and also random-effects models. But you could change the code to extract whatever you like for the randomization.

I think the point of this code is not to estimate your pooled effects, etc., but may be an alternative way to compare traditional to phylogenetic MA where hopefully simply incorporating a tree is not penalizing the meta-analysis so much that you will always accept the traditional MA as better.

Get the code here, and also below. Get the example tree file and data file, named "phylogeny.txt" and "metadata_2g.txt", respectively below (or use your own data!). You need the file "phylometa_fxn.r" from my website, get here, but just call it using source as seen below.

#####################################################################

# Created by Scott Chamberlain

# Ecology and Evolutionary Biology Dept., Rice University

# Houston, TX 77005, USA

# my[email protected]

# http://schamber.wordpress.com/

#####################################################################

##### Directions

#Put your phylogeny, in format required by phylometa (see Phylometa manual), in your working directory

#Put your meta-analysis dataset, in format required by phylometa (see Phylometa manual), in your working directory

#Set working directory

#Use below functions

#Beware: only use a moderator variable with up to 6 groups

# Install packages

install.packages(c("plyr","ggplot2","ape","stringr")) # if not already installed

# Load packages

require(ggplot2); require(ape); require(stringr)

# Call and run functions (used below) in the working directory [NOTE:CHANGE TO YOUR WORKING DIRECTORY]

source("Z:/Mac/R_stuff/Blog_etc/Phylometa_randomization/phylometa_fxn.r")

####### Fxn to do randomization equivalent of Phylometa analysis

# phy = name of phylogeny file

# dat = name of data file in Phylometa format

# reps = number of randomized trees per real tree

# groups = number of levels in grouping/moderator variable

# Note that trees are written to directory so that Phylometa can call them

# Output is a jpeg of the histograms of randomized results, and vertical line for observed value

# Also, p-values for fixed and random-effects are output in the console

PhylogMetaRand <- function(phy, dat, reps, groups) {

# Observed data

obsout <- phylometa.process(system(paste("\"new_phyloMeta_1.2b.exe\"", phy, dat), intern=T), groups)

subobsout <- subset(obsout[4][[1]], Group == "All studies")

obsfixedeffect <- subobsout[1,3]

obsrandeffect <- subobsout[2,3]

# Randomization of data by shuffling tips on tree

phylo <- read.tree(phy)

rtree_list <- list()

for(i in 1:reps) {

temp <- phylo

randtips <- sample(temp$tip.label, length(temp$tip.label))

temp$tip.label <- randtips

rtree_list[[i]] <- temp

}

treenames <- list()

for(i in 1:reps){

treenames[[i]] <- paste("tree",i,".txt","",sep="")

}

WriteTrees2(rtree_list)

fixedrandeffects <- data.frame(matrix(NA, reps, 2))

for(i in 1:length(treenames)) {

cat(i, "\n")

out <- system(paste("\"new_phyloMeta_1.2b.exe\"", treenames[[i]], dat), intern=T)

fixrandout <- subset(phylometa.process(out, groups)[4][[1]], Group == "All studies")

fixedrandeffects[i,] <- c(fixrandout[1,3], fixrandout[2,3])

}

names(fixedrandeffects) <- c("fixed", "random")

# Plot results

fixed_ <- ggplot(fixedrandeffects, aes(x = fixed)) +

geom_histogram() +

geom_vline(xintercept = obsfixedeffect, colour="red", size=1)

random_ <- ggplot(fixedrandeffects, aes(x = random)) +

geom_histogram() +

geom_vline(xintercept = obsrandeffect, colour="red", size=1)

jpeg(paste(dat, ".jpeg", sep=""))

arrange_(fixed_, random_)

dev.off()

# Calculate randomization P-value

num_greater_than_fixed <- length(which(fixedrandeffects$fixed >= obsfixedeffect))

pval_fixed <- num_greater_than_fixed / reps

num_greater_than_random <- length(which(fixedrandeffects$random >= obsrandeffect))

pval_random <- num_greater_than_random / reps

pvals <- c(pval_fixed, pval_random)

names(pvals) <- c("pval_fixed","pval_random")

pvals

}

An example:

setwd("Z:/Mac/R_stuff/Blog_etc/Phylometa_randomization/")

PhylogMetaRand(phy = "phylogeny.txt", dat = "metadata_2g.txt", reps = 100, groups = 2)

As you can see, the observed values fall well within the distribution of values obtained from shuffling tips. P-values were 0.64 and 0.68 for fixed- and random-effects MA's, respectively.

To

**leave a comment**for the author, please follow the link and comment on his blog:**Recology**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...