Perform pairwise Wilcoxon test, classify groups by significance and plot results

March 25, 2017
By

(This article was first published on R – Fabio Marroni's Blog, and kindly contributed to R-bloggers)

This post is the result of work performed in collaboration with my colleague Eleonora Paparelli (who actually did most of the work!). We wanted to compare several distributions using Wilcoxon test and summarize results (i.e. indicate the comparisons showing significant differences).
R base includes pairwise.wilcox.test to perform Wilcoxon rank sum test between all pairs of samples in a study.
A common way to represent significance in pairwise comparisons is the use of letters. Samples sharing a letter are not different from each other. Samples not sharing letters are different.
Library multcompView in R can take a square matrix of p-values and return letters for all samples.
Sadly enough, pairwise.wilcox.test returns a triangular matrix, so I had to write a small function – named tri.to.squ – to take the output of pairwise.wilcox.test and convert it in a suitable input for the multcompLetters function of multcompView.
Now we can easily plot the distributions as box plot and add the letters as text.

library(multcompView)
 
tri.to.squ<-function(x)
{
rn<-row.names(x)
cn<-colnames(x)
an<-unique(c(cn,rn))
myval<-x[!is.na(x)]
mymat<-matrix(1,nrow=length(an),ncol=length(an),dimnames=list(an,an))
for(ext in 1:length(cn))
{
 for(int in 1:length(rn))
 {
 if(is.na(x[row.names(x)==rn[int],colnames(x)==cn[ext]])) next
 mymat[row.names(mymat)==rn[int],colnames(mymat)==cn[ext]]<-x[row.names(x)==rn[int],colnames(x)==cn[ext]]
 mymat[row.names(mymat)==cn[ext],colnames(mymat)==rn[int]]<-x[row.names(x)==rn[int],colnames(x)==cn[ext]]
 }
  
}
return(mymat)
}
first.set<-cbind(rnorm(100,mean=1.8),1)
second.set<-cbind(rnorm(100,mean=0.9),2)
third.set<-cbind(rnorm(100,mean=1),3)
 
full<-rbind(first.set,second.set,third.set)
 
pp<-pairwise.wilcox.test(full[,1], full[,2], p.adjust.method = "none", paired = FALSE)
mymat<-tri.to.squ(pp$p.value)
myletters<-multcompLetters(mymat,compare="<=",threshold=0.05,Letters=letters)
 
boxplot(full[,1] ~ full[,2],ylab="Something nice",names=c("Set 1","Set 2","Set 3"),ylim=c(min(full[,1]),0.3+max(full[,1])))
text(c(1,2,3),0.2+max(full[,1]),c(myletters$Letters[1],myletters$Letters[2],myletters$Letters[3]))


To leave a comment for the author, please follow the link and comment on their blog: R – Fabio Marroni's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)