Site icon R-bloggers

Multiple Tests, an Introduction

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last week, a student asked me about multiple tests. More precisely, she ran an experience over – say – 20 weeks, with the same cohort of – say – 100 patients. An we observe some

size=100
nb=20
set.seed(1)
X=matrix(rnorm(size*nb),size,nb)

(here, I just generate some fake data). I can visualize some trajectories, over the 20 weeks,

library(RColorBrewer)
cl1=brewer.pal(12,"Set3")[12]
cl2=brewer.pal(8,"Set2")[2:7]
cl=c(cl1,cl2)
boxplot(X)
for(i in 1:length(cl)){
lines(1:20,X[i,],type="b",col=cl[i])  }

 

She wanted to compare the averages, over the experiments.

boxplot(X)
for(i in 1:length(cl)){
lines(1:20,X[i,],type="b",col=cl[i])  }

The question is what do you want to test ? Is it

versus

Because that one might be not very informative (if we reject , we don’t really know why). Here, I did consider completely independent observations, over individuals, and over time. Nevertheless, if we perform 190 tests on pairs, we should not be suprised – even if is valid – that we reject it many times !

> P=NULL
> for(i in 1:(nb-1)){ for(j in (i+1):nb){
+ P=c(P,t.test(X[,i],X[,j])$p.value) }}
> mean(P>.05)
[1] 0.9368421

i.e. for 12 pairs, we reject the assumption of identical mean. Even if that assumption is true, here. That’s the problem with multiple tests.

Actually, observe that over those 190 tests, the -values are uniformly distributed over the unit interval. Thus, for (almost) 95% of the tests, the -value exceeds 5%.

Now, what if we relax the assumption of independence ? Here, we need to be more specific. A first idea might be to assume that

for some trend . We can actually assume that the trend depends on the individuals, and write, for instance,

Standard regression techniques can be considered, under those models. We’ll probably discuss that point before the end of this year.

My point was the following : if the assume that

consist of independent observations, where all ‘s are assumed to be independent, if we use pairwise comparison tests for the means, in 95% of the tests, the -value exceeds 5%, and in 90% of the tests, the -value exceeds 10%. That the interpreation of the property that -values are uniformly distributed on the unit interval. Now, what if we relax the independence assumption. What if obsevations were – somehow – correlated.

A first idea could be to consider a linear trend,

X=matrix(rnorm(size*nb),size,nb)
T=t(matrix((1:nb)/nb*r,nb,size))
X=X+T*a

Here, we still compute pairwise comparison tests

> P=NULL
> for(i in 1:(nb-1)){ for(j in (i+1):nb){
+ P=c(P,t.test(X[,i],X[,j])$p.value) }}

and we counts the number of pairs where the -values exceeds some given values

PV[1]=mean(P>.025)
PV[2]=mean(P>.05)
PV[3]=mean(P>.075)
PV[4]=mean(P>.1)
PV[5]=mean(P>.125)
PV[6]=mean(P>.15)

If we look at the proportions, as a function of the slope, we have the following graph

In the middle, we have the independent case, e.g. 97.5% of the the -values exceeds 2.5% – the upper curve. But, as the slope increase (or decrease), the proportion decreases. With a 0.2 slope, 90% of the the -values exceeds 2.5%.

One can also assume that

is some time series. An autoregressive one, for instance,

Here we use to following code to generate that vector

autoreg=function(n,r){
M=matrix(1,n,n)
for(i in 1:(n-2)){
diag(M[(i+1):n,1:(n-i)])=r^i
diag(M[1:(n-i),(i+1):n])=r^i
}
M[1,n]=M[n,1]=r^(i+1)
return(M)
}
S=autoreg(nb,r)
library(mnormt)
X=rmnorm(size,varcov=S)

The graph below is the evolution of the proportion of -values exceeding some threshold, as a function of the autoregressive coefficient.

The value on the left is the independent case (and -values are uniformly distributed). The stronger the autocorrelation, the larger the proportion of -values exceeding any value.

Finally, consider the case where components of vector

are exchangeable.

S=matrix(r,nb,nb)
diag(S)=1
library(mnormt)
X=rmnorm(size,M,varcov=S)

In that case, we have the following graph, with the proportion of -values exceeding some value, as the function of the correlation of (any) pair .

This is the idea that is used in any correction-type method, with multiple tests, for instance Bonferroni correction. When a run tests, instead of comparing the -value with , we use a correction on , . But as we can see here, it will depend on the kind of dependence we assume there might be.

Next time, I will discuss my student’s data, see what could be said.

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.