Multiple Tests, an Introduction

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last week, a student asked me about multiple tests. More precisely, she ran an experience over – say – 20 weeks, with the same cohort of – say – 100 patients. An we observe some http://latex.codecogs.com/gif.latex?X_{i,t}

size=100
nb=20
set.seed(1)
X=matrix(rnorm(size*nb),size,nb)

(here, I just generate some fake data). I can visualize some trajectories, over the 20 weeks,

library(RColorBrewer)
cl1=brewer.pal(12,"Set3")[12]
cl2=brewer.pal(8,"Set2")[2:7]
cl=c(cl1,cl2)
boxplot(X)
for(i in 1:length(cl)){
lines(1:20,X[i,],type="b",col=cl[i])  }

https://i0.wp.com/f-origin.hypotheses.org/wp-content/blogs.dir/253/files/2014/09/Selection_585.png?w=578

 

She wanted to compare the averages, over the experiments.

boxplot(X)
for(i in 1:length(cl)){
lines(1:20,X[i,],type="b",col=cl[i])  }

The question is what do you want to test ? Is it

http://latex.codecogs.com/gif.latex?H_0:forall%20i,jin{1,cdots,%20T},%20mu_i=mu_j

versus

http://latex.codecogs.com/gif.latex?H_1:exists%20i,jin{1,cdots,%20T},%20mu_ineq%20mu_j

Because that one might be not very informative (if we reject http://latex.codecogs.com/gif.latex?H_0, we don’t really know why). Here, I did consider completely independent observations, over individuals, and over time. Nevertheless, if we perform 190 tests on pairs, we should not be suprised – even if http://latex.codecogs.com/gif.latex?H_0 is valid – that we reject it many times !

> P=NULL
> for(i in 1:(nb-1)){ for(j in (i+1):nb){
+ P=c(P,t.test(X[,i],X[,j])$p.value) }}
> mean(P>.05)
[1] 0.9368421

i.e. for 12 pairs, we reject the assumption of identical mean. Even if that assumption is true, here. That’s the problem with multiple tests.

Actually, observe that over those 190 tests, the http://latex.codecogs.com/gif.latex?p-values are uniformly distributed over the unit interval. Thus, for (almost) 95% of the tests, the http://latex.codecogs.com/gif.latex?p-value exceeds 5%.

Now, what if we relax the assumption of independence ? Here, we need to be more specific. A first idea might be to assume that

http://latex.codecogs.com/gif.latex?X_{i,t}=mu_t%20+%20varepsilon_{i,t}

for some trend http://latex.codecogs.com/gif.latex?mu_t. We can actually assume that the trend depends on the individuals, and write, for instance,

http://latex.codecogs.com/gif.latex?X_{i,t}=alpha_i%20+%20beta_i%20t%20+%20varepsilon_{i,t}

Standard regression techniques can be considered, under those models. We’ll probably discuss that point before the end of this year.

My point was the following : if the assume that

http://latex.codecogs.com/gif.latex?boldsymbol{X}_i=(X_{i,1},cdots,X_{i,T})

consist of independent observations, where all http://latex.codecogs.com/gif.latex?X_{i,t}‘s are assumed to be independent, if we use pairwise comparison tests for the means, in 95% of the tests, the http://latex.codecogs.com/gif.latex?p-value exceeds 5%, and in 90% of the tests, the http://latex.codecogs.com/gif.latex?p-value exceeds 10%. That the interpreation of the property that http://latex.codecogs.com/gif.latex?p-values are uniformly distributed on the unit interval. Now, what if we relax the independence assumption. What if obsevations were – somehow – correlated.

A first idea could be to consider a linear trend,

X=matrix(rnorm(size*nb),size,nb)
T=t(matrix((1:nb)/nb*r,nb,size))
X=X+T*a

Here, we still compute pairwise comparison tests

> P=NULL
> for(i in 1:(nb-1)){ for(j in (i+1):nb){
+ P=c(P,t.test(X[,i],X[,j])$p.value) }}

and we counts the number of pairs where the http://latex.codecogs.com/gif.latex?p-values exceeds some given values

PV[1]=mean(P>.025)
PV[2]=mean(P>.05)
PV[3]=mean(P>.075)
PV[4]=mean(P>.1)
PV[5]=mean(P>.125)
PV[6]=mean(P>.15)

If we look at the proportions, as a function of the slope, we have the following graph

In the middle, we have the independent case, e.g. 97.5% of the the http://latex.codecogs.com/gif.latex?p-values exceeds 2.5% – the upper curve. But, as the slope increase (or decrease), the proportion decreases. With a 0.2 slope, 90% of the the http://latex.codecogs.com/gif.latex?p-values exceeds 2.5%.

One can also assume that

http://latex.codecogs.com/gif.latex?boldsymbol{X}_i=(X_{i,1},cdots,X_{i,T})

is some time series. An autoregressive one, for instance,

http://latex.codecogs.com/gif.latex?X_{i,t}=varphi%20X_{i,t-1}+varepsilon_{i,t}

Here we use to following code to generate that vector

autoreg=function(n,r){
M=matrix(1,n,n)
for(i in 1:(n-2)){
diag(M[(i+1):n,1:(n-i)])=r^i
diag(M[1:(n-i),(i+1):n])=r^i
}
M[1,n]=M[n,1]=r^(i+1)
return(M)
}
S=autoreg(nb,r)
library(mnormt)
X=rmnorm(size,varcov=S)

The graph below is the evolution of the proportion of http://latex.codecogs.com/gif.latex?p-values exceeding some threshold, as a function of the autoregressive coefficient.

The value on the left is the independent case (and http://latex.codecogs.com/gif.latex?p-values are uniformly distributed). The stronger the autocorrelation, the larger the proportion of http://latex.codecogs.com/gif.latex?p-values exceeding any value.

Finally, consider the case where components of vector

http://latex.codecogs.com/gif.latex?boldsymbol{X}_i=(X_{i,1},cdots,X_{i,T})

are exchangeable.

S=matrix(r,nb,nb)
diag(S)=1
library(mnormt)
X=rmnorm(size,M,varcov=S)

In that case, we have the following graph, with the proportion of http://latex.codecogs.com/gif.latex?p-values exceeding some value, as the function of the correlation of (any) pair http://latex.codecogs.com/gif.latex?(X_{i,s},X_{i,t}).

This is the idea that is used in any correction-type method, with multiple tests, for instance Bonferroni correction. When a run http://latex.codecogs.com/gif.latex?m tests, instead of comparing the http://latex.codecogs.com/gif.latex?p-value with http://latex.codecogs.com/gif.latex?alpha, we use a correction on http://latex.codecogs.com/gif.latex?alpha, http://latex.codecogs.com/gif.latex?alpha(m). But as we can see here, it will depend on the kind of dependence we assume there might be.

Next time, I will discuss my student’s data, see what could be said.

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)