Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last week, a student asked me about multiple tests. More precisely, she ran an experience over – say – 20 weeks, with the same cohort of – say – 100 patients. An we observe some $X_{i,t}$

size=100
nb=20
set.seed(1)
X=matrix(rnorm(size*nb),size,nb)

(here, I just generate some fake data). I can visualize some trajectories, over the 20 weeks,

library(RColorBrewer)
cl1=brewer.pal(12,"Set3")[12]
cl2=brewer.pal(8,"Set2")[2:7]
cl=c(cl1,cl2)
boxplot(X)
for(i in 1:length(cl)){
lines(1:20,X[i,],type="b",col=cl[i])  }

She wanted to compare the averages, over the experiments.

boxplot(X)
for(i in 1:length(cl)){
lines(1:20,X[i,],type="b",col=cl[i])  }

The question is what do you want to test ? Is it

$H_0:\forall i,j\in\{1,\cdots, T\}, \mu_i=\mu_j$

versus

$H_1:\exists i,j\in\{1,\cdots, T\}, \mu_i\neq \mu_j$

Because that one might be not very informative (if we reject $H_0$, we don’t really know why). Here, I did consider completely independent observations, over individuals, and over time. Nevertheless, if we perform 190 tests on pairs, we should not be suprised – even if $H_0$ is valid – that we reject it many times !

> P=NULL
> for(i in 1:(nb-1)){ for(j in (i+1):nb){
+ P=c(P,t.test(X[,i],X[,j])$p.value) }} > mean(P>.05) [1] 0.9368421 i.e. for 12 pairs, we reject the assumption of identical mean. Even if that assumption is true, here. That’s the problem with multiple tests. Actually, observe that over those 190 tests, the $p$-values are uniformly distributed over the unit interval. Thus, for (almost) 95% of the tests, the $p$-value exceeds 5%. Now, what if we relax the assumption of independence ? Here, we need to be more specific. A first idea might be to assume that $X_{i,t}=\mu_t + \varepsilon_{i,t}$ for some trend $\mu_t$. We can actually assume that the trend depends on the individuals, and write, for instance, $X_{i,t}=\alpha_i + \beta_i t + \varepsilon_{i,t}$ Standard regression techniques can be considered, under those models. We’ll probably discuss that point before the end of this year. My point was the following : if the assume that $\boldsymbol{X}_i=(X_{i,1},\cdots,X_{i,T})$ consist of independent observations, where all $X_{i,t}$‘s are assumed to be independent, if we use pairwise comparison tests for the means, in 95% of the tests, the $p$-value exceeds 5%, and in 90% of the tests, the $p$-value exceeds 10%. That the interpreation of the property that $p$-values are uniformly distributed on the unit interval. Now, what if we relax the independence assumption. What if obsevations were – somehow – correlated. A first idea could be to consider a linear trend, X=matrix(rnorm(size*nb),size,nb) T=t(matrix((1:nb)/nb*r,nb,size)) X=X+T*a Here, we still compute pairwise comparison tests > P=NULL > for(i in 1:(nb-1)){ for(j in (i+1):nb){ + P=c(P,t.test(X[,i],X[,j])$p.value) }}

and we counts the number of pairs where the $p$-values exceeds some given values

PV[1]=mean(P>.025)
PV[2]=mean(P>.05)
PV[3]=mean(P>.075)
PV[4]=mean(P>.1)
PV[5]=mean(P>.125)
PV[6]=mean(P>.15)

If we look at the proportions, as a function of the slope, we have the following graph

In the middle, we have the independent case, e.g. 97.5% of the the $p$-values exceeds 2.5% – the upper curve. But, as the slope increase (or decrease), the proportion decreases. With a 0.2 slope, 90% of the the $p$-values exceeds 2.5%.

One can also assume that

$\boldsymbol{X}_i=(X_{i,1},\cdots,X_{i,T})$

is some time series. An autoregressive one, for instance,

$X_{i,t}=\varphi X_{i,t-1}+\varepsilon_{i,t}$

Here we use to following code to generate that vector

autoreg=function(n,r){
M=matrix(1,n,n)
for(i in 1:(n-2)){
diag(M[(i+1):n,1:(n-i)])=r^i
diag(M[1:(n-i),(i+1):n])=r^i
}
M[1,n]=M[n,1]=r^(i+1)
return(M)
}
S=autoreg(nb,r)
library(mnormt)
X=rmnorm(size,varcov=S)

The graph below is the evolution of the proportion of $p$-values exceeding some threshold, as a function of the autoregressive coefficient.

The value on the left is the independent case (and $p$-values are uniformly distributed). The stronger the autocorrelation, the larger the proportion of $p$-values exceeding any value.

Finally, consider the case where components of vector

$\boldsymbol{X}_i=(X_{i,1},\cdots,X_{i,T})$

are exchangeable.

S=matrix(r,nb,nb)
diag(S)=1
library(mnormt)
X=rmnorm(size,M,varcov=S)

In that case, we have the following graph, with the proportion of $p$-values exceeding some value, as the function of the correlation of (any) pair $(X_{i,s},X_{i,t})$.

This is the idea that is used in any correction-type method, with multiple tests, for instance Bonferroni correction. When a run $m$ tests, instead of comparing the $p$-value with $\alpha$, we use a correction on $\alpha$, $\alpha(m)$. But as we can see here, it will depend on the kind of dependence we assume there might be.

Next time, I will discuss my student’s data, see what could be said.