# Show me the mean(ing)…

November 5, 2009
By

Well testing a bunch of samples for the largest population mean isn’t that common yet a simple test is at hand. Under the obvious title “The rank sum maximum test for the largest K population means” the test relies on the calculation of the sum of ranks under the combined sample of size ${{nk}}$, where ${{n}}$ is the common size of the k’s samples.

For illustration purposes the following data are used. They consist of 6 samples of 5 observations.

```> data
[1]  4.17143986  1.31264787  0.12109036  0.63031601  1.56705511  0.58817076
[7]  1.98011001  1.63226118 -0.03869368  1.80964611  4.80878278  0.67015153
[13]  2.07602321  1.52952749  1.68483297  2.00147364  9.30173048  0.58331012
[19]  2.49537140  1.31229842  1.40193543  0.11906268  4.76253012  1.26550467
[25]  0.69497074 -0.27612056  5.05751484  1.96589383  2.58427547 -0.36979229```

Next we construct a convenient matrix

```data.mat=expand.grid(x=rep(NA,5),sample=c("1","2","3","4","5","6"))
data.mat\$x=data
data.mat\$Rank=rank(data.mat\$x)```

and we compute the sample ranks

```R=rep(NA,6)
for (i in 1:6)
{
R[i]=sum(subset(data.mat,data.mat\$sample==i)\$Rank)
}```
```> rank(R)
[1] 3 2 5 6 1 4```

So we would test whether the 4th sample has the largest population mean. First we need critical values.

`##Critical valus 115/119/127/134 for 10%,5%,1% and 0.1%`
`> R[rank(R)==length(R)]>119`
`FALSE`

So, we cannot accept the hypothesis of the largest mean for the 4th sample.

Look it up… Gopal K. Kanji, 100 Statistical Tests , Sage Publications [google]

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...