# A robust Hotelling test…

July 12, 2010
By

Recently I was in need of testing a mean vector. I wrote a few lines of code in R and had it done perfectly. Hotelling test is one of the least interesting test to me. never really figured out why…

At that time I had some time to search more about it. One of the most common things to search for a test is a robust version of it (at least that’s what I search for!). A little search in the 3rd page of google results leads to the following :

### One-sample and two-sample robust Hotelling tests with fast and robust bootstrap

The classical Hotelling test for testing if the mean equals a certain value or if two means are equal is modiﬁed into a robust one through substitution of the empirical estimates by the MM-estimates of location and scatter. The MM-estimator, using Tukey’s biweight function, is tuned by default to have a breakdown point of 50% and 95% location efﬁciency. This could be changed through the control argument if desired.

### Robust Hotelling T2 test

Performs one and two sample Hotelling T2 tests as well as robust one-sample Hotelling T2 test.

The first uses MM and S estimators while the latter a Minimum Covariance Determinant one. You can get info on those on the links in the end of the post. What might be crucial to you is that MM/S estimators would be more time comsuming compared to MCD. A little demonstation is the following..

library(rrcov)
data(delivery)
delivery.x <- delivery[,1:2]
T2.test(delivery.x)
#
#     One-sample Hotelling test
#
# data:  delivery.x
# T^2 = 21.0494, df1 = 2, df2 = 23, p-value = 6.365e-06
# alternative hypothesis: true mean vector is not equal to (0, 0)'
#
# sample estimates:
#               n.prod distance
# mean x-vector   8.76   409.28
t0<-Sys.time()
T2.test(delivery.x, method="mcd")
#
#     One-sample Hotelling test (Reweighted MCD Location)
#
# data:  delivery.x
# T^2 = 37.701, df1 = 2.000, df2 = 9.146, p-value = 3.829e-05
# alternative hypothesis: true mean vector is not equal to (0, 0)'
#
# sample estimates:
#                n.prod distance
# MCD x-vector 6.190476 309.7143
Sys.time()-t0
# Time difference of 0.04200006 secs
library(FRB)
t0<-Sys.time()
FRBhotellingMM(delivery.x)
# One sample Hotelling test based on multivariate MM-estimates
# (bdp = 0.5, eff = 0.95)
# data:  delivery.x
# T^2_R =  84.59
# p-value =  0.0022
# Alternative hypothesis : true mean vector is not equal to ( 0 0 )
Sys.time()-t0
# Time difference of 4.859 secs

Time consuming as it may is I would stick with the Bootstrap method. What would you do?

Roelant, E., Van Aelst, S., and Willems, G. (2008), “Fast Bootstrap for Robust Hotelling Tests,” COMPSTAT 2008: Proceedings in Computational Statistics (P. Brito, Ed.) Heidelberg: Physika-Verlag, to appear.

Willems G., Pison G., Rousseeuw P. and Van Aelst S. (2002), A robust hotelling test, Metrika, 55, 125–138.