# Kruskal-Wallis one-way analysis of variance

July 31, 2009
By

(This article was first published on Statistic on aiR, and kindly contributed to R-bloggers)

If you have to perform the comparison between multiple groups, but you can not run a ANOVA for multiple comparisons because the groups do not follow a normal distribution, you can use the Kruskal-Wallis test, which can be applied when you can not make the assumption that the groups follow a gaussian distribution.
This test is similar to the Wilcoxon test for 2 samples.

Suppose you want to see if the means of the following 4 sets of values are statistically similar:
Group A: 1, 5, 8, 17, 16
Group B: 2, 16, 5, 7, 4
Group C: 1, 1, 3, 7, 9
Group D: 2, 15, 2, 9, 7

To use the test of Kruskal-Wallis simply enter the data, and then organize them into a list:

a = c(1, 5, 8, 17, 16)b = c(2, 16, 5, 7, 4)c = c(1, 1, 3, 7, 9)d = c(2, 15, 2, 9, 7)dati = list(g1=a, g2=b, g3=c, g4=d)

Now we can apply the kruskal.test() function:

kruskal.test(dati)        Kruskal-Wallis rank sum testdata:  dati Kruskal-Wallis chi-squared = 1.9217, df = 3, p-value = 0.5888

The value of the test statistic is 1.9217. This value already contains the fix when there are ties (repetitions). The p-value is greater than 0.05; also the value of the test statistic is lower than the chi-square-tabulation:

qchisq(0.950, 3)[1] 7.814728

The conclusion is therefore that I accept the null hypothesis H0: the means of the 4 groups are statistically equal.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...