Kruskal-Wallis one-way analysis of variance

July 31, 2009
By

(This article was first published on Statistic on aiR, and kindly contributed to R-bloggers)

If you have to perform the comparison between multiple groups, but you can not run a ANOVA for multiple comparisons because the groups do not follow a normal distribution, you can use the Kruskal-Wallis test, which can be applied when you can not make the assumption that the groups follow a gaussian distribution.
This test is similar to the Wilcoxon test for 2 samples.

Suppose you want to see if the means of the following 4 sets of values are statistically similar:
Group A: 1, 5, 8, 17, 16
Group B: 2, 16, 5, 7, 4
Group C: 1, 1, 3, 7, 9
Group D: 2, 15, 2, 9, 7


To use the test of Kruskal-Wallis simply enter the data, and then organize them into a list:


a = c(1, 5, 8, 17, 16)
b = c(2, 16, 5, 7, 4)
c = c(1, 1, 3, 7, 9)
d = c(2, 15, 2, 9, 7)

dati = list(g1=a, g2=b, g3=c, g4=d)


Now we can apply the kruskal.test() function:


kruskal.test(dati)

Kruskal-Wallis rank sum test

data: dati
Kruskal-Wallis chi-squared = 1.9217, df = 3, p-value = 0.5888


The value of the test statistic is 1.9217. This value already contains the fix when there are ties (repetitions). The p-value is greater than 0.05; also the value of the test statistic is lower than the chi-square-tabulation:


qchisq(0.950, 3)
[1] 7.814728


The conclusion is therefore that I accept the null hypothesis H0: the means of the 4 groups are statistically equal.

To leave a comment for the author, please follow the link and comment on his blog: Statistic on aiR.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.