# ANOVA vs Multiple Comparisons

**R – Predictive Hacks**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When we run an ANOVA, we analyze the differences among group means in a sample. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means.

## ANOVA Null and Alternatve Hypothesis

The **null hypothesis **in **ANOVA **is that there is no difference between means and the **alternative **is that the means are not all equal.

\(H_0: \mu _1= \mu _2=…= \mu _K \)

\(H_1: The~ \mu_s~Are~Not~All~Equal\)

This means that when we are dealing with many groups, we cannot compare them pairwise. We can simply answer if the means between groups can be considered as equal or not.

## Tukey’s HSD

What about if we want to compare all the groups pairwise? In this case, we can apply the Tukey’s HSD which is a single-step multiple comparison procedure and statistical test. It can be used to find means that are significantly different from each other.

## Example of ANOVA vs Tukey’s HSD

Let’s assume that we are dealing with the following 4 groups:

- Group “a”: 100 observations from the Normal Distribution with
**mean 10**and**standard deviation 5** - Group “b”: 100 observations from the Normal Distribution with
**mean 10**and**standard deviation 5** - Group “c”: 100 observations from the Normal Distribution with
**mean 11**and**standard deviation 6** - Group “d”: 100 observations from the Normal Distribution with
**mean 11**and**standard deviation 6**

Clearly, we were expecting the ANOVA to reject to Null Hypothesis but we would also to know that the** Group a and Group b** are **not statistically different** and the same with the **Group c and Group d**

Let’s work in R:

library(multcomp) library(tidyverse) # Create the four groups set.seed(10) df1 <- data.frame(Var="a", Value=rnorm(100,10,5)) df2 <- data.frame(Var="b", Value=rnorm(100,10,5)) df3 <- data.frame(Var="c", Value=rnorm(100,11,6)) df4 <- data.frame(Var="d", Value=rnorm(100,11,6)) # merge them in one data frame df<-rbind(df1,df2,df3,df4) # convert Var to a factor df$Var<-as.factor(df$Var) df%>%ggplot(aes(x=Value, fill=Var))+geom_density(alpha=0.5)

**ANOVA**

# ANOVA model1<-lm(Value~Var, data=df) anova(model1)

Output:

Analysis of Variance Table Response: Value Df Sum Sq Mean Sq F value Pr(>F) Var 3 565.7 188.565 6.351 0.0003257 *** Residuals 396 11757.5 29.691 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Clearly, we reject the null hypothesis since the p-value is **0.0003257**

** Tukey’s HSD **

Let’s apply the Tukey HSD test to test all the means.

# Tukey multiple comparisons summary(glht(model1, mcp(Var="Tukey")))

Output:

Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: lm(formula = Value ~ Var, data = df) Linear Hypotheses: Estimate Std. Error t value Pr(>|t|) b - a == 0 0.2079 0.7706 0.270 0.99312 c - a == 0 1.8553 0.7706 2.408 0.07727 . d - a == 0 2.8758 0.7706 3.732 0.00129 ** c - b == 0 1.6473 0.7706 2.138 0.14298 d - b == 0 2.6678 0.7706 3.462 0.00329 ** d - c == 0 1.0205 0.7706 1.324 0.54795 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Adjusted p values reported -- single-step method)

As we can see from the output above, the difference between **c vs a** and **c vs b** found not be statistically significant although they are from different distributions. The reason for that is the “issue” with the `multiple comparisons`

. Let’s compare them by applying the `t-test`

**t-test a vs c**

t.test(df%>%filter(Var=="a")%>%pull(), df%>%filter(Var=="c")%>%pull())

Output:

Welch Two Sample t-test data: df %>% filter(Var == "a") %>% pull() and df %>% filter(Var == "c") %>% pull() t = -2.4743, df = 189.47, p-value = 0.01423 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3.3343125 -0.3761991 sample estimates: mean of x mean of y 9.317255 11.172511

**t-test b vs c**

t.test(df%>%filter(Var=="b")%>%pull(), df%>%filter(Var=="c")%>%pull())

Output:

Welch Two Sample t-test data: df %>% filter(Var == "b") %>% pull() and df %>% filter(Var == "c") %>% pull() t = -2.1711, df = 191.53, p-value = 0.03115 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3.1439117 -0.1507362 sample estimates: mean of x mean of y 9.525187 11.172511

As we can see from above, the means of the two groups, in both cases, found to be statistically significant, if we ignore the multiple comparisons.

## Discussion

When we are dealing with multiple comparisons and we want to apply pairwise comparisons, then Tukey’s HSD is a good option. Another approach is to consider the P-Value Adjustments.

You can also have a look at how you can consider the multiple comparisons in A/B/n Testing

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Predictive Hacks**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.