When we have a statistically significant effect in ANOVA and an independent variable of more than two levels, we typically want to make follow-up comparisons. There are numerous methods for making pairwise comparisons and this tutorial will demonstrate how to execute several different techniques in R.
Before we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory. This dataset contains a hypothetical sample of 30 participants who are divided into three stress reduction treatment groups (mental, physical, and medical). The values are represented on a scale that ranges from 1 to 5. This dataset can be conceptualized as a comparison between three stress treatment programs, one using mental methods, one using physical training, and one using medication. The values represent how effective the treatment programs were at reducing participant’s stress levels, with higher numbers indicating higher effectiveness.
To begin, we need to read our dataset into R and store its contents in a variable.
- > #read the dataset into an R variable using the read.csv(file) function
- > dataPairwiseComparisons <- read.csv(“dataset_ANOVA_OneWayComparisons.csv”)
- > #display the data
- > dataPairwiseComparisons
For the purposes of this tutorial, we will assume that the omnibus ANOVA has already been conducted and that the main effect for treatment was statistically significant. For details on this process, see the One-Way ANOVA with Pairwise Comparisons tutorial, which uses the same dataset.
Let’s also look at the means of our treatment groups. Here, we will use the tapply() function, along with the following arguments, to generate a table of means.
- X: the data
- INDEX: a list() of factor variables
- FUN: the function to be applied
- > #use tapply(X, INDEX, FUN) to generate a table displaying each treatment group mean
- > tapply(X = dataPairwiseComparisons$StressReduction, INDEX = list(dataPairwiseComparisons$Treatment), FUN = mean)
We will cover five major techniques for controlling Type I error when making pairwise comparisons. These methods are no adjustment, Bonferroni’s adjustment, Holm’s adjustment, Fisher’s LSD, and Tukey’s HSD. All of these techniques will be demonstrated on our sample dataset, although the decision as to which to use in a given situation is left up to the reader.
Our first three methods will make use of the pairwise.t.test() function, which has the following major arguments.
- x: the dependent variable
- g: the independent variable
- p.adj: the p-value adjustment method used to control for the family-wise Type I error rate across the comparisons; one of “none”, “bonferroni”, “holm”, “hochberg”, “hommel”, “BH”, or “BY”
Using p.adj = “none” in the pairwise.t.test() function makes no correction for the Type I error rate across the pairwise tests. This technique can be useful for employing methods that are not already built into R functions, such as the Shaffer/Modified Shaffer, which use different alpha level divisors based on the number of levels composing the independent variable. The console results will contain no adjustment, but the researcher can manually consider the statistical significance of the p-values under his or her desired alpha level.
- > #use pairwise.t.test(x, g, p.adj) to test the pairwise comparisons between the treatment group means
- > #no adjustment
- > pairwise.t.test(dataPairwiseComparisons$StressReduction, dataPairwiseComparisons$Treatment, p.adj = “none”)
With no adjustment, the mental-medical and physical-medical comparisons are statistically significant, whereas the mental-physical comparison is not. This suggests that both the mental and physical treatments are superior to the medical treatment, but that there is insufficient statistical support to distinguish between the mental and physical treatments.
The Bonferroni adjustment simply divides the Type I error rate (.05) by the number of tests (in this case, three). Hence, this method is often considered overly conservative. The Bonferroni adjustment can be made using p.adj = “bonferroni” in the pairwise.t.test() function.
- > #Bonferroni adjustment
- > pairwise.t.test(dataPairwiseComparisons$StressReduction, dataPairwiseComparisons$Treatment, p.adj = “bonferroni”)
Using the Bonferroni adjustment, only the mental-medical comparison is statistically significant. This suggests that the mental treatment is superior to the medical treatment, but that there is insufficient statistical support to distinguish between the mental and physical treatments and the physical and medical treatments. Notice that these results are more conservative than with no adjustment.
The Holm adjustment sequentially compares the lowest p-value with a Type I error rate that is reduced for each consecutive test. In our case, this means that our first p-value is tested at the .05/3 level (.017), second at the .05/2 level (.025), and third at the .05/1 level (.05). This method is generally considered superior to the Bonferroni adjustment and can be employed using p.adj = “holm” in the pairwise.t.test() function.
- > #Holm adjustment
- > pairwise.t.test(dataPairwiseComparisons$StressReduction, dataPairwiseComparisons$Treatment, p.adj = “holm”)
Using the Holm procedure, our results are practically (but not mathematically) identical to using no adjustment.
The Fisher Least Significant Difference (LSD) method essentially does not correct for the Type I error rate for multiple comparisons and is generally not recommended relative to other options. However, should the need arise to employ this method, one should seek out the LSD.test() function in the agricolae package, which has the following major arguments.
- y: the dependent variable
- trt: the independent variable
- DFerror: the degrees of freedom error
- MSerror: the mean squared error
Note that the DFerror and MSerror can be found in the omnibus ANOVA table.
- > #load the agricolae package (install first, if necessary)
- > library(agricolae)
- #LSD method
- #use LSD.test(y, trt, DFerror, MSerror) to test the pairwise comparisons between the treatment group means
- > LSD.test(dataPairwiseComparisons$StressReduction, dataPairwiseComparisons$Treatment, 30.5, 1.13)
Using the LSD method, our results are practically (but not mathematically) identical to using no adjustment or the Holm procedure.
The Tukey Honest Significant Difference (HSD) method controls for the Type I error rate across multiple comparisons and is generally considered an acceptable technique. This method can be executed using the TukeyHSD(x) function, where x is a linear model object created using the aov(formula, data) function. Note that in this application, the aov(formula, data) function is identical to the lm(formula, data) that we are already familiar with from linear regression.
- > #HSD method
- > #use TukeyHSD(x), in tandem with aov(formula, data), to test the pairwise comparisons between the treatment group means
- TukeyHSD(aov(StressReduction ~ Treatment, dataPairwiseComparisons))
Using the HSD method, our results are practically (but not mathematically) identical to using the Bonferroni, Holm, or LSD methods.
Complete Pairwise Comparisons Example
To see a complete example of how various pairwise comparison techniques can be applied in R, please download the ANOVA pairwise comparisons example (.txt) file.