As demonstrated in the preceding ANOVA tutorials, data organization is central to conducting ANOVA in R. In standard ANOVA, we used the tapply() function to generate a table for a single summary function. In repeated measures ANOVA, we used separate datasets for our omnibus ANOVA and follow-up comparisons. This tutorial will demonstrate how the reshape package can be used to simplify the ANOVA data organization process in R.
Tutorial FilesBefore we begin, you may want to download the between group and repeated measures datasets (.csv) used in this tutorial. Be sure to right-click and save the files to your R working directory. The between groups dataset contains a hypothetical sample of 30 cases separated into three groups (a, b, and c). The repeated measures dataset contains a hypothetical sample of 10 cases across three measurements (a, b, and c). In both cases, the values are represented on a scale that ranges from 1 to 5.
Beginning StepsTo begin, we need to read our datasets into R and store their contents in variables.
- > #read the datasets into R variables using the read.csv(file) function
- > dataBetween <- read.csv("dataset_ANOVA_reshape_1.csv")
- > dataRepeated <- read.csv("dataset_ANOVA_reshape_2.csv")
Reshape PackageNext, we need to install and load the reshape package. In this tutorial, we will make use of the package’s cast() and melt() functions.
- > #install the package
- > install.packages(“reshape”)
- > #load the package
- > library(reshape)
Using cast() to Derive ANOVA DescriptivesThe cast() function can be used to easily derive summary statistics for a between groups ANOVA dataset. The cast() function receives the following primary arguments.
- data: the dataset
- formula: in our case, a one-sided formula indicating the grouping variable
- fun.aggregate: a function or vector of functions for deriving summary statistics, such as mean, var, or sd
- > #display the raw between groups data
- > dataBetween
The raw between groups data
- > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means
- > cast(dataBetween, formula = ~group, fun.aggregate = mean)
The casted data with means
Note that the fun.aggregate argument can also receive a vector of summary statistics functions. This will yield all of the requested descriptives via a single cast() function.
- > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means, variances, and standard deviations
- > cast(dataBetween, formula = ~group, fun.aggregate = c(mean, var, sd))
The casted data with descriptives
Using melt() to Prepare Repeated Measures Data for Pairwise ComparisonsThe melt() function can be used to morph a repeated measures ANOVA dataset prior to conducting pairwise comparisons. The melt() function receives the following primary arguments.
- data: the dataset
- id.vars: the id variable or a vector of values that can be used as ids
- measure.vars: a vector containing the variables to be melted
- variable_name: the name of the column containing the melted variables
- > #display the repeated measures data
- > dataRepeated
The raw repeated measures data
- > #melt the repeated measures data using melt(data, id.vars, measure.vars, variable_name) to organize it for pairwise comparisons
- > melt(dataRepeated, id.vars = “case”, measure.vars = c(“valueA”, “valueB”, “valueC”), variable_name = “abcValues”)
The melted repeated measures data
Note that the data are now prepared to be used in the pairwise.t.test() function. See the One-Way ANOVA with Pairwise Comparisons tutorial for details on using the pairwise.t.test() function.