# R Tutorial Series: Applying the Reshape Package to Organize ANOVA Data

March 14, 2011
By

(This article was first published on R Tutorial Series, and kindly contributed to R-bloggers)

As demonstrated in the preceding ANOVA tutorials, data organization is central to conducting ANOVA in R. In standard ANOVA, we used the tapply() function to generate a table for a single summary function. In repeated measures ANOVA, we used separate datasets for our omnibus ANOVA and follow-up comparisons. This tutorial will demonstrate how the reshape package can be used to simplify the ANOVA data organization process in R.

### Tutorial Files

Before we begin, you may want to download the between group and repeated measures datasets (.csv) used in this tutorial. Be sure to right-click and save the files to your R working directory. The between groups dataset contains a hypothetical sample of 30 cases separated into three groups (a, b, and c). The repeated measures dataset contains a hypothetical sample of 10 cases across three measurements (a, b, and c). In both cases, the values are represented on a scale that ranges from 1 to 5.

### Beginning Steps

To begin, we need to read our datasets into R and store their contents in variables.

1. > #read the datasets into R variables using the read.csv(file) function

### Reshape Package

Next, we need to install and load the reshape package. In this tutorial, we will make use of the package’s cast() and melt() functions.

1. > #install the package
2. > install.packages(“reshape”)
4. > library(reshape)

### Using cast() to Derive ANOVA Descriptives

The cast() function can be used to easily derive summary statistics for a between groups ANOVA dataset. The cast() function receives the following primary arguments.

• data: the dataset
• formula: in our case, a one-sided formula indicating the grouping variable
• fun.aggregate: a function or vector of functions for deriving summary statistics, such as mean, var, or sd
1. > #display the raw between groups data
2. > dataBetween
The raw between groups data
1. > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means
2. > cast(dataBetween, formula = ~group, fun.aggregate = mean)
The casted data with means

Note that the fun.aggregate argument can also receive a vector of summary statistics functions. This will yield all of the requested descriptives via a single cast() function.

1. > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means, variances, and standard deviations
2. > cast(dataBetween, formula = ~group, fun.aggregate = c(mean, var, sd))
The casted data with descriptives

### Using melt() to Prepare Repeated Measures Data for Pairwise Comparisons

The melt() function can be used to morph a repeated measures ANOVA dataset prior to conducting pairwise comparisons. The melt() function receives the following primary arguments.

• data: the dataset
• id.vars: the id variable or a vector of values that can be used as ids
• measure.vars: a vector containing the variables to be melted
• variable_name: the name of the column containing the melted variables
1. > #display the repeated measures data
2. > dataRepeated
The raw repeated measures data
1. > #melt the repeated measures data using melt(data, id.vars, measure.vars, variable_name) to organize it for pairwise comparisons
2. > melt(dataRepeated, id.vars = “case”, measure.vars = c(“valueA”, “valueB”, “valueC”), variable_name = “abcValues”)
The melted repeated measures data

Note that the data are now prepared to be used in the pairwise.t.test() function. See the One-Way ANOVA with Pairwise Comparisons tutorial for details on using the pairwise.t.test() function.

### Complete ANOVA Reshape Example

To see a complete example of how ANOVA data can be organized using the reshape package in R, please download the ANOVA reshape example (.txt) file.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...