**R Tutorial Series**, and kindly contributed to R-bloggers)

As demonstrated in the preceding ANOVA tutorials, data organization is central to conducting ANOVA in R. In standard ANOVA, we used the tapply() function to generate a table for a single summary function. In repeated measures ANOVA, we used separate datasets for our omnibus ANOVA and follow-up comparisons. This tutorial will demonstrate how the *reshape* package can be used to simplify the ANOVA data organization process in R.

### Tutorial Files

Before we begin, you may want to download the between group and repeated measures datasets (.csv) used in this tutorial. Be sure to right-click and save the files to your R working directory. The between groups dataset contains a hypothetical sample of 30 cases separated into three groups (a, b, and c). The repeated measures dataset contains a hypothetical sample of 10 cases across three measurements (a, b, and c). In both cases, the values are represented on a scale that ranges from 1 to 5.

### Beginning Steps

To begin, we need to read our datasets into R and store their contents in variables.

- > #read the datasets into R variables using the read.csv(file) function
- > dataBetween <- read.csv(“dataset_ANOVA_reshape_1.csv”)
- > dataRepeated <- read.csv(“dataset_ANOVA_reshape_2.csv”)

### Reshape Package

Next, we need to install and load the *reshape* package. In this tutorial, we will make use of the package’s cast() and melt() functions.

- > #install the package
- > install.packages(“reshape”)
- > #load the package
- > library(reshape)

### Using cast() to Derive ANOVA Descriptives

The cast() function can be used to easily derive summary statistics for a between groups ANOVA dataset. The cast() function receives the following primary arguments.

- data: the dataset
- formula: in our case, a one-sided formula indicating the grouping variable
- fun.aggregate: a function or vector of functions for deriving summary statistics, such as mean, var, or sd

- > #display the raw between groups data
- > dataBetween

- > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means
- > cast(dataBetween, formula = ~group, fun.aggregate = mean)

Note that the fun.aggregate argument can also receive a vector of summary statistics functions. This will yield all of the requested descriptives via a single cast() function.

- > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means, variances, and standard deviations
- > cast(dataBetween, formula = ~group, fun.aggregate = c(mean, var, sd))

### Using melt() to Prepare Repeated Measures Data for Pairwise Comparisons

The melt() function can be used to morph a repeated measures ANOVA dataset prior to conducting pairwise comparisons. The melt() function receives the following primary arguments.

- data: the dataset
- id.vars: the id variable or a vector of values that can be used as ids
- measure.vars: a vector containing the variables to be melted
- variable_name: the name of the column containing the melted variables

- > #display the repeated measures data
- > dataRepeated

- > #melt the repeated measures data using melt(data, id.vars, measure.vars, variable_name) to organize it for pairwise comparisons
- > melt(dataRepeated, id.vars = “case”, measure.vars = c(“valueA”, “valueB”, “valueC”), variable_name = “abcValues”)

Note that the data are now prepared to be used in the pairwise.t.test() function. See the One-Way ANOVA with Pairwise Comparisons tutorial for details on using the pairwise.t.test() function.

### Complete ANOVA Reshape Example

To see a complete example of how ANOVA data can be organized using the *reshape* package in R, please download the ANOVA reshape example (.txt) file.

**leave a comment**for the author, please follow the link and comment on their blog:

**R Tutorial Series**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...