R Tutorial Series: Applying the Reshape Package to Organize ANOVA Data
[This article was first published on R Tutorial Series, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As demonstrated in the preceding ANOVA tutorials, data organization is central to conducting ANOVA in R. In standard ANOVA, we used the tapply() function to generate a table for a single summary function. In repeated measures ANOVA, we used separate datasets for our omnibus ANOVA and follow-up comparisons. This tutorial will demonstrate how the reshape package can be used to simplify the ANOVA data organization process in R.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Tutorial Files
Before we begin, you may want to download the between group and repeated measures datasets (.csv) used in this tutorial. Be sure to right-click and save the files to your R working directory. The between groups dataset contains a hypothetical sample of 30 cases separated into three groups (a, b, and c). The repeated measures dataset contains a hypothetical sample of 10 cases across three measurements (a, b, and c). In both cases, the values are represented on a scale that ranges from 1 to 5.Beginning Steps
To begin, we need to read our datasets into R and store their contents in variables.
- > #read the datasets into R variables using the read.csv(file) function
- > dataBetween <- read.csv("dataset_ANOVA_reshape_1.csv")
- > dataRepeated <- read.csv("dataset_ANOVA_reshape_2.csv")
Reshape Package
Next, we need to install and load the reshape package. In this tutorial, we will make use of the package’s cast() and melt() functions.
- > #install the package
- > install.packages(“reshape”)
- > #load the package
- > library(reshape)
Using cast() to Derive ANOVA Descriptives
The cast() function can be used to easily derive summary statistics for a between groups ANOVA dataset. The cast() function receives the following primary arguments.- data: the dataset
- formula: in our case, a one-sided formula indicating the grouping variable
- fun.aggregate: a function or vector of functions for deriving summary statistics, such as mean, var, or sd
- > #display the raw between groups data
- > dataBetween
The raw between groups data
- > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means
- > cast(dataBetween, formula = ~group, fun.aggregate = mean)
The casted data with means
Note that the fun.aggregate argument can also receive a vector of summary statistics functions. This will yield all of the requested descriptives via a single cast() function.
- > #cast the between groups data using cast(data, formula, fun.aggregate) to get the group means, variances, and standard deviations
- > cast(dataBetween, formula = ~group, fun.aggregate = c(mean, var, sd))
The casted data with descriptives
Using melt() to Prepare Repeated Measures Data for Pairwise Comparisons
The melt() function can be used to morph a repeated measures ANOVA dataset prior to conducting pairwise comparisons. The melt() function receives the following primary arguments.- data: the dataset
- id.vars: the id variable or a vector of values that can be used as ids
- measure.vars: a vector containing the variables to be melted
- variable_name: the name of the column containing the melted variables
- > #display the repeated measures data
- > dataRepeated
The raw repeated measures data
- > #melt the repeated measures data using melt(data, id.vars, measure.vars, variable_name) to organize it for pairwise comparisons
- > melt(dataRepeated, id.vars = “case”, measure.vars = c(“valueA”, “valueB”, “valueC”), variable_name = “abcValues”)
The melted repeated measures data
Note that the data are now prepared to be used in the pairwise.t.test() function. See the One-Way ANOVA with Pairwise Comparisons tutorial for details on using the pairwise.t.test() function.
Complete ANOVA Reshape Example
To see a complete example of how ANOVA data can be organized using the reshape package in R, please download the ANOVA reshape example (.txt) file.To leave a comment for the author, please follow the link and comment on their blog: R Tutorial Series.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.