(This article was first published on

**Psychological Statistics**, and kindly contributed to R-bloggers)

Psychologists are gradually coming round to the view that it is a good idea to present interval estimates alongside point estimates of statistics. The most common statistic reported in psychology research is almost certainly the mean (strictly the arithmetic mean). Presenting an interval estimate for the mean of a single sample is usually quite simple. This is usually done as 95% confidence interval about the mean – and most researchers in psychology are able to calculate this by hand or get their statistical software to calculate and graph it for them.

Extending this to more than one mean introduces an additional layer of complexity. This is because the difference between two means is a different quantity, and its CI (although related to those of the individual means) is different in width from the CIs of the individual means. This creates a problem when plotting the CI because a researcher might be interested in the CI for an individual mean, the CI for their difference (or both).

The complexity increases further if the aim is to plot a set of means (e.g., from an ANOVA design). In this case, plotting all the possible differences (as is commonly done) obscures patterns in the individual means (e.g., linear or quadratic trends). Last, but not least, if the means are not from independent samples, there are further difficulties. This happens in within-subjects or repeated measures designs.

In these designs the variation around each mean is correlated with the variation around the other means. This correlation arises from individual differences. Statistical procedures such as ANOVA can capitalize on these individual differences to produce more sensitive statistical inferences (i.e., to increase statistical power or obtain narrower CIs). This is done by estimating the variation due to individual differences, and removing it from the error variance (the estimate of statistical noise in the data set).

This is a problem for graphical presentation of means because the precision of individual means is influenced by individual differences, whereas the precision of differences between means is not (because the estimate of individual differences is common to repeated samples from the same people and thus can be removed). Further complications arise when the sphericity assumption of repeated measures ANOVA is violated.

Several solutions to these problems have been proposed in the literature. The best known of these in psychology is that of Loftus and Masson (1994). Another well-known solution is that of Goldstein and Healy (1995), extended to correlated samples by Afshartous and Preston (2010).

Despite a large literature on the problems of graphing a set of correlated means, many people avoid the problems altogether by not reporting (or graphing) CIs or report CIs that are misleading in some way. Researchers are often unaware of the problems or find the solutions hard to understand and implement.

I recently reviewed the main approaches in the literature, describe how to obtain suitable intervals for individual means and differences between means and provide R code to calculate and plot the intervals.

The main highlights are that:

i) for inferences about individual means the standard approach works fairly well for between-subject (independent measures) designs, but there is a case to use CIs from a multilevel model for within-subject (repeated measures) designs

ii) an approach proposed by Cousineau (2005) with a correction by Morey (2008) offers advantages over the Loftus and Masson (1994) approach for within-subject ANOVA designs. It simplifies the calculations and does not assume sphericity. The Loftus-Masson approach will however usually be superior when

*n*is small.iii) if you are interested in differences between means then you should probably plot a version of the Cousineau-Morey (or Loftus-Masson) interval that is adjusted so that overlap of the CIs around two individual means corresponds to overlap of the CI for their difference. This can be done by incorporating a multiplier to the width of the individual CIs. This multiplier is equal to (2^0.5)/2.

iv) if you are interested in both precision of individual means and their differences you can use a two-tiered error bar to display both quantities (Cleveland, 1985).

v) the intervals (and graphical presentation of means) are useful for informal inference about a set of means. For formal inference it is better to set up precise hypotheses and test these via an a priori of contrast. This could be a traditional null hypothesis significance test, but other approaches are available. These include confidence intervals, Bayes factors, likelihood ratios and so forth (Baguley, in press; Dienes, 2008).

The paper is available here, the R code here and the data sets here.

Update: R functions now available for the simpler between-subjects (independent measures) ANOVA

case (at the Serious stats blog).

Update: R functions now available for the simpler between-subjects (independent measures) ANOVA

case (at the Serious stats blog).

*References*

Afshartous D., & Preston R. A. (2010). Confidence intervals for dependent data: equating nonoverlap with statistical significance.

*Computational Statistics and Data**Analysis. 54*, 2296-2305.

Baguley, T. (2011, in press). Calculating and graphing within-subject confidence intervals for ANOVA. Behavior Research Methods. DOI: 10.3758/s13428-011-0123-7

Baguley, T. (2012, in press). Serious Stats: A guide to advanced statistics for the behavioral sciences. Basingstoke: Palgrave.

Cousineau, D. (2005). Confidence intervals in within-subject designs: A simpler solution to Loftus and Masson’s method.

*Tutorials in Quantitative Methods for Psychology,**1*, 42-45.Dienes, Z. (2008). Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference. Basingstoke: Palgrave Macmillan.

Goldstein, H., & Healy, M. J. R. (1995). The graphical presentation of a collection of means.

*Journal of the Royal Statistical Society. Series A (Statistics in Society), 158*, 175-177.Loftus, G. R., & Masson, M. E. J. (1994). Using confidence intervals in within-subject designs.

*Psychonomic Bulletin & Review*,*1*, 476-490.Morey, R. D. (2008). Confidence intervals from normalized data: A correction to Cousineau (2005).

*Tutorials in Quantitative Methods for Psychology, 4*, 61-64.To

**leave a comment**for the author, please follow the link and comment on their blog:**Psychological Statistics**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...