Looking back at the coverage of the Chi-Square test of independence in the book there are a couple of things I wish I’d gone into greater depth on. First, resolving the debate on the appropriate way to handle small expected values in the test of independence. Second, expanding on residual analysis.
First, the issue of small expected values. Generally it is well known that the Pearson test of independence performs pretty well with moderate to large N and most expected values over 5 (generally people advise 80% of cells with expected values greater than 5). However, simulations by Campbell (2007) suggest that a simple correction first proposed by Egon Pearson (son of Karl Pearson – who the Chi-square test is associated with). This correction simply multiplies the test statistic by (N-1)/N with unchanged degrees of freedom. This procedure performs well even for relatively small values of N provided expected values are greater than 1. This covers most useful applications of Chi-square (and underlines that with large N the uncorrected test is generally going to be OK). If you have expected values lower than 1 then some form of exact test might be appropriate. However, the real problem in this case is that there is very sparse information for some cells and hence low statistical power. My present intuition is that this might lend itself to a Bayesian approach adding informative or weakly informative priors.
Second, I covered standardized residuals in the book but recently discovered that the classical Pearson residual does not have great distributional properties and is somewhat conservative when used for testing. Furthermore at least two quantities are referred to in the literature as standardized residuals. One of these – which I prefer to term the adjusted standardized residual (ASR) is generally recommended if you are following up a contingency table analysis (see Agresti, 2007). For large tables using the ASR could run into multiple testing issues so I’d recommend a correction such as the Holm or Hommel test. If you have specific hypotheses to test about patterns within the contingency table I’d recommend a different approach such as a log linear model or count model (such as Poisson on negative binomial regression).
Although the Egon Pearson N-1 Chi-square test is easy to calculate, getting exact p values is fiddly so I have implemented this in R (see Egon Pearson Chi-Square test with residual analyses). This R function also outputs standardized residuals and ASRs (the latter with p values adjusted for multiple testing by default).
Ian Campbell also provides a very easy to use calculator here. He also notes that Bruce Weaver and colleagues have discovered that the Egon Pearson corrected test is equivalent to the linear-by-linear association test provided in SPSS (and possibly other software).
Agresti, A. (2007), An Introduction to Categorical Data Analysis, 2nd Ed, New York: John Wiley &Sons.
Campbell I. (2007), Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations, Statistics in Medicine, 26, 3661 – 3675