# Contingency Tables – Fisher’s Exact Test

March 6, 2010
By

(This article was first published on Software for Exploratory Data Analysis and Statistical Modelling, and kindly contributed to R-bloggers)

A contingency table is used in statistics to provide a tabular summary of categorical data and the cells in the table are the number of occassions that a particular combination of variables occur together in a set of data. The relationship between variables in a contingency table are often investigated using Chi-squared tests.

The simplest contingency table with two variables has two levels for each of the variables. Consider a trial comparing the performance of two challengers. Each of the challengers undertook the trial eight times and the number of successful trials was recorded. The hypothesis under investigation in this experiment is that the performance of the two challengers is similar. If the first challenger was only successful on one trial and the second challenger was successful on four of the eight trials then can we discriminate between their peformance?

The function fisher.test is used to perform Fisher’s exact test when the sample size is small to avoid using an approximation that is known to be unrealiable for sample samples. The data is setup in a matrix:

challenge.df = matrix(c(1,4,7,4), nrow = 2)

The function is then called using this data to produce the test summary information:

> fisher.test(challenge.df)   Fisher's Exact Test for Count Data   data: challenge.df p-value = 0.2821 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.002553456 2.416009239 sample estimates: odds ratio 0.1624254

The p-value calculated for the test does not provide any evidence against the assumption of independence. In this example this means that we cannot confidently claim any difference in performance for the two challengers.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...