Barnard’s exact test – a powerful alternative for Fisher’s exact test (implemented in R)

February 7, 2010
By

(This article was first published on R-statistics blog » R, and kindly contributed to R-bloggers)

(The R code for Barnard’s exact test is at the end of the article, and you could also just download it from here)

About Barnard’s exact test

About half a year ago, I was studying various statistical methods to employ on contingency tables. I came across a promising method for 2×2 contingency tables called “Barnard’s exact test“. Barnard’s test is a non-parametric alternative to Fisher’s exact test which can be more powerful (for 2×2 tables) but is also more time-consuming to compute (References can be found in the Wikipedia article on the subject).

The test was first published by George Alfred Barnard (1945). Mehta and Senchaudhuri (2003) explain why Barnard’s test can be more powerful than Fisher’s under certain conditions:

When comparing Fisher’s and Barnard’s exact tests, the loss of power due to the greater discreteness of the Fisher statistic is somewhat offset by the requirement that Barnard’s exact test must maximize over all possible p-values, by choice of the nuisance parameter, π. For 2 × 2 tables the loss of power due to the discreteness dominates over the loss of power due to the maximization, resulting in greater power for Barnard’s exact test. But as the number of rows and columns of the observed table increase, the maximizing factor will tend to dominate, and Fisher’s exact test will achieve greater power than Barnard’s.

About the R implementation of Barnard’s exact test

After finding about Barnard’s test I was sad to discover that (at the time) there had been no R implementation of it. But last week, I received a surprising e-mail with good news. The sender, Peter Calhoun, currently a graduate student at the University of Florida, had implemented the algorithm in R. Peter had  found my posting on the R mailing list (from almost half a year ago) and was so kind as to share with me (and the rest of the R community) his R code for computing Barnard’s exact test. Here is some of what Peter wrote to me about his code:

On a side note, I believe there are more efficient codes than this one.  For example, I’ve seen codes in Matlab that run faster and display nicer-looking graphs.  However, this code will still provide accurate results and a plot that gives the p-value based on the nuisance parameter.  I did not come up with the idea of this code, I simply translated Matlab code into R, occasionally using different methods to get the same result.  The code was translated from:

Trujillo-Ortiz, A., R. Hernandez-Walls, A. Castro-Perez, L. Rodriguez-Cardozo. Probability Test.  A MATLAB file. URL

http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=6198

My goal was to make this test accessible to everyone.  Although there are many ways to run this test through Matlab, I hadn’t seen any code to implement this test in R.  I hope it is useful for you, and if you have any questions or ways to improve this code, please contact me at [email protected]

p.s: I added some minor cosmetics to the code, like allowing the input to be a table/matrix and the output to be a list.

The R function for Barnard’s exact test

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# published on: 
# http://www.r-statistics.com/2010/02/barnards-exact-test-a-powerful-alternative-for-fishers-exact-test-implemented-in-r/
 
Barnardextest<-function(Ta,Tb =NULL,Tc =NULL,Td =NULL, to.print = F, to.plot = T)
{
# The first argument (Ta) can be either a table or a matrix of 2X2.
# Or instead, the values of the table can be entered one by one to the function
 
#Barnard's test calculates the probabilities for contingency tables.  It has been shown that for 2x2 tables, Barnard's test
#has a higher power than Fisher's Exact test.  Barnard's test is a non-parametric test that relies upon a computer to generate
#the distribution of the Wald statistic.  Using a computer program, one could find the nuisance parameter that maximizes the 
#probability of the observations displayed from a table.
#Despite giving lower p-values for 2x2 tables, Barnard's test hasn't been used as often as Fisher's test because of its
#computational difficulty.  This code gives the Wald statistic, the nuisance parameter, and the p-value for any 2x2 table.
#The table can be written as:
#			Var.1
#		 ---------------
#		   a		b	 r1=a+b
#	Var.2
#		   c		d	 r2=c+d
#		 ---------------
#		 c1=a+c   c2=b+d	 n=c1+c2
 
#One example would be 
#				Physics
#			 Pass	     Fail
#			 ---------------
#		Crane	   8		14
#  Collage	
#		Egret	   1		3
#			 ---------------
#
#After implementing the function, simply call it by the command:
#Barnardextest(8,14,1,3)
#This will display the results:
 
#"The contingency table is:"
#      [,1] [,2]
#[1,]    8   14
#[2,]    1    3
#"Wald Statistic:"
#0.43944
#"Nuisance parameter:"
#0.9001
#"The 1-tailed p-value:"
#0.4159073
 
#On a side note, I believe there are more efficient codes than this one.  For example, I've seen codes in Matlab that run
#faster and display nicer-looking graphs.  However, this code will still provide accurate results and a plot that gives the
#p-value based on the nuisance parameter.  I did not come up with the idea of this code, I simply translated Matlab code 
#into R, occasionally using different methods to get the same result.  The code was translated from:
#
#Trujillo-Ortiz, A., R. Hernandez-Walls, A. Castro-Perez, L. Rodriguez-Cardozo. Probability Test.  A MATLAB file. URL
#http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=6198
#
#My goal was to make this test accessible to everyone.  Although there are many ways to run this test through Matlab, I hadn't
#seen any code to implement this test in R.  I hope it is useful for you, and if you have any questions or ways to improve
#this code, please contact me at [email protected]
 
 
	# Tal edit: choosing if to work with a 2X2 table or with 4 numbers:
	if(is.null(Tb) | is.null(Tc) | is.null(Td) )
	{
		# If one of them is null, then Ta should have an entire table, and we can take it's values
		if(is.table(Ta) | is.matrix(Ta))
		{
			if(sum(dim(Ta) == c(2,2))==2)
			{
				Tb <- Ta[1,2]
				Tc <- Ta[2,1]
				Td <- Ta[2,2]
				Ta <- Ta[1,1]		
			} else {stop("The table is not 2X2, please check it again...") }		
		} else {stop("We are missing value in the table") }		
	}
 
 
	c1<-Ta+Tc
	c2<-Tb+Td
	n<-c1+c2
	pao<-Ta/c1
	pbo<-Tb/c2
	pxo<-(Ta+Tb)/n
	TXO<-abs(pao-pbo)/sqrt(pxo*(1-pxo)*(1/c1+1/c2))
	n1<-prod(1:c1)
	n2<-prod(1:c2)
 
	P<-{}
	for( p in (0:99+.01)/100)
	{
		TX<-{}
		S<-{}
		for( i in c(0:c1))
		{
			for( j in c(0:c2))
			{
				if(prod(1:i)==0){fac1<-1} else {fac1<-prod(1:i)}
				if(prod(1:j)==0){fac2<-1} else {fac2<-prod(1:j)}
				if(prod(1:(c1-i))==0){fac3<-1} else {fac3<-prod(1:(c1-i))}
				if(prod(1:(c2-j))==0){fac4<-1} else {fac4<-prod(1:(c2-j))}
 
				small.s<-(n1*n2*(p^(i+j))*(1-p)^(n-(i+j))/(fac1*fac2*fac3*fac4))
				S<-cbind(S,small.s)
				pa<- i/c1
				pb<-j/c2
				px <- (i+j)/n
				if(is.nan((pa-pb)/sqrt(px*(1-px)*((1/c1)+(1/c2)))))
					{
						tx<-0
					} else {
						tx <- (pa-pb)/sqrt(px*(1-px)*((1/c1)+(1/c2)))
					}
				TX<-cbind(TX,tx)
			}
		}
 
		P<-cbind(P,sum(S[which(TX>=TXO)]))
	}
 
	np<-which(P==max(P))
	p <- (0:99+.01)/100
	nuisance<-p[np]
	pv<-P[np]
 
 
	if(to.print)
	{ 
		print("The contingency table is:")
		print(matrix(c(Ta,Tc,Tb,Td),2,2))
		print("Wald Statistic:")
		print(TXO)
		print("Nuisance parameter:")
		print(nuisance)
		print("The 1-tailed p-value:")
		print(pv)
	}
	if(to.plot)
	{
		plot(p,P,type="l",main="Barnard's exact P-value", xlab="Nuisance parameter", ylab="P-value")
		points(nuisance,pv,col=2)
	}
 
	return(	
			list(
					contingency.table = as.table(matrix(c(Ta,Tc,Tb,Td),2,2)),
					Wald.Statistic = TXO,
					Nuisance.parameter = nuisance,
					p.value.one.tailed = pv					
				)
			)
}
 
 
Barnardextest(matrix(c(8,1,14,3),2,2))
fisher.test(matrix(c(8,1,14,3),2,2))
 
 
Convictions <-
matrix(c(2, 10, 15, 3),
		   nrow = 2,
		   dimnames =
		   list(c("Dizygotic", "Monozygotic"),
				c("Convicted", "Not convicted")))
Convictions
fisher.test(Convictions, alternative = "less")
Barnardextest(Convictions)

Fisher’s Exact Test for Count Data

data: Convictions
p-value = 0.0004652
alternative hypothesis: true odds ratio is less than 1
95 percent confidence interval:
0.0000000 0.2849601
sample estimates:
odds ratio
0.04693661

$contingency.table
A B
A 2 15
B 10 3

$Wald.Statistic
[1] 3.609941

$Nuisance.parameter
[1] 0.4401

$p.value.one.tailed
[1] 0.0001528846

Final note: I would like to thank Peter Calhoun again for sharing his code with the rest of us – Thanks Peter!

To leave a comment for the author, please follow the link and comment on his blog: R-statistics blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , , ,

Comments are closed.