Estimating a proportion at first looks elementary. Hail to aymptotics, right? Well, initially it might seem efficient to iuse the fact that . In other words the classical confidence interval relies on the inversion of Wald’s test.
A function to ease the computation is the following (not really needed!).
waldci<- function(x,n,level){
phat<-sum(x)/n
results<-phat + c(-1,1)*qnorm(1-level/2)*sqrt(phat*(1-phat)/n)
print(results)
}
An exact confidence interval is the so-called Blake’s confidence interval
blakerci <- function(x,n,level,tolerance=1e-05){
lower = 0
upper = 1
if (x!=0){lower = qbeta((1-level)/2, x, n-x+1)
while (acceptbin(x, n, lower + tolerance) < (1 - level))
lower = lower+tolerance
}
if (x!=n){upper = qbeta(1 - (1-level)/2, x+1, n-x)
while (acceptbin(x, n, upper - tolerance) < (1 - level))
upper = upper-tolerance
}
c(lower,upper)
}
# Computes the Blaker exact ci (Canadian J. Stat 2000)
# for a binomial success probability
# for x successes out of n trials with
# confidence coefficient = level; uses acceptbin function
acceptbin = function(x, n, p){
#computes the Blaker acceptability of p when x is observed
# and X is bin(n, p)
p1 = 1 - pbinom(x - 1, n, p)
p2 = pbinom(x, n, p)
a1 = p1 + pbinom(qbinom(p1, n, p) - 1, n, p)
a2 = p2 + 1 - pbinom(qbinom(1 - p2, n, p), n, p)
return(min(a1,a2))
}
A comparison is easily made along the following lines of code
list.counts.bl=NA
list.counts.wl=NA
prob=c(0.05,0.1,.9,.97)
for (j in 1:4){
p=prob[j]
n=9
size=1
level=0.05
N=30000
counts.bl=0
counts.wl=0
for (i in 1:N)
{
tmp.sample=rbinom(n,size,p)
x=sum(tmp.sample)
if (blakerci(x,n,level,tolerance=1e-05)[1]<p && blakerci(x,n,level,tolerance=1e-05)[2]>p) counts.bl=counts.bl+1
if (waldci(x,n,level)[1]<p && waldci(x,n,level)[2]>p) counts.wl=counts.wl+1
}
list.counts.bl[j]=counts.bl
list.counts.wl[j]=counts.wl
}
list.counts.bl/N
list.counts.wl/N
You can see the difference at extremes!
tt=matrix(data=c(list.counts.bl,list.counts.wl),nrow=2,ncol=4,byrow=TRUE)/N
colnames(tt)=prob
rownames(tt)=list("Blaker","Wald")
> round(tt,3)
0.05 0.1 0.9 0.97
Blaker 0.632 0.391 0.384 0.762
Wald 0.368 0.605 0.605 0.238
The blacerci function is adapted form A. Agresti’s site [link].
Look it up…
Lawrence D. Brown, T. Tony Cai and Anirban DasGupta, “Interval Estimation for a Binomial Proportion”, Statistical Science 2001, Vol. 16, No. 2, pp. 101–133 [pdf]
Laura A. Thompson, An R (and S-PLUS) Manual to Accompany Agresti’s Categorical Data Analysis, 2009 [pdf | R code]
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).