Calculating confidence intervals for proportions

April 9, 2014
By

[This article was first published on Insights of a PhD student » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Heres a couple of functions for calculating the confidence intervals for proportions.

Firstly I give you the Simple Asymtotic Method:

simpasym <- function(n, p, z=1.96, cc=TRUE){
  out <- list()
  if(cc){
    out$lb <- p - z*sqrt((p*(1-p))/n) - 0.5/n
    out$ub <- p + z*sqrt((p*(1-p))/n) + 0.5/n
  } else {
    out$lb <- p - z*sqrt((p*(1-p))/n)
    out$ub <- p + z*sqrt((p*(1-p))/n)
  }
  out
}

which can be used thusly….

simpasym(n=30, p=0.3, z=1.96, cc=TRUE)
$lb
[1] 0.119348

$ub
[1] 0.480652

 

Where n is the sample size, p is the proportion, z is the z value for the % interval (i.e. 1.96 provides the 95% CI) and cc is whether a continuity correction should be applied. The returned results are the lower boundary ($lb) and the upper boundary ($ub).

The second method is the Score method and is define as follows:

scoreint <- function(n, p, z=1.96, cc=TRUE){
  out <- list()
  q <- 1-p
  zsq <- z^2
  denom <- (2*(n+zsq))
  if(cc){ 
    numl <- (2*n*p)+zsq-1-(z*sqrt(zsq-2-(1/n)+4*p*((n*q)+1)))
    numu <- (2*n*p)+zsq+1+(z*sqrt(zsq+2-(1/n)+4*p*((n*q)-1)))
    out$lb <- numl/denom
    out$ub <- numu/denom
    if(p==1) out$ub <- 1
    if(p==0) out$lb <- 0
  } else {
    out$lb <- ((2*n*p)+zsq-(z*sqrt(zsq+(4*n*p*q))))/denom
    out$ub <- ((2*n*p)+zsq+(z*sqrt(zsq+(4*n*p*q))))/denom
  }
  out
}

and is used in the same manner as simpasym…

scoreint(n=30, p=0.3, z=1.96, cc=TRUE)
$lb
[1] 0.1541262

$ub
[1] 0.4955824

These formulae (and a couple of others) are discussed in Newcombe, R. G. (1998) who suggests that the score method should be more frequently available in statistical software packages.Hope that help someone!!!Reference:Newcombe, R. G. (1998) Two-sided confidence intervals for the single proportion: comparison of seven methods. Statist. Med., 17: 857-872. doi: 10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.C

 

To leave a comment for the author, please follow the link and comment on their blog: Insights of a PhD student » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)