Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

SAMPLE UNTIL THE FIRST SUCCESS, PUT A CONFIDENCE INTERVAL ON P(SUCCESS)

We wrote last week of Charles Darwin’s love of the “rule of three” which, according to Stigler “is simply the mathematical proposition that if a/b = c/d, then any three of a, b, c, and d suffice to determine the fourth.”

We were surprised to learn this is called the rule of three, as we had only heard of the rule of three in comedy. We were even more surprised when a reader wrote in telling us about a rule of three in statistics. According to Wikipedia: the rule of three states that if a certain event did not occur in a sample with n subjects … the interval from 0 to 3/n is a 95% confidence interval for the rate of occurrences in the population.

It’s a heuristic. We love heuristics!

We decided to give it a test run in a little simulation. You can imagine that we’re testing for defects in products coming off of a production line. Here’s how the simulation works:

• We test everything that comes off the line, one by one, until we come across a defect (the n+1st test).
• We then make a confidence interval bounded by 0 and 3/n. and make note of it. 95% of such intervals should contain the true underlying probability of defect.
• Because it’s a simulation and we know the true underlying probability of defect, we make note of whether the interval contains the true probability of defect.
• We repeat this 10,000 times at each of the following underlying probabilities: .001, .002, and .003.

The simulation thus far assumes the testers have the patience to keep testing until they find a defect. In reality, however, they might get bored and stop testing before the first defect is found. To address this, we also simulated another condition in which the testing is stopped at n/2, halfway before the first defect is found. Of course, in reality, one has no way of knowing when one is half the way to the first defective test, but our simulation can at least let us know what kind of confidence interval one will get if one does indeed stop halfway.

Here’s the result on bracketing, that is, the probability that the confidence intervals contain the correct value:

Across all three levels of true underlying probability, when stopping just before the first defect, we get 95% confidence intervals. However, when we stop half way to the first defect, we get closer to 100% intervals (99.73%, 99.80%, and 99.86%, respectively).

So we know that the upper bounds of these confidence intervals fall above the true probability 95% to about 100% of the time, but where to they fall?

In the uppermost figure of this post, we see the locations of the upper bounds of the simulated confidence intervals when we stop immediately before the first defect. For convenience, we draw blue lines at the true underlying probabilities of .001 (top), .002 (middle), and .003 (bottom). When it’s a 95% confidence interval, about 95% of the upper bounds should fall to the right of the blue line, and 5% to the left. We cut the axis at .05 so you can actually see something, but realize it extends all the way to 1.0, with the heights of the bars trailing off.

For comparison, let’s look at the case in which we stop halfway to the first defect. As suggested by the bracketing probabilities, here we see almost all of the upper bounds exceed the true underlying probability. As our applied statistician reader wrote us about the rule of three, “the weakness is that in some situations it’s a very broad confidence interval.”

REFERENCE
A Look at the Rule of Three
B. D. Jovanovic and P. S. Levy
The American Statistician
Vol. 51, No. 2 (May, 1997), pp. 137-139
DOI: 10.2307/2685405
Stable URL: http://www.jstor.org/stable/2685405

R CODE FOR THOSE WHO WANT TO PLAY ALONG AT HOME
 require(scales) library(ggplot2) library(dplyr) levels = c(.001,.002,.003) ITER = 10000 res_list = vector('list', ITER*length(levels)) index=1 for(true_p in levels) { for (i in 1:ITER) { onesam = sample( x = c(1, 0), size = 10*1/true_p, prob = c(true_p, 1 - true_p), replace = TRUE ) cut = which.max(onesam) - 1 upper_bound_halfway = min(3 / (cut/2),1) upper_bound_lastpossible = min(3/cut,1) res_list[[index]] = data.frame( true_p = true_p, cut = cut, upper_bound_halfway = upper_bound_halfway, bracketed_halfway = true_p < upper_bound_halfway, upper_bound_lastpossible = upper_bound_lastpossible, bracketed_lastpossible = true_p < upper_bound_lastpossible ) index=index+1 } } df = do.call('rbind',res_list) rm(res_list) plot_data = rbind( df %>% group_by(true_p) %>% summarise(bracketing_probability = mean(bracketed_halfway),type="halfway"), df %>% group_by(true_p) %>% summarise(bracketing_probability = mean(bracketed_lastpossible),type="last possible") ) plot_data p=ggplot(plot_data,aes(x=true_p,y=bracketing_probability,group=type,fill=type)) + geom_bar(stat="identity",position="dodge") + coord_cartesian(ylim=c(.5,1)) + theme_bw() + theme(legend.position = "bottom", panel.grid.minor.x = element_blank()) + labs(x="True Probability",y="Bracketing Probability") p ggsave(plot=p,file="bracketing.png",width=4,height=4) plot_data2 = df %>% dplyr::select(-bracketed_halfway,-bracketed_lastpossible) %>% tidyr::gather(bound_type,upper_bound,c(upper_bound_halfway,upper_bound_lastpossible)) %>% arrange(bound_type,upper_bound) %>% mutate(bin = floor(upper_bound/.001)*.001) %>% group_by(bound_type,true_p,bin) %>% summarise(count = n()) %>% ungroup() p=ggplot(subset(plot_data2,bound_type=="upper_bound_lastpossible"),aes(x=bin+.0005,y=count)) + geom_bar(stat="identity",width = .0005) + geom_vline(aes(xintercept = true_p),color="blue") + coord_cartesian(xlim = c(0,.05)) + labs(x="Upper Bound",y="Count") + facet_grid(true_p ~ .) + theme_bw() + theme(legend.position = "none") p ggsave(plot=p,file="upper_bound_lastpossible.png",width=4,height=4) #repeat for upper_bound_halfway p=ggplot(subset(plot_data2,bound_type=="upper_bound_halfway"),aes(x=bin+.0005,y=count)) + geom_bar(stat="identity",width = .0005) + geom_vline(aes(xintercept = true_p),color="blue") + coord_cartesian(xlim = c(0,.05),ylim=c(0,1750)) + labs(x="Upper Bound",y="Count") + facet_grid(true_p ~ .) + theme_bw() + theme(legend.position = "none") p ggsave(plot=p,file="upper_bound_halfway.png",width=4,height=4) 

The post Another rule of three, this one in statistics appeared first on Decision Science News.