Get at least 12 observations before making a confidence interval?

Posted on April 14, 2010 by dan in R bloggers | 0 Comments

[This article was first published on Decision Science News » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

GET CONFIDENT ABOUT YOUR INTERVALS

Decision Science News is happy with its purchase of Statistical Rules of Thumb by Gerald van Belle many years ago. It’s full of examples in which math can surprise.

The first example in the book is titled “use at least 12 observations in constructing a confidence interval”. When people first hear this they think, nonsense, there’s nothing magic about the number twelve. And then they think that confidence interval sizes have to do with the square root of the sample size, but that still doesn’t do it. Thinking harder, one realizes that the half-width confidence interval for a sample of size n is t(n-1,1-alpha)/sqrt(n). One plots this out for 90% and 95% CIs and one sees that the first intuition was right, there is nothing magic about 12, but the plot above sure does seem to stop dropping in width somewhere around there. Maybe 15 is a safer number. To make it easier to see, here are the points on the above graph from the value 15 and greater.

We love heuristics for statistics, but do not promote following rules of thumb without reflection. We do promote playing with such rules of thumb as a way to become aware of the tradeoffs one makes in designing experiments. To encourage such play, we post the R code behind the above graphs here.

R CODE
(Don’t know R yet? Learn by watching: R Video Tutorial 1, R Video Tutorial 2)

n=seq(3,30,.1)
alpha=.1
y90=qt(1-alpha/2,n-1)/sqrt(n)
alpha=.05
y95=qt(1-alpha/2,n-1)/sqrt(n)

plot.new()
plot(n,y90,type=”l”,xlim=c(0,30),ylim=c(0,3),ylab=”Half-Width Confidence Interval Size”, xlab=”Sample Size”)
lines(n,y95,type=”l”)
text(15,y95[which(n==15)]+.15,labels=”95%”)
text(15,y90[which(n==15)]-.15,labels=”90%”)

#second plot
plot.new()
a=min(which(n>=15))
b=max(which(n>=15))
plot(n[a:b],y90[a:b],type=”l”,xlim=c(0,30),ylim=c(0,3),ylab=”Half-Width Confidence Interval Size”, xlab=”Sample Size”)
lines(n[a:b],y95[a:b],type=”l”)
text(15,y95[which(n==15)]+.15,labels=”95%”)
text(15,y90[which(n==15)]-.15,labels=”90%”)

Update: After Arjan’s comment, I tried to figure out if Van Belle is Dutch. I didn’t figure that out, but I did learn that he keeps a lot of these tips on his site. There’s even one on the 12 observation rule and some information added by others, including this figure:

To leave a comment for the author, please follow the link and comment on their blog: Decision Science News » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Get at least 12 observations before making a confidence interval?

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)