[This article was first published on Back Side Smack » R Stuff, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Since I just didn’t get enough this morning, I spent some more time fooling around with estimating pi. Since I was basically counting the number of random x,y pairs inside a quarter circle and computing a sample average for more and more iterations I wondered how sensitive my results were to small or large sample sizes within each iteration. My original code plotted a thousand points each iteration and computed 5000 iterations, probably more than necessary. What if I only plotted 10 points each time? 100? How much faster or slower would my estimate converge on the answer for larger sample sizes? So I took a look:

 From pi day

For less than 500 iterations, the lower sample sizes (10 and 100) jump around quite a bit before settling roughly on the answer. A sample size of 1000 gets within 0.001 very rapidly. I also looked at the progression of each estimate. I subtracted pi from each iteration’s estimate and looked at the different between period $t$ and $t-1$.

 From pi day

Just like the picture of overall estimation, the estimate built with a sample size of 10 is much less stable even after hundreds of iterations (where each successive sample has a smaller and smaller impact on the sample average). Code is below. You don’t actually need `ggplot2` to compute any of this, just to graph things (you could pretty easily graph this in base R as well).