Since I just didn’t get enough this morning, I spent some more time fooling around with estimating pi. Since I was basically counting the number of random x,y pairs inside a quarter circle and computing a sample average for more and more iterations I wondered how sensitive my results were to small or large sample sizes within each iteration. My original code plotted a thousand points each iteration and computed 5000 iterations, probably more than necessary. What if I only plotted 10 points each time? 100? How much faster or slower would my estimate converge on the answer for larger sample sizes? So I took a look:
|From pi day|
For less than 500 iterations, the lower sample sizes (10 and 100) jump around quite a bit before settling roughly on the answer. A sample size of 1000 gets within 0.001 very rapidly. I also looked at the progression of each estimate. I subtracted pi from each iteration’s estimate and looked at the different between period and .
|From pi day|
Just like the picture of overall estimation, the estimate built with a sample size of 10 is much less stable even after hundreds of iterations (where each successive sample has a smaller and smaller impact on the sample average). Code is below. You don’t actually need
ggplot2 to compute any of this, just to graph things (you could pretty easily graph this in base R as well).