Since I just didn’t get enough this morning, I spent some more time fooling around with estimating pi. Since I was basically counting the number of random x,y pairs inside a quarter circle and computing a sample average for more and more iterations I wondered how sensitive my results were to small or large sample sizes within each iteration. My original code plotted a thousand points each iteration and computed 5000 iterations, probably more than necessary. What if I only plotted 10 points each time? 100? How much faster or slower would my estimate converge on the answer for larger sample sizes? So I took a look:

For less than 500 iterations, the lower sample sizes (10 and 100) jump around quite a bit before settling roughly on the answer. A sample size of 1000 gets within 0.001 very rapidly. I also looked at the progression of each estimate. I subtracted pi from each iteration’s estimate and looked at the different between period and .

Just like the picture of overall estimation, the estimate built with a sample size of 10 is much less stable even after hundreds of iterations (where each successive sample has a smaller and smaller impact on the sample average). Code is below. You don’t actually need `ggplot2`

to compute any of this, just to graph things (you could pretty easily graph this in base R as well).

*Related*

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as: visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** fun with simple math, ggplot2, Math is Hard, Monte Carlo, pi, R, R Stuff