Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

by Joseph Rickert

What will you be doing at 26 minutes and 53 seconds past 9 this coming Saturday morning? I will probably be running simulations. I have become obsessed with an astounding result from number theory and have been trying to devise Monte Carlo simulations to get at it. The result, well known to number theorists says: choose two integers at random; the probability that they will be coprime is 6/π2 ! Here, π materializes out of thin air. Who could have possibly guessed this? – well, Leonhard Euler, apparently, and this sort of magic seems to be quite common in number theory.

More formally, the theorem Euler proved goes something like this: Let PN be the probability that two randomly chosen integers in {1, 2, … N} are coprime. Then, as N goes to infinity PN goes to 6/π2 . Well, this seems to be a little different. You don't actually have to sample from an infinite set. So, I asked myself, would a person who is not familiar with this result, but who is allowed to do some Monte Carlo simulations, have a reasonable chance of guessing the answer? I know that going to infinity would be quite a trip, but I imagined that I could take a few steps and see something interesting. How about this: choose some boundary, N, and draw lots of pairs of random numbers. Count the number of coprime pairs M. Then sqrt( 6 / (M/N)) will give an estimate of π. As N gets bigger and bigger you should see the digits of the average of the estimates marching closer and closer to π – 3.1, 3.14, 3.141 etc.

Before you go off and try this, let me warn you: there is no parade of digits. For a modest 100,000 draws with an N around 10,000,000 you should get an estimate of π close to 3.14. Then, even with letting N get up around 1e13 with a million draws, you won't do much better. The following code (compliments of a colleague who introduced me to mapply) performs 100 simulations, each with 100,000 draws as the range varies from +/- 1,000,000 to +/- 1e13. (The code runs pretty quickly on my laptop.)

# Monte Carlo estimate of pi
library( numbers )
library(ggplot2)

set.seed(123)

bigRange <- seq(1e6,1e13,by=1e11)
M <- length(bigRange)

draws <- 1e5

for(i in 1:M){
maxRange <- bigRange[i]
print(bigRange[i])
min <- -maxRange
max <- maxRange

r1 <- round( runif( n = draws, min = min, max = max ) )
r2 <- round( runif( n = draws, min = min, max = max ) )

system.time( coprimeTests <- mapply( coprime, r1, r2 ) )

prob[i] <- sum( coprimeTests ) / draws
print(prob[i])
piEst[i] <- sqrt( 6 / prob[i] )
print(piEst[i])
}

piRes2 <- data.frame(bigRange,prob,piEst)
p2 <- ggplot(piRes2,aes(bigRange,piEst))
p2 + geom_line() +
geom_point() +
xlab("Half Range for Random Draws")+
ylab("Estimate of Pi") +
ggtitle("Expanding Range Simulation") +
geom_smooth(method = "lm", se=FALSE, color="black")

Here is the "time series" plot of the results.

The mean is 3.142959. The slight upward trend is most likely a random number induced illusion.

lm(formula = piRes$piEst ~ piRes$index)

Coefficients:
(Intercept)  piRes\$index
3.142e+00    1.225e-05  

The problem as, I have framed it, is probably beyond the reach of a naive Monte Carlo approach. Nevertheless, on Saturday, when I have some simulation, time I will try running 100,000,000 draws. This should get me another digit of π since the accuracy of the mean increases as sqrt(N)

The situation, however, is not as dismal as I have been making it out. Don't imagine that just because the problem is opaque to a brute force Monte Carlo effort that it is without the possibility of computational illumination. Euler's proof, mentioned above, turns on recognizing that the expression for the probability of two randomly chosen integers being coprime may be expressed as the mathematical function z(2). The poof yields π2 = 6z(2), The wizards of R have reduced this calculation to the trivial. The Rmpfr package, which allows the use of arbitrarily precise numbers instead of R's double precision numbers, includes the function zeta(x)! So, here, splendidly arrayed, is π.

library(Rmpfr)
sqrt(6*zeta(2))

> sqrt(6*zeta(2))
1 'mpfr' number of precision  128   bits
[1] 3.141592653589793238462643383279502884195

Happy Pi Day