CLT Standard Normal Generator

Posted on April 2, 2010 by Lee in R bloggers | 0 Comments

[This article was first published on stotastic » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’ve found this standard normal random number generator in a number of places, one of which being from one of Paul Wilmott’s books. The idea is that we can use the Central Limit Theorem (CLT) to easily generate values distributed according to a standard normal distribution by using the sum of 12 uniform random variables and subtracting 6. In Excel, the implementation looks like this:
=RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()-6
By doing a simple cut-and-paste, we can stick this formula in an Excel cell and go on with our merry way assuming we have generated values from a standard normal distribution. But what is really going on here, and how good does this generator work?

The Idea

The idea behind this standard normal generator is simple and is based on the Central Limit Theorem. In a nut shell, if we define the following random variables.

$U_i : iid : sim U(0,1) \ S=displaystylesum_{i=1}^{m}{U_i}$

Then we can approximate the distribution of $S$ using the CLT.

$S : text{approximately} : sim N left(m cdot mathbb{E}[U_i], m cdot Var[U_i] right)$

Since we know that the mean and variance of $U_i$ is $frac{1}{2}$ and $12$ respectivly.

$S : text{approximately} : sim N left(frac{m}{2}, frac{m}{12}right)$

Finally, if we ‘standardize’ $S$ by subtracting the mean and dividing by its standard deviation we get a standard normal random variable.

$Z = frac{S-frac{m}{2}}{sqrt{frac{m}{12}}} \ text{where} : Z : text{approximately} : sim N left(0,1 right)$

So we have essentially taken the sum of uniform random variables and used them to approximate a standard normal random variable by applying the CLT. The important thing to keep in mind is that the more uniforms we use to do this, the better the approximation. You may be asking yourself why this looks nothing like the simple Excel formula I showed earlier. Well, something special happens when we use 12 uniforms; things start to simplify!

$Z = frac{S-frac{12}{2}}{sqrt{frac{12}{12}}} = S-6$

Voila! We have an easy to implement standard normal random number generator. We should still be a little concerned about the CLT approximation and we should probably ask ourselves if using only 12 uniform random variables is ‘good enough’.

Testing

Now to the fun part! I’ve written the following function which implements the above method in R.

## function that uses the CLT to generate standard normals from uniform
## n is the number of standard normal random numbers to generate
## m is the number of uniforms to generate for using the CLT
 
CLT_normal <-
function(n, m){
  z <- rep(0,n)
  for(i in 1:n){
    u <- runif(m,0,1)
    s <- sum(u)
    z[i] <- (s-m/2) / (m/12)
  }
  return(z)
}

Using the generated values, we can perform a visual inspection using QQ normal plots for various values of m. I also generated results using m as 30 since 30 is often used as a rule-of-thumb for applying the CLT.

## test the normal generator using various values of m
par(mfrow=c(2,2)) 
m <- 1
x <- CLT_normal(100000, m)
qqnorm(x, main=paste("QQ normal m=", m))
qqline(x, col="red") 
 
m <- 6
x <- CLT_normal(100000, m)
qqnorm(x, main=paste("QQ normal m=", m))
qqline(x, col="red") 
 
m <- 12
x <- CLT_normal(100000, m)
qqnorm(x, main=paste("QQ normal m=", m))
qqline(x, col="red") 
 
m <- 30
x <- CLT_normal(100000, m)
qqnorm(x, main=paste("QQ normal m=", m))
qqline(x, col="red")

Based on this output, the generated values have lighter tails than a normal distribution, but using 12 uniforms seems to be ok if one was performing a ‘quick and dirty’ analysis in Excel. 30 uniforms obviously performs better, but things start to slow down considerably and it would probably be better to write a function using the Box-Muller method if better accuracy in the tails was needed.

To leave a comment for the author, please follow the link and comment on their blog: stotastic » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

CLT Standard Normal Generator

The Idea

Testing

Related

The Idea

Testing

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)