(This article was first published on

**Econometrics by Simulation**, and kindly contributed to R-bloggers)

# models outcomes from events that are independent and equally

# likely to occur. The distribution takes only one parameter mu

# which is equal to both the mean (expected number of events)

# as well as the variance.

# This distribution as with all distributions is somewhat

# fascinating because it represents an approximation of a

# real world phenomenon.

# Imagine you are trying to model the mail delivery on wednesdays.

# On average you recieve 9 pieces of mail. If the mail delivery

# system is well modeled by a poisson distribution then

# the standard deviation of mail delivery should be 3.

# Meaning most days you should recieve between 3 and 15 pieces

# of mail.

# What underlying physical phenomenon must exist for this to be

# possible?

# In order to aid this discussion we will think of the poisson

# distribution as a limitting distribution of the sum of

# outcomes from a number of independent binary draws:

DrawsApprox <- function(mu, N) sum(rbinom(N,1,mu/N))

# This idea is if we specify a number of expected outcomes mu

# and give a number of draws (N>mu) then we can approximate the

# single draw of a poisson by summing across outcomes.

DrawsApprox(9,9)

# In this case of course the sum is 9 and variance = 0

# Under this case there are 9 letters which are always

# sent out every Wednesday.

# More interestingly:

DrawsApprox(9,18)

# In this case there are 18 letters that may be sent out.

# Any one of them is possible at a 50% rate.

# We want to know what the mean and variance is.

# Let us design a simple function to achieve this.

evar <- function(fun, draw=100, outc=NULL, ...) {

for(i in 1:draw) outc <- c(outc, get(fun)(...))

list(outc=outc, mean=mean(outc), var=var(outc))

}

evar("DrawsApprox", draw=10000, N=18, mu=9)

# I get the mean very close to 9 as we should hope

# but interestingly the variance less than five.

# This is less than that of the poisson which is 9.

# Let's see what happens if we double the number of

# potential letters going out which will halve the

# probability of any particular letter.

evar("DrawsApprox", draw=10000, N=36, mu=9)

# Now the variance is about 6.7

evar("DrawsApprox", draw=10000, N=72, mu=9)

# Now 7.7

evar("DrawsApprox", draw=10000, N=144, mu=9)

# 8.6

evar("DrawsApprox", draw=10000, N=288, mu=9)

# 8.65

# We can see that as the number of letters gets very large

# the mean and variance of the number letters approaches

# the same number 9. I will never be able to choose a

# large enough number of letters so that the variance exactly

# equals the mean.

# However the didactic point of how the distribution is

# structured and when it may be appropriate to use should be

# clear. Poisson is a good fit when the likelihood of each

# individual outcome is equal, yet the number of possible

# outcomes is large (in principal I could recieve 100 pieces

# of mail in a single day though it would be very unlikely).

bigdraw <- evar("DrawsApprox", draw=10000, N=1000, mu=9)

summary(bigdraw$outc)

` `

To

**leave a comment**for the author, please follow the link and comment on their blog:**Econometrics by Simulation**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...