**Kevin Davenport » R**, and kindly contributed to R-bloggers)

Data can take the form of counts:

Compliments or complaints received

Items returned

Number of E. coli cases

Data can also be expressed in rates:

Percent of web traffic from a user permissions type

Percent of businesses in a region passing a safety audit

A random variable X has the Poisson distribution with parameter lambda if

This is the distribution of the number of events that should occur during a time interval, if

we expect lambda occurrences on average and if events occur at a constant rate.

### Example

Let’s say a business receives 22 complaints a month on average. What is the probability that 30 or more complaints are received in a given month?

In R the probability of receiving 29 or less complaints in a particular month is generated with the *ppois* function:

ppois(29, lambda=22) [1] 0.9397826

This would specify that the probability of receiving 30 or more complaints a month is located in the upper tail of the probability density function (pdf).

ppois(29, lambda=22, lower=FALSE) [1] 0.06021738

In plain english this would indicate that given the probability density function, the likelihood of receiving 30 or more complaints in a given month is six percent or 6%.

Understanding these probability distributions allow you to explain some of variance in your observations. in other words it can help you understand if an observation in question is likely given a distribution or if a true exogenous variable is at play.

**leave a comment**for the author, please follow the link and comment on their blog:

**Kevin Davenport » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...