# Generate Quasi-Poisson Distribution Variable

**Category: R | Huidong Tian**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Most of regression methods assume that the response variables follow some exponential distribution families, e.g. *Guassian*,
*Poisson*, *Gamma*, etc. However, this assumption was frequently violated in real world data by, for example, zero-inflated overdispersion
problem. A number of methods were developed to deal with such problem, and among them, Quasi-Poisson and Negative Binomial are the most
popular methods perhaps due to that major statistical softwares contain such functions.

Unlike Negative Binomial distribution, there is no function for generating Quasi-Poisson distributed random variable in R. In this blog, I will show you how to generate Quasi-Poisson distributed variable using Negative Binomial distribution.

Let variable follows Quasi-Poisson distribution, then the variance of should have a linear relationship with the mean of :

where, $\theta$ is called the disperision parameter, and for overdispersion variables , $\theta$ should greater than 1.

If variable follows Negative Binomial distribution, the variance of $Y_{nb}$ should have quadratic relationship with the mean of .

Random Negative Binomial variable can be generated in R using function **rnbinom**:

1 2 3 4 5 | > x <- rnbinom(n = 10000, size = 8, mu = 5) > mean(x) [1] 4.9674 > var(x) [1] 7.874925 |

If we can find the relationship between $\theta$ and $size$, then we can use the Negative Binomial distribution to generate Quasi-Poisson distributed random variable. The proof is listed as the following:

So, we can define such a function in R:

1 2 3 | rqpois <- function(n, mu, theta) { rnbinom(n = n, mu = mu, size = mu/(theta-1)) } |

Take an example to diagnose the performance of the above function: $\mu = 3$ and $\theta = 5$. According to the relationship $Var(Y_{qp}) = \theta \times \mu$, the generated variable should have variance arround 15.

1 2 3 4 5 6 | > set.seed(0) > x <- rqpois(n = 10000, mu = 3, theta = 5); > mean(x) [1] 2.9718 > var(x) [1] 14.66027 |

So, it works!

**leave a comment**for the author, please follow the link and comment on their blog:

**Category: R | Huidong Tian**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.