Generating a quasi Poisson distribution, version 2

[This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Here and there, I mentioned two codes to generated quasiPoisson random variables. And in both of them, the negative binomial approximation seems to be wrong. Recall that the negative binomial distribution is

http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg01.png
where
http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg02.png
and in R, a negative binomial distribution can be parametrized using two parameters, out of the following ones
  • the sizehttp://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg03.png
  • the probabilityhttp://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg04.png
  • the mean
http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg05.png
Recall also that the variance for a negative binomial distribution should be
http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg06.png
Here, we consider a distribution such that http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg07.png and http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg08.png. In the previous posts, I used
http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg09.png
http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg10.png
i.e.
rqpois = function(n, lambda, phi) {
mu = lambda
k = mu/(phi * mu - 1)
r1 = rnbinom(n, mu = mu, size = k)
r2 = rnbinom(n, size=phi*mu/(phi-1),prob=1/phi)
k = mu/phi/(1-1/phi)
r3 = rnbinom(n, mu = mu, size = k)
r4 = rnbinom(n, size=mu/phi/(1-1/phi),prob=1/phi)
r = cbind(r1,r2,r3,r4)
return(r)
}
but as we can see below, none of those two functions work,
> N=rqpois(1000000,2,4)
> mean(N[,1])
[1] 2.001992
> mean(N[,2])
[1] 8.000033
> var(N[,1])/ mean(N[,1])
[1] 7.97444
> var(N[,2])/ mean(N[,2])
[1] 4.002022
with the first one, the expected value is correct, while it is the overdispersion parameter for the second. Now, if we do the maths correctly, it comes
http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg11.png

http://perso.univ-rennes1.fr/arthur.charpentier/latex/bineg12.png
which should now work,
> mean(N[,3])
[1] 2.001667
> mean(N[,4])
[1] 2.002776
> var(N[,3])/ mean(N[,3])
[1] 3.999318
> var(N[,4])/ mean(N[,4])
[1] 4.009647
So, finally it is better when we do the maths well.

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)