Will I ever be a bayesian statistician ? (part 1)

January 20, 2011
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

Last week, during the workshop on Statistical Methods for Meteorology and Climate Change (here), I discovered how powerful bayesian techniques could be, and that there were more and more bayesian statisticians. So, if I was to fully understand applied statisticians in conferences and workshops, I really have to understand basics of bayesian statistics. I have published some time ago some posts on bayesian statistics applied to actuarial problems (here or there), but so far, I always thought that bayesian was a synonym for magician.
To be honest, I am a Muggle, and I have not been trained as a bayesian. But I can be an opportunist...
So I decided to publish some posts on bayesian techniques, in order to prove that it is actually not that difficult to implement.

As far as I understand it, in bayesian statistics, the parameter is considered as a random variable (which is also the case, in classical mathematical statistics). But here, here assume that this parameter does have a parametric distribution....
Consider a classical statistical problem: assume we have a sample http://freakonometrics.free.fr/blog/bayy1.png i.i.d. with distribution http://freakonometrics.free.fr/blog/bayy2.png. Here we note

http://freakonometrics.free.fr/blog/bayy3.pnghttp://freakonometrics.free.fr/blog/bayy3.png
since parameter http://freakonometrics.free.fr/blog/bayyyyy001.pngthe is a random variable. The idea is to assume that http://freakonometrics.free.fr/blog/bayyyyy001.png has a (so called priori) distribution, e.g.
http://freakonometrics.free.fr/blog/bayy4.png
So far it was simple. The idea is then to consider the posterior distribution of http://freakonometrics.free.fr/blog/bayyyyy001.png, given the observations http://freakonometrics.free.fr/blog/bayyyyyy02.png. Thus, we need to compute the distribution of http://freakonometrics.free.fr/blog/bayyyyyy03.png which is here extremely simple (due to properties of the Gaussian distribution), i.e.
http://freakonometrics.free.fr/blog/bayyyyyy04.png
where
http://freakonometrics.free.fr/blog/bayyyyyy05.png
And them, it becomes extremely natural to consider http://freakonometrics.free.fr/blog/bayy20.png as an estimator of given our sample data (and thus, we also have a confidence interval since we know the distribution of http://freakonometrics.free.fr/blog/bayyyyy001.png given the observations http://freakonometrics.free.fr/blog/bayyyyyy02.png).
In order to be sure that we understood, consider now a heads and tails problem, i.e. http://freakonometrics.free.fr/blog/bayy5.png. Note, first, that theta has support http://freakonometrics.free.fr/blog/bayy6.png. So we need a distribution on that support. Why not a beta distribution ? E.g.
http://freakonometrics.free.fr/blog/bayy7.png
Thus,
http://freakonometrics.free.fr/blog/bayy8.png
and
http://freakonometrics.free.fr/blog/bayy9.png
From Bayes formula,
http://freakonometrics.free.fr/blog/bayy10.png
and we get easily
http://freakonometrics.free.fr/blog/bayy11.png
which is the density of a Beta distribution, i.e.
http://freakonometrics.free.fr/blog/bayy12.png
prior=dbeta(u,a,b)
posterior=dbeta(u,a+y,n-y+b)
The estimator proposed is then the expected value of that conditional distribution,
http://freakonometrics.blog.free.fr/public/perso/bayyyyyyyyyyy.png
Note that
http://freakonometrics.free.fr/blog/bayy13.png
Further, it is possible to derive confidence intervals using quantiles of the posterior distribution.
On the graphs below, we consider the following heads/tails sample

A first idea is to consider a uniform prior distribution.
http://freakonometrics.free.fr/blog/bayes-cv-1.gif
A second idea is to consider an asymmetric beta distribution. First, with an asymmetry on the left,
http://freakonometrics.free.fr/blog/bayes-cv-3.gif
or on the right
http://freakonometrics.free.fr/blog/bayes-cv-2.gif
Finally a third idea is simply to get back to the standard Gaussian approximation,
http://freakonometrics.free.fr/blog/bayes-cv-gauss.gif
If we compare the four models, we obtain (the plain black line is the Gaussian approximated distribution for the empirical mean), and red lines are obtained from prior beta distributions
http://freakonometrics.free.fr/blog/bayes-cv-all.gif
The code to generate those graphs is the following
a1=1; b1=1
D1[1,]=dbeta(u,a,b)
a2=4; b2=2
D2[1,]=dbeta(u,a,b)
a3=2; b3=4
D3[1,]=dbeta(u,a,b)
setseed(1)
S=sample(0:1,size=100,replace=TRUE)
COULEUR=rev(rainbow(120))
D1=D2=D3=D4=matrix(NA,101,length(u))
for(s in 1:100){
y=sum(S[1:s])
D1[s+1,]=dbeta(u,a1+y,s-y+b1)
D2[s+1,]=dbeta(u,a2+y,s-y+b2)
D3[s+1,]=dbeta(u,a3+y,s-y+b3)
D4[s+1,]=dnorm(u,y/s,sqrt(y/s*(1-y/s)/s))
plot(u,D1[1,],col="black",type="l",ylim=c(0,8),
xlab="",ylab="")
for(i in 1:s){lines(u,D1[1+i,],col=COULEUR[i])}
points(y/s,0,pch=3,cex=2)
plot(u,D2[1,],col="black",type="l",ylim=c(0,8),
xlab="",ylab="")
for(i in 1:s){lines(u,D2[1+i,],col=COULEUR[i])}
points(y/s,0,pch=3,cex=2)
plot(u,D3[1,],col="black",type="l",ylim=c(0,8),
xlab="",ylab="")
for(i in 1:s){lines(u,D3[1+i,],col=COULEUR[i])}
points(y/s,0,pch=3,cex=2)
plot(u,D4[1,],col="white",type="l",ylim=c(0,8),
xlab="",ylab="")
for(i in 1:s){lines(u,D4[1+i,],col=COULEUR[i])}
points(y/s,0,pch=3,cex=2)
plot(u,D4[s+1,],col="black",lwd=2,type="l",
ylim=c(0,8),xlab="",ylab="")
lines(u,D1[1+i,],col="blue")
lines(u,D2[1+i,],col="red")
lines(u,D3[1+i,],col="purple")
points(y/s,0,pch=3,cex=2)
}
Here, we can see that computations are simple if the prior distribution has a distribution which is the conjugate of the observations' distribution (see here for the list of prior and posterior standard distributions).
So far, I have two questions that naturally show up
  • is it possible to start with a neutral prior distribution, non informative ?
  • what if we are no longer working with conjugate distributions ?
Well, I guess I have to work a bit more to answer those questions.... to be continued...


To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , ,

Comments are closed.