(This article was first published on

**Econometrics Beat: Dave Giles' Blog**, and kindly contributed to R-bloggers)This is the second in a set of posts about Monte Carlo Markov Chain (MCMC, or MC

^{2}) methods in Bayesian econometrics. The background was provided in**this first post**, where the Gibbs sampler was introduced.The main objective of the present post is to convince you that this MCMC stuff actually works!

To achieve this, what we’re going to do is work through a simple example – one for which we actually know the answer in advance. That way, we’ll be able to check our results from applying the Gibbs sampler with the facts. Hopefully, we’ll then be able to see that this technique works – at least for this example!

I’ll be using some R script that I’ve written to take students through this, and it’s available on the

**code page**for this blog. I should mention in advance that this code is not especially elegant. It’s been written, quite deliberately, in a step-by-step manner to make it relatively transparent to non-users of R. Hopefully, the comments that are embedded in the code will also help.It’s also important to note that this first illustration of the Gibbs sampler in action

*does not*involve the posterior distribution for the parameters in a Bayesian analysis of some model. Instead, we’re going to look at the problem of obtaining the marginals of a bivariate normal distribution, when we know the form of the conditional distributions.In other words – let’s proceed one step at a time. The subsequent posts on this topic will be dealing with Bayesian posterior analysis.

Let’s take a look at the set-up, and the analysis that we’re going to undertake.

Suppose that we have two random variables, Y

_{1}and Y_{2}, which follow a bivariate normal distribution with a mean vector, (μ_{1}, μ_{2})’, and a correlation of ρ. The variances of Y_{1}and Y_{2}will be σ_{1}^{2}and σ_{2}^{2}respectively, and so the covariance between Y_{1}and Y_{2}will be (ρσ_{1}σ_{2}).Now, it’s well known that the

*conditional*distribution of Y_{1}given Y_{2}is p(y_{1} | y_{2}) ~ N[μ_{1} + ρσ_{1}(y_{2} – μ_{2}) / σ_{2} , σ_{1}^{2}(1 – ρ^{2})] ,

and the conditional distribution of Y

_{2}given Y_{1}is p(y

We’ll take this information as given, and we’ll use it (

_{2}| y_{1}) ~ N[μ_{2}+ ρσ_{2}(y_{1}– μ_{1}) / σ_{1}, σ_{2}^{2}(1 – ρ^{2})] .We’ll take this information as given, and we’ll use it (

*via*the Gibbs sampler) to obtain the*marginal*distributions of Y_{1}and Y_{2}.Now, in fact these marginal distributions are known to be

p(y

_{1}) ~ N[μ_{1}, σ_{1}^{2}]and

p(y

_{2}) ~ N[μ_{2}, σ_{2}^{2}] .However, we’re going to pretend that we don’t know this. It means, though, that we can check our MCMC results for accuracy at the end of the exercise.

Without any loss of generality, I’m going to set μ

_{1}= μ_{2}= 0; and σ_{1}= σ_{2}= 1. Regardless of the value of ρ, this means that the*marginal*distributions for Y_{1}and Y_{2}should each be*standard normal*.Now, let’s recall the steps involved in using the Gibbs sampler in this context. What we’ll do is:

- Choose an initial value for Y
_{1}. Let’s call this value y_{1}^{(0)}. - Next, generate a random Y
_{2}value (say, y_{2}^{(1)}) from p(y_{2}| y_{1}^{(0)}). - Then, generate a new random Y
_{1}value (say, y_{1}^{(1)}) from p(y_{1}| y_{2}^{(1)}). - Repeat steps 2 and 3 many, many times, saving the strings of Y
_{1}and Y_{2}values. - Throw away the first thousand or so values from these strings of Y
_{1}and Y_{2}values, as they are draws from the*conditional*distributions. - After these “burn in” values have been discarded, the subsequent strings should contain thousands of values from the marginal distributions of Y
_{1}and Y_{2}. These are the values that we’re interested in.

Of course, we have to allow for a sufficiently long “burn in period” for this to work effectively. This is something that we can return to in a later post.

So, let’s take a look at my R code. It starts off as follows, by initializing some quantities that we’ll need:

After assigning an initial value for Y

_{1}(*i.e*., step 1 above), we run through the Gibbs sampler loop, and store all of the values drawn from the*conditional*distributions for Y_{1}and Y_{2}:Next, we throw away the “burn in” observations, and just focus on the next 100,000 values of each of Y

_{1}and Y

_{2}:

Let’s plot these values for Y_{1}:

This is what we get:

Notice that the values are centered around a level of zero, and very few values fall above 3, or below -3. This is what we’d expect if the distribution is standard normal, which is what the

*marginal*distribution for Y_{1}is supposed to be.Now repeat this for Y

_{2}:_{1}and Y

_{2}values are essentially zero, and the variances are essentially unity. That’s what they should be if these draws are from the

*marginal*distributions, and not from the

*conditional*distributions.

_{1}and Y

_{2}data:

_{1}and those for Y

_{2}, come from standard normal distributions.

The Q-Q plots look great, and the p-values associated with the Jarque-Bera tests very strongly suggest that we can’t reject normality.

I’m convinced, and I hope that you are too! Our Gibbs sampler seems to have successfully generated 100,000 draws from the

*marginal*distributions for Y_{1}and Y_{2}.Once again, the full set of R commands is on the

**code page**for this blog. Some subsequent posts will provide examples of applying the Gibbs sampler to determine the marginal posterior distributions for the parameters in some Bayesian problems.To

**leave a comment**for the author, please follow the link and comment on their blog:**Econometrics Beat: Dave Giles' Blog**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...