[This article was first published on Markov Chain Monte Carlo in Item Response Models, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the chapter, we argue that a useful way to develop a Metropolis-Hastings (MH) within Gibbs sampler is to split the code into two levels. The top level is the “shell” of the MH within Gibbs algorithm, which sets up the data structures, implements burn-in and thinning of the chain, and calls a lower-level function to do the actual sampling from the complete conditionals. In this way, we can test the shell of the algorithm separately from testing the complete conditional samplers.

This post quotes the code presented in the chapter, discusses a data structure choice, gives an example of how to run the “empty” shell, and gives an example of how to visualize output from the shell.

# MH within Gibbs 2PL shell

The code that we present in the chapter for the shell is:

```## Define a complete run of the MH within Gibbs sampler
run.chain.2pl <- function(M.burnin, M.keep, M.thin,
U.data, hyperpars,
th.init, a.init, b.init, s2.init,
MH.th, MH.a, MH.b, verbose=FALSE) {

## Define and initialize the list of things to keep track of in
## the "current state" of the chain
cur     <- list(th=th.init, a=a.init, b=b.init, s2=s2.init,
hyperpars=hyperpars,
MH = list(th=MH.th,    a=MH.a,    b=MH.b),
ACC= list(th=0,th.n=0, a=0,a.n=0,  b=0,b.n=0 ))

## Define matrix to store MCMC results...
chain.keep <- matrix( NA, ncol = M.keep,
nrow = length(th.init)
+ length(a.init)
+ length(b.init)
+ length(s2.init))
rownames(chain.keep) <- c(
paste('theta.abl', 1:length(th.init)),
paste('a.disc',  1:length(a.init)),
paste('b.diff',  1:length(b.init)),
'sig2.theta')

# Burn-in phase: do not keep these results
if( M.burnin > 0 ) {
for(ii in 1:M.burnin ) {
cur <- blocked.mcmc.update(U.data, cur)
}}

# Converged phase: Keep these results after thinning
for (m in 1:M.keep) {
# Skip the "thinned" pieces of the chain
if( M.thin > 1 ) {
for( ii in 1:(M.thin-1) ) {
cur <- blocked.mcmc.update(U.data, cur)
}}
# Generate a "kept" update and save its results
cur <- blocked.mcmc.update(U.data, cur)
chain.keep[,m] <- c( cur\$th, cur\$a, cur\$b, cur\$s2 )

if ( m %% 100 == 0) {
print(m)
# Adaptive tuning would go here.
}

}

if (verbose) {
cat(paste("Average acceptance rates:",
"n theta.abl:", round(cur\$ACC\$th / cur\$ACC\$th.n,3),
"n a.disc:   ", round(cur\$ACC\$a  / cur\$ACC\$a.n,3),
"n b.diff:   ", round(cur\$ACC\$b  / cur\$ACC\$b.n,3),
"n"))
}

return( chain.keep )
}
```

Where the `blocked.mcmc.update` is defined with respect to the complete conditional samplers

```## Define a single iteration of the MH within Gibbs sampler
blocked.mcmc.update  <- function( U.data, cur){
cur <- sample.th(U.data, cur)
cur <-  sample.a(U.data, cur)
cur <-  sample.b(U.data, cur)
cur <- sample.s2(U.data, cur)
return(cur)
}
```

which are, for now, defined as dummy samplers

```## Define the dummy complete conditional sampler
dummy.cc.sampler  <- function(U.data, cur){ return(cur) }

## Set all complete conditional samplers to the dummy sampler
sample.th         <- dummy.cc.sampler
sample.a          <- dummy.cc.sampler
sample.b          <- dummy.cc.sampler
sample.s2         <- dummy.cc.sampler
```

Please see the chapter for details on the setup.

### A technical aside about `cur`

Users of programming languages like C or C++ may be confused as to why we chose to pass around the omnibus object `cur`, instead of passing the output matrix by reference to the complete conditional samplers and having them update it directly.

For technical and historical reasons, R implements a hybrid between pass by reference and pass by value. In R, an object is passed by reference if the function does not change the object, and passed value if it does. Therefore we use a small, omnibus object (`cur`) to be copied about as the chain moves through a single iteration of `blocked.mcmc.update`. This is much faster than passing the whole output matrix.

We could improve speed by mimicking pass by reference behavior with a package like R.oo, but at a large sacrifice to code readability. Therefore we define the `cur` object to contain only the current state of the chain and the minimal additional information to calculate updates and monitor their success.

# Running the shell

In Post 1 we discussed specific choices for the hyper priors. We can encode them in R as follows:

```## Define the hyper parameters to values from Post 1
hyperpars <- list( mu.a         = 1.185,
s2.a         = 1.185,
s2.b         = 100,
alpha.th     = 1,
beta.th      = 1 )```

In Post 2 we generated fake data. We can run the sampler on the fake data as follows:

```## Load the necessary code from this and previous posts
source("http://mcmcinirt.stat.cmu.edu/setup/post-3.R")

## Set the seed to keep results reproducible
set.seed(314159)

## Run the sampler at the true, simulated values
## for 1000 iterations
run.A <- run.chain.2pl(
## No burnin, keep 1000 iterations, do not thin
M.burnin = 0, M.keep  = 1000, M.thin = 1,
## Use the fake data simulated in Post #2
U.data  = U,
## Pass in the hyperpriors from Post #1
hyperpars = hyperpars,
## Use the true values from Post #2 for the initial values
th.init = theta.abl,
a.init  = a.disc,
b.init  = b.diff,
s2.init = sig2.theta,
## These parameters are not yet defined. Pass in NULL.
MH.th=NULL, MH.a=NULL, MH.b=NULL)
```

# Exploring the chain

Since `run.A` is a large matrix, printing it to the screen is not helpful. We can explore it as follows:

```## run.A is a large matrix
dim(run.A)
##  2061 1000

## We can select a single parameter, by selecting a
## row of the matrix. For example:
run.A['theta.abl 1',]
## selects the row for the first person ability parameter

## However, since there are 1000 columns, even one
## row is too much to print to the screen. Here
## we select columns 1 through 5
run.A['theta.abl 1', 1:5]
##  -1.682354 -1.682354 -1.682354 -1.682354 -1.682354
##
## Which is more manageable. Note, that as expected, they
## are all equal.

## We can also define a vector of names
first.names <- c('theta.abl 1',
'a.disc 1',
'b.diff 1',
'sig2.theta')
## and use it to look at the output
run.A[first.names, 1:5]
## theta.abl 1 -1.6823542 -1.6823542 -1.6823542 -1.6823542 -1.6823542
## a.disc 1     0.7120615  0.7120615  0.7120615  0.7120615  0.7120615
## b.diff 1    -3.0000000 -3.0000000 -3.0000000 -3.0000000 -3.0000000
## sig2.theta   1.5625000  1.5625000  1.5625000  1.5625000  1.5625000
##
## Note, that as expected each parameter stays constant.
```

Specifying a different vector for each set of outputs can be tedious. We define a function `get.2pl.params` to build the selection vector:

```## get.2pl.params takes ranges of parameters and appends the
## appropriate variable names.
get.2pl.params <- function( theta.range, a.range,
b.range, sig2.range) {
## Append the variable names to the ranges
if( !is.null(theta.range) ) {
theta.range <- paste('theta.abl', theta.range)
}
if( !is.null(a.range    ) ) {
a.range     <- paste('a.disc'   , a.range    )
}
if( !is.null(b.range    ) ) {
b.range     <- paste('b.diff'   , b.range    )
}
if( !is.null(sig2.range ) ) {
sig2.range  <-       'sig2.theta'
}

return( c(theta.range, a.range, b.range, sig2.range ))
}
```

We test that the `get.2pl.params` function is working by comparing its output to the `first.names` variable we constructed earlier.

```## N.B. get.2pl.params can give the same values as first.names
all.equal( get.2pl.params( 1, 1, 1, 1 ), first.names)
##  TRUE

## So now we can look at the first five samples of the first
## three item discrimination parameters
run.A[ get.2pl.params(NULL, 1:3, NULL, NULL), 1:5]
##               [,1]      [,2]      [,3]      [,4]      [,5]
## a.disc 1 0.7120615 0.7120615 0.7120615 0.7120615 0.7120615
## a.disc 2 1.1440699 1.1440699 1.1440699 1.1440699 1.1440699
## a.disc 3 0.7326655 0.7326655 0.7326655 0.7326655 0.7326655
```

# Visualizing sampler output

We can use coda to analyze and visualize the output of the sampler. Because we chose a matrix for the output, converting `run.A` to `coda`'s format is trivial:

```require(coda)
## If the above command fails, you must install the
## coda library. To do so, un-comment the install.packages
## line below, run it, and follow the directions it gives.
##
## install.packages('coda')

## N.B. t(...) takes the transpose of a matrix
run.A.mcmc <- mcmc( t(run.A) )
```

A `coda mcmc` object behaves like a matrix. We can use the `get.2pl.params` function to extract out the variables we want. Note that for the `mcmc` object, the variables are in the columns.

```run.A.mcmc[1:5, get.2pl.params(1,1,1,1)]
##      theta.abl 1  a.disc 1 b.diff 1 sig2.theta
## [1,]   -1.682354 0.7120615       -3     1.5625
## [2,]   -1.682354 0.7120615       -3     1.5625
## [3,]   -1.682354 0.7120615       -3     1.5625
## [4,]   -1.682354 0.7120615       -3     1.5625
## [5,]   -1.682354 0.7120615       -3     1.5625
```

A `coda mcmc` object has a default plotting method. Therefore it is trivial to produce this graph with one line of R code

```plot( run.A.mcmc[, get.2pl.params(1,1,1,1)], density=FALSE )
```

Run

```help(plot.mcmc)
```

for the help page, which gives details on how to customize the graph.

Note that the graph is not particularly impressive because the sampler is not doing anything yet.