Post 3: Setting up the sampler and visualizing its output

[This article was first published on Markov Chain Monte Carlo in Item Response Models, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the chapter, we argue that a useful way to develop a Metropolis-Hastings (MH) within Gibbs sampler is to split the code into two levels. The top level is the “shell” of the MH within Gibbs algorithm, which sets up the data structures, implements burn-in and thinning of the chain, and calls a lower-level function to do the actual sampling from the complete conditionals. In this way, we can test the shell of the algorithm separately from testing the complete conditional samplers.

This post quotes the code presented in the chapter, discusses a data structure choice, gives an example of how to run the “empty” shell, and gives an example of how to visualize output from the shell.

MH within Gibbs 2PL shell

The code that we present in the chapter for the shell is:

## Define a complete run of the MH within Gibbs sampler
run.chain.2pl <- function(M.burnin, M.keep, M.thin,
                , hyperpars, 
                          th.init, a.init, b.init, s2.init, 
                , MH.a, MH.b, verbose=FALSE) {
  ## Define and initialize the list of things to keep track of in 
  ## the "current state" of the chain 
  cur     <- list(th=th.init, a=a.init, b=b.init, s2=s2.init,
                  MH = list(,    a=MH.a,    b=MH.b),
                  ACC= list(th=0,th.n=0, a=0,a.n=0,  b=0,b.n=0 ))
  ## Define matrix to store MCMC results...
  chain.keep <- matrix( NA, ncol = M.keep,
                        nrow = length(th.init) 
                               + length(a.init)
                               + length(b.init) 
                               + length(s2.init))
  rownames(chain.keep) <- c( 
        paste('theta.abl', 1:length(th.init)),
        paste('a.disc',  1:length(a.init)),
        paste('b.diff',  1:length(b.init)),
  # Burn-in phase: do not keep these results
  if( M.burnin > 0 ) {
    for(ii in 1:M.burnin ) {
      cur <- blocked.mcmc.update(, cur) 
  # Converged phase: Keep these results after thinning
  for (m in 1:M.keep) {
    # Skip the "thinned" pieces of the chain       
    if( M.thin > 1 ) {
      for( ii in 1:(M.thin-1) ) {
        cur <- blocked.mcmc.update(, cur) 
    # Generate a "kept" update and save its results
    cur <- blocked.mcmc.update(, cur) 
    chain.keep[,m] <- c( cur$th, cur$a, cur$b, cur$s2 )
    if ( m %% 100 == 0) {
      # Adaptive tuning would go here. 
  if (verbose) {
    cat(paste("Average acceptance rates:",
              "n theta.abl:", round(cur$ACC$th / cur$ACC$th.n,3),
              "n a.disc:   ", round(cur$ACC$a  / cur$ACC$a.n,3),
              "n b.diff:   ", round(cur$ACC$b  / cur$ACC$b.n,3),
  return( chain.keep ) 

Where the blocked.mcmc.update is defined with respect to the complete conditional samplers

## Define a single iteration of the MH within Gibbs sampler
blocked.mcmc.update  <- function(, cur){
  cur <-, cur)  
  cur <-  sample.a(, cur)  
  cur <-  sample.b(, cur)  
  cur <- sample.s2(, cur)  

which are, for now, defined as dummy samplers
## Define the dummy complete conditional sampler  <- function(, cur){ return(cur) }

## Set all complete conditional samplers to the dummy sampler         <-
sample.a          <-
sample.b          <-
sample.s2         <-

Please see the chapter for details on the setup.

A technical aside about cur

Users of programming languages like C or C++ may be confused as to why we chose to pass around the omnibus object cur, instead of passing the output matrix by reference to the complete conditional samplers and having them update it directly.

For technical and historical reasons, R implements a hybrid between pass by reference and pass by value. In R, an object is passed by reference if the function does not change the object, and passed value if it does. Therefore we use a small, omnibus object (cur) to be copied about as the chain moves through a single iteration of blocked.mcmc.update. This is much faster than passing the whole output matrix.

We could improve speed by mimicking pass by reference behavior with a package like R.oo, but at a large sacrifice to code readability. Therefore we define the cur object to contain only the current state of the chain and the minimal additional information to calculate updates and monitor their success.

Running the shell

In Post 1 we discussed specific choices for the hyper priors. We can encode them in R as follows:

## Define the hyper parameters to values from Post 1
hyperpars <- list( mu.a         = 1.185,
                   s2.a         = 1.185,
                   s2.b         = 100,
              = 1,
               = 1 )

In Post 2 we generated fake data. We can run the sampler on the fake data as follows:
## Load the necessary code from this and previous posts

## Set the seed to keep results reproducible 
## Run the sampler at the true, simulated values
## for 1000 iterations
run.A <- run.chain.2pl(
    ## No burnin, keep 1000 iterations, do not thin
    M.burnin = 0, M.keep  = 1000, M.thin = 1,
    ## Use the fake data simulated in Post #2  = U,
    ## Pass in the hyperpriors from Post #1
    hyperpars = hyperpars,
    ## Use the true values from Post #2 for the initial values
    th.init = theta.abl,
    a.init  = a.disc,
    b.init  = b.diff,
    s2.init = sig2.theta,
    ## These parameters are not yet defined. Pass in NULL., MH.a=NULL, MH.b=NULL)

Exploring the chain

Since run.A is a large matrix, printing it to the screen is not helpful. We can explore it as follows:

## run.A is a large matrix 
## [1] 2061 1000

## We can select a single parameter, by selecting a
## row of the matrix. For example:
run.A['theta.abl 1',]
## selects the row for the first person ability parameter

## However, since there are 1000 columns, even one
## row is too much to print to the screen. Here
## we select columns 1 through 5
run.A['theta.abl 1', 1:5]
## [1] -1.682354 -1.682354 -1.682354 -1.682354 -1.682354 
## Which is more manageable. Note, that as expected, they
## are all equal. 

## We can also define a vector of names
first.names <- c('theta.abl 1',
                 'a.disc 1',
                 'b.diff 1',
## and use it to look at the output
run.A[first.names, 1:5]
## theta.abl 1 -1.6823542 -1.6823542 -1.6823542 -1.6823542 -1.6823542
## a.disc 1     0.7120615  0.7120615  0.7120615  0.7120615  0.7120615
## b.diff 1    -3.0000000 -3.0000000 -3.0000000 -3.0000000 -3.0000000
## sig2.theta   1.5625000  1.5625000  1.5625000  1.5625000  1.5625000
## Note, that as expected each parameter stays constant. 

Specifying a different vector for each set of outputs can be tedious. We define a function get.2pl.params to build the selection vector:

## get.2pl.params takes ranges of parameters and appends the 
## appropriate variable names. 
get.2pl.params <- function( theta.range, a.range, 
                            b.range, sig2.range) {
    ## Append the variable names to the ranges
    if( !is.null(theta.range) ) { 
      theta.range <- paste('theta.abl', theta.range) 
    if( !is.null(a.range    ) ) { 
      a.range     <- paste('a.disc'   , a.range    ) 
    if( !is.null(b.range    ) ) { 
      b.range     <- paste('b.diff'   , b.range    ) 
    if( !is.null(sig2.range ) ) { 
      sig2.range  <-       'sig2.theta'              
    return( c(theta.range, a.range, b.range, sig2.range ))

We test that the get.2pl.params function is working by comparing its output to the first.names variable we constructed earlier.
## N.B. get.2pl.params can give the same values as first.names
all.equal( get.2pl.params( 1, 1, 1, 1 ), first.names)
## [1] TRUE

## So now we can look at the first five samples of the first
## three item discrimination parameters
run.A[ get.2pl.params(NULL, 1:3, NULL, NULL), 1:5]
##               [,1]      [,2]      [,3]      [,4]      [,5]
## a.disc 1 0.7120615 0.7120615 0.7120615 0.7120615 0.7120615
## a.disc 2 1.1440699 1.1440699 1.1440699 1.1440699 1.1440699
## a.disc 3 0.7326655 0.7326655 0.7326655 0.7326655 0.7326655

Visualizing sampler output

We can use coda to analyze and visualize the output of the sampler. Because we chose a matrix for the output, converting run.A to coda's format is trivial:

## If the above command fails, you must install the 
## coda library. To do so, un-comment the install.packages 
## line below, run it, and follow the directions it gives.
## install.packages('coda')

## N.B. t(...) takes the transpose of a matrix
run.A.mcmc <- mcmc( t(run.A) )

A coda mcmc object behaves like a matrix. We can use the get.2pl.params function to extract out the variables we want. Note that for the mcmc object, the variables are in the columns.

run.A.mcmc[1:5, get.2pl.params(1,1,1,1)]
##      theta.abl 1  a.disc 1 b.diff 1 sig2.theta
## [1,]   -1.682354 0.7120615       -3     1.5625
## [2,]   -1.682354 0.7120615       -3     1.5625
## [3,]   -1.682354 0.7120615       -3     1.5625
## [4,]   -1.682354 0.7120615       -3     1.5625
## [5,]   -1.682354 0.7120615       -3     1.5625

A coda mcmc object has a default plotting method. Therefore it is trivial to produce this graph
with one line of R code

plot( run.A.mcmc[, get.2pl.params(1,1,1,1)], density=FALSE )



for the help page, which gives details on how to customize the graph.

Note that the graph is not particularly impressive because the sampler is not doing anything yet.

To leave a comment for the author, please follow the link and comment on their blog: Markov Chain Monte Carlo in Item Response Models. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.