# Post 5: Refactoring Part I: a generic Metropolis-Hastings sampler

**Markov Chain Monte Carlo in Item Response Models**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A best practice in software engineering is to re-use code instead of duplicating it. One reason for this is that it makes finding and fixing bugs easier. You only have to find the bug once instead of finding it every place you duplicated the code.

Since the person ability, item discrimination, and item difficulty samplers will all use the same Metropolis-Hastings (MH) algorithm, it would be useful to define each of those samplers in terms of a generic MH sampler.

In this post, we refactor the person ability sampler from Post 4 into: (1) a generic MH sampler, (2) a function to draw proposals for the person ability parameters, and (3) a function to calculate the complete conditional density qfor the person ability parameters. We then implement a new version of the person ability sampler and compare its output to the sampler from Post 4.

# General refactoring strategy

The Metropolis-Hastings algorithm is roughly as follows:

- Propose a new value.
- Calculate the probability of accepting the new value.
- Flip a coin to determine if we accept the proposal or not.
- Update the state of the chain.

The `sample.th`

code from the previous post implements these steps for the specific context of the person ability parameter. Below, we examine the code for each step of the algorithm, identify pieces that are specific to the person ability context, and then refactor the code into a general part and a specific part.

## Step 1: Propose a new value

There are two places in the `sample.th`

code that are specific to proposing new values in the person ability context. The first is in drawing the proposal values:

th.old <- old$th MH.th <- old$MH$th P.persons <- length(th.old) th.star <- rnorm(P.persons,th.old,MH.th)

The second is in calculating the last two terms of the acceptance probability:

log.prop.star <- log(dnorm(th.star,th.old,MH.th)) log.prop.old <- log(dnorm(th.old,th.star,MH.th))

One way to combine these two tasks is to write a specific proposal function for the person ability parameters which will do two things (1) make the proposal draws and (2) return the log-density of the proposal distribution. For example:

## Proposal function for the person ability parameters prop.th.abl <- function( state ) { ## Extract parameters from the state ## ... Previously: ## th.old <- old$th ## MH.th <- old$MH$th ## ... Now: th.old <- state$th MH.th <- state$MH$th ## Draw the proposal to update the state ## ... Previously: ## P.persons <- length(th.old) ## th.star <- rnorm(P.persons,th.old,MH.th) ## ... Now: state$th <- rnorm( length( th.old ), mean=th.old, sd=MH.th ) ## Build a function to return the density of the proposal ## ... Previously: ## log.prop.star <- log(dnorm(th.star,th.old,MH.th)) ## log.prop.old <- log(dnorm(th.old,th.star,MH.th)) ## ... Now, we define the function log.density <- function( state.to, state.from ) { dnorm( x = state.to$th, mean = state.from$th, sd = MH.th, log = TRUE ) } ## Return the proposal object ## ... Note that we'll need the name of the parameter later, ## so return it too return( list( param.name='th', state=state, log.density=log.density )) }

Then the generic sampler would look something like this:

## A snippet of the generic sampler mh.sample <- function( old.state, proposal.function ## ... snip ... ) { ## Step 1: Call the proposal function prop <- proposal.function( old.state ) ## which returns: ## ... prop$param.name, the name of the parameter ## ... prop$state, the new state object ## ... prop$log.density, the log density of the proposal ## Step 2: Calculate the (log) acceptance probability log.alpha.star <- ( ## ## ... snip ... ## ## Proposal density calculated to the proposal + prop$log.density( prop$state, old.state ) ## ## Proposal density calculated to the "old" value - prop$log.density( old.state, prop$state ) ) ## ... snip ... }

where `... snip ...`

stands in for parts of the code which are not yet fleshed out.

## Step 2: Calculating the acceptance probability

Now having drawn the proposals we can calculate the probability of accepting them. In the `theta.abl`

sampler, this occurs in one line:

log.alpha.star <- pmin(log.cc.star + log.prop.old - log.cc.old - log.prop.star,0)

where

`log.cc.star`

,`log.cc.old`

, `log.prop.star`

, and `log.prop.old`

represent the value of complete conditional density at the proposal, the value of the complete conditional density at the old state of the chain, the value of the proposal density at the proposal, and the value of the "reversed" proposal density respectively.
Since the proposal function we defined above returns the proposal density, the only other information needed is the complete conditional density. We can implement this for the person ability parameter as:

## Complete conditional for the person ability parameters th.abl.cc <- function( U.data, state ) { return( ## Calculate the log-likelihood term apply( log.prob(U.data, state$th, state$a, state$b),1,sum) ## And add the prior log-density term + dnorm(state$th, 0, sqrt(state$s2), log=TRUE) ) }

Then we can use it as follows in the generic MH sampling function:

## A snippet of the generic sampler mh.sample <- function( U.data, old.state, cc.log.density, proposal.function ) { ## Step 1: Call the proposal function prop <- proposal.function( old.state ) ## which returns: ## ... prop$param.name, the name of the parameter ## ... prop$state, the new state object ## ... prop$log.density, the log density of the proposal ## Step 2: Calculate the (log) acceptance probability log.alpha.star <- ( ## Complete conditional at the proposal cc.log.density( U.data, prop$state ) ## ## Complete conditional at the "old" value - cc.log.density( U.data, old.state ) ## ## Proposal density calculated to the proposal + prop$log.density( prop$state, old.state ) ## ## Proposal density calculated to the "old" value - prop$log.density( old.state, prop$state ) ) ## ... snip ... }

Note that I have removed the call to

`pmin`

that was in the `sample.th`

code. Details are in the next section.
## Step 3: Flipping the coin

Conceptually, choosing which proposals are accepted is independent of the specific context. In the `sample.th`

sampler, we implemented it so that it depended on `P.persons`

:

## Accept or reject the proposals by flipping coins with the ## calculated acceptance probabilities acc.new <- ifelse(log(runif(P.persons))<=log.alpha.star, 1, 0)

To make it general, we replace `P.persons`

with the length of the `log.alpha.star`

value:

## Step 3: Accept proposal draws ## ... Flip coins with the calculated probabilities acc.new <- ifelse( log(runif( length(log.alpha.star) )) <= log.alpha.star, 1, 0 )

which can be included in the generic `mh.sample`

function immediately before the final `## ... snip ...`

line.

### An aside about `pmin`

Recall that I removed the call to `pmin`

that was in the `sample.th`

code. In the chapter, the call to `pmin`

paralleled the explanation of the MH algorithm that was given in the text. Specifically, the acceptance probability must be less than or equal to one. Since the fraction `alpha.star`

can be greater than 1, we use the `pmin`

function to set it equal to 1 in those cases. Or, on the log scale, the fraction `log.alpha.star`

may be greater than zero, so we use `pmin`

to set it equal to 0 in those cases.

Because we implement the coin flipping with the inverse CDF trick, it does not matter if `log.alpha.star`

has been forced to be zero for all instances where its value is greater than zero. The inequality holds either way. Therefore we remove the call to `pmin`

because it is unnecessary.

## Step 4: Updating the chain

In the `sample.th`

sampler, the updates were handled by referring to the `th`

object within the `cur`

object via the `cur$th`

syntax:

## Update the chain cur <- old cur$th <- ifelse(acc.new==1, th.star, th.old) ## Calculate acceptance probabilities (for tuning) cur$ACC$th <- old$ACC$th + mean( acc.new ) cur$ACC$th.n <- cur$ACC$th.n + 1

In R, an additional way to access the element of list is with the `[[...]]`

syntax, where the `...`

is a variable that stores the name of the list element. For example, if `param.name = 'th'`

, then `cur$th`

and `cur[[param.name]]`

both reference the same object.

The `[[...]]`

syntax can be useful when element names within a list are related to each other. For example, in the `cur$ACC`

object there are two elements `th`

and `th.n`

. If `param.name = 'th'`

, then we can construct the `th.n`

name with the `paste`

function:

name.n <- paste( param.name, ".n", sep="")

where the value of

`name.n`

will be `th.n`

. Therefore, `cur$ACC[[name.n]]`

will refer to `cur$ACC$th.n`

and `cur$ACC[[param.name]]`

will refer to `cur$ACC$th`

.
See the next section for how these ideas are implemented in the generic MH sampler.

# The final refactored code

The final generic MH sampler is then:

## A generic MH sampler mh.sample <- function( U.data, old.state, cc.log.density, proposal.function ) { ## Step 1: Call the proposal function prop <- proposal.function( old.state ) ## which returns: ## ... prop$param.name, the name of the parameter ## ... prop$state, the new state object ## ... prop$log.density, the log density of the proposal ## Step 2: Calculate the (log) acceptance probability log.alpha.star <- ( ## Complete conditional at the proposal cc.log.density( U.data, prop$state ) ## ## Complete conditional at the "old" value - cc.log.density( U.data, old.state ) ## ## Proposal density calculated to the proposal + prop$log.density( prop$state, old.state ) ## ## Proposal density calculated to the "old" value - prop$log.density( old.state, prop$state ) ) ## Step 3: Accept proposal draws ## ... Flip coins with the calculated probabilities acc.new <- ifelse( log(runif( length(log.alpha.star) ))

The person ability proposal function is:

## Proposal function for the person ability parameters prop.th.abl <- function( state ) { ## Extract parameters from the state ## ... Previously: ## th.old <- old$th ## MH.th <- old$MH$th ## ... Now: th.old <- state$th MH.th <- state$MH$th ## Draw the proposal to update the state ## ... Previously: ## P.persons <- length(th.old) ## th.star <- rnorm(P.persons,th.old,MH.th) ## ... Now: state$th <- rnorm( length( th.old ), mean=th.old, sd=MH.th ) ## Build a function to return the density of the proposal ## ... Previously: ## log.prop.star <- log(dnorm(th.star,th.old,MH.th)) ## log.prop.old <- log(dnorm(th.old,th.star,MH.th)) ## ... Now, we define the function log.density <- function( state.to, state.from ) { dnorm( x = state.to$th, mean = state.from$th, sd = MH.th, log = TRUE ) } ## Return the proposal object ## ... Note that we'll need the name of the parameter later, ## so return it too return( list( param.name='th', state=state, log.density=log.density )) }

The person ability complete conditional density is:

## Complete conditional for the person ability parameters th.abl.cc <- function( U.data, state ) { return( ## Calculate the log-likelihood term apply( log.prob(U.data, state$th, state$a, state$b),1,sum) ## And add the prior log-density term + dnorm(state$th, 0, sqrt(state$s2), log=TRUE) ) }

The refactored person ability sampler is:

## Define the person ability sampler sample.th.refactor <- function( U.data, old ) { mh.sample( U.data, old, cc.log.density = th.abl.cc, proposal.function = prop.th.abl ) }

# Testing the refactored code

In the previous post, we demonstrated that `sample.th`

was implemented correctly. Therefore, to check that the refactored sampler is working correctly, it is sufficient to show that it returns identical output. To do so, we run a short chain with `sample.th`

, run an otherwise identical chain with `sample.th.refactor`

, and then compare their output.

## Load the necessary code from this and previous posts source('http://mcmcinirt.stat.cmu.edu/setup/post-5.R') ## Check that we are using the Post 4 ## version of sample.th head(sample.th) ## ## 1 function (U.data, old) ## 2 { ## 3 th.old <- old$th ## 4 MH.th <- old$MH$th ## 5 P.persons <- length(th.old) ## 6 th.star <- rnorm(P.persons, th.old, MH.th) ## Set the seed and run the sampler for ten iterations set.seed(314159) run.post4 <- run.chain.2pl( M.burnin = 0, M.keep = 10, M.thin = 1, U.data = U, hyperpars = hyperpars, th.init = runif( length(theta.abl), min=-5, max=5 ), a.init = a.disc, b.init = b.diff, s2.init = sig2.theta, MH.th=1, MH.a=NULL, MH.b=NULL) ## Swap in the refactored sampler sample.th.post4 <- sample.th sample.th <- sample.th.refactor ## Set the seed and run the sampler for ten iterations set.seed(314159) run.refactor <- run.chain.2pl( M.burnin = 0, M.keep = 10, M.thin = 1, U.data = U, hyperpars = hyperpars, th.init = runif( length(theta.abl), min=-5, max=5 ), a.init = a.disc, b.init = b.diff, s2.init = sig2.theta, MH.th=1, MH.a=NULL, MH.b=NULL) ## Check that they return the same values all.equal( run.post4, run.refactor ) ## [1] TRUE

That the output is identical suggests that the code was refactored correctly.

**leave a comment**for the author, please follow the link and comment on their blog:

**Markov Chain Monte Carlo in Item Response Models**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.