# Post 7: Sampling the item parameters with generic functions

**Markov Chain Monte Carlo in Item Response Models**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this post, we will build the samplers for the item parameters using the generic functions developed in Post 5 and Post 6. We will check that the samplers work by running them on the fake data from Post 2, visualizing the results, and checking that the true values are recovered.

# Implementing the item samplers

Recall that the refactored person ability sampler from Post 5 needs only a proposal function (`prop.th.abl`

) and a complete conditional density (`th.abl.cc`

):

## Define the person ability sampler sample.th.refactor <- function( U.data, old ) { mh.sample( U.data, old, cc.log.density = th.abl.cc, proposal.function = prop.th.abl ) }

The generic normal proposal function from Post 6 makes implementing the proposal functions for all of the parameters trivial:

prop.th.abl <- generic.normal.proposal('th') prop.a.disc <- generic.normal.proposal('a') prop.b.diff <- generic.normal.proposal('b')

The complete conditional densities require a bit more work. We discuss them below.

## Implementing the the complete conditional densities

Recall from Post 1 that the relevant complete conditional densities are:

begin{aligned}

f(theta_p|text{rest}) & propto

prod_{i=1}^I

pi_{pi}^{u_{pi}} (1 - pi_{pi})^{1-u_{pi}}

f_text{Normal}(theta_p| 0,sigma_theta^2) ~, \

% f(sigma^2_theta|text{rest}) & propto

% f_text{Inverse-Gamma}left(sigma^2_theta left|

% alpha_theta + frac{P}{2},

% beta_theta + frac{1}{2} sum_{p=1}^P theta^2_p right.

%right) \

f(a_i|text{rest}) & propto

prod_{p=1}^P

pi_{pi}^{u_{pi}} (1 - pi_{pi})^{1-u_{pi}}

f_text{Log-Normal}(a_i| mu_a,sigma_a^2) ~, \

f(b_i|text{rest}) & propto

prod_{p=1}^P

pi_{pi}^{u_{pi}} (1 - pi_{pi})^{1-u_{pi}}

f_text{Normal}(b_i| 0,sigma_b^2) ,

end{aligned}

where stands in for the conditioning variables, represents the density of the random variable named (which has a distribution), and is defined as:

begin{equation}

ln{frac{pi_{pi}}{1+pi_{pi}}} = a_i ( theta_p - b_i) label{eq:pipi} quad.

end{equation}

Each of these complete conditionals contain a likelihood term which is either multiplied across the items in the case of the person ability parameter or multiplied across the persons by the item parameters. As discussed in Post 4 and Post 5 we implement this by calculating a matrix of the log-likelihood terms and then collapsing that matrix by column for the person ability parameters. For the item parameters, we do much the same, except by row.

The full implementation is as follows.

### Person ability complete conditional density

From Post 5, with explanation of what the code does in in Post 4:

## Complete conditional for the person ability parameters th.abl.cc <- function( U.data, state ) { return( ## Calculate the log-likelihood term apply( log.prob(U.data, state$th, state$a, state$b),1,sum) ## And add the prior log-density term + dnorm(state$th, 0, sqrt(state$s2), log=TRUE) ) }

### Item discrimination complete conditional density

The item discrimination complete conditional is very similar. It sums over the persons instead of the item for the log-likelihood term. Since its prior is a log normal, it uses the `dlnorm`

function for its prior. See Post 1 for details of why the log normal was chosen.

## Complete conditional for the item discrimination parameters a.disc.cc <- function( U.data, state ) { return( ## Calculate the log-likelihood term apply(log.prob(U.data, state$th, state$a, state$b),2,sum) ## And add the prior log-density term + dlnorm(state$a, state$hyperpar$mu.a, sqrt(state$hyperpar$s2.a), log=TRUE) ) }

### Item difficulty complete conditional density

The item difficulty parameter is the same as the item discrimination parameter, except it has a normal, instead of log-normal prior:

## Complete conditional for the item discrimination parameters b.diff.cc <- function( U.data, state ) { return( ## Calculate the log-likelihood term apply(log.prob(U.data, state$th, state$a, state$b),2,sum) ## And add the prior log-density term + dnorm(state$b, 0, sqrt(state$hyperpar$s2.b), log=TRUE) ) }

## Implementing the samplers

Now that the proposal functions and complete conditional densities are implemented, we can use the generic Metropolis-Hastings sampler to define the complete conditional samplers:

## Define the person ability sampler sample.th <- function( U.data, old ) { mh.sample( U.data, old, cc.log.density = th.abl.cc, proposal.function = prop.th.abl ) } ## Define the item discrimination sampler sample.a <- function( U.data, old ) { mh.sample( U.data, old, cc.log.density = a.disc.cc, proposal.function = prop.a.disc ) } ## Define the item difficulty sampler sample.b <- function( U.data, old ) { mh.sample( U.data, old, cc.log.density = b.diff.cc, proposal.function = prop.b.diff ) }

# Testing that the samplers work

To test that the item samplers work, we run a chain, visualize it, and check that the item and person parameters are recovered properly.

## Running the chain

Running the chain follows the same pattern as before, where the call to `source('.../setup/post-7.R')`

loads the necessary functions to use the refactored samplers.

## Load the necessary code from this and previous posts source("http://mcmcinirt.stat.cmu.edu/setup/post-7.R") ## Set the seed to keep results reproducible set.seed(314159) ## Run the sampler with overdispersed theta.abl, ## a.disc, and b.diff values. Keep sig2.theta ## at its true value. run.C <- run.chain.2pl( ## No burnin, keep 1000 iterations, do not thin M.burnin = 0, M.keep = 1000, M.thin = 1, ## Use the fake data simulated in Post #2 U.data = U, ## Pass in the hyperpriors from Post #1 hyperpars = hyperpars, ## ## Generate starting values uniformly from -5 to 5 th.init = runif( length(theta.abl), min=-5, max=5 ), ## ## Generate starting values uniformly from 1/3 to 3 a.init = runif( length(a.disc), min=1/3, max=3 ), ## ## Generate starting values uniformly from -5 to 5 b.init = runif( length(b.diff), min=-5, max=5 ), ## ## Set the starting value to the truth from Post #2 s2.init = sig2.theta, ## ## Set proposal variances. ## ( I will explain how to do this in Post 9. ## For now, just use these values. ) MH.th=.85, MH.a=.15, MH.b=.15, verbose=TRUE) ## ## Average acceptance rates: ## theta.abl: 0.525 ## a.disc: 0.341 ## b.diff: 0.456

Note that the acceptance rates reported by the sampler are between 30% and 55%. This is in the "good enough" range between 20% and 60%. I achieved this by changing the values for the proposal variances until they were just right. The tuning runs are not shown. Tuning will be covered in Post 9.

## Visualizing the chain

First we convert to a `coda`

object:

require(coda) run.C.mcmc <- mcmc( t(run.C) )

and then look at one of each of the parameters:

plot( run.C.mcmc[, get.2pl.params(1,1,1,1)], density=FALSE, smooth=TRUE ) |

From the non-parametric smoother, it looks like the chain is done burning it at around 200 iterations or so. We can examine a few more trace plots (not shown) to verify this is the case:

## 9 person ability parameters plot( run.C.mcmc[, get.2pl.params(1:9,NULL,NULL,NULL)], density=FALSE, smooth=TRUE, ylim=c(-5,5) ) ## 9 item discrimination parameters plot( run.C.mcmc[, get.2pl.params(NULL,1:9,NULL,NULL)], density=FALSE, smooth=TRUE, ylim=c(-2,2) ) ## 9 item difficulty parameters plot( run.C.mcmc[, get.2pl.params(NULL,NULL,1:9,NULL)], density=FALSE, smooth=TRUE, ylim=c(-4,-1) )

## Recovering parameters

To estimate parameters we use only the converged part of the chain. In this example, we calculate EAP estimates using only iterations after 200:

all.eap <- apply( run.C.mcmc[200:1000,], MARGIN=2, mean )

We then visually compare the EAP estimates with the true parameter values:

## Person Ability check.sampler.graph( theta.abl, all.eap[ get.2pl.params(1:P.persons,NULL,NULL,NULL)], desc="Person Ability", ylab="EAP Estimates", col="blue" ) |

## Item discrimination check.sampler.graph( a.disc, all.eap[ get.2pl.params(NULL,1:I.items,NULL,NULL)], desc="Item discrimination", ylab="EAP Estimates", col="blue" ) |

## Item difficulty check.sampler.graph( b.diff, all.eap[ get.2pl.params(NULL,NULL,1:I.items,NULL)], desc="Item difficulty", ylab="EAP Estimates", col="blue" ) |

## Conclusion

By refactoring the person ability parameter code from Post 4 in Post 5 and Post 6, we were able to quickly put together a Metropolis-Hastings within Gibbs sampler. Additionally, the sampler should be free of bugs given the checks that we have implemented along the way.

In the next post, we complete the MH within Gibbs sampler by implementing a Gibbs step for the variance of the person ability parameters.

**leave a comment**for the author, please follow the link and comment on their blog:

**Markov Chain Monte Carlo in Item Response Models**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.