This post demonstrates how to tune the sampler for optimal acceptance probabilities and demonstrates that the whole sampler works.
Tuning the complete sampler for acceptance rate
Tuning the sampler’s acceptance rates consists of running the sampler several times while tweaking the values of the proposal variances (
MH.b) between each run. If the acceptance rate is too high, the chain is taking steps which are too small (but usually accepting them), so the posterior will be explored slowly. If the acceptance rate is too low, the chain is attempting to take steps which are too big (and usually rejecting them), so the posterior will also be explored slowly. A rule of thumb is that the average acceptance rate should be between 20% and 60%.
We get setup by loading the necessary functions, declaring the hyper-parameter values, and setting initial values for the chain:
## Load the necessary code from this and previous posts source("http://mcmcinirt.stat.cmu.edu/setup/post-9.R") ## Define the hyper parameters to values from Post 1 hyperpars
In the code below, we show two tuning runs. We do this by running our first guess, copying the acceptance output into a comment, and then commenting out the last line of the function. Now, we replace the last line of the function with updated guesses for the tuning parameters and try again. This gives us a convenient shorthand to make several tuning runs without copying and pasting the whole code several times.
## Make the tuning run set.seed( 314159 ) run.tune
Having tuned the sampler, we make a longer run to check that it is working properly:
set.seed( 314159 ) run.E
We can check burnin by visualizing the chain with
plot( run.E.mcmc[,get.2pl.params(1,1,1,1)], density=FALSE )
It looks like the sampler burns in almost immediately. We discard the first 100 iterations to be conservative.
Parameter recovery plots
We begin by calculating the EAP estimates:
## make EAP and check values all.eap
## Person Ability check.sampler.graph( theta.abl, eap.th, desc="Person Ability", ylab="EAP Estimates", col="blue" )
Item discrimination (Equated)
## Equated Item Discrimination check.sampler.graph( a.disc, equate.2pl.a.disc(eap.a, eap.b, a.disc, b.diff), desc="Item disc. (Equated)", ylab="EAP Estimates", col="blue" )
Item difficulty (Equated)
## Equation Item Difficulty check.sampler.graph( b.diff, equate.2pl.b.diff(eap.a, eap.b, a.disc, b.diff), desc="Item difficulty (Equated)", ylab="EAP Estimates", col="blue" )
Variance of person ability
check.sigma( run.E.mcmc.conv, c(1,3) )
The parameter recovery plots give mixed signals.
On one hand, the parameter recovery plots for the item and person parameters suggest that the sampler is working well. Note that the we needed to equate the item parameters as discussed in Post 2, but after equating, the estimates from the MCMC chain agree well with the true values.
On the other hand, the parameter recovery plot for the variance of person ability suggests that something is wrong. The true value of , which is plotted in blue, is far in the tail of the MCMC posterior estimate. This is not good.
The reason for these mixed signals is the that chain has not run for long enough. Unfortunately detecting when “long enough” has happened can be somewhat difficult. For example, if we did not know the true values of the parameters, we may have been fooled in this example into believing that was, in fact, that tightly distributed.
If we run the chain for 100 times longer (by only keeping every 100th iteration of the chain, see Post 3 for details), we can see that the chain can recover the true value of :
Running the thinned chain:
######################## ## WARNING ## ## This takes awhile. ## ## ################## ## set.seed( 314159 ) run.100.thinned
Generating the graph of :
check.sigma( mcmc.thinned, xylim=c(0,4))
Note that because of the thinning the scale of the trace plot is in units of 100. The first 10 points of the traceplot represent the 1,000 iterations graphed from Run E. This gives a much different perspective on the behavior of the chain; it explores the posterior of quite slowly. If we wish to do inference on , we must be quite patient.
In the next post, we will give a way to check that the sampler has both converged and is properly exploring the whole posterior — even when we do not know the true values.
An aside: Why did the other parameters “work”?
The estimation of the person and item parameters is robust to a “bad” value of because the amount of data in this example (30 items, 2000 persons) is enough to overcome almost any (hyper-) prior. Therefore, even though the chain for the values is not near the true value, the item and person parameters are still well estimated.