Post 4: Sampling the person ability parameters
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The previous post outlined the general strategy of writing a MH within Gibbs sampler by breaking the code into two levels: a high level shell and a series of lower-level samplers which do the actual work.
This post discusses the first of the lower-level samplers, the sampler for the person ability parameters. The chapter gives a complete treatment of how to write the sampler, please read it for the details. Here, we summarize the development of the sampler, check that the sampler code is correct, and give two examples of how bugs might be detected. The final bug will motivate the next post where we will refactor this specific person ability sampler into a general MH sampler.
The person ability sampler
The person ability sampler will use the Metropolis-Hastings (MH) algorithm to sample from the complete conditional. See the chapter for details on how (and why!) to write the code presented in this section.
In a previous post we derived the complete conditional as:
begin{equation}
f(theta_p|text{rest}) propto
prod_{i=1}^I
pi_{pi}^{u_{pi}} (1 – pi_{pi})^{1-u_{pi}}
f_text{Normal}(theta_p| 0,sigma_theta^2) label{eq:thetacc}
end{equation}
where is the probability of person
answering item
correctly.
depends on both the person and item parameters, and as is defined by
begin{equation*}
ln{frac{pi_{pi}}{1+pi_{pi}}} = a_i ( theta_p – b_i) .
end{equation*}
The likelihood term in Equation ref{eq:thetacc} must be calculated twice for each iteration of the MH algorithm: first for the proposed values of
and second for the “old” values of
. To minimized code duplication, we define the
log.prob
function to calculate it on the log scale:
## log.prob ## ## Output : A matrix such that the ## column sums are item log-likelihood values, and ## row sums are person log-likelihood values. log.prob
The chapter gives details on how to implement the MH sampler. A more highly commented version of the code is given here:
sample.th
Please see the chapter for additional details.
Running the sampler
We can run the sampler as follows:
## Load the necessary code from this and previous posts source("http://mcmcinirt.stat.cmu.edu/setup/post-4.R") ## Set the seed to keep results reproducible set.seed(314159) ## Run the sampler with overdispersed theta.abl ## values and the true, simulated values for the other ## parameters. run.B
Note that the acceptance rate reported by the sampler is about 48%, which is near the optimal rate. I achieved this by changing the value for MH.th
until it was “just right”. The tuning runs are not shown (see Post 9 for how to tune the sampler).
Testing the sampler
Using the coda
library and the get.2pl.params
function from Post 3 (both of which were loaded by the source("http://mcmcinirt.stat.cmu.edu/setup/post-4.R")
line in the code above), we can quickly visualize the trace plots for through
:
using the default plot
function:
## N.B. ylim specifies the scale on the y-axis. plot( run.B.mcmc[, get.2pl.params(1:9,NULL,NULL,NULL)], density=FALSE, ylim=c(-5,5), smooth=TRUE )
From the trace plots, it appears that each chain quickly converges to the mode from its starting value. We can test that the chains are getting the right answers by calculating the average values of the chains (Expected A Posteriori estimates; EAP estimates),
theta.eap.B
and graphically comparing them with the simulated true values:
To generate the graph, we define a special visualization function
## We'll be doing these checks a lot, so let's make a function check.sampler.graph
and call it for this specific run of the sampler:
## Run the function for this example: check.sampler.graph(theta.abl, theta.eap.B, 'Person Ability', 'EAP estimates', 'blue') ## Parameter recovery looks a little biased ## Empirical CDF looks very close
Both the parameter recovery scatter plot and the empirical CDF plot suggest that the sampler is properly recovering the simulated values. This is good news; it suggests that we implemented the sampler correctly.
What if there is a bug in the code?
Bug 1: Good sampler properties, but wrong answers
In my experience, a common bug in writing a sampler is not calculating the log-likelihood correctly. Here we introduce a bug in the calculation of the log-likelihood to see what happens with the sampler.
## Introduce a bug into the calculation of the log-likelihood log.prob.buggy
Now we swap in the buggy log.prob
function, make a buggy run of the sampler, and the swap out the buggy log.prob
function for the correct one:
## Set the seed to keep results reproducible set.seed(314159) ## Swap in the buggy log likelihood log.prob.correct
Now, the first nine person ability parameters still converge quickly to their mode:
![]() |
plot( run.B.buggy.mcmc[, get.2pl.params(1:9,NULL,NULL,NULL)], density=FALSE, ylim=c(-5,5), smooth=TRUE ) |
However, those modes do not correspond to the true values:
![]() |
## Calculate buggy EAP estimates theta.eap.B.buggy |
Lesson learned from bug #1
Sometimes the sampler will converge well to a global mode, but that mode will be the wrong answer because there is an error in the way that the likelihood (or prior) is being calculated. Only a check with simulated data — where the true values are known — will be able to spot this bug.
Bug 2: Terrible sampler properties, but correct answers
In my experience, a common bug in writing a sampler is making a small error in how the Metropolis-Hastings algorithm is implemented. Here we introduce a bug in the calculation of the acceptance probability:
sample.th.buggy
Since the bug involves the acceptance rate, we must adjust the variance of the Metropolis-Hastings proposal to get reasonable acceptance rates:
## Set the seed to keep results reproducible set.seed(314159) ## Swap in sample.th.correct
Again, the first nine person ability parameters still converge quickly to their modes:
![]() |
plot( run.B.buggy.th.mcmc[, get.2pl.params(1:9,NULL,NULL,NULL)], density=FALSE, ylim=c(-5,5), smooth=TRUE ) |
And the chains even recover the true values:
![]() |
## Calculate buggy EAP estimates theta.eap.B.buggy.th |
So, if the chains converge quickly to their modes and recover their true values, in what sense is this a bug?
This bug is about efficiency. Note that the buggy chain has much higher auto-correlation than the correct chain, even though both acceptance rates are around 0.44.
![]() |
par(mfrow=c(1,2)) acf( run.B.mcmc[, 'theta.abl 1'], main="ACF: correct sampler", col='blue') acf( run.B.buggy.th.mcmc[, 'theta.abl 1'], main="ACF: buggy sampler", col='purple') par(mfrow=c(1,1)) |
Since the autocorrelation is so high in the buggy sampler, there is less information about the posterior distribution in the chain. One metric that is both simple to calculate and to interpret is the effective sample size. This is (roughly) the number of draws from an independent sampler that would have the same amount of information as the (in this case, 1,000) draws from the correlated sampler. In R, we can calculate an effective sample size for each and examine boxplots as follows:
![]() |
## Effective sample size of the correct sampler ess.run.B |
Note that the 1,000 draws from the correct sampler is worth about 200 draws from a hypothetical independent sampler, while the 1,000 draws from the buggy sampler is worth only about 50. Therefore, this bug does affect our ability to draw inferences — mainly because our standard error of estimation will be quite a bit worse.
Lesson learned from bug #2
Sometimes a sampler will appear to recover the true values, but the sampler will be very inefficient. This could be due to a mistake in the algorithm. Unfortunately, detecting these bugs is quite difficult.
In the next post, we give a partial solution to avoiding such bugs.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.