Ordinal data with JAGS
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week is had a look at the standard R routines for estimating models for ordinal data. This week, I want to have a look at JAGS for examining the same data. To be honest, most of it is taking an example (inhaler) and removing code. To my surprise this was almost as difficult as adding complexity in models. I did add some output from JAGS. The data, as before, is the cheese data.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Data preparation and model
The data is same as last week. Hence I will pick up at a created dataframe cheese2. One new item is added, a dataframe difp which is used to index differences between cheese means. The model is a simplified version of inhaler. Note that loads of pre processing of data is removed from JAGS example and placed into R. On the other hand, to minimize the amount of output, some post processing is added, hence not done in R. I guess that’s a matter of taste.Other than that is a fairly straightforward JAGS run with the usual preparation steps.
ordmodel <- function() {
for (i in 1:N) {
for (j in 1:Ncut) {
# Cumulative probability of worse response than j
logit(QQ[i,j]) <- -(a[j] - mu[cheese[i]] )
}
p[i,1] <- 1 - QQ[i,1];
for (j in 2:Ncut) {
p[i,j] <- QQ[i,j-1] - QQ[i,j]
}
p[i,(Ncut+1)] <- QQ[i,Ncut]
response[i] ~ dcat(p[i,1:(Ncut+1)])
}
# Fixed effects
for (g in 2:G) {
# logistic mean for group
mu[g] ~ dnorm(0, 1.0E-06)
}
mu[1] <- 0
# ordered cut points for underlying continuous latent variable
a <- sort(a0)
for(i in 1:(Ncut)) {
a0[i] ~ dnorm(0, 1.0E-6)
}
#top 2 box
for (g in 1:G) {
logit(ttb[g]) <- -(a[Ncut-1] - mu[g] )
}
# Difference between categories
for (i in 1:ndifp) {
dif[i] <- mu[difpa[i]]-mu[difpb[i]]
}
}
difp <- expand.grid(a=1:4,b=1:4)
difp <- difp[difp$a>difp$b,]
datain <- list(response=cheese2$Response, N=nrow(cheese2),
cheese=c(1:4)[cheese2$Cheese],G=nlevels(cheese2$Cheese),
Ncut=nlevels(cheese2$FResponse)-1,difpa=difp$a,difpb=difp$b,ndifp =nrow(difp))
params <- c('mu','ttb','dif')
inits <- function() {
list(mu = c(NA,rnorm(3,0,1)),
a0= rnorm(8,0,1))
}
jagsfit <- jags(datain, model=ordmodel, inits=inits, parameters=params,
progress.bar=”gui”,n.iter=10000)
Results
The results, happily, are similar to those obtained via polr and clm. All results seem well within variation due to sampling rater than deterministic calculation. MCMCoprobit had completely different results. In hindsight I think those were originally incorrectly estimated and I corrected the post processing in last week’s post. Note that MCMCoprobit uses probit rather than logit, the practical difference is minimal, much less than variation in parameters.
Legend for output:
Legend for output:
dif: differences between cheese parameters.
mu: mean cheese values as used within the model
ttb: top two box
jagsfitInference for Bugs model at “C:/Users/…/Local/Temp/RtmpwPgnSx/model4d41cdc653a.txt”, fit using jags,
3 chains, each with 10000 iterations (first 5000 discarded), n.thin = 5
n.sims = 3000 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
dif[1] -3.408 0.420 -4.254 -3.691 -3.404 -3.111 -2.606 1.001 2500
dif[2] -1.729 0.368 -2.444 -1.978 -1.733 -1.478 -1.003 1.001 2300
dif[3] 1.669 0.387 0.943 1.411 1.658 1.920 2.453 1.002 1300
dif[4] 1.679 0.381 0.962 1.420 1.663 1.952 2.420 1.001 3000
dif[5] 5.077 0.481 4.164 4.740 5.063 5.405 6.008 1.001 3000
dif[6] 3.398 0.423 2.561 3.115 3.401 3.669 4.235 1.001 3000
mu[1] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
mu[2] -3.408 0.420 -4.254 -3.691 -3.404 -3.111 -2.606 1.001 2500
mu[3] -1.729 0.368 -2.444 -1.978 -1.733 -1.478 -1.003 1.001 2300
mu[4] 1.669 0.387 0.943 1.411 1.658 1.920 2.453 1.002 1300
ttb[1] 0.173 0.043 0.100 0.143 0.170 0.199 0.271 1.003 800
ttb[2] 0.007 0.003 0.003 0.005 0.007 0.009 0.015 1.002 1700
ttb[3] 0.037 0.013 0.017 0.028 0.035 0.044 0.066 1.002 1600
ttb[4] 0.519 0.068 0.386 0.472 0.521 0.565 0.650 1.002 1400
deviance 722.375 4.703 715.176 718.981 721.657 725.071 733.254 1.001 3000
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
pD = 11.1 and DIC = 733.4
DIC is an estimate of expected predictive error (lower deviance is better).
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.