Ordinal data with JAGS

Posted on March 25, 2013 by Wingfeet in Uncategorized | 0 Comments

[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last week is had a look at the standard R routines for estimating models for ordinal data. This week, I want to have a look at JAGS for examining the same data. To be honest, most of it is taking an example (inhaler) and removing code. To my surprise this was almost as difficult as adding complexity in models. I did add some output from JAGS. The data, as before, is the cheese data.

Data preparation and model

The data is same as last week. Hence I will pick up at a created dataframe cheese2. One new item is added, a dataframe difp which is used to index differences between cheese means. The model is a simplified version of inhaler. Note that loads of pre processing of data is removed from JAGS example and placed into R. On the other hand, to minimize the amount of output, some post processing is added, hence not done in R. I guess that’s a matter of taste.
Other than that is a fairly straightforward JAGS run with the usual preparation steps.
ordmodel <- function() {
for (i in 1:N) {
for (j in 1:Ncut) {
# Cumulative probability of worse response than j
logit(QQ[i,j]) <- -(a[j] - mu[cheese[i]] )
}
p[i,1] <- 1 - QQ[i,1];
for (j in 2:Ncut) {
p[i,j] <- QQ[i,j-1] - QQ[i,j]
}
p[i,(Ncut+1)] <- QQ[i,Ncut]
response[i] ~ dcat(p[i,1:(Ncut+1)])
}
# Fixed effects
for (g in 2:G) {
# logistic mean for group
mu[g] ~ dnorm(0, 1.0E-06)
}
mu[1] <- 0
# ordered cut points for underlying continuous latent variable
a <- sort(a0)
for(i in 1:(Ncut)) {
a0[i] ~ dnorm(0, 1.0E-6)
}
#top 2 box
for (g in 1:G) {
logit(ttb[g]) <- -(a[Ncut-1] - mu[g] )
}
# Difference between categories
for (i in 1:ndifp) {
dif[i] <- mu[difpa[i]]-mu[difpb[i]]
}
}

difp <- expand.grid(a=1:4,b=1:4)
difp <- difp[difp$a>difp$b,]

datain <- list(response=cheese2$Response, N=nrow(cheese2),
cheese=c(1:4)[cheese2$Cheese],G=nlevels(cheese2$Cheese),
Ncut=nlevels(cheese2$FResponse)-1,difpa=difp$a,difpb=difp$b,ndifp =nrow(difp))
params <- c('mu','ttb','dif')
inits <- function() {
list(mu = c(NA,rnorm(3,0,1)),
a0= rnorm(8,0,1))
}

jagsfit <- jags(datain, model=ordmodel, inits=inits, parameters=params,
progress.bar=”gui”,n.iter=10000)

Results

The results, happily, are similar to those obtained via polr and clm. All results seem well within variation due to sampling rater than deterministic calculation. MCMCoprobit had completely different results. In hindsight I think those were originally incorrectly estimated and I corrected the post processing in last week’s post. Note that MCMCoprobit uses probit rather than logit, the practical difference is minimal, much less than variation in parameters.

Legend for output:

dif: differences between cheese parameters.

mu: mean cheese values as used within the model

ttb: top two box

jagsfit

Inference for Bugs model at “C:/Users/…/Local/Temp/RtmpwPgnSx/model4d41cdc653a.txt”, fit using jags,

3 chains, each with 10000 iterations (first 5000 discarded), n.thin = 5

n.sims = 3000 iterations saved

mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff

dif[1] -3.408 0.420 -4.254 -3.691 -3.404 -3.111 -2.606 1.001 2500

dif[2] -1.729 0.368 -2.444 -1.978 -1.733 -1.478 -1.003 1.001 2300

dif[3] 1.669 0.387 0.943 1.411 1.658 1.920 2.453 1.002 1300

dif[4] 1.679 0.381 0.962 1.420 1.663 1.952 2.420 1.001 3000

dif[5] 5.077 0.481 4.164 4.740 5.063 5.405 6.008 1.001 3000

dif[6] 3.398 0.423 2.561 3.115 3.401 3.669 4.235 1.001 3000

mu[1] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1

mu[2] -3.408 0.420 -4.254 -3.691 -3.404 -3.111 -2.606 1.001 2500

mu[3] -1.729 0.368 -2.444 -1.978 -1.733 -1.478 -1.003 1.001 2300

mu[4] 1.669 0.387 0.943 1.411 1.658 1.920 2.453 1.002 1300

ttb[1] 0.173 0.043 0.100 0.143 0.170 0.199 0.271 1.003 800

ttb[2] 0.007 0.003 0.003 0.005 0.007 0.009 0.015 1.002 1700

ttb[3] 0.037 0.013 0.017 0.028 0.035 0.044 0.066 1.002 1600

ttb[4] 0.519 0.068 0.386 0.472 0.521 0.565 0.650 1.002 1400

deviance 722.375 4.703 715.176 718.981 721.657 725.071 733.254 1.001 3000

For each parameter, n.eff is a crude measure of effective sample size,

and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)

pD = 11.1 and DIC = 733.4

DIC is an estimate of expected predictive error (lower deviance is better).

To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Ordinal data with JAGS

Data preparation and model

Results

Related

Data preparation and model

Results

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)