**Wiekvoet**, and kindly contributed to R-bloggers)

I recently made three posts regarding analysis of ordinal data. A post looking at all methods I could find in R, a post with an additional method and a post using JAGS. Common in all three was using the cheese data, a data set where only product data was available, observers were not there. The real world is different, there are observers and they are quite different. So, in this post I have added observers and look at two models again; Simple ANOVA and, complex, an ordinal model with random observers.

### Data

To spare myself looking for data, and keeping in mind I may want to run some simulations, I have created a data generator. In this generator I entered some of the things I could imagine regarding the observers. First of all they all have different sensitivities. In other words, what is very salt for some, is not at all salt for others. This is expressed in an additive and a multiplicative effect. Second, they have observation error. The same product does not always give the same response. Third is the categories, observers use them slightly different. I am absolutely aware these are much more properties than I might estimate. After all, a typical consumer takes two or three products in a test, while I have many more parameters. Finally, the outer limits (categories 1 and 9) are placed a bit outward. This is because there is an aversion to using these categories.

num2scale <- function(x,scale) {

cut(x,c(-Inf,scale,Inf),1:(length(scale)+1))

}

pop.limits2ind.limits <- function(scale,sd=.5) {

newscale <- scale+rnorm(length(scale),sd=sd)

newscale[order(newscale)]

}

obs.score <- function(obs,pop.score,pop.limits,scale.sd=0.5,

sensitivity.sd=.1, precision.sd=1,

additive.sd=2,center.scale=5,

labels=LETTERS[1:length(pop.score)]) {

# individual sensitivity (multiplicative)

obs.errorfreeintensity <- center.scale +

(pop.score-center.scale)*rlnorm(1,sd=sensitivity.sd)

#individual (additive)

obs.errorfreeintensity <- obs.errorfreeintensity +

rnorm(1,mean=0,sd=additive.sd)

# individual observation error

obs.intensity <- obs.errorfreeintensity+

rnorm(length(pop.score),sd=precision.sd)

# individual cut offs between categories

obs.limits <- pop.limits2ind.limits(pop.limits)

obs.score <- num2scale(obs.intensity,obs.limits)

data.frame(obs=obs,

score = obs.score,

product=labels

)

}

panel.score <- function(n=100,pop.score,pop.limits,scale.sd=0.5,

sensitivity.sd=.1,precision.sd=1,

additive.sd=2,center.scale=5,

labels=LETTERS[1:length(pop.score)]) {

la <- lapply(1:n,function(x) {

obs.score(x,pop.score,pop.limits,scale.sd,sensitivity.sd,

precision.sd,labels=labels)

})

dc <- do.call(rbind,la)

dc$obs <- factor(dc$obs)

dc

}

pop.limits <- c(1,3:8,10)

prodmeans <- c(7,7,8)

scores <- panel.score(40,prodmeans,pop.limits)

scores$numresponse <- as.numeric(levels(scores$score))[scores$score]

### Analysis – ANOVA

The first analysis method is ANOVA. Plain ANOVA. Not because I dislike variance components or such, but because this is what is done most commonly. On top of that, I cannot imagine somebody taking these data, pulling it through a mixed model, while forgetting it is not from a continuous scale.

library(multcomp)

Res.aov <- aov( numresponse ~ obs + product , data=scores)

anova(Res.aov)

Analysis of Variance Table

Response: numresponse

Df Sum Sq Mean Sq F value Pr(>F)

obs 39 239.300 6.1359 6.3686 2.321e-12 ***

product 2 9.517 4.7583 4.9388 0.00956 **

Residuals 78 75.150 0.9635

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

data.frame(dimnames(mt$tables$product)[1],

Response=as.numeric(mt$tables$product),

LSD=cld(glht(Res.aov,linfct=mcp(product=’Tukey’))

)$mcletters$monospacedLetters

)

product Response LSD

A A 6.725 a

B B 6.850 a

C C 7.375 b

### Analysis – cumulative link mixed model (clmm)

library(ordinal)

Res.clmm <- clmm(score ~ product + (1|obs),data=scores)

summary(Res.clmm)

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: score ~ product + (1 | obs)

data: scores

link threshold nobs logLik AIC niter max.grad cond.H

logit flexible 120 -179.57 379.14 37(5046) 8.77e-06 Inf

Random effects:

Var Std.Dev

obs 8.425 2.903

Number of groups: obs 40

Coefficients:

Estimate Std. Error z value Pr(>|z|)

productB 0.1309 0.4225 0.310 0.75680

productC 1.4640 0.4718 3.103 0.00192 **

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

Threshold coefficients:

Estimate Std. Error z value

2|3 -6.0078 NA NA

3|4 -5.6340 NA NA

4|5 -4.1059 0.4528 -9.069

5|6 -3.0521 0.4769 -6.400

6|7 -1.0248 0.4970 -2.062

7|8 0.5499 0.5303 1.037

8|9 4.0583 0.7429 5.463

anova(Res.clmm2,Res.clmm)

Likelihood ratio tests of cumulative link models:

formula: link: threshold:

Res.clmm2 score ~ 1 + (1 | obs) logit flexible

Res.clmm score ~ product + (1 | obs) logit flexible

no.par AIC logLik LR.stat df Pr(>Chisq)

Res.clmm2 8 387.41 -185.70

Res.clmm 10 379.14 -179.57 12.266 2 0.00217 **

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

vc <- vcov(Res.clmm)

names(co) <- levels(scores$product)

sd <- sqrt(c(vc[1,1],diag(vc)[-1]+vc[1,1]-2*vc[1,-1]))

data.frame(

`top 2 box`=arm::invlogit(c(-co[1],-co[1]+co[-1])),

`2.5% limit`=arm::invlogit(c(-co[1],-co[1]+co[-1])+qnorm(.025)*sd),

`97.5% limit`=arm::invlogit(c(-co[1],-co[1]+co[-1])+qnorm(.975)*sd),

check.names=FALSE

)

top 2 box 2.5% limit 97.5% limit

A 0.3658831 0.1694873 0.6199713

B 0.3967401 0.1753347 0.6704337

C 0.7138311 0.4498042 0.8838691

**leave a comment**for the author, please follow the link and comment on their blog:

**Wiekvoet**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...