(This article was first published on

R index is developed in interpreting signal detection data for human perception. In sensory research it is used to interpret ranking data. The value one gets out of an R-index calculation is interpreted as a confusion between samples tested. It has been popularized among others by Bi and O'Mahony.**Wiekvoet**, and kindly contributed to R-bloggers)The value of R-index ranges between 0% and 100%, with 50% representing equality.

The computation of the R index is not difficult and can be done using a few short functions.

Simulated data of ranking for 3 persons and 4 products is:

person prod rank

1 1 1 1

2 1 2 3

3 1 3 2

4 1 4 4

5 2 1 1

6 2 2 4

7 2 3 3

8 2 4 2

9 3 1 1

10 3 2 4

11 3 3 3

12 3 4 2

These data has as R indices:

prod1 prod2 Rindex

1 1 2 100.00000

2 1 3 100.00000

3 1 4 100.00000

4 2 3 11.11111

5 2 4 22.22222

6 3 4 44.44444

It is not obvious at first sight, but the R index is in all actuality not a continuous function. There are only so many permutations of rankings. To demonstrate this, we repeat a simulation 1000 times. With 25 simulated persons with each 4 products the frequency of R-indices is shown in the figure.

It is clear that R index exactly 50% is the most probable outcome. This represents that the products are all equal. It is also clear that some R-indices which are really close to 50 are much less probable. Finally, values under 20 and over 80 hardly occur. This is well know. Critical values of R-indices are given by the red and blue lines (Bi and O'Mahony 1995 respectively 2007, Journal of Sensory Studies)

R code for these results

# A simple Ranking

makeRanksNodiff <- function(nprod,nrep) {

inList <- lapply(1:nrep,function(x) sample(1:nprod,nprod) )

data.frame(person=factor(rep(1:nrep,each=nprod)),

prod=factor(rep(1:nprod,times=nrep)),

rank=unlist(inList))

}

# R index for two products out of a cross table

tab2Rindex <- function(t1,t2) {

Rindex <- crossprod(rev(t1)[-1],cumsum(rev(t2[-1]))) + 0.5*crossprod(t1,t2)

100*Rindex/(sum(t1)*sum(t2))

}

# R index for all products out of an experiment

allRindex <- function(rankExperiment) {

crst <- xtabs(~ prod + rank,data=rankExperiment)

nprod <- nlevels(rankExperiment$prod)

Rindices <- lapply(1:(nprod-1),function(p1) {

lap <- lapply((p1+1):nprod,function(p2) {

index <- tab2Rindex(crst[p1,],crst[p2,])

prod1 <- levels(rankExperiment$prod)[p1]

prod2 <- levels(rankExperiment$prod)[p2]

data.frame(prod1=prod1,prod2=prod2,Rindex=index)

})

return(do.call(rbind,lap))

})

do.call(rbind,Rindices)

}

# small data set

set.seed(12)

rankExperiment <- makeRanksNodiff(nprod=4,nrep=3)

rankExperiment

allRindex(rankExperiment)

# larger simulation

set.seed(12)

last <- lapply(1:1000,function(x) {

re <- makeRanksNodiff(nprod=4,nrep=25)

ar <- allRindex(re)

ar$Rindex

} )

rankmatrix <- do.call(rbind,last)

RindicesTable <- as.data.frame(table(rankmatrix))

RindicesTable$Rindex <- as.numeric(as.character(RindicesTable$rankmatrix))

library(ggplot2)

g1 <- ggplot(RindicesTable, aes( Rindex, Freq)) + geom_point(alpha=.5)

g1 <- g1 + geom_vline(xintercept=50 + 18.57*c(-1,1),colour='red')

g1 <- g1 + geom_vline(xintercept=50 + 15.21*c(-1,1),colour='blue')

g1

To

**leave a comment**for the author, please follow the link and comment on his blog:**Wiekvoet**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...