Bayesian analysis of sensory profiling data III

February 16, 2014
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)


Last week I extended my Bayesian model. So this week I wanted to test it with different data. There is one other data set with profiling data in R, french fries in the reshape package. 'This data was collected from a sensory experiment conducted at Iowa State University in 2004. The investigators were interested in the effect of using three different fryer oils had on the taste of the fries'. In this post I try t analyze the data, however the documentation which I found is so limited, it is needed to second guess the objectives of the experiment.

Data

The data includes a treatment and a time variable. The time is in weeks and crossed with the other variables, suggesting that this is systematically varied and hence in itself a factor of interest. In the end I decided to cross treatments and time to generate a product factor with 30 levels. In addition I crossed time with a repetition factor so this effect could function as a random variable. As bonus, there are some missing data, which Stan does not like and are removed before analysis.
data(french_fries,package='reshape')
head(french_fries)
vars <- names(french_fries)
fries <- reshape(french_fries,
        idvar=vars[1:4],
        varying=list(vars[5:9]),
        v.names='Score',
        direction='long',
        timevar='Descriptor',
        times=vars[5:9])
head(fries)
fries$Product <- interaction(fries$treatment,fries$time)
fries$Session <- interaction(fries$time,fries$rep)
fries$Descriptor <- factor(fries$Descriptor)
fries <- fries[complete.cases(fries),]

Model

The model is a variation on last week, because I there is no rounds information but there are sessions. In addition, I made the range for the shift variable a bit larger, because it appeared some panelists have extreme behavior. 
model1 <-
        '
    data {
        int<lower=0> npanelist;
        int<lower=0> nobs;
        int<lower=0> nsession;
        int<lower=0> ndescriptor;
        int<lower=0> nproduct;
        vector[nobs] y;
        int<lower=1,upper=npanelist> panelist[nobs];
        int<lower=1,upper=nproduct> product[nobs];
        int<lower=1,upper=ndescriptor> descriptor[nobs];
        int<lower=1,upper=nsession> session[nobs];
        real maxy;
    }
    parameters {
        matrix<lower=0,upper=maxy> [nproduct,ndescriptor] profile;
        vector<lower=-maxy/2,upper=maxy/2>[npanelist] shift[ndescriptor];
        vector<lower=-maxy/2,upper=maxy/2>[npanelist] sitting[nsession];  
        vector<lower=-1,upper=1> [npanelist] logsensitivity;
        real<lower=0,upper=maxy/5> varr;
        vector [npanelist] lpanelistvar;
        vector [ndescriptor] ldescriptorvar;
        real<lower=0,upper=maxy/5> sigmashift;
        real<lower=0,upper=maxy/5> sigmasitting;
    }
    transformed parameters {
        vector [nobs] expect;
        vector[npanelist] sensitivity;
        real mlogsens;
        real mlpanelistvar;
        real mldescriptorvar;
        real meanshift[ndescriptor];
        real meansit[nsession];
        vector [nobs] sigma;
        mlogsens <- mean(logsensitivity);
        for (i in 1:ndescriptor) {
            meanshift[i] <- mean(shift[i]);
        }
        mlpanelistvar <- mean(lpanelistvar);
        mldescriptorvar <- mean(ldescriptorvar);   
        for (i in 1:nsession) {
            meansit[i] <- mean(sitting[i]);
        }
        for (i in 1:npanelist) {
            sensitivity[i] <- pow(10,logsensitivity[i]-mlogsens);
        }
        for (i in 1:nobs) {
            expect[i] <- profile[product[i],descriptor[i]]
                *sensitivity[panelist[i]]
                + shift[descriptor[i],panelist[i]]-meanshift[descriptor[i]]
                + sitting[session[i],panelist[i]]-meansit[session[i]];
            sigma[i] <- sqrt(varr
                *exp(lpanelistvar[panelist[i]]-mlpanelistvar)
                *exp(ldescriptorvar[descriptor[i]]-mldescriptorvar));
            }
    }
    model {
        logsensitivity ~ normal(0,0.1);
        for (i in 1: ndescriptor) {
            shift[i] ~ normal(0,sigmashift);
            ldescriptorvar[i] ~ normal(0,1);
        }
        for (i in 1:nsession)
            sitting[i] ~ normal(0,sigmasitting);
        for (i in 1:npanelist)
            lpanelistvar[i] ~ normal(0,1);
        y ~ normal(expect,sigma);
    } 
    generated quantities    {
        vector [npanelist] panelistsd;
        vector [ndescriptor] descriptorsd;
        for (i in 1:npanelist) {
            panelistsd[i] <- sqrt(exp(lpanelistvar[i]));
        }
        for (i in 1:ndescriptor) {
            descriptorsd[i] <- sqrt(exp(ldescriptorvar[i]));
        }
    }
        '

Additional code

This is just the stuff needed to get Stan running
datainfries <- with(fries,list(
                panelist=(1:nlevels(subject))[subject],
                npanelist=nlevels(subject),
                session=(1:nlevels(Session))[Session],
                nsession=nlevels(Session),
                product=(1:nlevels(Product))[Product],
                nproduct=nlevels(Product),
                descriptor=(1:nlevels(Descriptor))[Descriptor],
                ndescriptor=nlevels(Descriptor),
                y=Score,
                nobs=length(Score),
                maxy=15))

pars <- c('profile','sensitivity','shift','sitting',
        'panelistsd','descriptorsd')
library(rstan)
fitfries <- stan(model_code = model1,
        data = datainfries,
        pars=pars,
        iter = 1100,
        warmup=100,
        chains = 4)

Results

Traceplot is not shown, but the general plot provides some overview.
Plot of the profile plot has been adapted given the design. This plot should convey the message the data contains. As time progresses the flavor becomes more rancid and pointy, less potato. This process starts at circa 4 weeks.Treatment 3 seems least affected, but the difference is minute.

samples <- extract(fitfries,'profile')$profile
nsamp <- dim(samples)[1]
profile <- expand.grid(Product=levels(fries$Product),
        Descriptor=levels(fries$Descriptor))
profile$des <- as.numeric(profile$Descriptor)
profile$prod <- as.numeric(profile$Product)
profs <- as.data.frame(t(sapply(1:nrow(profile),function(i){
                        subsamples <-c(samples[1:nsamp,

                                         profile$prod[i],profile$des[i]])
                        c(mean=mean(subsamples),quantile(subsamples,c(.1,.9)))
                        })))
names(profs) <- c('score','lmin','lmax')
profile <- cbind(profile,profs)
profile <- merge(profile,
           unique(fries[,c('Product','treatment','time')])
           )

library(ggplot2)

profile$timen <- c(1:12)[profile$time]
profile <- profile[order(profile$timen),]

p1 <- ggplot(profile, aes(y=score, x=timen,color=treatment))

p1 + geom_path() +
        scale_x_continuous(
                breaks=1:nlevels(profile$time),
                labels=levels(profile$time)) +
        xlab('') +
         geom_ribbon(aes(ymin=lmin, ymax=lmax,fill=treatment),
                 alpha=.15,linetype='blank') +
        facet_grid(  Descriptor ~ . )

To leave a comment for the author, please follow the link and comment on his blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Mango solutions

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

RStudio homepage

REvolution analytics

http://www.eoda.de

Plotly: collaborative, publication-quality graphing.







rapporter.net: An R based reporting and data analysis platform in the cloud