# Linking apple liking to sensory

March 24, 2012
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

Previously it was seen that apple liking was related to consumers scores for juiciness and sweetness. It would be most nice if these scores can be linked to sensory scores. Thus a three block model would result:

• A block with sensory data describing how the apples taste
• A block with consumer data describing how the apples are perceived by the consumers
• A block with consumer data describing how the apples are liked

It would be most efficient to use randomForest to get the next connection. However, I like to try new things in this blog and it would be boring for the reader. Just to try, I decided to use Bayesian Model Averaging.

### Juiciness

library(BMA)
library(ggplot2)
# read and prepare the data
#remove storage conditions

datain <- datain[-grep(‘bag|net’,datain\$Products,ignore.case=TRUE),]
datain\$week <-  sapply(strsplit(as.character(datain\$Product),’_’),
function(x) x[[2]])
#convert into numerical
dataval <- datain
vars <- names(dataval)[-1]
for (descriptor in vars) {
dataval[,descriptor] <- as.numeric(gsub(‘[[:alpha:]]’,”,
dataval[,descriptor]))
}

indepV <- grep(‘^S’,vars,value=TRUE)
xblock <- as.matrix(dataval[,indepV])
bcJ <- bicreg(y=dataval\$CJuiciness,x=xblock)

Warning message:
In if ((1 – r2/100) <= 0) stop(“a model is perfectly correlated with the response”) :
the condition has length > 1 and only the first element will be used

summary(bcJ)

Call:
bicreg(x = xblock, y = dataval\$CJuiciness)

p!=0    EV         SD        model 1     model 2     model 3     model 4     model 5
Intercept   100.0   2.9100926  0.403545    3.055344    2.619845    2.913889    2.924000    3.170406
SCrispness   21.9   0.0013282  0.004954       .           .           .           .           .
SFirmness    20.2  -0.0011815  0.004592       .           .           .           .           .
SJuiciness  100.0   0.0245677  0.005499    0.026477    0.022773    0.021240    0.025762    0.027249
SMealiness   13.5   0.0002949  0.002505       .           .           .           .           .
SSweetness   58.4  -0.0050294  0.005644   -0.008973       .           .       -0.006871   -0.009673
SSourness    37.7   0.0014320  0.002701       .        0.004351       .        0.001453       .
SFlavor      17.0  -0.0004604  0.003566       .           .           .           .       -0.002023

nVar                                         2           2           1           3           3
r2                                         0.728       0.710       0.636       0.731       0.729
BIC                                      -17.631267  -16.497892  -15.312821  -14.954309  -14.852938
post prob                                  0.198       0.113       0.062       0.052       0.049
plot(bcJ,mfrow=c(3,3))

The models show a link with SJuiciness, which is what is expected. This link is clearly a positive correlation. A second link is to SSweetness, which is probably negative. Alternatively, but less probable, this link may be a positive link to SSourness. The model does not account for any curvature by quadratic or interaction effects. This is a bit of a disadvantage, but for this model it is not required. From sensory point of view, it can go either way, depending on how the scales are created for data acquisition. (the warning R dropped was ignored)

### Sweetness

bcS <- bicreg(y=dataval\$CSweetness,x=xblock)

Warning message:
In if ((1 – r2/100) <= 0) stop(“a model is perfectly correlated with the response”) :
the condition has length > 1 and only the first element will be used

summary(bcS)
Call:
bicreg(x = xblock, y = dataval\$CSweetness)

27  models were selected
Best  5  models (cumulative posterior probability =  0.5121 ):

p!=0    EV         SD        model 1     model 2     model 3     model 4     model 5
Intercept   100.0   3.6442040  0.518335    3.433920    4.091197    3.165198    3.424928    3.815567
SCrispness   85.7  -0.0077878  0.004981   -0.006778   -0.011925   -0.007570       .       -0.012148
SFirmness    29.7  -0.0012657  0.003688       .           .           .       -0.006709       .
SJuiciness   16.2  -0.0001717  0.001631       .           .           .           .           .
SMealiness   41.6  -0.0037273  0.006153       .       -0.009068       .           .       -0.008314
SSweetness  100.0   0.0098838  0.002804    0.010048    0.009443    0.010997    0.009795    0.010274
SSourness    18.2  -0.0001813  0.001224       .           .           .           .           .
SFlavor      28.1   0.0013050  0.003288       .           .        0.004341       .        0.003569

nVar                                         2           3           3           2           4
r2                                         0.702       0.742       0.723       0.664       0.755
BIC                                      -16.023858  -15.699168  -14.426420  -13.864248  -13.788552
post prob                                  0.173       0.147       0.078       0.059       0.056
CSweetness can be explained via SSweetness, as was hoped and expected and one or two other variables; SCrispness and SMealiness, both of these probably negative
plot(bcS,mfrow=c(3,3))

### BMA

My final verdict on BMA is mixed. It is nice to get an insight about important independent variables, effects of other variables. It is odd that to get an warning, but since the result seems to make sense ignoring is not too bad an approach. The main dislike is complete inability to handle interactions and quadratic effects. Even while it is not difficult to extend xblock to contain all these, the calculation time becomes prohibitive and BMA does not understand that if a model has a quadratic term or interaction terms it also requites the main effects. For the current model step, this is not the largest problem.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...