**Wiekvoet**, and kindly contributed to R-bloggers)

I have read about people doing a Bayesian PCA at some points and always wondered how that would work. Then, at some point I thought of a way to do so. As ideas evolved my interest became not PCA as such, but rather in a prefmap. As a first step in that this post contains the mapping from a sensory space to a two dimensional space. For prefmap this step is commonly done via a PCA.

### Data

Data are the coctail data from sensominer package.

### Algorithm

The calculation is mostly inspired by the many PLS algorithms to which I was exposed when I was doing chemometrics. Scores and loadings may be obtained from each other by multiplying with the data matrix. In this case it means I just take a set of product scores and obtain the associated descriptors via a simple matrix multiplication. The resulting product and descriptor vectors can be used to reconstruct the original matrix; the best solution minimizes difference between the constructed and original data. For dimension two subtract reconstructed data from original data and repeat on residuals.

#### Scaling

#### Unique solution

PCA is known not to have one unique solution; each solution is equivalent to its mirror image. It seemed most elegant to do this completely at the end, after inspection of the data it seemed the location of product 12 was suitable for making the solution unique, since it was extreme on both dimensions. The final step (generated quantities) forces the location to be top right quadrant for data reported.

#### Code

library(rstan)

nprod <- 16

ndescr <- 13

sprofile <- as.matrix(scale(senso.cocktail,scale=FALSE))

datain <- list(

nproduct=nprod,

ndescriptor=ndescr,

profile=sprofile

)

model1 <- “

data {

int

int

matrix[nproduct,ndescriptor] profile;

}

parameters {

row_vector[nproduct] prodsp1;

row_vector[nproduct] prodsp2;

real

real

}

transformed parameters {

vector [ndescriptor] descrsp1;

vector [ndescriptor] descrsp2;

matrix[nproduct,ndescriptor] expected1;

matrix[nproduct,ndescriptor] expected2;

matrix[nproduct,ndescriptor] residual1;

descrsp1 <- profile’*prodsp1′;

expected1 <- (descrsp1*prodsp1)’;

residual1 <- profile-expected1;

descrsp2 <- profile’*prodsp2′;

expected2 <- (descrsp2*prodsp2)’;

}

model {

for (r in 1:nproduct) {

prodsp1[r] ~ normal(0,1);

prodsp2[r] ~ normal(0,1);

for (c in 1:ndescriptor) {

profile[r,c] ~ normal(expected1[r,c],sigma1);

residual1[r,c] ~ normal(expected2[r,c],sigma2);

}

}

}

generated quantities {

vector [ndescriptor] descrspace1;

vector [ndescriptor] descrspace2;

row_vector [nproduct] prodspace1;

row_vector [nproduct] prodspace2;

prodspace1 <-(

((prodsp1[12]>0)*prodsp1)-

((prodsp1[12]<0)*prodsp1)

);

prodspace2 <-(

((prodsp2[12]>0)*prodsp2)-

((prodsp2[12]<0)*prodsp2)

);

descrspace1 <-(

((prodsp1[12]>0)*descrsp1)-

((prodsp1[12]<0)*descrsp1)

);

descrspace2 <-(

((prodsp2[12]>0)*descrsp2)-

((prodsp2[12]<0)*descrsp2)

);

}

“

pars <- c(‘prodspace1′,’prodspace2′,’descrspace1′,’descrspace2’)

fit1 <- stan(model_code = model1,

data = datain,

pars=pars)

### Results

For comparison, first a standard biplot.

#### Product space

It is not difficult to extract the samples and plot them. See end of post. One notable property of the plot is that the products are in ellipses with the minor axis towards the center. Apparently part of variation between MCMC samples is rotational freedom between dimensions. Other than that the solution is actually pretty close to the PCA

#### Descriptor space

The rotational freedom is even more clear here.

### Additional code

#### data

library(SensoMineR)

data(cocktail)

#### biplot

pr <- prcomp(senso.cocktail)

plot(pr)

biplot(pr)

#### product plot

fit1samps <- as.data.frame(fit1)

prod <- reshape(fit1samps,

drop=names(fit1samps)[33:59],

direction=’long’,

varying=list(names(fit1samps)[1:16],

names(fit1samps)[17:32]),

timevar=’sample’,

times=1:16,

v.names=c(‘PDim1′,’PDim2’)

)

prod <- prod[order(prod$PDim1),]

plot(prod$PDim1,prod$PDim2,

col=c(2,17,3,4,6,5,7:10,13,12,11,14:16)[prod$sample],

pch=46,

cex=2,

xlim=c(-1,1)*.75,

ylim=c(-1,1)*.75)

sa <- sapply(1:16,function(x)

c(sample=x,

Dim1=mean(prod$PDim1[prod$sample==x]),

Dim2=mean(prod$PDim2[prod$sample==x])))

sa <- as.data.frame(t(sa))

text(x=sa$Dim1,y=sa$Dim2,labels=sa$sample,cex=1.5)

#### descriptor plot

descr <- reshape(fit1samps,

drop=names(fit1samps),

direction=’long’,

varying=list(names(fit1samps)[33:45],

names(fit1samps)[46:58]),

timevar=’sample’,

times=1:13,

v.names=c(‘DDim1′,’DDim2’)

)

descr <- descr[order(descr$DDim1),]

plot(descr$DDim1,descr$DDim2,

col=c(2,1,3:13)[descr$sample],

pch=46,

cex=2,

xlim=c(-1,1)*9,

ylim=c(-1,1)*9)

sa <- sapply(1:13,function(x)

c(sample=x,

Dim1=mean(descr$DDim1[descr$sample==x]),

Dim2=mean(descr$DDim2[descr$sample==x])))

sa <- as.data.frame(t(sa))

text(x=sa$Dim1,y=sa$Dim2,labels=names(senso.cocktail))

**leave a comment**for the author, please follow the link and comment on their blog:

**Wiekvoet**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...