There are different algorithms to calculate the Principal Components (PCs). Kurt Varmuza & Peter Filzmozer explain them in their book: “Introduction to Multivariate Statistical Analysis in Chemometrics”.I´m going to apply one of them, to...

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers) This is another pretreatment used quite often in Near Infrared to remove the scatter. It is applied to every spectrum individually. The average and standard deviation of all the data points for that spectra is calculated. Every data point of the spectra is substracted from the mean and...

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers) Mean spectrum calculation is important: To center a matrix of spectra, we subtract the mean spectrum, from every spectrum in the matrix. There are also many options to use the mean spectrum, like average subsamples. Let´s calculate and plot the mean spectra for the Yarn NIR Data:...

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers) It is always good to look at the spectra from different points of view, before to develop a regression, this will help us to understand better our samples, to detect outliers, to check where the variability is, if that variability correlates with the constituent of interest (directly...

MSC (Multiple Scatter Correction) is a Math treatment to correct the scatter in the spectra. The scatter is produced for different physical circumstances as particle size, packaging.Normally scatter make worse the correlation of the spectra with the constituent of interest.Almost all the chemometric software’s available include this math treatment and of course “R” have it as well in the...

In the previous post we plot the Cross Validation predictions with:> plot(gas1, ncomp = 3, asp = 1, line = TRUE)We can plot the fitted values instead with:> plot(gas1, ncomp = 3, asp = 1, line = TRUE,which=train) Graphics are different:Of course, using "train" we get overoptimisc statistics and we should look...

The gasoline data set has the spectra of 60 samples acquired by diffuse reflectance from 900 to 1700 nm. We saw how to plot the spectra in the previous post.Now, following the tutorial of Bjorn-Helge Mevik published in "R-News Volume 6/3, August 2006", we will do the PLS regression:gas1 <- plsr(octane~NIR, ncomp = 10,data = gasoline, validation...