"R": PLS Regression (Gasoline) – 003

February 3, 2012

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers)

The gasoline data set has the spectra of 60 samples acquired by diffuse reflectance from 900 to 1700 nm. We saw how to plot the spectra in the previous post.
Now, following the tutorial of Bjorn-Helge Mevik published in “R-News Volume 6/3, August 2006″, we will do the PLS regression:

gas1 <- plsr(octane~NIR, ncomp = 10,data = gasoline, validation = “LOO”)

This will fit a model of 10 components.
We will use the “Leave one out Cross Validation” (LOO)
The constituent is the octane number.

> summary(gas1)

Data:   X dimension: 60 401
        Y dimension: 60 1
Fit method: kernelpls
Number of components considered: 10
Cross-validated using 60 leave-one-out segments.
       (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
CV           1.543    1.328   0.3813   0.2579   0.2412   0.2412   0.2294
adjCV        1.543    1.328   0.3793   0.2577   0.2410   0.2405   0.2288
       7 comps  8 comps  9 comps  10 comps
CV      0.2191   0.2280   0.2422    0.2441
adjCV   0.2183   0.2273   0.2411    0.2433
TRAINING: % variance explained
        1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps  8 comps
X         70.97    78.56    86.15    95.40    96.12    96.97    97.32    98.10
octane    31.90    94.66    97.71    98.01    98.68    98.93    99.06    99.11
        9 comps  10 comps
X         98.32     98.71
octane    99.20     99.24
One way to decide better the number of components to use, is to plot the RMSEPs:

> plot(RMSEP(gas1), legendpos = “topright”)
adjCV is the RMSEP Bias corrected which in the case of  “LOO” is almost the same that the RMSEP without correction.
The plot suggest three components giving a RMSEP of 0.258.
Now we can see the different plots like the prediction plot:

> plot(gas1, ncomp = 3, asp = 1, line = TRUE)
We will continue with more plots in the next post.

Tutorials of :
Bjorn-Helge Mevik

Norwegian University of Life Sciences

Ron Wehrens

Radboud University Nijmegen

To leave a comment for the author, please follow the link and comment on their blog: NIR-Quimiometría.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...


Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)