NIT: Fatty acids study in R – Part 004

[This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It is clear that MSC does not remove the entire scatter in the raw spectra, so some of the information is hidden by the scatter. Improvement of the sample presentation will help to remove the scatter.
We know that the first loading is much related to the main source of variance (in this case the scatter). In the next figure, I overplot the standard deviation spectrum (multiplied by 10, in order to compare them easily) with the first loading.
> l1sd10<-cbind(loading1,sdfattyac_msc_c)
> l1sd10<-cbind(loading1,sd10)
> plot(t(l1sd10))
> matplot(wavelengths,l1sd10,lty=1,pch=21)
The second loading will give us more details about the bands positions.
I´m going to use the function “Find Peaks”, from the package “quantmode”.
> findPeaks(loading2)
 X878 X932 X972
  15   42   62
The band at 932 nm (data point 42) is probably due to a C-H third overtone vibration of fat. The band at 972nm has some relation with the C-H2 vibration and water. The band at 878 seems to be also related with fat.
We can also interpret if possible the other loadings.

We saw how one of the samples (66) has a MD of 11.6. Let´s see the values for the six constituents for this sample:
> fattyac_msc[66,1:6]
   C16_0  C16_1  C18_0  C18_1  C18_2  C18_3
66  15.8     2     6    62.3   10.2    0.6
Let´s compare with the summary
> summary(fattyac_msc)
        C16_0           C16_1                 
 Min.   : 0.00   Min.   :1.500     
 1st Qu.:20.10   1st Qu.:2.000     
 Median :21.00   Median :2.200     
 Mean   :21.34   Mean   :2.267     
 3rd Qu.:22.90   3rd Qu.:2.500     
 Max.   :26.00   Max.   :3.500    
   
  C18_0            C18_1     
Min.   : 5.800   Min.   :43.80 
1st Qu.: 8.600   1st Qu.:51.95 
Median : 9.400   Median :54.50 
Mean   : 9.711   Mean   :53.93 
3rd Qu.:10.500   3rd Qu.:56.15 
Max.   :14.000   Max.   :62.30 

 C18_2            C18_3      
 Min.   : 5.500   Min.   :0.3000 
 1st Qu.: 7.600   1st Qu.:0.5000 
 Median : 8.500   Median :0.6000 
 Mean   : 8.503   Mean   :0.6032 
 3rd Qu.: 9.100   3rd Qu.:0.7000 
 Max.   :14.700   Max.   :1.3000 
Sample 66 has the higher value for C18:1 (oleic acid), but it is not isolated in the histogram. For some reasons this sample differs from the others especially from 100 to 1050 nm. We will wait forward to take a decision about this sample.
Until now we have been managing with the X matrix.
Now we start to study the Y matrix. First thing to do is to have a look to the summary, and of course to the histograms.
If you want to follow this tutorial, please send me an e_mail. I´ll send you the “txt” file attached.


> hist(C16_0,col=”red”)
> hist(C16_1,col=”blue”)
> hist(C18_0,col=”green”)
> hist(C18_1,col=”brown”)
> hist(C18_2,col=”violet”)
> hist(C18_3,col=”orange”)


To leave a comment for the author, please follow the link and comment on their blog: NIR-Quimiometría.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)