(This article was first published on

**NIR-Quimiometría**, and kindly contributed to R-bloggers)We started some posts based on the tutorials of:

“Multivariate Statistical Analysis using the R package

The first post was:

“Multivariate Statistical Analysis using the R package

**chemometrics**“The first post was:

Now we continue with a second part.

The graphics help us to decide the number of PCs, but for the tutorial we decided 5 PCs.

So, let´s calculate the PC space:

The graphics help us to decide the number of PCs, but for the tutorial we decided 5 PCs.

So, let´s calculate the PC space:

**> X.pca<-princomp(X)**

The function

**pcaDiagplot**gave to us an interesting plot which help to us to detect outliers, let´s apply this function in R:**> res<-pcaDiagplot(X,X.pca,a=5)**

We get this plot:

:

In this plots we see two distances: orthogonal and score distances.

In this plots we see two distances: orthogonal and score distances.

**Orthogonal distance:**Distance between the object (in the original space) and its orthogonal projection on the PCA subspace.

**Score distance:**Distance of and object projected on the PCA space to the center.Some chemometric software has the NIPALS Algorithm included in order to reduce all our original X matrix to a few principal components.

NIPALS Algorithm

*(nonlinear iterative partial least square algorithm)*was developed by H. Wold (1966).The idea is to substract the reconstruction matrix by the first PC1 to the original matrix (X) getting a residual matrix (E1). From this “E” matrix, we calculate the second principal component PC2.

We have a new reconstruction matrix (the sum of PC1 and PC2) and we substract it again from X, getting a smaller residual matrix (E2), and we continue again with more PCs untill the desired number of PCs or untill “E” becomes very small.

A graphics plot of the variance explained versus number of PCs will help us to decide the cuttoff.

In Principal Components Analysis with “R” (Part: 001), we decided looking at this plot 5 PCs. So lets apply in R this number of components to the NIPALS Algorithm.

*We get some warnings like:*

*WARNING! Iteration stop in h= 2 without convergence!*

*In the NIPALS Algorithm, there is one more argument called iteractions (stepwise calculation), that in the case of “*

**nipals(X,a=5)”**

*lives the default value*

**(it=10),**

*and the tolerance limit too*

**(tol=0,0001)**.

*Let´s try with:*

**> X_nipals<-nipals(X,a=5,it=160)**

*No warnings in this case.*

**> X_nipals<-list(scores=X_nipals$T,loadings=X_nipals$P,sdev=apply(X_nipals$T,2,sd))**

**> res<-pcaDiagplot(X,X.pca=X_nipals,a=5)**

*We get this plot:*

Graphics, for:**res<-pcaDiagplot(X,X.pca=X_nipals,a=5)**

and

**> res<-pcaDiagplot(X,X.pca,a=5)**

are almost the same and I not notice any diference.

We will continue soon.

To

**leave a comment**for the author, please follow the link and comment on their blog:**NIR-Quimiometría**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...