270 search results for "PCA"

PCA / EOF for data with missing values – a comparison of accuracy

September 15, 2014
By
PCA / EOF for data with missing values – a comparison of accuracy

Not all Principal Component Analysis (PCA) (also called Empirical Orthogonal Function analysis, EOF) approaches are equal when it comes to dealing with a data field that contain missing values (i.e. "gappy"). The following post compares several methods by assessing the accuracy of the derived PCs to reconstruct the "true" data set, as was similarly...

Read more »

PCA and K-means Clustering of Delta Aircraft

June 22, 2014
By
PCA and K-means Clustering of Delta Aircraft

nIntroductionnI work in consulting. If you're a consultant at a certain type of company, agency, organization, consultancy, whatever, this can sometimes mean travelling a lot.nnMany business travellers 'in the know' have heard the old joke that if you want to stay at any type of hotel anywhere in the world and get a great rate, all you have to...

Read more »

Computing and visualizing PCA in R

November 28, 2013
By
Computing and visualizing PCA in R

Following my introduction to PCA, I will demonstrate how to apply and visualize PCA in R. There are many packages and functions that can apply PCA in R. In this post I will use the function prcomp from the stats package. I will also show how to visualize PCA in R using Base R graphics.

Read more »

PCA or SPCA or NSPCA?

November 15, 2013
By

Principal component analysis(PCA) is one of the classical methods in multivariate statistics. In addition, it is now widely used as a way to implement data-processing and dimension-reduction. Besides statistics, there are numerous applications about PCA in engineering, biology, and so on. There are two main optimal properties of PCA,  which are guaranteeing minimal information loss and uncorrelated principal components. That's … Continue reading...

Read more »

Introduction to Feature selection for bioinformaticians using R, correlation matrix filters, PCA & backward selection

October 17, 2013
By
Introduction to Feature selection for bioinformaticians using R, correlation matrix filters, PCA & backward selection

Bioinformatics is becoming more and more a Data Mining field. Every passing day, Genomics and Proteomics yield bucketloads of multivariate data (genes, proteins, DNA, identified peptides, structures), and every one of these biological data units are described by a number of features: length, physicochemical properties, scores, etc. Careful consideration of which features to select when trying...

Read more »

PCA to PLS modeling analysis strategy for WIDE DATA

March 2, 2013
By
PCA to PLS modeling analysis strategy for WIDE DATA

Working with wide data is already hard enough, add to this row outliers and things can get murky fast. Here is an example of an anlysis of a wide data set, 24 rows  x 84 columns. Using imDEV, written in R, to calculate and visualize a principal components analysis (PCA) on this data set. We find that

Read more »

Finding a pin in a haystack – PCA image filtering

December 4, 2012
By
Finding a pin in a haystack – PCA image filtering

I found the following post regarding the anomalous metal object observed in a Curiosity Rover photo to be fascinating - specifically, the clever ways that some programmers used for filtering the image for the object. The following answer on mathematica.stackexchange.com was especially illuminating for its use of a multivariate distribution to...

Read more »

Looking to the PCA scores with GGobi

October 21, 2012
By
Looking to the PCA scores with GGobi

In this post I continue with the unsupervised exploration of oil spectra, which we have seen in previous post ( PCA with "ChemoSpec" - 001).In the manual "ChemoSpec:An R Package for Chemometric Analysis of Spectroscopic Data", (page 23) there is a brie...

Read more »

PCA with "ChemoSpec" – 001

October 20, 2012
By
PCA with "ChemoSpec" – 001

In my last post about "ChemoSpec package" (Hierarchical Cluster Analysis (ChemoSpec) - 02), we saw the two cluster groups (one for olive oil, other for sunflower oil), and also another sub-clusters for the sunflower oil.Continue reading the manual "Che...

Read more »

PCA or Polluting your Clever Analysis

August 31, 2012
By
PCA or Polluting your Clever Analysis

When I learned about principal component analysis (PCA), I thought it would be really useful in big data analysis, but that's not true if you want to do prediction. I tried PCA in my first competition at kaggle, but it delivered bad results. This post illustrates how PCA can pollute good predictors.When I started examining this problem,...

Read more »