Here you will find daily news and tutorials about R, contributed by over 750 bloggers.
There are many ways to follow us - By e-mail:On Facebook: If you are an R blogger yourself you are invited to add your own R content feed to this site (Non-English R bloggers should add themselves- here)

The basic idea is to measure a range of features of the time series (such as strength of seasonality, an index of spikiness, first order autocorrelation, etc.) Then a principal component decomposition of the feature matrix is calculated, and outliers are identified in 2-dimensional space of the first two principal component scores.

We use two methods to identify outliers.

A bivariate kernel density estimate of the first two PC scores is computed, and the points are ordered based on the value of the density at each observation. This gives us a ranking of most outlying (least density) to least outlying (highest density).

A series of –convex hulls are computed on the first two PC scores with decreasing , and points are classified as outliers when they become singletons separated from the main hull. This gives us an alternative ranking with the most outlying having separated at the highest value of , and the remaining outliers with decreasing values of .

The density-ranking of PC scores was also used in my work on detecting outliers in functional data. See my 2010 JCGS paper and the associated rainbow package for R.

There are two versions of the package: one under an ACM licence, and a limited version under a GPL licence. Eventually we hope to make the GPL version contain everything, but we are currently dependent on the alphahull package which has an ACM licence.

Related

To leave a comment for the author, please follow the link and comment on their blog: Hyndsight » R.