# Mahalanobis distance with "R" (Exercice)

May 29, 2012
By

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers)

I have developed this exercise with Excel in another post for the same calculations , I am going to develop  it this time with  “R”.

1    28    31   130.0    68.12
2    24    28   143.0   127.89
3    28    20   136.0    89.03
4    32    34   130.5    78.28
5    22    15   125.0   134.08
6    26    37   147.5   135.31
7    24    19   135.0   130.48
8    28    22   125.0    86.48
9    24    26   127.0   129.47
10   30    21   139.0    82.43
11   22    20   121.5   127.41
12   30    38   150.5    71.21
13   24    17   120.0   132.06
14   26    20   125.0    90.85

We import the data into R.
We are going to apply the Mahalanobis Distance formula:

D^2 = (x – μ)’ Σ^-1 (x – μ)
We calculate ﻿μ (mean) with:
mean<-colMeans(x)
26.28571  24.85714 132.50000 105.93571
We calculate Σ (covariance matrix (Sx)) with:
Sx<-cov(x)
> Sx
long.  12.813187  56.90110  49.11538 -70.62066
peso   12.076923  49.11538  92.80769 -46.06962
mg.kg -72.154066 -70.62066 -46.06962 714.00118
The default value for the Mahalanobis function is inverted=FALSE, so the function will calculate the inverse of Sx. If we calculated appart remember to change to TRUE.
See R help:
http://127.0.0.1:25215/library/stats/html/mahalanobis.html
O.K. Let´s go:
>D2<-mahalanobis(x,mean,Sx)
> D2
[1] 5.571677 2.863499 2.686127 7.766153 2.379621 6.366793 2.135347 1.538248
[9] 2.018812 5.143830 3.082734 5.470313 3.158651 1.818195

These are the values in the Diagonal Matrix we saw with the calculations in Excel.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: