Mahalanobis distance with "R" (Exercice)

May 29, 2012
By

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers)

I have developed this exercise with Excel in another post for the same calculations , I am going to develop  it this time with  "R".

    edad  long.  peso    mg.kg
1    28    31   130.0    68.12
2    24    28   143.0   127.89
3    28    20   136.0    89.03
4    32    34   130.5    78.28
5    22    15   125.0   134.08
6    26    37   147.5   135.31
7    24    19   135.0   130.48
8    28    22   125.0    86.48
9    24    26   127.0   129.47
10   30    21   139.0    82.43
11   22    20   121.5   127.41
12   30    38   150.5    71.21
13   24    17   120.0   132.06
14   26    20   125.0    90.85

We import the data into R.
 x<-read.table("C:\\lead_fish.txt",header=TRUE)
We are going to apply the Mahalanobis Distance formula:
D^2 = (x - μ)' Σ^-1 (x - μ)
We calculate μ (mean) with:
mean<-colMeans(x)
   edad     long.      peso     mg.kg
 26.28571  24.85714 132.50000 105.93571
We calculate Σ (covariance matrix (Sx)) with:
Sx<-cov(x)
> Sx
            edad     long.      peso     mg.kg
edad    9.758242  12.81319  12.07692 -72.15407
long.  12.813187  56.90110  49.11538 -70.62066
peso   12.076923  49.11538  92.80769 -46.06962
mg.kg -72.154066 -70.62066 -46.06962 714.00118
The default value for the Mahalanobis function is inverted=FALSE, so the function will calculate the inverse of Sx. If we calculated appart remember to change to TRUE.
See R help:
http://127.0.0.1:25215/library/stats/html/mahalanobis.html

O.K. Let´s go:
>D2<-mahalanobis(x,mean,Sx)
> D2
 [1] 5.571677 2.863499 2.686127 7.766153 2.379621 6.366793 2.135347 1.538248
 [9] 2.018812 5.143830 3.082734 5.470313 3.158651 1.818195

These are the values in the Diagonal Matrix we saw with the calculations in Excel.
 



 

To leave a comment for the author, please follow the link and comment on his blog: NIR-Quimiometría.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags:

Comments are closed.