Mahalanobis distance with "R" (Exercice)

May 29, 2012
By

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers)

I have developed this exercise with Excel in another post for the same calculations , I am going to develop  it this time with  “R”.

    edad  long.  peso    mg.kg
1    28    31   130.0    68.12
2    24    28   143.0   127.89
3    28    20   136.0    89.03
4    32    34   130.5    78.28
5    22    15   125.0   134.08
6    26    37   147.5   135.31
7    24    19   135.0   130.48
8    28    22   125.0    86.48
9    24    26   127.0   129.47
10   30    21   139.0    82.43
11   22    20   121.5   127.41
12   30    38   150.5    71.21
13   24    17   120.0   132.06
14   26    20   125.0    90.85

We import the data into R.
 x<-read.table(“C:\\lead_fish.txt”,header=TRUE)
We are going to apply the Mahalanobis Distance formula:

D^2 = (x – μ)’ Σ^-1 (x – μ)
We calculate μ (mean) with:
mean<-colMeans(x)
   edad     long.      peso     mg.kg
 26.28571  24.85714 132.50000 105.93571
We calculate Σ (covariance matrix (Sx)) with:
Sx<-cov(x)
> Sx
            edad     long.      peso     mg.kg
edad    9.758242  12.81319  12.07692 -72.15407
long.  12.813187  56.90110  49.11538 -70.62066
peso   12.076923  49.11538  92.80769 -46.06962
mg.kg -72.154066 -70.62066 -46.06962 714.00118
The default value for the Mahalanobis function is inverted=FALSE, so the function will calculate the inverse of Sx. If we calculated appart remember to change to TRUE.
See R help:
http://127.0.0.1:25215/library/stats/html/mahalanobis.html
O.K. Let´s go:
>D2<-mahalanobis(x,mean,Sx)
> D2
 [1] 5.571677 2.863499 2.686127 7.766153 2.379621 6.366793 2.135347 1.538248
 [9] 2.018812 5.143830 3.082734 5.470313 3.158651 1.818195

These are the values in the Diagonal Matrix we saw with the calculations in Excel.

 
 

To leave a comment for the author, please follow the link and comment on their blog: NIR-Quimiometría.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags:

Comments are closed.

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)