Association and concordance measures

September 12, 2012
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

Following the course, in order to define assocation measures (from Kruskal (1958)) or concordance measures (from Scarsini (1984)), define a concordance function as follows: let http://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-28.gif?w=456 be a random pair with copula http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-27.gif?w=456, and http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-29.gif?w=456 with copula http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-26.gif?w=456. Then define

http://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/cibc-25.gif?w=456

the so-called concordance function. Thus

http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-23.gif?w=456

As proved last week in class,

http://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-24.gif?w=456

Based on that function, several concordance measures can be derived. A popular measure is Kendall's tau, from Kendall (1938), defined as http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-22.gif?w=456 i.e.

 http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-21.gif?w=456

which is simply http://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-20.gif?w=456. Here, computation can be tricky. Consider the following sample,

> set.seed(1)
> n=40
> library(mnormt)
> X=rmnorm(n,c(0,0),
+ matrix(c(1,.4,.4,1),2,2))
> U=cbind(rank(X[,1]),rank(X[,2]))/(n+1)

Then, using R function, we can obtain Kendall's tau easily,

> cor(X,method="kendall")[1,2]
[1] 0.3794872

To get our own code (and to understand a bit more how to get that coefficient), we can use

> i=rep(1:(n-1),(n-1):1)
> j=2:n
> for(k in 3:n){j=c(j,k:n)}
> M=cbind(X[i,],X[j,])
> concordant=sum((M[,1]-M[,3])*(M[,2]-M[,4])>0)
> discordant=sum((M[,1]-M[,3])*(M[,2]-M[,4])<0)
> total=n*(n-1)/2
> (K=(concordant-discordant)/total)
[1] 0.3794872

or the following (we'll use random variable http://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-30.gif?w=456 quite frequently),

> i=rep(1:n,each=n)
> j=rep(1:n,n)
> Z=((X[i,1]>X[j,1])&(X[i,2]>X[j,2]))
> (K=4*mean(Z)*n/(n-1)-1)
[1] 0.3794872

Another measure is Spearman's rank correlation, from Spearman (1904),

http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-05.gif?w=456

where http://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-19.gif?w=456 has distribution http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-17.gif?w=456.

Here, http://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-07.gif?w=456 which leads to the following expressions

http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-06.gif?w=456

Numerically, we have the following

> cor(X,method="spearman")[1,2]
[1] 0.5388368
> cor(rank(X[,1]),rank(X[,2]))
[1] 0.5388368

Note that it is also possible to write

http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-04.gif?w=456

Another measure is the cograduation index, from Gini (1914), obtained by sybstituting an L1 norm instead of a L2 one in the previous expression,

http://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/concord-01.gif?w=456

Note that this index can also be defined as http://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/concor-02.gif?w=456. Here,

> Rx=rank(X[,1]);Ry=rank(X[,2]);
> (G=2/(n^2) *(sum(abs(Rx+Ry-n-1))-
+ sum(abs(Rx-Ry))))
[1] 0.41

Finally, another measure is the one from Blomqvist (1950). Let http://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-10.gif?w=456 denote the median of http://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-12.gif?w=456, i.e.

http://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-15.gif?w=456

Then define

http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-09.gif?w=456

or equivalently

http://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-08.gif?w=456

> Mx=median(X[,1]);My=median(X[,2])
> (B=4*sum((X[,1]<=Mx)*((X[,2]<=My)))/n-1)
[1] 0.4

To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , , , , , , ,

Comments are closed.