Association and concordance measures

[This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Following the course, in order to define assocation measures (from Kruskal (1958)) or concordance measures (from Scarsini (1984)), define a concordance function as follows: let https://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-28.gif?w=578 be a random pair with copula https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-27.gif?w=578, and https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-29.gif?w=578 with copula https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-26.gif?w=578. Then define

https://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/cibc-25.gif?w=578

the so-called concordance function. Thus

https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-23.gif?w=578

As proved last week in class,

https://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-24.gif?w=578

Based on that function, several concordance measures can be derived. A popular measure is Kendall’s tau, from Kendall (1938), defined as https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-22.gif?w=578 i.e.

 https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-21.gif?w=578

which is simply https://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-20.gif?w=578. Here, computation can be tricky. Consider the following sample,

> set.seed(1)
> n=40
> library(mnormt)
> X=rmnorm(n,c(0,0),
+ matrix(c(1,.4,.4,1),2,2))
> U=cbind(rank(X[,1]),rank(X[,2]))/(n+1)

Then, using R function, we can obtain Kendall’s tau easily,

> cor(X,method="kendall")[1,2]
[1] 0.3794872

To get our own code (and to understand a bit more how to get that coefficient), we can use

> i=rep(1:(n-1),(n-1):1)
> j=2:n
> for(k in 3:n){j=c(j,k:n)}
> M=cbind(X[i,],X[j,])
> concordant=sum((M[,1]-M[,3])*(M[,2]-M[,4])>0)
> discordant=sum((M[,1]-M[,3])*(M[,2]-M[,4])<0)
> total=n*(n-1)/2
> (K=(concordant-discordant)/total)
[1] 0.3794872

or the following (we’ll use random variable https://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-30.gif?w=578 quite frequently),

> i=rep(1:n,each=n)
> j=rep(1:n,n)
> Z=((X[i,1]>X[j,1])&(X[i,2]>X[j,2]))
> (K=4*mean(Z)*n/(n-1)-1)
[1] 0.3794872

Another measure is Spearman’s rank correlation, from Spearman (1904),

https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-05.gif?w=578

where https://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-19.gif?w=578 has distribution https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-17.gif?w=578.

Here, https://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-07.gif?w=578 which leads to the following expressions

https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-06.gif?w=578

Numerically, we have the following

> cor(X,method="spearman")[1,2]
[1] 0.5388368
> cor(rank(X[,1]),rank(X[,2]))
[1] 0.5388368

Note that it is also possible to write

https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-04.gif?w=578

Another measure is the cograduation index, from Gini (1914), obtained by sybstituting an L1 norm instead of a L2 one in the previous expression,

https://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/concord-01.gif?w=578

Note that this index can also be defined as https://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/concor-02.gif?w=578. Here,

> Rx=rank(X[,1]);Ry=rank(X[,2]);
> (G=2/(n^2) *(sum(abs(Rx+Ry-n-1))-
+ sum(abs(Rx-Ry))))
[1] 0.41

Finally, another measure is the one from Blomqvist (1950). Let https://i0.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-10.gif?w=578 denote the median of https://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-12.gif?w=578, i.e.

https://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-15.gif?w=578

Then define

https://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-09.gif?w=578

or equivalently

https://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/conc-08.gif?w=578

> Mx=median(X[,1]);My=median(X[,2])
> (B=4*sum((X[,1]<=Mx)*((X[,2]<=My)))/n-1)
[1] 0.4

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)