Conditional dependence measures

December 17, 2013

(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers)

This week, I spend some time at the Workshop on Nonparametric Curve Smoothing conference at Concordia. Yesterday afternoon, Noël Veraverbeke show an interesting graph, to illustrate conditional copulas (and the derivation of conditional dependence measures, such as Kendall’s tau, or Spearman’s rho). A long time ago, in my PhD thesis (mainly on conditional copulas) I did try to derive conditional dependence measures (in a dedicated chapter). In my PhD, I was interested to describe the dependence of a pair,Y_2) given,Y_2)\in\mathcal{V}, where\mathcal%20V is a region of interest, such has tails. So I wanted to study the behavior of,Y_2) given\{Y_1%3Et,Y_2%3Et\}. This has interpretation when studying large risks, but also in joint life mortality.

In the paper Noël mentioned, they want to describe the dependence of a pair,Y_2) given a covariate And he came up with this very nice example: consider expected lifetimes, for male and female, in various countries. You can get zipped files with data for male, female and we can use the GPD per capita as our covariate. Here is the code to visualize life expectancies,

plot(b$LEM,b$LEF,xlab="Life Expectancy (male vs. female)")

With this graph, we cannot visualize the link with the covariate,

CL=brewer.pal(6, "RdBu")	
plot(b$LEM,b$LEF,xlab="Life Expectancy (male vs. female)",pch=19,col=CL[as.numeric(b$cgpd)])

Here, poor countries are in red, and rich countries in blue,

Clearly, life expectancy is connected to the wealth of the country,

plot(b$GPD,b$LEF,xlab="(Female) Life Expectancy vs. GPD (log scale)",pch=19,col=CL[as.numeric(b$cgpd)],log="x")
plot(b$GPD,b$LEM,xlab="(Male) Life Expectancy vs. GPD (log scale)",pch=19,col=CL[as.numeric(b$cgpd)],log="x")

The idea here is to consider the conditional dependence structure, given the wealth. If we want something smooth (this is actually the goal of the workshop, but I’d like to make that quickly) consider some weighted version of Kendall’s tau, based on the idea mentioned in a post on

The idea is to use concordance and discordance counts, with replications of the data, based on the weights

P = function(t) {   
  r_ndx = row(t)
  c_ndx = col(t)
  sum(t * mapply(function(r, c){sum(t[(r_ndx > r) & (c_ndx > c)])},
    r = r_ndx, c = c_ndx))}
Q = function(t) {
  r_ndx = row(t)
  c_ndx = col(t)
  sum(t * mapply( function(r, c){
      sum(t[(r_ndx > r) & (c_ndx < c)])
    r = r_ndx, c = c_ndx) )
kendall_tau_c = function(t){
    t = as.matrix(t) 
    m = min(dim(t))
    n = sum(t)
    ks_tauc = (m*2*(P(t)-Q(t)))/((n*n)*(m-1))
df=data.frame(Y1=b$LEF, Y2=b$LEM, freq=trunc(dnorm(log(b$GPD)-log(x),sd=bw)*100))
dfrep=data.frame( lapply(df, function(x){rep(x, df$freq)}))
t=xtabs(~ Y1+Y2, dfrep)

Here, I use weights using some Gaussian kernel on the logarithm of the GPD per capita (my standard deviation for the Gaussian weight being equal to the bandwidth of the Gaussian kernel of the density of the log of the GPD per capita), then, we can compute various conditional Kendall’s tau,


and plot them,

plot(T,K,type="l",xlab="Conditional Kendall's tau vs. GPD (log scale)")

There is more “correlation” between lifetimes of men and women in poor countries than rich country (which is also what Noël observed). Now, we can also play with time, because we have those statistics for several years.

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)