Copulas and tail dependence, part 1

September 17, 2012

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

As mentioned in the course last week Venter (2003) suggested nice functions to illustrate tail dependence (see also some slides used in Berlin a few years ago).

  • Joe (1990)’s lambda

Joe (1990) suggested a (strong) tail dependence index. For lower tails, for instance, consider

  • Upper and lower strong tail (empirical) dependence functions

The idea is to plot the function above, in order to visualize limiting behavior. Define

for the lower tail, and
for the upper tail, where is the survival copula associated with, in the sense that


Now, one can easily derive empirical conterparts of those function, i.e.


Thus, for upper tail, on the right, we have the following graph

and for the lower tail, on the left, we have

For the code, consider some real data, like the loss-ALAE dataset.

> library(evd)
> X=lossalae

The idea is to plot, on the left, the lower tail concentration function, and on the right, the upper tail function.

> U=rank(X[,1])/(nrow(X)+1)
> V=rank(X[,2])/(nrow(X)+1)
> Lemp=function(z) sum((U<=z)&(V<=z))/sum(U<=z)
> Remp=function(z) sum((U>=1-z)&(V>=1-z))/sum(U>=1-z)
> u=seq(.001,.5,by=.001)
> L=Vectorize(Lemp)(u)
> R=Vectorize(Remp)(rev(u))
> plot(c(u,u+.5-u[1]),c(L,R),type="l",ylim=0:1,
+ xlab="LOWER TAIL          UPPER TAIL")
> abline(v=.5,col="grey")

Now, we can compare this graph, with what should be obtained for some parametric copulas that have the same Kendall’s tau (e.g.). For instance, if we consider a Gaussian copula,

> tau=cor(lossalae,method="kendall")[1,2]
> library(copula)
> paramgauss=sin(tau*pi/2)
> copgauss=normalCopula(paramgauss)
> Lgaussian=function(z) pCopula(c(z,z),copgauss)/z
> Rgaussian=function(z) (1-2*z+pCopula(c(z,z),copgauss))/(1-z)
> u=seq(.001,.5,by=.001)
> Lgs=Vectorize(Lgaussian)(u)
> Rgs=Vectorize(Rgaussian)(1-rev(u))
> lines(c(u,u+.5-u[1]),c(Lgs,Rgs),col="red")

or Gumbel’s copula,

> paramgumbel=1/(1-tau)
> copgumbel=gumbelCopula(paramgumbel, dim = 2)
> Lgumbel=function(z) pCopula(c(z,z),copgumbel)/z
> Rgumbel=function(z) (1-2*z+pCopula(c(z,z),copgumbel))/(1-z)
> u=seq(.001,.5,by=.001)
> Lgl=Vectorize(Lgumbel)(u)
> Rgl=Vectorize(Rgumbel)(1-rev(u))
> lines(c(u,u+.5-u[1]),c(Lgl,Rgl),col="blue")

That’s nice (isn’t it?), but since we do not have any confidence interval, it is still hard to conclude (even if it looks like Gumbel copula has a much better fit than the Gaussian one). A strategy can be to generate samples from those copulas, and to visualize what we had. With a Gaussian copula, the graph looks like

> u=seq(.0025,.5,by=.0025); nu=length(u)
> nsimul=500
> MGS=matrix(NA,nsimul,2*nu)
> for(s in 1:nsimul){
+ Xs=rCopula(nrow(X),copgauss)
+ Us=rank(Xs[,1])/(nrow(Xs)+1)
+ Vs=rank(Xs[,2])/(nrow(Xs)+1)
+ Lemp=function(z) sum((Us<=z)&(Vs<=z))/sum(Us<=z)
+ Remp=function(z) sum((Us>=1-z)&(Vs>=1-z))/sum(Us>=1-z)
+ MGS[s,1:nu]=Vectorize(Lemp)(u)
+ MGS[s,(nu+1):(2*nu)]=Vectorize(Remp)(rev(u))
+ lines(c(u,u+.5-u[1]),MGS[s,],col="red")
+ }

(including – pointwise – 90% confidence bands)

> Q95=function(x) quantile(x,.95)
> V95=apply(MGS,2,Q95)
> lines(c(u,u+.5-u[1]),V95,col="red",lwd=2)
> Q05=function(x) quantile(x,.05)
> V05=apply(MGS,2,Q05)
> lines(c(u,u+.5-u[1]),V05,col="red",lwd=2)

while it is

with Gumbel copula. Isn’t it a nice (graphical) tool ?

But as mentioned in the course, the statistical convergence can be slow. Extremely slow. So assessing if the underlying copula has tail dependence, or not, it now that simple. Especially if the copula exhibits tail independence. Like the Gaussian copula. Consider a sample of size 1,000. This is what we obtain if we generate random scenarios,

or we look at the left tail (with a log-scale)

Now, consider a 10,000 sample,

or with a log-scale

We can even consider a 100,000 sample,

or with a log-scale

On those graphs, it is rather difficult to conclude if the limit is 0, or some strictly positive value (again, it is a classical statistical problem when the value of interest is at the border of the support of the parameter). So, a natural idea is to consider a weaker tail dependence index. Unless you have something like 100,000 observations…

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , , ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)