Copulas and tail dependence, part 1

September 17, 2012
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

As mentioned in the course last week Venter (2003) suggested nice functions to illustrate tail dependence (see also some slides used in Berlin a few years ago).
  • Joe (1990)'s lambda
Joe (1990) suggested a (strong) tail dependence index. For lower tails, for instance, consider
http://i1.wp.com/blogperso.univ-rennes1.fr/arthur.charpentier/public/perso3/toc3latex2png.2.php.png?w=456
i.e

http://i0.wp.com/blogperso.univ-rennes1.fr/arthur.charpentier/public/perso3/toc3latex2png.3.php.png?w=456
  • Upper and lower strong tail (empirical) dependence functions

The idea is to plot the function above, in order to visualize limiting behavior. Define

http://i2.wp.com/blogperso.univ-rennes1.fr/arthur.charpentier/public/perso3/Llatex2png.2.php.png?w=456
for the lower tail, and
http://i2.wp.com/blogperso.univ-rennes1.fr/arthur.charpentier/public/perso3/Clatex2png.2.php.png?w=456
for the upper tail, where http://i1.wp.com/perso.univ-rennes1.fr/arthur.charpentier/latex/toclatex2png-12.2.php.png?w=456 is the survival copula associated with http://i2.wp.com/perso.univ-rennes1.fr/arthur.charpentier/latex/toclatex2png-13.2.php.png?w=456, in the sense that
http://i2.wp.com/perso.univ-rennes1.fr/arthur.charpentier/latex/toclatex2png-14.2.php.png?w=456
while
http://i0.wp.com/perso.univ-rennes1.fr/arthur.charpentier/latex/toclatex2png-15.2.php.png?w=456

Now, one can easily derive empirical conterparts of those function, i.e.

http://i2.wp.com/perso.univ-rennes1.fr/arthur.charpentier/latex/toclatex2png-18.2.php.png?w=456
and
http://i0.wp.com/perso.univ-rennes1.fr/arthur.charpentier/latex/toclatex2png-19.2.php.png?w=456
Thus, for upper tail, on the right, we have the following graph

http://i1.wp.com/freakonometrics.blog.free.fr/public/perso6/upper-lambda.gif?resize=456%2C228

and for the lower tail, on the left, we have

http://i2.wp.com/freakonometrics.blog.free.fr/public/perso6/lower-lambda.gif?resize=456%2C228

For the code, consider some real data, like the loss-ALAE dataset.

> library(evd)
> X=lossalae

The idea is to plot, on the left, the lower tail concentration function, and on the right, the upper tail function.

> U=rank(X[,1])/(nrow(X)+1)
> V=rank(X[,2])/(nrow(X)+1)
> Lemp=function(z) sum((U<=z)&(V<=z))/sum(U<=z)
> Remp=function(z) sum((U>=1-z)&(V>=1-z))/sum(U>=1-z)
> u=seq(.001,.5,by=.001)
> L=Vectorize(Lemp)(u)
> R=Vectorize(Remp)(rev(u))
> plot(c(u,u+.5-u[1]),c(L,R),type="l",ylim=0:1,
+ xlab="LOWER TAIL          UPPER TAIL")
> abline(v=.5,col="grey")

Now, we can compare this graph, with what should be obtained for some parametric copulas that have the same Kendall's tau (e.g.). For instance, if we consider a Gaussian copula,

> tau=cor(lossalae,method="kendall")[1,2]
> library(copula)
> paramgauss=sin(tau*pi/2)
> copgauss=normalCopula(paramgauss)
> Lgaussian=function(z) pCopula(c(z,z),copgauss)/z
> Rgaussian=function(z) (1-2*z+pCopula(c(z,z),copgauss))/(1-z)
> u=seq(.001,.5,by=.001)
> Lgs=Vectorize(Lgaussian)(u)
> Rgs=Vectorize(Rgaussian)(1-rev(u))
> lines(c(u,u+.5-u[1]),c(Lgs,Rgs),col="red")

or Gumbel's copula,

> paramgumbel=1/(1-tau)
> copgumbel=gumbelCopula(paramgumbel, dim = 2)
> Lgumbel=function(z) pCopula(c(z,z),copgumbel)/z
> Rgumbel=function(z) (1-2*z+pCopula(c(z,z),copgumbel))/(1-z)
> u=seq(.001,.5,by=.001)
> Lgl=Vectorize(Lgumbel)(u)
> Rgl=Vectorize(Rgumbel)(1-rev(u))
> lines(c(u,u+.5-u[1]),c(Lgl,Rgl),col="blue")

That's nice (isn't it?), but since we do not have any confidence interval, it is still hard to conclude (even if it looks like Gumbel copula has a much better fit than the Gaussian one). A strategy can be to generate samples from those copulas, and to visualize what we had. With a Gaussian copula, the graph looks like
> u=seq(.0025,.5,by=.0025); nu=length(u)
> nsimul=500
> MGS=matrix(NA,nsimul,2*nu)
> for(s in 1:nsimul){
+ Xs=rCopula(nrow(X),copgauss)
+ Us=rank(Xs[,1])/(nrow(Xs)+1)
+ Vs=rank(Xs[,2])/(nrow(Xs)+1)
+ Lemp=function(z) sum((Us<=z)&(Vs<=z))/sum(Us<=z)
+ Remp=function(z) sum((Us>=1-z)&(Vs>=1-z))/sum(Us>=1-z)
+ MGS[s,1:nu]=Vectorize(Lemp)(u)
+ MGS[s,(nu+1):(2*nu)]=Vectorize(Remp)(rev(u))
+ lines(c(u,u+.5-u[1]),MGS[s,],col="red")
+ }

(including - pointwise - 90% confidence bands)

> Q95=function(x) quantile(x,.95)
> V95=apply(MGS,2,Q95)
> lines(c(u,u+.5-u[1]),V95,col="red",lwd=2)
> Q05=function(x) quantile(x,.05)
> V05=apply(MGS,2,Q05)
> lines(c(u,u+.5-u[1]),V05,col="red",lwd=2)

while it is

with Gumbel copula. Isn't it a nice (graphical) tool ?

But as mentioned in the course, the statistical convergence can be slow. Extremely slow. So assessing if the underlying copula has tail dependence, or not, it now that simple. Especially if the copula exhibits tail independence. Like the Gaussian copula. Consider a sample of size 1,000. This is what we obtain if we generate random scenarios,

or we look at the left tail (with a log-scale)

Now, consider a 10,000 sample,

or with a log-scale

We can even consider a 100,000 sample,

or with a log-scale

On those graphs, it is rather difficult to conclude if the limit is 0, or some strictly positive value (again, it is a classical statistical problem when the value of interest is at the border of the support of the parameter). So, a natural idea is to consider a weaker tail dependence index. Unless you have something like 100,000 observations...

To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , , ,

Comments are closed.