Copulas and Financial Time Series

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I was recently asked to write a survey on copulas for financial time series. The paper is, so far, unfortunately, in French, and is available on https://hal.archives-ouvertes.fr/. There is a description of various models, including some graphs and statistical outputs, obtained from read data.

To illustrate, I’ve been using weekly log-returns of (crude) oil prices, Brent, Dubaï and Maya.

The dataset is available from an excel file, oil.xls (I thought it was possible to load it direclty from the internet, but it did not work… so I suggest to download the file first, and then load it)

> library(xlsx)
> temp <- tempfile()
> download.file(
+ "http://freakonometrics.free.fr/oil.xls",temp)
trying URL 'http://freakonometrics.free.fr/oil.xls'
Content type 'application/vnd.ms-excel' length 99328 bytes (97 KB)
downloaded 97 KB
> oil=read.xlsx(temp,sheetName="DATA",dec=",")
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  : 
  java.io.IOException: block[ 0 ] already removed - does your POIFS have circular or duplicate block references?
> oil=read.xlsx("D:\home\acharpen\mes documents\oil.xls",sheetName="DATA")

Then we can plot those three time series

> head(oil)
        Date      WTI    brent   Dubai     Maya
1 1997-01-10  2.73672  2.25465  3.3673   1.5400
2 1997-01-17 -3.40326 -6.01433 -3.8249  -4.1076
3 1997-01-24 -4.09531 -1.43076 -6.6375  -4.6166
4 1997-01-31 -0.65789  0.34873  0.7326  -1.5122
5 1997-02-07 -3.14293 -1.97765 -0.7326  -1.8798
6 1997-02-14 -5.60321 -7.84534 -7.6372 -11.0549

> Time=as.Date(oil$Date,"%Y-%m-%d")
> plot(Time,oil[,3],type="l",ylab="Brent, weekly log returns",ylim=range(oil[,3:5]))

The idea is to use some multivariate ARMA-GARCH processes here. The heuristics here is that the first part is used to model the dynamics of the average value of the time series, and the second part is used to model the dynamics of the variance of the time series. Two kinds of models are considered in the paper

  • a mutivariate GARCH process (or a model on the dynamics of the variance matrix) on the residuals from the ARMA models
  • a multivariate model (based on copulas) on the residuals of the ARMA-GARCH process

So different series will be considered here, obtained as residuals of different models. And we can also normalize those residuals. Either from the ARMA models

> fit1=arima(x=dat[,1],order=c(2,0,1))
> fit2=arima(x=dat[,2],order=c(1,0,1))
> fit3=arima(x=dat[,3],order=c(1,0,1))
> dat_arma <- cbind(residuals(fit1),residuals(fit2),residuals(fit3))
> m <- apply(dat_arma, 2, mean)
> v <- apply(dat_arma, 2, var)
> dat_arma_std <- t((t(dat_arma)-m)/sqrt(v))

or from the ARMA-GARCH models

> fit1=garchFit(formula = ~ arma(2,1)+garch(1, 1),data=dat[,1],cond.dist ="std")
> fit2=garchFit(formula = ~ arma(1,1)+garch(1, 1),data=dat[,2],cond.dist ="std")
> fit3=garchFit(formula = ~ arma(1,1)+garch(1, 1),data=dat[,3],cond.dist ="std")
> m_res <- apply(dat_res, 2, mean)
> v_res <- apply(dat_res, 2, var)
> dat_res_std =cbind((dat_res[,1]-m_res[1])/sqrt(v_res[1]),(dat_res[,2]-m_res[2])/sqrt(v_res[2]),(dat_res[,3]-m_res[3])/sqrt(v_res[3]))
  • Multivariate GARCH Models

A first model that can be consider is a multivariate EWMA for the covariance matrix,

> ewma=EWMAvol(dat_res_std, lambda = 0.96)

To visualize the volatility, use

> emwa_series_vol=function(i=1){
+ plot(Time,ewma$Sigma.t[,1],type="l",
+ col="white",ylim=c(0,80))
+ lines(Time,dat_arma[,i]+40,col="grey")
+ j=1
+ if(i==2) j=5
+ if(i==3) j=9
+ lines(Time,ewma$Sigma.t[,j],lwd=1.5)
+ }

The implied correlation is here

> emwa_series_cor=function(i=1,j=2){
+   if((min(i,j)==1)&(max(i,j)==2)){
+     a=1;     b=5;     ab=2}
+   if((min(i,j)==1)&(max(i,j)==3)){
+     a=1;     b=9;     ab=3}
+   if((min(i,j)==2)&(max(i,j)==3)){
+     a=5;     b=9;     ab=6}
+ r=ewma$Sigma.t[,ab]/sqrt(ewma$Sigma.t[,a]*
+ ewma$Sigma.t[,b])
+ plot(Time,r,type="l",ylim=c(0,1))
+ }

It is also possible possible some multivariate GARCH, namely a BEKK(1,1) model, e.g. using

> library(MTS)
> bekk=BEKK11(dat_arma)
> bekk_series_vol=function(i=1){
+     plot(Time,bk$Sigma.t[,1],type="l",
+          ylab=names(dat)[i],col="white",ylim=c(0,80))
+     lines(Time,dat_arma[,i]+40,col="grey")
+     j=1
+     if(i==2) j=5
+     if(i==3) j=9
+     lines(Time,bk$Sigma.t[,j],lwd=1.5)
+ }
> bekk_series_cor=function(i=1,j=2){
+   if((min(i,j)==1)&(max(i,j)==2)){
+     a=1;     b=5;     ab=2}
+   if((min(i,j)==1)&(max(i,j)==3)){
+     a=1;     b=9;     ab=3}
+   if((min(i,j)==2)&(max(i,j)==3)){
+     a=5;     b=9;     ab=6}
+   r=bk$Sigma.t[,ab]/sqrt(bk$Sigma.t[,a]*
+ bk$Sigma.t[,b])
+   plot(Time,r,type="l",
+        ylim=c(0,1))
+ }

The implied corelation is here

  • Modeling Residuals from Univariate GARCH Models

The first step might be to consider some static (joint) distribution for the residuals. The univariate marginal distributions are

while the contour of the joint density (obtained using a bivariate kernel estimator) is

It is also possible to visualise the copula density (with some nonparametric estimator on top, and parametric copula below)

> copula_NP=function(i=1,j=2){
+     uv=dat_garch_std[,c(i,j)]
+     n=nrow(uv)
+     uv=cbind(rank(uv[,1]),rank(uv[,2]))/(n+1)
+     xy=qnorm(uv)
+     s=0.3
+     dc=Vectorize(function(x,y) mean(dnorm(rep(qnorm(x),n),xy[,1],s)*dnorm(rep(qnorm(y),n),xy[,2],s))/
+                      dnorm(qnorm(x))/dnorm(qnorm(y)))
+     vx=seq(1/30,29/30,by=1/30)
+     vz=outer(vx,vx,dc)
+     zl=c(0,8)
+     persp(vx,vx,vz,theta=20,phi=10,col="yellow",shade=TRUE,xlab=names(dat)[j],
+           ylab=names(dat)[j],zlab="",ticktype="detailed",zlim=zl)
+     
+     library(copula)
+     norm.cop <- normalCopula(0.5)
+     norm.cop <- normalCopula(fitCopula(norm.cop,uv)@estimate)
+     dc=function(x,y) dCopula(cbind(x,y), norm.cop)
+     vz=outer(vx,vx,dc)
+     persp(vx,vx,vz,theta=20,phi=10,col="yellow",shade=TRUE,xlab=names(dat)[j],
+           ylab=names(dat)[j],zlab="copule Gaussienne",ticktype="detailed",zlim=zl)
+     
+     t.cop <- tCopula(0.5,df=3)
+     t.fit <- fitCopula(t.cop,uv)@estimate
+     t.cop <- tCopula(t.fit[1],df=t.fit[2])
+     dc=function(x,y) dCopula(cbind(x,y), t.cop)
+     vz=outer(vx,vx,dc)
+     persp(vx,vx,vz,theta=20,phi=10,col="yellow",shade=TRUE,xlab=names(dat)[j],
+           ylab=names(dat)[j],zlab="copule de Student",ticktype="detailed",zlim=zl)
+ }

One can consider the http://latex.codecogs.com/gif.latex?lambda(cdot) function,

http://latex.codecogs.com/gif.latex?lambda(u)=left{begin{array}{l}Pr[X_1%20leq%20F_1^{-1}(u)%20vert%20X_2%20leq%20F_2^{-1}(u)]text{%20if%20}uin(0,1/2)%20\Pr[X_1%20%3E%20F_1^{-1}(u)%20vert%20X_2%20%3E%20F_2^{-1}(u)]text{%20if%20}uin(1/2,1)%20end{array}right.

compute its empirical version on the three pairs of series, and compare it with some parametric version,

> 
> lambda=function(C){
+     l=function(u) pcopula(C,cbind(u,u))/u
+     u=seq(.001,.5,by=.001)
+     v=Vectorize(l)(u)
+     return(c(v,rev(v)))
+ }
> 
> graph_lambda=function(i,j){
+     X=dat_res
+     U=rank(X[,i])/(nrow(X)+1)
+     V=rank(X[,j])/(nrow(X)+1)
+     library(copula)
+     normal.cop <- normalCopula(.5,dim=2)
+     t.cop <- tCopula(.5,dim=2,df=3)
+     gumbel.cop <- gumbelCopula(1.5,dim=2)
+     fit1=fitCopula(normal.cop, cbind(U,V), method="ml")
+     fit2=fitCopula(t.cop, cbind(U,V), method="ml")
+     C1=normalCopula(fit1@copula@parameters,dim=2)
+     C2=tCopula(fit2@copula@parameters[1],dim=2,df=trunc(fit2@copula@parameters[2]))
+     
+     Lemp=function(z) sum((U<=z)&(V<=z))/sum(U<=z)
+     Remp=function(z) sum((U>=1-z)&(V>=1-z))/sum(U>=1-z)
+     u=seq(.001,.5,by=.001)
+     L=Vectorize(Lemp)(u)
+     R=Vectorize(Remp)(rev(u))
+     nom=paste("Indice de dépendance de queue ",names(dat)[i]," vs. ",names(dat)[j],sep="")
+     plot(c(u,u+.5-u[1]),c(L,R),type="l",
+          ylim=0:1,lwd=2,
+          xlab="",ylab=nom)
+     lines(c(u,u+.5-u[1]),lambda(C1),lty=2)
+     lines(c(u,u+.5-u[1]),lambda(C2),col=gray.colors(1,.4))
+ }

But one could wonder if the correlation is stable with time.

> time_varying_correl_2=function(i=1,j=2,
+ nom_arg="Pearson"){
+     uv=dat_arma[,c(i,j)]
+     n=nrow(uv)
+     correlation=function(t){
+         sub_uv=uv[t+(-26):26,]
+         cor(sub_uv,method = tolower(nom_arg))[1,2]
+     }
+     ic_correlation=function(t){
+         b=1000
+         sub_uv=uv[t+(-26):26,]
+         cr=rep(NA,b)
+         for(sim in 1:b){
+             sub_uv_b=sub_uv[sample(1:53,size=53,replace=TRUE),]+matrix(rnorm(106),53,2)/10000
+             cr[sim]=cor(sub_uv_b,method = tolower(nom_arg))[1,2]}
+         quantile(cr,c(.05,.95))
+     }
+     nom=paste("Corr?lation de ",nom_arg, " ",names(dat)[i]," vs. ",names(dat)[j],sep="")
+     C=sapply(26:(n-26),correlation)
+     IC_C=sapply(27:(n-27),ic_correlation)
+     plot(Time[26:(n-26)],C,type="l",xlab="Time",ylab=nom,ylim=c(.3,1),col="white")
+     polygon(c(Time[27:(n-27)],rev(Time[27:(n-27)])),c(IC_C[1,],rev(IC_C[2,])),col="light blue",border=NA)  
+     lines(Time[26:(n-26)],C)  
+     abline(h=cor(uv,method = tolower(nom_arg))[1,2],lty=2,col="red")
+ }
> 
> time_varying_correl_2(1,2)
> time_varying_correl_2(1,2, "spearman")
> time_varying_correl_2(1,2,"kendall")

Spearman rank correlation with time

or Kendall’s tau, with time

To model that correlation, consider the DCC model(s)

> library(MTS)
> m2=dccFit(dat_res_std)
> m3=dccFit(dat_res_std,type="Engle")
> R2=m2$rho.t
> R3=m3$rho.t

To get some forecasts, use for instance

> library(rmgarch)
> garch11.spec = ugarchspec(mean.model = list(armaOrder = c(2,1)),variance.model = list(garchOrder = c(1,1), model = "GARCH"), distribution.model = "norm")
> dcc.garch11.spec = dccspec(uspec = multispec( replicate(3, garch11.spec) ), dccOrder = c(1,1),
 distribution = "mvnorm")
> dcc.fit= dccfit(dcc.garch11.spec,data = dat)
> fcst=dccforecast(dcc.fit,n.ahead=200)
> plot(fcst)

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)