Bounding sums of random variables, part 1

September 27, 2012
By

(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers)

For the last course MAT8886 of this (long) winter session, on copulas (and extremes), we will discuss risk aggregation. The course will be mainly on the problem of bounding  the distribution (or some risk measure, say the Value-at-Risk) for two random variables with given marginal distribution. For instance, we have two Gaussian risks. What could be be worst-case scenario for the 99% quantile of the sum ? Note that I mention implications in terms of risk management, but of course, those questions are extremely important in terms of statistical inference, see e.g. Fan & Park (2006).

This problem, is sometimes related to some question asked by Kolmogorov almost one hundred years ago, as mentioned in Makarov (1981). One year after, Rüschendorf (1982) also suggested a proof of bounds calculation. Here, we focus in dimension 2. As usual, it is the simple case. But as mentioned recently, in Kreinovich & Ferson (2005), in dimension 3 (or higher), “computing the best-possible bounds for arbitrary n is an NP-hard (computationally intractable) problem“. So let us focus on the case where we sum (only) two random variable (for those interested in higher dimension, Puccetti & Rüschendorf (2012) provided interesting results for a dual version of those optimal bounds).

Let http://latex.codecogs.com/gif.latex?\Delta denote the set of univariate continuous distribution function, left-continuous, on http://latex.codecogs.com/gif.latex?\mathbb{R}. And http://latex.codecogs.com/gif.latex?\Delta^+ the set of distributions on http://latex.codecogs.com/gif.latex?\mathbb{R}^+. Thus, http://latex.codecogs.com/gif.latex?F\in\Delta^+ if http://latex.codecogs.com/gif.latex?F\in\Delta and http://latex.codecogs.com/gif.latex?F(0)=0. Consider now two distributions http://latex.codecogs.com/gif.latex?F,G\in\Delta^+. In a very general setting, it is possible to consider operators on http://latex.codecogs.com/gif.latex?\Delta^+\times%20\Delta^+. Thus, let http://latex.codecogs.com/gif.latex?T:[0,1]\times[0,1]\rightarrow[0,1] denote an operator, increasing in each component, thus that http://latex.codecogs.com/gif.latex?T(1,1)=1. And consider some function http://latex.codecogs.com/gif.latex?L:\mathbb{R}^+\times\mathbb{R}^+\rightarrow\mathbb{R}^+ assumed to be also increasing in each component (and continuous). For such functions http://latex.codecogs.com/gif.latex?T and http://latex.codecogs.com/gif.latex?L, define the following (general) operator, http://latex.codecogs.com/gif.latex?\tau_{T,L}(F,G) as

http://latex.codecogs.com/gif.latex?\tau_{T,L}(F,G)(x)=\sup_{L(u,v)=x}\{T(F(u),G(v))\}

One interesting case can be obtained when http://latex.codecogs.com/gif.latex?Tis a copula, http://latex.codecogs.com/gif.latex?C. In that case,

http://latex.codecogs.com/gif.latex?\tau_{C,L}(F,G):\Delta^+\times\Delta^+\rightarrow\Delta^+

and further, it is possible to write

http://latex.codecogs.com/gif.latex?\tau_{C,L}(F,G)(x)=\sup_{(u,v)\in%20L^{-1}(x)}\{C(F(u),G(v))\}

It is also possible to consider other (general) operators, e.g. based on the sum

http://latex.codecogs.com/gif.latex?\sigma_{C,L}(F,G)(x)=\int_{(u,v)\in%20L^{-1}(x)}%20dC(F(u),G(v))

or on the minimum,

http://latex.codecogs.com/gif.latex?\rho_{C,L}(F,G)(x)=\inf_{(u,v)\in%20L^{-1}(x)}\{C^\star(F(u),G(v))\}

where http://latex.codecogs.com/gif.latex?C^\star is the survival copula associated with http://latex.codecogs.com/gif.latex?C, i.e. http://latex.codecogs.com/gif.latex?C^\star(u,v)=u+v-C(u,v). Note that those operators can be used to define distribution functions, i.e.

http://latex.codecogs.com/gif.latex?\sigma_{C,L}(F,G):\Delta^+\times\Delta^+\rightarrow\Delta^+

and similarly

http://latex.codecogs.com/gif.latex?\rho_{C,L}(F,G):\Delta^+\times\Delta^+\rightarrow\Delta^+

All that seems too theoretical ? An application can be the case of the sum, i.e. http://latex.codecogs.com/gif.latex?L(x,y)=x+y, in that case http://latex.codecogs.com/gif.latex?\sigma_{C,+}(F,G) is the distribution of sum of two random variables with marginal distributions http://latex.codecogs.com/gif.latex?F and http://latex.codecogs.com/gif.latex?G, and copula http://latex.codecogs.com/gif.latex?C. Thus, http://latex.codecogs.com/gif.latex?\sigma_{C^\perp,+}(F,G) is simply the convolution of two distributions,

http://latex.codecogs.com/gif.latex?\sigma_{C^\perp,+}(F,G)(x)=\int_{u+v=x}%20dC^\perp(F(u),G(v))

The important result (that can be found in Chapter 7, in Schweizer and Sklar (1983)) is that given an operator http://latex.codecogs.com/gif.latex?L, then, for any copula http://latex.codecogs.com/gif.latex?C, one can find a lower bound for http://latex.codecogs.com/gif.latex?\sigma_{C,L}(F,G)

http://latex.codecogs.com/gif.latex?\tau_{C^-,L}(F,G)\leq%20\tau_{C,L}(F,G)\leq\sigma_{C,L}(F,G)

as well as an upper bound

http://latex.codecogs.com/gif.latex?\sigma_{C,L}(F,G)\leq%20\rho_{C,L}(F,G)\leq\rho_{C^-,L}(F,G)

Those inequalities come from the fact that for all copula http://latex.codecogs.com/gif.latex?C, http://latex.codecogs.com/gif.latex?C\geq%20C^-, where http://latex.codecogs.com/gif.latex?C^- is a copula. Since this function is not copula in higher dimension, one can easily imagine that get those bounds in higher dimension will be much more complicated…

In the case of the sum of two random variables, with marginal distributions http://latex.codecogs.com/gif.latex?F and http://latex.codecogs.com/gif.latex?G, bounds for the distribution of the sum http://latex.codecogs.com/gif.latex?H(x)=\mathbb{P}(X+Y\leq%20x), where http://latex.codecogs.com/gif.latex?X\sim%20F and http://latex.codecogs.com/gif.latex?Y\sim%20G, can be written

http://latex.codecogs.com/gif.latex?H^-(x)=\tau_{C^-%20,+}(F,G)(x)=\sup_{u+v=x}\{%20\max\{F(u)+G(v)-1,0\}%20\}

for the lower bound, and

http://latex.codecogs.com/gif.latex?H^+(x)=\rho_{C^-%20,+}(F,G)(x)=\inf_{u+v=x}\{%20\min\{F(u)+G(v),1\}%20\}

for the upper bound. And those bounds are sharp, in the sense that, for all http://latex.codecogs.com/gif.latex?t\in(0,1), there is a copula http://latex.codecogs.com/gif.latex?C_t such that

http://latex.codecogs.com/gif.latex?\tau_{C_t,+}(F,G)(x)=\tau_{C^-%20,+}(F,G)(x)=t

and there is (another) copula http://latex.codecogs.com/gif.latex?C_t such that

http://latex.codecogs.com/gif.latex?\sigma_{C_t,+}(F,G)(x)=\tau_{C^-%20,+}(F,G)(x)=t

Thus, using those results, it is possible to bound cumulative distribution function. But actually, all that can be done also on quantiles (see Frank, Nelsen & Schweizer (1987)). For all http://latex.codecogs.com/gif.latex?F\in\Delta^+ let http://latex.codecogs.com/gif.latex?F^{-1} denotes its generalized inverse, left continuous, and let http://latex.codecogs.com/gif.latex?\nabla^+ denote the set of those quantile functions. Define then the dual versions of our operators,

http://latex.codecogs.com/gif.latex?\tau^{-1}_{T,L}(F^{-1},G^{-1})(x)=\inf_{(u,v)\in%20T^{-1}(x)}\{L(F^{-1}(u),G^{-1}(v))\}

and

http://latex.codecogs.com/gif.latex?\rho^{-1}_{T,L}(F^{-1},G^{-1})(x)=\sup_{(u,v)\in%20T^\star^{-1}(x)}\{L(F^{-1}(u),G^{-1}(v))\}

Those definitions are really dual versions of the previous ones, in the sense that http://latex.codecogs.com/gif.latex?\tau^{-1}_{T,L}(F^{-1},G^{-1})=[\tau_{T,L}(F,G)]^{-1} and http://latex.codecogs.com/gif.latex?\rho^{-1}_{T,L}(F^{-1},G^{-1})=[\rho_{T,L}(F,G)]^{-1}.

Note that if we focus on sums of bivariate distributions, the lower bound for the quantile of the sum is

http://latex.codecogs.com/gif.latex?\tau^{-1}_{C^{-},+}(F^{-1},G^{-1})(x)=\inf_{\max\{u+v-1,0\}=x}\{F^{-1}(u)+G^{-1}(v)\}

while the upper bound is

http://latex.codecogs.com/gif.latex?\rho^{-1}_{C^{-},+}(F^{-1},G^{-1})(x)=\sup_{\min\{u+v,1\}=x}\{F^{-1}(u)+G^{-1}(v)\}

A great thing is that it should not be too difficult to compute numerically those quantities. Perhaps a little bit more for cumulative distribution functions, since they are not defined on a bounded support. But still, if the goal is to plot those bounds on , for instance. The code is the following, for the sum of two lognormal distributions .

> F=function(x) plnorm(x,0,1)
> G=function(x) plnorm(x,0,1)
> n=100
> X=seq(0,10,by=.05)
> Hinf=Hsup=rep(NA,length(X))
> for(i in 1:length(X)){
+ x=X[i]
+ U=seq(0,x,by=1/n); V=x-U
+ Hinf[i]=max(pmax(F(U)+G(V)-1,0))
+ Hsup[i]=min(pmin(F(U)+G(V),1))}

If we plot those bounds, we obtain

> plot(X,Hinf,ylim=c(0,1),type="s",col="red")
> lines(X,Hsup,type="s",col="red")

But somehow, it is even more simple to work with quantiles since they are defined on a finite support. Quantiles are here

> Finv=function(u) qlnorm(u,0,1)
> Ginv=function(u) qlnorm(u,0,1)

The idea will be to consider a discretized version of the unit interval as discussed in Williamson (1989), in a much more general setting. Again the idea is to compute, for instance

http://latex.codecogs.com/gif.latex?\sup_{u\in[0,x]}\{F^{-1}(u)+G^{-1}(x-u)\}

The idea is to consider http://latex.codecogs.com/gif.latex?x=i/n and http://latex.codecogs.com/gif.latex?u=j/n, and the bound for the quantile function at point http://latex.codecogs.com/gif.latex?i/n is then

http://latex.codecogs.com/gif.latex?\sup_{j\in\{0,1,\cdots,i\}}\left\{F^{-1}\left(\frac{j}{n}\right)+G^{-1}\left(\frac{i-j}{n}\right)\right\}

The code to compute those bounds, for a given http://latex.codecogs.com/gif.latex?n is here

> n=1000
> Qinf=Qsup=rep(NA,n-1)
> for(i in 1:(n-1)){
+ J=0:i
+ Qinf[i]=max(Finv(J/n)+Ginv((i-J)/n))
+ J=(i-1):(n-1)
+ Qsup[i]=min(Finv((J+1)/n)+Ginv((i-1-J+n)/n))
+ }

Here we have (several http://latex.codecogs.com/gif.latex?ns were considered, so that we can visualize the convergence of that numerical algorithm),

Here, we have a simple code to visualize bounds for quantiles for the sum of two risks. But it is possible to go further…

Arthur Charpentier

Arthur Charpentier, professor at UQaM in Actuarial Science. Former professor-assistant at ENSAE Paristech, associate professor at Ecole Polytechnique and assistant professor in Economics at Université de Rennes 1. Graduated from ENSAE, Master in Mathematical Economics (Paris Dauphine), PhD in Mathematics (KU Leuven), and Fellow of the French Institute of Actuaries.

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.