# Pricing Reinsurance Contracts

October 24, 2013
By

(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers)

In order to illustrate the next section of the non-life insurance course, consider the following example1, inspired from http://sciencepolicy.colorado.edu/…. This is the so-called “Normalized Hurricane Damages in the United States” dataset, for the period 1900-2005, from Pielke et al. (2008). The dataset is available in xls format, so we have to spend some time to import it,

> library(gdata)
+ sheet=1)

Content type 'application/vnd.ms-excel' length 119296 bytes (116 Kb)
opened URL
==================================================

perl: warning: Setting locale failed.
LANGUAGE = "fr_CA:fr",
LC_ALL = (unset),
LANG = "fr_CA.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

The problem with excel spreadsheets is that some columns might have pre-specified format (here, losses are with a format 000,000,000 for instance)

> tail(db)
Year Hurricane.Description State Category Base.Economic.Damage
202 2005                 Cindy    LA        1          320,000,000
203 2005                Dennis    FL        3        2,230,000,000
204 2005               Katrina LA,MS        3       81,000,000,000
205 2005               Ophelia    NC        1        1,600,000,000
206 2005                  Rita    TX        3       10,000,000,000
207 2005                 Wilma    FL        3       20,600,000,000
Normalized.PL05 Normalized.CL05  X X.1
202     320,000,000     320,000,000 NA  NA
203   2,230,000,000   2,230,000,000 NA  NA
204  81,000,000,000  81,000,000,000 NA  NA
205   1,600,000,000   1,600,000,000 NA  NA
206  10,000,000,000  10,000,000,000 NA  NA
207  20,600,000,000  20,600,000,000 NA  NA

To get data in a format we can play with, consider the following function,

> stupidcomma = function(x){
+ x=as.character(x)
+ for(i in 1:10){x=sub(",","",as.character(x))}
+ return(as.numeric(x))}

and let’s convert those values into numbers,

> base=db[,1:4]
> base$Base.Economic.Damage=Vectorize(stupidcomma)(db$Base.Economic.Damage)
> base$Normalized.PL05=Vectorize(stupidcomma)(db$Normalized.PL05)
> base$Normalized.CL05=Vectorize(stupidcomma)(db$Normalized.CL05)

Here is the dataset we will use, from now on,

> tail(base)
Year Hurricane.Description State Category Base.Economic.Damage
202 2005                 Cindy    LA        1             3.20e+08
203 2005                Dennis    FL        3             2.23e+09
204 2005               Katrina LA,MS        3             8.10e+10
205 2005               Ophelia    NC        1             1.60e+09
206 2005                  Rita    TX        3             1.00e+10
207 2005                 Wilma    FL        3             2.06e+10
Normalized.PL05 Normalized.CL05
202        3.20e+08        3.20e+08
203        2.23e+09        2.23e+09
204        8.10e+10        8.10e+10
205        1.60e+09        1.60e+09
206        1.00e+10        1.00e+10
207        2.06e+10        2.06e+10

We can visualize the normalized costs of hurricanes, from 1900 till 2005, with the 207 hurricanes (here the x-axis is not time, it is simply the index of the loss)

> plot(base$Normalized.PL05/1e9,type="h",ylim=c(0,155)) As usual, there are two components when computing the pure premium of an insurance contract. The number of claims (or here hurricanes) and the individual losses of each claim. We’ve seen – above – individual losses, let us focus now on the annual frequency. > TB <- table(base$Year)
> years <- as.numeric(names(TB))
> counts <- as.numeric(TB)
> years0=(1900:2005)[which(!(1900:2005)%in%years)]
> db <- data.frame(years=c(years,years0),
+ counts=c(counts,rep(0,length(years0))))
> db[88:93,]
years counts
88  2003      3
89  2004      6
90  2005      6
91  1902      0
92  1905      0
93  1907      0

On average, we experience about 2 (major) hurricanes per year,

> mean(db$counts) [1] 1.95283 In predictive modeling (here, we wish to price a reinsurance contract for, say, 2014), we need probably to take into account some possible trend in the hurricane occurrence frequency. We can consider either a linear trend, > reg0 <- glm(counts~years,data=db,family=poisson(link="identity"), + start=lm(counts~years,data=db)$coefficients)

or an exponential one,

We can plot those three predictions, and get a prediction for the number of (major) hurricanes in 2014,

> plot(years,counts,type='h',ylim=c(0,6),xlim=c(1900,2020))
> cpred1=predict(reg1,newdata=data.frame(years=1890:2030),type="response")
> lines(1890:2030,cpred1,col="blue")
> cpred0=predict(reg0,newdata=data.frame(years=1890:2030),type="response")
> lines(1890:2030,cpred0,col="red")
> abline(h=mean(db$counts),col="black") > (predictions=cbind(constant=mean(db$counts),linear=
+ cpred0[126],exponential=cpred1[126]))
constant   linear exponential
126  1.95283 3.573999    4.379822
> points(rep((1890:2030)[126],3),prediction,col=c("black","red","blue"),pch=19)

Observe that changing the model will change the pure premium: with a flat prediction, we expect less than 2 (major) hurricanes, but with the exponential trend, we expect more than 4…

This is for the expected frequency. Now, we should find a suitable model to compute the pure premium of a reinsurance treaty, with a (high) deductible, and a limited (but large) cover. As we will seen in class next week, the appropriate model is a Pareto distribution (see Hagstrœm (1925), Huyghues-Beaufond (1991) or a survey – in French – published a few years ago).

$F_{(\xi,\mu,\sigma)}(x) = \begin{cases}1 - \left(\ds{1+ \frac{\xi(x-\mu)}{\sigma}\right)^{-1/\xi}} & \text{for }\xi \neq 0, \\1 - \exp \left(-\frac{x-\mu}{\sigma}\right) & \text{for }\xi = 0.\end{cases}$

We can use Hill’s plot to estimate the tail index,

> library(evir)
> hill(base$Normalized.PL05) Clearly, costs of major hurricanes are heavy tailed. Now, consider an insurance company, in the U.S., with 5% market share (just to illustrate). We will consider there \tilde Y_i= Y_i/20. The losses are given below. Consider a reinsurance treaty, with a deductible of 2 (billion) and a limited cover of 4 (billion), For our Pareto model, consider only losses above 500 millions, > threshold=.5 > (gpd.PL <- gpd(base$Normalized.PL05/1e9/20,threshold)$par.ests) xi beta 0.4424669 0.6705315 Keep in mind the 1 hurricane out of 8 reaches that level > mean(base$Normalized.CL05/1e9/20>.5)
[1] 0.1256039

Given that the loss exceeds 500 millions, we can now compute the expected value of the reinsurance contact,

$\int_{d}^{d+c}(x-d)\ dF_{(\xi,\mu,\sigma)}(x)$

To compute it we can use

> E <- function(yinf,ysup,xi,beta){
+   as.numeric(integrate(function(x) (x-yinf)*dgpd(x,xi,mu=threshold,beta),
+   lower=yinf,upper=ysup)$value+ + (1-pgpd(ysup,xi,mu=threshold,beta))*(ysup-yinf)) + } Now, it is probably time to bring all the pieces together. We might expect a bit less than 2 (major) hurricanes per year, > predictions[1] [1] 1.95283 and each hurricane has 12.5% chances to cost more than 500 million for our insurance company, > mean(base$Normalized.PL05/1e9/20>.5)
[1] 0.1256039

and given that an hurricane exceeds 500 million loss, then the expected repayment by the reinsurance company is (in millions)

> E(2,6,gpd.PL[1],gpd.PL[2])*1e3
[1] 330.9865

So the pure premium of the reinsurance contract is simply

> predictions[1]*mean(base\$Normalized.PL05/1e9/20>.5)*
+ E(2,6,gpd.PL[1],gpd.PL[2])*1e3
[1] 81.18538

for a cover of 4 billion, in excess of 2.

1.This example will be found in the Reinsurance and Extremal Events chapter in the forthcoming Computational Actuarial Science with R, by Eric Gilleland and Mathieu Ribatet.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...