Statistical Methods for the Chain Ladder Technique Demo (CloudStat)

[This article was first published on CloudStat, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This demo will use a study case, how a general insurance company use CloudStat to estimate outstanding claims.

Background

Forecasting outstanding claims and setting up suitable reserves to meet these claims is an important part of the business of a general insurance company. Indeed, the published profits of these companies depend not only on the actual claims paid, but on the forecasts of the claims which will have to he paid. It is essential, therefore, that a reliable estimate is available of the reserve to be set aside to cover claims, in order to ensure the financial stability of the company and its profit and loss account. There are a number of methods which have proved useful in practice, one of which is extensively used and is known as the chain ladder technique.

Chain ladder method is a statistical method of estimating outstanding claims, whereby the weighted average of past claim development is projected into the future. The projection is based on the ratios of cumulative past claims, usually paid or incurred, for successive years of development. It requires the earliest year of origin to be fully run-off or at least that the final outcome for that year can be estimated with confidence. If appropriate, the method can be applied to past claims data that have been explicitly adjusted for past inflation.

In recent years, a statistical framework for analyzing this data has been built up, which encompasses the actuarial method, extending and consolidating it. We hope to bring together there results and to illustrate how the chain ladder technique can be improved and extended, without altering the basic foundations upon which it has been built.

These improvements are designed to overcome two problems with the chain ladder technique. Firstly, that not enough connection is made between the accident years, resulting in an over-parametrized model and unstable forecasts. Secondly, that the development pattern is assumed to be the same for all accident years. No allowance is made by the chain ladder technique for any change in the speed with which Aims are settled, or for any other factors which may change the shape of the run-off pattern.

Dataset

General Insurance companies will sell insurance policies and receive claims everyday. These claims will be indexed by their business year and the delay.

We will use the following dataset, RAA for illustrative purpose.

RAA is a dataset of Run-off triangle of Automatic Facilitative business in General Liability in a matrix with 10 accident years and 10 development year, taken from Historical Loss Development, Reinsurance Association of America (RAA), 1991, p.96.

Functions

Mack-Chain-Ladder Model
The Mack-chain-ladder model forecasts future claims developments based on a historical cumulative claims development triangle and estimates the standard error around those.

The Mack-chain-ladder model can be regarded as a weighted linear regression through the origin for each development period: lm(y ~ x + 0, weights=w/x^(2-alpha)), where y is the vector of claims at development period k + 1 and x is the vector of claims at development period k.

Bootstrap-Chain-Ladder Model
The BootChainLadder procedure provides a predictive distribution of reserves or IBNRs for accumulative claims development triangle.

The BootChainLadder function uses a two-stage bootstrapping/simulation approach. In the first stage an ordinary chain-ladder methods is applied to the cumulative claims triangle. From this we calculate the scaled Pearson residuals which we bootstrap R times to forecast future incremental claims payments via the standard chain-ladder method. In the second stage we simulate the process error with the bootstrap value as the mean and using the process distribution assumed. The set of reserves obtained in this way forms the predictive distribution, from which summary statistics such as mean, prediction error or quartiles can be derived.

Code

library(ChainLadder)
RAA
plot(RAA)
plot(RAA, lattice=TRUE)

M = MackChainLadder(Triangle = RAA, est.sigma = “Mack”)
M
plot(M)
plot(M, lattice=TRUE)

set.seed(1)
B = BootChainLadder(Triangle = RAA, R = 999, process.distr = “od.pois”)
B
plot(B)

RAA2 = chainladder(RAA, weights=RAA, delta=1)
predict(RAA2)

Output

> library(ChainLadder)
> RAA
dev
origin 1 2 3 4 5 6 7 8 9 10 
1981 5012 8269 10907 11805 13539 16181 18009 18608 18662 18834 
1982 106 4285 5396 10666 13782 15599 15496 16169 16704 NA
1983 3410 8992 13873 16141 18735 22214 22863 23466 NA NA
1984 5655 11555 15766 21266 23425 26083 27067 NA NA NA
1985 1092 9565 15836 22169 25955 26180 NA NA NA NA
1986 1513 6445 11702 12935 15852 NA NA NA NA NA
1987 557 4020 10946 12314 NA NA NA NA NA NA
1988 1351 6947 13112 NA NA NA NA NA NA NA
1989 3133 5395 NA NA NA NA NA NA NA NA
1990 2063 NA NA NA NA NA NA NA NA NA
> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)
> png(file=filename)
[1] “C:/R/tmp\file34ff7dd6.png”

> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)
> png(file=filename)
[1] “C:/R/tmp\file5cc576ca.png”

>
> M = MackChainLadder(Triangle = RAA, est.sigma = “Mack”)
> M
MackChainLadder(Triangle = RAA, est.sigma = “Mack”)

Latest Dev.To.Date Ultimate IBNR Mack.S.E CV(IBNR)
1981 18,834 1.000 18,834 0 0 NaN
1982 16,704 0.991 16,858 154 206 1.339 
1983 23,466 0.974 24,083 617 623 1.010 
1984 27,067 0.943 28,703 1,636 747 0.457 
1985 26,180 0.905 28,927 2,747 1,469 0.535 
1986 15,852 0.813 19,501 3,649 2,002 0.549 
1987 12,314 0.694 17,749 5,435 2,209 0.406 
1988 13,112 0.546 24,019 10,907 5,358 0.491 
1989 5,395 0.336 16,045 10,650 6,333 0.595 
1990 2,063 0.112 18,402 16,339 24,566 1.503 

Totals
Dev: 0.76 
Ultimate: 213,122.23 
IBNR: 52,135.23 
Mack S.E.: 26,909.01 
CV(IBNR): 0.516138742518263 
> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)
> png(file=filename)
[1] “C:/R/tmp\file2791452f.png”

> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)
> png(file=filename)
[1] “C:/R/tmp\file54b62d7c.png”

>
> set.seed(1)
> B = BootChainLadder(Triangle = RAA, R = 999, process.distr = “od.pois”)
> B
BootChainLadder(Triangle = RAA, R = 999, process.distr = “od.pois”)

Latest Mean Ultimate Mean IBNR SD IBNR IBNR 75% IBNR 95%
1981 18,834 18,834 0 0 0 0 
1982 16,704 16,921 217 710 253 1,597 
1983 23,466 24,108 642 1,340 1,074 3,205 
1984 27,067 28,739 1,672 1,949 2,679 4,980 
1985 26,180 29,077 2,897 2,467 4,149 7,298 
1986 15,852 19,611 3,759 2,447 4,976 8,645 
1987 12,314 17,724 5,410 3,157 7,214 11,232 
1988 13,112 24,219 11,107 5,072 14,140 20,651 
1989 5,395 16,119 10,724 6,052 14,094 21,817 
1990 2,063 18,714 16,651 13,426 24,459 42,339 

Totals
Latest: 160,987 
Mean Ultimate: 214,066 
Mean IBNR: 53,079 
SD IBNR: 18,884 
Total IBNR 75%: 64,788 
Total IBNR 95%: 88,037 
> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)
> png(file=filename)
[1] “C:/R/tmp\file5b2b338c.png”

>
> RAA2 = chainladder(RAA, weights=RAA, delta=1)
> predict(RAA2)
dev
origin 1 2 3 4 5 6 7 8 
1981 5012 8269.000 10907.000 11805.000 13539.00 16181.00 18009.00 18608.00 
1982 106 4285.000 5396.000 10666.000 13782.00 15599.00 15496.00 16169.00 
1983 3410 8992.000 13873.000 16141.000 18735.00 22214.00 22863.00 23466.00 
1984 5655 11555.000 15766.000 21266.000 23425.00 26083.00 27067.00 27938.45 
1985 1092 9565.000 15836.000 22169.000 25955.00 26180.00 27241.19 28118.25 
1986 1513 6445.000 11702.000 12935.000 15852.00 17432.56 18139.18 18723.19 
1987 557 4020.000 10946.000 12314.000 14308.52 15735.19 16373.00 16900.15 
1988 1351 6947.000 13112.000 16532.776 19210.62 21126.06 21982.39 22690.14 
1989 3133 5395.000 8464.494 10672.786 12401.48 13638.00 14190.80 14647.69 
1990 2063 4574.169 7176.649 9048.957 10514.63 11563.02 12031.72 12419.09 
dev
origin 9 10 
1981 18662.00 18834.00 
1982 16704.00 16857.95 
1983 23838.84 24058.55 
1984 28382.35 28643.94 
1985 28565.00 28828.28 
1986 19020.67 19195.98 
1987 17168.66 17326.90 
1988 23050.65 23263.10 
1989 14880.42 15017.57 
1990 12616.41 12732.69 
>

To leave a comment for the author, please follow the link and comment on their blog: CloudStat.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)