# Statistical Methods for the Chain Ladder Technique Demo (CloudStat)

**CloudStat**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This demo will use a study case, how a general insurance company use CloudStat to estimate outstanding claims.

**Background**

Forecasting outstanding claims and setting up suitable reserves to meet these claims is an important part of the business of a general insurance company. Indeed, the published profits of these companies depend not only on the actual claims paid, but on the forecasts of the claims which will have to he paid. It is essential, therefore, that a reliable estimate is available of the reserve to be set aside to cover claims, in order to ensure the financial stability of the company and its profit and loss account. There are a number of methods which have proved useful in practice, one of which is extensively used and is known as the chain ladder technique.

Chain ladder method is a statistical method of estimating outstanding claims, whereby the weighted average of past claim development is projected into the future. The projection is based on the ratios of cumulative past claims, usually paid or incurred, for successive years of development. It requires the earliest year of origin to be fully run-off or at least that the final outcome for that year can be estimated with confidence. If appropriate, the method can be applied to past claims data that have been explicitly adjusted for past inflation.

In recent years, a statistical framework for analyzing this data has been built up, which encompasses the actuarial method, extending and consolidating it. We hope to bring together there results and to illustrate how the chain ladder technique can be improved and extended, without altering the basic foundations upon which it has been built.

These improvements are designed to overcome two problems with the chain ladder technique. Firstly, that not enough connection is made between the accident years, resulting in an over-parametrized model and unstable forecasts. Secondly, that the development pattern is assumed to be the same for all accident years. No allowance is made by the chain ladder technique for any change in the speed with which Aims are settled, or for any other factors which may change the shape of the run-off pattern.

**Dataset**

General Insurance companies will sell insurance policies and receive claims everyday. These claims will be indexed by their business year and the delay.

We will use the following dataset, RAA for illustrative purpose.

RAA is a dataset of Run-off triangle of Automatic Facilitative business in General Liability in a matrix with 10 accident years and 10 development year, taken from Historical Loss Development, Reinsurance Association of America (RAA), 1991, p.96.

**Functions**

__Mack-Chain-Ladder Model__

The Mack-chain-ladder model forecasts future claims developments based on a historical cumulative claims development triangle and estimates the standard error around those.

The Mack-chain-ladder model can be regarded as a weighted linear regression through the origin for each development period: lm(y ~ x + 0, weights=w/x^(2-alpha)), where y is the vector of claims at development period k + 1 and x is the vector of claims at development period k. __Bootstrap-Chain-Ladder Model__

The BootChainLadder procedure provides a predictive distribution of reserves or IBNRs for accumulative claims development triangle.

The BootChainLadder function uses a two-stage bootstrapping/simulation approach. In the first stage an ordinary chain-ladder methods is applied to the cumulative claims triangle. From this we calculate the scaled Pearson residuals which we bootstrap R times to forecast future incremental claims payments via the standard chain-ladder method. In the second stage we simulate the process error with the bootstrap value as the mean and using the process distribution assumed. The set of reserves obtained in this way forms the predictive distribution, from which summary statistics such as mean, prediction error or quartiles can be derived.

**Code**

library(ChainLadder)

RAA

plot(RAA)

plot(RAA, lattice=TRUE)

M = MackChainLadder(Triangle = RAA, est.sigma = “Mack”)

M

plot(M)

plot(M, lattice=TRUE)

set.seed(1)

B = BootChainLadder(Triangle = RAA, R = 999, process.distr = “od.pois”)

B

plot(B)

RAA2 = chainladder(RAA, weights=RAA, delta=1)

predict(RAA2)

> library(ChainLadder)

> RAA

dev

origin 1 2 3 4 5 6 7 8 9 10

1981 5012 8269 10907 11805 13539 16181 18009 18608 18662 18834

1982 106 4285 5396 10666 13782 15599 15496 16169 16704 NA

1983 3410 8992 13873 16141 18735 22214 22863 23466 NA NA

1984 5655 11555 15766 21266 23425 26083 27067 NA NA NA

1985 1092 9565 15836 22169 25955 26180 NA NA NA NA

1986 1513 6445 11702 12935 15852 NA NA NA NA NA

1987 557 4020 10946 12314 NA NA NA NA NA NA

1988 1351 6947 13112 NA NA NA NA NA NA NA

1989 3133 5395 NA NA NA NA NA NA NA NA

1990 2063 NA NA NA NA NA NA NA NA NA

> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)

> png(file=filename)

[1] “C:/R/tmp\file34ff7dd6.png”

> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)

> png(file=filename)

[1] “C:/R/tmp\file5cc576ca.png”

>

> M = MackChainLadder(Triangle = RAA, est.sigma = “Mack”)

> M

MackChainLadder(Triangle = RAA, est.sigma = “Mack”)

Latest Dev.To.Date Ultimate IBNR Mack.S.E CV(IBNR)

1981 18,834 1.000 18,834 0 0 NaN

1982 16,704 0.991 16,858 154 206 1.339

1983 23,466 0.974 24,083 617 623 1.010

1984 27,067 0.943 28,703 1,636 747 0.457

1985 26,180 0.905 28,927 2,747 1,469 0.535

1986 15,852 0.813 19,501 3,649 2,002 0.549

1987 12,314 0.694 17,749 5,435 2,209 0.406

1988 13,112 0.546 24,019 10,907 5,358 0.491

1989 5,395 0.336 16,045 10,650 6,333 0.595

1990 2,063 0.112 18,402 16,339 24,566 1.503

Totals

Dev: 0.76

Ultimate: 213,122.23

IBNR: 52,135.23

Mack S.E.: 26,909.01

CV(IBNR): 0.516138742518263

> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)

> png(file=filename)

[1] “C:/R/tmp\file2791452f.png”

> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)

> png(file=filename)

[1] “C:/R/tmp\file54b62d7c.png”

>

> set.seed(1)

> B = BootChainLadder(Triangle = RAA, R = 999, process.distr = “od.pois”)

> B

BootChainLadder(Triangle = RAA, R = 999, process.distr = “od.pois”)

Latest Mean Ultimate Mean IBNR SD IBNR IBNR 75% IBNR 95%

1981 18,834 18,834 0 0 0 0

1982 16,704 16,921 217 710 253 1,597

1983 23,466 24,108 642 1,340 1,074 3,205

1984 27,067 28,739 1,672 1,949 2,679 4,980

1985 26,180 29,077 2,897 2,467 4,149 7,298

1986 15,852 19,611 3,759 2,447 4,976 8,645

1987 12,314 17,724 5,410 3,157 7,214 11,232

1988 13,112 24,219 11,107 5,072 14,140 20,651

1989 5,395 16,119 10,724 6,052 14,094 21,817

1990 2,063 18,714 16,651 13,426 24,459 42,339

Totals

Latest: 160,987

Mean Ultimate: 214,066

Mean IBNR: 53,079

SD IBNR: 18,884

Total IBNR 75%: 64,788

Total IBNR 95%: 88,037

> filename = paste(tempfile(tmpdir=”C:/R/tmp”), “.png”, sep=”“)

> png(file=filename)

[1] “C:/R/tmp\file5b2b338c.png”

>

> RAA2 = chainladder(RAA, weights=RAA, delta=1)

> predict(RAA2)

dev

origin 1 2 3 4 5 6 7 8

1981 5012 8269.000 10907.000 11805.000 13539.00 16181.00 18009.00 18608.00

1982 106 4285.000 5396.000 10666.000 13782.00 15599.00 15496.00 16169.00

1983 3410 8992.000 13873.000 16141.000 18735.00 22214.00 22863.00 23466.00

1984 5655 11555.000 15766.000 21266.000 23425.00 26083.00 27067.00 27938.45

1985 1092 9565.000 15836.000 22169.000 25955.00 26180.00 27241.19 28118.25

1986 1513 6445.000 11702.000 12935.000 15852.00 17432.56 18139.18 18723.19

1987 557 4020.000 10946.000 12314.000 14308.52 15735.19 16373.00 16900.15

1988 1351 6947.000 13112.000 16532.776 19210.62 21126.06 21982.39 22690.14

1989 3133 5395.000 8464.494 10672.786 12401.48 13638.00 14190.80 14647.69

1990 2063 4574.169 7176.649 9048.957 10514.63 11563.02 12031.72 12419.09

dev

origin 9 10

1981 18662.00 18834.00

1982 16704.00 16857.95

1983 23838.84 24058.55

1984 28382.35 28643.94

1985 28565.00 28828.28

1986 19020.67 19195.98

1987 17168.66 17326.90

1988 23050.65 23263.10

1989 14880.42 15017.57

1990 12616.41 12732.69

>

**leave a comment**for the author, please follow the link and comment on their blog:

**CloudStat**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.