**Freakonometrics - Tag - R-english**, and kindly contributed to R-bloggers)

The final course (since courses end this week in Montréal) can be watched here and there. The drawings from the course can be downloaded here (including last week’s). First, to come back on last week’s course , we considered

Lee-carter model, i.e.

Note that it is possible to go a bit further, and to consider something that can be seen as a second order development

or even more generally

This idea was implicitly in the initial paper published in the 80’s. Hence, I mentioned in the course that (weighted) least square techniques can be considered to estimate parameters,

and the the first order condition is simply

(which was the interpretation that it is simply the average (over years) of mortality rate). Given that estimator, it is possible to write

and then to use the first components of the singular value decomposition to derive estimators.About the *singular value decomposition* (SVD, with a nice graph stolen from wikipedia, here, on the right – actually the French article on SVD is a bit more punchy, from Снайперская винтовка

Драгунова, there). Note that SVD is closely related to PCA…

Thus, instead of focusing only on the first principal components, it is possible to go up to order 2.

But it is also possible to add, instead of a time component, a cohort-based component, i.e.

The estimation is rather simple, as shown below

D=as.vector(BASEB)

E=as.vector(BASEC)

A=rep(AGE,each=length(ANNEE))

Y=rep(ANNEE,length(AGE))

C=Y-A

base=data.frame(D,E,A,Y,C,a=as.factor(A),

y=as.factor(Y),c=as.factor(C))

LC4=gnm(D~a+Mult(a,y)+Mult(a,c),offset=log(E),

family=poisson,data=base)

Here, parameters have the following shape.

plot(AGE[-1],LC2$coefficients[2:90])

plot(AGE,LC2$coefficients[91:180])

plot(ANNEE,LC2$coefficients[181:279])

plot(AGE,LC2$coefficients[280:369])

plot(1808:1997,LC2$coefficients[370:559])

The first component is rather similar to what we had before

On the other hand, the Lee-Carter (age-year) component is

and the cohort effect (age-cohort) is

Note that the year-component captures wars and chocks that can affect anyone some given years, and that gains on life expectancy is more a cohort effect.

Finally, we discussed actuarial applications. But first, recall that in actuarial literature (without any dynamics), it is standard to defined

and

The mortality rate is then

i.e.

Thus,

Complete expectation of life is then

or its discrete version

Hence, it is possible to write

which can be extended in the dynamic framework as follows

A natural estimator of that quantity is

i.e., with Lee-Carter model

All those quantities can be computed quite simply.

But first, I have to work a little bit on the dataset, at least to be able to predict mortality up to 100 years old (with the demography package, we have to stop at 90). One idea can be to mix two estimation techniques: the nonlinear Poisson regression to get and up to 99 years old, and then to use Rob Hyndman’s package to estimate and predict the component. But first, we have to check that ‘s and ‘s

with the two techniques are not too different. The code for the first model is

BASEB=BASEB[,1:99]

BASEC=BASEC[,1:99]

AGE=AGE[1:99]

library(gnm)

D=as.vector(BASEB)

E=as.vector(BASEC)

A=rep(AGE,each=length(ANNEE))

Y=rep(ANNEE,length(AGE))

base=data.frame(D,E,A,Y,a=as.factor(A),

y=as.factor(Y))

LC2=gnm(D~a+Mult(a,y)+offset=log(E),

family=poisson,data=base)

while it is

BASEB=BASEB[,1:90]

BASEC=BASEC[,1:90]

AGE=AGE[1:90]

library(demography)

MU=as.matrix(BASEB/BASEC)

base=demogdata(data=t(MU),pop=t(BASEC),

ages=AGE,years=ANNEE,type="mortality",

label="France",name="Total",lambda=0)

LC3=lca(base,interpolate=TRUE)

LC3F=forecast(LC3,150)

for the second one. Then, we can compare predictions for ‘s e.g. (with output from econometric regression in blue, and Rob’s package in red)

plot(AGE,LC3$ax,lwd=2,col="red",xlab="",ylab="")

lines(1:98,LC2$coefficients[1]+LC2$coefficients[2:99],

lwd=2,col="blue")

The second problem is that we have to use a linear transformation, since in the econometric package, we do not use the same constraints as in Rob’s package. Here, it was

Then we can compute predicted for all ages and years,

A=LC2$coefficients[1]+LC2$coefficients[2:99]

B=LC2$coefficients[100:190]

K1=LC3$kt

K2=LC3$kt[99]+LC3F$kt.f$mean

K=-c(K1,K2)/50

MU=matrix(NA,length(A),length(K))

for(i in 1:length(A)){

for(j in 1:length(K)){

MU[i,j]=exp(A[i]+B[i]*K[j])

}}

Once we have that matrix, we simply have to work on diagonal (i.e. cohorts) to calculate anything. For instance, to get the dynamic version of , the code is (here for someone of age 40 in 2000)

It is also possible to compute residual life expectancy , including dynamics

x=40

E=rep(NA,150)

for(t in 1900:2040){

s=seq(0,90-x-1)

MUd=MU[x+1+s,t+s-1898]

Pxt=cumprod(exp(-diag(MUd)))

ext=sum(Pxt)

E[t-1899]=ext}

plot(1900:2049,E)

Note that this has been computed given the maximum lifetime we had in the dataset (here 99 years old). Hence, we assume here that no one can live more than 100 year (which will impact life expectancy and is a rather strong assumption, especially in 2050 – which might explain the concave shape of the curve on the right part).

It is also possible to compute annuities. For instance, consider a 40 years old insured, in 2000, willing to get 1 every year after age 70, until he dies, i.e.

The annuity for such a contract is

x=40

t=2000

r=.035

m=70

h=seq(0,21)

V=1/(1+r)^(m-x+h)*Pxt[m-x+h]

sum(V)

VV=rep(NA,150)

for(t in 1900:2040){

s=seq(0,90-x-1)

MUd=MU[x+1+s,t+s-1898]

Pxt=cumprod(exp(-diag(MUd)))

h=seq(0,30)

V=1/(1+r)^(m-x+h)*Pxt[m-x+h]

VV[t-1899]=sum(V,na.rm=TRUE)}

plot(1900:2049,VV)

(in order to compare the value of such a contract, from 1900 up to 2050).

**leave a comment**for the author, please follow the link and comment on their blog:

**Freakonometrics - Tag - R-english**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...