# Model life tables by @ellis2013nz

**free range statistics - R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I am working to improve my knowledge of demography. This is something I’ve only had a relatively superficial engagement with but it’s an important part of the responsibilities of the team where I work.

A key tool in demography is a “life table”, which is basically a table where rows are ages or age groups and columns are different calculations relating to the chance of the cohort in that age group dying in a given year. The cumulative probability of survival lets you directly calculate life expectancy from any given age.

In a country with a very effective statistical system, these life tables can be estimated pretty directly. You take the number of people dying at each age from the death registrations, and the denominator of people per age cohort from the latest population projection.

But if your death registry data is incomplete or your population projection is unreliable, it is a much tougher job. And death registration is incomplete in many countries; enough so that indicator 17.19.2 against the Sustainable Development Goal “Partnerships for the Goals” includes a target for 80% of deaths to be registered; a target that many countries are going to struggle to meet by 2030.

To meet this use case (very common in developing countries), the United Nations provides a set of model life tables:

“The United Nations and the demographic research community at large commonly use two sets of standard model life table families to derive a variety of mortality indicators and underlying mortality patterns for estimation and projection (Coale-Demeny, 1966 and 1989; United Nations, 1982). These two sets of model life tables, designed primarily for use in developing countries or for estimating historic populations, are limited to mortality patterns for a life span from age 20 to 75. A revised set of model life tables, extending the initials sets from life expectancy at birth (e(0)) from age 75.0 up to 92.5, uses both a limited life table as an asymptotic pattern and the classic Lee-Carter approach to derive intermediate age patterns (Buettner, 2002).”

The basic procedure to use these, as I understand it, is that you:

- get one or more statistics that you *can* measure – like infant or under five mortality (which can be estimated with care from survey or census data) and probability of surviving to age 60 if you are 15
- use your judgement and demographer community wisdom to decide the “family” of life tables that is most appropriate for your country, and
- then from that family you pick the particular model life table that most closely matches the statistic or statistics that you do have.

Alternatively it is possible to model the curves (of death rate ~ age) directly with a logistic function of some sort – using the curves of the model tables as a starting point and modifying the parameters according to the statistics available. That takes me beyond today’s scope I think.

In essence, either way, you are relying on a “typical” shape for the mortality at the ages that you can’t measure mortality directly.

These are the families available, grouped under the two types of “Coale-Demeny” or “United Nations”:

type family n <chr> <chr> <int> 1 CD East East 21222 2 CD North North 21222 3 CD South South 21222 4 CD West West 21222 5 UN Chilean Chilean 21222 6 UN Far_East_Asian Far_East_Asian 21222 7 UN General General 21222 8 UN Latin Latin 21222 9 UN South_Asian South_Asian 21222

Those values of ‘n’ are the number of rows of data associated with each family. Each family has 81 complete life tables for each of male and female – a life table for each value of “life expectancy at birth” from 20 to 100. The life tables contain mortalities for 131 ages, from 0 to 130. And 81 * 131 * 2 = 21222:

> range(mlt_raw$age) [1] 0 130 > range(mlt_raw$e0) [1] 20 100 > 81*131*2 [1] 21222

So this lets us do some interesting comparisons. For example, you can look at the relationship between infant mortality and life expectancy for each family of the model life tables:

Here’s the code for everything so far – downloading the model life tables from the UN, reading them into R, counting the families and drawing that plot of life expectancies:

Alternatively we can look at a more summarised version of the data by comparing a couple of particular demographic statistics, at a given life expectancy, for the different model life tables. Here’s my attempt at repeating (but adding in a sex dimension) Figure 1 from this UNFPA instructional site.

I like this plot. It lets you see at a glance how the different families of model life table vary in at least one or two important ways. For example, we can see that the “South” (Coale-Demeny) and “South Asian” (UN) tables have relatively high child mortality (and then of course compensating low adult mortality) for a given life expectancy.

That was done with this code. Note my struggles in the comments with exactly what is meant age 5, age 60 etc. and hence which column to use from the life table; significant expertise with life tables involves understanding the exact ways adjustments are made for things like the difference between age x and the average age when people are age x; the fact that young people die earlier in their first year rather than later; and so on:

OK so that’s a nice representation of the overall life expectancy, but what about the gritty detail of the mortality rates at each individual age? One way to look at this is via an animation:

I quite like this for giving you an overview of the mortality rates of the different families and different life expectancies, but it’s not great for comparing say two different families with the same life expectancy.

The animation was produced with this code:

To better make the comparisons that I felt the animation wasn't good at (particularly family to family of model life table), I made a shiny app. See below, or:

If I had a bit more oomph I would add some tooltips and stuff, but I think I've done enough to feel I'm getting the hang of this thing.

As mentioned earlier, this isn't an area of deep expertise for me. I'm quite likely to have got some details of the terminology wrong, for example, so use the above with caution, and please add comments for anything you spot that I can correct.

**leave a comment**for the author, please follow the link and comment on their blog:

**free range statistics - R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.