Age-Structure Adjusted All-Cause Mortality

[This article was first published on Theory meets practice..., and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


This page is an updated version of the 2020-12-28 blog post Age Stratified All-Cause and COVID-19 Associated Mortality. The post considers the age stratified all-cause and COVID-19 associated mortality in Germany during 2020 based on numbers provided by the Federal Statistical Office and the Robert Koch Institute. Important extensions compared to the original post are:

  • an improved population adjusted expected mortality calculation
  • an update of the analysis containing the 2020 numbers as of 2021-03-13

Note: The present analyses were previously kept in a separate updated R-Markdown file, which was updated until 2021-01-29. The present text is a conversion of this document into a blog post including a treamtnet of week 53, but does not contain any updated interpretations compared to the 2021-01-29 version.

Creative Commons License This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU General Public License (GPL v3) license from github.


All-cause mortality is one of indicators used to measure the impact of the COVID-19 pandemic, because this indicator is less biased by the testing strategy needed to identify cases to have died while having a recent COVID-19 diagnosis. Several ressources (OurWorldInData, Financial Times, Aburto et al. (2021)) have used this indicator to compare the COVID-19 response between countries. Since both death and COVID-19 have a strong age component, it appears crucial to take an age-stratified view on both all-cause mortality as well as deaths associated with COVID-19.

Age Stratified Mortality

Real-time mortality monitoring is not common in Germany, as can be seen from the coverage of the EuroMoMo monitoring for Germany, where only the two federal states Hesse and Berlin participate. However, as part of the COVID-19 response, the Federal Statistical Office (Destatis) now provides weekly updated preliminary mortality statistics of all-cause mortality in 2020. The methodology behind the numbers as well as age-stratified additional analyses are described in an accompanying publication (zur Nieden, Sommer, and Lüken 2020). The age-stratified analyses are unfortunately not continuously updated, however, up-to-date data are made publicly available.

The age-stratified reported COVID-19 associated deaths (by week of death) are obtained from an export of the RKI. In order to compensate for reporting delays of deaths, the Destatis analysis only goes until 4 weeks before the time point of analysis, but in this post we shall only be interested in the 2020 results. As additional data source we use the age-stratified weekly time series by time of death reported by RKI.

Preliminary Destatis Mortality Data

Available all-cause deaths (by week of death) are available from Destatis until 2021-W07. Hwoever, we will only use the data until the end of 2020 in this post. Note that Destatis stresses the preliminary character of the data – the numbers might change as further deaths arrive, however, as of 2021-03-13 changes are expected to be small. Stratified by age the times series for 2020 compared to the years 2016-2019 looks as follows – please beware of the different y-axes for the age-groups in the plot:

Note: Because the years 2016-2019 do not contain a week 53, we obtain a hypothetical week 53 value for year \(Y, Y=2016,\ldots, 2019\), for comparison in the plot by averaging the observed counts in \(Y\)-W52 and \((Y+1)\)-W01.

Since the age-groups contain different population sizes and – as pointed out by Ragnitz (2021) – population sizes of the age-groups changed relevantly 2016-2019, a better comparison between age-groups instead of absolute numbers is by incidence rate (i.e. deaths per 100,000 population in the age group). For this, the yearly Destatis age-specific population data available for the cut-off-date Dec 31st for 2015-2019 are linearly interpolated for the weeks. An estimate of the 2020 population is obtained from the 2020 value of the destatis population projection (Variant G2-L2-W2).

We notice the strong increase in size of the [80,90) year old age group (increase by 26%) and the noticable decline in the groups of [40,50) (-12%) and [70,80) (-9%) in just 5 years. Although not large in absolute size compared to the other age groups, the [90,Inf) group increased by 18%. These changes, and in particular those in the higher age groups, will be relevant for the analysis of excess mortality.

Once we have the weekly age-specific population estimates available we can compute the incidence per week and age-group. Below the weekly mortality incidence rate (per 100 000 population) is shown for 2020 compared to the minimum, mean and maximum of the corresponding week for the years 2016-2019.

Compared to the graphic with the absolute number of deaths, one notices that the population adjustment leads to a smaller excess in the [80-90) group (because the population in this group became larger). On the other hand, the Nov-Dec 2020 curve in the [70-80) group now is clearer in excess of the expected. For further insights see also Kauermann et al. (2021) and Ragnitz (2021).

To underline the age-gradient of all-cause mortality: 56% of the deaths 2016-2019 occured in the age group of 80+ years (90% in the age group of 60+). It becomes clear that the 2020 mortality in the 80+ age groups was rather low during the first 10-12 weeks and then had a spike in connection with the first COVID-19 wave (March-April). Subsequently, a summer peak (possibly associated with heat) is followed by an increasing and ongoing upward trend in December. One challenge for a statistical analysis of these numbers is to figure out how much of the upward trend is “catch-up mortality” due to the lower mortality in the beginning of the year and how much is excess related to COVID-19. However, both Kauermann et al. (2021) and Ragnitz (2021) show that the excess can be well explained by COVID-19 associated deaths.

An initial analysis of this question consists of summing the all-cause mortalities from W1 until W53 for 2020 (observed) and compare this to the summation of the weekly mean of 2016-2019 for the corresponding time period (expected). Since both 2021-W01 and 2020-W53 contains days outside the year 2020 we weight these two weeks in the summation according to the number of days in 2020 (i.e. 5/7 and 4/7). As a consequence, our number of deaths deviates slightly against a more exact calculation based on daily deaths (which however does not have age-strata publically available). Note: This calculation method ignores the population changes in the years 2016-2020. Furthermore:

Age group observed_2020 expected_20162019 Percent change min_20162019 max_20162019
[00,30) 7285 7781 -6% 7508 8116
[30,40) 6811 6445 6% 6371 6512
[40,50) 15658 16917 -7% 15505 18557
[50,60) 57485 58060 -1% 56823 58943
[60,70) 118313 111742 6% 107755 114922
[70,80) 201411 211212 -5% 202435 216055
[80,90) 378011 340670 11% 325392 349797
[90,Inf) 199646 178355 12% 165384 185617
984621 931182 6% 906309 984621
a Min and max for row ‘Total’ is obtained by first summing each of the years 2016-2019 and then take the min and max.

Note: The exact mortality count for 2020 by Destatis using daily data is 985145, hence, the weighted weekly summation does produce a small discrepancy from this value. The total proportion of 2020-W1 to 2020-W53 mortalities in the 80+ age group is currently 59%, i.e. a slightly higher proportion than in the previous years.

A population age-structure adjusted estimate can be obtained using an indirect standardization approach (Keiding and Clayton 2014): For each age group we compute the weekly incidence rates 2016-2019, then for each week we take the min, mean and max of the 2016-2020 incidences. If we for a given age-group and week want an expected number of deaths based on the 2016-2019 data, we would multiply the 2016-2019 mean incidence on the corresponding 2020 population in order to get the expected number of deaths in the corresponding age-group and week of 2020. Because the estimated incidence rates for 2016-2019 are based on slightly different population sizes, one could acknowledge this in the mean-calculation by instead computing an inverse variance weighted mean (or use logistic regression). However, numerical differences are neglible so we proceed with the equal weighting of the mean. The computations lead to the following table for the age-population adjusted 2020 mortality:

Age group observed_2020 expected_20162019 Percent change
[00,30) 7285 7770 -6%
[30,40) 6811 6732 1%
[40,50) 15658 15990 -2%
[50,60) 57485 58709 -2%
[60,70) 118313 118596 0%
[70,80) 201411 202851 -1%
[80,90) 378011 387314 -2%
[90,Inf) 199646 194453 3%
984621 992416 -1%

What looked like a small excess in the raw calculations becomes a small negative excess after population adjustment by the proposed method. This shows the importance of computing the expected number of cases in a population adjustment way, which was the point of Ragnitz (2021).

Altogether, the mild mortality in the older age groups during the first weeks (e.g. due to a mild influenza season) so far balances the continuing excess in the higher age-groups since Mar-Apr, which coincides with the start of the COVID-19 pandemic. If one is interested in COVID-19 associated deaths an alternative might be to focus on the period since Mar 2020, but then one would ignore the low influenza season in the beginning of the year, which is IMO relevant for all-cause mortality excess analysis. For comparison, excess mortality computations for influenza are in the northern hemisphere often done not by calendar year but by season, i.e. for the period from July in Year \(X\) to June in Year \(X+1\) (Nielsen et al. 2011). In this case one would associate the first COVID-19 peak to the analysis of the season 2019/2020 and the ongoing 2nd wave to an analysis of the ongoing season 2020/2021. The present analysis, however, focused on the calendar year 2020 following the Destatis presentation of the data.

When interpreting the above results it is important to realize that the 2020 population numbers are currently projections. In a recent press release Destatis announced that the population in 2020 might not have increases as projected, because of a smaller migration surplus, higher mortality and lower birth numbers. As a consequence, Destatis now estimates that the 2020 population consisted of 83.20 mio (no population increase compared to 2019) individuals whereas in the G2-L2-W2 projection variant the projection was 83.38 mio (population increase by 0.3% compared to 2019). Final figures, including more detailed age-stratified data, will be available mid 2021 and it will be interesting to see how these impact the excess mortality calculations.

Furthermore, it is important to realize that the current observed 2020 mortality numbers contain the consequences of all type of effects from the pandemic management, which includes changes in the population behavior due to interventions. Disentangling the complex effects of all-cause mortality and the COVID-19 pandemic is a delicate matter, which takes experts in several disciplines (demographers, statisticians, epidemiologists) to solve. However, should you based on the above numbers happen to think that COVID-19 is not a serious problem, it is insightful to think about the prevention paradox and take a look at the all-cause mortality statistics from other countries and consider a more regional analysis as done in the next section.

Analysis for the 16 Federal States

The preliminary all-cause mortality data are also available for each of the 16 federal states, however, with a coarser age discretization. We show for each of the two age-groups the weekly population adjusted mortality relative to the same week in 2016-2019. Note that since the age-groups are also coarser, i.e. the data contains only the two groups [00,65) and [65,Inf), the effect of the changing populations on the estimates is less pronounced.

The plot can also be compared to the recently introduced Destatis visualization of absolute case numbers on state level. We note the strong regional differences in the plot. As an example, the highest mortality in the 65+ age-group occurs in 2020-W52 in the federal state of Sachsen, where the poulation adjusted mortality is 122% above the mean of 2016-2019. Note that for Sachsen, even the <65 age-group has a visible excess. The 3 federal states with the highest excess mortality week in the 65+ age-group are:
Federal state Week Excess mortality
Sachsen 2020-W52 122%
Brandenburg 2020-W53 72%
Thüringen 2020-W52 68%

All-Cause Mortality and COVID-19 Associated Deaths

To see, how much of the all-cause mortality is directly contributed by deaths in association with COVID-19, we match the age-stratified all-cause mortality data with the age-stratified COVID-19 deaths reported by the RKI. As of 2020-01-14 the RKI provides weekly time series of date of death in age groups of 10 years. Hence, the previous rough extrapolation from day of report used in the original blog post is not needed anymore.

We note that a large part of the increase in mortality can be explained by COVID-19, i.e. after subtracting the COVID-19 associated deaths the remainder of the mortality seems comparable with previous years. This is good news, because deviations would be early signs of possible secondary adverse health effects due to the COVID-19 pandemic, i.e. patients trying to avoid the hospital, or care for emergencies which cannot be given due to lack of ressources.

We note that the COVID-19 associated deaths in the most recent weeks in the [80,90) age groups make up approximately 25% of all deaths reported in the week.


Considering all-cause mortality and COVID-19 associated mortality as a measure for the impact of an pandemic is a rather simplistic view of the pandemic. COVID-19 infections can be very mild, but complicated progressions can occur without leading to death (see, e.g., long COVID). Looking at mortality also ignores the complex interplay between age-groups, where it can be beneficial to reduce infections in a not-so-affected-by-the-disease age-group in order to protect the higher-risk groups. The motivation of this post was primarily to put COVID-19 associated mortality in relation to all-cause mortality in order to get a better understanding of the daily number of COVID-19 deaths. An age-stratified view is necessary for this. Furthermore, as pointed out by Ragnitz (2021), excess mortality calculations should take the changing population structure into account.

For a more modelling based analysis of the German COVID-19 associated mortality data based on reported infections see also the work by Linden et al. (2020) (updated analysis). More information on real-time mortality monitoring can be obtained from the EuroMoMo methodology page or Höhle and Mazick (2010). Comments and feedback to the analysis in this blog post are much appretiated.


Aburto, José Manuel, Jonas Schöley, Luyin Zhang, Ilya Kashnitsky, Charles Rahal, Trifon I. Missov, Melinda C. Mills, Jennifer B. Dowd, and Ridhi Kashyap. 2021. “Recent Gains in Life Expectancy Reversed by the Covid-19 Pandemic.” medRxiv.

Höhle, M., and A. Mazick. 2010. “Aberration Detection in R Illustrated by Danish Mortality Monitoring.” In Biosurveillance: A Health Protection Priority, edited by T. Kass-Hout and X. Zhang, 215–38. CRC Press.

Kauermann, G., M. Schneble, G. De Nicola, and U. Berger. 2021. “Übersterblichkeit in Deutschland – Große Unterschiede Zwischen Den Bundesländern- in Sachsen Sehr Starke übersterblichkeit Mit Und Ohne Corona-Todesfälle.” Department of Statistics, University of Munich.

Keiding, N., and D. Clayton. 2014. “Standardization and Control for Confounding in Observational Studies: A Historical Perspective.” Statistical Science 29 (4): 529–58.

Linden, M., J Dehning, SB Mohr, J Mohring, M Meyer-Hermann, I Pigeot, A Schöbel, and V Priesemann. 2020. “Case Numbers Beyond Contact Tracing Capacity Are Endangering the Containment of Covid-19.” Dtsch Arztebl Int 117: 790–91.

Nielsen, J., A. Mazick, S. Glismann, and K. Mølbak. 2011. “Excess Mortality Related to Seasonal Influenza and Extreme Temperatures in Denmark, 1994-2010.” BMC Infectious Diseases 11 (1): 350.

Ragnitz, J. 2021. “Hat Die Corona-Pandemie Zu Einer übersterblichkeit in Deutschland Geführt? – Aktualisierung 15.1.2020.” ifo Institute.

zur Nieden, F., B. Sommer, and S. Lüken. 2020. “Sonderauswertung Der Sterbefallzahlen 2020.” WISTA – Wirtschaft Und Statistik – Amtliche Statistik in Zeiten von Corona, no. 4 (August): 38–50.

To leave a comment for the author, please follow the link and comment on their blog: Theory meets practice.... offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)