Do excess deaths exceed leading causes of death?

In most countries they do (if we can trust the estimates)

This post examines how the latest available estimates of cumulative excess deaths during this pandemic compare to mortality patterns prior to the pandemic. Anchored to 2019, we compare the severity of the pandemic’s excess death toll to 123 causes of death documented in the Global Burden of Disease database (level-3 aggregation). While caveats apply relating to standard errors and possibility of bias, this approach provides a new perspective on the understated severity of the pandemic in particularly the developing world. 

Comparing excess mortality estimates with pre-pandemic patterns

Before we start, let us first motivate the approach here and provide a few remarks on (1) excess deaths and how we measure them and (2) why we compare excess mortality estimates with pre-pandemic patterns of mortality.  A fuller discussion can be found in Annexes 1 and 2 to this post.

Why excess mortality estimates?

The concept of excess mortality provides the best method to assess the true and total mortality impact of the pandemic. By capturing mortality patterns beyond what would be expected under normal circumstances, we get an amalgamated insight into the effects of factors such as the misattribution of underlying cause of death or the indirect channels through which the pandemic raises mortality. 

Excess death measures, however, are far from perfect. First, even in countries with solid CRVS systems that provide timely data, excess deaths remain estimates since they require an assessment of the counterfactual of what would have happened under normal circumstances. Second, the estimates require all-cause mortality data, which are not available everywhere on a regular basis. As a result, excess deaths are available only for some 80 countries. 

That has recently changed.  New estimates have become available that directly estimate excess deaths in data-poor environments. The excess death estimates by The Economist are currently the best-in-class. They’re not without limitations themselves – see details in the methods section below – but they give us a solid starting point and are notable for their transparent disclosure of assumptions, methods and standard errors around the results.

This post uses the mid-point estimates of The Economist’s excess death model, i.e. the mid-points of the 95% confidence interval around them. For more information about alternative representations, this companion post contrasts the results based on mid-point excess death estimates with those based on officially reported COVID-19 stats as well as those based on the lower bounds and upper bounds of the excess death estimates.

Why compare with pre-pandemic patterns?

We will be comparing the mid-point estimates of cumulative excess mortality with patterns of of reported mortality prior to the pandemic. One comparison will be between the level of excess deaths during the pandemic and the level of deaths reported in 2019 where the cause of death belongs to one of the top 3 for each country in that year. 

Why is this useful? 

  • Comparisons with top causes of death before the pandemic help dimension the severity of the pandemic’s mortality toll by providing an intuitive reference point. Statements such as “the pandemic is claiming more lives than heart attack or stroke before the pandemic” may foster a better appreciation of the severity of the pandemic than “the excess death ratio is X per 100,000”.  
  • The country specificity of the comparisons also helps dimension severity.  The comparisons are country specific both in terms of the identification of the top causes of death and the level of mortality associated with them.  For example, the excess mortality rate in country X may exceed that of country Y, but it may exceed the Top 2 cause of death (e.g. heart attack) in country X but only the Top 3 cause of death (e.g. stroke) in country Y.  More here for a discussion why that is helpful.
  • Due to data limitations we cannot provide a contemporaneous assessment of the leading causes of death during the pandemic. While such data are available in some countries, they are not in most and our focus in this post on providing a global assessment with special emphasis on the developing world. We therefore anchor the discussion to a comparison to what transpired before the pandemic and we take 2019 as the base year. Such data are available for almost all countries.

Interpretation of results

The chart below shows the top cause of death in 2019 that is exceeded by the estimated excess death toll during the pandemic. We focus on the leading causes of death, which we take to be the Top 3. A country is colored red when excess deaths exceed cause of death #1. The different shades of orange apply this logic to causes of death #2 and #3. Green refers to countries where excess deaths do not exceed the top 3 causes. Excess deaths may be negative for some countries, in which case they do not exceeded the death toll of any of the 123 disease families considered.

How to interpret these results?

  • The mid-points suggest that the pandemic has claimed more lives than the leading causes of death in most countries. This underscores once again the point that this is and has been a severe pandemic in terms of its mortality impact. So much for the argument that COVID is just a flu. 
  • Excess deaths have been particularly high in Latin America. But the pattern  appears to extend to all other regions in the world. The green areas are the notable exceptions where we should differentiate between countries such as Australia where excess deaths are estimated to be below 0 and others where the estimates are positive but less severe than the three leading causes of death.
  • The pandemic has been severe in countries where one would have predicted a priori a milder impact. Many countries in Africa, for example, are very young. Their demographic structure should provide some protection against this age-discriminating infectious disease pandemic. The excess death estimates suggest that these advantages may have been offset by other factors, which must be a combination of higher infection prevalence, higher age-adjusted infection fatality rates, greater measurement challenges and/or a greater contribution from indirect deaths.

As an aside, note that the full details of the methodology are available in Annex 2. Note also that this is a clickable map. The tooltips provide a step-by-step guide for each country on how the results were derived. For a non-interactive version of the map, click on “download image” within the map or here

Next, let us characterize the distribution of countries by the ranking observed in the previous map.  

The chart above shows the number of countries that have suffered high severity of excess mortality with reference to the leading causes of death. For example, the first bar shows the number of countries on the Y axis (and the share of countries in the global total of 189 as a label on the bar), for which the Top #1 cause of death is exceeded by the measure of excess mortality. The last bar shows the number of countries where none of the leading causes of death (the top 3) are exceeded. The results suggest that the leading causes are exceeded for the vast majority of countries.


Annex 1. Caveats about excess death estimates

A number of caveats apply to the excess death estimates used. 

First, the standard errors of the estimates. The two visualization immediately above illustrate the sometimes large confidence intervals around the mid-point estimates of excess deaths. The first map shows the country distribution by leading cause of death if we were to take the upper-bound estimates of the 90% confidence interval around the mid-points. The second map shows the same for the lower-bound estimates of that confidence interval. 

  • The confidence intervals are especially large in Sub-Saharan Africa, which would be colored virtually entirely red if we were to apply the upper bound of the 90% confidence interval instead of the mid-point and virtually entirely green if we used the lower bound (check out the companion post to see these differences). 
  • The large intervals do not necessarily undermine the validity of the model, which is well-calibrated across different substrata for which we have data (more on that later). But they do highlight the more limited predictive value of the model in environments of data scarcity. It’s entirely natural that prediction errors are larger when less information is available.

Second, the possibility of bias. Not only are the excess death measures estimated with a considerable degree of imprecision in especially the poorer countries, there are also reasons why the estimates may be biased (i.e. misrepresentative of reality). The Economist acknowledges two main reasons why and it turns out these are not just some after-thought but important qualifiers:

  • “The [excess-death tallies] rely on the assumption that officially published excess-mortality numbers are accurate. Given the disruption that covid-19 has caused, it is plausible that some governments may have changed how they compile data on total deaths during the pandemic. This might lead us to publish incorrect figures for the countries in question. It could also introduce errors into the estimates that our model produces for all other countries.”
  • “Because most countries that report excess deaths are rich or middle-income, the bulk of the data used to train our model comes from such places. The patterns that the model detects in these areas could thus be an inaccurate guide to the dynamics of the pandemic in poor countries. A similar caveat applies to our estimates for countries that have suffered lots of excess deaths for reasons other than the pandemic, such as war or natural disasters.”

The second bullet point on bias is especially important.

  • The machine learning algorithm is trained on countries for which we have data available. However, as we go down the country income ladder, we are also narrowing the set of comparator countries against which the algorithm can be trained within income strata. This complicates the calibration of the estimates. 
  • The estimates may be well-calibrated against the available observations but we cannot tell with much confidence how well they hold up for those countries where information is sparse or not available. Should additional information be published in the future, the estimates can be checked and improved. But until then we have to accept that not only the margins of uncertainty are large, but also there may be an element of bias in the results.

The direction of bias is likely upward since reports on the ground in especially lower-income countries do not suggest a very considerable excess death toll. If we were to simplistically divide the world into groups of lower- and higher-income countries which are respectively characterized by data environments that are poor and rich, then structural differences between these groups could easily lead to an upward bias. Let’s focus on four structural differences:

  • Demography: the age gradient in infection fatality risk is very steep. Lower-income countries count a vast population share that is young or very young. We’d expect a strongly nonlinear effect on mortality.
  • Ventilation: life in many poorer countries is outdoor. Only the richer echelons of society live, work and play in air-conditioned spaces. Outdoor means better ventilation, which in turn may reduce disease severity and transmission potential by reducing the dose of virus in the initial inoculum.
  • Connectivity: poorer countries tend to be characterized by less external (with other countries) and internal (e.g. urban-rural) connectivity, limiting the spread
  • Density: urbanization rates (the share of people living in urban areas) are high in Latin America & the Caribbean but low in Sub-Saharan Africa and South Asia. What sets the latter two apart however is that rural density in South Asia is a lot higher than in Sub-Saharan Africa. This provides yet another reason why the lower-income Africa is structurally different. 

For all of the above these reasons, we should approach the estimates as tentative and treat the results shown here as preliminary. We should also use complementary information at the country level to assess the true severity of the pandemic. Having said that, the estimates and results represent the best possible effort to convey a globally consistent picture of the impact of the pandemic and remain a good starting point. 

Annex 2: Details on methodology

In what follows, we expand on the extensive footnote of the map shown in this post.  

The excess death estimates. As noted, we take the cumulative excess death estimates by The Economist, which start for each and every country on Jan 1, 2020. Please note: 

  • These estimates are not point estimates, but interval estimates with a sometimes large confidence interval around the mid-points.
  • For visualization purposes we show the mid-points, but we report the entire 90% confidence interval for each country in the tooltips.
  • We will compare the excess death estimates with pre-pandemic mortality data on a yearly basis. For this reason, we need to adjust the excess death estimates into yearly averages by scaling them down by 365 / the number of days passed since Jan 1, 2020.

The “excess severity” ratio. We express cumulative excess deaths as a ratio to pre-pandemic mortality (and not a proportion since it is a merely a comparison and not a share, i.e. the numerator is not part of the denominator). A few points on this:

  • The comparisons of excess deaths with top causes of death before the pandemic are based on the concept of relative severity (which has been introduced earlier here and here and forms the basis of a tracker here). 
  • When translating the relative severity concept to excess deaths, we define the “excess severity” ratio as adjusted cumulative excess deaths since the start of the pandemic divided by 2019 all-cause mortality.
  • Why compare to 2019? We do not have contemporaneous information available on the distribution of causes of death. Hence we anchor it to the year prior to the pandemic. This has the added benefit of not confounding mortality patterns with the direct and indirect effects of COVID.
  • Are excess deaths overestimated because they are compared to 2019? One common mistake when estimating excess deaths is to ignore underlying trends in baseline mortality. We’re not committing such an error here.
    • We compare with 2019 mortality levels and patterns, but the excess death estimates for 2020 onwards are already devoid of any trends in baseline mortality during these years. The Economist has taken a two-pronged approach to control for underlying trends:
    • Where mortality data for 2020 up till now are available on a timely basis, excess deaths are derived as the difference between observed mortality and what we would have expected under normal circumstances, which itself is estimated with a linear regression trend based on previous years.
    • Where such data are not available, a machine learning model estimates excess deaths directly on the basis of contemporaneous information that is available. 

Comparing “excess severity” with top causes of death. The next step in the analysis is to compare the ratio of cumulative excess deaths (adjusted and mid-points) to the proportionate mortality rates of the top causes of deaths. Let’s break this down:

  • For each and every country, we obtain mortality patterns based on the 2019 Global Burden of Disease study. We use level 3 of the aggregation, thus obtaining data for 123 disease families. We then calculate the proportionate mortality rates (which are the shares of cause-specific deaths in all-cause mortality). 
  • We then compare the excess severity ratio with the proportionate mortality rates of these 123 disease families and identify the top cause of death that has been exceeded by the excess severity ratio. For example, if the excess severity ratio is 15% and the top cause of death in 2019 was ischemic heart attack, which represented 14% of all deaths in 2019, then we have a basis to claim that mid-point excess deaths during 2020-21 have been more severe than the top cause of death in 2019.
  • Why compare excess deaths (which is an all-cause mortality concept) with specific causes of death? The intent is to convey a feel about the severity of the pandemic in terms of the total direct and indirect mortality impact it has caused by making comparisons with top causes of death. These are easily understood comparisons that are meant to dimension the severity of the current episode we’re going through.

Looking for fresh content?

Get notified about new material

You can unsubscribe anytime. Protected by ReCAPTCHA. Google & apply.

You might also like

One world, two pandemics?

How different types of mortality data support opposite views on pandemic severity across countries and why one of them is completely wrong

Keeping count of the big picture

Media attention has focused excessively on officially reported COVID-19 mortality rates. To assess global impact accurately, we need to look beyond that.

No more posts
No more posts


Share on twitter
Share on linkedin
Share on facebook
Share on email