Data insight

Do excess deaths exceed leading causes of death?

In most countries they do (if we can trust the estimates)

In the chart below we explore how the latest available estimates of cumulative excess deaths during this pandemic compare to mortality patterns prior to the pandemic. This comparison is anchored to 2019 and we compare the severity of the pandemic’s excess death toll to 123 causes of death documented in the Global Burden of Disease database (level-3 aggregation). 

Why? Comparisons with top causes of death help dimension the severity of the pandemic’s mortality toll by providing an intuitive reference point. Statements such as “the pandemic is claiming more lives than heart attack or stroke before the pandemic” may foster a better appreciation of the severity of the pandemic than “the excess death ratio is X per 100,000”.  

The chart shows the top cause of death in 2019 that is exceeded by estimated excess death toll during the pandemic (see below for more details on the tweaks that need to be applied to make this comparison valid). We focus on the leading causes of death, which we take to be the Top 3. A country is colored red when excess deaths exceed cause of death #1. The different shades of orange apply this logic to causes of death #2 and #3. Green refers to countries where excess deaths do not exceed the top 3 causes. Excess deaths may be negative for some countries, in which case they do not exceeded the death toll of any of the 123 disease families considered.

This analysis is subject to important caveats about standard errors and bias.  See below for a detailed discussion of results and methods. Note also that this is a clickable map. The tooltips provide a step-by-step guide for each country on how the results were derived. For a non-interactive version of the map, click on “download image” within the map or here

New excess death estimates

A myriad of factors complicate our assessment of the true mortality impact of the pandemic. Inadequacies in CRVS systems (civil registration and vital statistics) are one source of uncertainty. Another is the presence of indirect impacts on mortality. These can inflate the mortality toll in the case of un- or under-managed diseases due to strained hospital resources for example. Or they could reduce the mortality toll – think of the virtual absence of the flu season as more people wear masks and keep distance from others. 

The concept of excess deaths helps alleviate these challenges. By capturing mortality patterns beyond what would be expected under normal circumstances, we get an amalgamated insight into the effects of factors such as the misattribution of underlying cause of death or the indirect channels through which the pandemic raises mortality. 

Excess death measures, however, are far from perfect. First, even in countries with solid CRVS systems that provide timely data, excess deaths remain estimates since they require an assessment of the counterfactual of what would have happened under normal circumstances. Second, and more important, excess death estimates require all-cause mortality data, which creates a serious limitation to getting a globally comprehensive sample. As a result, excess deaths are available only for some 80 countries. 

That has recently changed.  New estimates have become available that directly estimate excess deaths in data-poor environments. The excess death estimates by The Economist are currently the best-in-class. They’re not without limitations themselves – see details in the methods section below – but they give us a solid starting point and are notable for their transparent disclosure of assumptions, methods and standard errors around the results.

They represent a top-down approach that shouldn’t be seen as substituting for more detailed bottom-up country-level or even subnational analysis; rather the estimates complement such efforts and the true assessment of the pandemic’s impact will in any case need to take into account information based on more than just one method.

Interpretation of results

The chart simply shows what the cumulative excess death estimates of The Economist imply for the severity of the pandemic. We take the estimates as given and focus in this map on the mid-points of the confidence intervals that surround them. 

A companion post contrasts the results based on mid-point excess death estimates with those based on officially reported COVID-19 stats as well as those based on the lower bounds and upper bounds of the excess death estimates.

Taking the mid-point estimates at face value for now, what does this map tell us?

  • The mid-points suggest that the pandemic has claimed more lives than the leading causes of death in most countries. This underscores once again the point that this is and has been a severe pandemic in terms of its mortality impact. So much for the argument that COVID is just a flu. 
  • Excess deaths have been particularly high in Latin America. But the pattern actually appears to extend to all other regions in the world. The green areas are the notable exceptions where we should differentiate between countries such as Australia where excess deaths are estimated to be below 0 and others where the estimates are positive but less severe than the three leading causes of death.
  • The results also suggest that the pandemic has been severe in countries where one would have predicted a priori a milder impact. Many countries in Africa, for example, are very young. Their demographic structure should provide some protection against this age-discriminating infectious disease pandemic. The excess death estimates suggest that these advantages may have been offset by other factors, which must be a combination of higher infection prevalence, higher age-adjusted infection fatality rates, greater measurement challenges and/or a greater contribution from indirect deaths.

Caveats

First, the standard errors of the estimates. The two visualization immediately above illustrate the sometimes large confidence intervals around the mid-point estimates of excess deaths. The first map shows the country distribution by leading cause of death if we were to take the upper-bound estimates of the 90% confidence interval around the mid-points. The second map shows the same for the lower-bound estimates of that confidence interval. 

  • The confidence intervals are especially large in Sub-Saharan Africa, which would be colored virtually entirely red if we were to apply the upper bound of the 90% confidence interval instead of the mid-point and virtually entirely green if we used the lower bound (check out the companion post to see these differences). 
  • The large intervals do not necessarily undermine the validity of the model, which is well-calibrated across different substrata for which we have data (more on that later). But they do highlight the more limited predictive value of the model in environments of data scarcity. It’s entirely natural that prediction errors are larger when less information is available.
 

Second, the possibility of bias. Not only are the excess death measures estimated with a considerable degree of imprecision in especially the poorer countries, there are also reasons why the estimates may be biased (i.e. misrepresentative of reality). The Economist acknowledges two main reasons why and it turns out these are not just some after-thought but important qualifiers:

  • “The [excess-death tallies] rely on the assumption that officially published excess-mortality numbers are accurate. Given the disruption that covid-19 has caused, it is plausible that some governments may have changed how they compile data on total deaths during the pandemic. This might lead us to publish incorrect figures for the countries in question. It could also introduce errors into the estimates that our model produces for all other countries.”
  • “Because most countries that report excess deaths are rich or middle-income, the bulk of the data used to train our model comes from such places. The patterns that the model detects in these areas could thus be an inaccurate guide to the dynamics of the pandemic in poor countries. A similar caveat applies to our estimates for countries that have suffered lots of excess deaths for reasons other than the pandemic, such as war or natural disasters.”

The second bullet point on bias is especially important.

  • The machine learning algorithm is trained on countries for which we have data available. However, as we go down the country income ladder, we are also narrowing the set of comparator countries against which the algorithm can be trained within income strata. This complicates the calibration of the estimates. 
  • The estimates may be well-calibrated against the available observations but we cannot tell with much confidence how well they hold up for those countries where information is sparse or not available. Should additional information be published in the future, the estimates can be checked and improved. But until then we have to accept that not only the margins of uncertainty are large, but also there may be an element of bias in the results.

The direction of bias is likely upward since reports on the ground in especially lower-income countries do not suggest a very considerable excess death toll. If we were to simplistically divide the world into groups of lower- and higher-income countries which are respectively characterized by data environments that are poor and rich, then structural differences between these groups could easily lead to an upward bias. Let’s focus on four structural differences:

  • Demography: the age gradient in infection fatality risk is very steep. Lower-income countries count a vast population share that is young or very young. We’d expect a strongly nonlinear effect on mortality.
  • Ventilation: life in many poorer countries is outdoor. Only the richer echelons of society live, work and play in air-conditioned spaces. Outdoor means better ventilation, which in turn may reduce disease severity and transmission potential by reducing the dose of virus in the initial inoculum.
  • Connectivity: poorer countries tend to be characterized by less external (with other countries) and internal (e.g. urban-rural) connectivity, limiting the spread
  • Density: urbanization rates (the share of people living in urban areas) are high in Latin America & the Caribbean but low in Sub-Saharan Africa and South Asia. What sets the latter two apart however is that rural density in South Asia is a lot higher than in Sub-Saharan Africa. This provides yet another reason why the lower-income Africa is structurally different. 
 

For all of the above these reasons, we should approach the estimates as tentative and treat the results shown here as preliminary. We should also use complementary information at the country level to assess the true severity of the pandemic. Having said that, the estimates and results represent the best possible effort to convey a globally consistent picture of the impact of the pandemic and remain a good starting point. 

Details on methodology

In what follows, we expand on the extensive footnote to this map, which explains the methodology and articulates the various caveats that apply to this analysis. 

The excess death estimates. As noted, we take the cumulative excess death estimates by The Economist, which start for each and every country on Jan 1, 2020. Please note: 

  • These estimates are not point estimates, but interval estimates with a sometimes large confidence interval around the mid-points.
  • For visualization purposes we show the mid-points, but we report the entire 90% confidence interval for each country in the tooltips.
  • We will compare the excess death estimates with pre-pandemic mortality data on a yearly basis. For this reason, we need to adjust the excess death estimates into yearly averages by scaling them down by 365 / the number of days passed since Jan 1, 2020.

The “excess severity” ratio. We express cumulative excess deaths as a ratio to pre-pandemic mortality (and not a proportion since it is a merely a comparison and not a share, i.e. the numerator is not part of the denominator). A few points on this:

  • The comparisons of excess deaths with top causes of death before the pandemic are based on the concept of relative severity (which has been introduced earlier here and here and forms the basis of a tracker here). 
  • When translating the relative severity concept to excess deaths, we define the “excess severity” ratio as adjusted cumulative excess deaths since the start of the pandemic divided by 2019 all-cause mortality.
  • Why compare to 2019? We do not have contemporaneous information available on the distribution of causes of death. Hence we anchor it to the year prior to the pandemic. This has the added benefit of not confounding mortality patterns with the direct and indirect effects of COVID.
  • Are excess deaths overestimated because they are compared to 2019? One common mistake when estimating excess deaths is to ignore underlying trends in baseline mortality. We’re not committing such an error here.
    • We compare with 2019 mortality levels and patterns, but the excess death estimates for 2020-2021 are already devoid of any trends in baseline mortality during these years. The Economist has taken a two-pronged approach to control for underlying trends:
    • Where mortality data for 2020 and 2021 are available on a timely basis, excess deaths are derived as the difference between observed mortality and what we would have expected under normal circumstances, which itself is estimated with a linear regression trend based on previous years.
    • Where such data are not available, a machine learning model estimates excess deaths directly on the basis of contemporaneous information that is available. 
 

Comparing “excess severity” with top causes of death. The next step in the analysis is to compare the ratio of cumulative excess deaths (adjusted and mid-points) to the proportionate mortality rates of the top causes of deaths. Let’s break this down:

  • For each and every country, we obtain mortality patterns based on the 2019 Global Burden of Disease study. We use level 3 of the aggregation, thus obtaining data for 123 disease families. We then calculate the proportionate mortality rates (which are the shares of cause-specific deaths in all-cause mortality). 
  • We then compare the excess severity ratio with the proportionate mortality rates of these 123 disease families and identify the top cause of death that has been exceeded by the excess severity ratio. For example, if the excess severity ratio is 15% and the top cause of death in 2019 was ischemic heart attack, which represented 14% of all deaths in 2019, then we have a basis to claim that mid-point excess deaths during 2020-21 have been more severe than the top cause of death in 2019.
  • Why compare excess deaths (which is an all-cause mortality concept) with specific causes of death? The intent is to convey a feel about the severity of the pandemic in terms of the total direct and indirect mortality impact it has caused by making comparisons with top causes of death. These are easily understood comparisons that are meant to dimension the severity of the current episode we’re going through.