How population outliers distort COVID severity rankings

Population outliers such as China, Ethiopia, India and the US upend global rankings of COVID severity if we remove them from their peer groups

Countries with large populations have the potential to significantly distort how we see the severity of the pandemic around the world. Considering the pandemic performance of the US, China, India and Ethiopia, we assess how these population outliers affect average mortality rates by World Bank income group. As it turns out, the rankings of pandemic severity are completely upset. The outcomes for upper-middle and lower-middle income countries completely change, whereas those of high- and low-income countries converge to each other.  


Views about pandemic severity

Even into this third year of the pandemic, it remains necessary to dispel the persistent perception that the pandemic has left the developing world largely unscathed. Nothing could be further from the truth. The pandemic has dealt the developing world a very serious blow and it has been the purpose of this data analytics resource to call attention to that fact.  

Demographic structure and poor data make it hard to eradicate the mistaken perception of a mild developing country pandemic. Developing countries tend to have proportionately younger populations, which after all should reduce, all else equal, population-wide mortality risk. Officially reported COVID-19 mortality rates aid in reinforcing this perception since they tend to be a lot lower in the developing world than in high-income countries. 

Unfortunately, as we have argued in this companion post, not all else is equal. For one, in many poorer countries, the demographic advantage of having a young population is reduced, if not offset, by other factors, such as limited availability of and access to quality health care. Furthermore, weaknesses in civil registration and vital statistics systems mean that we cannot rely on officially reported COVID-19 mortality statistics. We also need to cast our net more broadly to account for the various indirect effects on mortality to be able to account for the total impact of the pandemic, which means that a narrow focus on COVID-19 deaths is insufficient.

In this post, we examine one further factor that contributes to distorting the perception of how severely the developing world has been affected: the role of population outliers. We examine how countries with very large populations affect population-weighted mortality rates and how the rankings of pandemic severity across World Bank income groups are altered if we were to exclude these large countries. The effects turn out to be very considerable.

Population outliers: with or without you?

The World Bank’s income classification splits the world into four groups of countries according to average per capita income levels: high income countries (HICs), upper-middle income countries (UMICs), lower-middle income countries (LMICs) and low income countries (LICs). This classification allows us to make statements about pandemic severity in rich and poor countries, where UMICs, LMICs and LICs traditionally refer to the heterogenous group of the developing world. 

We select one population outlier for each income group, i.e. the country with the largest population size relative to its income peers. That gives us the following four countries: the United States within the group of HICs, China among UMICs, India as part of the LMICs and Ethiopia among LICs. 

Should we assess pandemic outcomes with or without these population outliers? At some level, we should include them by all means. The selected countries represent after all a large, if not humongous, share in their group totals. The US represents about 27% of the HIC population, China 56% of the UMIC group, India 41% of the LMICs and Ethiopia 17% of LICs. Collectively, they represent 42% of the world population. If we are interested in assessing pandemic performance across income buckets, we should include them into the population-weighted average. A life lost is a life lost, regardless of borders, so let us consider the full income classification when assessing the severity of the pandemic. 

At the same time, we are also interested in assessing how pandemic severity manifests itself unequally within each income group. As it turns out, China, India and the US are not only the three largest countries in the world population-wise, but they are also notable for the vastly different mortality impacts relative to their respective peers (the exception here is Ethiopia, whose performance was more in line with its low-income country peers). Given that pandemic performance is to a significant extent the result of sovereign actions, it is useful to see how the four income groups have performed without the outliers.  This provides us with an insight into how robust statements about pandemic severity based on population-weighted group averages have been. 

Adjusting the rankings of severity

The chart below shows the main results. It depicts the cumulative COVID-19 mortality rate and the cumulative estimated excess mortality rate, where both measures count back to the start of the pandemic and are expressed per 100,000 people. The excess mortality estimate represents the mid-point estimate of the excess death model by The Economist, where it should be mentioned that this estimate is subject to a margin of error that is larger for the poorer countries. 

The chart has three panels:

  • The left panel depicts the rankings of pandemic severity across the original income groups of the World Bank income classification (as discussed also in the companion post);
  • The middle panel shows the rankings for adjusted income groups that exclude the most populous country for each group;
  • The right panel shows the rankings of pandemic severity for the outlier countries that were previously excluded: the US, China, India and Ethiopia

Let’s look first at the officially reported COVID-19 mortality rates with and without population outliers (the left axis in each of the panels). We obtain the following results: 

  • The officially reported data on COVID-19 mortality confirm the traditional pattern: HIC mortality rates are far above those of UMICs, which in turn exceed those of LMICs and LICs. Note that HIC mortality rates are almost double those of UMICs. LMIC ones are half of those of UMICs and LIC mortality rates are almost indistinguishable from 0. 
  • Once we exclude the population outliers, HICs and UMICs swap places, placing UMICs (ex China) at the top, followed closely by HICs (ex US) and subsequently LMICs and LICs (ex India and Ethiopia). This result is driven by US mortality rates exceeding those of HIC peers and China’s rates being a lot lower than its UMIC peers. The effect of removing China on the UMIC ex China aggregate is huge given China 56% population weight in the UMIC group and its radically different pandemic performance relative to most other UMICs.  
Consider next the rankings based on estimated excess mortality rates. The results suggest that:
  • Excess mortality rates are a lot higher than official COVID-19 mortality rates across all income groups. Notice the very steep upward slopes for especially LMICs, but also LICs and to a lesser extent UMICs as we progress from COVID-19 to excess mortality. The changes for HICs are much more modest.
  • The ranking based on excess mortality rates places LMICs at the top followed by HICs, UMICs and LICs at roughly similar levels. The similarity across HICs, UMICs and LICs is surprising given the large differences in age structure between these groups. As such, excess death rates in the group of poorest countries with the youngest populations are estimated to be roughly the same as those for the group of the richest countries with the oldest populations. 
  • Once we exclude the population outliers, the rankings completely flip, with UMICs (ex China) now at the top, far above LMICs, HICs and LICs (respectively ex India, Ethiopia and the US). This result, shown also in the chart below, is largely driven by the exclusion of China and India, which considerably raises the UMIC aggregate and lowers the LMIC one. The value for LICs is not much affected by the exclusion of Ethiopia and by dropping the US from the HIC aggregate, we see the values for HICs and LICs converge. Notice also how close the adjusted rates are for LMICs, LICs and HICs, whereas those for UMICs are about two times higher.

Same results, alternate presentation

The chart below repeats the above analysis with a somewhat different presentation. The panels now represent the two mortality concepts (before they represented country aggregates): reported COVI-19 mortality rates on the left and estimated excess mortality rates on the right. Each panel now has three axes: on the left we have the individual country outliers, in the middle the income aggregate inclusive of population outliers and on the right the income aggregate without the outliers. 

The following patterns are clear:

  • As for reported COVID-19 mortality rates, the effect of population outliers is very large for HICs and UMICs but negligible for LMICs and LICs. This confirms that the US and China are not only population outliers relative to their peers but have also suffered rather different mortality rates. That the effects for LMICs and LICs are small suggests that India and Ethiopia are much alike their peers in the officially reported stats.
  • When we look at estimated excess mortality rates, the effect of population outliers becomes much larger for UMICs and LMICs, whereas for HICs and LICs it stays about the same.  About the same means HICs experience a considerable drop in mortality rates when excluding the US that is about the same regardless of mortality concept. For LICs, it is about the same in the sense that for both concepts there is virtually no effect. But the story is radically different for UMICs that see excess mortality rise more dramatically when excluding China compared to COVID mortality. And the same goes for LMICs in the opposite direction when excluding India. 

The differential effect of population outliers on COVID and excess mortality rates can be quite clearly seen in the chart above. It shows a beeswarm of the gap between excess and covid mortality rates on the Y axis with World Bank income groups on the X axis. The bubbles are country observations size by the cumulative excess death tally (in the absolute), whereas the short horizontal lines are the average gaps for the income group. The observations of the four population outliers are filled in color and their value is marked by a black dot.

As we can see, the gaps for US and Ethiopia (labeled as USA and ETH) are quite close to the income group average. The gap for China (CHN) on the other hand is well below the UMIC average, whereas the one for India (IND) is well above the LMIC average. Note also how the distribution of the gap is much more dispersed for UMICs and LMICs than for HICs and LICs. The reason why HICs have less dispersion is likely that statistical systems are uniformly more adept at capturing COVID mortality correctly, where the opposite is likely to be uniformly true for the LICs. 


Population outliers carry a large weight in their group averages and because of their differences in pandemic performance they tend to have a large influence on the rankings of pandemic severity across World Bank income groups. With outliers included, LMICs appear to have been the worst affected – a result which is driven completely by India. Without outliers, UMICs ex China have suffered the most by far.

Estimates for excess mortality rates for HICs and LICs are surprisingly similar when we look at the unadjusted income groups. They become even more similar once we exclude the population outliers. 

This goes to show, once again, that the impact of the pandemic on the developing world should not be minimized. The perception that the high-income countries have been dealt the most severe blow is not correct. Indeed, the results above suggest a very different picture. 

Note: Thanks to Vincent Rajkumar for insightful exchanges on the topic and inspiring me to write this post.