One would think that data on vaccination by age should be a global priority given the age-discriminating impact of COVID-19. Yet, well into the second year of the global vaccination campaign and the third year of the pandemic, our ability to track global vaccination progress by age remains severely hindered by data constraints.
Focusing on the cohorts of children (ages under 12), adolescents (12-18) and elderly (60+), this post reviews the database on Vaccine Uptake by Age, which produced by the WHO is presently the only source of globally comprehensive data. We evaluate this database in terms of the data quality dimensions of completeness, consistency and timeliness and note that the submissions by member countries leave large gaps in all three dimensions. The result is that we are very much in the dark when it comes to the state of global vaccination progress by age.
Data quality can be evaluated according to six different dimensions: completeness, uniqueness, timeliness, validity, accuracy and consistency. This post will focus on three of them: completeness, consistency and timeliness.
The completeness of the data we have on vaccination by age is evaluated by looking at (1) the number of countries that report age-differentiated information for our three demographic groups and (2) the population share these reporting countries represent for the respective demographic groups.
The chart below identifies the share of countries that are reporting any age-differentiated data for children, adolescents and elderly. We tally a country as reporting if the WHO database includes for that country any submissions of cohorts that overlap with the groups of children (12-), adolescents (12-18) or elderly (60+). As we can see, many more countries report on elderly vaccination than they do on the vaccination of adolescents and especially children.
These averages however mask considerably heterogeneity underneath. To see this, look at the chart below which cuts the data also by World Bank income classification. This divides the world into groups of high, upper-middle, lower-middle and low income countries (HICs, UMICs, LMICs and LICs) and allows us to differentiate across countries rich and poor.
The richest countries (HICs) are more likely to report data to WHO across the three demographies. Interestingly, the majority of the poorest countries (LICs) do not report on the vaccination of children and adolescents, but the vast majority of them do report such data for the elderly. We also note the large gaps in UMICs and LMICs across the three demographic groups.
The finding that many countries do not report any vaccination data by age to the WHO may not sound surprising when it comes to vaccination data for the younger age categories. Countries may simply not have started vaccinating yet, reflecting a lack of regulatory approval, appetite or vaccines. The reality however is that many more countries have started vaccinating than what the above chart suggests (as shown here albeit only for the age cohorts under 16, not 12). And regardless of the reporting frequencies for the younger age groups, we continue to see large gaps for the elderly group, especially among upper and lower-middle income countries.
Let us now turn to a more granular perspective and have a look at the country level. The maps below show the data availability by country. The tooltips to the maps provide country-level details on the (possibly multiple) age categories that are reported to the WHO.
One might argue that data not being reported to WHO does not imply that the data does not exist. That is indeed the case. Consider for example child vaccination in China, which shows up red in the first map. However, news reports dating back to November 2021 stated that China had already by then vaccinated over 80 million kids ages 3 to 11. This issue will be taken up later again. For now, let’s continue to review what the WHO database tells us.
If the vast majority of non-reporting countries were tiny, we could still end up with a reasonably globally representative sample of vaccination by age. Unfortunately, however, that is not the case.
The charts below calculate for our three demographic groups their population shares in global totals for the countries that are reporting data. Immediately below we see the most aggregate result at the global level. Coverage is far from complete, with the largest gaps observed among children and elderly vaccination.
Breaking this down by World Bank income classification, we can see that high income countries (HICs) pull up the global average for all three demographics. Interestingly, coverage of elderly groups among low income countries (LICs) is high. But coverage among upper- and lower-middle income countries (UMICs and LMICs) is low, except for adolescents in LMICs. These results are primarily driven by China and India, which represent a large part of the population in UMICs and LMICs.
The cartograms below visualize the absolute dimension of the data discrepancies. A cartogram distorts the geometry of countries to reflect an alternate variable, which we take to be the population size of the demographic group we are interested in. Starting off from an equal-area projection, the cartogram distorts land mass so it becomes proportionate to the number of children, adolescents or elderly. This inflates the land mass of Africa and South Asia for the younger age cohorts and China and Europe for the older cohorts.
The colors show whether we have vaccination data by age in the WHO dataset, with red indicating that at least some data are available and green indicating that data are missing. We see a lot of red everywhere especially for the younger groups.
Consistency is another data quality dimension that needs to be taken into account. Here we spotted an issue with respect to the thresholds of age cohorts. Country practices vary widely in terms of how the information is split across different age buckets. The age buckets themselves may be consistent between each other within countries, but when they are added together into the global database it becomes virtually impossible to compare them and distill a consistent set of estimates for our three demographic groups.
The table below shows the various cohort thresholds that are being reported by member countries to the WHO as per the database. One could in principle assume that children under the lower thresholds reported here are not being vaccinated, the variety in upper thresholds creates a problem. Similar inconsistencies apply to the adolescent cohorts and the lower thresholds for the elderly cohorts.
One final dimension of data quality is timeliness. Here we will focus on how timely the reports are among the reporting countries. Timeliness matters especially where the speed of vaccination is fast. And of course it also matters tremendously to identify the remaining vaccination gaps in contexts where the virus is spreading rapidly and extensively.
The chart shows that the majority of data points provided are within the last month (which we mark as 0 delay). However, we do see that the share of observations that are delayed by a month or more is very large. Moreover, the distribution of delays show a long tail, indicating that for several countries the information is likely to be considerably out of date.
Does the extent of delay correlate with country income level? It does. High income countries consistently outperform on timeliness for all three demographics and the longest delays appear to be concentrated among low income countries.
We offer three reasons: (1) it affects country-level efforts to scale and target the vaccination campaign, (2) it undermines the pursuit of global vaccine equity, and (3) it presents an opportunity for the WHO to incentivize member countries to provide a more complete, more consistent and more timely picture of vaccination by age, which will be of great benefit during the COVID-19 pandemic as well as future pandemics.
The availability of quality information matters in the first place at the level of a country. It is simply important to collect adequate information to be able to track the progress of the vaccination campaign and identify any remaining gaps. Data needs to be granular both geographically and across socio-economic groups so that efforts to address gaps can be properly scaled and targeted.
This is particularly important for data on elderly vaccination. COVID-19 is an age-discriminating disease that disproportionately affects the elderly, so we would want to make sure that adequate information is available on how well the vulnerable elderly groups (along with other members of the priority group) are covered. After all, what doesn’t get measured, doesn’t get managed.
Due to the limited data on for example elderly vaccination, we need to resort to alternate measures to assess the ability of countries to protect their elderly. This includes for example ex ante measures of vaccine sufficiency, which take total doses administered to the entire population as a proxy of supply and compare that to the size of the priority group population. That provides a measure of potential coverage assuming that governments want to prioritize members of the the priority group and can effectively reach them. But it would be far better to have actual data that provide the ex post picture.
But we also need good data at the global level. If we are to advocate for global vaccine equity, how can we effectively do so without details on the vaccination status of the most vulnerable at risk? But even more broadly, how can we practice vaccine equity if we treat any age group fundamentally differently across borders?
We observe for example a growing trend of childhood vaccination against COVID, with some countries having approved vaccination after the age of merely 6 months. As countries shift the minimum thresholds for vaccination, knowing how well the younger age cohorts of adolescents and children are vaccinated matters in determining in a consistent way how large the remaining vaccination challenge is. And we know all too well that taking into account the youngest age cohorts will present a tremendous global challenge given that these cohorts are so large in less-resourced countries.
Several non-governmental organizations have stepped up during the COVID-19 pandemic to collect and aggregate dispersed information and make it publicly available. The Center for Systems Science and Engineering at John’s Hopkins University and Our World in Data come to mind as prominent providers of data on confirmed cases, fatalities and vaccinations.
Unfortunately, collection of data on vaccination by age cannot not be easily outsourced because the data appear to be largely unavailable. The few data panels that are available at Our World in Data (see here, here and here) on vaccination progress by age illustrate the point.
In its capacity of global health authority and provider of credible data, the WHO could play a useful public goods role. As this post has shown, there is great scope to provide more complete, more consistent and more timely data on age-differentiated vaccination progress. Improving these data quality dimensions would help us track this age-discriminating pandemic better. It would also make us better prepared for future pandemics that may like COVID-19 affect different strata of the population unequally.