Keywords

1 Introduction

The WHO recommends that 95% of children born in a country should be vaccinated against measles to ensure herd immunity [25]. However, during recent years, measles outbreaks have frequently occurred in regions with a high overall vaccination coverage [7, 8]. To control and eventually eliminate measles, we need to understand the mechanisms that enable such outbreaks to happen. The threshold proposed by the WHO is based on the assumption that susceptibility to measles is always homogeneously distributed in a population. This, however, might not be a realistic assumption.

When measles outbreaks occur in a highly vaccinated population, it is often observed that pockets of un-vaccinated individuals are important drivers of these outbreaks [10, 20]. Susceptible individuals can be clustered in a variety of ways: geographically, in schools, or in households. Some attention has already been devoted to modeling the effect of the geographical clustering of measles susceptibility [21]. Since households are an important place for disease transmission [17], the presence of multiple susceptible individuals within the same household may have an important impact on the risk for measles outbreaks. This is the reason why we investigated the effect of household-based susceptibility clustering on the risk and persistence of measles outbreaks in a previous study [14]. Using data on the current population of Flanders (Belgium), we found that a higher level of household-based susceptibility clustering leads to an increased risk for measles outbreaks and increases the size of those outbreaks.

However, to adequately plan for the control and elimination of measles, we need to validate if these results will still apply in the future. As time progresses, the age distribution of a population changes, as does the way in which households are constituted [6]. Furthermore, different age groups will be at risk for measles infection in the future [11, 15]. To estimate the effects of household-based susceptibility clustering in the future, we need to take these changes in the population into account.

Individual-based models, in which each individual is treated as a unique entity, are very well suited to model these different levels of heterogeneity in a population [24]. They allow us to take into account age- and context-dependent social mixing behavior, as well as age-specific immunity levels and heterogeneous vaccination behavior.

In this paper we use Stride [13], an individual-based model for the transmission of infectious diseases, to examine how the effects of household-based susceptibility clustering on measles outbreaks are expected to change over the next 20 years. To do this, we simulate different scenarios regarding within-household clustering of susceptibility, using projections of the Flemish population and age-specific immunity levels for 2020, 2030 and 2040. To estimate how the effects of household-based clustering evolve over the next two decades, we compare simulation results regarding the risk and persistence of measles outbreaks between the different scenarios and calendar years.

2 Methods

Stride was previously developed by our research group [13]. By supplying different input files to the simulator, it can be used to model a wide variety of populations and (air-borne) infectious diseases. It is an open-source project: the source code can be found in a public Github repository [3]. We will briefly describe the input we supplied to Stride to conduct our study and the way in which certain relevant aspects of the simulator were implemented.

2.1 Population

We used projections for the age distribution of the population of Flanders and the constitution of households for 2020, 2030, and 2040. These projections are based on currently unpublished work, and were provided to us after personal communication with the authors [16]. For each calendar year we examined, we used a population of about 300,000 individuals, 5% of the total population size of Flanders. Aside from in their households, there are three other social contexts in which individuals can contact each other in our model: schools, workplaces, and more general communities.

The distribution of school group sizes by student age is based on registration data collected by the Flemish government in 2019 [23]. We assumed that all children aged between 3 and 18 years of age attend school during weekdays. The distribution of workplace sizes was based on data extracted from Eurostat [5]. We assumed an employment rate of 70% for individuals aged between 18 and 65 years old, based on data for Belgium in 2018 obtained from the Eurostat database [5]. Finally, each individual was also assigned to a separate week- and weekend community, to represent more general contacts made during, respectively, week- and weekend days. These communities consist of about 1,000 individuals each, in line with earlier work by Chao et al. [4].

Age-specific contact rates in each of these contexts - households, schools, workplaces, and communities - are based on a social contact study conducted in Flanders in 2010 and 2011 [12].

2.2 Age-Specific Immunity Levels

To inform our model, we used projections for age-specific immunity levels for Belgium in 2020, 2030 and 2040. These projections are based on a recent study by Hens et al. [11], and were obtained in the same way as described in a recent paper [15]. In Fig. 1, the projected percentage of immune individuals by age can be observed for 2020 (solid blue line), 2030 (dashed orange line) and 2040 (dotted green line).

Fig. 1.
figure 1

Projected percentages of immune individuals by age for 2020 (solid blue line), 2030 (dashed orange line) and 2040 (dotted green line). (Color figure online)

In Belgium, uptake of a measles-containing vaccine took off on a large scale from 1985 on - although a vaccine had been available on the Belgian market since 1975 [2]. Therefore, we will assume that most individuals born in Belgium before 1985 have acquired natural immunity after surviving a measles infection.

For the generation born between 1985 and 1995, the situation is different. As measles no longer circulated, fewer persons were infected and, consequently, natural immunity became less common. However, due to the introduction period of the vaccine, many individuals born during this period were not vaccinated, or were incompletely vaccinated - receiving only one dose instead of the recommend two doses of a measles-containing vaccine. This is reflected in the immunity level of this age group, as can be seen in Fig. 1: for 2020, a dip in the immunity level for individuals aged 25 to 35 years can be observed. As this generation ages, their immunity level is expected to decrease further due to the waning of vaccine-induced immunity: the dip in immunity level that can be observed for 2020 becomes deeper for 2030 and 2040. As a consequence of this, the overall immunity level of the population is also projected to decline in the future (from about 92% in 2020 to a little over 86% in 2040).

Since 1995, vaccination coverage in Flanders has been fairly stable. In the projections we used, it is assumed that this will also remain the case in the future. The recommended age for infants to receive the first dose of a measles-containing vaccine in Flanders is at 12 months [1]. As such, children younger than 12 months of age constitute another group at risk for measles infection. Furthermore, in Flanders, the recommended age to receive the second dose of a measles-containing vaccine is at 10 years of age [1]. Until they receive this second dose, children might not be fully protected against measles.

2.3 Distribution of Immunity: Implementation

To reflect both the age-specific immunity levels discussed above, and the target level of household-based susceptibility clustering that is supplied to the simulator as an input parameter, the following procedure was used to distribute immunity in the population before the start of each simulation.

We assume that individuals born before 1985 have acquired natural immunity, and that the household-based clustering of susceptibility is the result of decisions made about vaccination. As such, we do not take into account household-based susceptibility clustering for individuals born before 1985. To distribute immunity among individuals in this age group, we follow the algorithm described below.

First, we calculate for each age the target number of immune individuals, based on the age-specific immunity level and the total number of individuals of this age present in the simulated population. Next, we draw a random individual from the population. If this individual is susceptible and there are not yet enough immune individuals in the age category that this individual belongs to, we make this person immune. We repeat this process until all age-dependent immunity quota for individuals born before 1985 have been fulfilled.

For individuals born since 1985, we do take into account the target clustering level. The target clustering level is an input parameter between 0 and 1. It represents the probability that if an individual (born since 1985) is immune, all other individuals (also born since 1985) that belong to the same household are also immune. To immunize this part of the population, we use the following procedure.

As before, we calculate the target number of immune individuals for each age category. Next, we select a random individual from a random household. If the selected individual is still susceptible, and if there are not yet enough immune individuals in their age category, we make this person immune. Then, we compare a random draw to the target clustering level. If the random draw is lower than the target clustering level, we immunize all other individuals born since 1985 that belong to the same household. We do this within the limits posed by age-specific immunity quota: we do not allow the actual immunity level for each age to exceed the target immunity level by more than 5%. We repeat this procedure until the total target number of immune individuals born since 1985 has been reached.

2.4 Contact and Transmission Events

Once the population has been initialized, we simulate the actual contact and transmission events that occur in the population. This procedure is explained in more detail in a previous publication [13], but we will briefly repeat relevant elements here. The simulator moves forward in discrete time-steps of one day. During each time-step, we update the presence of individuals in their respective social contact pools (households, schools, workplaces and communities), based on the day of the week and each individual’s health status (symptomatic individuals only make contacts within their own household).

Next, we simulate contact and transmission events for each social contact pool. Based on the age- and context-specific contact rates discussed above, we check whether contact occurs between infectious and susceptible members of each contact pool. If an infectious and a susceptible individual contact each other, we compare a random draw to the transmission probability. This transmission probability - \(\text {P}_\text {transmission}\) - is supplied as an input parameter to the simulator and represents the probability that if an infectious and a susceptible individual have contact, a transmission of the disease occurs. If the random draw is lower than the transmission probability, we set the health status of the susceptible individual to exposed.

Individuals that are not marked as immune from the beginning of the simulation can be in one of 5 health states: susceptible, exposed, infectious, infectious and symptomatic, and recovered. We modeled the natural history of measles as described in previous work [14].

2.5 Scenarios

A common measure used in epidemiology to estimate the transmission potential of a disease is R0, the basic reproduction number. R0 represents the average number of new cases one infected individual would cause in a completely susceptible population. However, as R0 not only depends on the pathogen itself, but also the structure of the population and on social contact behavior, we used \(\text {P}_\text {transmission}\) as an input parameter to represent the transmission potential of the disease. Despite this, we want to be able to express our results in terms of R0, and so we established a relationship between \(\text {P}_\text {transmission}\) and R0 for the different populations that we used in our experiments.

We estimated this relationship for the populations projected for 2020, 2030 and 2040. For each calendar year, we ran 1000 simulations each for 21 values of \(\text {P}_\text {transmission}\) between 0 and 1. At the beginning of each simulation, we introduced one infected individual into the population. We kept track of the number of secondary cases caused by this index case in a completely susceptible population, and used this to estimate R0. We ran each simulation for 30 days: by this time the index case has recovered and can no longer infect any new cases.

We used the optimize.curve_fit function in the scipy Python package [22] to fit a function through the 21,000 data points we collected for each calendar year, allowing us to estimate a corresponding R0 value for each \(\text {P}_\text {transmission}\) that we used as an input parameter. The optimize.curve_fit function uses a non-linear least squares method to fit the data obtained from our simulation runs to a function of the form shown in Eq. (1).

$$\begin{aligned} a + b \times \log (1 + P_{\text {transmission}}) \end{aligned}$$
(1)

After we established a relationship between \(\text {P}_\text {transmission}\) and R0, we investigated different scenarios regarding household-based clustering of susceptibility for different calendar years. We tested 9 values for \(\text {P}_\text {transmission}\) between 0.40 and 0.80 - corresponding to a basic reproduction number of about 11.16 to 19.71 for 2020, 10.95 to 19.36 for 2030 and 10.80 to 19.12 for 2040. We also tested 5 values for the target clustering level between 0 and 1. We compared these 45 scenarios between 3 different calendar years: 2020, 2030 and 2040. For each of these 135 scenarios, we ran 200 stochastic simulations.

At the beginning of each simulation, we introduced one infectious individual into the population. Next, we ran every simulation for 730 days. We assumed that after this period the outbreak had run its full course - as no more new infections were recorded after day 730 in previous, exploratory simulations.

3 Results

3.1 Relationship \({\text {P}_\text {transmission}} \sim \mathrm{{R}_{0}}\)

As discussed above, we established a relationship between the input parameter \(\text {P}_\text {transmission}\) and R0, the basic reproduction number. As R0 depends on both the transmission potential of a disease as well as on the structure and social mixing behavior of a population, we estimated this relationship separately for each different population projection we used (2020, 2030, and 2040).

The functions of the form shown in Eq. (1) that we fit for 2020, 2030, and 2040 can bee seen in Table 1. Even though the populations used for 2020, 2030 and 2040 differ from each other in terms of age distribution and household constitution, the relationship between \(\text {P}_\text {transmission}\) and R0 does not appear to change a lot.

Table 1. Coefficients for fitted functions of the form shown in Eq. (1) to estimate the relationship between \(\text {P}_\text {transmission}\) and R0.

An overview of the simulation results we used to fit these functions can be seen in Fig. 2. For all three fits, we observe that a value for \(\text {P}_\text {transmission}\) of 0 corresponds to a value of 0 for R0. Furthermore, as \(\text {P}_\text {transmission}\) increases, we see that both the mean number of secondary cases caused by our index case (solid blue line) and the median number of secondary cases (dashed pink line) increase. These values both roughly follow the shape of a logarithmic function, which is the reason why we chose to fit them to a function of the form shown in Eq. (1).

We also added the fitted function which estimates the relationship between \(\text {P}_\text {transmission}\) and R0 for each calendar year to the plot seen in Fig. 2 (dotted brown line). For all calendar years we tested, the fitted function neatly follows both the mean and median number of secondary cases we observed for each value of \(\text {P}_\text {transmission}\).

The basic reproduction number of measles is commonly estimated to be between 12 and 18. In our model, this would thus correspond to a value of \(\text {P}_\text {transmission}\) of about 0.44 (\(\hat{R}_{0} = 12.12\)) to 0.72 (\(\hat{R}_{0} = 18.17\)) for 2020, 0.45 (\(\hat{R}_{0} = 12.12\)) to 0.73 (\(\hat{R}_{0} = 18.03\)) for 2030, and 0.46 (\(\hat{R}_{0} = 12.19\)) to 0.75 (\(\hat{R}_{0} = 18.19\)) for 2040. As we ran simulations for values of \(\text {P}_\text {transmission}\) from 0.40 to 0.80, we are certain to have included a relevant range of transmission probabilities for measles.

Fig. 2.
figure 2

Estimated relationship between \(\text {P}_\text {transmission}\) and R0 in our simulated populations for 2020 (a) and 2040 (b) (results for 2030 not shown here). Besides the mean number of secondary cases for each value of \(\text {P}_\text {transmission}\) (solid blue line), the median number of secondary cases (dashed pink line), and the 95% percentile interval of secondary cases observed (gray shape), the fitted function (dotted brown line) is also shown for each tested calendar year. (Color figure online)

3.2 Household Assortativity Coefficient

Our goal in this study was to investigate how the effect of the clustering of measles susceptibility within households evolves as the population ages. We used a target clustering level as an input parameter to inform how immunity is to be distributed in the simulated population. To check in how far changes in this input parameter actually led to more clustering of susceptibility in the simulated population, and to obtain a measure that could be used to measure clustering in reality, we constructed a measure to estimate the actual level of household-based susceptibility clustering in a population: the household assortativity coefficient [14, 18].

To calculate this household assortativity coefficient, we first build a network. The nodes in this network correspond to individuals in the population. Each node has a single attribute: an individual is either susceptible to measles at the beginning of the simulation or they are not. An edge connects two nodes if the two individuals represented by the nodes belong to the same household.

Once this network has been constructed, we can calculate the attribute assortativity coefficient, based on the susceptibility attribute. This coefficient describes in how far similar nodes - here in respect to their immunity status - in the network are connected to each other. To construct the networks and calculate the attribute assortativity coefficient, we used the networkX Python package [9].

In Fig. 3, the distribution of household assortativity coefficients by target clustering level can be seen for simulations for 2020 (red), 2030 (yellow), and 2040 (green). For all calendar years the same trend can be observed: as the target clustering level is increased, the household assortativity coefficient also increases. Furthermore, there seems to be a consistent relationship between the target clustering level and the household assortativity coefficient for each calendar year.

Fig. 3.
figure 3

Distribution of household assortativity coefficients by input clustering level for simulations for 2020 (red), 2030 (yellow) and 2040 (green). (Color figure online)

When we compare the different calendar years, we observe that, in later years, the household assortativity coefficient increases more sharply as the clustering level is increased. This can be expected when we consider that we only took the target clustering level into account for individuals born since 1985. In 2020, this age group constitutes a smaller part of the population than it does in 2030 and in 2040. As such, clustering is applied to a larger part of the population in later calendar years, which is reflected in the corresponding household assortativity coefficients.

3.3 Risk and Persistence of Measles Outbreaks

Effective R. To estimate the impact of household-based susceptibility clustering on the risk for measles outbreaks, we calculated the Effective R for each scenario that we tested. We defined the Effective R as the average number of secondary cases an infected individual causes in a partially immune population. The method we used to calculate the Effective R is similar to how we calculated R0. For each scenario we tested, we calculated the average number of secondary cases caused by the index case over the 200 stochastic simulations.

In Fig. 4, the Effective R by \(\text {P}_\text {transmission}\) and clustering level is shown for 2020 (a), 2030 (b), and 2040 (c). For all calendar years, the same trend can be observed. As expected, increasing the transmission probability leads to an increase in the Effective R. However, when the clustering level is increased while \(\text {P}_\text {transmission}\) remains the same, the Effective R also increases.

When we compare the results for the different calendar years to each other, we observe that overall, the Effective R is higher in later calendar years - even for the lowest values of \(\text {P}_\text {transmission}\) and a clustering level of 0. This can be explained by the fact that the overall immunity level of the population is also decreasing as time progresses.

Fig. 4.
figure 4

Heat-maps of the Effective R by \(\text {P}_\text {transmission}\) and clustering level for 2020 (a), 2030 (b), and 2040 (c).

Escape Probability. To estimate the risk for measles outbreaks, it is important to know whether it is likely that an outbreak will be contained to a few secondary cases, or has the potential to spread to a large part of the susceptible population. For this reason, we calculated, for each simulation run, the escape probability. We defined the escape probability as the chance that an individual who is susceptible at the beginning of the simulation will remain uninfected over the entire course of the simulation (730 days). We estimated this probability as shown in Eq. (2), with \(\text {N}_\text {susceptible}\) the number of susceptible individuals in the population at the beginning of the simulation and \(\text {N}_\text {cases}\) the total number of cases infected over the course of the entire simulation (730 days).

$$\begin{aligned} \hat{P}_{\text {escape}} = \frac{N_{\text {susceptible}} - N_{\text {cases}}}{N_{\text {susceptible}}} \end{aligned}$$
(2)

In Fig. 5, the average escape probability over 200 stochastic runs for each scenario is shown for 2020 (a), 2030 (b), and 2040 (c). Again, a clear relationship can be observed between \(\text {P}_\text {transmission}\) and the escape probability: as the transmission probability is increased, the escape probability decreases. Increasing the clustering level has the same effect: a higher clustering level corresponds to a lower escape probability.

In 2020 (see Fig. 5 (a)), when \(\text {P}_\text {transmission}\) is 0.50 (corresponding to an R0 of about 13.5) and the clustering level is set at 0, the average escape probability is 0.79. However, when \(\text {P}_\text {transmission}\) remains the same, but the clustering level is increased to 1, the average escape probability decreases to 0.68. The same trend can be observed for 2030 (b) and 2040 (c).

Fig. 5.
figure 5

Average escape probability by \(\text {P}_\text {transmission}\) and clustering level for 2020 (a), 2030 (b), and 2040 (c).

Outbreak Size. Finally, we looked at the average size of persistent outbreaks. We defined a persistent outbreak as an outbreak that spreads throughout the population, instead of being contained to only a few secondary cases. To determine what we would consider a persistent outbreak, we looked at the frequency of outbreak sizes over all scenarios. We observed that an outbreak either dies out after only a few infections, or spreads through a large part of the susceptible population, with very few cases in between. As such, we chose a threshold of 5,000 infected cases (about 1.5% of the population), in between those two extremes. Any outbreak that leads to more than 5,000 infected cases, we thus regard as a persistent outbreak.

In Fig. 6, the average outbreak size of persistent outbreaks by \(\text {P}_\text {transmission}\) and by clustering level is shown for 2020 (a), 2030 (b), and 2040 (c). Again, an increase in \(\text {P}_\text {transmission}\) leads to an increase in the average size of persistent outbreaks. Furthermore, as the clustering level is increased, the average size of persistent outbreaks also increases.

However, we observe that the increase in average outbreak size as the clustering level is raised from 0 to 1 is smaller for later calendar years, and conversely, is also smaller for larger values of \(\text {P}_\text {transmission}\). Indeed, when we consider that the overall population immunity decreases from 2020 to 2040, these are two sides of the same coin: an increase in the transmission potential of a disease can be expected to have the same effects as a decrease in the overall population immunity to a disease.

Fig. 6.
figure 6

Average sizes of persistent outbreaks (threshold = 5,000 cases) by \(\text {P}_\text {transmission}\) and clustering level for 2020 (a), 2030 (b), and 2040 (c).

4 Discussion

We used an individual-based model to investigate how the effect of household-based susceptibility clustering on the risk for measles outbreaks is expected to change over the next 20 years. We compared different scenarios regarding the clustering of susceptibility over three calendar years: 2020, 2030 and 2040. For each of these calendar years we used projections of age distribution, household constitution, and age-specific immunity levels.

We compared the results of these simulations regarding the risk, persistence and size of measles outbreaks. We found that for all tested calendar years, an increase in the level of household-based susceptibility clustering leads to an increase in the risk for measles outbreaks, their potential to spread throughout the population, and the eventual size of these outbreaks. However, for later calendar years, and for higher values of \(\text {P}_\text {transmission}\), the effect of susceptibility clustering on the size of persistent outbreaks is less pronounced.

As such, we need to take household-based susceptibility clustering into account when modeling the spread of measles both in current and in future populations. Not doing so could lead to an under-estimation of the herd immunity threshold and the efforts needed to control measles. Furthermore, the clustering of susceptibility within households becomes especially important when taking into account the projections for age-specific immunity levels that we used. The generation born between 1985 and 1995, who have a lower immunity level, are now becoming parents, meaning they will share a roof with susceptible infants, who are still too young to be vaccinated. Increasing the immunity level of this generation - for example by organizing a catch-up campaign - should thus be a priority.

There are some limitations that should be taken into account when interpreting the results of this study. First, the populations we used are closed populations: no individuals were born or died over the course of each simulation. As such, there were no new susceptible individuals entering the population. In reality, newborn infants are an important group of susceptible individuals, that may thus have been underrepresented in our study [1].

Furthermore, the projected age-specific immunity levels that we used for 2020, 2030 and 2040 were based on a serological survey [11]. A serological survey measures antibody titres in surveyed individuals, which provide an estimate of humoral immune response. However, cellular immune response mechanisms may offer protection against a disease even when antibody levels are low [19]. As such, the projections we used may underestimate the level of immunity against measles in Flanders.

Finally, this study is a case-study for the population of Flanders, Belgium. It should be verified whether the effect of household-based susceptibility clustering evolves in the same manner for other populations. We also recommend that, in the future, data on the level of susceptibility clustering in different contexts and for different populations should be collected and used to update estimations of the herd immunity threshold for measles.