1 Introduction

Recent events concerning the outbreak of infectious diseases such as Ebola virus disease [1,2,3], Zika virus [4,5,6], a particularly deadly strain of Influenza [7, 8], and COVID-19 [9] have brought to light the need to develop effective and integrated measures to both monitor and control the spread of epidemics. Only through such an integrated approach will one be able to prevent a regional outbreak from becoming a global pandemic and to mitigate the damage done to the affected areas.

One such approach involves the use of early warning signals (EWS) [10] to detect if a critical transition is approaching. These critical transitions can be found in ecosystems [10], financial systems [11], and the Earth’s climate system [12], and may be caused by the system crossing a bifurcation or through stochastic-induced switching [13]. In particular, for problems of disease emergence and ecological tipping points, it has been suggested that a noise-induced switch from the endemic state/carrying capacity to the extinct state can be anticipated if data is collected with sufficient frequency [14, 15].

However, this has proven difficult in practice since, for instance, disease emergence is characterized by low prevalence and is often complicated by amplification of transients and oscillatory dynamics [15]. Also, it is widely held that noise-induced transitions cannot be detected since there is no change in the shape of the potential function, and no change in the eigenvalue of the mean field model [14, 16]. Furthermore, insights from large deviation theory [17, 18] cast doubt on the effectiveness, accuracy, and feasibility of anticipating stochastic transitions using EWS [19,20,21].

Nevertheless, extensive research shows that generic early warning signals exist in a wide variety of systems, and can in fact be the harbingers of critical transitions [10, 22]. For instance, the presence of leading indicators such as autocorrelation and variance has been found to increase prior to climatic transitions as well as before regime changes in lake food webs [16, 23,24,25]. And although, as previously stated, the early detection of a stochastic transition is difficult to predict by the sole reliance on the generic indicators developed in the theory of EWS, the changes detected by EWS indicators do tell us that some type of transition is taking place, be it critical or otherwise [26]. This observation will serve as the foundation of this article, as we seek to test and develop methods that introduce control measures at critical stages in the dynamics of the epidemic model we consider.

We consider a susceptible–infectious–susceptible (SIS) epidemic model and employ the metric-based indicators developed in the theory of EWS to monitor the state of the system. By doing this, we are able to ascertain the system’s overall resilience to control measures at different points in time. We seek to drive the infectious disease to extinction in an amount of time that is shorter than the noise-induced extinction that would occur in the absence of control. This will be accomplished by perturbing the system at specific points corresponding to states of low resilience, i.e., a weakened system state.

In short, whereas previous attempts focused on the prediction of critical transitions in stochastic systems, we attempt to induce them. We hypothesize that perturbing the system while in a weakened state will precipitate critical transitions, while the same approach applied during periods of increased resilience will be met with resistance, failing to yield similar results. Our hypothesis stems from results that show that many stochastic systems do in fact behave like systems undergoing critical transition due to critical slowing down (CSD) in the mean field model [14].

We test our hypothesis by simulating stochastic data from the SIS epidemic model using a type of Monte Carlo method [27]. This time series data is subjected to EWS analysis [23, 24] using metric-based indicators, which quantify changes in statistical properties of the time series. In particular, one can use the autocorrelation, return rate, skewness, and variance to yield actionable information which is then used to introduce quarantine/immunization measures to effect the desired result [23].

Consistent with our proposition, we show that removing a portion of the infected population in the SIS model at a low resilience state through quarantine/vaccine is an effective measure for inducing early extinction events in the epidemic. Mean extinction times (MTE) are computed for the various cases and demonstrate agreement with the hypothesis. Results also warn against the indiscriminate application of controls when the system is in a high resilience state.

2 Early warning signals theory

Early warning signals (EWS) can be thought of as the direct product of the need to understand the phenomenon of critical transitions. In many instances, the critical transition corresponds to a catastrophic bifurcation in which the qualitative behavior of the system changes abruptly in response to changes in one or more external parameters. It commonly arises in deterministic systems with alternative equilibrium states (Fig. 1). However, as demonstrated in this article, the transition can be noise-induced wherein stochasticity causes a transition from a deterministically stable equilibrium to a deterministically unstable equilibrium (e.g., a transition from the endemic state to the extinct state found in epidemic models for a reproduction number, \(R_0 >1\)) [1, 18, 28, 29].

Fig. 1
figure 1

A sample dynamical landscape for a system with alternative equilibrium states in which the critical transition is brought about by a deterministic bifurcation. Although the SIS epidemic model discussed in this article does not possess alternative equilibrium states, we show that the EWS metric indicators can still be used. (Top left) The dynamical landscape far from the transition. A deep basin of attraction makes the switch to the alternative state due to small perturbations difficult. (Top right) Far from the transition, the return rate is fast, and the variance is relatively small. (Bottom left) Flattening of the dynamical landscape due to critical slowing-down can cause even small perturbations to induce a switch to an alternative state. (Bottom right) Close to the transition, the return rate slows down, while the variance increases

In addition, critical transitions often exhibit “threshold behavior,” which is accompanied by changes in the system properties that become more extreme as it moves closer to the tipping point. These changes include but are not limited to: increasing autocorrelation, decreased return rate, increasing variance, increasing skewness, and flickering [30]. The gradual change in these collective properties is generally termed critical slowing down (CSD). In essence, it is CSD that allows early warning signals to detect critical transitions in progress. Figure 1 illustrates the role of CSD with respect to the dynamical landscape.

It is worth noting again that there are different types of transitions, some catastrophic, some not, some showing CSD and some not, and some induced by noise. Since we are considering stochastic data, it is important to realize that the stochastic perturbations will influence the EWS statistics, and it is precisely the careful monitoring of these metric-based indicators that will serve as the basis for our analysis. In addition, while CSD is typically associated with undesirable outcomes, such as desertification [31], lake eutrophication [32], and coral reef collapse [33], here we take advantage of the resulting dynamical consequence of CSD to accelerate critical transitions in systems where such outcomes are desirable, such as the extermination of pest populations, and the local extinction of epidemics.

Fig. 2
figure 2

SIS compartmental diagram. Individuals are born susceptible with rate \(\mu \), susceptible individuals become infectious with contact rate \(\beta \), infectious individuals recover and become re-susceptible with rate \(\gamma \), and both susceptible and infectious individuals die with rate \(\mu \)

3 SIS epidemic model

The SIS epidemic model is constructed by dividing the population into two classes of individuals, namely susceptible S and infectious I. Individuals are born into the susceptible population with birth rate \(\mu \). A susceptible individual can perish with death rate \(\mu \) and can become infected with the disease, with contact rate \(\beta \). An infectious individual recovers and becomes re-susceptible with rate \(\gamma \), and can perish with death rate \(\mu \). The units of all rates are per year. Figure 2 illustrates the SIS compartmental diagram, and the deterministic mean-field equations are given as

$$\begin{aligned} \frac{dS}{dt}&= \frac{-\beta SI}{N} + \mu (N-S) + \gamma I, \end{aligned}$$
(1)
$$\begin{aligned} \frac{dI}{dt}&= \frac{\beta SI}{N} - \mu I - \gamma I, \end{aligned}$$
(2)

where N is the constant population size.

Equations (1 and 2) have two steady states. The disease-free or extinct state has no infectious individuals and is given as

$$\begin{aligned} (S_0,I_0)=(N, 0). \end{aligned}$$
(3)

The infection is maintained at the endemic disease state, which is given as

$$\begin{aligned} (S_e,I_e)=\left( \frac{N}{R_0}, N\left( 1 - \frac{1}{R_0} \right) \right) , \end{aligned}$$
(4)

where \(R_0 = \frac{\beta }{\gamma + \mu }\) is the basic reproductive number [34]. The stability of these two fixed points is determined by the value of the reproductive number \(R_0\), which can be thought of as the average number of new infectious individuals that one infectious individual generates over the course of the infectious period in an entirely susceptible population. For \(R_0 > 1\), the extinct state is unstable while the endemic state is stable. Note that since the model is deterministic, a population at the attracting endemic state can never go extinct. However, the inclusion of demographic stochasticity will induce a large fluctuation that brings the population into the extinct state.

The corresponding stochastic population model is represented by the transition processes of birth, death, infection, and recovery [18, 35, 36]. The associated rates are given in Table 1.

Table 1 SIS Transitions and Rates

4 Methods

In this section, we describe the methodology used to generate the data as well as the statistical methods used to understand the application of the control at different resiliencies. Specifically, Sect. 4.1 contains details about the Monte Carlo method that is used to generate 1000 stochastic realizations (time series) of the SIS epidemic model, while Sect. 4.2 describes the metric-based indicators that are used to capture changes in the properties of the time series. Based on the resulting autocorrelation data for each time series, we determine the low and high thresholds that correspond, respectively, to regions of high and low resilience (Sect. 4.3). Lastly, Sect. 4.4 describes the control mechanism and the determination of mean extinction times.

4.1 Simulation of the stochastic SIS model

To generate a solution of this stochastic model, where the demographic noise is internal to the system, we use the Doob–Gillespie algorithm (also known as the Gillespie algorithm or Gillespie’s stochastic simulation algorithm (SSA)) [27, 37]. The algorithm is a type of Monte Carlo method that was originally proposed by Kendall [38] for simulating birth–death processes and was popularized by Gillespie [27] as a useful method for simulating chemical reactions based on molecular collisions. The results of a Gillespie simulation is a stochastic trajectory that represents an exact sample from the probability function that solves the master equation. Therefore, the method can be used to simulate population dynamics where molecular collisions are replaced by individual events and interactions including birth, death, and infection [18].

Fig. 3
figure 3

(Top) Stochastic time series for the SIS model generated using the Gillespie algorithm as described in the text. (Middle) Autocorrelation, and (Bottom) return rate for the stochastic time series shown in the top panel computed using the Early Warning Signals Toolbox [39]. The green band indicates the ideal time to apply the control (low resilience), while the red band indicates where the control could lead to undesired results (high resilience). These, and other, statistical metric-based indicators provide an accurate picture of the state of the SIS system

Let \(\mathbf{x}=(x_1,\ldots ,x_n)^T\) denote the state variables of a system, where \(x_j\) provides the number of individuals in state \(x_j\) at time t. The first step of the algorithm is to initialize the number of individuals in the population compartments \(\mathbf{x}_0\). For a given state \(\mathbf{x}\) of the system, one calculates the transition rates (birth rate, death rate, contact rate, etc.) denoted as \(a_i(\mathbf{x})\) for \(i=1\ldots l\), where l is the number of transitions. Thus, the sum of all transition rates is given by \(a_0=\sum \limits _{i=1}^{l}a_i(\mathbf{x})\).

Random numbers are generated to determine both the next event to occur as well as the time at which the next event will occur. One simulates the time \(\tau \) until the next transition by drawing from an exponential distribution with mean \(1/a_0\). This is equivalent to drawing a random number \(r_1\) uniformly on (0, 1) and computing \(\tau =(1/a_0)\ln {(1/r_1)}\). During each random time step, exactly one event occurs. The probability of any particular event taking place is equal to its own transition rate divided by the sum of all transition rates \(a_i(\mathbf{x})/a_0\). A second random number \(r_2\) is drawn uniformly on (0, 1), and it is used to determine the transition event that occurs. If \(0<r_2 < a_1(\mathbf{x})/a_0\), then the first transition occurs; if \(a_1(\mathbf{x})/a_0<r_2 < (a_1(\mathbf{x})+a_2(\mathbf{x}))/a_0\), then the second transition occurs, and so on.

Lastly, both the time step and the number of individuals in each compartment are updated, and the process is iterated until the disease goes extinct or until the simulation time has been exceeded [18]. A sample realization of the stochastic SIS model using the transitions in Table 1 and the parameter values shown in Table 2 is shown in the top panel of Fig. 3. The Table 2 parameter values are used to generate all the simulated data used throughout this article, and the code used to generate this realization as well as all the simulated data can be found in the github repository https://github.com/Walter-Ullon/Stochastic-SIS.

Table 2 Parameter values used in the time series simulations of the SIS model

4.2 Statistical analysis: metric-based indicators

Statistical analysis of the simulated stochastic time series data is performed using the Early Warning Signals Toolbox for R [39]. As stated by its developers, the toolbox provides methods for estimating statistical changes in time series data that can be used for identifying nearby critical transitions [23].

Metric-based indicators capture changes in the properties of an observed time series of a system and quantify changes in the statistical properties of the time series without attempting to fit the data with a specific model structure [23]. These methods include a variety of statistical measures including autocorrelation and spectral properties, variance, skewness and kurtosis, detrended fluctuation analysis, and conditional heteroskedasticity.

While the Early Warning Signals Toolbox also contains robust model-based methods to analyze time series data, in our work we seek to rely only on metric-based methods as this provides us with more flexibility in the application of the theory across various dynamical systems. From the aforementioned indicators, autocorrelation and return rate are the simplest methods to measure critical slowing down. An increase in autocorrelation indicates that the state of the system has become increasingly similar between consecutive observations, while a decrease in the return-rate signals the system’s inability to return to its previous state. Both measures provide evidence of critical slowing down [23].

The Early Warning Signals Toolbox measures lag-1 autocorrelation in three separate ways, namely the autocorrelation function, an autoregressive model of order 1, and return rate [23]. In this work, we have relied on autocorrelation found using the autoregressive model using a rolling window of 25% of the time series based on the idea that indicators should be estimated as data becomes available. The autocorrelation found using the autocorrelation function is very similar for the time series generated in this work.

Figure 3 shows two statistical indicators, autocorrelation and return rate, for the associated stochastic SIS time series. As we can observe in Fig. 3, it is possible for these indicators to signal a critical transition in progress, while the time series of SIS infectives still does not switch from the endemic state to the alternative extinct state, i.e., a false positive. However, by controlling the system (i.e., removing a portion of the infected individuals) at the right time, one can precipitate the transition and achieve an early extinction in which extinction occurs earlier than it would have with no control. This loss of resilience is a trademark of CSD. As we shall see, the results obtained by relying on metric-based EWS analysis are both encouraging and sobering.

4.3 Statistical analysis: resilience thresholds

Since our methods call for the introduction of control measures when the system is displaying threshold values in autocorrelation (high/low), it is essential that we define specifically what these thresholds are and the manner in which they were obtained. We note that at this point, we take autocorrelation as our leading metric-based indicator in the analysis and simulations that follow.

To ensure robust statistical estimates, one thousand stochastic realizations of the SIS model were computed using the Gillespie algorithm outlined in Sect. 4.1. The autocorrelation data was computed for each of the thousand realizations in the manner described in Sect. 4.2, and the autocorrelation time series was averaged to obtain a mean autocorrelation value specific to the time series being considered. The end result of this process is one thousand autocorrelation means, which were in turn averaged to obtain a clearer idea of the distribution and location of the true mean autocorrelation for the SIS model.

This process was carried out in the same manner for the maximum and minimum autocorrelation values of each time series, which yielded one thousand maximum/minimum data points, which were subsequently averaged. Ultimately, these measures provided us with a statistical approximation for the location of the high and low autocorrelation thresholds.

Fig. 4
figure 4

Autocorrelation analysis for one thousand stochastic realizations of the SIS model. Note that the “high threshold” value in the legend represents the maximum autocorrelation value minus one standard deviation, while the “low threshold” represents the minimum autocorrelation value plus one standard deviation

To account for the inherent variation present in these simulations, we also obtained the mean standard deviation for the model which was used to create “soft thresholds.” Thus, as we will see in the next section, when the control was applied at the high/low autocorrelation point, it was done only when these values surpassed the maximum/minimum \({\mp }\) one standard deviation. The choice to employ the soft thresholds was done in order to capture more time series that exhibit threshold behavior, as some realizations never reach or surpass the maximum/minimum thresholds.

The thresholding results are summarized in Table 3 and Fig. 4. These threshold values provide a metric by which to determine how far we are willing to let the system evolve before the application of control measures. In short, we introduce vaccination/quarantine measures in the SIS model whenever the autocorrelation reaches or surpasses the aforementioned threshold values for high/low thresholds.

Table 3 Threshold values for autocorrelation for the SIS model

In general, for a new epidemic outbreak, one does not have vast troves of data available to determine the thresholds. However, it is possible to calculate the threshold value from the beginning of a single stochastic realization corresponding to a specific disease outbreak. One could also find the threshold values using data from previous or similar disease outbreaks.

4.4 Control mechanism: pulsing the population

From our previous discussion of EWS theory, we know that the time series statistics when far from the transition are very different from the statistics when close to the transition. This is reflected by the rise in autocorrelation, variance, and skewness, and inversely by a drop in the return rate. It is precisely at this point where we introduce control measures such as quarantine or vaccines in the SIS model.

The control can be thought of as either an indefinite quarantine or a mass vaccination event. One could also envision a scenario in which quarantined individuals recover and are given a vaccine after quarantine to prevent re-susceptibility. Either way, the “removed” individuals do not interact with the rest of the population (susceptibles or infectious) and do not contribute to the SI dynamics. The control causes a decrease in both the number of infectious individuals and the total population N since \(S+I=N\).

Figure 5 demonstrates the effect of control when it is applied at the low and high resilience states. The top panel plot shows the result of a typical run of the stochastic SIS model unaffected by controls or outside measures. Even at time \(t=1200\), the infected population is still fluctuating around the endemic steady state.

Fig. 5
figure 5

Time series for the SIS model. (Top) an unperturbed realization of the model. After \(10^{7}\) iterations, the epidemic continues to fluctuate about the endemic state (red). (Middle) the control is applied at the low-resilience state (29 of 46 infected individuals are removed), and extinction follows shortly after. (Bottom) the control is applied at the high-resilience state (29 of 45 infected individuals are removed); the system is able to resist the control, recover, and continue to fluctuate about the endemic state

Fig. 6
figure 6

Time series for the SIS model. (Top) An unperturbed realization of the model where the disease undergoes extinction on its own. (Middle) The control is applied at the low-resilience state (28 of 38 infected individuals are removed), and extinction follows shortly after. (Bottom) The control is applied at the high-resilience state (28 of 37 infected individuals are removed). As in the previous case, the system is able to resist the control and recover. However, it eventually goes extinct at a much later time than if no control measure was applied (see Top panel)

Following with our analysis of the metric-based indicators for critical transitions, for the middle plot we realize a second run of the top plot using the same random seed and parameters, where we monitor for high autocorrelation values corresponding to low resilience. Once this threshold is reached (as per the values outlined in Fig. 4), we remove a portion of the population, i.e., the control is applied. The infected population struggles to return to the deterministically stable endemic state and eventually goes extinct. Thus, applying the control at the low-resilience state corresponding to high autocorrelation, variance, skewness, and decrease in return rate suffices to induce a critical transition to the extinct state.

For the bottom plot, we again recreate an exact re-simulation of the stochastic SIS, but this time we monitor the metric indicators for signals corresponding to high resilience, namely low autocorrelation values. Here, the infected population behavior is much different; it is able to sustain the removal of individuals, effectively returning to fluctuate about the endemic state as it did before the control measure. Hence, consistent with our hypothesis, quarantining individuals at the high-resilience state (low autocorrelation, high return rate) fails to precipitate an early extinction event.

In addition to the individual realizations presented in Fig. 5, we performed the same analysis for one thousand realizations of the model utilizing the thresholds outlined in Sect. 4.3. By comparing the mean time to extinction (MTE) of the SIS realizations with and without control measures as well as with a blind test where the control was applied randomly, it is possible to compute a quantitative measure of the effectiveness of the EWS analysis.

Because not all of the individual SIS realizations reached extinction in the prescribed computational bounds (defined by us to be \(10^7\) random time steps), we separately compare “effective” versus “absolute” MTE. The effective MTE is calculated by averaging the individual extinction times only for the realizations that went extinct after the control was applied and before the maximum number of time steps was reached, whereas the absolute MTE is calculated by incorporating into the average those realizations that had not yet gone extinct despite the application of control (using the maximum time as an “extinction” time). These two types of MTE were computed for four possible scenarios: (i) no control is applied; (ii) the control is applied at the high resilience point; (iii) the control is applied at the low resilience point; and (iv) the control is applied at random. The results are provided in the following section (Sect. 5).

For the purpose of our simulation, we opted to vaccinate/quarantine 40% of the infected population once the thresholds were met. While several removal factors were implemented (30% up to 75%), we note that in our simulations, 40% provided the best results, as a lower fraction proves ineffective in driving the critical transition, while higher percentages move the system so far away from the endemic state that the system loses the capability to recover and a very rapid extinction occurs—this scenario does little to provide us with a clear understanding of the underlying dynamics and resiliency of the system. In essence, if the control is too small the effect cannot contribute to the governing dynamics in overriding the current state, while if the control is too large, the contribution overwhelms the governing dynamics without yielding any useful information.

Table 4 Effective mean extinction times for 1,000 realizations (years)

5 Results and discussion

As we have seen in Fig. 5, the results of our experiment are encouraging since we have shown that monitoring the system and attacking it with conventional control measures can lead to a significant reduction in the active cycle of the epidemic. Nevertheless, this approach needs to be employed with care.

For instance, the results presented in Fig. 6 paint a perplexing picture. Consistent with the case examined previously, the quarantining/immunization of infected individuals during weakened system states drives it to early extinction. However, application of the control at the high-resilience state exacerbates adverse conditions, ultimately causing an extension of the life cycle of the epidemic. This is at the very least counter intuitive, since reason dictates that when faced with the outbreak of some epidemic the best alternative is to act rather than sit idly by. But as this result shows, doing the right thing at the wrong time can lead to dire consequences.

5.1 SIS mean extinction time

After applying the controls to the indicated fraction of the population, we compared the results of applying the control at the high threshold, applying the control at the low threshold, and applying the control randomly (blind test) to the case of no control. From one thousand realizations, we were able to meet the high threshold 894 times, achieving early extinction 235 times, for an MTE of 1228.6 years.

On the other hand, the low threshold was met 770 times, reaching extinction only 134 times for an MTE of 1265.8. The results confirm our hypothesis, and show that applying the control at the high threshold is at least 8.9% more effective than the low threshold, resulting in a difference of approximately 37 years in terms of MTE.

As for the blind test, we applied the control at random. This resulted in 153 extinctions for an effective MTE of 1237 years. As expected, the use of random control falls between the results of applying the control at low and high threshold. We expected to perform better than applying the control at the low resilience point. In this case, the MTE is shorter by approximately eight years. We also expected to do worse by applying the control at the high resilience point. Application of control at high resilience leads to a longer MTE by approximately 28 years.

Note that regardless of the difference in MTE between the different scenarios, more extinctions were ultimately achieved by applying the control at the low resilience state versus applying the control at either the high resilience state or at random.

The numbers are far more dramatic when comparing the results of the high threshold control versus no controls measures. To obtain the “no control” MTE, we computed the simulations for \({10^{11}}\) time steps. The results are outlined in Table 4.

5.2 Mean extinction time: effective versus absolute

We note that the values outlined in Table 4 represent the effective MTE, which were obtained by aggregating and averaging the final extinction times for only those trials that resulted in extinction post-control. Because those simulations were only computed for \({10^{7}}\) time steps, some realizations were still fluctuating around the endemic state at the time the simulation ended. Taking those into consideration and using the “no control” figures as a guide, we can easily calculate the absolute MTE. The results are outlined in Table 5, and are consistent with our hypothesis and the theory of EWS.

Table 5 Absolute mean extinction times for 1,000 realizations (years)

The results for the SIS model show that it is in our best interest to extract information from the system before we attempt to introduce controls, as we could ultimately be extending the same conditions that we are trying to eliminate. Furthermore, extensive simulations show that there is an advantage in introducing controls at the appropriate moment when trying to induce early extinctions. Last, the difference in MTE of the control versus no control cases show us that EWS analysis can be an effective tool in the fight against disease epidemics.

6 Conclusions and future work

Our results show that metric-based EWS analysis is able to capture critical changes in the system state under different conditions, calling to our attention the effectiveness of early warning signals theory to monitor the resilience of dynamic systems across a variety of scenarios.

In addition, the control methods explored, aided by the theory, provide us with the ability to explore novel ways to solve problems of great importance, where the focus is to move away from the prediction of transitions toward the control of outcomes. As we have seen, the results obtained can be of great value and hopefully they will serve as the launching platform for further studies in this area.

In the future, we seek to extend our work to include the study of additional dynamical models, thus gaining a deeper understanding of the theory of early warning signals and its applicability to real world problems. Also, further scaling of the models presented here to include larger populations, as well as large-scale automated testing would enhance the robustness of the results achieved thus far. This is important as unfortunately, the simulations are computationally expensive and at the moment long runs with large populations are prohibitive. These improvements would enable the application of this work to novel epidemic outbreaks.