2.1 The SIR Model

In 1927, Kermack and McKendrick (1927) published a set of general equations (Breda et al. 2012) to better understand the dynamics of an infectious disease spreading through a susceptible population. Their motivation was

“One of the most striking features in the study of epidemics is the difficulty of finding a causal factor which appears to be adequate to account for the magnitude of the frequent epidemics of disease which visit almost every population [] The problem may be summarized as follows: One (or more) infected person is introduced into a community of individuals, more or less susceptible to the disease in question. The disease spreads from the affected to the unaffected by contact infection. Each infected person runs through the course of his sickness, and finally is removed from the number of those who are sick, by recovery or by death. The chances of recovery or death vary from day to day during the course of his illness. The chances that the affected may convey infection to the unaffected are likewise dependent upon the stage of the sickness. As the epidemic spreads, the number of unaffected members of the community becomes reduced [] In the course of time the epidemic may come to an end. One of the most important problems in epidemiology is to ascertain whether this termination occurs only when no susceptible individuals are left, or whether the interplay of the various factors of infectivity, recovery and mortality, may result in termination, whilst many susceptible individuals are still present in the unaffected population.”

Following a general mathematical exposé, they suggested a set of pragmatic assumptions which lead to the standard SIR model of ordinary differential equations for the flow of hosts between Susceptible, Infectious, and Recovered compartments. In modern notation, their simplest set of equations is (Fig. 2.1):

$$\displaystyle\begin{array}{rcl} \frac{dS} {dt} =\mu (N - S) -\beta I \frac{S} {N}& &{}\end{array}$$
(2.1)
$$\displaystyle\begin{array}{rcl} \frac{dI} {dt} =\beta I \frac{S} {N} - (\mu +\gamma )I& &{}\end{array}$$
(2.2)
$$\displaystyle\begin{array}{rcl} \frac{dR} {dt} =\gamma I -\mu R.& &{}\end{array}$$
(2.3)
Fig. 2.1
figure 1

The SIR flow diagram. Flows represent per capita flows from the donor compartments

The assumptions of Eqs. (2.1)– (2.3) are:

  • The infection circulates in a population of size N, with a per capita “background” death rate, μ, which is balanced by a birth rate μN. From the sum of Eqs. (2.1)– (2.3), dNdt = 0 and N = S + I + R is thus constant.

  • The infection causes acute morbidity (not mortality); That is, in this version of the SIR model we assume we can ignore disease-induced mortality. This is reasonable for certain infections like chickenpox, but certainly not for others like rabies, SARS, or ebola.

  • Individuals are recruited directly into the susceptible class at birth (so we ignore perinatal maternal immunity).

  • Transmission of infection from infectious to susceptible individuals is controlled by a bilinear contact term \(\beta I \frac{S} {N}\). This stems from the assumption that the I infectious individuals are independently and randomly mixing with all other individuals, so the fraction SN of the encounters is with susceptible individuals; β is the contact rate times the probability of transmission given a contact between a susceptible and an infectious individual.

  • Chances of recovery or death is assumed not to change during the course of infection.

  • Infectiousness is assumed not to change during the course of infection.

  • Infected individuals move directly into the the infectious class (as opposed to the SEIR model; see Sect. 3.7) and remains there for an average infectious period of 1∕γ (assuming μ < < γ).Footnote 1

  • The model assumes that recovered individuals are immune from reinfection for life.

The basic reproductive ratio (R0), defined as the expected number of secondary infections from a single index case in a completely susceptible population, is a very important quantity in epidemiology. Chapter 3 is entirely devoted to this quantity. For this simple SIR model \(R_{0} = \frac{\beta } {\gamma +\mu }\).

2.2 Numerical Integration of the SIR Model

If there are no (or negligible) births and deaths during the duration of an epidemic (μ ≃ 0), it is commonly referred to as a closed epidemic. While it is occasionally possible to derive analytical solutions to systems of ODEs like Eqs. (2.1)– (2.3), we generally have to resort to numerical integration to predict the numbers over time. We use the deSolve R-package to numerically integrate the equations. We will numerically integrate a variety of different models. While the models differ, the basic recipe is generally the same: (1) define a R-function for the general system of equations, (2) specify the time points at which we want the integrator to save the state of the system, (3) provide values for the parameters, (4) give initial values for all state variables, and finally (5) invoke the R-function that does the integration. We use the ode-function in the deSolve-package.

Step 1: We define the function (often called the gradient-functions) for the equation systems. The deSolve-package requires the function to take the following parameters: time,Footnote 2 t, a vector with the values for the state variables (S, I, R), y, and parameter values (β, μ, γ, and N), parms:

The ode-function solves differential equations numerically.

Steps 2–4: Specify the time points at which we want ode to record the states of the system (here we use 26 weeks with 10 time-increments per week as specified in the vector times), the parameter values (in this case as specified in the vector parms), and starting conditions (specified in start). In this case we model the fraction of individuals in each class, so we set N = 1, and consider a disease with an infectious period of 2 weeks (γ = 1∕2), no births or deaths (μ = 0) and a transmission rate of 2 (β = 2). For our starting conditions we assume that 0. 1% of the initial population is infected and the remaining fraction is susceptible.

Step 5: Feed start values, times, the gradient-function and parameter vector to the ode-function as suggested by args(ode).Footnote 3 For convenience we convert the output to a data frame (ode returns a list). The head-function shows the first 5 rows of out, and round(,3) rounds the number to three decimals.

We can plot the result (Fig. 2.2) to see that the model predicts an initial exponential growth of the epidemic that decelerates as susceptibles are depleted, and finally fade-out as susceptible numbers are too low to sustain the chain of transmission.

R allows for a lot of customization of graphics— Rseek.org is a useful resource to find solutions to all things R…Fig. 2.2 has some added features such as a right-hand axis for the effective reproductive ratio (RE)—the expected number of new cases per infected individuals in a not completely susceptible population—and a legend so that we can confirm that the turnover of the epidemic happens exactly when RE = R0s = 1, where s is the fraction of remaining susceptibles. The threshold R0s = 1 ⇒ s = 1∕R0 results in the powerful rule of thumb for vaccine induced eradication and herd immunity: If we can—through vaccination —keep the susceptible population below a critical fraction, pc = 1 − 1∕R0, then pathogen spread will dissipate and the pathogen will not be able to reinvade the host population (e.g.,  Anderson and May 1982; Roberts and Heesterbeek 1993; Ferguson et al. 2003). This rule of thumb appeared to work well for smallpox, the only vaccine-eradicated human disease; Its R0 was commonly around 5, and most countries saw elimination once vaccine cover exceeded 80% (Anderson and May 1982). The actual code used to produce Fig. 2.2 is:

2.3 Final Epidemic Size

The closed epidemic model has two equilibria {S = 1, I = 0, R = 0} which is unstable when R0 > 1, and the {S , I , R }-equilibrium which reflects the final epidemic size, for which I = 0 as the epidemic eventually self-extinguish in the absence of susceptible recruitment; S is the fraction of susceptibles that escape infection altogether; and R is the final epidemic size—the fraction of susceptibles that will be infected before the epidemic self-extinguish. For the closed epidemic, there is an exact mathematical solution to the final epidemic size (below). It is nevertheless useful to consider computational ways of finding equilibria in the absence of exact solutions.

The rootSolve-package will attempt to find equilibria of systems of differential equations through numerical integration. The function runsteady is really just a wrapper function around the ode-function that integrates until the system settles on some steady-state (if it exists). It takes the same arguments as ode. By varying initial conditions rootSolve should find multiple stable equilibria if there are more than one stable solution.Footnote 4

Fig. 2.2
figure 2

The closed SIR epidemic with left and right axes and effective reproductive ratio, RE. The epidemic turns over at RE = 1

So for these parameters, 2% of susceptibles are expected to escape infection altogether and 98%—the final epidemic size—are expected to be infected during the course of the epidemic.

Let us explore numerically how the final epidemic size depends on R0. Recall that for the specific SIR variant we are working with R0 = β∕(γ + μ), and since we are studying the closed epidemic μ = 0. In the above example we assume an infectious period of 2 weeks (i.e., γ = 1∕2), so we may vary β so R0 goes from 0.1 to 5. For moderate to large R0 this fraction has been shown to be approximately 1 − exp(−R0) (e.g.,  Anderson and May 1982). We can check how well this approximation holds (Fig. 2.3).Footnote 5

Fig. 2.3
figure 3

The final epidemic size as a function of R0. The black line is the solution based on numerically integrating the closed epidemic, and the red line is the approximation f ≃ 1 − exp(−R0)

We see that the approximation is good for R0 > 2. 5 but overestimates the final epidemic size for smaller R0 (and is terrible for R0 < 1).

For the closed epidemic SIR model, there is an exact mathematical solution to the fraction of susceptibles that escapes infection (1 − f) given by the implicit equation f = exp(−R0(1 − f)) or equivalently exp(−R0(1 − f)) − f = 0 (Swinton 1998). So we can also find the final size by applying the uniroot-function to the equation. The uniroot-function finds numerical solutions to equations with one unknown variable (which has to be named x).

So for R0 = 2 the final epidemic size is 79.6% and the approximation is off by around 6.7% points.

2.4 Open Epidemic

The open epidemic has recruitment of new susceptibles (i.e., μ > 0). As long as R0 > 1, the open epidemic has an “endemic equilibrium” were the pathogen and host coexist. If we use the SIR equations to model fractions (i.e., set N = 1), Eq. (2.2) of the SIR model implies that S = (γ + μ)∕β = 1∕R0 is the endemic S-equilibrium, which when substituted into Eq. (2.1) gives I = μ(R0 − 1)∕β, and finally, R = NI S as the I and R endemic equilibria. We can study the predicted dynamics of the open epidemic using the sirmod-function. Let us assume a life expectancy of 50 years, a stable population size, and thus a weekly birth rate of μ = 1∕(50 ∗ 52). Let’s assume that 19% of the initial population is susceptible and 1% is infected and numerically integrate the model for 50 years (Fig. 2.4).

Fig. 2.4
figure 4

The open SIR epidemic. (a) The fraction infected over time. (b) The joint time series of infecteds and susceptibles in the S-I phase plane. The trajectory forms a counter-clockwise inwards spiral in the S-I plane (note that the 50-year simulation is not long enough for the system to reach the steady-state endemic equilibrium at the center of the spiral)

2.5 Phase Analyses

When working with dynamical systems we are often interested in studying the dynamics in the phase plane and derive the isoclines that divide this plane in regions of increase and decrease of the various state variables. The phaseR package is a wrapper around ode that makes it easy to analyze 1D and 2D ode’s.Footnote 6 The R-state in the SIR model does not influence the dynamics, so we can rewrite the SIR model as a 2D system.

The isoclines (sometimes called the nullclines) in this system are given by the solution to the equations dSdt = 0 and dIdt = 0 and partitions the phase plane into regions were S and I are increasing and decreasing. For N = 1, the I-isocline is S = (γ + μ)∕β = 1∕R0 and the S-isocline is I = μ(1∕S − 1)∕β. We can draw these in the phase plane and add a simulated trajectory to the plot (Fig. 2.5). The trajectory cycles in a counter-clockwise dampened fashion towards the endemic equilibrium (Fig. 2.5). To visualize the expected change to the system at arbitrary points in the phase plane, we can further use the function flowField in the phaseR-package to superimpose predicted arrows of change (vectors).

Fig. 2.5
figure 5

The S-I phase plane with isoclines and the predicted anti-clockwise trajectory towards the equilibrium

2.6 Stability and Periodicity

If we work with continuous-time ODE models like the SIR, equilibria are locally stable if (and only if) all the real part of the eigenvalues of the Jacobian matrix —when evaluated at the equilibrium—are smaller than zero. We will discuss stability and resonant periodicity in detail in Chap. 9, so this section is just a teaser…An equilibrium is (1) a node (i.e., all trajectories moves monotonically towards/away from the equilibrium) if the largest eigenvalue has only real parts, or (2) a focus (trajectories spiral towards or away from the equilibrium) if the largest eigenvalues are a conjugate pair of complex numbers (a ± ).Footnote 7 For a focus the imaginary part determines the dampening period of the cycle according to 2πb. We can thus use the Jacobian matrix to study the SIR model’s equilibria. If we let F = dSdt = μ(NS) −βSIN and G = dIdt = βSIN − (μ + γ)I, the Jacobian of the SIR system is

$$\displaystyle{ J = \left (\begin{array}{cc} \frac{\partial F} {\partial S} &\frac{\partial F} {\partial I} \\ \frac{\partial G} {\partial S} &\frac{\partial G} {\partial I} \end{array} \right ), }$$
(2.4)

and the two equilibria are the disease-free equilibrium and the endemic equilibrium as defined above.

R can help with all of this. We first calculate the equilibria:

We then calculate the elements of the Jacobian using R’s D-function:

We pass the values for S and I in the eq1-list to the Jacobian,Footnote 8 and use eigen-function to calculate the eigenvalues:

For the endemic equilibrium, the eigenvalues are a pair of complex conjugates which real parts are negative, so it is a stable focus. The period of the inwards spiral is:

So with these parameters the dampening period is predicted to be 261 weeks (just over 5 years). Thus, during disease invasion we expect this system to exhibit initial outbreaks every 5 years. A further significance of this number is that if the system is stochastically perturbed by, say, environmental variability affecting transmission, we expect the system to exhibit low amplitude “phase-forgetting” cycles (Nisbet and Gurney 1982) with approximately this period in the long run (see Chap. 9). We can make more accurate calculations of the stochastic system using transfer functions (Nisbet and Gurney 1982; Priestley 1981). We will visit on this slightly more advanced topic in Sect. 9.7.

The same protocol can be used on the disease-free equilibrium {S = 1, I = 0}.

The eigenvalues are strictly real and the largest value is greater than zero, so it is an unstable node (a “saddle”); The epidemic trajectory is predicted to move monotonically away from this disease free equilibrium if infection is introduced into the system. This makes sense because with the parameter values used, R0 = 3. 99 which is greater than the invasion-threshold value of 1.

2.7 Advanced: More Realistic Infectious Periods

The S(E)IR-type differential equation models assumes that rate of exit from the infectious classes are constant, the implicit assumption is thus that the infectious period is exponentially distributed among infected individuals; The average infectious period is 1∕(γ + μ), but an exponential fraction is infectious much shorter/longer than this. The chain-binomial model (see Sect. 3.4), in contrast, assumes that everybody is infectious for a fixed period and then all instantaneously recover (or die). These assumptions are mathematically convenient, but in reality neither are particularly realistic. Hope-Simpson (1952) traced the chains of transmission of measles in multi-sibling household. The timing of secondary and tertiary cases was analyzed in detail by Bailey (1956) and Bailey and Alff-Steinberger (1970). The average latent and infectious periods were calculated to be 8. 58 and 6. 57 days, respectively. While the distribution around each of these averages were not estimated separately (the latent period was assumed to be distributed and the infectious period assumed fixed), the variance around the roughly fortnight period of infection was estimated to be 3. 13. The mean duration of infection is thus 15. 15 days with a standard deviation of 1. 77 (Fig. 2.6). So neither a fixed nor an exponential distribution is very accurate (Keeling and Grenfell 1997; Lloyd 2001).

Fig. 2.6
figure 6

Gamma distributed infectious periods: (a) The predicted infectious period distribution based on a Gamma distribution with shape u = 1, 5, 25, 100, and 100, 000; u = 1 corresponds to the exponential distribution implicit in the standard SIR model; the bold line (u = 73) is the one corresponding to the variance observed in Hope-Simpson’s (1952) study of measles. The dotted line (virtually indistinguishable from the u = 100) is a Gaussian distribution intended to show that when u is large the Gamma distribution converges on the Gaussian. (b) The probability of still being infectious as a function of time for the different distributions; as u becomes large, the distribution converges on a fixed infectious period. Note that the empirical distribution (bold) is quite different from the exponential

Kermack and McKendrick’s (1927) original model allows for arbitrary infectious-period distributions. We can write Kermack and McKendrick’s original equations as “renewal equations” (Breda et al. 2012), introducing the additional notation of k(t) being the (instantaneous) incidence at time t (i.e., flux into the I-class at time t).

$$\displaystyle\begin{array}{rcl} \frac{dS} {dt} & =& \mu (N - S) - k(t){}\end{array}$$
(2.5)
$$\displaystyle\begin{array}{rcl} k(t)& =& \beta I \frac{S} {N}{}\end{array}$$
(2.6)
$$\displaystyle\begin{array}{rcl} \frac{dI} {dt} & =& k(t) -\mu I -\int _{0}^{\infty } \frac{h(\tau )} {1 - H(\tau )}k(t-\tau )d\tau {}\end{array}$$
(2.7)
$$\displaystyle\begin{array}{rcl} \frac{dR} {dt} & =& \int _{0}^{\infty } \frac{h(\tau )} {1 - H(\tau )}k(t-\tau )d\tau -\mu R,{}\end{array}$$
(2.8)

where k(tτ) is the number of individuals that was infected τ time units ago, h(τ) is the probability of recovering on infection-day τ, and H(τ) is the cumulative probability of having recovered by infection-day τ; k(tτ)∕(1 − H(τ)) is thus the fraction of individuals infected at time tτ that still remains in the infected class on day t and the integral is over all previous infections so as to quantify the total flux into the removed class at time t. Though intuitive, these general integro-differential equations (Eqs. (2.5)– (2.8)) are not easy to work with in general. For a restricted set of distributions for the h()-function, however—the Erlang distribution (the Gamma distribution with an integer shape parameter)—the model can be numerically integrated using a “Gamma-chain” model (referred to as “linear chain trickery” by  Metz and Diekmann 1991) of coupled ordinary differential equations (e.g.,  Blythe et al. 1984; de Valpine et al. 2014; Bjørnstad et al. 2016). The trick is to separate any distributed-delay compartment into u sub-compartments through which individuals pass through at a rate overallrate ∗ u. The resultant infectious period will have a mean of 1∕overallrate and a coefficient-of-variation of \(1/\sqrt{u}\).

We can write a chain-SIR model to simulate SIR flows with more realistic infectious period distributionsFootnote 9:

We can compare the predicted dynamics of the simple SIR model with the u = 2 chain model, the u = 500 chain model (which is effectively the fixed-period delayed-differential model) and the “measles-realistic” u = 73 model.

The more narrow the infectious-period distribution, the more punctuated the predicted epidemics. However, infectious-period narrowing—alone—cannot sustain recurrent epidemics; In the absence of stochastic or seasonal forcing epidemics will dampen to the endemic equilibrium (though the damping period is slightly accelerated and the convergence on the equilibrium is slightly slower with narrowing infectious period distributions) (Fig. 2.7).

Fig. 2.7
figure 7

Chain-SIR models with different infectious period distributions

In the above we considered non-exponential infectious-period distributions. However, the general ODE chain method can be used for any compartment. Lavine et al. (2011), for example, used it to model non-exponential waning of natural and vaccine-induced immunity to whooping cough.

2.8 ShinyApp

The following code will launch a local shinyApp of the SIR model in your local browser. This App can also be launched by calling SIR.app in the epimdr-package. Several of the subsequent chapters also have associated shinyApps. Those will only be accessible from the package (because the code is long and a bit tedious). We quote an annotated version of the SIR.app in full.