This chapter uses the following R-packages: deSolve, rootSolve, phaseR, and shiny.
A conceptual understanding of reproductive ratios and the closed epidemic is useful prior to this discussion. Five minute epidemics-MOOC introductions can be watched from YouTube:
Reproductive number https://www.youtube.com/watch?v=ju26rvzfFg4.
Closed epidemic https://www.youtube.com/watch?v=sSLfrSSmJZM.
Access provided by CONRICYT-eBooks. Download chapter PDF
2.1 The SIR Model
In 1927, Kermack and McKendrick (1927) published a set of general equations (Breda et al. 2012) to better understand the dynamics of an infectious disease spreading through a susceptible population. Their motivation was
“One of the most striking features in the study of epidemics is the difficulty of finding a causal factor which appears to be adequate to account for the magnitude of the frequent epidemics of disease which visit almost every population […] The problem may be summarized as follows: One (or more) infected person is introduced into a community of individuals, more or less susceptible to the disease in question. The disease spreads from the affected to the unaffected by contact infection. Each infected person runs through the course of his sickness, and finally is removed from the number of those who are sick, by recovery or by death. The chances of recovery or death vary from day to day during the course of his illness. The chances that the affected may convey infection to the unaffected are likewise dependent upon the stage of the sickness. As the epidemic spreads, the number of unaffected members of the community becomes reduced […] In the course of time the epidemic may come to an end. One of the most important problems in epidemiology is to ascertain whether this termination occurs only when no susceptible individuals are left, or whether the interplay of the various factors of infectivity, recovery and mortality, may result in termination, whilst many susceptible individuals are still present in the unaffected population.”
Following a general mathematical exposé, they suggested a set of pragmatic assumptions which lead to the standard SIR model of ordinary differential equations for the flow of hosts between Susceptible, Infectious, and Recovered compartments. In modern notation, their simplest set of equations is (Fig. 2.1):
The assumptions of Eqs. (2.1)– (2.3) are:
-
The infection circulates in a population of size N, with a per capita “background” death rate, μ, which is balanced by a birth rate μN. From the sum of Eqs. (2.1)– (2.3), dN∕dt = 0 and N = S + I + R is thus constant.
-
The infection causes acute morbidity (not mortality); That is, in this version of the SIR model we assume we can ignore disease-induced mortality. This is reasonable for certain infections like chickenpox, but certainly not for others like rabies, SARS, or ebola.
-
Individuals are recruited directly into the susceptible class at birth (so we ignore perinatal maternal immunity).
-
Transmission of infection from infectious to susceptible individuals is controlled by a bilinear contact term \(\beta I \frac{S} {N}\). This stems from the assumption that the I infectious individuals are independently and randomly mixing with all other individuals, so the fraction S∕N of the encounters is with susceptible individuals; β is the contact rate times the probability of transmission given a contact between a susceptible and an infectious individual.
-
Chances of recovery or death is assumed not to change during the course of infection.
-
Infectiousness is assumed not to change during the course of infection.
-
Infected individuals move directly into the the infectious class (as opposed to the SEIR model; see Sect. 3.7) and remains there for an average infectious period of 1∕γ (assuming μ < < γ).Footnote 1
-
The model assumes that recovered individuals are immune from reinfection for life.
The basic reproductive ratio (R0), defined as the expected number of secondary infections from a single index case in a completely susceptible population, is a very important quantity in epidemiology. Chapter 3 is entirely devoted to this quantity. For this simple SIR model \(R_{0} = \frac{\beta } {\gamma +\mu }\).
2.2 Numerical Integration of the SIR Model
If there are no (or negligible) births and deaths during the duration of an epidemic (μ ≃ 0), it is commonly referred to as a closed epidemic. While it is occasionally possible to derive analytical solutions to systems of ODEs like Eqs. (2.1)– (2.3), we generally have to resort to numerical integration to predict the numbers over time. We use the deSolve R-package to numerically integrate the equations. We will numerically integrate a variety of different models. While the models differ, the basic recipe is generally the same: (1) define a R-function for the general system of equations, (2) specify the time points at which we want the integrator to save the state of the system, (3) provide values for the parameters, (4) give initial values for all state variables, and finally (5) invoke the R-function that does the integration. We use the ode-function in the deSolve-package.
Step 1: We define the function (often called the gradient-functions) for the equation systems. The deSolve-package requires the function to take the following parameters: time,Footnote 2 t, a vector with the values for the state variables (S, I, R), y, and parameter values (β, μ, γ, and N), parms:
The ode-function solves differential equations numerically.
Steps 2–4: Specify the time points at which we want ode to record the states of the system (here we use 26 weeks with 10 time-increments per week as specified in the vector times), the parameter values (in this case as specified in the vector parms), and starting conditions (specified in start). In this case we model the fraction of individuals in each class, so we set N = 1, and consider a disease with an infectious period of 2 weeks (γ = 1∕2), no births or deaths (μ = 0) and a transmission rate of 2 (β = 2). For our starting conditions we assume that 0. 1% of the initial population is infected and the remaining fraction is susceptible.
Step 5: Feed start values, times, the gradient-function and parameter vector to the ode-function as suggested by args(ode).Footnote 3 For convenience we convert the output to a data frame (ode returns a list). The head-function shows the first 5 rows of out, and round(,3) rounds the number to three decimals.
We can plot the result (Fig. 2.2) to see that the model predicts an initial exponential growth of the epidemic that decelerates as susceptibles are depleted, and finally fade-out as susceptible numbers are too low to sustain the chain of transmission.
R allows for a lot of customization of graphics— Rseek.org is a useful resource to find solutions to all things R…Fig. 2.2 has some added features such as a right-hand axis for the effective reproductive ratio (RE)—the expected number of new cases per infected individuals in a not completely susceptible population—and a legend so that we can confirm that the turnover of the epidemic happens exactly when RE = R0s = 1, where s is the fraction of remaining susceptibles. The threshold R0s = 1 ⇒ s ∗ = 1∕R0 results in the powerful rule of thumb for vaccine induced eradication and herd immunity: If we can—through vaccination —keep the susceptible population below a critical fraction, pc = 1 − 1∕R0, then pathogen spread will dissipate and the pathogen will not be able to reinvade the host population (e.g., Anderson and May 1982; Roberts and Heesterbeek 1993; Ferguson et al. 2003). This rule of thumb appeared to work well for smallpox, the only vaccine-eradicated human disease; Its R0 was commonly around 5, and most countries saw elimination once vaccine cover exceeded 80% (Anderson and May 1982). The actual code used to produce Fig. 2.2 is:
2.3 Final Epidemic Size
The closed epidemic model has two equilibria {S = 1, I = 0, R = 0} which is unstable when R0 > 1, and the {S ∗, I ∗, R ∗}-equilibrium which reflects the final epidemic size, for which I ∗ = 0 as the epidemic eventually self-extinguish in the absence of susceptible recruitment; S ∗ is the fraction of susceptibles that escape infection altogether; and R ∗ is the final epidemic size—the fraction of susceptibles that will be infected before the epidemic self-extinguish. For the closed epidemic, there is an exact mathematical solution to the final epidemic size (below). It is nevertheless useful to consider computational ways of finding equilibria in the absence of exact solutions.
The rootSolve-package will attempt to find equilibria of systems of differential equations through numerical integration. The function runsteady is really just a wrapper function around the ode-function that integrates until the system settles on some steady-state (if it exists). It takes the same arguments as ode. By varying initial conditions rootSolve should find multiple stable equilibria if there are more than one stable solution.Footnote 4
So for these parameters, 2% of susceptibles are expected to escape infection altogether and 98%—the final epidemic size—are expected to be infected during the course of the epidemic.
Let us explore numerically how the final epidemic size depends on R0. Recall that for the specific SIR variant we are working with R0 = β∕(γ + μ), and since we are studying the closed epidemic μ = 0. In the above example we assume an infectious period of 2 weeks (i.e., γ = 1∕2), so we may vary β so R0 goes from 0.1 to 5. For moderate to large R0 this fraction has been shown to be approximately 1 − exp(−R0) (e.g., Anderson and May 1982). We can check how well this approximation holds (Fig. 2.3).Footnote 5
We see that the approximation is good for R0 > 2. 5 but overestimates the final epidemic size for smaller R0 (and is terrible for R0 < 1).
For the closed epidemic SIR model, there is an exact mathematical solution to the fraction of susceptibles that escapes infection (1 − f) given by the implicit equation f = exp(−R0(1 − f)) or equivalently exp(−R0(1 − f)) − f = 0 (Swinton 1998). So we can also find the final size by applying the uniroot-function to the equation. The uniroot-function finds numerical solutions to equations with one unknown variable (which has to be named x).
So for R0 = 2 the final epidemic size is 79.6% and the approximation is off by around 6.7% points.
2.4 Open Epidemic
The open epidemic has recruitment of new susceptibles (i.e., μ > 0). As long as R0 > 1, the open epidemic has an “endemic equilibrium” were the pathogen and host coexist. If we use the SIR equations to model fractions (i.e., set N = 1), Eq. (2.2) of the SIR model implies that S ∗ = (γ + μ)∕β = 1∕R0 is the endemic S-equilibrium, which when substituted into Eq. (2.1) gives I ∗ = μ(R0 − 1)∕β, and finally, R ∗ = N − I ∗− S ∗ as the I and R endemic equilibria. We can study the predicted dynamics of the open epidemic using the sirmod-function. Let us assume a life expectancy of 50 years, a stable population size, and thus a weekly birth rate of μ = 1∕(50 ∗ 52). Let’s assume that 19% of the initial population is susceptible and 1% is infected and numerically integrate the model for 50 years (Fig. 2.4).
2.5 Phase Analyses
When working with dynamical systems we are often interested in studying the dynamics in the phase plane and derive the isoclines that divide this plane in regions of increase and decrease of the various state variables. The phaseR package is a wrapper around ode that makes it easy to analyze 1D and 2D ode’s.Footnote 6 The R-state in the SIR model does not influence the dynamics, so we can rewrite the SIR model as a 2D system.
The isoclines (sometimes called the nullclines) in this system are given by the solution to the equations dS∕dt = 0 and dI∕dt = 0 and partitions the phase plane into regions were S and I are increasing and decreasing. For N = 1, the I-isocline is S = (γ + μ)∕β = 1∕R0 and the S-isocline is I = μ(1∕S − 1)∕β. We can draw these in the phase plane and add a simulated trajectory to the plot (Fig. 2.5). The trajectory cycles in a counter-clockwise dampened fashion towards the endemic equilibrium (Fig. 2.5). To visualize the expected change to the system at arbitrary points in the phase plane, we can further use the function flowField in the phaseR-package to superimpose predicted arrows of change (vectors).
2.6 Stability and Periodicity
If we work with continuous-time ODE models like the SIR, equilibria are locally stable if (and only if) all the real part of the eigenvalues of the Jacobian matrix —when evaluated at the equilibrium—are smaller than zero. We will discuss stability and resonant periodicity in detail in Chap. 9, so this section is just a teaser…An equilibrium is (1) a node (i.e., all trajectories moves monotonically towards/away from the equilibrium) if the largest eigenvalue has only real parts, or (2) a focus (trajectories spiral towards or away from the equilibrium) if the largest eigenvalues are a conjugate pair of complex numbers (a ± bı).Footnote 7 For a focus the imaginary part determines the dampening period of the cycle according to 2π∕b. We can thus use the Jacobian matrix to study the SIR model’s equilibria. If we let F = dS∕dt = μ(N − S) −βSI∕N and G = dI∕dt = βSI∕N − (μ + γ)I, the Jacobian of the SIR system is
and the two equilibria are the disease-free equilibrium and the endemic equilibrium as defined above.
R can help with all of this. We first calculate the equilibria:
We then calculate the elements of the Jacobian using R’s D-function:
We pass the values for S ∗ and I ∗ in the eq1-list to the Jacobian,Footnote 8 and use eigen-function to calculate the eigenvalues:
For the endemic equilibrium, the eigenvalues are a pair of complex conjugates which real parts are negative, so it is a stable focus. The period of the inwards spiral is:
So with these parameters the dampening period is predicted to be 261 weeks (just over 5 years). Thus, during disease invasion we expect this system to exhibit initial outbreaks every 5 years. A further significance of this number is that if the system is stochastically perturbed by, say, environmental variability affecting transmission, we expect the system to exhibit low amplitude “phase-forgetting” cycles (Nisbet and Gurney 1982) with approximately this period in the long run (see Chap. 9). We can make more accurate calculations of the stochastic system using transfer functions (Nisbet and Gurney 1982; Priestley 1981). We will visit on this slightly more advanced topic in Sect. 9.7.
The same protocol can be used on the disease-free equilibrium {S ∗ = 1, I ∗ = 0}.
The eigenvalues are strictly real and the largest value is greater than zero, so it is an unstable node (a “saddle”); The epidemic trajectory is predicted to move monotonically away from this disease free equilibrium if infection is introduced into the system. This makes sense because with the parameter values used, R0 = 3. 99 which is greater than the invasion-threshold value of 1.
2.7 Advanced: More Realistic Infectious Periods
The S(E)IR-type differential equation models assumes that rate of exit from the infectious classes are constant, the implicit assumption is thus that the infectious period is exponentially distributed among infected individuals; The average infectious period is 1∕(γ + μ), but an exponential fraction is infectious much shorter/longer than this. The chain-binomial model (see Sect. 3.4), in contrast, assumes that everybody is infectious for a fixed period and then all instantaneously recover (or die). These assumptions are mathematically convenient, but in reality neither are particularly realistic. Hope-Simpson (1952) traced the chains of transmission of measles in multi-sibling household. The timing of secondary and tertiary cases was analyzed in detail by Bailey (1956) and Bailey and Alff-Steinberger (1970). The average latent and infectious periods were calculated to be 8. 58 and 6. 57 days, respectively. While the distribution around each of these averages were not estimated separately (the latent period was assumed to be distributed and the infectious period assumed fixed), the variance around the roughly fortnight period of infection was estimated to be 3. 13. The mean duration of infection is thus 15. 15 days with a standard deviation of 1. 77 (Fig. 2.6). So neither a fixed nor an exponential distribution is very accurate (Keeling and Grenfell 1997; Lloyd 2001).
Kermack and McKendrick’s (1927) original model allows for arbitrary infectious-period distributions. We can write Kermack and McKendrick’s original equations as “renewal equations” (Breda et al. 2012), introducing the additional notation of k(t) being the (instantaneous) incidence at time t (i.e., flux into the I-class at time t).
where k(t −τ) is the number of individuals that was infected τ time units ago, h(τ) is the probability of recovering on infection-day τ, and H(τ) is the cumulative probability of having recovered by infection-day τ; k(t −τ)∕(1 − H(τ)) is thus the fraction of individuals infected at time t −τ that still remains in the infected class on day t and the integral is over all previous infections so as to quantify the total flux into the removed class at time t. Though intuitive, these general integro-differential equations (Eqs. (2.5)– (2.8)) are not easy to work with in general. For a restricted set of distributions for the h()-function, however—the Erlang distribution (the Gamma distribution with an integer shape parameter)—the model can be numerically integrated using a “Gamma-chain” model (referred to as “linear chain trickery” by Metz and Diekmann 1991) of coupled ordinary differential equations (e.g., Blythe et al. 1984; de Valpine et al. 2014; Bjørnstad et al. 2016). The trick is to separate any distributed-delay compartment into u sub-compartments through which individuals pass through at a rate overallrate ∗ u. The resultant infectious period will have a mean of 1∕overallrate and a coefficient-of-variation of \(1/\sqrt{u}\).
We can write a chain-SIR model to simulate S → I → R flows with more realistic infectious period distributionsFootnote 9:
We can compare the predicted dynamics of the simple SIR model with the u = 2 chain model, the u = 500 chain model (which is effectively the fixed-period delayed-differential model) and the “measles-realistic” u = 73 model.
The more narrow the infectious-period distribution, the more punctuated the predicted epidemics. However, infectious-period narrowing—alone—cannot sustain recurrent epidemics; In the absence of stochastic or seasonal forcing epidemics will dampen to the endemic equilibrium (though the damping period is slightly accelerated and the convergence on the equilibrium is slightly slower with narrowing infectious period distributions) (Fig. 2.7).
In the above we considered non-exponential infectious-period distributions. However, the general ODE chain method can be used for any compartment. Lavine et al. (2011), for example, used it to model non-exponential waning of natural and vaccine-induced immunity to whooping cough.
2.8 ShinyApp
The following code will launch a local shinyApp of the SIR model in your local browser. This App can also be launched by calling SIR.app in the epimdr-package. Several of the subsequent chapters also have associated shinyApps. Those will only be accessible from the package (because the code is long and a bit tedious). We quote an annotated version of the SIR.app in full.
Notes
- 1.
The implicit assumptions that stem from the use of deterministic, ordinary differential equation (ODE) are that the infectious periods (and resident times in all compartments) are exponentially distributed. This is a tractable approximation for exploring overall dynamics, but observed duration of infection periods is often much less variable—the Eimeria-gut parasite (a relative of Plasmodium that cause malaria) undergoes exactly 8 replication cycles before leaving a host; or much more variable—see superspreader MOOC video: https://www.youtube.com/watch?v=3H1tG4uz9uk. Section 2.7 discusses a practical approach to model dynamics when the exponential assumption is deemed too simplistic.
- 2.
Though, in the case of the simple SIR model there is no time-dependence in any of the parameters, so this parameter is not called within the gradient function; This will change when we consider seasonality (Chap. 5).
- 3.
For further details on usage, do ?function on the R command-line, i.e., ?ode in this instance.
- 4.
It will not find unstable equilibria, for these we will need to use other strategies. We will consider finding all equilibria in more depth in Sect. 9.3.
- 5.
We use a for-loop here to calculate the final epidemic size for a range of values of R0; A loop works by repeating calculations (in this case 50 times), after each repeat the value of the looping variable (in this case i) is changed to the next value in the looping vector. So in this example i will be 1 first, then 2, then …until the loop ends after i = 50.
- 6.
The phaseR package requires the gradient function to take the arguments t, y, and parameters.
- 7.
And a center—like the Lotka-Volterra predator-prey model—if conjugate pair only has imaginary parts.
- 8.
In previous coding like for the sirmod-function, we “pulled” parameter values from the input arguments inside the function to make the code as transparent as possible; while it makes the code easy to read, it makes for extra coding, and can clutter up the workspace with variables that are defined in multiple locations. The with-function allows the evaluation of an expression using variables defined in a data list.
- 9.
With high number of compartments this system of equations can become “stiff” with the computer potentially making rounding errors leading to erroneous negative numbers. We use a “log-trick” (Ellner and Guckenheimer 2011) available for systems where all state variables are strictly positive: we solve the system in log-coordinates to smooth abrupt changes and force all number to be greater than zero. To employ this technique we log-transform all initial values in start, change the first line in the function to x = exp(logx) and the last line to return dS/S, etc. in place of dS which comes from the chain-rule of differentiation and the fact that D(logx) = 1∕x.
References
Anderson, R. M., & May, R. M. (1982). Directly transmitted infectious diseases: Control by vaccination. Science, 215, 1053–1060.
Bailey, N. T. J. (1956). On estimating the latent and infectious periods of measles: I. Families with two susceptibles only. Biometrika, 43(1/2), 15–22.
Bailey, N. T. J., & Alff-Steinberger, C. (1970). Improvements in the estimation of the latent and infectious periods of a contagious disease. Biometrika, 57(1), 141–153.
Bjørnstad, O. N., Nelson, W. A., & Tobin, P. C. (2016). Developmental synchrony in multivoltine insects: Generation separation versus smearing. Population Ecology, 58(4), 479–491.
Blythe, S., Nisbet, R., & Gurney, W. (1984). The dynamics of population models with distributed maturation periods. Theoretical Population Biology, 25(3), 289–311.
Breda, D., Diekmann, O., De Graaf, W., Pugliese, A., & Vermiglio, R. (2012). On the formulation of epidemic models (an appraisal of Kermack and Mckendrick). Journal of Biological Dynamics, 6(Suppl. 2), 103–117.
de Valpine, P., Scranton, K., Knape, J., Ram, K., & Mills, N. J. (2014). The importance of individual developmental variation in stage-structured population models. Ecology Letters, 17(8), 1026–1038.
Ellner, S. P., & Guckenheimer, J. (2011). Dynamic models in biology. Princeton: Princeton University Press.
Ferguson, N. M., Keeling, M. J., Edmunds, W. J., Gani, R., Grenfell, B. T., Anderson, R. M., & Leach, S. (2003). Planning for smallpox outbreaks. Nature, 425(6959), 681–685.
Hope-Simpson, R. (1952). Infectiousness of communicable diseases in the household. Lancet, 2, 549–554.
Keeling, M. J., & Grenfell, B. (1997). Disease extinction and community size: Modeling the persistence of measles. Science, 275(5296), 65–67.
Kermack, W. O., & McKendrick, A. G. (1927). A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 115(772), 700–721.
Lavine, J. S., King, A. A., & Bjørnstad, O. N. (2011). Natural immune boosting in pertussis dynamics and the potential for long-term vaccine failure. Proceedings of the National Academy of Sciences, 108(17), 7259–7264.
Lloyd, A. L. (2001). Destabilization of epidemic models with the inclusion of realistic distributions of infectious periods. Proceedings of the Royal Society of London B: Biological Sciences, 268(1470), 985–993.
Metz, J., & Diekmann, O. (1991). Exact finite dimensional representations of models for physiologically structured populations. I: The abstract foundations of linear chain trickery. Differential equations with applications in biology, physics and engineering: Vol. 133. Lecture notes in pure and applied mathematics (pp. 269–289). New York: Marcel Dekker.
Nisbet, R. M., & Gurney, W. (1982). Modelling fluctuating populations. Chichester: John Wiley and Sons Limited.
Priestley, M. B. (1981). Spectral analysis and time series. Cambridge, MA: Academic.
Roberts, M., & Heesterbeek, H. (1993). Bluff your way in epidemic models. Trends in Microbiology, 1(9), 343–348.
Swinton, J. (1998). Extinction times and phase transitions for spatially structured closed epidemics. Bulletin of Mathematical Biology, 60(2), 215–230.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bjørnstad, O.N. (2018). SIR. In: Epidemics. Use R!. Springer, Cham. https://doi.org/10.1007/978-3-319-97487-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-97487-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97486-6
Online ISBN: 978-3-319-97487-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)