Abstract
Analytical approximations have generated many insights into the dynamics of epidemics, but there is only one well-known approximation which describes the dynamics of the whole epidemic. In addition, most of the well-known approximations for different aspects of the dynamics are for the classic susceptible–infected–recovered model, in which the infectious period is exponentially distributed. Whilst this assumption is useful, it is somewhat unrealistic. Equally reasonable assumptions are that the infectious period is finite and fixed or that there is a distribution of infectious periods centred round a nonzero mean. We investigate the effect of these different assumptions on the dynamics of the epidemic by deriving approximations to the whole epidemic curve. We show how the well-known sech-squared approximation for the infective population in ‘weak’ epidemics (where the basic reproduction rate \(R_0\approx 1\)) can be extended to the case of an arbitrary distribution of infectious periods having finite second moment, including as examples fixed and gamma-distributed infectious periods. Further, we show how to approximate the time course of a ‘strong’ epidemic, where \(R_0\gg 1\), demonstrating the importance of estimating the infectious period distribution early in an epidemic.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The dynamics of the classic susceptible–infected–recovered (SIR) infectious disease transmission model framework, as first outlined by Kermack and McKendrick (1927), underlies much of our understanding of infectious disease epidemiology (Anderson and May 1991; Diekmann and Heesterbeek 2000; Keeling and Rohani 2007). Important insights from this model framework include the threshold properties of the basic reproductive number, \(R_0\) (Kermack and McKendrick 1927), the critical vaccination proportion (Smith 1964) and the relationship between the epidemic growth rate, \(r_g\), the generation time, \(T_g\), and \(R_0\). Within their classic paper Kermack and McKendrick (1927) not only derived \(R_0\) but also derived an approximation to the epidemic curve for \(R_0\) close to 1 (a ‘weak’ epidemic). Understanding of these various quantities, although they apply only to relatively simple homogeneous models, has proved very useful in developing our understanding of the characteristics of an epidemic.
The classic SIR model was derived as a special case of a more general formulation with infectiousness varying over the course of the time since infection, or across the population (Kermack and McKendrick 1927; Diekmann and Heesterbeek 2000). This form of epidemic model in slightly different forms is also known as the ‘Lotka–Euler’ formulation (Wallinga and Lipsitch 2007) or ‘renewal equation’ (Fraser 2007). Within this framework, the classic SIR model emerges under the assumption that infectious periods are exponentially distributed across the population. This is, of course, unlikely to be the case in reality. Arguably, the most parsimonious representation of a more realistic infectious period is to assume that the infectious period is limited and is the same across all individuals [the Soper model, Soper (1929)]. The assumption of different distributions of infectious periods is known to affect the relationship between the exponential growth rate and the generation time distribution, and estimates of the reproductive number from the epidemic growth rate (Fraser 2007; Wallinga and Lipsitch 2007; Wearing et al. 2005; Lloyd 2001a; Hethcote and Tudor 1980), a crucial estimate in the early stages of a new outbreak.
Despite the impact of different infectious period distributions on the dynamics of the early stages of an epidemic, the ‘final epidemic size’ or the total number of infectives over the course of an epidemic has been shown to be invariant under different assumptions on the distribution of infectious periods and disease course within individuals (Kermack and McKendrick 1927; Bailey 1975; Anderson and Watson 1980; Anderson and May 1991; Andersson and Britton 2000; Diekmann and Heesterbeek 2000), provided there is homogeneous mixing (Ma and Earn 2006; Anderson and May 1991; Diekmann and Heesterbeek 2000; Andreasen 2011).
In the declining stages of an epidemic, or during the decline of a seasonally forced epidemic, the distribution of infectious periods has been shown to destabilise the dynamics (Lloyd 2001a) and to change dependence of persistence on the population size (Lloyd 2001b), results which were first derived for a model in which the infectious period was exponentially distributed (Keeling and Grenfell 1997). Keeling and Grenfell and Lloyd came to opposite conclusions concerning persistence, but their results were reconciled by Conlan et al. (2010).
Given these insights, it is surprising that there has not been more investigation of the impact of the infectious period distribution on the peak and decline of the SIR-type models. Here we formulate a general approximation to the epidemic curve for any infectious period distribution within a unified framework. We derive approximations to the time course for \(R_0\) close to 1 (‘weak’ epidemics) and, innovatively, for larger \(R_0\) (‘strong’ epidemics). Using these novel approximations, we are able to characterise the impact of infectious period distribution on the time course of the epidemic, including the time to and magnitude of peak prevalence. Despite the simplicity of obtaining numerical solutions of these models, analytic approximations such as those highlighted above are a useful way of characterising the impact of different assumptions on epidemic dynamics, as we demonstrate below.
2 The Generalised Infectious Period Model
We first formulate the general transmission model in which individuals are either susceptible, S, infectious, I or recovered, R for a general infectious period distribution; the epidemic is assumed to occur on a fast timescale, so that births and deaths are not modelled. This type of model formulation has been described and analysed several times, most notably by Kermack and McKendrick (1927).
To proceed, we denote by i(a, t) the number density of the infected cohort having had the disease for a period a. Then
is the total number of infectives, assuming that recovery or removal is inevitable (i. e., \(i\rightarrow 0\) as \(a\rightarrow \infty \)). As with age-dependent population models, or time since infection models, i satisfies the partial differential equation
where r(a) is the recovery rate, and is taken to be a function of the time since infection. Suitable initial conditions are
where the ‘recruitment’ or incidence rate is
just as in the classic SIR model (Kermack and McKendrick 1927). Integration of (2) leads to
We solve (2) using the method of characteristics. In \(t<a\), we have \(i=0\), whilst for \(t>a\), we find
Putting \(i_0=-\dot{S}\) in this, we find, after integrating by parts, that I is given by
where we can define \(S_0\) to be the total (pre-infection) population of susceptibles. We use the following notation with respect to the recovery, or infectious, period distribution:
Note that K(a) is the infection time probability density, and that [from (8)]
Using (4), we thus have the generalised Soper model, following the early work by Soper (1929), and its exposition by Wilson and Burke (1942) and Wilson and Worcester (1944),
The pre-infection state \(S=S_0\), \(I=0\) for \(t<0\) is also described by (10), providing we take \(K(a)=0\) and thus \(F(a)=1\) for \(a<0\). The onset of the epidemic is enabled by initial conditions
and typically we suppose \(I_0\ll S_0\).
2.1 Infectious Period Distributions
Different assumptions regarding the infectious period distributions can be represented by different functional forms of K(a). In these formulations we set the functions to have the same mean infectious period,
We additionally define the second moment,
for future use.
In this formulation the classic SIR model, with its exponential decay in infectiousness, corresponds to a recovery rate \(r=1/T\) which is independent of age, and a consequent delay kernel
with mean T and second moment \(2T^2\).
Another plausible assumption is that the infectious period is a fixed constant T. Since F(a) is the fraction of an initial inoculate who still have the disease after period a, we can take \(F=1-H(a-T)\), where H is the Heaviside step function, and thus K is a delta function,
with mean T and second moment \(T^2\).
More general kernels can be analysed in the same way, including, for example, the gamma distribution
which has mean T and second moment \(T^2(\gamma +1)/\gamma \) and which takes the limits (14) and (15) when \(\gamma =1\) and \(\gamma \rightarrow \infty \), respectively.
2.2 Nondimensionalisation
We analyse (6), (7) and (10) by first rescaling the variables, thus
Then we have the dimensionless integrals
and the dimensionless equations for u, v and w can thus be written in the form [bearing in mind (18)]
where
The initial values for u and v are, from (11),
Note also that the initial value of v is assumed small and nonzero. The dimensionless kernels for exponentially distributed, gamma-distributed and fixed infectious periods are
respectively. Note that the mean of each dimensionless kernel is one and the dimensionless second moments are \(\kappa _2= 2\), \((\gamma +1)/\gamma \) and 1, respectively.
2.3 Initial Growth
We can find the initial growth rate of the epidemic for general infectious period distributions. We first put \(u=R_0-v_0 e^{\lambda t}\), and expanding (19) for small \(v_0\) and large t, we find
which has a unique positive root if \(R_0>1\); we thus identify \(R_0\) as the basic reproduction rate of the epidemic for the general infectious period distribution.
For the gamma-distributed infectious period kernel in (22), the dimensionless epidemic growth rate satisfies
and for the particular cases \(\gamma =1\) (SIR model) and \(\gamma =\infty \) (Soper model), we find
Whilst these approximations to the early epidemic growth rate are useful, they do not tell us about the dynamics of the whole epidemic. We now present approximations to the whole epidemic curve firstly for epidemics for which \(R_0\) is close to one (“weak” epidemics), and then for large \(R_0\) (“strong” epidemics).
3 Weak Epidemics
For the case where \(R_0\approx 1\), Kermack and McKendrick derived a classic approximation to the epidemic curve for the model with exponentially distributed infectious periods. Soper derived a similar expression for the model with fixed infectious periods. We rederive these expressions by showing that this approximation can be generalised for any infectious period distribution with finite second moment.
We first define
and take \(\varepsilon \ll 1\). We then rescale the variables by writing
Substituting these changes into (19), using (18), we obtain
where the overdot denotes differentiation with respect to \(\tau \), and we have replaced the upper limit on the integral by \(\infty \) on the basis that we have
in (19) (the epidemic is initiated at \(t=0\)). The initial conditions are, from (21),
and we assume that \(v_0\ll \varepsilon ^2\).
We now expand \(U(\tau -\varepsilon a)\) in the integral in a Taylor series, and, using (18), this leads to
substituting this into (28)\(_1\), using the boundary conditions
[the latter from (31)] then leads to the leading order equation
providing the second moment \(\kappa _2\) exists, essentially equivalent to requiring that \(K(a)\ll \dfrac{1}{a^3}\) for large a. For a heavy-tailed distribution with unbounded second moment, a more elaborate procedure would be necessary. We do not pursue this here, but note that the breakdown of the method is associated with the nonuniform convergence of the Taylor expansion of \(U(\tau -\varepsilon a)\) for large a, because of (29). The correct procedure can be obtained by replacing the upper limit in the integral in (28)\(_2\) by \(\tau /\varepsilon \).
The solution to the Eq. (33) is
Therefore
and
This shows that the approximation to the epidemic curve for low \(R_0\) by Kermack and McKendrick (1927) for the exponential distribution and Wilson and Worcester (1944) for fixed infectious periods is generalisable to any infectious period distribution with a finite second moment.
If we first compare the result of this approximation for the SIR model and the constant infectious period (Soper) model, the approximations are
Of note here is the factor of two, which means that the epidemic with a constant infectious period will grow (and therefore decay) more rapidly than that with an exponentially distributed infectious period. It also means that the approximate maximum prevalence for the constant infectious period model, \(\left( R_0-1\right) ^2\), is twice as big as for the SIR model. This effect of a constant infectious period on shortening the ‘generation time’ (time from infection to onward transmission) has been previously noted by, amongst others, Diekmann and Heesterbeek (2000) and Wallinga and Lipsitch (2007), but its effect on peak prevalence has not been previously approximated.
For gamma-distributed infectious periods, the epidemic curve is approximated by
and the bigger the shape parameter, \(\gamma \) (resulting in smaller variance in infectious periods) the higher the peak prevalence and the shorter is the duration of the outbreak.
3.1 Peak Prevalence and Time to Peak
From (36), (17), (20) and (26), the peak prevalence P, defined as the ratio of the maximum infected number \(I_{\max }\) to the total population \(S_0\) is, for a weak epidemic,
If the initial infected population consists of \(I_0\) individuals, then the initial value of v is given by (21), and since by assumption this is very small, we can suppose v reaches its maximum when \(\tau \) is large, in which case we can use the approximation \(\hbox {sech}\,(-\theta )\approx 2e^{-\theta }\), and the dimensionless time to peak prevalence (scaled with T), is then found from (36) to be
4 Strong Epidemics
Now we consider the case \(R_0 \gg 1\), for which we devise an asymptotic method similar to that used by Fowler (1982). First we rescale the variables as follows:
so that
where
There is an initial phase where \(U\approx 1\), and we have
where \(\lambda \) is given by (23), using also the fact that \(u+v=R_0\) at \(t=0\), and \(v_0\) is given by (21). Since \(R_0\gg 1\), the application of Laplace’s method to (23) shows that
Note that (44) can thus be written in the form
where
The approximation becomes invalid when \(t\approx t_0\), and the appropriate rescaling of (42) is done by choosing
The equation (42) become
where the prime denotes differentiation with respect to \(\tau \).
For small \(\delta \), (46) implies that \(U(t)\approx 1\) for \(t<t_0\), and this implies that the integral in (49) is small, so that V can be approximated by
and therefore
Note that (46) implies
for \(\tau <0\), and the solution of (51) which matches to this as \(\tau \rightarrow -\infty \) is
The solution in (53) is a monotonic solution in which the number of infectives rapidly increases to a peak at \(V\approx 1\), i. e., \(v\approx R_0\), whilst U decreases towards zero: everybody gets infected! However, the approximation (50) and therefore (51) clearly break down when \(\tau \sim 1/\delta \), and a further rescaling is then necessary.
As \(\tau \) becomes large, we rescale back to the original time scale \(t=t_0+\delta \tau \). Note that then \(U\sim e^{-\tau }=e^{-(t-t_0)/\delta }\), and this suggests we write
with
then (42) becomes
together with the matching condition (55).
In the integral, we may take \(U(t-a)\approx 1\) for \(t-a<t_0\), whilst \(U(t-a)=\exp \left[ -\dfrac{\phi (t-a)}{\delta }\right] \) for \(t-a>t_0\). The exponential terms are small and can be neglected, and therefore
with \(\phi \sim t-t_0\) as \(t\rightarrow t_0\), and thus, interchanging the order of integration in the quadrature for \(\phi \),
and \(\phi \rightarrow 1\) as \(t\rightarrow \infty \). Thus U reaches equilibrium and V declines to zero; no further approximations are necessary.
Because the approximation has two distinct phases, it is less easy to extract such quantities as peak prevalence and time to peak. To do this, we can write a uniformly asymptotic approximation. We write the small and large time approximations in terms of t, thus
A uniform approximation is essentially obtained by adding the two approximations and subtracting the common part; for details see Dyke (1975). In the present case we can write a uniform approximation by inspection. This is
providing we extend the definition of K so that \(K(a)=0\) for \(a<0\); it is clear that these expressions reduce to both approximations in (59) in the appropriate time range.
4.1 Peak Prevalence and Time to Peak
The peak time is approximately \(t_0\) given by (47), but the peak value is not well constrained. To find this, we use the uniform approximation for v to find the time where it is maximum; this is the peak time \(t_p\). It is given implicitly by
thus
Evidently \(t'\) is small, so that \(F\approx 1\), but the precise expression for \(t'\) depends critically on the behaviour of the distribution kernel K(a) near \(a=0\). For the gamma distribution (22), we have
and in that case
so that
From (60), the maximum of v is approximately \(R_0F-K\), so that the peak infected population is to leading order the whole population. More accurately, the peak prevalence
this last expression being for the gamma distribution. For the SIR problem for which \(\gamma =1\), and \(F(a)=K(a)=e^{-a}\), we have more directly from (62) that
and using this directly in (60) yields the peak prevalence as
The limit in which \(\gamma \rightarrow \infty \) corresponds to the Soper problem where \(K(a)=\delta (a-1)\), and (61) is irrelevant. Direct inspection of (60) shows that in this case v rapidly rises, reaches a maximum \(\approx R_0(1-e^{-R_0})\) at \(t=t_0+1\), and is then instantly extinguished. This last result (from (60)) is not quite right, as it ignores the corrective terms in (56). More precisely, we have from (42), with \(K(a)=\delta (a-1)\) and taking \(t>1\),
and we can use (53) throughout, since although it is inaccurate for \(t>t_0\), U is in any case very small then. Thus the uniform approximate solution for the Soper case is
and the term in square brackets provides the correction to the step function in (60). From this we find the time to peak is
and the peak prevalence is given by
4.2 Accuracy of the Approximations
Figure 1 compares numerical simulations of the model with the weak approximation we have given for the case \(R_0=1.5\). The shapes of the curves and the time to peak are surprisingly well represented, despite the fact that \(\varepsilon =0.5\) is not that small, though the peak values are overestimated.
For large \(R_0\gg 1\) the difference between the dynamics for the two extremes of the infectious period distribution is striking (Fig. 2). The assumption of a constant infectious period results in a much faster decline of the epidemic following peak prevalence than for an exponentially distributed infectious period. The approximation to the epidemic curve is almost exact for large \(R_0\) and constant infectious period. In particular, it captures both peak prevalence and the time at which it occurs very well once \(R_0\) is larger than about 3. This is shown in Fig. 3, where we plot the time to peak and peak prevalence for both numerical and asymptotic results as a function of \(R_0\).
These figures provide a gloss on the examples shown in Figs. 1 and 2. It can be seen in Fig. 3 that the weak approximation gives a uniformly excellent approximation to \(t_p\) for the SIR model. In fact it deviates at smaller \(R_0\) (!), due to the fact that for fixed \(I_0\), the initial value of V in (30) increases as \(\varepsilon \) is reduced. This is more clearly visible in Fig. 4 for the Soper model. The strong approximation (for the SIR model), on the other hand, is only reasonably accurate for \(R_0\gtrsim 5\). The peak prevalence is not well approximated in either limit: the weak approximation is useful for \(R_0\lesssim 0.4\), and the strong approximation only becomes useful for \(R_0\gtrsim 10\), as also illustrated by Fig. 2.
For the Soper model, and more generally when there is a peaked infection time distribution, the weak and strong approximations cope better between them. The peak time \(t_p\) is well served by one or other approximation either side of \(R_0=2\); the peak prevalence is less well captured, but still fares much better than the exponential distribution of the SIR model.
5 Conclusions
We have derived analytical approximations to the epidemic curve for small and large \(R_0\) for general infectious period distributions. The weak epidemic limit is well known, but not in its application to such distributions, and particularly, its inadmissibility for heavy-tailed distributions provides a new insight. The extreme version of such a distribution corresponds to an immune carrier which can thus act as a reservoir for the infection. The strong epidemic limit has not, to our knowledge, been studied before.
Approximations to the epidemic curve are not only useful for developing our understanding of the dynamics, but may also be used in situations where large numbers of simulations are required, e. g., for parameter estimation or identifying optimal control strategies for a given set of parameters. The approximations presented here capture the general shape of the epidemic, as characterised by the peak prevalence, the time at which the peak occurs, and the rate of increase and decline of the epidemic. In the case of the constant infectious period and large \(R_0\), the approximation is almost exact.
These analytical expressions extend previous observations on the effect of assumptions regarding the infectious period on the dynamics of the epidemic. For an infection with a given \(R_0\) and mean infectious period, the distribution of this infectious period has an impact on all aspects of the epidemic curve. The differences in the shape of the epidemic curves due to different infectious period distributions is most notable for large \(R_0\), where the epidemic curve for the SIR model is asymmetric. For \(R_0\approx 1\), the epidemic shape is symmetric due to the slow epidemic growth smoothing out the effect of different infectious period distributions. However, different distributions do affect the height and width of this symmetric curve.
The assumption of an exponentially distributed infectious period results in a broad distribution in times from one infection to another (the generation time) meaning that some individuals infect susceptibles much later than others. This serves to smooth out the profile, resulting in a less intense epidemic than one in which the infectious period is fixed. For low \(R_0\), the shapes are qualitatively similar, but both peak prevalence and the time to peak are approximately half that of an epidemic with a constant infectious period. An exponential distribution of infectious periods results not only in a slower epidemic with a lower peak prevalence but it lengthens the tail of the epidemic, giving a more gradual decline in prevalence following saturation than a fixed infectious period. This effect of different infectious periods on the generation time and exponential growth rate has been discussed before (Fraser 2007; Wallinga and Lipsitch 2007; Wearing et al. 2005; Lloyd 2001a; Hethcote and Tudor 1980), but the impact on the second half of the epidemic has not. These extreme differences in the tail of an epidemic demonstrate the importance of quantifying this distribution during a novel outbreak in order to estimate peak prevalence and the timescale of the decline of the epidemic.
There are, of course, limitations to the application of an SIR-type approach, whether with constant or exponentially distributed periods, since more detailed biological effects, such as a period prior to infectivity represented in SEIR models, will affect the dynamics (Wearing et al. 2005). Further, homogeneous models of epidemic spread necessarily make a number of unrealistic assumptions about contact patterns, which have been previously been shown to affect the final size of the epidemic (Ma and Earn 2006; Anderson and May 1991; Diekmann and Heesterbeek 2000; Andreasen 2011), and will therefore affect the dynamics during the epidemic. However, the insights about the timing and magnitudes of peak prevalence and the rate of decline of the epidemic are useful for understanding and validating the dynamics of more complex models.
References
Anderson D, Watson R (1980) On the spread of a disease with gamma distributed latent and infectious periods. Biometrika 67(1):191–198
Anderson RM, May RM (1991) Infectious diseases of humans. Oxford University Press, Oxford
Andersson H, Britton T (2000) Stochastic epidemic models and their statistical analysis. Lecture notes in statistics, vol 151. Springer, New York
Andreasen V (2011) The final size of an epidemic and its relation to the basic reproduction number. Bull Math Biol 73:2305–2321
Bailey NTJ (1975) The mathematical theory of infectious diseases and its application, 2nd edn. Griffin, London
Conlan AJK, Rohani P, Lloyd AL, Keeling M, Grenfell BT (2010) Resolving the impact of waiting time distributions on the persistence of measles. J R Soc Interface 7:623–640
Diekmann O, Heesterbeek JAP (2000) Mathematical epidemiology of infectious diseases: model building, analysis and interpretation. Wiley, Chichester
Fowler AC (1982) An asymptotic analysis of the logistic delay equation when the delay is large. IMA J Appl Math 28:41–49
Fraser C (2007) Estimating individual and household reproduction numbers in an emerging epidemic. PLoS ONE 2(8):e758
Hethcote HW, Tudor DW (1980) Integral equation models for endemic infectious diseases. Theor Popul Biol 18:204–243
Keeling MJ, Grenfell BT (1997) Disease extinction and community size: modeling the persistence of measles. Science 275:65–67
Keeling MJ, Rohani P (2007) Modeling infectious diseases in humans and animals. Princeton University Press, Princeton
Kermack WO, McKendrick AG (1927) Contributions to the mathematical theory of epidemics. Proc R Soc A 115:700–721
Lloyd AL (2001a) Destabilization of epidemic models with the inclusion of realistic distributions of infectious periods. Proc R Soc B 268:985–993
Lloyd AL (2001b) Realistic distributions of infectious periods in epidemic models: changing patterns of persistence and dynamics. Theor Popul Biol 60:59–71
Ma J, Earn DJD (2006) Generality of the final size formula for an epidemic of a newly invading infectious disease. Bull Math Biol 68:679–702
Smith CEG (1964) Factors in the transmission of virus infections from animals to man. Sci Basis Med Ann Rev 1964:125–150
Soper HE (1929) The interpretation of periodicity in disease prevalence. J R Stat Soc 92:34–73
Van Dyke MD (1975) Perturbation methods in fluid mechanics. Parabolic Press, Stanford
Wallinga J, Lipsitch M (2007) How generation intervals shape the relationship between growth rates and reproductive numbers. Proc R Soc B 274:599–604
Wearing HJ, Rohani P, Keeling MJ (2005) Appropriate models for the management of infectious diseases. PLoS Med 7:e174
Wilson EB, Burke MH (1942) The epidemic curve. Proc Nat Acad Sci 28:361–367
Wilson EB, Worcester J (1944) A second approximation to Soper’s epidemic curve. Proc Nat Acad Sci 30:37–44
Acknowledgments
A. C. F. acknowledges the support of the Mathematics Applications Consortium for Science and Industry (www.macsi.ul.ie) funded by the Science Foundation Ireland grant 12/1A/1683. T. D. H. thanks Imperial College for provision of an Imperial College Junior Research Fellowship.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fowler, A.C., Déirdre Hollingsworth, T. Simple Approximations for Epidemics with Exponential and Fixed Infectious Periods. Bull Math Biol 77, 1539–1555 (2015). https://doi.org/10.1007/s11538-015-0095-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-015-0095-3