1 Introduction

The dynamics of the classic susceptible–infected–recovered (SIR) infectious disease transmission model framework, as first outlined by Kermack and McKendrick (1927), underlies much of our understanding of infectious disease epidemiology (Anderson and May 1991; Diekmann and Heesterbeek 2000; Keeling and Rohani 2007). Important insights from this model framework include the threshold properties of the basic reproductive number, \(R_0\) (Kermack and McKendrick 1927), the critical vaccination proportion (Smith 1964) and the relationship between the epidemic growth rate, \(r_g\), the generation time, \(T_g\), and \(R_0\). Within their classic paper Kermack and McKendrick (1927) not only derived \(R_0\) but also derived an approximation to the epidemic curve for \(R_0\) close to 1 (a ‘weak’ epidemic). Understanding of these various quantities, although they apply only to relatively simple homogeneous models, has proved very useful in developing our understanding of the characteristics of an epidemic.

The classic SIR model was derived as a special case of a more general formulation with infectiousness varying over the course of the time since infection, or across the population (Kermack and McKendrick 1927; Diekmann and Heesterbeek 2000). This form of epidemic model in slightly different forms is also known as the ‘Lotka–Euler’ formulation (Wallinga and Lipsitch 2007) or ‘renewal equation’ (Fraser 2007). Within this framework, the classic SIR model emerges under the assumption that infectious periods are exponentially distributed across the population. This is, of course, unlikely to be the case in reality. Arguably, the most parsimonious representation of a more realistic infectious period is to assume that the infectious period is limited and is the same across all individuals [the Soper model, Soper (1929)]. The assumption of different distributions of infectious periods is known to affect the relationship between the exponential growth rate and the generation time distribution, and estimates of the reproductive number from the epidemic growth rate (Fraser 2007; Wallinga and Lipsitch 2007; Wearing et al. 2005; Lloyd 2001a; Hethcote and Tudor 1980), a crucial estimate in the early stages of a new outbreak.

Despite the impact of different infectious period distributions on the dynamics of the early stages of an epidemic, the ‘final epidemic size’ or the total number of infectives over the course of an epidemic has been shown to be invariant under different assumptions on the distribution of infectious periods and disease course within individuals (Kermack and McKendrick 1927; Bailey 1975; Anderson and Watson 1980; Anderson and May 1991; Andersson and Britton 2000; Diekmann and Heesterbeek 2000), provided there is homogeneous mixing (Ma and Earn 2006; Anderson and May 1991; Diekmann and Heesterbeek 2000; Andreasen 2011).

In the declining stages of an epidemic, or during the decline of a seasonally forced epidemic, the distribution of infectious periods has been shown to destabilise the dynamics (Lloyd 2001a) and to change dependence of persistence on the population size (Lloyd 2001b), results which were first derived for a model in which the infectious period was exponentially distributed (Keeling and Grenfell 1997). Keeling and Grenfell and Lloyd came to opposite conclusions concerning persistence, but their results were reconciled by Conlan et al. (2010).

Given these insights, it is surprising that there has not been more investigation of the impact of the infectious period distribution on the peak and decline of the SIR-type models. Here we formulate a general approximation to the epidemic curve for any infectious period distribution within a unified framework. We derive approximations to the time course for \(R_0\) close to 1 (‘weak’ epidemics) and, innovatively, for larger \(R_0\) (‘strong’ epidemics). Using these novel approximations, we are able to characterise the impact of infectious period distribution on the time course of the epidemic, including the time to and magnitude of peak prevalence. Despite the simplicity of obtaining numerical solutions of these models, analytic approximations such as those highlighted above are a useful way of characterising the impact of different assumptions on epidemic dynamics, as we demonstrate below.

2 The Generalised Infectious Period Model

We first formulate the general transmission model in which individuals are either susceptible, S, infectious, I or recovered, R for a general infectious period distribution; the epidemic is assumed to occur on a fast timescale, so that births and deaths are not modelled. This type of model formulation has been described and analysed several times, most notably by Kermack and McKendrick (1927).

To proceed, we denote by i(at) the number density of the infected cohort having had the disease for a period a. Then

$$\begin{aligned} I=\int _0^\infty i\,\hbox {d}a \end{aligned}$$
(1)

is the total number of infectives, assuming that recovery or removal is inevitable (i. e., \(i\rightarrow 0\) as \(a\rightarrow \infty \)). As with age-dependent population models, or time since infection models, i satisfies the partial differential equation

$$\begin{aligned} \frac{\partial {i}}{\partial {t}}+\frac{\partial {i}}{\partial {a}}=-r(a)i, \end{aligned}$$
(2)

where r(a) is the recovery rate, and is taken to be a function of the time since infection. Suitable initial conditions are

$$\begin{aligned} i= & {} 0 \quad \mathrm{at} \quad t=0,\nonumber \\i= & {} i_0(t) \quad \mathrm{at} \quad a=0, \end{aligned}$$
(3)

where the ‘recruitment’ or incidence rate is

$$\begin{aligned} i_0(t)=-\dot{S}=kSI, \end{aligned}$$
(4)

just as in the classic SIR model (Kermack and McKendrick 1927). Integration of (2) leads to

$$\begin{aligned} \dot{I}=i_0(t)-\int _0^\infty r(a)i\,\hbox {d}a. \end{aligned}$$
(5)

We solve (2) using the method of characteristics. In \(t<a\), we have \(i=0\), whilst for \(t>a\), we find

$$\begin{aligned} i=i_0(t-a)\exp \left[ -\int _0^ar(a')\,\hbox {d}a'\right] . \end{aligned}$$
(6)

Putting \(i_0=-\dot{S}\) in this, we find, after integrating by parts, that I is given by

$$\begin{aligned} I=S_0F(t)-S+\int _0^tK(a)S(t-a)\,\hbox {d}a, \end{aligned}$$
(7)

where we can define \(S_0\) to be the total (pre-infection) population of susceptibles. We use the following notation with respect to the recovery, or infectious, period distribution:

$$\begin{aligned} F(a)=\exp \left[ -\int _0^ar(a')\,\hbox {d}a'\right] ,\quad K(a)=-F'(a)=r(a)\exp \left[ -\int _0^ar(a')\,\hbox {d}a'\right] . \end{aligned}$$
(8)

Note that K(a) is the infection time probability density, and that [from (8)]

$$\begin{aligned} \int _0^\infty K(a)\,\hbox {d}a=1. \end{aligned}$$
(9)

Using (4), we thus have the generalised Soper model, following the early work by Soper (1929), and its exposition by Wilson and Burke (1942) and Wilson and Worcester (1944),

$$\begin{aligned} \dot{S}=-kSI,\quad I=S_0F(t)-S+\int _0^tK(a)S(t-a)\,\hbox {d}a. \end{aligned}$$
(10)

The pre-infection state \(S=S_0\), \(I=0\) for \(t<0\) is also described by (10), providing we take \(K(a)=0\) and thus \(F(a)=1\) for \(a<0\). The onset of the epidemic is enabled by initial conditions

$$\begin{aligned} I=I_0,\quad S=S_0-I_0\quad \mathrm{at}\quad t=0+, \end{aligned}$$
(11)

and typically we suppose \(I_0\ll S_0\).

2.1 Infectious Period Distributions

Different assumptions regarding the infectious period distributions can be represented by different functional forms of K(a). In these formulations we set the functions to have the same mean infectious period,

$$\begin{aligned} T=\int _0^\infty aK(a)\,\hbox {d}a. \end{aligned}$$
(12)

We additionally define the second moment,

$$\begin{aligned} K_2=\int _0^\infty a^2K(a)\,\hbox {d}a, \end{aligned}$$
(13)

for future use.

In this formulation the classic SIR model, with its exponential decay in infectiousness, corresponds to a recovery rate \(r=1/T\) which is independent of age, and a consequent delay kernel

$$\begin{aligned} K=\frac{1}{T}\exp \left( -\frac{a}{T}\right) \end{aligned}$$
(14)

with mean T and second moment \(2T^2\).

Another plausible assumption is that the infectious period is a fixed constant T. Since F(a) is the fraction of an initial inoculate who still have the disease after period a, we can take \(F=1-H(a-T)\), where H is the Heaviside step function, and thus K is a delta function,

$$\begin{aligned} K=\delta (a-T). \end{aligned}$$
(15)

with mean T and second moment \(T^2\).

More general kernels can be analysed in the same way, including, for example, the gamma distribution

$$\begin{aligned} K=\frac{1}{T}\frac{\gamma ^\gamma }{\Gamma (\gamma )}\left( \frac{a}{T}\right) ^{\gamma -1}\exp \left( -\frac{\gamma a}{T}\right) , \end{aligned}$$
(16)

which has mean T and second moment \(T^2(\gamma +1)/\gamma \) and which takes the limits (14) and (15) when \(\gamma =1\) and \(\gamma \rightarrow \infty \), respectively.

2.2 Nondimensionalisation

We analyse (6), (7) and (10) by first rescaling the variables, thus

$$\begin{aligned}&i=\frac{w}{kT^2},\quad I=\frac{v}{kT},\quad S=\frac{u}{kT},\quad K_2=T^2\kappa _2,\nonumber \\&t\sim T,\quad a\sim T,\quad K\sim \frac{1}{T},\quad r\sim \frac{1}{T}. \end{aligned}$$
(17)

Then we have the dimensionless integrals

$$\begin{aligned} \int _0^\infty K(a)\,\hbox {d}a=1,\quad \int _0^\infty aK(a)\,\hbox {d}a=1,\quad \int _0^\infty a^2K(a)\,\hbox {d}a=\kappa _2, \end{aligned}$$
(18)

and the dimensionless equations for u, v and w can thus be written in the form [bearing in mind (18)]

$$\begin{aligned} \dot{u}= & {} -uv,\nonumber \\v= & {} R_0-u-\int _0^tK(a)[R_0-u(t-a)]\,\hbox {d}a,\nonumber \\w= & {} -\dot{u}(t-a)F(a),\quad v=\int _0^{\infty }w(t,a)\,\hbox {d}a, \end{aligned}$$
(19)

where

$$\begin{aligned} R_0=kTS_0. \end{aligned}$$
(20)

The initial values for u and v are, from (11),

$$\begin{aligned} u=R_0-v_0,\quad v=v_0=kTI_0=\frac{R_0I_0}{S_0}. \end{aligned}$$
(21)

Note also that the initial value of v is assumed small and nonzero. The dimensionless kernels for exponentially distributed, gamma-distributed and fixed infectious periods are

$$\begin{aligned} K(a) = e^{-a}, \qquad K(a) = \frac{\gamma ^\gamma }{\Gamma (\gamma )}a^{\gamma -1}e^{-\gamma a}, \qquad K(a) = \delta (a-1), \end{aligned}$$
(22)

respectively. Note that the mean of each dimensionless kernel is one and the dimensionless second moments are \(\kappa _2= 2\), \((\gamma +1)/\gamma \) and 1, respectively.

2.3 Initial Growth

We can find the initial growth rate of the epidemic for general infectious period distributions. We first put \(u=R_0-v_0 e^{\lambda t}\), and expanding (19) for small \(v_0\) and large t, we find

$$\begin{aligned} \lambda =R_0\left[ 1-\int _0^\infty K(a)e^{-\lambda a}\,\hbox {d}a\right] , \end{aligned}$$
(23)

which has a unique positive root if \(R_0>1\); we thus identify \(R_0\) as the basic reproduction rate of the epidemic for the general infectious period distribution.

For the gamma-distributed infectious period kernel in (22), the dimensionless epidemic growth rate satisfies

$$\begin{aligned} \lambda =R_0\left[ 1-\left( \frac{\gamma }{\gamma +\lambda }\right) ^\gamma \right] , \end{aligned}$$
(24)

and for the particular cases \(\gamma =1\) (SIR model) and \(\gamma =\infty \) (Soper model), we find

$$\begin{aligned}&\lambda _\mathrm{SIR} = R_0 -1, \,\,\gamma =1, \nonumber \\&\lambda _\mathrm{Soper}=R_0\left( 1-e^{-\lambda }\right) , \,\,\gamma = \infty . \end{aligned}$$
(25)

Whilst these approximations to the early epidemic growth rate are useful, they do not tell us about the dynamics of the whole epidemic. We now present approximations to the whole epidemic curve firstly for epidemics for which \(R_0\) is close to one (“weak” epidemics), and then for large \(R_0\) (“strong” epidemics).

3 Weak Epidemics

For the case where \(R_0\approx 1\), Kermack and McKendrick derived a classic approximation to the epidemic curve for the model with exponentially distributed infectious periods. Soper derived a similar expression for the model with fixed infectious periods. We rederive these expressions by showing that this approximation can be generalised for any infectious period distribution with finite second moment.

We first define

$$\begin{aligned} R_0=1+\varepsilon , \end{aligned}$$
(26)

and take \(\varepsilon \ll 1\). We then rescale the variables by writing

$$\begin{aligned} t=\frac{\tau }{\varepsilon },\quad u=1-\varepsilon U(\tau ),\quad v=\varepsilon ^2V. \end{aligned}$$
(27)

Substituting these changes into (19), using (18), we obtain

$$\begin{aligned} \dot{U}= & {} (1-\varepsilon U)V,\nonumber \\V= & {} \frac{1}{\varepsilon }\left[ U-\int _0^\infty K(a)U(\tau -\varepsilon a)\,\hbox {d}a\right] , \end{aligned}$$
(28)

where the overdot denotes differentiation with respect to \(\tau \), and we have replaced the upper limit on the integral by \(\infty \) on the basis that we have

$$\begin{aligned} u(t)\equiv R_0,\quad v(t)\equiv 0\quad \mathrm{for}\quad t<0 \end{aligned}$$
(29)

in (19) (the epidemic is initiated at \(t=0\)). The initial conditions are, from (21),

$$\begin{aligned} U\approx -1,\quad V=\frac{v_0}{\varepsilon ^2}\quad \mathrm{at}\quad t=0, \end{aligned}$$
(30)

and we assume that \(v_0\ll \varepsilon ^2\).

We now expand \(U(\tau -\varepsilon a)\) in the integral in a Taylor series, and, using (18), this leads to

$$\begin{aligned} V=\dot{U}-\tfrac{1}{2}\varepsilon \kappa _2\ddot{U}+\cdots ; \end{aligned}$$
(31)

substituting this into (28)\(_1\), using the boundary conditions

$$\begin{aligned} U\approx -1,\quad \dot{U}\approx 0\quad \mathrm{at}\quad \tau =0 \end{aligned}$$
(32)

[the latter from (31)] then leads to the leading order equation

$$\begin{aligned} \dot{U}\approx \frac{(1-U^2)}{\kappa _2}, \end{aligned}$$
(33)

providing the second moment \(\kappa _2\) exists, essentially equivalent to requiring that \(K(a)\ll \dfrac{1}{a^3}\) for large a. For a heavy-tailed distribution with unbounded second moment, a more elaborate procedure would be necessary. We do not pursue this here, but note that the breakdown of the method is associated with the nonuniform convergence of the Taylor expansion of \(U(\tau -\varepsilon a)\) for large a, because of (29). The correct procedure can be obtained by replacing the upper limit in the integral in (28)\(_2\) by \(\tau /\varepsilon \).

The solution to the Eq. (33) is

$$\begin{aligned} U = \tanh \left( \frac{\tau -\tau _p}{\kappa _2}\right) . \end{aligned}$$
(34)

Therefore

$$\begin{aligned} u = 1-\varepsilon \tanh \left( \frac{\tau -\tau _p}{\kappa _2}\right) \end{aligned}$$
(35)

and

$$\begin{aligned} v = \frac{\epsilon ^2}{\kappa _2}\, \hbox {sech}\,^2 \left( \frac{\tau -\tau _p}{\kappa _2}\right) . \end{aligned}$$
(36)

This shows that the approximation to the epidemic curve for low \(R_0\) by Kermack and McKendrick (1927) for the exponential distribution and Wilson and Worcester (1944) for fixed infectious periods is generalisable to any infectious period distribution with a finite second moment.

If we first compare the result of this approximation for the SIR model and the constant infectious period (Soper) model, the approximations are

$$\begin{aligned} v_\mathrm{SIR}\approx & {} \frac{\left( R_0-1\right) ^2}{2}\, \hbox {sech}\,^2\left\{ \tfrac{1}{2}\left( R_0-1\right) \left( t-t_p\right) \right\} , \nonumber \\v_\mathrm{Soper}\approx & {} \left( R_0-1\right) ^2\, \hbox {sech}\,^2\left\{ \left( R_0-1\right) \left( t-t_p\right) \right\} . \end{aligned}$$
(37)

Of note here is the factor of two, which means that the epidemic with a constant infectious period will grow (and therefore decay) more rapidly than that with an exponentially distributed infectious period. It also means that the approximate maximum prevalence for the constant infectious period model, \(\left( R_0-1\right) ^2\), is twice as big as for the SIR model. This effect of a constant infectious period on shortening the ‘generation time’ (time from infection to onward transmission) has been previously noted by, amongst others, Diekmann and Heesterbeek (2000) and Wallinga and Lipsitch (2007), but its effect on peak prevalence has not been previously approximated.

For gamma-distributed infectious periods, the epidemic curve is approximated by

$$\begin{aligned} v \approx \frac{\gamma \left( R_0-1\right) ^2}{\gamma +1} \,\hbox {sech}\,^2\left\{ \frac{\gamma \left( R_0-1\right) (t-t_p)}{\left( \gamma +1\right) }\right\} , \end{aligned}$$
(38)

and the bigger the shape parameter, \(\gamma \) (resulting in smaller variance in infectious periods) the higher the peak prevalence and the shorter is the duration of the outbreak.

3.1 Peak Prevalence and Time to Peak

From (36), (17), (20) and (26), the peak prevalence P, defined as the ratio of the maximum infected number \(I_{\max }\) to the total population \(S_0\) is, for a weak epidemic,

$$\begin{aligned} P=\frac{I_{\max }}{S_0}=\frac{(R_0-1)^2}{R_0\kappa _2}. \end{aligned}$$
(39)

If the initial infected population consists of \(I_0\) individuals, then the initial value of v is given by (21), and since by assumption this is very small, we can suppose v reaches its maximum when \(\tau \) is large, in which case we can use the approximation \(\hbox {sech}\,(-\theta )\approx 2e^{-\theta }\), and the dimensionless time to peak prevalence (scaled with T), is then found from (36) to be

$$\begin{aligned} t_p=\frac{\kappa _2}{2(R_0-1)}\ln \left[ \frac{4(R_0-1)^2S_0}{\kappa _2R_0I_0}\right] . \end{aligned}$$
(40)

4 Strong Epidemics

Now we consider the case \(R_0 \gg 1\), for which we devise an asymptotic method similar to that used by Fowler (1982). First we rescale the variables as follows:

$$\begin{aligned} u=R_0U, \quad v=R_0V, \end{aligned}$$
(41)

so that

$$\begin{aligned} \delta \dot{U}= & {} -UV,\nonumber \\V= & {} 1-U -\int _0^{t}K(a)\left[ 1-U\left( t-a\right) \right] \,\hbox {d}a, \end{aligned}$$
(42)

where

$$\begin{aligned} \delta =\frac{1}{R_0}\ll 1. \end{aligned}$$
(43)

There is an initial phase where \(U\approx 1\), and we have

$$\begin{aligned} U\approx 1-\frac{I_0}{S_0}e^{\lambda t}, \end{aligned}$$
(44)

where \(\lambda \) is given by (23), using also the fact that \(u+v=R_0\) at \(t=0\), and \(v_0\) is given by (21). Since \(R_0\gg 1\), the application of Laplace’s method to (23) shows that

$$\begin{aligned} \lambda \approx R_0. \end{aligned}$$
(45)

Note that (44) can thus be written in the form

$$\begin{aligned} U=1-\exp \left( \frac{t-t_0}{\delta }\right) , \end{aligned}$$
(46)

where

$$\begin{aligned} t_0=\frac{1}{R_0}\ln \left( \frac{S_0}{I_0}\right) . \end{aligned}$$
(47)

The approximation becomes invalid when \(t\approx t_0\), and the appropriate rescaling of (42) is done by choosing

$$\begin{aligned} t=t_0+\delta \tau . \end{aligned}$$
(48)

The equation (42) become

$$\begin{aligned} U'= & {} -UV,\nonumber \\V= & {} 1-U -\int _0^{t_0+\delta \tau }K(a)\left[ 1-U\left( t_0+\delta \tau -a\right) \right] \,\hbox {d}a, \end{aligned}$$
(49)

where the prime denotes differentiation with respect to \(\tau \).

For small \(\delta \), (46) implies that \(U(t)\approx 1\) for \(t<t_0\), and this implies that the integral in (49) is small, so that V can be approximated by

$$\begin{aligned} V\approx 1- U, \end{aligned}$$
(50)

and therefore

$$\begin{aligned} U'\approx -U\left( 1-U\right) . \end{aligned}$$
(51)

Note that (46) implies

$$\begin{aligned} U=1-e^\tau \end{aligned}$$
(52)

for \(\tau <0\), and the solution of (51) which matches to this as \(\tau \rightarrow -\infty \) is

$$\begin{aligned} U\approx \frac{1}{1+e^{\tau }}, \quad V\approx \frac{e^{\tau }}{1+e^{\tau }}. \end{aligned}$$
(53)

The solution in (53) is a monotonic solution in which the number of infectives rapidly increases to a peak at \(V\approx 1\), i. e., \(v\approx R_0\), whilst U decreases towards zero: everybody gets infected! However, the approximation (50) and therefore (51) clearly break down when \(\tau \sim 1/\delta \), and a further rescaling is then necessary.

As \(\tau \) becomes large, we rescale back to the original time scale \(t=t_0+\delta \tau \). Note that then \(U\sim e^{-\tau }=e^{-(t-t_0)/\delta }\), and this suggests we write

$$\begin{aligned} U=\exp \left( -\frac{\phi }{\delta }\right) ,\quad t>t_0, \end{aligned}$$
(54)

with

$$\begin{aligned} \phi \sim t-t_0\quad \mathrm{as}\quad t\rightarrow t_0; \end{aligned}$$
(55)

then (42) becomes

$$\begin{aligned} \dot{\phi }= & {} V,\nonumber \\V= & {} 1-\exp \left[ -\frac{\phi }{\delta }\right] -\int _0^{t}K(a)\left[ 1- U(t-a)\right] \,\hbox {d}a, \end{aligned}$$
(56)

together with the matching condition (55).

In the integral, we may take \(U(t-a)\approx 1\) for \(t-a<t_0\), whilst \(U(t-a)=\exp \left[ -\dfrac{\phi (t-a)}{\delta }\right] \) for \(t-a>t_0\). The exponential terms are small and can be neglected, and therefore

$$\begin{aligned} \dot{\phi }=V\approx 1-\int _{0}^{t-t_0}K(a)\,\hbox {d}a=\int _{t-t_0}^{\infty }K(a)\,\hbox {d}a, \end{aligned}$$
(57)

with \(\phi \sim t-t_0\) as \(t\rightarrow t_0\), and thus, interchanging the order of integration in the quadrature for \(\phi \),

$$\begin{aligned} \phi =\int _0^\infty \min (a,t-t_0)K(a)\,\hbox {d}a,\quad t>t_0, \end{aligned}$$
(58)

and \(\phi \rightarrow 1\) as \(t\rightarrow \infty \). Thus U reaches equilibrium and V declines to zero; no further approximations are necessary.

Because the approximation has two distinct phases, it is less easy to extract such quantities as peak prevalence and time to peak. To do this, we can write a uniformly asymptotic approximation. We write the small and large time approximations in terms of t, thus

$$\begin{aligned}&u=\frac{R_0}{1+e^{R_0(t-t_0)}},\quad v=\frac{R_0}{1+e^{-R_0(t-t_0)}},\quad t\lesssim t_0,\nonumber \\&u = R_0 \exp \left[ -R_0 \int _0^\infty \min (a,t-t_0)K(a)\,\hbox {d}a \right] ,\quad v=R_0\int _{t-t_0}^\infty K(a)\,\hbox {d}a,\quad t>t_0.\nonumber \\ \end{aligned}$$
(59)

A uniform approximation is essentially obtained by adding the two approximations and subtracting the common part; for details see Dyke (1975). In the present case we can write a uniform approximation by inspection. This is

$$\begin{aligned} u\approx \frac{R_0}{1+\exp \left[ R_0\int _0^\infty \min (a,t-t_0)K(a)\,\hbox {d}a\right] },\quad v=\frac{R_0}{1+e^{-R_0(t-t_0)}}\int _{t-t_0}^\infty K(a)\,\hbox {d}a, \end{aligned}$$
(60)

providing we extend the definition of K so that \(K(a)=0\) for \(a<0\); it is clear that these expressions reduce to both approximations in (59) in the appropriate time range.

4.1 Peak Prevalence and Time to Peak

The peak time is approximately \(t_0\) given by (47), but the peak value is not well constrained. To find this, we use the uniform approximation for v to find the time where it is maximum; this is the peak time \(t_p\). It is given implicitly by

$$\begin{aligned} t_p=t_0+t',\quad t'=\frac{1}{R_0}\ln \left[ \frac{R_0F(t')-K(t')}{K(t')}\right] , \end{aligned}$$
(61)

thus

$$\begin{aligned} t_p\approx \frac{1}{R_0}\ln \left[ \frac{S_0(R_0F-K)}{I_0K}\right] . \end{aligned}$$
(62)

Evidently \(t'\) is small, so that \(F\approx 1\), but the precise expression for \(t'\) depends critically on the behaviour of the distribution kernel K(a) near \(a=0\). For the gamma distribution (22), we have

$$\begin{aligned} K(a)\approx \dfrac{\gamma ^\gamma c^{\gamma -1}}{\Gamma (\gamma )R_0^{\gamma -1}}\quad \mathrm{for}\quad a= \dfrac{c}{R_0}\ll 1, \end{aligned}$$
(63)

and in that case

$$\begin{aligned} t'\approx \frac{c}{R_0},\quad c=\ln \left[ \frac{R_0^\gamma \Gamma (\gamma )}{\gamma ^\gamma c^{\gamma -1}}\right] , \end{aligned}$$
(64)

so that

$$\begin{aligned} t'\approx \frac{\gamma \ln R_0}{R_0},\quad t_p\approx 1-\frac{1}{R_0}\ln \left[ \frac{S_0R_0^\gamma }{I_0}\right] . \end{aligned}$$
(65)

From (60), the maximum of v is approximately \(R_0F-K\), so that the peak infected population is to leading order the whole population. More accurately, the peak prevalence

$$\begin{aligned} P=\frac{I_{\max }}{S_0}=\frac{v_{\max }}{R_0}=F-\frac{K}{R_0}\approx 1-\frac{(\gamma c)^{\gamma -1}(c+\gamma )}{\Gamma (\gamma )R_0^\gamma }, \end{aligned}$$
(66)

this last expression being for the gamma distribution. For the SIR problem for which \(\gamma =1\), and \(F(a)=K(a)=e^{-a}\), we have more directly from (62) that

$$\begin{aligned} t_p\approx \frac{1}{R_0}\ln \left[ \frac{(R_0-1)S_0}{I_0}\right] , \end{aligned}$$
(67)

and using this directly in (60) yields the peak prevalence as

$$\begin{aligned} P\approx \frac{(R_0-1)^{(R_0-1)/R_0}}{R_0}. \end{aligned}$$
(68)

The limit in which \(\gamma \rightarrow \infty \) corresponds to the Soper problem where \(K(a)=\delta (a-1)\), and (61) is irrelevant. Direct inspection of (60) shows that in this case v rapidly rises, reaches a maximum \(\approx R_0(1-e^{-R_0})\) at \(t=t_0+1\), and is then instantly extinguished. This last result (from (60)) is not quite right, as it ignores the corrective terms in (56). More precisely, we have from (42), with \(K(a)=\delta (a-1)\) and taking \(t>1\),

$$\begin{aligned} V=U(t-1)-U(t), \end{aligned}$$
(69)

and we can use (53) throughout, since although it is inaccurate for \(t>t_0\), U is in any case very small then. Thus the uniform approximate solution for the Soper case is

$$\begin{aligned} v=\frac{R_0}{1+e^{-R_0(t-t_0)}}\left[ \frac{1-e^{-R_0}}{1+e^{R_0(t-t_0-1)}}\right] , \end{aligned}$$
(70)

and the term in square brackets provides the correction to the step function in (60). From this we find the time to peak is

$$\begin{aligned} t_p=t_0+\tfrac{1}{2}, \end{aligned}$$
(71)

and the peak prevalence is given by

$$\begin{aligned} P=\frac{v_{\max }}{R_0}=\frac{1-e^{-R_0}}{\left( 1+e^{-\frac{1}{2}R_0}\right) ^2}. \end{aligned}$$
(72)

4.2 Accuracy of the Approximations

Fig. 1
figure 1

Comparison between numerical simulations of the model (solid (red) lines) and approximations to the epidemic curve for \(P=I/S_0\) for the weak approximation (dashed (blue) lines), using a value of \(R_0=1.5\) and a mean recovery time of 2 days. The left hand figure is for the SIR model, and the right hand one is for the Soper model. The initial fraction of infectives \(I_0/S_0=10^{-3}\). Notice the different scales on the axes (Color figure online)

Fig. 2
figure 2

Comparison between numerical simulations of the model (solid (red) lines) and approximations to the epidemic curve for \(P=I/S_0\) for the strong approximation (dashed (blue or green) lines), using a value of \(R_0=10\) for the SIR model (left) and \(R_0=5\) for the Soper model (right); the mean recovery time is 2 days. The initial fraction of infectives \(I_0/S_0=10^{-3}\). Note that the Soper approximation is almost exact (Color figure online)

Figure 1 compares numerical simulations of the model with the weak approximation we have given for the case \(R_0=1.5\). The shapes of the curves and the time to peak are surprisingly well represented, despite the fact that \(\varepsilon =0.5\) is not that small, though the peak values are overestimated.

For large \(R_0\gg 1\) the difference between the dynamics for the two extremes of the infectious period distribution is striking (Fig. 2). The assumption of a constant infectious period results in a much faster decline of the epidemic following peak prevalence than for an exponentially distributed infectious period. The approximation to the epidemic curve is almost exact for large \(R_0\) and constant infectious period. In particular, it captures both peak prevalence and the time at which it occurs very well once \(R_0\) is larger than about 3. This is shown in Fig. 3, where we plot the time to peak and peak prevalence for both numerical and asymptotic results as a function of \(R_0\).

These figures provide a gloss on the examples shown in Figs. 1 and 2. It can be seen in Fig. 3 that the weak approximation gives a uniformly excellent approximation to \(t_p\) for the SIR model. In fact it deviates at smaller \(R_0\) (!), due to the fact that for fixed \(I_0\), the initial value of V in (30) increases as \(\varepsilon \) is reduced. This is more clearly visible in Fig. 4 for the Soper model. The strong approximation (for the SIR model), on the other hand, is only reasonably accurate for \(R_0\gtrsim 5\). The peak prevalence is not well approximated in either limit: the weak approximation is useful for \(R_0\lesssim 0.4\), and the strong approximation only becomes useful for \(R_0\gtrsim 10\), as also illustrated by Fig. 2.

Fig. 3
figure 3

Comparison of time to peak \(t_p\) approximations given by (40) (with \(\kappa _2=2\)) and (67) with direct numerical simulations for the SIR model (left figure), and the equivalent approximations to the peak prevalence \(P=v_{\max }/R_0\) given by (39) (with \(\kappa _2=2\)) and (68) together with the direct numerical simulation. The solid (red) curves are the numerical solution, dashed (blue) curves are the weak approximations, and the dotted (mauve) curves are the strong approximation. Note that the times to peak here are dimensionless (Color figure online)

Fig. 4
figure 4

Comparison of time to peak \(t_p\) approximations given by (40) (with \(\kappa _2=1\)) and (71) with direct numerical simulations for the Soper model (left figure), and the equivalent approximations to the peak prevalence \(P=v_{\max }/R_0\) given by (39) (with \(\kappa _2=1\)) and (72) together with the direct numerical simulation. The solid (red) curves are the numerical solution, dashed (blue) curves are the weak approximations, and the dotted (mauve) curves are the strong approximation. Note that the times to peak here are dimensionless (Color figure online)

For the Soper model, and more generally when there is a peaked infection time distribution, the weak and strong approximations cope better between them. The peak time \(t_p\) is well served by one or other approximation either side of \(R_0=2\); the peak prevalence is less well captured, but still fares much better than the exponential distribution of the SIR model.

5 Conclusions

We have derived analytical approximations to the epidemic curve for small and large \(R_0\) for general infectious period distributions. The weak epidemic limit is well known, but not in its application to such distributions, and particularly, its inadmissibility for heavy-tailed distributions provides a new insight. The extreme version of such a distribution corresponds to an immune carrier which can thus act as a reservoir for the infection. The strong epidemic limit has not, to our knowledge, been studied before.

Approximations to the epidemic curve are not only useful for developing our understanding of the dynamics, but may also be used in situations where large numbers of simulations are required, e. g., for parameter estimation or identifying optimal control strategies for a given set of parameters. The approximations presented here capture the general shape of the epidemic, as characterised by the peak prevalence, the time at which the peak occurs, and the rate of increase and decline of the epidemic. In the case of the constant infectious period and large \(R_0\), the approximation is almost exact.

These analytical expressions extend previous observations on the effect of assumptions regarding the infectious period on the dynamics of the epidemic. For an infection with a given \(R_0\) and mean infectious period, the distribution of this infectious period has an impact on all aspects of the epidemic curve. The differences in the shape of the epidemic curves due to different infectious period distributions is most notable for large \(R_0\), where the epidemic curve for the SIR model is asymmetric. For \(R_0\approx 1\), the epidemic shape is symmetric due to the slow epidemic growth smoothing out the effect of different infectious period distributions. However, different distributions do affect the height and width of this symmetric curve.

The assumption of an exponentially distributed infectious period results in a broad distribution in times from one infection to another (the generation time) meaning that some individuals infect susceptibles much later than others. This serves to smooth out the profile, resulting in a less intense epidemic than one in which the infectious period is fixed. For low \(R_0\), the shapes are qualitatively similar, but both peak prevalence and the time to peak are approximately half that of an epidemic with a constant infectious period. An exponential distribution of infectious periods results not only in a slower epidemic with a lower peak prevalence but it lengthens the tail of the epidemic, giving a more gradual decline in prevalence following saturation than a fixed infectious period. This effect of different infectious periods on the generation time and exponential growth rate has been discussed before (Fraser 2007; Wallinga and Lipsitch 2007; Wearing et al. 2005; Lloyd 2001a; Hethcote and Tudor 1980), but the impact on the second half of the epidemic has not. These extreme differences in the tail of an epidemic demonstrate the importance of quantifying this distribution during a novel outbreak in order to estimate peak prevalence and the timescale of the decline of the epidemic.

There are, of course, limitations to the application of an SIR-type approach, whether with constant or exponentially distributed periods, since more detailed biological effects, such as a period prior to infectivity represented in SEIR models, will affect the dynamics (Wearing et al. 2005). Further, homogeneous models of epidemic spread necessarily make a number of unrealistic assumptions about contact patterns, which have been previously been shown to affect the final size of the epidemic (Ma and Earn 2006; Anderson and May 1991; Diekmann and Heesterbeek 2000; Andreasen 2011), and will therefore affect the dynamics during the epidemic. However, the insights about the timing and magnitudes of peak prevalence and the rate of decline of the epidemic are useful for understanding and validating the dynamics of more complex models.