1 Introduction

The evolution of pathogen virulence has been an important area of study for the past two decades (Bull 1994; Ebert and Herre 1996; Frank 1996; Levin 1996; Brown et al. 2006; Alizon et al. 2009). One of the hypotheses put forth by theoretical studies about virulence evolution is the trade-off between virulence and other life history parameters of the pathogen (Bull 1994; Frank 1996; Levin 1996). Many theoretical studies have explored the evolutionary consequences of trade-offs in compartmental susceptible-infectious-recovered (SIR)-type host–pathogen models where the pathogen is horizontally transmitted, only a single strain of the pathogen can infect any given host, and selection is frequency independent (Anderson and May 1982; Bremermann and Thieme 1989; Thieme 2007; Metz et al. 2008; Gyllenberg et al. 2011). A classic result from the earliest of those studies is that under the above conditions evolution maximizes the basic reproductive number (Anderson and May 1982; Bremermann and Thieme 1989).

The basic reproductive number, \(\mathcal R _0\), is the number of secondary cases that arise from a single infected individual in a completely susceptible host population. When \(\mathcal R _0>1\), the pathogen is able to invade a susceptible population, resulting in an epidemic. When \(\mathcal R _0<1\), the pathogen is unable to invade and dies out. Larger values of \(\mathcal R _0\) imply that the pathogen is better able to invade a completely susceptible population. Thus, in systems where evolution maximizes \(\mathcal R _0\), the pathogen strain that is best able to invade a completely susceptible population is also the strain that is unable to be invaded by other strains at low density.

For well-mixed deterministic SIR-type systems where a host can only be infected by a single pathogen strain, \(\mathcal R _0\) maximization has been shown to occur in particular models (Dieckmann et al. 2002; Boots and Sasaki 2003; Thieme 2007; Medlock et al. 2009). However, \(\mathcal R _0\) maximization does not occur universally in that class of systems (Nowak 1991; Dieckmann et al. 2002; Thieme 2007; Metz et al. 2008; Gyllenberg et al. 2011). To help identify when pathogen evolution maximizes \(\mathcal R _0\), previous studies have derived sufficient conditions for the evolutionary maximization of \(\mathcal R _0\) in terms of \(\mathcal R _0\) itself (Mylius and Diekmann 1995; Metz et al. 2008; Gyllenberg et al. 2011). An important conclusion from these studies is that \(\mathcal R _0\) maximization, and optimization in general, requires the environmental feedbacks of the system to be effectively one dimensional (Mylius and Diekmann 1995; Metz et al. 2008).

Despite this existing body of general mathematical theory and the collection of specific studies demonstrating when \(\mathcal R _0\) maximization does and does not occur, it remains unclear under what general biological conditions evolution maximizes \(\mathcal R _0\) in SIR-type systems. Furthermore, it is unclear what constraints the \(\mathcal R _0\) maximization theory in Mylius and Diekmann (1995) and Metz et al. (2008) imposes on the functional forms used to model pathogen-related processes like transmission, progression of the infection, recovery, and mortality. Understanding these constraints on the functional forms is important for three reasons. First, such constraints allow one to identify when evolution maximizes \(\mathcal R _0\) simply from the structure of the equations of the dynamical system. Depending on how complex a model is, this can be simpler than determining if \(\mathcal R _0\) satisfies monotonicity conditions or computing invasability plots. Second, there is a clear mechanistic link between the functional forms used in theoretical models and particular characteristics of biological processes in natural systems. Thus, conditions on functional forms allows one to link particular biological phenomena to general theory. Finally, linking the functional form constraints to biological processes can identify what characteristics of natural systems inhibit and promote \(\mathcal R _0\) maximization. For example, constraints on functional forms can be used to identify what functional forms (and hence what biological processes) increase the dimension of environmental feedbacks. This in turn helps identify what kinds of systems are likely to exhibit \(\mathcal R _0\) maximization and evaluate how applicable results based on \(\mathcal R _0\) maximization theory are to natural systems.

This study presents a synthesis of results from the literature that results in a set of sufficient mathematical conditions under which evolution maximizes \(\mathcal R _0\) in SIR-type systems. We focus on well-mixed SIR-type systems where a host can only be infected by a single pathogen strain and consider both frequency independent and frequency dependent selection. The analysis is based on the next generation technique for computing \(\mathcal R _0\) in van den Driessche and Watmough (2002) and the sufficient condition for \(\mathcal R _0\) maximization in Mylius and Diekmann (1995). This approach yields conditions in terms of the functional forms used to model epidemiological processes and we directly relate these conditions to the characteristics of natural systems. We note that other optimization principles can arise in theoretical models (Metz et al. 2008; Gyllenberg et al. 2011), but in this study we will only focus on \(\mathcal R _0\) maximization.

The biological interpretation of our conditions yields three general biological conditions under which evolution maximizes \(\mathcal R _0\) when selection is frequency independent. First, there are no genotype-by-density or genotype-by-environment interactions. That is, the effects of the host class densities and the pathogen trait on system processes are independent. Second, the pathogen utilizes a single route of transmission (e.g. horizontal, vertical, or vector transmission). Third, if mortality is density dependent then: (i) there is a single infectious class that individuals cannot recover from, (ii) mortality in the infectious class is entirely density dependent, and (iii) the rates of recovery, infection progression, and mortality in the exposed classes are independent of the pathogen trait. The additional condition that arises when selection is frequency dependent is that there can be no genotype-by-genotype interactions between pathogen strains.

In the following we first review the theory underlying our results, focusing on the case where selection is frequency independent. We then apply the theory to epidemiological systems with a single infectious class and systems with one exposed and one infectious class. Next we present a summary of the results for more complicated systems involving multiple exposed and multiple infectious classes and vector transmission. Within each of the above cases we analyze specific examples. We then extend our conditions to include systems where selection is frequency dependent. We conclude with a discussion of how our conditions for \(\mathcal R _0\) maximization relate to the dimension of the environmental feedbacks in the systems.

2 Methods

2.1 Model

Here we introduce a general model for a direct transmission host–pathogen system with a single host species. A review of the following theory for more general systems (e.g., vector-borne pathogens) is included in Appendix A.

We divide the host population into susceptible \((S)\), infected (\(C_j\) for \(1 \le j \le n\)), and recovered \((R)\) classes. We assume that there is a single susceptible class and a single recovered class, but see Appendix B for two examples with multiple susceptible classes. Note that an infected class, \(C_j\), could be either an infectious class \((I)\) or an exposed class \((E)\), where exposed individuals are infected but not infectious. We assume all newly infected individuals enter class \(C_1\) and that infected individuals pass through each infected class sequentially as the infection progresses. The pathogen is assumed to be characterized by a one-dimensional trait parameter \(\theta \). We assume the dynamics of the system tend to either an endemic equilibrium or the disease free equilibrium for any fixed trait value. The disease free equilibrium is comprised of only susceptible individuals at density \(N\).

For a monomorphic pathogen population, the dynamics of the host–pathogen system are

$$\begin{aligned} \frac{dS}{dt}&= \mathcal G _S(S,\mathcal C ,R,\theta )- \mathcal U _S(S,\mathcal C ,R,\theta )\nonumber \\ \frac{dC_1}{dt}&= \sum _{j=1}^{n} C_{j} \mathcal F ^{(j)}_{C_1}(S,\mathcal C ,R,\theta )- C_1 \mathcal V ^{-}_{C_1}(S,\mathcal C ,R,\theta )\nonumber \\ \\ \frac{dC_j}{dt}&= C_{j-1} \mathcal V ^{+}_{C_j}(S,\mathcal C ,R,\theta )- C_{j}\mathcal V ^{-}_{C_j}(S,\mathcal C ,R,\theta ), \quad 2 \le j \le n \nonumber \\ \frac{dR}{dt}&= \mathcal G _R(S,\mathcal C ,R,\theta )- \mathcal U _R(S,\mathcal C ,R,\theta ).\nonumber \end{aligned}$$
(1)

where \(\mathcal C = (C_1,\ldots ,C_n)\). The functions \(\mathcal G _S\) and \(\mathcal U _S\) (\(\mathcal G _R\) and \(\mathcal U _R\)) are the rates at which individuals enter and leave the susceptible (recovered) class, respectively. The terms \(C_j\mathcal F ^{(j)}_{C_1}\) are the rates at which newly infected individuals enter class \(C_1\) due to transmission of the pathogen from individuals in class \(C_j\). The terms \(C_{j-1}\mathcal V ^{+}_{C_j}\) and \(C_{j}\mathcal V ^{-}_{C_j}\) are the rates at which already infected individuals enter and leave class \(C_j\), respectively. \(\mathcal V ^{-}_{C_j}\) is the sum of the per capita mortality rate, \(\mathcal D _{C_j}\), and the per capita rate at which individuals transfer out of class \(C_j\) due to recovery or progression of the infection, \(\mathcal T _{C_j}\). Throughout we assume \(\mathcal F ^{(j)}_{C_1}\), \(\mathcal V ^{+}_{C_j}\), and \(\mathcal V ^{-}_{C_j}\) are finite and positive when evaluated at points where \(C_j= 0\) for all \(j\).

2.2 The basic reproductive number

We now compute the basic reproductive number for the pathogen in system (1). The reproductive number of a pathogen with trait \(\theta ,\,\mathcal R (S,\mathcal C ,R,\theta )\), is the number of secondary infections that arise from a single infected individual in a population with host class densities \(S,\,\mathcal C \), and \(R\). The basic reproductive number of a pathogen with trait \(\theta ,\,\mathcal R _0(\theta )\), is the number of secondary infections that arise from a single infected individual in a completely susceptible population at the disease free equilibrium. The basic productive number can be computed as \(\mathcal R _0(\theta )=\mathcal R (N,0,0,\theta )\) where \(\mathcal C =0\) implies that \(C_j = 0\) for all \(j\).

We compute the reproductive number using the next generation technique in van den Driessche and Watmough (2002). Define \(\mathcal V _{C_j}(S,\mathcal C ,R,\theta ,\bar{C}_{j-1},\bar{C}_{j-1}) =-\bar{C}_{j-1}\mathcal V ^{+}_{C_j}(S,\mathcal C ,R,\theta ) + \bar{C}_j\mathcal V ^{-}_{C_j}(S,\mathcal C ,R,\theta )\). Let \(\mathbf{M}_F\) and \(\mathbf{M}_V\) be the matrices

$$\begin{aligned} \mathbf{M}_F (S,\mathcal C ,R,\theta ) = \left[\mathcal F ^{(k)}_{C_j}\right]_{j,k}, \quad \quad \quad \mathbf{M}_V(S,\mathcal C ,R,\theta ) = \left[ \frac{\partial \mathcal V {C_j}}{\partial \bar{C}_k}\right]_{j,k} \end{aligned}$$
(2)

evaluated at \(\bar{C}_j =0\) for all \(j\). The reproductive number is \(\mathcal R (S,\mathcal C ,R,\theta ) = \rho (\mathbf{M}_F \mathbf{M}_V^{-1})\), where \(\rho (\mathbf{M})\) is the spectral radius of a matrix M (Diekmann et al. 1990; van den Driessche and Watmough 2002). When \(\mathcal R (S,\mathcal C ,R,\theta )>1\), a pathogen with trait \(\theta \) can invade a population with class densities \(S,\,\mathcal C \), and \(R\).

2.3 Evolution and \(\mathcal R _0\) maximization

We assume evolution occurs at a much slower rate than the epidemiological dynamics in system (1). In order to apply our theory, we require that any successful invading pathogen replaces the resident pathogen in system (1). System (1) has this property when the trait values of the invading pathogen strains are sufficiently close to those of the resident pathogen strains (Geritz et al. 2002; Dercole and Rinaldi 2008). In particular, if the invading strain can invade the endemic equilibrium of the resident strain and the resident strain cannot invade the endemic equilibrium of the invader, then invasion implies replacement (Geritz 2005). The author is not aware of any conditions on system (1) that ensure invasion implies replacement for large differences between the trait values of the invading and resident strains. Finally, we assume only a single strain of the pathogen can infect any given host (i.e. no coinfection or superinfection) and recovered individuals are immune to all strains of the pathogen (i.e. total cross immunity).

We are interested in the case where nonresident strains of the pathogen can invade the endemic equilibrium of a resident strain. In the following we will focus on the evolutionary dynamics in system (1) when selection is frequency independent. When selection is frequency independent, the fitness of the invading pathogen strain is independent of the trait value of the resident and determined solely by the densities of the host classes (Hartl and Clark 2007). In this case the reproductive number of the invading strain only depends on the densities of the host classes and the invader trait value. Note that while the host class densities at the endemic equilibrium are determined by the trait value of the resident strain, the trait value of the resident strain only indirectly affects the fitness of the invading strain via the host class densities. We address the case where selection is frequency dependent in Appendix F. When selection is frequency dependent, the fitness of the invading strain depends explicitly on the trait value of the resident strain (Hartl and Clark 2007). In this case the reproductive number of the invading strain depends on the host class densities and both the resident and invader trait values.

Let \(S^{*},\,\mathcal C ^{*}\),and \(R^{*}\) be the susceptible, infected, and recovered host densities at the endemic equilibrium for a resident strain of the pathogen with trait \(\theta _r\). Let \(\bar{C}_j\) denote the densities of hosts infected with an invading pathogen strain with trait \(\theta _i\). When selection is frequency independent and the invader is rare, the epidemiological dynamics of the invader are

$$\begin{aligned} \frac{d\bar{C}_1}{dt}&= \sum _{j=1}^{n} \bar{C}_{j} \mathcal F ^{(j)}_{C_1}(S^*,\mathcal C ^*,R^*,\theta _i)- \bar{C}_1 \mathcal V ^{-}_{C_1}(S^*,\mathcal C ^*,R^*,\theta _i)\nonumber \\ \frac{dC_j}{dt}&= \bar{C}_{j-1} \mathcal V ^{+}_{C_j}(S^*,\mathcal C ^*,R^*,\theta _i)- \bar{C}_{j}\mathcal V ^{-}_{C_j}(S^*,\mathcal C ^*,R^*,\theta _i), \quad 2 \le j \le n. \end{aligned}$$
(3)

The invader can invade the endemic equilibrium of the resident strain if \(\mathcal R (S^{*},\mathcal C ^*,R^{*},\theta _i)>1\). Because we assume invasion implies replacement, in this case the invading strain becomes the new resident strain. Note that the reproductive number for a resident pathogen invading its own endemic equilibrium is \(\mathcal R (S^{*},\mathcal C ^*,R^{*},\theta _r)=1\). If there exists a resident strain such that \(\mathcal R (S^{*},\mathcal C ^*,R^{*},\theta _i)<1\) for all other strains, then that resident strain cannot be invaded by any other strain at low densities. Such strains are called evolutionary stable strategies (ESSs, Smith and Price 1973).

We are interested in sufficient conditions on the functional forms in system (1) such that an ESS also has the largest basic reproductive number. To determine these conditions, we use a sufficient condition for \(\mathcal R _0\) maximization derived in Mylius and Diekmann (1995). In particular, if the reproductive number for a pathogen with trait \(\theta \) can be written as

$$\begin{aligned} \mathcal R (S,\mathcal C ,R,\theta ) = g(S,\mathcal C ,R) \mathcal R _0(\theta ), \end{aligned}$$
(4)

where \(g(S,R)\) is a positive function, then evolution always selects for the pathogen strain that maximizes the basic reproductive number; see Appendix A. Note that typically \(0\le g(S,R)\le 1\) for biological models. In the following we present the conditions on the functional forms of \(\mathcal F ^{(j)}_{C_1},\,\mathcal V ^{+}_{C_j}\), and \(\mathcal V ^{-}_{C_j}\) such that \(\mathcal R (S,\mathcal C ,R,\theta )\) factors as in Eq. (4). Note that because \(\mathbf{M}_F\) and \(\mathbf{M}_V\) depend only on the functions \(\mathcal F ^{(j)}_{C_1},\,\mathcal V ^{+}_{C_j}\), and \(\mathcal V ^{-}_{C_j}\), our conditions for \(\mathcal R _0\) maximization hold for any choice of dynamics for the susceptible and recovered classes.

3 Results

3.1 \(\mathcal R _0\) maximization in models with a single infectious class

We first consider systems with a single infectious class. The level of generality of our model includes systems where transmission is density dependent or frequency dependent and systems where immunity can be lost (SIS and SIRS systems) and recovery is not possible (SI systems). The model is

$$\begin{aligned} \frac{dS}{dt}&= \mathcal G _S(S,I,R,\theta ) - \mathcal U _S(S,I,R,\theta ) \nonumber \\ \frac{dI}{dt}&= I\mathcal F _I(S,I,R,\theta ) - I\mathcal V ^{-}_I(S,I,R,\theta ) \\ \frac{dR}{dt}&= \mathcal G _S(S,I,R,\theta ) - \mathcal U _R(S,I,R,\theta ).\nonumber \end{aligned}$$
(5)

Here \(\mathcal F _I\) is the per capita recruitment rate of infectious individuals due to horizontal or vertical transmission and \(\mathcal V ^{-}_I\) is the per capita rate at which infected individuals exit the infectious class. The per capita exit rate \(\mathcal V ^{-}_I=\mathcal T _{I}+\mathcal D _{I}\) is the sum of the per capita rate at which infectious individuals transfer out of the infectious class into other classes \((\mathcal T _{I})\) and the per capita death rate \((\mathcal D _{I})\). Since there is only a single infected class in system (5), the transfer rate is equal to the recovery rate.

Using Eq. (2), the reproductive number for a pathogen with trait \(\theta \) in system (5) is

$$\begin{aligned} \mathcal{R }(S,I,R,\theta )&= \frac{\mathcal{F }_{I}(S,I,R,\theta )}{\mathcal{V }^{-}_{I}(S,I,R,\theta )} = \frac{\mathcal{F }_{I}(S,I,R,\theta )}{\mathcal{T }_{I}(S,I,R,\theta )+\mathcal{D }_{I}(S,I,R,\theta )}; \end{aligned}$$
(6)

see Appendix B for details. The basic reproductive number for a pathogen strain is \(\mathcal R _{0}(\theta ) = \mathcal R (N,0,0,\theta )\). The reproductive number can be written in the form given by Eq. (4) when \(\mathcal F _{I}\) and \(\mathcal V ^{-}_{I}\) can be written as

  1. (A1)

    \(\mathcal F _{I}(S,I,R,\theta ) = f_{I}(S,I,R)\eta (\theta )\xi _f(S,I,R,\theta )\)

  2. (A2)

    \(\mathcal V ^{-}_{I}(S,I,R,\theta ) = v^{-}_{I}(S,I,R)\nu (\theta )\xi _v(S,I,R,\theta )\)

  3. (A3)

    \(\xi _f = \xi _v\)

Under these conditions the reproductive rate can be written as

$$\begin{aligned} \mathcal R (S,I,R,\theta )&= \frac{f_{I}(S,I,R)\eta (\theta )\xi _f(S,I,R,\theta )}{v^{-}_{I}(S,I,R)\nu (\theta )\xi _v(S,I,R,\theta )} \\&= \frac{f_{I}(N,0,0)}{v^{-}_{I}(N,0,0)}\frac{\eta (\theta )}{\nu (\theta )}\frac{\xi _f(N,0,0,\theta )}{\xi _v(N,0,0,\theta )}\frac{v_{I}^{-}(N,0,0)}{f_{I}(N,0,0)} \frac{f_{I}(S,I,R)}{v^{-}_{I}(S,I,R)}\\&= R_0(\theta ) g(S,I,R). \end{aligned}$$

In conditions (A1) and (A2), \(f_{I}\) and \(v^{-}_{I}\) represent the effects the densities of the host classes have on the recruitment and exit rates of infectious individuals. We will refer to \(f_{I}\) and \(v^{-}_{I}\) as the effects of the environment. The functions \(\eta \) and \(\nu \) represent the effects the pathogen trait has on the recruitment and exit rates of infected individuals. Finally, the terms \(\xi _f\) and \(\xi _v\) represent the effects of the interactions between the pathogen trait and the densities of the host classes. This term has multiple interpretations, e.g. genotype-by-density, genotype-by-environment, or phenotype-by-environment interactions, but throughout the text we will refer to these terms as genotype-by-environment interaction effects.

The biological interpretations and consequences of conditions (A1) through (A3) are the following. First consider condition (A3). Biologically, \(\mathcal F _{I}\) and \(\mathcal V ^{-}_{I}\) are only likely to have the same genotype-by-environment interaction terms when there are no genotype-by-environment interactions, i.e. \(\xi _f = \xi _v =1\). In this case, the effects of the environment and the trait on recruitment, recovery, and death are independent. Next consider condition (A1). When there are multiple routes of transmission (e.g., horizontal or vertical transmission), condition (A1) implies that either the environmental effects or the trait effects must be the same across all transmission routes; see Example 3 and Appendix B for details. In natural systems, we do not expect horizontal and vertical transmission pathways to have the same dependence on the densities of the host classes, nor do we expect the pathogen trait to affect all transmission routes the same. Thus, we expect evolution to maximize \(\mathcal R _0\) only in systems with a single transmission pathway.

To interpret condition (A2) we decompose \(\mathcal V _I\) into death \((\mathcal D _I)\) and transfer \((\mathcal T _I)\) rates. Condition (A2) is satisfied when one of the following holds

  1. (A2.1)

    \(\mathcal T _{I} = t_I(S,I,R)\tau (\theta )\xi _v(S,I,R,\theta )\) and \(\mathcal D _{I} =0\)

  2. (A2.2)

    \(\mathcal T _{I}=0\) and \(\mathcal D _{I} = d_I(S,I,R)\delta (\theta )\xi _v(S,I,R,\theta )\)

  3. (A2.3)

    \(\mathcal T _{I}=t_I(S,I,R)\tau (\theta )\xi _v(S,I,R,\theta ),\,\mathcal D _{I}=d_I(S,I,R)\delta (\theta )\xi _v(S,I,R,\theta )\), and \(\tau = \delta \)

  4. (A2.4)

    \(\mathcal T _{I}=t_{I}(S,I,R)\tau (\theta )\xi _v(S,I,R,\theta ),\,\mathcal D _{I}=d_{I}(S,I,R)\delta (\theta )\xi _v(S,I,R,\theta )\), and \(t_{I}=d_{I}\)

Condition (A2.1) implies that there is no mortality of infectious individuals. Condition (A2.2) implies that recovery is not possible and that infection ultimately leads to death. When death and recovery are both possible, condition (A2.3) requires the death and recovery rates to have the same trait dependence. Condition (A2.3) is unlikely to arise in nature as it implies that an increase in the death rate due to pathogen evolution is accompanied by an increase in the recovery rate.

Condition (A2.4) requires the death and recovery rates to have the same density dependence. To see when this arises, decompose the mortality rate into pathogen induced mortality that is independent of the host population density, \(\delta _1(\theta )\), and density dependent mortality, \(\delta _2(\theta )d(S,I,R)\). Thus the total death per capita rate is \(\mathcal D _I= \delta _1(\theta )+ \delta _2(\theta )d(S,I,R)\). An example from the literature is \(\mathcal D _I= \delta _1(\theta )+ \delta _2(S+I+R)\) (Thieme 2007; Gyllenberg et al. 2011). For \(\mathcal D _I\) to satisfy condition (A2.2), the density dependent mortality rate would either have to be independent of the host population density [\(d(S,I,R) =0\)] or the pathogen induced mortality would have to be negligible [\(\delta _1(\theta ) =0\)]. Note that we expect the per capita recovery rate to be independent of the host population density. Thus, for \(\mathcal D _I+\mathcal T _I\) to satisfy condition (A2.4), either mortality would have to be density independent or there could be no recovery [\(\mathcal T (S,I,R,\theta )=0\)]. In total, condition (A2.4) will be satisfied in natural systems either when mortality is density independent or when mortality is entirely density dependent and there is no recovery.

In the following we consider three examples. The first has a widely used and simple form that results in the maximization of \(\mathcal R _0\). In the second system the transmission function does not satisfy condition (A1) and evolution does not maximize \(\mathcal R _0\). Specific numerical examples of these two systems are included in Fig. 1. The third example illustrates why we do not expect \(\mathcal R _0\) to arise in systems where the pathogen is spread both vertically and horizontally.

Fig. 1
figure 1

Numerical examples where pathogen evolution maximizes \(\mathcal R _0(\theta )\) and where pathogen evolution does not maximize \(\mathcal R _0(\theta )\). a, b Basic reproductive number of a pathogen as a function of its trait, \(\theta \). c, d Pairwise invasability plots for the pathogen. Gray (white) regions denote strains of the pathogen that can (cannot) invade a given resident strain. The evolutionarily optimal trait value occurs at the intersection of the black curves. Arrows in (c) and (d) denote the trait values that maximize \(\mathcal R _0(\theta )\) in panels (a) and (b), respectively. Numerical examples are derived from Example 1 (a, c) and Example 2 (b, d) from the main text. Parameters are \(\beta (\theta ) = 4 + 2\theta ,\,\kappa (\theta ) = 1+\theta ,\,\omega =0.5,\,\mu (\theta ) = 2+\theta ^2\), and \(N=10\). The dynamical equation for the susceptible class is \(dS/dt = -\mathcal F (S,I,R)+\mu (\theta )I +\mu _R R+\rho R\) where \(\mu _R\) is the death rate of recovered individuals, \(\rho \) is the loss of immunity rate, and the total population size is assumed to be constant, \(S+I+R=N\)

Example 1:

Systems with functions of the form \(\mathcal F _{I} = f_1(S,I,R) \eta (\theta )\) and \(\mathcal V ^{-}_{I} = v_1(S,I,R) \nu (\theta )\) are widely used in the literature. In such systems, the reproductive number can easily be shown to factor as in Eq. (4),

$$\begin{aligned} \mathcal R (S,I,R,\theta )&= \left(\frac{f_1(S,I,R)}{f_1(N,0)}\frac{v_1(N,0)}{v_1(S,I,R)}\right) \frac{f_1(N,0) \eta (\theta )}{v_1(N,0) \nu (\theta )}\\&= \left(\frac{f_1(S,I,R)}{f_1(N,0)}\frac{v_1(N,0)}{v_1(S,I,R)}\right) \mathcal R _{0}(N,0,0,\theta ). \end{aligned}$$

Systems with these functional forms always satisfy conditions (A1) and (A2) and hence, result in the evolutionary maximization of \(\mathcal R _0\).

For example, the \(\mathcal R _0\) maximization hypothesis in Anderson and May (1982) and Bremermann and Thieme (1989) arose from models where the infectious class dynamics were given by

$$\begin{aligned} \frac{dI}{dt} = \beta (\theta )SI -\mu (\theta )I - \omega (\theta )I \end{aligned}$$
(7)

where \(\beta (\theta )\) is the mass action transmission coefficient, \(\mu (\theta )\) is the per capita mortality rate, and \(\omega (\theta )\) is the per capita recovery rate. Translating equation (7) into our notation yields \(f_I(S,I,R)=S,\,\eta (\theta )=\beta (\theta ),\,v^{-}_I(S,I,R)=1,\,\nu (\theta )=\mu (\theta )+\omega (\theta )\), and \(\xi _f=\xi _v=1\). Decomposing the exit rates into transfer and death rates yields \(t_I(S,I,R)=1,\,\tau (\theta )=\omega (\theta ),\,d_I(S,I,R)=1\), and \(\delta (\theta )=\mu (\theta )\). Note that the per capita mortality and recovery rates are independent of the host class sizes and thus satisfy condition (A2.4).

In this example,

$$\begin{aligned} \mathcal R (S,I,R,\theta ) = \frac{\beta (\theta )S}{\mu (\theta )+\omega (\theta )} \end{aligned}$$
(8)

which can be put in the form of Eq. (4) by setting

$$\begin{aligned} g(S,I,R) = \left(\frac{f_1(S,I,R)}{f_1(N,0)}\frac{v_1(N,0)}{v_1(S,I,R)}\right) = \frac{S}{N}. \end{aligned}$$
(9)

Hence, as was observed in Anderson and May (1982) and Bremermann and Thieme (1989), evolution maximizes \(\mathcal R _0\). In the numerical example in Fig. 1c, the evolutionary optimal strategy (i.e. the ESS denoted by the intersection of the two black curves) coincides with the trait value that maximizes the basic reproductive number (denoted by the black arrow).

Example 2:

Now consider a system where the infectious dynamics are given by

$$\begin{aligned} \frac{dI}{dt} = \frac{\beta (\theta ) SI}{\kappa (\theta ) + S} -\mu (\theta )I - \omega (\theta )I. \end{aligned}$$
(10)

Here, the transmission function \(\beta S/[\kappa (\theta )+ S]\) represents how the transmission rate saturates as the density of susceptible individuals increases. The parameter \(\kappa (\theta )\) is the density of susceptible individuals at which the transmission rate is half of the maximum rate. In this system \(\mathcal F _I(S,I,R,\theta )=\beta (\theta )S/(\kappa (\theta )+S)\) and \(\mathcal V ^{-}_I(S,I,R,\theta )=\mu (\theta )+\omega (\theta )\). The reproductive number for the pathogen is

$$\begin{aligned} \mathcal R (S,I,R,\theta ) = \frac{\beta (\theta ) S}{\kappa (\theta ) +S}\frac{1}{\mu (\theta ) + \omega (\theta )}. \end{aligned}$$
(11)

When \(\kappa \) does not depend on the trait \(\theta \), then system (7) satisfies conditions (A1) through (A3). In particular, \(f_I(S,I,R)=S/(\kappa +S),\,\eta (\theta )=\beta (\theta ),\,v^{-}_I(S,I,R)=1,\nu (\theta )=\mu (\theta )+\omega (\theta )\), and \(\xi _f=\xi _v=1\). In this case \(\mathcal R (S,I,R,\theta )\) can be written as in Eq. (4) with

$$\begin{aligned} g(S,I,R) = \frac{S}{\kappa +S}\frac{\kappa +N}{N}. \end{aligned}$$
(12)

Thus, evolution maximizes \(\mathcal R _0\).

When \(\kappa (\theta )\) does depend on the trait, \(\beta (\theta )S/(\kappa (\theta )+S)\) cannot be factored as in condition (A1). Thus, \(\mathcal R (S,I,R,\theta )\) cannot be written as in Eq. (4) because the function \(g(S,I,R)\) will depends on both \(S\) and \(\theta \). In this case, the value of the optimal strain will be determined by how the endemic equilibrium density of the susceptible population depends on the strain of the resident pathogen (Mylius and Diekmann 1995; Geritz et al. 1998). As seen in Fig. 1d, the evolutionary optimal strategy (denoted by the intersection of the black curves, \(\theta \approx 0.135\)) does not coincide with the trait value that maximizes the basic reproductive number (black arrow, \(\theta \approx 0.283\)).

Example 3

Assume the pathogen can be transmitted vertically and horizontally. To simplify the dynamics, we assume that the host population dynamics follow logistic growth with carrying capacity \(K\). The equation for the infectious class is

$$\begin{aligned} \frac{dI}{dt} = I\left(r(\theta ) - \frac{S+I+R}{K}\right) + \beta (\theta ) SI- \mu _I(\theta )I - \omega (\theta )I \end{aligned}$$
(13)

where \(r(\theta )\) is the per capita birth rate of infectious hosts and all other parameters are defined as in Example 1. In Eq. (13), \(\mathcal F _I(S,I,R,\theta )=r(\theta ) - (S+I+R)/K+\beta (\theta )S\) and \(\mathcal V ^{-}_I(S,I,R,\theta )=\mu (\theta )+\omega (\theta )\).

The reproductive number for system (20) is

$$\begin{aligned} \mathcal R (S,I,R,\theta ) = \frac{r(\theta )-(S+I+R)/K + \beta (\theta ) S}{\mu (\theta ) + \omega (\theta )}. \end{aligned}$$
(14)

If either \(r(\theta )\) or \(\beta (\theta )\) depend on the trait, then \(\mathcal F _I(S,I,R,\theta )\) does not factor as in condition (A1). In this case it is not possible to write \(\mathcal R (S,I,R,\theta )\) as in Eq. (4) because \(g(S,I,R)\) will depend on \(S,\,I,\,R\) and \(\theta \). If \(r\) and \(\beta \) do not depend on the pathogen trait, then \(\mathcal F _I(S,I,R,\theta )\) can be factored as \(f_I(S,I,R)=r-(S+I+R)/K+\beta S\) and \(\eta _I(\theta )=1\). However, this case only arises in the biologically unlikely scenario where the pathogen trait has no affect on pathogen transmission or host recruitment. This example illustrates why we do not expect evolution to maximize \(\mathcal R _0\) when vertical and horizontal transmission pathways are both present.

3.2 \(\mathcal R _0\) maximization in models with an exposed class

To demonstrate how the conditions for \(\mathcal R _0\) maximization in model (5) generalize to more complex models, we consider a system with an exposed class, \(E\), and an infectious class, \(I\). The generality of the model includes systems where immunity can be lost and recovery is not possible. For notational ease in this section we set \(X = (S,E,I,R)\). The model is

$$\begin{aligned} \frac{dS}{dt}&= \mathcal G _S(X,\theta )- \mathcal U _S(X,\theta )\nonumber \\ \frac{dE}{dt}&= I\mathcal F _{E}(X,\theta )- E\mathcal V ^{-}_{E}(X,\theta )\nonumber \\ \frac{dI}{dt}&= E\mathcal V ^{+}_{I}(X,\theta )- I\mathcal V ^{-}_{I}(X,\theta )\nonumber \\ \frac{dR}{dt}&= \mathcal G _R(X,\theta )- \mathcal U _R(X,\theta ). \end{aligned}$$
(15)

Note that infectious individuals cannot return to the exposed class (i.e. there is no \(I\mathcal V ^{+}_E\) term) and that all newly infected individuals enter the exposed class (i.e. there is no \(I\mathcal F _I\) term).

The reproductive number for a pathogen with strain \(\theta \) in system (15) is

$$\begin{aligned} \mathcal{R }(X,\theta ) = \frac{\mathcal{F }_{E}(X,\theta )}{\mathcal{V }^{-}_{E}(X,\theta )} \frac{\mathcal{V }^{+}_{I}(X,\theta )}{\mathcal{V }^{-}_{I}(X,\theta )}. \end{aligned}$$
(16)

The basic reproductive number is \(\mathcal R _0(\theta ) = \mathcal R (N,0,0,0,\theta )\). \(\mathcal R (X,\theta )\) factors as in Eq. (4) when

  1. (B1)

    \(\mathcal F _{E} = f_{E}(X)\eta _{E}(\theta )\xi ^{+}_{E}(X,\theta )\)

  2. (B2)

    \(\mathcal V ^{-}_{E}=v^{-}_{E}(X)\nu ^{-}_{E}(\theta )\xi ^{-}_{E}(X,\theta )\)

  3. (B3)

    \(\mathcal V ^{+}_{I}=v^{+}_{I}(X)\nu ^{+}_{I}(\theta )\xi ^{+}_{I}(X,\theta )\)

  4. (B4)

    \(\mathcal V ^{-}_{I}=v^{-}_{I}(X)\nu ^{-}_{I}(\theta )\xi ^{-}_{I}(X,\theta )\)

  5. (B5)

    \(\xi ^{+}_{E} = \xi ^{-}_{E}\) and \(\xi ^{+}_{I} = \xi ^{-}_{I}\)

Under these conditions, \(\mathcal R (X,\theta )\) can be written as in Eq. (4),

$$\begin{aligned} \mathcal R (X,\theta ) = \frac{f_E(X) v^{-}_E(N,0,0,0) v^{+}_I(X) v^{-}_I(N,0,0,0)}{v^{-}_E(X) f_E(N,0,0,0) v^{-}_I(X)v^{+}_I(N,0,0,0) }\mathcal R _0(\theta ) = g(X)\mathcal R _0(\theta )\quad \end{aligned}$$
(17)

Conditions (B1) through (B5) have a particular structure. First, each functional response must factor into terms representing environmental effects (\(f_{E},\,v^{-}_{E},\,v^{+}_{I}\), and \(v^{-}_{I}\)), pathogen trait effects (\(\eta _{E},\,\nu ^{-}_{E},\,\nu ^{+}_{I}\), and \(\nu ^{-}_{I}\)), and genotype-by-environment interaction effects (\(\xi ^{\pm }_E\) and \(\xi ^{\pm }_I\)). In addition, for each infected class the genotype-by-environment interaction terms of all the functional forms must be the same (\(\xi ^{+}_E=\xi ^{-}_E\) and \(\xi ^{+}_I=\xi ^{-}_I\)).

The biological interpretation of conditions (B1) through (B5) is the following. First consider condition (B5). Since the rate of disease progression out of the exposed class is equal to the entry rate into the infectious class, the equivalence of the genotype-by-environment interactions within classes implies that the genotype-by-environment interactions are the same across all classes. We do not expect the genotype-by-environment interaction effects to be the same for the transmission, entry, and exit rates. Thus, condition (B5) implies that there are no genotype-by-environment interactions \((\xi ^{\pm }_E = \xi ^{\pm }_I =1)\). In this case the effects of the environment and the trait are independent.

Condition (B1) implies that the pathogen utilizes a single route of transmission. Thus, transmission cannot occur both vertically and horizontally. Conditions (B2) through (B4) can be interpreted by decomposing \(\mathcal V ^{-}_{E}\) and \(\mathcal V ^{-}_{I}\) into transfer and death rates. This decomposition yields conditions analogous to conditions (A2.1) through (A2.4) for each infected class. We expect conditions (B2) through (B4) to be satisfied in two different cases. In the first case, the per capita mortality, disease progression, and recovery rates of all class are density independent. In the second case, mortality is density dependent, but additional constraints on the infected classes must hold. First, the per capita mortality, transfer, and recovery rates of the exposed class must be independent of the pathogen trait. Second, recovery is not possible from the infectious class. Third, the mortality rate of the infectious class is entirely density dependent and factors as in condition (A2.2).

In the following we consider two examples that illustrate the two cases above. In the first, mortality is density independent and evolution maximizes \(\mathcal R _0\). The second illustrates the conditions under which \(\mathcal R _0\) maximization occurs when mortality is density dependent.

Example 4:

When mortality is density independent, the dynamics of the infected classes are

$$\begin{aligned} \frac{dE}{dt}&= \beta SI- (\nu _E + \rho _E+ \mu _E )E \nonumber \\ \frac{dI}{dt}&= \nu _E E- (\rho _I+ \mu _I )I \end{aligned}$$
(18)

where \(\beta \) is the transmission coefficient, \(1/\nu _E\) is the average time between infection and the onset of infectiousness, \(\rho _E\) and \(\rho _I\) are the per capita recovery rates, and \(\mu _E\) and \(\mu _I\) are per capita death rates. All parameters are potential functions of the pathogen trait. Here, \(f_E(X)=S,\,\eta _E(\theta )=\beta (\theta ),\,v^{-}_E(X)=v^{+}_I(X)=v^{-}_I(X)=1,\,\nu ^{-}_E(\theta )=\nu _E(\theta )+\rho _E(\theta )+\mu _E(\theta ),\,\nu ^{+}_I(\theta )=\nu _E(\theta ),\,\nu ^{-}_I(\theta )=\nu _I(\theta )+\rho _I(\theta )+\mu _I(\theta )\), and \(\xi ^{\pm }_E=\xi ^{\pm }_I=1\). Note that decomposing the exit rates into transfer and mortality rates yields \(\tau _E(\theta )=\nu _E(\theta )+\rho _E(\theta ),\,\delta _E(\theta )=\mu _I(\theta ),\,\tau _I(\theta )=\rho _E(\theta )\), and \(\delta _I(\theta )=\mu _I(\theta )\). Since \(d_E(X) = d_I(X) = t_E(X) = t_I(I)=1\), system (18) satisfies the condition for SEIR systems analogous to condition (A2.4).

Since the functional forms in system (18) satisfy conditions (B1) through (B5), the reproductive number factors as

$$\begin{aligned} \mathcal R (X,\theta ) \!=\! \frac{\beta \nu _E S}{(\nu _E\!+\! \rho _E\!+\!\mu _E)(\rho _I\!+\!\mu _I)} \!=\! \frac{S}{N} \frac{\beta \nu _E N}{(\nu _E\!+\! \rho _E\!+\!\mu _E)(\rho _E\!+\!\mu _I)} \!=\! \frac{S}{N}\mathcal R _0(\theta )\nonumber \\ \end{aligned}$$
(19)

where \(g(S,I,R) = S/N\) as in Eq. (4). Thus, evolution always maximizes \(\mathcal R _0\).

Example 5:

Now consider a system where mortality is density dependent,

$$\begin{aligned} \frac{dE}{dt}&= \beta SI- [\nu _E + \rho _E + \mu _E +m_E(M,\theta )]E \nonumber \\ \frac{dI}{dt}&= \nu _E E- [\rho _I + \mu _I + m_I(M,\theta )]I. \end{aligned}$$
(20)

where \(M=S(t)+E(t)+I(t)+R(t)\) is the host population size at time \(t\). Here, \(m_E(M)\) and \(m_I(M)\) are the per capita density dependent mortality rates. All other parameters are interpreted as in system (18). The reproductive number for system (20) is

$$\begin{aligned} \mathcal R (X,\theta ) = \frac{\beta S \nu _E }{[\nu _E + \rho _E + \mu _E +m_E(M,\theta )][\rho _I + \mu _I + m_I(M,\theta )]}. \end{aligned}$$
(21)

The functions \(\mathcal F _E=\beta (\theta )S\) and \(\mathcal V ^{+}_I=\nu _E(\theta )\) factor as \(f_E(X)=S,\,\eta _E(\theta )=\beta (\theta ),\,v^{+}_I(X)=1,\,\nu ^{+}_I(\theta )=\nu _E(\theta )\), and \(\xi ^{+}_E=\xi ^{+}_I=1\). However, in general \(\mathcal V ^{-}_E=\nu _E + \rho _E + \mu _E +m_E(M,\theta )\) and \(\mathcal V ^{-}_E=\rho _I + \mu _I + m_I(M,\theta )\) do not satisfy conditions (B2), (B4), and (B5). Hence, in general Eq. (21) cannot be written as in Eq. (4).

System (20) satisfies conditions (B1) through (B5) when one of the following is holds

  1. (C1)

    \(m_E=0\) and \(m_I=0\)

  2. (C2)

    \(m_E=0,\,\rho _I= \mu _I=0\), and \(m_I(M,\theta )=m_1(M)m_2(\theta )\)

  3. (C3)

    \(\nu _E,\,\rho _E,\,\mu _E\), and \(m_E\) do not depend on \(\theta ,\,\rho _I= \mu _I=0\), and \(m_I(M,\theta )=m_1(M)m_2(\theta )\)

Other possibilities exist if \(\rho _I(\theta )= \mu _I(\theta )=\delta _I(\theta )\) or if \(\rho _E(\theta )= \mu _E(\theta )=\delta _E(\theta )\). However, we do not expect these cases to arise in natural systems because they imply that the pathogen trait has the same effect on recovery and mortality rates. Note that under conditions (C2) and (C3) mortality in the infectious class is entirely due to density dependent processes.

When condition (C1) is satisfied, systems (18) and (20) are equivalent. Hence evolution maximizes \(\mathcal R _0\). When condition (C2) is satisfied, the per capita mortality rate of the exposed class is independent of the host population size, there is no recovery from the infectious class, and mortality in the infectious class is entirely due to density dependent processes. In this case \(v^{-}_E(X)=v^{+}_I(X)=1,\,\nu ^{-}_E(\theta )=\nu _E(\theta )+\rho _E(\theta )+\mu _E(\theta ),\,\nu ^{+}_I(\theta )=\nu _E(\theta ),\,v^{-}_I(X)=m_1(M),\,\nu ^{-}_I(\theta )=m_2(\theta )\), and \(\xi ^{\pm }_E=\xi ^{\pm }_I=1\). Following Eq. (17), the reproductive number can then be written as

$$\begin{aligned} \mathcal R (X,\theta ) = \frac{S m_1(N)}{N m_1(M)}\mathcal R _0(\theta ) \end{aligned}$$
(22)

where \(g(S,E,I,R)= S m_1(M)/[N m_1(N)]\) as in Eq. (4).

When condition (C3) is satisfied, all recovery and mortality rates of the exposed class are independent of the pathogen trait, there is no recovery from the infectious class, and mortality in the infectious class is entirely due to density dependent processes. In this case, \(\mathcal V ^{-}_E\) and \(\mathcal V ^{-}_E\) satisfy conditions (B2) through (B5) where \(v^{-}_E(X)=\nu _E+\rho _E+\mu _E+m_E(M),\,\nu ^{-}_E(\theta )=1,\,v^{-}_I(X)=m_1(M),\,\nu ^{-}_I(\theta )=m_2(\theta )\), and \(\xi ^{-}_E=\xi ^{-}_I=1\). The reproductive number for system (20) can then be written as

$$\begin{aligned} \mathcal R (X,\theta ) = \frac{S m_3(N) [\nu _E+\rho _E+\mu _E+m_E(N)]}{N m_3(M) [\nu _E+\rho _E+\mu _E+m_E(M)]}\mathcal R _0(\theta ) = g(X)\mathcal R _0(\theta ). \end{aligned}$$
(23)

3.3 Maximization in models with vectors and multiple infected classes

We now summarize the main conclusions for systems with multiple exposed and multiple infectious classes and systems with vector-borne pathogens. All analytical results are contained in Appendices C, D, and E.

There are three general mathematical conditions under which evolution maximizes \(\mathcal R _0\). First, each functional response must factor into three components representing environmental, trait, and genotype-by-environment interaction effects, e.g. \(\mathcal F (X,\theta ) = f(X)\eta (\theta )\xi _f(X,\theta )\). Second, for each species the pathogen infects, the genotype-by-environment interaction terms must be the same across all classes of that species. For example, in vector-borne systems the genotype-by-environment terms for all host classes must be the same and the genotype-by-environment terms for all vector classes must be the same. Third, for systems with multiple infectious classes, the per capita rates at which infectious individuals enter \((\mathcal V ^{+}_{C_j})\) and exit \((\mathcal V ^{-}_{C_j})\) a particular infectious class must have either the same dependence on the pathogen trait or the same dependence on the densities of the host classes. This last condition is similar to, but more restrictive than, conditions (A2.1) through (A2.4) for system (5).

The biological consequences of the conditions are similar to those for the previous models. The first condition implies that the pathogen can only utilize one transmission pathway. Thus the pathogen can only be spread via horizontal, vertical, or vector-borne transmission; see Example 6. The constraints imposed by the second condition suggest that \(\mathcal R _0\) maximization will arise only in systems where there are no genotype-by-environment interactions. In this case, the trait and the host class densities affect pathogen related processes independently.

The biological consequences of the third condition depend on the structure of the host population. When each species has a single infectious class, the consequences are the same as those in system (15). That is, either mortality is density independent or mortality is density dependent and the conditions illustrated in Example 5 must hold. When multiple infectious classes are present in a given species, the entry and exit rates for all infectious classes of that species must have either the same dependence on the trait or the same dependence on the host classes. We do not expect the pathogen trait to have the same effect on the entry and exit rates for all infected classes. In principle the entry and exit rates of all infectious classes can have the same density dependence, but biologically we expect this only in the case where the per capita recovery, mortality, and infection progression rates are density independent. Thus, for any species with multiple infectious classes, \(\mathcal R _0\) maximization is only expected if mortality is density independent.

The following two examples consider vector-borne pathogens. Let \(\hat{S},\,\hat{E},\,\hat{I}\), and \(\hat{R}\) denote the densities of the susceptible, exposed, infectious and recovered vector classes, respectively. Let \(\hat{N}\) be the density of susceptible vectors at the disease free equilibrium. For notational ease, let \(X=(S,E,I,R,\hat{S},\hat{E},\hat{I},\hat{R})\). The first example is from a study by Medlock et al. (2009) where evolution maximizes \(\mathcal R _0\). The second example includes both direct and vector transmission and does not maximize \(\mathcal R _0\).

Example 6:

The equations for the infected classes in Medlock et al. (2009) are

$$\begin{aligned} \frac{dE}{dt}&= \beta S\hat{I}- (\nu _E + \mu _E)E \nonumber \\ \frac{dI}{dt}&= \nu _E E- (\rho _I + \mu _I)I \nonumber \\ \\ \frac{d\hat{E}}{dt}&= \hat{\beta } \hat{S}I- (\nu _{\hat{E}} + \mu _{\hat{E}})\hat{E} \nonumber \\ \frac{d\hat{I}}{dt}&= \nu _{\hat{E}} \hat{E}- (\rho _{\hat{I}} + \mu _{\hat{I}})\hat{I}\nonumber \end{aligned}$$
(24)

where \(\beta \) and \(\hat{\beta }\) are the transmission coefficients, \(1/\nu _E\) and \(1/\nu _{\hat{E}}\) are the average times between exposure to the pathogen and the onset of infectiousness, \(\rho _I\) and \(\rho _{\hat{I}}\) are the recovery rates, and \(\mu _E,\,\mu _I,\,\mu _{\hat{E}}\) and \(\mu _{\hat{I}}\) are the mortality rates. All parameters are potentially functions of \(\theta \). Note that mortality is density independent and that the pathogen only utilizes one route of transmission.

In this system, the functional forms in the host class equations factor into components that depend on the host class densities [\(f_E(X) = S\) and \(v^{-}_E(X) = v^\pm _I(X) = 1\)] and components that depend on the pathogen trait [\(\eta _E(\theta ) = \beta (\theta )S,\,\nu ^{-}_E(\theta ) =\nu _E+\mu _E,\,\nu ^{+}_I(\theta ) =\nu _E,\,\nu ^{-}_I(\theta ) = \rho _I+\mu _I\)]. Furthermore, there are no genotype-by-environment interaction terms, i.e. \(\xi _E^\pm = \xi _I^\pm =1\). The functional forms in the vector class equations factor in an analogous way. Consequently, the reproductive number of the pathogen can be written as in Eq. (4),

$$\begin{aligned} \mathcal R _1(X,\theta ) = \frac{\beta S \nu _E}{(\nu _E+\mu _E)(\rho _I+\mu _I)}\frac{\hat{\beta } \hat{S} \nu _{\hat{E}}}{(\nu _{\hat{E}}+\mu _{\hat{E}})(\rho _{\hat{I}}+\mu _{\hat{I}})} = \frac{S\hat{S}}{N\hat{N}}\mathcal R _{0}(\theta ) \end{aligned}$$
(25)

where \(g(X) = S\hat{S}/(N\hat{N})\). As was found in Medlock et al. (2009), evolution maximizes \(\mathcal R _0\).

Example 7

Now assume the pathogen utilizes both horizontal and vector-borne transmission routes. The dynamics of the exposed class are

$$\begin{aligned} \frac{dE}{dt} = \beta _1 S I + \beta _2 S\hat{I}- (\nu _E + \mu _E)E. \end{aligned}$$
(26)

where \(\beta _1\) is the direct transmission coefficient and \(\beta _2\) is the vector transmission coefficient. The equations for the \(I,\,\hat{E}\), and \(\hat{I}\) classes are as in system (24). The reproductive number of the pathogen is

$$\begin{aligned} \mathcal R _{2}(X,\theta ) \!=\! \frac{1}{2} \frac{\beta _1 S \nu _E}{(\nu _E\!+\!\mu _E)(\rho _I\!+\!\mu _I)} \!+\! \frac{1}{2}\sqrt{\left( \frac{\beta _1 S \nu _E}{(\nu _E\!+\!\mu _E)(\rho _I\!+\!\mu _I)} \right)^2 \!+\!\mathcal R _{1}(X,\theta )}\quad \quad \end{aligned}$$
(27)

where \(\mathcal R _{1}\) is defined as in Eq. (25) with \(\beta =\beta _2\).

As shown in Example 6, all of the transfer functions factor into components that depend only on the host and vector classes and components that only depend on the pathogen trait. The contributions to newly infected individuals from direct transmission \((\beta _1 SI)\) and vector transmission \((\beta _2 S\hat{I})\) also factor. However, because the pathogen utilizes two modes of transmission, the reproductive number \(\mathcal R _{2}(X,\theta )\) factors as in Eq. (4) only if the parameters depend on \(\theta \) in such a way that \(\hat{\beta }\beta _2\hat{\nu }_E =\beta _1^2\nu _E\) and \((\nu _E+\mu _E)(\nu _I+\mu _I) = (\hat{\nu }_E+\hat{\mu }_E)(\hat{\nu }_I+\hat{\mu }_I)\). We do not expect these conditions to be satisfied in natural systems. Thus, we do not expect evolution to maximize \(\mathcal R _{0}\) when there are multiple routes of transmission.

3.4 Frequency dependent selection

The above conditions can be extended to include the case where selection is frequency dependent. Here we present the results for a system with a single infectious class and discuss how the results can be extended for more complex systems. Additional details and the analysis of a particular example are included in Appendix F.

Consider a system with a single infectious class. Let \(\theta \) denote the resident’s trait and let \((S^*,I^*,R^*)\) denote the endemic equilibrium of the resident. When selection is frequency dependent, the dynamics of an invading pathogen with strain \(\theta _i\) at the endemic equilibrium of the resident are

$$\begin{aligned} \frac{dI_i}{dt} = I_i F_I(S^*,I^*,R^*,\theta ,\theta _i) - I_iV_I^{-}(S^*,I^*,R^*,\theta ,\theta _i). \end{aligned}$$
(28)

Here, \(I_i\) is the density of individuals infected with the invading strain, \(F\) is the per capita transmission rate of the invading strain, and \(V^{-}\) is the per capita exit rate. The reproductive number of the invading strain is

$$\begin{aligned} \mathcal R (S^*,I^*,R^*,\theta ,\theta _i) = \frac{F_I(S^*,I^*,R^*,\theta ,\theta _i)}{V_I^{-}(S^*,I^*,R^*,\theta ,\theta _i)}. \end{aligned}$$
(29)

The basic reproductive number of the invading strain is \(\mathcal R (\theta _i)= \mathcal R (N,0,0,\theta _i,\theta _i)\). The reproductive number factors as in Eq. (4) under the following conditions

  1. (D1)

    \(F_{I}(S,I,R,\theta ,\theta _i) = f_{I}(S,I,R,\theta )\eta (\theta _i)\xi _f(S,I,R,\theta ,\theta _i)\)

  2. (D2)

    \(V^{-}_{I}(S,I,R,\theta ,\theta _i) = v^{-}_{I}(S,I,R,\theta )\nu (\theta _i)\xi _v(S,I,R,\theta ,\theta _i)\)

  3. (D3)

    \(\xi _f = \xi _v\)

  4. (D4)

    \(f_{I}(S,0,0,\theta _1) = f_{I}(S,0,0,\theta _2)\) and \(v^{-}_{I}(S,0,0,\theta _1) = v^{-}_{I}(S,0,0,\theta _2)\) for all \(\theta _1\) and \(\theta _2\)

These conditions are similar to conditions (A1) and (A3) but two key differences arise. First, the terms that define the environmental effects (\(f_I\) and \(v^{-}_I\)) now depend on the trait value of the resident. Note that from the perspective of the invader, the resident is part of the environment. Second, condition (D4) imposes an additional constraint on the effects of the host density and the invading pathogen strain in completely susceptible populations.

The biological interpretations of the conditions are the following. Condition (D3) implies that the genotype-by-environment interactions are the same for transmission, mortality, and recovery. Condition (D3) also implies that the genotype-by-genotype interactions are the same. Here, genotype-by-genotype interactions refer to interactions between the traits of the resident and invading strain. We do not expect genotype-by-environment and genotype-by-genotype interaction to be the same for all processes in natural systems. Thus, condition (D3) is only likely to be satisfied in systems where the effects of the invader’s trait are independent of the host class densities and the resident’s trait, i.e. \(\xi _f = \xi _v=1\). Condition (D1) implies that the pathogen can only utilize one transmission pathway. Condition (D3) implies that either mortality is density independent or mortality is density dependent and recovery is not possible. Condition (D4) implies the effects of host density and the pathogen trait are independent in completely susceptible populations.

The conditions for \(\mathcal R _0\) maximization in any system with frequency independent selection can be extended to the frequency dependent selection case in an analogous way. First, all terms representing environmental effects need to depend on both the host class densities and the trait value of the resident. Second, terms representing trait effects can only depend on the invader’s trait. Third, a condition like condition (D4) must hold for all functional responses. The biological interpretation of these conditions remains essentially unchanged.

4 Discussion

In this study we presented a set of sufficient conditions for \(\mathcal R _0\) maximization in terms of the functional forms used to model epidemiological processes. We also discussed how those mathematical conditions relate to the biological characteristics of natural systems. Our analysis yields three mathematical conditions under which evolution maximizes \(\mathcal R _0\) in SIR-type systems. First, each functional response must factor into terms representing environmental, trait, and genotype-by-environment interaction effects [conditions (A1) through (A2) for system (5)]. Second, for each species the pathogen infects, the effects of genotype-by-environment (or genotype-by-density) interactions must the same for all epidemiological processes [condition (A3)]. Third, the per capita mortality, recovery, and infection progression rates must have the same density dependence or the same trait dependence [conditions (A2.1) through (A2.4)].

These mathematical conditions yield three general biological constraints on when evolution is expected to maximize \(\mathcal R _0\) in natural systems. First, the pathogen can only utilize one transmission pathway, e.g. horizontal, vertical, or vector transmission. Second, there are no genotype-by-environment interaction effects. That is, the host class densities and the pathogen trait have independent effects on all epidemiological processes. Third, either the per capita mortality, recovery and disease progression rates are density independent or mortality is density dependent and (i) there is a single infectious class that individuals cannot recover from, (ii) mortality in the infectious class is entirely density dependent, and (iii) the rates of recovery, infection progression, and mortality in the exposed classes are independent of the pathogen trait.

It is important to note that the conditions presented in this study are only sufficient conditions for \(\mathcal R _0\) maximization. Other studies have considered particular systems that do not satisfy our conditions but result in \(\mathcal R _0\) maximization (Thieme 2007; Metz et al. 2008; Gyllenberg et al. 2011). However, while our conditions do not capture all of the cases in which \(\mathcal R _0\) maximization occurs, our conditions do identify what characteristics of natural systems inhibit or promote \(\mathcal R _0\) maximization. We now interpret our results in the context of one-dimensional environmental feedbacks. In particular, Metz et al. (2008) found that \(\mathcal R _0\) maximization required a one-dimensional environmental feedback. We focus on the biological mechanisms that yield higher dimensional environmental feedbacks and inhibit \(\mathcal R _0\) maximization.

When multiple transmission pathways are present, each pathway yields an additional environmental feedback. Previous studies have shown that evolution does not maximize \(\mathcal R _0\) when a pathogen can be transmitted both horizontally and vertically (Nowak 1991; Lipsitch et al. 1996). For this case, the susceptible class is the feedback variable for horizontal transmission and the infectious class, via the density regulated birth rate, is the feedback variable for vertical transmission. Our results show that this conclusion extends to indirectly transmitted pathogens as well. In particular, when vector transmission is also possible, the susceptible vector population becomes a feedback variable. This suggests we should not expect evolution to maximize \(\mathcal R _0\) when a pathogen utilizes multiple pathways.

Previous work has also shown that host heterogeneity can inhibit \(\mathcal R _0\) maximization (Dwyer et al. 1997; Gandon et al. 2001). In our models, host heterogeneity is represented via multiple susceptible classes; see Appendix B. When the effects of the pathogen trait on transmission differ across susceptible classes, then each susceptible class acts as an independent feedback variable and evolution will not maximize \(\mathcal R _0\). However, if the pathogen trait affects transmission uniformly (e.g. increasing transmission by a constant factor across all susceptible classes) then host heterogeneity will not prevent \(\mathcal R _0\) maximization. Thus, the underlying structure dictating the heterogeneity and its interaction with the pathogen trait will determine if \(\mathcal R _0\) maximization is possible.

Density dependent mortality has been shown to add additional feedback variables (Thieme 2007; Metz et al. 2008; Gyllenberg et al. 2011). However, this is true only when the pathogen trait affects pathogen-induced mortality, recovery, density dependent mortality, or all of the above in a given class. For example consider system (5) where there is a single infectious class. The feedback variable for density independent pathogen-induced mortality and recovery is the susceptible class. The feedback variable for density dependent mortality is the total population size. If density dependent mortality is present with density independent mortality or recovery, then the environmental feedback will have dimension two. But, if recovery is not possible and density independent mortality is negligible, then the environmental feedback will be one-dimensional. Furthermore, as shown in system (15) and Example 5, density dependent and density independent sources of mortality can arise in exposed classes so long as those processes are not affected by the pathogen trait. Thus, our results show that the dimension of the environmental feedback depends on the characteristics of the system when mortality is density dependent.

Our approach also shows that the structure of the host population can influence the dimension of the environmental feedback. Differences in the progression of infection can inhibit \(\mathcal R _0\) maximization. When newly infected hosts can enter different infected classes, multiple life cycle pathways are accessible to the pathogen. Each new pathway can potentially result in the pathogen trait affecting the environmental feedback. When this occurs, the environmental feedback has dimension two. Differences in the infectiousness of infectious individuals can also inhibit \(\mathcal R _0\) maximization. In particular, if transmission processes differ mechanistically across the infectious classes of a given species, then each transmission mechanism will create a new feedback; see Appendix C. In this case the environmental feedback will depend on the pathogen trait and have a dimension greater than one. This is particularly important for pathogens of organisms in stage or age structured populations. If the pathogen can be spread during different life stages, then the transmission mechanisms could differ, preventing \(\mathcal R _0\) maximization.

The conditions for \(\mathcal R _0\) maximization for every model we considered included a requirement for the genotype-by-environment interaction effects to be the same for all functional responses. Genotype-by-environment interactions arise when the effect of the environment is conditional on the pathogen trait. In most cases, this implies that the feedback must be at least two dimensional. Many previous studies have implicitly assumed there are no genotype-by-environment interactions via density dependent, frequency dependent, or mass action transmission functions and density independent mortality and recovery rates (Anderson and May 1982; Bremermann and Thieme 1989; Lenski and May 1994; Dieckmann et al. 2002; Boots and Sasaki 2003; Basu and Galvani 2009; Medlock et al. 2009). Each of these cases found that evolution maximized \(\mathcal R _0\). However, those transmission functions do not accurately capture the dynamics of all host–pathogen systems (McCallum et al. 2001; Smith et al. 2009). Other functional forms like the negative binomial (Knell et al. 1996; Barlow 2000; Briggs and Godfray 1995) and asymptotic transmission functions (Barlow 2000; Diekmann and Kretzschmar 1991; Roberts 1996; Heesterbeek and Metz 1993) have been used to model host–pathogen systems. Depending on how the pathogen trait is incorporated, these transmission functions may contain terms that represent genotype-by-environment interactions. In such cases, we do not expect evolution to maximize \(\mathcal R _0\).

Finally, we return to our initial assumptions about complete cross immunity and single strain infections. Previous studies have shown that superinfection and coinfection of multiple strains (May and Nowak 1995; Nowak and May 1994; Mosquera and Adler 1998) and assumptions about cross immunity (Gog and Grenfell 2002) can also affect evolutionary outcomes. In these cases the infectious classes of other strains can become environmental feedback variables. Furthermore, interactions between pathogen strains can result in evolution selecting for strains that do not maximize \(\mathcal R _0\). These genotype-by-genotype interactions are related to additional conditions for \(\mathcal R _0\) maximization that arise when selection is frequency dependent. In particular, as with genotype-by-environment interactions, genotype-by-genotype interactions tend to create higher dimensional environmental feedbacks. Understanding how prevalent genotype-by-genotype interactions are in host–pathogen systems is an important area of future research that can determine how applicable \(\mathcal R _0\) maximization theory is to natural systems.

The conditions in this study hold for systems where any pathogen strain can invade the resident as long as invasion implies replacement. Other studies have focused on gradient dynamic evolutionary models where only pathogen strains close to the resident can attempt to invade the resident (Abrams et al. 1993; Dieckmann and Law 1996). In these models, the pathogen evolves in the direction of increasing fitness by climbing the fitness gradient. As shown in Appendix G, the conditions under which evolution maximizes \(\mathcal R _0\) in a system with a single infectious class are very similar to those of system (5). In particular, each functional response must factor into terms representing environmental and trait effects and there can be no genotype-by-environment interaction terms. This suggests that conclusions about \(\mathcal R _0\) maximization in those host–pathogen evolutionary models may also be useful in identifying other mechanisms through which higher dimensional environmental feedbacks arise.

In total, the conditions for \(\mathcal R _0\) maximization presented in this study have shown that there are many biological mechanisms that inhibit \(\mathcal R _0\) maximization in natural systems. This suggests that additional theoretical studies involving more realistic functional forms and fewer simplifying biological assumptions are necessary to understand if \(\mathcal R _0\) maximization in natural systems. The approach taken in this study may also be useful in identifying biological conditions under which optimization principles beyond \(\mathcal R _0\) maximization arise. An important area of research is understanding if optimization principles of any kind are expected to arise in natural systems.