1 Introduction

Since the late 1950s, public health officials have been focusing on the control and elimination of the organisms that cause infectious diseases. The introduction of antibiotics, sanitation and vaccinations brought a positive perspective of disease eradication Hethcote (2000). However, factors such as resistance to the medicine by the microorganisms, demographic evolution, accelerated urbanization and increased travelling, led to new infectious diseases and the reemergence of existing diseases Hethcote (2000). Newly identified diseases include Lyme disease (1975), Legionnaires disease (1976), Toxic shock syndrome (1978), Hepatitis C (1989), Hepatitis E (1990), and Hantavirus (1993) Hethcote (2000). The emergence of Human Immunodeficiency Virus (HIV) in 1981 suddenly became an important sexually transmitted disease throughout the world Hethcote (2000). Antibiotic-resistant strains of tuberculosis, pneumonia and gonorrhea have evolved and these diseases are reemerging. Malaria, dengue, and yellow fever have also reemerged and are spreading into new regions because of climate change. Diseases such as plague and cholera continue to erupt occasionally. Most recently, the reemergence of Ebola virus disease (EVD) in 2013 has perplexed the world. Reemergence remains a serious medical burden all around the world with 15 million deaths per year estimated to be directly related to reemergence of infectious diseases Hethcote (2000).

Mathematical modelling continues to play a significant role in epidemiology by providing deeper insight into the underlying mechanisms for the spread of emerging and reemerging infectious diseases and suggesting effective control strategies Hethcote (2000). The successful eradication of these emerging diseases does not depend only on the availability of medical infrastructures but also on the ability to understand the transmission dynamics of a particular disease and the application of optimal control strategies and the implementation of logistic policies Hethcote (2000). Mathematical models have been used in comparing, planning, implementing, evaluating, and optimizing various detection, prevention, therapy, and control programs. Epidemiology modelling has contributed to the design and analysis of epidemiological surveys, suggested crucial data that should be collected, identified trends, made general forecasts, and estimate the uncertainty in forecasts Hethcote (2000). Mathematical models have been used to answer the following questions

  • How many people will be infected?

  • How many infected people will require hospitalization?

  • What is the expected maximum number of people infected at any given time?

  • What is the estimated duration of the epidemic?

These questions are of interest to the public health officials, which are generally explored by identifying the mechanisms responsible for the epidemic, without adequately taking into consideration the economic constraints in analyzing the control strategies. Since economic resources are limited, epidemiological models have started taking into consideration the economic constraints imposed by limited resources when analyzing control strategies. Optimal control theory has been applied to the mathematical models of HIV models Zarei et al. (2010), Kwon et al. (2012), Karrakchou et al. (2006), Kwon (2007), Roshanfekr et al. (2014), Okosun et al. (2013), Zhou et al. (2014), Adams et al. (2005), Costanza et al. (2013), Orellana (2011), Malaria Okosun et al. (2013), Okosun et al. (2011), Okosun and Makinde (2014), Makinde and Okosun (2011), Kim (2012), Prosper et al. (2014), Tuberculosis Moualeu et al. (2015), Silva and Torres (2013), Agusto and Adekunle (2014), Bowong and Aziz Alaoui (2013), Whang et al. (2011), Vector borne diseases Lashari (2012), Graesboll et al. (2014), Sung Lee and Ali Lashari (2014) and other diseases Yan and Zou (2008), Agusto (2013), Brown and Jane White (2011), Zaman et al. (2008), Okosun and Makinde (2014), Su and Sun (2015), Buonomo et al. (2014), Lowden et al. (2014), Roshanfekr et al. (2014), Apreutesei et al. (2014), Imran et al. (2014).

Epidemiological models often split the total population into different classes called compartments with labels such as \(S\), \(V\), \(E\), \(I\), \(R\), and \(T\) to represent, respectively, the susceptible, vaccinated, exposed, infectious, recovered and treated individuals. The choice of compartments to be included in a mathematical model often depends on the following:

  • the control mechanism;

  • the type and properties of the disease being modelled;

  • the purpose of the mathematical model.

An ordinary differential equation (ODE) or a partial differential equation (PDE) with respect to time is usually formulated for each subclass. For the purpose of this survey, mathematical models will be classified based on the control mechanism and theoretical results will be given.

The paper is organized as follows. A brief description of the necessary and sufficient conditions for the existence of multi-objective optimal control is provided in Sect. 2. A detailed description and analysis of the application of the multi-objective optimal control theory applied to continuous-time mathematical models is presented in Sect. 3. Similarly, Sect. 4 gives detailed description and analysis of application of the theory to discrete-time mathematical models. Finally, Sect. 5 summaries the conclusions based on the review.

2 Multi-objective optimal control

Suppose \(x(t)\in X \subset {\mathbb {R}}^n\), represents the state variables of a system and \(u(t)\in {\mathfrak {U}} \subset {\mathbb {R}}^m\) represents the control variables at time \(t\), with \(t_0\le t\le t_f\). An optimal control problem consists of finding a piecewise continuous control \(u(t)\) and the associated state \(x(t)\) that optimizes a cost functional \(J[x(t),u(t)]\). The majority of mathematical models that uses the optimal control theory relies on the Pontryagin’s Maximum Principle, which is a first-order condition for finding the optimal solution. This is reproduced below for convenience.

Theorem 2.1

(Pontryagin’s Maximum Principle Lenhart and Workman (2007)) If \(u^*(t)\) and \(x^*(t)\) are optimal for the problem

$$\begin{aligned} \begin{aligned}&\displaystyle \max _{u} J[x(t),u(t)],\;\; \mathrm{where}\, J [x(t),u(t)] = \max _{u}\displaystyle \int _{t_0}^{t_f} {f(t,x(t),u(t))dt},\\ \mathrm{subject\,\,to}\,\,&{\left\{ \begin{array}{ll} \displaystyle \frac{dx}{dt}=g(t,x(t),u(t))\\ x(t_0)=x_0, \end{array}\right. } \end{aligned} \end{aligned}$$
(1)

then there exists a piecewise differentiable adjoint variable \(\lambda (t)\) such that

$$\begin{aligned} H(t,x^*(t),u(t),\lambda (t))\le H(t,x^*(t),u^*(t),\lambda (t)) \end{aligned}$$

for all controls \(u\) at each time \(t\), where the Hamiltonian \(H\) is given by

$$\begin{aligned} H(t,x(t),u(t),\lambda (t))=f(t,x(t),u(t))+\lambda (t)g(t,x(t),u(t)) \end{aligned}$$
(2)

and

$$\begin{aligned} {\left\{ \begin{array}{ll} \lambda ^{'}(t) &{}= -\displaystyle \frac{\partial H(t,x^*(t),u^*(t),\lambda (t))}{\partial x},\\ \lambda (t_f) &{}= 0. \end{array}\right. } \end{aligned}$$

While the Pontryagin’s Maximum Principle gives the necessary conditions for the existence of an optimal solution, the following theorem provides the sufficient conditions.

Theorem 2.2

(Arrow Sufficiency Theorem  Chiang (1992)) For the optimal control problem (1), the conditions of the maximum principle are sufficient for the global minimization of \(J[x(t),u(t)]\), if the minimized Hamiltonian function \(H\), defined in (2), is convex in the variable \(x\) for all \(t\) in the time interval \([t_0,t_f]\), for a given \(\lambda \).

One of the major side-effects of vaccination/treatment is the creation of drug resistant virus/bacteria which eventually leads to drug failure (due to ineffectiveness of the vaccine/treatment). Optimal control has been used to curb the creation of drug resistant virus/bacteria or drug failure (at the same time reducing the cost of treatment or vaccination) by imposing a condition that monitors the global effect of the vaccination/treatment program. Hence if \(x(t)\) represents the group of individuals to be vaccinated/treated and \(u(t)\in {\mathcal {U}}\) represents the control on vaccination/treatment, where the control set \({\mathcal {U}}\) is given by

$$\begin{aligned} {\mathcal {U}}=\{u(t)|v_0\le u(t)\le v_1,\,\,\mathrm{Lebesgue\,measurable}\}. \end{aligned}$$

Then, the following objective functions are to be minimized simultaneously:

$$\begin{aligned} I_1(u)=\int _{t_0}^{t_f}x(t)dt,\,\,\,\mathrm{and}\,\,\,I_2(u)=\int _{t_0}^{t_f}u^m(t)dt,\,\,\mathrm{for}\,\,m>0, \end{aligned}$$

and the optimal solution can be represented as

$$\begin{aligned} \displaystyle \min _{u\in {\mathcal {U}}}\{I_1(u),I_2(u)\}. \end{aligned}$$
(3)

In general, there does not exist a feasible solution that minimizes both objective functions simultaneously. Hence, Pareto optimality concept is used to find the optimal control \(u^*\) that minimizes both objective functions simultaneously.

Definition 1

A solution \(u^*\in {\mathcal {U}}\) is called Pareto optimal solution of the problem (3) if and only if, there exists no other solution \(u\in {\mathcal {U}}\) such that \(I_i(u^*)\le I_i(u)\) for all \(i=1,2\), and \(I_i(u^*)<I_i(u)\) for some \(i=1,2\).

The following (Scalarization and Goal Programming Model) are two of the various methods usually used for obtaining a Pareto optimal solution for a multi-objective problem.

2.1 Goal programming model

The goal programming model is a well-known aggregating methodology for solving multi-objective programming problems by taking into account simultaneously several conflicting objectives. Thus the solution obtained through the goal programming model represents the best compromise that can be achieved by the decision maker. The Goal Programming model is a distance function where the unwanted positive and negative deviations, between the achievement and aspiration levels, are to be minimized. Goal Programming model has been widely applied in several fields such as accounting, marketing, quality control, human resources, production, economics and operations management [for example, in stochastic and deterministic optimal control models Paolo et al. (2014), Forster et al. (2014), Anita et al. (2013), La Torre and Marsiglio (2010) and in stochastic and deterministic scenario-based multi-criteria decision making models Aouni et al. (2014), Belad et al. (2013), Marco and La Torre (2012)]. In epidemiology, the scalarization method is widely used and this is discussed next.

2.2 Scalarization method

A multi-objective problem is often solved by combining all multiple objectives into one single-objective scalar function, known as the weighted-sum or scalarization method. Hence, for the problem (3), a single-objective functional \(I(u)\) can be formed by summing the weighted objectives as follows

$$\begin{aligned} \displaystyle \min _{u\in {\mathcal {U}}} I(u) =\, \min _{u\in {\mathcal {U}}} \sum _{j=1}^2{A_jI_j(u)}. \end{aligned}$$
(4)

The following Theorem guarantees that the solution of the weighted sum is Pareto optimal.

Theorem 2.3

The solution of the weighted sum problem (4) is Pareto optimal if the weighting coefficients are positive, that is, \(A_i>0\) for all \(i=1,2\) and \(\displaystyle \sum _{i=0}^2{A_i}=1\).

The present survey focuses on the use of multi-criteria optimal control in deterministic mathematical models. Deterministic mathematical models can either be continuous-time or discrete-time. We consider first, the continuous-time mathematical models.

3 Continuous-time mathematical model

Continuous mathematical models have been used to study the dynamic of infectious diseases within a human host and in the population. Optimal control has been used, in the past, to find an optimal schedule for vaccine, treatment and chemotherapy for an infected individual. It has also been used to optimally manage the resources associated to quarantine and isolation programs Yan and Zou (2008).

Consider the following SIR model without control strategies, initially designed and studied by Kermack and McKendrick (1927). This model categorises individuals in a population as Susceptible \((S)\), Infected \((I)\) and Recovered \((R)\). It simulates the transmission dynamics of diseases where individuals acquire permanent immunity. Examples include mumps, typhoid fever and smallpox:

$$\begin{aligned} \begin{aligned} \frac{dS}{dt}&=\Pi -\beta SI - \mu S,\\ \frac{dI}{dt}&=\beta SI - \gamma I - \mu I,\\ \frac{dR}{dt}&=\gamma I - \mu R, \end{aligned} \end{aligned}$$
(5)

The model assumes a constant recruitment rate (by birth) \(\Pi \) into the susceptible class. Susceptible individuals acquire infection and become infected, following effective contact with infected individuals, at a rate \(\beta SI\), where \(\beta \) is the effective contact rate. Infected individuals recover and move to the Recover class \(R\) at a rate \(\gamma \). Natural death occurs in all class at a rate \(\mu \). All discussions will be based on incorporating different controls in the SIR model. This is done next.

3.1 Optimal control in SIR model with vaccination

Consider the extension of the SIR model 5 through incorporation of a vaccination class \((V)\):

$$\begin{aligned} \frac{dS}{dt}= & {} \Pi -\beta SI - \alpha u(t)S - \mu S,\nonumber \\ \frac{dV}{dt}= & {} \alpha u(t)S-\beta \epsilon VI - \mu V,\nonumber \\ \frac{dI}{dt}= & {} \beta (S+\epsilon V)I - \gamma I - \mu I,\nonumber \\ \frac{dR}{dt}= & {} \gamma I - \mu R. \end{aligned}$$
(6)

with the initial conditions \(S(0)>0, I(0)\ge 0\), \(R(0)\ge 0\) and \(V(0)\ge 0\). The new class represents the group of individuals who get vaccinated when susceptible, at a continuous rate \(\alpha u(t)\). The model assumes that the vaccination is not 100 % efficient and as such, individuals in this class can be infected via contact with individuals in the infected/infectious class \(I\), but at a lower rate \(\beta \epsilon \,(0\le \epsilon <1)\). The control function \(u(t)\), with \(0\le u(t)\le 1\) represents the fraction of susceptible individuals that requires vaccination. When \(u(t)\) is close to 1, then vaccination failure is very low but with high implementation costs.

For the model (6), the single-objective cost functional to be minimized is given by

$$\begin{aligned} J(u_1,u_2)=\int _{t_0}^{t_f}{\bigg [a_0I(t)+\frac{a_1}{2}u^2(t)\bigg ]dt}, \end{aligned}$$
(7)

with \(a_0>0\) and \(a_1>0\), where we want to minimize the infected/infectious group \(I\) while also keeping the cost of vaccination \(u(t)\) low. It is generally assumed that the cost of control is usually nonlinear with the quadratic form as given in Eq. (7) which is a convex function. This quadratic form in Eq. (7) represents “giving too much of vaccine to an individual” which often leads to waste. The term \(a_0I(t)\) represents the cost of infection, while the term \(\displaystyle \frac{a_1}{2}u^2(t)\) represents the cost of vaccination program at the time \(t\). The goal is to find an optimal control, \(u^*\), such that

$$\begin{aligned} J(u^*)=\min _{\Omega _1}{J(u)} \end{aligned}$$
(8)

where

$$\begin{aligned} \Omega _1 = \{u|0\le u \le 1,\,\,\mathrm{Lebesgue\,measurable}\}. \end{aligned}$$
(9)

Applying the Pontryagins Maximum Principle, we have the following result

Theorem 3.1

There exists an optimal control \(u^*\) and the corresponding solution \((S^*, V^*, I^*, R^*)\) of the system (6), that minimizes \(J(u)\) over \(\Omega _1\). Furthermore, there exist adjoint functions, \(\lambda _1(t), \ldots , \lambda _4(t)\), such that

$$\begin{aligned} \frac{d\lambda _1}{dt}= & {} \lambda _1(\beta I^*+\mu +\alpha u_2^*)-\lambda _2\alpha u_2^*-\lambda _3\beta I^*,\nonumber \\ \frac{d\lambda _2}{dt}= & {} \lambda _2(\beta \epsilon I^*+\mu )-\lambda _3\beta \epsilon I^*,\nonumber \\ \frac{d\lambda _3}{dt}= & {} -a_0+\lambda _1\beta S^*+\lambda _2\beta \epsilon V^*-\lambda _3[\beta (S^*+\epsilon V^*)-\mu -\gamma ]-\lambda _4\gamma ,\nonumber \\ \frac{d\lambda _4}{dt}= & {} \lambda _4\mu , \end{aligned}$$
(10)

with the transversality conditions

$$\begin{aligned} \lambda _i(t_f)=0,\,\,\,\,i=1,\ldots ,4. \end{aligned}$$
(11)

and the control \(u^*\) satisfies the optimality condition

$$\begin{aligned} u^*(t)=\min \bigg (\max \bigg (0,\displaystyle \frac{\alpha S^*(\lambda _1-\lambda _2)}{a_2}\bigg ),1\bigg ) \end{aligned}$$
(12)

Proof

The existence of an optimal control is guaranteed by Corollary 4.1 of Fleming and Rishel (1975) due the following

  • the convexity of the integrand of \(J\) with respect to \(u\);

  • a priori boundedness of the state solutions;

  • Lipschitz property of the state system with respect to the state variables.

Thus, applying Pontryagin’s Maximum Principle, we convert (6), (7) and (8) into a problem of minimizing a Hamiltonian, \(H\), pointwise with respect to \(u\):

$$\begin{aligned} H=a_0I(t)+\frac{a_1}{2}u^2(t)+\displaystyle \sum _{i=1}^4\lambda _if_i \end{aligned}$$
(13)

where \(f_i\), \(i=1,\ldots ,4\) are the right-hand sides of the system (6) and we have the adjoint equations

$$\begin{aligned} \frac{d\lambda _1}{dt}= & {} -\frac{\partial H}{\partial S},\,\,\lambda _1(t_f)=0,\nonumber \\ \frac{d\lambda _2}{dt}= & {} -\frac{\partial H}{\partial V},\,\,\lambda _2(t_f)=0,\nonumber \\ \frac{d\lambda _3}{dt}= & {} -\frac{\partial H}{\partial I},\,\,\lambda _3(t_f)=0,\nonumber \\ \frac{d\lambda _4}{dt}= & {} -\frac{\partial H}{\partial R},\,\,\lambda _4(t_f)=0. \end{aligned}$$
(14)

Evaluating the equations in (14) at the optimal control and the corresponding states will give the adjoint system (10) and (11). On the interior of the set \(\Omega \), where \(0< u<1\), we have

$$\begin{aligned} \frac{\partial H}{\partial u}=0. \end{aligned}$$
(15)

Solving Equation (15) for \(u^*\) gives the characterization (12). \(\square \)

It can be shown that the state \([S(t), V(t), I(t), R(t)]\) and the adjoint functions \(\lambda _1(t), \ldots , \lambda _4(t)\) are all bounded. Furthermore, based on the Lipschitz structure of the ODEs, a unique optimal control \(u^*\) is obtained for small \(t_f\). The uniqueness of the optimal control follows from the uniqueness of the optimality system, which consists of (6), (10) and (11) with the characterizations (12).

3.2 Optimal control in SIR model with treatment

Consider the SIR model with treatment, given by:

$$\begin{aligned} \frac{dS}{dt}= & {} \Pi -\beta SI - \mu S,\nonumber \\ \frac{dI}{dt}= & {} \beta SI - \gamma I - \mu I,\nonumber \\ \frac{dT}{dt}= & {} u(t)\kappa \gamma I - \tau T - \mu T,\nonumber \\ \frac{dR}{dt}= & {} (1-u(t)\kappa )\gamma I + \tau T - \mu R, \end{aligned}$$
(16)

A new treatment class \((T)\) is added to the SIR model (5) which represents the group of individuals who are receiving treatment to cure an infection. The model assumes that people leave the infected class \(I\) at a rate \(\gamma \). A fraction \((1-u(t)\kappa )\) of those leaving the infected class will recover from the infection without receiving treatment while the remaining fraction \(u(t)s\) will receive treatment and move to the treatment class \((T)\). Treated individuals recover faster at a rate \(\tau \) as compared to those who do not receive treatment so that \(\tau >(1-u(t)\kappa )\gamma \). The control function \(u(t)\), with \(0\le u(t)\le 1\) represents the fraction of the infected individuals who are identified and will be treated (to reduce the number of individuals that may be infectious). When \(u(t)\) is close to 1, then the treatment failure is low but the implementation cost is high.

For the model (16), the single-objective cost functional to be minimized is given by

$$\begin{aligned} J(u)=\int _{t_0}^{t_f}{\bigg [a_0I(t)+\frac{a_1}{2}u(t)\bigg ]dt}, \end{aligned}$$
(17)

with \(a_0>0\) and \(a_1>0\), where we want to minimize the infectious group \(I\) while also keeping the cost of treatment \(u(t)\) low. The term \(a_0I(t)\) represents the cost of infection, while the term \(\displaystyle \frac{a_1}{2}u^2(t)\) represents the cost of treatment. The goal is to find an optimal control, \(u^*\), such that

$$\begin{aligned} J(u^*)=\min _{\Omega _2}{J(u)} \end{aligned}$$
(18)

where

$$\begin{aligned} \Omega _2 = \{u|0\le u \le 1,\,\,\mathrm{Lebesgue\,measurable}\}. \end{aligned}$$
(19)

Applying the Pontryagins Maximum Principle, we have the following result

Theorem 3.2

There exists an optimal control \(u^*\) and the corresponding solution \((S^*, I^*, T^*, R^*)\) of the system (16), that minimizes \(J(u)\) over \(\Omega _2\). Furthermore, there exists adjoint functions, \(\lambda _1(t),\ldots ,\lambda _4(t)\), such that

$$\begin{aligned} \frac{d\lambda _1}{dt}= & {} \lambda _1(\beta I^*+\mu )-\lambda _2\beta I^*,\nonumber \\ \frac{d\lambda _2}{dt}= & {} -a_0+\lambda _1\beta S^*-\lambda _2[\beta S^*-\mu -\gamma ]-\lambda _3u^*\kappa \gamma -\lambda _4(1-u^*\kappa )\gamma ,\nonumber \\ \frac{d\lambda _3}{dt}= & {} \lambda _3(\tau +\mu )-\lambda _4\tau ,\nonumber \\ \frac{d\lambda _4}{dt}= & {} \lambda _4\mu . \end{aligned}$$
(20)

with transversality conditions

$$\begin{aligned} \lambda _i(t_f)=0,\,\,\,\,i=1,\ldots ,4. \end{aligned}$$
(21)

The control \(u^*\) satisfies the optimality condition

$$\begin{aligned} u^*(t)=\min \bigg (\max \bigg (0,\frac{\kappa \gamma I^*(\lambda _4-\lambda _3)}{a_1}\bigg ),1\bigg ) \end{aligned}$$
(22)

Proof

The proof is similar to that of Theorem 3.1 with the Hamiltonian, \(H\), given by

$$\begin{aligned} H=a_0I(t)+\frac{a_1}{2}u^2(t)+\displaystyle \sum _{i=1}^4\lambda _if_i, \end{aligned}$$
(23)

where \(f_i\), \(i=1,\ldots ,4\) are the right-hand sides of the differential Eq. (16). \(\square \)

Similarly, due to the uniqueness of the optimality system (16), (20), (21) with the characterizations (22), a unique optimal control \(u^*\) exists for small \(t_f\).

3.3 Optimal control in SIR model with quarantine and isolation

Consider the SIR model with quarantine and isolation, given by:

$$\begin{aligned} \frac{dS}{dt}= & {} \Pi + \rho Q-\beta S(I+\epsilon _QQ+\epsilon _JJ) - \mu S,\nonumber \\ \frac{dE}{dt}= & {} \beta S(I+\epsilon _QQ+\epsilon _JJ) - \gamma E - \mu E,\nonumber \\ \frac{dQ}{dt}= & {} u_1(t)\kappa \gamma E - \eta Q - \rho Q - \mu Q,\nonumber \\ \frac{dI}{dt}= & {} (1-u_1(t)\kappa )\gamma E -\alpha I - \mu I,\nonumber \\ \frac{dJ}{dt}= & {} u_2(t)\nu \alpha I + \eta Q - \sigma J - \mu J,\nonumber \\ \frac{dR}{dt}= & {} (1-u_2(t)\nu )\alpha I + \sigma J - \mu R, \end{aligned}$$
(24)

Here, three additional classes the exposed \((E)\), quarantine \((Q)\) and isolation \((J)\) classes have been added. The new model (24) assumes that susceptible individuals acquire new infection via contact with individuals in the Quarantine, exposed or isolation classes at a rate \(\beta S(I+\epsilon _QQ+\epsilon _JJ)\) with \(\epsilon _Q\ge 0\) representing varying levels of hygiene precautions that may or may not limit the quarantined individuals from making an effective contact with the susceptible individuals. The parameter \(\epsilon _J\ge 0\) represents level of hygiene precautions during isolation. Upon exposure to infection, exposed individuals who can be identified are quarantined at a rate \(u_1(t)\kappa \gamma \) while those who cannot be identified will become infectious at a rate \((1-u_1(t)\kappa )\gamma \), without being quarantined. Some of the individuals in the quarantined class will develop symptoms at a rate \(\eta \) and will be isolated while those who do not develop symptoms and clear infection may become susceptible again at a rate \(\rho \). Infectious individuals who have been identified are isolated at a rate \(u_2(t)\nu \alpha \) while others recover from the infection at a rate \((1-u_2(t)\nu )\alpha \). Isolated individuals eventually recover from the infection at a rate \(\sigma \) and move to the \(R\) class.

The control function \(u_1(t)\), with \(0\le u_1(t)\le 1\) represents the fraction of the quarantined individuals (people who have been in contact with an infected individual) who are identified and will be quarantined. The control function \(u_2(t)\), with \(0\le u_2(t)\le 1\) similarly represents the fraction of the isolated individuals (isolation of symptomatic individuals) who are identified and will be isolated. When \(u_1(t)\) or \(u_2(t)\) is close to 1, then the quarantined or isolation failure is low and their implementation costs are high. For the model (24), the single-objective cost functional to be minimized is given by

$$\begin{aligned} J(u)=\int _0^{t_f}{\bigg [a_0Q(t)+a_1I(t)+a_2J(t)+\frac{a_3}{2}u_1(t)+\frac{a_4}{2}u_2(t)\bigg ]dt}, \end{aligned}$$
(25)

with \(a_i>0\), \(i=1,\ldots ,5\), where we want to minimize the infectious group \(I\) while also keeping the cost of treatment \(u(t)\) low. The term \(a_0I(t)\) represents the cost of infection, while the terms \(\displaystyle \frac{a_3}{2}u_1^2(t)\) and \(\displaystyle \frac{a_4}{2}u_2^2(t)\) represent the cost of quarantine and isolation, respectively. The goal is to find an optimal control pair, \(u_1^*\) and \(u_2^*\), such that

$$\begin{aligned} J\left( u_1^*,u_2^*\right) =\min _{\Omega _3}{J(u_1,u_2)} \end{aligned}$$
(26)

where

$$\begin{aligned} \Omega _3 = \{(u_1,u_2)|0\le u_i \le 1,\,\,\mathrm{Lebesgue\,measurable}\,i=1,2\}. \end{aligned}$$
(27)

Applying the Pontryagin’s Maximum Principle, we have the following result

Theorem 3.3

There exists an optimal control pair \(u_1^*\) and \(u_2^*\) and the corresponding solution \((S^*, E^*, Q^*, I^*, J^*, R^*)\) of the system (24), that minimizes \(J(u_1,u_2)\) over \(\Omega _3\). Furthermore, there exist adjoint functions, \(\lambda _1(t),\ldots ,\lambda _6(t)\), such that

$$\begin{aligned} \frac{d\lambda _1}{dt}= & {} \lambda _1[\beta (I^*+\epsilon _QQ^*+\epsilon _JJ^*)+\mu ]-\lambda _2\beta (I^*+\epsilon _QQ^*+\epsilon _JJ^*),\nonumber \\ \frac{d\lambda _2}{dt}= & {} \lambda _2(\gamma +\mu )-\lambda _3u_1^*\kappa \gamma -\lambda _4(1-u_1^*\kappa )\gamma ,\nonumber \\ \frac{d\lambda _3}{dt}= & {} -a_0+\lambda _1[\beta \epsilon _QS^*-\rho ]-\lambda _2\beta \epsilon _QS^*+\lambda _3(\eta +\rho +\mu )-\lambda _5\eta ,\nonumber \\ \frac{d\lambda _4}{dt}= & {} -a_1+\lambda _1\beta S^*-\lambda _2\beta S^*+\lambda _4(\alpha +\mu )-\lambda _5u_2^*\nu \alpha -\lambda _6(1-u_2^*\nu )\alpha ,\nonumber \\ \frac{d\lambda _5}{dt}= & {} -a_2+\lambda _1\beta \epsilon _JS^*-\lambda _2\beta \epsilon _JS^*+\lambda _5(\sigma +\mu )-\lambda _6\sigma ,\nonumber \\ \frac{d\lambda _6}{dt}= & {} \lambda _6\mu , \end{aligned}$$
(28)

with transversality conditions

$$\begin{aligned} \lambda _i(t_f)=0,\,\,\,\,i=1,\ldots ,6. \end{aligned}$$
(29)

The control pair \(u_1^*\) and \(u_2^*\) satisfies the optimality condition

$$\begin{aligned} \begin{aligned} u_1^*(t)&=\min \bigg (\max \bigg (0,\frac{\kappa \gamma E^*(\lambda _4-\lambda _3)}{a_3}\bigg ),1\bigg )\\ \mathrm{and}&\\ u_2^*(t)&=\min \bigg (\max \bigg (0,\frac{\nu \alpha I^*(\lambda _6-\lambda _5)}{a_4}\bigg ),1\bigg ). \end{aligned} \end{aligned}$$
(30)

Proof

The proof follows with the Hamiltonian, \(H\), given by

$$\begin{aligned} H=a_0Q(t)+a_1I(t)+a_2J(t)+\frac{a_3}{2}u_1(t)+\frac{a_4}{2}u_2(t)+\displaystyle \sum _{i=1}^6\lambda _if_i, \end{aligned}$$
(31)

where \(f_i\), \(i=1,\ldots ,6\) are the right-hand sides of the differential Equation (24). \(\square \)

Here, again, due to the uniqueness of the optimality system (24), (28), (29) with the characterizations (30), a unique optimal control pair \((u_1^*,u_2^*)\) exists for small \(t_f\).

3.4 Optimal control in age-structured model

Consider the SVIR model with age-structure given by:

$$\begin{aligned} \frac{\partial S}{\partial a}+\frac{\partial S}{\partial t}= & {} -\lambda (a,t)S - \alpha u(t)S - \mu (a)S,\nonumber \\ \frac{\partial V}{\partial a}+\frac{\partial V}{\partial t}= & {} \alpha u(t)S-\epsilon \lambda (a,t)V - \mu (a) V,\nonumber \\ \frac{\partial I}{\partial a}+\frac{\partial I}{\partial t}= & {} \lambda (a,t)(S+\epsilon V) -\gamma I - \mu (a) I,\nonumber \\ \frac{\partial R}{\partial a}+\frac{\partial R}{\partial t}= & {} \gamma I - \mu (a) R, \end{aligned}$$
(32)

with

$$\begin{aligned}&S+V+I+R=U(a,t),\,\,\lambda (a,t)=\int _0^{a_m}{\beta (a,\check{a})I(\check{a},t)d\check{a}},\\&S(0,t)=\Pi ,\,\,V(0,t)=I(0,t)=R(0,t)=0,\,\,S(a,0)=S_0(a),\\&V(a,0)=V_0(a),\,\,I(a,0)=I_0(a),\,\,R(a,0)=R_0(a), \end{aligned}$$

for \(0\le t\le t_f\) and \(0\le a\le a_m\). The rates \(\Pi \), \(\alpha , \epsilon \) and \(\gamma \) (assumed independent of age) are the same as in the SVIR model (6) without age structure. In the model (32), it is assumed that the contact rate between people of age \(a\) and \(\check{a}\) is separable in the form \(\beta (a,\check{a})=\kappa (a)\delta (\check{a})\), while \(\mu (a)\) is the age-specific per-capita death rate. The functions \(\kappa (a)\), \(\delta (a)\) and \(\mu (a)\) are assumed continuous and will take the value zero beyond some maximum age \((a_m)\). For the model (32), the single-objective cost functional to be minimized is given by

$$\begin{aligned} J(u)=\int _0^{t_f}\int _0^{a_m}{\bigg [AI(a,t)+\frac{B}{2}u^2(a,t)\bigg ]dadt}, \end{aligned}$$
(33)

with \(A>0\) and \(B>0\), where the goal is to minimize the infectious group \(I\) while also keeping the cost of treatment \(u(t)\) low. The term \(AI(a,t)\) represents the cost of infection for all individuals in age group \(a\) at time \(t\), while the terms \(\displaystyle \frac{B}{2}u^2(a,t)\) represents the cost of treatment for all individuals in age group \(a\) at time \(t\). The goal is to find an optimal control, \(u^*\), such that

$$\begin{aligned} J(u^*)=\min _{\Omega _4}{J(u)}, \end{aligned}$$
(34)

where

$$\begin{aligned} \Omega _4 = \{u|0\le u \le 1,\,\,\mathrm{Lebesgue\,measurable}\}. \end{aligned}$$
(35)

The sensitivity equation (for variation \(l\)) of the model (32) is given by

$$\begin{aligned}&\frac{\partial \psi _1(a,t)}{\partial a}+\frac{\partial \psi _1(a,t)}{\partial t}+\psi _1(a,t)\int _0^{a_m}{\beta (a,\check{a})I(\check{a},t)d\check{a}}\nonumber \\&\quad +\,S(a,t)\int _0^{a_m}{\beta (a,\check{a})\psi _3(\check{a},t)d\check{a}}+\alpha u(a,t)\psi _1(a,t)+\mu (a)\psi _1=-\alpha Sl,\nonumber \\&\frac{\partial \psi _2(a,t)}{\partial a}+\frac{\partial \psi _2(a,t)}{\partial t}-\alpha u(a,t)\psi _1(a,t) +\epsilon \psi _2(a,t)\int _0^{a_m}{\beta (a,\check{a})I(\check{a},t)d\check{a}}\nonumber \\&\quad +\,\epsilon V(a,t)\int _0^{a_m}{\beta (a,\check{a})\psi _3(\check{a},t)d\check{a}} +\mu (a)\psi _2(a,t)=\alpha Sl,\nonumber \\&\frac{\partial \psi _3(a,t)}{\partial a}+\frac{\partial \psi _3(a,t)}{\partial t} -[\psi _1(a,t)+\epsilon \psi _2(a,t)]\int _0^{a_m}{\beta (a,\check{a})I(\check{a},t)d\check{a}}\nonumber \\&\quad -\,[S(a,t)+\epsilon V(a,t)]\int _0^{a_m}{\beta (a,\check{a})\psi _3(\check{a},t)d\check{a}}+(\gamma +\mu (a))\psi _3(a,t)=0,\nonumber \\&\frac{\partial \psi _4(a,t)}{\partial a}+\frac{\partial \psi _4(a,t)}{\partial t}-\gamma (\psi _3(a,t))+\mu (a)\psi _4(a,t)=0, \end{aligned}$$
(36)

which can be written in the form

$$\begin{aligned} {\mathcal {L}}\begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}=\begin{pmatrix} -\alpha Sl \\ \alpha Sl \\ 0 \\ 0 \\ \end{pmatrix} \end{aligned}$$
(37)

with,

$$\begin{aligned} {\mathcal {L}}\begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}=\bigg (\frac{\partial }{\partial a}+\frac{\partial }{\partial t}\bigg ) \begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}+M\begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}+{\mathcal {G}} \begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix} \end{aligned}$$
(38)

where, \(M\) is the matrix

$$\begin{aligned} M=\begin{pmatrix} \mu (a)+\alpha u(a,t)+ \Theta &{} 0 &{} 0 &{} 0 \\ -\alpha u(a,t) &{} \mu (a)+\epsilon \Theta &{} 0 &{} 0 \\ -\Theta &{} -\epsilon \Theta &{} \gamma +\mu (a) &{} 0 \\ 0 &{} 0 &{} -\gamma &{} \mu (a) \\ \end{pmatrix} \end{aligned}$$
(39)

and

$$\begin{aligned} \Theta =\displaystyle \int _0^{a_m}{\beta (a,\check{a})I(\check{a},t)d\check{a}}. \end{aligned}$$

The final term \({\mathcal {G}}\) is given by

$$\begin{aligned} {\mathcal {G}}\begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}=\begin{pmatrix} S(a,t)\displaystyle \int _0^{a_m}{\beta (a,\check{a})\psi _3(\check{a},t)d\check{a}} \\ \epsilon V(a,t)\displaystyle \int _0^{a_m}{\beta (a,\check{a})\psi _3(\check{a},t)d\check{a}} \\ -[S(a,t)+\epsilon V(a,t)]\displaystyle \int _0^{a_m}{\beta (a,\check{a})\psi _3(\check{a},t)d\check{a}} \\ 0 \\ \end{pmatrix}. \end{aligned}$$
(40)

Using the relation

$$\begin{aligned} (\lambda _1,\lambda _2,\lambda _3,\lambda _4)\cdot {\mathcal {L}} \begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}=(\psi _1,\psi _2,\psi _3,\psi _4)\cdot {\mathcal {L}}^* \begin{pmatrix} \lambda _1 \\ \lambda _2 \\ \lambda _3 \\ \lambda _4 \\ \end{pmatrix}, \end{aligned}$$
(41)

where, \({\mathcal {L}}^*\) is the adjoint operator. We can find the equations of the adjoints \(\lambda _1,\ldots ,\lambda _4\). The adjoint PDE system

$$\begin{aligned} {\mathcal {L}}^*\begin{pmatrix} \lambda _1 \\ \lambda _2 \\ \lambda _3 \\ \lambda _4 \\ \end{pmatrix}=\begin{pmatrix} 0 \\ 0 \\ A \\ 0 \\ \end{pmatrix}, \end{aligned}$$
(42)

where \(A\) is a constant from the cost functional. The adjoint operator is given by

$$\begin{aligned} {\mathcal {L}}^*\begin{pmatrix} \lambda _1 \\ \lambda _2 \\ \lambda _3 \\ \lambda _4 \\ \end{pmatrix}=-\bigg (\frac{\partial }{\partial a}+\frac{\partial }{\partial t}\bigg ) \begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}+M^T\begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}+{\mathcal {G}}^*\begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}, \end{aligned}$$
(43)

with

$$\begin{aligned} {\mathcal {G}}^* \begin{pmatrix} \psi _1 \\ \psi _2 \\ \psi _3 \\ \psi _4 \\ \end{pmatrix}= \begin{pmatrix} \displaystyle \int _0^{a_m}{\beta (a,\check{a})S(\check{a},t)d\check{a}} \\ \epsilon \displaystyle \int _0^{a_m}{\beta (a,\check{a})V(\check{a},t)d\check{a}} \\ -\displaystyle \int _0^{a_m}{\beta (a,\check{a})[S(\check{a},t)+\epsilon V(\check{a},t)]d\check{a}} \\ 0 \\ \end{pmatrix}. \end{aligned}$$
(44)

For the adjoint system, we have zero Neumann conditions and zero final-time solutions. The adjoint system is calculated at the optimal control \(u^*\) and the corresponding states \(S^*, V^*, I^*\) and \(R^*\). The transversality conditions are

$$\begin{aligned} \lambda _i(a,t_f)=0\,\,\,\mathrm{for}\,\,i=1,\ldots ,4\,\,\mathrm{and}\,\,0\le a \le a_m. \end{aligned}$$
(45)

The characterization of the optimal control is obtained by computing the directional derivative of the functional \(J(u)\) with respect to \(u\) in the direction \(l\) at \(u^*\). Since \(J(u^*)\) is the minimum value, we have

$$\begin{aligned} 0\le & {} \displaystyle \lim _{\epsilon \rightarrow 0^+}\frac{J(u^*+\epsilon l)-J(u^*)}{\epsilon },\\= & {} \lim _{\epsilon \rightarrow 0^+}\int _0^{t_f}\int _0^{a_m}{\bigg [A\bigg (\frac{I^{\epsilon }-I}{\epsilon }\bigg )+\frac{B}{2\epsilon }[(u^*+\epsilon l)^2-(u^*)^2]\bigg ]dadt},\\= & {} \int _0^{t_f}\int _0^{a_m}{(A\psi _3+Bu^*l)dadt},\\= & {} \int _0^{t_f}\int _0^{a_m}{\bigg [(\psi _1,\psi _2,\psi _3,\psi _4)\cdot \begin{pmatrix} 0 \\ 0 \\ A \\ 0 \\ \end{pmatrix}+Bu^*l\bigg ]dadt},\\= & {} \int _0^{t_f}\int _0^{a_m}{\bigg [(\lambda _1,\lambda _2,\lambda _3,\lambda _4)\cdot \begin{pmatrix} -\alpha Sl \\ \alpha Sl \\ 0 \\ 0 \\ \end{pmatrix}+Bu^*l\bigg ]dadt},\\= & {} \int _0^{t_f}\int _0^{a_m}{l(-\lambda _1\alpha S^*(a,t) + \lambda _2\alpha S^*(a,t) + Bu^*(a,t))dadt}. \end{aligned}$$

This implies that the optimal controls are

$$\begin{aligned} u^*(a,t)=\frac{\alpha S^*(a,t)(\lambda _1-\lambda _2)}{B} \end{aligned}$$

Thus, we have the following result.

Theorem 3.4

There exists an optimal control \(u^*\) and the corresponding solution \((S^*(a,t),V^*(a,t),I^*(a,t),R^*(a,t))\) of the system (32), that minimizes \(J(u)\) over \(\Omega _4\). Furthermore, there exists adjoint equations (PDEs), given by (42) with transversality conditions (45) and the control \(u^*\) satisfies the optimality condition

$$\begin{aligned} u^*(a,t)=\min \bigg (\max \bigg (0,\frac{\alpha S^*(a,t)(\lambda _1-\lambda _2)}{B}\bigg ),1\bigg ) \end{aligned}$$
(46)

4 Optimal control in discrete-time mathematical model

Consider the discrete-time equivalent of the SIR model with vaccination given by:

$$\begin{aligned} S_{k+1}= & {} \Pi -(\mu -1)S_k-\beta S_kI_k - \alpha u_kS_k,\nonumber \\ V_{k+1}= & {} \alpha u_kS_k-(\mu -1)V_k-\beta \epsilon V_kI_k,\nonumber \\ I_{k+1}= & {} \beta (S_k+\epsilon V_k)I_k - (\mu -1)I_k - \gamma I_k,\nonumber \\ R_{k+1}= & {} \gamma I_k - (\mu -1)R_k. \end{aligned}$$
(47)

For the model (47), the single-objective cost functional to be minimized is given by

$$\begin{aligned} J(u)=A_{t_f}I_{t_f}+\sum _{k=0}^{t_f-1}\bigg (A_kI_k + \frac{B_k}{2}u_k^2\bigg ). \end{aligned}$$
(48)

with \(A_k>0\) and \(B_k>0\) for \(k=1,\ldots ,t_f\), where the parameters \(A_k>0\) and \(B_k>0\) are the cost balancing factors. The problem minimizes the number of infected individuals during the time steps \(k=0\) to \(k=t_f\) while minimizing the cost of the control at the same time. The goal is to find an optimal control \(u^*\), such that

$$\begin{aligned} J(u^*)=\min _{\Omega _5}{J(u)} \end{aligned}$$

where,

$$\begin{aligned} \Omega _5 = \{u|0\le u_k \le 1,\,\,\mathrm{Lebesgue\,measurable}\}. \end{aligned}$$

Applying the Pontryagin’s Maximum Principle, we have the following result

Theorem 4.1

There exists an optimal control \(u^*\) and the corresponding solution, \((S_k^*, V_k^*, I_k^*, R_k^*)\), that minimizes \(J(u)\) over \(\Omega _5\). Furthermore, there exist adjoint functions, \(\lambda _{1,k},\ldots ,\lambda _{4,k}\), such that

$$\begin{aligned} \lambda _{1,k}= & {} \lambda _{1,k+1}\left[ -(\mu -1)-\beta I_k^* - \alpha u_k\right] +\lambda _{2,k+1}\alpha u_k+\lambda _{3,k+1}\beta I_k^*,\nonumber \\ \lambda _{2,k}= & {} \lambda _{2,k+1}\left[ -(\mu -1)-\beta \epsilon I_k^*\right] +\lambda _{3,k+1}\beta \epsilon I_k^*,\nonumber \\ \lambda _{3,k}= & {} A_k-\lambda _{1,k+1}\beta S_k^*-\lambda _{2,k+1}\beta \epsilon V_k^*+\lambda _{3,k+1}\left[ \beta \left( S_k^*+\epsilon V_k^*\right) \right. \nonumber \\&\left. -(\mu -1)-\gamma \right] +\lambda _{4,k+1}\gamma ,\nonumber \\ \lambda _{4,k}= & {} -\lambda _{4,k+1}(\mu -1). \end{aligned}$$
(49)

with transversality conditions

$$\begin{aligned} \lambda _{i,t_f}= & {} 0,\,\,\,\,i=1,2,4,\nonumber \\ \lambda _{3,t_f}= & {} A_{t_f} \end{aligned}$$
(50)

and the controls \(u^*\) satisfies the optimality condition

$$\begin{aligned} u_k^*&=\min \bigg (\max \bigg (0,\frac{\alpha S_k^*(\lambda _{1,k+1}-\lambda _{2,k+1})}{B_k}\bigg ),1\bigg ) \end{aligned}$$
(51)

Proof

The Hamiltonian \(H_k\) at the time step \(k\), is given by,

$$\begin{aligned} H_k=A_kI_k + \frac{B_k}{2}u_k^2+\displaystyle \sum _{i=1}^4\lambda _{i,k+1}f_i, \end{aligned}$$
(52)

where \(f_i\), \(i=1,\ldots ,4\) are the right-hand sides of the system (47). For \(k=0,\ldots ,t_f-1\), the adjoint equations and the transversality conditions are obtained by using the Pontryagin’s Maximum Principle, in discrete time, such that

$$\begin{aligned} \lambda _{1,k}= & {} \frac{\partial H_k}{\partial S_k},\,\,\,\lambda _{1,t_f}=0,\nonumber \\ \lambda _{2,k}= & {} \frac{\partial H_k}{\partial V_k},\,\,\,\lambda _{2,t_f}=0,\nonumber \\ \lambda _{3,k}= & {} \frac{\partial H_k}{\partial I_k},\,\,\,\lambda _{3,t_f}=A_{t_f},\nonumber \\ \lambda _{4,k}= & {} \frac{\partial H_k}{\partial R_k},\,\,\,\lambda _{4,t_f}=0, \end{aligned}$$
(53)

and the optimality condition (51) is obtained by solving for \(u_k^*\) in the interior of \(\Omega _5\) when

$$\begin{aligned} \frac{\partial H_k}{\partial u_k}=0. \end{aligned}$$
(54)

\(\square \)

5 Conclusions

In this survey, we have shown how the multi-objective optimal control has been implemented in epidemiological models. Mathematical models have been useful in comparing, planning, implementing and evaluating various intervention strategies for the prevention and control of various infectious diseases. Furthermore, the original goal of optimal control is to enforce the natural restriction of economic constraints imposed by limited resources when analyzing control strategies, it has also been helpful in devising control strategies aimed at curbing creation of drug/vaccine resistant virus/bacteria. This is achieved by limiting the amount of drugs administered to an infected individual or by limiting the amount of vaccine administered to a susceptible individual (while reducing the cost of implementation at the same time). Because of Optimal control theory, the goal of eradicating infectious diseases in a community with limited resources, is now a step closer to being achieved.