1 Introduction

Mathematical model is one of the main tools in describing infectious disease dynamics, which can better reflect the spread process, spread law, and spread trend. To date, the dynamic behavior of infectious disease has made significant progress in both theories and applications by mathematical research (e.g., Refs. [1,2,3,4,5]).

A major problem in modeling is to determine the incidence rate of disease, defined as the average number of new cases within a specified period of time, which reflects transmission status of the disease. The bilinear incidence rate \(\beta SI\) is the most common type, regarded as an extreme form and poorly reflective of disease transmission [6, 7]. In 1978, Capasso et al. [8] proposed the saturated incidence rate in studying the cholera epidemic, which may be more suitable for many cases because it may better understand the psychological effects of disease transmission. Namely, the population may tend to reduce interpersonal contacts when there exist an enormous number of infected individuals. Besides, it is essential to timely and adequate treatment in controlling disease outspread. In dynamics modeling for infectious disease, the treatment rate of infected individuals has received attention from numerous researchers. In most published models, there are typically two types the treatment rate: one being constant and the other being proportional to the number of infected individuals. Indeed, the capacity of medical resources is limited in the community. The above assumption applies only to the situation where the number of infected individuals is present in low numbers. As a result, Zhang et al. [9] proposed the following treatment rate function to characterize the saturation phenomenon of the limited medical resources.

$$\begin{aligned} T\left( I \right) = \frac{{rI}}{{1 + kI}},I \ge 0,r > 0,k \ge 0. \end{aligned}$$

It can be seen if the number of infected individuals I is small, the function \(T(I) \propto \;rI\) ; if the number of infected individuals I is large, function \(T(I) \propto r/k\). Thus, the proposed saturated treatment rate function is more applicable to the actual situation. Currently, it has been extensively applied to various epidemic models by many researchers (e.g., Refs. [10,11,12,13,14]). In addition, many authors have further examined different types of treatment rate functions in [7].

Another significant factor that cannot be ignored in modeling disease transmission is the time delay, which may affect the ultimate result of the steady state and may exhibit limit-cycle oscillations and chaos [15]. As generally known, it has taken a period from infection to the first appearance of symptoms for most epidemic, termed the disease incubation period. Introducing the time delay into the epidemic model is more aligned with the real-world phenomenon.

As an extension of ordinary calculus, fractional calculus has significant advantages over the integer-order in describing the real objective phenomena. Fractional calculus has unique properties such as memory and hereditary [16, 17], present in most biological systems. In fact, other investigators have also demonstrated that the cell membranes of the biological organism have fractional electrical conductance, and thus, the biological systems are divided into non-integer-order models [18]. In addition, the fractional system’s stability domain is larger than the integer-order system, which helps reduce the error caused by the improper parameters [19]. Remarkably, the classical integer-order calculus can be considered as a particular case of fractional calculus. In other words, the final response of the fractional system must converge to the response of the corresponding integer order model [20], which can be used to verify the accuracy of the simulation results. Thus, the fractional differential equation can better describe the epidemic model than integer-order differential equations [21]. At present, applying fractional calculus to construct epidemic models is receiving increasing attention from researchers (e.g., Refs. [21,22,23,24]).

The susceptible–infective–recovered (SIR) epidemic models with the saturated incidence rate and delay have been researched extensively to the best of our knowledge (e.g., Refs. [25,26,27]). However, few works have considered the fractional-order and the treatment rate function on the model above-mentioned dynamic behavior. Based on the discussion above, the following fractional delayed SIR epidemic model is proposed:

$$\begin{aligned} \left\{ \begin{array}{l} {}_0^{C}{\mathscr {D}}_t^q S(t)=\varLambda -\mu S(t)-\frac{\beta S(t) I(t)}{1+\alpha I(t)}, \\ {}_0^{C}{\mathscr {D}}_t^q I(t)=e^{-\mu \tau } \frac{\beta S(t-\tau ) I(t-\tau )}{1+\alpha I(t-\tau )}-(\mu +\epsilon +\gamma ) I(t)-\frac{r I(t)}{1+k I(t)},\\ {}_0^{C}{\mathscr {D}}_t^q R(t)=\gamma I(t)-\mu R(t)+\frac{r I(t)}{1+k I(t)}, \end{array}\right. \end{aligned}$$
(1)

with initial conditions

$$\begin{aligned} S\left( \varsigma \right) = {\vartheta _1}\left( \varsigma \right) \ge 0,I\left( \varsigma \right) = {\vartheta _2}\left( \varsigma \right) \ge 0,R\left( \varsigma \right) = {\vartheta _3}\left( \varsigma \right) \ge 0,\varsigma \in \left[ { - \tau ,0} \right] , \end{aligned}$$
(2)

where \(\vartheta =(\vartheta _1,\vartheta _2,\vartheta _3) \in {\mathcal {C}}([-\tau ,0],{\mathbb {R}}^3_+)\). The set \({\mathcal {C}}([-\tau ,0],{\mathbb {R}}^3_+)\) is the set of continuous functions. Here, \({}_0^{C}{\mathscr {D}}_t^q\), \(q\in (0,1]\) denotes the Caputo fractional derivative give by [28]

$$\begin{aligned} {}_{0}^C{\mathscr {D}}_t^q f(t) = {}_{0}I_t^{1-q}\frac{{\hbox {d}}f(t)}{\hbox {d}t}=\frac{1}{{\varGamma (1 - q)}}\int _{0}^t {\frac{{f'\left( z \right) }}{{{{\left( {t - z } \right) }^q }}}} {\hbox {d}}z, \end{aligned}$$
(3)

in which \({}_{0}I_t^{q}f(t)=\frac{1}{{\varGamma ( q)}}\int _{0}^t \left( {t - z } \right) ^{q-1} f(z){\hbox {d}}z\) represents the fractional integral of order q, and \(\varGamma (z)=\int _{0}^\infty t^{z-1}e^{-t}{\hbox {d}}t\) is the Gamma function. The system parameters are explained as follow:

  1. (1)

    \(\varLambda \) denotes recruitment rate of the susceptible individuals, \(\mu \) denotes the natural mortality rate, \(\varepsilon \) denotes disease-caused death rate, \(\gamma \) denotes the natural recovery rate of the infectious individuals, \(\tau \) denotes the incubation period, and \(e^{-\mu \tau }\) indicate the probability of survival for each individual after incubation period \([t-\tau ,t]\).

  2. (2)

    The incidence rate \(\beta S(t-\tau ) I(t-\tau )/\left( 1+\alpha I(t-\tau )\right) \) denotes susceptible individuals leave the susceptible class at time \(t-\tau \) and enter the infectious class at the present time t, where \(\beta \) is the disease contact rate and \(\alpha \) denotes the saturation constant that measure the inhibitory effect.

  3. (3)

    The saturated treatment function \(r I/(1+k I)\), where \(r>0\) and \(k\ge 0\). r/k denotes the largest medical resource supply and \(1/(1+kI)\) describes the reverse effect of treatment delay for infected individuals.

The main objectives of this paper are to analyze the dynamic behavior of the model and give the optimal control strategy at the least cost. The rest of the paper is arranged as follows. In Sect. 2, we research some properties of the solutions, as well as the existence conditions of the equilibria, and the stability analysis of the equilibria. In Sect. 3, the fractional-order delayed optimal control problem is proposed and investigated. Several numerical simulations are given in Sect. 4. Major conclusions are summarized in the final section.

2 Analysis of the model

2.1 Positivity and boundedness

Since the object described by the model is the evolution of population, it makes no sense that the system has negative solutions. Thus, we first demonstrate that the state variables are nonnegative and bounded.

Theorem 1

All solutions of the system (1) satisfying the initial condition (2) are nonnegative and uniformly bounded for all \(t\ge 0\).

Proof

(Positivity) Firstly, we employ generalized mean value theorem [29] to prove that the solutions of the system (1) which start in \({\mathbb {R}}_+^3\) are nonnegative. Then, we proceed by contradiction, that is, assume that exists a \(t_1\) such that

$$\begin{aligned} t_1=\min \left\{ t>0:S(t)I(t)R(t)=0\right\} . \end{aligned}$$
  1. (i)

    Assume that \(S(t_1)=0\) and \(S(t)<0\) for \(t>t_1\), then it follows that \(I(t), R(t)\ge 0\). In terms of the first equation of system (1), we have

    $$\begin{aligned} {}_{t_1}^{C}{\mathscr {D}}_t^q S(t)|_{t>t_1}=\varLambda -\mu S(t)-\frac{\beta S(t) I(t)}{1+\alpha I(t)}>0. \end{aligned}$$

    Applying the generalized mean value theorem, for \(\xi \in [t_1,t]\), we have

    $$\begin{aligned} S(t)= \underbrace{S(t_1)}_{=0}+\underbrace{\frac{1}{\varGamma (q)}{_{t_1}^{C}{\mathscr {D}}_t^q S(\xi )(t-\xi )^q}}_{>0}>0, \end{aligned}$$

    which contradicts the assumption with \(S(t)<0\) for \(t>t_1\).

  2. (ii)

    Assume that \(I(t_1)=0\) and \(I(t)<0\) for \(t>t_1\), then it follows that \(S(t), R(t)\ge 0\).

    • If \({}_{t_1}^{C}{\mathscr {D}}_t^qI(t)>0\) for \(t>t_1\), the proof method is the same as in (i).

    • If \({}_{t_1}^{C}{\mathscr {D}}_t^q I(t)\le 0\) and \(I(t-\tau )\ge 0\), utilizing the second equality of system (1), we have

      $$\begin{aligned} {}_{t_1}^{C}{\mathscr {D}}_t^q I(t)=\underbrace{e^{-\mu \tau } \frac{\beta S(t-\tau ) I(t-\tau )}{1+\alpha I(t-\tau )}}_{>0}\underbrace{-(\mu +\epsilon +\gamma ) I(t)}_{>0}\underbrace{-\frac{r I(t)}{1+k I(t)}}_{>0}>0, \end{aligned}$$

      where \(1+kI(t)>0\) for any \(t>t_1\). It is contradictory to the assumption \({}_{t_1}^{C}{\mathscr {D}}_t^qI(t)\le 0\).

    • If \({}_{t_1}^{C}{\mathscr {D}}_t^qI(t)\le 0\) and \(I(t-\tau )<0\), according to the literature [29], then I(t) is a non-increasing function for \(t>t_1\) such that \(I(t)<I(t-\tau )<0\). At this time point, if \(1+\alpha I(t-\tau )<0\) then \({}_{t_1}^{C}{\mathscr {D}}_t^qI(t)> 0\), which contradicts the assumption. Moreover, if \(1+\alpha I(t-\tau )>0\), it follows that

      $$\begin{aligned} {}_{t_1}^{C}{\mathscr {D}}_t^q I(t)&=\underbrace{e^{-\mu \tau } \frac{\beta S(t-\tau ) I(t-\tau )}{1+\alpha I(t-\tau )}}_{<0}\underbrace{-(\mu +\epsilon +\gamma ) I(t)}_{>0}\underbrace{-\frac{r I(t)}{1+k I(t)}}_{>0}\\&> e^{-\mu \tau } \frac{\beta S(t-\tau ) }{1+\alpha I(t-\tau )} I(t-\tau )\\&>M I(t), \end{aligned}$$

      where \(M=\max \{e^{-\mu \tau }\beta S(t-\tau )/(1+\alpha I(t-\tau ))\}>0\), hence there exists a nonnegative function g(t) satisfying

      $$\begin{aligned} {}_{t_1}^{C}{\mathscr {D}}_t^q I(t)=MI(t)+g(t). \end{aligned}$$

      Then, we may further derive that

      $$\begin{aligned} I(t)= I(t_1)E_q\left[ M(t-t_1)^q\right] +{\int _{{t_1}}^t {(t - s)} ^{q - 1}}{E_q}\left[ {M{{\left( {t - s} \right) }^q}} \right] g\left( s \right) \text {d}s\ge 0, \end{aligned}$$

      where \({E_q}\left( z \right) = \sum \limits _{k = 0}^\infty {\frac{{{z^k}}}{{\varGamma \left( {qk + 1} \right) }}}\) is the Mittag-Leffler function, which contradicts the assumption with \(I(t) < 0\) for \(t>t_1\).

  3. (iii)

    Using the same proof method in (i), it follows that \(R(t)\ge 0\) for all \(t\ge 0\). (Boundedness) Let \(F\left( t \right) = {e^{ - \mu \tau }}S\left( t \right) + I\left( {t + \tau } \right) + R\left( {t + \tau } \right) \), then

    $$\begin{aligned} {}_{0}^{C}{\mathscr {D}}_t^q F\left( t \right) = {e^{ - \mu \tau }}\varLambda - \mu F\left( t \right) - \varepsilon I\left( {t + \tau } \right) \le {e^{ - \mu \tau }}\varLambda - \mu F\left( t \right) \le \varLambda - \mu F\left( t \right) . \end{aligned}$$

    Applying the Laplace transform to this inequality yields

    $$\begin{aligned}F\left( t \right) \le \left( { - \frac{\varLambda }{\mu } + F\left( 0 \right) } \right) {E_q}\left( { - \mu {t^q}} \right) + \frac{\varLambda }{\mu },\end{aligned}$$

    it follows that \(\mathop {\lim }\limits _{t \rightarrow \infty } \sup F\left( t \right) \le \varLambda /\mu \). Then, from the system (1), it can be derived \({}_{0}^{C}{\mathscr {D}}_t^q S(t)\le \varLambda -\mu S(t)\), hence, we have \(\mathop {\lim }\limits _{t \rightarrow \infty } \sup S\left( t \right) \le \varLambda /\mu \). The biological feasible domains of the system (1) yield

    $$\begin{aligned} \varXi =\left\{ {\left( {S,I,R} \right) \in {\mathbb {R}}_ + ^3:0<S(t)\le \frac{\varLambda }{\mu },0 < {e^{ - \mu \tau }}S + I + R \le \frac{\varLambda }{\mu }} \right\} . \end{aligned}$$

The proof is completed. \(\square \)

2.2 Equilibria and local dynamics

Note that the variable R is not appear in the first two equations of the system (1). In fact, the results from the first two equations can be viewed as the input for the equation for R. Thus, this allows us to consider the following subsystem

$$\begin{aligned} \left\{ \begin{array}{l} {}_0^{C}{\mathscr {D}}_t^q S(t)=\varLambda -\mu S(t)-\frac{\beta S(t) I(t)}{1+\alpha I(t)}, \\ {}_0^{C}{\mathscr {D}}_t^q I(t)=e^{-\mu \tau } \frac{\beta S(t-\tau ) I(t-\tau )}{1+\alpha I(t-\tau )}-\rho I(t)-\frac{r I(t)}{1+k I(t)}, \end{array}\right. \end{aligned}$$
(4)

where \(\rho =\mu +\epsilon +\gamma \).

Then, the equilibria are solved by setting the right-hand side of system (4) to zero. Clearly, system (4) always has a disease-free equilibrium \(E_0=(S_0,I_0)=(\varLambda /\mu ,0)\). Utilizing the next-generation matrix approach [30], the basic reproductive number is given by

$$\begin{aligned} {R_0} = \frac{{\beta \varLambda {e^{ - \mu \tau }}}}{{\mu \left( {\rho + r} \right) }}, \end{aligned}$$
(5)

which represents the expected number of secondary cases generated by a single infectious case in an otherwise susceptible population.

In addition, the system (4) has at most two endemic equilibria after calculation, denoted by \(E_1=(S_1,I_1)\) and \(E_2=(S_2,I_2)\), where \(S_{1,2}=\varLambda (1+\alpha I_{1,2})/\left( \mu +(\alpha \mu +\beta )I_{1,2}\right) \), \(I_2>I_1\), and \(I_1\) and \(I_2\) are the solutions of the equation

$$\begin{aligned} a{I^2} + bI + c = 0, \end{aligned}$$
(6)

with

$$\begin{aligned} a&= k\rho \left( {\alpha \mu + \beta } \right) ,\\ b&= k\mu (\rho +r)(1-R_0)+(\alpha \mu +\beta )(\rho +r)-\mu kr,\\ c&= \mu \left( {\rho + r} \right) - \beta \varLambda {e^{ - \mu \tau }} = \mu \left( {\rho + r} \right) \left( {1 - {R_0}} \right) . \end{aligned}$$

Let \(\varDelta \) denote the discriminant of Eq. (6), one has

$$\begin{aligned} \varDelta>0\Longleftrightarrow R_0>R_c,~~ \varDelta =0\Longleftrightarrow R_0=R_c,~~ \varDelta<0\Longleftrightarrow R_0<R_c, \end{aligned}$$

where \(R_c= 4a\beta \varLambda e^{ - \mu \tau }/(b^2 + 4a\beta \varLambda e^{ - \mu \tau })\), thus, \(0\le R_c\le 1\) is clear.

After further analysis, the following results are obtained.

Theorem 2

All the possible endemic equilibria are given

  1. (i)

    For \(k=0\), Eq. (6) reduces to a linear equation, with a unique positive solution \(I=-c/b\) if and only if \(R_0>1\); in other words, system (4) has a unique endemic equilibrium if \(R_0>1\).

  2. (ii)

    For \(k>0\), \(b\ge 0\), system (4) has a unique endemic equilibrium \(E_2\) if \(R_0>1\).

  3. (iii)

    For \(k>0\), \(b<0\), system (4) has two endemic equilibria \(E_1\) and \(E_2\) if \(R_c\le R_0<1\), and a unique endemic equilibrium \(E_2\) if \(R_0\ge 1\).

Obviously, if the condition as stated (ii) in Theorem 2 holds, there is a backward bifurcation from an endemic equilibrium at \(R_0=1\), which leads to the existence of two endemic equilibria. Thus, we present the following corollary.

Corollary 1

System (4) exhibits a backward bifurcation at \(R_0=1\) if \(k>0\) and \(b<0\).

Proof

This corollary is an immediate consequence of stated (iii) in Theorem 2. \(\square \)

Then, it follows from Theorem 2, there is no backward bifurcation when \(k=0\). As k increases and meets the conditions stated in Corollary 1, the system exhibits backward bifurcation. This means that the infected being delayed for treatment is a source of the backward bifurcation. Thus, we may further derive an explicit condition for the existence of a backward bifurcation with k as the bifurcation parameter.

Corollary 2

When \(k>k_0\), then the system (4) exhibits backward bifurcation at \(R_0=1\).

Proof

It is clear that \(R_0=1\Rightarrow b = (\alpha \mu +\beta )(\rho +r)-\mu kr<0 \Rightarrow k > \frac{{\left( {\alpha \mu + \beta } \right) \left( {\rho + r} \right) }}{{r\mu }} \buildrel \varDelta \over = {k_0}.\) \(\square \)

Remark 1

Generally, the disease can be effectively controlled if the basic reproduction number \(R_0\) is less than unity. However, the existence of the backward bifurcation of the system (4) makes \(R_0\) no longer the threshold for disease control. In other words, the backward bifurcation property of the system (4) makes disease control difficult.

In the following, the local stability of all the possible equilibria are researched. The local stability of system depends on the roots of the characteristic polynomial according to the literature [31]. Thus, the characteristic matrix associated with system (4) is given:

$$\begin{aligned} \varDelta \left( s \right) = \left[ {\begin{array}{*{20}{c}} {{s^q} + \mu + \frac{{\beta {I}}}{{1 + \alpha {I}}}}&{}{\frac{{\beta {S}}}{{{{\left( {1 + \alpha {I}} \right) }^2}}}}\\ { - {e^{ - \mu \tau }}{e^{ - s\tau }}\frac{{\beta {I}}}{{1 + \alpha {I}}}}&{}{{s^q} - \frac{{\beta {S}{e^{ - \mu \tau }}{e^{ - s\tau }}}}{{{{\left( {1 + \alpha {I}} \right) }^2}}} + \rho + \frac{r}{{{{\left( {1 + k{I}} \right) }^2}}}} \end{array}} \right] . \end{aligned}$$
(7)

Here, \(\det \varDelta \left( s \right) \) is referred as the characteristic polynomial of \(\varDelta \left( s \right) \).

Theorem 3

The disease-free equilibrium \(E_0\) is locally asymptotically stable if \(R_0 < 1\) and unstable whenever \(R_0 > 1\).

Proof

The corresponding characteristic polynomial at \(E_0\) is simplified as

$$\begin{aligned} {\left( {{s^q} + \mu } \right) }[ {{s^q} + \rho + r - \frac{{\beta \varLambda }}{\mu }{e^{ - \mu \tau }}{e^{ - s\tau }}} ] = 0. \end{aligned}$$
(8)

\(\bullet \)  When \(\tau =0\), we can rewrite the above equation as

$$\begin{aligned} {\left( {{s^q} + \mu } \right) }[ {{s^q} - \left( {\rho + r} \right) \left( {{R_0} - 1} \right) } ] = 0. \end{aligned}$$

It is clear, therefore, all the roots of the above equation still have negative real parts if \(R_0<1\). Then, \(E_0\) is locally asymptotically stable if \(R_0 < 1\).

\(\bullet \)  When \(\tau \ne 0\), it is deduced that the following equation

$$\begin{aligned} {{s^q} + \left( {\rho + r} \right) - \frac{{\beta \varLambda }}{\mu }{e^{ - \mu \tau }}{e^{ - s\tau }}}=0. \end{aligned}$$
(9)

Suppose the above equation has a purely imaginary root \(s= iw, w>0\), then w satisfy

$$\begin{aligned} {w^{2q}} + 2\left( {\rho + r} \right) \cos \left( {\frac{{q\pi }}{2}} \right) {w^q} + {\left( {\mu + \varepsilon + \gamma + r} \right) ^2}\left( {1 - R_0^2} \right) = 0. \end{aligned}$$
(10)

It is evident that Eq. (10) has no positive real root if \(R_0<1\); however, Eq. (10) has a positive root for \(R_0>1\). Hence Eq. (9) has no purely imaginary root for \(R_0<1\). According to Theorem 3.2 in the literature [31], the disease-free equilibrium \(E_0\) is locally asymptotically stable for any \(\tau \ge 0\) when \(R_0<1\), and it is unstable when \(R_0>1\). \(\square \)

For convenience in the following discussion, define

$$\begin{aligned} \begin{aligned} f(I)&= {{{\left( {1 + k{I}} \right) }^2}\big ( ( \mu +\rho )\alpha I + \beta I + \mu \big ) + r\alpha I - rkI},\\ g(I)&=( 1 + kI )^2(\alpha \mu + \beta )\rho +r\alpha \mu -rk\mu +r\beta . \end{aligned} \end{aligned}$$
(11)

Then, the local stability of the unique endemic equilibrium \(E_2\) is described below.

Theorem 4

If \(R_0>1\) and \(0 \le k \le \min \left\{ {{k_1},{k_2}} \right\} \), the unique endemic equilibrium \(E_2\) is locally asymptotically stable for \(\tau \ge 0\).

Proof

\(\bullet \) When \(\tau =0\), the endemic equilibrium \(E_2\) satisfies

$$\begin{aligned} \frac{\beta S_2}{1+\alpha I_2}-\rho -\frac{r }{1+k I_2}=0, \end{aligned}$$

so that

$$\begin{aligned} \frac{{\beta {S_2}}}{{{{\left( {1 + \alpha {I_2}} \right) }^2}}} = \frac{\rho }{{\left( {1 + \alpha {I_2}} \right) }} + \frac{r}{{\left( {1 + k{I_2}} \right) \left( {1 + \alpha {I_2}} \right) }}. \end{aligned}$$

Substituting it into the characteristic matrix (7), the following characteristic equation can be derived

$$\begin{aligned} {{s^{2q}} + {a_1}{s^q} + {a_0}} = 0, \end{aligned}$$
(12)

where \(a_1 = \frac{1}{{\left( {1 + \alpha {I_2}} \right) {{\left( {1 + k{I_2}} \right) }^2}}}f(I_2), a_0 = \frac{{{I_2}}}{{\left( {1 + \alpha {I_2}} \right) {{\left( {1 + k{I_2}} \right) }^2}}}g(I_2).\) Then, \(a_1\) is positive if and only if

$$\begin{aligned} f(I_2)>0\Rightarrow {\left( {1 + k{I_2}} \right) ^2}[ {\left( {\mu + \rho } \right) \alpha {I_2} + \beta {I_2} + \mu } ] + r\alpha {I_2} > rk{I_2}. \end{aligned}$$

In fact,

$$\begin{aligned}{\left( {1 + k{I_2}} \right) ^2}[ {\left( {\mu + \rho } \right) \alpha {I_2} + \beta {I_2} + \mu } ] + r\alpha {I_2} > \left( {\mu + \rho } \right) \alpha {I_2} + \beta {I_2} + r\alpha {I_2}.\end{aligned}$$

Hence, the sufficient condition for \(a_1>0\) yields

$$\begin{aligned} \left( {\mu + \rho } \right) \alpha {I_2} + \beta {I_2} + r\alpha {I_2} > rk{I_2}, \end{aligned}$$

so that

$$\begin{aligned}k < \frac{{\left( {\mu + \rho +r } \right) \alpha + \beta }}{r} \buildrel \varDelta \over = {k_1}. \end{aligned}$$

Similarly, if \(a_0>0\), one can get \(k<k_0\), where \(k_0\) is defined in Corollary 2.

\(\bullet \) When \(\tau \ne 0\), the characteristic equation yields

$$\begin{aligned} {{s^{2q}} + {b_1}{s^q} + {b_0} - \left( {{c_1}{s^q} + {c_0}} \right) {e^{ - s\tau }}} = 0, \end{aligned}$$
(13)

where

$$\begin{aligned} \begin{array}{l} {b_1} = {\mu + \rho } + \frac{{\beta {I_2}}}{{1 + \alpha {I_2}}} + \frac{r}{{{{\left( {1 + k{I_2}} \right) }^2}}},~~~~~~~~{c_1} = \frac{1}{{\left( {1 + \alpha {I_2}} \right) \left( {1 + k{I_2}} \right) }}\left[ {\rho \left( {1 + k{I_2}} \right) + r} \right] ,\\ {b_0} = \left( {\mu + \frac{{\beta {I_2}}}{{1 + \alpha {I_2}}}} \right) \left[ {\rho + \frac{r}{{{{\left( {1 + k{I_2}} \right) }^2}}}} \right] ,~~~~~~{c_0} = \frac{\mu }{{\left( {1 + \alpha {I_2}} \right) \left( {1 + k{I_2}} \right) }}\left[ {\rho \left( {1 + k{I_2}} \right) + r} \right] . \end{array} \end{aligned}$$

Setting \(s=iw=we^{\frac{\pi i}{2}}, w>0\) and substituting it into Eq. (13), it follows

$$\begin{aligned} \begin{aligned}&{w^{2q }}(\cos v\pi + i\sin q \pi ) + {b_1}{w^q }\left( \cos \frac{{q \pi }}{2} + i\sin \frac{{q \pi }}{2}\right) +{b_0}\\&\qquad \qquad - {c_1}\left[ {{w^q }(\cos \frac{{q \pi }}{2} + i\sin \frac{{q \pi }}{2}) + {\mu }} \right] (\cos w\tau - i\sin w\tau ) = 0. \end{aligned} \end{aligned}$$

Separating the real and imaginary parts, one has

$$\begin{aligned} \begin{aligned} {w^{2q }}\cos q \pi + {b_1}{w^q }\cos \frac{{q \pi }}{2} +b_0&= {c_1}{w^q }\sin \frac{{q \pi }}{2}\sin w\tau +{c_1} \left( {{w^q }\cos \frac{{q \pi }}{2} + {\mu } } \right) \cos w\tau , \\ {w^{2q }}\sin q \pi + {b_1}{w^q }\sin \frac{{q \pi }}{2}&= {c_1}{w^q }\sin \frac{{q \pi }}{2}\cos w\tau - {c_1} \left( {{w^q }\cos \frac{{q \pi }}{2} + {\mu } } \right) \sin w\tau . \end{aligned} \end{aligned}$$

Applying the fact \({\cos ^2}w\tau + {\sin ^2}w\tau = 1\), it is calculated that

$$\begin{aligned} {w^{4q}} + 2{b_1}\cos \frac{{q\pi }}{2}{w^{3q}} + \left( b_1^2 - c_1^2 + 2{b_0}\cos q\pi \right) {w^{2q}} + 2\cos \frac{{q\pi }}{2}\left( b_1b_0-c_1c_0 \right) {w^q} + b_0^2 - c_0^2= 0. \end{aligned}$$
(14)

After straightforward but lengthy calculations, if \(k<\min \{k_1,k_2\}\) holds, where \(k_2=\alpha \left( {\rho + r} \right) /r\), then the following inequalities hold

  1. (1)

    \(b_1^2 - c_1^2 + 2{b_0}\cos q\pi \ge b_1^2-c_1^2-2b_0={e^2} + \frac{{{I_2\left( {d + {c_1}} \right) }}}{{\left( {1 + \alpha {I_2}} \right) {{\left( {1 + k{I_2}} \right) }^2}}}\left[ {\alpha \rho {{\left( {1 + k{I_2}} \right) }^2} + r\left( {\alpha - k} \right) } \right] >0\),

  2. (2)

    \(b_1b_0-c_1c_0=a_1(a_0+c_0)+a_0c_1>0\),

  3. (3)

    \(b_0-c_0=a_0>0,\)

where \(e=\mu + \beta I_2/\left( 1 + \alpha I_2 \right) \) and \(d= \rho + r/\left( 1 + k{I_2} \right) ^2\). It can be seen that the above inequalities hold true when \(k<\min \{k_0,k_1,k_2\}\). In light of the expressions \(k_0\) and \(k_1\), there clearly is \(k_0>k_2\). Hence, the coefficients of Eq. (14) are all positive, which implies Eq. (14) has no positive root.

Overall, the unique endemic equilibrium \(E_2\) for \(\tau \ge 0\) if \(0 \le k \le \min \left\{ {{k_1},{k_2}} \right\} \). This completes the proof. \(\square \)

Remark 2

In light of (iii) in Theorem 2, the system (4) also exists a unique endemic equilibrium when \(R_0=1\). For convenience, therefore, this situation will be further discussion ensued.

Remark 3

The characteristic equations of two endemic equilibria \(E_1\) and \(E_2\) are almost identical to that of the unique endemic equilibrium \(E_2\).

Next, we discuss the situation that the system (4) possesses two endemic equilibria \(E_1\) and \(E_2\).

Theorem 5

The endemic equilibrium \(E_1\) is unstable whenever it exists.

Proof

When \(\tau =0\), \(a_0\) is modified as

$$\begin{aligned} {a_0} = \frac{{{I_1}}}{{\left( {1 + \alpha {I_1}} \right) {{\left( {1 + k{I_1}} \right) }^2}}}g(I_1). \end{aligned}$$

If \(a_0<0\), the characteristic equation has one positive real root, implying that \(E_1\) is unstable. As such, we will show that \(a_0<0\) in the following.

According to Theorem 2, the existence of \(E_1\) should satisfy the conditions \(b<0\) and \(R_0<1\), it can be obtained that

$$\begin{aligned} g(0)=(\alpha \mu + \beta )\rho +r\alpha \mu -rk\mu +r\beta =b+k\mu (\rho +r)(R_0-1)<0, \end{aligned}$$

and

$$\begin{aligned} g'(I)=2k\rho (1+kI)(\alpha \mu + \beta )>0 ~~ \text {for}~~I>0. \end{aligned}$$

From this, then there exists a unique \(I_*>0\) satisfying

$$\begin{aligned} \left\{ \begin{array}{ll} g(I)<0, &{} \hbox {if }I<I_*; \\ g(I)=0, &{} \hbox {if }I=I_*; \\ g(I)>0, &{} \hbox {if }I>I_*. \end{array} \right. \end{aligned}$$

where \(I_*=\frac{1}{k}\sqrt{\frac{rk\mu -r\alpha \mu -r\beta }{(\alpha \mu +\beta )\rho }}-\frac{1}{k}\). Hence,

$$\begin{aligned} I_1-I_*= & {} \frac{-b-\sqrt{\varDelta }}{2a}+\frac{1}{k}-\frac{1}{k}\sqrt{\frac{rk\mu -r\alpha \mu -r\beta }{(\alpha \mu +\beta )\rho }}\\= & {} \frac{U-\sqrt{\varDelta }}{2k\rho (\alpha \mu +\beta )}, \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} U&= \beta (\varLambda k + \rho - r) - \mu [\rho (k - \alpha )+ \alpha r]-2\sqrt{r\rho (\alpha \mu +\beta )[(k-\alpha )\mu -\beta ]},\\ \varDelta&= [\mu \rho (k+\alpha )+\mu \alpha r+\beta (\rho +r)-\varLambda \beta k]^2 -4k\rho (\alpha \mu +\beta )[\mu (\rho +r)-\beta \varLambda ]\\&= U^2+4\sqrt{r\rho (\alpha \mu +\beta )[(k-\alpha )\mu -\beta ]}[(\alpha \mu +\beta )\rho +(\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )]\\&\quad -8r\rho (\alpha \mu +\beta )[(k-\alpha )\mu -\beta )]. \end{aligned} \end{aligned}$$

Obviously, \(I_1<I_*\) holds if and only if

$$\begin{aligned}(\alpha \mu +\beta )\rho +(\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )>2\sqrt{r\rho (\alpha \mu +\beta )[(k-\alpha )\mu -\beta ]},\end{aligned}$$

that is,

$$\begin{aligned} \left[ (\alpha \mu +\beta )\rho +(\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )\right] ^2>4r\rho (\alpha \mu +\beta )\left[ (k-\alpha )\mu -\beta \right] . \end{aligned}$$
(15)

Next, we will prove that the above inequality always true. In view of \(b^2>4ac\), then we have

$$\begin{aligned}{}[(\alpha \mu +\beta )\rho -(\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )]^2>4k\rho \mu (\alpha \mu +\beta )(\rho +r)-4k\rho \varLambda \beta (\alpha \mu +\beta ). \end{aligned}$$

Thus, it can be obtained that

$$\begin{aligned} \begin{aligned}&[(\alpha \mu +\beta )\rho +(\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )]^2\\&\quad =[(\alpha \mu +\beta )\rho -(\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )]^2+4(\alpha \mu +\beta )\rho (\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )\\&\quad >4k\rho \mu (\alpha \mu +\beta )(\rho +r)-4k\rho \varLambda \beta (\alpha \mu +\beta )+4(\alpha \mu +\beta )\rho (\varLambda \beta k-\mu \alpha r-\beta r -k\mu \rho )\\&\quad =4r\rho (\alpha \mu +\beta )[(k-\alpha )\mu -\beta ]. \end{aligned} \end{aligned}$$

The inequality (15) is validated, thus, implies that \(I_1<I_*\), which in turn implies that \(g(I_1)<0\), i.e., \(a_0<0\). Hence, the endemic equilibrium \(E_1\) is a saddle whenever it exists. The proof is completed. \(\square \)

Theorem 6

If \(R_0\le 1\) and \(\zeta >0\), the endemic equilibrium \(E_2\) is locally asymptotically stable for \(\tau \ge 0\), where \(\zeta \) is given by Eq. (16).

Proof

\(\bullet \) When \(\tau =0\), \(a_0>0\) can be proved by a similar approach to that of Theorem 5. Then

$$\begin{aligned} f(I_2)= & {} {\mathcal {A}}I_2^3+{\mathcal {B}}I_2^2+{\mathcal {C}}I_2+\mu \\= & {} (a{I_2^2} + bI_2 + c)(n_1I_2+n_2)+\frac{m_1I_2+m_2}{a^2}, \end{aligned}$$

where \({\mathcal {A}}=[(\rho +\mu )\alpha +\beta ]k^2\), \({\mathcal {B}}=[k\mu +2(\rho +\mu )\alpha +2\beta ]k\), \({\mathcal {C}}=(2\mu -r)k+(\rho +r+\mu )\alpha +\beta \), \(n_1={\mathcal {A}}/{a}\), \(n_2=({\mathcal {B}}-bn_1)/{a}\) and \(m_1=b^2{\mathcal {A}}+a^2{\mathcal {C}}-ab{\mathcal {B}}-ac{\mathcal {A}}\), \(m_2=bc{\mathcal {A}}-ac{\mathcal {B}}+\mu a^2\).

Then, recall from Eq. (6) that \(a{I_2^2} + bI_2 + c=0\) yields

$$\begin{aligned} \begin{aligned} {\mathop {\mathrm{sgn}}} \left( {{a_1}} \right)&= {\mathop {\mathrm{sgn}}} \left( {f\left( {{I_2}} \right) } \right) = {\mathop {\mathrm{sgn}}} \left( {{m_1}{I_2} + {m_2}} \right) ={\mathop {\mathrm{sgn}}}\left( m_1\frac{-b+\sqrt{b^2-4ac}}{2a}+m_2\right) \\&={\mathop {\mathrm{sgn}}}\left( 2am_2+m_1\left( \sqrt{b^2-4ac}-b\right) \right) \buildrel \varDelta \over = {\mathop {\mathrm{sgn}}}(\zeta ). \end{aligned} \end{aligned}$$
(16)

Therefore, \(E_2\) is locally asymptotically stable if and only if \(\zeta >0\) for \(\tau =0\).

\(\bullet \) When \(\tau \ne 0\), the proof process is the same as Theorem 4. We omit the proof here. Overall, \(E_2\) is locally asymptotically stable if \(\zeta >0\) when \(\tau \ge 0\). The proof is completed. \(\square \)

2.3 Global analysis

In the following, we focus on the global stability of \(E_0\) and \(E_2\) using the Lyapunov method and the fractional LaSalle’s invariance principle.

Theorem 7

If \(R_0\le \min \{1,R_a\}\), the disease-free equilibrium \(E_0\) is globally asymptotically stable.

Proof

Define the following Lyapunov function

$$\begin{aligned} V\left( {S,I} \right) = {V_1} + {V_2}, \end{aligned}$$
(17)

where \({V_1} = S - {S_0} - {S_0}\ln \frac{S}{{{S_0}}} + {e^{\mu \tau }}I\) and \({V_2} = \beta {{{}_0I_\tau ^{q}}}\frac{{S\left( {t - \iota } \right) I\left( {t - \iota } \right) }}{{1 + \alpha I\left( {t - \iota } \right) }}, \iota \in [0,\tau ]\), \(V_2\) denotes the fractional integral with respect to \(\iota \). According to Lemma 3.1 in literature [32] and Theorem 4.5 in the literature [33], we have

$$\begin{aligned} {{}_0^{C}{\mathscr {D}}_t^q}{V}\le & {} \frac{{S - {S_0}}}{S}{{}_0^{C}{\mathscr {D}}_t^q}S + {e^{\mu \tau }}{{}_0^{C}{\mathscr {D}}_t^q}I+\beta {{}_0^{C}{\mathscr {D}}_t^q} \left[ {{{}_0I_\tau ^{q}}}\frac{{S\left( {t - \iota } \right) I\left( {t - \iota } \right) }}{{1 + \alpha I\left( {t - \iota } \right) }}\right] \nonumber \\= & {} \frac{{S - {S_0}}}{S}\left( {\varLambda - \mu S - \frac{{\beta SI}}{{1 + \alpha I}}} \right) + \frac{{\beta {S_\tau }{I_\tau }}}{{1 + \alpha {I_\tau }}} - \rho {e^{\mu \tau }}I - \frac{{r{e^{\mu \tau }}I}}{{1 + kI}}+\frac{{\beta SI}}{{1 + \alpha I}} - \frac{{\beta {S_\tau }{I_\tau }}}{{1 + \alpha {I_\tau }}}\nonumber \\= & {} - \mu {S_0}\left( {\frac{{{S_0}}}{S} + \frac{S}{{{S_0}}} - 2} \right) - \left( {\rho {e^{\mu \tau }} + \frac{{r{e^{\mu \tau }}}}{{1 + kI}} - \frac{{\beta {S_0}}}{{1 + \alpha I}}} \right) I\nonumber \\= & {} - \mu {S_0}\left( {\frac{{{S_0}}}{S} + \frac{S}{{{S_0}}} - 2} \right) - \frac{{{e^{\mu \tau }}I}}{{\left( {1 + kI} \right) \left( {1 + \alpha I} \right) }}\left( {{a'}{I^2} + {b'}I + {c'}} \right) , \end{aligned}$$
(18)

where \( {a'} = \rho \alpha k > 0, {b'} = \rho \left( {\alpha + k} \right) + r\alpha - \beta {S_0}k{e^{ - \mu \tau }}, {c'} = \left( {\rho + r} \right) - \beta {S_0}{e^{ - \mu \tau }} = \left( {\rho + r} \right) \left( {1 - {R_0}} \right) \). Clearly, if \(k=0\), it follows that \({{}_0^{C}{\mathscr {D}}_t^q}{V}\le 0\) when \(R_0\le 1\). For the case \(k>0\), it is clear from the definition that \(c'>0\) is equivalent to \(R_0<1\), and \(b'\ge 0\) if and only if

$$\begin{aligned} {R_0} \le 1 + \frac{\alpha }{k} - \frac{r}{{\rho + r}} \buildrel \varDelta \over = {R_a}. \end{aligned}$$

Applying average value inequality, it is easy to show \(\frac{{{S_0}}}{S} + \frac{S}{{{S_0}}} - 2 \ge 0\). As consequence, \({{}_0^{C}{\mathscr {D}}_t^q}{V}\le 0\) if \(R_0\le \min \{1,R_a\}\), and then, \({{}_0^{C}{\mathscr {D}}_t^q}{V}=0\) if and only if \(S=S_0,I=0\). It follows that \(E_0\) is globally asymptotically stable if \(R_0\le \min \{1,R_a\}\) in view of the Lasalle invariance principle [34]. Proof of theorem is complete. \(\square \)

Remark 4

If \(k\le \mu \alpha (\rho +r)/\mu r<k_0\), it implies \(R_a\ge 1\), there is no generate the phenomenon of backward bifurcation according to Corollary 2. At this point, \(R_0\le 1\) remains the threshold to eradicate the disease. In other words, a delay in the treatment may have no negative impact in such case. In contrast, if \(k\ge \mu \alpha (\rho +r)/\mu \), then \(R_a<1\), the system may undergo a backward bifurcation. Clearly, the threshold to eradicate the disease is \(R_0\le R_a\) instead of \(R_0\le 1\) at this point. The delayed treatment of infected individuals makes the dynamics of the model more complicated.

Then, the global asymptotic stability of the unique endemic equilibrium \(E_2\) is analyzed. For that, the following assumption is made.

(H)  \(k\left( \rho -r\right) I_2+\left( \rho +r\right) \alpha -kr\ge 0,~~ k\alpha \left( \rho +\rho kI_2-r\alpha I_2\right) \ge 0.\)

Theorem 8

Assume that (H) holds, the endemic equilibrium \(E_2\) is globally asymptotically stable.

Proof

Define Lyapunov functional as

$$\begin{aligned} V = {V_S} + {V_I} + {V_*}, \end{aligned}$$
(19)

where

$$\begin{aligned} \begin{aligned} V_S&=S-S_2-S_2\ln \frac{S}{S_2},~~~ V_I=e^{\mu \tau }\left( I-I_2-I_2\ln \frac{I}{I_2}\right) ,\\ V_*&=\frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}{{{}_0I_\tau ^{q}}}\left[ {\frac{{1 + \alpha {I_2}}}{{{S_2}{I_2}}}\frac{{S\left( {t - \iota } \right) I\left( {t - \iota } \right) }}{{1 + \alpha I\left( {t - \iota } \right) }} - 1 - \ln \frac{{1 + \alpha {I_2}}}{{{S_2}{I_2}}}\frac{{S\left( {t - \iota } \right) I\left( {t - \iota } \right) }}{{1 + \alpha I\left( {t -\iota } \right) }}} \right] ,\iota \in [0,\tau ]. \end{aligned} \end{aligned}$$

Then, the relationships \(\varLambda = \mu {S_2} + \frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\) and \(\rho {e^{\mu \tau }}I_2 = \frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}} - \frac{{r{e^{\mu \tau }}I_2}}{{1 + kI_2}}\) will be used as substitutions in the following calculation. Thus,

$$\begin{aligned} {{}_0^{C}{\mathscr {D}}_t^q}{V_S}\le & {} \left( 1-\frac{S_2}{S}\right) {{}_0^{C}{\mathscr {D}}_t^q}{S} =\left( 1-\frac{S_2}{S}\right) \left( \varLambda -\mu S-\frac{\beta S I}{1+\alpha I}\right) \nonumber \\= & {} -\frac{\mu \left( S-S_2\right) ^2}{S}+\frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\left( 1-\frac{S_2}{S}+\frac{I}{I_2}\frac{1+\alpha I_2}{1+\alpha I}\right) -\frac{\beta S I}{1+\alpha I}, \end{aligned}$$
(20)

and

$$\begin{aligned} {{}_0^{C}{\mathscr {D}}_t^q}{V_I}\le & {} e^{\mu \tau }\left( 1-\frac{I_2}{I}\right) {{}_0^{C}{\mathscr {D}}_t^q}{I} =\left( 1-\frac{I_2}{I}\right) \left( \frac{\beta S_\tau I_\tau }{1+\alpha I_\tau }-\rho e^{\mu \tau }I-\frac{{r{e^{\mu \tau }}I}}{{1 + kI}}\right) \nonumber \\= & {} \left( 1-\frac{I_2}{I}\right) \left( \frac{\beta S_\tau I_\tau }{1+\alpha I_\tau }-\frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\frac{I}{I_2}\right) +\left( 1-\frac{I_2}{I}\right) \left( \frac{{r{e^{\mu \tau }}I_2}}{{1 + kI_2}}\frac{I}{I_2}-\frac{{r{e^{\mu \tau }}I}}{{1 + kI}}\right) \nonumber \\= & {} \frac{\beta S_\tau I_\tau }{1+\alpha I_\tau }+\frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\left( 1-\frac{I}{I_2}-\frac{I_2}{I} \frac{S_\tau I_\tau }{1+\alpha I_\tau }\frac{1+\alpha I_2}{S_2I_2}\right) \nonumber \\&~~~+ \frac{{r{e^{\mu \tau }}I_2}}{{1 + kI_2}}\left( 1-\frac{I_2}{I}\right) \left( \frac{I}{I_2}-\frac{I}{1+kI}\frac{1+kI_2}{I_2}\right) . \end{aligned}$$
(21)

Then, the fractional derivative of \(V_*\) yields

$$\begin{aligned} {{}_0^{C}{\mathscr {D}}_t^q}{V_*}\le \frac{\beta SI}{1+\alpha SI}-\frac{\beta S_\tau I_\tau }{1+\alpha S_\tau I_\tau }+\frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\ln \frac{S_\tau I_\tau }{1+\alpha I_\tau }\frac{1+\alpha I}{SI}. \end{aligned}$$
(22)

By means of Eqs. (20)–(22), we have

$$\begin{aligned} \begin{aligned} {{}_0^{C}{\mathscr {D}}_t^q}{V}&\le \frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\left( 1-\frac{S_2}{S}+\ln \frac{S_2}{S}\right) +\frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\left( 1-\frac{I_2}{I} \frac{S_\tau I_\tau }{1+\alpha I_\tau }\frac{1+\alpha I_2}{S_2I_2}+\ln \frac{I_2}{I} \frac{S_\tau I_\tau }{1+\alpha I_\tau }\frac{1+\alpha I_2}{S_2I_2}\right) \\&~~~+\frac{{\beta {S_2}{I_2}}}{{1 + \alpha {I_2}}}\left( 1-\frac{1+\alpha I}{1+\alpha I_2}+\ln \frac{1+\alpha I}{1+\alpha I_2}\right) -\frac{\alpha \beta S_2\left( I-I_2\right) ^2}{\left( 1+\alpha I_2\right) ^2\left( 1+\alpha I\right) }+\frac{re^{\mu \tau }k\left( I-I_2\right) ^2}{\left( 1+kI\right) \left( 1+kI_2\right) }. \end{aligned} \end{aligned}$$

It is a well-known fact that \(1-x+\ln x\le 0\) with equality only if \(x=1\). The simplest case is when \(k=0\), it follows that \({{}_0^{C}{\mathscr {D}}_t^q}{V}\le 0\). For the case \(k>0\), using \(e^{\mu \tau }\rho \left( 1+\alpha I_2\right) +e^{\mu \tau }r(1+\alpha I_2)/\left( 1+kI_2\right) \) to replace \(\beta S_2\), the last two terms of the above inequality are simplified as

$$\begin{aligned} -\frac{e^{\mu \tau }G(I)}{(1+\alpha I_2)(1+k I_2)(1+\alpha I)(1+kI)}, \end{aligned}$$
(23)

where \(G(I)=\alpha k\left[ k\alpha \left( \rho +\rho kI_2-r\alpha I_2\right) \right] I+k\left( \rho -r\right) I_2+\left( \rho +r\right) \alpha -kr\). If condition (H) is satisfied, there is \(G(I)\ge 0\) for all \(I\ge 0\), hence we may further derive that \({{}_0^{C}{\mathscr {D}}_t^q}{V}\le 0\) for all \(S\ge 0,I\ge 0\). Thus, \(E_2\) is the maximal invariant subset in \(\left\{ (S,I)\in {\mathbb {R}}_ + ^2:{{}_0^{C}{\mathscr {D}}_t^q}{V}= 0\right\} \). From the Lasalle invariance principle [34], \(E_2\) is globally asymptotically stable. The proof is complete. \(\square \)

3 Fractional optimal control problem

3.1 Formulation of optimal control problem

In this section, we consider a corresponding control problem by introducing vaccination and strengthening the medical resource supply as control interventions on the basis of the system (1). Vaccination and treatment are chosen as control policies because both interventions have their own unique sets of advantages and ease of implementation. First, these control interventions are described in more detail below.

  1. (i)

    Vaccines have always been among the most effective means of controlling infectious diseases [35]. Although the mass vaccination can effectively control those diseases, they impart a large socioeconomic burden. A key question is what weight and when the vaccination is vaccine uptake among the population so that decline the number of susceptible at the less possible cost. The control variable related to vaccinated \(u_1(t)\) is introduced into the system (1) with the aim to maximize vaccinated individual at the lowest cost in finite time period.

  2. (ii)

    Treatment is vital to cure the infection and prevent further spread of the disease. As mentioned in the introduction, the capacity of medical resources is limited in the community. However, when the disease large-scale outbreaks, government and management agencies may invest more medical resources for this disease. Thus, another control variable \(u_2(t)\) is introduced into the system (1) to represent the enhancement of treatment intensity for infected individuals, that is, to increase r.

After introducing the above-mentioned control measures into the system (1) yields

$$\begin{aligned} \left\{ \begin{array}{l} {{}_0^{C}{\mathscr {D}}_t^q} S(t)=\varLambda -\mu S(t)-\frac{\beta S(t) I(t)}{1+\alpha I(t)}-u_1(t)S(t), \\ {{}_0^{C}{\mathscr {D}}_t^q} I(t)=e^{-\mu \tau } \frac{\beta S(t-\tau ) I(t-\tau )}{1+\alpha I(t-\tau )}-\rho I(t)-\frac{(r+u_2(t)) I(t)}{1+k I(t)}, \\ {{}_0^{C}{\mathscr {D}}_t^q} R(t)=\gamma I(t)-\mu R(t)+u_1(t)S(t)+\frac{(r+u_2(t)) I(t)}{1+k I(t)}, \end{array}\right. \end{aligned}$$
(24)

subject to nonnegative initial conditions (2). Among them, \(u_1\) and \(u_2\) come from the following the admissible control set

$$\begin{aligned} Q = \left\{ {\left( {u_1}(t),{u_2}(t)\right) \in {L^\infty }(0,{T_f}):0 \le {u_i}(t) \le 1,i=1,2} \right\} , \end{aligned}$$

where \(T_f\) denotes the final time of control strategies. Our principle goal is to minimize the number of infected individuals and reduce the corresponding total costs during the spread of the disease.

Vaccination is known to be a collective strategy that requires to cover larger proportion of population to produce protective efficacy. Hence, we utilize \(u_1^4(t)\) to represent the high cost of vaccination process [36]. Additionally, the corresponding cost of treatment is indicated as \(u_2^2(t)\), which has proven its suitability in the literature [37, 38]. Thus, the objective (cost) function subject to the fractional-order system (24) is proposed as follows

$$\begin{aligned} {\mathcal {J}}({u_1},{u_2}) = {{{}_0I_{T_f}^{q}}} {\left[ { pI(t) + {{\kappa _1}u_1^4(t) + {\kappa _2}u_2^2(t)} } \right] }, \end{aligned}$$
(25)

where p represents the positive weight constant of the infected individuals, while \(\kappa _1\) and \(\kappa _2\) are positive weight constants for vaccination and treatment enhancement strategy, respectively. The term pI(t) is total cost of all infected individuals due to opportunity loss, that is, reduced productivity due to sickness, manpower, patient caring, social burden, etc. The terms \({\kappa _1}u_1^4(t)\) and \( {\kappa _2}u_2^2(t)\) describe the costs associated with the delivery of the two interventions. Thus, the objective (cost) function is regarded as weighted combination of costs because of opportunity loss due to infected individuals and costs incurred in providing vaccination and treatment enhancement. Our objective is to find the optimal control pair \(({u_1^*},{u_2^*})\) such that the objective (cost) function is minimized.

3.2 Existence of optimal control pair

In this section, we take advantage of the methodology given by Fleming and Rishel [37] to prove the existence of optimal control pair.

Theorem 9

There exists optimal control pair \((u_1^*,u_2^*)\in Q\) such that \({\mathcal {J}}(u_1^*,u_2^*) = \min {\mathcal {J}}({u_1},{u_2})\) subject to the control system (24).

Proof

First, the solution of the state system (24) is bounded for each bounded control variables in Q, and its proof process is identical to Theorem 1. Then, the right-hand side of system (24) is Lipschitz continuous, as shown in “Appendix A” for detailed proof. From Picard-Lindelöf theorem [39], the set of control and corresponding state variables are non-empty. Then, the admissible set Q, by definition, is closed and bounded. Further, due to the biquadratic and quadratic property of control variables, the integrand of Eq. (25) is convex with respect to Q. In addition, the integrand of Eq. (25) satisfies \({ {p}I + {{\kappa _1}u_1^4 + {\kappa _2}u_2^2} } \ge \kappa \left( u_1^4+u_2^2\right) :=f(u_1,u_2)\) with \(\kappa =\min \left\{ \kappa _1,\kappa _2\right\} \), thus, f is continuous and satisfy \(f(u_1,u_2)/|(u_1,u_2)|\rightarrow \infty \) as \(|(u_1,u_2)|\rightarrow \infty \). According to theorem 4.1 [37], there exists an optimal \((u_1^*,u_2^*)\in Q\) such that \({\mathcal {J}}(u_1^*,u_2^*) = \min {\mathcal {J}}({u_1},{u_2})\). The proof is completed. \(\square \)

3.3 Characterization of optimal control solution

In order to characterize the optimal solution, the Lagrangian function for the fractional control system (24)–(25) is defined as follows

$$\begin{aligned} {\mathcal {L}}(I,{u_1},{u_2}) = { {{p}I(t) + {{\kappa _1}u_1^4(t) + {\kappa _2}u_2^2(t)} }}, \end{aligned}$$
(26)

and Hamiltonian function is given by

$$\begin{aligned} {{{\mathcal {H}}}}(S,I,R,{u_1},{u_2},{\lambda })= & {} {{{\mathcal {L}}}}(I,{u_1},{u_2}) + {\lambda _1}{{{}_0^{C}{\mathscr {D}}_t^q} }S\left( t \right) + {\lambda _2}{{{}_0^{C}{\mathscr {D}}_t^q} }I\left( t \right) + {\lambda _3}{{{}_0^{C}{\mathscr {D}}_t^q} }R\left( t \right) \nonumber \\= & {} {p}I + {{\kappa _1}u_1^4 + {\kappa _2}u_2^2}+\lambda _1\left( \varLambda -\mu S-\frac{\beta S I}{1+\alpha I}-u_1S\right) \nonumber \\&~~+\lambda _2\left( e^{-\mu \tau } \frac{\beta S_\tau I_\tau }{1+\alpha I_\tau }-\rho I-\frac{(r+u_2) I}{1+k I}\right) \nonumber \\&~~+\lambda _3\left( \gamma I-\mu R+u_1S+\frac{(r+u_2) I}{1+k I}\right) , \end{aligned}$$
(27)

where \(\lambda =\left[ \lambda _1,\lambda _2 ,\lambda _3\right] ^T\in {\mathbb {R}}^3_+\) is the adjoint variable.

Theorem 10

Let \((S^*,I^*,R^*)\) be an optimal state solution associated with optimal control pair \((u_1^*,u_2^*)\) that minimizes the objective function (25), there exists adjoint variable \(\lambda \) such that

$$\begin{aligned} \left\{ \begin{array}{l} \begin{aligned} {{{}_0^{C}{\mathscr {D}}_t^q} }{\lambda _1}(t)&{}= {\lambda _1}(t)\left( {\mu + \frac{{\beta {I^*(t)}}}{{1 + \alpha {I^*(t)}}} + u_1^*(t)} \right) - {\chi _{\left[ {0,{T_f} - \tau } \right] }}{\lambda _2}(t + \tau )\frac{{\beta {I^*(t)}}}{{1 + \alpha {I^*(t)}}}- {\lambda _3}(t)u_1^*(t) ,\\ {{{}_0^{C}{\mathscr {D}}_t^q} }{\lambda _2}(t)&{}=- {p} + {\lambda _1}(t)\frac{{\beta {S^*(t)}}}{{{{\left( {1 + \alpha {I^*(t)}} \right) }^2}}} + {\lambda _2}(t)\left( {\rho + \frac{{r + u_2^*(t)}}{{{{\left( {1 + k{I^*(t)}} \right) }^2}}}} \right) - {\chi _{\left[ {0,{T_f} - \tau } \right] }}{\lambda _2}(t + \tau )\frac{{\beta {S^*(t)}{e^{ - \mu \tau }}}}{{{{\left( {1 + \alpha {I^*(t)}} \right) }^2}}}\\ &{}~~~- {\lambda _3}(t)\left( {\gamma + \frac{{r + u_2^*(t)}}{{{{\left( {1 + k{I^*(t)}} \right) }^2}}}} \right) ,\\ {{{}_0^{C}{\mathscr {D}}_t^q} }{\lambda _3}(t)&{}= {\lambda _3}(t)\mu , \end{aligned} \end{array}\right. \end{aligned}$$
(28)

with the transversality conditions

$$\begin{aligned} {\lambda _i}({T_f}) = 0,~~i = 1,2,3. \end{aligned}$$
(29)

Furthermore, the optimal control pair is given by

$$\begin{aligned} \left\{ {\begin{array}{*{20}{l}} {u_1^*(t) = \max \bigg \{ {0,\min \left\{ {1,{\root 3 \of {{\frac{{({\lambda _1}(t) - {\lambda _3}(t)){S^*(t)}}}{{4{\kappa _1}}}}}}} \right\} } \bigg \},}\\ {u_2^*(t) = \max \bigg \{ {0,\min \left\{ {1,~\frac{{({\lambda _2}(t) - {\lambda _3}(t)){I^*(t)}}}{{2{\kappa _2}\left( {1 + k{I^*(t)}} \right) }}} \right\} } \bigg \}.} \end{array}} \right. \end{aligned}$$
(30)

Proof

According to the analysis in the literature [40] for the fractional-order optimal control problem with delay in the state, the adjoint equations should satisfy

$$\begin{aligned} \left\{ \begin{array}{l} {{{}_0^{C}{\mathscr {D}}_t^q} }{\lambda _1}(t)=- {\frac{{\partial {{{\mathcal {H}}}}}}{{\partial S}} - {\chi _{[0,{T_f} - \tau ]}}\left( t \right) {{\left. {\frac{{\partial {{{\mathcal {H}}}}}}{{\partial {S_\tau }}}} \right| }_{t = t + \tau }}},\\ {{{}_0^{C}{\mathscr {D}}_t^q} }{\lambda _2}(t)=- {\frac{{\partial {{{\mathcal {H}}}}}}{{\partial I}} - {\chi _{[0,{T_f} - \tau ]}}\left( t \right) {{\left. {\frac{{\partial {{{\mathcal {H}}}}}}{{\partial {I_\tau }}}} \right| }_{t = t + \tau }}},\\ {{{}_0^{C}{\mathscr {D}}_t^q} }{\lambda _3}(t)=-\frac{{\partial \mathcal{{{{\mathcal {H}}}}}}}{{\partial R}}, \end{array} \right. \end{aligned}$$

where \({\chi _{[0,{T_f} - \tau ]}}\) is the characteristic function such that

$$\begin{aligned} \chi _{\left[ {0,{T_f} - \tau } \right] }= {\left\{ \begin{array}{ll} 1,&{} \text {if} ~t\in [0,T_f-\tau ],\\ 0,&{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Hence, the adjoint equations can be further simplified and expressed as Eq. (28). In addition, this optimal control problem has no terminal cost and the final state is also free, and thus, the transversality conditions are \({\lambda _1}({T_f})=\lambda _2(T_f)=\lambda _3(T_f)=0 \). Then, from the optimality conditions, it follows that

$$\begin{aligned}\left\{ {\begin{array}{*{20}{l}} {{{\left. {\frac{{\partial {{\mathcal {H}}}}}{{\partial {u_1}}}} \right| }_{{u_1}\left( t \right) = u_1^*(t)}} = 4{\kappa _1}{{\left( {u_1^*(t)} \right) }^3} - {\lambda _1}\left( t \right) {S^*} + {\lambda _3}\left( t \right) {S^*} = 0,}\\ {{{\left. {\frac{{\partial {{\mathcal {H}}}}}{{\partial {u_2}}}} \right| }_{{u_2}\left( t \right) = u_2^*(t)}} = 2{\kappa _2}u_2^*(t) - {\lambda _2}\left( t \right) \frac{{{I^*}}}{{1 + k{I^*}}} + {\lambda _3}\left( t \right) \frac{{{I^*}}}{{1 + k{I^*}}} = 0,} \end{array}} \right. \end{aligned}$$

combining the feature of the admissible set Q, the associated optimal control pair \((u_1^*,u_2^*)\) satisfies Eq. (30). This completes the proof of Theorem 10. \(\square \)

4 Numerical simulations

In this section, the validity of the theoretical results is confirmed by using MATLAB/Simulink. According to the existing literature data [9], we set the parameter values as:

$$\begin{aligned} \varLambda =15,\mu =0.1,\beta =0.01,\alpha =0.1,\epsilon =0.01,\gamma =0.1. \end{aligned}$$

The other parameter values are shown in specific examples.

4.1 Numerical simulations without control strategies

In this section, we first address the effect of \(\tau \) and r on \(R_0\), as well as the effect of r and k on I. From Fig. 1, \(R_0\) has negatively correlated with both \(\tau \) and r, which allowed us to infer time delay and the more sufficiency of medical resources may suppress the outbreak of the disease. As is clear from Fig. 2, I decreases with increasing r and with decreasing k, that is, the higher efficiency of the medical resource supply (bigger r/k) could reduce the number of infected individuals. Besides, as can be seen from Figs. 2a, b, there is a backward bifurcation as the k increases to a certain value, which results in the existence of multiple endemic equilibria.

Fig. 1
figure 1

Numerical simulations of the effect of \(\tau \) and r on \(R_0\)

Fig. 2
figure 2

Numerical simulations of I changing with different k and r when \(\tau =0\)

Example 1

Fix \(r=3,k=0.1,q=0.95,\tau =0,10\). It is easy to show that \(R_0<\min \{1,R_a\}\) and \(b>0\). Thus, there is only one disease-free equilibrium \(E_0\), which is globally asymptotically stable according to Theorem 7, as illustrated in Fig. 3. In addition, Fig. 4 simulates the solution of the system (4) at different values of order q. Clearly, we can observe that the order affects the convergence speed and the steady-state value. As q value decreases, the memory effect of the system increases, resulting in the slower convergence speed and the corresponding change in steady-state value, which means that the disease may take longer to be eradicated.

Fig. 3
figure 3

\(S-I\) plane of phase diagram with different initial values when \(R_0<\min \{1,R_a\}\)

Fig. 4
figure 4

Numerical simulations of the state variables with different orders q when \(\tau =0\). The arrow represents the direction of increasing q

Example 2

Fix \(r=4,k=2,q=0.95,\tau =0,2\). In both cases, \(R_c<R_0<1\) and system (4) has three equilibria \(E_0\), \(E_1\) and \(E_2\). As can be seen from Fig. 5, the equilibria \(E_0\) and \(E_2\) are stable, and \(E_1\) is a saddle.

Fig. 5
figure 5

\(S-I\) plane of phase diagram with different initial values when \(R_c<R_0<1\)

Example 3

Fix \(r=0.1,k=0.01,\tau =0,2\). In both cases, \(R_0>1\) and condition (H) hold, thus system (4) has a unique positive equilibrium \(E_2\), which is globally asymptotically stable from Theorem 8, as shown in Fig. 6. In addition, it can be seen from Fig. 7 that the closer q to 1, the solution to fractional-order follows the pattern of the integer-order system. Of note, the steady-state value increases as q value decreases, which implies that the disease requires noticeably longer periods of intervention or intensified interventions to be eradicated.

Fig. 6
figure 6

\(S-I\) plane of phase diagram with different initial values when \(q=0.95\) and \(R_0>1\)

Fig. 7
figure 7

Numerical simulations of the infected individuals I with different orders q. The arrow represents the direction of increasing q

4.2 Numerical simulations with control strategies

In this section, the generalized Euler method (GEM) is adopted to solve the fractional-order the optimal control problem, for more details, see [41]. The detailed algorithm is summarized in “Appendix B”. For the optimal control problem proposed in Sect. 3, the parameter values are set to \(\gamma =0.0001,r=0.05,k=0.1,\tau =1,q=0.98, p=1,\kappa _1=15,\kappa _2=10,T_f=50\) with the initial condition \(\vartheta _1=50,\vartheta _2=5,\vartheta _3=1\) and the other parameter values are kept constant as described above. In order to evaluate the efficiency of single or combination of the control strategies, we simulate the following three possible cases.

Strategy A:   Single control strategy of vaccination only (i.e., \(u_1\ne 0,u_2=0\)).

Strategy B:   Single control strategy of treatment enhancement only (i.e., \(u_1=0,u_2\ne 0\)).

Strategy C:   Combination of both the control strategies (i.e., \(u_1\ne 0,u_2\ne 0\)).

In Fig. 8, the dynamic behavior of the system with control Strategies A, B and C is compared with that without control strategy, respectively. The observed results can be summarized as follows.

  1. (1)

    As can be seen from Fig. 8a, the number of susceptible individuals using Strategies A and C decreases sharply within the first week, and consistently stays near slightly lower level, among which Strategy A little better that of Strategy C in lowering the number of susceptible individuals, while the number of susceptible individuals with Strategy B is significantly increase.

  2. (2)

    From Fig. 8b, all three strategies exert an inhibitory effect on the spread of the disease, among which Strategy A and C exhibit the most obvious effect, while Strategy B has a slightly worse effect. If only a single control strategy is adopted, vaccination strategy is much effective than the enhancement of treatment intensity. However, the above conclusion may change if vaccine efficacy is low. In addition, we note that there is a slight rise at the end of the curves, which is caused by the rapid decay of the control functions near final time.

  3. (3)

    From Fig. 8c, the number of recovered individuals using Strategies A and C increases rapidly at the initial stage, among which Strategy C stays at higher level than in Strategy A, whereas the number of recovered individuals associated with Strategy B increases is also slightly increases.

  4. (4)

    The control paths under different control strategies are shown in Fig. 8d–f. We note that the control measures of the three strategies have been widely used during almost the whole disease control period. In addition, from Fig. 8f, the control variable \(u_2\) is always at the maximum during the first 43 days and then gradually decreases to the lower bound. Meanwhile, the control variable \(u_1\) decreases slowly towards a minimum. This indicates that the enhancement of treatment intensity has a greater effect than that of vaccination when the two control strategies are combined.

In addition, the cost is among the most significant parameters determining applicable strategy, thus, we perform a comparative analysis of the costs in different strategies, as shown in Fig. 9. Clearly, in the absence of control measures, the cost of the initial stage of disease is relatively low, and then the rapid increase in the number of infected individuals due to the increased infectivity leads to the highest cost in the later period. Meanwhile, we also noticed that only vaccination strategy is less costly than of treatment enhancement only. In addition, the corresponding total cost of Strategy C is the lowest among all other strategies.

Based on the above observations and analyses, it is clear that among the all three control strategies, Strategy C (i.e., combined control strategies ) is most effective and the corresponding cost is the lowest in control and reduction transmission of disease.

Finally, the changes of the control variables \(u_1\) and \(u_2\) in strategy C under different values of order q are simulated in Fig. 10. It can be clearly observed that the maximum levels of the control variables increase as q value decreases, which is in accordance with the analysis result of Fig. 7 in Example 3. Thus, the maximum level of control variables can be reduced by reducing the memory characteristics.

Fig. 8
figure 8

Numerical simulations of the state variables and control variables under the different control strategies

Fig. 9
figure 9

Cost curves under the different control strategies

Fig. 10
figure 10

Control variables in strategy C for different values of q. The arrow represents the direction of increasing q

5 Conclusion

In this paper, we investigated a SIR epidemic model with saturated treatment function and saturated incidence rate that generalizes the model studied in [9] by introducing disease latency and fractional-order operator. This generalization not only enriches the dynamics of the model and increases its complexity, but also makes the model more realistic. This study was read as two parts.

In the first part, some essential properties of the model were studied, such as the positivity and boundedness of solutions, existence and stability of equilibria. The results of analysis indicated that the capacity of the medical resources have important effects on the dynamics of the transmission of infectious diseases. If there are sufficient medical resources, the system has a unique endemic equilibrium, at this point, the basic reproductive number \(R_0\) serves as a threshold parameter to determine the disease status. If the medical resources are limited, then the system could exhibit richer dynamical behavior such as backward bifurcation phenomena, which results in the existence of multiple endemic equilibria. In this case, even if \(R_0\) is less than 1, it is also insufficient to eradicate the disease. At this point, \(R_c\) can be regarded as a threshold for eradicating the disease. From an epidemiological point of view, this feature is important especially for develop effective disease control and elimination programs. In addition, we found that fractional order q has a very distinct effect on the steady-state value and the convergence speed. That is, the disease may take longer or intensive interventions to be eradicated as fractional order q decreases.

In the second part, based on the proposed model, we considered an optimal control problem by introducing vaccination and increasing treatment intensity as control interventions. The existence of optimal control pairs was demonstrated, and the optimal control solution was characterized by using Pontryagin’s maximum principle. The results were then analyzed and discussed by numerical simulations. The results indicated that Strategy A (vaccination only) is both more effective and less costly than Strategy B (treatment enhancement only) when only a single control strategy is adopted. Among all the strategies, however, Strategy C (a combination of two control strategies) is most effective and lowest expensive. In other words, the combined effect of vaccination and enhancement treatment not only minimizes the cost, but also efficiently control the spread of infectious diseases. Besides, we found that the fractional order q affects the maximum level of control variables, that is, the maximum level of control variables can be reduced by reducing the memory characteristics. As a result, the proper choice of fractional order q is particularly important for the proposed disease prediction or control.