1 Introduction

Since the pioneering work carried out by Kermack and McKendrick (1927, 1932, 1933), many mathematical models were put forward to study different infectious diseases. At the same time, age heterogeneity is an important factor that cannot be ignored in the mathematical modeling of infectious diseases. Age structure usually includes two categories: physiological age and infection age. The introduction of the age structure allows for the mathematical description of the disease progression process by partial differential equations(PDEs), which has given rise to the PDEs model or forms of models coupled with ordinary differential equations(ODEs) and PDEs. In 1974, Hoppensteadt developed and studied for the first time an age-dependent epidemic model with physiological and infection age (Hoppensteadt 1974). Because of the complexity of these issues, many scholars have begun modeling and studying physiological age-structured models (Ai and Wang 2024; Dai and Zhang 2023; Huang et al. 2022; Kumar and Abbas 2022; Tian and Wang 2020; Yu et al. 2023) and infection age-structure models (Duan et al. 2014; Guo et al. 2021, 2023; Wang et al. 2017; Li et al. 2017; Liu and Liu 2017; Sun et al. 2021; Wang et al. 2016; Rodrigue Yves et al. 2018), respectively.

In particular, Duan et al. (2014) considered the vaccination period and latent period as two important factors affecting disease dynamics, for this purpose established an SVEIR epidemic model with ages of vaccination and latency. They derived the basic reproduction number and proved the global stability of the disease-free equilibrium and the endemic equilibrium. Wang et al. (2017) considered the age at relapse, the vaccinated individuals can be infected and incomplete vaccination of newborns based on the literature (Duan et al. 2014). Li et al. (2017) built on the literature (Duan et al. 2014) by considering the age at relapse, the disease-induced death rate of infected individuals and the nonlinear incidence rate, promoting the linear and saturation incidence rates already available. Liu and Liu (2017) studied the global stability of an SVEIR epidemic model with age-dependent waning immunity, latency and relapse. Sharp threshold properties for global asymptotic stability of equilibria are given. Wang et al. (2016) investigated the global dynamics of an age-structured SVEIR epidemic model with media influence, incomplete vaccination of newborns and the nonlinear incidence, the analysis showed that the introduction of media-induced nonlinear incidence and age-dependent vaccination and latency had no fundamental effect on the qualitative behavior of the model. Rodrigue Yves et al. (2018) proposed an SVEIR model with age-dependent vaccination, infection and latency, considered a more significant incidence rate, where transmission by both age-mates infective individuals and infective individuals of any age. Sun et al. (2021) supported the results of the literature (Li et al. 2017) by taking into account the age of infection. Recently, Guo et al. (2021, 2023) studied two age-structured tuberculosis(TB) models, and proved the global asymptotic stability of TB-free equilibrium and the endemic equilibrium, respectively.

In addition, optimal control theory is a method of seeking control strategies that make the objective function extremely small or big, it opens up new ideas for the study of infectious diseases and the formulation of control strategies. Moreover, the study of optimal control is an important part of infectious disease modeling, which is significant in guiding public health work and disease prevention and control, can effectively control the spread of disease and reduce control costs. In 1967, Revelle et al. (1967) applied optimal control theory for the first time in the process of controlling diseases to find the optimal control strategy. The purpose is to minimize the prevalence rate of infectious diseases and the cost of control. After that, many scholars have further studied the optimal control model for multiple types of infectious diseases by using control strategies such as vaccination, treatment, isolation, etc., see (Guo et al. 2022; Li et al. 2020; Mohammed Awel et al. 2021; Wang and Nie 2023; Yang et al. 2019, 2023).

Recently, Khan and Zaman (2018) studied optimal control strategy for the SEIR epidemic model with continuous age-structured for exposed and infected classes. Using vaccination as the controlling variable, optimal control theory was applied to minimize susceptible, exposed and infected individuals to reduce the rate of infection. The objective function was established and the form of the optimal control was derived using the adjoint and sensitivity systems. For the numerical simulation of the optimal system, the forward-backward sweep method was adopted to numerically solve the state and adjoint systems. Duan et al. (2020) discussed the optimal control problem by considering enhanced vaccination and improved treatment as control strategies, the results showed that both the age of vaccination and recovery age appeared in the expressions of the optimal control variables, affecting the efficiency of the optimal control. Guo et al. (2024) developed an age-structured TB model of vaccination, treatment and relapse, and investigated the stability of model equilibria and the optimal control issue. The findings guide how to effectively utilize limited resources to reduce the spread of TB.

Although the above literature has explored the dynamic behavior and optimal control problem of epidemic models. However, to varying degrees, the impact of some factors on disease transmission has been overlooked. There are few papers that have considered all these factors together to study an epidemic model. To this end, we develop a new age-structured SVEIR model coupled with PDEs and ODEs, focusing on the following four main aspects: (1) the effects of the nonlinear incidence rate, waning immunity, incomplete treatment of disease, and relapse on the global stability of the model; (2) the global stability of the disease-free equilibrium and the endemic equilibrium. It is worth noting that the model is built with multiple return terms, and the derivative of the Volterra-type Lyapunov function concerning time t still contains various Volterra-type functions. So, some construction and computational skills are required in the proof; (3) parameter fitting using real data and (4) the problem of the existence of optimal control. It is a challenging task for us to prove the properties of the model and establish the Lyapunov function.

The basic structure of this paper is as follows: the next section investigates a new age-structured SVEIR epidemic model with the nonlinear incidence rate, waning immunity, incomplete treatment and relapse. In Sect. 3, we give the properties of the model and the basic reproduction number \({\mathcal {R}}_0\). Section 4 contains our main results on the asymptotic smoothness, the uniform persistence, the existence of a global attractor and the global stability of equilibria. In Sect. 5, the parameters were fitted using actual data to obtain the initial values of the model parameters. Furthermore, we conduct sensitivity analysis in this section. In Sect. 6, we formulate an optimal control problem, involving two types of control measures: enhancing treatment and controlling relapse. And applying the Gâteaux derivatives to obtain the necessary conditions for optimal control. A brief discussion is given in Sect. 7.

2 Model formulation

We divide the total population N(t) at the time t into five classes: the susceptible S(t), vaccinated V(t), latent E(t), infectious I(t) and recovered R(t). Let S(t) and I(t) represent the density of the susceptible individuals and infectious individuals at time t, \(v(\theta ,t)\) stands for the density of the vaccinated individuals at time t with vaccination age \(\theta \), e(at) represents the density of latent individuals with latency age a at time t, and r(wt) be the density of the recovered individuals at time t with relapse age w.

We assume positive constants \(\Lambda \), \(\alpha \) and \(\mu \) to be the recruitment rate, the vaccination rate of the susceptible individuals, and the natural death rate of the population, respectively. \(\delta _1\) and \(\delta _2\) represent the disease-related mortality rate of latent and infectious individuals, \(\sigma _1\) and \(\sigma _2\) denote the treatment rate and incomplete treatment rate of the disease. Denote the infectious force function by \(\beta S(t)f(I(t))\), where \(\beta \) is the probability of infection. The vaccine-induced immunity wanes at the rate \(h(\theta )\) and the total number of vaccinated individuals at time t is \(V(t)=\int _0^\infty v(\theta ,t)\textrm{d}\theta \), thus the total number of waning immunity individuals who progress into the susceptible class alive reads \(\int _0^\infty h(\theta )v(\theta ,t)\textrm{d}\theta \).

Similarly, the vaccine efficacy is given by the function \(\sigma (\theta )\), thus the total number of vaccinated individuals who progress into the latent class alive reads \(\int _0^\infty \beta f(I(t))\sigma (\theta ) v(\theta ,t) \textrm{d}\theta \), where \(\sigma (\theta )\in [0,1]\); The removal rate from latent class is given by the function k(a), thus the total number of latent individuals who progress into the infectious class alive reads \(\int _0^\infty k(a)e(a,t)\textrm{d}a\); The relapse rate in recovered class is given by the function \(\gamma (w)\), thus the total number of recovered individuals who progress into the infectious class alive reads \(\int _0^\infty \gamma (w)r(w,t)\textrm{d}w\). Our model is described by the following diagram in Fig. 1:

Fig. 1
figure 1

Diagram of model transmission mechanism, where [hV], \([\beta \sigma f(I(t))V]\), [kE] and \([\gamma R]\) represent \(\int _0^\infty h(\theta )v(\theta ,t)\textrm{d}\theta \), \(\int _0^\infty \beta \sigma (\theta )f(I(t))v(\theta ,t)\textrm{d}\theta \), \(\int _0^\infty k(a)e(a,t)\textrm{d}a\) and \(\int _0^\infty \gamma (w)r(w,t)\textrm{d}w\), respectively

From Fig. 1, we can establish the following coupled system of ODEs and PDEs:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\text {d}S(t)}}{{\text {d}t}} = \Lambda - (\mu +\alpha ) S(t) - \beta S(t)f(I(t))+ \int _0^\infty {h (\theta )v(\theta ,t)\text {d}\theta },\\ \frac{{\partial v(\theta ,t)}}{{\partial \theta }} + \frac{{\partial v(\theta ,t)}}{{\partial t}} = -\Big (h(\theta )+\mu +\beta \sigma (\theta )f(I(t))\Big )v(\theta ,t),\\ \frac{{\partial e(a,t)}}{{\partial a}} + \frac{{\partial e(a,t)}}{{\partial t}} = - \big (k(a)+\mu +\delta _1\big )e(a,t), \\ \frac{{\text {d}I(t)}}{{\text {d}t}} = \int _0^\infty {k(a)e(a,t)\text {d}a} + \int _0^\infty {\gamma (w)r(w,t)\text {d}w}-(\mu +\delta _2+\sigma _1+\sigma _2)I(t), \\ \frac{{\partial r(w,t)}}{{\partial w}} + \frac{{\partial r(w,t)}}{{\partial t}} = - \big (\mu + \gamma (w)\big )r(w,t), \end{array} \right. \end{aligned}$$
(1)

with the initial conditions

$$\begin{aligned} \begin{array}{l} S(0)=S_0,~ v\left( \theta ,0 \right) =v_0(\theta ),~ e\left( a,0 \right) =e_0(a),~ I(0)=I_0,~ r\left( w,0 \right) =r_0(w), \end{array}\nonumber \\ \end{aligned}$$
(2)

and the boundary conditions

$$\begin{aligned} \begin{array}{l} v\left( 0,t\right) =\alpha S(t),\\ e\left( 0,t \right) =\beta S(t)f(I(t))+\int _0^\infty {\beta \sigma (\theta )f(I(t))v(\theta ,t)\text {d}\theta }+\sigma _2I(t),\\ r\left( 0,t \right) =\sigma _1I(t), \end{array} \end{aligned}$$
(3)

here \(S_0\), \(I_0 \in {\mathbb {R}}_ +\), and \(v_0(\theta )\), \(e_0(a)\), \( r_0(w) \in L_+^1\), where \(L_+^1:(0,\infty )\rightarrow {\mathbb {R}}_+\) is the space of Lebesgue integrable functions. All the parameters are positive. We make the following assumptions about the parameters of the system (1):

(A1) \(\theta \in [0,\Theta ]\), \(a \in [0,A]\) and \(w \in [0,W]\), where \(\Theta \), A and W are the maximum ages of vaccination, latency and relapse, respectively. If \(\Theta =\infty \), \(A=\infty \) and \(W=\infty \), suppose that \(v(\theta ,t)=0\), \(e(a,t)=0\) and \(r(w,t)=0\) for all sufficiently large \(\theta \), a and w, respectively.

(A2) \(h(\theta )\), \(\sigma (\theta )\), k(a), \(\gamma (w) \in L_+^1\) are bounded with the essential upper bounds \({\bar{h}}\), \( {\bar{\sigma }}\), \({\bar{k}}\), \({\bar{\gamma }}\) for \(\theta \ge 0\), \(a\ge 0\), \(w\ge 0\), and Lipschitz continuous with Lipschitz coefficients \(M_h\), \(M_\sigma \), \(M_k\), \(M_\gamma \), respectively.

(A3) There exists a positive constant \(\mu _0\in [0,\mu ]\), such that \(h(\theta )\), \(\sigma (\theta )\), k(a), \(\gamma (w)\ge \mu _0\) for \(\theta \ge 0\), \(a\ge 0\), \(w\ge 0\), respectively.

(A4) Function f(I) is nonnegative and twice differentiable for all \(I\ge 0\), \(f(I)\ge 0\) with equality if and only if \(I=0\), and \(f'(I)\ge 0\), \(f''(I)\le 0\).

Remark 1

The f(I) satisfying assumption (A4), then there are \(f'(I)I \le f(I)\le f'(0)I\), for \(I\ge 0\).

Remark 2

The f(I) satisfying assumption (A4) can be expressed by the following functions: \(f(I)=I\), \(f(I)=\frac{I}{1+\alpha I}\), where \(\alpha >0\), and \(f(I)=\Big (\beta -\frac{\beta _1I}{m+I}\Big )I\), where \(\beta >\beta _1\), etc.

3 Preliminaries

In this section, some preliminary results are presented to establish our main conclusions.

3.1 Notations

In order to simplify the later derivation, we use the following notation:

$$\begin{aligned} \begin{array}{ll} n(r,t)=h(r)+\mu +\beta \sigma (r)f(I(t-\theta +r)),&{}\eta (\theta ,t)={e^{-\int _0^\theta {n(r,t)\text {d}r}}},\\ n(\theta )=h(\theta )+\mu ,&{}\eta (\theta )={e^{-\int _0^\theta {n(r)\text {d}r}}},\\ \varepsilon (a)=k(a)+\mu +\delta _1,&{}\Omega _1(a)={e^{-\int _0^a {\varepsilon (r)\text {d}r}}},\\ u(w)=\mu +\gamma (w),&{}\Omega _2(w)={e^{-\int _0^w {u(r)\text {d}r}}},\\ \zeta _1(a)=\int _a^\infty k(s)e^{-\int _a^s \varepsilon (r)\text {d}r}\text {d}s,&{}\theta _1=\int _0^{\infty }\gamma (w)\Omega _2(w)\text {d}w,\\ \zeta _2(w)=\int _w^\infty \gamma (s){e^{-\int _w^s {u(r)\text {d}r}}}\text {d}s,&{}\theta _2=\int _0^{\infty }k(a)\Omega _1(a)\text {d}a. \end{array} \end{aligned}$$
(4)

3.2 Volterra formulation

By application of Volterra formulation (Webb 1985), solving \(v(\theta ,t)\), e(at) and r(wt) in system (1) along the characteristic lines \(t-\theta \) = constant, \(t-a\) = constant and \(t-w\) = constant, respectively, we get

$$\begin{aligned} \begin{array}{ll} v\left( \theta ,t \right) ={\left\{ \begin{array}{ll} v(0,t-\theta )\textrm{exp}\left( -\int _0^\theta n(r,t)\textrm{d}r\right) =v(0,t-\theta )\eta (\theta ,t),\quad ~~&{} t\ge \theta \ge 0,\\ v_0(\theta -t)\textrm{exp}\left( -\int _{\theta -t}^\theta n(r,t)\textrm{d}r\right) =v_0(\theta -t)\frac{\eta (\theta ,t)}{\eta (\theta -t,t)}, \quad &{}\theta>t\ge 0.\\ \end{array}\right. }\\ e\left( a,t \right) ={\left\{ \begin{array}{ll} e(0,t-a)\textrm{exp}\Big (-\int _0^a \varepsilon (r)\textrm{d}r\Big )=e(0,t-a)\Omega _1(a),\quad ~~~&{} t\ge a \ge 0,\\ e_0(a-t)\textrm{exp}\left( -\int _{a-t}^a \varepsilon (r)\textrm{d}r\right) =e_0(a -t)\frac{\Omega _1(a)}{\Omega _1(a-t)}, \quad &{}a> t\ge 0.\\ \end{array}\right. }\\ r\left( w,t \right) ={\left\{ \begin{array}{ll} r(0,t-w)\textrm{exp}\Big (-\int _0^w u(r)\textrm{d}r\Big )=r(0,t-w)\Omega _2(w),\quad ~&{}t\ge w \ge 0,\\ r_0(w-t)\textrm{exp}\left( -\int _{w-t}^w u(r)\textrm{d}r\right) =r_0(w -t)\frac{\Omega _2(w)}{\Omega _2(w-t)}, \quad &{}w > t\ge 0. \end{array}\right. } \end{array} \end{aligned}$$
(5)

3.3 State space and well-posedness

The state space X for system (1) is defined by

$$\begin{aligned} {\textbf {X}}:= {{\mathbb {R}}_ + } \times L_ + ^1\times L_ + ^1 \times {{\mathbb {R}}_ +} \times L_ + ^1, \end{aligned}$$

equipped with the norm

$$\begin{aligned} \Vert (x_1, x_2, x_3, x_4, x_5)\Vert _{\textbf {X}}&=|x_1|+\int _0^\infty |x_2(\theta )|\text {d}\theta +\int _0^\infty |x_3(a)|\text {d}a+|x_4|\\&\quad +\int _0^\infty |x_5(w)|\text {d}w. \end{aligned}$$

The initial conditions (2) of the system (1) can be rewritten as \(x_0=\big (S_0,v_0(\cdot ),e_0(\cdot ),I_0,r_0(\cdot )\big )\in {\textbf {X}}\). Then, from the standard theory of functional differential equations (Hale 1971), it follows that when the initial condition is satisfied, for all \(t\ge 0\), there exists a unique nonnegative solution to the system (1). Thus, we can obtain a continuous solution semi-flow \(\Phi :{\mathbb {R}}_+ \times {\textbf {X}}\rightarrow {\textbf {X}}\) associated with the system (1),

$$\begin{aligned} \Phi (t,{x_0}) = \big (S(t),v(\cdot ,t),e(\cdot ,t),I(t), r(\cdot ,t)\big ), ~\text {for} ~t\ge 0, x_0\in {\textbf {X}}, \end{aligned}$$

with the norm

$$\begin{aligned} \big \Vert \Phi (t,{x_0})\big \Vert _{\textbf {X }}&= \big \Vert \big (S(t), v(\cdot ,t), e(\cdot ,t ), I(t), r(\cdot ,t)\big )\big \Vert _{\textbf {X}} \nonumber \\&=|S(t)|+\int _0^\infty |v(\theta ,t)|\text {d}\theta +\int _0^\infty |e(a,t)|\text {d}a+|I(t)|\nonumber \\&\quad +\int _0^\infty |r(w,t)|\text {d}w. \end{aligned}$$
(6)

Set

$$\begin{aligned} \Omega&:=\Bigg \{\big (S(t), v(\cdot ,t), e(\cdot ,t), I(t), r(\cdot ,t)\big )\in {\textbf{X}}:S(t) + \int _0^\infty v(\theta ,t)\text {d}\theta \\&\qquad +\int _0^\infty e(a,t)\text {d}a+I(t)+\int _0^\infty r(w,t)\text {d}w \le \frac{\Lambda }{\mu }\,\Bigg \}. \end{aligned}$$

We have the following lemma.

Lemma 1

For system (1), we have

  1. (i)

    \(\Omega \) is a positively invariant for \(\Phi \). i.e., \(\Phi (t,x_0) \subset \Omega \), \(\forall t\ge 0\), \(x_0 \in \Omega \);

  2. (ii)

    \(\Phi \) is point dissipative and \(\Omega \) attracts each point in \({\textbf{X}}\);

  3. (iii)

    There exist constants \(\varsigma _1\), \(\varsigma _2 > 0\), such that \(\lim \limits _{t \rightarrow \infty } \mathop {\inf } S(t)\ge \varsigma _1\), \(\lim \limits _{t \rightarrow \infty }\mathop {\inf } \int _0^\infty v(\theta ,t)\textrm{d}\theta \ge \varsigma _2.\)

Proof

From (6), we have

$$\begin{aligned} \frac{\textrm{d}}{{\textrm{d}t}}{\big \Vert {{\Phi }({t,x_0})} \big \Vert _{\textbf{X}}}= & {} \frac{{\textrm{d}S(t)}}{{\textrm{d}t}} + \frac{\textrm{d}}{{\textrm{d}t}}\int _0^\infty {v(\theta ,t)\textrm{d}\theta } + \frac{\textrm{d}}{{\textrm{d}t}}\int _0^\infty {e(a,t)\textrm{d}a} + \frac{{\textrm{d}I(t)}}{{\textrm{d}t}} \\{} & {} + \frac{\textrm{d}}{\textrm{d}t}\int _0^\infty {r(w,t)\textrm{d}w}. \end{aligned}$$

From the second equation in the system (1), it follows that

$$\begin{aligned} \frac{{\partial v(\theta ,t)}}{{\partial t}} =-\frac{{\partial v(\theta ,t)}}{{\partial \theta }} -\Big (h(\theta )+\mu +\beta \sigma (\theta )f(I(t))\Big )v(\theta ,t). \end{aligned}$$

So

$$\begin{aligned} \frac{\text {d} }{\text {d}t}\int _0^\infty {v(\theta , t)\text {d}\theta }&=\int _0^\infty \bigg [-\frac{\partial v(\theta ,t)}{\partial \theta }-\Big (h(\theta )+\mu +\beta \sigma (\theta )f(I(t))\Big )v(\theta ,t)\bigg ]\textrm{d}\theta \nonumber \\&=-v(\theta ,t)\big |_0^\infty -\int _0^\infty n(\theta ,t)v(\theta ,t)\textrm{d}\theta \nonumber \\&=\alpha S(t)-\int _0^\infty n(\theta ,t)v(\theta ,t)\textrm{d}\theta . \end{aligned}$$
(7)

Similarly

$$\begin{aligned} \frac{\text {d}}{\text {d}t}\int _0^\infty e(a,t)\text {d}a&= \beta S(t)f(I(t))+\int _0^\infty \beta \sigma (\theta )f(I(t))v(\theta ,t)\text {d}\theta +\sigma _2I(t)\nonumber \\&\quad - \int _0^\infty {\varepsilon (a)e(a, t)\text {d}a}, \end{aligned}$$
(8)
$$\begin{aligned} \frac{\text {d}}{\text {d}t}\int _0^\infty r(w,t)\text {d}w&= \sigma _1I(t) - \int _0^\infty {u(w)r(w, t)\text {d}w}. \end{aligned}$$
(9)

Combining (7-9) and the first and fourth equations of the system (1) yields

$$\begin{aligned}{} & {} \frac{\text {d}}{{\text {d}t}}\left( S(t)+\int _0^\infty v(\theta ,t)\text {d}\theta +\int _0^\infty e(a,t)\text {d}a+I(t)+\int _0^\infty r(w,t)\text {d}w\right) \\{} & {} \quad \quad \le \Lambda - \mu \left( S(t) + \int _0^\infty {v(\theta ,t)\text {d}\theta } + \int _0^\infty {e(a,t)\text {d}a} + I(t) + \int _0^\infty {r(w,t)\text {d}w} \right) . \end{aligned}$$

Further, we get

$$\begin{aligned} ||\Phi (t,x_0)||_{\textbf {X}}\le \frac{\Lambda }{\mu }- e^{ - \mu t}\left( \frac{\Lambda }{\mu } - ||x_0||_{\textbf {X}}\right) ,~ \forall t\ge 0. \end{aligned}$$

Thus \(x_0 \in {\textbf {X}}\), there are all the solutions to the system (1) have \(\Phi (t,x_0) \subset \Omega \). So, \(\Omega \) is positively invariant for \(\Phi \), which leads to \(\mathop {\lim }\limits _{t\rightarrow \infty }\sup \left\| {{\Phi }({t,x_0})} \right\| _{\textbf {X}} \le \frac{\Lambda }{\mu }\) for any \(x_0 \in {\textbf {X}}\). Therefore, \(\Phi \) is point dissipative and \(\Omega \) attracts all points in X, i.e., Lemma 1 (i) and (ii) hold.

Now prove (iii) in Lemma 1. Since \(I(t)\le \frac{\Lambda }{\mu }\) and \(f(I)\le f'(0)I\), it is easy to see from the first equation of the system (1) that

$$\begin{aligned} \frac{\text {d}}{{\text {d}t}}S(t)\ge \Lambda - (\mu +\alpha ) S(t)-\beta S(t)f'(0) \frac{\Lambda }{\mu }. \end{aligned}$$

Then we have that

$$\begin{aligned} \lim _{t\rightarrow \infty } \inf S(t) \ge \frac{\Lambda }{\mu +\alpha +\beta f'(0)\frac{\Lambda }{\mu }}:=\varsigma _1. \end{aligned}$$

According to assumption (A2), using the second equation of the system (1) yields

$$\begin{aligned} \frac{\text {d}}{{\text {d}t}}\int _0^\infty {v(\theta ,t)\text {d}\theta }&=-\int _0^\infty \left[ \frac{\partial v(\theta ,t)}{\partial \theta }+\big (h(\theta )+\mu +\beta \sigma (\theta )f(I(t))\big )v(\theta ,t)\right] \text {d}\theta \nonumber \\&\ge -v(\theta ,t)\big |_0^\infty -\left( {\overline{h}}+\mu +\beta {\overline{\sigma }} f'(0) \frac{\Lambda }{\mu }\right) \int _0^\infty v(\theta , t)\text {d}\theta \nonumber \\&=\alpha S(t) - \left( {\overline{h}}+\mu +\beta {\overline{\sigma }} f'(0)\frac{\Lambda }{\mu }\right) \int _0^\infty {v(\theta , t)\text {d}\theta }.\, \end{aligned}$$

Here we use the result that \(v(\theta ,t)|_{\theta =\infty }=0\) from assumption (A1) gives

$$\begin{aligned} \lim _{t\rightarrow \infty } \inf \int _0^\infty {v(\theta ,t)\textrm{d}\theta } \ge \frac{\alpha S(t)}{{\overline{h}}+\mu +\beta {\overline{\sigma }} f'(0)\frac{\Lambda }{\mu }}:=\varsigma _2 \end{aligned}$$

This completes the proof. \(\square \)

The following proposition directly follows from Lemma 1.

Proposition 1

If \( x_0 \in {\textbf{X}}\) and \(\Vert x_0\Vert _{\textbf{X}} \le M\) for some constant \(M \ge \frac{\Lambda }{\mu }\), then the following statements hold true for \(t \ge 0\):

  1. (i)

    \( 0\le S(t)\), \(\int _0^\infty v(\theta ,t)\textrm{d}\theta \), \( \int _0^\infty e(a,t)\textrm{d}a\), I(t), \(\int _0^\infty r(w,t)\textrm{d}w\le M;\)

  2. (ii)

    \(v(0,t)\le \alpha M\), \(e(0,t)\le \beta f'(0)M^2+\int _0^\infty \beta f'(0) M^2 \sigma (\theta )\textrm{d}\theta +\sigma _2M\), \(r(0,t)\le \sigma _1 M.\)

3.4 Equilibria and the basic reproduction number

In this subsection, we consider the equilibria and the explicit expression of the basic reproduction number of the system (1).

Theorem 1

System (1) always exists a disease-free equilibrium \(E^0=\big (S^0,v^0(\theta ),e^0(a),I^0,r^0(w)\big )\), and has a unique endemic equilibrium \(E^*=\big (S^*,v^*(\theta ),e^*(a),I^*,r^*(w)\big )\) if and only if \({\mathcal {R}}_0>1\).

Proof

Obviously, the system (1) always exists a disease-free equilibrium \(E^0=\big (S^0,v^0(\theta ),e^0(a),I^0,r^0(w)\big )\) satisfies

$$\begin{aligned} S^0=\frac{\Lambda }{\mu +\alpha -\alpha \int _0^\infty h(\theta )\eta (\theta )\textrm{d}\theta },~ v^0(\theta )=\alpha S^0\eta (\theta ),~ e^0(a)=0,~ I^0=0,~ r^0(w)=0. \end{aligned}$$

To find the endemic equilibrium of the system (1), we first define the basic reproduction number (Diekmann et al. 1990) as

$$\begin{aligned} {\mathcal {R}}_0=\frac{\beta f'(0)\theta _2\big (S^0+\int _0^\infty {\sigma (\theta )v^0(\theta )\textrm{d}\theta }\big )}{\varpi -\sigma _1\theta _1-\sigma _2\theta _2}, \end{aligned}$$
(10)

where \(\varpi =\mu +\delta _2+\sigma _1+\sigma _2\). \(\frac{1}{\varpi }\) is the average time of an infectious individual in infectious class on the first pass, \(\theta _2\) and \(\theta _1\) are the probability of the infectious individuals surviving from latent class and relapse individuals from the recovered class. Thus the total average time in the infectious class is calculated by

$$\begin{aligned} \frac{1}{\varpi }\left[ 1+\frac{\sigma _1 \theta _1+\sigma _2 \theta _2}{\varpi }+\left( \frac{\sigma _1 \theta _1+\sigma _2 \theta _2}{\varpi }\right) ^2+\cdots \right] =\frac{1}{\varpi -\sigma _1 \theta _1-\sigma _2 \theta _2}. \end{aligned}$$

Hence, \(\frac{\beta f'(0) \theta _2 S^0}{\varpi -\sigma _1 \theta _1-\sigma _2 \theta _2}\) and \(\frac{\beta f'(0)\theta _2 \int _0^\infty {\sigma (\theta )v^0(\theta )\textrm{d}\theta }}{\varpi -\sigma _1 \theta _1-\sigma _2 \theta _2}\) represent the number of susceptible and vaccination individuals that can be infected by an infectious individual, respectively. \({\mathcal {R}}_0\) represents the average number of new infections produced by an infectious individual over the entire infectious period, it is a crucial threshold parameter for the global stability of equilibria.

Note that any endemic equilibrium \(\big (S^*,v^*(\theta ),e^*(a),I^*,r^*(w)\big )\) of system (1) satisfies

$$\begin{aligned} \left\{ \begin{array}{l} 0 = \Lambda - (\mu +\alpha ) {S^*} - \beta S^*f(I^*) +\int _0^\infty {h(\theta )v^*(\theta )\textrm{d}\theta },\\ \frac{{\textrm{d}{v^*}(\theta )}}{\mathrm{{d}\theta }} = -n^*(\theta )v^*(\theta ),\\ \frac{{\textrm{d}{e^*}(a)}}{{\textrm{d}a}} = -\varepsilon (a)e^*(a),\\ 0 = \int _0^\infty {k(a)e^*(a)\textrm{d}a} + \int _0^\infty {\gamma (w)r^*(w)\textrm{d}w}-\varpi I^*,\\ \frac{{\textrm{d}{r^*}(w)}}{{\textrm{d}w}} = - u(w)r^*(w),\\ v^*(0)=\alpha S^*,~ e^*(0)=\beta S^*f(I^*)+\int _0^\infty \beta \sigma (\theta )f(I^*)v^*(\theta )\textrm{d}\theta +\sigma _2I^*,~ r^*(0)=\sigma _1I^*.\\ \end{array} \right. \end{aligned}$$
(11)

where \(n^*(\theta )=h(\theta )+\mu +\beta \sigma (\theta )f(I^*)\). From the second, third and fifth equations of the system (11), we obtain

$$\begin{aligned} \begin{array}{l} v^*(\theta )=\alpha S^* \eta ^*(\theta ),\\ e^*(a)=\Omega _1(a)\big (\beta S^*f(I^*)+\int _0^\infty \beta \sigma (\theta )f(I^*)v^*(\theta )\textrm{d}\theta +\sigma _2I^*\big ),\\ r^*(w)=\sigma _1I^*\Omega _2(w), \end{array} \end{aligned}$$

where \(\eta ^*(\theta )={e^{-\int _0^\theta {n^*(r)\text {d}r}}}\). Substituting \(v^*(\theta )\) into the first equation of the system (11), yields

$$\begin{aligned} S^*=\frac{\Lambda }{\mu +\alpha +\beta f(I^*)-\alpha \int _0^\infty h(\theta )\eta ^*(\theta )\textrm{d}\theta }. \end{aligned}$$

Substituting \(v^*(\theta )\) and \(S^*\) into the fourth equation of the system (11) gives

$$\begin{aligned} 0&=\int _0^\infty k(a)\Omega _1(a)\left( \beta S^*f(I^*)+\int _0^\infty \beta \sigma (\theta )f(I^*)v^*(\theta )\textrm{d}\theta +\sigma _2I^*\right) \textrm{d}a-\varpi I^*\nonumber \\&\quad +\int _0^\infty \gamma (w)\sigma _1I^*\Omega _2(w)\textrm{d}w\nonumber \\&=\int _0^\infty \beta k(a)\Omega _1(a)f(I^*)\left( 1+\alpha \int _0^\infty \sigma (\theta )\eta ^*(\theta )\textrm{d}\theta \right) \nonumber \\&\quad \times \frac{\Lambda }{\mu +\alpha +\beta f(I^*)-\alpha \int _0^\infty h(\theta )\eta ^*(\theta )\textrm{d}\theta }\textrm{d}a -(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)I^*\nonumber \\&=\frac{\beta \theta _2\big (1+\alpha \int _0^\infty {\sigma (\theta )\eta ^*(\theta )\textrm{d}\theta }\big )}{\mu +\alpha +\beta f(I^*)-\alpha \int _0^\infty {h(\theta )\eta ^*(\theta )\textrm{d}\theta }}-\frac{(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)I^*}{\Lambda f(I^*)}. \end{aligned}$$
(12)

Denoting the right-hand side of (12) as \(\kappa (I)|_{I=I^*}\), the existence of the endemic equilibrium \(E^*\) depends on the existence of \(I^*\), there is

$$\begin{aligned} \kappa (I)=\frac{\beta \theta _2\big (1+\alpha \int _0^\infty {\sigma (\theta )\eta (\theta )e^{-\int _0^\theta \beta \sigma (r)f(I)\textrm{d}r}\textrm{d}\theta }\big )}{\mu +\alpha +\beta f(I)-\alpha \int _0^\infty {h(\theta )\eta (\theta )e^{-\int _0^\theta \beta \sigma (r)f(I)\textrm{d}r}\textrm{d}\theta }}-\frac{(\varpi {-}\sigma _1\theta _1{-}\sigma _2\theta _2)I}{\Lambda f(I)}. \end{aligned}$$

Derivation of the above equation gives

$$\begin{aligned} \kappa '(I)=&-\frac{\beta ^2f'(I)\theta _2\big (1+\alpha \int _0^\infty {\sigma (\theta )\eta (\theta )e^{-\int _0^\theta \beta \sigma (r)f(I)\textrm{d}r}\textrm{d}\theta }\big )\big (\mu +\alpha +\beta f(I)+1\big )}{\big (\mu +\alpha +\beta f(I)-\alpha \int _0^\infty {h(\theta )\eta (\theta )e^{-\int _0^\theta \beta \sigma (r)f(I)\textrm{d}r}\textrm{d}\theta }\big )^2}\\&-\frac{(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)}{\Lambda }\cdot \frac{\big (f(I)-f'(I)I\big )}{f^2(I)}. \end{aligned}$$

By assumption (A4) and Remark 1, there are \(f'(I)\ge 0\) and \(f(I)-f'(I)I\ge 0\), then \(\kappa '(I)\le 0\), which means that \(\kappa (I)\) monotonically decreases in \(\left( 0,\frac{\Lambda }{\mu }\right) \). Since

$$\begin{aligned} \kappa (0)=\lim _{I\rightarrow 0}\kappa (I)&=\frac{\beta \theta _2\big (1+\alpha \int _0^\infty {\sigma (\theta )\eta (\theta )\textrm{d}\theta }\big )}{\mu +\alpha -\alpha \int _0^\infty {h(\theta )\eta (\theta )\textrm{d}\theta }}-\frac{(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)}{\Lambda f'(0)}\\&=\frac{(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)}{\Lambda f'(0)}({\mathcal {R}}_0-1). \end{aligned}$$

If \({\mathcal {R}}_0 \le 1\), then \(\kappa (0) \le 0\), so \(\kappa (I)=0\) has no positive real root. If \({\mathcal {R}}_0 >1\), then \(\kappa (0) >0\), so \(\kappa (I)=0\) has a unique positive real root \(I^*\). Thus, the system (1) has a unique endemic equilibrium \(E^*=\big (S^*, v^*(\theta ), e^*(a), I^*, r^*(w)\big )\) if and only if \({\mathcal {R}}_0>1\). This completes the proof. \(\square \)

4 Main results

In this part, we give some results on the asymptotic smoothness, the uniform persistence and the existence of a global attractor for the semi-flow generated by the system (1). Moreover, the stability of equilibria is captured.

4.1 Asymptotic smoothness of the semi-flow

Asymptotic smoothness of the semi-flow is vital to apply the invariance principle. The following lemma is useful for proofing asymptotic smoothness.

Lemma 2

(Hale and Waltman 1989) If the following two conditions hold, then the semi-flow \(\Phi (t,x_0)=\phi (t,x_0)+\varphi (t,x_0):{\mathbb {R}}_+\times {\textbf{X}}\rightarrow {\textbf{X}}\) is asymptotically smooth in \({\textbf{X}}\).

  1. (i)

    There exists a continuous function \(u: {\mathbb {R}}_+\times {\mathbb {R}}_+\rightarrow {\mathbb {R}}_+\) such that \(\mathop {\lim }\limits _{t\rightarrow \infty }u(t,h)= 0\) and \({\left\| {\phi (t,{x_0})} \right\| _{\textbf{X}}} \le u(t,h)\) if \({\left\| {{x_0}} \right\| _{\textbf{X}}} \le h\);

  2. (ii)

    For \(t\ge 0,\) \(\varphi (t,x_0)\) is completely continuous.

We can now prove one of the main results.

Theorem 2

The semi-flow \(\Phi (t,x_0)\) is asymptotically smooth.

Following Lemma 2, we will divide into two steps to verify Theorem 2.

Step 1: We need to decompose the semi-flow as \(\Phi (t,x_0)=\phi (t,x_0)+\varphi (t,x_0)\), where \(\phi (t,x_0)=\big (0, v(\cdot ,t), e(\cdot ,t), 0, r(\cdot ,t)\big )\) and \(\varphi (t,x_0)=\big (S(t), {\tilde{v}}(\cdot ,t), {\tilde{e}}(\cdot ,t), I(t), {\tilde{r}}(\cdot ,t)\big ),\) here

$$\begin{aligned} v(\cdot ,t)= & {} {\left\{ \begin{array}{ll} 0,&{} t\ge \theta \ge 0,\\ v(\theta ,t),&{} \theta> t\ge 0. \end{array}\right. } \quad \quad {\tilde{v}}(\cdot ,t)={\left\{ \begin{array}{ll} v(\theta ,t),&{} t\ge \theta \ge 0,\\ 0,&{} \theta > t\ge 0. \end{array}\right. } \end{aligned}$$
(13)
$$\begin{aligned} e(\cdot ,t)= & {} {\left\{ \begin{array}{ll} 0,&{} t\ge a \ge 0,\\ e(a,t),&{} a> t\ge 0. \end{array}\right. } \quad \quad {\tilde{e}}(\cdot ,t)={\left\{ \begin{array}{ll} e(a,t),&{} t\ge a \ge 0,\\ 0,&{} a > t\ge 0. \end{array}\right. } \end{aligned}$$
(14)
$$\begin{aligned} ~~r(\cdot ,t)= & {} {\left\{ \begin{array}{ll} 0,&{} t\ge w \ge 0,\\ r(w,t),&{} w> t\ge 0. \end{array}\right. } \quad \quad {\tilde{r}}(\cdot ,t)={\left\{ \begin{array}{ll} r(w,t),&{} t\ge w \ge 0,\\ 0,&{} w > t\ge 0. \end{array}\right. } \end{aligned}$$
(15)

We need to prove the following proposition to verify that condition (i) of Lemma 2 holds.

Proposition 2

For \(h>0\), let \(u(t,h) = h{e^{ - ({\mu _0}+\mu )t}}\). Then \(\mathop {\lim }\limits _{t \rightarrow \infty } u(t,h) = 0\) and \( {\left\| {\phi (t,{x_0})} \right\| _{\textbf{X}}} \le u(t,h) \) if \( {\left\| {{x_0}} \right\| _{\textbf{X}}} \le h.\)

Proof

Let \(u(t,h)=he^{-(\mu _0+\mu )t}\). Clearly, for any \(h>0\), \(\mathop {\lim }\limits _{t\rightarrow \infty }u(t,h)=0\). For \(x_0 \in \Omega \) and \(\Vert x_0\Vert _{\textbf {X}} \le h\), it follows from (5) and (6) that

$$\begin{aligned} {\left\| {\phi (t,{x_0})} \right\| _{\textbf {X}}}&= |0| + \int _0^\infty |v(\theta ,t)|\textrm{d}\theta +\int _0^\infty {\left| e(a,t)\right| \textrm{d}a}+|0|+\int _0^\infty \left| r(w,t)\right| \textrm{d}w \nonumber \\&=\int _t^\infty {\left| {{v_0}(\theta - t)\frac{{{\eta }(\theta ,t)}}{{{\eta }(\theta - t,t)}}} \right| \textrm{d}\theta } + \int _t^\infty {\left| {{e_0}(a - t)\frac{{{\Omega _1}(a)}}{{{\Omega _1}(a - t)}}} \right| \textrm{d}a}\nonumber \\&\quad +\int _t^\infty {\left| {{r_0}(w - t)\frac{{{\Omega _2}(w)}}{{{\Omega _2}(w - t)}}} \right| \textrm{d}w}\nonumber \\&=\int _0^\infty {\left| {{v_0}(\tau )\frac{{{\eta }(t + \tau ,t)}}{{{\eta }(\tau ,t )}}} \right| \textrm{d}\tau } + \int _0^\infty {\left| {{e_0}(\tau )\frac{{{\Omega _1}(t + \tau )}}{{{\Omega _1}(\tau )}}} \right| \textrm{d}\tau }\nonumber \\&\quad +\int _0^\infty {\left| {{r_0}(\tau )\frac{{{\Omega _2}(t + \tau )}}{{{\Omega _2}(\tau )}}} \right| \textrm{d}\tau }\nonumber \\&=\int _0^\infty {\left| {{v_0}(\tau )\exp \left( { - \int _\tau ^{t + \tau } {n(r,t)\textrm{d}r} } \right) } \right| \textrm{d}\tau }\nonumber \\&\quad + \int _0^\infty {\left| {{e_0}(\tau )\exp \left( { - \int _\tau ^{t + \tau } {\varepsilon (r)\textrm{d}r} } \right) } \right| \textrm{d}\tau }\nonumber \\&\quad + \int _0^\infty {\left| {{r_0}(\tau )\exp \left( { - \int _\tau ^{t + \tau } {u(r)\textrm{d}r} } \right) } \right| \textrm{d}\tau }. \end{aligned}$$

Since \(n(\theta ,t)\ge \mu _0+\mu \), \(\varepsilon (a)\ge \mu _0+\mu \) and \(u(w)\ge \mu _0+\mu \) hold for all \(\theta \), a, \(w\ge 0\), we have

$$\begin{aligned} {\left\| {\phi (t,{x_0})} \right\| _{\textbf{X}}}&\le e^{- (\mu _0+\mu ) t }\left( {|0| + \int _0^\infty \! {|{v_0}(\tau )|\textrm{d}\tau } + \int _0^\infty \! {|{e_0}(\tau )|\textrm{d}\tau } + |0| + \int _0^\infty \! {|{r_0}(\tau )|\textrm{d}\tau } } \right) \nonumber \\&=e^{- (\mu _0+\mu ) t } {\left\| {{x_0}} \right\| _{\textbf{X}}} \nonumber \\&\le he^{- (\mu _0+\mu ) t } = u(t,h),\quad \forall t\ge 0. \end{aligned}$$

This completes the proof. \(\square \)

Step 2: In order to verify (ii) of Lemma 2, we discuss the completely continuous of \(\varphi (t,{x_0})\). Proposition 1 states that S(t) and I(t) remain in the compact set [0, M], where \(M \ge \frac{\Lambda }{\mu }\). We need a notion of compactness in \(L_ + ^1\) since it is a component of infinite dimensional Banach space \({\textbf{X}}\). To ensure this, we need to confirm that \({\tilde{v}}(\theta ,t)\), \({\tilde{e}}(a,t)\) and \({\tilde{r}}(w,t)\) belong to a precompact subset of \(L_+^1\) that is independent of \(x_0\in \Omega \). So we will need the following result.

Lemma 3

(Adams and Fournier 2003) Let \(K \subset {L^p}(0,\infty )\) be a closed and bounded set where \(p\ge 1\). Then K is compact if the following conditions hold:

  1. (i)

    \(\mathop {\lim }\limits _{h \rightarrow \infty } \int _h^\infty {{{\left| {f(z)} \right| }^p}\textrm{d}z} = 0\) uniformly for \(f\in K\);

  2. (ii)

    \(\mathop {\lim }\limits _{h \rightarrow 0 } \int _0^\infty {{{\left| {f(z + h) - f(z)} \right| }^p}\textrm{d}z} = 0\) uniformly for \(f\in K \).

Proposition 3

For \(t\ge 0\), \( \varphi (t,{x_0}) \) is completely continuous.

Proof

It follows from Lemma 3 that for any closed and bounded set \(B\subset {\textbf{X}}\), we have \(\varphi (t,B)\) is compact. According to Proposition 1, S(t) and I(t) remain in the compact set [0, M], where \(M\ge \frac{\Lambda }{\mu }\) is a bound for B.

We have from (5) and (13) that

$$\begin{aligned} 0\le {\tilde{v}}(\theta ,t)={\left\{ \begin{array}{ll} v(0,t-\theta )\eta (\theta ,t),&{} t\ge \theta \ge 0,\\ 0,&{} \theta > t\ge 0. \end{array}\right. } \end{aligned}$$
(16)

As \(0\le \eta (\theta ,t)\le e^{-(\mu _0+\mu ) \theta }\). Using (ii) of Proposition 1, we obtain that

$$\begin{aligned} {\tilde{v}}(\theta ,t)\le \alpha M e^{-(\mu _0+\mu ) \theta }. \end{aligned}$$

Thus (i) in Lemma 3 is satisfied.

For sufficiently small \(h \in (0,t)\), yields

$$\begin{aligned}&\int _0^\infty {\left| {{{\tilde{v}}}(\theta +h,t) - {{\tilde{v}}}(\theta ,t)} \right| \textrm{d}\theta } \\&\quad =\int _0^t {\big | {v(\theta + h,t) - v(\theta ,t)} \big |\textrm{d}\theta }\\&\quad = \int _0^{t - h} {\big | {v(0,t - \theta - h){\eta }(\theta + h,t) - v(0,t - \theta ){\eta }(\theta ,t)} \big |\textrm{d}\theta }\\&\qquad + \int _{t - h}^t {\big | {0 - v(0,t - h){\eta }(\theta ,t)} \big |\textrm{d}\theta } \\&\quad \le \int _0^{t - h} v(0,t - \theta - h)\big |\eta (\theta + h,t) - \eta (\theta ,t) \big |\textrm{d}\theta \\&\qquad + \int _0^{t - h} {\big | {v(0,t - \theta - h) - v(0,t -\theta )} \big |{\eta }(\theta ,t)\textrm{d}\theta }\\&\qquad + \int _{t - h}^t {\big | {v(0,t - \theta ){\eta }(\theta ,t)} \big |\textrm{d}\theta }\\&=\int _0^{t - h} {v(0,t - \theta - h)\big | {{\eta }(\theta + h,t) - {\eta }(\theta ,t)} \big |\textrm{d}\theta } + \int _{t - h}^t {\big | {v(0,t - \theta ){\eta }(\theta ,t)} \big |\textrm{d}\theta } + \Delta , \end{aligned}$$

where

$$\begin{aligned} \Delta =\int _0^{t - h} {\big | {v(0,t -\theta - h) - v(0,t - \theta )} \big |{\eta }(\theta ,t)\textrm{d}\theta }. \end{aligned}$$

Recall that \(0 \le \eta (\theta ,t) \le 1\) and \(\eta (\theta ,t)\) is a non-increasing function with respect to \(\theta \). Then it follows that

$$\begin{aligned} \int _0^{t - h} {\big | {{\eta }(\theta + h,t) - {\eta }(\theta ,t)}\big |} \textrm{d}\theta&=\int _0^{t - h} \eta (\theta ,t)\textrm{d}\theta - \int _0^{t-h} \eta (\theta +h,t) \textrm{d}\theta \\&=\int _0^{t - h} {{\eta }(\theta ,t)} \textrm{d}\theta - \int _h^{t - h} {{\eta }(\theta ,t)\textrm{d}\theta } - \int _{t - h}^t {{\eta }(\theta ,t)\textrm{d}\theta }\\&=\int _0^h {{\eta }(\theta ,t)} \textrm{d}\theta - \int _{t - h}^t {{\eta }(\theta ,t)\textrm{d}\theta } \le h. \end{aligned}$$

From Proposition 1, we have

$$\begin{aligned}{} & {} \int _0^{t - h} {v(0,t -\theta - h)\big | {{\eta }(\theta +h,t) - {\eta }(\theta ,t)} \big |} \textrm{d}\theta \\{} & {} \quad \le \alpha Mh,~ \int _{t-h}^t \big |v(0,t-\theta )\eta (\theta ,t)\big |\textrm{d}\theta \le \alpha Mh. \end{aligned}$$

Combining the system (1), assumption (A2), Remark 1 and Proposition 1, it follows that

$$\begin{aligned} \left| {\frac{{\textrm{d}S(t)}}{{\textrm{d}t}}} \right| \le \Lambda + (\mu +\alpha ) M +\beta f'(0) M^2 + {\overline{h}}M:=M_S, \\ \left| {\frac{{\textrm{d}I(t)}}{{\textrm{d}t}}} \right| \le {\overline{k}} M +{\overline{\gamma }}M+\varpi M:=M_I. \end{aligned}$$

Which imply \(\left| \frac{\textrm{d}S(t)}{\textrm{d}t} \right| \) is bounded and S(t) is Lipschitz continuous on \( [0,\infty )\) with Lipschitz coefficient \( {M_S}\); Similarly, \(\left| \frac{\textrm{d}I(t)}{\textrm{d}t} \right| \) is bounded and I(t) is Lipschitz continuous on \( [0,\infty )\) with Lipschitz coefficient \( {M_I}\). We deduce

$$\begin{aligned} \Delta \le \int _0^{t - h} {\alpha M_Sh{e^{-\mu \theta }}\textrm{d}\theta } \le \frac{\alpha M_Sh}{\mu }. \end{aligned}$$

Combining the above processes leads to

$$\begin{aligned} \int _0^\infty {\big | {{{\tilde{v}}}(\theta + h,t) - {{\tilde{v}}}(\theta ,t)} \big |\textrm{d}\theta } \le \left( 2\alpha M + \frac{\alpha M_S}{\mu }\right) h. \end{aligned}$$

So \(\mathop {\lim }\limits _{h\rightarrow 0}\int _0^\infty {\left| {{\tilde{v}}(\theta + h,t) - {\tilde{v}}(\theta ,t)} \right| \textrm{d}\theta }=0\) uniformly for any \(x_0\in B\), thus, \({\tilde{v}}(\theta ,t)\) remains in a precompact subset \({B_{{{\tilde{v}}}}}\) of \( L_+^1\). Similarly, \({\tilde{e}}(a,t)\) and \({\tilde{r}}(w,t)\) also remain in precompact subsets \(B_{{\tilde{e}}}\) and \(B_{{\tilde{r}}}\) of \(L_+^1\), respectively. Thus \(\varphi (t,B)\subseteq [0,M]\times B_{{\tilde{v}}}\times {B_{{{\tilde{e}}}}}\times [0,M]\times {B_{{{\tilde{r}}}}}\), which is compact in \({\textbf {X}}\). Thus, \(\varphi (t,{x_0})\) is completely continuous. This finishes the verification of (ii) of Lemma 2.

Now, Theorem 2 has been proven by combining Lemma 2, Proposition 2 and 3. The proof is completed. \(\square \)

4.2 Uniform persistence

This subsection will prove the uniform persistence of the system (1) when \({\mathcal {R}}_0 > 1\), which means that infected individuals are always present if infectivity initially appears.

For convenience, we introduce the Volterra-type function

$$\begin{aligned} g(x)=x-1-\textrm{ln}x,~ x>0. \end{aligned}$$

We can see that g(x) is positive definite and reaches a global minimum at \(x=1\) with \(g(1)=0\).

We are now introduce some notations. Set

$$\begin{aligned} {{\bar{a}}} = \inf \left\{ {a:\int _a^\infty {k(\tau )\textrm{d}\tau } = 0} \right\} ,~ {{\bar{w}}} = \inf \left\{ {w:\int _w^\infty {\gamma (\tau )\textrm{d}\tau } = 0} \right\} . \end{aligned}$$

Clearly, \({{{\bar{a}}}}\), \({{{\bar{w}}}}>0\) due to k(a), \(\gamma (w)\in L_ + ^1\). We further define

$$\begin{aligned} \tilde{{\textbf{X}}}&= L_ + ^1\times {\mathbb {R}}_+\times L_ + ^1,\\ \tilde{{\textbf{Y}}}&=\left\{ \big (e(\cdot ,t),I(t),r( \cdot ,t )\big )^T \in \tilde{{\textbf{X}}}: \int _0^{{{\bar{a}}}} {e(a,t)\textrm{d}a}> 0\ \text {or}\ I(t)> 0\ \text {or}\right. \\&\quad \left. \int _0^{{{\bar{w}}}} {r(w,t)\textrm{d}w }> 0\right\} , \end{aligned}$$

and

$$\begin{aligned} {\textbf{Y}}&= {{\mathbb {R}}_+ }\times L_ + ^1 \times \tilde{{\textbf{Y}}},~ \partial {\textbf{Y}}= {\textbf{X}}\backslash {\textbf{Y}},~ \partial \tilde{{\textbf{Y}}}= \tilde{{\textbf{X}}}\backslash \tilde{{\textbf{Y}}}. \end{aligned}$$

Let us first establish the following theorem.

Theorem 3

For the semi-flow \({\left\{ {\Phi (t,x_0)} \right\} _{t \ge 0}}\) generated by system (1), the set \(\partial {\textbf{Y}}\) is positively invariant, that is, \(\Phi (t,\partial {\textbf{Y}}) \subset \partial {\textbf{Y}}\) for any \(t\ge 0\). Furthermore, the disease-free equilibrium \(E^0\) is globally asymptotically stable for the semi-flow \({\left\{ {\Phi (t,x_0)} \right\} _{t \ge 0}}\) restricted to \(\partial {\textbf{Y}}\).

Proof

Suppose \(\big (S_0, v_0(\cdot ), e_0(\cdot ), I_0, r_0(\cdot )\big )\in \partial {\textbf{Y}}\), then there is \(\big (e_0(\cdot ), I_0, r_0(\cdot )\big )\in \partial \tilde{{\textbf{Y}}}\), this shows that \(e_0(\cdot )=0\), \(I_0=0\) and \(r_0(\cdot )=0\). We consider the following system:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\partial e(a,t)}}{{\partial a}} + \frac{{\partial e(a,t)}}{{\partial t}} = - \varepsilon (a)e(a,t), \\ \frac{{\textrm{d}I(t)}}{{\textrm{d}t}} = \int _0^\infty {k(a)e(a,t)\textrm{d}a} + \int _0^\infty {\gamma (w)r(w,t)\textrm{d}w}-\varpi I(t), \\ \frac{{\partial r(w,t)}}{{\partial w}} + \frac{{\partial r(w,t)}}{{\partial t}} = -u (w)r(w,t),\\ e\left( a,0 \right) =0,~ I(0)=0,~ r\left( w,0 \right) =0,\\ e\left( 0,t \right) =\beta S(t)f(I(t)){+}\int _0^\infty \beta \sigma (\theta )f(I(t))v(\theta ,t)\textrm{d}\theta {+}\sigma _2I(t), ~r\left( 0,t \right) =\sigma _1I(t). \end{array}\right. \end{aligned}$$
(17)

Since \(0\le \sigma (\theta ) \le 1,~ S(t)+\int _0^\infty \sigma (\theta )v(\theta ,t)\textrm{d}\theta \le S(t)+\int _0^\infty v(\theta ,t)\textrm{d}\theta \le \frac{\Lambda }{\mu }\) as \(t \rightarrow \infty \). From Remark 1, \(f(I)\le f'(0)I\) for any \(I\ge 0\). The comparison theorem of differential equations implies that for any \(t\ge 0\),

$$\begin{aligned} e(a,t) \le {{\hat{e}}}(a,t),~ I(t) \le {{\hat{I}}}(t),~ r(w,t) \le {{\hat{r}}}(w,t), \end{aligned}$$

where \(\big ({{\hat{e}}}(a,t), {{\hat{I}}}(t), {{\hat{r}}}(w,t)\big )\) is the solution of the following system:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\partial {\hat{e}}(a,t)}}{{\partial a}} + \frac{{\partial {\hat{e}}(a,t)}}{{\partial t}} = - \varepsilon (a){{\hat{e}}}(a,t), \\ \frac{{\textrm{d}{{\hat{I}}}(t)}}{{\textrm{d}t}} = \int _0^\infty {k(a){{\hat{e}}}(a,t)\textrm{d}a} + \int _0^\infty {\gamma (w){{\hat{r}}}(w,t)\textrm{d}w}-\varpi {{\hat{I}}}(t), \\ \frac{{\partial {\hat{r}}(w,t)}}{{\partial w}} + \frac{{\partial {{\hat{r}}}(w,t)}}{{\partial t}} = - u(w){{\hat{r}}}(w,t),\\ {{\hat{e}}}\left( a,0 \right) =0,\ {{\hat{I}}}(0)=0,\ {{\hat{r}}}\left( w,0 \right) =0,\\ {\hat{e}}\left( 0,t \right) =\left( \beta f'(0)\frac{\Lambda }{\mu }+\sigma _2\right) {{\hat{I}}}(t),\ {{\hat{r}}}\left( 0,t \right) =\sigma _1{{\hat{I}}}(t). \end{array}\right. \end{aligned}$$
(18)

Similarly to (5), solving \({{\hat{e}}}(a,t)\) and \({{\hat{r}}}(w,t)\) of the system (18), we can obtain

$$\begin{aligned} \begin{array}{l} {{\hat{e}}}\left( a,t \right) ={\left\{ \begin{array}{ll} \left( \beta f'(0)\frac{\Lambda }{\mu }+\sigma _2\right) {{\hat{I}}}(t-a)\Omega _1(a), &{} t\ge a \ge 0,\\ 0, &{}a>t\ge 0.\\ \end{array}\right. }\\ {{\hat{r}}}\left( w,t \right) ={\left\{ \begin{array}{ll} \sigma _1{{\hat{I}}}(t-w)\Omega _2(w), \quad \quad \quad \quad \quad ~ &{}t\ge w \ge 0,\\ 0, \quad &{}w>t\ge 0. \end{array}\right. } \end{array} \end{aligned}$$
(19)

Substituting (19) into the second equation of system (18), yields

$$\begin{aligned} \begin{array}{l} \frac{\textrm{d}{\hat{I}}(t)}{\textrm{d}t}=\int _0^t k(a)\left( \beta f'(0)\frac{\Lambda }{\mu }+\sigma _2\right) {{\hat{I}}}(t-a)\Omega _1(a)\textrm{d}a\\ \qquad \quad +\int _0^t \sigma _1 \gamma (w)\hat{I}(t-w)\Omega _2(w)\textrm{d}w-\varpi {\hat{I}}(t). \end{array} \end{aligned}$$

The initial condition \({{\hat{I}}}(0)=0\) implies a unique solution \({{\hat{I}}}(t)=0\) for the above equation. By (19), we also get \({{\hat{e}}}(a,t)=0\) and \({{\hat{r}}}(w,t)=0\) for \(0\le a, w\le t\). With \(e(a,t)=0\), \(I(t)=0\) and \(r(w,t)=0\), this holds true for all \(t\ge 0\), this result indicates that the set \(\partial {\textbf{Y}}\) is positively invariant.

The system (1)-(3) in \(\partial {\textbf{Y}}\) becomes

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\textrm{d}S(t)}}{{\textrm{d}t}} = \Lambda - (\mu +\alpha ) S(t) + \int _0^\infty {h (\theta )v(\theta ,t)\textrm{d}\theta },\\ \frac{{\partial v(\theta ,t)}}{{\partial \theta }} + \frac{{\partial v(\theta ,t)}}{{\partial t}} = -n(\theta )v(\theta ,t),\\ S(0)=S_0,~ v(\theta ,0)=v_0(\theta ),\\ v(0,t)=\alpha S(t). \end{array} \right. \end{aligned}$$
(20)

Construct the following Lyapunov function:

$$\begin{aligned} P(t)=P_S(t)+P_v(t)=S^0g\left( \frac{S(t)}{S^0}\right) +\int _0^\infty v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta . \end{aligned}$$

Note that \(\mu +\alpha =\frac{1}{S^0}\left( \Lambda +\int _0^\infty h(\theta )v^0(\theta )\textrm{d}\theta \right) \). Calculation the derivative of \(P_S(t)\) along with the solutions of system (20) gives

$$\begin{aligned} \frac{\textrm{d}P_S(t)}{\textrm{d}t}&=\left( 1 - \frac{S^0}{S(t)} \right) \left( \Lambda - (\mu + \alpha )S(t) +\int _0^\infty h(\theta )v(\theta ,t)\textrm{d}\theta \right) \nonumber \\&=\left( 1 - \frac{S^0}{S(t)} \right) \left[ \Lambda -\frac{S(t)}{S^0}\left( \Lambda +\int _0^\infty h(\theta )v^0(\theta )\textrm{d}\theta \right) +\int _0^\infty h(\theta )v(\theta ,t)\textrm{d}\theta \right] \nonumber \\&=\Lambda \left( 2 - \frac{S(t)}{S^0} - \frac{S^0}{S(t)} \right) +\int _0^\infty h(\theta )v^0(\theta )\nonumber \\&\quad \times \left( \frac{v(\theta ,t)}{v^0(\theta )}- \frac{S(t)}{S^0}-\frac{S^0 v(\theta ,t)}{S(t)v^0(\theta )}+1\right) \textrm{d}\theta . \end{aligned}$$
(21)

The derivative of \(P_v(t)\) satisfies

$$\begin{aligned} \frac{{\textrm{d}{P_v(t)}}}{{\textrm{d}t}}&=\int _0^\infty v^0(\theta ) \frac{\partial }{\partial t}g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta \nonumber \\&=-\int _0^\infty {v^0(\theta )\left( \frac{1}{v^0(\theta )}-\frac{1}{v(\theta ,t)}\right) \left( n(\theta )v(\theta ,t)+\frac{\partial }{{\partial \theta }}v(\theta ,t)\right) \textrm{d}\theta }\nonumber \\&=-\int _0^\infty {v^0(\theta )\left( \frac{v(\theta ,t)}{v^0(\theta )}-1\right) \left( n(\theta )+\frac{v_\theta (\theta ,t)}{v(\theta ,t)}\right) \textrm{d}\theta }, \end{aligned}$$
(22)

where \(v_\theta (\theta ,t)=\frac{\partial v(\theta ,t)}{\partial \theta }\). Recall that \(\frac{\textrm{d} v^0(\theta )}{\textrm{d}\theta }=-n(\theta )v^0(\theta )\), it follows that

$$\begin{aligned} \frac{\partial }{{\partial \theta }}g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) =\left( \frac{v(\theta ,t)}{v^0(\theta )}-1\right) \left( n(\theta )+\frac{v_\theta (\theta ,t)}{v(\theta ,t)}\right) . \end{aligned}$$
(23)

Using integration by parts, \(v^0(0)=\alpha S^0\) and \(v(0,t)=\alpha S(t)\), we have

$$\begin{aligned} g\left( \frac{v(0,t)}{v^0(0)}\right) =g\left( \frac{S(t)}{S^0}\right) . \end{aligned}$$
(24)

Substitute (23)-(24) into (22)

$$\begin{aligned} \frac{{\textrm{d}{P_v(t)}}}{{\textrm{d}t}}&= -\int _0^\infty {v^0(\theta )\frac{\partial }{{\partial \theta }}g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }\nonumber \\&= -v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \Bigg |_{\theta =\infty }- \int _0^\infty {v^0(\theta )n(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }\nonumber \\&\quad +\alpha S^0g\left( \frac{S(t)}{S^0}\right) . \end{aligned}$$
(25)

Noticing \(n(\theta )=h(\theta )+\mu \) and \(\alpha =\frac{\Lambda }{S^0}-\mu +\frac{1}{S^0}\int _0^\infty h(\theta )v^0(\theta )\textrm{d}\theta \). It follows from (21) and (25) that

$$\begin{aligned} \frac{\textrm{d}P(t)}{\textrm{d}t}&=\Lambda \left( 2 - \frac{S(t)}{S^0} - \frac{S^0}{S(t)} \right) +\int _0^\infty {h(\theta )v^0(\theta )\left( \frac{v(\theta ,t)}{v^0(\theta )}- \frac{S(t)}{S^0}-\frac{S^0 v(\theta ,t)}{S(t)v^0(\theta )}+1\right) \textrm{d}\theta }\nonumber \\&\quad -v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \Bigg |_{\theta =\infty }- \int _0^\infty {\mu v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }\nonumber \\&\quad -\int _0^\infty {h(\theta )v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }+ \left( \Lambda -\mu S^0+\int _0^\infty h(\theta )v^0(\theta )\textrm{d}\theta \right) g\left( \frac{S(t)}{S^0}\right) \nonumber \\&=-\Lambda g\left( \frac{S^0}{S(t)}\right) -\mu S^0g\left( \frac{S(t)}{S^0}\right) -v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \Bigg |_{\theta =\infty }\nonumber \\&\quad -\int _0^\infty {\mu v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }-\int _0^\infty {h(\theta ) v^0(\theta )g\left( \frac{S^0v(\theta ,t)}{S(t)v^0(\theta )}\right) \textrm{d}\theta }. \end{aligned}$$
(26)

By (26), \(\frac{\textrm{d}P(t)}{\textrm{d}t}\le 0\), the equality sign holds if and only if \(S(t)=S^0\) and \(v(\theta ,t)=v^0(\theta )\). The disease-free equilibrium \(E^0\) is globally asymptotically stable in \(\partial {\textbf{Y}}\) according to the LaSalle invariance principle (LaSalle 1976). This completes the proof. \(\square \)

Next, we apply methods from the literature (Hale and Waltman 1989; Magal and Zhao 2005) to prove the uniform persistence of the system (1). The following theorem is introduced.

Theorem 4

Suppose that \({\mathcal {R}}_0>1\). The semi-flow \({\left\{ {\Phi (t,x_0)} \right\} _{t \ge 0}}\) is uniformly persistent in \(({\textbf{Y}},\partial {\textbf{Y}})\), that is, there exists a constant \(\iota >0\) which is independent of the initial values such that \(\mathop {\lim }\limits _{t \rightarrow \infty }\inf {\left\| {\Phi (t,x_0)} \right\| _{\textbf{X}}} \ge \iota \) for any \(x_0\in {\textbf{Y}}\). Furthermore, the semi-flow \({\left\{ {\Phi (t,x_0)} \right\} _{t \ge 0}}\) has a compact global attractor \( {\mathcal {A}}_0\) in \({\textbf{Y}}\).

Proof

Since it has been verified in Theorem 3 that \(E^0\) is globally asymptotically stable in \(\partial {\textbf{Y}}\). From Theorem 4.2 in Hale and Waltman (1989) it only needs to prove that there exist \(T>0\) and \(\iota >0\), such that \(\mathop {\lim }\limits _{t \rightarrow \infty } \inf {\left\| {\Phi (t,{x_0})} \right\| _{\textbf{X}}} \ge \iota \) for any \(t>T\), \(x_0\in {\textbf{Y}}\). It suffices to examine that \({W^s}({E^0}) \cap {\textbf{Y}} = \emptyset \), where \({W^s}({E^0}) = \{ {x_0} \in {\textbf{Y}}:\mathop {\lim }\limits _{t \rightarrow \infty } \Phi (t,{x_0}) = {E^0}\}\).

By the way of contradiction, assuming that there exists a solution \(y\in {\textbf{Y}}\) such that \(\mathop {\lim }\limits _{t \rightarrow \infty } \Phi (t,y) = E^0\). Thus, we can take a sequence \(\{y_n\} \subset {\textbf{Y}}\), where \({y_n} = \big ({S_n}(0), {v_n}(\cdot ,0), {e_n}(\cdot ,0 ), {I_n}(0), {r_n}(\cdot ,0)\big )\), such that \({\left\| {\Phi (t,y_n)-E^0} \right\| _{\textbf {X}}} < \frac{1}{n}\), for \(t \ge 0\). Here, \(\Phi (t,{y_n}) = \big ({S_n}(t), {v_n}(\cdot ,t), {e_n}( \cdot ,t ), {I_n}(t), {r_n}( \cdot ,t )\big )\).

Now, we choose a sufficiently large \(n>0\), ensures that \(S^0-\frac{1}{n}>0\) and \(v^0(\theta )-\frac{1}{n}>0\). For this chosen n, there exists \(T>0\) such that \(t>T\),

$$\begin{aligned} {S^0} - \frac{1}{n}< {S_n}(t)< {S^0} + \frac{1}{n},\ {v^0(\theta )} - \frac{1}{n}< {v_n}(\theta ,t)< {v^0(\theta )} + \frac{1}{n},\ 0<I_n(t)<\frac{1}{n}.\nonumber \\ \end{aligned}$$
(27)

From (5), we obtain that

$$\begin{aligned} e(a,t)&=e(0,t-a)\Omega _1(a)+e_0(a-t)\frac{\Omega _1(a)}{\Omega _1(a-t)}\nonumber \\&\ge \left( \beta S(t-a)f(I(t-a))+\int _0^\infty \beta \sigma (\theta )f(I(t-a))v(\theta ,t-a)\textrm{d}\theta \right. \nonumber \\&\quad \left. +\sigma _2I(t-a)\right) \Omega _1(a),\nonumber \\ r(w,t)&=r(0,t-w)\Omega _2(w)+r_0(w-t)\frac{\Omega _2(w)}{\Omega _2(w-t)}\ge \sigma _1I(t-w)\Omega _2(w). \end{aligned}$$
(28)

Combining Remark 1 and (27-28) yields

$$\begin{aligned} e_n(a,t)&\ge \left( \beta S_n(t-a)f(I_n(t-a))+\int _0^\infty \beta \sigma (\theta )f(I_n(t-a))v_n(\theta ,t-a)\textrm{d}\theta \right. \nonumber \\&\quad \left. +\sigma _2I_n(t-a)\right) \Omega _1(a)\nonumber \\&\ge \left[ \beta \left( \left( S^0-\frac{1}{n}\right) +\int _0^\infty \sigma (\theta )\left( v^0(\theta )-\frac{1}{n}\right) \textrm{d}\theta \right) f'\left( \frac{1}{n}\right) +\sigma _2\right] \nonumber \\&\quad \times I_n(t-a)\Omega _1(a),\nonumber \\ r_n(w,t)&\ge \sigma _1I_n(t-w)\Omega _2(w). \end{aligned}$$
(29)

Substituting (29) into the fourth equation in the system (1) and applying the comparison principle of differential equation, we obtain \(I_n(t) \ge \varrho _n(t)\), for all \(t\ge 0\), where \(\varrho _n(t)\) is the solution to the following Volterra-type integral equation:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{\textrm{d}\varrho _n(t)}{\textrm{d}t} = \int _0^\infty \beta k(\tau )\Omega _1(\tau ) f'\left( \frac{1}{n}\right) \big [\left( S^0-\frac{1}{n}\right) +\int _0^\infty \sigma (\theta )\left( v^0(\theta )-\frac{1}{n}\right) \textrm{d}\theta \big ]\varrho _n(t-\tau )\textrm{d}\tau \nonumber \\ \quad \quad \quad ~+\int _0^\infty \big (\sigma _2k(\tau )\Omega _1(\tau )+ \sigma _1\gamma (\tau )\Omega _2(\tau )\big )\varrho _n(t-\tau )\textrm{d}\tau -\varpi \varrho _n(t),\nonumber \\ {\varrho _n}(0) = {I_n}(0) > 0. \end{array} \right. \end{aligned}$$
(30)

The next proof \(\varrho _n(t)\) is unbounded. If \(\varrho _n(0)>0\), then when \({\mathcal {R}}_0>1\), choose enough large \(n \in {\mathbb {R}}_+\), so that the following equation holds:

$$\begin{aligned} \beta \theta _2\left[ \left( S^0-\frac{1}{n}\right) +\int _0^\infty \sigma (\theta )\left( v^0(\theta )-\frac{1}{n}\right) \textrm{d}\theta \right] >\varpi -\sigma _1\theta _1-\sigma _2\theta _2. \end{aligned}$$

Namely

$$\begin{aligned}&\int _0^\infty \beta k(\tau )\Omega _1(\tau )f'\left( \frac{1}{n}\right) \left[ \left( S^0-\frac{1}{n}\right) +\int _0^\infty \sigma (\theta )\left( v^0(\theta )-\frac{1}{n}\right) \textrm{d}\theta \right] \textrm{d}\tau \\&\quad +\sigma _1\theta _1+\sigma _2\theta _2>\varpi . \end{aligned}$$

We can obtain that \(\varrho _n(t)\) is unbounded due to \(I_n(t)\ge \varrho _n(t)\), so \(I_n(t)\) is unbounded, which contradicts the boundedness of \(I_n(t)\) in (27), thus the assumption is not valid, \({W^s}({E^0}) \cap {\textbf{Y}} = \emptyset \) holds. Uniform persistent and existence of a global attractor \({\mathcal {A}}_0 \) for \({\{ \Phi (t,x_0)\} _{t \ge 0}}\) can be derived from Theorem 3.7 in literature (Magal and Zhao 2005). This completes the proof. \(\square \)

4.3 Stability of the disease-free equilibrium \(E^0\)

In this subsection, we give the local and global asymptotic stability of the disease-free equilibrium \(E^0\), respectively.

Theorem 5

The disease-free equilibrium \(E^0\) is locally asymptotically stable if \({\mathcal {R}}_0< 1\), and unstable if \({\mathcal {R}}_0>1\).

Proof

We first take the change of variables as follows:

$$\begin{aligned}{} & {} {x_1}(t) = S(t) - {S^0},\ {x_2}(\theta ,t) = v(\theta ,t) - {v^0(\theta )},\ {x_3}(a,t) = e(a,t),\\{} & {} {x_4}(t) = I(t),\ {x_5}(w,t) = r(w,t). \end{aligned}$$

Linearizing the system (1) at the disease-free equilibrium \(E^0\) leads to the following system:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\textrm{d}{x_1}(t)}}{{\textrm{d}t}} = - (\mu + \alpha ){x_1}(t) -\beta f'(0)S^0{x_4}(t)+\int _0^\infty {h(\theta )x_2(\theta ,t)\textrm{d}\theta },\\ \frac{{\partial {x_2}(\theta ,t)}}{{\partial \theta }} + \frac{{\partial {x_2}(\theta ,t)}}{{\partial t}}= - n(\theta ){x_2}(\theta ,t) - \beta f'(0) \sigma (\theta ) {v^0(\theta )}{x_4}(t),\\ \frac{{\partial {x_3}(a,t)}}{{\partial a}} + \frac{{\partial {x_3}(a,t)}}{{\partial t}} = - \varepsilon (a){x_3}(a,t),\\ \frac{{\textrm{d}{x_4}(t)}}{{\textrm{d}t}} = \int _0^\infty {k (a){x_3}(a,t)\textrm{d}a} + \int _0^\infty {\gamma (w){x_5}(w,t)\textrm{d}w} - \varpi {x_4}(t), \\ \frac{{\partial {x_5}(w,t)}}{{\partial w}} + \frac{{\partial {x_5}(w,t)}}{{\partial t}} = - u(w) {x_5}(w,t),\\ {x_2}(0,t)=\alpha x_1(t),\\ {x_3}(0,t) = \beta f'(0) S^0x_4(t) + \int _0^\infty \beta f'(0)\sigma (\theta ) {v^0(\theta )}{x_4}(t)\textrm{d}\theta +\sigma _2 x_4(t),\\ {x_5}(0,t) = \sigma _1{x_4}(t). \end{array} \right. \end{aligned}$$
(31)

Let \( {x_1}(t) = x_1^0{e^{\lambda t}}\), \({x_2}(\theta ,t) = x_2^0(\theta ){e^{\lambda t}}\), \({x_3}(a,t) = x_3^0(a){e^{\lambda t}}\), \( {x_4}(t) = x_4^0{e^{\lambda t}}\) and \({x_5}(w,t) = x_5^0(w){e^{\lambda t}}\) be the solution of the system (30), where \(x_1^0\), \(x_2^0(\theta )\), \(x_3^0(a)\), \(x_4^0\), \(x_5^0(w)\) and \(\lambda \) will be determined later. Observe that it suffices to use the third, fourth and fifth equations of the system (30) to determine the characteristic equation of the system (1) at the disease-free equilibrium \(E^0\). Then

$$\begin{aligned}{} & {} \left\{ \begin{array}{l} \lambda x_3^0(a) + \frac{{\textrm{d}x_3^0(a)}}{{\textrm{d}a}} = - \varepsilon (a)x_3^0(a),\\ x_3^0(0)= \beta f'(0) S^0x_4^0 + \int _0^\infty \beta f'(0)\sigma (\theta ) v^0(\theta )x_4^0\textrm{d}\theta +\sigma _2 x_4^0,\\ \end{array}\right. \end{aligned}$$
(32)
$$\begin{aligned}{} & {} \begin{array}{l} \lambda x_4^0 = \int _0^\infty {k (a){x_3^0}(a)\textrm{d}a} + \int _0^\infty {\gamma (w){x_5^0}(w)\textrm{d}w} - \varpi {x_4^0} \end{array}, \end{aligned}$$
(33)
$$\begin{aligned}{} & {} \left\{ \begin{array}{l} \lambda x_5^0(w) + \frac{{\textrm{d}x_5^0(w)}}{{\textrm{d}w}} = - u(w) x_5^0(w),\\ x_5^0(0)=\sigma _1 x_4^0. \end{array}\right. \end{aligned}$$
(34)

Integrating the first equation of (31) from 0 to a, yields

$$\begin{aligned} \begin{array}{l} x_3^0(a) = \Big (\beta f'(0) S^0x_4^0+ \int _0^\infty \beta f'(0)\sigma (\theta ) v^0(\theta )x_4^0\textrm{d}\theta +\sigma _2 x_4^0\Big ) e^{- \int _0^a (\lambda +\varepsilon (r))\textrm{d}r}. \end{array}\nonumber \\ \end{aligned}$$
(35)

Similarly, solving (33), we get

$$\begin{aligned} \begin{array}{l} x_5^0(w) = \sigma _1 x_4^0e^{ - \int _0^w \left( \lambda +u(r)\right) \textrm{d}r}. \end{array} \end{aligned}$$
(36)

Substituting (34)-(35) into (32) reduces to the characteristic equation of system (1) at the disease-free equilibrium \(E^0\)

$$\begin{aligned} 0&=\int _0^\infty k (a)\left( \beta f'(0) S^0 + \int _0^\infty \beta f'(0)\sigma (\theta ) v^0(\theta )\textrm{d}\theta +\sigma _2\right) e^{ - \int _0^a (\lambda +\varepsilon (r))\textrm{d}r}\textrm{d}a \nonumber \\&\quad +\int _0^\infty \sigma _1\gamma (w) e^{ - \int _0^w (\lambda +u (r))\textrm{d}r}\textrm{d}w - (\varpi +\lambda ). \end{aligned}$$
(37)

Let \(H(\lambda )\) denote the right-hand side of (36). It’s clear that

$$\begin{aligned}{} & {} \mathop {\lim }\limits _{\lambda \rightarrow + \infty } H(\lambda ) = - \infty ,~\mathop {\lim }\limits _{\lambda \rightarrow - \infty } H(\lambda ) = + \infty ,~H'(\lambda ) < 0,\\{} & {} H(0)=(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)({\mathcal {R}}_0-1). \end{aligned}$$

If \({\mathcal {R}}_0>1\), then \(H(0)>0\). Since \(H(\lambda )\) is continuously differentiable, there exists at least one positive real root, so the disease-free equilibrium \(E^0\) is unstable. If \({\mathcal {R}}_0<1\), we rewrite (36) into

$$\begin{aligned} \lambda +\varpi&=\int _0^\infty k (a)\left( \beta f'(0)S^0 + \int _0^\infty \beta f'(0)\sigma (\theta ) v^0(\theta )\textrm{d}\theta +\sigma _2\right) e^{ - \int _0^a (\lambda +\varepsilon (r))\textrm{d}r}\textrm{d}a \nonumber \\&\quad +\int _0^\infty \sigma _1\gamma (w) e^{ - \int _0^w (\lambda +u (r))\textrm{d}r}\textrm{d}w. \end{aligned}$$
(38)

Let Re\(\lambda \ge 0\), then \(|\lambda +\varpi |\ge \varpi \). Using \(\Xi \) to denote the right-hand side of (37), we have

$$\begin{aligned} \Xi \le \beta \theta _2 f'(0)\left( S^0+\int _0^\infty \sigma (\theta )v^0(\theta )\textrm{d}\theta \right) +\sigma _1\theta _1+\sigma _2\theta _2<\varpi . \end{aligned}$$

This contradicts equation (37). Thus, if \({\mathcal {R}}_0<1\), then Re\(\lambda < 0\). Namely, the disease-free equilibrium \(E^0\) is locally asymptotically stable. This proof is complete. \(\square \)

Next, we give the global stability of the disease-free equilibrium \(E^0\).

Theorem 6

The disease-free equilibrium \(E^0\) is globally asymptotically stable if \({\mathcal {R}}_0<1\).

Proof

Set Lyapunov function as follows:

$$\begin{aligned} L(t)=L_1(t)+L_2(t)+I(t)+L_3(t)+L_4(t), \end{aligned}$$

where

$$\begin{aligned} {L_1(t)}&= {\theta _2}P_S(t),\quad \quad \quad \quad \quad \quad \quad \quad \,\,~~{L_2(t)} = {\theta _2}P_v(t) ,\\ {L_3(t)}&= \int _0^\infty {{\zeta _1}(a)e(a,t)\textrm{d}a},\quad \quad \quad {L_4(t)} = \int _0^\infty {{\zeta _2}(w)r(w,t)\textrm{d}w}. \end{aligned}$$

By \(\mu =\frac{\Lambda }{S^0}-\alpha +\frac{1}{S^0}\int _0^\infty h(\theta )v^0(\theta )\textrm{d}\theta \), similar to the process of solving (21), the derivative of \(L_1(t)\) along the system (1) is

$$\begin{aligned} \frac{\textrm{d}L_1(t)}{\textrm{d}t}&=\theta _2\Lambda \left( 2 - \frac{S(t)}{S^0} - \frac{S^0}{S(t)} \right) - \beta \theta _2 f(I(t))S(t) +\beta \theta _2f(I(t))S^0\nonumber \\&\quad +\theta _2\int _0^\infty {h(\theta )v^0(\theta )\left( \frac{v(\theta ,t)}{v^0(\theta )}- \frac{S(t)}{S^0}-\frac{S^0 v(\theta ,t)}{S(t)v^0(\theta )}+1\right) \textrm{d}\theta }. \end{aligned}$$
(39)

Similar to (25), we have

$$\begin{aligned} \frac{{\textrm{d}{L_2(t)}}}{{\textrm{d}t}}&= -\theta _2v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \Bigg |_{\theta =\infty }-\theta _2\int _0^\infty {v^0(\theta )n(\theta ) g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }\nonumber \\&\quad -\beta \theta _2 \int _0^\infty {v(\theta ,t)\sigma (\theta )f(I(t))\textrm{d}\theta }+\beta \theta _2 \int _0^\infty {v^0(\theta )\sigma (\theta )f(I(t))\textrm{d}\theta }\nonumber \\&\quad +\theta _2\left( \Lambda -\mu S^0+\int _0^\infty h(\theta )v^0(\theta )\textrm{d}\theta \right) g\left( \frac{S(t)}{S^0}\right) . \end{aligned}$$
(40)

The derivative of \(L_3(t)\) satisfies

$$\begin{aligned} \frac{{\textrm{d}{L_3(t)}}}{{\textrm{d}t}}&= \int _0^\infty {\zeta _1(a)\frac{\partial e(a,t)}{{\partial t}}\textrm{d}a}\nonumber \\&=-\int _0^\infty {\zeta _1(a)\varepsilon (a)e(a,t)\textrm{d}a}+\theta _2e(0,t)-\zeta _1(a)e(a,t)\Big |_{a=\infty }\nonumber \\&\quad +\int _0^\infty {e(a,t)\big (\zeta _1(a)\varepsilon (a)-k(a)\big )\textrm{d}a}\nonumber \\&=\theta _2e(0,t)-\zeta _1(a)e(a,t)\Big |_{a=\infty }-\int _0^\infty {k(a)e(a,t)\textrm{d}a}. \end{aligned}$$
(41)

Obviously

$$\begin{aligned} \frac{{\textrm{d}{I(t)}}}{{\textrm{d}t}} =\int _0^\infty {k(a)e(a,t)\textrm{d}a}+\int _0^\infty {\gamma (w)r(w,t)\textrm{d}w}-\varpi I(t). \end{aligned}$$
(42)

Similarly to (40), we get

$$\begin{aligned} \frac{\textrm{d}L_4(t)}{\textrm{d}t} =\theta _1r(0,t)-\zeta _2(w)r(w,t)\Big |_{w=\infty }-\int _0^\infty {\gamma (w)r(w,t)\textrm{d}w}. \end{aligned}$$
(43)

Rearranging (38)-(42) reduces to

$$\begin{aligned} \frac{\textrm{d}L(t)}{\textrm{d}t}&= -\theta _2\Lambda g\left( \frac{S^0}{S(t)}\right) -\theta _2\mu S^0-\theta _2v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \Bigg |_{\theta =\infty }-\zeta _1(a)e(a,t)\Big |_{a=\infty }\nonumber \\&\quad -\zeta _2(w)r(w,t)\Big |_{w=\infty }-\theta _2\mu \int _0^\infty {v^0(\theta ) g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }+Q_1+Q_2, \end{aligned}$$
(44)

where

$$\begin{aligned} Q_1&=\theta _2\int _0^\infty {h(\theta )v^0(\theta )\left( \frac{v(\theta ,t)}{v^0(\theta )}- \frac{S(t)}{S^0}-\frac{S^0 v(\theta ,t)}{S(t)v^0(\theta )}+1-\frac{v(\theta ,t)}{v^0(\theta )}+1+\ln \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }\nonumber \\&\quad +\theta _2\int _0^\infty {h(\theta )v^0(\theta ) g\left( \frac{S(t)}{S^0}\right) \textrm{d}\theta }\nonumber \\&=-\theta _2\int _0^\infty {h(\theta )v^0(\theta )g\left( \frac{S^0v(\theta ,t)}{S(t)v^0(\theta )}\right) \textrm{d}\theta },\\ Q_2&=\beta \theta _2\left( S^0+\int _0^\infty {\sigma (\theta )v^0(\theta )\textrm{d}\theta }\right) f(I(t))-\varpi I(t)+\sigma _1\theta _1I(t)+\sigma _2\theta _2I(t)\nonumber \\&=(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)I(t)\left[ \frac{\beta f(I(t))\theta _2\big (S^0+\int _0^\infty {v^0(\theta )\sigma (\theta )\textrm{d}\theta }\big )}{(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)I(t)}-1\right] . \end{aligned}$$
(45)

According to Remark 1, there is \(f(I)\le f'(0)I\) holds, which means that

$$\begin{aligned} Q_2\le (\varpi -\sigma _1\theta _1-\sigma _2\theta _2)I(t)({\mathcal {R}}_0-1). \end{aligned}$$

Summarizing, we get

$$\begin{aligned} \frac{\textrm{d}L(t)}{\textrm{d}t}&\le -\theta _2\Lambda g\left( \frac{S^0}{S(t)}\right) -\theta _2\mu S^0-\theta _2v^0(\theta )g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \Bigg |_{\theta =\infty }-\zeta _1(a)e(a,t)\Big |_{a=\infty }\nonumber \\&\quad -\zeta _2(w)r(w,t)\Big |_{w=\infty }-\theta _2\mu \int _0^\infty {v^0(\theta ) g\left( \frac{v(\theta ,t)}{v^0(\theta )}\right) \textrm{d}\theta }\nonumber \\&\quad -\theta _2\int _0^\infty {h(\theta )v^0(\theta )g\left( \frac{S^0v(\theta ,t)}{S(t)v^0(\theta )}\right) \textrm{d}\theta }+(\varpi -\sigma _1\theta _1-\sigma _2\theta _2)I(t)({\mathcal {R}}_0-1). \end{aligned}$$
(46)

Thus, it follows from (46) that \(\frac{\textrm{d}L(t)}{\textrm{d}t}\le 0\) if \({\mathcal {R}}_0<1\). When \(S(t)=S^0\), \(v(\theta ,t)=v^0(\theta )\), \(e(a,t)=0\), \(I(t)=0\), \(r(w,t)=0\) hold at the same time there is \(\frac{\textrm{d}L(t)}{\textrm{d}t}=0\). Therefore, the single point set \(\{E^0\}\subset \Omega \) is the largest invariant subset of \(\frac{\textrm{d}L(t)}{\textrm{d}t}=0\). By the LaSalle invariance principle, when \({\mathcal {R}}_0<1\), the disease-free equilibrium \(E^0\) is globally asymptotically stable. That proves the Theorem 6. \(\square \)

4.4 Stability of the endemic equilibrium \(E^*\)

In this subsection, we prove the global stability of the endemic equilibrium \(E^*\) by constructing the suitable Lyapunov function.

We have the following lemma based on Remark 1 and Proposition A.1 in Sigdel and McCluskey (2014).

Lemma 4

Assign

$$\begin{aligned} G(I)=g\left( \frac{f(I)}{f(I^*)}\right) -g\left( \frac{I}{I^*}\right) . \end{aligned}$$

Then the assumption (A4) ensure that \(G(I)\le 0\) holds for all \(I >0\).

Now, we give the following theorem.

Theorem 7

The endemic equilibrium \(E^*\) is globally asymptotically stable in \({\textbf{Y}}\) if \({\mathcal {R}}_0>1\).

Proof

Define the following Lyapunov function:

$$\begin{aligned} H(t)=H_1(t)+H_2(t)+H_3(t)+H_4(t)+H_5(t), \end{aligned}$$

where

$$\begin{aligned} {H_1(t)}&={\theta _2}{S^*}g\left( {\frac{S(t)}{{{S^*}}}} \right) ,\quad ~~ H_2(t)=\theta _2\int _0^\infty v^*(\theta ) g\left( \frac{v(\theta ,t)}{v^*(\theta )} \right) \textrm{d}\theta ,\\ H_3(t)&=\int _0^\infty {\zeta _1(a)e^*(a) g\left( \frac{e(a,t)}{e^*(a)} \right) \textrm{d}a},~\quad ~ H_4(t)={I^*}g\left( {\frac{I(t)}{{{I^*}}}} \right) ,\\ H_5(t)&=\int _0^\infty {\zeta _2(w)r^*(w) g\left( \frac{r(w,t)}{r^*(w)} \right) \textrm{d}w}. \end{aligned}$$

By \(\mu =\frac{\Lambda }{S^*}-\alpha +\frac{1}{S^*} \int _0^\infty {h(\theta )v^*(\theta )\textrm{d}\theta }-\beta f(I^*)\), similar to the process of solving (21), the derivative of \(H_1(t)\) along the system (1) is given by

$$\begin{aligned} \frac{\textrm{d}H_1(t)}{\textrm{d}t}&={\theta _2}\Lambda \left( 2 - \frac{S(t)}{S^*}-\frac{S^*}{S(t)} \right) +\beta \theta _2S^*f(I^*) \left( \frac{S(t)}{S^*}-\frac{S(t)f(I(t))}{S^*f(I^*)}-1+\frac{f(I(t))}{f(I^*)}\right) \nonumber \\&\quad +\theta _2\int _0^\infty {h(\theta )v^*(\theta )\left( \frac{v(\theta ,t)}{v^*(\theta )}-\frac{S(t)}{S^*}-\frac{S^*v(\theta ,t)}{S(t)v^*(\theta )}+1\right) \textrm{d}\theta }. \end{aligned}$$
(47)

Similar to (25), we have

$$\begin{aligned} \frac{\textrm{d}H_2(t)}{\textrm{d}t}&=\theta _2 v^*(0)g\left( \frac{v(0,t)}{v^*(0)}\right) -\theta _2 v^*(\theta )g\left( \frac{v(\theta ,t)}{v^*(\theta )}\right) \Bigg |_{\theta =\infty }\nonumber \\&\quad -\beta \theta _2 \int _0^\infty v^*(\theta )\sigma (\theta )f(I^*)\left( \frac{v(\theta ,t)f(I(t))}{v^*(\theta )f(I^*)}-\frac{f(I(t))}{f(I^*)}-\ln \frac{v(\theta ,t)}{v^*(\theta )}\right) \textrm{d}\theta \nonumber \\&\quad - \theta _2 \int _0^\infty {v^*(\theta )\big (h(\theta )+\mu \big )g\left( \frac{v(\theta ,t)}{v^*(\theta )} \right) \textrm{d}\theta }. \end{aligned}$$
(48)

The same reasoning leads to

$$\begin{aligned} \frac{\textrm{d}H_3(t)}{\textrm{d}t}&=\zeta _1(0)e^*(0)g\left( \frac{e(0,t)}{e^*(0)}\right) - \zeta _1(a)e^*(a)g\left( \frac{e(a,t)}{e^*(a)}\right) \Bigg |_{a=\infty }\nonumber \\&\quad -\int _0^\infty {k(a)e^*(a)g\left( \frac{e(a,t)}{e^*(a)}\right) \textrm{d}a}. \end{aligned}$$
(49)

By \(\varpi =\frac{1}{I^*}\big (\int _0^\infty {k(a)^*e(a)\textrm{d}a}+\int _0^\infty {\gamma (w)r^*(w)\textrm{d}w}\big )\), analogously to (21), we get

$$\begin{aligned} \frac{\textrm{d}H_4(t)}{\textrm{d}t}&=\int _0^\infty {k(a)e^*(a)\left( \frac{e(a,t)}{e^*(a)}-\frac{I(t)}{I^*}-\frac{I^*e(a,t)}{I(t)e^*(a)}+1\right) \textrm{d}a}\nonumber \\&\quad +\int _0^\infty \gamma (w)r^*(w)\left( \frac{r(w,t)}{r^*(w)}-\frac{I(t)}{I^*}-\frac{I^*r(w,t)}{I(t)r^*(w)}+1\right) \textrm{d}w. \end{aligned}$$
(50)

The same reasoning as (25) gives

$$\begin{aligned} \frac{\textrm{d}H_5(t)}{\textrm{d}t}&=\zeta _2(0)r^*(0)g\left( \frac{r(0,t)}{r^*(0)}\right) - \zeta _2(w)r^*(w)g\left( \frac{r(w,t)}{r^*(w)}\right) \Bigg |_{w=\infty }\nonumber \\&\quad -\int _0^\infty {\gamma (w)r^*(w)g\left( \frac{r(w,t)}{r^*(w)}\right) \textrm{d}w}. \end{aligned}$$
(51)

Using \(g\left( \frac{r^*(0)I(t)}{r(0,t)I^*}\right) =g(1)=0\) and \(g\left( \frac{v^*(0)S(t)}{v(0,t)S^*}\right) =g(1)=0\). The simplification from (4751) leads to

$$\begin{aligned} \frac{\textrm{d}H(t)}{\textrm{d}t}&= -\theta _2 v^*(\theta )g\left( \frac{v(\theta ,t)}{v^*(\theta )}\right) \Bigg |_{\theta =\infty }-\zeta _1(a)e^*(a)g\left( \frac{e(a,t)}{e^*(a)}\right) \Bigg |_{a=\infty }\nonumber \\&\quad -\zeta _2(w)r^*(w)g\left( \frac{r(w,t)}{r^*(w)}\right) \Bigg |_{w=\infty }-\mu \theta _2 \int _0^\infty {v^*(\theta ) g\left( \frac{v(\theta ,t)}{v^*(\theta )} \right) \textrm{d}\theta }\nonumber \\&\quad +\theta _2\left( g\left( \frac{f(I(t))}{f(I^*)}\right) - g\left( \frac{I(t)}{I^*}\right) \right) \left( \beta S^* f(I^*)+\int _0^\infty \beta v^*(\theta )f(I^*)\sigma (\theta )\textrm{d}\theta \right) \nonumber \\&\quad -\beta \theta _2S^*f(I^*)g\left( \frac{S(t)f(I(t))e^*(0)}{S^*f(I^*)e(0,t)}\right) -\theta _2\int _0^\infty {h(\theta )v^*(\theta )g\left( \frac{S^*v(\theta ,t)}{S(t)v^*(\theta )}\right) \textrm{d}\theta } \nonumber \\&\quad -\beta \theta _2 \int _0^\infty v^*(\theta )\sigma (\theta )f(I^*)g\left( \frac{v(\theta ,t)f(I(t))e^*(0)}{v^*(\theta )f(I^*)e(0,t)}\right) \textrm{d}\theta \nonumber \\&\quad - \theta _2\mu S^*g \left( \frac{S(t)}{S^*}\right) -\theta _2\Lambda g\left( \frac{S^*}{S(t)}\right) -\sigma _2\theta _2I^*g\left( \frac{e^*(0)I(t)}{e(0,t)I^*}\right) . \end{aligned}$$
(52)

By Lemma 4, we know that the fifth term of (52) is negative. In summary, \(\frac{\textrm{d}H(t)}{\textrm{d}t}\le 0\). \(\frac{\textrm{d}H(t)}{\textrm{d}t}=0\) if and only if \(S(t)=S^*\), \(v(\theta ,t)=v^*(\theta )\), \(e(a,t)=e^*(a)\), \(I(t)=I^*\) and \(r(w,t)=r^*(w)\) simultaneous. Hence \(\{E^*\} \subset \Omega \) is the largest invariant subset of \(\frac{\textrm{d}H(t)}{\textrm{d}t}=0\). Thus, applying the LaSalle invariance principle, we get the endemic equilibrium \(E^*\) is globally asymptotically stable in \({\textbf{Y}}\) if \({\mathcal {R}}_0>1\). This completes the proof. \(\square \)

5 Parameters estimation and sensitivity analysis

In this section, we estimate the parameters of the system (1) using the number of new TB cases per year in China from 2007 to 2020. In addition, we conduct a sensitivity analysis and predict whether China will be able to achieve the World Health Organization’s (WHO) goal of reducing TB incidence by \(80\%\) in 2030 compared to 2015.

5.1 Parameters estimation

According to data from the National Bureau of Statistics of China(NBSC) (National Bureau of Statistics of China 2023), the average newborn population and the average life expectancy in the 2007–2020 period are 16, 289, 670 persons \(\mathrm { year^{-1}}\) and 76.34 years old, respectively, and the number of the initial susceptible population is 1, 314, 480, 000 persons. Thus, we take \(\Lambda = 16,289,670\), \(\mu = 1/76.34\) and \(S(0)=1,314,480,000\). From the data of the Chinese Center for Disease Control and Prevention (Chinese Center for Disease Control and Prevention 2023), the number of new cases of TB from 2007 to 2020 is shown in Table 1.

Table 1 The number of new cases of TB in China from 2007 to 2020

In the numerical simulations, we use the year as the unit to ignore the time length within a few weeks, and choose \(f(I) = I\). Next, we estimate the unknown parameters and initial values of the system (1). Set

$$\begin{aligned} {\mathcal {Q}}=\big (\beta ,\alpha ,\delta _1,\delta _2,\sigma _1,\sigma _2,d_1,d_2, c_1,c_2,k_1,k_2,\gamma _1,\gamma _2,v(0),e(0),I(0),r(0)\big ), \end{aligned}$$

where we assume \(h(a)=d_1e^{-d_2a},~ \sigma (a)=c_1e^{-c_2a},~ k(a)=k_1e^{-k_2a},~ \gamma (a)=\gamma _1e^{-\gamma _2a},~ v_0(a)=v(0)c_2e^{-c_2a},~ e_0(a)=e(0)k_2e^{-k_2a},~ r_0(a)=r(0)\gamma _2e^{-\gamma _2a}\).

Let \(T({\mathcal {Q}},t)\) denote the number of new TB cases in the system (1) at the \(t^{th}\) year, then \(T({\mathcal {Q}},t)\) can be written as \( T({\mathcal {Q}},t)=X(t)-X(t-1), \) where X(t) means the cumulative number of people infected with TB cases by the \(t^{th}\) year. We can derive the expression for X(t) as

$$\begin{aligned} \frac{\textrm{d}X(t)}{\textrm{d}t}=\int _0^\infty k(a)e(a,t)\textrm{d}a+ \int _0^\infty \gamma (w)r(w,t)\textrm{d}w. \end{aligned}$$

We utilize \(T({\mathcal {Q}},t)\) to simulate annual new TB cases. Using the Grey Wolf Optimizer(GWO) algorithm (Mirjalili et al. 2014), which allows us to estimate the unknown parameters and initial values of the system (1) are shown in Table 2.

Table 2 The parameter values of the system (1)

The simulation results are shown in Fig. 2.

Fig. 2
figure 2

Fitting result of annual new TB cases in China for system (1)

From Fig. 2, we think that the simulation results fit well with the number of new TB cases in China from 2007–2020. Based on the parameters in Table 2, we get the basic reproduction number \({\mathcal {R}}_0=1.1938>1\). From the Theorem 7, when \({\mathcal {R}}_0>1\), the endemic equilibrium \(E^*\) is globally asymptotically stable in \({\textbf{Y}}\), so it is difficult to say that TB is eradicated in China.

5.2 Sensitivity analysis

To measure the reliability of the \(\beta \), \(\alpha \), \(\delta _2\), \(\sigma _1\), \(\sigma _2\) estimates obtained through the GWO algorithm, we employ the Markov Chain Monte Carlo method with the Delayed Rejection and Adaptive Metropolis algorithm (Haario et al. 2006). The convergence of the chain is confirmed by using Geweke’s Z-scores. Then, the mean, standard deviation, \(95\%\) confidence intervals and Z-score of the estimated parameters values are shown in Table 3, and the fitting result can be seen in Fig. 3.

Table 3 The parameter values of the system (1)
Fig. 3
figure 3

The fitting results of the number of new TB cases from 2007 to 2020. The solid black line represents the fitted data, and the red dots represent the actual data. The darkest to lightest areas represent \(50\%\), \(90\%\), \(95\%\) and \(99\%\) confidence intervals, respectively

We performed a sensitivity analysis using the Partial Rank Correlation Coefficient(PRCC) method. Fixed the parameters \(\delta _1\), \(d_1\), \(d_2\), \(c_1\), \(c_2\), \(k_1\), \(k_2\), \(\gamma _1\), \(\gamma _2\), v(0), e(0), I(0), r(0), the other parameters are shown in Table 3. We calculate the PRCC between the parameter and the basic reproduction number \({\mathcal {R}}_0\), as shown in Fig. 4.

Fig. 4
figure 4

The PRCC values

As can be seen from Fig. 4, \(\beta \), \(\sigma _1\) and \(\alpha \) have a large effect on the basic reproduction number \({\mathcal {R}}_0\). Thus, predicting whether the WHO goal will be met by 2030 by changing the values of \(\beta \), \(\sigma _1\) and \(\alpha \). We use the parameter values in Table 1 as the baseline for comparing the control effects, while also considering the effect of implementing a mixture of control measures. First, we consider decreasing the value of parameter \(\beta \) by \(50\%\), increasing the value of \(\sigma _1\) by \(50\%\) and decreasing the value of parameter \(\beta \) by \(95\%\), increasing the value of \(\sigma _1\) by \(95\%\), see Fig. 5a; Similarly, consider changing the values of \(\beta \) and \(\alpha \), see Fig. 5b; Finally, we consider changing the values of \(\sigma _1\) and \(\alpha \), see Fig. 5c.

Fig. 5
figure 5

a Predicted number of new TB cases for different values of \(\beta \) and \(\sigma _1\). b Predicted number of new TB cases for different values of \(\beta \) and \(\alpha \). c Predicted number of new TB cases for different values of \(\sigma _1\) and \(\alpha \)

Based on the simulation results in Fig. 5, it is difficult for China to reach the goal set by WHO. The reason for this is that people are more aware of self-protection, wearing mask, keeping social distance, and learning more about preventive measures through various media reports. The vaccination population is mainly newborns and the vaccination rate is high. Therefore, the treatment measures are relatively good.

As a result, we should take into account the actual situation, to further optimize the prevention and control measures. For instance, strengthening the capacity of medical institutions to improve early diagnosis and treatment of TB. This involves promoting advanced diagnostic techniques and drug susceptibility testing, providing high-quality and sustained anti-tuberculosis drug treatment to ensure that patients can complete the full course of treatment. The government should increase investment in public education and medical resources in low-income areas. In summary, to eliminate TB, the entire society must work together.

6 Optimal control problem

In this section, we discuss the optimal control problem. Optimal control theory is a method for finding control strategies that minimize or maximize an objective function, which opens up new ideas in the study of infectious diseases and the development of control strategies.

First, through the sensitivity analysis, we discovered that controlling the change of \({\mathcal {R}}_0\) sensitive parameters did not have a significant effect. Therefore, we decided to use the treatment rate \(\sigma _1\) of the TB-infected population and the relapse rate \(\gamma (w)\) of the recovered individuals as control parameters. Then two control variables \(l_1(t)\) and \(l_2(w,t)\) are introduced, where \(l_1(t)\) represents the strategy of reinforcement treatment and \(l_2(w,t)\) represents the strategy of controlling relapse in recovered individuals. Since f(I) is monotonically increasing and downward concave, we choose \(f(I)=I\). By substituting this into the system (1)-(3), we obtained the corresponding optimal control problem:

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\textrm{d}S(t)}}{{\textrm{d}t}} = \Lambda - (\mu +\alpha ) S(t) - \beta S(t)I(t)+ \int _0^\Theta {h (\theta )v(\theta ,t)\textrm{d}\theta },\\ \frac{{\partial v(\theta ,t)}}{{\partial \theta }} + \frac{{\partial v(\theta ,t)}}{{\partial t}} = -\big (n(\theta )+\beta I(t)\sigma (\theta )\big )v(\theta ,t),\\ \frac{{\partial e(a,t)}}{{\partial a}} + \frac{{\partial e(a,t)}}{{\partial t}} = - \varepsilon (a)e(a,t), \\ \frac{{\textrm{d}I(t)}}{{\textrm{d}t}} = \int _0^A {k(a)e(a,t)\textrm{d}a} + \int _0^W \gamma (w)\big (1-l_2(w,t)\big )r(w,t)\textrm{d}w\\ \qquad \qquad -\big [\mu +\delta _2+\sigma _1\big (1+l_1(t)\big )+\sigma _2\big ]I(t), \\ \frac{{\partial r(w,t)}}{{\partial w}} + \frac{{\partial r(w,t)}}{{\partial t}} =-[\mu +\gamma (w)\big (1-l_2(w,t)\big )]r(w,t),\\ S(0)=S_0,~ v\left( \theta ,0 \right) =v_0(\theta ),~ e\left( a,0 \right) =e_0(a),~ I(0)=I_0,~ r\left( w,0 \right) =r_0(w),\\ v\left( 0,t\right) =\alpha S(t),~ e\left( 0,t \right) =\beta S(t)I(t)+\int _0^\Theta {\beta I(t)\sigma (\theta )v(\theta ,t)\textrm{d}\theta }+\sigma _2I(t),\\ r\left( 0,t \right) =\sigma _1\big (1+l_1(t)\big )I(t). \end{array} \right. \end{aligned}$$
(53)

The control set is defined by

$$\begin{aligned} {\mathcal {L}}=\Big \{{\mathfrak {L}}=(l_1,l_2) \in \big (L^\infty (0,T),L^\infty ((0,W)\times (0,T))\big )\big |0\le l_1\le l_{1\textrm{max}},~ 0\le l_2\le l_{2\textrm{max}}\Big \}, \end{aligned}$$

where \(l_{1\textrm{max}}\) and \(l_{2\textrm{max}}\) denote the maximum value of the control variables. To consider the cost of disease control, we define the following objective function:

$$\begin{aligned} {\mathcal {J}}({\mathfrak {L}})=\int _0^T\left( \int _0^A C_1e(a,t)\textrm{d}a+C_2I(t)+\frac{1}{2}B_1l_1^2(t)+\frac{1}{2}B_2\int _0^Wl_2^2(w,t)\textrm{d}w\right) \textrm{d}t, \end{aligned}$$
(54)

where \(C_1\) and \(C_2\) are positive weights, \(B_1\) and \(B_2\) are the cost of the control. It is assumed that the cost consumed by the measure is a nonlinear quadratic function. We need to find an optimal control \({\mathfrak {L}}^*\), so as to minimize \({\mathcal {J}}(\mathfrak {L^*})\), i.e., to minimize the number of latent and infectious individuals and to minimize the cost of control inputs.

To obtain the necessary conditions for the control problem, we use the Gâteaux derivative rule (Kang 2009) and introduce another control \(l_1^\epsilon (t)=l_1(t)+\epsilon j_1(t),\ l_2^\epsilon (w,t)=l_2(w,t)+\epsilon j_2(w,t)\), where \(j_1(t)\), \(j_2(w,t)\) are called variation functions and \(0<\epsilon <1\). We use \({\overline{S}},~ {\overline{v}},~ {\overline{e}},~ {\overline{I}},~ {\overline{r}}\) to denote S, v, e, I, r with respect to the Gâteaux derivative of \(\epsilon \), also know as sensitivities. Therefore, we have

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\textrm{d}{{\overline{S}}}(t)}}{{\textrm{d}t}} = - (\mu +\alpha ){\overline{S}}(t) - \beta {{\overline{S}}}(t)I(t)-\beta S(t){{\overline{I}}}(t)+ \int _0^\Theta h (\theta ){{\overline{v}}}(\theta ,t)\textrm{d}\theta ,\\ \frac{{\partial {{\overline{v}}}(\theta ,t)}}{{\partial \theta }} + \frac{{\partial {{\overline{v}}}(\theta ,t)}}{{\partial t}} = -\big (n(\theta )+\beta I(t)\sigma (\theta )\big ){{\overline{v}}}(\theta ,t)-\beta {{\overline{I}}}(t)\sigma (\theta )v(\theta ,t),\\ \frac{{\partial {{\overline{e}}}(a,t)}}{{\partial a}} + \frac{{\partial {{\overline{e}}}(a,t)}}{{\partial t}} = - \varepsilon (a){{\overline{e}}}(a,t), \\ \frac{{\textrm{d}{{\overline{I}}}(t)}}{{\textrm{d}t}} = \int _0^A {k(a){{\overline{e}}}(a,t)\textrm{d}a} + \int _0^W \gamma (w)\big (1-l_2(w,t)\big ){{\overline{r}}}(w,t)\textrm{d}w\\ \quad \quad \qquad -\int _0^W \gamma (w)j_2(w,t)r(w,t)\textrm{d}w -\varpi {{\overline{I}}}(t)-\sigma _1l_1(t){{\overline{I}}}(t)-\sigma _1j_1(t)I(t), \\ \frac{{\partial {{\overline{r}}}(w,t)}}{{\partial w}} + \frac{{\partial {{\overline{r}}}(w,t)}}{{\partial t}} =\big (-u(w)+l_2(w,t) \gamma (w)\big ){{\overline{r}}}(w,t)+j_2(w,t)\gamma (w)r(w,t),\\ {{\overline{S}}}(0)={{\overline{v}}}(\theta ,0)={{\overline{e}}}(a,0)={{\overline{I}}}(0)={{\overline{r}}}(w,0)=0,\\ {{\overline{v}}}\left( 0,t\right) =\alpha {{\overline{S}}}(t),\\ {{\overline{e}}}\left( 0,t \right) =\beta {{\overline{S}}}(t)I(t)+\beta S(t){{\overline{I}}}(t)+\int _0^\Theta \beta {{\overline{I}}}(t)\sigma (\theta )v(\theta ,t)\textrm{d}\theta \\ \quad \qquad \qquad +\int _0^\Theta \beta I(t)\sigma (\theta ){{\overline{v}}}(\theta ,t)\textrm{d}\theta +\sigma _2{{\overline{I}}}(t),\\ {{\overline{r}}}\left( 0,t \right) =\sigma _1\big (1+l_1(t)\big ){{\overline{I}}}(t)+\sigma _1j_1(t)I(t). \end{array} \right. \end{aligned}$$
(55)

To obtain the adjoint system, we introduce the adjoint variables \(\lambda _1(t),~ \lambda _2(\theta ,t), \lambda _3(a,t),~ \lambda _4(t)\) and \(\lambda _5(w,t)\). By using the initial condition \(\overline{S}(0)=0\) and the transversality condition \(\lambda _1(T)=0\), with integration by parts, the first equation of the system (55) can be expressed as

$$\begin{aligned} 0&=\Bigg \langle \frac{{\textrm{d}{{\overline{S}}}(t)}}{{\textrm{d}t}} + (\mu +\alpha ) {{\overline{S}}}(t) + \beta {{\overline{S}}}(t)I(t)+\beta S(t){{\overline{I}}}(t)-\int _0^\Theta h (\theta ){{\overline{v}}}(\theta ,t)\textrm{d}\theta , \lambda _1(t) \Bigg \rangle \nonumber \\&=\Bigg \langle {{\overline{S}}}(t), -\frac{\textrm{d}\lambda _1(t)}{\textrm{d}t} + (\mu +\alpha ) \lambda _1(t) + \beta I(t)\lambda _1(t)\Bigg \rangle +\int _0^T \beta {{\overline{I}}}(t)S(t)\lambda _1(t)\textrm{d}t\nonumber \\&\quad -\int _0^T \int _0^\Theta \lambda _1(t)h(\theta ){{\overline{v}}}(\theta ,t)\textrm{d}\theta \textrm{d}t, \end{aligned}$$
(56)

where \(\big \langle f,g \big \rangle =\int _0^T fg \textrm{d}t\). With the initial condition \({{\overline{v}}}(\theta ,0)=0\), the boundary condition \({{\overline{v}}}(0,t)=\alpha {{\overline{S}}}(t)\) and the conditions \(\lambda _2(\Theta ,t)=0\), \(\lambda _2(\theta ,T)=0\). Similarly, the second equation of the system (55) is given by

$$\begin{aligned} 0&=\Bigg \langle \Bigg \langle \frac{{\partial {{\overline{v}}}(\theta ,t)}}{{\partial \theta }} + \frac{{\partial {{\overline{v}}}(\theta ,t)}}{{\partial t}} +\big (n(\theta )+\beta I(t)\sigma (\theta )\big ){{\overline{v}}}(\theta ,t)+\beta {{\overline{I}}}(t)\sigma (\theta )v(\theta ,t), \lambda _2(\theta ,t)\Bigg \rangle \Bigg \rangle \nonumber \\&=\Bigg \langle \Bigg \langle {{\overline{v}}}(\theta ,t), -\frac{{\partial \lambda _2(\theta ,t)}}{{\partial \theta }} -\frac{{\partial \lambda _2(\theta ,t)}}{{\partial t}} +\big (n(\theta )+\beta I(t)\sigma (\theta )\big ) \lambda _2(\theta ,t)\Bigg \rangle \Bigg \rangle \nonumber \\&\quad +\int _0^T\int _0^\Theta \lambda _2(\theta ,t)\beta {{\overline{I}}}(t)\sigma (\theta )v(\theta ,t)\textrm{d}\theta \textrm{d}t-\int _0^T\alpha {{\overline{S}}}(t)\lambda _2(0,t)\textrm{d}t, \end{aligned}$$
(57)

where \(\big \langle \big \langle f,g \big \rangle \big \rangle =\int _0^T \int _0^A fg \textrm{d}a\textrm{d}t\). Under the initial condition \({{\overline{e}}}(a,0)=0\), the boundary condition \({{\overline{e}}}(0,t)=\beta {{\overline{S}}}(t)I(t)+\beta S(t){{\overline{I}}}(t)+\int _0^\Theta \beta \overline{I}(t)\sigma (\theta )v(\theta ,t)\textrm{d}\theta +\int _0^\Theta \beta I(t)\sigma (\theta )\overline{v}(\theta ,t)\textrm{d}\theta +\sigma _2{{\overline{I}}}(t)\) and the conditions \(\lambda _3(A,t)=0\), \(\lambda _3(a,T)=0\). From the third equation of the system (55) yields

$$\begin{aligned} 0&=\Bigg \langle \Bigg \langle \frac{{\partial {{\overline{e}}}(a,t)}}{{\partial a}} + \frac{{\partial {{\overline{e}}}(a,t)}}{{\partial t}}+ \varepsilon (a){{\overline{e}}}(a,t), \lambda _3(a,t)\Bigg \rangle \Bigg \rangle \nonumber \\&=\Bigg \langle \Bigg \langle {{\overline{e}}}(a,t), -\frac{{\partial \lambda _3(a,t)}}{{\partial a}} -\frac{{\partial \lambda _3(a,t)}}{{\partial t}}+ \varepsilon (a)\lambda _3(a,t)\Bigg \rangle \Bigg \rangle \nonumber \\&\quad -\int _0^T\big (\beta {{\overline{S}}}(t)I(t)+\beta S(t){{\overline{I}}}(t)+\sigma _2{{\overline{I}}}(t)\big )\lambda _3(0,t)\textrm{d}t\nonumber \\&\quad -\int _0^T \int _0^\Theta \big (\beta {{\overline{I}}}(t)v(\theta ,t)+\beta I(t){{\overline{v}}}(\theta ,t)\big )\sigma (\theta )\lambda _3(0,t)\textrm{d}\theta \textrm{d}t. \end{aligned}$$
(58)

Using the initial condition \({{\overline{I}}}(0)=0\) and the transversality condition \(\lambda _4(T)=0\). The fourth equation of the system (55) can be reformulated as

$$\begin{aligned} 0&=\Bigg \langle \frac{{\textrm{d}{{\overline{I}}}(t)}}{{\textrm{d}t}}- \int _0^A k(a){{\overline{e}}}(a,t)\textrm{d}a - \int _0^W \gamma (w)\big (1-l_2(w,t)\big ){{\overline{r}}}(w,t)\textrm{d}w\nonumber \\&\quad +\int _0^W \gamma (w)j_2(w,t)r(w,t)\textrm{d}w\nonumber \\&\quad +\varpi {{\overline{I}}}(t)+\sigma _1l_1(t){{\overline{I}}}(t)+\sigma _1j_1(t)I(t), \lambda _4(t) \Bigg \rangle \nonumber \\&=\Bigg \langle {{\overline{I}}}(t), -\frac{\textrm{d}\lambda _4(t)}{\textrm{d}t}+\varpi \lambda _4(t)+\sigma _1l_1(t) \lambda _4(t)\Bigg \rangle -\int _0^T \int _0^A \lambda _4(t)k(a) {{\overline{e}}}(a,t)\textrm{d}a\textrm{d}t\nonumber \\&\quad -\int _0^T \int _0^W \lambda _4(t)\gamma (w)\big (1-l_2(w,t)\big ){{\overline{r}}}(w,t)\textrm{d}w \textrm{d}t\nonumber \\&\quad +\int _0^T \int _0^W \lambda _4(t)\gamma (w)j_2(w,t)r(w,t)\textrm{d}w \textrm{d}t\nonumber \\&\quad +\int _0^T \sigma _1 \lambda _4(t)j_1(t)I(t)\textrm{d}t. \end{aligned}$$
(59)

Under the initial condition \({{\overline{r}}}(w,0)=0\), the boundary condition \({{\overline{r}}}(0,t)=\sigma _1\big (1+l_1(t)\big )\overline{I}(t)+\sigma _1j_1(t)I(t)\) and the conditions \(\lambda _5(W,t)=0\), \(\lambda _5(w,T)=0\) hold. The fifth equation of the system (55) can be expressed as

$$\begin{aligned} 0&=\Bigg \langle \Bigg \langle \frac{{\partial {{\overline{r}}}(w,t)}}{{\partial w}} + \frac{{\partial {{\overline{r}}}(w,t)}}{{\partial t}} +\big (u(w)-l_2(w,t) \gamma (w)\big ){{\overline{r}}}(w,t)\nonumber \\&\quad -j_2(w,t)\gamma (w)r(w,t), \lambda _5(w,t) \Bigg \rangle \Bigg \rangle \nonumber \\&=\Bigg \langle \Bigg \langle {{\overline{r}}}(w,t), -\frac{\partial \lambda _5(w,t)}{\partial w} -\frac{\partial \lambda _5(w,t)}{\partial t}+\big (u(w)-l_2(w,t) \gamma (w)\big )\lambda _5(w,t) \Bigg \rangle \Bigg \rangle \nonumber \\&\quad -\int _0^T \int _0^W j_2(w,t)\gamma (w)\lambda _5(w,t)r(w,t)\textrm{d}w\textrm{d}t\nonumber \\&\quad -\int _0^T \sigma _1\lambda _5(0,t)\left[ \big (1+l_1(t)\big ){{\overline{I}}}(t)+j_1(t)I(t)\right] \textrm{d}t. \end{aligned}$$
(60)

Combining (56)-(60), the adjoint system can be written as

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\textrm{d}\lambda _1(t)}}{{\textrm{d}t}} = (\mu +\alpha ) \lambda _1(t) + \beta I (t)\lambda _1(t)-\alpha \lambda _2(0,t)-\beta I(t)\lambda _3(0,t),\\ \frac{{\partial \lambda _2(\theta ,t)}}{{\partial \theta }} + \frac{{\partial \lambda _2(\theta ,t)}}{{\partial t}} =\big (n(\theta )+\beta I(t)\sigma (\theta )\big )\lambda _2(\theta ,t)-\lambda _1(t) h(\theta )-\lambda _3(0,t)\beta I(t)\sigma (\theta ),\\ \frac{{\partial \lambda _3(a,t)}}{{\partial a}} + \frac{{\partial \lambda _3(a,t)}}{{\partial t}} = \varepsilon (a)\lambda _3(a,t)-\lambda _4(t) k(a)-C_1,\\ \frac{{\textrm{d}\lambda _4(t)}}{{\textrm{d}t}} = \big (\varpi +\sigma _1l_1(t)\big )\lambda _4(t)+\beta S(t)\lambda _1(t)+ \int _0^\Theta \beta \sigma (\theta )\lambda _2(\theta ,t)v(\theta ,t)\textrm{d}\theta \\ \qquad \qquad \quad -\beta S(t)\lambda _3(0,t)-\int _0^\Theta \beta v(\theta ,t)\sigma (\theta )\lambda _3(0,t)\textrm{d}\theta -\sigma _1\lambda _5(0,t) \big (1+l_1(t)\big )\\ \qquad \qquad \quad -\sigma _2\lambda _3(0,t)-C_2,\\ \frac{{\partial \lambda _5(w,t)}}{{\partial w}} + \frac{{\partial \lambda _5(w,t)}}{{\partial t}}=\big (u(w) -l_2(w,t) \gamma (w)\big )\lambda _5(w,t)-\lambda _4(t)\gamma (w)\big (1-l_2(w,t)\big ), \end{array} \right. \end{aligned}$$
(61)

with the transversality conditions

$$\begin{aligned} \lambda _1(T)=0,~ \lambda _2(\theta ,T)=0,~ \lambda _3(a,T)=0,~ \lambda _4(T)=0,~ \lambda _5(w,T)=0, \end{aligned}$$

and the boundary conditions

$$\begin{aligned} \lambda _2(\Theta ,t)=0,~\lambda _3(A,t)=0,~\lambda _5(W,t)=0. \end{aligned}$$

The Gâteaux derivative of the objective function \({\mathcal {J}}({\mathfrak {L}})\) with respect to the control variable \({\mathfrak {L}}\) is

$$\begin{aligned} \frac{\textrm{d}{\mathcal {J}}}{\textrm{d}\epsilon }\bigg |_{\epsilon =0^+}&=\lim _{\epsilon \rightarrow 0^+}\frac{{\mathcal {J}}(l_1+\epsilon j_1, l_2+\epsilon j_2)-{\mathcal {J}}(l_1, l_2)}{\epsilon }\nonumber \\&=\int _0^T \left( \int _0^A C_1{{\overline{e}}}(a,t)\textrm{d}a+C_2{{\overline{I}}}(t)+B_1l_1(t)j_1(t)\right. \nonumber \\&\quad \left. +\int _0^W B_2 l_2(w,t)j_2(w,t)\textrm{d}w\right) \textrm{d}t. \end{aligned}$$
(62)

Theorem 8

If there is \({\mathfrak {L}}^*\in {\mathcal {L}}\) such that \({\mathcal {J}}({\mathfrak {L}})\) is minimized, then

$$\begin{aligned} l_1^*(t)&=\textrm{min} \left\{ \textrm{max} \left\{ 0,\ \frac{\big (\lambda _4(t)-\lambda _5(0,t)\big )\sigma _1 I(t)}{B_1}\right\} ,\ l_{1\textrm{max}}\right\} ,\nonumber \\ l_2^*(w,t)&=\textrm{min}\left\{ \textrm{max}\left\{ 0,\ \frac{\big (\lambda _4(t)-\lambda _5(w,t)\big )\gamma (w)r(w,t)}{B_2}\right\} ,\ l_{2\textrm{max}}\right\} . \end{aligned}$$

Proof

Substituting the system (61) into the equation (62) gives

$$\begin{aligned} \frac{\textrm{d}{\mathcal {J}}}{\textrm{d}\epsilon }\Bigg |_{\epsilon =0^+}&=\int _0^T \int _0^A {{\overline{e}}}(a,t) \left( -\frac{{\partial \lambda _3(a,t)}}{{\partial a}} - \frac{{\partial \lambda _3(a,t)}}{{\partial t}} + \varepsilon (a)\lambda _3(a,t)-\lambda _4(t) k(a)\right) \textrm{d}a\textrm{d}t\nonumber \\&\quad +\int _0^T {{\overline{I}}}(t)\left[ -\frac{{\textrm{d} \lambda _4(t)}}{{\textrm{d}t}} + \varpi \lambda _4(t)+\sigma _1l_1(t)\lambda _4(t)+\beta S(t)\lambda _1(t)\right. \nonumber \\&\quad + \int _0^\Theta \beta \sigma (\theta )\lambda _2(\theta ,t)v(\theta ,t)\textrm{d}\theta -\beta S(t)\lambda _3(0,t)\nonumber \\&\quad \left. -\int _0^\Theta \beta v(\theta ,t)\sigma (\theta )\lambda _3(0,t)\textrm{d}\theta -\sigma _1\lambda _5(0,t) \big (1+l_1(t)\big )-\sigma _2\lambda _3(0,t)\right] \textrm{d}t\nonumber \\&\quad + \int _0^T{{\overline{S}}}(t)\left( -\frac{{\textrm{d}\lambda _1(t)}}{{\textrm{d}t}}+ (\mu +\alpha ) \lambda _1(t) + \beta I (t)\lambda _1(t)-\alpha \lambda _2(0,t)\right. \nonumber \\&\quad \left. -\beta I(t)\lambda _3(0,t)\right) \textrm{d}t+\int _0^T \int _0^\Theta {{\overline{v}}}(\theta ,t)\left[ -\frac{{\partial \lambda _2(\theta ,t)}}{{\partial \theta }} - \frac{{\partial \lambda _2(\theta ,t)}}{{\partial t}}\right. \nonumber \\&\quad +\big (n(\theta )+\beta I(t)\sigma (\theta )\big )\lambda _2(\theta ,t)-\lambda _1(t) h(\theta )-\lambda _3(0,t)\beta I(t)\sigma (\theta )\bigg ]\textrm{d}\theta \textrm{d}t\nonumber \\&\quad +\int _0^T \int _0^W {{\overline{r}}}(w,t)\bigg [-\frac{{\partial \lambda _5(w,t)}}{{\partial w}} - \frac{{\partial \lambda _5(w,t)}}{{\partial t}}+\big (u(w) -l_2(w,t) \gamma (w)\big )\lambda _5(w,t)\nonumber \\&\quad -\lambda _4(t)\gamma (w)\big (1-l_2(w,t)\big )\bigg ]\textrm{d}w\textrm{d}t+\int _0^T B_1l_1(t)j_1(t)\textrm{d}t\nonumber \\&\quad +\int _0^T\int _0^W B_2 l_2(w,t)j_2(w,t)\textrm{d}w\textrm{d}t. \end{aligned}$$
(63)

Substituting (56)-(60) into equation (63), for all \({\mathfrak {L}}\in {\mathcal {L}}\), we obtain

$$\begin{aligned} 0\le \frac{\textrm{d}{\mathcal {J}}}{\textrm{d}\epsilon }\Bigg |_{\epsilon =0^+}&=\int _0^T \big [\big (-\lambda _4(t)+\lambda _5(0,t)\big )\sigma _1I(t)+B_1l_1(t)\big ]j_1(t)\textrm{d}t\nonumber \\&\quad +\int _0^T \int _0^W \big [\big (-\lambda _4(t)+\lambda _5(w,t)\big )\gamma (w)r(w,t)\\&\quad +B_2l_2(w,t)\big ]j_2(w,t)\textrm{d}w\textrm{d}t. \end{aligned}$$

When \({\mathcal {J}}\) is minimized at its minimum, we get

$$\begin{aligned} {\overline{l}}_1(t)=\frac{\big (\lambda _4(t)-\lambda _5(0,t)\big )\sigma _1 I(t)}{B_1},\ {\overline{l}}_2(w,t)=\frac{\big (\lambda _4(t)-\lambda _5(w,t)\big )\gamma (w)r(w,t)}{B_2}. \end{aligned}$$

Considering the upper and lower bounds of the control variables, lead to the optimal control \({\mathfrak {L}}^*\) with the following form:

$$\begin{aligned} l_1^*(t)=\textrm{min}\big \{\textrm{max}\{0,~ {\overline{l}}_1(t)\},~ l_{1\textrm{max}}\big \},~ l_2^*(w,t)=\textrm{min}\big \{\textrm{max}\{0,~ {\overline{l}}_2(w,t)\},~ l_{2\textrm{max}}\big \}. \end{aligned}$$

The proof is completed. \(\square \)

7 Discussion

The efficacy of vaccination, incomplete treatment and disease relapse are critical challenges that must be faced to prevent and control the spread of infectious diseases. And age heterogeneity is also a very crucial factor. However, few papers have considered all these factors together to study an epidemic model. Thus, we formulate and analyze a new age-structured SVEIR epidemic model with the nonlinear incidence rate, waning immunity, incomplete treatment and relapse. Comparison with the model put forward in references, if ignores the vaccinated individuals being infected, incomplete treatment and the disease-related mortality rate of latent individuals, namely, \(\sigma (\theta )=0,~ \sigma _2=0,~ \delta _1=0\), the system (1) is identical to that of the model in Li et al. (2017). In addition, if the incidence is bilinear and ignores vaccinated individuals being infected and incomplete treatment, i.e., \(f(I)=I,~ \sigma (\theta )=0,~ \sigma _2=0\), then the system (1) is the same as the model in Liu and Liu (2017). So, if the above parameters are not equal to zero, it means that the model contains more return terms, which is a challenging task for us to prove the properties of the system (1) and establish the Lyapunov function.

In this paper, we define the basic reproduction number \({\mathcal {R}}_0\). The asymptotic smoothness, the uniform persistence and the existence of a global attractor for the semi-flow generated by the system (1) have been shown. In addition, study results show that the basic reproduction number \({\mathcal {R}}_0\) completely determines the global dynamics of the system (1). If \({\mathcal {R}}_0<1\), the disease goes extinct; If \({\mathcal {R}}_0>1\), the disease persists. As can be seen from the expression for the basic reproduction number \({\mathcal {R}}_0\), the age-dependent parameters \(h(\theta ),~\theta _1\) and \(\theta _2\) affect the transmission of disease. The danger of disease outbreaks may be overstated or underestimated if these values are simply taken as constants.

It is worth noticing that we use the annual data of TB in China from 2007–2020 to estimate unknown parameter values in the system (1). These main theoretical results are applied to analyze and predict the trend of TB prevalence in China. It can be concluded that China may not be able to reach the WHO goal under the current TB control measures. It is difficult to say that TB is eradicated in China. Therefore, China needs to optimize its prevention and control methods according to the actual situation. We believe that eliminating TB is possible, but will require integrated and sustained action.

Finally, we consider the sensitive parameters and the actual situation. So, strengthening treatment and controlling relapse were selected as control variables. We obtain the necessary conditions for optimal control, which show that the relapse age w appears in the expression for the optimal control variables and affects the efficiency of optimal control.