1 Introduction

Almost more than 585,700 deaths due to cancer from 1,665,540 cancerous people are reported every year in the US [1]. So, modeling and treatment of cancer are the main focus of many researchers worldwide from clinicians, biologists, mathematicians, and control engineers. Many evidences showed the ability of the immune system in diminishing the tumor cells in the absence of the external treatment [2]. Therefore, immunotherapy has been used for cancer treatment. The two important parts of the defensive mechanisms in the body are innate and adaptive immune mechanisms. Adaptive immune systems have memory. In other words, part of these cells will stay in the body after encountering with a causing disease agent [2]. Hence, the following issues are essential in cancer modeling and finding an appropriate treatment method:

  • existence of memory and memory cells;

  • complete elimination of the tumors after a finite duration treatments.

The modeling approaches to study disease dynamics include, but not limited to the following, optimization, compartmental, and dynamical system approaches [3]. In this work, a dynamical system approach is used which shows the interaction among cells and drugs. In [4, 5], a review of mathematical models which are used for cancer therapy is expressed.

The two open-loop and closed-loop approaches could be used to control a system. Feedback control strategy is robust in dealing with parameter variation of the system, which causes better performance relative to open-loop controllers [68].

Chemotherapy as an efficacious method in treatment of cancer and is based on usage of drugs. For avoiding from adverse side effects of such drugs and preserving the level of dosage, drugs should be used based on a regular program. Different control methods have been used for solving this problem. Using these methods besides optimizing the amount of drugs used yields effective diminishing of tumor cells. Currently, many mathematical models for simulating the behavior of the drug and its effects on the body are presented [9]. Swan and Vincent introduced chemotherapy treatment program as an optimal control problem [10]. In 1990, Swan studied usage of optimal control theory in cancer chemotherapy, and described great variation among these models [11]. Then in 2000, Claire et al [12] introduced several models in the field of application of chemotherapy in the treatment of breast cancer. In 2001, Parker and Doyle performed a thorough review of articles that build the mathematical models of drug delivery and allocated small part of the cancer optimal chemotherapy [13]. In 2005 Harold and in 2009 Harold and Parker recognized deficiencies and weaknesses in the treatment of chemotherapy that related to clinical programs [13, 14]. In 2007, Nanda et al. [15] applied an optimal control model of two-drug chemotherapy for leukemia. In 2011, Shi et al. [9] presented a summary of the optimization models in treatment chemotherapy programs. In 2013, Moradi et al. [16] designed an optimal robust control in cancer chemotherapy. However, the existing studies assume that the dynamics of the cancer during treatment is time invariant. In other words, they consider the effect of therapeutic inputs only on the states of the system. But, the dynamics of cancer alters during its progression [17]. Wrecking inputs such as external stresses can disable the DNA repair genes [18]. These inputs are able to change the functions of growth-inhibiting signals (TGF-b), regulatory growth signals (TGF-a), and apoptosis (TP53) [17]. Therefore, an effective treatment method should correct these destructive changes in the dynamic behavior during treatment which is considered in this paper.

In this study, we regard a system of ordinary differential equation (ODE) that presents interaction among immune cells and tumor cells. This model is based on the model developed in [19]. In the proposed treatment method, we not only reduce the tumor cells, but also modify the dynamics of the system to correct the aforementioned destructive changes in the system dynamic. In many studies, the important shortcoming is that the cancer relapses after elimination of the therapy. For instance, in [19], the authors suggest a combined open-loop control for a series of parameters (patient 9). But after elimination of the inputs (combination of chemotherapy and immunotherapy), the tumor will re-grow and the cancer relapses due to instability of the tumor-free equilibrium point. We want to propose a method for finite duration treatment such that at the end of treatment the tumor becomes eradicated. So, we analyze and extend this model by adding vaccine and chemotherapy treatment terms. The vaccine has an effect on some parameters of the system; while, chemotherapy has an effect on the cell populations. We use SDRE method due to its optimal performance, flexibility in design, and robustness [20].

In the next section, the no treatment model is analyzed. In Sect. 3, we extend this model by adding vaccine therapy and chemotherapy treatment terms. We show that any treatment method without changing the dynamics of the system around the tumor-free equilibrium point is not an appropriate treatment method. Then, we suggest SDRE-based optimal control for the nonlinear tumor growth mode in Sect. 4. The aim of proposing the mixed vaccine and chemotherapy treatment is to present an optimal finite duration treatment such that the cancer is not able to relapse. In the last section, simulation results are discussed. The main highlights of the present study can be summarized as follows:

  • consideration the change in the dynamics of the cancer during treatment as a main factor in proposing treatment methods;

  • applying the SDRE optimal control to nonlinear cancer dynamics;

  • robustness of the proposed method with respect to parametric uncertainty;

  • straightforward implementation of the SDRE method.

2 The no treatment model

We use the model presented in [19] in the absence of treatment. This model is an extension of the model presented in [21] by adding new cell interaction terms. This model does not concentrate on a special type of cancer. In the absence of treatment, the model is fourth order ordinary differential equations. The states of the system are: the total tumor cell population (\(T(t)\)), the concentration of NK cells (cells/L) (\(N(t)\)), the concentration of CD8+T cells (cells/L) (\(L(t)\)), and the concentration of lymphocytes, not including NK cells and active CD4+T cells (cells/L) (\(C(t)\)). The no treatment system is given by:

$$\frac{{{\text{d}}T}}{{{\text{d}}t}} = aT\left( {1 - bT} \right) - cNT - DT, \,\, D = {\text{d}}\frac{{L^{l} }}{{sT^{l} + L^{l} }},$$
(1)
$$\frac{{{\text{d}}N}}{{{\text{d}}t}} = eC - fN + g\left( {\frac{{T^{2} }}{{h + T^{2} }}} \right)N - pNT,$$
(2)
$$\frac{{{\text{d}}L}}{{{\text{d}}t}} = - mL + j\frac{{D^{2} T^{2} }}{{k + D^{2} T^{2} }}L - qLT + r_{1} NT + r_{2} CT - uNL^{2} ,$$
(3)
$$\frac{{{\text{d}}C}}{{{\text{d}}t}} = \alpha - \beta C.$$
(4)

The tumor cells grow logistically with rate \(a\) up to \(b^{ - 1}\), which is the tumor carrying capacity. The NK cells kill the tumor cells in the form \({-}cNT\). The tumor lysis term by \(CD8^{ + } T\) is in the form \(- DT\), which shows the bounded ability of the effector cells in lysing the tumor cells. In this model, it is assumed that the growth rate of NK cells is tied to the overall immune health levels (\(eC - fN\)). The NK cells’ recruitment term is \(g\left( {\frac{{T^{2} }}{{h + T^{2} }}} \right)N\). The death rate of NK cells in dealing with tumor cells is \({-}pNT\). The death rate of \(CD8^{ + } T\) cells is proportional to their population (\(- mL\)). The recruitment term for \(CD8^{ + } T\) cells is \(j\frac{{D^{2} T^{2} }}{{k + D^{2} T^{2} }}L,\,r_{1} NT\) and \(r_{2} CT\). The term \(- uNL^{2}\) shows in inactivation term of these cells. It is assumed that the circulating lymphocytes grow with constant rate, which have a natural lifespan [19].

2.1 Equilibria

To derive the equilibria of the system, we simultaneously set all Eqs. (1), (2), (3) (4) equal to zero. Equation (4) is decoupled from others; so we have \(C_{E} = \frac{\alpha }{\beta }\). By setting (1) equal to zero, we may have two type of equilibrium points. One of them is the tumor-free equilibrium point, i.e., \(T_{E} = 0\). The second type corresponds to non-zero tumor cell population. The tumor-free equilibrium point for all four states variables is given by:

$$E_{0} = \left( {0,\frac{e\alpha }{\beta f},0,\frac{\alpha }{\beta }} \right).$$

The other equilibrium points for the non-zero tumor population, i.e., \(T_{E} \ne 0\), must be obtained numerically. By setting (2) equal to zero and solving for \(N_{E}\), we have:

$$N_{E} = \frac{{eC_{E} (h + T^{2} )}}{{fh + \left( {f - g} \right)T^{2} + phT + pT^{3} }}.$$
(5)

Similarly, by setting (1) equal to zero gives,

$$D_{E} = a - abT - cN_{E} .$$
(6)

Using the expression for \(D\) we have:

$$L_{E} = \left( {\frac{{sD_{E} T^{l} }}{{d - D_{E} }}} \right)^{{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 l}}\right.\kern-0pt} \!\lower0.7ex\hbox{$l$}}}} .$$
(7)

Finally, by setting (3) equal to zero gives,

$$uN_{E} L^{2} + \left( {m - \frac{{jD_{E}^{2} T^{2} }}{{k + D_{E}^{2} T^{2} }} + qT} \right)L - \left( {r_{1} N_{E} + r_{2} C_{E} } \right)T = 0.$$
(8)

Equilibrium points of the system defined by (1), (2), (3) (4) are founded by intersecting Eqs. (7) and (8). Numerical simulation shows that there are two solutions to (5), (6), (7) (8) which only one of them is positive (Fig. 1). This means that it is biologically plausible. So, the system has only two equilibrium points with the parameters stated in Appendix 1.

Fig. 1
figure 1

The non-negative equilibrium points for human data presented in Appendix 1 by intersecting Eqs. (7) and (8)

2.2 Local stability

In this section, we examine the local stability of the equilibrium points by linearizing the system about them. The tumor-free equilibrium point is very important from physiological viewpoint. Treatments actually should be able to push the system to this point eventually. The Jacobian matrix of the system around an arbitrary equilibrium point is:

$$J = \left[ {\begin{array}{*{20}c} {J_{11} } & {J_{12} } & {J_{13} } & 0 \\ {J_{21} } & {J_{22} } & 0 & {J_{24} } \\ {J_{31} } & {J_{32} } & {J_{33} } & {J_{34} } \\ 0 & 0 & 0 & {J_{44} } \\ \end{array} } \right],$$

where:

$$\begin{aligned} J_{11} & = a - 2abT - cN - d\frac{{sT^{l} L^{l} \left( {1 - l} \right) + L^{2l} }}{{\left( {sT^{l} + L^{l} } \right)^{2} }}, \\ J_{12} & = - cT, \\ J_{13} & = - \frac{{sdlT^{l + 1} L^{l - 1} }}{{\left( {sT^{l} + L^{l} } \right)^{2} }}, \\ J_{21} & = \frac{2gNhT}{{\left( {h + T^{2} } \right)^{2} }} - pN, \\ J_{22} & = - f + g\frac{{T^{2} }}{{h + T^{2} }} - pT, \\ J_{24} & = e, \\ J_{31} & = \frac{{2jkd^{2} TL^{2l + 1} \left( {sT^{l} + L^{l} } \right)\left( {sT^{l} \left( {1 - l} \right) + L^{l} } \right)}}{{\left( {k\left( {sT^{l} + L^{l} } \right)^{2} + d^{2} T^{2} L^{2l} } \right)^{2} }} - qL + r_{1} N + r_{2} C, \\ J_{32} & = r_{1} T - uL^{2} , \\ J_{33} & = - m + j\frac{{(2l + 1)kd^{2} sT^{l + 2} L^{2l} \left( {sT^{l} + L^{l} } \right) + d^{2} L^{4l} T^{2} (d^{2} T^{2} + k)}}{{\left( {k\left( {sT^{l} + L^{l} } \right)^{2} + d^{2} L^{2l} T^{2} } \right)^{2} }} - qT - 2uNL, \\ J_{34} & = r_{2} T, \\ J_{44} & = - \beta , \\ \end{aligned}$$

Preposition1

The tumor-free equilibrium point \(E_{0}\) is asymptotically stable if and only if: \(c > \frac{a\beta f}{e\alpha }\).

Proof

The Jacobian matrix of the system around the tumor-free equilibrium point \(E_{0}\) is simplified as:

$$J = \left[ {\begin{array}{*{20}c} {a - \frac{ce\alpha }{\beta f}} & 0 & 0 & 0 \\ { - \frac{pe\alpha }{\beta f}} & { - f} & 0 & e \\ {\frac{{r_{1} e\alpha }}{\beta f} + \frac{{r_{2} \alpha }}{\beta }} & 0 & { - m} & 0 \\ 0 & 0 & 0 & { - \beta } \\ \end{array} } \right].$$

Therefore, the eigenvalues of the system around this equilibrium point are:

$$\lambda_{1} = a - \frac{ce\alpha }{\beta f}, \,\lambda_{2} = - f,\,\lambda_{3} = - m,\,\lambda_{4} = - \beta .$$

Since \(f,\, m\) and \(\beta\) are positive, therefore, the eigenvalues \(\lambda_{2} ,\, \lambda_{3}\) and \(\lambda_{4}\) are always negative. The first eigenvalue is negative if:

$$a - \frac{ce\alpha }{\beta f} < 0 \to c > \frac{a\beta f}{e\alpha }.$$

Hence, if \(c > \frac{a\beta f}{e\alpha }\), the tumor-free equilibrium point \(E_{0}\) is asymptotically stable and vice versa.□

The stability of the high-tumor equilibrium point is also investigated. By investigating the Jacobian matrix of the system at this equilibrium point, all eigenvalues are negative. Therefore, this equilibrium point is stable. This means that, if the treatment is stopped and the dynamics of the system has not been changed during treatment, the system will return to its high-tumor state. So, if the tumor-free equilibrium point is unstable, to have an effective cure, any treatment must not only lessen the tumor volume, but it must also change the dynamics of the system around the tumor-free equilibrium point.

3 The mixed vaccine and chemotherapy treatment model

The aim of this paper is the total recovery of the patient after a finite duration treatment such that the cancer is not able to relapse again. In the sense that, the cancer cell populations must go to zero after the end of elimination of treatment. For this purpose, the system during treatment must be pushed to the tumor-free equilibrium point exactly due to the instability of this point. But, in a continuous system, it takes infinite time. In other words, the treatment must be applied during the entire life of the patient. Otherwise, after elimination of the input, the system comes back to its non-zero tumor equilibrium point (Fig. 2). In Fig. 2, the tumor cell population is pushed toward zero by constant five dose chemotherapy in period of 5 days, but after elimination of the input due to the instability of this point, the system goes to the only stable equilibrium point in the positive region of solutions.

Fig. 2
figure 2

Turning the system to the non-zero tumor equilibrium point after elimination of the chemotherapy due to instability of tumor-free equilibrium point

Therefore, the total recovery in a finite duration treatment is not possible unless this equilibrium point becomes stable. Therefore, one of the features of an appropriate treatment method must be to change the dynamics of the system around the tumor-free equilibrium point. Then, by pushing the system to the domain of attraction of this point, the system converges to it even after elimination of the treatment inputs.

Since, the vaccine therapy has an effect on some parameters of the system, we use mixed vaccine and chemotherapy treatment. The duty of the vaccine therapy is to change the dynamics of the system and the duty of the chemotherapy is to push the system toward the domain of attraction of the tumor-free equilibrium point. The effect of vaccine is considered on parameters \(c,g,j,s\) and \(d\) [19]. The effect of vaccine therapy is included in the mathematical model by the term \(v_{v} (t) \ge 0\). The rate of changing these parameters is assumed to be proportional to the input magnitude \(v_{v} \left( t \right)\), which is in accordance with [19, 22]. The values of \(\mu_{c} , \mu_{g} , \mu_{j} , \mu_{s}\) and \(\mu_{d}\) depend on the dynamics of \(c,g,j,s\) and \(d\), respectively. The biotransformation coefficients saturate at a finite limit \(k_{c} , k_{g} , k_{j} , k_{s}\) and \(k_{d}\), which are related to the biological limits of body organs and the accumulation of external effect [22]. Also, the effect of chemotherapy is included by the term \(M\left( t \right)\) which \(v_{M} (t) \ge 0\) is the amount of chemotherapy agent injected per day per liter of blood. Some chemotherapeutic drugs, such as doxorubicin, are only effective during certain phases of the cell cycle, and pharmacokinetics also indicate that the effectiveness of chemotherapy is bounded [19]. Therefore, a saturation term \(\frac{1.2M}{0.8 + M}\) is used to represent the chemotherapy fractional cell kill. Note that the kill rate is almost linear at low concentrations of drug, while it becomes plateaus at higher drug concentration. So, the modified equations of the system with treatment are as following:

$$\frac{{{\text{d}}T}}{{{\text{d}}t}} = aT\left( {1 - bT} \right) - cNT - DT - K_{T} \left( {\frac{1.2M}{0.8 + M}} \right)T, D = {\text{d}}\frac{{L^{l} }}{{sT^{l} + L^{l} }},$$
(9)
$$\frac{{{\text{d}}N}}{{{\text{d}}t}} = eC - fN + g\left( {\frac{{T^{2} }}{{h + T^{2} }}} \right)N - pNT - K_{N} \left( {\frac{1.2M}{0.8 + M}} \right)N,$$
(10)
$$\frac{{{\text{d}}L}}{{{\text{d}}t}} = - mL + j\frac{{D^{2} T^{2} }}{{k + D^{2} T^{2} }}L - qLT + r_{1} NT + r_{2} CT - uNL^{2} - K_{L} \left( {\frac{1.2M}{0.8 + M}} \right)L,$$
(11)
$$\frac{{{\text{d}}C}}{{{\text{d}}t}} = \alpha - \beta C - K_{C} \left( {\frac{1.2M}{0.8 + M}} \right)C,$$
(12)
$$\frac{{{\text{d}}c}}{{{\text{d}}t}} = \mu_{c} v_{v} \left( t \right)\left( {1 - \frac{c}{{k_{c} }}} \right),$$
(13)
$$\frac{{{\text{d}}g}}{{{\text{d}}t}} = \mu_{g} v_{v} \left( t \right)\left( {1 - \frac{g}{{k_{g} }}} \right),$$
(14)
$$\frac{{{\text{d}}j}}{{{\text{d}}t}} = \mu_{j} v_{v} \left( t \right)\left( {1 - \frac{j}{{k_{j} }}} \right),$$
(15)
$$\frac{{{\text{d}}s}}{{{\text{d}}t}} = \mu_{s} v_{v} \left( t \right)\left( {1 - \frac{s}{{k_{s} }}} \right),$$
(16)
$$\frac{{{\text{d}}d}}{{{\text{d}}t}} = \mu_{d} v_{v} \left( t \right)\left( {1 - \frac{d}{{k_{d} }}} \right),$$
(17)
$$\frac{{{\text{d}}M}}{{{\text{d}}t}} = - \mu M + v_{M} \left( t \right).$$
(18)

Note also that the system, which is an autonomous system with differentiable functions, satisfies existence and uniqueness of initial value problems [23].

A concise illustration of each term can be found in Appendix 2. Also, the parameters of the system and their values, from experimental data, are described in Appendix 1. All constants are positive.

4 Optimal control for mixed drug administration

A recently developed technique for nonlinear systems is called State-Dependent Riccati Equation (SDRE)-based optimal control, which does not yet have complete theoretical background, and has been applied successfully to nonlinear systems both in theory and experimental practice. Although there are some applications of SDRE optimal control to biological systems, control of drug administration in cancer dynamics using this method has not been studied yet. Due to its computational simplicity and its satisfactory simulation/experimental results, SDRE optimal control technique becomes an attractive control approach for a class of nonlinear systems and therefore many research and application results are reported [8].

In this paper, we apply SDRE-based optimal control for the nonlinear cancer dynamics. The amount of chemotherapy drug administration is considered as a control input to the system which holds the interactions between normal, tumor, and immune cells and should be optimal which is defined as the minimization of drug amount and duration of the treatment. The aim of the control is to eliminate the tumor cell while minimizing the amount and duration of chemotherapy. The cost function for the optimal control is selected as a biological relevant quadratic function of the states and control input. One of the main contributions of this paper is to apply SDRE optimal control to cancer dynamics. Although there are many optimal control algorithms proposed in this field, almost all of these algorithms require the use of some special nonlinear optimization software. Unlike the other optimal control approaches which have appeared in the literature in which (Hamilton–Jacobi–Bellman) HJB equations should be solved using numerical shooting methods or using special nonlinear optimization algorithms, the method proposed here gives a suboptimal solution by solving the well-known linear quadratic regulator (LQR) problem. The method also gives some extra freedom to choose different state-dependent coefficient (SDC) matrices and weighting matrices of the states and controls which may lead to better results in terms of chosen variables such as a state or a control input [8].

4.1 SDRE optimal control theory

Consider the deterministic, infinite horizon nonlinear optimal regulation (stabilization) problem, such that it is full state observable, time invariant and affine in the input, represented in the following form:

$$\dot{x} = f\left( x \right) + B\left( x \right)u\left( t \right), \,\,x\left( 0 \right) = x_{0} ,$$
(19)

where \(x \in R^{n}\) is the state vector, \(u \in R^{m}\) is the input vector, and \(t \in [0,\infty )\) with \(C^{1} (R^{n} )\) functions \(f:R^{n} \to R^{n}\) and \(B:R^{n} \to R^{n \times m}\), and \(B\left( x \right) \ne 0 \forall x\). Without loss of generality, the origin \(x = 0\) is assumed to be an equilibrium point. The minimization of the infinite time performance index:

$$J\left( {x_{0} , u\left( . \right)} \right) = \frac{1}{2}\int_{0}^{\infty } {\left\{ {x^{T} \left( t \right)Q\left( x \right)x\left( t \right) + u^{T} \left( t \right)R\left( x \right)u(t)} \right\}dt,}$$

is considered, which is non-quadratic in \(x\) but quadratic in \(u\). The state and input weighting matrices are assumed state dependent such that \(Q:R^{n} \to R^{n}\) and \(R:R^{n} \to R^{m \times m}\). It is assumed that \(Q\) and \(R\) are symmetric and \(R\) is positive definite.

$$Q\left( x \right) \ge 0, \,\, R\left( x \right) > 0.$$

Since \(f\left( 0 \right) = 0\) and \(f(.) \in C^{1} (R^{n} )\), the system (19) can be written as pseudo-linear form:

$$\dot{x} = A\left( x \right)x + B\left( x \right)u,$$
(20)

where \(f\left( x \right) = A\left( x \right)x\). In Eq. (20), \(A(x) \in R^{n \times n}\) and \(B(x) \in R^{n \times m}\) are state-dependent coefficient (SDC) matrices which bring the nonlinear system described by (19) into a linear-like representation. These matrices are not unique. However, it is advisable to select such that the matrices \(A\left( x \right)\) and \(B(x)\) are controllable. The state-dependent controllability matrix is as follows:

$$M\left( x \right) = [ \begin{array}{*{20}c} {B\left( x \right)} & {A\left( x \right)B\left( x \right)} & \ldots & {A^{n - 2} \left( x \right)B\left( x \right)} & {A^{n - 1} \left( x \right)B(x)].} \\ \end{array}$$

To control the nonlinear system, the above matrix must have full rank in the domain for which the nonlinear system is controlled.

Some optimal control problems need constraints that must be applied on state variables or the control input. Choice of weight matrices \(Q (x)\) and \(R (x)\) plays an important role in satisfying these optimal control problems’ constraints.

Hamiltonian matrix for the optimal control problem is as follows:

$$H\left( {x,u,\lambda } \right) = \frac{1}{2}\left( {x^{T} Q\left( x \right)x + u^{T} R\left( x \right)u} \right) + \lambda^{T} \left( {A\left( x \right)x + B\left( x \right)u} \right) - \overline{w}^{T} \left( {u - u_{\hbox{min} } } \right) - \hat{w}^{T} \left( {u_{\hbox{max} } - u} \right),$$

where \(\overline{w}\) and \(\hat{w}\) are \(m\) dimensional non-negative vectors presented to apply constraints to the control input and they must satisfy the following conditions:

$$\overline{w}^{T} \left( {u - u_{\hbox{min} } } \right) = \hat{w}^{T} \left( {u_{\hbox{max} } - u} \right) = 0.$$

From the Hamiltonian, the necessary conditions for optimality are:

$$\left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\dot{x} = \frac{\partial H}{\partial \lambda } = A\left( x \right)x + B\left( x \right)u, } \\ {\dot{\lambda } = - \frac{\partial H}{\partial x} = - Qx - \left[ {\frac{\partial A\left( x \right)x}{{{\text{d}}x}}} \right]^{T} \lambda - \left[ {\frac{\partial B\left( x \right)u}{{{\text{d}}x}}} \right]^{T} \lambda ,} \\ \end{array} } \\ {0 = \frac{\partial H}{\partial u} = R\left( x \right)u + B^{T} \left( x \right)\lambda - \overline{w} + \hat{w}} \\ \end{array} } \right.$$
(21)

The last equation of (21) gives the optimal control of the form:

$$u\left( x \right) = - R^{ - 1} \left( x \right)\left( {B^{T} \left( x \right)\lambda - \bar{w} + \hat{w}} \right),$$

By applying the theory of LQR, the ad-joint state vector has the form given by:

$$\lambda = P\left( x \right)x.$$

If we suppose \(A_{i:}\) as the \(i\) th row of \(A(x)\) and \(B_{i:}\) as the \(i\) th row of \(B(x)\):

$$\frac{\partial (A(x)x)}{\partial x} = A\left( x \right) + \frac{\partial (A(x))}{\partial x}x = A\left( x \right) + \left[ {\begin{array}{*{20}c} {\frac{{\partial A_{1} }}{{\partial x_{1} }}x} & \ldots & {\frac{{\partial A_{1} }}{{\partial x_{n} }}x} \\ \vdots & {} & \vdots \\ {\frac{{\partial A_{n} }}{{\partial x_{1} }}x} & \ldots & {\frac{{\partial A_{n} }}{{\partial x_{n} }}x} \\ \end{array} } \right]$$

and.

$$\frac{\partial (B(x)u)}{\partial x} = \left[ {\begin{array}{*{20}c} {\frac{{\partial B_{1} }}{{\partial x_{1} }}u} & \ldots & {\frac{{\partial B_{1} }}{{\partial x_{n} }}u} \\ \vdots & {} & \vdots \\ {\frac{{\partial B_{n} }}{{\partial x_{1} }}u} & \ldots & {\frac{{\partial B_{n} }}{{\partial x_{n} }}u} \\ \end{array} } \right]$$

Differentiation from \(\lambda = P(x)x\) with respect to time along a trajectory to find the matrix-valued function \(P(x)\) yields:

$$\dot{\lambda } = \dot{P}\left( x \right)x + P\left( x \right)\dot{x} = \dot{P}\left( x \right)x + P\left( x \right)A\left( x \right)x - P\left( x \right)B\left( x \right)R^{ - 1} \left( x \right)\left( {B^{T} \left( x \right)P\left( x \right)x - \overline{w} + \hat{w}} \right)$$

The following notation is used:

$$\dot{P}\left( x \right) = \mathop \sum \limits_{i = 1}^{n} P_{{x_{i} }} \left( x \right)\dot{x}_{i} (t)$$

By comparing with (21):

$$\begin{gathered} \dot{P}\left( x \right)x + P\left( x \right)A\left( x \right)x - P\left( x \right)B\left( x \right)R^{ - 1} \left( {B^{T} \left( x \right)P\left( x \right)x - \overline{w} + \hat{w}} \right) \hfill \\ = - Q\left( x \right) - \left[ {A\left( x \right) + \frac{{\partial \left( {A\left( x \right)} \right)}}{\partial x}x} \right]^{T} P(x)x - \left[ {\frac{\partial B\left( x \right)u}{dx}} \right]^{T} P(x)x \hfill \\ \end{gathered}$$

Rearranging the terms gets:

$$\begin{gathered} \left[ {\left( {\dot{P}\left( x \right) + \left[ {\frac{{\partial \left( {A\left( x \right)} \right)}}{\partial x}} \right]^{T} P\left( x \right) + \left[ {\frac{\partial B\left( x \right)u}{dx}} \right]^{T} P(x)} \right)} \right. \hfill \\ \left. { + \left( {P\left( x \right)A\left( x \right) + A^{T} \left( x \right)P\left( x \right) - P\left( x \right)B\left( x \right)R^{ - 1} B^{T} \left( x \right)P\left( x \right) + Q(x)} \right)} \right] \hfill \\ x - P\left( x \right)B\left( x \right)R^{ - 1} ( - \bar{w} + \hat{w}) = 0 \hfill \\ \end{gathered}$$

By assuming that \(\partial (A(x) )/\partial x\), \(\partial (B(x)u )/\partial x\) are small and \(P\left( x \right)\) is stationary, the suboptimal solution is:

$$\tilde{u}\left( x \right) = { \hbox{min} }\left( {\hbox{max} \left( {u,u_{ \hbox{min} } } \right),u_{ \hbox{max} } } \right),$$

where \(u_{ \hbox{min} }\) and \(u_{ \hbox{max} }\) are the minimum and maximum bounds on the control, respectively and:

$$u\left( x \right) = - R^{ - 1} \left( x \right)B^{T} \left( x \right)P\left( x \right)x.$$
(24)

It is shown that the feedback control law works reasonably well when these conditions are not satisfied [24]. It is assumed that \(P\left( x \right)\) solves the SDRE, which is given by:

$$P\left( x \right)A\left( x \right) + A^{T} \left( x \right)P\left( x \right) - P\left( x \right)B\left( x \right)R^{ - 1} B^{T} \left( x \right)P\left( x \right) + Q\left( x \right) = 0$$

then the following condition must be satisfied for optimality:

$$\dot{P}\left( x \right) + \left[ {\frac{{\partial \left( {A\left( x \right)} \right)}}{\partial x}} \right]^{T} P\left( x \right) + \left[ {\frac{\partial B\left( x \right)u}{{{\text{d}}x}}} \right]^{T} P\left( x \right) = 0$$

The above equation is the optimality criterion [25].

Dynamics of the closed-loop system is obtained according to the following equation:

$$\dot{x} = \left( {A\left( x \right) - B\left( x \right)R^{ - 1} \left( x \right)B^{T} \left( x \right)P\left( x \right)} \right)x.$$

4.2 SDRE optimal control design

To design SDRE-based optimal control, we must rewrite the system in the form (20) by shifting the tumor-free equilibrium point to the origin. New state variables are as follows:

$$\begin{aligned} T & = x_{1} , \\ N & = x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}}, \\ L & = x_{3} , \\ C & = x_{4} + {\raise0.7ex\hbox{$\alpha $} \!\mathord{\left/ {\vphantom {\alpha \beta }}\right.\kern-0pt} \!\lower0.7ex\hbox{$\beta $}}, \\ M & = x_{5} . \\ \end{aligned}$$

In this case, the system of equations is as following:

$$\frac{{{\text{d}}x_{1} }}{{{\text{d}}t}} = ax_{1} \left( {1 - bx_{1} } \right) - c\left( {x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}}} \right)x_{1} - Dx_{1} - \frac{{1.2K_{T} x_{1} x_{5} }}{{0.8 + x_{5} }},$$
$$\frac{{{\text{d}}x_{2} }}{{{\text{d}}t}} = e\left( {x4 + {\raise0.7ex\hbox{$\alpha $} \!\mathord{\left/ {\vphantom {\alpha \beta }}\right.\kern-0pt} \!\lower0.7ex\hbox{$\beta $}}} \right) - f\left( {x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}}} \right) + g\frac{{x_{1}^{2} }}{{h + x_{1}^{2} }}\left( {x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}}} \right) - p(x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}})x_{1} - \frac{{1.2K_{N} x_{5} }}{{0.8 + x_{5} }}\left( {x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}}} \right),$$
$$\frac{{{\text{d}}x_{3} }}{{{\text{d}}t}} = - mx_{3} + j\frac{{D^{2} x_{1}^{2} }}{{k + D^{2} x_{1}^{2} }}x_{3} - qx_{1} x_{3} + \left( {r_{1} \left( {x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}}} \right) + r_{2} \left( {x4 + {\raise0.7ex\hbox{$\alpha $} \!\mathord{\left/ {\vphantom {\alpha \beta }}\right.\kern-0pt} \!\lower0.7ex\hbox{$\beta $}}} \right)} \right)x_{1} - u\left( {x_{2} + {\raise0.7ex\hbox{${e\alpha }$} \!\mathord{\left/ {\vphantom {{e\alpha } {\beta f}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\beta f}$}}} \right)x_{3}^{2} - \frac{{1.2K_{L} x_{3} x_{5} }}{{0.8 + x_{5} }},$$
$$\begin{aligned} \frac{{{\text{d}}x_{4} }}{{{\text{d}}t}} & = \alpha - \beta \left( {x4 + {\raise0.7ex\hbox{$\alpha $} \!\mathord{\left/ {\vphantom {\alpha \beta }}\right.\kern-0pt} \!\lower0.7ex\hbox{$\beta $}}} \right) - \frac{{1.2K_{C} x_{5} }}{{0.8 + x_{5} }}\left( {x4 + {\raise0.7ex\hbox{$\alpha $} \!\mathord{\left/ {\vphantom {\alpha \beta }}\right.\kern-0pt} \!\lower0.7ex\hbox{$\beta $}}} \right), \\ \frac{{{\text{d}}x_{5} }}{{{\text{d}}t}} & = - \gamma x_{5} + \upsilon_{M} \left( t \right), \\ D & = d\frac{{\left( {{\raise0.7ex\hbox{${x_{3} }$} \!\mathord{\left/ {\vphantom {{x_{3} } {x_{1} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${x_{1} }$}}} \right)^{l} }}{{s + \left( {{\raise0.7ex\hbox{${x_{3} }$} \!\mathord{\left/ {\vphantom {{x_{3} } {x_{1} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${x_{1} }$}}} \right)^{l} }}. \\ \end{aligned}$$

To use the SDRE method, the above equations must be represented in the form of pseudo-linear given by (20). The matrices \(A(x)\) and \(B(x)\) are:

$$A\left( x \right) = \left[ {\begin{array}{*{20}c} {A_{11} } & { - cx_{1} } & 0 & 0 & {A_{15} } \\ {A_{21} } & {A_{22} } & 0 & e & {A_{25} } \\ {A_{31} } & {r_{1} x_{1} } & {A_{33} } & {r_{2} x_{1} } & {A_{35} } \\ 0 & 0 & 0 & { - \beta } & {A_{45} } \\ 0 & 0 & 0 & 0 & { - \gamma } \\ \end{array} } \right],$$
$$\begin{aligned} A_{11} & = a\left( {1 - bx_{1} } \right) - D - \frac{ce\alpha }{\beta f}, \\ A_{15} & = \frac{{ - 1.2K_{T} x_{1} }}{{0.8 + x_{5} }}, \\ A_{21} & = \frac{{gx_{1} }}{{h + x_{1}^{2} }}\frac{e\alpha }{\beta f} - p\left( {x_{2} + \frac{e\alpha }{\beta f}} \right), \\ A_{22} & = \frac{{gx_{1}^{2} }}{{h + x_{1}^{2} }} - f, \\ A_{25} & = \frac{{ - 1.2K_{N} }}{{0.8 + x_{5} }}\left( {x_{2} + \frac{e\alpha }{\beta f}} \right), \\ A_{31} & = - qx_{3} + r_{1} \frac{e\alpha }{\beta f} + r_{2} \frac{\alpha }{\beta }, \\ A_{33} & = - m + \frac{{jD^{2} x_{1}^{2} }}{{K + D^{2} x_{1}^{2} }} - u\left( {x_{2} + \frac{e\alpha }{\beta f}} \right)x_{3} , \\ A_{35} & = \frac{{ - 1.2K_{L} x_{3} }}{{0.8 + x_{5} }}, \\ A_{45} & = \frac{{ - 1.2K_{C} }}{{0.8 + x_{5} }}\left( {x_{4} + \frac{\alpha }{\beta }} \right), \\ B\left( x \right) & = \left[ {\begin{array}{*{20}c} 0 & 0 & 0 & 0 & 1 \\ \end{array} } \right]^{T} . \\ \end{aligned}$$

5 Numerical simulations

The goal of the treatment is to kill the tumor cells in a finite duration while minimizing the amount of drug that also reduces the detrimental toxicity effect caused by extra usage of chemotherapy drugs. In the proposed model the tumor-free equilibrium point is unstable; therefore, for changing its stability, the vaccine therapy is necessary. Hence, at the first of the treatment, the vaccine input is applied for stabilization of the tumor-free equilibrium point and it changes the parameters of the system in day 10. Then, chemotherapy pushes the system to the domain of attraction of this point in an optimal manner.

In this section, we simulate the behavior of our model by considering the combination treatment for human data. A tumor of size \(10^{7}\) cells is too large for the immune system to control naturally. Hence, we consider a tumor of size \(10^{7}\) cells for testing the proposed combination therapy. We also simulate three different cases of bounded optimal control application to the cancer model to show the effects of different weighting matrices. We test the combination therapy using experimental human data presented in Appendix 1.

5.1 Case 1

In this case, we use the following constant matrices for the cost function:

$$Q\left( x \right) = \left[ {\begin{array}{*{20}c} {100} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & {0.1} \\ \end{array} } \right] ,\, R = 4 \times 10^{13} .$$

Results show that the combination vaccine therapy and chemotherapy treatment are effective for finite duration treatment. It is shown that changing the dynamics of the cancer around the tumor-free equilibrium point is essential for having finite duration treatment (Fig. 3).

Fig. 3
figure 3

Optimal input (left) and the system response (right) with human parameters in case 1

The chemotherapy stopped the tumor growth at the primary of the treatment and pushed the system toward the tumor-free equilibrium point. Then, by changing the parameters of the system in the tenth day with vaccination and stabilizing the tumor-free equilibrium point, the system would converge to this point without needing any chemotherapy.

5.2 Case 2

In this case, we use the matrix \(Q\left( x \right)\) as before and a state-dependent matrix for input, \(R\left( x \right) = R + \gamma x_{2}\) where \(\gamma\) is a positive constant. As it is clear, the \(R\left( x \right)\) is greater than \(R\) at the beginning of treatment and decreases by time till tends to \(R\) by diminishing tumor cells. So, at the beginning of treatment, the control input is lower than case 1.

In this case, the behavior of the system is similar to case 1. However, the amplitude of the input is smaller that case 1 (Fig. 4).

Fig. 4
figure 4

Optimal input (left) and the system response (right) with human parameters in case 2, \(\gamma = 1e7\)

5.3 Case 3

In this case, we use matrix \(Q\left( x \right)\) as before and a state-dependent matrix for input, \(R\left( x \right) = R - \gamma x_{2}\) where \(\gamma\) is a positive constant. We choose the value of \(\gamma\) such that always \(R > \gamma x_{2}\). As it is clear, the \(R\left( x \right)\) is smaller than \(R\) at the beginning of treatment and increases by time till tends to \(R\). So, at the beginning of treatment, the control input is higher than cases 1 and 2 (Fig. 5).

Fig. 5
figure 5

Optimal input (left) and the system response (right) with human parameters in case 3, \(\gamma = 3e6\)

In the above three cases, the behavior of the system is similar but the control inputs are different. The natural killer cells and tumor cell populations during the treatment are shown in Figs. 6, 7 for all three cases. As shown in these figures, the results for case 3 are slightly better than two other cases. However, the control input in case 2 is significantly smaller than two others. This is due to that the chemotherapy has not considerable effect on removing tumor cells and the main purpose of chemotherapy is to prevent tumor growth until the vaccination changes the dynamics of the cancer. However, after day 10, when the vaccination changes the stability of the tumor-free equilibrium point the chemotherapy is stopped and the tumor cell populations converge to its point exponentially. So, this proposed finite duration treatment method is effective for cancer treatment.

Fig. 6
figure 6

The natural killer cell populations in the three cases during treatment

Fig. 7
figure 7

The tumor cell populations in the three cases during treatment

In [26], the authors proposed on–off regimens for minimizing the tumor cell population and maintaining the healthy cells in an allowable level. However, these suggested regimens are not able to complete elimination of tumor cells. Moreover, this method is not flexible whilst considering the particular conditions of the patients is not possible. Also, this method is sensitive to the initial estimates of state variables and control. Evaluation of quadratic and linear cost functions are examined in [6, 21]. The optimal chemotherapy regimens which are computed based on these cost functions are able to eradicate the tumor, but they are sensitive to the initial estimates of state variables and control. In [19], de Pillis et al. proposed a mixed immunotherapy and chemotherapy protocol for cancer treatment. The main shortcoming of these protocols is that after elimination of the treatment, the cancer relapses due to lack of stable tumor-free equilibrium point (Fig. 2). In this paper, we use vaccine therapy for changing the dynamics of the system around the tumor-free equilibrium point. In addition, these proposed methods are open-loop which has many deficiencies such as un-robustness in dealing with parameter variation. In [2, 7, 8], the authors presented the SDRE method based on the model of de Pillis et al. which is published in 2003. In this model the tumor-free equilibrium point is stable. But in the model of de Pillis et al. which is published in 2006, new interaction terms among cells are added and this extended model has unstable tumor-free equilibrium point. So, chemotherapy is not adequate for finite duration cancer treatment. Hence, we use mixed vaccine–chemotherapy using SRDE approach. Moreover, for better performance of the control system, we use state-dependent matrix for \(R\left( x \right)\). In addition, the chemotherapy terms are exerted in a saturation manner which is in accordance with physical observations [19].

6 Conclusion

In this paper, we have extended and analyzed previous mathematical models of cancer by mixed vaccine therapy and chemotherapy. The model describes the effect of tumor cells on the immune response with considering the effect of vaccine therapy and chemotherapy. First, the system of equations has analyzed in the absence of treatment, then; the equilibrium points of the system along with the criteria for stability have determined. For human parameter set, we found two equilibria. One was a tumor-free equilibrium point, which was unstable, and the other was a high-tumor equilibrium point which was stable. The instability of the tumor-free equilibrium implies that any successful finite duration treatment must be able to change the system dynamics to force this desirable equilibrium point becomes stable. So, the vaccine therapy is used for this purpose. Hence, at the first of treatment, the vaccine therapy is used for stabilization of the tumor-free equilibrium point. Then, chemotherapy pushes the system to the domain of attraction of this point in an optimal manner. For this purpose, we have developed an SDRE-based optimal control and applied it to the model. Afterward, by pushing the system inside the domain of attraction of this equilibrium point, the tumor cell populations converge to zero even after the elimination of therapies. We have shown that the SDRE optimal control method provides fast and easy derivation of suboptimal control for the chemotherapy administration problem. We apply different types of input weighting matrix to show the effectiveness of this feature of the SDRE method in cancer treatment. It is shown that after the end of treatment, although the populations of tumor cells are not zero due to change in the stability of tumor-free equilibrium point and pushing the system to its domain of attraction, the system converges to the tumor-free equilibrium point ever after elimination of the inputs. So, the development of combination vaccine–chemotherapy protocols for remedying certain forms of cancer is an appropriate strategy in cancer treatment research. Also, the present study suggests that a proper treatment method should not only change the dynamics of the cancer, but also reduce the population of tumor cells, which has not been considered yet and is a shortcoming of many treatment methods.