1 Introduction: Chemotherapy for Heterogeneous Tumor Populations

The prevailing paradigm in cancer chemotherapy is to give as much of the drug as possible (MTD-maximum tolerated dose) immediately. The reason is that cancer is a widely symptomless disease which, once finally detected, often is in an advanced stage where immediate action is required. Then the aim simply is to be as toxic as possible to the cancerous cells. If the tumor consists of a homogeneous agglomeration of chemotherapeutically sensitive cells, simple mathematical models confirm such a strategy as optimal (e.g., see [5, 16, 18, 2729]). However, malignant cancer cell populations are genetically unstable and coupled with fast proliferation rates, this leads to a great variety in the structure of the cells within one tumor—the number of genetic errors present within one cancer cell can lie in the thousands [19]. Consequently, tumors often consist of heterogeneous agglomerations of subpopulations that show widely varying sensitivities towards the actions of a particular chemotherapeutic agent [7, 8]. In medicine, the Norton–Simon hypothesis postulates that tumors consist of faster growing cells that are sensitive to chemotherapy and slower growing populations of cells that, with time, become resistant to the chemotherapeutic agent (acquired drug resistance). There may even exist small subpopulations of cells for which the specific activation mechanism of a chemotherapeutic agent does not work and which thus are not sensitive to the treatment from the beginning (ab initio, intrinsic resistance). Given such a scenario, over time, as the drugs kill sensitive tumor cells, resistant subpopulation of cancer cells may emerge that will make an MTD-style therapy less and less effective [17, 18, 30]. Even if the fraction of intrinsically resistant tumor cells is tiny and undetectable, after the sensitive cells have been killed by the treatment, it may grow in time to become a fully developed tumor of chemotherapeutically resistant cells leading to the failure of therapy, possibly only after many years of seeing remission of the cancer.

The question how chemotherapeutic agents should be scheduled in the long run to optimize their effects is a difficult one when the true system (patient) is considered and many systemic aspects need to be taken into account to give a satisfactory answer. In fact, the entire tumor microenvironment (consisting of the tumor vasculature that provides nutrients, tumor immune system interactions and many other aspects such as fibroblast cells, extracellular matrix, etc.) will need to be considered, all residing in healthy tissue and contributing to the multifaceted nature of the disease [6]. Modern treatments therefore are increasingly multi-targeted therapies that not only aim to kill cancer cells, but also include antiangiogenic therapy, immunotherapy and other options. But even before these other components are addressed, it is important to understand the influence that tumor heterogeneity has on the structure of optimal protocols. Several mathematical models for developing drug resistance have been put forward (e.g., see the monograph by Martin and Teo [20], the work by Swierniak and Smieja [30]) and analyzed mathematically. For simple 2-compartment models in which only sensitive and resistant cell populations are distinguished, it can be shown that as the resistant subpopulation becomes too large, a standard MTD approach will no longer be optimal since the damage caused by high dose chemotherapy to healthy cells outweighs the benefits of killing the cancer cells [17]. In a recent paper by Lavi, Greene, Gottesman and Levy [9, 15], a mathematical model for multi-drug resistance in cancer has been proposed that leads to the emergence of specific traits (or resistance levels) as a response to cell density and mutations. It thus is interesting to consider models which include various levels of chemotherapeutic sensitivities or drug resistance. In this paper, we consider such a model for a single chemotherapeutic agent distinguishing three distinct levels. Just for sake of terminology, we call the subpopulations ‘sensitive’, ‘partially sensitive’ and ‘resistant’. We analyze the dynamic properties of the system under a continuous, low-dose and constant drug administration. Recently, there have been a number of medical trials that explore such low dose administration of chemotherapy and the beneficial effects that it has under the name of metronomic dosing [1, 2, 1214, 23]. More generally, we also consider the optimal control problem of minimizing the tumor burden over a prescribed therapy interval. Interestingly, as more levels of sensitivity are taken into account in the model, lower time-varying dose rates like they are given by singular controls become a treatment option. This is only the case once a significant residuum of resistant cells has been created in simpler 2-compartment models.

2 A 3-Compartment Mathematical Model for Tumor Heterogeneity

In this paper, we consider a mathematical model for heterogeneous tumor populations similar to the one considered in [11] that distinguishes between three distinct subpopulations. For simplicity of terminology, we label them “sensitive”, S, “partially sensitive”, P, and “resistant”, R, but the terminology is only meant to indicate that these populations have different sensitivities towards a chemotherapeutic agent with S the highest and R the lowest. We assume that these subpopulations grow at rates α 1, α 2 and α 3, respectively. Generally, we do not make assumptions on the order of the growth rates, but an ordering α 1>α 2>α 3 would be consistent with the “Norton–Simon hypothesis” according to which a tumor consists of faster-growing populations of chemotherapeutically sensitive cells and slower-growing populations of increasingly more resistant cells [21, 22]. We allow for transitions between the compartments, i.e., we include the typical effects that sensitive cells become more resistant through mutations, but we also allow for resensitizations which make cells less resistant to the chemotherapeutic agent. This phenomenon is well-documented in the medical literature, e.g., see [10, 26]. We denote the transition rates between the compartments by using a Greek letter to denote the originating compartment and a Roman letter to denote the receiving compartment. For example, σ P denotes the transition rate from sensitive to partially sensitive cells while π S denotes the reverse transition rates from partially sensitive to sensitive cells. In this paper, these rates are assumed to be constant and positive. This corresponds to an ergodic structure in which all compartments are repeatedly visited by cells. Cell kill by a chemotherapeutic agent is expressed by the standard linear log-kill hypothesis: if we denote the concentration of the drug in the bloodstream by u, then the rate of cells eliminated is given by φ i u, i=1,2,3, with the coefficients φ 1, φ 2 and φ 3 representing the effectiveness of the drug on the sensitive, partially sensitive and resistant subpopulations, respectively. Thus φ 1>φ 2>φ 3≥0. The case φ 3=0 corresponds to the situation of a fully resistant subpopulation R. As a matter of simplification of the model, we do not include the standard pharmacokinetic model on the agent here and treat u as the control of the system with maximum concentration given by u max. The controlled dynamics is then determined by the inflows and outflows from the various compartments and is given by the following 3-dimensional linear system of equations:

$$\begin{aligned} \dot{S} & = ( \alpha_{1}-\sigma_{P}-\sigma_{R}- \varphi_{1}u ) S + \pi_{S}P + \rho_{S}R, \end{aligned}$$
(1)
$$\begin{aligned} \dot{P} & = \sigma_{P}S + ( \alpha_{2}-\pi_{S}- \pi_{R}-\varphi _{2}u ) P + \rho_{P}R, \end{aligned}$$
(2)
$$\begin{aligned} \dot{R} & = \sigma_{R}S + \pi_{R}P + ( \alpha_{3}-\rho_{S}-\rho _{P}- \varphi_{3}u ) R. \end{aligned}$$
(3)

Admissible controls are Lebesgue measurable functions with values in a compact interval [0,u max], u:[0,T]→[0,u max], tu(t).

Lemma 1

For any admissible control u, the solution to Eqs. (1)(3) exists on the full interval [0,T] and all components are positive on (0,T].

Proof

The system (1)–(3) is a homogeneous linear system with coefficient matrix having bounded Lebesgue measurable entries. Thus solutions exist over the full interval [0,T]. Without loss of generality, we assume that the initial population size is positive so that the solution is nontrivial. Even if one of the components vanishes at the initial time, it will immediately become positive. For example, if R(0)=0, then \(\dot{R}(0)=\sigma_{R}S(0)+\pi_{R}P(0)>0\) since at least one of P(0) or S(0) is positive. Thus there exists an interval (0,ε) with ε>0 on which all components are positive. If one of the variables becomes zero at a later time, let τ denote the minimum of all times for which one of the components S, P or R is zero. Then τ>0 and at least one of the other states is positive at time τ. But, as before, if, for example, P(τ)=0, then again \(\dot{P}(\tau)=\sigma_{P}S(\tau)+\rho_{P}R(\tau)>0\). Contradiction. Hence all states remain positive. □

2.1 Steady-State Behavior of the Relative Proportions

A discrete-time analogue of the model formulated above is a homogeneous Markov chain with states S, P and R and positive transition probabilities between each pair of states. Such a chain is ergodic and has a well-defined limiting stationary distribution for which all probabilities to be in a particular state are positive. In this section, we show that the dynamical systems version has the same steady-state behavior: the proportions of cells in the respective compartments converge to a positive limit.

Let C denote the total number of cancer cells, C=S+P+R, and consider a continuous administration of some chemotherapeutic agent at possibly a constant low dose u≡const. The growth of the total population is then given by

$$\dot{C} = ( \alpha_{1}-\varphi_{1}u ) S + ( \alpha_{2} -\varphi_{2}u ) P+ ( \alpha_{3}-\varphi _{3}u ) R. $$

Mathematically, the analysis of the dynamics can always be reduced to the uncontrolled system by setting \(\hat{\alpha}_{i}=\alpha_{i}-\varphi _{i}u\) and we thus consider the case u≡0. Note, however, once this is done the growth relations between the compartments may change. For example, if the drug is effective on the sensitive cells, this will generate a negative growth rate while a truly resistant compartment will not be affected. Thus the rates \(\hat{\alpha}_{i}\) can be negative and there may be no order relation between these coefficients. Hence our analysis will be carried out for arbitrary reals α i . Let x, y and z denote the proportions of the respective populations, i.e.,

$$x = \frac{S}{C}, \qquad y = \frac{P}{C}, \quad\mbox{and} \quad z = \frac{R}{C}. $$

Since S, P and R satisfy linear differential equations, the quotients x, y and z obey Riccati equations. Direct computations verify that

$$\begin{aligned} \dot{x} & = ( \alpha_{1}-\sigma_{P}-\sigma_{R} ) x + \pi_{S}y + \rho_{S}z - x(\alpha_{1}x+ \alpha_{2}y+\alpha_{3}z), \end{aligned}$$
(4)
$$\begin{aligned} \dot{y} & = \sigma_{P}x + ( \alpha_{2}-\pi_{S}- \pi_{R} ) y + \rho_{P}z - y(\alpha_{1}x+ \alpha_{2}y+\alpha_{3}z), \end{aligned}$$
(5)
$$\begin{aligned} \dot{z} & = \sigma_{R}x + \pi_{R}y + ( \alpha_{3}-\rho_{S}-\rho _{P} ) z - z( \alpha_{1}x+\alpha_{2}y+\alpha_{3}z). \end{aligned}$$
(6)

Our first aim is to establish that there exists a unique steady-state (x ,y ,z ) for the corresponding system. Let Σ denote the unit simplex in \(\mathbb{R}^{3}\), i.e.,

$$\varSigma= \bigl\{ (x,y,z): x\geq0, y\geq0, z\geq0, x+y+z=1 \bigr\} . $$

Theorem 1

The unit simplex Σ is positively invariant under the dynamics (4)(6) and there exists a unique, globally asymptotically stable equilibrium point (x ,y ,z ) in Σ, i.e., for any initial condition (x 0,y 0,z 0)∈Σ, the corresponding trajectory converges to (x ,y ,z ) as t→∞.

Corollary 1

If a chemotherapeutic agent is administered at a constant low dose u, then the total tumor population asymptotically grows exponentially at rate

$$( \alpha_{1}-\varphi_{1}u ) x_{\ast} + ( \alpha _{2}-\varphi_{2}u ) y_{\ast} + ( \alpha_{3}- \varphi_{3}u ) z_{\ast}. $$

Proof

The total cancer population C satisfies

$$\begin{aligned} \dot{C} & = ( \alpha_{1}-\varphi_{1}u ) S + ( \alpha _{2}-\varphi_{2}u ) P + ( \alpha_{3}- \varphi_{3}u ) R \\ & = \bigl[ ( \alpha_{1}-\varphi_{1}u ) x + ( \alpha _{2}-\varphi_{2}u ) y + ( \alpha_{3}- \varphi_{3}u ) z \bigr] C \\ & \simeq \bigl[ ( \alpha_{1}-\varphi_{1}u ) x_{\ast} + ( \alpha_{2}-\varphi_{2}u ) y_{\ast} + ( \alpha _{3}-\varphi_{3}u ) z_{\ast} \bigr] C. \end{aligned}$$

Note that the limit (x ,y ,z ) also is a function of the dose rate u. □

We divide the proof of the theorem into several lemmas.

Lemma 2

The unit simplex Σ is positively invariant under the dynamics (4)(6).

Proof

By definition we have that x+y+z≡1 and setting z=1−xy we consider the unit-simplex as a subset of (x,y)-space in \(\mathbb{R}^{2}\). It suffices to show that all trajectories starting at a point (x 0,y 0) in the boundary of Σ, ∂Σ, enter the interior of Σ. For x=0 we have that \(\dot {x}_{|x=0}=\pi _{S}y+\rho_{S}z\) and since at least one of y or z must be positive, it follows that \(\dot{x}_{|x=0}>0\). Analogously we have that

$$\dot{y}_{|y=0} = \sigma_{P}x + \rho_{p}z > 0 \quad\mbox{and} \quad \dot{z}_{|z=0} = \sigma_{R}x + \pi_{R}y > 0. $$

Hence, whenever (x 0,y 0)∈∂Σ, the vector field defining the dynamics points inside Σ. □

It thus follows (for example, from Poincaré–Bendixson theory) that there exists at least one equilibrium point inside of Σ.

Lemma 3

The unit simplex Σ contains exactly one equilibrium point for the dynamical system (4)(6).

Proof

Without loss of generality, we consider Eqs. (4) and (5) coupled with the relation x+y+z≡1. In order to simplify the notation a bit, we denote the relative growth rates of the subpopulations by λ s =α 1σ P σ R , λ P =α 2π S π R and λ R =α 3ρ S ρ P . Using a blow-up in the variables of the form y=wx and z=vx with positive coefficients v and w, it follows that \(x=\frac{1}{1+w+v}>0\) and (x,y,z)∈Σ for positive v and w. It therefore suffices to show that there exists a unique positive solution (v,w) to the equations

$$\begin{aligned} 0 & = \lambda_{S} + \pi_{S}w + \rho_{S}v - ( \alpha_{1}+\alpha _{2}w + \alpha_{3}v ) x, \end{aligned}$$
(7)
$$\begin{aligned} 0 & = \sigma_{P} + \lambda_{P}w + \rho_{P}v - ( \alpha_{1}+\alpha_{2}w + \alpha_{3}v)wx. \end{aligned}$$
(8)

Equating these two relations gives

$$\lambda_{S}+\pi_{S}w+\rho_{S}v = ( \alpha_{1}+\alpha_{2}w+\alpha _{3}v ) x = \frac{1}{w} ( \sigma_{P}+\lambda_{P}w+ \rho_{P}v ) $$

which yields

$$ \pi_{S}w^{2}+ ( \lambda_{S}- \lambda_{P} ) w -\sigma_{P} = ( \rho_{P}- \rho_{S}w ) v. $$
(9)

This relation defines v as a rational function of w of the form

$$v = v(w) = \frac{\pi_{S}w^{2} + ( \lambda_{S}-\lambda_{P} ) w - \sigma_{P}}{ ( \rho_{P}-\rho_{S}w )} = \frac{Q(w)}{L(w)} $$

with L and Q the respective linear and quadratic polynomials. Since Q(0)<0, Q has a unique positive root which we denote by w q and it matters how it is located relative to the root \(w_{\ell}=\frac{\rho _{P}}{\rho_{S}}>0\) of L. If w q w , then v(w) has a pole at w and is positive and strictly increasing from 0 to +∞ over the interval [w q ,w ) if w q <w while v(w) is positive and strictly decreasing from +∞ to 0 over the interval (w ,w q ] if w q >w (see Fig. 1). If the zeros cancel, formally, the solution is given by v(w )=+∞, but in this case Eq. (9) is satisfied trivially.

Fig. 1
figure 1

Existence of a unique equilibrium point

Equation (7) is equivalent to

$$0 = ( \lambda_{S}+\pi_{S}w+\rho_{S}v ) (1+w+v) - ( \alpha _{1}+\alpha_{2}w+\alpha_{3}v ) $$

which gives

$$\begin{aligned} 0 = & \pi_{S}w^{2} + ( \pi_{S}+ \rho_{S} ) wv + \rho _{S}v^{2} \\ & {}+ \bigl[ \lambda_{S}- ( \alpha_{2}- \pi_{S} ) \bigr] w + \bigl[ \lambda_{P}- ( \alpha_{3}-\rho_{S} ) \bigr] v - (\sigma _{P}+ \sigma_{R}). \end{aligned}$$

Using the relation (9), this reduces to

$$ \rho_{S}v^{2} + ( \lambda_{S}- \lambda_{R} ) v-\sigma_{R} = ( \pi_{R}- \pi_{S}v ) w. $$
(10)

This equation is the analogue to (9) and symmetrically defines w as a rational function of v in the form

$$w = w(v) = \frac{\rho_{S}v^{2}+ ( \lambda_{S}-\lambda_{R} ) v - \sigma_{R}}{ ( \pi_{R}-\pi_{S}v )} = \frac{\tilde{Q}(v)}{\tilde{L}(v)}. $$

As above, the quadratic polynomial has a unique positive root which we denote by v q and it matters how it is located relative to the root \(v_{\ell}=\frac{\pi_{R}}{\pi_{S}}>0\) of \(\tilde{L}\). If v q v , then w(v) has a pole at v and w(v) is positive and strictly increasing from 0 to +∞ over the interval [v q ,v ) if v q <v while w(v) is positive and strictly decreasing from +∞ to 0 over the interval (v ,v q ] if v q >v (see Fig. 1). If the zeros cancel, Eq. (10) again is trivially satisfied in the form 0=0.

The relative location of the zeros is determined by the two quantities

$$A = \pi_{S}\rho_{P}^{2} + ( \lambda_{S}-\lambda_{P} ) \rho _{P} \rho_{S} - \sigma_{P}\rho_{S}^{2} $$

and

$$B = \rho_{S}\pi_{R}^{2} + ( \lambda_{S}-\lambda_{R} ) \pi _{S} \pi_{R}-\sigma_{R}\pi_{S}^{2} $$

and is summarized in Table 1.

Table 1 Summary of the location of the poles and roots of the rational functions defined by Eqs. (9) and (10)

In each case, there exists a unique positive solution \((\bar{v},\bar {w})\) to Eqs. (9) and (10). For example, if A and B are both positive, then the function v(w) is only positive over the interval (w q ,w ) and it is strictly increasing from 0 to +∞. At the same time, the function w(v) is only positive over the interval (v q ,v ) and is strictly increasing from 0 to +∞. The inverse branch therefore is defined on all of (0,∞) and the values increase from v q to v . Thus there exists a unique intersection of these two curves and we have that \(\bar{v}\in(v_{q},v_{\ell})\) and \(\bar {w}\in(w_{q},w_{\ell})\). Analogous reasoning gives the existence of a unique positive solution \((\bar{v},\bar{w})\) whenever A and B are nonzero. If one of them is zero, the argument is slightly different. For sake of specificity, suppose A=0. In this case, \(\bar{w}=w_{\ell}=\frac{\rho_{P}}{\rho _{S}}\) is a solution to Eq. (9) and the solution \(\bar{v}\) to Eq. (10) is the unique intersection of this line with the graph of the inverse branch of the function w(v) (whose range is the interval (0,∞)) if B≠0 or \(\bar{v}=\frac{\pi_{R}}{\pi_{S}}\) if B=0 as well. The underlying geometry is illustrated qualitatively in Fig. 1. For the case (A,B)=(0,0) the equilibrium point is explicitly given by \((\bar{v},\bar{w})=(\frac{\pi_{R}}{\pi_{S}},\frac{\rho_{P}}{\rho_{S}})\) and generally the equilibrium point will be quite close to this value. □

Theorem 1 then follows from the fact that the system (4)–(6) does not have periodic orbits. This, however, is a more difficult and lengthy technical argument and we only indicate the reasoning. Unfortunately, Bendixson’s criterion does not always work here. The divergence of the vector field F that defines the \(\dot{x}\) and \(\dot{y}\) dynamics is given by

$$\operatorname{div}F = \alpha_{1} + \alpha_{2} + \alpha_{3} - 3 ( \alpha_{1}x+\alpha_{2}y+ \alpha_{3}z ) -(\sigma_{P}+\sigma_{R}+ \pi_{S}+\pi_{R}+\rho_{S}+\rho_{P}) $$

and will always be negative for at least one vertex. But the maximum, also attained at a vertex, may be positive and thus a different argument is required. It follows from index theory that any periodic orbit γ must contain the equilibrium point (x ,y ) in the region encircled by γ. Hence, if there exists a trajectory (x(t;x 0,y 0),y(t;x 0,y 0)) that starts at a point (x 0,y 0)∈∂Σ and in the limit t→∞ converges to (x ,y ), then no periodic orbits can exist. (The periodic orbit would need to intersect this trajectory contradicting uniqueness of solutions.) This is indeed the case (cf., the figures above) and can be verified through a more elaborate analysis of the geometric shape of the curves \(\dot{x}=0\) and \(\dot{y}=0\). Essentially, either these curves bound at least one sub-region of Σ with the property that all trajectories flow out of this region while trajectories cannot enter it or the Poincaré return map to these curves has no fixed points.

Overall, for any initial condition (x 0,y 0)∈Σ, the corresponding ω-limit set Ω(x 0,y 0), the set of all accumulation points of the trajectory as t→∞, is nonempty and, by Poincaré’s theorem, since there are no periodic orbits, it must contain at least one equilibrium point. The fact that there is a unique equilibrium point also precludes the existence of homoclinic orbits and thus Ω(x 0,y 0)={(x ,y )}. Hence every trajectory converges to the unique equilibrium point.

In Figs. 2 and 3 we show some samples of trajectories of the system (4)–(5) starting at the vertices of Σ. Note that all trajectories converge to the equilibrium thus precluding periodic orbits. We also graph the line where \(\operatorname{div}F=0\) in green and it can be seen that it may change sign on Σ.

Fig. 2
figure 2

In each figure, the equilibrium point (x ,y ,z ) is marked by a red dot and the blue curves are the trajectories of the system (4)–(5) starting at the vertices of the unit simplex Σ. Also shown is the line where the divergence if the vector field vanishes (Color figure online)

Fig. 3
figure 3

In each figure, the equilibrium point (x ,y ,z ) is marked by a red dot and the blue curves are the trajectories of the system (4)–(5) starting at the vertices of the unit simplex Σ. Also shown is the line where the divergence if the vector field vanishes (Color figure online)

In all four diagrams in Fig. 2 we have chosen the transit rates equal to σ P =4, σ R =2, π S =1, π R =2, ρ S =0.5 and ρ P =0.25. In diagram (a) (top, left), the growth rates for the respective compartments are α 1=10, α 2=5 and α 3=2 while these rates are α 1=−3, α 2=−1 and α 3=2 in diagram (b) (top, right). The numbers are just for illustration, but scenario (a) could be considered an uncontrolled system with the sensitive cells the most strongly proliferating ones and the resistant population the slowest growing subpopulation. Diagram (b) then would be typical of a system with constant rate chemotherapy that kills sensitive and partially sensitive cells, and in effect generates a negative growth rate for these subpopulations, while it is assumed that the resistant subpopulation R is fully resistant. Note how the equilibrium point shifts towards the origin which implies a strong dominance of the resistant subpopulation R. The approximate growth rates \(\hat{\alpha}_{1}x_{\ast}+\hat{\alpha}_{2}y_{\ast}+\hat{\alpha }_{3}z_{\ast}\) for the two cases are given by 5.6525 for scenario (a) and by 1.5320 for scenario (b). Thus, while the chemotherapy is able to reduce the growth, it cannot eliminate it. The reason is that in case (b) we have z =0.8765 and coupled with \(\hat{\alpha }_{3}=2\), this positive growth rate cannot be overcome by the decline in the other populations. It is only when one assumes that the agent can also reduce the growth rate of the resistant population that one sees lower overall growth rates. But since z →1 as the effectiveness of the drug on the sensitive and resistant population becomes very high (x →0 and y →0), it is clear that the net growth rate \(\hat{\alpha}_{3}\) of the resistant subpopulation becomes the determining factor. It is only when this rate becomes so small that it can be overcome by the decrease in the sensitive and resistant populations that the overall growth rate can be made negative. For example, this happens for α 1=−10, α 2=−3 and α 3=0.5 in which case (x ,y ,z )=(0.0322,0.0598,0.9080) and the overall growth rate is −0.0472. The corresponding diagram is shown in scenario (d) (bottom, right). Scenario (c) (bottom, left) still shows an intermediate case for α 1=−5.5, α 2=−3 and α 3=0.5 when the overall growth rate of the total population is zero, i.e., the status quo is maintained.

Figure 3 still shows two cases when the transition rates are much smaller given by σ P =0.2, σ R =0.01, π S =0.02, π R =0.05, ρ S =0.01 and ρ P =0.03. Figure 3(a) on the left illustrates a typical initial scenario with a large portion of sensitive and partially sensitive cells that get eliminated with treatment with the balance shifting towards a dominance of resistant cells shown in Fig. 3(b). The overall growth rate in this case again remains positive.

3 Chemotherapy as an Optimal Control Problem

In the previous section, we only considered constant administration of chemo-therapy at low doses, for example like it would be given in a metronomic dosing. Here we now more generally consider the optimal control problem to minimize the tumor burden over a fixed therapy interval [0,T] through administration of chemotherapy. Once the dose rates are no longer small, it becomes imperative to limit the toxicity of treatment. Under the standard log-kill hypothesis, the damage done to cells is proportional to the concentration of drugs given and thus the integral \(\int_{0}^{T}u(t)dt\), which can be interpreted as the total dose of drugs given over the interval [0,T], becomes a measure for the toxic side effects of treatment. In the approach taken in this paper we include the toxicity of treatment as a soft constraint and include this integral in the objective as a term to be minimized and then consider the following optimal control problem:

[OC]:

For a fixed therapy horizon [0,T], minimize the objective

$$\begin{aligned} J(u) = & rN(T) + \int_{0}^{T} qN(t)+u(t)dt \\ = & r_{1}S(T) + r_{2}P(T) + r_{3}R(T) \\ & {}+ \int_{0}^{T} q_{1}S(t) + q_{2}P(t) + q_{3}R(t) +u(t) dt \rightarrow\min \end{aligned}$$
(11)

over all Lebesgue-measurable functions u:[0,T]→[0,u max] subject to the dynamics (1)–(3).

In the objective, we denote the state of the system by N, N=(S,P,R)T, written as a column vector, and the coefficients r=(r 1,r 2,r 3) and q=(q 1,q 2,q 3) are positive weights which we write as row vectors. Thus the inner product rN(T) is a weighted average of all tumor cells at the end of the therapy horizon and the integral of qN(t), t∈[0,T], takes a weighted average over the therapy interval. This term is included in order to prevent solutions to rise to unacceptably high levels during the therapy interval. Including the integral over the concentration, \(\int_{0}^{T}u(t)dt\), in the minimization forces a compromise between the objectives of minimizing the tumor burden and limiting toxicity of treatment. Without loss of generality we normalize this weight to be 1. Generally these parameters (r and q) are variables of choice and may be calibrated to obtain a desired response of the system.

We write the dynamics (1)–(3) more compactly in matrix form as

$$ \dot{N}= ( A+uB ) N $$
(12)

with the matrices A and B given by

$$A=\left ( \begin{array}{c@{\quad}c@{\quad}c} \alpha_{1}-\sigma_{P}-\sigma_{R} & \pi_{S} & \rho_{S}\\ \sigma_{P} & \alpha_{2}-\pi_{S}-\pi_{R} & \rho_{P}\\ \sigma_{R} & \pi_{R} & \alpha_{3}-\rho_{S}-\rho_{P} \end{array} \right ) $$

and

$$B=\left ( \begin{array}{c@{\quad}c@{\quad}c} -\varphi_{1} & 0 & 0\\ 0 & -\varphi_{2} & 0\\ 0 & 0 & -\varphi_{3} \end{array} \right ) . $$

3.1 Necessary Conditions for Optimality

First order necessary conditions for optimality for the optimal control problem [OC] are given by the Pontryagin maximum principle [24] (for some recent references about optimal control, see [3, 4, 25]). If u :[0,T]→[0,u max] is an optimal control with corresponding trajectory N , then there exist a constant λ 0≥0 and multipliers \(\lambda=(\lambda_{1},\lambda_{2},\lambda _{3}):[0,T]\rightarrow ( \mathbb{R}^{3} ) ^{\ast}\) (written as a row vector), the so-called adjoint variable, such that the following conditions are satisfied:

  1. (a)

    (nontriviality of the multipliers) (λ 0,λ(t))≠0 for all t∈[0,T];

  2. (b)

    (adjoint equations and transversality conditions) defining the Hamiltonian function H as

    $$ H = H(\lambda_{0},\lambda,N,u) = \lambda_{0} ( qN+u ) + \lambda ( A+uB ) N, $$
    (13)

    the multiplier λ satisfies the following linear differential equation:

    $$ \dot{\lambda} = -\frac{\partial H}{\partial N} = -\lambda_{0}q-\lambda ( A+u_{\ast}B ) , \qquad\lambda(T) = \lambda_{0}r, $$
    (14)
  3. (c)

    (minimum condition) for almost every time t∈[0,T] the optimal control u (t), minimizes the Hamiltonian H pointwise over the control set [0,u max] along (λ 0,λ(t),N (t),u (t)), i.e.,

    $$ H\bigl(\lambda_{0},\lambda(t),N_{\ast}(t),u_{\ast}(t) \bigr) = \min_{0\leq u\leq u_{\max}}H\bigl(\lambda_{0}, \lambda(t),N_{\ast}(t),u\bigr), $$
    (15)

    and this minimum value is constant over the interval [0,T],

    $$ H\bigl(\lambda_{0},\lambda(t),N_{\ast}(t),u_{\ast}(t) \bigr) = \mathrm{const}. $$
    (16)

A controlled trajectory (N,u) for which there exist multipliers λ 0 and λ such that these conditions are satisfied is called an extremal and the triple (N,u,(λ 0,λ)) is an extremal lift. If the multiplier λ 0 vanishes, the extremal is called abnormal while it is called normal if λ 0 is positive. In the latter case, by dividing the other multipliers by λ 0, it is always possible to normalize λ 0=1. It is easy to see that all extremals for problem [OC] are normal: If λ 0=0, then λ(t) satisfies a homogeneous time-varying linear differential equation which vanishes at the terminal condition. Hence it vanishes identically contradicting the nontriviality of the multipliers. We henceforth normalize the multiplier λ 0 to be λ 0=1 and drop it from our notation.

Lemma 4

The multipliers λ i , i=1,2,3 are positive over the interval [0,T].

Proof

In coordinates, the multipliers λ i satisfy the equations

$$\begin{aligned} \dot{\lambda}_{1} & = -\frac{\partial H}{\partial S} = -q_{1} - \lambda_{1} ( \alpha_{1}-\sigma_{P}- \sigma_{R}-\varphi _{1}u ) - \lambda_{2} \sigma_{P} - \lambda_{3}\sigma_{R}, \quad \lambda_{1}(T) =r_{1}, \\ \dot{\lambda}_{2} & = -\frac{\partial H}{\partial P} = -q_{2} - \lambda_{1}\pi_{S} -\lambda_{2} ( \alpha_{2}-\pi_{S}-\pi _{R}-\varphi_{2}u ) -\lambda_{3}\pi_{R}, \quad \lambda_{2}(T) =r_{2}, \\ \dot{\lambda}_{3} & = -\frac{\partial H}{\partial R} = -q_{3} - \lambda_{1}\rho_{S} - \lambda_{2} \rho_{P} - \lambda_{3} ( \alpha_{3}- \rho_{S}-\rho_{P}-\varphi_{3}u ) , \quad \lambda_{3}(T) =r_{3}. \end{aligned}$$

At the terminal time T all values are positive. Suppose there exists a time when at least one of the multipliers is negative and let

$$\tau=\max \bigl\{ t\leq T:\lambda_{1}(t)\lambda_{2}(t) \lambda _{3}(t)=0 \bigr\} < T. $$

This simply is the “first” time (counting backward) when one of the multipliers becomes zero. If j denotes an index such that λ j (τ)=0, it then follows from the differential equations that \(\dot{\lambda}_{j}(\tau)\leq-q_{j}<0\). But then λ j (t) must be negative for t>τ close to τ. Contradiction. Hence all multipliers remain positive. □

The minimum of the Hamiltonian H over the control set [0,u max] is attained at one of the boundary points u=0 or u=u max whenever the function

$$ \varPhi(t) = 1+\lambda(t)BN_{\ast}(t), $$
(17)

the so-called switching function, does not vanish and optimal controls satisfy

$$ u_{\ast}(t)=\left \{ \begin{array}{l@{\quad}l} 0 & \mbox{if }\varPhi(t)>0,\\ u_{\max} & \mbox{if }\varPhi(t)<0. \end{array} \right . $$
(18)

But optimal controls can also take values in the interior of the control set if the switching function vanishes identically over some open interval. Such controls are called singular while controls that take values in the extreme points of the interval are called bang–bang controls. Note that whenever Φ(τ)=0 and \(\dot{\varPhi}(\tau)\neq0\), then the optimal control switches between u max and 0 depending on the sign of \(\dot{\varPhi }(\tau)\). Hence the name bang–bang controls. If the control is singular over an open interval I, then (modulo some degenerate nongeneric situations) the control u explicitly occurs for the first time only in an even numbered derivative of the switching function. If this is the 2k-th derivative, then it is a necessary condition for optimality, the so-called generalized Legendre–Clebsch condition [3, 25], that

$$ (-1)^{k} \frac{\partial}{\partial u} \frac{d^{2k}}{dt^{2k}} \frac{\partial H}{\partial u}\bigl(\lambda(t),N_{\ast}(t),u_{\ast}(t)\bigr) = (-1)^{k}\frac{\partial}{\partial u}\varPhi^{(2k)}(t)\geq0. $$
(19)

If strict inequality holds in this equation, we say the singular control is of intrinsic order k and the strengthened Legendre–Clebsch condition is satisfied. This often is an indication of local optimality properties of the singular control. Typically, optimal controls consist of concatenations of bang and singular structures that need to be determined through an analysis of the properties of the switching functions.

3.2 Singular Controls

If an optimal control is singular over an open interval I, then λ(t)BN (t)≡−1 on I. The following simple formula, which follows from a direct calculation, allows us to organize the derivatives of the switching function in a structured manner.

Proposition 1

Suppose M is a constant matrix and let Ψ(t)=λ(t)MN(t) where N is a solution to the dynamics (12) for the control u and λ is a solution of the corresponding adjoint equation (14). Then

$$ \dot{\varPsi}(t)=\lambda(t) [ A+uB,M ] N(t)-qMN(t), $$
(20)

with [X,Y]=YXXY denoting the commutator of the matrices X and Y.

We have chosen the sign of the commutators of the matrices to be consistent with the definition of the Lie bracket [X,Y] of the linear vector fields X(N)=XN and Y(N)=YN.

If the switching function Φ vanishes identically over an interval I, we thus have that

$$\begin{aligned} \dot{\varPhi}(t) = & \lambda(t) \bigl[A+u_{\ast}(t)B,B \bigr]N_{\ast}(t) - qBN_{\ast }(t) \\ = & \bigl\{ \lambda(t) [A,B] - qB \bigr\} N_{\ast}(t) = 0, \end{aligned}$$
(21)
$$\begin{aligned} \ddot{\varPhi}(t) = & \lambda(t) \bigl[A+u_{\ast}(t)B,[A,B] \bigr]N_{\ast}(t) - q[A,B]N_{\ast}(t) - qB\bigl(A+u_{\ast}(t)B \bigr)N_{\ast}(t) \\ = & \bigl\{ \bigl( \lambda(t)\bigl[A,[A,B]\bigr] - q[A,B] - qBA \bigr) \\ & {} + u_{\ast}(t) \bigl( \lambda(t) \bigl[B,[A,B]\bigr] - qB^{2} \bigr) \bigr\} N_{\ast}(t) = 0, \end{aligned}$$
(22)

with the iterated brackets denoting successive Lie brackets (or commutators) of the matrices A and B. For the model under consideration we have that

$$\begin{aligned} BA &= -\left ( \begin{array}{c@{\quad}c@{\quad}c} \varphi_{1}\nu_{S} & \varphi_{1}\pi_{S} & \varphi_{1}\rho_{S}\\ \varphi_{2}\sigma_{P} & \varphi_{2}\nu_{P} & \varphi_{2}\rho_{P}\\ \varphi_{3}\sigma_{R} & \varphi_{3}\pi_{R} & \varphi_{3}\nu_{R} \end{array} \right ) ,\qquad B^{2}=\left ( \begin{array}{c@{\quad}c@{\quad}c} \varphi_{1}^{2} & 0 & 0\\ 0 & \varphi_{2}^{2} & 0\\ 0 & 0 & \varphi_{3}^{2} \end{array} \right ) , \end{aligned}$$
(23)
$$\begin{aligned} {}[ A,B] & =\left ( \begin{array}{c@{\quad}c@{\quad}c} 0 & (\varphi_{2}-\varphi_{1})\pi_{S} & (\varphi_{3}-\varphi _{1})\rho_{S}\\ (\varphi_{1}-\varphi_{2})\sigma_{P} & 0 & (\varphi_{3}-\varphi_{2})\rho _{P}\\ (\varphi_{1}-\varphi_{3})\sigma_{R} & (\varphi_{2}-\varphi_{3})\pi_{R} & 0 \end{array} \right ) , \end{aligned}$$
(24)
$$\begin{aligned} \bigl[ B,[A,B ] \bigr] & = -\left ( \begin{array}{c@{\quad}c@{\quad}c} 0 & (\varphi_{2}-\varphi_{1})^{2}\pi_{S} & (\varphi_{3}-\varphi _{1})^{2}\rho_{S}\\ (\varphi_{1}-\varphi_{2})^{2}\sigma_{P} & 0 & (\varphi_{3}-\varphi _{2})^{2}\rho_{P}\\ (\varphi_{1}-\varphi_{3})^{2}\sigma_{R} & (\varphi_{2}-\varphi _{3})^{2}\pi_{R} & 0 \end{array} \right ) , \end{aligned}$$
(25)

and [A,[A,B]] is a 3×3-matrix with full and lengthy entries. The diagonal terms of [A,B] and [B,[A,B]] vanish since B is a diagonal matrix which commutes with the diagonal part of whatever matrix the bracket is taken with. The coefficient multiplying the control u in the second derivative of the switching function is given by

$$\frac{\partial}{\partial u}\frac{d^{2}}{dt^{2}}\frac{\partial H}{\partial u} \bigl( \lambda(t),N_{\ast}(t),u_{\ast}(t) \bigr) = \bigl\{ \lambda(t) \bigl[ B, [ A,B ] \bigr] - qB^{2} \bigr\} N_{\ast}(t). $$

The matrix [B,[A,B]] has all nonpositive entries and it follows from Lemmas 1 and 4 that all entries of the state N and the multiplier λ are positive. Hence

$$\lambda(t)\bigl[B,[A,B]\bigr]N_{\ast}(t)\leq0. $$

Furthermore,

$$qB^{2}N_{\ast}(t) = q_{1}\varphi_{1}^{2}S_{\ast}(t) + q_{2}\varphi _{2}^{2}P_{\ast}(t) + q_{3}\varphi_{3}^{2}R_{\ast}(t) > 0 $$

so that

$$-\frac{\partial}{\partial u}\frac{d^{2}}{dt^{2}}\frac{\partial H}{\partial u} \bigl( \lambda(t),N_{\ast}(t),u_{\ast}(t) \bigr) > 0. $$

Thus we have the following result:

Proposition 2

Singular controls are of order 1 and the strengthened Legendre–Clebsch condition for minimality is satisfied.

Solving Eq. (22) for the control gives the following formula for the singular control

$$ u_{\mathrm{sing}}(t) = \frac{ \{ \lambda(t)[A,[A,B]]-q[A,B]-qBA \} N_{\ast}(t)}{ \{ -\lambda(t)[B,[A,B]]+qB^{2} \} N_{\ast}(t)}. $$
(26)

Note that, dividing both numerator and denominator by C(t), the singular control formally depends only on the proportions x, y and z, not on the actual states S, P and R. However, dependence on the states comes in indirectly through the multipliers. In order to be admissible, the values need to lie in the control set [0,u max]. It follows from the strengthened Legendre–Clebsch condition that the denominator is positive. In the numerator, all terms in the vector −qBA are positive. There exist coefficients in the matrices [A,[A,B]] and the vector −q[A,B] that are negative, but just a few. Thus generally, and this is what we have seen consistently in numerical computations, the values of the expression (26) are positive and thus admissible for suitable upper bounds u max.

Analyzing optimal concatenations between bang and singular controls is difficult and this analysis has not yet been carried out. However, it is not difficult to give some numerical samples of singular controls and corresponding trajectories. Along a singular arc, the multiplier λ satisfies Φ(t)=1+λ(t)BN (t)≡0 and \(\dot{\varPhi}(t)= \{ \lambda(t)[A,B]-qB \} N_{\ast}(t)\equiv0\). It is determined by these conditions up to one degree of freedom. In principle, singular controls are possible everywhere in the state space and we give an example of an extremal controlled trajectory for which the control is given by the maximum dose rate for an initial interval [0,τ b ] and then is singular over the remaining period [τ b ,T]. Ignoring the terminal value, we simply determine a value for λ(τ b ) so that \(\varPhi(\tau_{b})=\dot{\varPhi}(\tau_{b})=0\) and then integrate the combined flow of the system dynamics and adjoint equation corresponding to the singular control forward in time until time T. As long as the multipliers λ i (t), i=1,2,3 remain positive for t∈[0,T], we end up with singular extremals for the optimal control problem [OC] with penalty terms r i =λ i (T). More generally, if these values are specified a priori, the corresponding two-point boundary value problem needs to be solved. Here we only seek to give an illustration of the singular control and its flow and thus follow the simpler initial value problem approach.

Figure 4 shows an example of such a structure for the growth rates α 1=1, α 2=0.5 and α 3=0.1, transition rates σ P =0.05, σ R =0.01, π S =0.03, π R =0.01, ρ S =0.01 and ρ P =0.03, and pharmacodynamic coefficients φ 1=1.5, φ 2=1 and φ 3=0.1; the maximum dose rate was normalized to u max=1 and all the weights q i in the objective were chosen equal to 0.01. The initial interval with maximum dose has length τ b =5 and the therapy horizon is T=28. We have normalized the total cancer volume at the initial time to be C(0)=1 and then have taken as initial condition the corresponding steady-state of the proportions, i.e., S 0=x =0.8954, P 0=y =0.0933 and R 0=z =0.0112. Figure 4(a) (top) shows the graph of the corresponding control and Fig. 4(b) (middle) shows the graphs of the corresponding states. The value of the singular control u sing(t) increases slightly over the interval [5,28]. The multipliers over the singular interval are shown in part (c) (bottom) and remain positive. It then follows from Lemma 4 that they are positive on the initial interval as well.

Fig. 4
figure 4

Control, states and multipliers for a bang-singular controlled extremal

4 Conclusion

We considered a compartmental model for chemotherapy of heterogeneous tumor populations distinguishing three levels of sensitivity with interchanges between the compartments (developing drug resistance and resensitization) possible. For the problem of minimizing an average of the total tumor burden over the therapy horizon we have shown that singular controls, which correspond to continuous time chemotherapy at lower than maximum doses, are a viable option in the sense that the strengthened Legendre–Clebsch condition for optimality is always satisfied. With the emergence of various chemotherapeutic sensitivities, the standard maximum tolerated dose (MTD) approach to chemotherapy may thus not necessarily be the best possible option to pursue. While this seems to be clear, and has been confirmed earlier in mathematical models if a sizable resistant subpopulation exists, our results seem to point to this feature simply because of the varying sensitivities of the subpopulations which, in fact, all could still be chemotherapeutically sensitive.