Keywords

1 Introduction

In any cancer treatment, the question arises how therapeutic agents (various drugs, radiation dosages, antiangiogenic biological agents, cancer vaccines, ) should be given in order to be at the same time reasonably safe and effective. Mathematically, the scheduling of therapeutic agents over time in order to minimize some objective related to tumor burden (e.g., tumor volume) and quality of life of the patient (e.g., some measure of the toxic side effects of treatment) while the underlying system follows some dynamics (in this case determined by the processes of tumor development and treatment interactions) is an optimal control problem. In this paper, we review some results about the structure of treatment protocols that can be inferred from mathematical models with the methods and tools of optimal control.

Our emphasis will be on models for cancer chemotherapy. For most tumors, it is a standard medical practice to give chemotherapy at maximum tolerated doses (MTD) with rest periods in between. The underlying rationale simply is that when the disease has progressed into an advanced stage, it is imperative to kill as many of the cancer cells as possible and this has to be done right now. Since drugs are rarely selective in their activation mechanisms, chemotherapy also severely damages other proliferating cells that are essential for survival like bone marrow. This necessitates the introduction of rest periods for the patient to recover from the strong toxic attack. We are interested in questions of the following type: Under what kind of conditions is an MTD approach the optimal treatment strategy? When should different protocols be favored? If resistance to chemotherapeutic agents is present, is a metronomic scheduling of chemotherapeutic agents (essentially, a continuous-type treatment at low doses) which avoids the high toxicities associated with MTD doses equally effective? Naturally, answers to such questions depend on the type of tumor. Simple dividing characteristics that are important in the scheduling of treatment are given by the tumor doubling time, the growth fractions of tumor cells, and much more.

A tumor consists not just of cancerous cells but of a full array of other structures that in various ways aid and abet the tumor, but also fight it. The most important structure that sustains the tumor is its vasculature which provides the tumor with the oxygen and nutrients needed for further growth; an example of an endogenous structure that fights the tumor is the body’s immune system. The tumor microenvironment consists of these components and much more (e.g., macrophages and fibroblast cells that form the intracellular matrix), all still residing in healthy tissue. In modern oncology thus the point of view of the tumor as a system of interacting components has become the more common one and modern treatments are multi-targeted therapies that not only aim to kill cancer cells but often include antiangiogenic therapy, immunotherapy, and other options. Yet, the complex interactions between these and other treatment modalities still are not fully understood and are the topic of active current medical research (e.g., see [1]).

In clinical trials, the scheduling of therapeutic agents is pursued in medically guided, exhaustive trial-and-error approaches of simple strategies. Hardly ever are nonstandard protocols pursued in this research since complex protocols are relatively difficult, if not impossible to test in a laboratory setting, or at a minimum at great cost. The analysis of mathematical models can be of benefit here by giving some theoretical suggestions for treatment protocols through an alternative noninvasive tool or by establishing benchmarks for medically realizable protocols. As of today, the question how chemotherapeutic agents should best be administered if a more wholistic approach to treatment is taken that takes the structures of the tumor microenvironment into account still has not been answered (e.g., see [21]).

This paper is organized as follows: In Sect. 2 we give a brief introduction to the main tools and results from optimal control theory that are needed in the analysis of mathematical models for cancer treatment. Especially, the distinction between bang-bang controls (which correspond to maximum dose treatment periods interlaced with rest periods) and singular controls (which correspond to time-varying administration schedules at lower dose rates) will be emphasized. Bang-bang controls directly relate to the MTD strategies of medical practice while singular controls are of special interest in the search for the biologically optimal dose (BOD). This is an effective dose which has minimal or at least low side effects [26]. In Sect. 3, we start with a discussion of optimal treatment protocols for compartmental models of cancer chemotherapy. It is easily seen that optimal controls indeed support the traditional MTD paradigm if it is assumed that the tumor consists of a homogeneous population of chemotherapeutically sensitive cells. However, as compartments of varying sensitivities or even full resistance are introduced into the model, this no longer is valid and singular controls along with the associated lower dose rates become candidates for optimality. Optimal administration of antiangiogenic agents also is done by means of singular controls and will be discussed in Sect. 4, both as stand-alone approach and in combination therapy with chemotherapy. Once tumor-immune system interactions are taken into account, optimal administration of cytotoxic agents no longer follows an MTD approach, but a so-called “chemo-switch” regimen: after an initial interval of maximum dose treatment, in optimal solutions dose rates are reduced and given by singular controls. These results are given in Sect. 5.

Overall, an optimal control analysis of mathematical models for cancer chemotherapy as it is presented here leads to results that provide information about the qualitative structure of treatment protocols that can be of use in the design of practical treatment protocols.

2 Optimal Control–A Brief Introduction

We briefly review the main results of optimal control theory. However, rather than considering the general case, we restrict the mathematical structure to a model of the form that most examples in biomedical applications have: a multi-input control-affine system. This simply reflects the fact that “controls” represent structures imposed on an existing dynamical system from the outside to influence its behavior and that these are naturally set up in a way so that these effects are most easily analyzed. This generally leads to linear terms in the controls. For such systems, the so-called bang-bang and singular controls become the prime candidates for optimality. We describe the principal tools for analyzing singular controls which include Lie brackets for computing derivatives of the switching function and the Legendre-Clebsch condition as the main necessary condition for optimality.

2.1 Control Affine Systems as Mathematical Models for Biomedical Models

We say a control system is control-affine with drift vector field f and control vector fields g i , i = 1, , m, if the dynamics takes the following form:

$$\displaystyle{ \dot{x} = f(x) +\sum _{ i=1}^{m}g_{ i}(x)u_{i},\qquad x \in M,\qquad u \in U. }$$
(1)

The vector x is the state of the system and takes values in an open and connected subset M of \(\mathbb{R}^{n}\); the vector u represents the controls and takes values in a control set \(U \subset \mathbb{R}^{m}\). In the biomedical models we shall be considering, the controls represent dose rates or concentrations of some therapeutic agents and all take nonnegative values that lie in prescribed ranges. We therefore take the control set U as an m-dimensional interval of the form

$$\displaystyle{ U = [0,u_{1}^{\max }] \times \cdots \times [0,u_{ m}^{\max }]. }$$
(2)

The class \(\mathcal{U}\) of admissible controls is given by Lebesgue-measurable functions u defined on some interval I with values in the control set (almost everywhere), \(u: I \rightarrow U\), \(t\mapsto u(t)\). The differential equation (1) represents the dynamics which connects the controls with the state of the system. Given an admissible control \(u \in \mathcal{U}\), it follows from classical results about solutions to ordinary differential equations that for any initial condition x 0, there exists a unique solution x to (1) with initial condition x(0) = x 0. We call this solution x the trajectory corresponding to the control u and call the pair (x, u) an admissible controlled trajectory.

An optimal control problem then consists in finding, among all admissible controlled trajectories, one that minimizes an objective, possibly subject to additional constraints. Here we only consider constraints of a fixed terminal time T or on the final state x(T) of the system. The former correspond to therapy over an a priori specified horizon (Sect. 3) and the latter arise if therapy with an a priori given amount of therapeutic agents is considered (Sect. 4). We assume that such constraints have a regular geometric structure and are given in the form \(N =\{ x \in M:\psi (x) = 0\}\) with \(\psi: M \rightarrow \mathbb{R}^{n-k}\) a continuously differentiable mapping and the matrix of the partial derivatives of ψ with respect to x of full rank everywhere on N. We choose the functional form of the objective to be consistent with the control-affine structure of the dynamics, i.e., we take the functional to be minimized in the form

$$\displaystyle{ \mathcal{J} (u) =\int _{ 0}^{T}\left (L(x(s)) +\sum _{ i=1}^{m}\theta _{ i}u_{i}(s)\right )ds +\varphi (x(T)) }$$
(3)

with \(L: M \rightarrow \mathbb{R}\), \(x\mapsto L(x)\) the Lagrangian and \(\varphi: N \rightarrow \mathbb{R}\), \(x\mapsto \varphi (x)\) a penalty term on the final state. Both \(L\) and \(\varphi\) are continuously differentiable functions. The terminal time T can be fixed or free. We choose the functional dependence of the objective on the controls to be linear since the integrals \(\int _{0}^{T}u_{i}(t)dt\) have an immediate interpretation in terms of the total dose of agents given and thus are biomedically meaningful. It would be mathematically simpler to choose quadratic terms for the controls in the objective, but such terms are imposed arbitrarily. We thus consider the following optimal control problem:

[OC] :

minimize the objective \(\mathcal{J} (u)\) over all admissible controlled trajectories (x, u) subject to the terminal constraint x(T) ∈ N.

2.2 Necessary Conditions for Optimality: The Pontryagin Maximum Principle

The fundamental necessary conditions for a controlled trajectory (x, u) to be optimal are given by the Pontryagin maximum principle [59]. (We refer the reader to [3, 4, 61] for some modern treatments of the subject.) We consistently write tangent vectors as column vectors and multipliers as row vectors denoting the space of row vectors by \(\left (\mathbb{R}^{n}\right )^{{\ast}}\). The Hamiltonian function H of the optimal control problem [OC] is defined as

$$\displaystyle{ H =\lambda _{0}\left (L(x) +\sum _{ i=1}^{m}\theta _{ i}u_{i}\right ) + \left \langle \lambda,\ f(x) +\sum _{ i=1}^{m}g_{ i}(x)u_{i}\right \rangle }$$
(4)

Theorem 2.1 (Pontryagin Maximum Principle [59]).

Let \((x_{{\ast}},u_{{\ast}})\) be an optimal controlled trajectory for the problem [OC] defined over the interval [0,T]. Then there exist a constant λ 0 ≥ 0, a multiplier \(\nu \in (\mathbb{R}^{n-k})^{{\ast},}\) and a co-vector \(\lambda: [0,T] \rightarrow (\mathbb{R}^{n})^{{\ast}}\) , the so-called adjoint variable , such that the following conditions are satisfied:

  1. 1.

    Nontriviality of the multipliers: \((\lambda _{0},\lambda (t))\neq 0\) for all \(t \in [0,T]\) .

  2. 2.

    Adjoint equation: the adjoint variable λ is a solution to the time-varying linear differential equation

    $$\displaystyle{ \dot{\lambda }(t) = -\lambda _{0}\nabla L(x_{{\ast}}(t)) -\lambda (t)\left (Df(x_{{\ast}}(t) +\sum _{ i=1}^{m}u_{ i}^{{\ast}}(t)g_{ i}(x_{{\ast}}(t))\right ) }$$
    (5)

    with terminal condition

    $$\displaystyle{ \lambda (T) =\lambda _{0} \frac{\partial \varphi } {\partial x}\left (x_{{\ast}}(T)\right ) +\nu \frac{\partial \psi } {\partial x}\left (x_{{\ast}}(T)\right ). }$$
    (6)
  3. 3.

    Minimum condition: almost everywhere in [0,T] we have that

    $$\displaystyle{ H(\lambda _{0},\lambda (t),x_{{\ast}}(t),u_{{\ast}}(t)) =\min _{v\in U}H(\lambda _{0},\lambda (t),x_{{\ast}}(t),v) }$$
    (7)

    and the Hamiltonian is constant along λ and \((x_{{\ast}},u_{{\ast}})\) . If the terminal time T is free, the value of this constant is 0.

Controlled trajectories (x, u) for which there exist multipliers λ 0 and λ such that the conditions of the maximum principle are satisfied are called extremals and the triples (x, u, (λ 0, λ)) including the multipliers are called extremal lifts. The constant multiplier λ 0 can be zero and in this case the extremal is called abnormal while it is called normal if λ 0 > 0. In this case, since the conditions are linear in the multipliers, it is always possible to normalize λ 0 = 1.

In the original formulation of the theorem by Pontryagin et al. [59], the minimum condition (7) was formulated as a maximum condition and gave the result its name. In fact, depending on the choice of the signs associated with the multipliers λ 0 and λ, the maximum principle can be stated in four equivalent versions. Since the problems we will be considering are all cast as minimization problems, we prefer this formulation, but retain the classical name. The minimum condition contains the essence of the result and states that in order to solve the minimization problem on the function space of admissible controls, the control u needs to be chosen so that for some extremal lift it minimizes the Hamiltonian H pointwise over the control set U, i.e., for every \(t \in [0,T]\) the control u (t) is a minimizer of the function \(v\mapsto H(\lambda _{0},\lambda (t),x_{{\ast}}(t),v)\) over the control set U.

2.3 Bang-Bang and Singular Controls

In our case, since U is an m-dimensional interval, the minimum condition splits into m scalar minimization problems that are easily solved. Defining the functions

$$\displaystyle{ \varPhi _{i}(t) =\lambda _{0}\theta _{i} + \left \langle \lambda (t),g_{i}(x_{{\ast}}(t))\right \rangle, }$$
(8)

it follows that the optimal controls satisfy

$$\displaystyle{ u_{i}^{{\ast}}(t) = \left \{\begin{array}{cc} 0 &\text{if}\ \varPhi _{i}(t)> 0, \\ u_{i}^{\max }&\text{if}\ \varPhi _{i}(t) <0. \end{array} \right. }$$
(9)

A priori, the control is not determined by the minimum condition at times when Φ(τ) = 0. In such a case, all controls trivially satisfy the minimum condition and, in principle, are candidates for optimality. Naturally, if the derivative \(\dot{\varPhi }(\tau )\) exists and does not vanish, then the control switches between u i  = 0 and u i  = u i max with the order depending on the sign of \(\dot{\varPhi }(\tau )\). Such a time τ is called a bang-bang switch. On the other hand, if Φ(t) were to vanish identically on an open interval I, then, although the minimization property by itself gives no information about the control, in this case also all the derivatives of Φ(t) must vanish and this, except for some degenerate situations, generally does determine the control. Controls of this kind are called singular while the constant controls u i  = 0 and \(u_{i} = u_{i}^{\max }\) are called bang controls and controls that only switch between 0 and the maximum control values are bang-bang controls. Strictly speaking, to be singular is not a property of the control, but of the extremal lift since it also depends on the multiplier λ defining the function Φ i . This function is called the switching function for the control u i .

The terminology “singular” has its historical origin in the fact that the switching functions can be expressed as

$$\displaystyle{ \varPhi _{i}(t) = \frac{\partial H} {\partial u_{i}}(\lambda _{0},\lambda (t),x_{{\ast}}(t),u_{{\ast}}(t)) }$$
(10)

and thus the condition Φ(t) = 0 formally is the first-order necessary condition for the Hamiltonian to have a minimum in the interior of the control set. For singular controls, the Hessian matrix \(\frac{\partial ^{2}H} {\partial u^{2}}\) corresponding to second order necessary conditions for optimality is singular. In fact, for a control-affine system this matrix is identically zero.

If the control corresponds to the application of some therapeutic agent, then bang-bang controls represent treatment strategies that switch between maximum dose therapy sessions and rest periods, the typical MTD-type applications on chemotherapy. Singular controls on the other hand represent time-varying administrations of the agent at intermediate and often significantly lower doses. Although administration of such time-varying schedules may be difficult in practice, there is growing interest in such structures in the medical community because of mounting evidence that “more is not necessarily better” [18, 55] and that a biologically optimal dose (BOD) with the best overall response should be sought. In this direction, the concept of metronomic chemotherapy as well as other approaches like chemo-switch protocols [57] or adaptive therapy [12] have been introduced. We shall say more about these medical connections later on. But the question whether optimal controls are bang-bang or singular has an immediate interpretation and relevance for the structure of optimal treatment protocols. While the terminology is somewhat misleading, these singular structures indeed are the more natural candidates for optimality.

2.4 The Legendre-Clebsch Condition for Optimality of Singular Controls

In the solution of any optimal control problem, it becomes necessary to determine singular controls and then synthesize optimal controls from the primary candidates—bang and singular controls. In order to do so, we need to analyze the derivatives of the switching functions. In these formulas, the notion of the Lie bracket of vector fields arises naturally: given two differentiable vector fields f and g defined on some open set \(M \subset \mathbb{R}^{n}\), \(f,g: M \rightarrow \mathbb{R}^{n}\), their Lie bracket [f, g] is another vector field defined on G by

$$\displaystyle{ [f,g](x) = Dg(x)f(x) - Df(x)g(x). }$$
(11)

Its importance in optimal control is because of the following simple formula that is verified by a direct computation:

Proposition 2.1.

Let x(⋅) be a solution of the dynamics (1) for the controls u i and let λ be a solution of the corresponding adjoint equation  (5) . For a continuously differentiable vector field h, the derivative of the function

$$\displaystyle{\varPsi (t) = \left \langle \lambda (t),h(x(t))\right \rangle =\lambda (t)h(x(t))}$$

is given by

$$\displaystyle{\dot{\varPsi }(t) = \left \langle \lambda (t),\left [f +\sum _{ i=1}^{m}u_{ i}^{{\ast}}(t)g_{ i},h\right ](x(t))\right \rangle -\lambda _{0}\nabla L(x(t))h(x(t)).}$$

Singular controls are computed by differentiating the switching functions until the controls explicitly appear and then solving the resulting equations for the controls. We demonstrate the procedure for the simpler case of a single-input control system of the form \(\dot{x} = f(x) + g(x)u\). Since \([g,g] \equiv 0\), the derivative of the switching function Φ is given by

$$\displaystyle{ \dot{\varPhi }(t) = \left \langle \lambda (t),[f,g](x(t))\right \rangle -\lambda _{0}\nabla L(x(t))g(x(t)), }$$
(12)

does not depend on the control, and thus is once more differentiable. In the second derivative \(\ddot{\varPhi }(t)\), the control appears linearly and, expressing the switching function as \(\varPhi = \frac{\partial H} {\partial u}\), the term multiplying the control is given by \(\frac{\partial } {\partial u} \frac{d^{2}} {dt^{2}} \frac{\partial H} {\partial u} (\lambda _{0},\lambda (t),x_{{\ast}}(t),\) u (t)). A singular control (more precisely, the singular lift) is said to be of order 1 over an open interval I if this expression does not vanish on I and in this case, we can solve the equation \(\ddot{\varPhi }(t) = 0\) for the control as a function of the state and multiplier. Essentially, the sign of this expression distinguishes between locally minimizing and maximizing controls. This is the interpretation of the Legendre-Clebsch condition, the fundamental necessary condition for optimality of singular controls which states that for minimizing controls we must have that

$$\displaystyle{ \frac{\partial } {\partial u} \frac{d^{2}} {dt^{2}} \frac{\partial H} {\partial u} (\lambda _{0},\lambda (t),x_{{\ast}}(t),u_{{\ast}}(t)) \leq 0\quad \mathrm{for\ all\quad }t \in I. }$$
(13)

If this expression vanishes over an interval, then it becomes necessary to differentiate the switching function further. This leads to the concept of singular controls of higher order and the generalized Legendre-Clebsch condition. In some special, but common circumstances, it follows from Lie algebraic identities that the control can only appear for the first time in an even order derivative. Then the singular control is said to be of intrinsic order k if this is the 2kth derivative and one then has the following necessary condition for optimality for singular controls of finite order:

Theorem 2.2 (Generalized Legendre-Clebsch Condition).

Suppose the controlled trajectory \((x_{{\ast}},u_{{\ast}})\) defined over the interval [0,T] is optimal for the optimal control problem [OC] and the control u is singular of intrinsic order k over an open interval \(I \subset [0,T]\) . Then there exists an extremal lift \(\varGamma = ((x_{{\ast}},u_{{\ast}}),\lambda )\) with the property that

$$\displaystyle{ (-1)^{k} \frac{\partial } {\partial u} \frac{d^{2k}} {dt^{2k}} \frac{\partial H} {\partial u} (\lambda _{0},\lambda (t),x_{{\ast}}(t),u_{{\ast}}(t)) \geq 0\quad \mathrm{for\ all\quad }t \in I. }$$
(14)

2.5 Sufficient Conditions for Optimality

The optimality conditions discussed so far are all necessary and do not guarantee that a controlled trajectory that satisfies them is optimal. The theory of sufficient conditions for optimality is more intricate. Essentially, to guarantee local optimality properties, it becomes necessary to embed a reference extremal (i.e., controlled trajectory and associated multiplier) into a family of extremals in such a way that the controlled trajectories cover a neighborhood of the reference controlled trajectory. If this can be done globally in the form of what is called a regular synthesis, then the associated controls all are globally optimal. These concepts are related to classical ideas from the calculus of variations about fields of extremals or, in a more modern language, to dynamic programming and solutions of the Hamilton-Jacobi-Bellman equations. However, the details are too involved to even be outlined here and we refer the interested reader to the literature on the subject, such as, for example, our text [61].

3 Compartmental Models for Cancer Chemotherapy

In this section we formulate a general bilinear version of the optimal control problem [OC] that serves as the mathematical framework for compartmental models for cancer chemotherapy. Applications of optimal control to mathematical models for cancer chemotherapy have a long history (e.g., [6, 43, 64, 66]), but generally early models were noncompartmental. The models we consider here were formulated and first analyzed in the work of Swierniak and coworkers (e.g., [24, 67, 71]) and then reconsidered in our work [33, 34, 69]. Compartments may be comprised of various phases of the cell cycle (Sect. 3.2) or may correspond to different subpopulations of cancer cells of varying chemotherapeutic sensitivities (Sect. 3.3). While optimal controls are bang-bang with upfront dosing for homogeneous cell populations of chemotherapeutically sensitive cells and thus agree with the medical MTD paradigm of scheduling chemotherapy, as resistance effects come into play, this is no longer the case and singular controls with associated lower dose rates become candidates for optimality.

3.1 A General Bilinear Model

We consider a mathematical model with a finite number n of compartments and use the first orthant \(M = \mathbb{P}\) in \(\mathbb{R}^{n}\) as state space; \(N = (N_{1},\ldots,N_{n})^{T}\) denotes the state with N i the average number of cancer cells in the ith compartment, \(i = 1,\ldots,n\). The control is a vector \(u = (u_{1},\ldots,u_{m})^{T}\) with u i denoting various drug concentrations in the blood stream. For simplicity, in our language we identify the drug dose rates with their concentrations. Indeed, standard linear pharmacokinetic equations are easily incorporated within the general structure below at the expense of increasing the dimension of the state space, but they do not alter the results we obtain [36] and thus we use this simplified approach here. As before, the control set U is the m-dimensional interval \(U = [0,u_{1}^{\max }] \times \cdots \times [0,u_{m}^{\max }]\) and admissible controls are Lebesgue-measurable functions u that take values in the control set. The dynamics consists of balance equations that describe the inflows and outflows between the various compartments and is assumed to be of the form

$$\displaystyle{ \dot{N}(t) = \left (A +\sum _{ j=1}^{m}u_{ j}B_{j}\right )N(t),\qquad N(0) = N_{0}, }$$
(15)

where the A and B j , \(j = 1,\ldots,m\), are constant n × n matrices, \(A,B_{j} \in \mathbb{R}^{n\times n}\). The matrix A describes the transitions between the various compartments in the absence of treatment and the matrices B j represent the effects of the jth drug on the system. An equation of the form (15) is called a bilinear control system since it is linear both in the state N and the control u. Note, however, that there exist quadratic terms since the controls u i are multiplied with the states N j and thus overall this equation is not linear in all the variables (N, u).

The dynamics represents in- and outflows of the various compartments, and for this reason, no matter what the control is, all diagonal entries of the matrix \(A +\sum _{ j=1}^{m}u_{j}B_{j}\) are negative (there always is a positive outflow from each compartment) and all the off-diagonal entries (which model the inflows) are nonnegative. Zero values may occur when there are no connections between some of the compartments, but every row will have at least one positive entry. In mathematics, matrices with these properties are called M-matrices (named so in honor of Minkowski) and their structure implies the positive invariance properties for the state space \(\mathbb{P}\) required for the model to be consistent [69].

(M):

For all u ∈ U the matrices \(A +\sum _{ j=1}^{m}u_{j}B_{j}\) have negative diagonal entries and nonnegative off-diagonal entries, \(A +\sum _{ j=1}^{m}u_{j}B_{j} \in \mathcal{M}\).

Let \(r = (r_{1},\ldots,r_{n})\) and \(q = (q_{1},\ldots,q_{n})\) be \(n\)-dimensional row vectors of positive numbers and let \(s = (s_{1},\ldots,s_{m})\) be a nonzero m-dimensional row vector of nonnegative numbers. These coefficients represent subjective weights which define the objective as

$$\displaystyle{ J = rN(T) +\int _{ 0}^{T}\left (qN(t) + su(t)\right )dt \rightarrow \min }$$
(16)

The term \(su =\sum _{ i=1}^{m}s_{i}u_{i}\) in the integral is a weighted average of the amounts of the various drugs given and the coefficients s i represent the degrees of toxicity of the drugs. Side effects generally depend on the specific cytotoxic agent used and may be more severe than those of a cytostatic or recruiting agent. This would be reflected in the choice of these weights. Similarly, the second integral term \(qN =\sum _{ i=1}^{m}q_{i}N_{i}\) represents a weighted average of the number of cancer cells in the respective compartments during treatment and the penalty term \(rN(T) =\sum _{ i=1}^{m}r_{i}N_{i}(T)\) represents a weighted average of the number of cancer cells in the respective compartments at the end of treatment. The inclusion of the term qN in the Lagrangian is important since otherwise optimization will lead to protocols that put all the emphasis on the end of the therapy interval ignoring the behavior in between. While relevant biological information should be taken into account when selecting the parameters, it generally is also useful to modulate these parameters within specified ranges to obtain otherwise desired features of the optimal solutions. We then consider the following optimal control problem:

[CC]:

for a fixed therapy horizon [0, T], minimize the objective (16) over all Lebesgue-measurable functions u: [0, T] → U subject to the dynamics (15).

There are no constraints on the terminal state N(T) in this formulation and by the nontriviality of the multipliers this implies that all extremals are normal. We thus normalize λ 0 = 1 and drop it in the notation. The adjoint equation and terminal condition then take the form

$$\displaystyle{ \dot{\lambda }= -q -\lambda \left (A +\sum _{ j=1}^{m}u_{ j}B_{j}\right ),\qquad \lambda (T) = r. }$$
(17)

Under assumption (M), the positive orthant \(\mathbb{P}^{{\ast}}\) in the dual space \(\left (\mathbb{R}^{n}\right )^{{\ast}}\) also is negatively invariant for the adjoint equation (17), i.e., if \(\lambda (t_{0}) \in \mathbb{P}^{{\ast}}\), then all components of λ are positive for all times t < t 0. We thus have the following fact:

Proposition 3.1 ([69]).

Under assumption (M), all states N i and multipliers λ i are positive over [0,T].

This is useful information in evaluating the signs of various expressions that arise in an analysis of optimal controls. We recall that the switching functions are given by \(\varPhi _{j}(t) = s_{j} +\lambda (t)B_{j}N(t)\), j = 1, , m, with singular controls possible if one of them vanishes over an open interval. Proposition 2.1 simplifies to the following statement:

Proposition 3.2.

Suppose M is a constant matrix and let Ψ(t) = λ(t)MN(t), where N is a solution to the system equation  (15) corresponding to the control u and λ is a solution to the corresponding adjoint equation  (17) . Then

$$\displaystyle{ \dot{\varPsi }(t) =\lambda (t)\left [A +\sum _{ j=1}^{m}u_{ j}B_{j},M\right ]N(t) - qMN(t), }$$
(18)

with \([X,Y ] = Y X - XY\) the commutator of the matrices X and Y.

Whether or not optimal controls can be singular depends on the properties of the matrices A and B i and needs to be evaluated on a case-by-case basis. Here we briefly describe two models: one for homogeneous and the other for heterogeneous tumor populations and point out the differences in the structures of optimal controls that result from these assumptions.

3.2 Cell-Cycle-Specific Models for Homogeneous Tumor Populations

We consider the problem of administering a single cytotoxic agent that is active in the G 2M phase of the cell cycle such as, for example, paclitaxel. This model was originally considered by Swierniak in [67] and has been analyzed further by us in [33]. Taking into account the phase sensitivity of the drug, the cell cycle is broken up into two compartments with one combining the second growth phase G 2 and mitosis M and the other compartment simply made up of the remaining phases of the cell cycle. The state N of the system can then be described by a 2-dimensional vector with N 1(t) denoting the average number of cancer cells in the first compartment at time t (comprised of the phases G 0, G 1 and S) and N 2(t) the average number of cancer cells in the second compartment at time t (comprised of G 2 and M).

Cell division is a stochastic process with the individual cells determining the sample paths and the transit times following some empirical distribution. Various probabilistic models such as χ 2- or Weibull distributions can be used to describe these transit times. In the approach by Swierniak, an exponential distribution (a special case of the Weibull distribution) is used. This leads to balance equation for the compartments that are linear in the states: the outflow of the first compartment equals the inflow into the second compartment and thus we have that

$$\displaystyle{\dot{N}_{2}(t) = -a_{2}N_{2}(t) + a_{1}N_{1}(t)}$$

with a i the inverse mean transit time through the ith compartment. In the second compartment cell division occurs and thus, while the outflow from the second compartment is still given by \(a_{2}N_{2}(t)\), the inflow into the first compartment doubles to \(2a_{2}N_{2}(t)\) giving

$$\displaystyle{\dot{N}_{1}(t) = -a_{1}N_{1}(t) + 2a_{2}N_{2}(t).}$$

Since the differential equations are linear, quotients of the variables obey Riccati differential equations and it follows that in steady state, i.e., in the “long” run, fixed proportions of the cells will lie in the respective compartments: if

$$\displaystyle{x = \frac{N_{1}} {N_{1} + N_{2}}\qquad \mathrm{and}\qquad y = \frac{N_{2}} {N_{1} + N_{2}}}$$

denote the average proportions of cells in the two compartments, x, y > 0, \(x + y = 1\), then y satisfies the Riccati equation

$$\displaystyle{ \dot{y} = a_{1} - (a_{1} + a_{2})y - a_{2}y^{2} }$$
(19)

and has a well-defined steady state (i.e., a unique, globally asymptotically stable equilibrium point) y in the open interval (0, 1) given by

$$\displaystyle{ y_{{\ast}} = \frac{1} {2}\left (\sqrt{\left (1 + \frac{a_{1 } } {a_{2}}\right )^{2} + 4\frac{a_{1}} {a_{2}}} -\left (1 + \frac{a_{1}} {a_{2}}\right )\right ). }$$
(20)

All solutions approach this value as \(t \rightarrow \infty\). We only remark that these proportions x and y can be measured using cell cycle flow cytometry. If we write \(C(t) = N_{1}(t) + N_{2}(t)\) for the average total number of cancer cells, then the differential equations imply that

$$\displaystyle{\dot{C}(t) = a_{2}N_{2}(t) = a_{2}y(t)C(t) \approx a_{2}y_{{\ast}}C(t)}$$

and thus, in steady state, the total tumor population grows exponentially at about rate \(a_{2}y_{{\ast}}\). This allows us to relate the coefficients a i to the tumor doubling time and these steady states as follows:

Proposition 3.3.

With T denoting the tumor doubling time and x and y ∗, the steady-state proportions of cells in the \(G_{0}/G_{1} + S\) and G 2 ∕M phases of the cell cycle, respectively, we have that

$$\displaystyle{a_{1} = \left (1 + y_{{\ast}}\right ) \frac{\ln 2} {Tx_{{\ast}}}\qquad \mathrm{and}\qquad a_{2} = \frac{\ln 2} {Ty_{{\ast}}}.}$$

Drug treatment influences the cell cycle in many ways and in the model considered here only the most fundamental aspect is considered, cell killing of a cytotoxic agent in the G 2M phase. It is implicitly assumed that all cancer cells are drug sensitive. Recall that the control variable u represents the drug concentration in the blood stream and in accordance with the log-kill hypothesis, we assume that the drug concentration u(t) kills a fraction of the outflow \(a_{2}N_{2}(t)\) of cells from the G 2M compartment. Thus the number of cells killed is given by \(\varphi u(t)a_{2}N_{2}(t)\) with \(\varphi\) a constant chemotherapeutic killing parameter. The control set is a compact interval [0, u max] with u max denoting the maximum dose rate/concentration. In the model, the control u always appears in conjunction with the constant \(\varphi\) and thus, in order to keep the number of free parameters to a minimum, we combine it with the maximum dose rate into one quantity that we still denote with u max under the assumption that u max ≤ 1. If the concentration is high enough, then indeed u max = 1 is realistic: almost all the cancer cells in that compartment can be killed. Cells which are killed in G 2M leave this compartment, i.e., are counted as outflows from the second compartment, but they no longer enter the first compartment. Only the remaining fraction (1 − u)a 2 N 2 undergoes cell division. Thus the controlled mathematical model becomes

$$\displaystyle\begin{array}{rcl} \dot{N}_{1}& =& -a_{1}N_{1} + 2(1 - u)a_{2}N_{2}, {}\\ \dot{N}_{2}& =& a_{1}N_{1} - a_{2}N_{2}, {}\\ \end{array}$$

or, in matrix form, \(\dot{N}(t) = (A + uB)N(t)\), with A and B given by

$$\displaystyle{ A = \left (\begin{array}{cc} - a_{1} & 2a_{2} \\ a_{1} & - a_{2} \end{array} \right )\qquad \mathrm{and}\qquad B = \left (\begin{array}{cc} 0& - 2a_{2} \\ 0& 0 \end{array} \right ). }$$
(21)

For this model, singular controls are not optimal. Denoting the coefficient at the control u in the objective by s, the switching function is given by \(\varPhi (t) = s +\lambda (t)BN_{{\ast}}(t)\). If the control is singular on an open interval I, then, using Proposition 3.2 it follows that

$$\displaystyle{ \dot{\varPhi }(t) = \left \{\lambda (t)[A,B] - qB\right \}N_{{\ast}}(t) \equiv 0 }$$
(22)

and

$$\displaystyle\begin{array}{rcl} \ddot{\varPhi }(t)& =& \left \{\lambda (t)[A,[A,B]] - q[A,B] - qBA\right \}N_{{\ast}}(t) \\ & & +u(t)\left \{\lambda (t)[B,[A,B]]N_{{\ast}}(t) - qB^{2}\right \}N_{ {\ast}}(t).{}\end{array}$$
(23)

Hence the Legendre-Clebsch condition is determined by the expression

$$\displaystyle{ \frac{\partial } {\partial u} \frac{d^{2}} {dt^{2}} \frac{\partial H} {\partial u} (\lambda (t),N_{{\ast}}(t),u_{{\ast}}(t)) = \left \{\lambda (t)[B,[A,B]]N_{{\ast}}(t) - qB^{2}\right \}N_{ {\ast}}(t). }$$
(24)

It is clear that \(B^{2} \equiv 0\) and a direct computation verifies that

$$\displaystyle{ [B,[A,B]] = 8a_{1}a_{2}^{2}\left (\begin{array}{cc} 0&1 \\ 0&0 \end{array} \right ) = -4a_{1}a_{2}B. }$$
(25)

Furthermore, \(\varPhi (t) \equiv 0\) implies that \(\lambda (t)BN_{{\ast}}(t) \equiv -s\) and thus

$$\displaystyle{ \frac{\partial } {\partial u} \frac{d^{2}} {dt^{2}} \frac{\partial H} {\partial u} (\lambda (t),N_{{\ast}}(t),u_{{\ast}}(t)) = 4a_{1}a_{2}s> 0 }$$
(26)

violating the Legendre-Clebsch condition for optimality of a singular control.

Theorem 3.1.

If \((N_{{\ast}},u_{{\ast}})\) is an optimal controlled trajectory for the optimal control problem [CC] with matrices (21) , then there does not exist an interval on which the control u is singular.

We give a couple of examples of locally optimal bang-bang controls. For the cell cycle parameters we have chosen the values a 1 = 0. 197 and a 2 = 0. 356 used in [67] and in all computations the initial condition is taken as the steady-state proportions defined by Eq. (20), normalizing the total number of initial cancer cells to 1 (times 1010), i.e., N 1(0) = 0. 7012 and N 2(0) = 0. 2988. This would be representative of conditions where the cancer has been growing exponentially for some time without treatment; even if chemotherapy has been given earlier, in the rest periods the cells redistributed over the compartments and their proportions are given by these values. The control limit is taken as u max = 0. 90, but is just meant for illustrative purposes. Figure 1 shows two examples of controls and corresponding trajectories when the coefficients in the objective have been chosen as r = (3, 3), q = (0. 1, 0. 1) and \(s = \frac{1} {2}\). The examples shown are for time horizons of T = 21 and T = 60 [days]. In all cases, extremals are bang-bang trajectories with exactly one switching from u = u max to u = 0. The total reductions in cancer cells at the end of the therapy horizon are given by \(N_{1}(T) + N_{2}(T) = 0.5297\) and 0. 4799, respectively.

Fig. 1
figure 1

Examples of locally optimal controls (left) and their corresponding trajectories (right) for T = 21 (top) and T = 60 (bottom) from the steady-state solution

All extremals shown here are strong local minima; that is, there exists a neighborhood W of the graph of the corresponding trajectory in \([0,T] \times \mathbb{P}\) such that the controls are optimal with respect to any other control u for which the graph of its corresponding trajectory N lies in W [33]. In fact, for this 2-compartment model we have consistently seen that extremal bang-bang trajectories that have more than one switching are not optimal and the examples shown are expected to be globally optimal. This simply means that we can take the neighborhood W as the full space \([0,T] \times \mathbb{P}\), but we have not verified this.

Analogous results have been obtained for multidrug 3-compartment models when the actions of a G 2M-specific cytotoxic agent were combined either with a cytostatic agent that was slowing down the progression of cells during the synthesis phase [34, 62] or with a recruiting agent that was applied to entice dormant cells to reenter the active cell cycle from the compartment G 0 [35]. In each case, singular controls can be excluded from optimality using the Legendre-Clebsch condition and optimal controls are bang-bang with one switching for the cytotoxic agent giving the dose upfront. In the model formulation, however, it is implicitly assumed that the tumor population is homogeneous and consists of chemotherapeutically sensitive cells. Also, the problem considered here corresponds to one particular chemotherapy session only. The steady-state proportions of the uncontrolled system reestablish very quickly during the rest periods and thus multiple chemotherapy sessions reduce to repetitions of the structure obtained above. Overall, for tumor populations that are homogeneous and consist of chemotherapeutically sensitive cells, these mathematical models therefore confirm the prevailing paradigm that chemotherapy should be given in an MTD scheme upfront. However, this no longer is so clear cut once tumor heterogeneity is taken into account.

3.3 Compartmental Models for Heterogeneous Tumor Populations

Malignant cancer cell populations are genetically unstable and coupled with fast proliferation rates, this leads to a great variety in the structure of the cells within one tumor—the number of genetic errors present within one cancer cell can lie in the thousands [42]. Consequently, tumors often consist of a heterogeneous mixtures of various subpopulations that show widely varying sensitivities towards the actions of a particular chemotherapeutic agent [13, 14]. In medicine, the Norton-Simon hypothesis [44] postulates that tumors consist of faster growing cells that are sensitive to chemotherapy and slower growing populations of cells that exhibit lower sensitivities or, with time, become resistant to the chemotherapeutic agent (acquired drug resistance). There may even exist small subpopulations of cells for which the specific activation mechanism of a chemotherapeutic agent does not work at all and which thus are not sensitive to the treatment from the beginning (ab initio, intrinsic resistance). Given such a scenario, over time, as the drugs kill sensitive tumor cells, resistant subpopulation of cancer cells may emerge that will make an MTD-style therapy less and less effective [37, 41, 71]. Even if the fraction of intrinsically resistant tumor cells is tiny (undetectable) after the sensitive cells have been killed by the treatment, it may then grow in time to become a fully developed tumor of chemotherapeutically resistant cells leading to the failure of therapy, possibly only after many years of seeing remission of the cancer.

Compartmental models of the type (15) can also be used to investigate the structure of optimal controls if the tumor population is heterogeneous. In [17], Hahnfeldt, Folkman and Hlatky compare the effects of MTD and metronomic chemotherapy (when given by bolus-type injections) on sensitive and resistant tumor populations. Optimizing the maximum asymptotic factor reduction in tumor size between periods in an infinite cycle of periodic therapy periods, the authors come to the conclusion that a metronomic, regular scheduling of the drugs has better long-term effects. We here consider the same underlying dynamics in a continuous-time formulation and explore the structure of optimal protocols that minimize the tumor burden as measured by the average over one, but possibly very large therapy interval. Since we want to explore the effects that heterogeneity has, we distinguish three subpopulations which, for simplicity of terminology, are labeled “sensitive,” S, “partially sensitive,” P, and “resistant,” R. The terminology is only meant to indicate that these populations have different sensitivities towards a chemotherapeutic agent with S the highest and R the lowest. We assume that these subpopulations grow at growth rates α 1, α 2, and α 3, respectively. Generally we do not make assumptions on the order of the growth rates, but an ordering \(\alpha _{1}>\alpha _{2}>\alpha _{3}\) would be consistent with the “Norton-Simon hypothesis.” We allow for transitions between the compartments, i.e., we include the typical effects that sensitive cells can become more resistant, but we also allow for resensitizations which make cells less resistant to the chemotherapeutic agent [15]. We denote the transition rates from the sensitive to the partially sensitive and resistant compartments by σ P and σ R , respectively, and use analogous notations for the other transition rates. Thus, for example, ρ P denotes the transition rates from resistant to partially sensitive cells. These rates are assumed to be constant and we assume they all are positive. This corresponds to an ergodic structure in which all compartments are repeatedly visited by cells. Cell kill by a chemotherapeutic agent is expressed by the standard linear log-kill hypothesis: if we denote the concentration of the drug in the bloodstream by u, then the rate of cells eliminated is given by \(\varphi _{i}u\), i = 1, 2, 3, with the coefficients \(\varphi _{1}\), \(\varphi _{2}\), and \(\varphi _{3}\) representing the effectiveness of the drug on the sensitive, partially sensitive and resistant subpopulations, respectively. Thus \(\varphi _{1}>\varphi _{2}>\varphi _{3} \geq 0\). The case \(\varphi _{3} = 0\) corresponds to the situation of a fully resistant subpopulation R. We again do not include the standard pharmacokinetic model on the agent here and treat u as the control of the system with maximum concentration given by u max. The controlled dynamics is then simply determined by the inflows and outflows from the various compartments and is given by the following 3-dimensional linear system of equations:

$$\displaystyle\begin{array}{rcl} \dot{S} = \left (\alpha _{1} -\sigma _{P} -\sigma _{R} -\varphi _{1}u\right )S +\pi _{S}P +\rho _{S}R,& &{}\end{array}$$
(27)
$$\displaystyle\begin{array}{rcl} \dot{P} =\sigma _{P}S + \left (\alpha _{2} -\pi _{S} -\pi _{R} -\varphi _{2}u\right )P +\rho _{P}R,& &{}\end{array}$$
(28)
$$\displaystyle\begin{array}{rcl} \dot{R} =\sigma _{R}S +\pi _{R}P + \left (\alpha _{3} -\rho _{S} -\rho _{P} -\varphi _{3}u\right )R.& &{}\end{array}$$
(29)

Even if initially no partially sensitive or resistant cells are present, they will immediately appear because of the ergodic nature of the underlying Markov chain and resulting transitions between the compartments. Without loss of generality, we thus assume that all initial conditions S 0, P 0, and R 0 are positive. Admissible controls are Lebesgue-measurable functions with values in a compact interval [0, u max], \(u: [0,T] \rightarrow [0,u_{\max }]\), \(t\mapsto u(t)\).

We denote the proportions of the respective populations by

$$\displaystyle{x = \frac{S} {S + P + R},\qquad y = \frac{P} {S + P + R}\qquad \mathrm{and}\qquad z = \frac{R} {S + P + R};}$$

it then follows that x, y, and z obey Riccati equations and direct computations verify that

$$\displaystyle\begin{array}{rcl} \dot{x} =\nu _{S}x +\pi _{S}y +\rho _{S}z - x(\alpha _{1}x +\alpha _{2}y +\alpha _{3}z),& &{}\end{array}$$
(30)
$$\displaystyle\begin{array}{rcl} \dot{y} =\sigma _{P}x +\nu _{P}y +\rho _{P}z - y(\alpha _{1}x +\alpha _{2}y +\alpha _{3}z),& &{}\end{array}$$
(31)
$$\displaystyle\begin{array}{rcl} \dot{z} =\sigma _{R}x +\pi _{R}y +\nu _{R}z - z(\alpha _{1}x +\alpha _{2}y +\alpha _{3}z).& &{}\end{array}$$
(32)

with the system evolving on the unit simplex

$$\displaystyle{\varSigma = \left \{(x,y,z): x \geq 0,\ y \geq 0,\ z \geq 0,\ x + y + z = 1\right \}.}$$

Proposition 3.4 ([28]).

The dynamics (30) (32) has exactly one equilibrium point \((x_{{\ast}},y_{{\ast}},z_{{\ast}}) \in \varSigma\) which is globally asymptotically stable in Σ and defines the steady-state proportions.

Thus, given an estimate C 0 on the tumor size, there once more exists a well-defined initial condition \(S_{0} = x_{{\ast}}C_{0}\), \(P_{0} = y_{{\ast}}C_{0}\) and \(R_{0} = z_{{\ast}}C_{0}\) for the optimal control problem [OC]. Setting N = (S, P, R)T, we have a 3-dimensional single-input control system of the form \(\dot{N} = \left (A + uB\right )N\) with the matrices determined by equations (30)–(32); the objective is the same as defined in (16) before. The necessary conditions for optimality thus take the same form as for the 2-compartment model considered above. It is easily seen that also for this system, although the dynamics is not described by an \(\mathcal{M}\)-matrix, all states and multipliers λ i , i = 1, 2, 3 are positive over the interval [0, T] and we have the same formulas (22) and (23) for the derivatives of the switching function \(\varPhi (t) = s +\lambda (t)BN_{{\ast}}(t)\) with the Legendre-Clebsch condition again given by (24). Here

$$\displaystyle{ \left [B,[A,B\right ]] = -\left (\begin{array}{ccc} 0 & (\varphi _{2} -\varphi _{1})^{2}\pi _{S} & (\varphi _{3} -\varphi _{1})^{2}\rho _{S} \\ (\varphi _{1} -\varphi _{2})^{2}\sigma _{P}& 0 &(\varphi _{3} -\varphi _{2})^{2}\rho _{P} \\ (\varphi _{1} -\varphi _{3})^{2}\sigma _{R}&(\varphi _{2} -\varphi _{3})^{2}\pi _{R}& 0 \end{array} \right ) }$$
(33)

so that \(\lambda (t)[B,[A,B]]N_{{\ast}}(t) \leq 0\) while

$$\displaystyle{qB^{2}N_{ {\ast}}(t) = q_{1}\varphi _{1}^{2}S_{ {\ast}}(t) + q_{2}\varphi _{2}^{2}P_{ {\ast}}(t) + q_{3}\varphi _{3}^{2}R_{ {\ast}}(t)> 0.}$$

Hence \(\left \{\lambda (t)[B,[A,B]]N_{{\ast}}(t) - qB^{2}\right \}N_{{\ast}}(t) <0\) and the strengthened Legendre-Clebsch condition is always satisfied. Essentially, this is just a consequence of having different sensitivities.

Proposition 3.5.

For the compartmental model defined by equations  (27) (29) , singular controls are of order 1 and the strengthened Legendre-Clebsch condition for minimality is satisfied.

Thus, in this case it is expected that singular controls are locally optimal. Solving equation (23) for u gives the following formula for the singular control:

$$\displaystyle{ u_{\mathrm{sing}}(t) = \frac{\left \{\lambda (t)\left [A,[A,B]\right ] - q\left [A,B\right ] - qBA\right \}N_{{\ast}}(t)} {\left \{-\lambda (t)\left [B,[A,B]\right ] + qB^{2}\right \}N_{{\ast}}(t)}. }$$
(34)

This singular control actually does not depend on the values S, P, and R of the state, but only on the values of the proportions x, y, and z. In order to be admissible, the control values need to lie in the control set [0, u max]. It follows from the strengthened Legendre-Clebsch condition that the denominator is positive. In the numerator, all terms in the vector − qBA are positive, but there exist coefficients in the matrices [A, [A, B]] and in the vector \(-q\left [A,B\right ]\) that are negative, but just a few. Thus generally, and this is what we have seen consistently in numerical computations, the values of the expression (34) are positive and thus admissible for suitable upper bounds u max.

Analyzing optimal concatenations between bang and singular controls is difficult and this analysis has not been carried out yet. However, it is not difficult to give some numerical samples of singular controls and corresponding trajectories. Along a singular arc, the multiplier λ satisfies \(\varPhi (t) = s +\lambda (t)BN_{{\ast}}(t) \equiv 0\) and \(\dot{\varPhi }(t) = \left \{\lambda (t)[A,B] - qB\right \}N_{{\ast}}(t) \equiv 0\) and is determined by these conditions up to a positive scalar multiple. In principle, here singular controls are possible everywhere in the state space and in Fig. 2 we give an example of an extremal controlled trajectory for which the control is given by the maximum dose rate for an initial interval [0, τ b ] and then is singular over the remaining period [τ b , T]. In this simulation the parameter values defining the dynamics are α 1 = 1, α 2 = 0. 5, and α 3 = 0. 1 with transition rates σ P  = 0. 05, σ R  = 0. 01, π S  = 0. 03, π R  = 0. 01, ρ S  = 0. 01, and ρ P  = 0. 03. Normalizing the initial cancer burden to C(0) = 1, the corresponding steady-state proportions are given by \(S_{0} = x_{{\ast}} = 0.8954\), \(P_{0} = y_{{\ast}} = 0.0933\), and \(R_{0} = z_{{\ast}} = 0.0112\) and we used these as initial condition. The maximum dose rate is normalized to u max = 1 and the pharmacodynamic coefficients are \(\varphi _{1} = 1.5\), \(\varphi _{2} = 1\), and \(\varphi _{3} = 0.1\). All these values are for illustrative purpose only. In the objective we chose all weights q i equal to 0. 01 and we used τ b  = 5 and T = 25, so that a full dose is given for 20% of the time. Over this time horizon the lower dose rates of the singular controls are able to maintain a lower cancer burden, but eventually the resistant population will become dominant. However, this will happen regardless of the specific administration protocol of the drug.

Fig. 2
figure 2

Example of an extremal control and associated states for a bang-singular controlled trajectory

In the medical literature protocols like these are referred to as “chemo-switch” protocols and our computations show that, as differing chemotherapeutic sensitivities and even drug resistance come into play, lower dose rates become a valid alternative to MTD protocols.

4 Mathematical Models for Antiangiogenic Treatments

The most important structure of a tumor’s microenvironment is its vasculature. In order to grow beyond a small size, a tumor needs to develop its own network of blood vessels and capillaries that will provide it with nutrients and oxygen. This process is called angiogenesis and was already pointed out as a therapeutic target by J. Folkman in the early 1970s [8, 9]. Antiangiogenic treatments aim at depriving the tumor of this needed vasculature by either disrupting the signaling process that the tumor uses to recruit surrounding, mature, host blood vessels or by directly inhibiting the growth of endothelial cells that form the lining of the newly developing blood vessels and capillaries. Ideally, without an adequate support network, the tumor’s further development is halted and it even shrinks. Rather than fighting the fast duplicating, genetically unstable, and continuously mutating tumor cells, this indirect treatment approach targets the genetically stable endothelial cells. As a consequence, no clonal resistance to angiogenic inhibitors has been observed in experimental cancer [2] and for this reason, after the discovery of antiangiogenic mechanisms that the tumor uses to control its vasculature in the 1990s [5, 10, 25], antiangiogenic treatments were a new hope in the war on cancer. Unfortunately, these high hopes have not been realized, mostly due to the maintenance only character of the treatment [22]. However, antiangiogenic approaches have become a valuable component in the treatment of many cancer types in connection with other traditional approaches like chemo- or radiotherapy that directly attack tumor cells.

A widely influential population-based mathematical model for tumor development under angiogenic signaling was developed and biologically validated in 1999 by Hahnfeldt, Panigrahy, Folkman, and Hlatky [16]. This model has become an object of strong interest also in the mathematical literature and to this date is still undergoing vigorous development. It has been analyzed from a dynamical systems perspective (e.g., by d’Onofrio and Gandolfi [47, 48], Forys et al. [11]) as well as from an optimal control point of view (by the authors and coworkers [38, 39, 52] and by Swierniak [68, 70]) with numerous generalizations and variations of the underlying model that have been proposed (e.g., [7, 46, 49, 51, 58, 60]). In Sect. 4.1, for the original mathematical model, we describe a complete solution of how to administer an a priori given amount of antiangiogenic agents in order to achieve the best possible tumor reduction. In this solution, an optimal singular arc and its associated singular control determine the structure of optimal controls which are largely defined by a singular segment. These feedback controls, however, are difficult to implement. Yet, the solution is fully robust and excellent simple suboptimal controls that come within 1% of the optimal value exist and will be discussed in Sect. 4.2. Since antiangiogenic therapy only targets cancer cells indirectly, in order to be effective, it needs to be combined with therapies that also kill the cancer cells. In Sect. 4.3 we show how the solution for the antiangiogenic monotreatment therapy presented in Sect. 4.1 provides the basis for the solutions for such combination therapy problem.

4.1 Synthesis of Optimal Controlled Trajectories for the Monotherapy Problem

In the model by Hahnfeldt et al. [16], the spatial aspects of angiogenesis are incorporated into a nonspatial 2-compartment model with the primary tumor volume, p, and the carrying capacity of the vasculature, q, as its principal variables. Intuitively, the latter can be thought of as the ideal tumor volume sustainable by the vascular network and is closely related to the volume of endothelial cells that form the lining of the existing and newly forming capillaries. The dynamics consists of two ODEs that describe the evolution of the tumor volume and its carrying capacity, which, with u denoting the action of an antiangiogenic agent, is given by the following equations:

$$\displaystyle\begin{array}{rcl} \dot{p} = -\xi p\ln \left (\frac{p} {q}\right ),\qquad \quad p(0) = p_{0},& &{}\end{array}$$
(35)
$$\displaystyle\begin{array}{rcl} \dot{q} = bp -\left (dp^{\frac{2} {3} }+\mu \right )q -\gamma uq,\qquad \quad q(0) = q_{0},& &{}\end{array}$$
(36)

In equation (35) a Gompertzian model with ξ a constant parameter is chosen to model tumor growth (other choices are equally possible). Note that the carrying capacity and tumor volume are balanced for p = q and thus \(\dot{p} = 0\) in this case while the tumor volume shrinks for inadequate endothelial support (p > q) and increases if this support is plentiful (p < q). Different from conventional approaches, in this model the carrying capacity is not a constant, but itself becomes a state variable whose evolution is governed by a balance of stimulatory and inhibitory effects given in equation (36). Based on an asymptotic analysis of the underlying consumption-diffusion process and the facts that angiogenic inhibitors have a more systemic effect while stimulators, on the other hand, act locally, the functional forms S(p, q) = bp and \(I(p,q) = dp^{\frac{2} {3} }q\) for stimulators and inhibitors are proposed in [16]. The term μ q, μ ≥ 0, that has been separated describes the loss to the endothelial cells through natural causes (death etc.) and γ q u models the loss to the vasculature due to outside administration of antiangiogenic agents using a standard log-kill term. The control u represents the concentration in the plasma of such an agent with u max denoting an a priori set maximum dose rate/concentration.

Different from the previous model formulations, we here assume that a fixed amount A of angiogenic inhibitors is given. Mathematically this represents an isoperimetric constraint and is modeled as

$$\displaystyle{ \dot{y} = u,\qquad \qquad y(0) = 0,\qquad y(T) \leq A. }$$
(37)

The question then becomes how to use the given amount of agents in the best possible way. Here we choose to minimize the tumor volume. In this formulation, there is no fixed therapy horizon [0, T], but rather the terminal time T is free and it merely represents the time when the minimum tumor volume is being realized. Such models are of practical interest and give an important alternative to the formulations considered earlier. We thus consider the following optimal control problem:

[A] :

for a free terminal time T, minimize the terminal value p(T) of the tumor volume subject to the dynamics (35)–(37) over all Lebesgue-measurable functions \(u: [0,T] \rightarrow [0,u_{\max }]\) for which the corresponding trajectory (p, q, y) satisfies the terminal constraint \(y(T) =\int _{ 0}^{T}u(t)dt \leq A\).

We denote the 3-dimensional state by z = (p, q, y)T and write the dynamics in the form

$$\displaystyle{ \dot{z} = f(z) + ug(z) }$$
(38)

with

$$\displaystyle{f(z) = \left (\begin{array}{c} -\xi p\ln \left (\frac{p} {q}\right ) \\ bp -\left (dp^{\frac{2} {3} }+\mu \right )q \\ 0 \end{array} \right )\quad \mathrm{and}\quad g(z) = \left (\begin{array}{c} 0\\ -\gamma q \\ 1 \end{array} \right ).}$$

All coefficients are positive parameters and we also assume that γ u max > bμ > 0. The first inequality implies that a constant does rate u max eradicates the tumor [47], but is only made in order not to have to distinguish cases the second inequality is always satisfied for the underlying medical problem. Under these assumptions in [38] we gave a complete global solution to this optimal control problem in the form of a regular synthesis for all initial data \((p_{0},q_{0},A)\) that are well posed. This simply means that there are enough antiangiogenic agents available to realize a terminal value p(T) < p 0 since otherwise the optimal terminal time T is given by T = 0.

Necessary conditions for optimality of a control u are again given by the Pontryagin maximum principle. It is not difficult to see that all available inhibitors will be exhausted along an optimal trajectory \((p_{{\ast}},q_{{\ast}},y_{{\ast}})\), y (T) = A and that \(p_{{\ast}}(T) = q_{{\ast}}(T)\) holds at the final time. For this problem, the switching function Φ is given by

$$\displaystyle{ \varPhi (t) = \left \langle \lambda (t),g(z(t))\right \rangle =\lambda _{3} -\lambda _{2}(t)\gamma q_{{\ast}}(t) }$$
(39)

and, compared with the models considered in Sect. 3, here the computation of singular controls simplifies since the Lagrangian L is identically zero. It follows from Proposition 2.1 that the derivative of a function of the form \(\varPsi (t) = \left \langle \lambda (t),h(z(t))\right \rangle\) is given by \(\dot{\varPsi }(t) = \left \langle \lambda (t),[f + ug,h](z(t))\right \rangle\) and for the switching function Φ(t) we thus obtain that \(\dot{\varPhi }(t) = \left \langle \lambda (t),[f,g](z(t))\right \rangle\) and

$$\displaystyle{ \ddot{\varPhi }(t) =\lambda (t)[f + ug,[f,g]](z(t)), }$$
(40)

with the control u once more only appearing in the second derivative. If u is singular on some open interval I, then these derivatives all vanish on I and if \(\left \langle \lambda (t),[g,[f,g]](z(t))\right \rangle \neq 0\), then (40) can formally be solved for u as

$$\displaystyle{ u_{\mathrm{sing}}(t) = -\frac{\left \langle \lambda (t),[f,[f,g]](z(t))\right \rangle } {\left \langle \lambda (t),[g,[f,g]](z(t))\right \rangle }. }$$
(41)

The strengthened Legendre-Clebsch condition for optimality of the singular control here takes the form

$$\displaystyle{ \left \langle \lambda (t),[g,[f,g]](z(t))\right \rangle <0\qquad \mathrm{for\ all\ }t \in I. }$$
(42)

The determination of singular controls and the analysis of their local optimality properties thus reduces to the computation of the Lie brackets [f, [f, g]] and [g, [f, g]] and their inner products with the multiplier λ. For the model [A], the control vector field g and the Lie brackets [f, g] and [g, [f, g]] are linearly independent and thus the Lie bracket [f, [f, g]] can be written as a linear combination of this basis with coefficients that are smooth functions of the state z, say

$$\displaystyle{[f,[f,g]](z) =\rho (z)g(z) +\varphi (z)[f,g](z) +\psi (z)[g,[f,g]](z).}$$

Along a singular extremal (z, u, λ), the inner products \(\left \langle \lambda (t),g(z(t))\right \rangle\) and \(\left \langle \lambda (t),[f,g](z(t))\right \rangle\) vanish identically and thus

$$\displaystyle{\left \langle \lambda (t),[f,[f,g]](z(t))\right \rangle =\psi (z(t))\left \langle \lambda (t),[g,[f,g]](z(t))\right \rangle.}$$

If the singular control is of order 1, we therefore simply have that

$$\displaystyle{ u_{\mathrm{sing}}(t) = -\psi (z(t)) }$$
(43)

and the singular control is given in feedback form, i.e., as a function only of the state z alone which does not depend on the multiplier. Naturally, whether this feedback is admissible still needs to be determined separately.

However, this feedback does not define a singular control everywhere, but only on a thin subset. The reason for this lies in the fact that along extremals also the Hamiltonian H needs to vanish identically and thus, along a singular arc, we also have that \(\left \langle \lambda (t),f(z(t))\right \rangle \equiv 0\) for all t ∈ I. Consequently the multiplier λ(t) vanishes against the vector fields f, g and [f, g] along a singular trajectory. Since λ(t) ≠ 0, it follows that these vector fields must be linearly dependent along the singular arc. Thus (43) only defines a singular control on the surface

$$\displaystyle{\mathcal{S} =\{ z \in \mathbb{R}^{3}:\det \left (f(z),g(z),[f,g](z)\right ) = 0\}}$$

where \(\det \left (f(z),g(z),[f,g](z)\right )\) denotes the determinant of the matrix whose columns are formed by the ordered vectors f(z), g(z), and [f, g](z). Evaluating this formula gives that

$$\displaystyle{\det \left (f(z),g(z),[f,g](z)\right ) =\xi \gamma p\left [bp\left (1 -\ln \left (\frac{p} {q}\right )\right ) -\left (dp^{\frac{2} {3} }+\mu \right )q\right ].}$$

In particular, \(\mathcal{S}\) is a vertical surface independent of y over a base curve \(\mathcal{S}_{0}\) in (p, q)-space given by \(\mu +dp^{\frac{2} {3} } = bx(1 -\ln x)\) with \(x = \frac{p} {q}\). For the singular control, we have the following explicit formulas:

Proposition 4.1.

If the control u is singular on an open interval (α,β) with corresponding trajectory (p,q), then the singular control is determined in feedback form by

$$\displaystyle{ \gamma u_{\mathrm{sing}}(t) =\varPsi (p(t),q(t)) =\xi \ln \left (\frac{p(t)} {q(t)}\right ) + b\frac{p(t)} {q(t)} + \frac{2} {3}\xi \frac{d} {b} \frac{q(t)} {p^{\frac{1} {3} }(t)} -\left (\mu +dp^{\frac{2} {3} }(t)\right ) }$$
(44)

There exists exactly one connected arc on the singular base curve \(\mathcal{S}_{0}\) along which the control is admissible, i.e., satisfies the bounds \(0 \leq u_{\mathrm{sing}} \leq u_{\max }\) .

Figure 3 illustrates the petallike singular curve \(\mathcal{S}_{0}\) for u max = 75 with the admissible portion marked as a solid curve for the parameter values ξ = 0. 2, b = 5, d = 0. 01, and μ = 0. The qualitative structure shown in this figure is generally valid, but the admissible portion shrinks with smaller values u max.

Fig. 3
figure 3

The singular control u sing plotted as a feedback function of the quotient \(x = \frac{p} {q}\) (left) and the singular base curve \(\mathcal{S}_{0}\) plotted in (p, q)-space (right) with the admissible part marked by the solid portion of the curve. Away from this solid segment the singular control is either negative or exceeds the limit u max

The structure of optimal controls and trajectories is summarized in the following theorem:

Theorem 4.1 ([38]).

Given well-posed initial data \((p_{0},q_{0},A)\) , optimal controls are at most concatenations of 4 pieces in the form bsu max 0 with 0 denoting an arc along the constant control u = 0, u max  denoting an arc along the constant control u = u max, b standing for either u max or 0 , and s denoting an arc in the singular surface  \(\mathcal{S}\) .

This result provides an upper bound on the number of segments for optimal controls and it significantly limits the structure of possible concatenations. For the medically most relevant case of initial conditions \((p_{0},q_{0})\) that represent a growing tumor with high carrying capacity, \(p_{0} <q_{0}\), and ample supply A of inhibitors, typically optimal controls have the following structure: initially they are given by a segment of full dose therapy, u ≡ u max, until the corresponding trajectory meets the singular surface \(\mathcal{S}\). At this point, the optimal control changes to the singular control and antiangiogenic agents are administered at these singular dose rates until all angiogenic inhibitors have been exhausted. During that phase, the corresponding trajectory evolves on the singular surface \(\mathcal{S}\). Since the singular surface lies in the region p > q, after termination of therapy, the tumor volume will still be decreasing (due to after effects) even if no more agents are administered as long as the trajectory remains in the region p > q. The minimum tumor volume will then be realized as the trajectory reaches the diagonal, p = q. Thus, for these cases optimal controls follow the shorter concatenation sequence u max s0. This is the typical structure of optimal controlled trajectories for medically relevant initial conditions, but it depends on two facts: (i) the overall amount of inhibitors is large enough to reach the singular arc in its admissible range, but (ii) it is not so large that the singular control would saturate along the singular arc, i.e., would reach the limit u max. If (i) is violated and trajectories either do not reach \(\mathcal{S}\) at all or reach \(\mathcal{S}\) in its inadmissible part, then the singular control never becomes an option and in this case optimal controlled trajectories will simply be given by up-front administration of all antiangiogenic agents at full dose rates. In such a case, optimal controls are bang-bang with exactly one switching from u = u max to u = 0, i.e., of the type u max 0. If condition (ii) is violated, then optimal concatenation sequences of the forms 0su max 0 and \(\mathbf{u}_{\max }\mathbf{su}_{\max }\mathbf{0}\) arise.

The synthesis of optimal trajectories then consists of a unique covering of the full state space by controlled trajectories with the optimal control u opt = u opt(p, q, y) identifying the optimal dose rates as a function of an arbitrary point (p, q; y) of the state. Intuitively, a synthesis acts like a“GPS system” showing for every possible state of the system how optimal protocols are administered, both qualitatively and quantitatively. The variable y merely accounts for the amount of inhibitors that already has been used and it is more convenient, and more illustrative, to show the projections of trajectories into the (p, q)-plane. With only a slight abuse of terminology, we do not distinguish in our language between the trajectories in (p, q; y)-space and their projections onto the (p, q)-coordinates. Figure 4 shows this projection and also identifies a typical optimal control of the form u max s0 described above.

Fig. 4
figure 4

Synthesis of optimal controlled trajectories for the problem [A]

In Fig. 5, as an example, we show the optimal controlled trajectory (on the left) and its corresponding control (on the right) for the initial condition \((p_{0},q_{0}) = (12000\ [mm^{3}],15000\ [mm^{3}])\) and the values ξ = 0. 084, b = 5. 85, d = 0. 00873 taken from [16] and μ = 0. 2. The optimal control is of the type u max s0: it takes the maximal value u = u max for a short interval from 0 to t 1 = 0. 0905 [days] when the trajectory reaches the singular arc. At this point, the control switches to the time-varying singular control defined by the singular feedback (44) until all inhibitors are exhausted at time t 2 = 6. 5579 [days]. Then, due to after effects, the minimum value of the tumor volume is realized a short period later at the final time T = 6. 7221 [days] when the trajectory for u = 0 reaches the diagonal. Note the extremely fast q-dynamics away from the singular arc. Partly this is caused by the numerical values that were used which are based on Lewis lung carcinoma in mice, a fast growing cancer. Although the almost horizontal trajectory segments along the controls u = 0 and u = u max are sizable, the time spent along these pieces is small. Most of the time the control is singular and the trajectory follows the associated singular arc (whose projection in the (p, q)-space is a subset of the base curve S 0), but this dynamics is much slower. The optimal final value is given by p (T) = 8533. 4 [mm 3]. The optimal trajectory is shown as a solid curve in Fig. 5 and the singular curve \(\mathcal{S}\) and the diagonal \(\mathcal{D}_{0}\) are shown as dotted curves.

Fig. 5
figure 5

Example of an optimal u max s0 controlled trajectory (left) and associated control as function of time (right) for initial data (p 0, q 0, A) = (12000, 15000, 300)

4.2 Robustness Properties and Realizable Suboptimal Protocols

Singular controls play an essential role in determining the overall structure of optimal controlled trajectories for this problem. While Lie algebraic computations provide an elegant framework in which the singular controls and corresponding arcs can be determined analytically, these formulas are given as feedback controls that administer time-varying partial doses that are determined by the current state of the system, that is, the tumor volume p and its carrying capacity q. Even at the initial time, while a reasonably reliable estimate for the tumor volume p 0 may be available, the carrying capacity of the vasculature, q 0, is a highly idealized quantity and there exist no methods to measure it. The value of the theoretical optimal solution that was derived, apart from giving interesting qualitative insights into the underlying system, does not primarily lie in providing a feasible strategy, but in clarifying what in principle is possible—in fact, for many practical problem, this precisely is the contribution that optimal control methodologies provide. Then, based on the benchmarks that the theoretically optimal solutions provide, it becomes of importance to formulate simple, easily implementable, but also robust strategies that could be employed even in the face of great uncertainty in the parameters and the state of the system [31, 39]. The solution described above indeed exhibits strong robustness properties with respect to parameter values and this in particular is valid with respect to the initial values q 0 of the carrying capacity. Excellent approximations to the theoretically optimal solution are obtained by simply taking a constant control whose dose rate is given by the averaged optimal dose rate protocol, i.e.,

$$\displaystyle{\bar{u} \equiv \frac{1} {T_{\mathrm{opt}}}\int _{0}^{T_{\mathrm{opt}} }u_{\mathrm{opt}}(t)dt = \frac{A} {T_{\mathrm{opt}}},}$$

where u opt denotes the optimal control as a function of time, T opt is the time when all antiangiogenic agents have been used up and, as before, A denotes the a priori specified overall amount of agents to be given. Since all antiangiogenic agents are used along the optimal control, the integral is simply given by this total amount A. The final interval when the tumor volume still decreases because of after effects is not included in this computation. Figure 6, on the left, shows a comparison of the graphs of the minimum tumor volumes realized as a function of the initial carrying capacity q 0 by the optimal control (solid red curve), a full dose rate protocol where antiangiogenic agents are given at maximum dose rate u max (dashed blue curve), the half-dose rate protocol (dash-dotted blue curve), the averaged optimal control protocol (dashed black curve), and the best constant dose rate protocol (dash-dotted black curve) for the fixed initial tumor volume p 0 = 12, 000 [mm 3]. The curves for the averaged optimal control protocols and the best constant dose protocol are very close and basically lie on top of each other in the figure with only minute differences for very low and very high tumor volumes. On the right of the same figure we show a graph of the averaged optimal control as a function of q 0. These constant dose rates only vary between 45. 1 and 45. 7 showing the strong robustness of the solutions with respect to q 0.

Fig. 6
figure 6

The minimum tumor volumes and controls for p 0 = 12, 000 [mm3]

For the initial condition (p 0, q 0)=(12, 000 [mm3];15, 000 [mm3]), all antiangiogenic agents are used up at time 6. 558 [days] along the optimal solution. It is not difficult to compute the best protocol that would give the same total amount in 6 constant daily doses and these dose rates are given by

$$\displaystyle{u_{1} = 46.61,\ u_{2} = 45.31,\ u_{3} = 48.15,\ u_{4} = 50.71,\ u_{5} = 53.20,\ \mathrm{and}\ u_{6} = 56.02.}$$

The values closely mimic the structure of the theoretically optimal control shown in Fig. 5. There is a small dip in the dosage from the first to the second day which is caused by the fact that the piece along which the optimal dose rate is u max is small and thus the first daily value is significantly lower than u max = 75, but still higher than the second daily dose. Then the dosages gradually increase over the remaining days. This reflects the dose intensification along the optimal singular arc. Yet, specifying the time structure by restricting to daily doses reduces the quality of the approximation somewhat.

4.3 Combination of Antiangiogenic and Chemotherapy

Antiangiogenic therapy only attacks tumor growth indirectly through the vasculature and it is natural to combine it with a second therapy that directly attacks the tumor cells such as radio- or chemotherapy. We still consider a model that adds the action of a cytotoxic agent v, but again, rather than including the drug dosage as a penalty term in the objective, limits the overall amount of drugs given.

[AC] :

For a free terminal time T, minimize the tumor volume p(T) subject to the dynamics

$$\displaystyle\begin{array}{rcl} \dot{p} =\xi p\ln \left (\frac{q} {p}\right ) -\varphi pv,\qquad \quad p(0) = p_{0},& & {}\end{array}$$
(45)
$$\displaystyle\begin{array}{rcl} \dot{q} = bp -\left (dp^{\frac{2} {3} }+\mu \right )q -\gamma qu -\eta qv,\qquad \quad q(0) = q_{0},& & {}\end{array}$$
(46)

over all Lebesgue-measurable functions \(u: [0,T] \rightarrow [0,u_{\max }]\) and \(v: [0,T] \rightarrow [0,v_{\max }]\) for which the corresponding trajectory satisfies the terminal constraints

$$\displaystyle{ \int _{0}^{T}u(t)dt \leq A\qquad \mathrm{and}\qquad \int _{ 0}^{T}v(t)dt \leq B. }$$
(47)

An important feature of the optimal solution—and one that is not at all obvious—is that it builds in a modular way on the solution of the antiangiogenic monotherapy problem [A] already given [52]. Indeed, for a typical initial condition with p 0 < q 0, optimal controls for the combination therapy problem [AC] have the following structure: optimal controls for the antiangiogenic agent follow the optimal solution for the monotherapy problem and then, at a specific time, chemotherapy becomes active and is given in one full dose session. Both controls cannot be singular simultaneously and the formulas given above for the singular control and singular arc need to be adjusted to the presence of chemotherapy, but this is readily done and we have the following result:

Proposition 4.2 ([52]).

If the optimal antiangiogenic dose rate u follows the singular control u sing on an open interval I, then the chemotherapeutic agent v is bang-bang on I with at most one switching from v = 0 to v = v max , and the following relation holds between the controls u and v:

$$\displaystyle{ \gamma u_{\mathrm{sing}}(t) + \left (\eta -\varphi \right )v(t) =\varPsi (p(t),q(t)) }$$
(48)

with Ψ defined by equation  (44) . Given v, this determines the anti-angiogenic dose rate with a jump discontinuity when chemotherapy becomes active.

This structure allows to set up a simple minimization problem over a 1-dimensional parameter τ that denotes the time when chemotherapy becomes active. We illustrate this for an initial condition \((p_{0},q_{0})\) with \(p_{0} <q_{0}\) where the antiangiogenic agent is immediately applied with full dose. In principle, this time τ when chemotherapy is activated can lie anywhere in [0, T]. For example, if the amount z max of chemotherapeutic agents is high, then it is possible that chemotherapy already becomes active along the interval when the antiangiogenic dose rate is at maximum. On the other hand, if this amount is very low, this activation may only occur after all antiangiogenic agents have been exhausted. The typical case, however, is that this time τ lies somewhere in the interval where the control u follows the singular monotherapy structure. Figure 7 shows an example of numerically computed optimal controls for the combination therapy problem [30].

Fig. 7
figure 7

An optimal solution for the combination therapy problem [AC]

This structure of optimal controls for the combination therapy has interesting medical interpretation: optimization leads to the conclusion that it is best to follow specific “paths” along which maximum tumor reductions are achieved. This holds for both the monotherapy problem [A] and the combination therapy problem [AC] and these paths, as expressed in the formula (48), are closely linked with the optimal singular arc from the monotherapy problem. Note that the singular curve \(\mathcal{S}_{0}\) lies in the region where the tumor volume p is higher than its carrying capacity q, but there exists a specific relation between these variables. Clearly q is not pushed to zero too fast, but a definite balance between these two variables is maintained along the optimal solution. Since the vascular network of the tumor is needed to deliver the chemotherapeutic agents, this perfectly makes sense. In the medical literature, similar features have been observed and are know as “pruning” [19, 20]. It has been argued by Jain in [19] that the preliminary delivery of antiangiogenic agents may regularize a tumor’s vascular network with beneficial consequences for the successive delivery of cytotoxic chemotherapeutic agents. Although no “ pruning” aspects have been taken into account in the model (e.g., see [50] for such a model), it is interesting to note that an optimization approach for a rather small and minimally parameterized high-level mathematical model leads to very much the same conclusion: give antiangiogenic agents until an optimal relation between tumor volume and carrying capacity has been established and then apply full dose chemotherapy while still maintaining the optimal relation between p and q through the administration of antiangiogenic agents. Even when antiangiogenic treatment is combined with radiotherapy, this feature seems to persist with the optimal monotherapy solution once more playing a major role in the structure of optimal controls for the combination [40].

5 Tumor-Immune System Interactions

A second major component of a tumor’s microenvironment is the immune system. The immune system’s first response to its environment is on the basis of a discrimination between “own” and “foreign” objects and some tumor cells will simply be classified as “own” and thus tolerated [54]. However, tumor cells also exhibit a large number of abnormalities (such as mutated proteins, under- or over-expressed normal proteins and many more) that lead to the appearance of specific antigens, some of which will be classified as “foreign” and thus do trigger reactions by both the innate and adaptive immune system [23, 65]. In fact, the empirical hypothesis of immune surveillance, i.e., that the immune system may act to eliminate or control tumors, is well established in the medical community. The competitive interaction between tumor cells and the immune system is extremely complex and strongly nonlinear. The possible outcome of this interplay is not only constituted by tumor suppression or tumor outbreak but by a multitude of dynamic properties that include the persistence of both benign and malignant scenarios (e.g., see [45, 53]). Here we still consider a classical mathematical model by Stepanova [63] that captures these features of tumor-immune interactions in a low-dimensional, minimally parameterized model. In Sect. 5.1 we describe the model and consider the uncontrolled multi-stable dynamics which has both a benign and malignant region [32, 53]. We then in Sect. 5.2 set up an optimal control problem that induces the system to move from the malignant into the benign region under chemotherapy. After a brief administration of maximum dose chemotherapy, optimal protocols switch to singular controls and significantly lower dose rates [32]. In the medical literature such protocols are sometimes referred to as “chemo-switch” protocols [57].

5.1 Multi-stability and Regions of Attractions

We briefly recall Stepanova’s model. Let x denote the tumor volume with a fixed carrying capacity x  <  and let y be a non-dimensional order of magnitude variable related to the activities of various types of T-cells activated during the immune reaction. We shall refer to y as the immunocompetent cell density. While Stepanova uses an exponential model for the growth of the tumor, here, as in [72], we consider a Gompertzian tumor growth models. The dynamical equations of the model are given by

$$\displaystyle\begin{array}{rcl} \dot{x} = -\mu _{C}x\ln \left ( \frac{x} {x_{\infty }}\right ) -\gamma xy,& &{}\end{array}$$
(49)
$$\displaystyle\begin{array}{rcl} \dot{y} =\mu _{I}\left (x -\beta x^{2}\right )y -\delta y+\alpha,& &{}\end{array}$$
(50)

with all Greek letters denoting constant coefficients. The second equation summarizes the main features of the immune system’s reaction to cancer. Several organs contribute to the development of immune cells in the body and the parameter α models a combined rate of influx of T-cells generated through these primary organs; δ is simply the rate of natural death of the T-cells. The first term in this equation models the proliferation of lymphocytes. For small tumors, it is stimulated by the tumor antigen which can be assumed to be proportional to the tumor volume x. It is argued in [63] that large tumors suppress the activity of the immune system. The reasons lie in an inadequate stimulation of the immune forces as well as a general suppression of immune lymphocytes by the tumor (see [63] and the references therein). This feature is expressed in the model through the inclusion of the term −β x 2. Thus 1∕β corresponds to a threshold beyond which the immunological system becomes depressed by the growing tumor. The coefficients μ I and β are used to calibrate these interactions and in the product with y collectively describe a state-dependent influence of the cancer cells on the stimulation of the immune system. The first equation models tumor growth. The coefficient γ denotes the rate at which cancer cells are eliminated through the activity of T-cells and the term γ x y thus models the beneficial effect of the immune reaction on the cancer volume. Lastly, μ C simply is a tumor growth coefficient.

For our numerical computations we use the following parameter values that are based on the paper [27] by Kuznetsov, Makalkin, Taylor, and Perelson who estimate these parameters based on in vivo experimental data for B-lymphoma BCL 1 in the spleen of mice: α = 0. 1181, β = 0. 00264, γ = 1, δ = 0. 37451, μ C  = 0. 5618 and μ I  = 0. 00484. In that paper, a classical logistic growth term is used for cancer growth and we therefore adjusted the growth rates to account for Gompertzian growth using linear data fitting. Also, the functional form \(\left (x -\beta x^{2}\right )y\) used in Stepanova’s model in equation (50) is a quadratic expansion of the term used in [27]. Following [27], x is given in multiples of 106 cells and y is a dimensionless quantity that describes the immunocompetent cell density on an order of magnitude basis relative to base value 1. The time scale is taken relative to the tumor cell cycle and is in terms of 0. 11 days [27]. As always, we simply use this particular values to illustrate our analytical results.

There always exists a disease-free equilibrium point at \((x_{f},y_{f}) = (0, \frac{\alpha }{\delta })\) which is unstable. For the parameter values given above, there exist three equilibria with positive tumor volumes and Fig. 8 shows the phase portrait of the system. There is an asymptotically stable focus at \((x_{b},y_{b}) = (72.961,1.327)\) (marked by a green star), a saddle point at \((x_{s},y_{s}) = (356.174,0.439)\) (marked by a black star), and an asymptotically stable node at (x m , y m ) = (737. 278, 0. 032) (marked by a red star). In the diagram we have also marked the stable manifold of the saddle as a dashed red curve. The regions of attraction of the stable equilibria are the open regions that are separated by this stable manifold of the saddle.

Fig. 8
figure 8

Phase portrait of the uncontrolled system (49) and (50)

We call a locally asymptotically stable equilibrium point \((x_{{\ast}},y_{{\ast}})\) of the equations (49) and (50) malignant if the corresponding tumor volume x is close to the carrying capacity of the system, benign if it is by an order of magnitude smaller. The corresponding regions of attraction are the malignant and benign regions, respectively. In case of a microscopic benign equilibrium, this region can be interpreted as the set of all states of the system where the immune system is able to control the cancer and this is one possible way of describing what medically has been called immune surveillance. The region of attraction of the macroscopic equilibrium point, on the other hand, corresponds to conditions when the system has escaped from this immune surveillance and the disease will be lethal. Obviously, an interesting structure is the boundary between these two behaviors that is formed by the stable manifold of the saddle point. The natural therapeutic question then becomes how to move the state back into the benign region if it has been displaced into the malignant region.

5.2 Optimal Control for Tumor-Immune Interactions with Strongly Targeted Drugs

We now consider equations (49) and (50) with a cytotoxic agent u and a rudimentary immune boost v. As a simpler scenario, we assume that the chemotherapeutic agent is strongly targeted towards the tumor cells and therefore neglect its effects on the immunocompetent cell densities. Once more employing the standard log-kill assumption, this leads to the following equations:

$$\displaystyle\begin{array}{rcl} \dot{x} = -\mu _{C}x\ln \left ( \frac{x} {x_{\infty }}\right ) -\gamma xy -\kappa xu,& x(0)& = x_{0},{}\end{array}$$
(51)
$$\displaystyle\begin{array}{rcl} \dot{y} =\mu _{I}\left (x -\beta x^{2}\right )y -\delta y +\alpha +\rho yv,& y(0)& = y_{ 0}.{}\end{array}$$
(52)

Given the multi-stable scenario, the practical aim of therapy is to move an initial state (x 0, y 0) that lies in the malignant region into the region of attraction of the benign equilibrium point while keeping side effects tolerable. For this, we consider the following optimal control problem:

[CI] :

for a free terminal time T, minimize the objective

$$\displaystyle{ J = Ax(T) - By(T) +\int _{ 0}^{T}\left (Cu(t) + Dv(t) + S\right )dt, }$$
(53)

over all Lebesgue-measurable functions \(u: [0,T] \rightarrow [0,1]\) and \(v: [0,T] \rightarrow [0,1]\) subject to the dynamics (51) and (52).

The choice of the weights aims at striking a balance between the benefit at the terminal time T, Ax(T) − By(T), and the overall side effects measured by the total amounts of drugs given, while it at the same time guarantees the existence of an optimal solution by also penalizing the free terminal time T. The most important piece is the penalty term Ax(T) − By(T) at the final time that is designed to induce the state of the system to move from the malignant into the benign region. In order to accomplish this, it may no longer be adequate to simply minimize the tumor volume since, as can be seen in Fig. 8, small tumor volumes are possible that lie in the malignant region if the immune system is depressed. Rather, the geometric shape of the separatrix matters. While it is generally not possible to give an analytic description for this surface, the tangent space to the saddle is easily computed and its normal vector can serve as a reasonable direction in which we want the system to move. This is what we have done here giving the numerical values A = 0. 00192 and B = 1 for the data used earlier.

Once more, optimal controls for the cytotoxic agent consist of concatenations of bang and singular pieces. It can be shown that optimal administration of the immune boost v is bang-bang [29] and analytical formulas for a singular control u and arc can be derived, albeit with slightly different reasoning than above [29, 32]. The typical optimal control u is a concatenation of four pieces of the type 1s01: therapy starts with a short maximum dose therapy session followed by a segment where the control is singular. Along this segment, the system moves along the singular arc from the malignant into the benign region. It is this transfer that matters and the tumor volume may actually increase along this segment. Once safely into the benign region, at one point therapy stops, i.e., the optimal control switches to u = 0. This portion of the trajectory closely follows the unstable manifold of the saddle for the uncontrolled system and leads to a “free pass,” a trajectory along which no cost is incurred if S = 0. (The existence of such structures leads to issues about the existence of optimal controls and for this reason, we generally impose a small penalty S on the terminal time.) Along this portion of the controlled trajectory, the actions of the immune system take over. Quite frequently, after a prolonged rest period, optimal controls still give a short maximum dose chemotherapy and immune boost towards the end.

Figure 9 shows the optimal controlled trajectory for C = 0. 01, D = 0. 025, and S = 0. 001 [29]. In the figure of the controlled trajectory switching points for the cytotoxic agent are indicated by a red asterisk and those for the immune boost with a green asterisk. Initially chemotherapy is given at full dose without immune boost. Already after a brief time interval, as the state of the system nears the separatrix, chemotherapy is reduced drastically and is only administered at lower dose rates according to the singular control u sing and we clearly see the “chemo-switch”-type behavior of administration of a chemotherapeutic agent as optimal. In these solutions, the tumor microenvironment plays a major role: the initial chemotherapy is only designed to bring the state of the system into a region where the immune system is potent enough to control (not necessarily eliminate or eradicate) the cancer. If possible, this aim is achieved with low doses of chemotherapy. In fact—but such a structure is not included in the model—higher doses may be harmful in that they might adversely effect the immune system which otherwise would have come to the assistance in combating the tumor.

Fig. 9
figure 9

Optimal control (left) and corresponding controlled trajectory (right) for C = 0. 01, D = 0. 025, and S = 0. 001. (Reproduced with permission from [29], ©2013, AIMS)

6 Conclusion

We have outlined the qualitative type of results that can be obtained about cancer treatment protocols from an optimal control analysis of high-level mathematical models. Initially, the focus was on the cancerous cells progressing from mathematical models for homogeneous tumor populations of chemotherapeutically sensitive cells to heterogeneous structures of cell populations with varying sensitivities and resistance. From an optimal control point of view, optimal treatment schedules change from bang-bang solutions with upfront dosing (the classical MTD approaches in medicine) to administrations that favor singular controls (time-varying dosing schedules at less than maximum rates) as heterogeneity of the tumor population becomes more prevalent. Once the main components of the tumor microenvironment, its vasculature, and the immune system, are taken into account, in optimal solutions, more is not necessarily better. In this context, and in view of the fact that a properly calibrated dose (which does not waste agents nor have excessive side effects) can deliver the best outcomes, in medical research the search for a “biologically optimal dose” (BOD) is being pursued. In the model for antiangiogenic treatments it becomes clear that full dose therapies do waste agents that can be used more effectively when spread out at lower doses over prolonged time periods. The mathematical solution supports the idea of a normalization of the vasculature prior to the administration of chemotherapy, but then cytotoxic agents are given at the appropriate time in an MTD fashion. In a certain sense, an ideal tumor size-vasculature pattern is sought first which leads to an optimal tumor kill potential that then is exploited by maximum dose chemotherapy. However, as also the immune system is taken into account, chemo-switch protocols become optimal. The reason simply is that when the system is in a state where the actions of the immune system are able to control cancer growth, it is overall preferable (in view of the toxic side effects of chemotherapy) to administer lower doses.

Clearly, the models considered here are simplified, and this is natural at the high level of agglomeration that underlies their construction. While biological and medical research prefers to be as detailed as possible in their models, this also makes them amenable to the pitfall of Borges’s “exactitude in science.” The question simply is to what extent a model needs to be accurate to make significant and realistic predictions. In our view, the smaller the model is to give the relevant conclusion, the better it is. The conclusions that we obtain from these minimally parameterized models would suggest that these models lead to realistic statements about the structure of optimal treatment protocols that should be of interest in medical practice. In fact, the question how to schedule chemotherapeutic drugs in order to optimize their antitumor, antivasculature, and proimmune effects is far from being answered and there are concerted efforts in medical research to explore the benefits of metronomic scheduling in this respect [1, 56]. Qualitative mathematical results about optimal protocols that take into account a tumor’s microenvironment can be of assistance in these efforts.