Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The problems to which the maximum principle derived in the previous chapter was applicable had constraints involving only the control variables. We will see that in many applied models it is necessary to impose constraints involving both control and state variables. Inequality constraints involving control and possibly state variables are called mixed inequality constraints.

In the solution spaces of problems with mixed constraints , there may be regions in which one or more of the constraints is tight. When this happens, the system must be controlled in such a way that the tight constraints are not violated. As a result, the maximum principle of Chap. 2 must be revised so that the Hamiltonian is maximized subject to the constraints. This is done by appending the Hamiltonian with the mixed constraints and the associated Lagrange multipliers to form a Lagrangian, and then setting the derivatives of the resulting Lagrangian with respect to the control variables to zero.

In Sect. 3.1, a Lagrangian form of the maximum principle is discussed for models in which there are some constraints that involve only control variables, and others that involve both state and control variables simultaneously. Problems having pure state variable inequality constraints, i.e., those involving state variables but no control variables, are more difficult and will be dealt with in Chap. 4

In Sect. 3.2, we state conditions under which the Lagrangian maximum principle is also sufficient for optimality.

Economists frequently analyze optimal control problems involving a discount rate. By combining the discount factor with the adjoint variables and the Lagrange multipliers and making suitable changes in the definitions of the Hamiltonian and Lagrangian functions, it is possible to derive the current-value formulation of the maximum principle as described in Sect. 3.3.

It is often the case in finite horizon problems that some restrictions are imposed on the state variables at the end of the horizon. In Sect. 3.4, we discuss the transversality conditions to be satisfied by the adjoint variable in special cases of interest. Section 3.5 is devoted to the study of free terminal time problems where the terminal time itself is a decision variable to be determined. Models with infinite horizons and their stationary equilibrium solutions are covered in Sect. 3.6.

Section 3.7 presents a classification of a number of the most important and commonly used kinds of optimal control models, together with a brief description of the forms of their optimal solutions. The reader may wish to refer to this section from time to time while working through later chapters in the book.

3.1 A Maximum Principle for Problems with Mixed Inequality Constraints

We will state the maximum principle for optimal control problems with mixed inequality constraints without proof. For further details see Pontryagin et al. (1962) , Hestenes (1966) , Arrow and Kurz (1970) , Hadley and Kemp (1971) , Bensoussan et al. (1974) , Feichtinger and Hartl (1986) , Seierstad and Sydsæter (1987) , and Grass et al. (2008) .

Let the system under consideration be described by the following vector differential equation

$$\displaystyle{ \dot{x} = f(x,u,t),\;x(0) = x_{0} }$$
(3.1)

given the initial conditions x 0 and a control trajectory u(t), t ∈ [0, T],  T > 0, where T can be the terminal time to be optimally determined or given as a fixed positive number. Note that in the above equation, x(t) ∈ E n and u(t) ∈ E m, and the function f: E n × E m × E 1E n is assumed to be continuously differentiable.

Let us consider the following objective:

$$\displaystyle{ \max \left \{J =\int _{ 0}^{T}F(x,u,t)dt + S[x(T),T]\right \}, }$$
(3.2)

where F: E n × E m × E 1E 1 and S: E n × E 1E 1 are continuously differentiable functions and where T denotes the terminal time . Depending on the situation being modeled, the terminal time T may be given or to be determined. In the case when T is given, the function S(x(T), T) should be viewed as merely a function of the terminal state, and can be revised as S(x(T)). 

Next we impose constraints on state and control variables. Specifically, for each t ∈ [0, T], x(t) and u(t) must satisfy

$$\displaystyle{ g(x,u,t) \geq 0,\;t \in [0,T], }$$
(3.3)

where g: E n × E m × E 1E q is continuously differentiable in all its arguments and must contain terms in u. An important special case is that of controls having an upper bound that depends on the current state, i.e., u(t) ≤ M(x(t)),  t ∈ [0, T], which can be written as M(x) − u ≥ 0. Inequality constraints without terms in u will be introduced later in Chap. 4

It is important to note that the mixed constraints (3.3) allow for inequality constraints of the type g(u, t) ≥ 0 as special cases. Thus, the control constraints of the form u(t) ∈ Ω(t) treated in Chap. 2 can be subsumed in (3.3), provided that they can be expressed in terms of a finite number of inequality constraints of the form g(u, t) ≥ 0. In most problems that are of interest to us, this will indeed be the case. Thus, from here on, we will formulate control constraints either directly as inequality constraints and include them as parts of (3.3), or as u(t) ∈ Ω(t), which can be easily converted into a set of inequality constraints to be included as parts of (3.3).

Finally, the terminal state is constrained by the following inequality and equality constraints:

$$\displaystyle{ a(x(T),T) \geq 0, }$$
(3.4)
$$\displaystyle{ b(x(T),T) = 0, }$$
(3.5)

where \(a:\; E^{n} \times E^{1} \rightarrow E^{l_{a}}\) and \(b:\; E^{n} \times E^{1} \rightarrow E^{l_{b}}\) are continuously differentiable in all their arguments. Clearly, a and b are not functions of T, if T is a given fixed number. In the specific cases when T is given, the terminal state constraints will be written as a(x(T)) ≥ 0 and b(x(T)) = 0. Important special cases of (3.4) are x(T) ≥ k. 

We can now define a control u(t),  t ∈ [0, T], or simple u, to be admissible if it is piecewise continuous and if, together with its corresponding state trajectory x(t),  t ∈ [0, T], it satisfies the constraints (3.3), (3.4), and (3.5).

At times we may find terminal inequality constraints given as

$$\displaystyle{ x(T) \in Y (T) \subset X(T), }$$
(3.6)

where Y (T) is a convex set and X(T) is the set of all feasible terminal states, also called the reachable set from the initial state x 0, i.e.,

$$\displaystyle{X(T) =\{ x(T)\mid x(T)\mbox{ obtained by an admissible control $u$ and (3.1)}\}.}$$

Remark 3.1

The feasible set defined by (3.4) and (3.5) need not be convex. Thus, if the convex set Y (T) can be expressed by a finite number of inequalities a(x(T), T) ≥ 0 and equalities b(x(T), T) = 0, then (3.6) becomes a special case of (3.4) and (3.5). In general, (3.6) is not a special case of (3.4) and (3.5), since it may not be possible to define a given Y (T) by a finite number of inequalities and equalities.

In this book, we will only deal with problems in which the following full-rank conditions hold. That is,

$$\displaystyle{\mbox{ rank}[\partial g/\partial u,\mbox{ diag}(g)] = q}$$

holds for all arguments x(t),  u(t),  t, that could arise along an optimal solution, and

$$\displaystyle{\mbox{ rank}\left [\begin{array}{ccc} &\partial a/\partial x&\mbox{ diag}(a)\\ & \partial b/\partial x & 0 \end{array} \right ] = l_{a}+l_{b}}$$

hold for all possible values of x(T) and T. The first of these conditions means that the gradients with respect to u of all active constraints in (3.3) must be linearly independent. Similarly, the second condition means that the gradients with respect to x of the equality constraints (3.5) and of the active inequality constraints in (3.4) must be linearly independent. These conditions are also referred to as the constraint qualifications . In cases when these do not hold, see Seierstad and Sydsæter (1987) for details on weaker constraint qualifications.

Before proceeding further, let us recapitulate the optimal control problem under consideration in this chapter:

$$\displaystyle\begin{array}{rcl} \left \{\begin{array}{l} \max \left \{J =\int _{ 0}^{T}F(x,u,t)dt + S[x(T),T]\right \}, \\ \mbox{ subject to} \\ \dot{x} = f(x,u,t),\;x(0) = x_{0}, \\ g(x,u,t) \geq 0, \\ a(x(T),T) \geq 0, \\ b(x(T),T) = 0.\end{array} \right.& &{}\end{array}$$
(3.7)

To state the maximum principle we define the Hamiltonian function H:   E n × E m × E n × E 1E 1 as

$$\displaystyle{ H(x,u,\lambda,t):= F(x,u,t) +\lambda f(x,u,t), }$$
(3.8)

where λE n (a row vector). We also define the Lagrangian function L:   E n × E m × E n × E q × E 1E 1 as

$$\displaystyle{ L(x,u,\lambda,\mu,t):= H(x,u,\lambda,t) +\mu g(x,u,t), }$$
(3.9)

where μE q is a row vector, whose components are called Lagrange multipliers . These Lagrange multipliers satisfy the complementary slackness conditions

$$\displaystyle{\mu \geq 0,\;\mu g(x,u,t) = 0,}$$

which, in view of (3.3), can be expressed equivalently as

$$\displaystyle{\mu _{i} \geq 0,\;\mu _{i}g_{i}(x,u,t) = 0,\quad i = 1,2,\ldots,q.}$$

The adjoint vector satisfies the differential equation

$$\displaystyle{ \dot{\lambda }= -L_{x}(x,u,\lambda,\mu,t) }$$
(3.10)

with the terminal condition

$$\displaystyle{ \begin{array}{c} \left \{\begin{array}{ll} &la(T) = S_{x}(x(T),T) +\alpha a_{x}(x(T),T) +\beta b_{x}(x(T),T), \\ &\alpha \geq 0,\;\;\alpha a(x(T),T) = 0,\end{array} \right.\end{array} }$$
(3.11)

where \(\alpha \in E^{l_{a}}\) and \(\beta \in E^{l_{b}}\) are constant vectors.

The maximum principle states that the necessary conditions for u , with the corresponding state trajectory x , to be an optimal control are that there should exist continuous and piecewise continuously differentiable functions λ, piecewise continuous functions μ, and constants α and β such that (3.12) holds, i.e.,

(3.12)

In the case of the terminal constraint (3.6), note that the terminal conditions on the state and the adjoint variables in (3.12) will be replaced, respectively, by

$$\displaystyle{ x^{{\ast}}(T) \in Y (T) \subset X(T) }$$
(3.13)

and

$$\displaystyle{ [\lambda (T) - S_{x}(x^{{\ast}}(T),T)][y - x^{{\ast}}(T)] \geq 0,\hspace{14.45377pt} \forall y \in Y (T). }$$
(3.14)

In Exercise 3.5, you are asked to derive (3.14) from (3.12) in the one dimensional case when \(Y (T) = Y = [\underline{x},\bar{x}]\) for each T > 0, where x and \(\bar{x}\) are two constants such that \(\bar{x}>\underline{ x}.\)

In the case when the terminal time T ≥ 0 in the problem (3.10) is also a decision variable, there is an additional necessary transversality condition for T to be optimal, namely,

$$\displaystyle\begin{array}{rcl} & & H[x^{{\ast}}(T^{{\ast}}),u^{{\ast}}(T^{{\ast}}),\lambda (T^{{\ast}}),T^{{\ast}}] + S_{ T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}] \\ & & +\alpha a_{T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}] +\beta _{ T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}] = 0, {}\end{array}$$
(3.15)

provided T is an interior solution, i.e., T ∈ (0, ). In other words, optimal T and x (t),  u (t),  t ∈ [0, T ], must satisfy (3.12) with T replaced by T and (3.15). This condition will be further discussed and illustrated with examples in Sect. 3.5. The discussion will also include the case when T is restricted to lie in the interval [T 1, T 2], T 2 > T 1 ≥ 0. 

We will now illustrate the use of the maximum principle (3.12) by solving a simple example.

Example 3.1

Consider the problem:

$$\displaystyle{\max \left \{J =\int _{ 0}^{1}udt\right \}}$$

subject to

$$\displaystyle\begin{array}{rcl} \dot{x}& =& u,\;x(0) = 1,{}\end{array}$$
(3.16)
$$\displaystyle\begin{array}{rcl} u& \geq & 0,\;x - u \geq 0.{}\end{array}$$
(3.17)

Note that constraints (3.17) are of the mixed type (3.3). They can also be rewritten as 0 ≤ ux. 

Solution The Hamiltonian is

$$\displaystyle{H = u +\lambda u = (1+\lambda )u,}$$

so that the optimal control has the form

$$\displaystyle{ u^{{\ast}}(x,\lambda ) = \mbox{ bang}[0,x;1+\lambda ]. }$$
(3.18)

To get the adjoint equation and the multipliers associated with constraints (3.17), we form the Lagrangian:

$$\displaystyle{L = H +\mu _{1}u +\mu _{2}(x - u) =\mu _{2}x + (1 +\lambda +\mu _{1} -\mu _{2})u.}$$

From this we get the adjoint equation

$$\displaystyle{ \dot{\lambda }= -\frac{\partial L} {\partial x} = -\mu _{2},\;\lambda (1) = 0. }$$
(3.19)

Also note that the optimal control must satisfy

$$\displaystyle{ \frac{\partial L} {\partial u} = 1 +\lambda +\mu _{1} -\mu _{2} = 0, }$$
(3.20)

and μ 1 and μ 2 must satisfy the complementary slackness conditions

$$\displaystyle\begin{array}{rcl} \mu _{1}& \geq & 0,\;\mu _{1}u = 0,{}\end{array}$$
(3.21)
$$\displaystyle\begin{array}{rcl} \mu _{2}& \geq & 0,\;\mu _{2}(x - u) = 0.{}\end{array}$$
(3.22)

It is reasonable in this simple problem to guess that u (t) = x(t) is an optimal control for all t ∈ [0, 1]. We now show that this control satisfies all the conditions of the Lagrangian form of the maximum principle .

Since x(0) = 1, the control u = x gives x = e t as the solution of (3.16). Because x = e t > 0, it follows that u = x > 0. Thus, μ 1 = 0 from (3.21).

From (3.20) we then have

$$\displaystyle{\mu _{2} = 1 +\lambda.}$$

Substituting this into (3.19) and solving gives

$$\displaystyle{ 1 +\lambda (t) = e^{1-t}. }$$
(3.23)

Since the right-hand side of (3.23) is always positive, u = x satisfies (3.18). Notice that μ 2 = e 1−t ≥ 0 and xu = 0, so (3.22) holds.

Using u = x in (3.16), we can obtain the optimal state trajectory x (t) = e t. Thus, the optimal value of the objective function is

$$\displaystyle{J^{{\ast}} =\int _{ 0}^{1}e^{t}dt = (e - 1).}$$

Let us now examine the consequence of changing the constraint xu ≥ 0 on control u to xu ≥ −ɛ, which gives ux + ɛ for a small ɛ. In this case, it is clear that the optimal control u = x + ɛ, which we can use in (3.16) to obtain x (t) = e t(1 + ɛ) −ɛ. The optimal value of the objective function changes to

$$\displaystyle{\int _{0}^{1}u(t)dt =\int _{ 0}^{1}e^{t}(1+\varepsilon )dt = (e - 1)(1+\varepsilon ).}$$

This means that J increases by (e − 1)ɛ, which in this case equals ɛ∫ 01μ 2(t)dt = ɛ∫ 01e 1−tdt, as stipulated in Remark 3.8.

We conclude Sect. 3.1 with the following remarks.

Remark 3.2

Strictly speaking, we should have H = λ 0F + λf in (3.8) with (λ 0, λ(t)) ≠ (0, 0) for all t ∈ [0, T]. However, when λ 0 = 0, the conditions in the maximum principle do not change if we replace F by any other function. Therefore, the problems where the maximum principle holds only with λ 0 = 0 are termed abnormal. Such problems may arise when there are terminal state constraints such as (3.4) and (3.5) or pure state constraints treated in Chap. 4 In this book, as is standard in the economics literature dealing with optimal control theory, we will set λ 0 = 1. This is because the problems that are of interest to us will be normal. For examples of abnormal problems and further discussion on this issue, see Seierstad and Sydsæter (1987) .

Remark 3.3

The function defined in (3.9) is not a Lagrangian function in the sense of the continuous-time counterpart of the Lagrangian function defined in (8.45) in Chap. 8 However, it can be viewed, roughly speaking, as a Lagrangian function associated with the problem of maximizing the Hamiltonian (3.8) subject to the constraints (3.3) along the optimal path. As in this book, some people refer to (3.9) as a Lagrangian function, while others call it an extended Pontryagin function.

Remark 3.4

It should be pointed out that if the set Y in (3.6) consists of a single point Y = {k}, making the problem a fixed-end-point problem , then the transversality condition reduces to simply λ(T) to equal a constant to be determined, since x (T) = k. In this case the salvage function S becomes a constant, and can therefore be disregarded. When Y = X, the terminal condition in (3.12) reduces to (2.30). Further discussion of the terminal conditions can be found in Sect. 3.4 along with a summary in Table 3.1.

Table 3.1 Summary of the transversality conditions

Remark 3.5

As in Chap. 2, it can be shown that λ i(t),  i = 1, 2, . . . , n, is interpreted as the marginal value of an increment in the state variable x i at time t. Specifically, the relation (2.17) holds so long as the value function V (x, t), defined in (2.10), is continuously differentiable in x i; see Seierstad and Sydsæter (1987) .

Remark 3.6

The Lagrange multiplier α i,  i = 1, 2, , n represents the shadow price associated with the terminal state constraint a i(x(T), T) ≥ 0. Thus, if we change this constraint to a i(x(T), T) ≥ ɛ for a small ɛ, then the change in the objective function will be −ɛα i + o(ɛ). A similar interpretation holds for the multiplier β; see Sect. 3.4 for further discussion. This will be illustrated in Example 3.4 and Exercise 3.17.

Remark 3.7

In the case when the terminal constraint (3.4) or (3.5) is binding, the transversality condition λ(T) in (3.12) should be viewed as the left-hand limit, limt↑Tλ(t), sometimes written as λ(T ), and then we would express λ(T) = S x(x (T), T). However, the standard practice for problems treated in Chaps. 2 and 3 is to use the notation that we have used. Nevertheless, care should be exercised in distinguishing the marginal value of the state at time T given by S x(x (T), T) and the shadow prices for the terminal constraints (3.4) and (3.5) given by α and β, respectively. See Sect. 3.4 and Example 3.4 for further elaboration.

Remark 3.8

It is also possible to provide marginal value interpretations to Lagrange multipliers μ i,  i = 1, 2, , m. If we change the constraint g i(x, u, t) ≥ 0 to g i(x, u, t) ≥ ɛ for a small ɛ, then we expect the change in the optimal value of the objective function to be −ɛ∫ 0Tμ i(t)dt + o(ɛ); see Peterson (19731974) or Malanowski (1984) . If ɛ < 0, then the constraint is being relaxed, and 0Tμ i(t)dt ≥ 0 provides the marginal value of relaxing the constraint. We will illustrate this concept with the help of Example 3.1.

Remark 3.9

In the case when the problem (3.7) is changed by interchanging x(T) and x(0) so that the initial condition x(0) = x 0 is replaced by x(T) = x T, and S(x(T), T), a(x(T), T) and b(x(T), T) are replaced by S(x(0)), a(x(0)) and b(x(0)), respectively, then in the maximum principle (3.12), we need to replace initial condition x (0) = x 0 by x (T) = x T and the terminal condition on the adjoint variable λ by the initial condition λ(0) = S x(x (0)) + αa x(x (0)) + βb x(x (0)) with α ≥ 0 and αa(x (0)) = 0. 

3.2 Sufficiency Conditions

In this section we will state, without proof, a number of sufficiency results. These results require the concepts of concave and quasiconcave functions .

Recall from Sect. 1.4 that with DE n, a convex set, a function ψ: DE 1 is concave , if for all y, zD and for all p ∈ [0, 1], 

$$\displaystyle{ \psi (py + (1 - p)z) \geq p\psi (y) + (1 - p)\psi (z). }$$
(3.24)

The function ψ is quasiconcave if (3.24) is relaxed to

$$\displaystyle{ \psi (py + (1 - p)z) \geq \min \{\psi (y),\psi (z)\}, }$$
(3.25)

and ψ is strictly concave if yz and p ∈ (0, 1) and (3.24) holds with a strict inequality. Furthermore, ψ is convex , quasiconvex , or strictly convex if −ψ is concave, quasiconcave, or strictly concave, respectively. Note that linearity implies both concavity and convexity, and concavity implies quasiconcavity. For further details on the properties of such functions, see Mangasarian (1969) .

We can now state a sufficiency result concerning the problem with mixed constraints stated in (3.7). For this purpose, let us define the maximized Hamiltonian

$$\displaystyle{ H^{0}(x,\lambda,t) =\max _{\{ u\vert g(x,u,t)\geq 0\}}H(x,u,\lambda,t). }$$
(3.26)

Theorem 3.1

Let (x , u , λ, μ, α, β) satisfy the necessary conditions in(3.12). If H 0(x, λ(t), t) is concave in x at each t ∈ [0, T], S in(3.2) is concave in x, g in(3.3) is quasiconcave in (x, u), a in(3.4) is quasiconcave in x, and b in(3.5) is linear in x, then (x , u ) is optimal.

The result is a straightforward extension of Theorem 2.1. See, e.g., Seierstad and Sydsæter (19771987) and Feichtinger and Hartl (1986) .

In Exercise 3.7 you are asked to check these sufficiency conditions for Example 3.1.

3.3 Current-Value Formulation

In most management science and economics problems, the objective function is usually formulated in terms of money or utility. These quantities have time value, and therefore the future streams of money or utility are discounted. The discounted objective function can be written as a special case of (3.2) by assuming that the time dependence of the relevant functions comes only through the discount factor. Thus,

$$\displaystyle{ F(x,u,t) =\phi (x,u)e^{-\rho t}\mbox{ and }S(x,T) =\psi (x)e^{-\rho T}, }$$
(3.27)

where we assume the discount rate ρ > 0. We should also mention that if F(x, u, t) = ϕ(x, u, t)e ρt and S(x, T) = ψ(x, T)e ρT, then there is no advantage of developing a current-value version of the maximum principle, and it is recommended that the present-value formulation be used in this case.

Now, the objective in problem (3.7) can be written as:

$$\displaystyle{ \max \left \{J =\int _{ 0}^{T}\phi (x,u)e^{-\rho t}dt +\psi [x(T)]e^{-\rho T}\right \}. }$$
(3.28)

For this problem, the Hamiltonian , which we shall now refer to as the present-value Hamiltonian, H pv, is

$$\displaystyle{ H^{pv}:= e^{-\rho t}\phi (x,u) +\lambda ^{pv}f(x,u,t) }$$
(3.29)

and the present-value Lagrangian is

$$\displaystyle{ L^{pv}:= H^{pv} +\mu ^{pv}g(x,u,t) }$$
(3.30)

with the present-value adjoint variables λ pv and present-value multipliers α pv and β pv satisfying

$$\displaystyle{ \dot{\lambda }^{pv} = -L_{ x}^{pv},\; }$$
(3.31)
$$\displaystyle\begin{array}{rcl} \lambda ^{pv}(T)& =& S_{ x}[x(T),T] +\alpha ^{pv}a_{ x}(x(T),T) +\beta ^{pv}b_{ x}(x(T),T)\;\;\;\;\; \\ & =& e^{-\rho T}\psi _{ x}[x(T)] +\alpha ^{pv}a_{ x}(x(T),T) +\beta ^{pv}b_{ x}(x(T),T),\;\;\;\;\;{}\end{array}$$
(3.32)
$$\displaystyle{ \alpha ^{pv} \geq 0,\;\;\alpha ^{pv}a(x(T),T) = 0, }$$
(3.33)

and μ pv satisfying

$$\displaystyle{ \mu ^{pv} \geq 0,\;\;\mu ^{pv}g = 0. }$$
(3.34)

We use superscriptpv in this section to distinguish these from the current-value functions defined as follows. Elsewhere, we do not need to make the distinction explicitly since we will either be using the present-value definitions or the current-value definitions of these functions. The reader will always be able to tell what is meant from the context.

We now define the current-value Hamiltonian

$$\displaystyle{ H[x,u,\lambda,t]:=\phi (x,u) +\lambda f(x,u,t) }$$
(3.35)

and the current-value Lagrangian

$$\displaystyle{ L[x,u,\lambda,\mu,t]:= H +\mu g(x,u,t). }$$
(3.36)

To see why we can do this, we note that if we define

$$\displaystyle{ \lambda:= e^{\rho t}\lambda ^{pv}\mbox{ and }\mu \;:= e^{\rho t}\mu ^{pv}, }$$
(3.37)

we can rewrite (3.29) and (3.30) as

$$\displaystyle{ H = e^{\rho t}H^{pv}\mbox{ and }L = e^{\rho t}L^{pv}. }$$
(3.38)

Since e ρt > 0, maximizing H pv with respect to u at time t is equivalent to maximizing the current-value Hamiltonian H with respect to u at time t. Furthermore, from (3.37),

$$\displaystyle{ \dot{\lambda }=\rho e^{\rho t}\lambda ^{pv} + e^{\rho t}\dot{\lambda }^{pv}. }$$
(3.39)

The first term on the right-hand side of (3.39) is simply ρλ using the definition in (3.37). To simplify the second term we use the differential equation (3.31) for λ pv and the fact that L x = e ρtL xpv from (3.38). Thus,

$$\displaystyle{\dot{\lambda }=\rho \lambda -L_{x},}$$
$$\displaystyle{ \lambda (T) =\psi _{x}[x(T)] +\alpha a_{x}(x(T),T) +\beta b_{x}(x(T),T), }$$
(3.40)

where the terminal condition for λ(T) follows immediately from the terminal condition for λ pv(T) in (3.32), the definition (3.38),

$$\displaystyle{ \alpha = e^{\rho t}\alpha ^{pv}\hspace{14.45377pt} \mbox{ and}\hspace{14.45377pt} \beta = e^{\rho t}\beta ^{pv}. }$$
(3.41)

The complementary slackness conditions satisfied by the current-value Lagrange multipliers μ and α are

$$\displaystyle{\mu \geq 0,\;\mu g = 0,\;\ \alpha \geq 0,\mbox{ and }\alpha a = 0}$$

on account of (3.33), (3.34), (3.37), and (3.41).

We will now state the maximum principle in terms of the current-value functions . It states that the necessary conditions for u , with the corresponding state trajectory x , to be an optimal control are that there exist λ and μ such that the conditions (3.42) hold, i.e.,

(3.42)

As in Sect. 3.1, when the terminal constraint is given by (3.6) instead of (3.4) and (3.5), we need to replace the terminal condition on the state and the adjoint variables, respectively, by (3.13) and

$$\displaystyle{ [\lambda (T) -\psi _{x}(x^{{\ast}}(T))][y - x^{{\ast}}(T)] \geq 0,\;\forall y \in Y (T). }$$
(3.43)

See also Remark 3.4, which applies here as well.

If T ≥ 0 is also a decision variable and if T is the optimal terminal time, then the optimal solution x , u , and T must satisfy (3.42) with T replaced by T along with

$$\displaystyle\begin{array}{rcl} & & H[x^{{\ast}}(T^{{\ast}}),u^{{\ast}}(T^{{\ast}}),\lambda (T^{{\ast}}),T^{{\ast}}] -\rho \psi [x^{{\ast}}(T^{{\ast}})] \\ & & +\alpha a_{T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}] +\beta _{ T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}] = 0.{}\end{array}$$
(3.44)

You are asked in Exercise 3.8 to show that (3.44) is the current-value version of (3.15) under the relation (3.27). Furthermore, show how (3.44) should be modified if S(x, T) = ψ(x, T)e ρT in (3.27).

As for the sufficiency conditions for the current-value formulation, one can simply use Theorem 3.1 as if it were stated for the current-value formulation.

Example 3.2

We illustrate an application of the current-value maximum principle by solving the consumption problem of Example 1.3 with U(C) = lnC and W(T) = 0. Thus, we solve

$$\displaystyle{\max _{C(t)\geq 0}\left \{J =\int _{ 0}^{T}e^{-\rho t}\ln C(t)dt + B(0)e^{-\rho T}\right \}}$$

subject to the wealth dynamics

$$\displaystyle{\dot{W} = rW - C,\;W(0) = W_{0},\;W(T) = 0,}$$

where W 0 > 0. As hinted in Exercise 2.29(a), we do not need to impose the pure state constraint W(t) ≥ 0,  t ∈ [0, T], in view of C(t) ≥ 0,  t ∈ [0, T], and W(T) = 0. Also, the salvage function reduces to B(0), which is a constant; see Remark 3.4.

Solution In Exercise 2.29(a) we used the standard Hamiltonian formulation to solve the problem. We now demonstrate the use of the current-value Hamiltonian formulation:

$$\displaystyle{ H =\ln C +\lambda (rW - C), }$$
(3.45)

with the adjoint equation

$$\displaystyle{ \dot{\lambda }=\rho \lambda -\frac{\partial H} {\partial W} = (\rho -r)\lambda,\,\lambda (T) =\beta, }$$
(3.46)

where β is some constant to be determined. The solution of (3.46) is

$$\displaystyle{ \lambda (t) =\beta e^{(\rho -r)(t-T)}. }$$
(3.47)

To find the optimal control, we maximize H by differentiating (3.45) with respect to C and setting the result to zero:

$$\displaystyle{\frac{\partial H} {\partial C} = \frac{1} {C}-\lambda = 0,}$$

which implies

$$\displaystyle{ C^{{\ast}}(t) = \frac{1} {\lambda (t)} = \frac{1} {\beta } e^{(\rho -r)(T-t)}. }$$
(3.48)

Using this consumption level in the wealth dynamics gives

$$\displaystyle{\dot{W}(t) = rW(t) -\frac{1} {\beta } e^{(\rho -r)(T-t)},\;W(0) = W_{ 0},}$$

which can be solved as

$$\displaystyle{ W^{{\ast}}(t) = e^{rt}\left [W_{ 0} -\frac{e^{(\rho -r)T}(1 - e^{-\rho t})} {\rho \beta } \right ]. }$$
(3.49)

Setting W (T) = 0 gives β = e (ρr)T(1 − e ρT)∕ρW 0. Therefore, the optimal consumption rate and wealth at time t are

$$\displaystyle{ C^{{\ast}}(t) = \frac{\rho W_{0}e^{(r-\rho )t}} {1 - e^{-\rho T}},\;W^{{\ast}}(t) = e^{rt}W_{ 0}\left [\frac{e^{-\rho t} - e^{-\rho T}} {1 - e^{-\rho T}} \right ]. }$$
(3.50)

The optimal value of the objective function is

$$\displaystyle{ J^{{\ast}} = \frac{1 - e^{-\rho T}} {\rho } \left [\ln \frac{\rho W_{0}} {1 - e^{-\rho T}}\right ] + \frac{r-\rho } {\rho } \left [\frac{1} {\rho } - e^{-\rho T}\left (T + \frac{1} {\rho } \right )\right ] + B(0)e^{-\rho T}. }$$
(3.51)

The interpretation of the current-value functions are that these functions reflect the values at time t in terms of the current (or, time-t) dollars. The standard functions, on the other hand, reflect the values at time t in terms of time-zero dollars. For example, the standard adjoint variable λ pv(t) can be interpreted as the marginal value per unit increase in the state at time t, in the same units as that of the objective function (3.28), i.e., in terms of time-zero dollars; see Sect. 2.2.4. On the other hand, λ(t) = e ρtλ pv(t) is obviously the same value expressed in terms of current (or, time-t) dollars.

For the consumption problem of Example 3.2, note that the current-value adjoint function

$$\displaystyle{ \lambda (t) = e^{(\rho -r)t}(1 - e^{-\rho T})/\rho W_{ 0}. }$$
(3.52)

This gives the marginal value per unit increase in wealth at time t in time-t dollars. In Exercise 2.29(a), the standard adjoint variable was λ pv(t) = e rt(1 − e ρT)∕ρW 0, which can be written as λ pv(t) = e ρtλ(t). Thus, it is clear that λ pv(t) expresses the same marginal value in time-zero dollars. In particular,

$$\displaystyle{dJ^{{\ast}}/dW_{ 0} = (1 - e^{-\rho T})/\rho W_{ 0} =\lambda (0) =\lambda ^{pv}(0)}$$

gives the marginal value per unit increase in the initial wealth W 0. 

In Exercise 3.11, you are asked to formulate and solve a consumption problem of an economy. The problem is a linear version of the famous Ramsey model; see Ramsey (1928) and Feichtinger and Hartl (1986, p. 201) .

Before concluding this section on the current-value formulation, let us also provide the current-value version of the HJB equation (2.15) or (2.19) along with the terminal condition (2.16). As in (2.9), we now define the value function for the problem (3.7), with its objective function replaced by (3.28), as follows:

$$\displaystyle{ \begin{array}{cl} V (x,t) =&\max _{\{u\vert g(x,u,t)\geq 0\}}\left [\int _{t}^{T}\phi (x(s),u(s))ds + e^{-\rho (T-t)}\psi (x(T))\right ] \\ &\mbox{ if }x(T)\mbox{ satisfies }a(x(T),T) \geq 0\mbox{ and }b(x(T),T) = 0, \\ &\mbox{ and }V (x,t) = -\infty,\mbox{ otherwise.}\end{array} }$$
(3.53)

Then proceeding as in Sect. 2.1.1, we have

$$\displaystyle{ V (x,t)=\max _{\stackrel{\{u(\tau )\vert g(x(\tau ),u(\tau ),\tau )\geq 0\}}{\tau \in [t,\;t+\delta t]}}\left \{\phi [x(\tau ),u(\tau )]d\tau + e^{-\rho \delta dt}V [x(t +\delta t),t +\delta t]\right \}. }$$
(3.54)

Noting that e ρδt = 1 −ρδt + 0(δt) and continuing on as in Sect. 2.1.1, we can obtain the current-value version of (2.15) and (2.19) as

$$\displaystyle{ \begin{array}{ll} \rho V (x,t)& = \max _{\{u\vert g(x,u,t)\geq 0\}}\left \{\phi (x,u,t) + V _{x}(x,t)f(x,u,t) + V _{t}(x,t)\right \} \\ & = \max _{\{u\vert g(x,u,t)\geq 0\}}\left \{H(x,u,V _{x},t) + V _{t}\right \} = 0, \end{array} }$$
(3.55)

where H is defined as in (3.35).

Finally, we can write the terminal condition as

$$\displaystyle{ V (x,T) = \left \{\begin{array}{ll} &\psi (x),\mbox{ if }a(x,T) \geq 0\mbox{ and }b(x,T) = 0,\\ & - \infty, \mbox{ otherwise.} \end{array} \right. }$$
(3.56)

3.4 Transversality Conditions: Special Cases

Terminal conditions on the adjoint variables, also known as transversality conditions , are extremely important in optimal control theory. Because the salvage value function ψ(x) is known, we know the marginal value per unit change in the state at terminal time T. Since λ(T) must be equal to this marginal value, it provides us with the boundary conditions for the differential equations for the adjoint variables. We will now derive the terminal or transversality conditions for the current-value adjoint variables for some important special cases of the general problem treated in Sect. 3.3. We also summarize these conditions in Table 3.1.

Case 1: Free-end point . In this case, we do not put any constraints on the terminal state x(T). Thus,

$$\displaystyle{x(T) \in X(T).}$$

From the terminal conditions in (3.42), it is obvious that for the free-end-point problem , i.e., when Y (T) = X(T), 

$$\displaystyle{ \lambda (T) =\psi _{x}[x^{{\ast}}(T)]. }$$
(3.57)

This includes the condition λ(T) = 0 in the special case of ψ(x) ≡ 0; see Example 3.1, specifically (3.19). These conditions are repeated in Table 3.1, Row 1.

The economic interpretation of λ(T) is that it equals the marginal value of a unit increment in the terminal state evaluated at its optimal value x (T). 

Case 2: Fixed-end point . In this case, which is the other extreme from the free-end-point case, the terminal constraint is

$$\displaystyle{b(x(T),T) = x(T) - k = 0,}$$

and the terminal conditions in (3.42) do not provide any information for λ(T). However, as mentioned in Remark 3.4 and recalled subsequently in connection with (3.42), λ(T) will be some constant β, which will be determined by solving the boundary value problem, where the system of differential equations consists of the state equations with both initial and terminal conditions and the adjoint equations with no boundary conditions . This condition is repeated in Table 3.1, Row 2. Example 3.2 solved in the previous section illustrates this case.

The economic interpretation of λ(T) = β is as follows. The constant β times ɛ, i.e., βɛ, provides the value that could be lost if the fixed-end point were specified to be k + ɛ instead of k; see Exercise 3.12.

Case 3: Lower bound. Here we restrict the ending value of the state variable to be bounded from below, namely,

$$\displaystyle{a(x(T),T) = x(T) - k \geq 0,}$$

where kX. In this case, the terminal conditions in (3.42) reduce to

$$\displaystyle{ \lambda (T) \geq \psi _{x}[x^{{\ast}}(T)] }$$
(3.58)

and

$$\displaystyle{ \{\lambda (T) -\psi _{x}[x^{{\ast}}(T)]\}\{x^{{\ast}}(T) - k\} = 0, }$$
(3.59)

with the recognition that the shadow price of the inequality constraint (3.4) is

$$\displaystyle{ \alpha =\lambda (T) -\psi _{x}[x^{{\ast}}(T)] \geq 0. }$$
(3.60)

For ψ(x) ≡ 0, these terminal conditions can be written as

$$\displaystyle{ \lambda (T) \geq 0\mbox{ and }\lambda (T)[x^{{\ast}}(T) - k] = 0. }$$
(3.61)

These conditions are repeated in Table 3.1, Row 3.

Case 4: Upper bound. Similarly, when the ending value of the state variable is bounded from above, i.e., when the terminal constraint is

$$\displaystyle{k - x(T) \geq 0,}$$

the conditions for this opposite case are

$$\displaystyle{ \lambda (T) \leq \psi _{x}[x^{{\ast}}(T)] }$$
(3.62)

and (3.59). These are repeated in Table 3.1, Row 4. Furthermore, (3.62) can be related to the condition on λ(T) in (3.42) by setting

$$\displaystyle{ \alpha =\psi _{x}[x^{{\ast}}(T)] -\lambda (T) \geq 0. }$$
(3.63)

Case 5: A general case. A general ending condition is

$$\displaystyle{x(T) \in Y (T) \subset X(T),}$$

which is already stated in (3.6). The transversality conditions are specified in (3.43) and repeated in Table 3.1, Row 5.

An important situation which gives rise to a one-sided constraint occurs when there is an isoperimetric or budget constraint of the form

$$\displaystyle{ \int _{0}^{T}l(x,u,t)dt \leq K, }$$
(3.64)

where l: E n × E m × E 1E 1 is assumed to be nonnegative, bounded, and continuously differentiable, and K is a positive constant representing the amount of a budgeted resource. To see how this constraint can be converted into a lower bound constraint, we define an additional state variable x n+1 by the state equation

$$\displaystyle{ \dot{x}_{n+1} = -l(x,u,t),\;x_{n+1}(0) = K,\;x_{n+1}(T) \geq 0. }$$
(3.65)

We employ the index n + 1 simply because we already have n state variables x = (x 1, x 2, , x n). Also Eq. (3.65) becomes an additional equation which is added to the original system.

In Exercise 3.13 you will be asked to rework the leaky reservoir problem of Exercise 2.18 with an additional isoperimetric constraint on the total amount of water available. Later in Chap. 7, you’ll be asked to solve Exercises 7.10–7.12 involving budgets for advertising expenditures.

In Table 3.1, we have summarized all the terminal or transversality conditions discussed previously. In Sect. 3.7 we discuss model types. We will see that, given the initial state x 0, we can completely specify a control model by selecting a model type and a transversality condition . In what follows, we solve two examples with lower bounds on the terminal state illustrating the use of transversality conditions (3.61), also stated in Table 3.1, Row 3. Example 3.3 is a variation of the consumption problem in Example 3.2. It illustrates the use of the transversality conditions (3.61).

Example 3.3

Let us modify the objective function of the consumption problem (Example 3.2) to take into account the salvage (bequest) value of terminal wealth. This is the utility to the individual of leaving an estate to his heirs upon death. Let us now assume that T denotes the time of the individual’s death and BW(T), where B is a positive constant, denotes his utility of leaving wealth W(T) to his heirs upon death. Then, the problem is:

$$\displaystyle{ \max _{C(t)\geq 0}\left \{J =\int _{ 0}^{T}e^{-\rho t}\ln C(t)dt + e^{-\rho T}BW(T)\right \} }$$
(3.66)

subject to the wealth equation

$$\displaystyle{ \dot{W} = rW - C,\;W(0) = W_{0},\;W(T) \geq 0. }$$
(3.67)

Solution The Hamiltonian for the problem is given in (3.45), and the adjoint equation is given in (3.46) except that the transversality conditions are from Table 3.1, Row 3:

$$\displaystyle{ \lambda (T) \geq B,\;[\lambda (T) - B]W^{{\ast}}(T) = 0. }$$
(3.68)

In Example 3.2, the value of β, the terminal value of the adjoint variable, was

$$\displaystyle{\beta = \frac{1 - e^{-rT}} {rW_{0}}.}$$

We now have two cases: (i) βB and (ii) β < B. 

In case (i), the solution of the problem is the same as that of Example 3.2, because by setting λ(T) = β and recalling that W (T) = 0 in that example, it follows that (3.68) holds.

In case (ii), we set λ(T) = B. Then, by using B in place of β in (3.47)–(3.49), we get λ(t) = Be (ρr)(tT), C (t) = (1∕B)e (ρr)(Tt), and

$$\displaystyle{ W^{{\ast}}(t) = e^{rt}\left [W_{ 0} -\frac{e^{(\rho -r)T}(1 - e^{-\rho t})} {\rho B} \right ]. }$$
(3.69)

Since β < B, we can see from (3.49) and (3.69) that the wealth level in case (ii) is larger than that in case (i) at t ∈ (0, T]. Furthermore, the amount of bequest is

$$\displaystyle{W^{{\ast}}(T) = W_{ 0}e^{rT} -\frac{e^{\rho T} - 1} {\rho B}> 0.}$$

Note that (3.68) holds for case (ii). Also, if we had used (3.42) instead of Table 3.1, Row 3, we would have λ(T) = B + α,  α ≥ 0,  αW (T) = 0, equivalently, in place of (3.68). It is easy to see that α = βB in case (i) and α = 0 in case (ii).

Example 3.4

Consider the problem:

$$\displaystyle{\max \left \{J =\int _{ 0}^{2} - xdt\right \}}$$

subject to

$$\displaystyle{ \dot{x} = u,\;x(0) = 1,\;x(2) \geq 0, }$$
(3.70)
$$\displaystyle{ -1 \leq u \leq 1. }$$
(3.71)

Solution The Hamiltonian is

$$\displaystyle{H = -x +\lambda u.}$$

Here, we do not need to introduce the Lagrange multipliers for the control constraints (3.71), since we can easily deduce that the Hamiltonian maximizing control has the form

$$\displaystyle{ u^{{\ast}} = \mbox{ bang}[-1,1;\lambda ]. }$$
(3.72)

The adjoint equation is

$$\displaystyle{ \dot{\lambda }= 1 }$$
(3.73)

with the transversality conditions

$$\displaystyle{ \lambda (2) \geq 0\mbox{ and }\lambda (2)x(2) = 0, }$$
(3.74)

obtained from (3.61) or from Table 3.1, Row 3. Since λ(t) is monotonically increasing, the control (3.72) can switch at most once, and it can only switch from u = −1 to u = 1. Let the switching time be t ≤ 2. Then the optimal control is

$$\displaystyle{ u^{{\ast}}(t) = \left \{\begin{array}{lll} - 1&\mbox{ for }0 \leq t \leq t^{{\ast}}, \\ + 1&\mbox{ for }t^{{\ast}} <t \leq 2. \end{array} \right. }$$
(3.75)

Since the control switches at t , λ(t ) must be 0. Solving (3.73) gives

$$\displaystyle{\lambda (t) = t - t^{{\ast}}.}$$

There are two cases: (i) t < 2 and (ii) t = 2. We analyze case (i) first. Here λ(2) = 2 − t > 0; therefore from (3.74), x(2) = 0. Solving for x(t) with u (t) given in (3.75), we obtain

$$\displaystyle{x(t) = \left \{\begin{array}{cl} 1 - t &\mbox{ for }0 \leq t \leq t^{{\ast}}, \\ (t - t^{{\ast}}) + x(t^{{\ast}}) = t + 1 - 2t^{{\ast}}&\mbox{ for }t^{{\ast}} <t \leq 2. \end{array} \right.}$$

Therefore, setting x(2) = 0 gives

$$\displaystyle{x(2) = 3 - 2t^{{\ast}} = 0,}$$

which makes t = 3∕2. Since this satisfies t < 2, we do not have to deal with case (ii), and we have

$$\displaystyle{x^{{\ast}}(t) = \left \{\begin{array}{cl} 1 - t&\mbox{ for }0 \leq t \leq 3/2, \\ t - 2&\mbox{ for }3/2 <t \leq 2 \end{array} \right.\;\;\mbox{ and }\lambda (t) = t-\frac{3} {2}.}$$

Figure 3.1 shows the optimal state and adjoint trajectories. Using the optimal state trajectory in the objective function, we can obtain its optimal value J = −1∕4.

Figure 3.1
figure 1

State and adjoint trajectories in Example 3.4

In Exercise 3.15, you are asked to consider case (ii) by setting t = 2, and show that the maximum principle will not be satisfied in this case.

Finally, we can verify the marginal value interpretation of the adjoint variable as indicated in Remark 3.5. For this, we first note that the feasible region for the problem is given by xt − 2,  t ∈ [0, 2]. To obtain the value function V (x, t), we can easily obtain the optimal solution in the interval [t, 2] for the problem beginning with x(t) = x. We use the notation introduced in Example 2.5 to specify the optimal solution as

$$\displaystyle{u_{(x,t)}^{{\ast}}(s) = \left \{\begin{array}{cl} - 1,&s \in [t,\; \frac{1} {2}(x + t) + 1), \\ 1, &s \in [\frac{1} {2}(x + t) + 1,2], \end{array} \right.}$$

and

$$\displaystyle{x_{(x,t)}^{{\ast}}(s) = \left \{\begin{array}{cl} x + t - s,&s \in [t,\; \frac{1} {2}(x + t) + 1), \\ s - 2, &s \in [\frac{1} {2}(x + t) + 1,2]. \end{array} \right.}$$

Then for xt − 2, 

$$\displaystyle{ \begin{array}{lll} V (x,t)& =&\int _{t}^{2} - x_{(x,t)}^{{\ast}}(s)ds \\ & =& -\int _{t}^{(1/2)(x+t)+1}(x + t - s)ds -\int _{(1/2)(x+t)+1}^{2}(s - 2)ds \\ & =&(1/4)t^{2} - (1/4)x^{2} + (1/2)t(x - 2) - (x - 1).\end{array} }$$
(3.76)

For x < t − 2, there is no feasible solution, and we therefore set V (x, t) = −. 

We can now verify that for 0 ≤ t ≤ 3∕2, the value function V (x, t) is continuously differentiable at x = x (t) = 1 − t, and

$$\displaystyle\begin{array}{rcl} V _{x}(x^{{\ast}}(t),t)& =& -(1/2)x^{{\ast}}(t) + (1/2)t - 1 {}\\ & =& -(1/2)(1 - t) + (1/2)t - 1 {}\\ & =& t - 3/2 {}\\ & =& \lambda (t). {}\\ \end{array}$$

What happens when t ∈ (3∕2, 2]? Clearly, for xx (t) = t − 2, we may still use (3.76) to obtain the right-hand derivative V x+(x (t), t) = −(1∕2)x (t) + (1∕2)t − 1 = −(1∕2)(t − 2) + (1∕2)t − 1 = 0. However, for x < x (t), we have x < t − 2 for which there is no feasible solution, and we set the left-hand derivative V x(x (t), t) = −. Thus, the value function V (x, t) is not differentiable at x (t), and since V x(x (t), t) does not exist for t ∈ (3∕2, 2], (2.17) has no meaning; see Remark 2.2.

It is possible, however, to provide an economic meaning for λ(2). In Exercise 3.17, you are asked to rework Example 3.4 with the terminal condition x(2) ≥ 0 replaced by x(2) ≥ ɛ, where ɛ is small. Furthermore, the solution will illustrate that α = λ(2) − 0 = 1∕2, obtained by using (3.60), represents the shadow price of the constraint as indicated in Remark 3.7.

3.5 Free Terminal Time Problems

In some cases, the terminal time is not given but needs to be determined as an additional decision. Here, a necessary condition for a terminal time to be optimal in the present-value and current-value formulations are given in (3.15) and (3.44), respectively. In this section, we elaborate further on these conditions as well as solve two free terminal time examples: Examples 3.5 and 3.6.

Let us begin with a special case of the condition (3.15) for the simple problem (2.4) when T ≥ 0 is a decision variable. When compared with the problem (3.7), the simple problem is without the mixed constraints and constraints at the terminal time T. Thus the transversality condition (3.15) reduces to

$$\displaystyle{ H[x^{{\ast}}(T^{{\ast}}),u^{{\ast}}(T^{{\ast}}),\lambda (T^{{\ast}}),T^{{\ast}}] + S_{ T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}] = 0. }$$
(3.77)

This condition along with the Maximum Principle (2.31) with T replaced by T give us the necessary conditions for the optimality of T and u (t),  t ∈ [0, T ] for the simple problem (2.4) when T ≥ 0 is also a decision variable.

An intuitively appealing way to check if the optimal T ∈ (0, ) must satisfy (3.77) is to solve the problem (2.4) with the terminal time T with u (t), t ∈ [0, T ] as the optimal control trajectory, and then show that the first-order condition for T to maximize the objective function in a neighborhood (T δ, T + δ) of T with δ > 0 leads to (3.77). For this, let us set u (t) = u (T ),  t ∈ [T , T + δ), so that we have a control u (t) that is feasible for (2.4) for any T ∈ (T δ, T + δ), as well as continuous at T . Let x (t),  t ∈ [0, T + δ] be the corresponding state trajectory. With these we can obtain the corresponding objective function value

$$\displaystyle{ J(T) =\int _{ 0}^{T}F(x^{{\ast}}(t),u^{{\ast}}(t),t)dt + S(x^{{\ast}}(T),T),\;T \in (T^{{\ast}}-\delta,T^{{\ast}}+\delta ), }$$
(3.78)

which, in particular, represents the optimal value of the objective function for the problem (2.4) when T = T . Furthermore, since u (t) is continuous at T , x (t) is continuously differentiable there, and so is J(T). In this case, since T is optimal, it must satisfy

$$\displaystyle{ J'(T^{{\ast}}):= \frac{dJ(T)} {dT} \vert _{T=T^{{\ast}}} = 0. }$$
(3.79)

Otherwise, we would have either J′(T ) > 0 or J′(T ) < 0. The former situation would allow us to find a T ∈ (T , T + δ) for which J(T) > J(T ), and T could not be optimal since the choice of an optimal control for (2.4) defined on the interval [0, T] would only improve the value of the objective function. Likewise, the later situation would allow us to find a T ∈ (T δ, T ) for which J(T) > J(T ). By taking the derivative of (3.78), we can write (3.79) as

$$\displaystyle{ F(x^{{\ast}}(T^{{\ast}}),u^{{\ast}}(T^{{\ast}}),T^{{\ast}}) + S_{ x}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}]\dot{x}^{{\ast}}(T^{{\ast}}) + S_{ T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}] = 0. }$$
(3.80)

Furthermore, using the definition of the Hamiltonian in (2.18) and the state equation and the transversality condition in (2.31), we can easily see that (3.80) can be written as (3.77).

Remark 3.10

An intuitive way to obtain optimal T is to first solve the problem (2.4) with a given terminal time T and obtain the optimal value of the objective function J (T), and then maximize J (T) over T. Hartl and Sethi (1983) show that the first-order condition for maximizing J (T), namely, dJ (T)∕dT = 0 can also be used to derive the transversality condition (3.77).

If T is restricted to lie in the interval [T 1, T 2], where T 2 > T 1 ≥ 0, then (3.77) is still valid provided T ∈ (T 1, T 2). As is standard, if T = T 1, then the = sign in (3.77) is replaced by ≤, and if T = T 2, then the = sign in (3.77) is replaced by ≥. In other words, if we must have T ∈ [T 1, T 2], then we can replace (3.77) by

$$\displaystyle{ H[x^{{\ast}}(T^{{\ast}}),u^{{\ast}}(T^{{\ast}}),\lambda (T^{{\ast}}),T^{{\ast}}]+S_{ T}[x^{{\ast}}(T^{{\ast}}),T^{{\ast}}]\left \{\begin{array}{ll} \leq 0&\mbox{ if }T^{{\ast}} = T_{ 1}, \\ = 0&\mbox{ if }T^{{\ast}}\in (T_{1},T_{2}), \\ \geq 0&\mbox{ if }T^{{\ast}} = T_{2}. \end{array} \right. }$$
(3.81)

Similarly, we can also obtain the corresponding versions of (3.15) and (3.44) for the problem (3.7) and its current value version (specified in Sect. 3.3), respectively.

We shall now illustrate (3.77) and (3.81) by solving Examples 3.5 and 3.6. To illustrate the idea in Remark 3.10, you are asked in Exercise 3.6 to solve Example 3.5 by using dJ (T)∕dt = 0 to obtain the optimal T . 

Example 3.5

Consider the problem:

$$\displaystyle{ \max _{u,T}\left \{J =\int _{ 0}^{T}(x - u)dt + x(T)\right \} }$$
(3.82)

subject to

$$\displaystyle{ \dot{x} = -2 + 0.5u,\;x(0) = 17.5, }$$
(3.83)
$$\displaystyle{u \in [0,1],\;T \geq 0.}$$

Solution The Hamiltonian is

$$\displaystyle{H = x - u +\lambda (-2 + 0.5u),}$$

where \(\dot{\lambda }= -1,\;\lambda (T) = 1,\) which gives

$$\displaystyle{\lambda (t) = 1 + (T - t).}$$

Then, the optimal control is given by

$$\displaystyle{ u^{{\ast}}(t) = \mbox{ bang}[0,1;0.5(T - 1 - t)]. }$$
(3.84)

In other words, u (t) = 1 for 0 ≤ tT − 1 and u (t) = 0 for T − 1 < tT. 

Since we must also determine the optimal terminal time T , it must satisfy (3.77), which, in view of the fact that u (T ) = 0 from (3.84), reduces to

$$\displaystyle{ x^{{\ast}}(T^{{\ast}}) - 2 = 0. }$$
(3.85)

By substituting u (t) in (3.83) and integrating, we obtain

$$\displaystyle{ x^{{\ast}}(t) = \left \{\begin{array}{ll} 17.5 - 1.5t, &0 \leq t \leq T - 1, \\ 17 + 0.5T - 2t,&T - 1 <t \leq T.\\ \end{array} \right. }$$
(3.86)

We can now apply (3.85) to obtain

$$\displaystyle{x^{{\ast}}(T^{{\ast}}) - 2 = 17 - 1.5T^{{\ast}}- 2 = 0,}$$

which gives T = 10. Thus, the optimal solution of the problem is given by T = 10 and

$$\displaystyle{u^{{\ast}}(t) = \mbox{ bang}[0,1;0.5(9 - t)].}$$

Note that if we had restricted T to be in the interval [T 1, T 2] = [2, 8], we would have T = 8,  u (t) = bang[0, 1; 0. 5(7 − t)], and x (8) − 2 = 5 − 2 = 3 ≥ 0, which would satisfy (3.81) at T = T 2 = 8. On the other hand, if T were restricted in the interval [T 1, T 2] = [11, 15], then T = 11,  u (t) = bang[0, 1; 0. 5(10 − t)], and x (11) − 2 = 0. 5 − 2 = −1. 5 ≤ 0 would satisfy (3.81) at T = T 1 = 11. 

Next, we will apply the maximum principle to solve a well known time-optimal control problem . It is one of the problems used by Pontryagin et al. (1962) to illustrate the applications of the maximum principle . The problem also elucidates a specific instance of the synthesis of optimal controls.

By the synthesis of optimal controls , we mean the procedure of “patching” together various forms of the optimal controls obtained from the Hamiltonian maximizing condition . A simple example of the synthesis occurs in Example 2.5, where u = 1 when λ > 0, u = −1 when λ < 0, and the control is singular when λ = 0. An optimal trajectory starting at the given initial state variables is synthesized from these. In Example 2.5, this synthesized solution is u = −1 for 0 ≤ t < 1 and u = 0 for 1 ≤ t ≤ 2. Our next example requires a synthesis procedure which is more complex. In Chap. 5, both the cash management and equity financing models require such synthesis procedures.

Example 3.6

A Time-Optimal Control Problem . Consider a subway train of mass m moving horizontally along a smooth linear track with negligible friction. Let x(t) denote the position of the train, measured in miles from the origin called the main station, along the track at time t, measured in minutes. Then the equation of the train’s motion is governed by Newton’s Second Law of Motion, which states that force equals mass times acceleration. In mathematical terms, the equation of the motion is the second-order differential equation

$$\displaystyle{m\frac{d^{2}x(t)} {dt^{2}} = m\ddot{x}(t) = u(t),}$$

where u(t) denotes the external force applied to the train at time t and \(\ddot{x}(t)\) represents the acceleration in miles per minute per minute, or miles/minute2. This equation, along with

$$\displaystyle{x(0) = x_{0}\mbox{ and }\dot{x}(0) = y_{0},}$$

respectively, as the initial position of the train and its initial velocity in miles per minute, characterizes its motion completely.

For convenience in further exposition, we may assume m = 1 so that the equation of motion can be written as

$$\displaystyle{ \ddot{x} = u. }$$
(3.87)

Then, the force u can be expressed simply as acceleration or deceleration (i.e., negative acceleration) depending on whether u is positive or negative, respectively.

In order to develop the time-optimal control problem under consideration, we transform (3.87) into a system of two first-order differential equations (see Appendix A)

$$\displaystyle{ \left \{\begin{array}{ll} \dot{x} = y,&x(0) = x_{0}, \\ \dot{y} = u,&y(0) = y_{0},\end{array} \right. }$$
(3.88)

where y(t) denotes the velocity of the train in miles/minute at time t. 

Assume further that, for the comfort of the passengers, the maximum acceleration and deceleration are required to be at most 1 mile/minute2. Thus, the control variable constraint is

$$\displaystyle{ u \in \varOmega = [-1,1]. }$$
(3.89)

The problem is to find a control satisfying (3.89) such that the train stops at the main station located at x = 0 in a minimum possible time T. Of course, for the train to come to rest at x = 0 at time T, we must have x(T) = 0 and y(T) = 0. We have thus defined the following fixed-end-point optimal control problem:

$$\displaystyle{ \left \{\begin{array}{ll} \max \left \{J =\int _{ 0}^{T} - 1dt\right \} \\ \text{subject to} \\ \dot{x} = y,\;x(0) = x_{0},\;x(T) = 0, \\ \dot{y} = u,\;y(0) = y_{0},\;y(T) = 0, \\ \text{and the control constraint} \\ u \in \varOmega = [-1,1].\end{array} \right. }$$
(3.90)

Note that (3.90) is a fixed-end-point problem with unspecified terminal time . For this problem to be nontrivial, we must not have x 0 = y 0 = 0, i.e., we must have either x 0 ≠ 0 or y 0 ≠ 0 or both are nonzero.

Solution Here we have only control constraints of the type treated in Chap. 2, and so we can use the maximum principle (2.31). The standard Hamiltonian function is

$$\displaystyle{H = -1 +\lambda _{1}y +\lambda _{2}u,}$$

where the adjoint variables λ 1 and λ 2 satisfy

$$\displaystyle{\dot{\lambda _{1}} = 0,\;\lambda _{1}(T) =\beta _{1}\mbox{ and }\dot{\lambda _{2}} = -\lambda _{1},\;\lambda _{2}(T) =\beta _{2},}$$

and β 1 and β 2 are constants to be determined in the case of a fixed-end-point problem ; see Table 3.1, Row 2. We can integrate these equations and write the solution in the form

$$\displaystyle{\lambda _{1} =\beta _{1}\mbox{ and }\lambda _{2} =\beta _{2} +\beta _{1}(T - t),}$$

where β 1 and β 2 are constants to be determined from the maximum principle (2.31), condition (3.15), and the specified initial and terminal values of the state variables. The Hamiltonian maximizing condition yields the form of the optimal control to be

$$\displaystyle{ u^{{\ast}}(t) = \mbox{ bang}\{ - 1,1;\;\beta _{ 2} +\beta _{1}(T - t)\}. }$$
(3.91)

As for the minimum time T , it is clearly zero if the train is initially at rest at the main station, i.e., (x 0, y 0) = 0. In this case, the problem is trivial, u (0) = 0, and there is nothing further to solve. Otherwise, at least one of x 0 or y 0 is not zero, in which case the minimum time T > 0 and the transversality condition (3.15) applies. Since y(T) = 0 and S ≡ 0, we have

$$\displaystyle{H + S_{T}\mid _{T=T^{{\ast}}} =\lambda _{2}(T^{{\ast}})u^{{\ast}}(T^{{\ast}}) - 1 =\beta _{ 2}u^{{\ast}}(T^{{\ast}}) - 1 = 0,}$$

which together with the bang-bang control policy (3.91) implies either

$$\displaystyle{\lambda _{2}(T^{{\ast}}) =\beta _{ 2} = -1\mbox{ and }u^{{\ast}}(T^{{\ast}}) = -1,}$$

or

$$\displaystyle{\lambda _{2}(T^{{\ast}}) =\beta _{ 2} = +1\mbox{ and }u^{{\ast}}(T^{{\ast}}) = +1.}$$

Since the switching function β 2 + β 1(T t) is a linear function of the time remaining, it can change sign at most once. Therefore, we have two cases: (i) u (τ) = −1 in the interval tτT for some t ≥ 0; (ii) u (τ) = +1 in the interval tτT for some t ≥ 0. We can integrate (3.88) in each of these cases as shown in Table 3.2. Also in the table we have the curves Γ and Γ +, which are obtained by eliminating t from the expressions for x and y in each case. The parabolic curves Γ and Γ + are called switching curves and are shown in Fig. 3.2.

Table 3.2 State trajectories and switching curves
Figure 3.2
figure 2

Minimum time optimal response for Example 3.6

It should be noted parenthetically that Fig. 3.2 is different from the figures we have seen thus far, where the abscissa represented the time dimension. In Fig. 3.2, the abscissa represents the train’s location and the ordinate represents the train’s velocity. Thus, the point (x 0, y 0) represents the vector of the train’s initial position and initial velocity. A trajectory of the train over time can be represented by a curve in this figure. For example, the bold-faced trajectory beginning at (x 0, y 0) represents a train that is moving in the positive direction and it is slowing down. It passes through the main station located at the origin and comes to a momentary rest at the point that is \(\sqrt{y_{0 }^{2 } + 2x_{0}}\) miles to the right of the main station. At this location, the train reverses its direction and speeds up to reach the location x and attain the velocity of y . At this point, it slows down gradually until it comes to rest at the main station. In the ensuing discussion we will show that this trajectory is in fact the minimal time trajectory beginning at the location x 0 at a velocity of y 0. We will furthermore obtain the control representing the optimal acceleration and deceleration along the way. Finally, we will obtain the various instants of interest, which are implicit in the depiction of the trajectory in Fig. 3.2.

We can put Γ + and Γ into a single switching curve Γ as

$$\displaystyle{ y =\varGamma (x) = \left \{\begin{array}{ll} \varGamma ^{+}(x) = -\sqrt{2x}, &x \geq 0, \\ \varGamma ^{-}(x) = +\sqrt{-2x},&x <0. \end{array} \right. }$$
(3.92)

If the initial state (x 0, y 0) ≠ 0, lies on the switching curve, then we have u = +1 (resp., u = −1) if x 0 > 0 (resp., x 0 < 0); i.e., if (x 0, y 0) lies on Γ + (resp., Γ ). In the common parlance, this means that we apply the brakes to bring the train to a full stop at the main station. If the initial state (x 0, y 0) is not on the switching curve, then we choose, between u = 1 and u = −1, that which moves the system toward the switching curve. By inspection, it is obvious that above the switching curve we must choose u = −1 and below we must choose u = +1.

The other curves in Fig. 3.2 are solutions of the differential equations starting from initial points (x 0, y 0). If (x 0, y 0) lies above the switching curve Γ as shown in Fig. 3.2, we use u = −1 to compute the curve as follows:

$$\displaystyle{\dot{x} = y,\;x(0) = x_{0},}$$
$$\displaystyle{\dot{y} = -1,\;y(0) = y_{0}.}$$

Integrating these equations gives

$$\displaystyle{y = -t + y_{0},}$$
$$\displaystyle{x = -\frac{t^{2}} {2} + y_{0}t + x_{0}.}$$

Elimination of t between these two gives

$$\displaystyle{ x = \frac{y_{0}^{2} - y^{2}} {2} + x_{0}. }$$
(3.93)

This is the equation of the parabola in Fig. 3.2 through (x 0, y 0). The point of intersection of the curve (3.93) with the switching curve Γ + is obtained by solving (3.93) and the equation for Γ +, namely 2x = y 2, simultaneously, which gives

$$\displaystyle{ x_{{\ast}} = \frac{y_{0}^{2} + 2x_{0}} {4},\;y_{{\ast}} = -\sqrt{(y_{0 }^{2 } + 2x_{0 } )/2}, }$$
(3.94)

where the minus sign in the expression for y in (3.94) was chosen since the intersection occurs when y is negative. The time t that it takes to reach the switching curve, called the switching time , given that we start above it, is

$$\displaystyle{ t_{{\ast}} = y_{0} - y_{{\ast}} = y_{0} + \sqrt{(y_{0 }^{2 } + 2x_{0 } )/2}. }$$
(3.95)

To find the minimum total time to go from the starting point (x 0, y 0) to the origin (0,0), we substitute t into the equation for Γ + in Column (ii) of Table 3.2; this gives

$$\displaystyle{ T^{{\ast}} = t_{ {\ast}}- y_{{\ast}} = y_{0} + \sqrt{2(y_{0 }^{2 } + 2x_{0 } )}. }$$
(3.96)

Here t is the time to get to the switching curve and − y is the time spent along the switching curve.

Note that the parabola (3.93) intersects the y-axis at the point \((0,+\sqrt{2x_{0 } + y_{0 }^{2}})\) and the x-axis at the point (x 0 + y 02∕2, 0). This means that for the initial position (x 0, y 0) depicted in Fig. 3.2, the train first passes the main station at the velocity of \(+\sqrt{2x_{0 } + y_{0 }^{2}}\) and comes to a momentary stop at the distance of (x 0 + y 02∕2) to the right of the main station. There it reverses its direction, comes to within the distance of x from the main station, switches then to u = +1, which slows it to a complete stop at the main station at time T given by (3.96).

As a numerical example, start at the point (x 0, y 0) = (1,1). Then, the equation of the parabola (3.93) is

$$\displaystyle{2x = 3 - y^{2}.}$$

The switching point given by (3.94) is (\(3/4,-\sqrt{3/2}\)). Finally from (3.95), the switching time is \(t_{{\ast}} = 1 + \sqrt{3/2}\) min. Substituting into (3.96), we find the minimum time to stop is \(T^{{\ast}} = 1 + \sqrt{6}\) min.

To complete the solution of this example let us evaluate β 1 and β 2, which are needed to obtain λ 1 and λ 2. Since (1,1) is above the switching curve, the approach to the main station is on the curve Γ +, and therefore, u (T ) = 1 and β 2 = 1. To compute β 1, we observe that λ 2(t ) = β 2 + β 1(T t ) = 0 so that \(\beta _{1} = -\beta _{2}/(T^{{\ast}}- t_{{\ast}}) = -1/\sqrt{3/2} = -\sqrt{2/3}.\) Finally, we obtain x = 3∕4 and \(y_{{\ast}} = -\sqrt{3/2}\) from (3.94).

Let us now describe the optimal solution from (1, 1) in the common parlance. The position (1, 1) means the train is 1 mile to the right of the main station, moving away from it at the speed of 1 mile per minute. The control u = −1 means that the brakes are applied to slow the train down. This action brings the train to a momentary stop at a distance of \(\sqrt{ 3}\) miles to the right of the main station. Moreover, the continuation of control u = −1 means the train reverses its direction at that point and starts speeding toward the station. When it comes to within 3∕4 miles to the right of the main station at time \(t_{{\ast}} = 1 + \sqrt{3/2},\) its velocity of \(-\sqrt{3/2}\) or the speed of \(\sqrt{3/2}\) miles per minute toward the station is too fast to come to a rest at the main station without application of the brakes. So the control is switched to u = +1 at time t , which means the brakes are applied at that time. This action brings the train to a complete stop at the main station at the time of \(T^{{\ast}} = 1 + \sqrt{6}\) min after the train left its initial position (1, 1). 

In Exercises 3.193.22, you are asked to work other examples with different starting points above, below, and on the switching curve. Note that t = 0 by definition, if the starting point is on the switching curve.

3.6 Infinite Horizon and Stationarity

Thus far, we have studied problems whose horizon is finite or whose horizon length is a decision variable to be determined. In this section, we briefly discuss the case of T = in the problem (3.7), called the infinite horizon case. This case is especially important in many economics and management science problems. Our treatment of this case is largely heuristic, since a general theory of the necessary optimality conditions is not available. Nevertheless, we can rely upon an infinite-horizon extension of the sufficiency optimality conditions stated in Theorem 3.1.

When we put T = in (3.7) along with ρ > 0, we will generally get a nonstationary infinite horizon problem in the sense that the various functions involved depend explicitly on the time variable t. Such problems are extremely hard to solve. So, in this section we will devote our attention to only stationary infinite horizon problems, which do not depend explicitly on time t. Furthermore, it is reasonable in most cases to assume σ(x) ≡ 0 in infinite horizon problems. Moreover, in most economics and management science problems, the terminal constraints, if any, require the state variables to be nonnegative. Thus, to begin with, we consider the problem:

$$\displaystyle\begin{array}{rcl} \left \{\begin{array}{l} \max \left \{J =\int _{ 0}^{\infty }\phi (x,u)e^{-\rho t}dt\right \}, \\ \mbox{ subject to} \\ \dot{x} = f(x,u),\;x(0) = x_{0}, \\ g(x,u) \geq 0.\end{array} \right.& &{}\end{array}$$
(3.97)

This stationarity assumption means that the state equations, the current-value adjoint equations, and the current-value Hamiltonian in (3.35) are all explicitly independent of time t. 

Remark 3.11

The concept of stationarity introduced here is different from the concept of autonomous systems introduced in Exercise 2.9. This is because, in the presence of discounting in (3.28), the stationarity assumption (3.97) does not give us an autonomous system as defined there. See Exercise 3.42 for further comparison between the two concepts.

When it comes to the transversality conditions in the infinite horizon case, the situation is somewhat more complicated. Even the economic argument for the finite horizon case fails to extend here because we do not have a meaningful analogue of the salvage value function. Moreover, in the free-end-point case with no salvage value, the standard maximum principle (2.31) gives λ pv(T) = 0, which can no longer be necessary in general for T = , as confirmed by a simple counter-example in Exercise 3.37. As a matter of fact, we have no general results giving conditions under which the limit of the finite horizon transversality conditions are necessary. What is true is that the maximum principle (3.42) holds except for the transversality condition on λ(T). 

When it comes to the sufficiency of the limiting transversality conditions obtained by letting T in Theorem 3.1, the situation is much better. As a matter of fact, we can see from the inequality (2.73) with S(x) ≡ 0 that all we need is

$$\displaystyle{ \lim _{T\rightarrow \infty }\lambda ^{pv}(T)[x(T) - x^{{\ast}}(T)] =\lim _{ T\rightarrow \infty }e^{-\rho T}\lambda (T)[x(T) - x^{{\ast}}(T)] \geq 0 }$$
(3.98)

for Theorem 2.1, and therefore Theorem 3.1, to hold. See Seierstad and Sydsæter (1987) and Feichtinger and Hartl (1986) for further details.

In the important free-end-point case (3.97), since x(T) is arbitrary, (3.98) will imply

$$\displaystyle{ \lim _{T\rightarrow \infty }\lambda ^{pv}(T) =\lim _{ T\rightarrow \infty }e^{-\rho T}\lambda (T) = 0. }$$
(3.99)

While not a necessary condition as indicated earlier, it is interesting to note that (3.99) is the limiting version of the condition in Table 3.1, Row 1.

Another important case is that of nonnegativity constraints

$$\displaystyle{ \lim _{T\rightarrow \infty }x(T) \geq 0. }$$
(3.100)

Then, it is clear that the transversality conditions

$$\displaystyle{ \lim _{T\rightarrow \infty }e^{-\rho T}\lambda (T) \geq 0\mbox{ and }\lim _{ T\rightarrow \infty }e^{-\rho T}\lambda (T)x^{{\ast}}(T) = 0, }$$
(3.101)

imply (3.98). Note that these are also analogous to Table 3.1, Row 3.

We leave it as Exercise 3.38 for you to show that the limiting version of the condition in the rightmost column of Rows 2, 3, and 4 in Table 3.1 imply (3.98). This would mean that Theorem 3.1 provides sufficient optimality conditions for the problem (3.97), except in the free-end-point case, i.e., when the terminal constraints a(x(T)) ≥ 0 and b(x(T)) = 0 are not present. Moreover, in the free-end-point case, we can use (3.98), or even (3.99) with some qualifications, as discussed earlier.

Example 3.7

Let us return to Example 3.3 and now assume that we have a perpetual charitable trust with initial fund W 0, which wants to maximize its total discounted utility of charities C(t) over time, subject to the terminal condition

$$\displaystyle{ \lim _{T\rightarrow \infty }W(T) \geq 0. }$$
(3.102)

For convenience we restate the problem:

$$\displaystyle{\max _{C(t)\geq 0}\left \{J =\int _{ 0}^{\infty }e^{-\rho t}\ln C(t)dt\right \}}$$

subject to

$$\displaystyle{ \dot{W} = rW - C,\;W(0) = W_{0}> 0, }$$
(3.103)

and (3.102).

Solution We already know from Example 3.3 with B = 0 that we are in case (i), and the optimal solution is given by (3.50) in Example 3.2. It seems reasonable to explore whether or not we can obtain an optimal solution for our infinite horizon problem by letting T in (3.50). Furthermore, since the limiting version of the maximum principle (3.42) is sufficient for optimality in this case, all we need to do is to check if the limiting solution satisfies the condition

$$\displaystyle{ \lim _{T\rightarrow \infty }e^{-\rho T}\lambda (T) \geq 0\mbox{ and }\lim _{ T\rightarrow \infty }e^{-\rho T}\lambda (T)W^{{\ast}}(T) = 0. }$$
(3.104)

With T in (3.50) and (3.52), we have

$$\displaystyle{ W^{{\ast}}(t) = e^{(r-\rho )t}W_{ 0},\;C^{{\ast}}(t) =\rho W^{{\ast}}(t),\;\lambda (t) = 1/\rho W^{{\ast}}(t). }$$
(3.105)

Since λ(t) ≥ 0 and λ(t)W (t) = 1∕ρ, it is clear that (3.104) holds. Thus, (3.105) gives the optimal solution. Using this solution in the objective function, we obtain

$$\displaystyle{ J^{{\ast}} = \frac{1} {\rho } \ln \rho W_{0} + \frac{r-\rho } {\rho ^{2}}, }$$
(3.106)

which we can verify to be the same as (3.51) as T. 

It is interesting to observe from (3.105) that the optimal consumption is increasing, constant, or decreasing if r is greater than, equal to, or less than ρ, respectively. Moreover, if ρ = r, then W (t) = W 0, C (t) = rW 0, and λ(t) = 1∕rW 0, which means that it is optimal to consume just the interest earned on the invested wealth—no more, no less—and, therefore, none of the initial wealth is ever consumed!

In the case of stationary systems, considerable attention is focused on equilibrium where all motion ceases, i.e., the values of x and λ for which \(\dot{x} = 0\) and \(\dot{\lambda }= 0.\) The notion is that of optimal long-run stationary equilibrium ; see Arrow and Kurz (1970, Chapter 2) and Carlson and Haurie (1987a1996) . If an equilibrium exists, then it is defined by the quadruple \(\{\bar{x},\bar{u},\bar{\lambda },\bar{\mu }\}\) satisfying

(3.107)

Clearly, if the initial condition \(x_{0} =\bar{ x},\) the optimal control is \(u^{{\ast}}(t) =\bar{ u}\) for all t. If \(x_{0}\neq \bar{x},\) the optimal solution will have a transient phase. Moreover, depending on the problem, the equilibrium may be attained in a finite time or an approach to it may be asymptotic.

If the nonnegativity constraint (3.100) is added to problem (3.97), then we may include the requirement \(\bar{\lambda }\geq 0\) and \(\bar{\lambda }\bar{x} = 0\) in (3.107).

If the constraint involving g is not imposed in (3.97), \(\bar{\mu }\) may be dropped from the quadruple. In this case, the long-run stationary equilibrium is defined by the triple \(\{\bar{x},\;\bar{u},\;\bar{\lambda }\}\) satisfying

$$\displaystyle{ f(\bar{x},\;\bar{u}) = 0,\;\rho \bar{\lambda }= H_{x}(\bar{x},\;\bar{u},\;\bar{\lambda }),\mbox{ and }H_{u}(\bar{x},\;\bar{u},\;\bar{\lambda }) = 0. }$$
(3.108)

Also known in this case is that the optimal value of the objective function can be expressed as

$$\displaystyle{ J^{{\ast}} = H(x_{ 0},u^{{\ast}}(0),\lambda (0))/\rho. }$$
(3.109)

You are asked to prove this relation in Exercise 3.40. That it holds in Example 3.7 is quite clear when we use (3.105) in (3.109) and see that we get (3.106).

Also, we see from Example 3.7 that when we let t in (3.105), we formally obtain

$$\displaystyle{ \begin{array}{c} (\bar{W},\bar{C},\bar{\lambda }) = \left \{\begin{array}{lcl} (0,0,\infty ) &\mbox{ if}&\rho> r, \\ (W_{0},\rho W_{0},1/\rho W_{0})&\mbox{ if}&\rho = r, \\ (\infty,\infty,0) &\mbox{ if}&\rho <r. \end{array} \right. \end{array} }$$
(3.110)

This is precisely the long-run stationary equilibrium that we will obtain if we apply (3.108) along with \(\bar{\lambda }\geq 0\) and \(\bar{\lambda }\bar{W} = 0\) directly to the optimal control problem in Example 3.7. This verification is left as Exercise 3.41.

Example 3.8

For another application of (3.108), let us return to Example 3.7 and now assume that the wealth W is invested in a productive activity resulting in an output rate lnW, and that the horizon T = . Since lnW is only defined for W > 0, we do not need to impose the terminal constraint (3.102) here.

Thus, the problem is

$$\displaystyle{\max _{C(t)\geq 0}\left \{J =\int _{ 0}^{\infty }e^{-\rho t}\ln C(t)dt\right \}}$$

subject to

$$\displaystyle{ \dot{W} =\ln W - C,\;W(0) = W_{0}> 0, }$$
(3.111)

and one task is to find the long-run stationary equilibrium for it. Note that since the horizon is infinite, it is usual to assume no salvage value and no terminal conditions on the state.

Solution By (3.108) we set

$$\displaystyle{\ln \bar{W} -\bar{ C} = 0,\;\rho = 1/\bar{W},\;1/\bar{C}-\bar{\lambda } = 0,}$$

which gives the equilibrium \(\{\bar{W},\bar{C},\bar{\lambda }\}=\{ 1/\rho,-\ln \rho,-1/\ln \rho \}.\) Since, 0 < ρ < 1, we have \(\bar{C}> 0,\) which satisfies the requirement that the consumption be nonnegative. Also, the equilibrium wealth \(\bar{W}> 0.\)

It is important to note that the optimal long-run stationary equilibrium (which is also called the turnpike ) is not the same as the optimal steady-state among the set of all possible steady-states. The latter concept is termed the Golden Rule or Golden Path in economics, and a procedure to obtain it is described below. However, the two concepts are identical if the discount rate ρ = 0; see Exercise 3.43.

The Golden Path is obtained by setting \(\dot{x} = f(x,u) = 0,\) which provides the feedback control u(x) that would keep x(t) = x over time. Then, substitute u(x) in the integrand ϕ(x, u) of (3.28) to obtain ϕ(x, u(x)). The value of x that maximizes ϕ(x, u(x)) yields the Golden Path . Of course, all of the constraints imposed on the problem have to be respected when obtaining the Golden Path .

In some cases, there may be more than one equilibria defined by (3.107). If so, the equilibrium that is attained may depend on the initial starting point. Moreover, from some special starting points, the system may have an option to go to two or more different equilibria. Such points are called the Sethi-Skiba points ; see Appendix D.8.

For multidimensional systems consisting of two or more states, optimal trajectories may exhibit more complex behaviors. Of particular importance is the concept of limit cycles. If the optimal trajectory of a dynamical system tends to spiral in toward a closed loop in the state space, then that closed loop is called a limit cycle. For more on this topic, refer to Vidyasagar (2002) and Grass et al. (2008) .

3.7 Model Types

Optimal control theory has been used to solve problems occurring in engineering, economics, management science, and other fields. In each field of application, certain general kinds of models which we will call model types are likely to occur, and each such model requires a specialized form of the maximum principle . In Chap. 2 we derived, in considerable detail, a simple form of the continuous-time maximum principle. However, to continue to provide such details for each different version of the maximum principle needed in later chapters of this book would be both repetitive and lengthy.

The purpose of this section is to avoid the latter by listing most of the different management science model types that we will use in later chapters. For each model type, we will give a brief description of the corresponding objective function, state equations, control and state inequality constraints, terminal conditions, adjoint equations, and the form of the optimal control policy. We will also indicate where each of these model types is applied in later chapters.

The reader may wish to skim this section on first reading to get an idea of what it contains, work a few of the exercises, and go on to the various functional areas discussed in later chapters. Then, when specific model types are encountered, the reader may return to read the relevant parts of this section in more detail.

We are now able to state the general forms of all the models (with one or two exceptions) that we will use to analyze the applications discussed in the rest of the book. Some other model types will be explained in later chapters.

In Table 3.3 we have listed six different combinations of ϕ and f functions. If we specify the initial value x 0 of the state variable x and the constraints on the control and state variables, we can get a completely specified optimal control model by selecting one of the model types in Table 3.3 together with one of the terminal conditions given in Table 3.1.

Table 3.3 Objective, state, and adjoint equations for various model types

The reader will see numerous examples of the uses of Tables 3.1 and 3.3 when we construct optimal control models of various applied situations in later chapters. To help in understanding these, we will give a brief mathematical discussion of the six model types in Table 3.3, with an indication of where each model type will be used later in the book.

In Model Type (a) of Table 3.3 we see that both ϕ and f are linear functions of their arguments. Hence it is called the linear-linear case. The Hamiltonian is

$$\displaystyle\begin{array}{rcl} H& =& Cx + Du +\lambda (Ax + Bu + d) \\ & =& Cx +\lambda Ax +\lambda d + (D +\lambda B)u.{}\end{array}$$
(3.112)

From (3.112) it is obvious that the optimal policy is bang-bang with the switching function (D + λB). Since the adjoint equation is independent of both control and state variables, it can be solved completely without resorting to two-point boundary value methods. Examples of (a) occur in the cash balance problem of Sect. 5.1.1 and the maintenance and replacement model of Sect. 9.1.1.

Model Type (b) of Table 3.3 is the same as Model Type (a) except that the function C(x) is nonlinear. Thus, the term C x appears in the adjoint equation, and two-point boundary value methods are needed to solve the problem. Here, there is a possibility of singular control , and a specific example is the Nerlove-Arrow model in Sect. 7.1.1.

Model Type (c) of Table 3.3 has linear functions in the state equation and quadratic functions in the objective function. Therefore, it is sometimes called the linear-quadratic case . In this case, the optimal control can be expressed in a form in which the state variables enter linearly. Such a form is known as the linear decision rule; see (D.36) in Appendix D. A specific example of this case occurs in the production-inventory example of Sect. 6.1.1.

Model Type (d) is a more general version of Model Type (b) in which the state equation is nonlinear in x. Here again, there is a possibility of singular control. The wheat trading model of Sect. 6.2.1 illustrates this model type. The solution of a special case of the model in Sect. 6.2.3 exhibits the occurrence of a singular control.

In Model Types (e) and (f), the functions are scalar functions, and there is only one state equation, so λ is also a scalar function. In these cases, the Hamiltonian function is nonlinear in u. If it is concave in u, then the optimal control is usually obtained by setting H u = 0. If it is convex, then the optimal control is the same as in Model Type (b).

Several examples of Model Type (e) occur in this book: the optimal financing model in Sect. 5.2.1, the Vidale-Wolfe advertising model in Sect. 7.2.1, the nonlinear extension of the maintenance and replacement model in Sect. 9.1.4, the forestry model in Sect. 10.2.1, the exhaustible resource model in Sect. 10.3.1, and all of the models in Chap. 11 Model Type (f) examples are: The Kamien-Schwartz model in Sect. 9.2.1 and the sole-owner fishery resource model in Sect. 10.1.

Although the general forms of the model are specified in Tables 3.1 and 3.3, there are a number of additional modeling tricks that are useful, which will be employed later. We collect these as a series of remarks below.

Remark 3.12

We sometimes need to use the absolute value function | u | of a control variable u in forming the functions ϕ or f. For example, in the simple cash balance model of Sect. 5.1, u < 0 represents buying and u > 0 represents selling; in either case there is a transaction cost which can be represented as c | u |. In order to handle this, we define new control variables u 1 and u 2 satisfying the following relations:

$$\displaystyle{ u:= u_{1} - u_{2},\;u_{1} \geq 0,\;u_{2} \geq 0, }$$
(3.113)
$$\displaystyle{ u_{1}u_{2} = 0. }$$
(3.114)

Thus, we represent u as the difference of two nonnegative variables, u 1 and u 2, together with the quadratic constraint (3.114). We can then write

$$\displaystyle{ \vert u\vert = u_{1} + u_{2}, }$$
(3.115)

which expresses the nonlinear function | u | as a linear function with the constraint (3.114).

We now observe that we need not impose (3.114) explicitly, provided there are costs associated with the controls u 1 and u 2, since in the presence of these costs no optimal policy would ever choose to make both of them simultaneously positive. This is indeed the case in the cash balance problem of Sect. 5.1, where the associated transaction costs prevent us from simultaneously buying and selling the same security.

Thus, by doubling the number of variables and adding inequality constraints, we are able to represent | u | as a linear function in the model.

Remark 3.13

Tables 3.1 and 3.3 are constructed for continuous-time models. Exactly the same kinds of models can be developed in the discrete-time case; see Chap. 8

Remark 3.14

Consider Model Types (a) and (b) when the control variable constraints are defined by linear inequalities of the form

$$\displaystyle{ g(u,t) = g(t)u \geq 0. }$$
(3.116)

Then, the problem of maximizing the Hamiltonian function becomes:

$$\displaystyle{ \left \{\begin{array}{l} \max (D +\lambda B)u \\ \mbox{ subject to } \\ g(t)u \geq 0.\end{array} \right. }$$
(3.117)

This is clearly a linear programming problem for each given instant of time t, since the Hamiltonian function is linear in u. 

Further in Model Type (a), the adjoint equation does not contain terms in x and u, so we can solve it for λ(t), and hence the objective function of (3.117) varies parametrically with λ(t). In this case we can use parametric linear programming techniques to solve the problem over time. Since the optimal solution to the linear program always occurs at an extreme point of the convex set defined by g(t)u ≥ 0, it follows that as λ(t) changes, the optimal solution to (3.117) will “bang” from one extreme point of the feasible set to another. This is called a generalized bang-bang optimal policy . Such a policy occurs, e.g., in the optimal financing model treated in Sect. 5.2; see Table 5.1, Row 5.

In Model Type (b), the adjoint equation contains terms in x, so we cannot solve for the trajectory of λ(t) without knowing the trajectory of x(t). It is still true that (3.117) is a linear program for any given t, but the parametric linear programming techniques will not usually work. Instead, some type of iterative procedure is needed in general; see Bryson and Ho (1975).

Remark 3.15

The salvage value part S[x(T), T] of the objective function is relevant in the optimization context in the following two cases:

Case (i) T is free and part of the problem is to determine the optimal terminal time ; see, e.g., Sect. 9.1.

Case (ii) T is fixed and the problem is that of maximizing the objective function involving the salvage value of the ending state x(T), which in this case can be written simply as S[x(T)]. 

For the fixed-end-point problem and for the infinite horizon problem, it does not usually make much sense to define a salvage value function.

Remark 3.16

One important model type that we did not include in Table 3.3 is the impulse control model of Bensoussan and Lions (1975) . In this model, an infinite control is instantaneously exerted on a state variable in order to cause a finite jump in its value. This model is particularly appropriate for the instantaneous reordering of inventory as required in lot-size models; see Bensoussan et al. (1974) . Further discussion of impulse control is given in Sect. D.9.

Exercises for Chapter  3

E 3.1

Consider the constraint set

$$\displaystyle{\varOmega =\{ (u_{1},u_{2})\vert 0 \leq u_{1} \leq x,\;-1 \leq u_{2} \leq u_{1}\}.}$$

Write these in the form shown in (3.3).

E 3.2

Find the reachable set X, defined in Sect. 3.1, if x and u satisfy

$$\displaystyle{\dot{x} = u - 1,\;x_{0} = 5,\;-1 \leq u \leq 1,}$$

and T = 3. 

E 3.3

Assume the constraint (3.3) to be of the form g(u, t) ≥ 0, i.e., g does not contain x explicitly, and assume x(T) is free. Apply the Lagrangian form of the maximum principle and derive the Hamiltonian form (2.31) with

$$\displaystyle{\varOmega (t) =\{ u\vert g(u,t) \geq 0\}.}$$

Assume g(u, t) to be of the form αuβ. 

E 3.4

Use the Lagrangian form of the maximum principle to obtain the optimal control for the following problem:

$$\displaystyle{\max \{J = x_{1}(2)\}}$$

subject to

$$\displaystyle{\left.\begin{array}{ll} \dot{x}_{1}(t) = u_{1} - u_{2},&x_{1}(0) = 2, \\ \dot{x}_{2}(t) = u_{2}, &x_{2}(0) = 1, \end{array} \right.}$$

and the constraints

$$\displaystyle{u_{1}(t) \geq u_{2}(t),\;0 \leq u_{1}(t) \leq x_{2}(t),\;0 \leq u_{2}(t) \leq 2,\;0 \leq t \leq 2.}$$

An interpretation of this problem is that x 1(t) is the stock of steel at time t and x 2(t) is the total capacity of the steel mill at time t. Production of steel at rate u 1, which is bounded by the current steel mill capacity, can be split into u 2 and u 1u 2, where u 2 goes into increasing the steel mill capacity and u 1u 2 adds to the stock of steel. The objective is to build as large a stockpile of steel as possible by time T = 2. With this interpretation, we clearly need to have x 1(t) ≥ 0 and x 2(t) ≥ 0. However, it is easily seen that these constraints are automatically satisfied for every feasible solution of the problem. You may find it interesting to show why this is true. (It is possible to make the problem more interesting by assuming an exogenous demand d for steel so that \(\dot{x}_{1} = u_{1} - u_{2} - d.\))

E 3.5

Specialize the terminal condition (3.13) in the one-dimensional case (i.e., n = 1) with \(Y (T) = Y = [\underline{x},\bar{x}]\) for each T > 0, where x and \(\bar{x}\) are two constants satisfying \(\bar{x}>\underline{ x}.\) Use (3.12) to derive (3.14).

E 3.6

Obtain the optimal value J (T) of the objective function for Example 3.5 for a given terminal time T, and then maximize it with respect to T by using the conditions dJ (T)∕dT = 0. Show that you get the same optimal T as the one obtained for Example 3.5 by using (3.77).

E 3.7

Check that the solution of Example 3.1 satisfies the sufficiency conditions in Theorem 3.1.

E 3.8

Starting from (3.15), obtain the current-value version (3.44) for the problem defined by (3.27) and (3.28). Show further that if we were to require the function ψ to also depend on T, i.e. if S(x, T) = ψ(x, T)e ρT then the left-hand side of condition (3.44) would be modified to H[x (T ), u (T ), λ(T ), T ] + ψ T[x (T ), T ] −ρψ[x (T ), T ]. 

E 3.9

Develop the current-value formulation of Sect. 3.3 for a time-varying nonnegative discount rate ρ(t), by replacing the factors e ρt and e ρT in (3.28), respectively, by

$$\displaystyle{\alpha (t) = e^{-\int _{0}^{t}\rho (s)ds }\mbox{ and }\alpha (T) = e^{-\int _{0}^{T}\rho (s)ds }.}$$

E 3.10

Begin with (3.54) and perform the steps leading to (3.55).

E 3.11

Optimal Consumption of An Initial Investment Over a Finite Horizon. Begin with an initial investment of x0. Assets x(t) at time t earn at the rate of r per dollar per unit time. A portion of the earnings is consumed at a rate of c(t) per unit time at time t, while the remainder is invested. Neither a negative consumption rate nor a consumption rate exceeding the earnings is allowed. Assets depreciate at the constant rate δ. Assume r > δ + ρ, where ρ is the discount rate applied on consumption. Find the optimal consumption rate over a finite horizon T such that the present value of the consumption stream over the finite horizon is maximized. Assume that T is sufficiently large. Let us note that the optimal capital accumulation model treated in Sect. 11.1.1 represents a generalization of this problem.

E 3.12

Show that if we require W(T) = ɛ > 0, ɛ small, instead of W(T) = 0 in Example 3.2, then the optimal value of the objective function will decrease by an amount βɛ = ɛ(1 − e rT)∕rW 0 + o(ɛ). 

E 3.13

Recall Exercise 2.18 of the leaky reservoir in Chap. 2. In this problem there was no explicit constraint on the total amount of water available. Suppose we impose the following isoperimetric constraint on that problem:

$$\displaystyle{\int _{0}^{100}udt = K,}$$

where K > 0 is the total amount of water which must be used. Assume also that the reservoir has infinite capacity. Re-solve this problem for various values of K and the objective functions in parts (a) and (b) of Exercise 2.18.

E 3.14

From the transversality conditions for the general terminal constraints in Row 5 of Table 3.1, derive the transversality conditions in Row 1 for the free-end-point case, in Row 2 for the fixed-end-point case, and in Rows 3 and 4 for the one-sided constraint cases. Assume ψ(x) = 0, i.e., there is no salvage value and X = E 1 for simplicity.

E 3.15

For solving Example 3.3, consider case (ii) by starting with t = 2, and show that the maximum principle will not be satisfied in this case.

E 3.16

Rework Example 3.4 with T = 4 and the following different terminal conditions:

  1. (a)

    x(4) unconstrained,

  2. (b)

    x(4) = 1, 

  3. (c)

    x(4) ≤ 1, 

  4. (d)

    x(4) ≥ 1. 

E 3.17

Rework Example 3.4 with the terminal condition (3.70) replaced by x(2) ≥ ɛ, where ɛ is small. Verify that the change in the optimal value of the objective function is −ɛ∕2 ≈ −αɛ + o(ɛ), as stipulated in Remark 3.6.

E 3.18

Introduce a terminal value in Example 3.4 as follows:

$$\displaystyle{\max \left \{J =\int _{ 0}^{2}(-x)dt + Bx(2)\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;x(0) = 1,}$$
$$\displaystyle{x(2) \geq 0,\mbox{ i.e., }Y = [0,\infty )\mbox{ in Table 3.1, Row 3,}}$$
$$\displaystyle{-1 \leq u \leq 1.}$$

Note that for B = 0, the problem is the same as Example 3.4. Solve this problem for B = 1∕2, 1, 3/2, 2, 3. Conclude that for B ≥ 2, the solution for the state variable does not change.

E 3.19

In Example 3.6, determine the optimal control and the corresponding state trajectory starting at the point (-4,6), which lies above the switching curve.

E 3.20

Carry out the synthesis of the optimal control for Example 3.6 when the starting point (x 0, y 0) lies below the switching curve.

E 3.21

Use the results of Exercise 3.20 to find the optimal control and the corresponding trajectory starting at the point \((-1,-1).\)

E 3.22

Find the optimal control, the minimum time, and the corresponding trajectory for Example 3.6 starting at the point \((-2,2),\) which lies on the switching curve.

E 3.23

What is the shortest time in which a passenger can be transported in a ballistic missile from Los Angeles to New York? Assume that a missile with the ultimate mechanical and thermodynamical properties is available, but that the passenger imposes the restraint that the maximum acceleration or deceleration is 100 ft/s2. The missile starts from rest in Los Angeles and stops in New York. Assume that the path is a straight line of length 2400 miles and ignore the rotation and curvature of the earth.

E 3.24

In the time-optimal control problem (3.90), replace the state equations by

$$\displaystyle{\dot{x} = ay,\;x(0) = x_{0} \geq 0,\;x(T) =\bar{ x}> x_{0},}$$
$$\displaystyle{\dot{y} = u,\;\;y(0) = y_{0} \geq 0,\;y(T) = 0,}$$

and the control constraint by

$$\displaystyle{u \in \varOmega = [U_{\min },U_{\max }].}$$

Assume a > 0 and U max > 0 > U min. Observe here that x(t) could be interpreted as the cumulative value of gold mined by a gold-producing country and y(t) could be interpreted as the total value of gold-mining machinery employed by the country at time t ≥ 0. The required machinery is to be imported. Because of some inertia in the world market for the machinery, the country cannot control y(t) directly, but is able to control its rate of change \(\dot{y}(t).\) Thus u(t) represents at time t, the import rate of the machinery when positive and the export rate when negative. The terminal value \(\bar{x}\) represents the required amount of gold to be produced in a minimum possible time. Obtain the optimal solution.

E 3.25

Solve the following minimum weighted energy and time problem:

$$\displaystyle{\max _{u,T}\left \{J =\int _{ 0}^{T} - (\frac{1} {2})(u^{2} + 1)dt\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;x(0) = 5,\;x(T) = 0,}$$

and the control constraint

$$\displaystyle{\vert u\vert \leq 2.}$$

Hint. Use (3.77) to determine T , the optimal value of T. 

E 3.26

Rework Exercise 3.25 with the new integrand F = −(1∕2)(u 2 + 16) in the objective function.

Hint: Note that use of (3.77) gives an infeasible u. This means that we should look for a boundary solution for u. To obtain this, calculate J (T) as defined in Exercise 3.6, and then choose T to maximize it. In doing so, take care to see that x(T) = 0, and the control constraint is satisfied.

E 3.27

Exercise 3.26 becomes a minimum energy problem if we set F = −u 2∕2. Show that the Hamiltonian maximizing condition of the maximum principle implies u = k, where k is a constant. Note that the application of (3.77) implies that k = 0, which gives x(t) = 5 for all t ≥ 0 so that the terminal condition x(T) = 0 cannot be satisfied.

To see that there exists no optimal control in this situation, let k < 0 and compute J . It is now possible to see that limk → 0J = 0. This means that we can make the objective function value as close to zero as we wish, but not equal to zero. Note that in this case there are no feasible solutions satisfying the necessary conditions so we cannot check the sufficiency conditions; see the last paragraph of Sect. 2.1.4.

E 3.28

Show that every feasible control of the problem

$$\displaystyle{\max _{T,u}\left \{J =\int _{ 0}^{T} - udt\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;x(0) = x_{0},\;x(T) = 0,}$$
$$\displaystyle{\vert u\vert \leq q,\mbox{ where }q> 0,}$$

is an optimal control.

E 3.29

Let x 0 > 0 be the initial velocity of a rocket. Let u be the amount of acceleration (or deceleration) caused by applying a force which consumes fuel at the rate | u |. We want to bring the rocket to rest using minimum total amount of fuel. Hence, we have the following optimal control problem:

$$\displaystyle{\max _{T,u}\left \{J =\int _{ 0}^{T} -\vert u\vert dt\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;x(0) = x_{0},\;x(T) = 0,}$$
$$\displaystyle{-1 \leq u \leq +1.}$$

Hint: Use (3.113)–(3.115) to deal with | u |. Show that for x 0 > 0, say x 0 = 5, every feasible control is optimal.

E 3.30

Analyze Exercise 3.29 with the state equation

$$\displaystyle{\dot{x} = -ax + u,}$$

where a > 0. Show that no optimal control exists for the problem.

E 3.31

By using the maximum principle , show that the problem

$$\displaystyle{\left \{\begin{array}{ll} \max \int _{0}^{1}xdt \\ \mbox{ subject to } \\ \dot{x} = x + u,\;x(0) = 0, \\ 1 - u \geq 0,\;1 + u \geq 0,\;2 - x - u \geq 0,\end{array} \right.}$$

has the optimal control

$$\displaystyle{u^{{\ast}}(t) = \left \{\begin{array}{ll} 1, &t \in [0,\ln 2], \\ 1 + 2\mbox{ ln}2 - 2t,&t \in (\ln 2,1]. \end{array} \right.}$$

Also, provide the values of the state variable, the adjoint variable, and the Lagrange multipliers along the optimal path.

E 3.32

If, in Exercise 3.31, we perturb the constraint 2 − xu ≥ 0 by 2 − xuɛ, where ɛ is small, then show that the change in value of the objective function equals

$$\displaystyle{\varepsilon \int _{0}^{1}\mu _{ 3}dt + o(\varepsilon ),}$$

where μ 3 is the Lagrange multiplier associated with the constraint 2 − xu ≥ 0 in Exercise 3.31. Moreover, if ɛ < 0, implying that we are relaxing the constraint, then verify that the change in the objective function is positive.

E 3.33

Obtain the value function V (x, t) explicitly in Exercise 3.31 for every xE 1 and t ∈ [0, 1]. Furthermore, verify that λ(t) = V x(x (t), t),   t ∈ [0, 1], where λ(t) is the adjoint variable obtained in the solution of Exercise 3.31.

E 3.34

Solve the problem:

$$\displaystyle{\max _{u,T}\left \{J =\int _{ 0}^{T}[-2 + (1 - u(t))x(t)]dt\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;x(0) = 0,\;x(T) \geq 1,}$$
$$\displaystyle{u \in [0,1],}$$
$$\displaystyle{T \in [1,8].}$$

Hint: First, show that u = bang[0, 1; λx] and that control can switch at most once from 1 to 0. Then, let t (T) denote that switching time, if any, for a given T ∈ [1, 8]. Consider three cases: (i) T = 1, (ii) 1 < T < 8, and (iii) T = 8. Note that λ(t (T)) − x(t (T)) = 0. Use (3.15) in case (ii). Find the optimal solution in each of the three cases. The best of these solutions will be the solution of the problem.

E 3.35

Consider the problem:

$$\displaystyle{\max _{u,T}\left \{J =\int _{ 0}^{T}[-3 - u(t) + x(t)]dt\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;x(0) = 0,\;x(T) \geq 1,}$$
$$\displaystyle{u \in [0,1],}$$
$$\displaystyle{T \in [1,4 + 2\sqrt{2}].}$$

The problem has two different optimal solutions with different values for optimal T . Find both of these solutions.

E 3.36

Perform the following:

  1. (a)

    Find the optimal consumption rate C (t),  t ∈ [0, T], in the problem:

    $$\displaystyle{\max \left \{J =\int _{ 0}^{T}e^{-\rho t}\ln C(t)dt\right \}}$$

    subject to

    $$\displaystyle{\dot{W}(t) = -C(t),W(0) = W_{0},}$$

    where T is given and ρ > 0. 

  2. (b)

    Assume that T is not given in (a), and is to be chosen optimally. Show for this free terminal time version that the optimal T decreases as the discount rate ρ increases.

    Hint: It is possible to obtain dT by implicit differentiation.

E 3.37

An example, which illustrates that

$$\displaystyle{\lim _{t\rightarrow \infty }\lambda (t) = 0}$$

is not a necessary transversality condition in general, is:

$$\displaystyle{\max \left \{J =\int _{ 0}^{\infty }(1 - x)udt\right \}}$$

such that

$$\displaystyle{\dot{x} = (1 - x)u,\,x(0) = 0,}$$
$$\displaystyle{0 \leq u \leq 1.}$$

Show this by finding an optimal control.

E 3.38

Show that the limiting conditions in the rightmost column of Rows 2, 3, and 4 in Table 3.1 imply (3.98) when T. 

E 3.39

Consider the regulator problem defined by the scalar equation

$$\displaystyle{\dot{x} = u,\,x(0) = x_{0},}$$

with the objective function

$$\displaystyle{J = -\int _{0}^{\infty }\left (\frac{x^{4}} {4} + \frac{u^{2}} {2} \right )dt.}$$
  1. (a)

    Show that the long-term stationary equilibrium \((\bar{x},\bar{u},\bar{\lambda }) = (0,0,0),\) and conclude that in feedback form \(u^{{\ast}}(x) =\bar{ u} = 0\) when \(x =\bar{ x} = 0.\)

  2. (b)

    By using the maximum principle and the relation \(\dot{u}^{{\ast}} = \frac{du^{{\ast}}(x)} {dx} \dot{x},\) derive a differential equation for the optimal feedback control u (x) and solve it with the boundary condition u (0) = 0 to obtain

    $$\displaystyle{u^{{\ast}}(x) = \left \{\begin{array}{cc} - x^{2}/\sqrt{2},&x> 0, \\ 0, &x = 0, \\ + x^{2}/\sqrt{2},&x <0.\end{array} \right.}$$
  3. (c)

    Solve for x (t) and λ(t) and show that limtx (t) = 0 and that the limiting condition (3.99), i.e., limtλ(t) = 0, holds for this problem.

E 3.40

Show that for the problem (3.97) without the constraint g(x, u) ≥ 0, the optimal value of the objective function

$$\displaystyle{J^{{\ast}} = H(x_{ 0},u^{{\ast}}(0),\lambda (0))/\rho.}$$

See Grass et al. (2008) .

E 3.41

Apply (3.108), along with the requirement \(\bar{\lambda }\geq 0\) and \(\bar{\lambda }\bar{W} = 0\) in view of the constraint (3.102), to Example 3.7 to verify that the long-run stationary equilibrium is as shown in (3.110).

E 3.42

For a stationary system as defined in Sect. 3.6, show that

$$\displaystyle{\frac{dH} {dt} =\rho \lambda f(x^{{\ast}}(t),u^{{\ast}}(t))}$$

and

$$\displaystyle{\frac{dH^{pv}} {dt} = -\rho e^{-\rho t}\phi (x^{{\ast}}(t),u^{{\ast}}(t))}$$

along the optimal path. Also, contrast these results with that of Exercise 2.9.

E 3.43

Consider the inventory problem:

$$\displaystyle{\max \left \{J =\int _{ 0}^{\infty }- e^{-\rho t}[(I - I_{ 1})^{2} + (P - P_{ 1})^{2}]dt\right \}}$$

subject to

$$\displaystyle{\dot{I} = P - S,\;I(0) = I_{0},}$$

where I denotes inventory level, P denotes production rate, and S denotes a given constant demand rate.

  1. (a)

    Find the optimal long-run stationary equilibrium, i.e., the turnpike defined in (3.107).

  2. (b)

    Find the Golden Rule by setting \(\dot{I} = 0\) in the state equation, solve for P, and substitute it into the integrand of the objective function. Then, maximize the integrand with respect to I. 

  3. (c)

    Verify that the Golden Rule inventory level obtained in (b) is the same as the turnpike inventory level found in (a) when ρ = 0.