The Maximum Principle: Continuous Time

Sethi, Suresh P.

doi:10.1007/978-3-319-98237-3_2

Suresh P. Sethi²

2460 Accesses
2 Citations

Abstract

The main purpose of this chapter is to introduce the maximum principle as a necessary condition that must be satisfied by any optimal control for the basic problem specified in Sect. 2.1. Although vector notation is used, the reader can consider the problem as one with only a single state variable and a single control variable on the first reading. In Sect. 2.2, the method of dynamic programming is used to derive the maximum principle. We use this method because of the simplicity and familiarity of the dynamic programming concept. The derivation also yields significant economic interpretations. In Appendix C, the maximum principle is also derived by using a more general method similar to that of Pontryagin et al. (1962) , but with certain simplifications. In Sect. 2.3, we apply the maximum principle to solve a number of simple, but illustrative, examples. In Sect. 2.4, the maximum principle is shown to be sufficient for optimal control under an appropriate concavity condition, which holds in many management science applications. Finally, Sect. 2.5 illustrates the use of Excel spreadsheet software to solve an optimal control problem.

Access provided by Autonomous University of Puebla. Download chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The main purpose of this chapter is to introduce the maximum principle as a necessary condition that must be satisfied by any optimal control for the basic problem specified in Sect. 2.1. Although vector notation is used, the reader can consider the problem as one with only a single state variable and a single control variable on the first reading. In Sect. 2.2, the method of dynamic programming is used to derive the maximum principle. We use this method because of the simplicity and familiarity of the dynamic programming concept. The derivation also yields significant economic interpretations. In Appendix C, the maximum principle is also derived by using a more general method similar to that of Pontryagin et al. (1962) , but with certain simplifications. In Sect. 2.3, we apply the maximum principle to solve a number of simple, but illustrative, examples. In Sect. 2.4, the maximum principle is shown to be sufficient for optimal control under an appropriate concavity condition, which holds in many management science applications. Finally, Sect. 2.5 illustrates the use of Excel spreadsheet software to solve an optimal control problem.

2.1 Statement of the Problem

Optimal control theory deals with the problem of optimizing dynamic systems. The problem must be well posed before any solution can be attempted. This requires a clear mathematical description of the system to be optimized, the constraints imposed on the system, and the objective function to be maximized (or minimized).

2.1.1 The Mathematical Model

An important part of any control problem is the process of modeling the dynamic system under consideration, be it physical, business, or otherwise. The aim is to arrive at a mathematical description which is simple enough to deal with, and realistic enough to be able to predict the response of the system to any given input. Our model is restricted to systems that can be characterized by a set of ordinary differential equations (or, ordinary difference equations in the discrete-time case treated in Chap. 8). Thus, given the initial state x ₀ of the system and control history u(t), t ∈ [0, T], of the process, the evolution of the system may be described by the first-order differential equation, known also as the state equation ,

$$\displaystyle{ \dot{x}(t) = f(x(t),u(t),t),\quad x(0) = x_{0}, }$$

(2.1)

where the vector of state variables , x(t) ∈ E ⁿ, the vector of control variables , u(t) ∈ E ^m, and f: E ⁿ × E ^m × E ¹ → E ⁿ. Furthermore, the function f is assumed to be continuously differentiable. Here we assume x to be a column vector and f to be a column vector of functions. The path x(t), t ∈ [0, T], is called a state trajectory and u(t), t ∈ [0, T], is called a control trajectory or simply, a control. The terms vector of state variables, state vector , and state will be used interchangeably; similarly for the terms vector of control variables, control vector , and control. As mentioned earlier, when no confusion arises, we will usually suppress the time notation (t); thus, e.g., x(t) will be written simply as x. Furthermore, it should be inferred from the context whether x denotes the state at time t or the entire state trajectory. A similar statement holds for u.

2.1.2 Constraints

In this chapter, we are concerned with problems of types (1.4) and (1.5) that do not have state constraints. Such constraints are considered in Chaps. 3 and 4, as indicated in Sect. 1.1. We do impose constraints of type (1.3) on the control variables. We define an admissible control to be a control trajectory u(t), t ∈ [0, T], which is piecewise continuous and satisfies, in addition,

$$\displaystyle{ u(t) \in \varOmega (t) \subset E^{m},\quad t \in [0,T]. }$$

(2.2)

Usually the set Ω(t) is determined by physical or economic constraints on the values of the control variables at time t.

2.1.3 The Objective Function

An objective function is a quantitative measure of the performance of the system over time. An optimal control is defined to be an admissible control which maximizes the objective function. In business or economic problems, a typical objective function gives some appropriate measure of quantities such as profit or sales. If the aim is to minimize cost, then the objective function to be maximized is the negative of cost. Mathematically, we let

$$\displaystyle{ J =\int _{ 0}^{T}F(x(t),u(t),t)dt + S(x(T),T) }$$

(2.3)

denote the objective function, where the functions F: E ⁿ × E ^m × E ¹ → E ¹ and S: E ⁿ × E ¹ → E ¹ are assumed for our purposes to be continuously differentiable. In a typical business application, F(x, u, t) could be the instantaneous profit rate and S(x, T) could be the salvage value of having x as the system state at the terminal time T.

2.1.4 The Optimal Control Problem

Given the preceding definitions we can state the optimal control problem, which we will be concerned with in this chapter. The problem is to find an admissible control u ^∗, which maximizes the objective function (2.3) subject to the state equation (2.1) and the control constraints (2.2). We now restate the optimal control problem as:

$$\displaystyle{ \left \{\begin{array}{l} \max _{u(t)\in \varOmega (t)}\left \{J =\int _{ 0}^{T}F(x,u,t)dt + S(x(T),T)\right \} \\ \mbox{ subject to} \\ \dot{x} = f(x,u,t),\;x(0) = x_{0}.\end{array} \right. }$$

(2.4)

The control u ^∗ is called an optimal control and x ^∗, determined by means of the state equation with u = u ^∗, is called the optimal trajectory or an optimal path . The optimal value J(u ^∗) of the objective function will be denoted as J ^∗, and occasionally as $J_{(x_{0})}^{{\ast}}$ when we need to emphasize its dependence on the initial state x ₀.

The optimal control problem (2.4) is said to be in Bolza form because of the form of the objective function in (2.3). It is said to be in Lagrange form when S ≡ 0. We say the problem is in Mayer form when F ≡ 0. Furthermore, it is in linear Mayer form when F ≡ 0 and S is linear, i.e.,

$$\displaystyle{ \left \{\begin{array}{l} \max _{u(t)\in \varOmega (t)}\{J = cx(T)\} \\ \mbox{ subject to } \\ \dot{x} = f(x,u,t),\;x(0) = x_{0}, \end{array} \right. }$$

(2.5)

where c = (c ₁, c ₂, ⋯ , c _n) is an n-dimensional row vector of given constants. In the next paragraph and in Exercise 2.5, it will be demonstrated that all of these forms can be converted into the linear Mayer form .

To show that the Bolza form can be reduced to the linear Mayer form , we define a new state vector y = (y ₁, y ₂, …, y _n+1), having n + 1 components defined as follows: y _i = x _i for i = 1, …, n and y _n+1 defined by the solution of the equation

$$\displaystyle\begin{array}{rcl} \dot{y}_{n+1} = F(x,u,t) + \frac{\partial S(x,t)} {\partial x} f(x,u,t) + \frac{\partial S(x,t)} {\partial t},& &{}\end{array}$$

(2.6)

with y _n+1(0) = S(x ₀, 0). By writing f(x, u, t) as f(y, u, t), with a slight abuse of notation, and by denoting the right-hand side of (2.6) as f _n+1(y, u, t), we can write the new state equation in the vector form as

$$\displaystyle{ \dot{y} = \left (\begin{array}{c} \dot{x}\\ \dot{y}_{n+1}\\ \end{array} \right ) = \left (\begin{array}{c} f(y,u,t)\\ f_{ n+1}(y,u,t)\\ \end{array} \right ),\;y(0) = \left (\begin{array}{c} x_{0} \\ S(x_{0},0)\\ \end{array} \right ). }$$

(2.7)

We also put c = (0, ⋯ , 0, 1), where c has n + 1 components with the first n terms all 0. If we integrate (2.6) from 0 to T, we see that

$$\displaystyle{y_{n+1}(T) - y_{n+1}(0) =\int _{ 0}^{T}F(x,u,t)dt + S(x(T),T) - S(x_{ 0},0).}$$

In view of setting the initial condition as y _n+1(0) = S(x ₀, 0), the problem in (2.4) can be expressed as that of maximizing

$$\displaystyle{ J =\int _{ 0}^{T}F(x,u,t)dt + S(x(T),T) = y_{ n+1}(T) = cy(T) }$$

(2.8)

over u(t) ∈ Ω(t), subject to (2.7). Of course, the price paid for going from Bolza to linear Mayer form is an additional state variable and its associated differential equation (2.6). Also, for the function f _n+1 to be continuously differentiable, in keeping with the assumptions made in Sect. 2.1.1, we need to assume that the salvage value function S(x, t) is twice continuously differentiable.

Exercise 2.5 presents the task of showing in a similar way that the Lagrange and Mayer forms can also be reduced to the linear Mayer form.

Example 2.1

Convert the following single-state problem in Bolza form to its linear Mayer form:

$$\displaystyle{\max \left \{J =\int _{ 0}^{T}\left (x -\frac{u^{2}} {2} \right )dt + \frac{1} {4}\left [x(T)\right ]^{2}\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;\;x(0) = x_{0}.}$$

Solution. We use (2.6) to introduce the additional state variable y ₂ as follows:

$$\displaystyle{\dot{y}_{2} = x -\frac{u^{2}} {2} + \frac{1} {2}xu,\;\;y_{2}(0) = \frac{1} {4}x_{0}^{2}.}$$

Then,

$$\displaystyle\begin{array}{rcl} y_{2}(T)& =& y_{2}(0) +\int _{ 0}^{T}\left (x -\frac{u^{2}} {2} + \frac{1} {2}xu\right )dt {}\\ & =& \int _{0}^{T}\left (x -\frac{u^{2}} {2} \right )dt +\int _{ 0}^{T}\left (\frac{1} {2}x\dot{x}\right )dt + y_{2}(0) {}\\ & =& \int _{0}^{T}\left (x -\frac{u^{2}} {2} \right )dt +\int _{ 0}^{T}d\left (\frac{1} {4}x^{2}\right ) {}\\ & =& \int _{0}^{T}\left (x -\frac{u^{2}} {2} \right )dt + \frac{1} {4}\left [x(T)\right ]^{2} -\frac{1} {4}x_{0}^{2} + y_{ 2}(0) {}\\ & =& \int _{0}^{T}\left (x -\frac{u^{2}} {2} \right )dt + \frac{1} {4}x(T)^{2} {}\\ & =& J. {}\\ \end{array}$$

Thus, the linear Mayer form version with the two-dimensional state y = (x, y ₂) can be stated as

$$\displaystyle{\max \left \{J = y_{2}(T)\right \}}$$

subject to

$$\displaystyle\begin{array}{rcl} \dot{x}& =& u,\;\;x(0) = x_{0}, {}\\ \dot{y}_{2}& =& x -\frac{u^{2}} {2} + \frac{1} {2}xu,\;\;y_{2}(0) = \frac{1} {4}x_{0}^{2}. {}\\ \end{array}$$

In Sect. 2.2, we derive necessary conditions for optimal control in the form of the maximum principle, and in Sect. 2.4 we derive sufficient conditions. In these derivations, we shall assume the existence of an optimal control, while providing references where needed, as the topic of existence is beyond the scope of this book. In any particular application, however, the existence of a solution will be demonstrated by actually finding a solution that satisfies both the necessary and the sufficient conditions for optimality. We thus avoid the necessity of having to prove general existence theorems, which require advanced and difficult mathematics. Nevertheless, interested readers can consult Hartl et al. (1995) and Seierstad and Sydsæter (1987) for brief discussions of existence results and references therein including Cesari (1983) .

2.2 Dynamic Programming and the Maximum Principle

We will now derive the maximum principle by using a dynamic programming approach. The proof is intuitive in nature and is not intended to be mathematically rigorous. For more rigorous derivations, we refer the reader to Appendix C, Berkovitz (1961) , Pontryagin et al. (1962) , Halkin (1967) , Boltyanskii (1971) , Hartberger (1973) , Bryant and Mayne (1974) , Leitmann (1981) , and Seierstad and Sydsæter (1987) . Additional references can be found in the survey by Hartl et al. (1995) . For discussions of maximum principles for more general optimal control problems, including those with nondifferentiable functions, see Clarke (1983, 1989) .

2.2.1 The Hamilton-Jacobi-Bellman Equation

Suppose V (x, t): E ⁿ × E ¹ → E ¹ is a function whose value is the maximum value of the objective function of the control problem for the system, given that we start at time t in state x. That is,

$$\displaystyle{ V (x,t) =\max _{u(s)\in \varOmega (s)}\left [\int _{t}^{T}F(x(s),u(s),s)ds + S(x(T),T)\right ], }$$

(2.9)

where for s ≥ t,

$$\displaystyle{\frac{dx(s)} {ds} = f(x(s),u(s),s),\;x(t) = x.}$$

We initially assume that the value function V (x, t) exists for all x and t in the relevant ranges. Later we will make additional assumptions about the function V (x, t).

Bellman (1957) in his book on dynamic programming states the principle of optimality as follows:

An optimal policy has the property that, whatever the initial state and initial decision are, the remaining decision must constitute an optimal policy with regard to the outcome resulting from the initial decision.

Intuitively this principle is obvious, for if we were to start in state x at time t and did not follow an optimal path from then on, there would then exist (by assumption) a better path from t to T, hence, we could improve the proposed solution by following this better path. We will use the principle of optimality to derive conditions on the value function V (x, t).

Figure 2.1 is a schematic picture of the optimal path x ^∗(t) in the state-time space, and two nearby points (x, t) and (x + δx, t + δt), where δt is a small increment of time and x + δx = x(t + δt). The value function changes from V (x, t) to V (x + δx, t + δt) between these two points. By the principle of optimality, the change in the objective function is made up of two parts: first, the incremental change in J from t to t + δt, which is given by the integral of F(x, u, t) from t to t + δt; second, the value function V (x + δx, t + δt) at time t + δt. The control actions u(τ) should be chosen to lie in Ω(τ), τ ∈ [t, t + δt], and to maximize the sum of these two terms. In equation form this is

$$\displaystyle{ V (x,t) =\max _{\stackrel{u(\tau )\in \varOmega (\tau )}{\tau \in [t,t+\delta t]}}\left \{\int _{t}^{t+\delta t}F[x(\tau ),u(\tau ),\tau ]d\tau + V [x(t +\delta t),t +\delta t]\right \}, }$$

(2.10)

where δt represents a small increment in t. It is instructive to compare this equation to definition (2.9).

Since F is a continuous function, the integral in (2.10) is approximately F(x, u, t)δt so we can rewrite (2.10) as

$$\displaystyle{ V (x,t) =\max _{u\in \varOmega (t)}\left \{F(x,u,t)\delta t + V [x(t +\delta t),t +\delta t]\right \} + o(\delta t), }$$

(2.11)

where o(δt) denotes a collection of higher-order terms in δt. (By definition given in Sect. 1.4.4, o(δt) is a function such that $\lim _{\delta t\rightarrow 0}\frac{o(\delta t)} {\delta t} = 0$.)

We now make an assumption that we will return to again later. We assume that the value function V is a continuously differentiable function of its arguments. This allows us to use the Taylor series expansion of V with respect to δt and obtain

$$\displaystyle{ V [x(t +\delta t),t +\delta t] = V (x,t) + [V _{x}(x,t)\dot{x} + V _{t}(x,t)]\delta t + o(\delta t), }$$

(2.12)

where V _x and V _t are partial derivatives of V (x, t) with respect to x and t, respectively.

Substituting for $\dot{x}$ from (2.1) in the above equation and then using it in (2.11), we obtain

$$\displaystyle\begin{array}{rcl} V (x,t)& =& \max _{u\in \varOmega (t)}\left \{F(x,u,t)\delta t + V (x,t) + V _{x}(x,t)f(x,u,t)\delta t\right. \\ & & +\left.V _{t}(x,t)\delta t\right \} + o(\delta t). {}\end{array}$$

(2.13)

Canceling V (x, t) on both sides and then dividing by δt we get

$$\displaystyle{ 0 =\max _{u\in \varOmega (t)}\left \{F(x,u,t) + V _{x}(x,t)f(x,u,t) + V _{t}(x,t)\right \} + \frac{o(\delta t)} {\delta t}. }$$

(2.14)

Now we let δt → 0 and obtain the following equation

$$\displaystyle{ 0 =\max _{u\in \varOmega (t)}\left \{F(x,u,t) + V _{x}(x,t)f(x,u,t) + V _{t}(x,t)\right \}, }$$

(2.15)

for which the boundary condition is

$$\displaystyle{ V (x,T) = S(x,T). }$$

(2.16)

This boundary condition follows from the fact that the value function at t = T is simply the salvage value function.

The components of the vector V _x(x, t) can be interpreted as the marginal contributions of the state variables x to the value function or the maximized objective function (2.9). We denote the marginal return vector (along the optimal path x ^∗(t)) by the adjoint (row) vector λ(t) ∈ E ⁿ, i.e.,

$$\displaystyle{ \lambda (t) = V _{x}(x^{{\ast}}(t),t):= V _{ x}(x,t)\mid _{x=x^{{\ast}}(t)}. }$$

(2.17)

From the preceding remark, we can interpret λ(t) as the per unit change in the objective function value for a small change in x ^∗(t) at time t. In other words, λ(t) is the highest hypothetical unit price which a rational decision maker would be willing to pay for an infinitesimal addition to x ^∗(t). See Sect. 2.2.4 for further discussion.

Next we introduce a function H: E ⁿ × E ^m × E ⁿ × E ¹ → E ¹ called the Hamiltonian

$$\displaystyle{ H(x,u,\lambda,t) = F(x,u,t) +\lambda f(x,u,t). }$$

(2.18)

We can then rewrite Eq. (2.15) as the equation

$$\displaystyle{ \max _{u\in \varOmega (t)}[H(x,u,V _{x},t) + V _{t}] = 0, }$$

(2.19)

called the Hamilton-Jacobi-Bellman equation or, simply, the HJB equation to be satisfied along an optimal path. Note that it is possible to take V _t out of the maximizing operation since it does not depend on u.

The Hamiltonian maximizing condition of the maximum principle can be obtained from (2.19) and (2.17) by observing that, if x ^∗(t) and u ^∗(t) are optimal values of the state and control variables and λ(t) is the corresponding value of the adjoint variable at time t, then the optimal control u ^∗(t) must satisfy (2.19), i.e., for all u ∈ Ω(t),

$$\displaystyle\begin{array}{rcl} H[x^{{\ast}}(t),u^{{\ast}}(t),\lambda (t),t] + V _{ t}(x^{{\ast}}(t),t)& \geq & H[x^{{\ast}}(t),u,\lambda (t),t] \\ & & +V _{t}(x^{{\ast}}(t),t).{}\end{array}$$

(2.20)

Canceling the term V _t on both sides, we obtain the Hamiltonian maximizing condition

$$\displaystyle{ H[x^{{\ast}}(t),u^{{\ast}}(t),\lambda (t),t] \geq H[x^{{\ast}}(t),u,\lambda (t),t] }$$

(2.21)

for all u ∈ Ω(t).

In order to complete the statement of the maximum principle, we must still obtain the adjoint equation.

Remark 2.1

We use u ^∗ and x ^∗ for optimal control and state to distinguish them from an admissible control u and the corresponding state x, respectively. However, since the adjoint variable λ is defined only along the optimal path, there is no need for such a distinction, and therefore we do not use the superscript^∗ on λ.

2.2.2 Derivation of the Adjoint Equation

The derivation of the adjoint equation proceeds from the HJB equation (2.19), and is similar to those in Fel’dbaum (1965) and Kirk (1970) . Note that, given the optimal path x ^∗, the optimal control u ^∗ maximizes the left-hand side of (2.19), and its maximum value is zero. We now consider small perturbations of the values of the state variables in a neighborhood of the optimal path x ^∗. Thus, let

$$\displaystyle{ x(t) = x^{{\ast}}(t) +\delta x(t), }$$

(2.22)

where ∥δx(t) ∥ < ɛ for a small positive ɛ.

We now consider a ‘fixed’ time instant t. We can then write (2.19) as

$$\displaystyle\begin{array}{rcl} 0& =& H[x^{{\ast}}(t),u^{{\ast}}(t),V _{ x}(x^{{\ast}}(t),t),t] + V _{ t}(x^{{\ast}}(t),t) \\ & \geq & H[x(t),u^{{\ast}}(t),V _{ x}(x(t),t),t] + V _{t}(x(t),t).{}\end{array}$$

(2.23)

To explain, we note from (2.19) that the left-hand side of ≥ in (2.23) equals zero. The right-hand side can attain the value zero only if u ^∗(t) is also an optimal control for x(t). In general, for x(t) ≠ x ^∗(t), this will not be so. From this observation, it follows that the expression on the right-hand side of (2.23) attains its maximum (of zero) at x(t) = x ^∗(t). Furthermore, x(t) is not explicitly constrained. In other words, x ^∗(t) is an unconstrained local maximum of the right-hand side of (2.23), so that the derivative of this expression with respect to x must vanish at x ^∗(t), i.e.,

$$\displaystyle{ H_{x}[x^{{\ast}}(t),u^{{\ast}}(t),V _{ x}(x^{{\ast}}(t),t),t] + V _{ tx}(x^{{\ast}}(t),t) = 0, }$$

(2.24)

provided the derivative exists, and for which, we must further assume that V is a twice continuously differentiable function of its arguments. With H = F + V _x f from (2.17) and (2.18), we obtain

$$\displaystyle{H_{x} = F_{x} + V _{x}f_{x} + f^{T}V _{ xx} = F_{x} + V _{x}f_{x} + (V _{xx}f)^{T}}$$

by using g = V _x in the identity (1.15). Substituting this in (2.24) and recognizing the fact that V _xx = (V _xx)^T, we obtain

$$\displaystyle{ F_{x} + V _{x}f_{x} + f^{T}V _{ xx} + V _{tx} = F_{x} + V _{x}f_{x} + (V _{xx}f)^{T} + V _{ tx} = 0, }$$

(2.25)

where the superscript^T denotes the transpose operation. See (1.16) or Exercise 1.10 for further explanation.

The derivation of the necessary condition (2.25) is the crux of the reasoning in the derivation of the adjoint equation . It is easy to obtain the so-called adjoint equation from it. We begin by taking the time derivative of V _x(x, t). Thus,

$$\displaystyle{ \begin{array}{cl} \frac{dV _{x}} {dt} & = \left (\frac{dV _{x_{1}}} {dt},\; \frac{dV _{x_{2}}} {dt},\cdots \,, \frac{dV _{x_{n}}} {dt} \right ) \\ & = \left (V _{x_{1}x}\dot{x} + V _{x_{1}t},V _{x_{2}x}\dot{x} + V _{x_{2}t},\cdots \,,V _{x_{n}x}\dot{x} + V _{x_{n}t}\right ) \\ & = \left (\sum _{i=1}^{n}V _{x_{1}x_{i}}\dot{x_{i}},\;\sum _{i=1}^{n}V _{x_{2}x_{i}}\dot{x_{i}},\cdots \,,\sum _{i=1}^{n}V _{x_{n}x_{i}}\dot{x_{i}}\right )\; +\; (V _{x})_{t} \\ & = (V _{xx}\dot{x})^{T} + V _{xt} \\ & = (V _{xx}f)^{T} + V _{tx}.\end{array} }$$

(2.26)

Note in the above that

$$\displaystyle{V _{x_{i}x} = (V _{x_{i}x_{1}},V _{x_{i}x_{2}},\cdots \,,V _{x_{i}x_{n}})}$$

and

$$\displaystyle{ V _{xx}\dot{x} = \left (\begin{array}{ccccc} V _{x_{1}x_{1}} & V _{x_{1}x_{2}} & \cdots & V _{x_{1}x_{n}} \\ V _{x_{2}x_{1}} & V _{x_{2}x_{2}} & \cdots & V _{x_{2}x_{n}} \\ \vdots & \vdots & \cdots & \vdots\\ V _{ x_{n}x_{1}} & V _{x_{n}x_{2}} & \cdots &V _{x_{n}x_{n}}\\ \end{array} \right )\left (\begin{array}{c} \dot{x}_{1} \\ \dot{x}_{2}\\ \vdots \\ \dot{x}_{n}\\ \end{array} \right ). }$$

(2.27)

Since the terms on the right-hand side of (2.26) are the same as the last two terms in (2.25), we see that (2.26) becomes

$$\displaystyle{ \frac{dV _{x}} {dt} = -F_{x} - V _{x}f_{x}. }$$

(2.28)

Because λ was defined in (2.17) to be V _x, we can rewrite (2.28) as

$$\displaystyle{\dot{\lambda }= -F_{x} -\lambda f_{x}.}$$

To see that the right-hand side of this equation can be written simply as − H _x, we need to go back to the definition of H in (2.18) and recognize that when taking the partial derivative of H with respect to x, the adjoint variables λ are considered to be independent of x. We note further that along the optimal path, λ is a function of t only. Thus,

$$\displaystyle{ \dot{\lambda }= -H_{x}. }$$

(2.29)

Also, from the definition of λ in (2.17) and the boundary condition (2.16), we have the terminal boundary condition , which is also called the transversality condition :

$$\displaystyle{ \lambda (T) = \frac{\partial S(x,T)} {\partial x} \mid _{x=x^{{\ast}}(T)} = S_{x}(x^{{\ast}}(T),T). }$$

(2.30)

The adjoint equation (2.29) together with its boundary condition (2.30) determine the adjoint variables .

This completes our derivation of the maximum principle using dynamic programming. We can now summarize the main results in the following section.

2.2.3 The Maximum Principle

The necessary conditions for u ^∗(t), t ∈ [0, T], to be an optimal control are:

$$\displaystyle\begin{array}{rcl} \left \{\begin{array}{l} \dot{x}^{{\ast}} = f(x^{{\ast}},u^{{\ast}},t),x^{{\ast}}(0) = x_{0}, \\ \dot{\lambda } = -H_{x}[x^{{\ast}},u^{{\ast}},\lambda,t],\;\;\lambda (T) = S_{x}(x^{{\ast}}(T),T), \\ H[x^{{\ast}},u^{{\ast}},\lambda,t] \geq H[x^{{\ast}},u,\lambda,t],\forall u \in \varOmega (t),t \in [0,T].\end{array} \right.& &{}\end{array}$$

(2.31)

It should be emphasized that the state and the adjoint arguments of the Hamiltonian are x ^∗(t) and λ(t) on both sides of the Hamiltonian maximizing condition in (2.31), respectively. Furthermore, u ^∗(t) must provide a global maximum of the Hamiltonian H[x ^∗(t), u, λ(t), t] over u ∈ Ω(t). For this reason the necessary conditions in (2.31) are called the maximum principle.

Note that in order to apply the maximum principle, we must simultaneously solve two sets of differential equations with u ^∗ obtained from the Hamiltonian maximizing condition in (2.31). With the control variable u ^∗ so obtained, the state equation for x ^∗ is given with the initial value x ₀, and the adjoint equation for λ is specified with a condition on the terminal value λ(T). Such a system of equations, where initial values of some variables and final values of other variables are specified, is called a two-point boundary value problem (TPBVP) . The general solution of such problems can be very difficult; see Bryson and Ho (1975), Roberts and Shipman (1972) , and Feichtinger and Hartl (1986) . However, there are certain special cases which are easy. One such is the case in which the adjoint equation is independent of the state and the control variables; here we can solve the adjoint equation first, then get the optimal control u ^∗, and then solve for x ^∗.

Note also that if we can solve the Hamiltonian maximizing condition for an optimal control function in closed form u ^∗(x, λ, t) so that

$$\displaystyle{u^{{\ast}}(t) = u^{{\ast}}[x^{{\ast}}(t),\lambda (t),t],}$$

then we can substitute this into the state and adjoint equations to get the TPBVP just in terms of a set of differential equations, i.e.,

$$\displaystyle{ \begin{array}{c} \left \{\begin{array}{ll} \dot{x}^{{\ast}} = f(x^{{\ast}},u^{{\ast}}(x^{{\ast}},\lambda,t),t),\;\;x^{{\ast}}(0) = x_{ 0}, \\ \dot{\lambda } = -H_{x}(x^{{\ast}},u^{{\ast}}(x^{{\ast}},\lambda,t),\lambda,t),\;\;\lambda (T) = S_{x}(x^{{\ast}}(T),T).\end{array} \right.\end{array} }$$

(2.32)

We should note that we are making a slight abuse of notation here by using u ^∗(x, λ, t) to denote the optimal control function and u ^∗(t) as the optimal control at time t. Thus, depending on the context, when we use u ^∗ without any argument, it may mean the optimal control function u ^∗(x, λ, t), or the optimal control at time t, or the entire optimal control trajectory {u ^∗(t), t ∈ [0, T]}.

In Sect. 2.5, we derive the TPBVP for a specific example, and solve its discrete version by using Excel. In subsequent chapters we will solve many TPBVPs of varying degrees of difficulty.

One final remark should be made. Because an integral is unaffected by values of the integrand at a finite set of points, some of the arguments made in this chapter may not hold at a finite set of points. This does not affect the validity of the results.

In the next section, we give economic interpretations of the maximum principle, and in Sect. 2.3, we solve five simple examples by using the maximum principle.

2.2.4 Economic Interpretations of the Maximum Principle

Recall from Sect. 2.1.3 that the objective function (2.3) is

$$\displaystyle{J =\int _{ 0}^{T}F(x,u,t)dt + S(x(T),T),}$$

where F is considered to be the instantaneous profit rate measured in dollars per unit of time, and S(x, T) is the salvage value, in dollars, of the system at time T when the terminal state is x. For purposes of discussion it will be convenient to consider the system as a firm and the state x(t) as the stock of capital at time t.

In (2.17), we interpreted λ(t) to be the per unit change in the value function V (x, t) for small changes in capital stock x. In other words, λ(t) is the marginal value per unit of capital at time t, and it is also referred to as the price or shadow price of a unit of capital at time t. In particular, the value of λ(0) is the marginal rate of change of the maximum value of J (the objective function) with respect to the change in the initial capital stock, x ₀.

Remark 2.2

As mentioned in Appendix C, where we prove a maximum principle without any smoothness assumption on the value function, there arise cases in which the value function may not be differentiable with respect to the state variables. In such cases, when V _x(x ^∗(t), t) does not exist, then (2.17) has no meaning. See Bettiol and Vinter (2010), Yong and Zhou (1999), and Cernea and Frankowska (2005) for interpretations of the adjoint variables or extensions of (2.17) in such cases.

Next we interpret the Hamiltonian function in (2.18). Multiplying (2.18) formally by dt and using the state equation (2.1) gives

$$\displaystyle{Hdt = Fdt +\lambda fdt = Fdt +\lambda \dot{ x}dt = Fdt +\lambda dx.}$$

The first term F(x, u, t)dt represents the direct contribution to J in dollars from time t to t + dt, if the firm is in state x (i.e., it has a capital stock of x), and we apply control u in the interval [t, t + dt]. The differential dx = f(x, u, t)dt represents the change in capital stock from time t to t + dt, when the firm is in state x and control u is applied. Therefore, the second term λdx represents the value in dollars of the incremental capital stock dx, and hence can be considered as the indirect contribution to J in dollars. Thus, Hdt can be interpreted as the total contribution to J from time t to t + dt when x(t) = x and u(t) = u in the interval [t, t + dt].

With this interpretation, it is easy to see why the Hamiltonian must be maximized at each instant of time t. If we were just to maximize F at each instant t, we would not be maximizing J, because we would ignore the effect of the control in changing the capital stock, which gives rise to indirect contributions to J. The maximum principle derives the adjoint variable λ(t), the price of capital at time t, in such a way that λ(t)dx is the correct valuation of the indirect contribution to J from time t to t + dt. As a consequence, the Hamiltonian maximizing problem can be treated as a static problem at each instant t. In other words, the maximum principle decouples the dynamic maximization problem (2.4) in the interval [0, T] into a set of static maximization problems associated with instants t in [0, T]. Thus, the Hamiltonian can be interpreted as a surrogate profit rate to be maximized at each instant of time t.

The value of λ to be used in the maximum principle is given by (2.29) and (2.30), i.e.,

$$\displaystyle{\dot{\lambda }= -\frac{\partial H} {\partial x} = -\frac{\partial F} {\partial x} -\lambda \frac{\partial f} {\partial x},\;\lambda (T) = S_{x}(x(T),T).}$$

Rewriting the first equation as

$$\displaystyle{-d\lambda = H_{x}dt = F_{x}dt +\lambda f_{x}dt,}$$

we can observe that along the optimal path, − dλ, the negative of the increase or, in other words, the decrease in the price of capital from t to t + dt, which can be considered as the marginal cost of holding that capital , equals the marginal revenue H _x dt of investing the capital. In turn the marginal revenue H _x dt consists of the sum of the direct marginal contribution F _x dt and the indirect marginal contribution λf _x dt. Thus, the adjoint equation becomes the equilibrium relation —marginal cost equals marginal revenue , which is a familiar concept in the economics literature; see, e.g., Cohen and Cyert (1965, p. 189) or Takayama (1974, p. 712) .

Further insight can be obtained by integrating the above adjoint equation from t to T as follows:

$$\displaystyle{\begin{array}{cl} \lambda (t)& =\lambda (T) +\int _{ t}^{T}H_{x}(x(\tau ),u(\tau ),\lambda (\tau ),\tau )d\tau \\ & = S_{x}(x(T),T) +\int _{ t}^{T}H_{x}d\tau.\end{array} }$$

Note that the price λ(T) of a unit of capital at time T is its marginal salvage value S _x(x(T), T). In the special case when S ≡ 0, we have λ(T) = 0, as clearly no value can be derived or lost from an infinitesimal increase in x(T). The price λ(t) of a unit of capital at time t is the sum of its terminal price λ(T) plus the integral of the marginal surrogate profit rate H _x from t to T.

The above interpretations show that the adjoint variables behave in much the same way as the dual variables in linear (and nonlinear) programming, with the differences being that here the adjoint variables are time dependent and satisfy derived differential equations. These connections will become clearer in Chap. 8, which addresses the discrete maximum principle.

2.3 Simple Examples

In order to absorb the maximum principle, the reader should study very carefully the examples in this section, all of which are problems having only one state and one control variable. Some or all of the exercises at the end of the chapter should also be worked.

In the following examples and others in this book, we will at times omit the superscript ∗ on the optimal values of the state variables as long as no confusion arises from doing so.

Example 2.2

Consider the problem:

$$\displaystyle{ \max \left \{J =\int _{ 0}^{1} - xdt\right \} }$$

(2.33)

subject to the state equation

$$\displaystyle{ \dot{x} = u,\;x(0) = 1 }$$

(2.34)

and the control constraint

$$\displaystyle{ u \in \varOmega = [-1,1]. }$$

(2.35)

Note that T = 1, F = −x, S = 0, and f = u. Because F = −x, we can interpret the problem as one of minimizing the (signed) area under the curve x(t) for 0 ≤ t ≤ 1.

Solution First, we form the Hamiltonian

$$\displaystyle{ H = -x +\lambda u }$$

(2.36)

and note that, because the Hamiltonian is linear in u, the form of the optimal control, i.e., the one that would maximize the Hamiltonian, is

$$\displaystyle\begin{array}{rcl} u^{{\ast}}(t) = \left \{\begin{array}{ccc} 1 &\mbox{ if}& \lambda (t) > 0, \\ \mbox{ arbitrary}&\mbox{ if}& \lambda (t) = 0, \\ - 1 &\mbox{ if}&\lambda (t) < 0, \end{array} \right.& &{}\end{array}$$

(2.37)

or referring to the notation in Sect. 1.4,

$$\displaystyle{ u^{{\ast}}(t) = \mbox{ bang}[-1,1;\lambda (t)]. }$$

(2.38)

To find λ, we write the adjoint equation

$$\displaystyle\begin{array}{rcl} \dot{\lambda }= -H_{x} = 1,\;\lambda (1) = S_{x}(x(T),T) = 0.& &{}\end{array}$$

(2.39)

Because this equation does not involve x and u, we can easily solve it as

$$\displaystyle{ \lambda (t) = t - 1. }$$

(2.40)

It follows that λ(t) = t − 1 < 0 for t ∈ [0, 1) and so u ^∗(1) = −1, t ∈ [0, 1). Since λ(1) = 0, for simplicity we can also set u ^∗(1) = −1 at the single point t = 1. We can then specify the optimal control to be

$$\displaystyle{u^{{\ast}}(t) = -1\mbox{ for all }t \in [0,1].}$$

Substituting this into the state equation (2.34) we have

$$\displaystyle{ \dot{x} = -1,\;x(0) = 1, }$$

(2.41)

whose solution is

$$\displaystyle{ x^{{\ast}}(t) = 1 - t\mbox{ for }t \in [0,1]. }$$

(2.42)

The graphs of the optimal state and adjoint trajectories appear in Fig. 2.2. Note that the optimal value of the objective function is J ^∗ = −1∕2.

In Sect. 2.2.4, we stated that the adjoint variable λ(t) gives the marginal value per unit increment in the state variable x(t) at time t. Let us illustrate this claim at time t = 0 with the help of Example 2.2. Note from (2.40) that λ(0) = −1. Thus, if we increase the initial value x(0) from 1, by a small amount ɛ, to a new value 1 + ɛ, where ɛ may be positive or negative, then we expect the optimal value of the objective function to change from J ^∗ = −1∕2 to

$$\displaystyle{J_{(1+\varepsilon )}^{{\ast}} = -1/2 +\lambda (0)\varepsilon + o(\varepsilon ) = -1/2 -\varepsilon +o(\varepsilon ),}$$

where we use the subscript (1 + ɛ) to distinguish the new value from J ^∗ as well as to emphasize its dependence on the new initial condition x(0) = 1 + ɛ. To verify this, we first observe that u ^∗(t) = −1, t ∈ [0, 1], remains optimal in this example for the new initial condition. Then from (2.41) with x(0) = 1 + ɛ, we can obtain the new optimal state trajectory, shown by the dotted line in Fig. 2.2 as

$$\displaystyle{x_{(1+\varepsilon )}^{{\ast}}(t) = 1 +\varepsilon -t,\;t \in [0,1],}$$

where the notation x _(y) ^∗(t) indicates the dependence of the optimal trajectory on the initial value x(0) = y. Substituting this for x in (2.33) and integrating, we get the new objective function value to be − 1∕2 −ɛ. Since 0 is of the order o(ɛ), our claim has been illustrated.

We should note that in general it may be necessary to perform separate calculations for positive and negative ɛ. It is easy to see, however, that this is not the case in this example.

Example 2.3

Let us solve the same problem as in Example 2.2 over the interval [0, 2] so that the objective is:

$$\displaystyle{ \max \left \{J =\int _{ 0}^{2} - xdt\right \}. }$$

(2.43)

The dynamics and constraints are (2.34) and (2.35), respectively, as before. Here we want to minimize the signed area between the horizontal axis and the trajectory of x(t) for 0 ≤ t ≤ 2.

Solution As before, the Hamiltonian is defined by (2.36) and the optimal control is as in (2.38). The adjoint equation

$$\displaystyle{ \dot{\lambda }= 1,\;\lambda (2) = 0 }$$

(2.44)

is the same as (2.39) except that now T = 2 instead of T = 1. The solution of (2.44) is easily found to be

$$\displaystyle{ \lambda (t) = t - 2,\quad t \in [0,2]. }$$

(2.45)

The graph of λ(t) is shown in Fig. 2.3.

With λ(t) as in (2.45), we can determine u ^∗(t) = −1 throughout. Thus, the state equation is the same as (2.41). Its solution is given by (2.42) for t ∈ [0, 2]. The optimal value of the objective function is J ^∗ = 0. The graph of x ^∗(t) is also sketched in Fig. 2.3.

Example 2.4

The next example is:

$$\displaystyle{ \max \left \{J =\int _{ 0}^{1} -\frac{1} {2}x^{2}dt\right \} }$$

(2.46)

subject to the same constraints as in Example 2.2, namely,

$$\displaystyle{ \dot{x} = u,\;x(0) = 1,\;u \in \varOmega = [-1,1]. }$$

(2.47)

Here F = −(1∕2)x ² so that the interpretation of the objective function (2.46) is that we are trying to find the trajectory x(t) in order that the area under the curve (1∕2)x ² is minimized.

Solution The Hamiltonian is

$$\displaystyle{ H = -\frac{1} {2}x^{2} +\lambda u. }$$

(2.48)

The control function u ^∗(x, λ) that maximizes the Hamiltonian in this case depends only on λ, and it has the form

$$\displaystyle{ u^{{\ast}}(x,\lambda ) = \mbox{ bang}[-1,1;\lambda ]. }$$

(2.49)

Then, the optimal control at time t can be expressed as u ^∗(t) = bang[−1, 1, λ(t)].

The adjoint equation is

$$\displaystyle{ \dot{\lambda }= -H_{x} = x,\;\lambda (1) = 0. }$$

(2.50)

Here the adjoint equation involves x, so we cannot solve it directly. Because the state equation (2.47) involves u, which depends on λ, we also cannot integrate it independently without knowing λ.

A way out of this dilemma is to use some intuition. Since we want to minimize the area under (1∕2)x ² and since x(0) = 1, it is clear that we want x to decrease as quickly as possible. Let us therefore temporarily assume that λ is nonpositive in the interval [0, 1] so that from (2.49) we have u = −1 throughout the interval. (In Exercise 2.8, you will be asked to show that this assumption is correct.) With this assumption, we can solve (2.47) as

$$\displaystyle{ x(t) = 1 - t. }$$

(2.51)

Substituting this into (2.50) gives

$$\displaystyle{\dot{\lambda }= 1 - t.}$$

Integrating both sides of this equation from t to 1 gives

$$\displaystyle{\int _{t}^{1}\dot{\lambda }(\tau )d\tau =\int _{ t}^{1}(1-\tau )d\tau,}$$

or

$$\displaystyle{\lambda (1) -\lambda (t) = (\tau -\frac{1} {2}\tau ^{2})\left.\right \vert _{ t}^{1},}$$

which, using λ(1) = 0, yields

$$\displaystyle{ \lambda (t) = -\frac{1} {2}t^{2} + t -\frac{1} {2}. }$$

(2.52)

The reader may now verify that λ(t) is nonpositive in the interval [0, 1], verifying our original assumption. Hence, (2.51) and (2.52) satisfy the necessary conditions. In Exercise 2.26, you will be asked to show that they satisfy sufficient conditions derived in Sect. 2.4 as well, so that they are indeed optimal. Thus, x ^∗(t) = 1 − t, and using this in (2.46), we can get J ^∗ = −1∕6. Figure 2.4 shows the graphs of the optimal state and adjoint trajectories.

Example 2.5

Let us rework Example 2.4 with T = 2, i.e., with the objective function:

$$\displaystyle{ \max \left \{J =\int _{ 0}^{2} -\frac{1} {2}x^{2}dt\right \} }$$

(2.53)

subject to the constraints (2.47).

Solution The Hamiltonian is still as in (2.48) and the form of the optimal policy remains as in (2.49). The adjoint equation is

$$\displaystyle{\dot{\lambda }= x,\;\lambda (2) = 0,}$$

which is the same as (2.50) except T = 2 instead of T = 1. Let us try to extend the solution of the previous example from T = 1 to T = 2. Thus, we keep λ(t) as in (2.52) for t ∈ [0, 1] with λ(1) = 0. If we recall from the definition of the bang function that bang [−1, 1; 0] is not defined, it allows us to choose u in (2.49) arbitrarily when λ = 0. This is an instance of singular control , so let us see if we can maintain the singular control by choosing u appropriately. To do this we choose u = 0 when λ = 0. Since λ(1) = 0 we set u(1) = 0 so that from (2.47), we have $\dot{x}(1) = 0.$ Now note that if we set u(t) = 0 for t > 1, then by integrating equations (2.47) and (2.50) forward from t = 1 to t = 2, we see that x(t) = 0 and λ(t) = 0 for 1 < t ≤ 2; in other words, u(t) = 0 maintains singular control in the interval. Intuitively, this is the correct answer since once we get x = 0, we should keep it at 0 in order to maximize the objective function J in (2.53). We will later give further discussion of singular control and state an additional necessary condition in Sect. D.6 for such cases; see also Bell and Jacobson (1975) . In Fig. 2.4, we can get the singular solution by extending the graphs shown to the right (as shown by thick dotted line), making x ^∗(t) = 0, λ(t) = 0, and u ^∗(t) = 0 for 1 < t ≤ 2.

With the trajectory x ^∗(t), 0 ≤ t ≤ 2, thus obtained, we can use (2.53) to compute the optimal value of the objective function as

$$\displaystyle{J^{{\ast}} =\int _{ 0}^{1} - (1/2)(1 - t)^{2}dt +\int _{ 1}^{2} - (1/2)(0)dt = -1/6.}$$

Now suppose that the initial x(0) is perturbed by a small amount ɛ to x(0) = 1 + ɛ, where ɛ may be positive or negative. According to the marginal value interpretation of λ(0), whose value is − 1∕2 in this example, we can estimate the change in the objective function to be λ(0)ɛ + o(ɛ) = −ɛ∕2 + o(ɛ).

Next we calculate directly the impact of the perturbation in the initial value. For this we must obtain new control and state trajectories. These are clearly

$$\displaystyle\begin{array}{rcl} u_{(1+\varepsilon )}^{{\ast}}(t) = \left \{\begin{array}{cc} - 1,& t \in [0,1+\varepsilon ], \\ 0, &t \in (1+\varepsilon,2], \end{array} \right.& & \\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} x_{(1+\varepsilon )}^{{\ast}}(t) = \left \{\begin{array}{cc} 1 +\varepsilon -t,& t \in [0,1+\varepsilon ], \\ 0, &t \in (1+\varepsilon,2], \end{array} \right.& & \\ \end{array}$$

where we have used the subscript (1 + ɛ) to distinguish these from the original trajectories as well as to indicate their dependence on the initial value x(0) = 1 + ɛ. We can then obtain the corresponding optimal value of the objective function as

$$\displaystyle\begin{array}{rcl} J_{(1+\varepsilon )}^{{\ast}}& =& \int _{ 0}^{1+\varepsilon } - (1/2)(1 +\varepsilon -t)^{2}dt = -1/6 -\varepsilon /2 -\varepsilon ^{2}/2 -\varepsilon ^{3}/6 \\ & =& -1/6 +\lambda (0)\varepsilon + o(\varepsilon ), \\ \end{array}$$

where o(ɛ) = −ɛ ²∕2 −ɛ ³∕6.

In this example and Example 2.2, we have, by direct calculation, demonstrated the significance of λ(0) as the marginal value of the change in the initial state. This could have also been accomplished by obtaining the value function V (x, t) for x(t) = x, t ∈ [0, 2], and then showing that λ(0) = V _x(1, 0). This, of course, is the relationship (2.17) at x(0) = x = 1 and t = 0.

Keep in mind, however, that deriving V (x, t) is more than just finding the solution of the problem, which we have already found by using the maximum principle . V (x, t) also yields additional insights into the problem. In order to completely specify V (x, t) for all x ∈ E ¹ and all t ∈ [0, 2], we need to deal with a number of cases. Here, we will carry out the details only in the case of any t ∈ [0, 2] and 0 ≤ x ≤ 2 − t, and leave the listing of the other cases and the required calculations as Exercise 2.13.

We know from (2.9) that we need to solve the optimal control problem for any given t ∈ [0, 2] with 0 ≤ x ≤ 2 − t. However, from our earlier analysis of this example, it is clear that the optimal control

$$\displaystyle\begin{array}{rcl} u_{(x,t)}^{{\ast}}(s) = \left \{\begin{array}{cc} - 1,& s \in [t,t + x], \\ 0, &s \in (t + x,2], \end{array} \right.& & \\ \end{array}$$

and the corresponding

$$\displaystyle\begin{array}{rcl} x_{(x,t)}^{{\ast}}(s) = \left \{\begin{array}{cc} x - (s - t),& s \in [t,t + x], \\ 0, &s \in (t + x,2], \end{array} \right.& & \\ \end{array}$$

where we use the subscript to show the dependence of the control and state trajectories of a problem beginning at time t with the state x(t) = x. Thus,

$$\displaystyle{V (x,t) =\int _{ t}^{t+x} -\frac{1} {2}[x_{(x,t)}^{{\ast}}(s)]^{2}ds = -\frac{1} {2}\int _{t}^{t+x}(x - s + t)^{2}ds.}$$

While this expression can be easily integrated to obtain an explicit solution for V (x, t), we do not need to do this for our immediate purpose at hand, which is to obtain V _x(x, t). Differentiating the right-hand side with respect to x, we obtain

$$\displaystyle{V _{x}(x,t) = -\frac{1} {2}\int _{t}^{x+t}2(x - s + t)ds.}$$

Furthermore, since

$$\displaystyle\begin{array}{rcl} x^{{\ast}}(t) = \left \{\begin{array}{cc} 1 - t,& t \in [0,1], \\ 0, &t \in (1,2], \end{array} \right.& & \\ \end{array}$$

we obtain

$$\displaystyle\begin{array}{rcl} V _{x}(x^{{\ast}}(t),t) = \left \{\begin{array}{cc} -\frac{1} {2}\int _{t}^{1}2(x - s + t)ds = -\frac{1} {2}t^{2} + t -\frac{1} {2},& t \in [0,1], \\ 0, &t \in (1,2], \end{array} \right.& & \\ \end{array}$$

which equals λ(t) obtained as the adjoint variable in Example 2.5. Note that for t ∈ [0, 1], λ(t) in Example 2.5 is the same as that in Example 2.4 obtained in (2.52).

Example 2.6

This example is slightly more complicated and the optimal control is not bang-bang . The problem is:

$$\displaystyle{ \max \left \{J =\int _{ 0}^{2}(2x - 3u - u^{2})dt\right \} }$$

(2.54)

subject to

$$\displaystyle{ \dot{x} = x + u,\;x(0) = 5 }$$

(2.55)

and the control constraint

$$\displaystyle{ u \in \varOmega = [0,2]. }$$

(2.56)

Solution Here T = 2, F = 2x − 3u − u ², S = 0, and f = x + u. The Hamiltonian is

$$\displaystyle\begin{array}{rcl} H& =& (2x - 3u - u^{2}) +\lambda (x + u) \\ & =& (2+\lambda )x - (u^{2} + 3u -\lambda u).{}\end{array}$$

(2.57)

Let us find the optimal control policy by differentiating (2.57) with respect to u. Thus,

$$\displaystyle{\frac{\partial H} {\partial u} = -2u - 3+\lambda = 0,}$$

so that the form of the optimal control is

$$\displaystyle{ u^{{\ast}}(t) = \frac{\lambda (t) - 3} {2}, }$$

(2.58)

provided this expression stays within the interval Ω = [0, 2]. Note that the second derivative of H with respect to u is ∂ ² H∕∂u ² = −2 < 0, so that (2.58) satisfies the second-order condition for the maximum of a function.

We next derive the adjoint equation as

$$\displaystyle{ \dot{\lambda }= -\frac{\partial H} {\partial x} = -2-\lambda,\;\lambda (2) = 0. }$$

(2.59)

Referring to Appendix A.1, we can use the integrating factor e ^t to obtain

$$\displaystyle{e^{t}(d\lambda +\lambda dt) = d(e^{t}\lambda ) = -2e^{t}dt.}$$

We then integrate it on both sides from t to 2 and use the terminal condition λ(2) = 0 to obtain the solution of the adjoint equation (2.59) as

$$\displaystyle{\lambda (t) = 2(e^{2-t} - 1).}$$

If we substitute this into (2.58) and impose the control constraint (2.56), we see that the optimal control is

$$\displaystyle{ \begin{array}{c} u^{{\ast}}(t) = \left \{\begin{array}{ccl} 2 &\mbox{ if}&e^{2-t} - 2.5 > 2, \\ e^{2-t} - 2.5&\mbox{ if}&0 \leq e^{2-t} - 2.5 \leq 2, \\ 0 &\mbox{ if}&e^{2-t} - 2.5 < 0, \end{array} \right. \end{array} }$$

(2.60)

or referring to the notation defined in (1.22),

$$\displaystyle{u^{{\ast}}(t) = \mbox{ sat}[0,2;e^{2-t} - 2.5].}$$

The graph of u ^∗(t) appears in Fig. 2.5. In the figure, t ₁ satisfies $e^{2-t_{1}} - 2.5 = 2,$ i.e., t ₁ = 2 − ln4. 5 ≈ 0. 496, while t ₂ satisfies $e^{2-t_{2}} - 2.5 = 0,$ which gives t ₂ = 2 − ln2. 5 ≈ 1. 08.

In Exercise 2.2 you will be asked to compute the optimal state trajectory x ^∗(t) corresponding to u ^∗(t) shown in Fig. 2.5 by piecing together the solutions of three separate differential equations obtained from (2.55) and (2.60).

2.4 Sufficiency Conditions

So far, we have shown the necessity of the maximum principle conditions for optimality. Next we prove a theorem that gives qualifications under which the maximum principle conditions are also sufficient for optimality. This theorem is important from our point of view since the models derived from many management science applications will satisfy conditions required for the sufficiency result. As remarked earlier, our technique for proving existence will be to display for any given model, a solution that satisfies both necessary and sufficient conditions. A good reference for sufficiency conditions is Seierstad and Sydsæter (1987) .

We first define a function H ⁰: E ⁿ × E ^m × E ¹ → E ¹ called the derived Hamiltonian as follows:

$$\displaystyle{ H^{0}(x,\lambda,t) =\max _{ u\in \varOmega (t)}H(x,u,\lambda,t). }$$

(2.61)

We assume that by this equation a function u ^∗(x, λ, t) is implicitly and uniquely defined. Given these assumptions we have by definition,

$$\displaystyle{ H^{0}(x,\lambda,t) = H(x,u^{{\ast}},\lambda,t). }$$

(2.62)

For our proof of the sufficiency of the maximum principle, we also need the derivative H _x ⁰(x, λ, t), which by use of the Envelope Theorem can be given as

$$\displaystyle{ H_{x}^{0}(x,\lambda,t) = H_{ x}(x,u^{{\ast}},\lambda,t):= H_{ x}(x,u,\lambda,t)\vert _{u=u^{{\ast}}}. }$$

(2.63)

To see this in the case when u ^∗(x, λ, t) is differentiable in x, let us differentiate (2.62) with respect to x:

$$\displaystyle{ H_{x}^{0}(x,\lambda,t) = H_{ x}(x,u^{{\ast}},\lambda,t) + H_{ u}(x,u^{{\ast}},\lambda,t)\frac{\partial u^{{\ast}}} {\partial x}. }$$

(2.64)

To obtain (2.63) from (2.64), we need to show that the second term on the right-hand side of (2.64) vanishes, i.e.,

$$\displaystyle{ H_{u}(x,u^{{\ast}},\lambda,t)\frac{\partial u^{{\ast}}} {\partial x} = 0 }$$

(2.65)

for each x. There are two cases to consider. If u ^∗ is in the interior of Ω(t), then it satisfies the first-order condition H _u(x, u ^∗, λ, t) = 0, thereby implying (2.65). Otherwise, u ^∗ is on the boundary of Ω(t). Then, for each i, j, either $H_{u_{i}} = 0$ or ∂u _i ^∗∕∂x _j = 0 or both. Once again, (2.65) holds. Exercise 2.25 gives a specific instance of this case.

Remark 2.3

We have shown the result in (2.63) in cases when u ^∗ is a differentiable function of x. The result holds more generally, provided that Ω(t) is appropriately qualified; see Derzko et al. (1984) . Such results are known as Envelope Theorems , and are used often in economics.

Theorem 2.1 (Sufficiency Conditions )

. Let u ^∗(t), and the corresponding x ^∗(t) and λ(t) satisfy the maximum principle necessary condition(2.31) for all t ∈ [0, T]. Then, u ^∗ is an optimal control if H ⁰(x, λ(t), t) is concave in x for each t and S(x, T) is concave in x.

Proof. The proof is a minor extension of the arguments in Arrow and Kurz (1970) . By definition

$$\displaystyle\begin{array}{rcl} H[x(t),u(t),\lambda (t),t] \leq H^{0}[x(t),\lambda (t),t].& &{}\end{array}$$

(2.66)

Since H ⁰ is differentiable and concave, we can use the applicable definition of concavity given in Sect. 1.4 to obtain

$$\displaystyle{ H^{0}[x(t),\lambda (t),t] \leq H^{0}[x^{{\ast}}(t),\lambda (t),t] + H_{ x}^{0}[x^{{\ast}}(t),\lambda (t),t][x(t) - x^{{\ast}}(t)]. }$$

(2.67)

Using (2.66), (2.62), and (2.63) in (2.67), we obtain

$$\displaystyle\begin{array}{lllll} && H[x(t),u(t),\lambda (t),t]\; \leq \; H[x^{{\ast}}(t),u^{{\ast}}(t),\lambda (t),t]&& \\ & &\qquad +H_{x}[x^{{\ast}}(t),u^{{\ast}}(t),\lambda (t),t][x(t) - x^{{\ast}}(t)].{}\end{array}$$

(2.68)

By definition of H in (2.18) and the adjoint equation of (2.31)

$$\displaystyle\begin{array}{lllll} &&F[x(t),u(t),t] +\lambda (t)f[x(t),u(t),t] \leq F[x^{{\ast}}(t),u^{{\ast}}(t),t]&& \\ & &\qquad +\lambda (t)f[x^{{\ast}}(t),u^{{\ast}}(t),t] -\dot{\lambda } (t)[x(t) - x^{{\ast}}(t)].{}\end{array}$$

(2.69)

Using the state equation in (2.31), transposing, and regrouping,

$$\displaystyle\begin{array}{rcl} F[x^{{\ast}}(t),u^{{\ast}}(t),t] - F[x(t),u(t),t]& \geq & \dot{\lambda }(t)[x(t) - x^{{\ast}}(t)] \\ & & +\lambda (t)[\dot{x}(t) -\dot{ x}^{{\ast}}(t)].{}\end{array}$$

(2.70)

Furthermore, since S(x, T) is a differential and concave function in its first argument, we have

$$\displaystyle{ S(x(T),T) \leq S(x^{{\ast}}(T),T) + S_{ x}(x^{{\ast}}(T),T)[x(T) - x^{{\ast}}(T)] }$$

(2.71)

or,

$$\displaystyle{ S(x^{{\ast}}(T),T) - S(x(T),T) \geq S_{ x}(x^{{\ast}}(T),T)[x(T) - x^{{\ast}}(T)]. }$$

(2.72)

Integrating both sides of (2.70) from 0 to T and adding (2.72), we have

$$\displaystyle\begin{array}{lllll} &&\left [\int _{0}^{T}F(x^{{\ast}}(t),u^{{\ast}}(t),t)dt + S(x^{{\ast}}(T),T)\right ]&& {}\\ & &\qquad -\left [\int _{0}^{T}F(x(t),u(t),t)dt + S(x(T),T)\right ] {}\\ & &\qquad \geq [\lambda (T) - S_{x}(x^{{\ast}}(T),T)][x(T) - x^{{\ast}}(T)] -\lambda (0)[x(0) - x^{{\ast}}(0)] {}\\ \end{array}$$

or,

$$\displaystyle\begin{array}{lllll} &&J(u^{{\ast}}) - J(u)&& {} \\ & &\qquad \geq [\lambda (T) - S_{x}(x^{{\ast}}(T),T)][x(T) - x^{{\ast}}(T)] -\lambda (0)[x(0) - x^{{\ast}}(0)], \\ \end{array}$$

(2.73)

where J(u) is the value of the objective function associated with a control u. Since x ^∗(0) = x(0) = x ₀, the initial condition, and since λ(T) = S _x(x ^∗(T), T) from the terminal adjoint condition in (2.31), we have

$$\displaystyle{ J(u^{{\ast}}) \geq J(u). }$$

(2.74)

Thus, u ^∗ is an optimal control. This completes the proof. □

Because λ(t) is not known a priori, it is usual to test H ⁰ for a stronger assumption, i.e., to check for the concavity of the function H ⁰(x, λ, t) in x for any λ and t. Sometimes the stronger condition given in Exercise 2.27 can be used.

Mangasarian (1966) gives a sufficient condition in which the concavity of H ⁰(x, λ(t), t) in Theorem 2.1 is replaced by a stronger condition requiring the Hamiltonian H(x, u, λ(t), t) to be jointly concave in (x, u).

Example 2.7

Let us show that the problems in Examples 2.2 and 2.3 satisfy the sufficient conditions. We have from (2.36) and (2.61),

$$\displaystyle{H^{0} = -x +\lambda u^{{\ast}},}$$

where u ^∗ is given by (2.37). Since u ^∗ is a function of λ only, H ⁰(x, λ, t) is certainly concave in x for any t and λ (and in particular for λ(t) supplied by the maximum principle). Since S(x, T) = 0, the sufficient conditions hold.

Finally, it is important to mention that thus far in this chapter, we have considered problems in which the terminal values of the state variables are not constrained. Such problems are called free-end-point problems. The problems at the other extreme, where the terminal values of the state variables are completely specified, are termed fixed-end-point problems. Then, there are problems in between these two extremes. While a detailed discussion of terminal conditions on state variables appears in Sect. 3.4 of the next chapter, it is instructive here to briefly indicate how the maximum principle needs to be modified in the case of fixed-end-point problems. Suppose x(T) is completely specified, i.e., x(T) = k ∈ E ⁿ, where k is a vector of constants. Observe then that the first term on the right-hand side of inequality (2.73) vanishes regardless of the value of λ(T), since x(T) − x ^∗(T) = k − k = 0 in this case. This means that the sufficiency result would go through for any value of λ(T). Not surprisingly, therefore, the transversality condition (2.30) in the fixed-end-point case changes to

$$\displaystyle{ \lambda (T) =\beta, }$$

(2.75)

where β ∈ E ⁿ is a vector of constants to be determined.

Indeed, one can show that (2.75) is also the necessary transversality condition for fixed point problems. With this observation, the maximum principle for fixed-end-point problems can be obtained by modifying (2.31) as follows: adding x(T) = k and removing λ(T) = S _x(x ^∗(T), T). Likewise, the resulting TPBVP (2.32) can be modified correspondingly; it will have initial and final values on the state variables, whereas both initial and terminal values for the adjoint variables are unspecified, i.e., λ(0) and λ(T) are constants to be determined.

In Exercises 2.28 and 2.19, you are asked to solve the fixed-end-point problems given there.

2.5 Solving a TPBVP by Using Excel

A number of examples and exercises found throughout this book involve finding a numerical solution to a two-point boundary value problem (TPBVP) . In this section we will show how the GOAL SEEK function in Excel can be used for this purpose. We will solve the following example.

Example 2.8

Consider the problem:

$$\displaystyle{\max \left \{J =\int _{ 0}^{1} -\frac{1} {2}(x^{2} + u^{2})dt\right \}}$$

subject to

$$\displaystyle{ \dot{x} = -x^{3} + u,\;x(0) = 5. }$$

(2.76)

Solution We form the Hamiltonian

$$\displaystyle{H = -\frac{1} {2}(x^{2} + u^{2}) +\lambda (-x^{3} + u),}$$

where the adjoint variable λ satisfies the equation

$$\displaystyle{ \dot{\lambda }= x + 3x^{2}\lambda,\;\lambda (1) = 0. }$$

(2.77)

Since u is unconstrained, we set H _u = 0 to obtain u ^∗ = λ. With this, the state equation (2.76) becomes

$$\displaystyle{ \dot{x} = -x^{3}+\lambda,\;x(0) = 5. }$$

(2.78)

Thus, the TPBVP is given by the system of equations (2.77) and (2.78).

A simple method to solve the TPBVP uses what is known as the shooting method, explained in the flowchart in Fig. 2.6.

We will use Excel functions to implement the shooting method. For this we discretize (2.77) and (2.78) by replacing dx∕dt and dλ∕dt by

$$\displaystyle{\frac{\bigtriangleup x} {\bigtriangleup t} = \frac{x(t + \bigtriangleup t) - x(t)} {\bigtriangleup t} \mbox{ and } \frac{\bigtriangleup \lambda } {\bigtriangleup t} = \frac{\lambda (t + \bigtriangleup t) -\lambda (t)} {\bigtriangleup t},}$$

respectively. Substitution of △x∕△t for $\dot{x}$ in (2.78) and △λ∕△t for $\dot{\lambda }$ in (2.77) gives the discrete version of the TPBVP :

$$\displaystyle{ x(t + \bigtriangleup t) = x(t) + [-x(t)^{3} +\lambda (t)] \bigtriangleup t,\;x(0) = 5, }$$

(2.79)

$$\displaystyle{ \lambda (t + \bigtriangleup t) =\lambda (t) + [x(t) + 3x(t)^{2}\lambda (t)] \bigtriangleup t,\;\lambda (1) = 0. }$$

(2.80)

In order to solve these equations, open an empty spreadsheet, choose the unit of time to be △ t = 0. 01, make a guess for the initial value λ(0) to be, say − 0. 2, and make the entries in the cells of the spreadsheet as specified below:

Enter -0.2 in cell A1.

Enter 5 in cell B1.

Enter = A1 + (B1 + 3 ∗ (B1$\;\hat{}\;$ 2)∗ A1)∗ 0.01 in cell A2.

Enter = B1 + (-B1$\;\hat{}\;$ 3 + A1) ∗ 0.01 in cell B2.

Here we have entered the right-hand side of the difference equation (2.80) for t = 0 in cell A2 and the right-hand side of the difference equation (2.79) for t = 0 in cell B2. Note that λ(0) = −0. 2 shown as the entry − 0. 2 in cell A1 is merely a guess. The correct value will be determined by the use of the GOAL SEEK function.

Next highlight cells A2 and B2 and drag the combination down to row 101 of the spreadsheet. Using EDIT in the menu bar, select FILL DOWN. Thus, Excel will solve Eqs. (2.80) and (2.79) from t = 0 to t = 1 in steps of △ t = 0. 01, and that solution will appear as entries in columns A and B of the spreadsheet, respectively. In other words, the guessed solution for λ(t) will appear in cells A1 to A101 and the corresponding solution for x(t) will appear in cells B1 to B101. To find the correct value for λ(0), use the GOAL SEEK function under TOOLS in the menu bar and make the following entries:

Set cell: A101.

To value: 0.

By changing cell: A1.

It finds the correct initial value for the adjoint variable as λ(0) = −0. 10437, which should appear in cell A1, and the correct ending value of the state variable as x(1) = 0. 62395, which should appear in cell B101. You will notice that the entry in cell A101 may not be exactly zero as instructed, although it will be very close to it. In our example, it is − 0. 0007. By using the CHART function, the graphs of x ^∗(t) and λ(t) can be printed out by Excel as shown in Fig. 2.7.

As we discuss more complex problems involving control and state inequality constraints in Chaps. 3 and 4, we will realize that the shooting method is no longer adequate to solve such problems. However, there is a large amount of literature devoted to computational methods for solving optimal control problems. While a detailed treatment of this topic is beyond the scope of this book, we suggest some references as well as a software in Chap. 4, Sect. 4.3.

Exercises for Chapter 2

E 2.1

Perform the following:

(a)
In Example 2.2, show J ^∗ = −1∕2.
(b)
In Example 2.3, show J ^∗ = 0.
(c)
In Example 2.4, show J ^∗ = −1∕6.
(d)
In Example 2.5, show J ^∗ = −1∕6.

E 2.2

Complete Example 2.6 by writing the optimal x ^∗(t) in the form of integrals over the three intervals (0, t ₁), (t ₁, t ₂), and (t ₂, 2) shown in Fig. 2.5.Hint: It is not necessary to actually carry out the numerical evaluation of these integrals unless you are ambitious.

E 2.3

Find the optimal solution for Example 2.1 with x ₀ = 0 and T = 1.

E 2.4

Rework Example 2.6 with F = 2x − 3u.

E 2.5

Show that both the Lagrange and Mayer forms of the optimal control problem can be reduced to the linear Mayer form (2.5).

E 2.6

Show that the optimal control obtained from the application of the maximum principle satisfies the principle of optimality: if u ^∗(t) is an optimal control and x ^∗(t) is the corresponding optimal path for 0 ≤ t ≤ T with x(0) = x ₀, then verify the above proposition by showing that u ^∗(t) for τ ≤ t ≤ T satisfies the maximum principle for the problem beginning at time τ with the initial condition x(τ) = x ^∗(τ).

E 2.7

Provide an alternative derivation of the adjoint equation in Sect. 2.2.2 by starting with a restatement of the Eq. (2.19) as − V _t = H ⁰ and differentiating it with respect to x.

E 2.8

In Example 2.4, show that in view of (2.47) any λ(t), t ∈ [0, 1], that satisfies (2.50) must be nonnegative.

E 2.9

The system defined in (2.4) is termed autonomous if F, f, S and Ω are not explicit functions of time t. In this case, show that the Hamiltonian is constant along the optimal path, i.e., show that dH∕dt = 0. Furthermore, verify this result in Example 2.4 by a direct substitution for x and λ from (2.51) and (2.52), respectively, into H given in (2.48).

E 2.10

In Example 2.4, verify by direct calculation that with a new initial value x(0) = 1 + ɛ with ɛ small, the new optimal objective function value will be

$$\displaystyle{J_{(1+\varepsilon )}^{{\ast}} = -1/6 +\lambda (0)\varepsilon + o(\varepsilon ) = -1/6 -\varepsilon /2 -\varepsilon ^{2}/2.}$$

E 2.11

In Example 2.6, verify by direct calculation that with a new initial x(0) = 5 + ɛ with ɛ small, the objective function value will change by

$$\displaystyle{\lambda (0)\varepsilon + o(\varepsilon ) = 2(e^{2} - 1)\varepsilon + o(\varepsilon ).}$$

E 2.12

Obtain the value function V (x, t) explicitly in Example 2.4 and verify the relation V _x(x ^∗(t), t) = λ(t) for the example by showing that V _x(1 − t, t) = −(1∕2)t ² + t − 1∕2.

E 2.13

Obtain the value function V (x, t) explicitly in Example 2.5 for every x ∈ E ¹ and t ∈ [0, 2]. Hint: You need to deal with the following cases for t ∈ [0, 2]:

(i)
0 ≤ x ≤ 2 − t,
(ii)
x > 2 − t,
(iii)
t − 2 ≤ x < 0, and
(iv)
x < t − 2.

E 2.14

Obtain V (x, t) in Example 2.6 for small positive and negative x for t ∈ [t ₂, 2]. Then, show that V _x(x, t) = 2(e ^2−t − 1), t ∈ [t ₂, 2], is the same as λ(t), t ∈ [t ₂, 2] obtained in Example 2.6.

E 2.15

Solve the problem:

$$\displaystyle{\max \left \{J =\int _{ 0}^{T}(x -\frac{u^{2}} {2} )dt\right \}}$$

subject to

$$\displaystyle{\dot{x} = u,\;x(0) = x_{0},}$$

$$\displaystyle{u \in [0,1],}$$

for optimal control and optimal state trajectory. Verify that your solution is optimal by using the maximum principle sufficiency condition.

E 2.16

Solve completely the problem:

$$\displaystyle{\max \left \{\int _{0}^{1}(x + u)dt\right \}}$$

$$\displaystyle{\dot{x} = 1 - u^{2},\;\;x(0) = 1;}$$

that is, find x ^∗(t), u ^∗(t) and λ(t), 0 ≤ t ≤ 1.

E 2.17

Use the maximum principle to solve the following problem given in the Mayer form:

$$\displaystyle{\max [8x_{1}(18) + 4x_{2}(18)]}$$

subject to

$$\displaystyle{\dot{x}_{1} = x_{1} + x_{2} + u,\;x_{1}(0) = 15,}$$

$$\displaystyle{\dot{x}_{2} = 2x_{1} - u,\;x_{2}(0) = 20,}$$

and the control constraint

$$\displaystyle{0 \leq u \leq 1.}$$

Hint: Use the method in Appendix A to solve the simultaneous differential equations.

E 2.18

In Fig. 2.8, a water reservoir being used for the purpose of fire-fighting is leaking, and its water height x(t) is governed by

$$\displaystyle{\dot{x} = -0.1x + u,x(0) = 10,}$$

where u(t) denotes the net inflow at time t and 0 ≤ u ≤ 3.

Note that x(t) also represents the water pressure in appropriate units. Since high water pressure is useful for fire-fighting, the objective function in (a) below involves keeping the average pressure high, while that in (b) involves building up a high pressure at T = 100. Furthermore, we do not need to impose the state constraints 0 ≤ x(t) ≤ 50, as these will always be satisfied for every feasible control u(t), 0 ≤ t ≤ 100.

(a)
Find the optimal control which maximizes
$$\displaystyle{J =\int _{ 0}^{100}xdt.}$$
Find the maximum level reached.
(b)
Replace the objective function in (a) by
$$\displaystyle{J = 5x(100),}$$
and re-solve the problem.
(c)
Redo the problem with J = ∫ ₀ ¹⁰⁰(x − 5u)dt.

E 2.19

Consider the following fixed-end-point problem:

$$\displaystyle{\max _{u}\left \{J = -\int _{0}^{T}(g(x) + cu^{2})dt\right \}}$$

subject to

$$\displaystyle{\dot{x} = f(x) + b(x)u,\;x(0) = x_{0},\;x(T) = 0,}$$

where functions g ≥ 0, f, and b are assumed to be continuously differentiable. Derive the two-point boundary value problem (TPBVP) satisfied by the optimal state and control trajectories.

E 2.20

A Machine Maintenance Problem. Consider the machine state dynamics

$$\displaystyle{\dot{x} = -\delta x + u,\;x(0) = x_{0} > 0,}$$

where δ > 0 is the rate of deterioration of the machine state and u is the rate of machine maintenance. Find the optimal maintenance rate:

$$\displaystyle{\max \left \{J =\int _{ 0}^{T}e^{-\rho t}(\pi x -\frac{u^{2}} {2} )dt + e^{-\rho T}Sx(T)\right \},}$$

where π > 0 with πx representing the profit rate when the machine state is x, u ²∕2 is the cost of maintaining the machine at rate u, ρ > 0 is the discount rate, T is the time horizon, and S > 0 is the salvage value of the machine for each unit of the machine state at time T. Furthermore, show that the optimal maintenance rate decreases, increases, or remains constant over time depending on whether the difference S −π∕(ρ + δ) is negative, positive, or zero, respectively.

E 2.21

Transform the machine maintenance problem of Exercise 2.20 into Mayer Form. Then solve it to obtain the optimal maintenance rate.

E 2.22

Regional Allocation of Investment . Let K _i, i = 1, 2, denote the capital stock in Region i. Let b _i be the productivity of capital and s _i be the marginal propensity to save in Region i. Since the investment funds for the two regions come from the savings in the whole economy, we have

$$\displaystyle{\dot{K}_{1} +\dot{ K}_{2} = b_{1}s_{1}K_{1} + b_{2}s_{2}K_{2} = g_{1}K_{1} + g_{2}K_{2},}$$

where g _i = b _i s _i. Let u denote the control variable representing the fraction of investment allocated to Region 1 with the remainder going to Region 2. Clearly,

$$\displaystyle{ 0 \leq u \leq 1, }$$

(2.81)

and

$$\displaystyle\begin{array}{rcl} \dot{K}_{1}& =& u(g_{1}K_{1} + g_{2}K_{2}),K_{1}(0) = a_{1} > 0,{}\end{array}$$

(2.82)

$$\displaystyle\begin{array}{rcl} \dot{K}_{2}& =& (1 - u)(g_{1}K_{1} + g_{2}K_{2}),K_{2}(0) = a_{2} > 0.{}\end{array}$$

(2.83)

The optimal control problem is to maximize the productivity of the whole economy at time T. Thus, the objective is:

$$\displaystyle{\max \{J = b_{1}K_{1}(T) + b_{2}K_{2}(T)\}}$$

subject to (2.81), (2.82), and (2.83).

(a)
Use the maximum principle to derive the form of the optimal policy.
(b)
Assume b ₂ > b ₁. Show that u ^∗(t) = 0 for $t \in [\hat{t},T],$ where $\hat{t}$ is a switching point and $0 \leq \hat{ t} < T.$
(c)
If you are ambitious, find the $\hat{t}$ of part (b).

E 2.23

Investment Allocation . Let K denote the capital stock and λK its output rate with λ > 0. For simplicity in notation, we set the productivity factor λ = 1. Let u denote the invested fraction of the output. Then, uK is the investment rate and (1 − u)K is the consumption rate. Let us assume an exponential utility 1 − e ^−C of consumption C. Solve the resulting optimal control problem:

$$\displaystyle{\max \left \{J =\int _{ 0}^{T}[1 - e^{-(1-u(t))K(t)}]dt\right \}}$$

subject to

$$\displaystyle{\dot{K}(t) = u(t)K(t),\;K(0) = K_{0},\;K(T)\mbox{ free},\;0 \leq u(t) \leq 1,\;0 \leq t \leq T.}$$

Assume T > 1 and 0 < K ₀ < 1 − e ^1−T. Obtain explicitly the optimal investment allocation u ^∗(t), optimal capital K ^∗(t), and the adjoint variable λ(t), 0 ≤ t ≤ T.

E 2.24

The rate at which a new product can be sold at any time t is f(p(t))g(Q(t)) where p is the price and Q is cumulative sales. We assume f′(p) < 0; sales vary inversely with price. Also $g'(Q) \gtrless 0$ for Q ≶ Q ₁, respectively, where Q ₁ > 0 is a constant known as the saturation level. For a given price, current sales grow with past sales in the early stages as people learn about the good from past purchasers. But as cumulative sales increase, there is a decline in the number of people who have not yet purchased the good. Eventually the sales rate for any given price falls, as the market becomes saturated. The unit production cost c may be constant or may decline with cumulative sales if the firm learns how to produce less expensively with experience: c = c(Q), c′(Q) ≤ 0. Formulate and solve the optimal control problem in order to characterize the price policy p(t), 0 ≤ t ≤ T, that maximizes profits from this new “fad” over a fixed horizon T. Specifically, show that in marketing a new product, its optimal price rises while the market expands to its saturation level and falls as the market matures beyond the saturation level.

E 2.25

Suppose $H(x,u,\lambda,t) =\lambda ux -\frac{1} {2}u^{2}$ and Ω(t) = [0, 1] for all t.

(a)
Show that the form of the optimal control is given by the function
$$\displaystyle{\begin{array}{c} u^{{\ast}}(x,\lambda ) = \mbox{ sat}[0,1;\lambda x] = \left \{\begin{array}{ccl} \lambda x&\mbox{ if}&0 \leq \lambda x \leq 1, \\ 1&\mbox{ if}&\lambda x > 1, \\ 0&\mbox{ if}&\lambda x < 0.\end{array} \right. \end{array} }$$
(b)
Verify that (2.63) holds for all values of x and λ.

E 2.26

Show that the derived Hamiltonians H ⁰ found in Examples 2.4 and 2.6 satisfy the concavity condition required for the sufficiency result in Sect. 2.4.

E 2.27

If F and f are concave in x and u and if λ(t) ≥ 0, then show that the derived Hamiltonian H ⁰ is concave in x. Note that the concavity of F and f are easier to check than the concavity of H ⁰ as required in Theorem 2.1 on sufficiency conditions.

E 2.28

A simple controlled dynamical system is modeled by the scalar equation

$$\displaystyle{\dot{x} = x + u.}$$

The fixed-end-point optimal control problem consists in steering x(t) from an initial state x(0) = x ₀ to the target x(1) = 0, such that

$$\displaystyle{J(u) = \frac{1} {4}\int _{0}^{1}u^{4}dt}$$

is minimized. Use the maximum principle to show that the optimal control is given by

$$\displaystyle{u^{{\ast}}(t) = \frac{4x_{0}} {3} (e^{-4/3} - 1)^{-1}e^{-t/3}.}$$

E 2.29

Perform the following:

(a)
Solve the optimal consumption problem of Example 1.3 with U(C) = lnC and B = 0.

Hint: Since C(t) ≥ 0, we can replace the state constraint W(t) ≥ 0, t ∈ [0, T], by the terminal condition W(T) = 0, and then use the transversality condition given in (2.75).

(b)
Find the rate of change of optimal consumption over time and conclude that consumption remains constant when r = ρ, increases when r > ρ, and decreases when r < ρ.

E 2.30

Perform the following:

(a)
Formulate the TPBVP (2.32) and its discrete version for the problem in Example 2.8, but with a new initial condition x(0) = 1.
(b)
Solve the discrete version of the TPBVP by using Excel.

E 2.31

Solve explicitly

$$\displaystyle{\mbox{ max}\left \{J = -\int _{0}^{2}x(t)dt\right \}}$$

subject to

$$\displaystyle{\dot{x}(t) = u(t),\;x(0) = 1,\;x(2)\; =\; 0,}$$

$$\displaystyle{-a\; \leq \; u(t) \leq b,\;a > 1/2,\;b > 0.}$$

Obtain optimal x ^∗(t), u ^∗(t), and all required multipliers.

Bibliography

Arrow KJ, Kurz M (1970) Public investment, the rate of return, and optimal fiscal policy. The John Hopkins Press, Baltimore
Google Scholar
Bell DJ , Jacobson DH (1975) Singular optimal control. Academic Press, New York
Google Scholar
Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton
Google Scholar
Berkovitz LD (1961) Variational methods in problems of control and programming. J Math Anal Appl 3:145–169
Article Google Scholar
Bettiol P, Vinter RB (2010) Sensitivity interpretations of the costate variable for optimal control problems with state constraints. SIAM J Control Optim 48(5):3297–3317
Article Google Scholar
Boltyanskii VG (1971) Mathematical methods of optimal control. Holt, Rinehard & Winston, New York
Article Google Scholar
Bryant GF, Mayne DQ (1974) The maximum principle. Int J Control 20:1021–1054
Article Google Scholar
Bryson AE Jr, Ho Y-C (1975) Applied optimal control. Optimization, estimation and control. Taylor & Francis, New York
Google Scholar
Cernea A , Frankowska H (2005) A connection between the maximum principle and dynamic programming for constrained control problems. SIAM J Control Optim 44:673–703
Article Google Scholar
Cesari L (1983) Optimization - theory and applications: problems with ordinary differential equations. Springer, New York
Google Scholar
Clarke FH (1983) Optimization and nonsmooth analysis. Wiley, New York
Google Scholar
Clarke FH (1989) Methods of dynamic and nonsmooth optimization. Society for Industrial and Applied Mathematics, Philadelphia
Google Scholar
Cohen KJ, Cyert RM (1965) Theory of the firm: resource allocation in a market economy. Prentice-Hall, Englewood Cliffs
Google Scholar
Derzko NA , Sethi SP, Thompson GL (1984) Necessary and sufficient conditions for optimal control of quasilinear partial differential systems. J Optim Theory Appl 43:9–101
Article Google Scholar
Feichtinger G, Hartl RF (1986) Optimale Kontrolle Ökonomischer Prozesse: Anwendungen des Maximumprinzips in den Wirtschaftswissenschaften. Walter De Gruyter, Berlin
Google Scholar
Fel’dbaum AA (1965) Optimal control systems. Academic Press, New York
Google Scholar
Halkin H (1967) On the necessary condition for optimal control of non-linear systems. In: Leitmann G (ed) Topics in optimization. Academic Press, New York
Google Scholar
Hartberger RJ (1973) A proof of the Pontryagin maximum principle for initial-value problem. J Optim Theory Appl 11:139–145
Article Google Scholar
Hartl RF , Sethi SP, Vickson RG (1995) A survey of the maximum principles for optimal control problems with state constraints. SIAM Rev 37(2):181–218
Article Google Scholar
Kirk DE (1970) Optimal control theory: an introduction. Prentice-Hall, Englewood Cliffs
Google Scholar
Leitmann G (1981) The calculus of variations and optimal control. In: Miele A (ed) Series mathematical concepts and methods in science and engineering. Plenum Press, New York
Google Scholar
Mangasarian OL (1966) Sufficient conditions for the optimal control of nonlinear systems. SIAM J Control 4:139–152
Article Google Scholar
Pontryagin LS , Boltyanskii VG , Gamkrelidze RV, Mischenko EF (1962) The mathematical theory of optimal processes. Wiley, New York
Google Scholar
Roberts SM, Shipman JS (1972) Two-point boundary value problems: shooting methods. Elsevier, New York
Google Scholar
Seierstad A, Sydsæter K (1987) Optimal control theory with economic applications. North-Holland, Amsterdam
Google Scholar
Takayama A (1974) Mathematical economics. The Dryden Press, Hinsdale
Google Scholar
Yong J, Zhou XY (1999) Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer, New York
Book Google Scholar

Download references

Author information

Authors and Affiliations

Jindal School of Management, SM30, University of Texas at Dallas, Richardson, TX, USA
Suresh P. Sethi

Authors

Suresh P. Sethi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sethi, S.P. (2019). The Maximum Principle: Continuous Time. In: Optimal Control Theory. Springer, Cham. https://doi.org/10.1007/978-3-319-98237-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-98237-3_2
Published: 29 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98236-6
Online ISBN: 978-3-319-98237-3
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

The Maximum Principle: Continuous Time

Abstract

Keywords

2.1 Statement of the Problem

2.1.1 The Mathematical Model

2.1.2 Constraints

2.1.3 The Objective Function

2.1.4 The Optimal Control Problem

Example 2.1

2.2 Dynamic Programming and the Maximum Principle

2.2.1 The Hamilton-Jacobi-Bellman Equation

Remark 2.1

2.2.2 Derivation of the Adjoint Equation

2.2.3 The Maximum Principle

2.2.4 Economic Interpretations of the Maximum Principle

Remark 2.2

2.3 Simple Examples

Example 2.2

Example 2.3

Example 2.4

Example 2.5

Example 2.6

2.4 Sufficiency Conditions

Remark 2.3

Theorem 2.1 (Sufficiency Conditions )

Example 2.7

2.5 Solving a TPBVP by Using Excel

Example 2.8

E 2.1

E 2.2

E 2.3

E 2.4

E 2.5

E 2.6

E 2.7

E 2.8

E 2.9

E 2.10

E 2.11

E 2.12

E 2.13

E 2.14

E 2.15

E 2.16

E 2.17

E 2.18

E 2.19

E 2.20

E 2.21

E 2.22

E 2.23

E 2.24

E 2.25

E 2.26

E 2.27

E 2.28

E 2.29

E 2.30

E 2.31

Bibliography

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation