Abstract
Some classical references in the literature include second-order necessary conditions for the isoperimetric problem of Lagrange in the calculus of variations involving inequality and equality constraints. Those conditions take into account, in the sets of critical directions, the sign of the Lagrange multipliers of the extremal under consideration. However, they hold under a strong assumption of normality relative to a set defined only by equality constraints for active indices. In this paper, we show how the same conditions can be preserved under a weaker assumption, thus providing a wider range of applicability.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we solve a fundamental question in optimization theory. It deals with different criteria for regularity in a calculus of variations context and some of its consequences for first- and second-order necessary conditions. In particular, our main result allows us to weaken the usual assumption of strong normality found in the literature and, at the same time, preserve the corresponding conditions on the sets of extremals and critical directions.
The calculus of variations problem we shall be concerned with is a fixed endpoint problem of Lagrange posed over piecewise smooth functions and involving inequality and equality isoperimetric constraints. In a recent paper (see [1]), we studied different assumptions found in the literature under which, for that problem, second-order necessary conditions are well known, and showed that all those assumptions are equivalent to a form of strong normality defined in [2]. In this paper, we shall prove, in particular, that those second-order conditions for the problem under consideration can be derived under a much weaker assumption.
One can find in the literature different references treating this and more general problems, some inserted in an optimal control context. In particular, let us mention some differences between our main result, which corresponds to an improvement in Theorem 2.2, and the fundamental research developed in [3,4,5,6,7,8]. In those references, one of the main distinctive features is that the necessary conditions obtained make sense and are derived, without a priori assumptions on the normality of the extremal under consideration. The excellent survey given in [5] provides a full account of these results as well as others related to the inverse function theorem. The paper [6] continues an investigation begun in [3, 4, 8], extending some of the results to the case of mixed constraints. On the other hand, the second-order conditions derived in those references (see, for example, [5, Theorem 2.1]) are expressed in terms of the maximum of a quadratic form, over certain Lagrange multipliers, for all the elements of the so-called critical cone which, for our problem, corresponds to the set of tangential constraints with respect to the original set of inequality and equality constraints. In contrast, Theorem 2.4 provides second-order conditions on the set of tangential constraints with respect to a subset of the previous one, and under the assumption of normality relative not to a set defined only by equality constraints for active indices, as in Theorem 2.2, but relative to a set which properly contains it.
Other references in the subject provide different approaches to the derivation of second-order necessary conditions. Some deal with implicit function theorems (see, for example, [2, 9,10,11,12]), the maximum of a quadratic form for different types of minima [13,14,15], or new notions of conjugacy and critical directions [16,17,18,19].
The paper is organized as follows. In Sect. 2, we pose the problem we shall deal with and state two well-known results on first- and second-order necessary conditions. Both results require a standard assumption of “strong normality” and, based on a particular notion of regularity, we show how they can be easily established. In the same section, we introduce a more general notion of normality and state that the previous, standard second-order conditions hold under the weaker assumption of normality relative to a subset of the set of isoperimetric constraints, which takes into account the sign of the corresponding Lagrange multipliers. This is the main result of the paper. A result on necessary conditions in terms of a more general notion of regularity is then proved and, as explained at the end of that section, our main result turns out to be a simple consequence of the fact that regularity relative to a set S is implied precisely by normality relative to the same set. Section 3 is devoted to prove this fact by introducing the notion of properness relative to a set. Finally, in Sect. 4, we provide a simple example which illustrates the usefulness of our result and for which the classical theory on second-order conditions cannot be applied.
2 The Problem and Main Results
Let us begin by stating the problem. The given data correspond to an interval \(T:=[t_0,t_1]\) in \(\mathbb {R}\), two points \(\xi _0\), \(\xi _1\) in \(\mathbb {R}^n\), functions L and \(L_\gamma \) mapping \(T \times \mathbb {R}^n\times \mathbb {R}^n\) to \(\mathbb {R}\) and scalars \(b_\gamma \) in \(\mathbb {R}\) \((\gamma =1,\ldots ,q)\). Denote by X the space of piecewise \(C^1\) functions mapping T to \(\mathbb {R}^n\) called arcs or trajectories, and let \(X_e\) be the set of all arcs x satisfying the endpoint constraints \(x(t_0)=\xi _0\) and \(x(t_1)=\xi _1\). The admissible arcs are the elements of
where \(R=\{ 1,\ldots ,r \}\), \(Q=\{ r+1,\ldots ,q \}\), and
The problem we shall deal with, which we label (P), is that of minimizing I on S, where
An arc x is said to solve (P), if x is admissible and \(I(x) \le I(y)\) for all admissible arcs y. For any \(x \in X\), the notation \(({{\tilde{x}}}(t))\) represents \((t,x(t),{\dot{x}}(t))\) and we assume that L, \(L_\gamma \) are \(C^2\).
Given an arc x consider the first variation of I along x given by
and the second variation of I along x given by
where, for all \((t,y,{\dot{y}}) \in T \times \mathbb {R}^n\times \mathbb {R}^n\),
The first and second variations of other integrals such as \(I_\gamma \) are defined in a similar way. Define the set of admissible variations as
2.1 Strong Normality and p-Regularity
For first- and second-order necessary conditions, one is usually interested in proving linearly independence on Y of the first variations \(I^\prime _\gamma (x_0;y)\) (for all \(\gamma \in I_a (x_0) \cup Q\)) of \(I_\gamma \) along an admissible arc \(x_0\), where
denotes the set of active indices at \(x_0\). This is equivalent to the existence of \(y_\gamma \in Y\) \((\gamma \in I_a(x_0) \cup Q)\) such that
Let us call this property strong normality or, as it will become apparent from the theory to follow, normality relative to \(S_0\), where
Definition 2.1
We shall call a pair \((x_0,\lambda ) \in S \times \mathbb {R}^q\) an extremal, if the following conditions hold:
-
i.
\(\lambda _\alpha \ge 0\) and \(\lambda _\alpha I_\alpha (x_0)=0\) \((\alpha \in R)\).
-
ii.
If \(J(x):= I(x) + \sum _1^q \lambda _\gamma I_\gamma (x)\), then \(J^\prime (x_0;y)=0\) for all \(y \in Y\).
Condition (ii), as it is well known, is equivalent to the existence of \(c \in \mathbb {R}^n\) such that
where \(F = L + \sum _1^q \lambda _\gamma L_\gamma \) (see [2]). Denote by \({{\mathcal {E}}}\) the set of all extremals.
We have the following two well-known results on necessary conditions (see, for example, [2]). Both require a strong normality assumption on the solution to the problem.
Theorem 2.1
If \(x_0\) solves (P) and is strongly normal, then \(\exists \lambda \in \mathbb {R}^q\) unique such that \((x_0,\lambda ) \in {{\mathcal {E}}}\).
Theorem 2.2
Suppose \(\exists \lambda \in \mathbb {R}^q\) such that \((x_0,\lambda ) \in {{\mathcal {E}}}\). If \(x_0\) solves (P) and is strongly normal, then \(J^{\prime \prime }(x_0;y) \ge 0\) for all \(y \in Y\) satisfying
- a. :
-
\(I^\prime _{\alpha }(x_0;y) \le 0 \ (\alpha \in I_a(x_0),\ \lambda _\alpha =0)\);
- b. :
-
\(I^\prime _{\beta }(x_0;y) = 0 \ (\beta \in R \ \hbox {with}\ \lambda _\beta > 0, \ \hbox {or}\ \beta \in Q)\).
Let us briefly explain how, based on a particular notion of regularity, these two results can be easily established. We begin by stating the following well-known property of linear functionals on real vector spaces (see [2, 20]).
Lemma 2.1
Let X be a real vector space. Suppose \(L, L_i\) are linear functionals on X \((i \in A \cup B, \ \hbox {where}\ A=\{1,\ldots ,p\},\ B=\{p+1,\ldots ,m\})\), and
If \(L(x) \ge 0\) for all \(x\in {{\mathcal {R}}}\), then \(\exists \{\lambda _i\}_1^m\) such that \(\lambda _\alpha \ge 0\) \((\alpha \in A)\) and such that \(L(x) + \sum _1^m \lambda _iL_i(x) = 0\) \((x \in X)\). If \(\{L_i\}_1^m\) is linearly independent, then \(\{\lambda _i\}_1^m\) is unique.
Now, define the set of tangential constraints at \(x_0 \in S\) by
and the set of (positive) curvilinear tangents of S at \(x_0\) by
Clearly, \(C_S(x_0) \subset {{\mathcal {R}}}_S(x_0)\) since, for all \(y \in C_S(x_0)\),
If the converse holds, that is, if the two sets coincide, the arc \(x_0\) will be called p-regular. Note that in this event, if \(x_0\) solves the problem, \(y \in {{\mathcal {R}}}_S(x_0)\) and \(\delta > 0\) and x are as in the definition of \(C_S(x_0)\), then \(g(\epsilon ):=I(x(\cdot ,\epsilon ))\) has a local minimum at \(\epsilon = 0\) and so \(g^\prime (0) = I^\prime (x_0;y) \ge 0\). In Theorem 2.1, the existence and uniqueness of \(\lambda _1,\ldots ,\lambda _q\) satisfying (i) and (ii) in the definition of \({{\mathcal {E}}}\), follows by Lemma 2.1. Finally, by an application of the implicit function theorem, one can show that strong normality implies p-regularity.
For Theorem 2.2, consider the subset of S given by
and note that \(S_1 = \{ x \in S : J(x) = I(x) \}\) and \({{\mathcal {R}}}_{S_1}(x_0)\) is precisely the set of all \(y \in Y\) satisfying (a) and (b) of that theorem, that is,
Since strong normality relative to S is equivalent to strong normality relative to \(S_1\), our assumption implies that \({{\mathcal {R}}}_{S_1}(x_0) = C_{S_1}(x_0)\). Thus, for any y in \({{\mathcal {R}}}_{S_1}(x_0)\), there exist \(\delta \) and x as in the definition of \(C_{S_1}(x_0)\) and so, if we define \(g(\epsilon ):=I(x(\cdot ,\epsilon ))\) as before, then \(g(\epsilon ) = J(x(\cdot ,\epsilon ))\) by definition of \(S_1\), and therefore, as one readily verifies, \(J^{\prime \prime }(x_0;y) = g^{\prime \prime }(0) \ge 0\).
Let us emphasize that, in both proofs (which can be seen with full detail in, for example, [1, 2]), as well as in other proofs found in the literature (with a different but, as explained in [1], equivalent assumption in an optimal control context in [2, 9, 10]), the assumption of strong normality is basic. This notion, as mentioned before, is related to the set \(S_0\) when a different, more general definition of normality is adopted.
2.2 Normality and Regularity Relative to S
An arc \(x_0\) will be said to be normal relative to S, if \(\lambda =0\) is the only solution of
-
i.
\(\lambda _\alpha \ge 0\) and \(\lambda _\alpha I_\alpha (x_0)=0\) \((\alpha \in R)\).
-
ii.
\(\sum _1^q \lambda _\gamma I^\prime _\gamma (x_0;y)=0\) for all \(y \in Y\).
To understand the origin of this definition, let us state the following well-known set of first-order conditions given in [2].
Theorem 2.3
If \(x_0\) solves (P), then there exist \(\lambda _0 \ge 0\) and \(\lambda _1,\ldots ,\lambda _q\) not all zero, such that
-
i.
\(\lambda _\alpha \ge 0\) and \(\lambda _\alpha I_\alpha (x_0)=0\) \((\alpha \in R)\).
-
ii.
If \(J_0(x) := \lambda _0 I(x) + \sum _1^q \lambda _\gamma I_\gamma (x)\), then \(J^\prime _0(x_0;y) = 0\) for all \(y \in Y\).
Clearly, if \(x_0\) is a solution to the problem and is normal relative to S then, necessarily, \(\lambda _0 > 0\) in Theorem 2.3 and the multipliers can be chosen so that \(\lambda _0=1\). In this event, the conclusion of Theorem 2.1 follows except possibly for the uniqueness of the multipliers. Normality relative to S will be called weak normality.
Note that this notion, applied to \(S_0\), is equivalent to strong normality since, in view of the definition, \(x_0\) is normal relative to \(S_0\) if \(\lambda =0\) is the only solution of
-
i.
\(\lambda _\alpha I_\alpha (x_0)=0\) \((\alpha \in R)\).
-
ii.
\(\sum _1^q \lambda _\gamma I^\prime _\gamma (x_0;y)=0\) for all \(y \in Y\).
This notion of normality can, of course, be applied also to the set \(S_1\). An arc \(x_0\) is normal relative to \(S_1\), if \(\mu =0\) is the only solution of
-
i.
\(\mu _\alpha \ge 0\) and \(\mu _\alpha I_\alpha (x_0)=0\) \((\alpha \in R,\ \lambda _\alpha =0)\).
-
ii.
\(\sum _1^q \mu _\gamma I^\prime _\gamma (x_0;y)=0\) for all \(y \in Y\).
A fundamental question posed in the literature (see also [9, 11, 12, 17,18,19] for other problems in optimal control) is if, in Theorem 2.2, the assumption of strong normality can be replaced with that of normality relative to \(S_1\), without altering the set \({{\mathcal {R}}}_{S_1}(x_0)\) of critical directions where the second-order conditions hold, that is, the set of all \(y \in Y\) satisfying conditions (a) and (b) of Theorem 2.2. In other words, the question is whether the following theorem is valid or not.
Theorem 2.4
Suppose \(\exists \lambda _1,\ldots ,\lambda _q\) such that
-
i.
\(\lambda _\alpha \ge 0\) and \(\lambda _\alpha I_\alpha (x_0)=0\) \((\alpha \in R)\).
-
ii.
If \(J(x):= I(x) + \sum _1^q \lambda _\gamma I_\gamma (x)\), then \(J^\prime (x_0;y)=0\) for all \(y \in Y\).
If \(x_0\) solves (P) and is normal relative to \(S_1 = \{ x \in S : J(x) = I(x) \}\), then \(J^{\prime \prime }(x_0;y) \ge 0\) for all \(y \in Y\) satisfying
- a.:
-
\(I^\prime _{\alpha }(x_0;y) \le 0 \ (\alpha \in I_a(x_0),\ \lambda _\alpha =0)\);
- b.:
-
\(I^\prime _{\beta }(x_0;y) = 0 \ (\beta \in R \ \hbox {with}\ \lambda _\beta > 0, \ \hbox {or}\ \beta \in Q)\).
We shall pose, and solve this question, by means of a notion of regularity slightly different to the previous one, defined in terms not of curvilinear but sequential tangents. Based on the weak norm on X,
let us introduce the notion of tangent cone which corresponds to a generalization of the one given by Hestenes [2, 20] for finite-dimensional spaces (see also [21] for equivalent definitions). In what follows, the letter q should not be confused with the cardinality of \(R \cup Q\).
We shall say that a sequence \(\{x_q\} \subset X\) converges to \(x_0\) in the direction y if y is a unit arc, \(x_q \not = x_0\), and
The tangent cone of S at \(x_0\), which we shall denoted by \(T_S(x_0)\), is the cone determined by the unit arcs \(y \in Y\) for which there exists a sequence \(\{x_q\}\) in S converging to \(x_0\) in the direction y.
Note that, equivalently, \(T_S(x_0)\) is the set of all \(y \in Y\) for which there exist sequences \(\{x_q\}\) in S and \(\{\epsilon _q\}\) of positive numbers such that
This follows since, if \(\{x_q\}\) and \(\{\epsilon _q\}\) satisfy (1), then we have
Hence, if \(y \not \equiv 0\), then \(\Vert x_q - x_0\Vert \not = 0\) for large values of q and
Therefore, if y is a unit arc and there exist \(\{x_q\} \subset S\) and \(\{\epsilon _q > 0\}\) satisfying (1), then we can choose \(\epsilon _q = \Vert x_q - x_0\Vert \) in (1).
A fundamental property satisfied by this norm is that, as shown in [2], if \(\{x_q\}\) converges to \(x_0\) in the direction y, then
Similarly, for the second variation,
The first of these relations, clearly implies that \(T_S(x_0) \subset {{\mathcal {R}}}_S(x_0)\), and we shall say that \(x_0 \in S\) is a regular arc of S if \(T_S(x_0) = {{\mathcal {R}}}_S(x_0)\). In terms of this notion, we obtain the following first- and second-order necessary conditions.
Theorem 2.5
If \(x_0\) solves (P) and is a regular arc of S, then \(\exists \lambda \in \mathbb {R}^q\) such that \((x_0,\lambda ) \in {{\mathcal {E}}}\).
Theorem 2.6
Suppose \(\exists \lambda \in \mathbb {R}^q\) such that \((x_0,\lambda ) \in {{\mathcal {E}}}\). If \(x_0\) solves (P) and is a regular arc of \(S_1=\{ x \in S : J(x) = I(x) \}\), then \(J^{\prime \prime }(x_0;y) \ge 0\) for all \(y \in {{\mathcal {R}}}_{S_1}(x_0)\).
To prove these results, suppose \(x_0\) solves (P). Clearly, by (2), \(I^\prime (x_0;y) \ge 0\) for all \(y \in T_S(x_0)\). By regularity relative to S, this holds on \({{\mathcal {R}}}_S(x_0)\), and the conclusion of Theorem 2.5 follows by Lemma 2.1. Now, if \(y \in T_{S_1}(x_0)\) is a unit arc and \(\{x_q\} \subset S_1\) a sequence converging to \(x_0\) in the direction y then, since \(J(x) = I(x)\) on \(S_1\), we have \(J(x_q) \ge J(x_0)\), and therefore, by (3), \(J^{\prime \prime }(x_0;y) \ge 0\). The result then follows by regularity relative to \(S_1\).
Note that the only difference between Theorems 2.4 and 2.6 is the assumption imposed on the extremal. For the former, we have normality relative to \(S_1\) while, for the latter, it is regularity relative to \(S_1\). Therefore, Theorem 2.4 will be established, as a simple consequence of Theorem 2.6, if normality (relative to S) implies regularity (relative to S). Thus, this result would allow us to weaken the usual assumption of strong normality, but preserving the conditions on the sets of extremals and critical directions. This is the fundamental question in optimization theory, mentioned in the beginning of the introduction, which we shall now solve.
3 Normality, Regularity and Properness
Recall that an arc \(x_0 \in S\) is called regular relative to S if \({{\mathcal {R}}}_S(x_0) \subset T_S(x_0)\) (implying equality), and normal relative to S if \(\lambda =0\) is the only solution to
-
i.
\(\lambda _\alpha \ge 0\) and \(\lambda _\alpha I_\alpha (x_0)=0\) \((\alpha \in R)\).
-
ii.
\(\sum _1^q \lambda _\gamma I^\prime _\gamma (x_0;y)=0\) for all \(y \in Y\).
The purpose of this section is to prove that normality implies regularity. Let us begin with a new concept which, as we shall see below, characterizes normality relative to S.
Definition 3.1
We call \(x_0 \in S\) proper relative to S if
- a.:
-
\(\{ I^\prime _\beta (x_0;y) : \beta \in Q \}\) is linearly independent on Y.
- b.:
-
\(\exists y \in Y\) such that \(I^\prime _\alpha (x_0;y) < 0\) \((\alpha \in I_a(x_0))\) and \(I^\prime _\beta (x_0;y) = 0\) \((\beta \in Q)\).
To prove that properness and normality are equivalent, we shall make use of the following auxiliary result on linear functionals (see [2, 20]).
Lemma 3.1
Let \(L_1,\ldots ,L_m\) be linear forms on X, a real vector space, and let
where \(A=\{1,\ldots ,p\}\) and \(B=\{p+1,\ldots ,m\}\). Suppose that \(L_i(x) = 0\) for all \(x \in {{\mathcal {R}}}\) and \(i \in A \cup B\). Then there exist \(\lambda _\alpha > 0\) \((\alpha \in A)\) and \(\lambda _\beta \in \mathbb {R}\) \((\beta \in B)\) such that \(\sum _1^m \lambda _iL_i(x) = 0\) \((x \in X)\).
Proposition 3.1
Let \(x_0 \in S\). Then \(x_0\) is normal relative to S \(\Leftrightarrow \) \(x_0\) is proper relative to S.
Proof
Without loss of generality, assume that all constraints are active at \(x_0\), that is, \(R = I_a(x_0)\).
“\(\Leftarrow \)”: Suppose \(x_0\) is a proper arc of S. Let \(\lambda \) satisfy (i) and (ii) above. If \(R = \emptyset \), then \(\lambda = 0\). If \(R \ne \emptyset \), choose a trajectory y satisfying 3.1(b). We have
and, since \(I^\prime _\alpha (x_0;y) < 0\) \((\alpha \in R)\), we have \(\lambda _\alpha = 0\) \((\alpha \in R)\). Thus,
and, by 3.1(a), \(\lambda _\beta = 0\) \((\beta \in Q)\).
“\(\Rightarrow \)”: Suppose \(x_0\) is a normal arc of S. Clearly, 3.1(a) holds. Without loss of generality, \(R \ne \emptyset \). Define
and let \(D := R \sim C = \{\alpha \in R : \alpha \not \in C\}\), so that \(I^\prime _\gamma (x_0;y) = 0\) for all \(\gamma \in D \cup B\) and \(y \in {{\mathcal {R}}}_S(x_0)\). Let
and consider their corresponding sets of tangential constraints at \(x_0\):
We claim that \({{\mathcal {R}}}_V(x_0) = {{\mathcal {R}}}_{V_0}(x_0)\). To prove it, consider the following subset of S:
and the tangential constraints at \(x_0\) associated with \({{\tilde{S}}}\):
Clearly \({{\mathcal {R}}}_S(x_0) = {{\mathcal {R}}}_{{\tilde{S}}}(x_0)\). Without loss of generality, \(C \not = \emptyset \) for, otherwise, \({{\mathcal {R}}}_{{\tilde{S}}}(x_0) = {{\mathcal {R}}}_{V_0}(x_0)\) and \({{\mathcal {R}}}_S(x_0) = {{\mathcal {R}}}_V(x_0)\). Select, for each \(\alpha \in C\), a trajectory \(y_\alpha \in {{\mathcal {R}}}_S(x_0)\) such that \(I^\prime _\alpha (x_0;y_\alpha ) < 0\), and set \({{\hat{y}}}:=\sum _{\alpha \in C} y_\alpha \). Note that
Let \(y \not \equiv 0\) in \({{\mathcal {R}}}_V(x_0)\), and let \(\epsilon > 0\) be such that
Note that
and, therefore, \(y + \epsilon {{\hat{y}}}\in {{\mathcal {R}}}_S(x_0) = {{\mathcal {R}}}_{{\tilde{S}}}(x_0)\). We conclude that
and so \(y \in {{\mathcal {R}}}_{V_0}(x_0)\). This proves the claim.
Now, by Lemma 3.1, \(\exists \lambda _\alpha > 0\) \((\alpha \in D)\) and \(\lambda _\beta \in \mathbb {R}\) \((\beta \in Q)\), such that, for all \(y \in Y\),
If \(D \not = \emptyset \), we contradict normality. Thus, \(D = \emptyset \) and so, as before, we can find \(y \in Y\) satisfying 3.1(b) by selecting, for each \(\alpha \in R\), a trajectory \(y_\alpha \in {{\mathcal {R}}}_S(x_0)\) such that \(I^\prime _\alpha (x_0;y_\alpha ) < 0\), and setting \(y :=\sum _{\alpha \in R} y_\alpha \). \(\square \)
In the next result, we shall make use of the closedness of the tangent cone. This property can be easily seen as follows. Suppose \(y \not \equiv 0\), \(\{y_q\} \subset T_S(x_0)\) and \(y_q\) converges to y. Since \(y_q/\Vert y_q\Vert \rightarrow y/\Vert y\Vert \), it is sufficient to consider the case in which \(\Vert y\Vert = \Vert y_q\Vert = 1\). For all q, select \(x_q \in S\) such that
We then have
Hence, (1) holds, and so \(y \in T_S(x_0)\), as was to be proved.
Theorem 3.1
If \(x_0\) is a proper arc of S, then it is a regular arc of S.
Proof
Assume, without loss of generality, that \(R = I_a(x_0)\). By 3.1(b), \(R=C\), that is, \(D = \emptyset \). From the theory for equality constraints (see, for example, [1, 2]), 3.1(a) implies that \(x_0\) is regular relative to
that is,
Assume \(R \not =\emptyset \) since, otherwise, \(V_0 = S\). Define
We claim that \(K_S(x_0) \subset T_S(x_0)\). To prove it, let \(y \in K_S(x_0)\) and observe that, since \(y \in {{\mathcal {R}}}_{V_0}(x_0) = T_{V_0}(x_0)\), there exist sequences \(\{x_q\}\) in \(V_0\) and \(\{\epsilon _q > 0\}\) such that (1) holds. Therefore,
Since \(I^\prime _\alpha (x_0;y) < 0\) \((\alpha \in R)\), \(\exists N\) such that, for all \(q \ge N\), \(I_\alpha (x_q) < 0\) \((\alpha \in R)\). Since \(\{x_q\} \subset V_0\), it follows that, for \(q \ge N\), \(x_q \in S\). Hence, \(y \in T_S(x_0)\), and this proves the claim. Finally, let \({{\hat{y}}}\in {{\mathcal {R}}}_S(x_0)\) and let y satisfy 3.1(b), so that \(y \in K_S(x_0)\). Then, for all \(\epsilon > 0\), \({{\hat{y}}}+ \epsilon y \in K_S(x_0) \subset T_S(x_0)\) and, since \(T_S(x_0)\) is closed, \({{\hat{y}}}\in T_S(x_0)\). \(\square \)
We have proved that normality implies regularity. In particular, this implies, as explained before, that Theorem 2.4 is true.
4 An Example and Applications
In this section, we provide a simple example that illustrates one of the main consequences of our result. For this example, the classical theory cannot be applied since, for the extremal \((x_0,\lambda )\) under consideration, \(x_0\) is not strongly normal. Therefore, Theorem 2.2 yields no information. However, by an application of Theorem 2.4, we can conclude that \(x_0\) it is not a solution to the problem. This example shows a clear advantage of the main result of the paper over previous ones found in the literature. We end the paper with some comments on possible applications of the results obtained.
Example 4.1
Let \(a = \pi /2\) and consider the problem of minimizing
subject to \(x(0) = 1\), \(x(a) = -1\),
In this case, \(T = [0,a]\), \(n=1\), \(r=2\), \(\xi _0 = 1\), \(\xi _1=-1\),
Consider the function
and note that \(F_{\dot{x}}(t,x,{\dot{x}}) = - {\dot{x}}+ \lambda _2\) and \(F_x(t,x,{\dot{x}}) = x + \lambda _1 + \lambda _2\). Euler’s equation is, therefore, given by \({{\ddot{x}}}(t) + x(t) + \lambda _1 + \lambda _2 = 0\), whose general solution is
The constraints \(x(0) = 1\) and \(x(a) = -1\) imply that \(c_1 = \lambda _1 + \lambda _2 - 1\) and \(c_2 = \lambda _1 + \lambda _2 + 1\).
Let us consider the arc
and let \(\lambda = (\lambda _1,\lambda _2) \equiv (0,0)\). Clearly, \(x_0\) is admissible since it satisfies the endpoint constraints and \(I_1(x_0) = I_2(x_0) = 0\). Moreover, in view of the above argument, \((x_0,\lambda )\) is an extremal with \(c_1=-1\) and \(c_2=1\). Observe now that, for this particular multiplier, we have
Also, by definition, \(x_0\) will be normal relative to \(S = S_1\) if \(\mu _1 = \mu _2 = 0\) is the only solution to
-
i.
\(\mu _1 \ge 0\), \(\mu _2 \ge 0\);
-
ii.
\(\mu _1 I^\prime _1(x_0;y) + \mu _2 I^\prime _2(x_0;y) = 0\) for all \(y \in Y\).
That this is indeed the case follows since, for all \(y \in Y\),
and so (i) and (ii) imply \(\mu \equiv 0\). On the other hand, \(x_0\) is not strongly normal since, without imposing condition (i), we have nonnull solutions to (ii) such as \(\mu \equiv (1,-1)\). Therefore, we cannot invoke Theorem 2.2. However, if we define
then \(y \in Y\) with \(I^\prime _1(x_0;y) \le 0\), \(I^\prime _2(x_0;y) \le 0\) and
By Theorem 2.4, the admissible arc \(x_0\) with \((x_0,\lambda )\) an extremal, is not a solution to the problem.
Both the theory of necessary conditions for problems in the calculus of variations involving isoperimetric constraints and its applications to real problems have a long history. Perhaps the best known problems of this type include that of determining a closed curve of given length which encloses maximum area, and the shape of a flexible rope of uniform density that hangs at rest with its endpoints fixed. Quoting [22], “the study of the [calculus of variations] problem (and its numerous variants) is over three centuries old, yet its interest has not waned. Its applications are numerous in geometry and differential equations, in mechanics and physics, and in areas as diverse as engineering, medicine, economics, and renewable resources. It is not surprising, then, that modeling and numerical analysis play a large role in the subject today.”
This paper, however, focuses on one of the crucial mathematical issues: second-order necessary conditions for optimality. For the fundamental aspect of applications to real problems, we refer the reader to [22] for problems in elasticity and acoustics, [23,24,25] in economics and management, [26, 27] in physics and engineering, [28] in biology, and [29] in medicine (seeking, for example, the optimal dose to inject to a patient during a therapy).
5 Conclusions
This paper shows how the classical theory of second-order necessary conditions for the isoperimetric problem of Lagrange in the calculus of variations, involving inequality and equality constraints, can be substantially improved. The new assumption, under which the conditions are obtained, deals with the notion of normality relative to a set of constraints which takes into account the sign of the corresponding Lagrange multipliers, instead of the usual set defined only by equality constraints for active indices. The proof provided is based on the relation between the three notions appearing in the title of the paper. It is shown that normality is equivalent to properness which, in turn, implies regularity. It is of interest to see whether these notions, and the conditions obtained, can be generalized to isoperimetric problems in optimal control.
References
Becerril, J.A., Rosenblueth, J.F.: Necessity for isoperimetric inequality constraints. Discret. Contin. Dyn. Syst. Ser. A 37, 1129–1158 (2017)
Hestenes, M.R.: Calculus of Variations and Optimal Control Theory. Wiley, New York (1966)
Arutyunov, A.V.: Second-order conditions in extremal problems: the abnormal points. Trans. Am. Math. Soc. 350, 4341–4365 (1998)
Arutyunov, A.V.: Second-order necessary conditions in optimal control problems. Dokl. Math. 61, 158–161 (2000)
Arutyunov, A.V.: Smooth abnormal problems in extremum theory and analysis. Russ. Math. Surv. 67, 3–62 (2012)
Arutyunov, A.V., Karamzin, D.Y.: Necessary conditions for a weak minimum in an optimal control problem with mixed constraints. Differ. Equ. 41, 1532–1543 (2005)
Arutyunov, A.V., Pereira, F.L.: Second-order necessary optimality conditions for problems without a priori normality assumptions. Math. Op. Res. 31, 1–12 (2006)
Arutyunov, A.V., Vereshchagina, Y.S.: On necessary second-order conditions in optimal control problems. Differ. Equ. 38, 1531–1540 (2002)
de Pinho, M.R., Rosenblueth, J.F.: Mixed constraints in optimal control: an implicit function theorem approach. IMA J. Math. Control Inf. 24, 197–218 (2007)
Gilbert, E.G., Bernstein, D.S.: Second order necessary conditions in optimal control: accessory-problem results without normality conditions. J. Optim. Theory Appl. 41, 75–106 (1983)
Loewen, P.D., Zheng, H.: Generalized conjugate arcs in optimal control. In: Proceedings of the 33rd IEEE Conference on Decision and Control, Lake Buena Vista, Florida, Vol. 4, pp. 4004–4008 (1994)
Loewen, P.D., Zheng, H.: Generalized conjugate points for optimal control problems. Nonlinear Anal. Theory Methods Appl. 22, 771–791 (1994)
Levitin, E., Milyutin, A., Osomolovskiǐ, N.P.: Conditions of high order for a local minimum for problems with constraints. Rus. Math. Surv. 33, 97–168 (1978)
Milyutin, A.A., Osmolovskiǐ, N.P.: Calculus of Variations and Optimal Control. Translations of Mathematical Monographs. American Mathematical Society, Providence (1998)
Osmolovskiǐ, N.P.: Second order conditions for a weak local minimum in an optimal control problem (necessity, sufficiency). Sov. Math. Dokl. 16, 1480–1484 (1975)
Rosenblueth, J.F.: A new notion of conjugacy for isoperimetric problems. Appl. Math. Optim. 50, 209–228 (2004)
Rosenblueth, J.F.: Modified critical directions for inequality control constraints. WSEAS Trans. Syst. Control. 10, 215–227 (2015)
Rosenblueth, J.F., Sánchez Licea, G.: Cones of critical directions in optimal control. Int. J. Appl. Math. Inform. 7, 55–67 (2013)
Cortez del Río, K.L., Rosenblueth, J.F.: A second order constraint qualification for certain classes of optimal control problems. WSEAS Trans. Syst. Control. 11, 419–424 (2016)
Hestenes, M.R.: Optimization Theory. The Finite Dimensional Case. Wiley, New York (1975)
Giorgi, G., Guerraggio, A., Thierfelder, J.: Mathematics of Optimization: Smooth and Nonsmooth Case. Elsevier, Amsterdam (2004)
Clarke, F.: Functional Analysis, Calculus of Variations and Optimal Control. Springer, London (2013)
Cullingford, G., Prideaux, J.D.C.A.: A variational study of optimal resource profiles. Manag. Sci. 19, 1067–1081 (1973)
Hadley, G., Kemp, M.C.: Variational Methods in Economics. North-Holland, Amsterdam (1971)
Kamien, M.I., Schwartz, N.L.: Dynamic Optimization. The Calculus of Variations and Optimal Control in Economics and Management. Elsevier Science Publishing, New York (1991)
Troutman, J.L.: Variational Calculus and Optimal Control. Optimization with Elementary Convexity. Springer, New York (1996)
Weinstock, R.: Calculus of Variations with Applications to Physics and Engineering. Dover Publications, New York (1974)
Papst, I.: A biological application of the calculus of variations. Waterloo Math. Rev. 1, 3–16 (2011)
Elmouki, I., Saadi, S.: BCG immunotherapy optimization on an isoperimetric optimal control problem for the treatment of superficial bladder cancer. Int. J. Dyn. Control 4, 339–345 (2016)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Dean A. Carlson.
Rights and permissions
About this article
Cite this article
Becerril, J.A., Rosenblueth, J.F. The Importance of Being Normal, Regular and Proper in the Calculus of Variations. J Optim Theory Appl 172, 759–773 (2017). https://doi.org/10.1007/s10957-017-1070-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-017-1070-y