Key words

Mathematics Subject Classifications (2010)

4.1 Introduction

The aim of this paper is to obtain optimality conditions for the semivectorial bilevel optimal control problems introduced in [17] where existence results have been established.

Semivectorial bilevel optimal control problems are bilevel problems where the upper level corresponds to a scalar optimization problem and the lower level to a multiobjective optimal control problem. Multiobjective optimal control problems arise in many application areas where several conflicting objectives need to be considered. Minimizing several objective functionals leads to solutions such that none of the objective functional values can be improved further without deteriorating another. The set of all such solutions is referred to as efficient (also called Pareto optimal, noninferior, or nondominated) set of solutions (see, e.g. [38]). The lower level of the semivectorial bilevel optimal control problems can be associated to one player with p objective or to a “grand coalition” of a p-player “cooperative differential game”, every player having its own objective and control function. We consider situations in which these p players react as “followers” to every decision imposed by a “leader” (who acts at the so-called upper level). The best reply correspondence of the followers being in general non-uniquely determined, the leader cannot predict the followers choice simply on the basis of his rational behaviour. So, the choice of the best strategy from the leader point of view depends of how the followers choose a strategy amongst his best responses. In this paper, we will consider two (extreme) possibilities:

  1. 1.

    The optimistic situation, when for every decision of the leader, the followers will choose a strategy amongst the efficient controls which minimizes the (scalar) objective of the leader; in this case the leader will choose a strategy which minimizes the best he can obtain amongst all the best responses of the followers:

  2. 2.

    The pessimistic situation, when the followers can choose amongst the efficient controls one which maximizes the (scalar) objective of the leader; in this case the leader will choose a strategy which minimizes the worst he could obtain amongst all the best responses of the followers.

The semivectorial bilevel control problems which model these two situations, and which will be described in the next section, include the following problems which have been intensively studied in the last decades, so we will give essentially a few earlier references:

  • Optimizing a scalar-valued function over the efficient set associated to a multiobjective optimization (mathematical programming) problem (introduced in [47] and investigated in [813, 2527, 33, 36, 37, 50] for a survey).

  • Optimizing a scalar-valued function over an efficient control set associated to a multiobjective optimal control problem (introduced and investigated in [15], followed by [18])

  • Semivectorial bilevel static problems (introduced and investigated in [16], followed by [3, 14, 22, 30, 31, 51], for the optimistic case)

  • Stackelberg problems (introduced in [49] and investigated, e.g. in [6, 40, 43])

  • Bilevel optimization problems (e.g. [24, 28, 29, 41, 44, 45] for an extensive bibliography)

  • Stackelberg dynamic problems (introduced in [23, 48] and investigated, e.g. in [5, 6, 42, 45, 46], a book with an extensive bibliography)

In this paper, we rewrite the optimistic and pessimistic semivectorial bilevel control problems as bilevel problems where the lower level is a scalar optimization problem which admits a unique solution, using scalarization techniques as in [17]. So we are able to give optimality conditions for the lower level problem in the general case (supposing that the leader’s controls are bounded) using Pontryagin maximum principle. This theoretically allows to obtain under suitable conditions the dependence of the optimal control on the leader’s variables. However, this approach is very difficult to apply because one needs to solve a bilocal problem. That is why we consider the particular but important case when the followers’ problem is linear-quadratic. In this case we show that using a resolvent matrix obtained from data, we can explicitly solve the bilocal problem and express the optimal control and the state as functions of leader’s variables, and we show that these dependencies are continuously differentiable. Finally we present optimality conditions for the upper levels of the optimistic and pessimistic problems.

4.2 Preliminaries and Problem Statement

All the assumptions and notations considered in this section and introduced in [17] will be kept throughout this paper.

For the leader we denote by J l the scalar objective, by u l the control function and by \(\mathcal{U}_{l}\) the set of admissible controls. For the followers we denote by \(\mathbf{J_{f}} = (J_{1},\ldots,J_{p})\) the vector objective (p-scalar objectives) and by \(\mathbf{u_{f}} = (u_{1},\ldots,u_{p})\) the control function whose values belong to the set \(\mathbf{U_{f}} = U_{1} \times \cdots \times U_{p} \subseteq {\mathbb{R}}^{m_{f}} = {\mathbb{R}}^{m_{1}} \times \cdots \times {\mathbb{R}}^{m_{p}}\). U f is assumed to be nonempty, closed and convex, and 0 ∈ U f. Real numbers t 0, T are fixed (t 0 < T) and represent respectively the initial time and an upper bound of the final time. The set of final time values \(\mathcal{T} = [\underline{t},\bar{t}\,] \subset ]t_{0},T[\), where \(\underline{t} \leq \bar{ t}\). The final time, denoted by \(t_{1} \in \mathcal{T}\), may be variable and it is decided by the leader; hence t 1 is fixed in the followers’ problem. We assume that

$$\displaystyle\begin{array}{rcl} \mathcal{U}_{l}& \subset & L_{2}^{m_{l} }([t_{0},T])\quad \mbox{ is closed, nonempty and convex.}{}\end{array}$$
(4.1)

For each fixed \((t_{1},u_{l}) \in \mathcal{T} \times \mathcal{U}_{l}\), the followers have to solve the following parametric multiobjective control problem, called lower level problem:

$$\displaystyle{(\mathbf{LL})_{(t_{1},u_{l})}\left \{\begin{array}{ll} \mathbf{MIN}_{_{_{_{ (\mathbf{u_{f}},x)}}}}\ \;\mathbf{J_{f}}(t_{1},u_{l},\mathbf{u_{f}},x)\; \\ \mbox{ subject to }(\mathbf{u_{f}},x)\mbox{ verifies (4.2)\textendash (4.5)} \end{array} \right.}$$
$$\displaystyle\begin{array}{rcl} \mathbf{u_{f}}(t)& \in & \mathbf{U_{f}}\mbox{ a.e. on }[t_{0},T],\;\;\mathbf{u_{f}}(t) = 0\mbox{ a.e. on }[t_{1},T],{}\end{array}$$
(4.2)
$$\displaystyle\begin{array}{rcl} \dot{x}(t)& =& A(t)\,x(t) + B_{l}(t)\,u_{l}(t) + \mathbf{B_{f}}(t)\mathbf{u_{f}}(t)\ \mbox{ a.e. on }[t_{0},t_{1}],{}\end{array}$$
(4.3)
$$\displaystyle\begin{array}{rcl} x(t_{0})& =& x_{0},{}\end{array}$$
(4.4)
$$\displaystyle\begin{array}{rcl} x(t_{1})& \in & \mathcal{F},{}\end{array}$$
(4.5)

where \(A: [t_{0},T] \rightarrow {\mathbb{R}}^{n\times n},\;B_{l}: [t_{0},T] \rightarrow {\mathbb{R}}^{n\times m_{l}}\mathrm{and}\;\mathbf{B_{f}}: [t_{ 0},T] \rightarrow {\mathbb{R}}^{n\times m_{f}}\) are continuous matrix-valued functions and the control function \(\mathbf{u_{f}} = (u_{1},\ldots,u_{p}) \in L_{2}^{m_{f}}([t_{ 0},T]) = L_{2}^{m_{1}}([t_{ 0},T]) \times \cdots \times L_{2}^{m_{p}}([t_{ 0},T])\).

\(L_{2}^{m}([t_{0},T])\) stands for the usual Hilbert space of equivalence classes (two functions are equivalent iff they coincide a.e.) of (Lebesgue) measurable functions u from [t 0, T] to \({\mathbb{R}}^{m}\), such that the function tu T(t)u(t) is (Lebesgue) integrable over [t 0, T] endowed with the norm \(\|u\|_{2}:={ \left (\int _{t_{0}}^{T}{u}^{T}(t)u(t)\mathrm{d}t\right )}^{1/2}\). The target set \(\mathcal{F}\subset {\mathbb{R}}^{n}\) is assumed to be closed, convex and nonempty.

The initial state \(x_{0} \in {\mathbb{R}}^{n}\) is specified.

For each \(u = (t_{1},u_{l},\mathbf{u_{f}}) \in \mathcal{T} \times L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T])\), under the above assumptions, there exists a unique solution (in the sense of Carathéodory) x u of the Cauchy problem (4.3) and (4.4), and \(x_{u} \in H_{1}^{n}([t_{0},t_{1}])\). \(H_{1}^{n}([t_{0},t_{1}])\) stands for the Hilbert space of absolutely continuous functions from [t 0, t 1] to \({\mathbb{R}}^{n}\) with derivative in \(L_{2}^{n}([t_{0},t_{1}])\) endowed with the norm \(x\mapsto \|x\|:= {(\|\dot{x}\|_{2}^{2} +\| x\|_{2}^{2})}^{1/2}\).

The feasible set \(\mathcal{S}(t_{1},u_{l})\) for the problem (LL)\(_{(t_{1},u_{l})}\) is defined in the following way:

$$\displaystyle{ \mathcal{S}(t_{1},u_{l}) =\{ (\mathbf{u_{f}},x) \in L_{2}^{m_{f} }([t_{0},T])\times H_{1}^{n}([t_{ 0},t_{1}])\vert \;(\mathbf{u_{f}},x)\;\mbox{ verifies relations (4.2)\textendash (4.5)}\}. }$$
(4.6)

Thus, problem \((\mathbf{LL)}_{(t_{1},u_{l})}\) can be written as

$$\displaystyle{(\mathbf{LL})_{(t_{1},u_{l})}\mathbf{MIN}_{_{_{_{ (\mathbf{u_{f}},x)\in \mathcal{S}(t_{1},u_{l})}}}}\ \;\mathbf{J_{f}}(t_{1},u_{l},\mathbf{u_{f}},x).}$$

Next we give the following standard definitions.

Definition 4.1.

For problem (LL)\(_{(t_{1},u_{l})}\) the element \((\mathbf{\bar{u}_{f}},\bar{x}) \in \mathcal{S}(t_{1},u_{l})\) is said to be

  • An efficient (or Pareto) control process if there is no element \((\mathbf{u_{f}},x) \in \mathcal{S}(t_{1},u_{l})\) satisfying

    $$\displaystyle{\forall i \in \{ 1,\ldots,p\}J_{i}(t_{1},u_{l},\mathbf{u_{f}},x) \leq J_{i}(t_{1},u_{l},\mathbf{\bar{u}_{f}},\bar{x})}$$

    and

    $$\displaystyle{\exists i_{0} \in \{ 1,\ldots,p\}\;J_{i_{0}}(t_{1},u_{l},\mathbf{u_{f}},x) < J_{i_{0}}(t_{1},u_{l},\mathbf{\bar{u}_{f}},\bar{x}).}$$
  • A weakly efficient (or weakly Pareto) control process if there is no element \((\mathbf{u_{f}},x) \in \mathcal{S}(t_{1},u_{l})\) satisfying

    $$\displaystyle{\forall i \in \{ 1,\ldots,p\}J_{i}(t_{1},u_{l},\mathbf{u_{f}},x) < J_{i}(t_{1},u_{l},\mathbf{\bar{u}_{f}},\bar{x}).}$$
  • A properly efficient (or properly Pareto) control process (see [34] or [19, 38] for generalizations) if it is an efficient control process and there exists a real number M > 0 so that for every i ∈ { 1, , p} and every \((\mathbf{u_{f}},x) \in \mathcal{S}(t_{1},u_{l})\) with \(J_{i}(t_{1},u_{l},\mathbf{u_{f}},x) < J_{i}(t_{1},u_{l},\mathbf{\bar{u}_{f}},\bar{x})\) at least one k ∈ { 1, , p} exists with \(J_{k}(t_{1},u_{l},\mathbf{u_{f}},x) > J_{k}(t_{1},u_{l},\mathbf{\bar{u}_{f}},\bar{x})\) and

    $$\displaystyle{ \frac{J_{i}(t_{1},u_{l},\mathbf{\bar{u}_{f}},\bar{x}) - J_{i}(t_{1},u_{l},\mathbf{u_{f}},x)} {J_{k}(t_{1},u_{l},\mathbf{u_{f}},x) - J_{k}(t_{1},u_{l},\mathbf{\bar{u}_{f}},\bar{x})} \leq M.}$$

In the sequel the symbol σ ∈ { e, we, pe} stands for “efficient” when σ = e, “weakly efficient” when σ = we and “properly efficient” when σ = pe.

The set of all σ-control processes associated to problem (LL)\(_{(t_{1},u_{l})}\) will be denoted by \(\mathcal{P}_{\sigma }(t_{1},u_{l})\).

Finally we consider the following semivectorial bilevel optimal control problems:

$$\displaystyle{\mbox{ (OSVBC)}_{\sigma }\min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\min _{(\mathbf{u_{f}},x)\in \mathcal{P}_{\sigma }(t_{1},u_{l})}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x)}$$

called optimistic semivectorial bilevel control problem and

$$\displaystyle{\mbox{ (PSVBC)}_{\sigma }\min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\sup _{(\mathbf{u_{f}},x)\in \mathcal{P}_{\sigma }(t_{1},u_{l})}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x)}$$

called pessimistic semivectorial bilevel control problem.

Remark 4.2.

Note that the terminal time t 1 is fixed for the lower level problem, but it is a decision variable for the leader. Of course, a particular case can be obtained when the terminal time t 1 is fixed for the leader too, i.e. when \(\mathcal{T} =\{ t_{1}\}\).

Remark 4.3.

(LL)\(_{(t_{1},u_{l})}\) may be also considered as the problem to be solved by the grand coalition of a p -player cooperative differential game (see [35] and its extensive references list) where the functional J i and the control u i represent the payoff and the control of the player number i, i ∈ { 1, , p}. Then, our optimistic semivectorial bilevel problem corresponds to a strong Stackelberg problem in which, for any choice of (t 1, u l ), the leader can force the followers to choose amongst the σ-control processes one which minimizes the leader payoff. On the other hand, the pessimistic semivectorial bilevel problem corresponds to a weak Stackelberg problem in which, for any choice of the leader variables (t 1, u l ), the followers could choose amongst the σ-control processes one which is the worst for the leader.

We assume that for all t 1 ∈ [t 0, T] and all \((u_{l},\mathbf{u_{f}},x) \in L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T]) \times H_{1}^{n}([t_{ 0},t_{1}])\), we have

$$\displaystyle{J_{l}(t_{1},u_{l},\mathbf{u_{f}},x) =\int _{ t_{0}}^{t_{1} }f_{l}(t,u_{l}(t),\mathbf{u_{f}}(t),x(t))\mathrm{d}t,}$$

and also, for all i ∈ { 1, , p},

$$\displaystyle{J_{i}(t_{1},u_{l},\mathbf{u_{f}},x) =\psi _{i}(x(t_{1})) +\int _{ t_{0}}^{t_{1} }f_{i}(t,u_{l}(t),\mathbf{u_{f}}(t),x(t))\mathrm{d}t,}$$

where, for all i ∈ { 1, , p}, the functions \(\psi _{i},\psi _{l}: {\mathbb{R}}^{n} \rightarrow \mathbb{R},\) \(\;f_{i},f_{l}: [t_{0},T] \times {\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{m_{f}} \times {\mathbb{R}}^{n} \rightarrow \mathbb{R}\) verify the following preliminary assumptions:

$$\displaystyle{(\mathcal{P}\mathcal{A})\left \{\begin{array}{@{}l@{}} \bullet \!\!\quad \psi _{i},f_{i},f_{l}\mbox{ are continuously differentiable;} \\ \bullet \!\!\quad \mbox{ there exist integrable functions $a_{i},\,a_{l}: [t_{0},T] \rightarrow \mathbb{R}$ and real numbers} \\ \!\!\quad \;\;\mbox{ $b_{i},b_{l},c_{i},c_{l},d_{i},d_{l}$, such that, for all }\!(t,u_{l},\mathbf{u_{f}},x) \in \! [t_{0},T]\times \,{\mathbb{R}}^{m_{l}}\times \,{\mathbb{R}}^{m_{f}}\times \,{\mathbb{R}}^{n}\!\!, \\ \!\!\quad \;\;f_{i}(t,u_{l},\mathbf{u_{f}},x) \geq a_{i}(t) + b_{i}{x}^{T}x + c_{i}u_{l}^{T}u_{l} + d_{i}{\mathbf{u_{f}}}^{T}\mathbf{u_{f}}, \\ \!\!\quad \;\;f_{l}(t,u_{l},\mathbf{u_{f}},x) \geq a_{l}(t) + b_{l}{x}^{T}x + c_{l}u_{l}^{T}u_{l} + d_{l}{\mathbf{u_{f}}}^{T}\mathbf{u_{f}}; \\ \bullet \!\!\quad \mbox{ $\psi _{i}$ is a convex function;} \\ \bullet \!\!\quad \mbox{ for each fixed $t \in [t_{0},T]$, the function $f_{i}(t,\cdot,\cdot,\cdot )$ is convex} \\ \quad \;\;\mbox{ on ${\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{m_{f}} \times {\mathbb{R}}^{n}$.} \end{array} \right.}$$

4.3 The Lower Level Problem

Let \(t_{1} \in \mathcal{T}\) be fixed, and let \(\Phi : [t_{0},t_{1}] \times [t_{0},t_{1}] \rightarrow {\mathbb{R}}^{n\times n}\) be the matrix-valued function satisfying for each s ∈ [t 0, t 1]

$$\displaystyle\begin{array}{rcl} \forall t \in [t_{0},t_{1}]\frac{\partial \Phi } {\partial t} (t,s)& =& A(t)\Phi (t,s){}\end{array}$$
(4.7)
$$\displaystyle\begin{array}{rcl} & & \qquad \Phi (s,s) = I_{n}{}\end{array}$$
(4.8)

where I n is the identity matrix.

Since, for each \((u_{l},\mathbf{u_{f}}) \in L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T])\), the unique solution \(x_{(t_{1},u_{l},\mathbf{u_{f}})} \in H_{1}^{n}([t_{0},t_{1}])\) of the Cauchy problem (4.3) and (4.4) is given by

$$\displaystyle{\forall t \in [t_{0},t_{1}]x_{(t_{1},u_{l},\mathbf{u_{f}})}(t) = \Phi (t,t_{0})x_{0} +\int _{ t_{0}}^{t}\Phi (t,s)(B_{ l}(s)u_{l}(s) + \mathbf{B_{f}}(s)\mathbf{u_{f}}(s))\mathrm{d}s,}$$

it is clear that the map \((u_{l},\mathbf{u_{f}})\mapsto x_{(t_{1},u_{l},\mathbf{u_{f}})}\) is affine from \(L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T])\) to \(H_{1}^{n}([t_{0},t_{1}])\). Moreover, using Cauchy–Schwartz inequality, we obtain easily that the map \((u_{l},\mathbf{u_{f}})\mapsto x_{(t_{1},u_{l},\mathbf{u_{f}})}\) is also continuous from \(L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T])\) to \(H_{1}^{n}([t_{0},t_{1}])\).

For each i = 1, , p, consider the functional

$$\displaystyle{ (u_{l},\mathbf{u_{f}})\mapsto \tilde{J}_{i}(t_{1},u_{l},\mathbf{u_{f}}):= J_{i}(t_{1},u_{l},\mathbf{u_{f}},x_{(t_{1},u_{l},\mathbf{u_{f}})}). }$$
(4.9)

Define also

$$\displaystyle{ (u_{l},\mathbf{u_{f}})\mapsto \tilde{J}_{l}(t_{1},u_{l},\mathbf{u_{f}}):= J_{l}(t_{1},u_{l},\mathbf{u_{f}},x_{(t_{1},u_{l},\mathbf{u_{f}})}). }$$
(4.10)

From [17, Lemmas 1 and 2] and the fact that \(x_{(t_{1},\cdot,\cdot )}\) is continuous and affine from \(L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T])\) to \(H_{1}^{n}([t_{0},t_{1}])\), we obtain the following.

Lemma 4.4.

For each i = 1,…,p, the functional   \(\tilde{J}_{i}(t_{1},\cdot,\cdot ): L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T]) \rightarrow \mathbb{R} \cup \{ +\infty \}\) is well defined, lower semicontinuous and convex.

Also \(\tilde{J}_{l}(t_{1},\cdot,\cdot ): L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T]) \rightarrow \mathbb{R} \cup \{ +\infty \}\) is well defined and lower semicontinuous.

For each \((t_{1},u_{l}) \in \mathcal{T} \times \mathcal{U}_{l}\) [see (4.1)], denote

$$\displaystyle\begin{array}{rcl} \mathcal{U}_{f}(t_{1},u_{l})& =& \{\mathbf{u_{f}} \in L_{2}^{m_{f} }([t_{0},T])\vert \;\mathbf{u_{f}}(t) \in \mathbf{U_{f}}\;\mbox{ a.e. on }[t_{0},T], \\ & & \;\;\;\mathbf{u_{f}}(t) = 0\;\mbox{ a.e. on }[t_{1},T],\;\;x_{(t_{1},u_{l},\mathbf{u_{f}})}(t_{1}) \in \mathcal{F}\}.{}\end{array}$$
(4.11)

For each \((t_{1},u_{l}) \in \mathbb{R} \times L_{2}^{m_{l}}([t_{ 0},T]) \setminus \mathcal{T} \times \mathcal{U}_{l}\) we put \(\mathcal{U}_{f}(t_{1},u_{l}) = \varnothing \). Thus \(\mathcal{U}_{f}\) is a set-valued function \(\mathcal{U}_{f}: \mathbb{R} \times L_{2}^{m_{l}}([t_{ 0},T]) \rightrightarrows L_{2}^{m_{f}}([t_{ 0},T])\).

Recall that

$$\displaystyle{\mathrm{dom}\;(\mathcal{U}_{f}):=\{ (t_{1},u_{l}) \in \mathbb{R} \times L_{2}^{m_{l} }([t_{0},T])\vert \;\mathcal{U}_{f}(t_{1},u_{l})\neq \varnothing \}}$$

and

$$\displaystyle{\mathrm{Gr}\,(\mathcal{U}_{f}) =\{ (t_{1},u_{l},\mathbf{u_{f}}) \in \mathbb{R} \times L_{2}^{m_{l} }([t_{0},T]) \times L_{2}^{m_{f} }([t_{0},T])\vert \;\mathbf{u_{f}} \in \mathcal{U}_{f}(t_{1},u_{l})\}.}$$

We will assume in the sequel that

\((\mathcal{H})\mathrm{dom}\;(\mathcal{U}_{f}) = \mathcal{T} \times \mathcal{U}_{l}.\)

Proposition 4.5.

Each of the following is a sufficient condition for \((\mathcal{H})\):

  1. (a)

    \(\mathcal{F} = {\mathbb{R}}^{n}\) .

  2. (b)

    For each \(t_{1} \in \mathcal{T}\) , the linear system

    $$\displaystyle{\dot{x}(t) = A(t)x(t) + \mathbf{B_{f}}(t)\mathbf{u_{f}}(t),x(t_{0}) = 0,\;\;\mathbf{u_{f}}(t) \in \mathbf{U_{f}}\ \mathrm{a.e.\ on}\ [t_{0},t_{1}]}$$

    is controllable, i.e. for any \(x_{1} \in {\mathbb{R}}^{n}\) , there exists \(\mathbf{u_{f}} \in L_{2}^{m_{f}}([t_{ 0},t_{1}])\) such that u f (t) ∈ U f a.e. on [t 0 ,t 1 ], and the corresponding solution verifies x(t 1 ) = x 1 .

Proof.

It is easy to adapt the proof given in [17, Proposition 1], where the initial condition is x(t 0) = x 0 (instead of x(t 0) = 0 as above). ■ 

It can be easily proved that \(\mathcal{U}_{f}(t_{1},u_{l})\) is a convex subset of \(L_{2}^{m_{f}}([t_{ 0},T])\). Thus the problem (LL)\(_{(t_{1},u_{l})}\) can be rewritten as a p-objective convex optimization problem:

$$\displaystyle{(\mathbf{M})_{(t_{1},u_{l})}\left \{\begin{array}{ll} \mathbf{MIN}_{_{_{_{\mathbf{ u_{f}}}}}}\;\;\;(\tilde{J}_{1}(t_{1},u_{l},\mathbf{u_{f}}),\ldots,\tilde{J}_{p}(t_{1},u_{l},\mathbf{u_{f}})) \\ \mbox{ subject to }\;\;\mathbf{u_{f}} \in \mathcal{U}_{f}(t_{1},u_{l}). \end{array} \right.}$$

Definition 4.6.

Let σ ∈ { e, we, pe}. An element \(\mathbf{u_{f}} \in L_{2}^{m_{f}}([t_{ 0},T])\) will be called σ-control of problem (M)\(_{(t_{1},u_{l})}\) iff \((\mathbf{u_{f}},x_{(t_{1},u_{l},\mathbf{u_{f}})})\) is a σ-control process of problem (LL)\(_{(t_{1},u_{l})}\). We will denote \(\mathcal{E}_{\sigma }(t_{1},u_{l})\) the set of all σ-controls of the p-objective optimization problem \((\mathbf{M})_{(t_{1},u_{l})}\).

Thus, using Lemma 4.4 and the well-known scalarization results from vector optimization [38, p. 302] we obtain the following.

Theorem 4.7 (see [17]). 

 Let \((t_{1},u_{l}) \in \mathcal{T} \times \mathcal{U}_{l}\) and \(\mathbf{\hat{u}_{f}} \in \mathcal{U}_{f}(t_{1},u_{l})\) , where \(\mathcal{U}_{l}\) and \(\mathcal{U}_{f}\) are given in (4.1) and (4.11) , respectively. The control process \((\mathbf{\hat{u}_{f}},x_{(t_{1},u_{l},\mathbf{\hat{u}_{f}})})\) is weakly (resp. properly) efficient for problem ( LL ) \(_{(t_{1},u_{l})}\) if and only if there exist nonnegative real numbers (resp. positive real numbers) θ 1 ,…,θ p with \(\sum _{i=1}^{p}\theta _{i} = 1\) such that \(\mathbf{\hat{u}_{f}}\) is an optimal control for the classical scalar optimal control problem:

$$\displaystyle{\mathrm{(S)}_{(\theta _{1},\ldots,\theta _{p},t_{1},u_{l})}\left \{\begin{array}{ll} \min _{\mathbf{u_{f}}}\,\sum _{i=1}^{p}\theta _{ i}\tilde{J}_{i}(t_{1},u_{l},\mathbf{u_{f}}) \\ \mathrm{subject\ to}\;\mathbf{u_{f}} \in \mathcal{U}_{f}(t_{1},u_{l}). \end{array} \right.}$$

In the sequel we need the following sets:

$$\displaystyle{ \Theta _{\sigma } = \left \{\begin{array}{ll} \{(\theta _{1},\ldots,\theta _{p}) \in ]0,1{[}^{p}\vert \sum _{i=1}^{p}\theta _{i} = 1\}\;\;\;\;\mbox{ if }\;\sigma = pe\\ \\ \{(\theta _{1},\ldots,\theta _{p}) \in {[0,1]}^{p}\vert \sum _{i=1}^{p}\theta _{i} = 1\}\;\;\;\;\mbox{ if }\;\sigma = we \end{array} \right. }$$
(4.12)

and the following hypotheses:

$$\displaystyle{H_{\sigma }(t_{1}):\; \left \{\begin{array}{ll} (\exists i \in \{ 1,\ldots,p\})\;\big(\forall (t,v,x) \in [t_{0},t_{1}] \times {\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{n}\big) \\ \mathbf{u_{f}}\mapsto f_{i}(t,v,\mathbf{u_{f}},x)\;\mbox{ is strictly convex on }{\mathbb{R}}^{m}\qquad \qquad \mbox{ if }\sigma = pe\\ \\ (\forall i \in \{ 1,\ldots,p\})\;\big(\forall (t,v,x) \in [t_{0},t_{1}] \times {\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{n}\big) \\ \mathbf{u_{f}}\mapsto f_{i}(t,v,\mathbf{u_{f}},x)\;\mbox{ is strictly convex on }{\mathbb{R}}^{m}\qquad \qquad \mbox{ if }\sigma = we \end{array} \right.}$$

and

$$\displaystyle{(Hc)_{\sigma }:\ \left \{\begin{array}{@{}l@{\quad }l@{}} \forall i \in \{ 1,\ldots,p\}:\;\psi _{i} \geq 0,\,b_{i} = c_{i} = 0,\;d_{i} \geq 0,\;\sum _{j=1}^{p}d_{j} > 0\quad &\mbox{ if }\sigma = pe \\ \forall i \in \{ 1,\ldots,p\}:\;\psi _{i} \geq 0,\,b_{i} = c_{i} = 0,\;d_{i} > 0 \quad &\mbox{ if }\sigma = we, \end{array} \right.}$$

where \(b_{i},c_{i},d_{i}\) have been introduced in the preliminary assumptions \((\mathcal{P}\mathcal{A})\).

Theorem 4.8 (see [17]). 

Let σ ∈{ we,pe} and \((t_{1},u_{l}) \in \mathcal{T} \times \mathcal{U}_{l}\) . Assume that H σ (t 1 ) holds. Moreover, suppose that at least one of the following hypotheses holds:

  1. (i)

      U f is bounded.

  2. (ii)

    (Hc) σ .

Then, for each \(\mathbf{\theta } = (\theta _{1},\ldots,\theta _{p}) \in \Theta _{\sigma }\) , there exists a unique optimal control \(\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot ) \in \mathcal{U}_{f}(t_{1},u_{l})\) of the scalar problem (S) \(_{(\theta,t_{1},u_{l})}\) .

It is obvious that according to Theorem 4.7, \(\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) is a σ-control for multiobjective problem (M)\(_{(t_{1},u_{l})}\). Moreover, Theorem 4.7 implies also that for each σ-control \(\mathbf{u_{f}} \in \mathcal{U}_{f}(t_{1},u_{l})\) of the multiobjective problem (M)\(_{(t_{1},u_{l})}\), there exists θ ∈ Θ σ such that u f is the unique optimal control of the scalar problem \((S)_{(\theta,t_{1},u_{l})}\).

Thus we can state the following.

Corollary 4.9.

Let \((t_{1},u_{l}) \in \mathcal{T} \times \mathcal{U}_{l}\) . Under the hypotheses of Theorem  4.8 we have that the correspondence \(\theta \mapsto \mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) is a surjection from Θ σ to the set \(\mathcal{E}_{\sigma }(t_{1},u_{l})\) .

In the sequel we will keep all the hypotheses of Theorem 4.8 in addition to the preliminary assumptions \((\mathcal{P}\mathcal{A})\).

4.4 Equivalent Formulations of Problems (OSVBC) σ and (PSVBC) σ

Consider, for each \((\theta,t_{1},u_{l}) \in \Theta _{\sigma } \times \mathcal{T} \times \mathcal{U}_{l} \subset {\mathbb{R}}^{p} \times \mathbb{R} \times L_{2}^{m_{l}}([t_{ 0},T])\), the function \(F(\theta,t_{1},u_{l},\cdot ): \mathcal{U}_{f}(t_{1},u_{l}) \rightarrow \mathbb{R}\) defined by

$$\displaystyle{\forall \mathbf{u_{f}} \in \mathcal{U}_{f}(t_{1},u_{l})\quad \quad \quad F(\theta,t_{1},u_{l},\mathbf{u_{f}}):=\sum _{ i=1}^{p}\theta _{ i}\tilde{J}_{i}(t_{1},u_{l},\mathbf{u_{f}}),}$$

where \(\mathcal{U}_{f}(t_{1},u_{l})\) and \(\tilde{J}_{i}\) are given respectively in (4.11) and (4.9).

Note that problem (OSVBC) σ can be written equivalently as an optimistic semivectorial bilevel optimization problem:

$$\displaystyle{(\mathrm{OSVB})_{\sigma }\min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\min _{\mathbf{u_{f}}\in \mathcal{E}_{\sigma }(t_{1},u_{l})}\tilde{J}_{l}(t_{1},u_{l},\mathbf{u_{f}}).}$$

According to Theorem 4.8, for each \((\theta,t_{1},u_{l}) \in \Theta _{\sigma } \times \mathcal{T} \times \mathcal{U}_{l}\), there exists a unique minimizer \(\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot ) \in \mathcal{U}_{f}(t_{1},u_{l})\) of F(θ, t 1, u l ,  ⋅) over \(\mathcal{U}_{f}(t_{1},u_{l})\). According to Corollary 4.9, for each \((t_{1},u_{l}) \in \mathcal{T} \times \mathcal{U}_{l}\), we have

$$\displaystyle{ \mathcal{E}_{\sigma }(t_{1},u_{l}) =\bigcup _{\theta \in \Theta _{\sigma }}\{\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\}. }$$
(4.13)

Then we obviously have the following.

Proposition 4.10 (see [17]). 

Problem (OSVB)σ is equivalent to the problem

$$\displaystyle{\min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\min _{\theta \in \Theta _{\sigma }}\tilde{J}_{l}(t_{1},u_{l},\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )).}$$

Thus, the optimistic semivectorial problem (OSVB) σ can be rewritten as an optimistic bilevel optimization problem (also called strong Stackelberg problem):

$$\displaystyle{ \mbox{ (OB)}_{\sigma }\quad \left \{\begin{array}{ll} \min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\min _{\theta \in \Theta _{\sigma }}\tilde{J}_{l}(t_{1},u_{l},\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot ))\\ \\ \mbox{ where }\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\mbox{ is the unique minimizer to the problem }\\ \\ \mbox{ (S)}_{(\theta,t_{1},u_{l})}: \quad \quad \min _{\mathbf{u_{f}}\in \mathcal{U}_{f}(t_{1},u_{l})}F(\theta,t_{1},u_{l},\mathbf{u_{f}}). \end{array} \right. }$$

Here the upper and lower levels are given by scalar optimization problems and the lower level admits a unique solution.

In the same way the pessimistic semivectorial problem can be rewritten as a pessimistic bilevel optimization problem (leading to a so-called weak Stackelberg problem; see [20] where this terminology was introduced).

Proposition 4.11 (see [17]). 

Problem (PSVBC)σ is equivalent to the problem

$$\displaystyle{\min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\sup _{\theta \in \Theta _{\sigma }}\tilde{J}_{l}(t_{1},u_{l},\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )).}$$

Finally, we can rewrite that problem as

$$\displaystyle{\mbox{ (PB)}_{\sigma }\quad \left \{\begin{array}{ll} \min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\sup _{\theta \in \Theta _{\sigma }}\tilde{J}_{l}(t_{1},u_{l},\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )) \\ \mbox{ where $\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )$ is the unique minimizer of the problem }\\ \\ (S)_{(\theta,t_{1},u_{l})}: \quad \min _{\mathbf{u_{f}}\in \mathcal{U}_{f}(t_{1},u_{l})}F(\theta,t_{1},u_{l},\mathbf{u_{f}}). \end{array} \right.}$$

4.5 Necessary and Sufficient Conditions for the Scalarized Lower Level Problem

Let \((t_{1},u_{l}) \in \mathcal{T} \times \mathcal{U}_{l}\) and \(\theta = (\theta _{1},\ldots,\theta _{p}) \in \Theta _{\sigma }\) be given. The scalarized problem (S)\(_{(\theta,t_{1},u_{l})}\) can be written as

$$\displaystyle\begin{array}{rcl} \min _{(\mathbf{u_{f}},x)\in L_{2}^{m_{f}}([t_{0},T])\times H_{1}^{n}([t_{0},t_{1}])}\left [\sum _{i=1}^{p}\right.& & \left.\theta _{ i}\psi _{i}(x(t_{1})) +\int _{ t_{0}}^{t_{1} }\left (\sum _{i=1}^{p}\theta _{ i}f_{i}(t,u_{l}(t),\mathbf{u_{f}}(t),x(t))\right )\mathrm{d}t\right ] {}\\ \mbox{ s.t.}\quad \quad \qquad \mathbf{u_{f}}(t)& \in & \mathbf{U_{f}}\mbox{ a.e. on }[t_{0},T],\;\;\;\mathbf{u_{f}}(t) = 0\mbox{ a.e. on }[t_{1},T], {}\\ \dot{x}(t)& =& A(t)\,x(t) + B_{l}(t)\,u_{l}(t) + \mathbf{B_{f}}(t)\mathbf{u_{f}}(t)\ \mbox{ a.e. on }[t_{0},t_{1}] {}\\ x(t_{0})& =& x_{0} {}\\ x(t_{1})& \in & \mathcal{F}. {}\\ \end{array}$$

Let \(H: [t_{0},t_{1}] \times {\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{m_{f}} \times {\mathbb{R}}^{n} \times \mathbb{R} \times {\mathbb{R}}^{n} \rightarrow \mathbb{R}\) be the Hamilton-Pontryagin function associated to this control problem (see, e.g. [2] or [39]) defined by

$$\displaystyle{H(t,u_{l},\mathbf{u_{f}},x,\lambda _{0},\lambda ) {=\lambda }^{T}\Big(A(t)x + B_{ l}(t)u_{l} + \mathbf{B_{f}}(t)\mathbf{u_{f}}\Big) -\lambda _{0}\sum _{i=1}^{p}\theta _{ i}f_{i}(t,u_{l},\mathbf{u_{f}},x).}$$

Let \(\lambda (\cdot ) = (\lambda _{1}(\cdot ),\ldots,\lambda _{n}(\cdot )) \in W_{1,\infty }^{n}([t_{0},t_{1}])\) be the adjoint function, where \(W_{1,\infty }^{n}([t_{0},t_{1}])\) is the Banach space of absolutely continuous functions from [t 0, t 1] to \({\mathbb{R}}^{n}\) having derivative in the Banach space \(L_{\infty }^{n}([t_{0},t_{1}])\) of essentially bounded measurable functions (see, e.g. [21] for details).

Since we use L 2 controls, and the Pontryagin maximum principle usually uses controls in L , we will consider two particular situations in order to be able to get necessary and sufficient conditions for problem (S)\(_{(\theta,t_{1},u_{l})}\), as stated below.

4.5.1 The Case When U f Is Bounded and  \(\mathcal{U}_{l} \subset L_{\infty }^{m_{l}}([t_{ 0},T]) \cap L_{2}^{m_{l}}([t_{ 0},T])\)

In this subsection we assume the set U f is bounded (and closed, convex with nonempty interior) and the leader’s controls are essentially bounded, i.e. \(\mathcal{U}_{l} \subset L_{\infty }^{m_{l}}([t_{ 0},T]) \cap L_{2}^{m_{l}}([t_{ 0},T])\). Also, suppose the target set \(\mathcal{F} =\{ x \in {\mathbb{R}}^{n}\vert \,Gx = a\}\), where the matrix \(G \in {\mathbb{R}}^{k\times n}\), and \(a \in {\mathbb{R}}^{k}\) are given. Moreover we assume that rank(G) = k > 0. However the results presented in this subsection are also valid when \(\mathcal{F} = {\mathbb{R}}^{n}\) by taking G = 0, a = 0.

We obtain the following.

Theorem 4.12 (Necessary conditions). 

Let \((\mathbf{u_{f}}_{{\ast}},x_{{\ast}}) \in L_{2}^{m_{f}}([t_{ 0},T]) \times H_{1}^{n}([t_{ 0},t_{1}])\) be an optimal control process for problem (S) \(_{(\theta,t_{1},u_{l})}\) . Then there exist \(\lambda (\cdot ) \in W_{1,\infty }^{n}([t_{0},t_{1}])\) , a nonnegative real number λ 0 and a vector \(v \in {\mathbb{R}}^{k}\) with (λ(⋅),λ 0 ,v)≠0 such that

$$ \displaystyle\begin{array}{rcl} \dot{{\lambda }}^{T}(t)& =& {-\lambda }^{T}(t)\,A(t) +\lambda _{ 0}\sum _{i=1}^{p}\theta _{ i}\,\frac{\partial f_{i}} {\partial x} (t,u_{l}(t),\mathbf{u_{f}}_{{\ast}}(t),x_{{\ast}}(t))\,,\ \mathrm{a.e.\ on}\ [t_{0},t_{1}]{}\end{array}$$
(4.14)
$$ \displaystyle\begin{array}{rcl} {\lambda }^{T}(t_{ 1})& =& -\lambda _{0}\sum _{i=1}^{p}\theta _{ i}\, \frac{\partial \psi _{i}} {\partial x}(x_{{\ast}}(t_{1})) + {v}^{T}G\,,{}\end{array}$$
(4.15)

and, for almost all t ∈ [t 0 ,t 1 ],

$$\displaystyle{ H(t,u_{l}(t),\mathbf{u_{f}}_{{\ast}}(t),x_{{\ast}}(t),\lambda _{0},\lambda (t)) =\max _{\mathbf{v_{f}}\in \mathbf{U_{f}}}H(t,u_{l}(t),\mathbf{v_{f}},x_{{\ast}}(t),\lambda _{0},\lambda (t)). }$$
(4.16)

Moreover, if the linearized system

$$\displaystyle\begin{array}{rcl} \dot{x}(t)& =& A(t)\,x(t) + \mathbf{B_{f}}(t)\mathbf{u_{f}}(t)\ \ \mathrm{a.e.\ on}\ [t_{0},t_{1}]{}\end{array}$$
(4.17)
$$\displaystyle\begin{array}{rcl} x(t_{0})& =& 0{}\end{array}$$
(4.18)

is controllable, Footnote 1 then we can take above λ 0 = 1.

Sufficient conditions. Let \((x_{{\ast}},\mathbf{u_{f}}_{{\ast}}) \in H_{1}^{n}([t_{0},t_{1}]) \times L_{2}^{m_{f}}([t_{ 0},T])\) verifying (4.2)–(4.5) . If there exist \(\lambda (\cdot ) \in W_{1,\infty }^{n}([t_{0},t_{1}])\) and \(v \in {\mathbb{R}}^{k}\) such that (4.14)–(4.16) are verified with λ 0 = 1, then \((x_{{\ast}},\mathbf{u_{f}}_{{\ast}})\) is an optimal control process for problem (S) \(_{(\theta,t_{1},u_{l})}\) .

Proof.

Since U f is bounded, \(\{\mathbf{u_{f}}(\cdot ) \in L_{2}^{m_{f}}([t_{ 0},T])\vert \,\mathbf{u_{f}}(t) \in \mathbf{U_{f}}\} \subset L_{\infty }^{m_{f}}([t_{ 0},T])\). For the same reason \(u_{l}(\cdot ) \in L_{\infty }^{m_{l}}([t_{ 0},t_{1}])\). Thus we have \(\mathbf{u_{f}}_{{\ast}}\in L_{\infty }^{m_{f}}([t_{ 0},T])\); hence \(x_{{\ast}}\in W_{1,\infty }^{n}([t_{0},t_{1}])\) and \(\lambda (\cdot ) \in W_{1,\infty }^{n}([t_{0},t_{1}])\). Therefore we can apply [39, Theorem 5.19] to obtain the first part (necessary conditions). Note that [39, Theorem 5.19] is stated for autonomous systems, but the same proof apply for non-autonomous systems.

For the second part (sufficiency conditions) we can use [39, Theorem 5.22] which also holds for non-autonomous systems with the same proof. ■ 

Remark 4.13.

Since U f is convex and closed and H is concave w.r.t. u f, relation (4.16) can equivalently be written as a variational inequality:

$$\displaystyle{\begin{array}{ll} \forall \mathbf{v_{f}} \in \mathbf{U_{f}}\quad \Big{(\lambda }^{T}(t)\mathbf{B_{f}}(t) -\lambda _{0}\sum _{i=1}^{p}\theta _{i}\frac{\partial f_{i}} {\partial \mathbf{u_{f}}}(t,u_{l}(t),\mathbf{u_{f}}_{{\ast}}(t),x_{{\ast}}(t)\Big)(\mathbf{v_{f}} -\mathbf{u_{f}}_{{\ast}}(t)) \leq 0& \\ \mbox{ a.e. on }[t_{0},t_{1}]. & \end{array} }$$

Finally, we can conclude the following.

Corollary 4.14.

Let \((t_{1},u_{l}) \in \mathcal{U}_{l}\) , and let θ ∈ Θ σ . Assume that the linearized system (4.17) and (4.18) is controllable. Let \(\mathbf{u_{f}} \in L_{2}^{m_{f}}([t_{ 0},T])\) . Then \(\mathbf{u_{f}}(\cdot ) = \mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) (i.e. u f is the unique optimal control for problem \(S_{(\theta,t_{1},u_{l})}\) presented in Theorem  4.8 ) if, and only if, there exists \(\big(x(\cdot ),\lambda (\cdot ),v\big) \in H_{1}^{n}([t_{0},t_{1}]) \times W_{1,\infty }^{n}([t_{0},t_{1}]) \times {\mathbb{R}}^{k}\) such that

$$\displaystyle\begin{array}{rcl} \mathbf{u_{f}}(t)& \in & \mathbf{U_{f}}\ \mathrm{a.e.\ on}\ [t_{0},T],\;\;\;\mathbf{u_{f}}(t) = 0\ \mathrm{a.e.\ on}\ [t_{1},T],{}\end{array}$$
(4.19)
$$\displaystyle\begin{array}{rcl} \dot{x}(t)& =& A(t)\,x(t) + B_{l}(t)\,u_{l}(t) + \mathbf{B_{f}}(t)\mathbf{u_{f}}(t)\ \ \mathrm{a.e.\ on}\ [t_{0},t_{1}],{}\end{array}$$
(4.20)
$$\displaystyle\begin{array}{rcl} x(t_{0})& =& x_{0},{}\end{array}$$
(4.21)
$$\displaystyle\begin{array}{rcl} Gx(t_{1})& =& a,{}\end{array}$$
(4.22)
$$ \displaystyle\begin{array}{rcl} \dot{{\lambda }}^{T}(t)& =& {-\lambda }^{T}(t)\,A(t) +\sum _{ i=1}^{p}\theta _{ i}\,\frac{\partial f_{i}} {\partial x} (t,u_{l}(t),\mathbf{u_{f}}(t),x(t))\,\ \mathrm{a.e.\ on}\ [t_{0},t_{1}],\quad {}\end{array}$$
(4.23)
$$ \displaystyle\begin{array}{rcl} {\lambda }^{T}(t_{ 1})& =& -\sum _{i=1}^{p}\theta _{ i}\, \frac{\partial \psi _{i}} {\partial x}(x(t_{1})) + {v}^{T}G\,,{}\end{array}$$
(4.24)

and, for almost all t ∈ [t 0 ,t 1 ],

$$\displaystyle{ \forall \mathbf{v_{f}} \in \mathbf{U_{f}}\quad \quad \Big{(\lambda }^{T}(t)\mathbf{B_{ f}}(t) -\sum _{i=1}^{p}\theta _{ i} \frac{\partial f_{i}} {\partial \mathbf{u_{f}}}(t,u_{l}(t),\mathbf{u_{f}}(t),x_{{\ast}}(t)\Big)(\mathbf{v_{f}} -\mathbf{u_{f}}(t)) \leq 0. }$$
(4.25)

4.5.2 The Case \(\mathbf{U_{f}} = {\mathbb{R}}^{m_{f}}\): The Followers Problem Is Linear-Quadratic; Explicit Expressions of \(\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) and \(x_{(t_{1},u_{l},\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot ))}\)

In this subsection we consider the case when \(\mathbf{U_{f}} = {\mathbb{R}}^{m_{f}}\), \(\mathcal{U}_{l}\) is an arbitrary closed, convex set with nonempty interior in \(L_{2}^{m_{l}}([t_{ 0},T])\) and the endpoint is free, i.e. the target set \(\mathcal{F} = {\mathbb{R}}^{n}\). The objectives of the followers are quadratic, i.e. for i = 1, , p, and \((t,u_{l},\mathbf{u_{f}},x) \in [t_{0},T] \times {\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{m_{f}} \times {\mathbb{R}}^{n}\)

$$\displaystyle{f_{i}(t,u_{l},\mathbf{u_{f}},x) = {x}^{T}Q_{ i}(t)x +{ \mathbf{u_{f}}}^{T}R_{ i}(t)\mathbf{u_{f}},}$$

where \(Q_{i}(\cdot ): [t_{0},T] \rightarrow {\mathbb{R}}^{n\times n}\) and \(R_{i}(\cdot ): [t_{0},T] \rightarrow {\mathbb{R}}^{m_{f}\times m_{f}}\) are continuous positive semidefinite matrix-valued functions.

Also

$$\displaystyle{\psi _{i}(x) = {x}^{T}Q_{ i}^{f}x,}$$

where Q i f is a symmetric positive semidefinite matrix.

Moreover we make the following assumption:

$$\displaystyle{(\mathrm{HLQP})_{\sigma }: \quad \left \{\begin{array}{@{}l@{\quad }l@{}} \forall (i,t) \in \{ 1,\ldots,p\} \times [t_{0},T]\quad R_{i}(t) > 0 \quad &\mbox{ if }\sigma = we, \\ (\exists i \in \{ 1,\ldots,p\})\,(\forall t \in [t_{0},T])\quad R_{i}(t) > 0\quad &\mbox{ if }\sigma = pe. \end{array} \right.}$$

Note that this particular choice of f i and ψ i agrees with all the assumptions \((\mathcal{P}\mathcal{A})\).

Let us denote

$$\displaystyle{Q(\theta,\cdot ) =\sum _{ i=1}^{p}\theta _{ i}Q_{i}(\cdot );\quad R(\theta,\cdot ) =\sum _{ i=1}^{p}\theta _{ i}R_{i}(\cdot );\quad {Q}^{f}(\theta ) =\sum _{ i=1}^{p}\theta _{ i}Q_{i}^{f}.}$$

Thus, the scalarized problem (S)\(_{(\theta,t_{1},u_{l})}\) becomes the linear-quadratic problem

$$\displaystyle{\mbox{ (LQP)}\;\;\left \{\begin{array}{ll} \min \Big(x{(t_{1})}^{T}{Q}^{f}(\theta )x(t_{ 1}) +\int _{ t_{0}}^{t_{1} }(x{(t)}^{T}Q(\theta,t)x(t) + \mathbf{u_{ f}}{(t)}^{T}R(\theta,t)\mathbf{u_{ f}}(t))\mathrm{d}t\Big) \\ \mbox{ s.t.}\quad \dot{x}(t) = A(t)x(t) + \mathbf{B_{f}}(t)\mathbf{u_{f}}(t) + B_{l}(t)u_{l}(t)\quad \mbox{ a.e. on }[t_{0},t_{1}], \\ \ \;\quad \quad x(t_{0}) = x_{0}. \end{array} \right.}$$

We have the following result which is probably known also for L 2 controls, but we will present a proof for the sake of completeness.

Theorem 4.15.

Let \((x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot )) \in H_{1}^{n}([t_{0},t_{1}]) \times L_{2}^{m_{f}}([t_{ 0},t_{1}])\) verify the differential system and the initial condition for problem (LQP). Then the control process \((x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot ))\) is optimal for problem (LQP) if, and only if, there exists a function \(\lambda (\cdot ) \in H_{1}^{n}([t_{0},t_{1}])\) such that

$$ \displaystyle\begin{array}{rcl} \dot{{\lambda }}^{T}(t)& =& {-\lambda }^{T}(t)A(t) - x_{ {\ast}}^{T}(t)Q(\theta,t)\;\ \mathrm{a.e.\ on}\ [t_{ 0},t_{1}],{}\end{array}$$
(4.26)
$$ \displaystyle\begin{array}{rcl} {\lambda }^{T}(t_{ 1})& =& x_{{\ast}}^{T}(t_{ 1}){Q}^{f}(\theta ),{}\end{array}$$
(4.27)
$$\displaystyle\begin{array}{rcl} \mathbf{u_{f}}_{{\ast}}(t)& =& -{R}^{-1}(\theta,t){\mathbf{B_{ f}}}^{T}(t)\lambda (t)\quad \ \mathrm{a.e.\ on}\ \;[t_{ 0},t_{1}].{}\end{array}$$
(4.28)

Proof.

Assume that \(\lambda (\cdot ) \in H_{1}^{n}([t_{0},t_{1}])\) verifies (4.26)–(4.28). Let (x, u f) ∈ \(H_{1}^{n}([t_{0},t_{1}]) \times L_{2}^{m_{f}}([t_{ 0},t_{1}])\) verify the differential system and the initial condition for problem (LQP). We have for almost all t ∈ [t 0, t 1]

$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}} {\mathrm{d}t}\Big{(\lambda }^{T}(t)(x(t) - x_{ {\ast}}(t))\Big) {=\dot{\lambda } }^{T}(t)(x(t) - x_{ {\ast}}(t)) {+\lambda }^{T}(t)(\dot{x}(t) -\dot{ x}_{ {\ast}}(t)) {}\\ & & \qquad =\, -{(\lambda }^{T}(t)A(t) + x_{ {\ast}}^{T}(t)Q(\theta,t))(x(t) - x_{ {\ast}}(t)) {}\\ & & \qquad \quad {+\lambda }^{T}(t)\Big(A(t)(x(t) - x_{ {\ast}}(t)) + \mathbf{B_{f}}(t)(\mathbf{u_{f}}(t) -\mathbf{u_{f}}_{{\ast}}(t))\Big) {}\\ & & \qquad =\, -x_{{\ast}}^{T}(t)Q(\theta,t)(x(t) - x_{ {\ast}}(t)) -\mathbf{u_{f}}_{{\ast}}^{T}(t)R(\theta,t)(\mathbf{u_{ f}}(t) -\mathbf{u_{f}}_{{\ast}}(t)). {}\\ \end{array}$$

With the initial condition for x( ⋅), x  ∗ ( ⋅) and final condition for λ( ⋅) we get by integration

$$\displaystyle{ \begin{array}{ll} x_{{\ast}}^{T}(t_{1}){Q}^{f}(\theta )(x(t_{1}) - x_{{\ast}}(t_{1})) =& -\int _{t_{0}}^{t_{1}}\Big(x_{{\ast}}^{T}(t)Q(\theta,t)(x(t) - x_{{\ast}}(t)) \\ & + \mathbf{u_{f}}_{{\ast}}^{T}(t)R(\theta,t)(\mathbf{u_{f}}(t) -\mathbf{u_{f}}_{{\ast}}(t))\Big)\mathrm{d}t. \end{array} }$$
(4.29)

Denote

$$\displaystyle{J(x(\cdot ),\mathbf{u_{f}}(\cdot )) =\Big (x{(t_{1})}^{T}{Q}^{f}(\theta )x(t_{ 1})+\int _{t_{0}}^{t_{1} }(x{(t)}^{T}Q(\theta,t)x(t)+\mathbf{u_{ f}}{(t)}^{T}R(\theta,t)\mathbf{u_{ f}}(t))\mathrm{d}t\Big).}$$

For any symmetric positive semidefinite matrix P and for all vectors v, v  ∗ , we obviously have

$$\displaystyle{{v}^{T}Pv - v_{ {\ast}}^{T}Pv_{ {\ast}}\geq 2v_{{\ast}}^{T}P(v - v_{ {\ast}}).}$$

Therefore

$$\displaystyle\begin{array}{rcl} J(x(\cdot ),\mathbf{u_{f}}(\cdot )) - J(x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot ))& \geq & 2\Big[x_{{\ast}}^{T}(t_{ 1}){Q}^{f}(\theta )(x(t_{ 1}) - x_{{\ast}}(t_{1})) {}\\ & +& \int _{t_{0}}^{t_{1} }\Big(x_{{\ast}}^{T}(t)Q(\theta,t)(x(t) - x_{ {\ast}}(t)) {}\\ & +& \mathbf{u_{f}}_{{\ast}}^{T}(t)R(\theta,t)(\mathbf{u_{ f}}(t) -\mathbf{u_{f}}_{{\ast}}(t))\Big)\mathrm{d}t\Big]. {}\\ \end{array}$$

From (4.29) the last expression is zero; hence \(J(x(\cdot ),u(\cdot )) - J(x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot )) \geq 0\). Thus \((x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot ))\) is an optimal control process for problem (SQP).

Conversely, let \((x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot )) \in H_{1}^{n}([t_{0},t_{1}]) \times L_{2}^{m_{f}}([t_{ 0},t_{1}])\) be a solution of (LQP) (which exists and is unique according to Theorem 4.8). Let \(\lambda (\cdot ) \in H_{1}^{n}([t_{0},t_{1}])\) be the solution of the linear system (4.26) verifying the final condition (4.27). For any \(\mathbf{u_{f}}(\cdot ) \in L_{2}^{m_{f}}([t_{ 0},t_{1}])\), denoting by x( ⋅) the corresponding solution of the differential system and the initial condition for problem (LQP), we have (using a similar calculus as before)

$$\displaystyle{\begin{array}{ll} {\lambda }^{T}(t_{ 1})(x(t_{1}) - x_{{\ast}}(t_{1})) =& -\int _{t_{0}}^{t_{1}}\Big(x_{ {\ast}}^{T}(t)Q(\theta,t)(x(t) - x_{ {\ast}}(t)) \\ & {+\lambda }^{T}(t)\mathbf{B_{f}}(t)(\mathbf{u_{f}}(t) -\mathbf{u_{f}}_{{\ast}}(t))\Big)\mathrm{d}t.\end{array} }$$

On the other, using the fact that the directional derivative of J at the optimal point \((x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot ))\) in the direction \((x(\cdot ),\mathbf{u_{f}}(\cdot )) - (x_{{\ast}}(\cdot ),\mathbf{u_{f}}_{{\ast}}(\cdot ))\) is positive we have

$$\displaystyle{\begin{array}{ll} x_{{\ast}}^{T}(t_{1}){Q}^{f}(\theta )(x(t_{1}) - x_{{\ast}}(t_{1}))& +\int _{ t_{0}}^{t_{1}}(x_{{\ast}}^{T}(t)Q(\theta,t)(x(t) - x_{{\ast}}(t)) \\ & + \mathbf{u_{f}}_{{\ast}}^{T}(t)R(\theta,t)(\mathbf{u_{f}}(t) -\mathbf{u_{f}}_{{\ast}}(t)))\mathrm{d}t \geq 0. \end{array} }$$

Finally we obtain

$$\displaystyle{\int _{t_{0}}^{t_{1} }{(\lambda }^{T}(t)\mathbf{B_{ f}}(t) -\mathbf{u_{f}}_{{\ast}}^{T}(t)R(\theta,t))(\mathbf{u_{ f}}(t) -\mathbf{u_{f}}_{{\ast}}(t)))\mathrm{d}t \leq 0.}$$

Since u f( ⋅) can be arbitrarily chosen in \(L_{2}^{m_{f}}([t_{ 0},t_{1}])\), we obtain that (4.28) is satisfied.

 ■ 

Next we will show that, in the linear-quadratic case, it is possible to compute explicitly the optimal control and state as a function of the parameters θ, t 1, u l by means of a 2n ×2n resolvent matrix of a linear differential system based on data. This fact will allow us to find explicit optimality conditions for our bilevel problems.

Recall that \(\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) denotes the unique optimal control of the scalarized problem (S)\(_{(\theta,t_{1},u_{l})}\). The corresponding unique state and adjoint state (verifying Theorem 4.15) will be denoted by x(θ, t 1, u l ,  ⋅) and λ(θ, t 1, u l ,  ⋅).

To be more precise, the functions  x(θ, t 1, u l ,  ⋅) and λ(θ, t 1, u l ,  ⋅) verify the following boundary linear problem:

$$\displaystyle\begin{array}{rcl} & & \frac{\partial x} {\partial t} (\theta,t_{1},u_{l},t) =\, A(t)x(\theta,t_{1},u_{l},t) -\mathbf{B_{f}}(t){R}^{-1}(\theta,t)\mathbf{B_{ f}}{(t)}^{T}\lambda (\theta,t_{ 1},u_{l},t) \\ & & \qquad \qquad \, + B_{l}(t)u_{l}(t)\quad \mbox{ a.e. on }[t_{0},t_{1}], {}\end{array}$$
(4.30)
$$\displaystyle\begin{array}{rcl} & & \frac{\partial \lambda } {\partial t}(\theta,t_{1},u_{l},t) = -A{(t)}^{T}\lambda (\theta,t_{ 1},u_{l},t) - Q(\theta,t)x(\theta,t_{1},u_{l},t)\ \ \mbox{ a.e. on }[t_{0},t_{1}],{}\end{array}$$
(4.31)
$$\displaystyle\begin{array}{rcl} & & x(\theta,t_{1},u_{l},t_{0}) =\, x_{0},{}\end{array}$$
(4.32)
$$\displaystyle\begin{array}{rcl} & & \lambda (\theta,t_{1},u_{l},t_{1}) =\, {Q}^{f}(\theta )x(\theta,t_{ 1},u_{l},t_{1}){}\end{array}$$
(4.33)

and

$$\displaystyle{ \mathbf{u_{f}}(\theta,t_{1},u_{l},t) = -{R}^{-1}(\theta,t){\mathbf{B_{ f}}}^{T}(t)\lambda (\theta,t_{ 1},u_{l},t)\quad \mbox{ a.e. on }[t_{0},t_{1}]. }$$
(4.34)

Given \(t_{1} \in \mathcal{T}\) and θ ∈ Θ σ , consider the matrix-valued function \(P(\theta,t_{1},\cdot ): [t_{0},t_{1}] \rightarrow {\mathbb{R}}^{n\times n}\) which, under our hypotheses about matrices Q f(θ), Q(θ, t), R(θ, t), is the unique continuously differentiable solution (see, e.g. [1]) of the Riccati matrix differential equation (RMDE) on [t 0, t 1]:

$$\displaystyle{\begin{array}{ll} \frac{\partial P} {\partial t} (\theta,t_{1},t) =&\, - A{(t)}^{T}P(\theta,t_{ 1},t) - P(\theta,t_{1},t)A(t) - Q(\theta,t) \\ &\, + P(\theta,t_{1},t)\mathbf{B_{f}}(t)R{(\theta,t)}^{-1}\mathbf{B_{f}}{(t)}^{T}P(\theta,t_{1},t)\end{array} }$$

satisfying the final time condition

$$\displaystyle{ P(\theta,t_{1},t_{1}) = {Q}^{f}(\theta ). }$$
(4.35)

Moreover, P(θ, t 1, t) is a symmetric positive definite matrix for each t.

Following [18] we can express P in terms of a resolvent matrix depending directly on data. Thus consider for all \((\theta,t) \in \Theta _{\sigma } \times [t_{0},t_{1}]\) the 2n ×2n matrix which defines the linear system (4.30) and (4.31)

$$\displaystyle{L(\theta,t) = \left (\begin{array}{cc} A(t) & -\mathbf{B_{f}}(t){R}^{-1}(\theta,t){\mathbf{B_{f}}}^{T}(t)\\ \\ - Q(\theta,t)& - {A}^{T}(t) \end{array} \right ).}$$

The proof of the following result can be found in [18].

Proposition 4.16.

Let Ψ(θ,⋅,⋅) be the resolvent (or state transition) matrix associated to the linear differential system defined by L(θ,t), i.e. for each s ∈ [t 0 ,T], Ψ(θ,⋅,s) satisfies the Cauchy problem:

$$\displaystyle{\frac{\partial \Psi } {\partial t} (\theta,t,s) = L(\theta,t)\Psi (\theta,t,s),\;\;t \in [t_{0},T],\quad \Psi (\theta,s,s) = I_{2n}.}$$

Let us divide the matrix Ψ(θ,t,s) into four n × n blocks:

$$\displaystyle{\Psi (\theta,t,s) = \left (\begin{array}{ll} \Psi _{11}(\theta,t,s)&\Psi _{12}(\theta,t,s) \\ \Psi _{21}(\theta,t,s)&\Psi _{22}(\theta,t,s) \end{array} \right ).}$$

Then, for all t ∈ [t 0 ,t 1 ], the matrix \([\Psi _{11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )]\) is invertible and

$$\displaystyle{ P(\theta,t_{1},t) =\Big [\Psi _{21}(\theta,t,t_{1}) + \Psi _{22}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]\Big{[\Psi _{ 11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]}^{-1}. }$$
(4.36)

Next, let us denote by \(\xi (\theta,t_{1},u_{l},\cdot ) \in H_{1}^{n}([t_{0},t_{1}])\) the unique solution of the following linear Cauchy problem:

$$\displaystyle\begin{array}{rcl} \frac{\partial \xi } {\partial t}(\theta,t_{1},u_{l},t)& =& \,\big(-A{(t)}^{T} + P(\theta,t_{ 1},t)\mathbf{B_{f}}(t){R}^{-1}(\theta,t)\mathbf{B_{ f}}(t)\big)\xi (\theta,t_{1},u_{l},t) \\ & \,-& P(\theta,t_{1},t)B_{l}(t)u_{l}(t)\quad \mbox{ a.e. on }[t_{0},t_{1}], {}\end{array}$$
(4.37)
$$\displaystyle\begin{array}{rcl} \xi (\theta,t_{1},u_{l},t_{1})& =& \,0.{}\end{array}$$
(4.38)

Lemma 4.17.

For all t ∈ [t 0 ,t 1 ] we have

$$\displaystyle{ \lambda (\theta,t_{1},u_{l},t) = P(\theta,t_{1},t)x(\theta,t_{1},u_{l},t) +\xi (\theta,t_{1},u_{l},t). }$$
(4.39)

Proof.

Computing the derivative \(\frac{\partial } {\partial t}\Big(\lambda (\theta,t_{1},u_{l},t) - P(\theta,t_{1},t)x(\theta,t_{1},u_{l},t)-\xi\) \((\theta,t_{1},u_{l},t)\Big)\) and then, using (4.30)–(4.33), (RMDE), (4.35), (4.37), and (4.38), the result follows easily. ■ 

Denote by Ξ(θ, t 1,  ⋅,  ⋅) the resolvent matrix associated to (4.37), i.e. for all \((\theta,t_{1},s) \in \Theta _{\sigma } \times \mathcal{T} \times [t_{0},T]\)

$$\displaystyle\begin{array}{rcl} & & \frac{\partial \Xi } {\partial t} (\theta,t_{1},t,s) =\,\big (-A{(t)}^{T} + P(\theta,t_{ 1},t)\mathbf{B_{f}}(t){R}^{-1}(\theta,t)\mathbf{B_{ f}}(t)\big)\Xi (\theta,t_{1},t,s),\;t \in [t_{0},T]{}\end{array}$$
(4.40)
$$\displaystyle\begin{array}{rcl} & & \Xi (\theta,t_{1},s,s) =\, I_{n}.{}\end{array}$$
(4.41)

Based on this we are able to solve the boundary problem (4.30)–(4.33) in terms of data.

Corollary 4.18.

For all \((\theta,t_{1},u_{l}) \in \Theta _{\sigma } \times \mathcal{T} \times L_{2}^{m_{l}}([t_{ 0},T])\) and for all t ∈ [t 0 ,t 1 ] we have

$$\displaystyle\begin{array}{rcl} \left (\begin{array}{c} x(\theta,t_{1},u_{l},t)\\ \\ \lambda (\theta,t_{1},u_{l},t) \end{array} \right ) =& \,\Psi (\theta,t,t_{0})\left (\begin{array}{c} x_{0}\\ \\ P(\theta,t_{1},t_{0})x_{0} +\xi (\theta,t_{1},u_{l},t_{0}) \end{array} \right ) & {}\\ & \,+\int _{t_{0}}^{t}\Psi (\theta,t,s)\left (\begin{array}{c} B_{l}(s)u_{l}(s)\\ \\ 0 \end{array} \right )\mathrm{d}s,& {}\\ \end{array}$$

where

$$\displaystyle{\xi (\theta,t_{1},u_{l},t_{0}) =\int _{ t_{0}}^{t_{1} }\Xi (\theta,t_{1},t_{0},s)P(\theta,t_{1},s)B_{l}(s)u_{l}(s)\mathrm{d}s.}$$

Remark 4.19.

The right-hand side member in the formulas giving x(θ, t 1, u l , t) and λ(θ, t 1, u l , t) in Corollary 4.18 is defined for all \((t_{1},t) \in ]t_{0},T[\times [t_{0},T]\) (and not only for \((t_{1},t) \in \mathcal{T} \times [t_{0},t_{1}]\)) and for all θ belonging to an open convex set Ω with Θ σ  ⊆ Ω. Indeed, the formulas in Corollary 4.18 have a meaning as long as R(θ, t) > 0.

When σ = pe, by (HLQP) pe it is obvious that we can take \(\Omega = \mathbb{R}_{++}^{p}\).

When σ = we, the continuous function \([t_{0},T] \times {\mathbb{R}}^{m_{f}} \ni (t,\mathbf{u}_{ f})\mapsto {\mathbf{u_{f}}}^{T}R_{ i}(t)\mathbf{u_{f}}\) attains its minimum value, say α i , on the compact set \([t_{0},T] \times \mathbb{S}\), where \(\mathbb{S}\) is the unit sphere in \({\mathbb{R}}^{m_{f}}\), i = 1, , p. According to (HLQP) we we have α i  > 0 for all i. Then, it is easy to see that we can take

$$\displaystyle{\Omega =\{\theta \in {\mathbb{R}}^{p}\vert \sum _{ i=1}^{p}\theta _{ i}\alpha _{i} > 0\}.}$$

We will extend the functions x( ⋅,  ⋅,  ⋅,  ⋅) and λ( ⋅,  ⋅,  ⋅,  ⋅) based on these formulas as continuous functions from \(\Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T]) \times [t_{0},T]\) to \({\mathbb{R}}^{n}\). Moreover, based on (4.34), we will extend also the function u f( ⋅,  ⋅,  ⋅,  ⋅) as a continuous function from \(\Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T]) \times [t_{0},T]\) to \({\mathbb{R}}^{m_{f}}\). These extensions are necessary further in order to obtain optimality conditions for the upper level.

Using the differentiability with respect to parameters of a differential equation and some straightforward computation we have the following.

Proposition 4.20.

The resolvent Ψ(⋅,⋅,⋅) is continuously differentiable on Ω × [t 0 ,T] × [t 0 ,T]. We have the following formulas for all (θ,t,s) ∈ Ω × [t 0 ,T] × [t 0 ,T] and i = 1,…,p:

$$\displaystyle\begin{array}{rcl} \frac{\partial \Psi } {\partial \theta _{i}} (\theta,t,s)& =& \int _{s}^{t}\Psi (\theta,t,\tau )\frac{\partial L} {\partial \theta _{i}} (\theta,\tau )\Psi (\theta,\tau,s)\mathrm{d}\tau,\quad \mathrm{where}{}\end{array}$$
(4.42)
$$\displaystyle\begin{array}{rcl} \frac{\partial L} {\partial \theta _{i}} (\theta,t)& =& \left (\begin{array}{cc} 0 &\mathbf{B_{f}}(t){R}^{-1}(\theta,t)R_{i}(t){R}^{-1}(\theta,t)\mathbf{B_{f}}{(t)}^{T}\\ \\ - Q_{i}(t)& 0 \end{array} \right ),{}\end{array}$$
(4.43)
$$\displaystyle\begin{array}{rcl} \frac{\partial \Psi } {\partial s} (\theta,t,s)& =& -\Psi (\theta,t,s)L(\theta,s).{}\end{array}$$
(4.44)

By (4.36) and the previous proposition we obtain immediately the following.

Proposition 4.21.

The matrix-valued function P(⋅,⋅,⋅) is continuously differentiable on Ω × [t 0 ,T] × [t 0 ,T] and verifies the following formulas:

$$\displaystyle\begin{array}{rcl} & & \frac{\partial P} {\partial \theta _{i}} (\theta,t_{1},t) =\Big [\frac{\partial \Psi _{21}} {\partial \theta _{i}} (\theta,t,t_{1}) + \frac{\partial \Psi _{22}} {\partial \theta _{i}} (\theta,t,t_{1}){Q}^{f}(\theta ) + \Psi _{ 22}(\theta,t,t_{1})Q_{i}^{f}\Big] \\ & & \qquad \times \Big {[\Psi _{11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]}^{-1} \\ & & \qquad -\Big [\Psi _{21}(\theta,t,t_{1}) + \Psi _{22}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]\Big{[\Psi _{ 11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]}^{-1} \\ & & \qquad \times \Big [\frac{\partial \Psi _{11}} {\partial \theta _{i}} (\theta,t,t_{1}) + \frac{\partial \Psi _{12}} {\partial \theta _{i}} (\theta,t,t_{1}){Q}^{f}(\theta ) + \Psi _{ 12}(\theta,t,t_{1})Q_{i}^{f}\Big] \\ & & \qquad \times \Big {[\Psi _{11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]}^{-1} {}\end{array}$$
(4.45)

and

$$\displaystyle{\frac{\partial \Psi } {\partial \theta _{i}} (\theta,t,s) = \left (\begin{array}{cc} \frac{\partial \Psi _{11}} {\partial \theta _{i}} (\theta,t,s)&\frac{\partial \Psi _{12}} {\partial \theta _{i}} (\theta,t,s) \\ \\ \frac{\partial \Psi _{21}} {\partial \theta _{i}} (\theta,t,s)&\frac{\partial \Psi _{22}} {\partial \theta _{i}} (\theta,t,s) \end{array} \right ).}$$

Using an analogue calculus we obtain

$$\displaystyle\begin{array}{rcl} & \quad & \frac{\partial P} {\partial t_{1}}(\theta,t_{1},t) \\ & & =\Big [\frac{\partial \Psi _{21}} {\partial t_{1}} (\theta,t,t_{1}) + \frac{\partial \Psi _{22}} {\partial t_{1}} (\theta,t,t_{1}){Q}^{f}(\theta )\Big]\Big{[\Psi _{ 11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]}^{-1} \\ & & \quad -\Big [\Psi _{21}(\theta,t,t_{1}) + \Psi _{22}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]\Big{[\Psi _{ 11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]}^{-1} \\ & & \quad \times \Big [\frac{\partial \Psi _{11}} {\partial t_{1}} (\theta,t,t_{1}) + \frac{\partial \Psi _{12}} {\partial t_{1}} (\theta,t,t_{1}){Q}^{f}(\theta )\Big]\Big{[\Psi _{ 11}(\theta,t,t_{1}) + \Psi _{12}(\theta,t,t_{1}){Q}^{f}(\theta )\Big]}^{-1}.{}\end{array}$$
(4.46)

The computation of \(\frac{\partial \Psi _{ij}} {\partial t_{1}} (\theta,t,t_{1})\) can be obtained using (4.44) :

$$\displaystyle{ \left (\begin{array}{cc} \frac{\partial \Psi _{11}} {\partial t_{1}} (\theta,t,t_{1})&\frac{\partial \Psi _{12}} {\partial t_{1}} (\theta,t,t_{1}) \\ \\ \frac{\partial \Psi _{21}} {\partial t_{1}} (\theta,t,t_{1})&\frac{\partial \Psi _{22}} {\partial t_{1}} (\theta,t,t_{1}) \end{array} \right ) = -\left (\begin{array}{cc} \Psi _{11}(\theta,t,t_{1})&\Psi _{12}(\theta,t,t_{1})\\ \\ \\ \Psi _{21}(\theta,t,t_{1})&\Psi _{22}(\theta,t,t_{1}) \end{array} \right )L(\theta,t_{1}). }$$
(4.47)

Proposition 4.22.

The resolvent Ξ(⋅,⋅,⋅,⋅) is continuously differentiable on Ω × [t 0 ,T] × [t 0 ,T], and denoting

$$\displaystyle{ \mathcal{A}(\theta,t_{1},t):= -A{(t)}^{T} + P(\theta,t_{ 1},t)\mathbf{B_{f}}(t){R}^{-1}(\theta,t)\mathbf{B_{ f}}(t), }$$
(4.48)

we have

$$\displaystyle\begin{array}{rcl} \frac{\partial \Xi } {\partial \theta _{i}} (\theta,t_{1},t,s)& =& \int _{s}^{t}\Xi (\theta,t_{ 1},t,\tau )\frac{\partial \mathcal{A}} {\partial \theta _{i}} (\theta,t_{1},\tau )\Xi (\theta,t_{1},\tau,s)\mathrm{d}\tau,{}\end{array}$$
(4.49)
$$\displaystyle\begin{array}{rcl} \frac{\partial \Xi } {\partial t_{1}}(\theta,t_{1},t,s)& =& \int _{s}^{t}\Xi (\theta,t_{ 1},t,\tau )\frac{\partial \mathcal{A}} {\partial t_{1}}(\theta,t_{1},\tau )\Xi (\theta,t_{1},\tau,s)\mathrm{d}\tau,{}\end{array}$$
(4.50)
$$\displaystyle\begin{array}{rcl} \frac{\partial \Xi } {\partial s} (\theta,t_{1},t,s)& =& -\Xi (\theta,t_{1},t,s)\mathcal{A}(\theta,t_{1},s).{}\end{array}$$
(4.51)

The computation of the partial derivatives of \(\mathcal{A}(\theta,t_{1},t)\) can be obtained using (4.36) , Proposition  4.21 and the obvious formulas:

$$\displaystyle{ \frac{\partial } {\partial \theta _{i}}{R}^{-1}(\theta,t) = -{R}^{-1}(\theta,t)R_{ i}(t){R}^{-1}(\theta,t).}$$

Proposition 4.23.

For all (θ,t 1 ) ∈ Ω×]t 0 ,T[, the maps \(u_{l}\mapsto x(\theta,t_{1},u_{l},\cdot )\), \(u_{l}\mapsto \lambda (\theta,t_{1},u_{l},\cdot )\) , respectively, \(u_{l}\mapsto \mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) are affine and continuous from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(H_{1}^{n}([t_{0},t_{1}])\) , respectively, from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{m_{f}}([t_{ 0},T])\) . Therefore they are continuously Fréchet differentiable on \(L_{2}^{m_{l}}([t_{ 0},T])\) and, for any \(u_{l} \in L_{2}^{m_{l}}([t_{ 0},t_{1}])\) , their Fréchet differentials (which are linear continuous maps from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(H_{1}^{n}([t_{0},t_{1}])\) and, respectively, from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{m_{f}}([t_{ 0},T])\) ) verify for all \(h \in L_{2}^{m_{l}}([t_{ 0},T])\) and for all t ∈ [t 0 ,t 1 ]:

$$\displaystyle\begin{array}{rcl} & & \frac{\partial } {\partial u_{l}}x(\theta,t_{1},u_{l},t) \cdot h =\, \Psi _{12}(\theta,t,t_{0})\int _{t_{0}}^{t_{1} }\Xi (\theta,t_{1},t_{0},s)P(\theta,t_{1},s)B_{l}(s)h(s)\mathrm{d}s \\ & & \qquad \qquad \qquad \qquad \, +\int _{ t_{0}}^{t}\Psi _{ 11}(\theta,t,s)B_{l}(s)h(s)\mathrm{d}s {}\end{array}$$
(4.52)
$$\displaystyle\begin{array}{rcl} & & \frac{\partial } {\partial u_{l}}\lambda (\theta,t_{1},u_{l},t) \cdot h =\, \Psi _{22}(\theta,t,t_{0})\int _{t_{0}}^{t_{1} }\Xi (\theta,t_{1},t_{0},s)P(\theta,t_{1},s)B_{l}(s)h(s)\mathrm{d}s \\ & & \qquad \qquad \qquad \qquad \, +\int _{ t_{0}}^{t}\Psi _{ 21}(\theta,t,s)B_{l}(s)h(s)\mathrm{d}s {}\end{array}$$
(4.53)
$$\displaystyle\begin{array}{rcl} & & \frac{\partial } {\partial u_{l}}\mathbf{u_{f}}(\theta,t_{1},u_{l},t) \cdot h =\, -{R}^{-1}(\theta,t)\mathbf{B_{ f}}{(t)}^{T} \frac{\partial } {\partial u_{l}}\lambda (\theta,t_{1},u_{l},t) \cdot h,.{}\end{array}$$
(4.54)

Proof.

It is easy to see from Corollary 4.18 and (4.30) and (4.31) that the maps \(u_{l}\mapsto x(\theta,t_{1},u_{l},\cdot )\) and \(u_{l}\mapsto \lambda (\theta,t_{1},u_{l},\cdot )\) are affine and continuous from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(H_{1}^{n}([t_{0},t_{1}])\); hence (4.52) and (4.53) hold. Then, by (4.34), we obtain that the map \(u_{l}\mapsto \mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{m_{f}}([t_{ 0},T])\) is affine and continuous and we get (4.54). ■ 

Theorem 4.24 (Regularity of u f( ⋅,  ⋅,  ⋅,  ⋅) and x( ⋅,  ⋅,  ⋅,  ⋅)). 

  1. 1.

    The functions \(\mathbf{u_{f}}(\cdot,\cdot,\cdot,\cdot ): \Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T]) \times [t_{0},T] \rightarrow {\mathbb{R}}^{m_{f}}\) and \(x(\cdot,\cdot,\cdot,\cdot ): \Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T]) \times [t_{0},T] \rightarrow {\mathbb{R}}^{n}\) are continuous.

  2. 2.

    The function \((\theta,t_{1},u_{l})\mapsto \mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) from \(\Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{m_{f}}([t_{ 0},T])\) is continuous as well as the function \((\theta,t_{1},u_{l})\mapsto x(\theta,t_{1},u_{l},\cdot )\) from \(\Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{n}([t_{0},T])\) .

  3. 3.

    For each fixed \((\bar{\theta },\bar{t}_{1},\bar{u}_{l}) \in \Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T])\):

    • The function \(\theta \mapsto \mathbf{u_{f}}(\theta,\bar{t}_{1},\bar{u}_{l},\cdot )\) from Ω to \(L_{2}^{m_{f}}([t_{ 0},T])\) and the function Footnote 2 \(\theta \mapsto x(\theta,\bar{t}_{1},\bar{u}_{l},\cdot )\) from Ω to \(L_{2}^{n}([t_{0},T])\) are continuously Fréchet differentiable on Ω.

    • The function \(u_{l}\mapsto \mathbf{u_{f}}(\bar{\theta },\bar{t}_{1},u_{l},\cdot )\) from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{m_{f}}([t_{ 0},T])\) and the function \(u_{l}\mapsto x(\bar{\theta },\bar{t}_{1},u_{l},\cdot )\) from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(H_{1}^{n}([t_{0},T])\) are continuously Fréchet differentiable.

    • The functions \(t_{1}\mapsto \mathbf{u_{f}}(\bar{\theta },t_{1},\bar{u}_{l},\cdot )\) from ]t 0 ,T[ to \(L_{2}^{m_{f}}([t_{ 0},T])\) and \(t_{1}\mapsto x(\bar{\theta },t_{1},\bar{u}_{l},\cdot )\) from ]t 0 ,T[ to \(L_{2}^{n}([t_{0},T])\) are a.e. differentiable on ]t 0 ,T[, and for almost all t 1 ∈]t 0 ,T[, \(\frac{\partial \mathbf{u_{f}}} {\partial t_{1}} (\bar{\theta },\bar{t}_{1},\bar{u}_{l},\cdot ) \in L_{2}^{m_{f} }([t_{0},T])\) and \(\frac{\partial x} {\partial t_{1}}(\bar{\theta },\bar{t}_{1},\bar{u}_{l},\cdot ) \in L_{2}^{n}([t_{ 0},T])\) .

      Moreover, for each t 1 ∈]t 0 ,T[ such that \(\bar{u}_{l}\) is continuous Footnote 3 at t1 , these functions are differentiable in t 1 .

  4. 4.

    The functions u f (⋅,⋅,⋅,⋅), x(⋅,⋅,⋅,⋅) and their partial derivatives can be explicitly represented as functions of data (supposing we are able to compute the resolvent matrices Ψ and Ξ).

Proof.

By Corollary 4.18, Remark 4.19 and Propositions 4.20–4.23, we obtain points 1 and 4.

To prove point 2 we will use the fact that, by Corollary 4.18, we can write

$$\displaystyle{x(\theta,t_{1},u_{l},t) =\alpha (\theta,t_{1},t) +\int _{ t_{0}}^{T}X(\theta,t_{ 1},t,s)u_{l}(s)\mathrm{d}s,}$$

where

$$\displaystyle{\alpha (\theta,t_{1},t) =\big (\Psi _{11}(\theta,t,t_{0}) + \Psi _{12}P(\theta,t_{1},t_{0})\big)x_{0}}$$

and X(θ, t 1, t, s) is described later in relations (4.61) and (4.63). Obviously \(\alpha: \Omega \times ]t_{0},T[\times [t_{0},T] \rightarrow {\mathbb{R}}^{n}\) is a continuous function, and for each s ∈ [t 0, T], X( ⋅,  ⋅,  ⋅, s) is continuous on \(\Omega \times ]t_{0},T[\times [t_{0},T] \rightarrow {\mathbb{R}}^{n\times m_{l}}\), and, for each \((\theta,t_{1},t) \in \Omega \times ]t_{0},T[\times [t_{0},T]\), \(X(\theta,t_{1},t,\cdot ) \in L_{2}^{n\times m_{l}}([t_{ 0},T])\).

We obtain easily that the function (θ, t 1)↦α(θ, t 1,  ⋅) is continuous from Ω ×]t 0, T[ to \(\mathcal{C}([t_{0},T]; {\mathbb{R}}^{n})\), where \(\mathcal{C}([t_{0},T]; {\mathbb{R}}^{n})\) is the Banach space of continuous functions on [t 0, T] with values in \({\mathbb{R}}^{n}\) endowed with the uniform convergence norm.

Since the embedding \(\mathcal{C}([t_{0},T]; {\mathbb{R}}^{n}) \subset L_{2}^{n}([t_{0},T])\) is continuous, we obtain that the function (θ, t 1)↦α(θ, t 1,  ⋅) is continuous from Ω ×]t 0, T[ to \(L_{2}^{n}([t_{0},T])\).

Also, using Lebesgue’s dominated convergence theorem, we obtain easily that the function (θ, t 1, t)↦X(θ, t 1, t,  ⋅) is continuous from Ω ×]t 0, T[ ×[t 0, T] to \(L_{2}^{n\times m_{l}}([t_{ 0},T])\). Denoting \(y(\theta,t_{1},u_{l},t) =\int _{ t_{0}}^{T}X(\theta,t_{1},t,s)u_{l}(s)\mathrm{d}s\), and writing

$$\displaystyle{\begin{array}{ll} y(\theta \prime,t_{1}\prime,u_{l}\prime,t) - y(\theta,t_{1},u_{l},t) =&\,\big(y(\theta \prime,t_{1}\prime,u_{l}\prime,t) - y(\theta \prime,t_{1}\prime,u_{l},t)\big) \\ &\, +\big (y(\theta \prime,t_{1}\prime,u_{l},t) - y(\theta,t_{1},u_{l},t)\big), \end{array} }$$

we obtain that

$$\displaystyle{\begin{array}{ll} \vert y(\theta \prime,t_{1}\prime,u_{l}\prime,t) - y(\theta,t_{1},u_{l},t)\vert \leq &\,\|X(\theta \prime,t_{1}\prime,t,\cdot )\|_{2} \cdot \| u_{l}\prime - u_{l}\|_{2} \\ & +\| X(\theta \prime,t_{1}\prime,t,\cdot ) - X(\theta,t_{1},t,\cdot )\|_{2} \cdot \| u_{l}\|_{2} \end{array} }$$

which finally prove the continuity of the function \((\theta,t_{1},u_{l})\mapsto x(\theta,t_{1},u_{l},\cdot )\) from \(\Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{n}([t_{0},T])\).

With similar arguments we can prove the continuity of the function \((\theta,t_{1},u_{l})\mapsto \mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\) from \(\Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{m_{f}}([t_{ 0},T])\) and point 3. ■ 

4.6 Optimality Conditions for the Upper Level, i.e. for Problems (OB) σ and (PB) σ

In this section we will restrain to the case considered in Sect. 4.5.2. Moreover we will suppose that \(\mathcal{U}_{l}\) is the closed ball

$$\displaystyle{ \mathcal{U}_{l} = \left \{u_{l} \in L_{2}^{m_{l} }([t_{0},T])\,\vert \;\|u_{l}\|_{2} \leq R\right \}, }$$
(4.55)

where R is a strictly positive real.

4.6.1 The Optimistic Bilevel Problem

We begin with some preliminary results in order to obtain an existence result when U f is not assumed to be bounded, so we cannot apply the results obtained in [17]. We could adapt the proof given in [17], but we will give direct proofs for the sake of completeness.

Lemma 4.25.

Let X and Y be arbitrary sets and let \(J: X \times Y \rightarrow \mathbb{R} \cup \{ +\infty \}\) such that, for each x ∈ X, the set argmin  J(x,⋅) is nonempty. Then the problems

$$\displaystyle{ \min _{(x,y)\in X\times Y }J(x,y) }$$
(4.56)

and

$$\displaystyle{ \min _{x\in X}\min _{y\in Y }J(x,y) }$$
(4.57)

are equivalent, i.e. problem (4.56) is solvable if and only if problem (4.57) is solvable. In this case the solution sets coincide as well as the minimal values.

Proof.

Let \((\hat{x},\hat{y}) \in X \times Y\) be a solution for problem (4.56), i.e. \((\hat{x},\hat{y}) \in \mbox{ argmin }J(\cdot,\cdot )\). Then, for each x ∈ X, we have obviously \(J(\hat{x},\hat{y}) =\min _{y\in Y }J(\hat{x},y) \leq \min _{y\in Y }J(x,y)\); hence \(J(\hat{x},\hat{y}) =\min _{x\in X}\min _{y\in Y }J(x,y)\), and \((\hat{x},\hat{y})\) is a solution for problem (4.57).

Conversely, let \((\bar{x},\bar{y})\) be a solution for problem (4.57). This means that, for all x ∈ X and y′ ∈ argmin J(x,  ⋅), we have we have \(J(\bar{x},\bar{y}) \leq J(x,y\prime) =\min _{y\in Y }J(x,y)\); hence for all (x, y) ∈ X ×Y, we have \(J(\bar{x},\bar{y}) \leq J(x,y)\). Therefore \((\bar{x},\bar{y})\) is a solution for problem (4.56). ■ 

Lemma 4.26.

Let X = X′× X′ where X′ is a compact metric space, X′ is a closed bounded convex set in a reflexive Banach space \(\mathcal{X}\prime\prime\) and let Y be a compact metric space. Let \(J: X \times Y \rightarrow \mathbb{R} \cup \{ +\infty \}\) be a lower semicontinuous function on the topological product space X′× (X′,s) × Y, where s denotes the topology on X′ induced by the strong topology of \(\mathcal{X}\prime\prime\) . Suppose that J(x′,⋅,y) is convex for each fixed (x′,y) ∈ X′× Y.

Then the hypotheses of Lemma  4.25 are fulfilled, and argmin  J(⋅,⋅,⋅)≠∅.

Proof.

1. From Banach–Alaoglu–Kakutani theorem, X′ is compact for the weak topology of \(\mathcal{X}\prime\prime\) denoted w. Thus X ×Y = (X′ ×X′) ×Y is compact in the topological product space \([X\prime \times (\mathcal{X}\prime\prime,w)] \times Y\). Let us show that J is sequentially lower semicontinuous on [X′ ×(X′, w X′ )] ×Y, where w X′ stands for the topology on X′ induced by the weak topology of \(\mathcal{X}\prime\prime\). Indeed, for any real α, let us denote

$$\displaystyle{SL_{\alpha } =\{ (x\prime,x\prime\prime,y) \in X\prime \times X\prime\prime \times Y \vert J(x\prime,x\prime\prime,y) \leq \alpha \}.}$$

Since J is lower semicontinuous on X′ ×(X′, s) ×Y we have that SL α is closed in X′ ×(X′, s) ×Y. Consider now a sequence \(((x\prime_{k},x\prime\prime_{k},y_{k}))_{k}\) in SL α convergent to some (x′, x′, y) in \(X\prime \times (\mathcal{X}\prime\prime,w) \times Y\). Since (x′ k ) converges weakly to x′, by Mazur’s lemma [32, p. 6], there is a sequence \((\bar{x}\prime\prime_{k})\) converging to x′ in (X′, s) such that, for any k, \(\bar{x}\prime\prime_{k}\) is a convex combination of x′ k ’s. Then, by the convexity of X′ and of J(x′ k ,  ⋅, y k ), we have \(\bar{x}\prime\prime_{k} \in X\prime\prime\) and

$$\displaystyle{J(x\prime_{k},\bar{x}\prime\prime_{k},y_{k}) \leq J(x\prime_{k},x\prime\prime_{k},y_{k}) \leq \alpha.}$$

Thus \((x\prime_{k},\bar{x}\prime\prime_{k},y_{k}) \in SL_{\alpha }\) and \((x\prime_{k},\bar{x}\prime\prime_{k},y_{k})\) converges to (x′, x′, y) in X′ ×(X′, s) ×Y; hence (x′, x′, y) ∈ SL α . Therefore SL α is sequentially closed in \(X\prime \times (\mathcal{X}\prime\prime,w) \times Y\); hence J is sequentially lower semicontinuous on \(X\prime \times (\mathcal{X}\prime\prime,w) \times Y\). Finally, by Weierstrass’ theorem, we obtain that argmin J( ⋅,  ⋅,  ⋅)≠.

Let now \(x = (x\prime,x\prime\prime) \in X = X\prime \times X\prime\prime\) be fixed. Since Y is compact and J(x,  ⋅) is lower semicontinuous on Y, we obtain from Weierstrass’ theorem that argmin J(x,  ⋅)≠. ■ 

Let \(\hat{J}_{l}: \Omega \times ]t_{0},T[\times \mathcal{U}_{l} \rightarrow \mathbb{R} \cup \{ +\infty \}\) be defined by

$$\displaystyle{ \hat{J}_{l}(\theta,t_{1},u_{l}):=\tilde{ J}_{l}(t_{1},u_{l},\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )) = J_{l}(t_{1},u_{l},\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot ),x(\theta,t_{1},u_{l},\cdot )). }$$
(4.58)

Theorem 4.27.

In addition to hypotheses \((\mathcal{P}\mathcal{A})\) we suppose that, for each t ∈ [t 0 ,T], f l (t,⋅,⋅,⋅) is a convex function.

Moreover we suppose the following hypothesis:

$$\displaystyle{ \mathrm{(Hf)}\quad \left \{\begin{array}{ll} \mathrm{there\ is\ some\ \alpha \in L_{\infty }([t_{0},T])\ and\ some\ real\ constant\ \beta \ such\ that,} \\ \mathrm{for\ almost\ all\ t \in [t_{0},T],\ and\ for\ all\ (u_{l},\mathbf{u_{f}},x) \in {\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{m_{f}} \times {\mathbb{R}}^{n},} \\ \quad \left \vert \nabla _{(u_{l},\mathbf{u_{f}},x)}f_{l}(t,u_{l},\mathbf{u_{f}},x)\right \vert \leq \alpha (t) +\beta \vert (u_{l},\mathbf{u_{f}},x)\vert.\end{array} \right. }$$
(4.59)

Then problem (OB) we has at least one solution and it is equivalent to the problem

$$\displaystyle{(P_{l})\qquad \min _{(\theta,t_{1},u_{l})\in \Theta _{we}\times \mathcal{T}\times \mathcal{U}_{l}}\hat{J}_{l}(\theta,t_{1},u_{l}).}$$

Proof.

We will show that all the hypotheses of Lemma 4.26 are fulfilled (denoting \(X\prime = \mathcal{T},\;X\prime\prime = \mathcal{U}_{l},Y = \Theta _{we}\), \(\mathcal{X}\prime\prime = L_{2}^{m_{l}}([t_{ 0},T])\), \(x\prime = t_{1},\;x\prime\prime = u_{l},y =\theta,\) \(J(x\prime,x\prime\prime,y) =\hat{ J}_{l}(\theta,t_{1},u_{l})\)), and then the conclusion follows from Lemma 4.25.

\(\mathcal{U}_{l}\) is (strongly) closed, bounded and convex in \(L_{2}^{m_{l}}([t_{ 0},T])\); \(\mathcal{T}\) and Θ we are compact. For fixed \((t_{1},\theta ) \in \mathcal{T} \times \Theta _{we}\), the function \(\hat{J}_{l}(\theta,\cdot,t_{1})\) is convex since, for any t ∈ [t 0, T], the function f l (t,  ⋅,  ⋅,  ⋅) is convex, and \(u_{l}\mapsto \mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot ),\;u_{l}\mapsto x(\theta,t_{1},u_{l},\cdot )\) are affine functions by Proposition 4.23.

To finish the proof it is sufficient to show that \(\hat{J}_{l}\) is lower semicontinuous on \(\Theta _{we} \times \mathcal{T} \times \mathcal{U}_{l}\), where \(\mathcal{U}_{l}\) is endowed with the topology induced by the strong topology of \(L_{2}^{m_{l}}([t_{ 0},T])\). Let \({(\theta }^{k},t_{1}^{k},u_{l}^{k})_{k}\) be a sequence in \(\Theta _{we} \times \mathcal{T} \times \mathcal{U}_{l}\) which converges (strongly) to an element \((\bar{\theta },\bar{t}_{1},\bar{u}_{l})\). Since \(\Theta _{we} \times \mathcal{T} \times \mathcal{U}_{l}\) is closed we have \((\bar{\theta },\bar{t}_{1},\bar{u}_{l}) \in \Theta _{we} \times \mathcal{T} \times \mathcal{U}_{l}\).

We obtain from Lemma 4.4, Theorem 4.24 and (4.58) that, for each fixed \(t_{1} \in \mathcal{T}\), the function \(\hat{J}_{l}(\cdot,t_{1},\cdot )\) is lower semicontinuous. On the other hand we have

$$\displaystyle\begin{array}{rcl} \hat{J}_{l}{(\theta }^{k},t_{ 1}^{k},u_{ l}^{k})& =& \hat{J}_{ l}{(\theta }^{k},\bar{t}_{ 1},u_{l}^{k}) + (\hat{J}_{ l}{(\theta }^{k},t_{ 1}^{k},u_{ l}^{k}) -\hat{ J}_{ l}{(\theta }^{k},\bar{t}_{ 1},u_{l}^{k})), {}\\ \end{array}$$

and the term \((\hat{J}_{l}{(\theta }^{k},t_{1}^{k},u_{l}^{k}) -\hat{ J}_{l}{(\theta }^{k},\bar{t}_{1},u_{l}^{k}))\) tends to 0 as k →  + . Indeed,

$$\displaystyle{ \begin{array}{ll} \hat{J}_{l}{(\theta }^{k},t_{1}^{k},u_{l}^{k}) -\hat{ J}_{l}{(\theta }^{k},\bar{t}_{1},u_{l}^{k}) =&\,\int _{t_{0}}^{t_{1}^{k} }f_{l}(t,u_{l}^{k}(t),\mathbf{u_{f}}{(\theta }^{k},t_{1}^{k},u_{l}^{k},t),x{(\theta }^{k},t_{1}^{k},u_{l}^{k},t))\mathrm{d}t \\ & -\int _{t_{0}}^{\bar{t}_{1}}f_{ l}(t,u_{l}^{k}(t),\mathbf{u_{f}}{(\theta }^{k},\bar{t}_{ 1},u_{l}^{k},t),x{(\theta }^{k},\bar{t}_{ 1},u_{l}^{k},t))\mathrm{d}t. \end{array} }$$
(4.60)

Since the sequence (u l k) is bounded in \(L_{2}^{m_{l}}([t_{ 0},T])\), by (Hf) and Theorem 4.24 there is a constant M > 0, such that, for all \(k \in \mathbb{N}\) and almost all t ∈ [t 0, T],

$$\displaystyle{\vert f_{l}(t,u_{l}^{k}(t),\mathbf{u_{ f}}{(\theta }^{k},t_{ 1}^{k},u_{ l}^{k},t),x{(\theta }^{k},t_{ 1}^{k},u_{ l}^{k},t))\vert \leq M}$$

and

$$\displaystyle{\vert f_{l}(t,u_{l}^{k}(t),\mathbf{u_{ f}}{(\theta }^{k},\bar{t}_{ 1},u_{l}^{k},t),x{(\theta }^{k},\bar{t}_{ 1},u_{l}^{k},t))\vert \leq M.}$$

Finally, let us show that both integrals in (4.60) have the same limit as k →  + , which is \(\int _{t_{0}}^{\bar{t}_{1} }f_{l}(t,\bar{u}_{l}(t),\mathbf{u_{f}}(\bar{\theta },\bar{t}_{1},\bar{u}_{l},t),x(\bar{\theta },\bar{t}_{1},\bar{u}_{l},t))\mathrm{d}t\). To do this it is sufficient to prove that these convergences hold for a subsequence. Since (u l k) converges in \(L_{2}^{m_{l}}([t_{ 0},T]),\) there exists a subsequence \((u_{l}^{k\prime})_{k\prime}\), such that \((u_{l}^{k\prime}(t))_{k\prime}\) converges to \(\bar{u}_{l}(t)\) a.e. on [t 0, T]. Then, we can apply Lebesgue’s dominated convergence theorem to obtain the last claim.

Therefore, using the fact that for each \(t_{1} \in \mathcal{T}\) the function \(\hat{J}_{l}(\cdot,t_{1},\cdot )\) is lower semicontinuous, we obtain

$$\displaystyle{\lim _{k\rightarrow +\infty }\hat{J}_{l}{(\theta }^{k},t_{ 1}^{k},u_{ l}^{k}) =\lim _{ k\rightarrow +\infty }\hat{J}_{l}{(\theta }^{k},\bar{t}_{ 1},u_{l}^{k}) \geq \hat{ J}_{ l}(\bar{\theta },\bar{t}_{1},\bar{u}_{l}).}$$

 ■ 

We denote \((f_{l})\prime_{u_{l}}(\cdot,\cdot,\cdot,\cdot ): [t_{0},T]\times {\mathbb{R}}^{m_{l}}\times {\mathbb{R}}^{m_{f}}\times {\mathbb{R}}^{n} \rightarrow {\mathbb{R}}^{m_{l}},\;(f_{ l})\prime_{\mathbf{u_{f}}}(\cdot,\cdot,\cdot,\cdot ): [t_{0},T]\times {\mathbb{R}}^{m_{l}}\times {\mathbb{R}}^{m_{f}}\times {\mathbb{R}}^{n} \rightarrow {\mathbb{R}}^{m_{f}},\;(f_{ l})\prime_{x}(\cdot,\cdot,\cdot,\cdot ): [t_{0},T]\times {\mathbb{R}}^{m_{l}}\times {\mathbb{R}}^{m_{f}}\times {\mathbb{R}}^{n} \rightarrow {\mathbb{R}}^{n}\) the partial derivatives of f l with respect to the variables located on the second, third and fourth position, respectively.

Also, let us denote for all \((\theta,t_{1},t,s) \in \Omega \times ]t_{0},T[\times [t_{0},T] \times [t_{0},T]\),

$$\displaystyle\begin{array}{rcl} X(\theta,t_{1},t,s)& =& \,\Big[\chi _{[t_{0},t_{1}]}(s)\Psi _{12}(\theta,t,t_{0})\Xi (\theta,t_{1},t_{0},s)P(\theta,t_{1},s) \\ & +& \chi _{[t_{0},t]}(s)\Psi _{11}(\theta,t,s)\Big]B_{l}(s) {}\end{array}$$
(4.61)
$$\displaystyle\begin{array}{rcl} Y (\theta,t_{1},t,s)& =& \,-{R}^{-1}(\theta,t)\mathbf{B_{ f}}{(t)}^{T}\Big[\chi _{ [t_{0},t_{1}]}(s)\Psi _{22}(\theta,t,t_{0})\Xi (\theta,t_{1},t_{0},s)P(\theta,t_{1},s) \\ & +& \chi _{[t_{0},t]}(s)\Psi _{21}(\theta,t,s)\Big]B_{l}(s), {}\end{array}$$
(4.62)

where \(\chi _{[t_{0},t]}: [t_{0},T] \rightarrow \mathbb{R}\) is the characteristic function

$$\displaystyle{ \chi _{[t_{0},t]}(s) = \left \{\begin{array}{ll} 1&\mbox{ if }s \in [t_{0},t],\\ 0 &\mbox{ otherwise.} \end{array} \right. }$$
(4.63)

Thus, formulas (4.52), (4.54) become

$$\displaystyle\begin{array}{rcl} \frac{\partial } {\partial u_{l}}x(\theta,t_{1},u_{l},\cdot ) \cdot h& =& \int _{t_{0}}^{T}X(\theta,t_{ 1},\cdot,s)h(s)\mathrm{d}s,{}\end{array}$$
(4.64)
$$\displaystyle\begin{array}{rcl} \frac{\partial } {\partial u_{l}}\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot ) \cdot h& =& \int _{t_{0}}^{T}Y (\theta,t_{ 1},\cdot,s)h(s)\mathrm{d}s.{}\end{array}$$
(4.65)

Next result is necessary to ensure the differentiability of \(\hat{J}_{l}\).

Lemma 4.28.

Suppose that f l satisfies the hypothesis (Hf) given in Theorem  4.27 , in addition to the hypothesis \((\mathcal{P}\mathcal{A})\) . Then, for each fixed t 1 ∈]t 0 ,T[, the functional \(\hat{J}_{l}(\cdot,t_{1},\cdot ): \Omega \times L_{2}^{m_{l}}([t_{ 0},T]) \rightarrow \mathbb{R}\) is well defined and continuously Fréchet differentiable. Its partial derivatives with respect to θ i , i = 1,…,p are given by

$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial \theta _{i}} (\theta,t_{1},u_{l})& =& \int _{t_{0}}^{t_{1} }(f_{l})\prime_{\mathbf{u_{f}}}{(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t))}^{T}\frac{\partial \mathbf{u_{f}}} {\partial \theta _{i}} (\theta,t_{1},u_{l},t)\mathrm{d}t \\ & +& \int _{t_{0}}^{t_{1} }(f_{l})\prime_{x}{(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t))}^{T}\frac{\partial x} {\partial \theta _{i}} (\theta,t_{1},u_{l},t)\mathrm{d}t.{}\end{array}$$
(4.66)

Its partial Fréchet gradient with respect to u l at (θ,t 1 ,u l ) is given, for almost all s ∈ [t 0 ,t 1], byFootnote 4

$$\displaystyle\begin{array}{rcl} \nabla _{u_{l}}\hat{J}_{l}(\theta,t_{1},u_{l})(s)& =& (f_{l})\prime_{u_{l}}(s,u_{l}(s),\mathbf{u_{f}}(\theta,t_{1},u_{l},s),x(\theta,t_{1},u_{l},s)) \\ & +& \int _{t_{0}}^{T}{L}^{T}(\theta,t_{ 1},t,s)(f_{l})\prime_{\mathbf{u_{f}}}(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t))\mathrm{d}t \\ & +& \int _{t_{0}}^{T}{X}^{T}(\theta,t_{ 1},t,s)(f_{l})\prime_{x}(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t))\mathrm{d}t.{}\end{array}$$
(4.67)

Moreover, for each fixed \((\theta,u_{l}) \in \Omega \times L_{2}^{m_{l}}([t_{ 0},T])\) , the function \(\hat{J}_{l}(\theta,\cdot,u_{l}) \in H_{1}([t_{0},T])\) , and for almost all t 1 ∈]t 0 ,T[, its derivative is given by

$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial t_{1}} (\theta,t_{1},u_{l})& =& f_{l}(t_{1},u_{l}(t_{1}),\mathbf{u_{f}}(\theta,t_{1},u_{l},t_{1}),x(\theta,t_{1},u_{l},t_{1})) \\ & +& \int _{t_{0}}^{t_{1} }(f_{l})\prime_{\mathbf{u_{f}}}{(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t))}^{T}\frac{\partial \mathbf{u_{f}}} {\partial t_{1}} (\theta,t_{1},u_{l},t)\mathrm{d}t \\ & +& \int _{t_{0}}^{t_{1} }(f_{l})\prime_{x}{(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t))}^{T} \frac{\partial x} {\partial t_{1}}(\theta,t_{1},u_{l},t)\mathrm{d}t.{}\end{array}$$
(4.68)

In particular, at each point t 1 such that u l is continuous at t 1 (see footnote 4.24), the real-valued function \(t\mapsto \hat{J}_{l}(\theta,t,u_{l})\) is differentiable.

Proof.

By [4, Example 2, p. 20] we have that the functional \(J_{l}(t_{1},\cdot,\cdot,\cdot ): L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T]) \times H_{1}^{n}([t_{ 0},T]) \rightarrow \mathbb{R}\) is well defined and is continuously Fréchet differentiable for each fixed t 1 ∈ ]t 0, T[. Moreover, its partial derivatives satisfy, for all \((t_{1},u_{l},\mathbf{u_{f}},x) \in ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T]) \times H_{1}^{n}([t_{ 0},T])\), the following equations:

$$\displaystyle\begin{array}{rcl} \frac{\partial J_{l}} {\partial u_{l}}(t_{1},u_{l},\mathbf{u_{f}},x) \cdot v& =& \int _{t_{0}}^{t_{1} }(f_{l})\prime_{u_{l}}{(t,u_{l}(t),\mathbf{u_{f}}(t),x(t))}^{T}v(t)\mathrm{d}t\quad \forall v \in L_{ 2}^{m_{l} }([t_{0},T]), {}\\ \frac{\partial J_{l}} {\partial \mathbf{u_{f}}}(t_{1},u_{l},\mathbf{u_{f}},x) \cdot w& =& \int _{t_{0}}^{t_{1} }(f_{l})\prime_{\mathbf{u_{f}}}{(t,u_{l}(t),\mathbf{u_{f}}(t),x(t))}^{T}w(t)\mathrm{d}t\quad \forall w \in L_{ 2}^{m_{f} }([t_{0},T]), {}\\ \frac{\partial J_{l}} {\partial x} (t_{1},u_{l},\mathbf{u_{f}},x) \cdot z& =& \int _{t_{0}}^{t_{1} }(f_{l})\prime_{x}{(t,u_{l}(t),\mathbf{u_{f}}(t),x(t))}^{T}z(t)\mathrm{d}t\quad \forall z \in H_{ 1}^{n}([t_{ 0},T]). {}\\ \end{array}$$

Also, for each fixed \((u_{l},\mathbf{u_{f}},x) \in L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T]) \times H_{1}^{n}([t_{ 0},T])\) and for almost all t 1 ∈ ]t 0, T],

$$\displaystyle{ \frac{\partial J_{l}} {\partial t_{1}} (t_{1},u_{l},\mathbf{u_{f}},x) = f_{l}(t_{1},u_{l}(t_{1}),\mathbf{u_{f}}(t_{1}),x(t_{1})). }$$

Let us identify, using Riesz-Fréchet theorem, the Hilbert spaces \(L_{2}^{m_{l}}([t_{ 0},T])\), \(L_{2}^{m_{f}}([t_{ 0},T])\) and \(L_{2}^{n}([t_{0},T])\) with their duals, and do not identify \(H_{1}^{n}([t_{0},T])\) with its dual \(H_{1}^{n}{([t_{0},T])}^{{\ast}}.\) Based on the fact that (see [21, pp. 81–82] for details)

$$\displaystyle{H_{1}^{n}([t_{ 0},T]) \subset L_{2}^{n}([t_{ 0},T]) \equiv L_{2}^{n}{([t_{ 0},T])}^{{\ast}}\subset H_{ 1}^{n}{([t_{ 0},T])}^{{\ast}}}$$

and both embeddings are continuous and dense, and the duality product between \(H_{1}^{n}([t_{0},T])\) and \(H_{1}^{n}{([t_{0},T])}^{{\ast}}\) coincide with the inner product in \(L_{2}^{n}([t_{0},T])\) on \(H_{1}^{n}([t_{0},T]) \times L_{2}^{n}([t_{0},T])\), we have that the Fréchet gradients \(\nabla _{u_{l}}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x) \in L_{2}^{m_{l}}([t_{ 0},T])\), \(\nabla _{\mathbf{u_{f}}}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x) \in L_{2}^{m_{f}}([t_{ 0},T])\) and \(\nabla _{x}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x) \in L_{2}^{n}([t_{0},T])\) are given for almost all t ∈ [t 0, T] by

$$\displaystyle\begin{array}{rcl} \nabla _{u_{l}}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x)(t)& = \left \{\begin{array}{@{}l@{\quad }l@{}} (f_{l})\prime_{u_{l}}(t,u_{l}(t),\mathbf{u_{f}}(t),x(t)),\,\mbox{ if }\;t \in [t_{0},t_{1}],\quad \\ 0,\,\mbox{ if }\;t \in \left ]t_{1},T\right ], \quad \end{array} \right.& {}\\ \nabla _{\mathbf{u_{f}}}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x)(t)& = \left \{\begin{array}{@{}l@{\quad }l@{}} (f_{l})\prime_{\mathbf{u_{f}}}(t,u_{l}(t),\mathbf{u_{f}}(t),x(t)),\,\mbox{ if }\;t \in [t_{0},t_{1}],\quad \\ 0,\,\mbox{ if }\;t \in \left ]t_{1},T\right ], \quad \end{array} \right.& {}\\ \nabla _{x}J_{l}(t_{1},u_{l},\mathbf{u_{f}},x)(t)& = \left \{\begin{array}{@{}l@{\quad }l@{}} (f_{l})\prime_{x}(t,u_{l}(t),\mathbf{u_{f}}(t),x(t)),\,\mbox{ if }\;t \in [t_{0},t_{1}],\quad \\ 0,\,\mbox{ if }\;t \in \left ]t_{1},T\right ], \quad \end{array} \right.& {}\\ \end{array}$$

Now, using the chain rule in (4.58), we obtain immediately (4.66) and (4.68) and also

$$\displaystyle\begin{array}{rcl} \nabla _{u_{l}}\hat{J}_{l}(\theta,t_{1},u_{l})(t)& =& (f_{l})\prime_{u_{l}}(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t)) \\ & +&{ \left ( \frac{\partial } {\partial u_{l}}\mathbf{u_{f}}(\theta,t_{1},u_{l},\cdot )\right )}^{{\ast}}(f_{ l})\prime_{\mathbf{u_{f}}}(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t)) \\ & +&{ \left ( \frac{\partial } {\partial u_{l}}x(\theta,t_{1},u_{l},\cdot )\right )}^{{\ast}}(f_{ l})\prime_{x}(t,u_{l}(t),\mathbf{u_{f}}(\theta,t_{1},u_{l},t),x(\theta,t_{1},u_{l},t)),{}\end{array}$$
(4.69)

and, for almost all t ∈ ]t 1, T],  \(\nabla _{u_{l}}\hat{J}_{l}(\theta,t_{1},u_{l})(t) = 0,\) where M  ∗  stands for the adjoint operator of a linear continuous operator M between two Hilbert spaces.

Fix \((\theta,t_{1},u_{l}) \in \Omega \times ]t_{0},T[\times L_{2}^{m_{l}}([t_{ 0},T])\). Since the embedding \(H_{1}^{n}([t_{0},T]) \subset L_{2}^{n}(\![t_{0},T]\!)\) is continuous, we can consider the partial Fréchet derivative \(\frac{\partial } {\partial u_{l}}x(\!\theta,t_{t},u_{l},\cdot \!)\) as a linear continuous operator from \(L_{2}^{m_{l}}([t_{ 0},T])\) to \(L_{2}^{n}([t_{0},T])\). Denote ⟨ ⋅,  ⋅⟩ n the inner product in \(L_{2}^{n}([t_{0},T])\). For all \(h \in L_{2}^{m_{l}}([t_{ 0},T]),\;k \in L_{2}^{n}([t_{ 0},T])\) we have

$$\displaystyle\begin{array}{rcl} \langle \frac{\partial } {\partial u_{l}}x(\theta,t_{t},u_{l},\cdot )h,k\rangle _{n}& =& \int _{t_{0}}^{T}{k}^{T}(t)\left (\int _{ t_{0}}^{T}X(\theta,t_{ 1},t,s)h(s)\mathrm{d}s\right )\mathrm{d}t {}\\ & =& \int _{t_{0}}^{T}{h}^{T}(s)\left (\int _{ t_{0}}^{T}{X}^{T}(\theta,t_{ 1},t,s)k(t)\mathrm{d}t\right )\mathrm{d}s {}\\ & =& \langle h,{\left ( \frac{\partial } {\partial u_{l}}x(\theta,t_{t},u_{l},\cdot )\right )}^{{\ast}}k\rangle _{ m_{l}}; {}\\ \end{array}$$

hence

$$\displaystyle{{ \left ( \frac{\partial } {\partial u_{l}}x(\theta,t_{t},u_{l},\cdot )\right )}^{{\ast}}\cdot k =\int _{ t_{0}}^{T}{X}^{T}(\theta,t_{ 1},t,\cdot )k(t)\mathrm{d}t. }$$
(4.70)

In the same way we get for all \(k \in L_{2}^{m_{f}}([t_{ 0},T])\)

$$\displaystyle{{ \left ( \frac{\partial } {\partial u_{l}}\mathbf{u_{f}}(\theta,t_{t},u_{l},\cdot )\right )}^{{\ast}}\cdot k =\int _{ t_{0}}^{T}{Y }^{T}(\theta,t_{ 1},t,\cdot )k(t)\mathrm{d}t. }$$
(4.71)

Finally (4.67) follows from (4.69). ■ 

Theorem 4.29 (First-order necessary conditions when the final time is fixed, i.e. \(\mathcal{T} =\{ t_{1}\}\)). 

Suppose that \(\mathcal{T} =\{ t_{1}\}\) , and f l satisfies hypotheses \((\mathcal{P}\mathcal{A})\) , (Hf), and f l (t,⋅,⋅,⋅) is convex for all t ∈ [t 0 ,T].

Let \((\bar{\theta },\bar{u}_{l}) \in \Theta _{we} \times \mathcal{U}_{l}\) solve (OB ) we . Then there are nonnegative real numbers μ,l 1 ,…,l p and a real number ν such that

$$\displaystyle\begin{array}{rcl} \nabla _{u_{l}}\hat{J}_{l}(\bar{\theta },t_{1},\bar{u}_{l})(t) +\mu \bar{ u}_{l}(t)& =& 0\quad \quad \mathrm{a.e.\ on}\ [t_{0},T],{}\end{array}$$
(4.72)
$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial \theta _{i}} (\bar{\theta },t_{1},\bar{u}_{l}) - l_{i}+\nu & =& 0,\quad \quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.73)
$$\displaystyle\begin{array}{rcl} \mu (\|\bar{u}_{l}\|_{2} - R)& =& 0,{}\end{array}$$
(4.74)
$$\displaystyle\begin{array}{rcl} l_{i}\bar{\theta }_{i}& =& 0,\quad \quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.75)

and of course

$$\displaystyle\begin{array}{rcl} \sum _{i=1}^{p}\bar{\theta }_{ i}& =& 1,{}\end{array}$$
(4.76)
$$\displaystyle\begin{array}{rcl} \|\bar{u}_{l}\|_{2} \leq R,\quad \bar{\theta }_{i}& \geq & 0,\quad \quad \quad i = 1,\ldots,p.{}\end{array}$$
(4.77)

Remark 4.30.

According to (4.67), equation (4.72) is a Fredholm integral equation in the unknown \(\bar{u}_{l}\) (linear if f l (t,  ⋅,  ⋅,  ⋅) is quadratic, case which satisfies hypothesis (Hf)), depending on 2p + 1 parameters (μ and \(\bar{\theta }_{i}\)). Assuming that we are able to solve this integral equation, (4.73)–(4.76) represent a nonlinear system with 2p + 2 equations and 2p + 2 unknowns μ, ν, θ i , l i . A similar remark applies to the next theorem.

Theorem 4.31 (First-order necessary conditions when the final time \(t_{1} \in \mathcal{T} = [\underline{t},\overline{t}\,] \subset ]t_{0},T[\)). 

Suppose that f l satisfies hypotheses \((\mathcal{P}\mathcal{A})\) , (Hf) and f l (t,⋅,⋅,⋅) is convex for all t ∈ [t 0 ,T].

Let \((\bar{t}_{1},\bar{\theta },\bar{u}_{l}) \in \mathcal{T} \times \Theta _{we} \times \mathcal{U}_{l}\) solve (OB) we . Suppose that \(\bar{u}_{l}\) is continuous at \(\bar{t}_{1}\) (see footnote 4.24). Then there are nonnegative real numbers \(\mu,l_{1},\ldots,l_{p},l_{p+1},l_{p+2}\) and a real number ν such that

$$\displaystyle\begin{array}{rcl} \nabla _{u_{l}}\hat{J}_{l}(\bar{\theta },t_{1},\bar{u}_{l})(t) +\mu \bar{ u}_{l}(t)& =& 0\quad \quad \mathrm{a.e.\ on}\ [t_{0},T],{}\end{array}$$
(4.78)
$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial \theta _{i}} (\bar{\theta },t_{1},\bar{u}_{l}) - l_{i}+\nu & =& 0,\quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.79)
$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial t_{1}} (\bar{\theta },t_{1},\bar{u}_{l}) - l_{p+1} + l_{p+2}& =& 0,{}\end{array}$$
(4.80)
$$\displaystyle\begin{array}{rcl} \mu (\|\bar{u}_{l}\|_{2} - R)& =& 0,{}\end{array}$$
(4.81)
$$\displaystyle\begin{array}{rcl} l_{i}\bar{\theta }_{i}& =& 0,\quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.82)
$$\displaystyle\begin{array}{rcl} l_{p+1}(\bar{t}_{1} -\underline{ t})& =& 0,{}\end{array}$$
(4.83)
$$\displaystyle\begin{array}{rcl} l_{p+2}(\overline{t} -\bar{ t}_{1})& =& 0,{}\end{array}$$
(4.84)

and of course

$$\displaystyle\begin{array}{rcl} \sum _{i=1}^{p}\bar{\theta }_{ i}& =& 1,{}\end{array}$$
(4.85)
$$\displaystyle\begin{array}{rcl} \|\bar{u}_{l}\|_{2} \leq R,\quad \bar{\theta }_{i}& \geq & 0,\quad \quad \quad i = 1,\ldots,p.{}\end{array}$$
(4.86)

The proof of Theorems 4.29 and 4.31 is a direct application of the generalized Lagrange multiplier rule under Kurcyusz–Robinson–Zowe regularity condition (see [39, Theorem 5.3]) and is based on Theorem 4.27 and on Lemma 4.28.

4.6.2 The Pessimistic Bilevel Problem

In this section we assume that f l (t,  ⋅,  ⋅,  ⋅) is quadratic, i.e. for all \((t,u_{l},\mathbf{u_{f}},x) \in [t_{0},T] \times {\mathbb{R}}^{m_{l}} \times {\mathbb{R}}^{m_{f}} \times {\mathbb{R}}^{n}\),

$$\displaystyle{ f_{l}(t,u_{l},\mathbf{u_{f}},x) = u_{l}^{T}S_{ l}(t)u_{l} +{ \mathbf{u_{f}}}^{T}R_{ l}(t)\mathbf{u_{f}} + {x}^{T}Q_{ l}(t)x, }$$
(4.87)

where \(S_{l}(\cdot ),R_{l}(\cdot ),Q_{l}(\cdot )\) are continuous symmetric matrix-valued functions. Note that this function satisfies hypotheses \((\mathcal{P}\mathcal{A})\) and (Hf).

According to [4, Example 3, p. 14] the functional \(J_{l}(t_{1},\cdot,\cdot,\cdot ): L_{2}^{m_{l}}([t_{ 0},T]) \times L_{2}^{m_{f}}([t_{ 0},T]) \times H_{1}^{n}([t_{ 0},T])\times \) is well defined and continuous. Therefore, by Theorem 4.24, the functional \(\hat{J}_{l}(\cdot,\cdot,\cdot )\) has finite values and is continuous on \(\Theta _{we} \times \mathcal{T} \times \mathcal{U}_{l}\).

Moreover, since Θ we is compact, the pessimistic problem (PB) we can be written as

$$\displaystyle{\min _{(t_{1},u_{l})\in \mathcal{T}\times \mathcal{U}_{l}}\max _{\theta \in \Theta _{we}}\hat{J}_{l}(\theta,t_{1},u_{l}).}$$

Theorem 4.32 (First-order necessary conditions when the final time is fixed, i.e. \(\mathcal{T} =\{ t_{1}\}\)). 

Suppose that \(\mathcal{T} =\{ t_{1}\}\) .

Let \((\bar{\theta },\bar{u}_{l}) \in \Theta _{we} \times \mathcal{U}_{l}\) solve (PB)we . Then there are nonnegative real numbers μ,l 1 ,…,l p and a real number ν such that

$$\displaystyle\begin{array}{rcl} \nabla _{u_{l}}\hat{J}_{l}(\bar{\theta },t_{1},\bar{u}_{l})(t) +\mu \bar{ u}_{l}(t)& =& 0\quad \quad \mathrm{a.e.\ on}\ [t_{0},T],{}\end{array}$$
(4.88)
$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial \theta _{i}} (\bar{\theta },t_{1},\bar{u}_{l}) + l_{i}+\nu & =& 0,\quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.89)
$$\displaystyle\begin{array}{rcl} \mu (\|\bar{u}_{l}\|_{2} - R)& =& 0,{}\end{array}$$
(4.90)
$$\displaystyle\begin{array}{rcl} l_{i}\bar{\theta }_{i}& =& 0,\quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.91)

and of course

$$\displaystyle\begin{array}{rcl} \sum _{i=1}^{p}\bar{\theta }_{ i}& =& 1,{}\end{array}$$
(4.92)
$$\displaystyle\begin{array}{rcl} \|\bar{u}_{l}\|_{2} \leq R,\quad \bar{\theta }_{i}& \geq & 0,\quad \quad i = 1,\ldots,p.{}\end{array}$$
(4.93)

Proof.

We have that \(\bar{\theta }\) is a maximizer of \(\hat{J}_{l}(\cdot,t_{1},\bar{u}_{l})\) over Θ we . By Karush–Kuhn–Tucker theorem, since on Θ we the linear independence of gradients of active constraints holds (hence Mangasarian–Fromowitz regularity condition holds), and based on Lemma 4.28, we obtain that there are nonnegative reals l 1, , l p and a real ν such that (4.89) and (4.91) hold and of course (4.92) and (4.93).

Moreover, \(\bar{u}_{l}\) is a minimizer of \(\hat{J}_{l}(\bar{\theta },t_{1},\cdot )\) over the ball \(\mathcal{U}_{l}\). By the generalized Lagrange multiplier rule under Kurcyusz-Robinson-Zowe regularity condition (see [39, Theorem 5.3]), and based on Lemma 4.28, we obtain (4.88) and (4.90). ■ 

Theorem 4.33 (First-order necessary conditions when the final time \(t_{1} \in \mathcal{T} = \left [\underline{t},\overline{t}\right ] \subset \left ]t_{0},T\right [\)). 

Let \((\bar{t}_{1},\bar{\theta },\bar{u}_{l}) \in \mathcal{T} \times \Theta _{we} \times \mathcal{U}_{l}\) solve (PB) we . Suppose that \(\bar{u}_{l}\) is continuous at \(\bar{t}_{1}\) (see footnote  4.24 ). Then there are nonnegative real numbers \(\mu,l_{1},\ldots,l_{p},l_{p+1},l_{p+2}\) and a real number ν such that

$$\displaystyle\begin{array}{rcl} \nabla _{u_{l}}\hat{J}_{l}(\bar{\theta },t_{1},\bar{u}_{l})(t) +\mu \bar{ u}_{l}(t)& =& 0\quad \quad \mathrm{a.e.\ on}\ [t_{0},T],{}\end{array}$$
(4.94)
$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial \theta _{i}} (\bar{\theta },t_{1},\bar{u}_{l}) + l_{i}+\nu & =& 0,\quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.95)
$$\displaystyle\begin{array}{rcl} \frac{\partial \hat{J}_{l}} {\partial t_{1}} (\bar{\theta },t_{1},\bar{u}_{l}) - l_{p+1} + l_{p+2}& =& 0,{}\end{array}$$
(4.96)
$$\displaystyle\begin{array}{rcl} \mu (\|\bar{u}_{l}\|_{2} - R)& =& 0,{}\end{array}$$
(4.97)
$$\displaystyle\begin{array}{rcl} l_{i}\bar{\theta }_{i}& =& 0,\quad \quad i = 1,\ldots,p,{}\end{array}$$
(4.98)
$$\displaystyle\begin{array}{rcl} l_{p+1}(\bar{t}_{1} -\underline{ t})& =& 0,{}\end{array}$$
(4.99)
$$\displaystyle\begin{array}{rcl} l_{p+2}(\overline{t} -\bar{ t}_{1})& =& 0,{}\end{array}$$
(4.100)

and of course

$$\displaystyle\begin{array}{rcl} \sum _{i=1}^{p}\bar{\theta }_{ i}& =& 1,{}\end{array}$$
(4.101)
$$\displaystyle\begin{array}{rcl} \|\bar{u}_{l}\|_{2} \leq R,\quad \bar{\theta }_{i}& \geq & 0,\quad \quad i = 1,\ldots,p.{}\end{array}$$
(4.102)

The proof is identical to the proof of Theorem 4.32.

Remark 4.34.

A similar comment as in Remark 4.30 can be done for the last two theorems. Moreover, in this case the computation of the partial derivatives and gradients in Lemma 4.28 is simplified since, by (4.87), we have

$$\displaystyle{\begin{array}{ll} &(f_{l})\prime_{u_{l}}(t,u_{l},\mathbf{u_{f}},x) = 2u_{l}^{T}S_{l}(t), \\ &(f_{l})\prime_{\mathbf{u_{f}}}(t,u_{l},\mathbf{u_{f}},x) = 2{\mathbf{u_{f}}}^{T}R_{l}(t), \\ &(f_{l})\prime_{x}(t,u_{l},\mathbf{u_{f}},x) = {x}^{T}Q_{l}(t).\end{array} }$$