Key words

Mathematics Subject Classifications (2010)

19.1 Introduction

The title of this paper resembles the name of the third chapter in the remarkable book of Borwein and Zhu [3]. But the emphasis is somewhat different. In [3] variational techniques directly “refer to proofs by way of establishing that an appropriate auxiliary function attains a minimum.” This interpretation of variational analysis is close to its original meaning in the classical Morse’s monograph [33]. But a more recent and also already classical monograph by Rockafellar and Wets [37] offers a much broader interpretation of the term which has become widely accepted by now. Pursuit of maximum generality is immanent in this interpretation and this is definitely an indication of the power and ambition of the emerging theory. But there are also some dangers.

Needless to say that general theories often offer good starting observation points to attack concrete problems. But equally true is that neglecting specific features and details may lead to heavy and awkward constructions poorly connected with the nature of the problem. These are rather trivial remarks, but there is a worrying tendency to indiscriminately use techniques of general variational analysis as the main tool to study very well structured problems for which a proper use of techniques taking specified structures into account could be much more efficient.Footnote 1

My main intention in this paper is to demonstrate the interplay between methods of general variational analysis, on the one hand, and convex analysis, on the other (with more emphasis on the efficiency of the latter), in treating problems that come from the general variational analysis when applied in substantial or virtual presence of convexity.

We shall consider several problems connected either with (metric) regularity and perturbation stability of set-valued mappings or first-order optimality conditions. Not all of the problems are originally convex, but in either case the analysis involves convexity in one or another way. The paper basically surveys some results that recently have been or are about to be published, although there are some new results as well (those not supplied with references in the text or in the comments at the ends of each section), e.g., Theorem 19.28 containing an exact estimate for the modulus of Lipschitz stability of the set of solutions of a generalized equation with respect to joint variations of the right-hand part and linear perturbations of the set-valued mapping. For the majority of other results we just give new proofs, usually shorter than those available in the literature.

Now about the content of the paper. The next section contains necessary information from convex and modern variational analysis. In the third section we speak about calculation of global error bounds, for convex functions in the first part of the section and nonconvex l.s.c. functions in the second. This problem is an excellent example of the effect of specific structural properties of the objects and the concrete setting of the problem on the choice of technical instruments to solve the problem and on the result itself. We demonstrate this by considering first convex functions separately on Hilbert spaces, reflexive spaces, and arbitrary Banach spaces and then arbitrary lower semicontinuous functions on Banach spaces. The general results presented in the section can be found in Azé-Corvellec’s papers [1, 2] and the monograph [46] by Zalinescu although our statements and proofs contain some new elements (more detailed bibliographic information is contained in comments concluding each of the last three sections). In the fourth section we consider set-valued mappings with convex graphs and systems of convex inequalities. The main questions discussed in the section relate to metric regularity (exact formulas for regularity moduli) and stability estimates of the regularity properties with respect to linear perturbation. We start with a new and a very short proof of the Robinson-Ursescu theorem substantially based on the general metric regularity criterion of [20]. Subsequent results are closely related to Canovas et al. [48], Ioffe-Sekiguchi [25], and Ioffe [23]. The first statement of the mentioned final result of the section Theorem 19.28 applies to all set-valued mappings with closed graph.

The main message of the last section is that, as far as the first-order necessary optimality conditions are concerned, smooth nonconvex inequality constraint or cost functions do not exist. By that I mean that if such a function appears in the statement of the problem, it can be replaced by a convex continuous local majorant with the same value and the same derivative at the point. This is the content of main lemma in the last section. This simple lemma does not seem to have appeared earlier, but a similar and even more elaborate idea was behind the Levitin-Milyutin-Osmolovski upper approximation in [27] (see also [19]). Using convex majorants offered by the lemma dramatically simplify the derivation of necessary optimality conditions.

A discussion with B. Mordukhovich after his talk (based on his joint paper with Nghia [32]) at J. Borwein anniversary conference in Vancouver revealed however that utility of such tricks may not seem to be obvious for everybody. So we show here that it is possible to furnish, based on the lemma, fairly elementary and short proofs for the main results of [32] and even for stronger versions of some of them. In fact, the message can be even strengthened: the convex majorant that appears after the application of the lemma is necessarily strictly differentiable, no matter whether the original function has this property or not. Thus, when we deal with first-order optimality conditions, for inequality constraint and cost functions the differentiability and strict differentiability assumptions make no difference. Of course, the difference is substantial for equality constraints.

In the text we supply with references mainly the results that are stated without proofs. The statements of known results supplied with proofs may slightly differ from the original statements, and we give relevant references in the comments at the ends of the sections.

Notation. Throughout the paper X, Y, etc. are Banach spaces, X  ∗  is the topological dual of X, and \(\langle {x}^{{\ast}},x\rangle\) is the canonical pairing on X  ∗ ×X. By d(x, Q) we denote the distance from x to Q. We shall always consider X  ∗  with the weak ∗ -topology and the symbol cl  ∗  means closure in this topology. The symbol F: X ⇉ Y is used for set-valued mappings from X into Y, F  − 1 is the inverse mapping: \({F}^{-1}(y) =\{ x:\; y \in F(x)\}\).

If Q ⊂ X, then conv Q is the convex hull of Q, \(\mathrm{cone}\,Q = \cup _{\lambda \geq 0}\lambda Q\) is the cone generated by Q, S Q (x) stands for the support function of Q, and Ind Q indicator of Q:

$$\displaystyle{S_{Q}({x}^{{\ast}}) =\sup _{ x\in Q}\langle {x}^{{\ast}},x\rangle;\qquad \mathrm{IndQ} = \left \{\begin{array}{cl} 0, &\mathrm{if}\;x \in Q, \\ \infty,&\mathrm{otherwise}. \end{array} \right.}$$

19.2 Preliminaries

19.2.1 A Few Facts from Convex Analysis

Let X be a Banach space and f an extended-real-valued convex function on X. We call f proper if f(x) > − . We define as usual the domain \(\mathrm{dom}\,f =\{ x:\; f(x) <\infty \}\) and epigraph  \(\mathrm{epi}\,f =\{ (x,\alpha ) \in X \times \mathbb{R}:\;\alpha \geq f(x)\}\) of f. It is said that f is closed if epi f is a closed set or, equivalently, if f is lower semicontinuous. The (Fenchel) conjugate of f is

$$\displaystyle{{f}^{{\ast}}({x}^{{\ast}}) =\sup _{ u}(\langle {x}^{{\ast}},u\rangle - f(u)).}$$

The closure cl f of a convex function f is the greatest closed convex function majorized by f:

$$\displaystyle{ \mathrm{cl}\,f(x) =\liminf _{u\rightarrow x}f(u) = {f}^{{\ast}{\ast}}(x) =\sup _{{ x}^{{\ast}}}(\langle {x}^{{\ast}},x\rangle - {f}^{{\ast}}({x}^{{\ast}})). }$$
(19.1)

Observe that the lower bounds of a function and its closure coincide.

Recall that the subdifferential of f at x ∈ dom f is

$$\displaystyle{\partial f(x) =\{ {x}^{{\ast}}\in {X}^{{\ast}}:\;\langle {x}^{{\ast}},h\rangle \leq f(x + h) - f(x),\quad \forall \;h \in X\}.}$$

It is always a weak ∗  closed set, bounded (hence weak ∗ -compact) if f is continuous at x. Moreover, the set-valued mapping \(x\mapsto \partial f(x)\) is norm-to-weak ∗  upper semicontinuous; in particular, the function \(\mathrm{d}(0,\partial f(\cdot ))\) is lower semicontinuous.

If x ∈ dom f, then the directional derivative of f at x in the direction h is

$$\displaystyle{f^{\prime}(x;h) =\lim _{t\rightarrow +0}{t}^{-1}(f(x + th) - f(x)).}$$

This limit, finite or infinite, always exists, thanks to the following elementary fact: if \(\varphi (t)\) is a convex function on a real line and \(t_{1} <t_{2} \leq t_{4},\;t_{1} \leq t_{3} <t_{4}\) belong to the domain of \(\varphi\), then

$$\displaystyle{ \frac{\varphi (t_{2}) -\varphi (t_{1})} {t_{2} - t_{1}} \leq \frac{\varphi (t_{4}) -\varphi (t_{3})} {t_{4} - t_{3}}. }$$
(19.2)

As another immediate consequence, we get f′(x; h) ≥ − f′(x; − h).

The function f′(x; ⋅) is sublinear, that is, convex and positively homogeneous of degree one: \(f^{\prime}(x;\lambda h) =\lambda f^{\prime}(x;h)\) if λ > 0. It is continuous if f is continuous at x. Note that the closure of a sublinear function is also a sublinear function. The connection between the subdifferential and the directional derivative is described by the equality

$$\displaystyle{\sup _{{x}^{{\ast}}\in \partial f(x)}\langle {x}^{{\ast}},h\rangle = (\mathrm{cl}\,f^{\prime}(x;\cdot ))(h).}$$

We also have that f(x) = (cl f)(x) and \(\partial f(x) = \partial (\mathrm{cl}\,f)(x)\) if ∂ f(x)≠.

The following proposition reveals the connection between the directional derivative and the subdifferential at the same point.

Proposition 19.1.

Let f be a proper closed convex function on a Banach space X. Then

$$\displaystyle{\mathrm{d}(0,\partial f(x)) =\sup _{\|h\|\leq 1}(-f^{\prime}(x;\cdot ))(h) =\sup _{\|h\|\leq 1}(-\mathrm{cl}\,(f^{\prime}(x;\cdot )))(h).}$$

(Here we adopt the standard conventions: \(\sup \varnothing = -\infty\), \(\inf \varnothing = \infty\), so that\(d(x,\varnothing ) = \infty\).)

Proof.

If \(\partial f(x) = \varnothing\), the equality holds by the standard convention. Furthermore, 0 ∈ ∂ f(x) if and only if f′(x; h) ≥ 0 for all h and the equality obviously holds in this case (just take h = 0). Assume now that \(0\not\in \partial f(x)\neq \varnothing\). Then d(0, ∂ f(x)) > 0. As ∂ f(x) is a weak ∗  closed set and \(\|\cdot \|\) is weak ∗  l.s.c., there is an x  ∗  ∈ ∂ f(x) such that \(\|{x}^{{\ast}}\| = \mathrm{d}(0,\partial f(x))\). Take a small \(\varepsilon> 0\) and set \(Q_{\varepsilon } = (1-\varepsilon )\|{x}^{{\ast}}\|B_{{X}^{{\ast}}}\). This set is weak ∗ -compact and does not meet ∂ f(x). Therefore there is an h ∈ X, \(\|h\| = 1\) separating the sets, e.g.,

$$\displaystyle{T_{B}(Q,x) =\limsup _{{\lambda \rightarrow +0}}\lambda }^{-1}(Q - x) =\{ h:\;\liminf _{ _{\lambda \rightarrow +0}}\lambda^{-1}\mathrm{d}(x +\lambda h,Q) = 0\}.$$

On the other hand

$$\displaystyle{\inf _{{u}^{{\ast}}\in \partial f(x)}\langle {u}^{{\ast}},h\rangle = -\sup _{{ u}^{{\ast}}\in \partial f(x)}\langle {u}^{{\ast}},-h\rangle = -\mathrm{cl}\,(f^{\prime}(x;\cdot ))(-h)}$$

and we get \(\vert \mathrm{d}(0,\partial f(x)) - (-\mathrm{cl}\,(f^{\prime}(x,\cdot ))(-h)\vert \leq \varepsilon \mathrm{d}(0,\partial f(x))\).

Given a convex set Q ⊂ X and an \(\overline{x} \in Q\), the tangent cone to Q at \(\overline{x}\) is

$$\displaystyle{T(Q,\overline{x}) =\mathrm{ cl}\,[\mathrm{cone}\,(Q -\overline{x})] =\mathrm{ cl}\,\Big(\bigcup _{\lambda>0}\lambda (T -\overline{x})\Big)}$$

and the normal cone to Q at \(\overline{x}\) is

$$\displaystyle{N(Q,\overline{x}) =\{ {x}^{{\ast}}\in {X}^{{\ast}}:\;\langle {x}^{{\ast}},x -\overline{x}\rangle \leq 0,\;\forall \;x \in Q\}.}$$

If Q i ,  i = 1, 2 are convex sets such that Q 1 meets the interior of Q 2 then for any \(x \in Q_{1} \cap Q_{2}\)

$$\displaystyle{N(Q_{1} \cap Q_{2},x) \subset N(Q_{1},x) + N(Q_{2},x).}$$

If \(F: X \rightrightarrows Y\) is a set-valued mapping with convex graph and \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\), then \(T(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\) can be considered the graph of the set-valued mapping \(DF(\bar{x},\bar{y})\) defined by

$$\displaystyle{DF(\bar{x},\bar{y})(h) =\{ v \in Y: (h,v) \in T(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\}.}$$

This mapping is usually called the derivative (or contingent derivative) of F at \((\bar{x},\bar{y})\).

19.2.2 A Few Facts from General Variational Analysis

We need some facts from the local (metric) regularity theory of variational analysis. Given a set-valued mapping F: X ⇉ Y and an \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\), it is said that:

  • F is open (or covering) at a linear rate at (or near) \((\bar{x},\bar{y})\) if there are r > 0 and \(\varepsilon> 0\) such that

    $$\displaystyle{B(v,tr) \cap B(\overline{y},\varepsilon ) \subset F(B(x,t))}$$

    whenever \(\|x -\overline{x}\| <\varepsilon\), \(0 \leq t <\varepsilon\) and v ∈ F(x). The upper bound of such r is called the modulus or rate of surjection or openness of F at \((\bar{x},\bar{y})\) and is denoted \(\mathrm{sur}\,F(\overline{x}\vert \overline{y})\). If no such r and \(\varepsilon\) exist, we set \(\mathrm{sur}\,F(\overline{x}\vert \overline{y}) = 0\).

  • F is metrically regular at (or near) \((\bar{x},\bar{y})\) if there are K > 0 and \(\varepsilon> 0\) such that

    $$\displaystyle{\mathrm{d}(x,{F}^{-1}(y)) \leq K\mathrm{d}(y,F(x))}$$

    whenever \(\|x -\overline{x}\| <\varepsilon\) and \(\|y -\overline{y}\| <\varepsilon\). The lower bound of such K is called the modulus or rate of metric regularity of F at \((\bar{x},\bar{y})\) and is denoted \(\mathrm{reg}\,F(\overline{x}\vert \overline{y})\). If no such K and \(\varepsilon\) exist, we set \(\mathrm{reg}\,F(\overline{x}\vert \overline{y}) = \infty\).

  • F  − 1 is said to be pseudo-Lipschitz or to have the Aubin property at (or near) \((\overline{y},\overline{x})\) if there are K > 0 and \(\varepsilon> 0\) such that

    $$\displaystyle{\mathrm{d}(x,{F}^{-1}(y)) \leq K\mathrm{d}(y,v)}$$

    whenever \(\|x -\overline{x}\| <\varepsilon,\|y -\overline{y}\| <\varepsilon\) and v ∈ F(x). The lower bound of such K is called the Lipschitz modulus or rate of F  − 1 at \((\overline{y}\vert \overline{x})\) and is denoted \(\mathrm{lip}\,{F}^{-1}(\overline{y}\vert \overline{x})\). If no such K or \(\varepsilon\) exist, we set \(\mathrm{lip}\,{F}^{-1}(\overline{y},\overline{x}) = \infty\).

Theorem 19.2 (equivalence theorem). 

Under the convention that 0 ⋅∞ = 1, for any set-valued mapping F: X ⇉ Y and any \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\)

$$\displaystyle{\mathrm{sur}\,F(\overline{x}\vert \overline{y}) \cdot \mathrm{ reg}\,F(\overline{x}\vert \overline{y}) = 1;\qquad \mathrm{reg}\,F(\overline{x}\vert \overline{y}) =\mathrm{ lip}\,{F}^{-1}(\overline{y}\vert \overline{x}).}$$

We say that F is regular at \((\bar{x},\bar{y})\) if the three properties are satisfied at \((\bar{x},\bar{y})\).

Theorem 19.3 (criterion for local regularity). 

Let F be a set-valued mapping whose graph is locally complete in the product metric, and let \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\) . Then F is regular near \((\bar{x},\bar{y})\) if and only if there are \(\varepsilon> 0\) and r > 0 such that for any x, y, and v satisfying \(d(x,\overline{x}) <\varepsilon,\;d(v,\overline{y}) <\varepsilon,\;y \in F(x)\) , and \(0 <d(y,v) <\varepsilon\) , there is a pair (u,z) ∈ Graph  F, (u,z)≠(x,y), and a ξ > 0 such that

$$\displaystyle{ \|y - z\| \leq \| y - v\| - rd_{\xi }((x,y),(u,z)). }$$
(19.3)

The upper bound of such r coincides with  \(\mathrm{sur}\,F(\overline{x}\vert \overline{y})\) .

Moreover if F is u.s.c. in the sense that the functions \(\varphi _{y} = d(y,F(\cdot ))\) are l.s.c. for any y, (19.3)can be replaced by

$$\displaystyle{\|y - z\| \leq \| y - v\| - r\|x - u\|.}$$

Here d ξ is the distance in X ×Y associated with the norm \(\max \{\|x - u\|,\xi \|y - z\|\}\). Note that the definitions and results extend without change (except for obvious replacement of the norms by distances) to arbitrary metric spaces. For the proofs of the theorems see, e.g., [20, 22].

The geometric meaning of the criterion is obvious: for any observation point (x, v) of the graph (close to \((\bar{x},\bar{y})\)) and any yv you can find a better observation position (u, w) ∈ Graph F such that the gain in the distance to y is proportional to the distance between the observation points. Less obvious is that the criterion is an excellent practical instrument, often better than more sophisticated means using slopes and subdifferentials.Footnote 2

Calculation of regularity rates is typically a difficult task. But in certain cases it is sufficiently easy. Denote by Epi f the set-valued mapping \(X \rightrightarrows \mathbb{R}\) whose graph coincides with the epigraph of f;

$$\displaystyle{\mathrm{Epi}\,f(x) =\{\alpha \in \mathbb{R}:\;\alpha \geq f(x)\}.}$$

Proposition 19.4.

Let f be a closed convex function on X. Then for any x ∈ dom  f

$$\displaystyle{ \mathrm{sur}\,(\mathrm{Epi}\,f)(x,f(x)) =\liminf _{(u,f(u))\rightarrow (x,f(x))}d(0,\partial f(u)). }$$
(19.4)

Proof.

Note that for α ∈ Epi f(x) the inclusion \(B(\alpha,\varepsilon ) =\alpha +[-\varepsilon,\varepsilon ] \subset \mathrm{ Epi}\,f(B(u,t))\) is equivalent to \(\alpha +[-\varepsilon,\infty ] \subset \mathrm{ Epi}\,f(B(u,t))\). On the other hand, let \(\partial f(u)\neq \varnothing\). Set \(\rho = \mathrm{d}(0,\partial f(u))\). We have \(f(u + th) \geq f(u) + t\langle {u}^{{\ast}},h\rangle\) for any h ∈ X, any \({u}^{{\ast}}\in \partial f(u)\), and any t > 0. Together with Proposition 19.1 this implies that for any \(\varepsilon> 0\)

$$\displaystyle{f(u) + t(1-\varepsilon )\rho [-1,\infty ) \subset \mathrm{ Epi}\,f(B(u,t)) \subset f(u) + t\rho [-1,\infty )}$$

and (19.4) follows (as d(0, ∂ f( ⋅)) is an l.s.c. function).

The contingent or Bouligand tangent cone to Q at x ∈ Q is

$$\displaystyle{T_{B}(Q,x) =\limsup { _{\lambda \rightarrow +0}\lambda }^{-1}(Q - x) =\{ h:\;\liminf { _{\lambda \rightarrow +0}\lambda }^{-1}\mathrm{d}(x +\lambda h,Q) = 0\}.}$$

The polar of T B (Q, x)

$$\displaystyle{N_{DH}(Q,x) =\{ {x}^{{\ast}}\in {X}^{{\ast}}:\;\langle {x}^{{\ast}},u - x\rangle \leq 0,\quad \forall \;u \in Q\}}$$

is called the Dini-Hadamard normal cone to Q at x. Any normal cone to the graph of a set-valued mapping X ⇉ Y can be viewed as the graph of a set-valued mapping from Y  ∗  into X  ∗ . In particular, given F: X ⇉ Y and \((x,y) \in \mathrm{ Graph}\,F\), the set-valued mapping

$$\displaystyle{{y}^{{\ast}}\mapsto \{{x}^{{\ast}}:\; ({x}^{{\ast}},-{y}^{{\ast}}) \in N_{ DH}(\mathrm{Graph}\,F,(x,y)\}:= D_{DH}^{{\ast}}F(x,y)({y}^{{\ast}})}$$

is called the Dini-Hadamard coderivative of F at (x, y). We shall not need coderivatives associated with other types of normal cones.

The Clarke tangent cone T C (Q, x) to Q at x is the collection of h ∈ X such that for any sequence (x n ) ⊂ Q converging to x and any sequence (t n ) of positive numbers converging to zero there is a sequence (h n ) converging to h such that \(x_{n} + t_{n}h_{n} \in Q\). This is always a closed convex cone. Its polar N C (Q, x) is called Clarke’s normal cone to Q at x.

The inclusions \(T_{C}(Q,x) \subset T_{B}(Q,x)\) and \(N_{DH}(Q,x) \subset N_{C}(Q,x)\) always hold. If we actually have equalities, then Q is called Clarke regular at x ∈ Q.

19.3 Global Error Bounds

Let X be a metric space, and let f be an extended-real-valued function on X. We define the domain of f by \(\mathrm{dom}\,f =\{ x:\; \vert f(x)\vert <\infty \}\) and set \([f \leq \alpha ] =\{ x \in \mathrm{ dom}\,f:\; f(x) \leq \alpha \}\), \([f =\alpha ] =\{ x \in \mathrm{ dom}\,f:\; f(x) =\alpha \}\), etc.

Definition 19.5.

Suppose [f ≤ α]≠. A number K ≥ 0 is called a global error bound for f at level α if

$$\displaystyle{\mathrm{d}(x,[f \leq \alpha ]) \leq K{(f(x)-\alpha )}^{+},\quad \forall \;x \in X.}$$

Clearly the set of all global error bounds has the minimal element. We shall denote by K f (α) the smallest global error bound for f at level α. The reciprocal quantity \(K_{f}{(\alpha )}^{-1}\) is sometimes called the condition number of f at the level α.

We shall look for estimates or exact expressions for global error bounds (condition numbers) that use only infinitesimal information about the function.

19.3.1 Convex Function on a Banach Space

As follows from the title of the subsection we shall consider here the case when X is a Banach space and f is a convex function. We assume throughout that

(A 1):

f is a proper closed convex function and \([f \leq \alpha ]\neq \varnothing\).

In this subsection we prove the following theorem.

Theorem 19.6.

Let X be a Banach space and f a proper closed convex function on X satisfying ( A 1 ). Then for any α with \([f \leq \alpha ]\neq \varnothing\)

$$\displaystyle\begin{array}{rcl} K_{f}{(\alpha )}^{-1}& =& \inf _{ x\in [f>\alpha ]}\;\sup _{\|h\|\leq 1}(-f^{\prime}(x;h)) \\ & =& \inf _{x\in [f>\alpha ]}\mathrm{d}(0,\partial f(x)) \\ & =& \inf _{x\in [f>\alpha ]}\mathrm{sur}\,(\mathrm{Epi}\,f)(x,f(x)).{}\end{array}$$
(19.5)

Moreover, if X is a Hilbert space, then we also have

$$\displaystyle{ K_{f}{(\alpha )}^{-1} =\inf _{ x\in [f=\alpha ]}\ \;\inf _{h\in N([f\leq \alpha ],x),\|h\|=1}f^{\prime}(x;h). }$$
(19.6)

Proof.

The second and the third equalities follow from Propositions 19.1 and 19.4. So in order to prove the first statement we only have to show that K f (α) − 1 coincides with any of the quantities on the right. We shall do this separately for Hilbert spaces (along with (19.6)), reflexive spaces, and general Banach spaces. Set for brevity S = [f ≤ α], S 0 = [f = α].

  1. 1.

    The Case of a Hilbert Space. To begin with, consider a continuous convex function \(\varphi\) on the real segment [0, T] which is equal to zero at \(0\) and strictly positive on (0, T]. Denote by \(\varphi ^{\prime}(t\pm )\) the right and left derivatives of \(\varphi\) at t. We have \(\varphi ^{\prime}(t+) +\varphi ^{\prime}(t-) \geq 0\) for all t ∈ (0, T) and (by (19.2))

    $$\displaystyle{ \varphi ^{\prime}(0+) =\lim _{t\rightarrow 0}\varphi ^{\prime}(t) =\lim _{t\rightarrow 0}(-\varphi ^{\prime}(t-)) =\inf _{t>0}(-\varphi ^{\prime}(t-)). }$$
    (19.7)

    It follows further from (19.2) and the mean value theorem that for any t > 0 there is a τ ∈ (0, t) such that \(-t\varphi ^{\prime}(t-) \geq \varphi (t) \geq -\tau \varphi ^{\prime}(\tau -)\). Together with (19.7) this implies that

    $$\displaystyle\begin{array}{rcl} \sup \{k \geq 0: kt \leq \varphi (t),\ \forall t \in [0,T]\}& =& \varphi ^{\prime}(0+) \\ & =& -\lim _{t\rightarrow 0}\varphi ^{\prime}(t-) =\inf _{t>0}(-\varphi ^{\prime}(t-)).\qquad \quad {}\end{array}$$
    (19.8)

    Let \(x \in (\mathrm{dom}\,f)\setminus S\), and \(\overline{x} \in S\) be such that \(\|x -\overline{x}\| = \mathrm{d}(x,S)\). As f(x) < , we necessarily have \(\overline{x} \in S_{0}\). Set \(T =\| x -\overline{x}\|\), \(h = {T}^{-1}(x -\overline{x})\) and let \(\varphi (t) = f(\overline{x} + th),\;t \in [0,T]\). It is clear that \(\varphi ^{\prime}(t+) = f^{\prime}(\overline{x} + th;h)\) and \(\varphi ^{\prime}(t-) = f^{\prime}(\overline{x} + th;-h)\). Note further that for any \(\overline{x} \in S\) either \(N(S,\overline{x}) =\{ 0\}\) or \(\overline{x} + h\not\in \mathrm{dom}\,f\) for any nonzero \(h \in N(S,\overline{x})\) in which case \(f^{\prime}(\overline{x};h) = \infty\) for such \(h\), or finally there is an \(h \in N(S,\overline{x})\), \(\|h\| = 1\) such that \(f(\overline{x} + th) <\infty\) for some positive t. In the last case \(\overline{x}\) is necessarily the closest to \(\overline{x} + th\) element of S. Combining this with (19.8), we get

    $$\displaystyle{\begin{array}{lcl} K_{f}{(\alpha )}^{-1} & =&\sup \{k \geq 0:\; kd(x,S) \leq f(x)-\alpha,\;\forall \;x\not\in S\} \\ & =&\inf _{\overline{x}\in S_{0}}\ \inf _{h\in N(S,x),\|h\|=1}f^{\prime}(x;h) =\inf _{x\not\in S}\;\sup _{\|h\|\leq 1}(-f^{\prime}(x;h)).\end{array} }$$

    This proves both (19.6) and the first equality in (19.5).

  2. 2.

    The General Case: Proof that

    $$\displaystyle{ K_{f}{(\alpha )}^{-1} \leq \inf _{ x\in [f>\alpha ]}\;\sup _{\|h\|\leq 1}(-f^{\prime}(x;h)) = r. }$$
    (19.9)

    Take an \(x \in [f>\alpha ] \cap \mathrm{ dom}\,f\) and an \(\overline{x} \in S_{0}\). Set \(u = \overline{x} - x\) and \(h = u/\|u\|\). We have from (19.2)

    $$\displaystyle{\frac{f(x) - f(\overline{x})} {\|x -\overline{x}\|} \leq -f^{\prime}(x;h)}$$

    and (19.9) follows because \(\overline{x}\) can be chosen to make \(\|x -\overline{x}\|\) arbitrarily close to d(x, S). So it remains to prove the opposite inequality for which we can assume that r > 0.

  3. 3.

    The Case of a Reflexive Space: Completion of the Proof. We have \(\inf _{\|h\|\leq 1}f^{\prime}(x;h) \leq -r\) for all x ∈ [f > α]. This means that for any such x and any r′ < r there is a t > 0 such that for some u we have

    $$\displaystyle{ \|u - x\| \leq t,\qquad f(u) \leq f(x) - r^{^{\prime}}{t}. }$$
    (19.10)

    Fix x and \(\varepsilon\), denote by TU(x) the collection of pairs (t, u) satisfying (19.10), and consider the lower bound β of f(u) over TU(x).

    We claim that there is a (u, t) ∈ TU(x) such that f(u) ≤ α. This is obvious if β < α. For a β > − , the set TU(x) is convex and bounded by (19.10) and, as f is lower semicontinuous, it is a closed set. Since X is a reflexive space, then TU(x) is weakly compact, so the lower bound is attained at some \((\overline{u},\bar{t})\). If we had assumed that β > α then there would be a pair \((u,t) \in TU(\overline{u})\) in which case \((u,t +\bar{ t}) \in TU(x)\) and f(u) < β, a contradiction. Thus \(f(\overline{u}) =\beta =\alpha\).

  4. 4.

    General Case: Completion of the Proof. The last argument does not work if X is not reflexive. In this case by Ekeland’s variational principle for any δ > 0 there is a pair \((\overline{u},\bar{t}) \in TU(x)\) such that \(f(u) +\delta \| u -\overline{u}\|\) attains its minimum at \(\overline{u}\). We take δ < r′. If \(f(\overline{u}) =\alpha\), we are done. If \(f(\overline{u})>\alpha\), there is an h with \(\|h\| = 1\) such that \(-f^{\prime}(\overline{u};h)> r^{\prime}\), that is, \(f(\overline{u} + th)> f(\overline{u}) - r^{\prime}{t}\) for some t > 0. Set \(u = \overline{u} + th\). Then \((u,t) \in TU(\overline{u})\) and we get a contradiction with the definition of \(\overline{u}\), proving the claim. As \(x\) is an arbitrary point of [f > α], it follows that K f (α) ≤ 1 ∕ r′ and the desired inequality follows since r′ can be arbitrarily close to r.

Remark 19.7.

Observe that for the cases of a Hilbert and reflexive X we only needed elementary convex analysis, whereas for a general case we have been compelled to invoke the variational principle of Ekeland. It would be interesting (albeit doubtful) to find a proof completely based on convex analysis also in the general case. An alternative simple proof that \(K_{f}{(\alpha )}^{-1} \geq \inf _{x\in [f>\alpha ]}d(0,\partial f(x))\), also based on Ekeland’s principle, easily follows from Lemma 19.8 below (see also [26]).

19.3.2 General Results on Global Error Bounds

Here we consider the general case of an l.s.c. function on a complete metric space and the ways from these results to those of the preceding subsection. Recall [13] that the slope of f at x is

$$\displaystyle{\vert \nabla f\vert (x) =\limsup _{u\rightarrow x,u\neq x}\frac{{(f(x) - f(u))}^{+}} {d(x,u)} }$$

where \({\alpha }^{+} =\max \{\alpha,0\}\).

Lemma 19.8.

Let X be a complete metric space and f a lower semicontinuous function on X. Assume that for some x ∈ dom  f, and α < f(x), we have |∇f|(u) ≥ r > 0 if α < f(u) ≤ f(x). Then \([f \leq \alpha ]\neq \varnothing\) and \(d(x,[f \leq \alpha ]) \leq {r}^{-1}{(f(x)-\alpha )}^{+}\) .

Proof.

Set g(u) = (f(u) − α) + . By Ekeland’s principle for any δ > 0 there is a \(\overline{u}\) such that \(g(\overline{u}) \leq g(x)\), \(\mathrm{d}(\overline{u},x) \leq g(x)/\delta\), and \(g(u) +\delta \mathrm{d}(u,\overline{u}) \geq g(\overline{u})\) for all u. It follows that \(\vert \nabla g\vert (\overline{u}) \leq \delta\). If δ < r, this can happen only if \(g(\overline{u}) = 0\) for otherwise we would have \(\vert \nabla g\vert (\overline{u}) = \vert \nabla f\vert (\overline{u}) \geq r\). Taking δ arbitrarily close to (and still smaller than) r, we prove the second statement.

Denote by K f (α, β) (where β > α) the lower bound of K such that

$$\displaystyle{d(x,[f \leq \alpha ]) \leq K{(f(x)-\alpha )}^{+}\ \mbox{ if}\ \alpha <f(x) \leq \beta.}$$

Clearly, \(K_{f}(\alpha ) =\lim _{\beta \rightarrow \infty }K_{f}(\alpha,\beta )\).

Theorem 19.9.

Let X be a complete metric space and f a lower semicontinuous function on X. If \([f \leq \alpha ]\neq \varnothing\) , then

$$\displaystyle{\inf _{x\in [\alpha <f\leq \beta ]}\vert \nabla f\vert (x) =\inf _{\gamma \in [\alpha,\beta )}K_{f}{(\gamma,\beta )}^{-1}.}$$

Proof.

Set \(r =\inf _{x\in [\alpha <f\leq \beta ]}\vert \nabla f\vert (x)\). The inequality \(K_{f}{(\gamma,\beta )}^{-1} \geq r\) for \(\alpha \leq \gamma <\beta\) is immediate from Lemma 19.8. This proves that the left side of the equality cannot be greater than the quantity on the right. To prove the opposite inequality it is natural to assume that \(K_{f}{(\gamma,\beta )}^{-1} \geq \xi> 0\) for all \(\gamma \in [\alpha,\beta )\). For any x ∈ [f > α] and any \(\varepsilon> 0\) such that \(f(x)-\varepsilon>\alpha\), choose a \(u = u(\varepsilon ) \in [f \leq f(x)-\varepsilon ]\) such that \(\mathrm{d}(x,u) \leq (1+\varepsilon )\mathrm{d}(x,[f \leq f(x)-\varepsilon ]) \leq {(1+\varepsilon )\xi }^{-1}\varepsilon\) and therefore u → x as \(\varepsilon \rightarrow 0\). On the other hand, ξd(x, u) ≤ f(x) − f(u) which (as ux) implies that \(\xi \leq \vert \nabla f\vert (x)\), whence \(\xi \leq \vert \nabla f\vert (x)\), and the result follows.

As an immediate consequence we get

Corollary 19.10.

Under the assumption of the theorem

$$\displaystyle{K_{f}{(\alpha )}^{-1} \geq \inf _{ x\in [f>\alpha ]}\vert \nabla f\vert (x).}$$

A trivial example of a function f having an isolated local minimum at a certain \(\overline{x}\) and such that \(\inf f <f(\overline{x})\) shows that the inequality can be strict. This may happen of course even if the slope is different from zero everywhere on \([f>\alpha ]\). In this case an estimate of another sort can be obtained. Set (for \(\beta>\alpha\))

$$\displaystyle{d_{f}(\alpha,\beta ) =\sup _{x\in [f\leq \beta ]}\mathrm{d}(x,[f \leq \alpha ])}$$

and define the functions

$$\displaystyle{\kappa _{f,\varepsilon }(t) =\sup \{ \frac{1} {\vert \nabla f\vert (x)}:\; \vert f(x) - t\vert <\varepsilon \};\quad \kappa _{f}(t) =\lim _{\varepsilon \rightarrow 0}\kappa _{f,\varepsilon }(t).}$$

Proposition 19.11.

Let β > α. Assume that \([f \leq \alpha ]\neq \varnothing\) and |∇f|(x) ≥ r > 0 if x ∈ [α < f ≤β]. Then

$$\displaystyle{d_{f}(\alpha,\beta ) \leq \int _{\alpha }^{\beta }\kappa _{f}(t)\mathrm{d}t.}$$

Proof.

First we note that κ f is measurable (so as it is nonnegative, the integral makes sense). Indeed, it is enough to verify that every κ f, ɛ is measurable. In fact the latter is even lower semicontinuous. Indeed, take a δ > 0 and find an x with \(\vert f(x) - t\vert <\varepsilon\) such that \(\vert \nabla f\vert (x)>\kappa _{f,\varepsilon }(t)-\delta\). Take a positive \(\gamma <\varepsilon -\vert f(x) - t\vert\). Then for any τ with | t − τ | < γ we have \(\vert f(x) -\tau \vert <\varepsilon\) and therefore \(\kappa _{f,\varepsilon }(\tau ) \geq \kappa _{f,\varepsilon }(t)-\delta\).

Now fix an \(\varepsilon> 0\) and let \(\alpha =\tau _{0} <\ldots <\tau _{k} =\beta\) be a partition of [α, β] with \((1/2)(\tau _{i+1} -\tau _{i}) =\varepsilon _{i} <\varepsilon\). Set \(t_{i} = (\tau _{i} +\tau _{i-1})/2\), i = 1, , k. As follows from Theorem 19.9, \(d_{f}(\tau _{i-1},\tau _{i}) \leq \kappa _{f,\varepsilon _{i}}(t_{i})(\tau _{i+1} -\tau _{i})\) and therefore

$$\displaystyle{d_{f}(\alpha,\beta ) \leq \sum _{i=1}^{k}\kappa _{ f,\varepsilon _{i}}(t_{i})(\tau _{i+1} -\tau _{i}) \leq \sum _{i=1}^{k}\kappa _{ f,\varepsilon }(t_{i})(\tau _{i+1} -\tau _{i}).}$$

Passing to the limit over the net of all partitions of [α, β] we conclude that

$$\displaystyle{d_{f}(\alpha,\beta ) \leq \int _{\alpha }^{\beta }\kappa _{f,\varepsilon }(t)\mathrm{d}t.}$$

The result now follows from the Lebesgue majorized convergence theorem as by the assumption \(\kappa _{f,\varepsilon }(t) \leq {r}^{-1}\) for all t and \(\varepsilon\) if t ∈ (α, β].

Returning to the case of a convex function on a Banach space, we first state the following elementary fact that serves as a bridge between the general and convex situations.

Proposition 19.12.

Let X be a convex function on a Banach space X, and let \(x \in \mathrm{ dom}\,f\) . Then

$$\displaystyle{\vert \nabla f\vert (x) =\sup _{\|h\|\leq 1}(-f^{\prime}(x;h)) = \mathrm{d}(0,\partial f(x)).}$$

Proof.

Clearly | ∇ f | (x) = 0 if and only f′(x, h) ≥ 0 for all h and the equality holds with h = 0. If \(\|h\| = 1\) and u = x + th, then \(t =\| x - u\|\), so the equality \(-f^{\prime}(x;h) =\lim _{t\rightarrow 0}(f(x) - f(x + th))\) implies − f′(x; h) ≤ | ∇ f | (x). On the other hand, as \(f^{\prime}(x;h) \leq {t}^{-1}(f(x + th) - f(x))\) for all t and h, for a given ux, we get \(-f(x;h) \geq \| u - x\|\) if we set \(t =\| u - x\|\) and h = t  − 1(u − x).

Proposition 19.13.

Let f be a convex function on a Banach space X. Assume that \([f \leq \alpha ]\neq \varnothing\) . Let β > α. Then for any γ ∈ (α,β)

$$\displaystyle{K_{f}(\alpha,\beta ) \geq K_{f}(\gamma,\beta ).}$$

Proof.

We may assume that K f (α, β) <  and K f (γ, β) > 0 (which by definition means that \([f>\gamma ] \cap \mathrm{ dom}\,f\neq \varnothing\)). Take an \(x \in [f>\gamma ] \cap \mathrm{ dom}\,f\) and a K > K f (α, β) and find a u ∈ [f ≤ α] such that \(\|x - u\| \leq K(f(x)-\alpha )\). As earlier, we may assume that f(u) = α. As α < γ < f(x), there is a t > 0 such that f(w) = γ for w = tu + (1 − t)x. By convexity t(f(x) − α) ≤ f(x) − γ. We therefore have

$$\displaystyle{\|x - w\| = t\|x - u\| \leq \| x - u\|\frac{f(x)-\gamma } {f(x)-\alpha } \leq K(f(x)-\gamma ).}$$

This is true for all \(x \in [f>\gamma ] \cap \mathrm{ dom}\,f\) and all K > K f (α, β), whence the result.

Combining Theorem 19.9 and Propositions 19.12 and 19.13 we get still another proof of the first equality in Theorem 19.6.

19.3.3 Comments

Following the pioneering 1952 work by Hoffmann [18], error bounds, both for nonconvex and, especially, convex functions, are intensively studied, especially during last 2–3 decades, both theoretically, in connection with metric regularity, and also in view of their role in numerical analysis; see, e.g., [12, 17, 29, 31, 34, 39, 40, 44, 47, 48]. A finite dimensional version of (19.6) was proved in Lewis-Pang [29]. The equality can actually be extended to reflexive spaces (see Azé-Corvellec [1]). The equality \(K_{f}{(\alpha )}^{-1} =\inf \{ d(0,\partial f(x)):\; x \in [f>\alpha ]\}\) in Theorem 19.6 was proved by Zalinescu [45] (see also [46], Proposition 3.10.8, and for earlier results [11]). The first two equalities in the theorem can be found in [1, 2]. Theorem 19.9 and Proposition 19.13 were proved by Azé and Corvellec in [1]. The papers also contain sufficiently thorough bibliographic comments.

19.4 Convex Set-Valued Mappings

Let X and Y be Banach spaces and F: X ⇉ Y. We shall say that F is a convex mapping if its graph Graph F is a convex set. In this section we shall mainly discuss the regularity problems for convex mappings.

19.4.1 Theorem of Robinson–Ursescu

The standard statement of the Robinson-Ursescu theorem reads: Let X and Y be Banach spaces, and let F: X ⇉ Y be a set-valued mapping with convex and locally closed graph. Assume that the image of X under F has a nonempty interior. Then F is regular at every \((\bar{x},\bar{y})\) such that \(\overline{y} \in \mathrm{ int}\,F(X)\).

This theorem can be rightfully viewed as an extension of the Banach-Schauder open mapping theorem. Moreover, the original proofs of the theorem followed the pattern of the classical proof of the open mapping theorem: first the Baire category theorem is applied to show that under the assumptions \(\overline{y}\) belongs to the interior of the closure if the F-image of some ball around \(\overline{x}\) and then basically the same iteration scheme as in the classical Banach proof is applied to show that the closure operation can be dropped and \(\overline{y}\) belongs to the interior of the F-image (of the same ball) itself. Later it became clear that (as in many other results of the regularity theory) instead of the iteration procedure the variational principle of Ekeland can be used at the second stage of the proof. The latter is the basic fact behind the proof of the general regularity criterion of Theorem 19.3 quoted in the previous section.

The following elementary and short argument shows that the conclusion of the second part of the proof of the Robinson-Ursescu theorem, even in a more precise quantitative form, is a simple consequence of the general regularity criterion of Theorem 19.3. The only use of convexity in this argument is connected with the following obvious observation.

Proposition 19.14.

Let \(F: X \rightrightarrows Y\) be a set-valued mapping with convex graph. If α > 0 and β > 0 are such that F(B(x,α)) is dense in B(y,β) and y ∈ F(x), then for any λ ∈ (0,1) the F-image of B(x,λα) is dense in B(y,λβ).

Passing to the proof of the Robinson-Ursescu theorem, we first find α > 0 and β > 0 such that \(B(\overline{y},\beta ) \subset \mathrm{ cl}\,F(B(\overline{x},\alpha ))\) (whose existence as we have mentioned is proved through a standard application of the Baire category theorem) and then fix an \(\varepsilon> 0\) such that \(4\varepsilon <\min \{\alpha,\beta \}\). Let x, y, v satisfy \(\|x -\overline{x}\| <\varepsilon,\;y \in F(x)\), \(\|v -\overline{y}\| <\varepsilon,\;\|v - y\| <\varepsilon\). Then

$$\displaystyle{B(y,\beta -2\varepsilon ) \subset B(\overline{y},\beta ) \subset \mathrm{ cl}\,F(\overline{x},\alpha ) \subset \mathrm{ cl}\,F(\overline{x},\alpha +\varepsilon ).}$$

Setting \(\xi = (\alpha +\varepsilon )/(\beta -2\varepsilon )\) we get from Proposition 19.14 that \(B(y,t) \subset \mathrm{ cl}\,F(x,\xi t)\) if, e.g., \(t \in (0,\varepsilon )\). Let z = λ v + (1 − λ)y for some λ ∈ (0, 1). Then \(t =\| z - y\| <\varepsilon\) and there is a u with \(\|u - x\| \leq \xi t\) such that z ∈ F(u). We have

$$\displaystyle{d_{\xi }((x,y),(u,v)) =\max \{\| x - u\|,\xi \|y - z\|\} =\xi \| y - z\|}$$

and therefore \(\|v - z\| =\| v - y\| -\| z - y\| \leq \| v - y\| - rd_{\xi }((x,y),(u,z))\) if r < ξ  − 1. A reference to Theorem 19.3 completes the proof. Moreover, we see that

$$\displaystyle{\mathrm{sur}\,F(\overline{x}\vert \overline{y}) \geq \beta /\alpha.}$$

19.4.2 Regularity Moduli of Convex Multifunctions

The next question we shall discuss in this section is how to compute regularity moduli of convex multifunction. As immediately follows from the arguments concluding the previous subsection (when \(\varepsilon \rightarrow 0\)), \(\mathrm{sur}\,F(\overline{x}\vert \overline{y}) \geq \beta /\alpha\). A slight elaboration on this result gives a more precise conclusion.

Proposition 19.15.

Let F: X ⇉ Y have a convex and locally closed graph. Then

$$\displaystyle{\begin{array}{lcl} \mathrm{sur}\,F(\overline{x}\vert \overline{y})& =&\lim _{\varepsilon \rightarrow 0}\sup \{r \geq 0:\; B(\overline{y},r\varepsilon ) \subset \mathrm{ cl}\,F(B(\overline{x},\varepsilon ))\} \\ & =&\lim {_{\varepsilon \rightarrow 0}\ \varepsilon }^{-1}\sup \{t \geq 0:\; B(\overline{y},t) \subset \mathrm{ cl}\,F(B(\overline{x},\varepsilon ))\}.\end{array} }$$

Proof.

The second equality is obvious (just take \(r\varepsilon = t\)). The first equality is trivial if \(\overline{y}\) does not belong to \(\mathrm{int}\,\mathrm{cl}\,[F(B(\overline{x},\varepsilon ))]\). For the case when \(\overline{y}\) lies in the interior of \(\mathrm{cl}\,F(B(\overline{x},\varepsilon ))\), the inequality \(\mathrm{sur}\,F(\overline{x}\vert \overline{y}) \geq \sup \{ r \geq 0:\; B(\overline{y},r\varepsilon ) \subset \mathrm{ cl}\,F(B(\overline{x},\varepsilon ))\}\) follows from the proof following Proposition 19.14: just take \(\beta = r\varepsilon\) and \(\alpha =\varepsilon\). And the opposite inequality is immediate from the definition of the modulus of surjection.

Although the formula can hardly be recommended for practical computation of the surjection moduli, it brings about a substantial simplification compare to the general case as there is no longer a need to verify similar inclusions for other points of the graph close to \((\bar{x},\bar{y})\). A duality-based working formula for the surjection modulus is offered by the following theorem.

Theorem 19.16.

Let \(F: X \rightrightarrows Y\) be a set-valued mapping with convex and locally closed graph. If \(\overline{y} \in F(\overline{x})\) , then

$$\displaystyle{\mathrm{sur}\,F(\overline{x}\vert \overline{y}) =\lim _{\varepsilon \rightarrow +0}\ \inf _{\|{y}^{{\ast}}\|=1}\ \inf _{{x}^{{\ast}}}\Big(\|{x}^{{\ast}}\| + \frac{1} {\varepsilon } S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{y}^{{\ast}})\Big).}$$

Proof.

To begin with we observe the following. Let Q ⊂ Y be a closed convex set and \(\overline{y} \in Q\). Then \(B(\overline{y},r) \subset Q\) if and only if \(\sup \{\langle {y}^{{\ast}},y -\overline{y}\rangle:\; y \in Q\} \geq r\) for any y  ∗  with \(\|{y}^{{\ast}}\| = 1\). It follows that the lower bound of the supremum over the unit sphere in Y  ∗  coincides with the upper bound of r ≥ 0 such that \(B(\overline{y},r) \subset Q\).

We have furthermore

$$\displaystyle{ \begin{array}{l} \sup \{r \geq 0:\; B(\overline{y},r) \subset \mathrm{ cl}\,F(B(\overline{x},\varepsilon ))\} \\ \qquad \qquad \qquad =\inf _{\|{y}^{{\ast}}\|=1}\sup \{\langle {y}^{{\ast}},y -\overline{y}\rangle:\; y \in F(\overline{x} + h),\;\|h\| \leq \varepsilon \} \\ \qquad \qquad \qquad =\inf _{\|{y}^{{\ast}}\|=1}\sup \{\langle {y}^{{\ast}},v\rangle:\; (h,v) \in \mathrm{ Graph}\,F - (\bar{x},\bar{y}),\;\|h\| \leq \varepsilon \} \\ \qquad \qquad \qquad =\inf _{\|{y}^{{\ast}}\|=1}{(\mathrm{Ind}_{\mathrm{Graph}\,F-(\bar{x},\bar{y})} +\mathrm{ Ind}_{\varepsilon B\times Y })}^{{\ast}}(0,{y}^{{\ast}}). \end{array} }$$
(19.11)

As \((0,0) \in (\mathrm{Graph}\,F - (\bar{x},\bar{y})) \cap \mathrm{ int}\,(\varepsilon B \times Y )\), it follows from the standard duality between summation and infimal convolution:

$$\displaystyle{\begin{array}{l} {(\mathrm{Ind}_{\mathrm{Graph}\,F-(\bar{x},\bar{y})} +\mathrm{ Ind}_{\varepsilon B\times Y })}^{{\ast}}(0,{y}^{{\ast}}) \\ \qquad \qquad \qquad =\inf _{({x}^{{\ast}},{v}^{{\ast}})}\{S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{v}^{{\ast}}) + S_{\varepsilon B\times Y }(-{x}^{{\ast}},{y}^{{\ast}}- {v}^{{\ast}})\} \\ \qquad \qquad \qquad =\inf _{{x}^{{\ast}}}\{S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{y}^{{\ast}}) +\varepsilon \| {x}^{{\ast}}\|\} \\ \qquad \qquad \qquad =\varepsilon \inf _{{x}^{{\ast}}}\{\|{x}^{{\ast}}\| {+\varepsilon }^{-1}S_{\mathrm{ Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{y}^{{\ast}})\}. \end{array} }$$

Together with (19.11) and Proposition 19.15 this completes the proof.

19.4.3 Systems of Convex Inequalities

This is the system of relations

$$\displaystyle{ \varphi _{t}(x) \leq b_{t},\quad t \in T, }$$
(19.12)

where x ∈ X, X is a Banach space, T is a set of an arbitrary nature, and for any t, \(\varphi _{t}\) is a proper closed convex function on X and \(b_{t} \in \mathbb{R}\). Set b = (b t ) and let \(\mathcal{S}(b)\) be the set of solutions of (19.12). Clearly, \(\mathcal{S}(b)\) is a closed convex set (possibly empty). A natural question is about Lipschitz stability of the set-valued mapping \(\mathcal{S}\) with respect to small perturbations of \(b\) near some nominal value \(\overline{b}\).

Although we impose no a priori restrictions on elements of \(\overline{b}\), there is no loss of generality in assuming that \(\overline{b} = 0\). Otherwise, we can consider, instead of \(\varphi _{t}\), the functions \(\varphi _{t} -\overline{b}_{t}\). As perturbations of the right-hand side we shall consider arbitrary uniformly bounded real-valued functions on T, that is, elements of the space (T) with the standard uniform norm. As follows from the equivalence theorem, Lipschitz stability of solutions of (19.12) with \(\overline{b} = 0\) is guaranteed by regularity at \((\overline{x},0)\) of the following set-valued mapping from X into (T):

$$\displaystyle{F(x) =\{ a = (a_{t}) \in \ell_{\infty }(T):\; a_{t} \geq \varphi _{t}(x),\;\forall \ t \in T\}}$$

and

$$\displaystyle{\mathrm{lip}\,\mathcal{S}(0;\overline{x}) = {(\mathrm{sur}\,F(\overline{x}\vert 0))}^{-1}.}$$

Set

$$\displaystyle{\Phi (x) =\sup _{t\in T}(\varphi _{t}(x) -\overline{b}_{t}).}$$

Clearly, \(\Phi (\overline{x}) \leq 0\).

Theorem 19.17.

Let \(\overline{x}\) be a solution of (19.12)with \(b = \overline{b} = 0\) . Then either \(\mathrm{sur}\,F(\overline{x}\vert \overline{b}) = \infty\) or \(\Phi (\overline{x}) = 0\), \(\partial \Phi (\overline{x})\neq \varnothing\) and

$$\displaystyle{\mathrm{sur}\,F(\overline{x}\vert \overline{b}) = \mathrm{d}(0,\partial \Phi (\overline{x})).}$$

Thus the theorem effectively says that Lipschitz stability of the solution map \(\mathcal{S}\) at \((0,\overline{x})\) is equivalent to Lipschitz stability of the solution set of the single convex inequality

$$\displaystyle{\Phi (x) =\sup _{t\in T}\varphi _{t}(x) \leq \alpha }$$

at \((0,\overline{x})\) with the same Lipschitz modulus equal to \([\mathrm{d}(0,\partial \Phi (\overline{x}){]}^{-1}\).

Applying the theorem to the simplest case when T is a singleton, that is, when we deal with one convex function f and \(f(\overline{x}) = \overline{\alpha }\), we conclude (again by virtue of the equivalence theorem) that

$$\displaystyle{\mathrm{d}(x,[f \leq \alpha ]) \leq K{(f(x)-\alpha )}^{+}}$$

for all x and α close to \(\overline{x}\) and \(\overline{\alpha }\), respectively, with \(K = {(d(0,\partial f(\overline{x})))}^{-1}\), provided \(\partial f(\overline{x})\neq \varnothing\). (Note that regularity of f in this sense is a stronger property than the existence of a local error bound at the level \(\overline{\alpha }\). ) We can now proceed with the proof of the theorem.

Proof.

So we assume in the proof that \(\overline{b} = 0\). We may also harmlessly assume that \(\varphi _{t}\) are uniformly bounded from below (otherwise we can replace \(\varphi _{t}\), say by \(\max \{\varphi _{t},-1\}\)).

  1. 1.

    The cone \(\mathcal{K} =\{ a \in \ell_{\infty }:\; a_{t} \geq 0,\;\forall \;t \in T\}\) defines the standard order in (T). The dual cone \({\mathcal{K}}^{{\ast}}\) consists of all \({p}^{{\ast}}\in {(\ell_{\infty })}^{{\ast}}\) such that \(\langle {p}^{{\ast}},a\rangle\) ≥ 0 if a t  ≥ 0 for all t. We shall simply write p  ∗  ≥ 0 for elements of \({\mathcal{K}}^{{\ast}}\). For any p  ∗  ≥ 0, we define the function

    $$\displaystyle{({p}^{{\ast}}\circ F)(x) =\inf \{\langle {p}^{{\ast}},a\rangle:\; a \in F(x)\}}$$

    (clearly, the infimum is −  if \({p}^{{\ast}}\not\in {\mathcal{K}}^{{\ast}}\)). This function is obviously convex. We claim that for any \({x}^{{\ast}}\) the function p  ∗ ↦(p  ∗  ∘ F) ∗ (x  ∗ ) on ( ) ∗  is convex and weak  ∗  lower semicontinuous on its domain. Indeed, convexity follows from the obvious inequality:

    $$\displaystyle{\begin{array}{l} \sup _{x}\big(\langle {x}^{{\ast}},x\rangle - ((\alpha p_{ 1}^{{\ast}} + (1-\alpha )p_{ 2}^{{\ast}}) \circ F)(x)\big) \\ \qquad \qquad =\sup _{x}\big(\alpha (\langle {x}^{{\ast}},x\rangle - (p_{ 1}^{{\ast}}\circ F)(x)) + (1-\alpha )(\langle {x}^{{\ast}},x\rangle - (p_{ 2}^{{\ast}}\circ F)(x))\big) \\ \qquad \qquad \leq \alpha \sup _{x}(\langle {x}^{{\ast}},x\rangle - (p_{ 1}^{{\ast}}\circ F)(x)) + (1-\alpha )\sup _{ x}(\langle {x}^{{\ast}},x\rangle - (p_{ 2}^{{\ast}}\circ F)(x)).\end{array} }$$

    On the other hand, if a ∈  , then \({p}^{{\ast}}\mapsto \langle {p}^{{\ast}},a\rangle\) is linear and weak ∗ -continuous. It follows that for any x  ∗  the function \({p}^{{\ast}}\mapsto {({p}^{{\ast}}\circ F)}^{{\ast}}({x}^{{\ast}})\) is an upper bound of affine and weak ∗ -continuous functions \(\langle {x}^{{\ast}},x\rangle -\langle {p}^{{\ast}},a\rangle\) corresponding to (x, a) ∈ Graph F.

  2. 2.

    Set \({P}^{{\ast}} =\{ {p}^{{\ast}}\geq 0,\;\|{p}^{{\ast}}\| = 1\}\). We shall show next that

    $$\displaystyle{ \Phi (x) =\sup _{{p}^{{\ast}}\in {P}^{{\ast}}}({p}^{{\ast}}\circ F)(x);\qquad {\Phi }^{{\ast}}({x}^{{\ast}}) =\inf _{{ p}^{{\ast}}\in {P}^{{\ast}}}{({p}^{{\ast}}\circ F)}^{{\ast}}({x}^{{\ast}}). }$$
    (19.13)

    Indeed, the inequality (p  ∗  ∘ F)(x) ≥ Φ(x) is obvious. The opposite inequality follows from the fact that \((\delta _{t} \circ F)(x) =\varphi _{t}(x)\), where δ t is the “Dirac measure” at t: \(\langle \delta _{t},a\rangle = a_{t}\). This proves the first equality.

    As P  ∗  is a convex and weak ∗ -compact set, it follows, in view of the minimax theorem of Sion [38], that

    $$\displaystyle{\begin{array}{lcl} {\Phi }^{{\ast}}({x}^{{\ast}})& =&\sup _{x}(\langle {x}^{{\ast}},x\rangle ) -\sup _{{ p}^{{\ast}}\in {P}^{{\ast}}}({p}^{{\ast}}\circ F)(x)) \\ & =&\sup _{x}\inf _{{p}^{{\ast}}\in {P}^{{\ast}}}(\langle {x}^{{\ast}},x\rangle - ({p}^{{\ast}}\circ F)(x)) \\ & =&\inf _{{p}^{{\ast}}\in {P}^{{\ast}}}\sup _{x}(\langle {x}^{{\ast}},x\rangle - ({p}^{{\ast}}\circ F)(x)) \\ & =&\inf _{{p}^{{\ast}}\in {P}^{{\ast}}}{({p}^{{\ast}}\circ F)}^{{\ast}}({x}^{{\ast}}).\end{array} }$$

    As the function \({p}^{{\ast}}\mapsto ({p}^{{\ast}}\circ F)({x}^{{\ast}})\) is weak ∗  l.s.c., it follows that the infimum in the last expression is attained, so that \({\Phi }^{{\ast}}({x}^{{\ast}}) = {({p}^{{\ast}}\circ F)}^{{\ast}}({x}^{{\ast}})\) for some p  ∗  ∈ P  ∗ .

  3. 3.

    We have for \({x}^{{\ast}}\in {X}^{{\ast}},\ {p}^{{\ast}}\in {(\ell_{\infty })}^{{\ast}}\)

    $$\displaystyle{S_{\mathrm{Graph}\,F-(\overline{x},0)}({x}^{{\ast}},-{p}^{{\ast}}) =\sup \{\langle {x}^{{\ast}},x\rangle -\langle {p}^{{\ast}},a\rangle:\; a_{ t} \geq \varphi _{t}(x + \overline{x}),\;\forall \;t \in T\}.}$$

    If \(S_{\mathrm{Graph}\,F-(\overline{x},0)}({x}^{{\ast}},-{p}^{{\ast}}) <\infty\), then necessarily p  ∗  ≥ 0 and

    $$\displaystyle{S_{\mathrm{Graph}\,F-(\overline{x},0)}({x}^{{\ast}},-{p}^{{\ast}}) =\sup _{ x}\big(\langle {x}^{{\ast}},x\rangle - ({p}^{{\ast}}\circ F)(x + \overline{x})\big) = {({p}^{{\ast}}\circ F)}^{{\ast}}({x}^{{\ast}}) -\langle {x}^{{\ast}},\overline{x}\rangle.}$$

    Thus Theorem 19.16 along with (19.13) and the last equality gives

    $$\displaystyle{\mathrm{sur}\,F(\overline{x}\vert 0) =\lim _{\varepsilon \rightarrow 0}\inf _{{x}^{{\ast}}}\Big(\|{x}^{{\ast}}\| + \frac{1} {\varepsilon } ({\Phi }^{{\ast}}({x}^{{\ast}}) -\langle {x}^{{\ast}},\overline{x}\rangle )\Big).}$$

    If \(\mathrm{sur}\,F(\overline{x}\vert 0) = r <\infty\), then for any \(\varepsilon>\) the infimum is attained at a certain \({x}^{{\ast}}(\varepsilon )\) with \(\|{x}^{{\ast}}(\varepsilon )\| \leq r\) (indeed, \({\Phi }^{{\ast}}({x}^{{\ast}}) -\langle {x}^{{\ast}},\overline{x}\rangle \geq -\Phi (\overline{x}) \geq 0\) and the function in the parentheses is weak ∗  lower semicontinuous and nondecreasing as \(\varepsilon \rightarrow 0\)).

    Let x  ∗  be a weak ∗  limit point of \(({x}^{{\ast}}(\varepsilon ))\) as \(\varepsilon \rightarrow 0\). Then necessarily \({\Phi }^{{\ast}}({x}^{{\ast}}) -\langle {x}^{{\ast}},\overline{x}\rangle \leq 0\) which (as \(\Phi (\overline{x}) \leq 0\)) may happen only if \(\Phi (\overline{x}) = 0\) and \({x}^{{\ast}}\in \partial \Phi (\overline{x})\). On the other hand, if \({x}^{{\ast}}\in \partial \Phi (\overline{x})\) and \(\Phi (\overline{x}) = 0\), we get

    $$\displaystyle{\mathrm{sur}\,F(\overline{x}\vert 0) =\inf \{\| {x}^{{\ast}}\|:\; {x}^{{\ast}}\in \partial \Phi (\overline{x})\}}$$

    and the proof is completed.

Remark 19.18.

It is to be emphasized that in no point in the proof the representation of elements of by finitely additive measures has been needed.

19.4.4 Perfect Regularity and Linear Perturbations

As follows from the formula for the modulus of surjection in Theorem 19.16, the value of the modulus is fully determined by the restriction of the support function to \(\mathrm{Graph}\,F - (\bar{x},\bar{y})\) to the set on which it is smaller than \(\varepsilon> 0\), no matter however small this \(\varepsilon\) is. But in general we cannot replace such sets by the zero level of the support function (which, as it is easy to see, is precisely the normal cone to \(\mathrm{Graph}\,F\) at \((\bar{x},\bar{y})\)).

Example 19.19.

Let X = Y = L 2[0, 1] and \(F: X \rightrightarrows Y\) is defined by F(x) = x + K where K is the cone of nonnegative functions. Let \(\overline{x}(t) \equiv -1\) and y(t) ≡ 0. Clearly, \(\overline{y} \in F(\overline{x})\). Direct calculation gives

$$\displaystyle{S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{y}^{{\ast}}) = \left \{\begin{array}{cl} \|{y}^{{\ast}}\| +\int _{ 0}^{1}\vert {y}^{{\ast}}(t)\vert \mathrm{d}t,&\mathrm{if}\;{x}^{{\ast}} + {y}^{{\ast}} = 0,{y}^{{\ast}}(t) \leq 0\;\mathrm{a.e.} \\ \infty, &\mathrm{otherwise}. \end{array} \right.}$$

As the infimum of the L 1-norm on the unit sphere of L 2 is zero, it follows that \(\mathrm{sur}\,F(\overline{x}\vert \overline{y}) = 1\).

On the other hand, the zero level set of the support function contains only the zero element, so by the standard convention (inf = ), we conclude that restricting the infimum in the formula of Theorem 19.16 only to this set (which does not meet the unit sphere) we get , not 1.

Definition 19.20.

We shall say that F is perfectly regular at \((\bar{x},\bar{y})\) if

$$\displaystyle{ \mathrm{sur}\,F(\overline{x}\vert \overline{y}) =\inf \{\| {x}^{{\ast}}\|:\; ({x}^{{\ast}},{y}^{{\ast}}) \in N(\mathrm{Graph}\,F,(\bar{x},\bar{y})),\;\|{y}^{{\ast}}\| = 1\}. }$$
(19.14)

Footnote 3

An example of infinite dimensional perfectly regular mapping is the F associated with the system of convex inequalities (19.12) in the concluding part of the previous section (see [23]).

It is possible to give a primal characterization of perfect regularity. Remind first that a convex process is a set-valued mapping whose graph is a convex cone and note that the surjection modulus of a convex process \(A: X \rightrightarrows Y\) coincides with \(\sup \{r \geq 0:\; rB_{Y } \subset A(B_{X})\}\). The latter is implicitly contained in [28]. It is a simple consequence of the fact that the inclusion x + K ⊂ K holds for any point x of a convex cone K. Indeed, it follows that, given a convex process A, the inclusion \(rB_{Y } \subset A(B_{X})\) implies that for any \((x,y) \in \mathrm{ Graph}\,A\) we have \(y + rB_{y} \subset A(x + B_{X})\).

If F is convex set-valued mapping and \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\), then the set-valued mapping \(DF(\bar{x},\bar{y})\) whose graph is \(T(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\) is a convex process. It is clear that the tangent cone \(T(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\) to Graph F at \((\bar{x},\bar{y})\) contains \(\mathrm{Graph}\,F - (\bar{x},\bar{y})\). Therefore

$$\displaystyle\begin{array}{rcl} \mathrm{sur}\,F(\overline{x}\vert \overline{y})& \leq & \sup \{r \geq 0:\; B(\overline{y},tr) \subset F(B(\overline{x},t))\} \\ & \leq & \sup \{r \geq 0:\; rB_{Y } \subset DF(\bar{x},\bar{y})(B_{X})\} =\mathrm{ sur}\,DF(\bar{x},\bar{y})(0\vert 0).\qquad \quad {}\end{array}$$
(19.15)

On the other hand the support function of \(T(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\) is precisely the indicator of \(N(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\) and therefore by Theorem 19.16 the right-hand side of the equality in the definition (19.20) is the modulus of surjection of \(DF(\bar{x},\bar{y})\) at \((0,0)\). Thus

Proposition 19.21.

A convex mapping F is perfectly regular at \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\) if and only if the surjection moduli of F at \((\bar{x},\bar{y})\) and of the derivative of \(DF(\bar{x},\bar{y})\) at the origin coincide.

The following two propositions offer some sufficient conditions for perfect regularity.

Proposition 19.22 ([25], Proposition 5). 

Let \(F: X \rightrightarrows Y\) be a set-valued mapping with convex and locally closed graph. Suppose there is a weak-star closed convex subset Q of the unit sphere in Y such that for some \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\)

$$\displaystyle{S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{y}^{{\ast}}) <\infty \;\mbox{ and }\;\|{y}^{{\ast}}\| = 1\quad \Rightarrow \quad {y}^{{\ast}}\in {Q}^{{\ast}}.}$$

Then F is perfectly regular at \((\bar{x},\bar{y})\) .

The simplest situation when the conditions of the last proposition are satisfied occurs when X is a space of continuous functions over a compact set T, Q  ∗  is the set of probability measures on T, and K, the cone of nonnegative elements of X, is contained in F(0).

The second proposition is an easy consequence of Theorem 19.16.

Proposition 19.23.

Let F be as above and \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\) . For any \(\varepsilon> 0\) set

$$\displaystyle{\mathcal{L}_{\varepsilon } =\{ ({x}^{{\ast}},{y}^{{\ast}}):\; S_{\mathrm{ Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{y}^{{\ast}}) \leq \varepsilon,\;\|{x}^{{\ast}}\|\leq 1,\;\|{y}^{{\ast}}\|\leq 1\}.}$$

If the excess of \(\mathcal{L}_{\varepsilon }\) over \(N(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\)

$$\displaystyle{\mathrm{ex}(\mathcal{L}_{\varepsilon }.N(\mathrm{Graph}\,F,(\bar{x},\bar{y}))) =\sup \{ \mathrm{d}(({x}^{{\ast}},{y}^{{\ast}}),N(\mathrm{Graph}\,F(\bar{x},\bar{y}))):\; ({x}^{{\ast}},{y}^{{\ast}}) \in \mathcal{L}_{\varepsilon }\}}$$

goes to zero when \(\varepsilon \rightarrow 0\) , then \(F\) is perfectly regular at \((\bar{x},\bar{y})\) .

Note that the condition of the proposition is automatically satisfied if both X and Y are finite dimensional.

Our main interest in this subsection is the effect of linear perturbations of F on regularity moduli. Specifically we shall consider mappings F + A with A being a linear bounded operator from X into Y. We have (setting y = v + Ax)

$$\displaystyle{\begin{array}{lcl} S_{\mathrm{Graph}\,(F+A)-(\overline{x},\overline{y}+A\overline{x})}({x}^{{\ast}},{y}^{{\ast}})& =&\sup \{\langle {x}^{{\ast}},x -\overline{x}\rangle +\langle {y}^{{\ast}},y - (\overline{y} + A\overline{x})\rangle: \\ & &y \in F(x) + Ax\} \\ & =&\sup \{\langle {x}^{{\ast}} + {A}^{{\ast}}{y}^{{\ast}},x -\overline{x}\rangle +\langle {y}^{{\ast}},v -\overline{y}\rangle:\; v \in F(x)\} \\ & =&S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}} + {A}^{{\ast}}{y}^{{\ast}},{y}^{{\ast}}).\end{array} }$$

Theorem 19.16 now immediately gives

Proposition 19.24.

Let \(F: X \rightrightarrows Y\) be a set-valued mapping with convex closed graph, let \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\) , and let A: X → Y be a bounded linear operator. Then

$$\displaystyle{\mathrm{sur}\,(F(\overline{x}\vert \overline{y} + A\overline{x}) =\lim _{\varepsilon \rightarrow 0}\inf _{\|{y}^{{\ast}}\|=1}\inf _{{x}^{{\ast}}}(\|{x}^{{\ast}}- {A}^{{\ast}}{y}^{{\ast}}\| + \frac{1} {\varepsilon } S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}({x}^{{\ast}},{y}^{{\ast}}))}$$

and consequently

$$\displaystyle{\mathrm{sur}\,(F(\overline{x}\vert \overline{y} + A\overline{x}) \geq \mathrm{ sur}\,F(\overline{x}\vert \overline{y}) -\| A\|.}$$

This is a version of Milyutin’s perturbation theorem [14, 16, 20] for the specific case of a convex mapping and a linear perturbation. A natural question that arises in connection with the last result is about the minimal norm of a linear perturbation which destroys regularity.

Definition 19.25 ([15]). 

The radius of regularity of F: X ⇉ Y at \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\) is the lower bound of norms of linear bounded operators A: X → Y such that \(\mathrm{sur}\,(F + A)(\overline{x}\vert \overline{y} + A\overline{x}) = 0\). We shall denote it \(\mathrm{rad}\,F(\overline{x}\vert \overline{y})\).

Theorem 19.26.

Let \(F: X \rightrightarrows Y\) be a set-valued mapping with convex closed graph, and let \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\) . Suppose that F is perfectly regular at \((\overline{x},\overline{y})\) . Then

$$\displaystyle{\mathrm{rad}\,F(\overline{x}\vert \overline{y}) =\mathrm{ sur}\,F(\overline{x}\vert \overline{y}).}$$

Note that the condition of the theorem is satisfied under the assumptions of Propositions 19.22 and 19.23.

Proof.

It follows from Proposition 19.24 that \(\mathrm{rad}\,F(\overline{x}\vert \overline{y}) \geq \mathrm{ sur}\,F(\overline{x}\vert \overline{y})\), so we have to prove the opposite inequality. Set \(r:=\mathrm{ sur}\,F(\overline{x}\vert \overline{y})\). The theorem is obviously valid if r = 0. So we assume that r > 0. As F is perfectly regular at \((\bar{x},\bar{y})\), for any \(\varepsilon> 0\) there are \((x_{\varepsilon }^{{\ast}},y_{\varepsilon }^{{\ast}}) \in N(\mathrm{Graph}\,F,(\bar{x},\bar{y}))\) such that \(\|y_{\varepsilon }^{{\ast}}\| = 1,\;\|x_{\varepsilon }^{{\ast}}\|\leq (1+\varepsilon )r\). In particular we have

$$\displaystyle{ S_{\mathrm{Graph}\,F-(\bar{x},\bar{y})}(x_{\varepsilon }^{{\ast}},y_{\varepsilon }^{{\ast}}) = 0 }$$
(19.16)

Let further \(x_{\varepsilon } \in X\) and \(y_{\varepsilon } \in Y\) satisfy

$$\displaystyle{\|x_{\varepsilon }\| =\| y_{\varepsilon }\| = 1,\quad \langle x_{\varepsilon }^{{\ast}},x_{\varepsilon }\rangle \geq (1-\varepsilon )\|x_{\varepsilon }^{{\ast}}\|,\quad \langle y_{\varepsilon }^{{\ast}},y_{\varepsilon }\rangle \geq (1-\varepsilon ).}$$

We use these four vectors to define an operator \(A_{\varepsilon }: X \rightarrow Y\) as follows:

$$\displaystyle{A_{\varepsilon }x = \frac{\langle x_{\varepsilon }^{{\ast}},x\rangle } {\langle y_{\varepsilon }^{{\ast}},y_{\varepsilon }\rangle }x_{\varepsilon }.}$$

Then \(\|A_{\varepsilon }\| \leq \dfrac{1+\varepsilon } {1-\varepsilon }r\) and

$$\displaystyle{A_{\varepsilon }^{{\ast}}{y}^{{\ast}} = \frac{\langle {y}^{{\ast}},y_{\varepsilon }\rangle } {\langle y_{\varepsilon }^{{\ast}},y_{\varepsilon }\rangle }x_{\varepsilon }^{{\ast}}.}$$

In particular we see that \(-x_{\varepsilon }^{{\ast}} = A_{\varepsilon }^{{\ast}}y_{\varepsilon }^{{\ast}}\). Combining this with Propositions 19.24 and (19.16) we get \(\mathrm{sur}\,(F + A)(\overline{x}\vert \overline{y} + A\overline{x}) = 0\), that is, \(\mathrm{rad}\,F(\bar{x},\bar{y}) \leq \| A_{\varepsilon }\| \rightarrow r\) as \(\varepsilon \rightarrow 0\).

Remark 19.27.

  1. 1.

    The perfect regularity condition is not necessary for the equality of the radius of regularity and the modulus of surjection. It can be easily verified that the equality holds in Example 19.19: just take A to be minus identity.

  2. 2.

    For mappings (even nonconvex) between finite dimensional spaces the equality holds [15]. It is also known that the inequality may fail to hold already for single-valued Lipschitz mappings from a Hilbert space into itself [21]. It would be interesting to find an example of a convex mapping for which the equality does not hold.

A natural related problem concerns stability of solutions of the inclusion

$$\displaystyle{ y \in F(x) + Ax }$$
(19.17)

with respect to small variations of both y and A around some nominal values \(\overline{y}\) and \(\overline{A}\), given a nominal solution \(\overline{x}\) corresponding to \(\overline{y}\) and \(\overline{A}\). This question moves us beyond the realm of convex problems (as (x, A)↦Ax is not a convex mapping).

Let S(y, A) denote the set of solutions of (19.17). Our goal is to find the Lipschitz modulus of S at the nominal point. By the equivalence theorem, all we need is to find the modulus of surjection of the inverse mapping

$$\displaystyle{\Phi (x) =\{ (y,A) \in Y \times \mathcal{L}(X,Y ):\; y \in F(x) + Ax\}.}$$

Here \(\mathcal{L}(X,Y )\) is the space of linear bounded operators from X into Y with the standard operator norm. To correctly state the question we need to fix some norm in \(Y \times \mathcal{L}(X,Y )\) (assuming the norms in X and Y are given). To this end, we take a norm \(\nu\) in \({\mathbb{R}}^{2}\) and set

$$\displaystyle{\|(y,A)\| =\nu (\|y\|,\|A\|).}$$

We assume for convenience that \(c \cdot \max \{\| y\|,\|A\|\} \geq \| (y,A)\| \geq \max \{\| y\|,\|A\|\}\) for some c > 1. By ν  ∗  we denote the dual norm on \({\mathbb{R}}^{2}\).

Theorem 19.28.

Let F: X ⇉ Y be a set-valued mapping with closed graph. Let \(\overline{A} \in \mathcal{L}(X,Y )\) and \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,(F + \overline{A})\) be given. Then

$$\displaystyle{ \mathrm{sur}\,\Phi (\overline{x}\vert (\overline{y},\overline{A})) \geq { \frac{1} {\nu }^{{\ast}}(1,\|\overline{x}\|)}\mathrm{sur}\,(F + \overline{A})(\overline{x}\vert \overline{y} + \overline{A}\overline{x}). }$$
(19.18)

The equality holds if F is a convex mapping which is perfectly regular at \((\bar{x},\bar{y})\) . Therefore

$$\displaystyle{\mathrm{lip}\,S((\overline{y},\overline{A})\vert \overline{x}) {\leq \nu }^{{\ast}}(1,\|\overline{x}\|)\mathrm{reg}\,(F + \overline{A})(\overline{x}\vert \overline{y} + \overline{A}\overline{x})}$$

with the equality if F has the property specified above.

Proof.

With no loss of generality we may assume in the proof that \(\overline{y} = 0\) and \(\overline{A} = 0\). Set \(r =\mathrm{ sur}\,F(\overline{x}\vert (0,0))\).

  1. 1.

    The inequality (19.18) automatically holds if r = 0, so we assume r > 0. Take a positive ρ < r. To prove the statement it will be sufficient to show that there is a δ > 0 such that whenever

    $$\displaystyle{ \|x -\overline{x}\| <\delta,\;\|y\| <\delta,\;\|A\| <\delta,\;(y,A) \in \Phi (x),\;t \in (0,\delta ), }$$
    (19.19)

    we have

    $$\displaystyle{ B\big((y,A),{ \frac{\rho } {\nu }^{{\ast}}(1,\|\overline{x}\|)}t\big) \subset \Phi (B(x,t)). }$$
    (19.20)

    The definition of the modulus of surjection along with Milyutin’s theorem imply that there is an \(\varepsilon> 0\) such that for x,  y,  t, and A satisfying (19.19) with δ replaced by \(\varepsilon\) the inclusion

    $$\displaystyle{ B(y,\rho t) \subset (F + A)(B(x,t)) }$$
    (19.21)

    holds. Take a δ > 0 satisfying

    $$\displaystyle{\delta <\frac{\varepsilon } {2},\quad \delta (1 +\delta +\|\overline{x}\|) <\varepsilon.}$$

    Let x,  y,  t, and A satisfy (19.19) and

    $$\displaystyle{ (y^{\prime},A^{\prime}) \in B\big((y,A),{ \frac{\rho } {\nu }^{{\ast}}(1,\|\overline{x}\|)}t\big). }$$
    (19.22)

    We have \(y \in F(x) + Ax = F(x) + A^{\prime}(x) + (A - A^{\prime})x\), that is,

    $$\displaystyle{ y - (A - A^{\prime})x \in F(x) + A{\prime}{x} }$$
    (19.23)

    and

    $$\displaystyle{\|y - (A - A^{\prime})x\| \leq \| y\| +\| A - A^{\prime}\|\|x\| \leq \delta +\delta (\|\overline{x}\|+\delta ) <\varepsilon.}$$

    On the other hand, by (19.22)

    $$\displaystyle{\begin{array}{lcl} \|y^{\prime} - (y - (A - A^{\prime})x)\|& \leq &\|y^{\prime} - y\| +\| A^{\prime} - A\|\|x\|\\ & \leq &{\nu }^{{\ast} } (1,\| x\|)\|(y^{\prime} - y, A^{\prime}- A)\| \leq \rho t. \end{array} }$$

    By (19.21) and (19.23) there must be a u such that \(\|u - x\| <t\) and y′ ∈ F(u) + A′ u which means that \((y^{\prime},A^{\prime}) \in \Phi (u)\) and (19.20) follows. This completes the proof of (19.18).

  2. 2.

    To prove the equality in case of a convex F which is perfectly regular at \((\overline{x},0)\), note first that for any Ψ: X ⇉ Y with \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,\Psi\)

    $$\displaystyle{ \mathrm{sur}\,\Psi (\overline{x}\vert \overline{y}) \leq \sup \{ r \geq 0:\; B(\overline{y},tr) \subset \Psi (B(\overline{x},t))\} }$$
    (19.24)

    for all sufficiently small t ≥ 0. This is immediate from the definition.

    Consider the operator \(\Lambda : Y \times \mathcal{L}(X,Y ) \rightarrow Y\) defined by \(\Lambda (y,A) = y - A\overline{x}\). We claim that

    $$\displaystyle{ \mathrm{ex}\big((\Lambda (\Phi (x) \cap tB_{Y \times \mathcal{L}(X,Y )}),F(x)\big) \leq ct\|x -\overline{x}\|. }$$
    (19.25)

    Here ex(Q, P) stands for the excess of Q over P:

    $$\displaystyle{\mathrm{ex}(Q,P) =\sup _{u\in Q}d(u,P).}$$

    Indeed, let (y, A) ∈ Φ(x) and \(\|(y,A)\| \leq t\). Then y − Ax ∈ F(x), that is, \(\Lambda (y,A) \in F(x) +\| A\|\|x -\overline{x}\|B_{Y }\). This means that \(d(\Lambda (y,A),F(x)) \leq \| A\|\|x -\overline{x}\|\) and (19.25) follows.

    We observe furthermore that \(F(x) \subset DF(\overline{x},0)(x -\overline{x})\) (as the graph of F is convex), so (19.25) implies that

    $$\displaystyle{ \mathrm{ex}\big(\Lambda (\Phi (x) \cap tB_{Y \times \mathcal{L}(X,Y )}),(DF(\overline{x},0)(x -\overline{x}))\big) \leq ct\|x -\overline{x}\|. }$$
    (19.26)

    This along with (19.24) (applied to Ψ = Λ ∘ Φ) and (19.15) implies that

    $$\displaystyle{ \mathrm{sur}\,(\Lambda \circ \Phi )(\overline{x},(0,0)) \leq \mathrm{ sur}\,DF(\overline{x},0). }$$
    (19.27)

    The inequality

    $$\displaystyle{ \mathrm{sur}\,(\Lambda \circ \Phi )(\overline{x}\vert 0) \geq \mathrm{ sur}\,\Lambda (0,0) \cdot \mathrm{ sur}\,\Phi (\overline{x}\vert (0,0)) }$$
    (19.28)

    is straightforward. Finally, as Λ is a linear bounded operator, its surjection modulus is the same at any point and

    $$\displaystyle{ \mathrm{sur}\,\Lambda =\inf \{\| {\Lambda }^{{\ast}}{y}^{{\ast}}\|:\;\| {y}^{{\ast}}\| = 1\}. }$$
    (19.29)

    We have

    $$\displaystyle{\langle {y}^{{\ast}},y - A\overline{x}\rangle =\langle {y}^{{\ast}},y\rangle -\langle {y}^{{\ast}}\otimes \overline{x},A\rangle,}$$

    so \({\Lambda }^{{\ast}}({y}^{{\ast}}) = ({y}^{{\ast}},-{y}^{{\ast}}\otimes \overline{x})\) and

    $$\displaystyle{\|{\Lambda }^{{\ast}}{y}^{{\ast}}\| =\sup \{\langle {y}^{{\ast}},y\rangle -\langle {y}^{{\ast}}\otimes \overline{x},A\rangle:\;\nu (\|y\|,\|A\|) \leq 1\} {=\nu }^{{\ast}}(\|{y}^{{\ast}}\|,\|{y}^{{\ast}}\otimes \overline{x}\|)}$$

    and therefore (as \(\|{y}^{{\ast}}\otimes \overline{x}\| =\| {y}^{{\ast}}\|\|\overline{x}\|\)) \(\mathrm{sur}\,\Lambda {=\nu }^{{\ast}}(1,\|\overline{x}\|)\). Combining this with (19.27), (19.28), and (19.29) and taking into account that F is perfectly regular at \((\overline{x},0)\), we complete the proof.

19.4.5 Comments

For original proofs of the Robinson-Ursescu theorem see [36, 42]. In both publications this was, as we have mentioned, a purely qualitative result. But the first estimate (an upper estimate for the modulus of regularity) for convex set-valued mappings was obtained by Robinson even earlier in [35] for so-called linear constraint systems using the Hörmander homogenization transform which with any convex set Q ⊂ X associates a cone in \(X \times \mathbb{R}\) generated by the set \(Q \times \{ 1\}\). Robinson worked with the norm \(\max \{\|x\|,\vert t\vert \}\) in \(X \times \mathbb{R}\). In [25] we showed that the lower bound (over \(\varepsilon\)) of Robinson-type formulas corresponding to the norms \(\max \{\|x\|,\varepsilon \vert t\vert \}\) in \(X \times \mathbb{R}\) gives the exact value of the modulus of metric regularity. It was actually the first corollary of Theorem 19.16 also proved in [25]. Here the proof of the theorem has been substantially simplified. Systems of convex and linear inequalities have been thoroughly studied by Canovas et al. in [48]. My interest to the subject has been stimulated by some of their works. Theorem 19.17 was proved originally in [23]. The concept of perfect regularity was introduced in [25] Theorem 19.26. Its predecessor for arbitrary maps can be found in [23], but there it was assumed that F + A is perfectly regular at \((\overline{x},\overline{y} + A\overline{x})\) for any A. Theorem 19.28 is also a new result as has been mentioned in the introduction. An earlier result of such sort was established in a recent paper [6] for systems of convex inequalities in \({\mathbb{R}}^{n}\) in which every inequality was independently perturbed by linear functions. Observe that the set-valued mappings associated with systems of convex inequalities are perfectly regular at all points of their graphs [23].

19.5 First-Order Necessary Conditions in Mathematical Programming

We start with the principal lemma.

Lemma 19.29 (Lemma on convex majorant). 

Let f be a function on a Banach space X which is Fréchet differentiable at a certain \(\overline{x}\) . Then there is an \(\varepsilon> 0\) and a convex continuous function on X with the following properties:

  1. (a)

    \(\varphi (\overline{x}) = f(\overline{x})\) and \(\varphi (x) \geq f(x)\) if \(\|x -\overline{x}\| <\varepsilon\) .

  2. (b)

    \(\varphi\) is strictly differentiable at \(\overline{x}\) and \(\varphi ^{\prime}(\overline{x}) = f^{\prime}(\overline{x})\) .

  3. (c)

    \(0 \leq \varphi (x) -\varphi (\overline{x}) -\langle \varphi ^{\prime}(\overline{x}),x -\overline{x}\rangle \leq 2\|x -\overline{x}\|\) for all x.

Proof.

Let \({x}^{{\ast}} = f^{\prime}(\overline{x})\). Then \(\vert f(x) - f(\overline{x}) -\langle {x}^{{\ast}},x -\overline{x}\rangle \vert \leq r(\|x -\overline{x}\|)\), where r(λ) ≥ 0 and λ  − 1 r(λ) → 0 as λ → 0. Without loss of generality we may assume that r is non-decreasing. As r(λ = o(λ), there is an \(\varepsilon> 0\) such that \(r(\lambda ) \leq \lambda\) if \(\lambda \leq \varepsilon\). Set

$$\displaystyle{g(\xi ) =\sup _{0<\lambda \leq \varepsilon }r(\lambda )\frac{2\xi -\lambda } {\lambda }.}$$

Then g is a convex l.s.c. function on \(\mathbb{R}\) as the upper envelop of a family of affine functions. We have g(ξ) ≤ 2ξ for all ξ ≥ 0. Furthermore, g(ξ) = 0 for ξ ≤ 0 and g(ξ) > 0 if ξ > 0. Thus g is convex continuous on [0, ]. It is also clear that g is strictly increasing on [0, ).

We notice next that for any ξ the function under the sign of supremum is nonnegative if and only if λ ≤ 2ξ, so for any ξ we can take supremum over (0, 2ξ]. It follows that

$$\displaystyle{0 \leq g(\xi ) \leq \xi \sup _{0<\lambda \leq 2\xi }\frac{r(\lambda )} {\lambda } = o(\xi ),\quad \mathrm{as}\;\xi \rightarrow 0.}$$

Define \(\varphi\) by

$$\displaystyle{\varphi (x) = f(\overline{x}) +\langle {x}^{{\ast}},x -\overline{x}\rangle + g(\|x -\overline{x}\|).}$$

Then (a) and (c) are obvious as well as differentiability of \(\varphi\) at \(\overline{x}\) and the equality of derivatives of \(\varphi\) and f at \(\overline{x}\). We have further for ξ ≥ ξ ′

$$\displaystyle{g(\xi ) - g(\xi ^{\prime}) \leq \sup _{0<\lambda \leq 2\xi }\big[r(\lambda )\frac{2\xi -\lambda } {\lambda } - r(\lambda )\frac{2\xi ^{\prime}-\lambda } {\lambda } \big] \leq 2\sup _{0<\lambda \leq \xi }\frac{r(\lambda )} {\lambda } (\xi -\xi ^{\prime})}$$

which immediately implies (b).

Corollary 19.30.

Let T be a set and for any t ∈ T let f t be a function on X Fréchet differentiable at \(\overline{x}\) . Moreover, we assume that f t are uniformly differentiable in the sense that there is an r(⋅) common for all f t . Then there are functions \(\varphi _{t}\) (t ∈ T) and an \(\varepsilon> 0\) such that

  1. (a)

    For any t the function \(\varphi _{t}\) is convex continuous on X, Fréchet differentiable at \(\overline{x}\) , and such that \(\varphi _{t}(\overline{x}) = f_{t}(\overline{x})\), \(\varphi _{t}^{\prime}(\overline{x}) = f_{t}^{\prime}(\overline{x})\) and \(\varphi _{t}(x) \geq f_{t}(x)\) if \(\|x -\overline{x}\| <\varepsilon\) .

  2. (b)

    \(\varphi _{t}\) are uniformly strictly differentiable at \(\overline{x}\) , that is,

    $$\displaystyle{\lim _{\delta \rightarrow 0}\sup \{(\|x - {x^{\prime}\|}^{-1}\vert \varphi _{ t}(x) -\varphi (x^{\prime}) -\langle \varphi ^{\prime}(\overline{x}),x - x^{\prime}\rangle \vert :\; x,x^{\prime} \in B(\overline{x},\delta ),\;x\neq x^{\prime}\} = 0.}$$
  3. (c)

    The inequality \(0 \leq \varphi _{t}(x) -\varphi _{t}(\overline{x}) -\langle \varphi _{t}^{\prime}(\overline{x}),x -\overline{x}\rangle \leq 2\|x -\overline{x}\|\) holds for all x ∈ X and t ∈ T.

Proof.

Just set \(\varphi _{t}(x) = f_{t}(\overline{x}) +\langle f_{t}^{\prime}(\overline{x}),x -\overline{x}\rangle + g(x -\overline{x})\) with the same g as in the lemma.

An immediate consequence of the lemma is the practical trivialization of the procedure of developing first-order necessary optimality conditions when the cost function and the inequality constraint functions are Fréchet differentiable at the solution. Indeed, consider the problem:

(P 1):

minimize f 0(x) s.t.   \(f_{t}(x) \leq 0,\;t \in T;\;x \in Q\).

Here T is an arbitrary set, no matter finite or infinite. The nature of the constraint x ∈ Q is not essential for a time being.

If f 0 and all f t are Fréchet differentiable at \(\overline{x}\), then we denote by \(\varphi _{0}\) and \(\varphi _{t}\) convex functions obtained from f 0 and f t using Lemma 19.29. Set further

$$\displaystyle{\varphi (x) =\sup _{t\in T}\varphi _{t}(x).}$$

Set \(T_{\varepsilon } =\{ t \in T:\;\varphi _{t}(\overline{x}) \geq -\varepsilon \} =\{ t \in T:\; f_{t}(\overline{x}) \geq -\varepsilon \}\). We assume that

(A 2):

  there is an \(\varepsilon> 0\) such that the set \(\{f_{t}^{\prime}(\overline{x}):\; t \in T_{\varepsilon }\}\) is norm bounded in X  ∗ .

By Corollary 19.30 \(\varphi\) is a convex continuous function satisfying for some K > 0

$$\displaystyle{ \varphi (x) \leq \varphi (\overline{x}) + K\|x -\overline{x}\|,\quad \forall x \in X. }$$
(19.30)

For the subdifferential of \(\varphi\) at \(\overline{x}\) we have the formula (see, e.g., [30, 43])

$$\displaystyle{ \partial \varphi (\overline{x}) =\bigcap _{\varepsilon>0}\mathrm{{cl}\,}^{{\ast}}\mathrm{conv}\,\Big(\bigcup _{ t\in T_{\varepsilon }}\partial \varphi _{t}(\overline{x})\Big) =\bigcap _{\varepsilon>0}\mathrm{{cl}\,}^{{\ast}}\mathrm{conv}\,\Big(\bigcup _{ t\in T_{\varepsilon }}\{f^{\prime}_{t}(\overline{x})\}\Big). }$$
(19.31)

Under additional conditions it is possible to guarantee the equality

$$\displaystyle{\partial \varphi (\overline{x}) =\mathrm{{ cl}\,}^{{\ast}}\mathrm{conv}\,\Big(\bigcup _{ t\in T_{0}}f^{\prime}_{t}(\overline{x})\Big),}$$

for instance, if

(QC 1):

T is a compact Hausdorff space and the function \(t\mapsto \varphi _{t}(x)\) is upper semicontinuous for any x of a neighborhood of \(\overline{x},\)

or

(QC 2):

there is an \(\varepsilon> 0\) such that the set \(\mathrm{conv}\,\{(\varphi _{t}^{\prime}(\overline{x}),\varphi _{t}(\overline{x})):\; t \in T_{\varepsilon }\}\) is weak ∗  closed.

Both conditions are well known and verification does not present any difficulty. Moreover in both cases it is possible to give precise representations for elements of \(\partial \varphi (\overline{x})\): if (QC 1) holds and X is a separable Banach space, then for any \({x}^{{\ast}}\in \partial \varphi (\overline{x})\) there is a probability measure μ on T 0 such that

$$\displaystyle{ {x}^{{\ast}} =\int _{ T_{0}}\varphi _{t}^{\prime}(\overline{x})\mathrm{d}\mu (t) }$$
(19.32)

(see, e.g., [9, 24]. The same is obviously true if (QC 2) holds, but here in this case we can consider only measures μ supported on finite sets so that the necessary conditions assume the form: for any \({x}^{{\ast}}\in \partial \varphi (\overline{x})\) there are \(t_{i} \in T_{0},\;i = 1,\ldots,k\) and positive numbers \(\alpha _{i}\) with ∑ α i  = 1 such that

$$\displaystyle{ {x}^{{\ast}} =\alpha _{ 1}\varphi _{t_{1}}^{\prime}(\overline{x}) + \cdots +\alpha _{k}\varphi _{t_{k}}^{\prime}(\overline{x}). }$$
(19.33)

Let us return to the problem (P 1) assuming that \(\overline{x}\) is a solution. It is straightforward to see that \(\overline{x}\) also solves

(P 2):

minimize \(\varphi _{0}(x)\) s.t.  \(\varphi (x) \leq 0,\;x \in Q\).

Thus a problem with infinitely many inequality constraints reduces to a simple problem with one inequality constraint and both cost and constraint functions convex continuous. Further analysis depends of course on the structure of the constraint x ∈ Q. The simplest case occurs when Q is defined by the condition F(x) = 0 with F: X → Y strictly differentiable at \(\overline{x}\) and the image of \(F^{\prime}(\overline{x})\) (i.e., \(F^{\prime}(\overline{x})(X)\)) being a subspace of finite codimension. In this case (P 2) is a standard problem for which the Lagrange multiplier rule holds: there are λ 0 ≥ 0,  λ ≥ 0, and a y  ∗  ∈ Y  ∗ , not all equal to zero and such that

$$\displaystyle{ 0 \in \lambda _{0}\varphi _{0}^{\prime}(\overline{x}) +\lambda \partial \varphi (\overline{x}) + {(F^{\prime}(\overline{x}))}^{{\ast}}{y}^{{\ast}}. }$$
(19.34)

The standard constrained qualification condition. \(F^{\prime}(\overline{x})\) is surjective (that is to say, \(F^{\prime}(\overline{x})(X) = Y\)) and there is a h ∈ X such that (\(F^{\prime}(\overline{x})h = 0\) and \(\varphi (\overline{x}) + h <0\)) guarantees that λ 0 > 0. In view of (19.31) the “Slater” part of this condition simply means that there are \(\varepsilon> 0\) and δ > 0 such that \(\varphi _{t}(\overline{x} + h) \leq -\delta\) for all \(t \in T_{\varepsilon }\). We can further specify the necessary condition (19.34) under (QC 1), (QC 2), or alike. There is also no problem to express all these conclusions in terms of the original problem (P 1). The standard constraint qualification condition gives now precisely the “perturbed Mangasarian-Fromovitz qualification condition”(PMFQC) of [32].

The conclusions of the last paragraph contain (with the exception of one theorem) all main results of [32].Footnote 4 But we can make a step further and easily get necessary conditions for more general types of the constraint x ∈ Q. The following proposition is straightforward.

Proposition 19.31.

Let f 0 and all f t , t ∈ T be Fréchet differentiable at \(\overline{x}\) . We assume that \(f_{0}(\overline{x}) = 0\) . Set

$$\displaystyle{\psi (x) =\max \{\varphi _{0}(x),\varphi (x)\}.}$$

If \(\overline{x}\) is a local solution of ( P 1 ), then it is a local solution of the problem

( P 3):

minimize \(\varphi (x)\) s.t.   x ∈ Q.

In particular if ( A 2 ) holds then there is an N > 0 such that \(\overline{x}\) is an unconditional local minimizer of \(\psi (x) + N\mathrm{d}(x,Q)\) .

For the first statement see, e.g., [19] and for the second, e.g., Proposition 2.4.3 in [10]. Recall that \(\varphi\) is Lipschitz if \(\{f_{t}^{\prime}(\overline{x}),\;t \in T\}\) is a bounded set.

If the constrained x ∈ Q is not convex, nonconvex subdifferentials of one or another sort become in principle necessary for further analysis. However reasonable optimality conditions can be obtained only under assumption that the behavior of Q near \(\overline{x}\) is sufficiently regular which considerably simplifies the situation.

Proposition 19.32.

We assume that

  1. (a)

    ( A 2 ) holds

  2. (b)

    Q is Clarke regular at \(\overline{x}\)

If \(\overline{x}\) is a solution of ( P 1 ) then there is a λ ∈ [0,1] such that

$$\displaystyle{ 0 \in \lambda \partial \varphi _{0}(\overline{x}) + (1-\lambda )\varphi (\overline{x}) + N_{DH}(Q,\overline{x}). }$$
(19.35)

If moreover the Slater-type qualification condition

( QC 3):

there is an \(h \in T_{B}(Q,\overline{x})\) such that \(\psi (\overline{x} + h) <0,\)

is satisfied, then \(\lambda> 0\) .

Proof.

The first statement follows from Proposition 19.31 and standard calculus rules for Clarke’s generalized gradient, as ψ is globally Lipschitz by (a). The second statement follows from (b) and, again, continuity of ψ as (19.35) cannot hold in this case with λ = 0.

The specification of the result for one or another structure of the constraint x ∈ Q also does not present much difficulty. Let for instance \(Q =\{ x:\; 0 \in F(x)\}\), where \(F: X \rightrightarrows Y\).

Proposition 19.33.

Let \(F: X \rightrightarrows Y\) , and let \((\bar{x},\bar{y}) \in \mathrm{ Graph}\,F\) . Assume that F is metrically regular at \((\overline{x},0)\) . If under this condition \(\varphi\) is a function on X which is Lipschitz near \(\overline{x}\) and which attains a local minimum at \(\overline{x}\) subject to 0 ∈ F(x), then there is a \(K> 0\) such that the function \(g(x,y) =\varphi (x) + K\|y\|\) attains a local minimum at \((\overline{x},0)\) subject to \((x,y) \in \mathrm{ Graph}\,F\) .

Proof.

Let be the Lipschitz constant of \(\varphi\) in a neighborhood of \(\overline{x}\). Take \(K>\ell\mathrm{ reg}F(\overline{x}\vert 0)\). By the equivalence theorem, for any \((x,y) \in \mathrm{ Graph}\,F\) sufficiently close to \((\overline{x},0)\) there is a u such that 0 ∈ F(u) and \(\ell\|x - u\| \leq K\|y\|\). For such (x, y) and u, we have

$$\displaystyle{\varphi (x) + K\|y\| \geq \varphi (x) +\ell\| x - u\| \geq \varphi (u) \geq \varphi (\overline{x})}$$

as claimed.

Proposition 19.34.

Let F be metrically subregular at \((\overline{x},0) \in \mathrm{ Graph}\,F\) , that is (see, e.g., [16]),

$$\displaystyle{\mathrm{d}(x,{F}^{-1}(0)) \leq K\mathrm{d}(0,F(x))}$$

for all x of a neighborhood of \(\overline{x}\) . Then

$$\displaystyle{T_{B}(Q,\overline{x}) = Pr_{X}\{h:\; (h,0) \in T_{B}(\mathrm{Graph}\,F,(\overline{x},0))\}.}$$

Proof.

If \(h \in T_{B}(Q,\overline{x})\), then \(0 \in F(\overline{x} + t_{n}h_{n})\) for some \(t_{n} \rightarrow +0,\ h_{n} \rightarrow h\), that is, \((\overline{x},0) + t_{n}(h_{n},0) \in \mathrm{ Graph}\,F\); hence \((h,0) \in T_{B}(\mathrm{Graph}\,F,(\overline{x},0))\). Conversely, if the last inclusion holds then there are t n  → 0 h n  → h and v n  → 0 such that \(t_{n}v_{n} \in F(\overline{x} + t_{n}h_{n})\). By subregularity, \(\mathrm{d}(\overline{x} + t_{n}h_{n},{F}^{-1}(0)) \leq t_{n}K\|v_{n}\|\) which means that there are u n  ∈ X with \(\|u_{n}\| \leq 2K\|v_{n}\|\) such that \(0 \in F(\overline{x} + t_{n}(h_{n} + u_{n}))\) but h n  + u n  → h, whence \(h \in T_{B}(Q,\overline{x})\).

Combining the last three propositions and taking into the account that metric regularity implies subregularity, we get

Proposition 19.35.

Consider ( P 1 ) with Q = F −1 (0), where F: X ⇉ Y is a set-valued mapping with closed graph. Assume that

  1. (a)

    ( A 2 ) holds

  2. (b)

    F is metrically regular at \((\overline{x},0)\)

  3. (b)

    Graph  F is Clarke regular at \((\overline{x},0)\)

If \(\overline{x}\) is a solution of ( P 1 ) then there are λ ∈ [0,1] and y such that

$$\displaystyle{0 \in \lambda \varphi ^{\prime}(\overline{x}) + (1-\lambda )\partial \varphi (\overline{x}) + D_{DH}^{{\ast}}F(\overline{x},0)({y}^{{\ast}}).}$$

If moreover the Slater-type qualification condition

( QC 3):

∃ h ∈ X such that \((h,0) \in T_{B}(\mathrm{Graph}\,F,(\overline{x},0))\) and \(\varphi (\overline{x} + h) <0,\)

is satisfied, then λ > 0.

Proof.

The first statement is a consequence of the first part of Proposition 19.32 and 19.33, and the second is the consequence of the second part of Propositions 19.32 and  19.34.

As above we can make the necessary condition more specific using either (19.32) or (19.33) under (QC 1) or (QC 2). Thus we get finally

Proposition 19.36.

We posit the assumptions of Proposition 19.35.

  1. (a)

    If ( QC 1 ) holds with X being a separable Banach space, then there are λ ∈ [0,1] and a probability measure μ supported on T 0 such that

    $$\displaystyle{0 \in \lambda \varphi ^{\prime}(\overline{x}) + (1-\lambda )\int _{T}f_{t}^{\prime}(\overline{x})\mathrm{d}\mu (t) + D_{DH}^{{\ast}}F(\overline{x},0)({y}^{{\ast}}).}$$
  2. (b)

    If ( QC 2 ) holds then there are t i ∈ T 0 , i = 1,…,k and nonnegative \(\lambda,\lambda _{1},\ldots,\lambda _{k}\) such that

    $$\displaystyle{0 \in \lambda f_{t}^{\prime}(\overline{x}) +\sum _{ i=1}^{k}\lambda _{ i}f_{t_{i}}^{\prime}(\overline{x}) + D_{DH}^{{\ast}}F(\overline{x},0)({y}^{{\ast}}).}$$

    In either case the condition

∃ h ∈ X such that \(\langle f_{t}^{\prime}(\overline{x}),h\rangle <0\) for all t ∈ T, \((h,0) \in T_{B}(\mathrm{Graph}\,F,(\overline{x},0)),\)

implies that necessarily λ > 0.

19.5.1 Comments

As have been mentioned in Introduction, this section has been written following the discussion with B. Mordukhovich at J. Borwein’s 60th anniversary conference in Vancouver in May 2011.Footnote 5 My point was that generalized subdifferentiation is not an adequate tool to treat semi-infinite programming problems with differentiable data and the aim of the first part of the section was to demonstrate that convex analysis offers a much more viable alternative. Another consequence of the discussion of this section is that, as far as first-order optimality conditions are concerned, semi-infinite programming with differentiable or convex inequalities and cost functions is not a particularly meaningful object to study: all results can be obtained from the standard results for problems with finitely many inequality constraints and convex subdifferential calculus. On the other hand, semi-infinite programming with non-differentiable inequality constraint functions remains rather Terra incognita and I am not aware of any results relating to the first-order necessary conditions for such problems. And of course for other problems, e.g., stability of solutions (or even feasible sets), the infinite number of constraints is an additional and serious challenge (see, e.g., [48, 23, 41]).