Abstract
We study the implications of a well-known metric inequality condition on sets to the applicability of standard necessary optimality conditions for constrained optimization problems when a new constraint is added. We compare this condition with several other constraint qualification conditions in the literature, and due to its metric nature, we apply it to nonsmooth optimization problems in order to perform first a penalization and then to give optimality conditions in terms of generalized differentiability.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction and Preliminaries
The starting point of the investigation we propose in this work is the question we briefly present below. Generally, if one considers a constrained optimization problem, in order to write necessary optimality conditions, some constraint qualification conditions are necessary. Suppose that we add to the current system of constraints a new constraint. Of course, the problem can dramatically change, and even if the initial system satisfies a constraint qualification condition, the new system can fail to do so. We asked ourselves if one can give a condition that links the old system of constraints and the new constraint in such a way that the optimality conditions apply for the new problem, without checking a constraint qualification condition for the whole new system of constraints. Actually, we started by asking this question in the case of a smooth optimization problem with inequality constraints, and then we observed that in order to keep the requirements as minimal as possible, we arrive at a metric condition—denoted (MC) in the following—that naturally comes into play for other types of optimization problems, including nonsmooth ones, and for some penalization results of Clarke’s type, as well. It turns out that this metric condition is equivalent to another well-known metric inequality condition studied under various names (metric inequality, linear regularity, metric regularity, linear coherence, and subtransversality) from about 30 years ago, which has been used, among others, as a key assumption in the study of the rate of convergence of some algorithms related to set intersection problems, especially in establishing linear convergence of sequences generated by the method of alternating projections. The way we use it in this paper follows another main stream of possible applications, i.e., as a qualification condition for formulae involving normal or tangent cones to intersections of sets, generically well-known as “Conical Hull Intersection Property” (CHIP)-type conditions in the convex framework, and extended to the nonconvex case by many authors. In this work, we choose to call this condition subtransversality and we denote it by (Str). Not surprisingly, having in mind that the metric subregularity type conditions can successfully replace the “full” regularity in qualification conditions, and the well-known fact that the Mangasarian–Fromovitz (MFCQ) is equivalent to the metric regularity of a certain epigraph-type multifunction—a condition which we denote (MRCQ)—, we asked ourselves what kind of subregularity is in relation to (Str) condition. Some immediate candidates, involving difference and distance type multifunctions, are indeed metrically subregular iff (Str) holds, but a more interesting object would be, of course, the epigraph multifunction mentioned above in relation with (MFCQ). As consequence, we studied the implications between (Str) and the metric subregularity of the epigraphical multifunction associated to systems of inequalities. This type of conditions were recently introduced and studied under the name of “metric subregularity constraint qualification” (MSCQ) conditions. It turns out that, in fact, in the context of inequality constraints involving \(C^{1}\) functions, (Str) is strictly weaker than (MSCQ) condition. Finally, we observed that all the constraint qualification conditions are only sufficient in establishing the nonconvex (CHIP) property we study in this work.
The next figure summarizes the relations between all the conditions discussed above, as will be made clear in the third section, along with more comments and references.
Another important feature of the metric condition (MC) is the fact that it can be employed in nonsmooth settings, for problems much more general than those we started with. As consequence, the fourth section is twofold. On the one hand, we apply some facts collected in the previous sections to directional Pareto minima in order to get necessary optimality conditions, and on the other hand, we employ the general pattern of the metric inequality under study to penalize scalar nonsmooth optimization problems with multiple constraints. The latter approach allows us to derive necessary optimality conditions in terms of limiting generalized differentiation techniques for the problems under consideration.
The notation we use is fairly standard. If X is a normed vector space, then we denote by B(x, r), D(x, r) and S(x, r) the open ball, the closed ball and the sphere of center \(x\in X\) and radius \(r>0,\) respectively. For a set \(A\subset X,\) we denote by \({\text {int}}A,\) \({\text {cl}}A,\) \({\text {bd}}A\) its topological interior, closure and boundary, respectively. The cone generated by A is designated by \({{\text {cone}}A,}\) and the convex hull of A is \({\text {conv}} A\). The distance from a point \(x\in X\) to a nonempty set \(A\subset X\) is \(d(x,A):=\inf \left\{ \left\| x-a\right\| \mid a\in A\right\} \) and the distance function to A is \(d_{A}:X\rightarrow {\mathbb {R}}\) given by \(d_{A}(x):=d(x,A)\). The topological dual of X is \(X^{*},\) and the negative polar of A is
The positive polar of A is \(A^{+}:=-A^{-}.\) Of course, \(A^{-}=\left( {\text {cone}}A\right) ^{-}.\)
Let D be a nonempty subset of X and \({\overline{x}}\in X.\) The first order Bouligand tangent cone to D at \({\overline{x}}\) is the set
where \((t_{n})\downarrow 0\) means \((t_{n})\subset (0,\infty )\) and \(t_{n} \rightarrow 0.\) The first-order Ursescu tangent cone to D at \({\overline{x}}\) is the set
The first-order Dubovitskij–Miljutin tangent set to D at \({\overline{x}}\) is the set
The Bouligand and Ursescu tangent cones are closed sets, and \(T_{U} (D,{\overline{x}})\subset T_{B}(D,{\overline{x}}).\) The fact that \(T_{B} (D,{\overline{x}})=X\backslash T_{DM}(X\backslash D,{\overline{x}})\) shows that the Dubovitskij–Miljutin tangent set to D at \({\overline{x}}\) is open. Moreover, for \(*\in \{B,U,DM\}\), we have \(T_{*}(D,{\overline{x}})=T_{*}({\text {cl}}D,{\overline{x}})\). If \({\overline{x}}\in {\text {int}}A,\) then \(T_{*}(D,{\overline{x}})=T_{*}(D\cap A,{\overline{x}})\). It is well known that \({\overline{x}}\in {\text {int}}D\) if and only if \(T_{DM} (D,{\overline{x}})=X\) and \({\overline{x}}\in {\text {cl}}D\) if and only if \(T_{B}(D,{\overline{x}})\ne \emptyset .\)
Now, we briefly collect some basic facts concerning the limiting generalized calculus (see [26]). This calculus relies on the concept of normal cone, and even if this object can be defined on general Banach spaces, its main applicable features hold on Asplund spaces, which represent a special class of Banach spaces: X is Asplund iff every continuous convex function on any open convex set \(U\subset X\) is Fréchet differentiable at the points of a dense \(G_{\delta }-\)subset of U. A very important property of Asplund spaces is that every bounded sequence of the topological dual admits a \(w^{*}-\)convergent subsequence.
Take a nonempty subset \(S\ \)of the Asplund space X and pick \(x\in S.\) Then the Fréchet normal cone to S at x is
where \(u\overset{S}{\rightarrow }x\) means that \(u\rightarrow x\) and \(u\in S.\)
Let \({\overline{x}}\in S.\) The limiting normal cone to S at \({\overline{x}}\) is
If \(S\subset X\) is a convex set, then
and coincides with the negative polar of \(T_{B}(S,{\overline{x}}).\)
Let \(F:X\rightrightarrows Y\) be a set-valued map between the Asplund spaces X and Y, and \(({\overline{x}},{\overline{y}})\in {\text {Gr}}F:=\left\{ \left( x,y\right) \in X\times Y\mid y\in F\left( x\right) \right\} .\) Then the limiting coderivative of F at \(({\overline{x}},{\overline{y}})\) is the set-valued map \(D^{*}F({\overline{x}},{\overline{y}}):Y^{*}\rightrightarrows X^{*}\) given by
As usual, when \(F:=f\) is a function, since \({\overline{y}}\in F\left( {\overline{x}}\right) \) means \({\overline{y}}=f\left( {\overline{x}}\right) ,\) we write \(D^{*}f\left( {\overline{x}}\right) \) for \(D^{*}f\left( {\overline{x}},{\overline{y}}\right) .\)
Let \(f:X\rightarrow \mathbb {R\cup \{+\infty \}}\) be finite at \({\overline{x}}\in X\ \)and lower semicontinuous around \({\overline{x}};\) the Fréchet subdifferential of f at \({\overline{x}}\) is given by
where \({\text {epi}}f\) denotes the epigraph of f; similarly, the limiting subdifferential of f at \({\overline{x}}\) is given by
One always has \({\widehat{\partial }}f({\overline{x}})\subset \partial f({\overline{x}}).\) Note that a generalized Fermat rule holds: if \({\overline{x}}\) is a local minimum point for f then \(0\in {\widehat{\partial }}f({\overline{x}}).\)
It is well known that if A is a closed set and \({\overline{x}}\in A\), then
If f is a convex function, then both \({\widehat{\partial }}f({\overline{x}})\) and \(\partial f({\overline{x}})\) coincide with the Fenchel subdifferential.
Next, we recall the standard definitions of metric regularity and metric subregularity. Given a set-valued mapping \(F:X\rightrightarrows Y\) between normed vector spaces, one says that F is metrically regular around \(\left( {\overline{x}},{\overline{y}}\right) \in {\text {Gr}}F\) if there exist \(r,\alpha >0\) such that for all \(\left( x,y\right) \in B\left( \left( {\overline{x}},{\overline{y}}\right) ,r\right) ,\)
One says that F is metrically subregular at \(\left( {\overline{x}} ,{\overline{y}}\right) \in {\text {Gr}}F\) if there exist \(r,\alpha >0\) such that for all \(x\in B\left( {\overline{x}},r\right) ,\)
It is well known that metric (sub)regularity of F (at) around \(\left( {\overline{x}},{\overline{y}}\right) \) is equivalent to (calmness) Aubin property of \(F^{-1}\) (at) around \(\left( {\overline{y}},{\overline{x}}\right) \) (see [9]).
Finally, we recall the Gerstewitz (Tammer) scalarization functional (see, for instance [17, Sect. 2.3]). Let \(K\subset Y\) be a closed convex cone with nonempty interior. Then for every \(e\in {\text {int}}K\), the functional \(s_{e}:Y\rightarrow {\mathbb {R}}\) given by
is continuous and sublinear. Moreover, for every \(t\in {\mathbb {R}}\),
2 Adding a New Constraint
We start by illustrating on a smooth optimization problem the main question we deal with in this paper. Let \(f,g:X\rightarrow {\mathbb {R}}\) be continuously differentiable functions. Consider the basic optimization problem
and let \({\overline{x}}\in X\) be an optimal solution of this problem. The first-order necessary optimality condition is
where
is the set of feasible points. We see that one important issue is to describe the cone \(T_{B}(M_{g},{\overline{x}})\). Clearly, if the constraint is not active at the feasible point \({\overline{x}}\) (that is, \(g({\overline{x}})<0\)), then \(T_{B}(M_{g},{\overline{x}})=X\) and (2.1) becomes \(\nabla f({\overline{x}})=0\) (Fermat’s Theorem).
Otherwise, if the constraint is active at \({\overline{x}}\), i.e., \(g(\overline{x})=0,\) we have to suppose that \(\nabla g({\overline{x}})\ne 0\) in order to obtain that
In order to show this, observe first that
Let now \(u\in T_{B}(M_{g},{\overline{x}}),\) meaning that there exist \((t_{n})\downarrow 0,(u_{n})\rightarrow u,n_{0}\in {\mathbb {N}}\), such that for all \(n\ge n_{0},\)
Since g is differentiable, there exists \((v_{n})\rightarrow 0\) such that for all \(n\ge n_{0}\)
i.e.,
Whence, passing to the limit in the relation \(\nabla g({\overline{x}} )(u_{n})+v_{n}\le 0,\) one gets that \(\nabla g({\overline{x}})(u)\le 0.\)
Take now \(u\in X\) such that \(\nabla g({\overline{x}})(u)<0.\) Notice that such an element exists since \(\nabla g({\overline{x}})\ne 0.\) Take \((t_{n})\downarrow 0\) and \((u_{n})\rightarrow u.\) Again, the differentiability property of g means
with \(\left( v_{n}\right) \rightarrow 0.\) Since \(\nabla g({\overline{x}})(u)<0\) and \((u_{n})\rightarrow u,\) for all n large enough, \(\nabla g(\overline{x})(u_{n})+v_{n}<0\), whence \(g\left( {\overline{x}}+t_{n}u_{n}\right) <0.\) This means that \({\overline{x}}+t_{n}u_{n}\in M_{g}\) and we get \(u\in T_{DM}(M_{g},{\overline{x}}).\)
Let now \(v\in X\) such that \(\nabla g({\overline{x}})(v)\le 0\) and \(\lambda \in (0,1).\) Clearly
whence, from the previous step, \(\lambda u+(1-\lambda )v\in T_{DM} (M_{g},{\overline{x}}).\) Passing to the limit with \(\lambda \rightarrow 0,\) we get \(v\in {\text {cl}}T_{DM}(M_{g},{\overline{x}}),\) and all the inclusions are proved.
We want to underline here that the essential assumption for getting (2.2) is \(\nabla g({\overline{x}})\ne 0.\)
Now condition (2.1) becomes
which can be seen as the fact that \(0\in X\) is an optimal solution to the linear problem
Then, since for linear problems there is no need of supplementary qualification conditions for applying Karush–Kuhn–Tucker Theorem, we get \(\lambda \ge 0\) such that
For the scalar function g, the condition \(\nabla g({\overline{x}})\ne 0\) is equivalent to the well-known Mangasarian–Fromovitz constraint qualification condition which we recall in greater generality below.
Let us consider a general constraint system \(h(x)\le 0,\) where \(h:X\rightarrow {\mathbb {R}}^{n}\) is a \(C^{1}\) function and \(h(x)\le 0\) means that \(h_{i}(x)\le 0\) for all \(i\in \overline{1,n}.\) The Mangasarian–Fromovitz constraint qualification condition at \({\overline{x}}\) when h is active (that is, all scalar coordinate functions are active) is
It is also well known (see [6]) that (MFCQ) is equivalent to the metric regularity around \(\left( {\overline{x}},0\right) \) of the set-valued map \({\tilde{h}}:X\rightrightarrows {\mathbb {R}}^{n}\) given by
We denote this metric regularity condition by (MRCQ). The equivalence between (MFCQ) and (MRCQ) was first made explicit by Robinson in [34].
There is an important amount of literature that underlines the idea that metric subregularity still ensures the validity of many facts in optimization and, in particular, can replace metric regularity in qualification conditions (see [20, 37] and the references therein). For instance, in the context we discuss here, the metric subregularity qualification condition (MSCQ) introduced in [16] and extensively used in many other works (see, e.g., [15] and the references therein) amounts to say that the above set-valued map is metrically subregular at \(\left( {\overline{x}},0\right) .\)
We continue our presentation by adding a new scalar inequality constraint. Therefore, suppose that the constraint is expressed in the same way, but with a function \(g=(g_{1},g_{2}):X\rightarrow {\mathbb {R}}^{2}.\) Let \({\overline{x}}\) be a feasible point. In the case of active constraints, that is, \(g(\overline{x})=0\in {\mathbb {R}}^{2},\) the Mangasarian–Fromovitz condition is: There exists \(u\in X\) such that \(\nabla g_{1}({\overline{x}})(u)<0\) and \(\nabla g_{2}({\overline{x}})(u)<0\). On the same lines as before, this condition ensures that
In particular, this means that
and in fact, this is the essential relationship to get, on the same argument as in the case of a single scalar-valued constraint, that there exist \(\lambda _{1},\lambda _{2}\ge 0\) such that
Now if another scalar-valued constraint is coming into play, that is, if we have \(g=(g_{1},g_{2},g_{3}):X\rightarrow {\mathbb {R}}^{3},\) then for an optimal solution \({\overline{x}},\) if the new constraint is active as well, the corresponding Mangasarian–Fromovitz condition (there exists \(u\in X\) such that \(\nabla g_{i}({\overline{x}})(u)<0\) for \(i\in \overline{1,3}\)) ensures, similarly,
and then the existence of \(\lambda _{1},\lambda _{2},\lambda _{3}\ge 0\) such that
Therefore, according to the facts above, every time we consider another scalar-valued constraint which is active at the underlying point \(\overline{x}\), one has to impose the Mangasarian–Fromovitz condition, and this condition is stronger than the Mangasarian–Fromovitz conditions for every of the components of g (that is, the Mangasarian–Fromovitz condition formally applied for each \(g_{i}\)). Therefore, we aim at presenting a situation when one can replace the general Mangasarian–Fromovitz condition with Mangasarian–Fromovitz conditions for each coordinate function, under a supplementary hypothesis. This idea will be subject of some generalizations, since it will be clear that the additional assumption we impose can be extended to nondifferentiable settings.
3 A Metric Condition
We start the investigation in order to get a condition with the above mentioned features by considering again the situation of \(g=(g_{1} ,g_{2}):X\rightarrow {\mathbb {R}}^{2}\). Suppose that \(g_{1}({\overline{x}})=0,\) \(g_{2}({\overline{x}})=0,~\nabla g_{1}({\overline{x}})\ne 0\) and \(\nabla g_{2}({\overline{x}})\ne 0.\) Firstly, looking again at the above arguments, observe that we have
In other words, (MFCQ) is used exactly to show that we have the reverse inclusion in the above relation. However, such an inclusion can be obtained as well via some regularity assumptions on the sets. We refer the reader to the paper [12] and the references therein for some steps in this direction of investigation.
In the next result, we employ a natural metric condition which yields the desired inclusions. We will see below that this condition is in fact a subtransversality condition on sets studied in depth by several authors. On the other hand, the property we deduce in Theorem 3.1 is a nonconvex generalized version of the (CHIP) property in the sense given in the papers [2] and [3]. Properties of this type for the normal cones are extended to nonconvex cases in many works; for additional bibliography, see the references after Proposition 3.3.
Theorem 3.1
Let X be a normed vector space and \(M_{1},M_{2}\subset X\) be closed sets. Take \({\overline{x}}\in M_{1}\cap M_{2}.\) Suppose that the following regularity assumption holds: There exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{1},\)
Then
Proof
Take \(u\in T_{B}(M_{1},{\overline{x}})\cap T_{U} (M_{2},{\overline{x}}),\) i.e., \(u\in T_{B}(M_{1},{\overline{x}})\) and \(u\in T_{U}(M_{2},{\overline{x}}).\) Then there exist \((t_{n})\downarrow 0,(u_{n} )\rightarrow u,\) \(\left( v_{n}\right) \rightarrow u\) with \(\overline{x}+t_{n}u_{n}\in M_{1}\) and \({\overline{x}}+t_{n}v_{n}\in M_{2}\) for all n large enough. Observe that one can apply the regularity assumption since \({\overline{x}}+t_{n}u_{n}\in B({\overline{x}},s)\) for n large enough: there exists \(p_{n}\in M_{1}\cap M_{2}\) with
Then for every n as above,
whence
We infer that \(u_{n}^{\prime }:=t_{n}^{-1}(p_{n}-{\overline{x}})\rightarrow u\) which, by the fact that for all n large enough
allows us to conclude the proof of the first inclusion of the theorem. Now, the other relations are similar. Notice that for the equality in the third relation one takes into account the simple inclusion
The proof is complete.\(\square \)
Remark 3.2
Observe that if X is finite dimensional, under condition (MC) one can prove the following stronger assertion: for all \(u\in T_{B}(M_{1},{\overline{x}})\) and \(v\in T_{U}(M_{2},{\overline{x}}),\) there is \(w\in T_{B}(M_{1}\cap M_{2},{\overline{x}})\) such that
Indeed, one can follow with obvious modifications the proof above and observe that the sequence \(\left( t_{n}^{-1}(p_{n}-{\overline{x}})\right) \) is bounded, whence it has a convergent subsequence whose limit satisfies the requirements for w.
In order to have a better insight on condition (MC), we present next some characterizations. Firstly we recall that a function \(f:X\rightarrow Y\) is metrically subregular at \(({\overline{x}},f({\overline{x}}))\) with respect to \(M\subset X\) when \({\overline{x}}\in M\) and there exist \(s>0,\) \(\mu >0\) such that for every \(u\in B({\overline{x}},s)\cap M\)
Observe that the metric subregularity of f at \(({\overline{x}},f(\overline{x}))\) with respect to \(M\ \)is exactly the metric subregularity of the set-valued map \(f_{M}:X\rightrightarrows Y\ \)given by
Note that in the paper [12] we used some metric subregularity conditions in getting calculus rules for tangent cones.
Now we present the equivalence (up to a change of the involved constants) of several metric conditions. Essentially, all these assertions are known in literature (see [21] and [19]). We give only some details in order to emphasize the features which will be of interest for us in the sequel. On product spaces, we consider the sum norm.
Proposition 3.3
Take \({\overline{x}}\in M_{1}\cap M_{2}.\) The next assertions are equivalent:
-
(i)
there exist \(s,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{1},\)
$$\begin{aligned} d(x,M_{1}\cap M_{2})\le \mu d(x,M_{2}). \end{aligned}$$ -
(ii)
there exist \(r,t,\mu >0\) such that for all \(x\in B({\overline{x}},r)\cap M_{1},\)
$$\begin{aligned} d(x,M_{1}\cap M_{2})\le \mu d(x,B({\overline{x}},t)\cap M_{2}). \end{aligned}$$ -
(iii)
there exist \(r,\nu >0\) such that for all \(x\in B({\overline{x}},r)\cap M_{2},\)
$$\begin{aligned} d(x,M_{1}\cap M_{2})\le \nu d(x,M_{1}). \end{aligned}$$ -
(iv)
there exist \(r,\nu >0\) such that for all \(x\in B({\overline{x}},r),\)
$$\begin{aligned} d(x,M_{1}\cap M_{2})\le \nu \left( d(x,M_{1})+d(x,M_{2})\right) . \end{aligned}$$(Str) -
(v)
the function \(g:X\times X\rightarrow X\) given by \(g(x,y):=x-y\) is metrically subregular at \(({\overline{x}},{\overline{x}},0)\) with respect to \(M_{1}\times M_{2}\).
-
(vi)
the function \(h:X\times X\rightarrow {\mathbb {R}}\) given by \(h(x,y):=d(x,y)\) is metrically subregular at \(({\overline{x}},{\overline{x}},0)\) with respect to \(M_{1}\times M_{2}\).
Proof
\((i)\Rightarrow (ii)\) This is obvious for \(r:=s,\) since \(d(x,M_{2})\le d(x,B({\overline{x}},t)\cap M_{2}).\)
\((ii)\Rightarrow (i)\) Consider \(s:=\min \left\{ r,t\right\} \) and \(x\in B({\overline{x}},3^{-1}s)\cap M_{1}.\) Then for every \(\varepsilon \in (0,3^{-1}s),\) there is \(x_{\varepsilon }\in M_{2}\) such that
Then
whence \(x_{\varepsilon }\in B({\overline{x}},t)\cap M_{2},\) so
Since this is true for every \(\varepsilon \) small enough, we arrive at the conclusion.
\((i)\Rightarrow (iv)\) Define \(r=2^{-1}s\) and take \(x\in B({\overline{x}},r).\) Then \(d(x,M_{1})\le \left\| x-{\overline{x}}\right\| <2^{-1}s,\) hence there is \(m_{1}\in M_{1}\) such that \(\left\| x-m_{1}\right\| <2^{-1}s.\) Therefore,
\(m_{1}\in M_{1}\cap B({\overline{x}},s),\) and by (i)
But then
Since \(m_{1}\) can be chosen such that \(\left\| x-m_{1}\right\| \) is arbitrarily close to \(d(x,M_{1}),\) it follows that (Str) holds for \(\nu :=1+\mu \).
\((iv)\Rightarrow (i)\) This is obvious for \(s:=r\) and \(\mu :=\nu .\)
\((iv)\Leftrightarrow (iii)\) This follows by the symmetry of conditions in the items (i) and (iii) and by \((i)\Leftrightarrow (iv).\)
\((i)\Leftrightarrow (v)\) See [19, Proposition 7.6].
\((v)\Leftrightarrow (vi)\) Observe that \(g^{-1}(0)=h^{-1}(0)\) and that \(d(0,g(x,y))=\left\| x-y\right\| =d(x,y)=d(0,d(x,y))=d(0,h(x,y)).\) \(\square \)
The metric condition (iv) is very well studied in literature and it is known under various names: local regularity, metric regularity, linear coherence, metric inequality, subtransversality (see, e.g., [19, 21,22,23,24,25, 27, 28, 30, 32]). For more historical details and an extensive bibliography, see [21].
Remark 3.4
Notice that (MC) is exactly the fact that \({\overline{x}}\) is a local minimum on \(M_{1}\) of the function
Since the latter function is \((1+\mu )-\)Lipschitz, by Clarke’s penalization principle ( [8, Proposition 2.4.3]), \({\overline{x}}\) is a local minimum (without constraints) of
which implies that for x around \({\overline{x}}\)
This is another argument for \((i)\Rightarrow (iv)\) of the proposition above.
Remark 3.5
The fact that \({\overline{x}}\) is a local minimum of the function
where \(\iota \) is the indicator function, means that the condition (MC) can be seen as a minimality condition for a difference function. Necessary and sufficient conditions for it can be devised in terms of subdifferentials of the involved functions (hence, in our case, in terms of normal cones to \(M_{1},M_{2},M_{1}\cap M_{2}\)) following the results in [31]. Moreover, in the convex case (that is, both \(M_{1}\) and \(M_{2}\) are convex) the local minimality becomes global minimality, and an equivalent condition in terms of \(\varepsilon -\)subdifferentials (and \(\varepsilon -\)normal cones) is to be found in [18].
For instance, on Asplund spaces, some well-known consequences of relation (Str) are formulae for the Fréchet/limiting normal cone to intersection of sets (see, e.g., [19, 32]).
Remark 3.6
Notice that, for the convex case, the necessity of some metric inequalities for ensuring the validity of constraint qualification conditions is discussed in [35].
Combining the discussions above, we present two consequences for the systems with multiple constraints considered in the previous section.
Corollary 3.7
Let \(g=\left( g_{1},g_{2}\right) :X\rightarrow {\mathbb {R}}^{2}\) be differentiable, consider \({\overline{x}}\in X\) such that \(g_{1}(\overline{x})=g_{2}({\overline{x}})=0\) and \(\nabla g_{1}({\overline{x}})\ne 0,\nabla g_{2}({\overline{x}})\ne 0.\) If there exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{g_{1}},\)
then
Corollary 3.8
Let \(g=\left( g_{1},g_{2},g_{3}\right) :X\rightarrow {\mathbb {R}}^{3}\) be differentiable, consider \({\overline{x}}\in X\) such that \(g_{i}({\overline{x}})=0\) and \(\nabla g_{i}({\overline{x}})\ne 0\) for \(i\in \overline{1,3}.\) If there exist \(s,t,\mu ,\gamma >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{g_{1}},\)
and for all \(x\in B({\overline{x}},t)\cap M_{g_{1}}\cap M_{g_{2}}\)
then
Consider a \(C^{1}\) function \(g=(g_{1},g_{2}):X\rightarrow {\mathbb {R}}^{2},\) the associated constraint system of inequalities \(g(x)\le 0,\) and let \({\overline{x}}\) be a feasible point where both scalar constraints are active. Let us discuss the relationship between (MFCQ), (Str) and (MSCQ).
Proposition 3.9
In the above notation, we have the following implications:
-
(i)
\((MFCQ )\Leftrightarrow (MRCQ) \Rightarrow (MSCQ) ;\)
-
(ii)
\((MSCQ) \Rightarrow (Str) ;\)
-
(iii)
\(\left[ (MSCQ) \text { for }g_{1}\text { and }g_{2}\right] +(Str) \Rightarrow (MSCQ) \) for \((g_{1},g_{2}).\)
Proof
(i) All the implications are known (see the above comments).
(ii) Firstly, remark that \(M_{g_{i}}={\tilde{g}}_{i}^{-1}\left( 0\right) ,\) for \(i=1,2.\) Then (MSCQ) means that there are some constants \(r,\alpha >0\) such that for all \(x\in B\left( {\overline{x}},r\right) \)
Since \(g_{i}\) are \(C^{1},\) they are locally Lipschitz around \({\overline{x}}.\) Whence \({\tilde{g}}_{i}\) are Lipschitz multifunctions around \(\left( {\overline{x}},0\right) ,\) meaning that \({\tilde{g}}_{i}^{-1}\) are metrically regular around \(\left( 0,{\overline{x}}\right) .\) This obviously implies that there exists \(\beta >0\) such that
for all x sufficiently close to \({\overline{x}}.\) The two relations above imply (Str) condition.
(iii) The assumption made means that there are \(r,\alpha ,\beta >0\) such that for all \(x\in B\left( {\overline{x}},r\right) \)
Then for all \(x\in B\left( {\overline{x}},r\right) ,\)
i.e., (MSCQ) holds.\(\square \)
We present some illustrative examples where we aim to discuss different situations from the perspective of the properties we study here, that is, (MFCQ), (CHIP), (MSCQ), (Str).
Example 3.10
Let \(g_{1},g_{2}:{\mathbb {R}}^{2}\mathbb {\rightarrow R}\) be given by
Consider the sets
Take \(\left( {\overline{x}},{\overline{y}}\right) =0\in {\mathbb {R}}^{2}\) and observe that \(\nabla g_{1}\left( {\overline{x}},{\overline{y}}\right) =(1,-1)=-\nabla g_{2}\left( {\overline{x}},{\overline{y}}\right) \) and (MFCQ) holds for \(g_{1}\) and \(g_{2}\) at \(\left( {\overline{x}},{\overline{y}}\right) ,\) while it does not for \((g_{1},g_{2}).\) We have
whence
On the other hand, \(M_{1}\cap M_{2}=\{0\},\) so \(T_{B}(M_{1}\cap M_{2} ,0)=\{0\},\) whence the equality does not hold. However, the relation (MC) does not hold either. Indeed, the relation (MC) would imply the existence of \(\mu >0\) such that for every small \(x>0,\)
which is impossible. So, (Str) does not hold.
Moreover, (MSCQ) would mean the existence of \(\alpha >0\) such that
for all \(\left( x,y\right) \) close to \(\left( 0,0\right) .\) But, for \(x=y,\) this would mean
for all small x, and this is false. Therefore, (MSCQ) does not hold.
Example 3.11
Let \(g_{1},g_{2}:{\mathbb {R}}^{2}\mathbb {\rightarrow R}\) be given by
Consider the sets
Take \(\left( {\overline{x}},{\overline{y}}\right) =0\in {\mathbb {R}}^{2}\) and observe that \(\nabla g_{1}({\overline{x}})=(1,-1)=-\nabla g_{2}({\overline{x}})\) and (MFCQ) holds for \(g_{1}\) and \(g_{2}\) at \(\left( {\overline{x}},\overline{y}\right) ,\) while it does not for \((g_{1},g_{2}).\) We have
whence
On the other hand,
so
whence the equality does hold. Moreover, it is not difficult to see that the relation (MC) does hold as well for small s and \(\mu =1\). So, (Str) holds. Also an easy computation shows that (MSCQ) holds.
Now we give examples which show that the converse implications in Proposition 3.9 (i), (ii) do not hold.
Example 3.12
Let \(g_{1},g_{2}:\mathbb {R\rightarrow R}\) be given by
Take \({\overline{x}}=0\in {\mathbb {R}}\). It is clear that (MFCQ) does not hold for \(g_{1}\) and \(g_{2}\). However, (MSCQ) does hold, since \({\tilde{g}} _{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{2}^{-1}\left( 0\right) ={\mathbb {R}},\) hence \(d\left( x,{\tilde{g}}_{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{2}^{-1}\left( 0\right) \right) =0\) for any x.
Example 3.13
Consider now \(g_{1},g_{2}:\mathbb {R\rightarrow R}\) be given by
Consider \({\overline{x}}=0\in {\mathbb {R}}\). For this system, (MSCQ) does not hold, since otherwise we should find \(\alpha >0\) such that
for all x close to 0, which is not possible. On the other hand, (Str) trivially holds with \(\mu =1.\)
Remark 3.14
The second item of Proposition 3.9 and above example show that there are situations where (Str) is strictly weaker than (MSCQ) . This fact will be emphasized in the next section. Some interesting equivalences concerning the fulfillment of Karush–Kuhn–Tucker conditions can be found in [1] and [33].
Moreover, it is well known that translating regularity properties from the image or product space into the preimage space requires, in general, certain regularity assumptions on the involved mappings.
Nevertheless, the conditions (MC) and (MFCQ) are only sufficient for having the equality between the tangent cone to the intersection and the intersection of tangent cones, as the next example shows.
Example 3.15
Consider \(g_{1},g_{2}:{\mathbb {R}}^{2}\mathbb {\rightarrow R}\) given by
and
respectively. We have that \(\nabla g_{1}(0,0)=\left( 0,-1\right) \) and \(\nabla g_{2}(0,0)=\left( 0,1\right) \) whence (MFCQ) holds for \(g_{1}\) and for \(g_{2}\), while it does not hold for \(\left( g_{1},g_{2}\right) \). We have that
Now denote \(f(x)=x^{4}\sin ^{2}\frac{1}{x}\) for any \(x\ne 0\) and consider, for \(n\in {\mathbb {N}}\setminus \left\{ 0\right\} ,\) the number
and the element
Since
one has that
On the other hand,
whence, (MC) would imply the existence of a constant \(\mu >0\) such that for all n large enough,
that is
Of course, this is impossible since the right-hand term converges to 0 as \(n\rightarrow \infty \). This means, obviously, that also (Str) does not hold.
4 Consequences in Optimization
In this section, we present some consequences of the metric condition discussed above, and we give as well some other similar conditions that can be used in various contexts.
Consider a scalar function \(f:X\rightarrow {\mathbb {R}}\) and recall that the Hadamard upper directional derivative of f at \({\overline{x}}\in X\) in direction \(u\in X\) is
while the Hadamard lower directional derivative of f at \({\overline{x}}\) in direction u is
In [7] a concept of directional minimum was introduced and studied in the vectorial setting (see also [14]). We consider this concept here, but, initially, for the sake of simplicity, we restrict ourselves to the scalar case.
Definition 4.1
Let \(f:X\rightarrow {\mathbb {R}}\) be a function and \(A\subset X\), \(L\subset S(0,1)\) be nonempty closed sets. One says that \({\overline{x}}\in A\) is a local directional minimum point for f on A with respect to (the set of directions) L if there exists a neighborhood U of \({\overline{x}}\) such that for every \(x\in U\cap A\cap \left( {\overline{x}}+{\text {cone}}L\right) ,\) \(f({\overline{x}})\le f(x).\)
Proposition 4.2
Let \(f:X\rightarrow {\mathbb {R}}\) be a function, \(A\subset X\), \(L\subset S(0,1)\) be nonempty closed sets and \({\overline{x}}\in A.\) Suppose that there exist \(s>0,\mu >0\) such that
-
(i)
If \({\overline{x}}\) is a local directional minimum point for f on A with respect to L, then \(d_{+}f({\overline{x}},u)\ge 0\) for all \(u\in T_{B}(A,{\overline{x}})\cap L.\)
-
(ii)
Moreover, if X is finite dimensional and \(d_{-}f({\overline{x}},u)>0\) for all \(u\in T_{B}(A,{\overline{x}})\cap L,\) then there exists \(\alpha >0\) such that \({\overline{x}}\) is a local directional minimum point for \(f\left( \cdot \right) -\alpha \left\| \cdot -{\overline{x}}\right\| \) on A with respect to L.
Proof
Firstly, observe that, according to Theorem 3.1, the metric assumption imposed ensures that
Since, obviously,
and
we actually get
Now the proof of both items is quite standard, but we present it here for the sake of clarity.
(i) Remark that the minimality of \({\overline{x}}\) on A with respect to a set of directions L is in fact the minimality of \({\overline{x}}\) on \(A\cap \left( {\overline{x}}+{\text {cone}}L\right) .\) Therefore, for every \(u\in T_{B}(A,{\overline{x}})\cap {\text {cone}}L=T_{B}(A\cap \left( \overline{x}+{\text {cone}}L\right) ,{\overline{x}}),\ \)there are some sequences \((t_{n})\downarrow 0,(u_{n})\rightarrow u\) such that for all \(n\in {\mathbb {N}},{\overline{x}}+t_{n}u_{n}\in U\cap A\cap \left( \overline{x}+{\text {cone}}L\right) ,\) whence, for n large enough \(f\left( {\overline{x}}+t_{n}u_{n}\right) \ge f({\overline{x}}).\) Consequently,
(ii) Suppose, by way of contradiction, that for every positive integer n there exists \(x_{n}\in A\cap \left( {\overline{x}}+{\text {cone}}L\right) \cap B({\overline{x}},n^{-1})\) such that
Then for all \(n\ge 1,\) \(x_{n}\ne {\overline{x}}\) and
Since X is finite dimensional, we can suppose, without loss of generality, that \(\left( \frac{x_{n}-{\overline{x}}}{\left\| x_{n}-\overline{x}\right\| }\right) \) converges to an element u which clearly is in \(T_{B}(A,{\overline{x}})\cap L.\) Therefore
and this is a contradiction. Therefore, the conclusion holds.\(\square \)
Remark 4.3
According to Proposition 3.3, one can replace (4.1) by any of the equivalent corresponding conditions.
From our point of view, an interesting fact is that metric conditions of the type (MC) come into play as weak assumptions to ensure the validity of some penalization principles. Even if all the conditions in Proposition 3.3 are equivalent, the change of constants is important in the construction of the penalty function. More details will be given after the next results.
Theorem 4.4
(penalty of an intersection of sets) Let \(f:X\rightarrow {\mathbb {R}}\) be a function and \(A,B\subset X\) be nonempty, closed sets. Let \({\overline{x}}\in A\cap B\) be a local minimum point for f on \(A\cap B\). Suppose that
-
(i)
there exist \(\varepsilon >0\) and \(\ell >0\) such that f is \(\ell -\)Lipschitz on \(B({\overline{x}},\varepsilon );\)
-
(ii)
there exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap A,\)
$$\begin{aligned} d(x,A\cap B)\le \mu d(x,B). \end{aligned}$$Then \({\overline{x}}\) is a local minimum point for \(f+\ell \mu d(\cdot ,B)\) on A and a local minimum point (without constraints) for
$$\begin{aligned} f+\ell \left( 1+\mu \right) d(\cdot ,A)+\ell \mu d(\cdot ,B). \end{aligned}$$(4.2)
Proof
We prove first that \({\overline{x}}\) is a local minimum point on A for
Let \(\alpha >0\) be the radius of the ball given by the local minimality of \({\overline{x}}.\) We choose \(\delta >0\) such that \(\delta <\min \left\{ \left( 1+\mu \right) ^{-1}\alpha ,s,\left( 1+\mu \right) ^{-1}\varepsilon \right\} \) and \(x\in A\cap B({\overline{x}},\delta ).\) Since \(f({\overline{x}})+\ell \mu d({\overline{x}},B)=f({\overline{x}}),\) we have to show that
If \(x\in A\cap B({\overline{x}},\delta )\cap B,\) this is obvious, by the property of \({\overline{x}}.\) Take \(x\in A\cap B({\overline{x}},\delta )\setminus B.\) By (ii),
whence, for every \(\rho \in (0,\min \left\{ \alpha ,\varepsilon \right\} -\delta \left( 1+\mu \right) ),\)
This means that there is \(x_{\rho }\in A\cap B\) such that
But,
Consequently, \(x_{\rho }\in A\cap B\cap B({\overline{x}},\alpha ),\) whence \(f({\overline{x}})\le f(x_{\rho }).\) Then, by (i),
Letting \(\rho \rightarrow 0\) we get the desired inequality, whence the claim is proved.
Now, observe that \(f+\ell \mu d(\cdot ,B)\) is locally Lipschitz around \({\overline{x}}\) with the Lipschitz constant \(\ell (1+\mu ).\) We apply the standard Clarke penalization principle and we get the conclusion.\(\square \)
Remark 4.5
Observe that the usual Clarke penalization principle corresponds to the case \(A=B\) in Theorem 4.4.
Remark 4.6
One can employ as well the following argument: if (ii) holds, then, according to Proposition 3.3 (i)\(\Leftrightarrow \)(iv), for all x close to \({\overline{x}}\)
and the Clarke penalization principle gives \(\ell \left( 1+\mu \right) d(\cdot ,A)+\ell \left( 1+\mu \right) d(\cdot ,B)\) as penalization term.
Remark 4.7
Similarly to the situation described in the above remark, the use of condition (MC) instead of (Str) could be of importance in the study of the rate of convergence of some algorithms related to the method of alternating projections or, more general, set intersection problems (see [4, 5, 10, 29]).
We apply this generalized penalty result for getting necessary optimality conditions in the dual space for directional minima.
Corollary 4.8
Let X be an Asplund space, \(f:X\rightarrow {\mathbb {R}}\) be a function, \(A\subset X\), \(L\subset S\left( 0,1\right) \) be nonempty closed sets and \({\overline{x}}\in A.\) Suppose that:
-
(i)
\({\overline{x}}\) is a local directional minimum point for f on A with respect to L;
-
(ii)
there exist \(\varepsilon >0\) and \(\ell >0\) such that f is \(\ell -\)Lipschitz on \(B({\overline{x}},\varepsilon );\)
-
(iii)
there exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap A,\)
$$\begin{aligned} d(x,A\cap \left( {\overline{x}}+{\text {cone}}L\right) )\le \mu d(x,{\overline{x}}+{\text {cone}}L). \end{aligned}$$Then there are \(u^{*}\in N(A,{\overline{x}}),v^{*}\in N\left( {\text {cone}}L,0\right) \) with \(\left\| u^{*}\right\| \le \ell \left( 1+\mu \right) \ \)and \(\left\| v^{*}\right\| \le \ell \mu ,\) such that
$$\begin{aligned} -u^{*}-v^{*}\in \partial f({\overline{x}}). \end{aligned}$$
Proof
Since \({\overline{x}}\) is a local directional minimum point for f on A with respect to L, then it is a local minimum point for f on \(A\cap \left( {\overline{x}}+{\text {cone}}L\right) \). Then, on the basis of (ii) and (iii) we can apply Theorem 4.4 to get that \({\overline{x}}\) is a local minimum point for \(f+\ell \left( 1+\mu \right) d(\cdot ,A)+\ell \mu d(\cdot ,{\overline{x}}+{\text {cone}}L).\) Therefore, since all the functions are locally Lipschitz around \({\overline{x}},\) one can apply the exact subdifferential calculus rules (see [26, Theorem 3.36]) to get
The conclusion follows.\(\square \)
Remark 4.9
The importance of having constants as small as possible is apparent in the conclusion of Corollary 4.8 in the estimation of the norms of the generalized Lagrange multipliers \(u^{*}\) and \(v^{*}.\) Observe that if \({\text {cone}}L\) is convex, then \(N\left( {\text {cone}} L,0\right) =L^{-}.\)
Remark 4.10
In view of the discussion in the preceding section, condition displayed in (iii) is, in general, weaker than (MSCQ) type conditions.
We illustrate the above results by the following example.
Example 4.11
Let \(X:={\mathbb {R}}^{2},\) \(A:=\left[ 0,1\right] ^{2},\) \(L:=S\left( \left( 0,0\right) ,1\right) \cap \left\{ \left( x,y\right) \in {\mathbb {R}}^{2}\mid y\ge x\ge 0\right\} .\) Consider the function \(f:{\mathbb {R}}^{2} \rightarrow {\mathbb {R}},\) \(f(x,y)=2y-x.\) Clearly, \(\left( 0,0\right) \) is a directional minimum point for f on A with respect to L. Moreover, the assumptions of Corollary 4.8 are satisfied with \(\ell =\sqrt{5}\) and \(\mu =1.\) Then \(\partial f(0,0)=\left\{ \left( -1,2\right) \right\} \) and there is \(u^{*}=\left( 0,-1\right) \in N(A,\left( 0,0\right) ),\) \(v^{*}=\left( 1,-1\right) \in L^{-}\) satisfying the conclusion.
On the other hand, if we change only the set of directions to
then one cannot find \(u^{*},\) \(v^{*}\) to satisfy the conclusion, confirming that this time \(\left( 0,0\right) \) is not a directional minimum point for f on A with respect to L.
In the next result we point out that a metric condition can be imposed as well for a functional constraint in order to get a penalty result. Let Z be a normed vector space, \(g:X\rightarrow Z\) be a function and \(Q\subset Z\) be a pointed closed convex cone. As above, one defines the set-valued map \({\tilde{g}}:X\rightrightarrows Z\) given by \({\tilde{g}}(x):=g(x)+Q\) and one considers \(g^{-1}\left( -Q\right) ={\tilde{g}}^{-1}\left( 0\right) \) as the feasible set.
Theorem 4.12
(penalty for a functional constraint)Let \({\overline{x}}\in g^{-1}(-Q)\) be a local minimum of f on \(g^{-1}(-Q)\). Suppose that
-
(i)
there exist \(\varepsilon >0\) and \(\ell >0\) such that f is \(\ell -\)Lipschitz on \(B({\overline{x}},\varepsilon );\)
-
(ii)
there exist \(s,\mu >0\) such that for all \(x\in g^{-1}\left( -Q+B(0,s)\right) \cap B\left( {\overline{x}},s\right) \)
$$\begin{aligned} d(x,g^{-1}(-Q))\le \mu d(0,{\tilde{g}}(x)\cap B(0,s)). \end{aligned}$$Then \(({\overline{x}},0)\) is a local minimum for the function \((x,z)\mapsto f(x)+\ell \mu \left\| z\right\| +\ell \left( 1+\mu \right) d(\left( x,z\right) ,{\text {Gr}}{\tilde{g}}).\) If, moreover, X, Z are Asplund spaces, then
$$\begin{aligned} -\partial f({\overline{x}})\cap D(0,\ell \left( 1+\mu \right) )\cap D^{*}{\tilde{g}}({\overline{x}},0)(Q^{+}\cap D(0,\ell \mu ))\ne \emptyset . \end{aligned}$$
Proof
Denote \(h:X\times Z\rightarrow {\mathbb {R}}\), \(h(x,z)=f(x)+\ell \mu \left\| z\right\| .\) Clearly, \(h(\overline{x},0)=f({\overline{x}})\) and we show first that \(({\overline{x}},0)\) is a local minimum on \({\text {Gr}}{\tilde{g}}\) for h. Thus, we have to show that there is \(\delta >0\) such that for all \((x,z)\in \left( B\left( \overline{x},\delta \right) \times B(0,\delta )\right) \cap {\text {Gr}}\tilde{g},\)
Let \(\alpha \) be the radius of the ball given by the local minimality of \({\overline{x}}.\) Take \(\delta >0\) such that \(\delta <\min \left\{ \varepsilon \left( 1+\mu \right) ^{-1},s,\alpha \left( 1+\mu \right) ^{-1}\right\} \) and \((x,z)\in \left( B\left( {\overline{x}},\delta \right) \times B(0,\delta )\right) \cap {\text {Gr}}{\tilde{g}}.\) If \(z\in -Q,\) then by the property of \({\overline{x}},\)
Suppose that \(z\notin -Q.\) Then there is \(q\in Q\) such that \(z-q=g\left( x\right) ,\) hence \(x\in B\left( {\overline{x}},\delta \right) \cap g^{-1}(z-q)\subset B\left( {\overline{x}},s\right) \cap g^{-1}\left( -Q+B(0,s)\right) .\) Now, by (ii),
Then for every \(\rho \in (0,\min \left\{ \alpha ,\varepsilon \right\} -\delta \left( 1+\mu \right) ),\) there is \(x_{\rho }\in g^{-1}(-Q)\) such that
Then
Therefore, by (i),
For \(\rho \rightarrow 0,\) we get the claim.
Since h is locally Lipschitz around \(({\overline{x}},0)\) with constant \(\ell \left( 1+\mu \right) ,\) we apply the standard Clarke penalization principle, and we get the first conclusion.
For the second part, by applying the generalized Fermat rule for limiting subdifferential on Asplund spaces, we get
Then there are \(x^{*}\in \partial f({\overline{x}}),\) \(z^{*}\in D(0,\ell \mu )\) such that
In particular, this means that
Taking into account the definition of the limiting normal cone, \(\left( -x^{*},-z^{*}\right) \) can be approached in the weak-star topology by some elements \((-x_{n}^{*},-z_{n}^{*})\) elements in the Fréchet normal cone to \({\text {Gr}}{\tilde{g}}\) at points close to \(\left( {\overline{x}},0\right) .\) According to [11, Lemma 3.2], the elements \(z_{n}^{*}\) are in \(Q^{+}\) and since this set is weakly-star closed, \(z^{*}\in Q^{+}.\) Therefore, \(z^{*}\in Q^{+}\cap D(0,\ell \mu )\) and then we get the conclusion.\(\square \)
Remark 4.13
Notice that the dual conditions in Corollary 4.8 and the last part of Theorem 4.12 are consequences of the corresponding primal conditions and the sum rule for limiting subdifferentials.
By combining both penalty principles above, one can get similarly optimality conditions for directional minima under functional constraints. We leave the details to interested reader.
Finally, we give some optimality conditions for a concept of directional minimum with respect to two sets of directions for vectorial functions (see [14]). One needs an auxiliary result.
Lemma 4.14
Let X, Y be Banach spaces, \(f:X\rightarrow Y\) a continuously differentiable function, \(M\subset S_{Y}\) a closed and nonempty set, and \({\overline{x}}\in X\). Suppose that one of the following sets of conditions holds:
-
(i)
\({\text {cone}}M\) is convex and there exists \(u_{0}\in X\) such that \(\nabla f({\overline{x}})(u_{0})\in -{\text {int}}({\text {cone}}M)\);
-
(ii)
the map \(g:X\times Y\rightarrow Y\) given by \(g(x,y)=f(x)-y\) is metrically subregular at \(({\overline{x}},f({\overline{x}}),0)\) with respect to \(X\times f^{-1}(f({\overline{x}})-{\text {cone}}M)\). Then
$$\begin{aligned} T_{B}(f^{-1}(f({\overline{x}})-{\text {cone}}M),{\overline{x}})=\nabla f({\overline{x}})^{-1}(-{\text {cone}}M). \end{aligned}$$(4.3)
Proof
Provided that f is continuously differentiable, it is always true that \(T_{B}(f^{-1}(f({\overline{x}})-{\text {cone}} M),{\overline{x}})\subset \nabla f({\overline{x}})^{-1}(-{\text {cone}}M)\). Indeed, for an arbitrary \(u\in T_{B}(f^{-1}(f({\overline{x}} )-{\text {cone}}M),{\overline{x}})\), there exist \((t_{n})\downarrow 0\) and \(\left( u_{n}\right) \rightarrow u\) such that \(f({\overline{x}}+t_{n}u_{n})\in f({\overline{x}})-{\text {cone}}M\) for all n. Since f is continuously differentiable at \({\overline{x}}\), this means that there exists a function \(\alpha :X\rightarrow Y\) such that \(\lim _{h\rightarrow 0}\alpha (\overline{x}+h)=\alpha ({\overline{x}})=0\) and
Therefore,
and from here, dividing by \(t_{n}\) and passing to the limit, one gets \(u\in \nabla f({\overline{x}})^{-1}(-{\text {cone}}M)\).
If the condition from (ii) holds, the conclusion is granted by [12, Theorem 3.1].
Suppose now that we are under the assumptions from (i). Let \((t_{n})\) be an arbitrary sequence of positive numbers converging to 0, and set \(u_{n}:=u_{0}\). Using the differentiability of f, we can write
where \(\alpha :X\rightarrow Y\) is such that \(\lim \limits _{n\rightarrow \infty }\alpha ({\overline{x}}+t_{n}u_{n})=\alpha ({\overline{x}})=0.\) Therefore, since \(\nabla f({\overline{x}})(u_{0})\in -{\text {int}}({\text {cone}}M)\), we have that for all n large enough, \(f({\overline{x}}+t_{n}u_{0})\in f({\overline{x}})-{\text {cone}}M\), that is \(u_{0}\in T_{U}(f^{-1} (f({\overline{x}})-{\text {cone}}M),{\overline{x}})\).
Consider now \(v\in X\) such that \(\nabla f({\overline{x}})(v)\in -{\text {cone}}M\), and \(\lambda \in (0,1)\). Then
Hence, from the previous step,
Passing to the limit with \(\lambda \rightarrow 0\), and taking into account that the adjacent cone is closed, we get that \(v\in T_{U}(f^{-1}(f(\overline{x})-{\text {cone}}M),{\overline{x}})\subset T_{B}(f^{-1}(f(\overline{x})-{\text {cone}}M),{\overline{x}})\), and this completes the proof. \(\square \)
We give next a more involved concept of directional minimality (see [14]).
Definition 4.15
Let X and Y be normed vector spaces, \(K\subset Y\) a closed ordering cone with nonempty interior, \(f:X\rightarrow Y\), and \(A\subset X\), \(L\subset S_{X} \), \(M\subset S_{Y}\) closed sets. One says that \({\overline{x}}\in A\) is a weak local directional Pareto minimum point for f with respect to the sets of directions L and M on A if there exists a neighborhood U for \({\overline{x}}\) such that
We present now the necessary optimality conditions.
Theorem 4.16
Let X, Y be Banach spaces, \(f:X\rightarrow Y\) a continuously differentiable function, \(K\subset Y\) a closed convex ordering cone with nonempty interior, \(A\subset X\), \(L\subset S_{X}\) and \(M\subset S_{Y}\) closed and nonempty sets, and \({\overline{x}}\in A\). Assume there exist \(s,\mu ,t,\gamma >0\) such that
and
Suppose also that either (i) or (ii) from Lemma 4.14 hold, and \({\overline{x}}\) is a weak local directional Pareto minimum for f on A with respect to L and M.
Then
Proof
First, we observe that the metric assumptions ensure the equality
But \(T_{B}({\overline{x}}+{\text {cone}}L,{\overline{x}} )={\text {cone}}L\) and since we are under assumptions (i) or (ii) from Lemma 4.14, we also have
Therefore,
The minimality of \({\overline{x}}\) means there exists a neighborhood U for \({\overline{x}}\) such that
That is, for every \(e\in {\text {int}}K\) and every \(x\in A\cap U\cap ({\overline{x}}+{\text {cone}}L)\cap f^{-1}(f(\overline{x})-{\text {cone}}M)\), we have \(s_{e}(f(x)-f({\overline{x}}))\ge 0\) since \(f(x)-f({\overline{x}})\notin -{\text {int}}K\). We can now see \({\overline{x}}\) as a local minimum point for the scalar function \(g:X\rightarrow {\mathbb {R}}\) given by \(g(x)=s_{e}(f(x)-f({\overline{x}}))\) on \(A\cap ({\overline{x}}+{\text {cone}}L)\cap f^{-1}(f(\overline{x})-{\text {cone}}M)\). Note that \(g({\overline{x}})=s_{e}(0)=0\) because \(s_{e}\) is sublinear.
Consider now \(u\in T_{B}(A,{\overline{x}})\cap {\text {cone}}(L)\cap \nabla f({\overline{x}})^{-1}(-{\text {cone}}M)\). There exist \((t_{n} )\downarrow 0\) and \(u_{n}\rightarrow u\) such that
due to (4.4). Moreover, since \({\overline{x}}+t_{n}u_{n}\rightarrow {\overline{x}}\), we also have that \({\overline{x}}+t_{n}u_{n}\in U\) for n sufficiently large. Hence,
for all n sufficiently large. There exists a function \(\alpha :X\rightarrow Y\) such that \(\lim _{h\rightarrow 0}\alpha ({\overline{x}}+h)=\alpha (\overline{x})=0\), so that we can write
Dividing by \(t_{n}\) and passing to the limit, we get \(s_{e}(\nabla f({\overline{x}})(u))\ge 0\), that is \(\nabla f({\overline{x}})(u)\notin -{\text {int}}K\). The proof is complete. \(\square \)
5 Concluding Remarks
Besides the known applications of the metric inequality on sets that we discuss in this work, we study its implications in at least two aspects related to optimization problems. First of all, it is an appropriate replacement for some constraint qualification conditions when a new constraint is added. Secondly, it is naturally involved in penalization results when several constraints are considered. By its nature, as underlined, the metric inequality is rather weak and this offers the perspective of applying the general principle behind the results in this work to other research directions. Therefore, the metric relations we study here can be extended as well for deriving calculus rules for tangent cones by changing the conditions based on subregularities of mappings initiated in [12]. Moreover, penalization procedures for vectorial problems with multiple constraints based on results such as those in [36] and [13] are to be envisaged in future works.
Data Availibility Statement
This manuscript has no associated data.
References
Azé, D.: Characterizations of the Lagrange-Karush-Kuhn-Tucker property. http://www.optimization-online.org/DB_FILE/2014/12/4689.pdf
Bauschke, H.H., Borwein, J.M., Li, W.: Strong conical hull intersection property, bounded linear, regularity, Jameson’s property (G), and error bounds in convex optimization. Math. Program. Ser. A 86, 135–160 (1999)
Bauschke, H.H., Borwein, J.M., Tseng, P.: Bounded linear regularity, strong CHIP, and CHIP are distinct properties. J. Convex Anal. 7, 39–412 (2000)
Bauschke, H.H., Luke, D.R., Phan, H.M., Wang, X.: Restricted normal cones and the method of alternating projections: Theory. Set-Valued Var. Anal. 21, 431–473 (2013)
Bauschke, H.H., Luke, D.R., Phan, H.M., Wang, X.: Restricted normal cones and the method of alternating projections: applications. Set-Valued Var. Anal. 21, 475–501 (2013)
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)
Chelmuş, T., Durea, M., Florea, E.-A.: Directional Pareto efficiency: concepts and optimality conditions. J. Optim. Theory Appl. 182, 336–365 (2019)
Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York (1983)
Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings. Springer, Berlin (2009)
Drusvyatskiy, D., Ioffe, A.D., Lewis, A.S.: Transversality and alternating projections for nonconvex sets. Found. Comut. Math. 15, 1637–1651 (2015)
Durea, M., Strugariu, R.: On some Fermat rules for set-valued optimization problems. Optimization 60, 575–591 (2011)
Durea, M., Strugariu, R.: Calculus of tangent sets and derivatives of set-valued maps under metric subregularity conditions. J. Global Optim. 56, 587–603 (2013)
Durea, M., Strugariu, R.: Vectorial penalization for generalized functional constrained problems. J. Global Optim. 68, 899–923 (2017)
Florea, E.-A., Maxim, D.: Directional openness for epigraphical mappings and optimality conditions for directional efficiency. Optimization 70, 321–344 (2021)
Gfrerer, H., Mordukhovich, B.S.: Complete characterizations of tilt stability in nonlinear programming under weakest qualification conditions. SIAM J. Optim. 25, 2081–2119 (2015)
Gfrerer, H., Ye, J.J.: New constraint qualifications for mathematical programs with equilibrium constraints via variational analysis. SIAM J. Optim. 27, 842–865 (2017)
Göpfert, A., Riahi, H., Tammer, C., Zălinescu, C.: Variational Methods in Partially Ordered Spaces. Springer, Berlin (2003)
Hiriart-Urruty, J.-B.: From convex optimization to nonconvex optimization. Necessary and sufficient conditions for global optimality. In: Clarke, F.H., Dem’yanov, V.F., Giannessi, F. (eds.): Nonsmooth optimization and related topics, vol. 43, pp. 219–240. Springer Ettore Majorana International Science Series (1989)
Ioffe, A.D.: Variational Analysis of Regular Mappings Theory and Applications. Springer, Cham (2017)
Ioffe, A.D., Outrata, J.V.: On metric and calmness qualification conditions in subdifferential calculus. Set-Valued Var. Anal. 16, 199–227 (2008)
Kruger, A.Y., Luke, D.R., Thao, N.H.: About subtransversality of collections of sets. Set-Valued Var. Anal. 25, 701–729 (2017)
Kruger, A.Y., Luke, D.R., Thao, N.H.: Set regularities and feasibility problems. Math. Program. 168, 279–311 (2018)
Kruger, A.Y., Thao, N.H.: Quantitative characterizations of regularity properties of collections of sets. J. Optim. Theory Appl. 164, 41–67 (2015)
Li, S.J., Meng, K.W., Penot, J.-P.: Calculus rules for derivatives of multimaps. Set-Valued Var. Anal. 17, 21–39 (2009)
Li, S., Penot, J.-P., Xue, X.: Codifferential calculus. Set-Valued Var. Anal. 19, 505–536 (2011)
Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation. Springer, Berlin (2006)
Ng, K.F., Zang, R.: Linear regularity and \(\varphi \)-regularity of nonconvex sets. J. Math. Anal. Appl. 328, 257–280 (2007)
Ngai, H.V., Théra, M.: Metric inequality, subdifferential calculus and applications. Set-Valued Var. Anal. 9, 187–216 (2001)
Pang, C.H.J.: Nonconvex set intersection problems: from projection methods to the Newton method for super-regular sets, arXiv:1506.08246
Penot, J.-P.: Metric estimates for the calculus of multimapping. Set-Valued Var. Anal. 5, 291–308 (1997)
Penot, J.-P.: On the minimization of difference functions. J. Global Optim. 12, 373–382 (1998)
Penot, J.-P.: Calculus without derivatives. Springer, New York (2013)
Penot, J.-P.: Error bounds and multipliers in constrained optimization problems with tolerance. SIAM J. Optim. 29, 522–540 (2019)
Robinson, S.M.: Stability theory for systems of inequalities. II. differentiable nonlinear systems, SIAM. J. Numer. Anal. 13, 497–513 (1976)
Tiba, D., Zălinescu, C.: On the necessity of some constraint qualification conditions in convex programming. J. Convex Anal. 11, 95–110 (2004)
Ye, J.J.: The exact penalty principle. Nonlinear Anal. Theory Methods Appl. 75, 1642–1654 (2012)
Zheng, X.Y., Ng, K.F.: Metric subregularity and constraint qualifications for convex generalized equations in banach spaces. SIAM J. Optim. 18, 437–460 (2007)
Acknowledgements
The research of D.-E. Maxim and R. Strugariu was supported by the grant PN-III-P1-1.1-TE-2016-0868 of Romanian Ministry of Education and Research, CNCS–UEFISCDI. The authors thank the two anonymous reviewers for their constructive comments which significantly improved the presentation of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Asen L. Dontchev.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Durea, M., Maxim, D. & Strugariu, R. Metric Inequality Conditions on Sets and Consequences in Optimization. J Optim Theory Appl 189, 744–771 (2021). https://doi.org/10.1007/s10957-021-01848-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-021-01848-5