1 Introduction and Preliminaries

The starting point of the investigation we propose in this work is the question we briefly present below. Generally, if one considers a constrained optimization problem, in order to write necessary optimality conditions, some constraint qualification conditions are necessary. Suppose that we add to the current system of constraints a new constraint. Of course, the problem can dramatically change, and even if the initial system satisfies a constraint qualification condition, the new system can fail to do so. We asked ourselves if one can give a condition that links the old system of constraints and the new constraint in such a way that the optimality conditions apply for the new problem, without checking a constraint qualification condition for the whole new system of constraints. Actually, we started by asking this question in the case of a smooth optimization problem with inequality constraints, and then we observed that in order to keep the requirements as minimal as possible, we arrive at a metric condition—denoted (MC) in the following—that naturally comes into play for other types of optimization problems, including nonsmooth ones, and for some penalization results of Clarke’s type, as well. It turns out that this metric condition is equivalent to another well-known metric inequality condition studied under various names (metric inequality, linear regularity, metric regularity, linear coherence, and subtransversality) from about 30 years ago, which has been used, among others, as a key assumption in the study of the rate of convergence of some algorithms related to set intersection problems, especially in establishing linear convergence of sequences generated by the method of alternating projections. The way we use it in this paper follows another main stream of possible applications, i.e., as a qualification condition for formulae involving normal or tangent cones to intersections of sets, generically well-known as “Conical Hull Intersection Property” (CHIP)-type conditions in the convex framework, and extended to the nonconvex case by many authors. In this work, we choose to call this condition subtransversality and we denote it by (Str). Not surprisingly, having in mind that the metric subregularity type conditions can successfully replace the “full” regularity in qualification conditions, and the well-known fact that the Mangasarian–Fromovitz (MFCQ) is equivalent to the metric regularity of a certain epigraph-type multifunction—a condition which we denote (MRCQ)—, we asked ourselves what kind of subregularity is in relation to (Str) condition. Some immediate candidates, involving difference and distance type multifunctions, are indeed metrically subregular iff (Str) holds, but a more interesting object would be, of course, the epigraph multifunction mentioned above in relation with (MFCQ). As consequence, we studied the implications between (Str) and the metric subregularity of the epigraphical multifunction associated to systems of inequalities. This type of conditions were recently introduced and studied under the name of “metric subregularity constraint qualification” (MSCQ) conditions. It turns out that, in fact, in the context of inequality constraints involving \(C^{1}\) functions, (Str) is strictly weaker than (MSCQ) condition. Finally, we observed that all the constraint qualification conditions are only sufficient in establishing the nonconvex (CHIP) property we study in this work.

The next figure summarizes the relations between all the conditions discussed above, as will be made clear in the third section, along with more comments and references.

Another important feature of the metric condition (MC) is the fact that it can be employed in nonsmooth settings, for problems much more general than those we started with. As consequence, the fourth section is twofold. On the one hand, we apply some facts collected in the previous sections to directional Pareto minima in order to get necessary optimality conditions, and on the other hand, we employ the general pattern of the metric inequality under study to penalize scalar nonsmooth optimization problems with multiple constraints. The latter approach allows us to derive necessary optimality conditions in terms of limiting generalized differentiation techniques for the problems under consideration.

The notation we use is fairly standard. If X is a normed vector space, then we denote by B(xr),  D(xr) and S(xr) the open ball, the closed ball and the sphere of center \(x\in X\) and radius \(r>0,\) respectively. For a set \(A\subset X,\) we denote by \({\text {int}}A,\) \({\text {cl}}A,\) \({\text {bd}}A\) its topological interior, closure and boundary, respectively. The cone generated by A is designated by \({{\text {cone}}A,}\) and the convex hull of A is \({\text {conv}} A\). The distance from a point \(x\in X\) to a nonempty set \(A\subset X\) is \(d(x,A):=\inf \left\{ \left\| x-a\right\| \mid a\in A\right\} \) and the distance function to A is \(d_{A}:X\rightarrow {\mathbb {R}}\) given by \(d_{A}(x):=d(x,A)\). The topological dual of X is \(X^{*},\) and the negative polar of A is

$$\begin{aligned} A^{-}=\{x^{*}\in X^{*}\mid x^{*}(a)\le 0,\forall a\in A\}. \end{aligned}$$

The positive polar of A is \(A^{+}:=-A^{-}.\) Of course, \(A^{-}=\left( {\text {cone}}A\right) ^{-}.\)

Let D be a nonempty subset of X and \({\overline{x}}\in X.\) The first order Bouligand tangent cone to D at \({\overline{x}}\) is the set

$$\begin{aligned} T_{B}(D,{\overline{x}}):=\{u\in X\mid \exists (t_{n})\downarrow 0,\exists (u_{n})\rightarrow u,\forall n\in {\mathbb {N}},{\overline{x}}+t_{n}u_{n}\in D\} \end{aligned}$$

where \((t_{n})\downarrow 0\) means \((t_{n})\subset (0,\infty )\) and \(t_{n} \rightarrow 0.\) The first-order Ursescu tangent cone to D at \({\overline{x}}\) is the set

$$\begin{aligned} T_{U}(D,{\overline{x}}):=\{u\in X\mid \forall (t_{n})\downarrow 0,\exists (u_{n})\rightarrow u,\forall n\in {\mathbb {N}},{\overline{x}}+t_{n}u_{n}\in D\}. \end{aligned}$$

The first-order Dubovitskij–Miljutin tangent set to D at \({\overline{x}}\) is the set

$$\begin{aligned} T_{DM}(D,{\overline{x}}):=\{u\in X\mid \forall (t_{n})\downarrow 0,\forall (u_{n})\rightarrow u,\exists n_{0}\in {\mathbb {N}}\text {,}\forall n\ge n_{0},{\overline{x}}+t_{n}u_{n}\in D\}. \end{aligned}$$

The Bouligand and Ursescu tangent cones are closed sets, and \(T_{U} (D,{\overline{x}})\subset T_{B}(D,{\overline{x}}).\) The fact that \(T_{B} (D,{\overline{x}})=X\backslash T_{DM}(X\backslash D,{\overline{x}})\) shows that the Dubovitskij–Miljutin tangent set to D at \({\overline{x}}\) is open. Moreover, for \(*\in \{B,U,DM\}\), we have \(T_{*}(D,{\overline{x}})=T_{*}({\text {cl}}D,{\overline{x}})\). If \({\overline{x}}\in {\text {int}}A,\) then \(T_{*}(D,{\overline{x}})=T_{*}(D\cap A,{\overline{x}})\). It is well known that \({\overline{x}}\in {\text {int}}D\) if and only if \(T_{DM} (D,{\overline{x}})=X\) and \({\overline{x}}\in {\text {cl}}D\) if and only if \(T_{B}(D,{\overline{x}})\ne \emptyset .\)

Now, we briefly collect some basic facts concerning the limiting generalized calculus (see [26]). This calculus relies on the concept of normal cone, and even if this object can be defined on general Banach spaces, its main applicable features hold on Asplund spaces, which represent a special class of Banach spaces: X is Asplund iff every continuous convex function on any open convex set \(U\subset X\) is Fréchet differentiable at the points of a dense \(G_{\delta }-\)subset of U. A very important property of Asplund spaces is that every bounded sequence of the topological dual admits a \(w^{*}-\)convergent subsequence.

Take a nonempty subset \(S\ \)of the Asplund space X and pick \(x\in S.\) Then the Fréchet normal cone to S at x is

$$\begin{aligned} {\widehat{N}}(S,x):=\left\{ x^{*}\in X^{*}\mid \underset{u\overset{S}{\rightarrow }x}{\lim \sup }\frac{x^{*}(u-x)}{\left\| u-x\right\| }\le 0\right\} , \end{aligned}$$

where \(u\overset{S}{\rightarrow }x\) means that \(u\rightarrow x\) and \(u\in S.\)

Let \({\overline{x}}\in S.\) The limiting normal cone to S at \({\overline{x}}\) is

$$\begin{aligned} N(S,{\overline{x}}):=\{x^{*}\in X^{*}\mid \exists x_{n}\overset{S}{\rightarrow }{\overline{x}},x_{n}^{*}\overset{w^{*}}{\rightarrow }x^{*},x_{n}^{*}\in {\widehat{N}}(S,x_{n}),\forall n\in {\mathbb {N}}\}. \end{aligned}$$

If \(S\subset X\) is a convex set, then

$$\begin{aligned} N(S,{\overline{x}})=\{x^{*}\in X^{*}\mid x^{*}(x-{\overline{x}} )\le 0,\forall x\in S\} \end{aligned}$$

and coincides with the negative polar of \(T_{B}(S,{\overline{x}}).\)

Let \(F:X\rightrightarrows Y\) be a set-valued map between the Asplund spaces X and Y,  and \(({\overline{x}},{\overline{y}})\in {\text {Gr}}F:=\left\{ \left( x,y\right) \in X\times Y\mid y\in F\left( x\right) \right\} .\) Then the limiting coderivative of F at \(({\overline{x}},{\overline{y}})\) is the set-valued map \(D^{*}F({\overline{x}},{\overline{y}}):Y^{*}\rightrightarrows X^{*}\) given by

$$\begin{aligned} D^{*}F({\overline{x}},{\overline{y}})(y^{*}):=\{x^{*}\in X^{*} \mid (x^{*},-y^{*})\in N({\text {Gr}}F,({\overline{x}},\overline{y}))\}. \end{aligned}$$

As usual, when \(F:=f\) is a function, since \({\overline{y}}\in F\left( {\overline{x}}\right) \) means \({\overline{y}}=f\left( {\overline{x}}\right) ,\) we write \(D^{*}f\left( {\overline{x}}\right) \) for \(D^{*}f\left( {\overline{x}},{\overline{y}}\right) .\)

Let \(f:X\rightarrow \mathbb {R\cup \{+\infty \}}\) be finite at \({\overline{x}}\in X\ \)and lower semicontinuous around \({\overline{x}};\) the Fréchet subdifferential of f at \({\overline{x}}\) is given by

$$\begin{aligned} {\widehat{\partial }}f({\overline{x}}):=\{x^{*}\in X^{*}\mid (x^{*} ,-1)\in {\widehat{N}}({\text {epi}}f,({\overline{x}},f({\overline{x}})))\}, \end{aligned}$$

where \({\text {epi}}f\) denotes the epigraph of f; similarly, the limiting subdifferential of f at \({\overline{x}}\) is given by

$$\begin{aligned} \partial f({\overline{x}}):=\{x^{*}\in X^{*}\mid (x^{*},-1)\in N({\text {epi}}f,({\overline{x}},f({\overline{x}})))\}. \end{aligned}$$

One always has \({\widehat{\partial }}f({\overline{x}})\subset \partial f({\overline{x}}).\) Note that a generalized Fermat rule holds: if \({\overline{x}}\) is a local minimum point for f then \(0\in {\widehat{\partial }}f({\overline{x}}).\)

It is well known that if A is a closed set and \({\overline{x}}\in A\), then

$$\begin{aligned} \partial d_{A}\left( {\overline{x}}\right) \subset N\left( A,\overline{x}\right) \cap D\left( 0,1\right) . \end{aligned}$$

If f is a convex function, then both \({\widehat{\partial }}f({\overline{x}})\) and \(\partial f({\overline{x}})\) coincide with the Fenchel subdifferential.

Next, we recall the standard definitions of metric regularity and metric subregularity. Given a set-valued mapping \(F:X\rightrightarrows Y\) between normed vector spaces, one says that F is metrically regular around \(\left( {\overline{x}},{\overline{y}}\right) \in {\text {Gr}}F\) if there exist \(r,\alpha >0\) such that for all \(\left( x,y\right) \in B\left( \left( {\overline{x}},{\overline{y}}\right) ,r\right) ,\)

$$\begin{aligned} d\left( x,F^{-1}\left( y\right) \right) \le \alpha d\left( y,F\left( x\right) \right) . \end{aligned}$$

One says that F is metrically subregular at \(\left( {\overline{x}} ,{\overline{y}}\right) \in {\text {Gr}}F\) if there exist \(r,\alpha >0\) such that for all \(x\in B\left( {\overline{x}},r\right) ,\)

$$\begin{aligned} d\left( x,F^{-1}\left( {\overline{y}}\right) \right) \le \alpha d\left( {\overline{y}},F\left( x\right) \right) . \end{aligned}$$

It is well known that metric (sub)regularity of F (at) around \(\left( {\overline{x}},{\overline{y}}\right) \) is equivalent to (calmness) Aubin property of \(F^{-1}\) (at) around \(\left( {\overline{y}},{\overline{x}}\right) \) (see [9]).

Finally, we recall the Gerstewitz (Tammer) scalarization functional (see, for instance [17, Sect. 2.3]). Let \(K\subset Y\) be a closed convex cone with nonempty interior. Then for every \(e\in {\text {int}}K\), the functional \(s_{e}:Y\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} s_{e}(y):=\inf \{t\in {\mathbb {R}}\mid te\in y+K\} \end{aligned}$$

is continuous and sublinear. Moreover, for every \(t\in {\mathbb {R}}\),

$$\begin{aligned} \{y\in Y\mid s_{e}(y)<t\}=te-{\text {int}}K. \end{aligned}$$

2 Adding a New Constraint

We start by illustrating on a smooth optimization problem the main question we deal with in this paper. Let \(f,g:X\rightarrow {\mathbb {R}}\) be continuously differentiable functions. Consider the basic optimization problem

$$\begin{aligned} \min f(x),\text { subject to }g(x)\le 0, \end{aligned}$$

and let \({\overline{x}}\in X\) be an optimal solution of this problem. The first-order necessary optimality condition is

$$\begin{aligned} \nabla f({\overline{x}})(u)\ge 0,\text { }\forall u\in T_{B}(M_{g},{\overline{x}}), \end{aligned}$$
(2.1)

where

$$\begin{aligned} M_{g}:=\{x\in X\mid g(x)\le 0\} \end{aligned}$$

is the set of feasible points. We see that one important issue is to describe the cone \(T_{B}(M_{g},{\overline{x}})\). Clearly, if the constraint is not active at the feasible point \({\overline{x}}\) (that is, \(g({\overline{x}})<0\)), then \(T_{B}(M_{g},{\overline{x}})=X\) and (2.1) becomes \(\nabla f({\overline{x}})=0\) (Fermat’s Theorem).

Otherwise, if the constraint is active at \({\overline{x}}\), i.e., \(g(\overline{x})=0,\) we have to suppose that \(\nabla g({\overline{x}})\ne 0\) in order to obtain that

$$\begin{aligned} T_{B}(M_{g},{\overline{x}})=T_{U}(M_{g},{\overline{x}})={\text {cl}} T_{DM}(M_{g},{\overline{x}})=\left\{ u\in X\mid \nabla g({\overline{x}} )(u)\le 0\right\} . \end{aligned}$$
(2.2)

In order to show this, observe first that

$$\begin{aligned} {\text {cl}}T_{DM}(M_{g},{\overline{x}})\subset T_{U}(M_{g},\overline{x})\subset T_{B}(M_{g},{\overline{x}}). \end{aligned}$$

Let now \(u\in T_{B}(M_{g},{\overline{x}}),\) meaning that there exist \((t_{n})\downarrow 0,(u_{n})\rightarrow u,n_{0}\in {\mathbb {N}}\), such that for all \(n\ge n_{0},\)

$$\begin{aligned} g\left( {\overline{x}}+t_{n}u_{n}\right) \le 0. \end{aligned}$$

Since g is differentiable, there exists \((v_{n})\rightarrow 0\) such that for all \(n\ge n_{0}\)

$$\begin{aligned} g\left( {\overline{x}}+t_{n}u_{n}\right) =g({\overline{x}})+t_{n}\nabla g({\overline{x}})(u_{n})+t_{n}v_{n}, \end{aligned}$$

i.e.,

$$\begin{aligned} t_{n}\left( \nabla g({\overline{x}})(u_{n})+v_{n}\right) \le 0. \end{aligned}$$

Whence, passing to the limit in the relation \(\nabla g({\overline{x}} )(u_{n})+v_{n}\le 0,\) one gets that \(\nabla g({\overline{x}})(u)\le 0.\)

Take now \(u\in X\) such that \(\nabla g({\overline{x}})(u)<0.\) Notice that such an element exists since \(\nabla g({\overline{x}})\ne 0.\) Take \((t_{n})\downarrow 0\) and \((u_{n})\rightarrow u.\) Again, the differentiability property of g means

$$\begin{aligned} g\left( {\overline{x}}+t_{n}u_{n}\right)&=g({\overline{x}})+t_{n}\nabla g({\overline{x}})(u_{n})+t_{n}v_{n}\\&=t_{n}\left( \nabla g({\overline{x}})(u_{n})+v_{n}\right) , \end{aligned}$$

with \(\left( v_{n}\right) \rightarrow 0.\) Since \(\nabla g({\overline{x}})(u)<0\) and \((u_{n})\rightarrow u,\) for all n large enough, \(\nabla g(\overline{x})(u_{n})+v_{n}<0\), whence \(g\left( {\overline{x}}+t_{n}u_{n}\right) <0.\) This means that \({\overline{x}}+t_{n}u_{n}\in M_{g}\) and we get \(u\in T_{DM}(M_{g},{\overline{x}}).\)

Let now \(v\in X\) such that \(\nabla g({\overline{x}})(v)\le 0\) and \(\lambda \in (0,1).\) Clearly

$$\begin{aligned} \nabla g({\overline{x}})(\lambda u+(1-\lambda )v)<0, \end{aligned}$$

whence, from the previous step, \(\lambda u+(1-\lambda )v\in T_{DM} (M_{g},{\overline{x}}).\) Passing to the limit with \(\lambda \rightarrow 0,\) we get \(v\in {\text {cl}}T_{DM}(M_{g},{\overline{x}}),\) and all the inclusions are proved.

We want to underline here that the essential assumption for getting (2.2) is \(\nabla g({\overline{x}})\ne 0.\)

Now condition (2.1) becomes

$$\begin{aligned} \nabla f({\overline{x}})(u)\ge 0\text {, subject to }\nabla g(\overline{x})(u)\le 0, \end{aligned}$$

which can be seen as the fact that \(0\in X\) is an optimal solution to the linear problem

$$\begin{aligned} \min \nabla f({\overline{x}})(u),\text { subject to }\nabla g({\overline{x}} )(u)\le 0. \end{aligned}$$

Then, since for linear problems there is no need of supplementary qualification conditions for applying Karush–Kuhn–Tucker Theorem, we get \(\lambda \ge 0\) such that

$$\begin{aligned} \nabla f({\overline{x}})+\lambda \nabla g({\overline{x}})=0. \end{aligned}$$

For the scalar function g,  the condition \(\nabla g({\overline{x}})\ne 0\) is equivalent to the well-known Mangasarian–Fromovitz constraint qualification condition which we recall in greater generality below.

Let us consider a general constraint system \(h(x)\le 0,\) where \(h:X\rightarrow {\mathbb {R}}^{n}\) is a \(C^{1}\) function and \(h(x)\le 0\) means that \(h_{i}(x)\le 0\) for all \(i\in \overline{1,n}.\) The Mangasarian–Fromovitz constraint qualification condition at \({\overline{x}}\) when h is active (that is, all scalar coordinate functions are active) is

$$\begin{aligned} {\exists u\in X:\nabla h({\overline{x}})(u)<0.} \end{aligned}$$
(MFCQ)

It is also well known (see [6]) that (MFCQ) is equivalent to the metric regularity around \(\left( {\overline{x}},0\right) \) of the set-valued map \({\tilde{h}}:X\rightrightarrows {\mathbb {R}}^{n}\) given by

$$\begin{aligned} {\tilde{h}}\left( x\right) :=h\left( x\right) +{\mathbb {R}}_{+}^{n}. \end{aligned}$$

We denote this metric regularity condition by (MRCQ). The equivalence between (MFCQ) and (MRCQ) was first made explicit by Robinson in [34].

There is an important amount of literature that underlines the idea that metric subregularity still ensures the validity of many facts in optimization and, in particular, can replace metric regularity in qualification conditions (see [20, 37] and the references therein). For instance, in the context we discuss here, the metric subregularity qualification condition (MSCQ) introduced in [16] and extensively used in many other works (see, e.g., [15] and the references therein) amounts to say that the above set-valued map is metrically subregular at \(\left( {\overline{x}},0\right) .\)

We continue our presentation by adding a new scalar inequality constraint. Therefore, suppose that the constraint is expressed in the same way, but with a function \(g=(g_{1},g_{2}):X\rightarrow {\mathbb {R}}^{2}.\) Let \({\overline{x}}\) be a feasible point. In the case of active constraints, that is, \(g(\overline{x})=0\in {\mathbb {R}}^{2},\) the Mangasarian–Fromovitz condition is: There exists \(u\in X\) such that \(\nabla g_{1}({\overline{x}})(u)<0\) and \(\nabla g_{2}({\overline{x}})(u)<0\). On the same lines as before, this condition ensures that

$$\begin{aligned} T_{B}(M_{g},{\overline{x}})&=T_{U}(M_{g},{\overline{x}})={\text {cl}} T_{DM}(M_{g},{\overline{x}})\\&=\left\{ u\in X\mid \nabla g_{1}({\overline{x}})(u)\le 0,\nabla g_{2}({\overline{x}})(u)\le 0\right\} . \end{aligned}$$

In particular, this means that

$$\begin{aligned} T_{B}(M_{g},{\overline{x}})=T_{B}(M_{g_{1}},{\overline{x}})\cap T_{B}(M_{g_{2} },{\overline{x}}), \end{aligned}$$

and in fact, this is the essential relationship to get, on the same argument as in the case of a single scalar-valued constraint, that there exist \(\lambda _{1},\lambda _{2}\ge 0\) such that

$$\begin{aligned} \nabla f({\overline{x}})+\lambda _{1}\nabla g_{1}({\overline{x}})+\lambda _{2}\nabla g_{2}({\overline{x}})=0. \end{aligned}$$

Now if another scalar-valued constraint is coming into play, that is, if we have \(g=(g_{1},g_{2},g_{3}):X\rightarrow {\mathbb {R}}^{3},\) then for an optimal solution \({\overline{x}},\) if the new constraint is active as well, the corresponding Mangasarian–Fromovitz condition (there exists \(u\in X\) such that \(\nabla g_{i}({\overline{x}})(u)<0\) for \(i\in \overline{1,3}\)) ensures, similarly,

$$\begin{aligned} T_{B}(M_{g},{\overline{x}})= {\displaystyle \bigcap \limits _{i=1}^{3}} T_{B}(M_{g_{i}},{\overline{x}}), \end{aligned}$$

and then the existence of \(\lambda _{1},\lambda _{2},\lambda _{3}\ge 0\) such that

$$\begin{aligned} \nabla f({\overline{x}})+ {\displaystyle \sum \limits _{i=1}^{3}} \lambda _{i}\nabla g_{i}({\overline{x}})=0. \end{aligned}$$

Therefore, according to the facts above, every time we consider another scalar-valued constraint which is active at the underlying point \(\overline{x}\), one has to impose the Mangasarian–Fromovitz condition, and this condition is stronger than the Mangasarian–Fromovitz conditions for every of the components of g (that is, the Mangasarian–Fromovitz condition formally applied for each \(g_{i}\)). Therefore, we aim at presenting a situation when one can replace the general Mangasarian–Fromovitz condition with Mangasarian–Fromovitz conditions for each coordinate function, under a supplementary hypothesis. This idea will be subject of some generalizations, since it will be clear that the additional assumption we impose can be extended to nondifferentiable settings.

3 A Metric Condition

We start the investigation in order to get a condition with the above mentioned features by considering again the situation of \(g=(g_{1} ,g_{2}):X\rightarrow {\mathbb {R}}^{2}\). Suppose that \(g_{1}({\overline{x}})=0,\) \(g_{2}({\overline{x}})=0,~\nabla g_{1}({\overline{x}})\ne 0\) and \(\nabla g_{2}({\overline{x}})\ne 0.\) Firstly, looking again at the above arguments, observe that we have

$$\begin{aligned} T_{B}(M_{g},{\overline{x}})&\subset T_{B}(M_{g_{1}},{\overline{x}})\cap T_{B}(M_{g_{2}},{\overline{x}})=T_{U}(M_{g_{1}},{\overline{x}})\cap T_{B} (M_{g_{2}},{\overline{x}})\\&=T_{B}(M_{g_{1}},{\overline{x}})\cap T_{U}(M_{g_{2}},{\overline{x}} )=T_{U}(M_{g_{1}},{\overline{x}})\cap T_{U}(M_{g_{2}},{\overline{x}}). \end{aligned}$$

In other words, (MFCQ) is used exactly to show that we have the reverse inclusion in the above relation. However, such an inclusion can be obtained as well via some regularity assumptions on the sets. We refer the reader to the paper [12] and the references therein for some steps in this direction of investigation.

In the next result, we employ a natural metric condition which yields the desired inclusions. We will see below that this condition is in fact a subtransversality condition on sets studied in depth by several authors. On the other hand, the property we deduce in Theorem 3.1 is a nonconvex generalized version of the (CHIP) property in the sense given in the papers [2] and [3]. Properties of this type for the normal cones are extended to nonconvex cases in many works; for additional bibliography, see the references after Proposition 3.3.

Theorem 3.1

Let X be a normed vector space and \(M_{1},M_{2}\subset X\) be closed sets. Take \({\overline{x}}\in M_{1}\cap M_{2}.\) Suppose that the following regularity assumption holds: There exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{1},\)

$$\begin{aligned} d(x,M_{1}\cap M_{2})\le \mu d(x,M_{2}). \end{aligned}$$
(MC)

Then

$$\begin{aligned} T_{B}(M_{1},{\overline{x}})\cap T_{U}(M_{2},{\overline{x}})&\subset T_{B} (M_{1}\cap M_{2},{\overline{x}})\\ T_{U}(M_{1},{\overline{x}})\cap T_{B}(M_{2},{\overline{x}})&\subset T_{B} (M_{1}\cap M_{2},{\overline{x}})\\ T_{U}(M_{1},{\overline{x}})\cap T_{U}(M_{2},{\overline{x}})&=T_{U}(M_{1}\cap M_{2},{\overline{x}}). \end{aligned}$$
(CHIP)

Proof

Take \(u\in T_{B}(M_{1},{\overline{x}})\cap T_{U} (M_{2},{\overline{x}}),\) i.e., \(u\in T_{B}(M_{1},{\overline{x}})\) and \(u\in T_{U}(M_{2},{\overline{x}}).\) Then there exist \((t_{n})\downarrow 0,(u_{n} )\rightarrow u,\) \(\left( v_{n}\right) \rightarrow u\) with \(\overline{x}+t_{n}u_{n}\in M_{1}\) and \({\overline{x}}+t_{n}v_{n}\in M_{2}\) for all n large enough. Observe that one can apply the regularity assumption since \({\overline{x}}+t_{n}u_{n}\in B({\overline{x}},s)\) for n large enough: there exists \(p_{n}\in M_{1}\cap M_{2}\) with

$$\begin{aligned} \left\| {\overline{x}}+t_{n}u_{n}-p_{n}\right\| <\mu \left\| {\overline{x}}+t_{n}u_{n}-{\overline{x}}-t_{n}v_{n}\right\| +t_{n}^{2}. \end{aligned}$$

Then for every n as above,

$$\begin{aligned} \left\| {\overline{x}}+t_{n}u_{n}-p_{n}\right\| <\mu \cdot t_{n}\left\| u_{n}-v_{n}\right\| +t_{n}^{2}, \end{aligned}$$

whence

$$\begin{aligned} \left\| t_{n}^{-1}(p_{n}-{\overline{x}})-u_{n}\right\| <\mu \left\| u_{n}-v_{n}\right\| +t_{n}. \end{aligned}$$

We infer that \(u_{n}^{\prime }:=t_{n}^{-1}(p_{n}-{\overline{x}})\rightarrow u\) which, by the fact that for all n large enough

$$\begin{aligned} {\overline{x}}+t_{n}u_{n}^{\prime }=p_{n}\in M_{1}\cap M_{2}, \end{aligned}$$

allows us to conclude the proof of the first inclusion of the theorem. Now, the other relations are similar. Notice that for the equality in the third relation one takes into account the simple inclusion

$$\begin{aligned} T_{U}(M_{1}\cap M_{2},{\overline{x}})\subset T_{U}(M_{1},{\overline{x}})\cap T_{U}(M_{2},{\overline{x}}). \end{aligned}$$

The proof is complete.\(\square \)

Remark 3.2

Observe that if X is finite dimensional, under condition (MC) one can prove the following stronger assertion: for all \(u\in T_{B}(M_{1},{\overline{x}})\) and \(v\in T_{U}(M_{2},{\overline{x}}),\) there is \(w\in T_{B}(M_{1}\cap M_{2},{\overline{x}})\) such that

$$\begin{aligned} \left\| w-u\right\| \le \mu \left\| v-u\right\| . \end{aligned}$$

Indeed, one can follow with obvious modifications the proof above and observe that the sequence \(\left( t_{n}^{-1}(p_{n}-{\overline{x}})\right) \) is bounded, whence it has a convergent subsequence whose limit satisfies the requirements for w.

In order to have a better insight on condition (MC), we present next some characterizations. Firstly we recall that a function \(f:X\rightarrow Y\) is metrically subregular at \(({\overline{x}},f({\overline{x}}))\) with respect to \(M\subset X\) when \({\overline{x}}\in M\) and there exist \(s>0,\) \(\mu >0\) such that for every \(u\in B({\overline{x}},s)\cap M\)

$$\begin{aligned} d(u,f^{-1}(f({\overline{x}}))\cap M)\le \mu \left\| f({\overline{x}} )-f(u)\right\| . \end{aligned}$$

Observe that the metric subregularity of f at \(({\overline{x}},f(\overline{x}))\) with respect to \(M\ \)is exactly the metric subregularity of the set-valued map \(f_{M}:X\rightrightarrows Y\ \)given by

$$\begin{aligned} f_{M}\left( x\right) =\left\{ \begin{array} [c]{l} \left\{ f\left( x\right) \right\} ,\text { if }x\in M\\ \emptyset ,\text { otherwise.} \end{array} \right. \end{aligned}$$

Note that in the paper [12] we used some metric subregularity conditions in getting calculus rules for tangent cones.

Now we present the equivalence (up to a change of the involved constants) of several metric conditions. Essentially, all these assertions are known in literature (see [21] and [19]). We give only some details in order to emphasize the features which will be of interest for us in the sequel. On product spaces, we consider the sum norm.

Proposition 3.3

Take \({\overline{x}}\in M_{1}\cap M_{2}.\) The next assertions are equivalent:

  1. (i)

    there exist \(s,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{1},\)

    $$\begin{aligned} d(x,M_{1}\cap M_{2})\le \mu d(x,M_{2}). \end{aligned}$$
  2. (ii)

    there exist \(r,t,\mu >0\) such that for all \(x\in B({\overline{x}},r)\cap M_{1},\)

    $$\begin{aligned} d(x,M_{1}\cap M_{2})\le \mu d(x,B({\overline{x}},t)\cap M_{2}). \end{aligned}$$
  3. (iii)

    there exist \(r,\nu >0\) such that for all \(x\in B({\overline{x}},r)\cap M_{2},\)

    $$\begin{aligned} d(x,M_{1}\cap M_{2})\le \nu d(x,M_{1}). \end{aligned}$$
  4. (iv)

    there exist \(r,\nu >0\) such that for all \(x\in B({\overline{x}},r),\)

    $$\begin{aligned} d(x,M_{1}\cap M_{2})\le \nu \left( d(x,M_{1})+d(x,M_{2})\right) . \end{aligned}$$
    (Str)
  5. (v)

    the function \(g:X\times X\rightarrow X\) given by \(g(x,y):=x-y\) is metrically subregular at \(({\overline{x}},{\overline{x}},0)\) with respect to \(M_{1}\times M_{2}\).

  6. (vi)

    the function \(h:X\times X\rightarrow {\mathbb {R}}\) given by \(h(x,y):=d(x,y)\) is metrically subregular at \(({\overline{x}},{\overline{x}},0)\) with respect to \(M_{1}\times M_{2}\).

Proof

\((i)\Rightarrow (ii)\) This is obvious for \(r:=s,\) since \(d(x,M_{2})\le d(x,B({\overline{x}},t)\cap M_{2}).\)

\((ii)\Rightarrow (i)\) Consider \(s:=\min \left\{ r,t\right\} \) and \(x\in B({\overline{x}},3^{-1}s)\cap M_{1}.\) Then for every \(\varepsilon \in (0,3^{-1}s),\) there is \(x_{\varepsilon }\in M_{2}\) such that

$$\begin{aligned} \left\| x-x_{\varepsilon }\right\|<d(x,M_{2})+\varepsilon \le \left\| x-{\overline{x}}\right\| +\varepsilon <2\cdot 3^{-1}s. \end{aligned}$$

Then

$$\begin{aligned} \left\| x_{\varepsilon }-{\overline{x}}\right\| \le \left\| x-x_{\varepsilon }\right\| +\left\| x-{\overline{x}}\right\| <s\le t, \end{aligned}$$

whence \(x_{\varepsilon }\in B({\overline{x}},t)\cap M_{2},\) so

$$\begin{aligned} d(x,M_{1}\cap M_{2})\le \mu d(x,B({\overline{x}},t)\cap M_{2})\le \mu \left\| x-x_{\varepsilon }\right\| <\mu \left( d(x,M_{2})+\varepsilon \right) . \end{aligned}$$

Since this is true for every \(\varepsilon \) small enough, we arrive at the conclusion.

\((i)\Rightarrow (iv)\) Define \(r=2^{-1}s\) and take \(x\in B({\overline{x}},r).\) Then \(d(x,M_{1})\le \left\| x-{\overline{x}}\right\| <2^{-1}s,\) hence there is \(m_{1}\in M_{1}\) such that \(\left\| x-m_{1}\right\| <2^{-1}s.\) Therefore,

$$\begin{aligned} \left\| m_{1}-{\overline{x}}\right\| \le \left\| m_{1}-x\right\| +\left\| x-{\overline{x}}\right\| <s, \end{aligned}$$

\(m_{1}\in M_{1}\cap B({\overline{x}},s),\) and by (i)

$$\begin{aligned} d(m_{1},M_{1}\cap M_{2})\le \mu d(m_{1},M_{2}). \end{aligned}$$

But then

$$\begin{aligned} d(x,M_{1}\cap M_{2})&\le \left\| x-m_{1}\right\| +d(m_{1},M_{1}\cap M_{2})\\&\le \left\| x-m_{1}\right\| +\mu d(m_{1},M_{2})\\&\le \left\| x-m_{1}\right\| +\mu \left( \left\| m_{1}-x\right\| +d(x,M_{2})\right) \\&\le (1+\mu )\left( \left\| x-m_{1}\right\| +d(x,M_{2})\right) . \end{aligned}$$

Since \(m_{1}\) can be chosen such that \(\left\| x-m_{1}\right\| \) is arbitrarily close to \(d(x,M_{1}),\) it follows that (Str) holds for \(\nu :=1+\mu \).

\((iv)\Rightarrow (i)\) This is obvious for \(s:=r\) and \(\mu :=\nu .\)

\((iv)\Leftrightarrow (iii)\) This follows by the symmetry of conditions in the items (i) and (iii) and by \((i)\Leftrightarrow (iv).\)

\((i)\Leftrightarrow (v)\) See [19, Proposition 7.6].

\((v)\Leftrightarrow (vi)\) Observe that \(g^{-1}(0)=h^{-1}(0)\) and that \(d(0,g(x,y))=\left\| x-y\right\| =d(x,y)=d(0,d(x,y))=d(0,h(x,y)).\) \(\square \)

The metric condition (iv) is very well studied in literature and it is known under various names: local regularity, metric regularity, linear coherence, metric inequality, subtransversality (see, e.g., [19, 21,22,23,24,25, 27, 28, 30, 32]). For more historical details and an extensive bibliography, see [21].

Remark 3.4

Notice that (MC) is exactly the fact that \({\overline{x}}\) is a local minimum on \(M_{1}\) of the function

$$\begin{aligned} x\mapsto \mu d_{M_{2}}(x)-d_{M_{1}\cap M_{2}}(x). \end{aligned}$$

Since the latter function is \((1+\mu )-\)Lipschitz, by Clarke’s penalization principle ( [8, Proposition 2.4.3]), \({\overline{x}}\) is a local minimum (without constraints) of

$$\begin{aligned} x\mapsto \mu d_{M_{2}}(x)-d_{M_{1}\cap M_{2}}(x)+\left( 1+\mu \right) d_{M_{1}}(x), \end{aligned}$$

which implies that for x around \({\overline{x}}\)

$$\begin{aligned} d_{M_{1}\cap M_{2}}(x)\le \mu d_{M_{2}}(x)+\left( 1+\mu \right) d_{M_{1} }(x)\le \left( 1+\mu \right) \left( d_{M_{1}}(x)+d_{M_{2}}(x)\right) . \end{aligned}$$

This is another argument for \((i)\Rightarrow (iv)\) of the proposition above.

Remark 3.5

The fact that \({\overline{x}}\) is a local minimum of the function

$$\begin{aligned} x\mapsto \mu d_{M_{2}}(x)+\iota _{M_{1}}-d_{M_{1}\cap M_{2}}(x), \end{aligned}$$
(3.1)

where \(\iota \) is the indicator function, means that the condition (MC) can be seen as a minimality condition for a difference function. Necessary and sufficient conditions for it can be devised in terms of subdifferentials of the involved functions (hence, in our case, in terms of normal cones to \(M_{1},M_{2},M_{1}\cap M_{2}\)) following the results in [31]. Moreover, in the convex case (that is, both \(M_{1}\) and \(M_{2}\) are convex) the local minimality becomes global minimality, and an equivalent condition in terms of \(\varepsilon -\)subdifferentials (and \(\varepsilon -\)normal cones) is to be found in [18].

For instance, on Asplund spaces, some well-known consequences of relation (Str) are formulae for the Fréchet/limiting normal cone to intersection of sets (see, e.g., [19, 32]).

Remark 3.6

Notice that, for the convex case, the necessity of some metric inequalities for ensuring the validity of constraint qualification conditions is discussed in [35].

Combining the discussions above, we present two consequences for the systems with multiple constraints considered in the previous section.

Corollary 3.7

Let \(g=\left( g_{1},g_{2}\right) :X\rightarrow {\mathbb {R}}^{2}\) be differentiable, consider \({\overline{x}}\in X\) such that \(g_{1}(\overline{x})=g_{2}({\overline{x}})=0\) and \(\nabla g_{1}({\overline{x}})\ne 0,\nabla g_{2}({\overline{x}})\ne 0.\) If there exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{g_{1}},\)

$$\begin{aligned} d(x,M_{g_{1}}\cap M_{g_{2}})\le \mu d(x,M_{g_{2}}), \end{aligned}$$

then

$$\begin{aligned} T_{B}(M_{g},{\overline{x}})=\left\{ u\in X\mid \nabla g({\overline{x}} )(u)\le 0\right\} . \end{aligned}$$

Corollary 3.8

Let \(g=\left( g_{1},g_{2},g_{3}\right) :X\rightarrow {\mathbb {R}}^{3}\) be differentiable, consider \({\overline{x}}\in X\) such that \(g_{i}({\overline{x}})=0\) and \(\nabla g_{i}({\overline{x}})\ne 0\) for \(i\in \overline{1,3}.\) If there exist \(s,t,\mu ,\gamma >0\) such that for all \(x\in B({\overline{x}},s)\cap M_{g_{1}},\)

$$\begin{aligned} d(x,M_{g_{1}}\cap M_{g_{2}})\le \mu d(x,M_{g_{2}}), \end{aligned}$$

and for all \(x\in B({\overline{x}},t)\cap M_{g_{1}}\cap M_{g_{2}}\)

$$\begin{aligned} d(x,M_{g_{1}}\cap M_{g_{2}}\cap M_{g_{3}})\le \gamma d(x,M_{g_{3}}), \end{aligned}$$

then

$$\begin{aligned} T_{B}(M_{g},{\overline{x}})=\left\{ u\in X\mid \nabla g({\overline{x}} )(u)\le 0\right\} . \end{aligned}$$

Consider a \(C^{1}\) function \(g=(g_{1},g_{2}):X\rightarrow {\mathbb {R}}^{2},\) the associated constraint system of inequalities \(g(x)\le 0,\) and let \({\overline{x}}\) be a feasible point where both scalar constraints are active. Let us discuss the relationship between (MFCQ), (Str) and (MSCQ).

Proposition 3.9

In the above notation, we have the following implications:

  1. (i)

    \((MFCQ )\Leftrightarrow (MRCQ) \Rightarrow (MSCQ) ;\)

  2. (ii)

    \((MSCQ) \Rightarrow (Str) ;\)

  3. (iii)

    \(\left[ (MSCQ) \text { for }g_{1}\text { and }g_{2}\right] +(Str) \Rightarrow (MSCQ) \) for \((g_{1},g_{2}).\)

Proof

(i) All the implications are known (see the above comments).

(ii) Firstly, remark that \(M_{g_{i}}={\tilde{g}}_{i}^{-1}\left( 0\right) ,\) for \(i=1,2.\) Then (MSCQ) means that there are some constants \(r,\alpha >0\) such that for all \(x\in B\left( {\overline{x}},r\right) \)

$$\begin{aligned} d\left( x,{\tilde{g}}^{-1}\left( 0,0\right) \right)= & {} d\left( x,\tilde{g}_{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{2}^{-1}\left( 0\right) \right) \le \alpha d\left( (0,0),{\tilde{g}}(x)\right) \\= & {} \alpha \left( d\left( 0,{\tilde{g}}_{1}\left( x\right) \right) +d\left( 0,{\tilde{g}}_{2}\left( x\right) \right) \right) . \end{aligned}$$

Since \(g_{i}\) are \(C^{1},\) they are locally Lipschitz around \({\overline{x}}.\) Whence \({\tilde{g}}_{i}\) are Lipschitz multifunctions around \(\left( {\overline{x}},0\right) ,\) meaning that \({\tilde{g}}_{i}^{-1}\) are metrically regular around \(\left( 0,{\overline{x}}\right) .\) This obviously implies that there exists \(\beta >0\) such that

$$\begin{aligned} d\left( 0,{\tilde{g}}_{i}\left( x\right) \right) \le \beta d(x,{\tilde{g}} _{i}^{-1}\left( 0\right) ) \end{aligned}$$

for all x sufficiently close to \({\overline{x}}.\) The two relations above imply (Str) condition.

(iii) The assumption made means that there are \(r,\alpha ,\beta >0\) such that for all \(x\in B\left( {\overline{x}},r\right) \)

$$\begin{aligned}&d\left( x,{\tilde{g}}_{i}^{-1}\left( 0\right) \right) \le \alpha d\left( 0,{\tilde{g}}_{i}(x)\right) ,i=1,2\\&d\left( x,{\tilde{g}}_{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{1} ^{-1}\left( 0\right) \right) \le \beta \left( d(x,{\tilde{g}}_{1}^{-1}\left( 0\right) )+d(x,{\tilde{g}}_{2}^{-1}\left( 0\right) )\right) . \end{aligned}$$

Then for all \(x\in B\left( {\overline{x}},r\right) ,\)

$$\begin{aligned} d\left( x,{\tilde{g}}_{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{1}^{-1}\left( 0\right) \right) \le \alpha \beta \left( d\left( 0,{\tilde{g}}_{1}(x)\right) +d\left( 0,{\tilde{g}}_{2}(x)\right) \right) , \end{aligned}$$

i.e., (MSCQ) holds.\(\square \)

We present some illustrative examples where we aim to discuss different situations from the perspective of the properties we study here, that is, (MFCQ), (CHIP), (MSCQ), (Str).

Example 3.10

Let \(g_{1},g_{2}:{\mathbb {R}}^{2}\mathbb {\rightarrow R}\) be given by

$$\begin{aligned} g_{1}(x,y)=x-y,\text { }g_{2}(x,y)=x^{2}-x+y. \end{aligned}$$

Consider the sets

$$\begin{aligned} M_{1}=M_{g_{1}}=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x\le y\right\} \text { and }M_{2}=M_{g_{2}}=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid y\le x-x^{2} \right\} . \end{aligned}$$

Take \(\left( {\overline{x}},{\overline{y}}\right) =0\in {\mathbb {R}}^{2}\) and observe that \(\nabla g_{1}\left( {\overline{x}},{\overline{y}}\right) =(1,-1)=-\nabla g_{2}\left( {\overline{x}},{\overline{y}}\right) \) and (MFCQ) holds for \(g_{1}\) and \(g_{2}\) at \(\left( {\overline{x}},{\overline{y}}\right) ,\) while it does not for \((g_{1},g_{2}).\) We have

$$\begin{aligned} T_{B}(M_{1},0)=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x\le y\right\} \text { and }T_{B}(M_{2},0)=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid y\le x\right\} , \end{aligned}$$

whence

$$\begin{aligned} T_{B}(M_{1},0)\cap T_{B}(M_{2},0)=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x=y\right\} . \end{aligned}$$

On the other hand, \(M_{1}\cap M_{2}=\{0\},\) so \(T_{B}(M_{1}\cap M_{2} ,0)=\{0\},\) whence the equality does not hold. However, the relation (MC) does not hold either. Indeed, the relation (MC) would imply the existence of \(\mu >0\) such that for every small \(x>0,\)

$$\begin{aligned} x<x\sqrt{2}= & {} d(\left( x,x\right) ,M_{1}\cap M_{2})\le \mu d\left( \left( x,x\right) ,M_{2}\right) \\\le & {} \mu \left\| \left( x,x\right) -\left( x,x-x^{2}\right) \right\| =\mu x^{2}, \end{aligned}$$

which is impossible. So, (Str) does not hold.

Moreover, (MSCQ) would mean the existence of \(\alpha >0\) such that

$$\begin{aligned} \left\| \left( x,y\right) \right\| \le \alpha \left[ \max \left\{ x-y,0\right\} +\max \left\{ x^{2}-x+y,0\right\} \right] \end{aligned}$$

for all \(\left( x,y\right) \) close to \(\left( 0,0\right) .\) But, for \(x=y,\) this would mean

$$\begin{aligned} 2\left| x\right| \le \alpha x^{2} \end{aligned}$$

for all small x,  and this is false. Therefore, (MSCQ) does not hold.

Example 3.11

Let \(g_{1},g_{2}:{\mathbb {R}}^{2}\mathbb {\rightarrow R}\) be given by

$$\begin{aligned} g_{1}(x,y)=x-y,\text { }g_{2}(x,y)=-x+y. \end{aligned}$$

Consider the sets

$$\begin{aligned} M_{1}&=M_{g_{1}}=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x\le y\right\} \\ M_{2}&=M_{g_{2}}=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid y\le x\right\} . \end{aligned}$$

Take \(\left( {\overline{x}},{\overline{y}}\right) =0\in {\mathbb {R}}^{2}\) and observe that \(\nabla g_{1}({\overline{x}})=(1,-1)=-\nabla g_{2}({\overline{x}})\) and (MFCQ) holds for \(g_{1}\) and \(g_{2}\) at \(\left( {\overline{x}},\overline{y}\right) ,\) while it does not for \((g_{1},g_{2}).\) We have

$$\begin{aligned} T_{B}(M_{1},0)=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x\le y\right\} \text { and }T_{B}(M_{2},0)=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid y\le x\right\} , \end{aligned}$$

whence

$$\begin{aligned} T_{B}(M_{1},0)\cap T_{B}(M_{2},0)=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x=y\right\} . \end{aligned}$$

On the other hand,

$$\begin{aligned} M_{1}\cap M_{2}=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x=y\right\} , \end{aligned}$$

so

$$\begin{aligned} T_{B}(M_{1}\cap M_{2},0)=\left\{ (x,y)\in {\mathbb {R}}^{2}\mid x=y\right\} , \end{aligned}$$

whence the equality does hold. Moreover, it is not difficult to see that the relation (MC) does hold as well for small s and \(\mu =1\). So, (Str) holds. Also an easy computation shows that (MSCQ) holds.

Now we give examples which show that the converse implications in Proposition 3.9 (i), (ii) do not hold.

Example 3.12

Let \(g_{1},g_{2}:\mathbb {R\rightarrow R}\) be given by

$$\begin{aligned} g_{1}(x)=g_{2}\left( x\right) =-x^{2}. \end{aligned}$$

Take \({\overline{x}}=0\in {\mathbb {R}}\). It is clear that (MFCQ) does not hold for \(g_{1}\) and \(g_{2}\). However, (MSCQ) does hold, since \({\tilde{g}} _{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{2}^{-1}\left( 0\right) ={\mathbb {R}},\) hence \(d\left( x,{\tilde{g}}_{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{2}^{-1}\left( 0\right) \right) =0\) for any x.

Example 3.13

Consider now \(g_{1},g_{2}:\mathbb {R\rightarrow R}\) be given by

$$\begin{aligned} g_{1}(x)=x^{2},\text { }g_{2}\left( x\right) =-x^{2}. \end{aligned}$$

Consider \({\overline{x}}=0\in {\mathbb {R}}\). For this system, (MSCQ) does not hold, since otherwise we should find \(\alpha >0\) such that

$$\begin{aligned} d\left( x,{\tilde{g}}_{1}^{-1}\left( 0\right) \cap {\tilde{g}}_{2}^{-1}\left( 0\right) \right) =\left| x\right| \le \alpha x^{2}, \end{aligned}$$

for all x close to 0,  which is not possible. On the other hand, (Str) trivially holds with \(\mu =1.\)

Remark 3.14

The second item of Proposition 3.9 and above example show that there are situations where (Str) is strictly weaker than (MSCQ) . This fact will be emphasized in the next section. Some interesting equivalences concerning the fulfillment of Karush–Kuhn–Tucker conditions can be found in [1] and [33].

Moreover, it is well known that translating regularity properties from the image or product space into the preimage space requires, in general, certain regularity assumptions on the involved mappings.

Nevertheless, the conditions (MC) and (MFCQ) are only sufficient for having the equality between the tangent cone to the intersection and the intersection of tangent cones, as the next example shows.

Example 3.15

Consider \(g_{1},g_{2}:{\mathbb {R}}^{2}\mathbb {\rightarrow R}\) given by

$$\begin{aligned} g_{1}(x,y)=\left\{ \begin{array} [c]{l} x^{4}\sin ^{2}\frac{1}{x}-y,\text { if }x\ne 0\\ 0,\text {if }x=0, \end{array} \right. \end{aligned}$$

and

$$\begin{aligned} g_{2}(x,y)=\left\{ \begin{array} [c]{l} x^{4}\sin ^{2}\frac{1}{x}+y,\text { if }x\ne 0\\ 0,\text {if }x=0, \end{array} \right. \end{aligned}$$

respectively. We have that \(\nabla g_{1}(0,0)=\left( 0,-1\right) \) and \(\nabla g_{2}(0,0)=\left( 0,1\right) \) whence (MFCQ) holds for \(g_{1}\) and for \(g_{2}\), while it does not hold for \(\left( g_{1},g_{2}\right) \). We have that

$$\begin{aligned} T_{B}(M_{g_{1}},\left( 0,0\right) )\cap T_{B}(M_{g_{2}},\left( 0,0\right) )&=T_{B}(M_{g_{1}}\cap M_{g_{2}},\left( 0,0\right) )\\&=\left\{ (a,0)\mid a\in {\mathbb {R}}\right\} . \end{aligned}$$

Now denote \(f(x)=x^{4}\sin ^{2}\frac{1}{x}\) for any \(x\ne 0\) and consider, for \(n\in {\mathbb {N}}\setminus \left\{ 0\right\} ,\) the number

$$\begin{aligned} {\overline{x}}_{n}=\frac{1}{2\pi }\left( \frac{1}{n}+\frac{1}{n+1}\right) =\frac{2n+1}{2\pi n\left( n+1\right) }, \end{aligned}$$

and the element

$$\begin{aligned} \left( {\overline{x}}_{n},f\left( {\overline{x}}_{n}\right) \right) \in M_{g_{1}}. \end{aligned}$$

Since

$$\begin{aligned} M_{g_{1}}\cap M_{g_{2}}=\left\{ (0,0)\right\} \cup \left\{ \left( \frac{1}{k\pi },0\right) \mid k\in {\mathbb {Z}}\setminus \left\{ 0\right\} \right\} , \end{aligned}$$

one has that

$$\begin{aligned} \left| {\overline{x}}_{n}-\frac{1}{n\pi }\right| \le d(\left( {\overline{x}}_{n},f\left( {\overline{x}}_{n}\right) \right) ,M_{g_{1}}\cap M_{g_{2}}). \end{aligned}$$

On the other hand,

$$\begin{aligned} d(\left( {\overline{x}}_{n},f\left( {\overline{x}}_{n}\right) \right) ,M_{g_{2}})\le 2f\left( {\overline{x}}_{n}\right) , \end{aligned}$$

whence, (MC) would imply the existence of a constant \(\mu >0\) such that for all n large enough,

$$\begin{aligned} \left| {\overline{x}}_{n}-\frac{1}{n\pi }\right| =\frac{1}{2\pi n(n+1)}\le 2\mu \left( \frac{2n+1}{2\pi n\left( n+1\right) }\right) ^{4}\sin ^{2}\left( \frac{2\pi n\left( n+1\right) }{2n+1}\right) , \end{aligned}$$

that is

$$\begin{aligned} 4\pi ^{3}\le \mu \frac{\left( 2n+1\right) ^{4}}{n^{3}\left( n+1\right) ^{3} }\sin ^{2}\left( \frac{2\pi n\left( n+1\right) }{2n+1}\right) . \end{aligned}$$

Of course, this is impossible since the right-hand term converges to 0 as \(n\rightarrow \infty \). This means, obviously, that also (Str) does not hold.

4 Consequences in Optimization

In this section, we present some consequences of the metric condition discussed above, and we give as well some other similar conditions that can be used in various contexts.

Consider a scalar function \(f:X\rightarrow {\mathbb {R}}\) and recall that the Hadamard upper directional derivative of f at \({\overline{x}}\in X\) in direction \(u\in X\) is

$$\begin{aligned} d_{+}f({\overline{x}},u)=\underset{u^{\prime }\rightarrow u,t\downarrow 0}{{\text {limsup}}}t^{-1}(f({\overline{x}}+tu^{\prime })-f({\overline{x}})), \end{aligned}$$

while the Hadamard lower directional derivative of f at \({\overline{x}}\) in direction u is

$$\begin{aligned} d_{-}f({\overline{x}},u)=\underset{u^{\prime }\rightarrow u,t\downarrow 0}{{\text {liminf}}}t^{-1}(f({\overline{x}}+tu^{\prime })-f({\overline{x}})). \end{aligned}$$

In [7] a concept of directional minimum was introduced and studied in the vectorial setting (see also [14]). We consider this concept here, but, initially, for the sake of simplicity, we restrict ourselves to the scalar case.

Definition 4.1

Let \(f:X\rightarrow {\mathbb {R}}\) be a function and \(A\subset X\), \(L\subset S(0,1)\) be nonempty closed sets. One says that \({\overline{x}}\in A\) is a local directional minimum point for f on A with respect to (the set of directions) L if there exists a neighborhood U of \({\overline{x}}\) such that for every \(x\in U\cap A\cap \left( {\overline{x}}+{\text {cone}}L\right) ,\) \(f({\overline{x}})\le f(x).\)

Proposition 4.2

Let \(f:X\rightarrow {\mathbb {R}}\) be a function, \(A\subset X\), \(L\subset S(0,1)\) be nonempty closed sets and \({\overline{x}}\in A.\) Suppose that there exist \(s>0,\mu >0\) such that

$$\begin{aligned} \forall x\in B({\overline{x}},s)\cap A:d(x,A\cap \left( {\overline{x}} +{\text {cone}}L\right) )\le \mu d(x,{\overline{x}}+{\text {cone}} L). \end{aligned}$$
(4.1)
  1. (i)

    If \({\overline{x}}\) is a local directional minimum point for f on A with respect to L,  then \(d_{+}f({\overline{x}},u)\ge 0\) for all \(u\in T_{B}(A,{\overline{x}})\cap L.\)

  2. (ii)

    Moreover, if X is finite dimensional and \(d_{-}f({\overline{x}},u)>0\) for all \(u\in T_{B}(A,{\overline{x}})\cap L,\) then there exists \(\alpha >0\) such that \({\overline{x}}\) is a local directional minimum point for \(f\left( \cdot \right) -\alpha \left\| \cdot -{\overline{x}}\right\| \) on A with respect to L.

Proof

Firstly, observe that, according to Theorem 3.1, the metric assumption imposed ensures that

$$\begin{aligned} T_{B}(A,{\overline{x}})\cap T_{U}({\overline{x}}+{\text {cone}} L,{\overline{x}})\subset T_{B}(A\cap \left( {\overline{x}}+{\text {cone}} L\right) ,{\overline{x}}). \end{aligned}$$

Since, obviously,

$$\begin{aligned} T_{B}(A\cap \left( {\overline{x}}+{\text {cone}}L\right) ,\overline{x})\subset T_{B}(A,{\overline{x}})\cap T_{B}({\overline{x}}+{\text {cone}} L,{\overline{x}}) \end{aligned}$$

and

$$\begin{aligned} T_{B}({\overline{x}}+{\text {cone}}L,{\overline{x}})=T_{U}(\overline{x}+{\text {cone}}L,{\overline{x}})=T_{U}({\text {cone}} L,0)={\text {cone}}L, \end{aligned}$$

we actually get

$$\begin{aligned} T_{B}(A\cap \left( {\overline{x}}+{\text {cone}}L\right) ,\overline{x})=T_{B}(A,{\overline{x}})\cap {\text {cone}}L. \end{aligned}$$

Now the proof of both items is quite standard, but we present it here for the sake of clarity.

(i) Remark that the minimality of \({\overline{x}}\) on A with respect to a set of directions L is in fact the minimality of \({\overline{x}}\) on \(A\cap \left( {\overline{x}}+{\text {cone}}L\right) .\) Therefore, for every \(u\in T_{B}(A,{\overline{x}})\cap {\text {cone}}L=T_{B}(A\cap \left( \overline{x}+{\text {cone}}L\right) ,{\overline{x}}),\ \)there are some sequences \((t_{n})\downarrow 0,(u_{n})\rightarrow u\) such that for all \(n\in {\mathbb {N}},{\overline{x}}+t_{n}u_{n}\in U\cap A\cap \left( \overline{x}+{\text {cone}}L\right) ,\) whence, for n large enough \(f\left( {\overline{x}}+t_{n}u_{n}\right) \ge f({\overline{x}}).\) Consequently,

$$\begin{aligned} d_{+}f({\overline{x}},u)\ge \underset{n\rightarrow \infty }{{\text {limsup}} ~}t_{n}^{-1}\left( f({\overline{x}}+t_{n}u_{n})-f({\overline{x}})\right) \ge 0. \end{aligned}$$

(ii) Suppose, by way of contradiction, that for every positive integer n there exists \(x_{n}\in A\cap \left( {\overline{x}}+{\text {cone}}L\right) \cap B({\overline{x}},n^{-1})\) such that

$$\begin{aligned} f(x_{n})<f({\overline{x}})+n^{-1}\left\| x_{n}-{\overline{x}}\right\| . \end{aligned}$$

Then for all \(n\ge 1,\) \(x_{n}\ne {\overline{x}}\) and

$$\begin{aligned} \left\| x_{n}-{\overline{x}}\right\| ^{-1}\left( f(x_{n})-f(\overline{x})\right)= & {} \left\| x_{n}-{\overline{x}}\right\| ^{-1}\left( f\left( {\overline{x}}+\left\| x_{n}-{\overline{x}}\right\| \frac{x_{n}-\overline{x}}{\left\| x_{n}-{\overline{x}}\right\| }\right) -f(\overline{x})\right) \\< & {} \frac{1}{n}. \end{aligned}$$

Since X is finite dimensional, we can suppose, without loss of generality, that \(\left( \frac{x_{n}-{\overline{x}}}{\left\| x_{n}-\overline{x}\right\| }\right) \) converges to an element u which clearly is in \(T_{B}(A,{\overline{x}})\cap L.\) Therefore

$$\begin{aligned} d_{-}f({\overline{x}},u)\le \underset{n\rightarrow \infty }{{\text {liminf}} }\left\| x_{n}-{\overline{x}}\right\| ^{-1}\left( f\left( \overline{x}+\left\| x_{n}-{\overline{x}}\right\| \frac{x_{n}-{\overline{x}} }{\left\| x_{n}-{\overline{x}}\right\| }\right) -f({\overline{x}})\right) \le 0, \end{aligned}$$

and this is a contradiction. Therefore, the conclusion holds.\(\square \)

Remark 4.3

According to Proposition 3.3, one can replace (4.1) by any of the equivalent corresponding conditions.

From our point of view, an interesting fact is that metric conditions of the type (MC) come into play as weak assumptions to ensure the validity of some penalization principles. Even if all the conditions in Proposition 3.3 are equivalent, the change of constants is important in the construction of the penalty function. More details will be given after the next results.

Theorem 4.4

(penalty of an intersection of sets) Let \(f:X\rightarrow {\mathbb {R}}\) be a function and \(A,B\subset X\) be nonempty, closed sets. Let \({\overline{x}}\in A\cap B\) be a local minimum point for f on \(A\cap B\). Suppose that

  1. (i)

    there exist \(\varepsilon >0\) and \(\ell >0\) such that f is \(\ell -\)Lipschitz on \(B({\overline{x}},\varepsilon );\)

  2. (ii)

    there exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap A,\)

    $$\begin{aligned} d(x,A\cap B)\le \mu d(x,B). \end{aligned}$$

    Then \({\overline{x}}\) is a local minimum point for \(f+\ell \mu d(\cdot ,B)\) on A and a local minimum point (without constraints) for

    $$\begin{aligned} f+\ell \left( 1+\mu \right) d(\cdot ,A)+\ell \mu d(\cdot ,B). \end{aligned}$$
    (4.2)

Proof

We prove first that \({\overline{x}}\) is a local minimum point on A for

$$\begin{aligned} f+\ell \mu d(\cdot ,B). \end{aligned}$$

Let \(\alpha >0\) be the radius of the ball given by the local minimality of \({\overline{x}}.\) We choose \(\delta >0\) such that \(\delta <\min \left\{ \left( 1+\mu \right) ^{-1}\alpha ,s,\left( 1+\mu \right) ^{-1}\varepsilon \right\} \) and \(x\in A\cap B({\overline{x}},\delta ).\) Since \(f({\overline{x}})+\ell \mu d({\overline{x}},B)=f({\overline{x}}),\) we have to show that

$$\begin{aligned} f({\overline{x}})\le f(x)+\ell \mu d(x,B). \end{aligned}$$

If \(x\in A\cap B({\overline{x}},\delta )\cap B,\) this is obvious, by the property of \({\overline{x}}.\) Take \(x\in A\cap B({\overline{x}},\delta )\setminus B.\) By (ii), 

$$\begin{aligned} d(x,A\cap B)\le \mu d(x,B), \end{aligned}$$

whence, for every \(\rho \in (0,\min \left\{ \alpha ,\varepsilon \right\} -\delta \left( 1+\mu \right) ),\)

$$\begin{aligned} d(x,A\cap B)<\mu d(x,B)+\rho . \end{aligned}$$

This means that there is \(x_{\rho }\in A\cap B\) such that

$$\begin{aligned} \left\| x-x_{\rho }\right\| <\mu d(x,B)+\rho \le \mu \left\| x-{\overline{x}}\right\| +\rho . \end{aligned}$$

But,

$$\begin{aligned} \left\| x_{\rho }-{\overline{x}}\right\| \le \left\| x_{\rho }-x\right\| +\left\| x-{\overline{x}}\right\| \le \mu \delta +\rho +\delta =\delta \left( 1+\mu \right) +\rho <\min \left\{ \alpha ,\varepsilon \right\} . \end{aligned}$$

Consequently, \(x_{\rho }\in A\cap B\cap B({\overline{x}},\alpha ),\) whence \(f({\overline{x}})\le f(x_{\rho }).\) Then, by (i), 

$$\begin{aligned} f({\overline{x}})\le f(x_{\rho })\le f(x)+\ell \left\| x-x_{\rho }\right\| \le f(x)+\ell \mu d(x,B)+\ell \rho . \end{aligned}$$

Letting \(\rho \rightarrow 0\) we get the desired inequality, whence the claim is proved.

Now, observe that \(f+\ell \mu d(\cdot ,B)\) is locally Lipschitz around \({\overline{x}}\) with the Lipschitz constant \(\ell (1+\mu ).\) We apply the standard Clarke penalization principle and we get the conclusion.\(\square \)

Remark 4.5

Observe that the usual Clarke penalization principle corresponds to the case \(A=B\) in Theorem 4.4.

Remark 4.6

One can employ as well the following argument: if (ii) holds, then, according to Proposition 3.3 (i)\(\Leftrightarrow \)(iv), for all x close to \({\overline{x}}\)

$$\begin{aligned} d(x,A\cap B)\le \left( 1+\mu \right) \left( d(x,A)+d(x,B)\right) \end{aligned}$$

and the Clarke penalization principle gives \(\ell \left( 1+\mu \right) d(\cdot ,A)+\ell \left( 1+\mu \right) d(\cdot ,B)\) as penalization term.

Remark 4.7

Similarly to the situation described in the above remark, the use of condition (MC) instead of (Str) could be of importance in the study of the rate of convergence of some algorithms related to the method of alternating projections or, more general, set intersection problems (see [4, 5, 10, 29]).

We apply this generalized penalty result for getting necessary optimality conditions in the dual space for directional minima.

Corollary 4.8

Let X be an Asplund space, \(f:X\rightarrow {\mathbb {R}}\) be a function, \(A\subset X\), \(L\subset S\left( 0,1\right) \) be nonempty closed sets and \({\overline{x}}\in A.\) Suppose that:

  1. (i)

    \({\overline{x}}\) is a local directional minimum point for f on A with respect to L

  2. (ii)

    there exist \(\varepsilon >0\) and \(\ell >0\) such that f is \(\ell -\)Lipschitz on \(B({\overline{x}},\varepsilon );\)

  3. (iii)

    there exist \(s>0,\mu >0\) such that for all \(x\in B({\overline{x}},s)\cap A,\)

    $$\begin{aligned} d(x,A\cap \left( {\overline{x}}+{\text {cone}}L\right) )\le \mu d(x,{\overline{x}}+{\text {cone}}L). \end{aligned}$$

    Then there are \(u^{*}\in N(A,{\overline{x}}),v^{*}\in N\left( {\text {cone}}L,0\right) \) with \(\left\| u^{*}\right\| \le \ell \left( 1+\mu \right) \ \)and \(\left\| v^{*}\right\| \le \ell \mu ,\) such that

    $$\begin{aligned} -u^{*}-v^{*}\in \partial f({\overline{x}}). \end{aligned}$$

Proof

Since \({\overline{x}}\) is a local directional minimum point for f on A with respect to L,  then it is a local minimum point for f on \(A\cap \left( {\overline{x}}+{\text {cone}}L\right) \). Then, on the basis of (ii) and (iii) we can apply Theorem 4.4 to get that \({\overline{x}}\) is a local minimum point for \(f+\ell \left( 1+\mu \right) d(\cdot ,A)+\ell \mu d(\cdot ,{\overline{x}}+{\text {cone}}L).\) Therefore, since all the functions are locally Lipschitz around \({\overline{x}},\) one can apply the exact subdifferential calculus rules (see [26, Theorem 3.36]) to get

$$\begin{aligned} 0&\in \partial \left( f+\ell \left( 1+\mu \right) d(\cdot ,A)+\ell \mu d(\cdot ,{\overline{x}}+{\text {cone}}L)\right) ({\overline{x}})\\&\subset \partial f({\overline{x}})+\ell \left( 1+\mu \right) \partial d(\cdot ,A)\left( {\overline{x}}\right) +\ell \mu \partial d(\cdot ,\overline{x}+{\text {cone}}L)\left( {\overline{x}}\right) . \end{aligned}$$

The conclusion follows.\(\square \)

Remark 4.9

The importance of having constants as small as possible is apparent in the conclusion of Corollary 4.8 in the estimation of the norms of the generalized Lagrange multipliers \(u^{*}\) and \(v^{*}.\) Observe that if \({\text {cone}}L\) is convex, then \(N\left( {\text {cone}} L,0\right) =L^{-}.\)

Remark 4.10

In view of the discussion in the preceding section, condition displayed in (iii) is, in general, weaker than (MSCQ) type conditions.

We illustrate the above results by the following example.

Example 4.11

Let \(X:={\mathbb {R}}^{2},\) \(A:=\left[ 0,1\right] ^{2},\) \(L:=S\left( \left( 0,0\right) ,1\right) \cap \left\{ \left( x,y\right) \in {\mathbb {R}}^{2}\mid y\ge x\ge 0\right\} .\) Consider the function \(f:{\mathbb {R}}^{2} \rightarrow {\mathbb {R}},\) \(f(x,y)=2y-x.\) Clearly, \(\left( 0,0\right) \) is a directional minimum point for f on A with respect to L. Moreover, the assumptions of Corollary 4.8 are satisfied with \(\ell =\sqrt{5}\) and \(\mu =1.\) Then \(\partial f(0,0)=\left\{ \left( -1,2\right) \right\} \) and there is \(u^{*}=\left( 0,-1\right) \in N(A,\left( 0,0\right) ),\) \(v^{*}=\left( 1,-1\right) \in L^{-}\) satisfying the conclusion.

On the other hand, if we change only the set of directions to

$$\begin{aligned} L:=S\left( \left( 0,0\right) ,1\right) \cap \left\{ \left( x,y\right) \in {\mathbb {R}}^{2}\mid y\ge 3^{-1}\cdot x\ge 0\right\} , \end{aligned}$$

then one cannot find \(u^{*},\) \(v^{*}\) to satisfy the conclusion, confirming that this time \(\left( 0,0\right) \) is not a directional minimum point for f on A with respect to L.

In the next result we point out that a metric condition can be imposed as well for a functional constraint in order to get a penalty result. Let Z be a normed vector space, \(g:X\rightarrow Z\) be a function and \(Q\subset Z\) be a pointed closed convex cone. As above, one defines the set-valued map \({\tilde{g}}:X\rightrightarrows Z\) given by \({\tilde{g}}(x):=g(x)+Q\) and one considers \(g^{-1}\left( -Q\right) ={\tilde{g}}^{-1}\left( 0\right) \) as the feasible set.

Theorem 4.12

(penalty for a functional constraint)Let \({\overline{x}}\in g^{-1}(-Q)\) be a local minimum of f on \(g^{-1}(-Q)\). Suppose that

  1. (i)

    there exist \(\varepsilon >0\) and \(\ell >0\) such that f is \(\ell -\)Lipschitz on \(B({\overline{x}},\varepsilon );\)

  2. (ii)

    there exist \(s,\mu >0\) such that for all \(x\in g^{-1}\left( -Q+B(0,s)\right) \cap B\left( {\overline{x}},s\right) \)

    $$\begin{aligned} d(x,g^{-1}(-Q))\le \mu d(0,{\tilde{g}}(x)\cap B(0,s)). \end{aligned}$$

    Then \(({\overline{x}},0)\) is a local minimum for the function \((x,z)\mapsto f(x)+\ell \mu \left\| z\right\| +\ell \left( 1+\mu \right) d(\left( x,z\right) ,{\text {Gr}}{\tilde{g}}).\) If, moreover, XZ are Asplund spaces, then

    $$\begin{aligned} -\partial f({\overline{x}})\cap D(0,\ell \left( 1+\mu \right) )\cap D^{*}{\tilde{g}}({\overline{x}},0)(Q^{+}\cap D(0,\ell \mu ))\ne \emptyset . \end{aligned}$$

Proof

Denote \(h:X\times Z\rightarrow {\mathbb {R}}\), \(h(x,z)=f(x)+\ell \mu \left\| z\right\| .\) Clearly, \(h(\overline{x},0)=f({\overline{x}})\) and we show first that \(({\overline{x}},0)\) is a local minimum on \({\text {Gr}}{\tilde{g}}\) for h. Thus, we have to show that there is \(\delta >0\) such that for all \((x,z)\in \left( B\left( \overline{x},\delta \right) \times B(0,\delta )\right) \cap {\text {Gr}}\tilde{g},\)

$$\begin{aligned} f({\overline{x}})\le f(x)+\ell \mu \left\| z\right\| . \end{aligned}$$

Let \(\alpha \) be the radius of the ball given by the local minimality of \({\overline{x}}.\) Take \(\delta >0\) such that \(\delta <\min \left\{ \varepsilon \left( 1+\mu \right) ^{-1},s,\alpha \left( 1+\mu \right) ^{-1}\right\} \) and \((x,z)\in \left( B\left( {\overline{x}},\delta \right) \times B(0,\delta )\right) \cap {\text {Gr}}{\tilde{g}}.\) If \(z\in -Q,\) then by the property of \({\overline{x}},\)

$$\begin{aligned} f({\overline{x}})\le f(x)\le f(x)+\ell \mu \left\| z\right\| . \end{aligned}$$

Suppose that \(z\notin -Q.\) Then there is \(q\in Q\) such that \(z-q=g\left( x\right) ,\) hence \(x\in B\left( {\overline{x}},\delta \right) \cap g^{-1}(z-q)\subset B\left( {\overline{x}},s\right) \cap g^{-1}\left( -Q+B(0,s)\right) .\) Now, by (ii), 

$$\begin{aligned} d(x,g^{-1}(-Q))\le \mu d(0,{\tilde{g}}(x)\cap B(0,s))\le \mu \left\| z\right\| . \end{aligned}$$

Then for every \(\rho \in (0,\min \left\{ \alpha ,\varepsilon \right\} -\delta \left( 1+\mu \right) ),\) there is \(x_{\rho }\in g^{-1}(-Q)\) such that

$$\begin{aligned} \left\| x-x_{\rho }\right\| <\mu \left\| z\right\| +\rho \le \mu \delta +\rho . \end{aligned}$$

Then

$$\begin{aligned} \left\| x_{\rho }-{\overline{x}}\right\| \le \left\| x_{\rho }-x\right\| +\left\| x-{\overline{x}}\right\| \le \mu \delta +\rho +\delta =\delta \left( 1+\mu \right) +\rho <\min \left\{ \alpha ,\varepsilon \right\} . \end{aligned}$$

Therefore, by (i), 

$$\begin{aligned} f({\overline{x}})\le f(x_{\rho })\le f(x)+\ell \left\| x-x_{\rho }\right\| \le f(x)+\ell \mu \left\| z\right\| +\ell \rho . \end{aligned}$$

For \(\rho \rightarrow 0,\) we get the claim.

Since h is locally Lipschitz around \(({\overline{x}},0)\) with constant \(\ell \left( 1+\mu \right) ,\) we apply the standard Clarke penalization principle, and we get the first conclusion.

For the second part, by applying the generalized Fermat rule for limiting subdifferential on Asplund spaces, we get

$$\begin{aligned} \left( 0,0\right) \in \partial f({\overline{x}})\times \left\{ 0\right\} +\left\{ 0\right\} \times \ell \mu \partial \left\| \cdot \right\| (0)+\ell \left( 1+\mu \right) \partial d(\left( \cdot ,\cdot \right) ,{\text {Gr}}{\tilde{g}})\left( {\overline{x}},0\right) . \end{aligned}$$

Then there are \(x^{*}\in \partial f({\overline{x}}),\) \(z^{*}\in D(0,\ell \mu )\) such that

$$\begin{aligned} \left( -x^{*},-z^{*}\right)&\in N\left( {\text {Gr}} {\tilde{g}},\left( {\overline{x}},0\right) \right) \text { and}\\ \left\| \left( x^{*},z^{*}\right) \right\|&\le \ell \left( 1+\mu \right) . \end{aligned}$$

In particular, this means that

$$\begin{aligned} \left\| x^{*}\right\|&\le \ell \left( 1+\mu \right) \text { and}\\ -x^{*}&\in D^{*}{\tilde{g}}({\overline{x}},0)\left( z^{*}\right) . \end{aligned}$$

Taking into account the definition of the limiting normal cone, \(\left( -x^{*},-z^{*}\right) \) can be approached in the weak-star topology by some elements \((-x_{n}^{*},-z_{n}^{*})\) elements in the Fréchet normal cone to \({\text {Gr}}{\tilde{g}}\) at points close to \(\left( {\overline{x}},0\right) .\) According to [11, Lemma 3.2], the elements \(z_{n}^{*}\) are in \(Q^{+}\) and since this set is weakly-star closed, \(z^{*}\in Q^{+}.\) Therefore, \(z^{*}\in Q^{+}\cap D(0,\ell \mu )\) and then we get the conclusion.\(\square \)

Remark 4.13

Notice that the dual conditions in Corollary 4.8 and the last part of Theorem 4.12 are consequences of the corresponding primal conditions and the sum rule for limiting subdifferentials.

By combining both penalty principles above, one can get similarly optimality conditions for directional minima under functional constraints. We leave the details to interested reader.

Finally, we give some optimality conditions for a concept of directional minimum with respect to two sets of directions for vectorial functions (see [14]). One needs an auxiliary result.

Lemma 4.14

Let XY be Banach spaces, \(f:X\rightarrow Y\) a continuously differentiable function, \(M\subset S_{Y}\) a closed and nonempty set, and \({\overline{x}}\in X\). Suppose that one of the following sets of conditions holds:

  1. (i)

    \({\text {cone}}M\) is convex and there exists \(u_{0}\in X\) such that \(\nabla f({\overline{x}})(u_{0})\in -{\text {int}}({\text {cone}}M)\);

  2. (ii)

    the map \(g:X\times Y\rightarrow Y\) given by \(g(x,y)=f(x)-y\) is metrically subregular at \(({\overline{x}},f({\overline{x}}),0)\) with respect to \(X\times f^{-1}(f({\overline{x}})-{\text {cone}}M)\). Then

    $$\begin{aligned} T_{B}(f^{-1}(f({\overline{x}})-{\text {cone}}M),{\overline{x}})=\nabla f({\overline{x}})^{-1}(-{\text {cone}}M). \end{aligned}$$
    (4.3)

Proof

Provided that f is continuously differentiable, it is always true that \(T_{B}(f^{-1}(f({\overline{x}})-{\text {cone}} M),{\overline{x}})\subset \nabla f({\overline{x}})^{-1}(-{\text {cone}}M)\). Indeed, for an arbitrary \(u\in T_{B}(f^{-1}(f({\overline{x}} )-{\text {cone}}M),{\overline{x}})\), there exist \((t_{n})\downarrow 0\) and \(\left( u_{n}\right) \rightarrow u\) such that \(f({\overline{x}}+t_{n}u_{n})\in f({\overline{x}})-{\text {cone}}M\) for all n. Since f is continuously differentiable at \({\overline{x}}\), this means that there exists a function \(\alpha :X\rightarrow Y\) such that \(\lim _{h\rightarrow 0}\alpha (\overline{x}+h)=\alpha ({\overline{x}})=0\) and

$$\begin{aligned} f({\overline{x}}+t_{n}u_{n})=f({\overline{x}})+t_{n}\nabla f({\overline{x}} )(u_{n})+t_{n}\alpha ({\overline{x}}+t_{n}u_{n})\Vert u_{n}\Vert . \end{aligned}$$

Therefore,

$$\begin{aligned} t_{n}\nabla f({\overline{x}})(u_{n})+t_{n}\alpha ({\overline{x}}+t_{n}u_{n})\Vert u_{n}\Vert \in -{\text {cone}}M, \end{aligned}$$

and from here, dividing by \(t_{n}\) and passing to the limit, one gets \(u\in \nabla f({\overline{x}})^{-1}(-{\text {cone}}M)\).

If the condition from (ii) holds, the conclusion is granted by [12, Theorem 3.1].

Suppose now that we are under the assumptions from (i). Let \((t_{n})\) be an arbitrary sequence of positive numbers converging to 0, and set \(u_{n}:=u_{0}\). Using the differentiability of f, we can write

$$\begin{aligned} f({\overline{x}}+t_{n}u_{n})=&f({\overline{x}})+t_{n}\nabla f(\overline{x})(u_{n})+t_{n}\alpha ({\overline{x}}+t_{n}u_{n})\Vert u_{n}\Vert \\ =&f({\overline{x}})+t_{n}[\nabla f({\overline{x}})(u_{0})+\alpha (\overline{x}+t_{n}u_{0})\Vert u_{0}\Vert ], \end{aligned}$$

where \(\alpha :X\rightarrow Y\) is such that \(\lim \limits _{n\rightarrow \infty }\alpha ({\overline{x}}+t_{n}u_{n})=\alpha ({\overline{x}})=0.\) Therefore, since \(\nabla f({\overline{x}})(u_{0})\in -{\text {int}}({\text {cone}}M)\), we have that for all n large enough, \(f({\overline{x}}+t_{n}u_{0})\in f({\overline{x}})-{\text {cone}}M\), that is \(u_{0}\in T_{U}(f^{-1} (f({\overline{x}})-{\text {cone}}M),{\overline{x}})\).

Consider now \(v\in X\) such that \(\nabla f({\overline{x}})(v)\in -{\text {cone}}M\), and \(\lambda \in (0,1)\). Then

$$\begin{aligned} \nabla f({\overline{x}})(\lambda u_{0}+(1-\lambda )v)\in -({\text {int}} ({\text {*}}{cone}M)+{\text {cone}}M)=-{\text {*}}{int} ({\text {cone}}M). \end{aligned}$$

Hence, from the previous step,

$$\begin{aligned} \lambda u_{0}+(1-\lambda )v\in T_{U}(f^{-1}(f({\overline{x}} )-{\text {cone}}M),{\overline{x}}). \end{aligned}$$

Passing to the limit with \(\lambda \rightarrow 0\), and taking into account that the adjacent cone is closed, we get that \(v\in T_{U}(f^{-1}(f(\overline{x})-{\text {cone}}M),{\overline{x}})\subset T_{B}(f^{-1}(f(\overline{x})-{\text {cone}}M),{\overline{x}})\), and this completes the proof. \(\square \)

We give next a more involved concept of directional minimality (see [14]).

Definition 4.15

Let X and Y be normed vector spaces, \(K\subset Y\) a closed ordering cone with nonempty interior, \(f:X\rightarrow Y\), and \(A\subset X\), \(L\subset S_{X} \), \(M\subset S_{Y}\) closed sets. One says that \({\overline{x}}\in A\) is a weak local directional Pareto minimum point for f with respect to the sets of directions L and M on A if there exists a neighborhood U for \({\overline{x}}\) such that

$$\begin{aligned}{}[\left[ f(A\cap U\cap ({\overline{x}}+{\text {cone}}L))\cap (f({\overline{x}})-{\text {cone}}M)\right] -f({\overline{x}})]\cap (-{\text {int}}K)=\emptyset . \end{aligned}$$

We present now the necessary optimality conditions.

Theorem 4.16

Let X, Y be Banach spaces, \(f:X\rightarrow Y\) a continuously differentiable function, \(K\subset Y\) a closed convex ordering cone with nonempty interior, \(A\subset X\), \(L\subset S_{X}\) and \(M\subset S_{Y}\) closed and nonempty sets, and \({\overline{x}}\in A\). Assume there exist \(s,\mu ,t,\gamma >0\) such that

$$\begin{aligned} d(x,A\cap ({\overline{x}}+{{\text {cone}}}L))\le \mu d(x,\overline{x}+{\text {cone}}L),\ \forall x\in B({\overline{x}},s)\cap A, \end{aligned}$$

and

$$\begin{aligned}&d(x,A\cap ({\overline{x}}+{{\text {cone}}}L)\cap f^{-1}(f(\overline{x})-{{\text {cone}}}M))\le \gamma d(x,f^{-1}(f({\overline{x}} )-{{\text {cone}}}M)),\\&\forall x\in B({\overline{x}},t)\cap A\cap ({\overline{x}}+{\text {cone}}L). \end{aligned}$$

Suppose also that either (i) or (ii) from Lemma 4.14 hold, and \({\overline{x}}\) is a weak local directional Pareto minimum for f on A with respect to L and M.

Then

$$\begin{aligned} \nabla f({\overline{x}})(u)\notin -{\text {int}}K,\ \forall u\in T_{B}(A,{\overline{x}})\cap {{\text {cone}}}L\cap \nabla f(\overline{x})^{-1}(-{{\text {cone}}}M). \end{aligned}$$

Proof

First, we observe that the metric assumptions ensure the equality

$$\begin{aligned}&T_{B}(A\cap ({\overline{x}}+{\text {cone}}L)\cap f^{-1}(f(\overline{x})-{\text {cone}}M),{\overline{x}})\\&\quad =T_{B}(A,{\overline{x}})\cap T_{B}({\overline{x}}+{\text {cone}}L,{\overline{x}})\cap T_{B} (f^{-1}(f({\overline{x}})-{\text {cone}}M),{\overline{x}}). \end{aligned}$$

But \(T_{B}({\overline{x}}+{\text {cone}}L,{\overline{x}} )={\text {cone}}L\) and since we are under assumptions (i) or (ii) from Lemma 4.14, we also have

$$\begin{aligned} T_{B}(f^{-1}(f({\overline{x}})-{\text {cone}}M),{\overline{x}})=\nabla f({\overline{x}})^{-1}(-{\text {cone}}M). \end{aligned}$$

Therefore,

$$\begin{aligned}&T_{B}(A\cap ({\overline{x}}+{\text {cone}}L)\cap f^{-1}(f(\overline{x})-{\text {cone}}M),{\overline{x}})\nonumber \\&\quad =T_{B}(A,{\overline{x}})\cap {\text {cone}}L\cap \nabla f({\overline{x}})^{-1}(-{\text {cone}}M). \end{aligned}$$
(4.4)

The minimality of \({\overline{x}}\) means there exists a neighborhood U for \({\overline{x}}\) such that

$$\begin{aligned}{}[\left[ f(A\cap U\cap ({\overline{x}}+{{\text {cone}}} L))\cap (f({\overline{x}})-{{\text {cone}}}M)\right] -f({\overline{x}} )]\cap (-{\text {int}}K)=\emptyset . \end{aligned}$$

That is, for every \(e\in {\text {int}}K\) and every \(x\in A\cap U\cap ({\overline{x}}+{\text {cone}}L)\cap f^{-1}(f(\overline{x})-{\text {cone}}M)\), we have \(s_{e}(f(x)-f({\overline{x}}))\ge 0\) since \(f(x)-f({\overline{x}})\notin -{\text {int}}K\). We can now see \({\overline{x}}\) as a local minimum point for the scalar function \(g:X\rightarrow {\mathbb {R}}\) given by \(g(x)=s_{e}(f(x)-f({\overline{x}}))\) on \(A\cap ({\overline{x}}+{\text {cone}}L)\cap f^{-1}(f(\overline{x})-{\text {cone}}M)\). Note that \(g({\overline{x}})=s_{e}(0)=0\) because \(s_{e}\) is sublinear.

Consider now \(u\in T_{B}(A,{\overline{x}})\cap {\text {cone}}(L)\cap \nabla f({\overline{x}})^{-1}(-{\text {cone}}M)\). There exist \((t_{n} )\downarrow 0\) and \(u_{n}\rightarrow u\) such that

$$\begin{aligned} {\overline{x}}+t_{n}u_{n}\in A\cap ({\overline{x}}+{{\text {cone}}}L)\cap f^{-1}(f({\overline{x}})-{{\text {cone}}}M), \end{aligned}$$

due to (4.4). Moreover, since \({\overline{x}}+t_{n}u_{n}\rightarrow {\overline{x}}\), we also have that \({\overline{x}}+t_{n}u_{n}\in U\) for n sufficiently large. Hence,

$$\begin{aligned} g({\overline{x}}+t_{n}u_{n})\ge g({\overline{x}})\Leftrightarrow s_{e} (f({\overline{x}}+t_{n}u_{n})-f({\overline{x}}))\ge 0, \end{aligned}$$

for all n sufficiently large. There exists a function \(\alpha :X\rightarrow Y\) such that \(\lim _{h\rightarrow 0}\alpha ({\overline{x}}+h)=\alpha (\overline{x})=0\), so that we can write

$$\begin{aligned} s_{e}(t_{n}\nabla f({\overline{x}})(u_{n})+t_{n}\alpha ({\overline{x}}+t_{n} u_{n})\Vert u_{n}\Vert )\ge 0. \end{aligned}$$

Dividing by \(t_{n}\) and passing to the limit, we get \(s_{e}(\nabla f({\overline{x}})(u))\ge 0\), that is \(\nabla f({\overline{x}})(u)\notin -{\text {int}}K\). The proof is complete. \(\square \)

5 Concluding Remarks

Besides the known applications of the metric inequality on sets that we discuss in this work, we study its implications in at least two aspects related to optimization problems. First of all, it is an appropriate replacement for some constraint qualification conditions when a new constraint is added. Secondly, it is naturally involved in penalization results when several constraints are considered. By its nature, as underlined, the metric inequality is rather weak and this offers the perspective of applying the general principle behind the results in this work to other research directions. Therefore, the metric relations we study here can be extended as well for deriving calculus rules for tangent cones by changing the conditions based on subregularities of mappings initiated in [12]. Moreover, penalization procedures for vectorial problems with multiple constraints based on results such as those in [36] and [13] are to be envisaged in future works.