1 Introduction

Since 2007, a difficult class of optimization problems was introduced by Kanzow and his co-authors in [1, 2]; it was called mathematical programming with vanishing constraints (MPVC for short). The MPVCs received attention from different fields. Some of their applications in topological optimization have been introduced in [1]. Also, MPVCs are generalizations of another group of optimization problems, named mathematical programming with equilibrium constraints (MPEC in brief).

Since MPVCs are a generalized form of MPECs, it seems natural that we seek to generalize the results of MPECs to them. As we know, in MPECs, occurrence of situations, which could lead to necessary conditions in Karush/Kuhn–Tucker (KKT) types, is very limiting and hindering. Therefore, instead of KKT condition some other conditions are substituted. They have the style and appearance of KKT condition, and also, their coefficients are more free than KKT coefficients. The recent conditions, which are called stationary conditions, can be proved under some constraint qualifications that their occurrence is more natural [3,4,5,6,7,8,9].

The same process has been repeated for MPVCs, and numerous constraint qualifications have been introduced, which lead to different stationary conditions. For instance, the first-order stationary conditions for MPVCs can be found in [1, 10, 11]. The second-order stationary conditions and exact penalty theorem are given in [2] and [12], respectively. Also, relaxation methods in [13], stability in [14], and numerical algorithms in [15] have been presented. In addition, weak and strong duality results for MPVCs could be seen in [16].

In the previous works that referenced earlier, all functions which define MPVC are continuously differentiable. Unlike the MPECs which their non-smooth form has been studied in some references as in [3, 7,8,9, 17], the earlier works for MPVCs assume the limiting criteria of continuously differentiability for functions. So, to the best of our knowledge, the current article is the first that studies the stationary conditions for MPVCs where their constraints are non-smooth, meaning that they are not necessarily differentiable. In this paper, we assume the objective function is differentiable.

Since the feasible set of an MPVC is not necessarily convex, even under the criteria of convexity of the functions which construct it (to be discussed in next sections), applying common methods of convex analysis is not applicable here. Therefore, we take the non-smooth analysis approach. To choose a suitable sub-differential, we select a sub-differential which not only is convex but also its calculation rules are known. The best selection is Clarke sub-differential. Therefore, we consider the functions which define the problem to be locally Lipschitz.

The structure of the subsequent sections of this paper is as follows: In Sect. 2, we define required definitions, theorems, and relations of non-smooth analysis. In Sect. 3, we will introduce some various constraint qualifications for a system with non-smooth vanishing constraints. Also, relations between defined constraint qualifications are presented in Sect. 3. In Sect. 4, we apply these constraint qualifications to obtain different kinds of stationary conditions for the problem.

2 Notations and Preliminaries

In this section, we provide an overview for some notations and preliminary results that will be used throughout this paper. Further details and examples of these results can be found in the books [18,19,20].

First, we recall that \({\mathbb {R}}_+\) and \(\langle x,y \rangle \) denote, respectively, the nonnegative real numbers \([0,+\infty [\) and the standard inner product of vectors \(x,y \in {\mathbb {R}}^n\). For a non-empty subset M of \( {\mathbb {R}}^n \), let:

$$\begin{aligned}&M^-:=\{x\in {\mathbb {R}}^n :\langle x,d \rangle \le 0, \quad \forall d\in M \},\\&M^s:=\{x\in {\mathbb {R}}^n :\langle x,d \rangle < 0, \quad \forall d\in M \}.\\&M^{\bot }:=M^-\cap (-M)^-=\{x\in {\mathbb {R}}^n :\langle x,d \rangle = 0, \quad \forall d\in M \}. \end{aligned}$$

It can easily be shown that if \( M^s \ne \emptyset \), then \( cl({M^s})=M^- \), which \( cl({M^s}) \) denotes the topological closure of \( M^s \). Here, it is necessary to note that for given closed convex cones \( M_1,\ M_2,\ M_3,\) and \(M_4, \) we have:

$$\begin{aligned}&\left( M_1 \cup M_2\right) ^-=M_1^- \cap M_2^-,\ \ \ \ \left( M_1 \times M_2\right) ^-=M_1^- \times M_2^-, \end{aligned}$$
(1)
$$\begin{aligned}&\left( M_1^-\times M_2^-\right) \cap \left( M_3^-\times M_4^-\right) =\left( M_1^- \cap M_3^-\right) \times \left( M_2^- \cap M_4^-\right) . \end{aligned}$$
(2)

The convex hull and the convex cone (containing origin) of \(M \subseteq {\mathbb {R}}^n\) are, respectively, denoted by \( \mathrm{conv}(M) \) and \( \mathrm{cone}(M) \). It is easy to see that if \( \{M_\kappa \}_{\kappa \in \varLambda } \) is a class of convex subsets of \( {\mathbb {R}}^n \) with \(|\varLambda |<\infty \) and with the convention \( \bigcup _{\kappa \in \emptyset }M_\kappa =\emptyset \), then one has:

$$\begin{aligned}&\mathrm{cone}\left( \bigcup _{\kappa \in \varLambda }M_\kappa \right) =\left\{ \sum _{\kappa \in \varLambda }\alpha _\kappa d_\kappa :\alpha _\kappa \ge 0, \ d_\kappa \in M_\kappa \right\} . \end{aligned}$$

The famous bipolar theorem says that \( (M^-)^-=cl \big ({\mathrm{cone}}(M)\big ) \) for each \( \emptyset \ne M\subseteq {\mathbb {R}}^n \). We define the normal cone of convex set \( A\subseteq {{\mathbb {R}}}^n \) at \( x_0 \in cl({A})\) as

$$\begin{aligned} N(A,x_0):=\{d\in {{\mathbb {R}}}^n :\langle d , x-x_0 \rangle \le 0, \quad \forall x\in A\}. \end{aligned}$$

The following relation for any subset K of \( {\mathbb {R}}^n \) is immediately from bipolar theorem:

$$\begin{aligned} N(K^-,0)=cl\big ( {\mathrm{cone}} (K)\big ). \end{aligned}$$
(3)

Let \(\psi :{\mathbb {R}}^n \rightarrow {{\mathbb {R}}}^*:={\mathbb {R}} \cup \{+\infty \} \) be a locally Lipschitz function, and \(x_0 \in dom \) \( \psi \). The Clarke sub-differential of \(\psi \) at \(x_0\) is defined as

$$\begin{aligned} \partial _c\psi (x_0):=\{\xi \in {\mathbb {R}}^n :\psi ^\circ (x_0;v)\ge \langle \xi , v \rangle , \quad \forall v\in {\mathbb {R}}^n \}, \end{aligned}$$

where \(\psi ^\circ (x_0;v)\) denotes the Clarke generalized directional derivative of \(\psi \) at \(x_0\) in direction \(v \in {\mathbb {R}}^n\),

$$\begin{aligned} \psi ^\circ (x_0;v)=\limsup _{y\rightarrow x_0 \ {t\downarrow 0}}\dfrac{\psi (y+tv)-\psi (y)}{t}. \end{aligned}$$

It should be noted that if \( \psi \) is continuously differentiable at \( x_0 \), then \( \partial _c \psi (x_0)=\{\triangledown \psi (x_0)\} \). Consequently, the sentences expressed in terms of \(\partial _c \) are generalizations of sentence that are expressed with gradient for \(C^1\) functions. Moreover, it can be shown that if \( \psi \) is a locally Lipschitz function, \( \partial _c \psi (x_0) \) is a non-empty, convex, and compact set. Also, \( \psi ^\circ (x;v) \) is a convex function with respect to v.

If \( \psi _1 \) and \( \psi _2 \) are locally Lipschitz functions, the following relationships should always hold:

$$\begin{aligned}&\partial _c(\psi _1+\psi _2)(x_0)\subseteq \partial _c\psi _1(x_0)+\partial _c\psi _2(x_0), \end{aligned}$$
(4)
$$\begin{aligned}&\partial _c(\lambda \psi _1)(x_0)=\lambda \partial _c\psi _1(x_0),\ \ \quad \forall \lambda \in {\mathbb {R}}, \end{aligned}$$
(5)
$$\begin{aligned}&\partial _c(\psi _1\psi _2)(x_0)\subseteq \psi _1(x_0)\partial _c\psi _2(x_0)+\psi _2(x_0)\partial _c\psi _1(x_0), \end{aligned}$$
(6)
$$\begin{aligned}&\psi ^\circ (x;v)=\max \{\langle \xi ,v \rangle :\xi \in \partial _c\psi (x_0)\}. \end{aligned}$$
(7)

The following three approximate cones for \(\emptyset \ne M \subseteq {\mathbb {R}}^n\) at \(x_0 \in cl({M})\) will be requested in this article:

  • The contingent cone \(\varGamma (M,x_0)\),

    $$\begin{aligned} \varGamma (M,x_0):=\Big \{v\in {\mathbb {R}}^n :\exists t_r \downarrow 0,\ \exists v_r \rightarrow v\ \text {such that}\ x_0+t_r v_r \in M \ \ \forall r\in \mathbb {N} \Big \} . \end{aligned}$$
  • The cone of feasible directions \(\varUpsilon (M,x_0)\),

    $$\begin{aligned} \varUpsilon (M,x_0):=\Big \{v \in {\mathbb {R}}^n :\exists \varepsilon >0,\ \text {such that}\ x_0+t v \in M \ \ \ \forall t\in ]0,\varepsilon [ \Big \} . \end{aligned}$$
  • The cone of attainable directions \(\varOmega (M,x_0)\),

    $$\begin{aligned} \varOmega (M,x_0):=\Big \{v \in {\mathbb {R}}^n :\forall t_r \downarrow 0, \ \exists v_r \rightarrow v \ \text {such that}\ x_0+t_r v_r \in M \ \ \ \forall r\in \mathbb {N} \Big \} . \end{aligned}$$

Remark 2.1

The cone of attainable directions of M at \(x_0\) has an another representation as follows (see, e.g., [19]):

\(v\in \varOmega (M, x_0)\), if and only if there exists a scalar \(\delta >0\) together with a mapping \(\varphi :{\mathbb {R}} \rightarrow {\mathbb {R}}^n\) such that \(\varphi (\beta )\in M\) for all \(\beta \in ]0,\delta [\), \(\varphi (0)=x_0\), and \(\lim _{\beta \downarrow 0}\frac{\varphi (\beta )-\varphi (0)}{\beta }=v\).

The relationship in these cones is as

$$\begin{aligned} cl\big ({\varUpsilon (M,x_0)}\big ) \subseteq \varOmega (M,x_0) \subseteq \varGamma (M,x_0)\subseteq cl \big ({\mathrm{cone}}\left( \varGamma (M,x_0)\right) \big ). \end{aligned}$$
(8)

Note that if the continuously differentiable function \(\psi (\cdot )\) attaints its minimum on \(M\subseteq {\mathbb {R}}^n\) at \(x_0 \in M\), then:

$$\begin{aligned} -\nabla \psi (x_0)\in N_F(M,x_0), \end{aligned}$$
(9)

where \(N_F(M,x_0)\) denotes the Fréchet normal cone of M at \(x_0\), which is defined as \( N_F(M,x_0):=\left( \varGamma (M,x_0)\right) ^- \).

3 Constraint Qualifications

Let \(I:=\{1,\ldots ,m\}\), and for each \(i\in I\) the functions \(G_i\) and \(H_i\) are locally Lipschitz from \({\mathbb {R}}^n\) to \({{\mathbb {R}}}^*\). In this section, we introduce and compare several constraint qualifications (CQ) for the following system with vanishing constraints (SVC in short):

$$\begin{aligned} \varPi := \left\{ \begin{array}{ll} H_i(x)\ge 0 , &{} \quad i\in I,\\ H_i(x)G_i(x)\le 0, &{} \quad i\in I. \end{array} \right. \end{aligned}$$

The feasible set of \(\varPi \) is assumed to be non-empty, i.e.,

$$\begin{aligned} S:=\{ x\in {\mathbb {R}}^n :H_i(x)\ge 0\ \ \text {and}\ \ H_i(x)G_i(x)\le 0\ \ \text {for all}\ i\in I \}\ne \emptyset . \end{aligned}$$

Considering a feasible point \(\hat{x} \in S\) (this point will be fixed throughout this section), we define following index sets:

$$\begin{aligned}&I_{+0}:=\{i\in I:H_i(\hat{x})>0 ,\ G_i(\hat{x})=0\}, \\&I_{+-}:=\{i\in I:H_i(\hat{x})>0 ,\ G_i(\hat{x})<0\}, \\&I_{0+}:=\{i\in I:H_i(\hat{x})=0 ,\ G_i(\hat{x})>0\}, \\&I_{00}:=\{i\in I:H_i(\hat{x})=0 ,\ G_i(\hat{x})=0\}, \\&I_{0-}:=\{i\in I:H_i(\hat{x})=0 ,\ G_i(\hat{x})<0\}. \end{aligned}$$

For given \( I_\star \subseteq I \) and \( \theta \in \{G,H \} \), let

$$\begin{aligned} \sigma _{I_*}^\theta :=\bigcup _{i\in I_*}\partial _c\theta _i(\hat{x}). \end{aligned}$$

Notice, if \(I_{00}=\emptyset \), then \(\varPi \) is locally equivalent with the following system:

$$\begin{aligned} \varPi ^*:= \left\{ \begin{array}{rl} H_i(x)=0 , &{}\quad G_i(x)>0,\quad i\in I_{0+} ,\\ H_i(x)\ge 0 , &{}\quad G_i(x)\le 0,\quad i\in I_{0-}\cup I_{+-}\cup I_{+0}\ . \end{array} \right. \end{aligned}$$

Since \(\varPi ^*\) has the classic form of nonlinear systems, definition of CQs for it is so easy. But, when \(I_{00}\ne \emptyset \), there is no any classic system that is equivalent to \(\varPi \). Thus, we consider the tightened system for \(\varPi \), which is a classic system that its feasible set is contained in S. For this purpose, it is easy to see that the following system is a tightened system for \(\varPi \) when \(I_{00}=\emptyset \):

$$\begin{aligned} \varPi _*:=\left\{ \begin{array}{rl} H_i(x)=0 , &{} \quad G_i(x)\ge 0,\quad i\in I_{0+} ,\\ H_i(x)\ge 0 , &{} \quad G_i(x)\le 0,\quad i\in I_{0-}\cup I_{+-}\cup I_{+0}\ . \end{array} \right. \end{aligned}$$

To add the functions \( H_i \) and \( G_i \) for \( i\in I_{00} \) to \(\varPi _*\), we can add them to the first line or to the second line of it. Thus, we achieve the following tightened systems:

$$\begin{aligned} \varPi _1:= & {} \left\{ \begin{array}{rl} H_i(x)=0 , &{} \quad G_i(x)\ge 0,\quad i\in I_{0+}\cup I_{00} ,\\ H_i(x)\ge 0 , &{} \quad G_i(x)\le 0,\quad i\in I_{0-}\cup I_{+-}\cup I_{+0} , \end{array} \right. \\ \varPi _2:= & {} \left\{ \begin{array}{rl} H_i(x)=0 , &{} \quad G_i(x)\ge 0,\quad i\in I_{0+},\\ H_i(x)\ge 0 , &{} \quad G_i(x)\le 0,\quad i\in I_{0-}\cup I_{+-}\cup I_{+0}\cup I_{00}\ . \end{array} \right. \end{aligned}$$

Considering \(\varPi _1\) and \(\varPi _2\), the corresponding linearized cones at \(\hat{x}\) are, respectively, given by \(\mathcal {L}_1^-\) and \(\mathcal {L}_2^-\), in which

$$\begin{aligned}&\mathcal {L}_1:=\sigma _{I_{0+}\cup I_{00}}^H \cup \left( -\sigma _{I_{0+}\cup I_{00}}^H\right) \cup \left( -\sigma _{I_{0-}}^H\right) \cup \left( \sigma _{I_{+0}\cup I_{00}}^G\right) ,\\&\mathcal {L}_2:=\sigma _{I_{0+}}^H \cup \left( -\sigma _{I_{0+} }^H\right) \cup \left( -\sigma _{I_{0-}\cup I_{00}}^H\right) \cup \left( \sigma _{I_{+0}\cup I_{00}}^G\right) . \end{aligned}$$

Following [1, 10, 11], we also define the linearized cone \(\mathcal {L}^-\) as

$$\begin{aligned} \mathcal {L}:=\sigma _{I_{0+}}^H \cup \left( -\sigma _{I_{0+}}^H\right) \cup \left( -\sigma _{I_{0-}\cup I_{00}}^H\right) \cup \left( \sigma _{I_{+0}}^G\right) . \end{aligned}$$

Now, we can define some new version of Abadie, Khun–Tucker, Guignard, Zangwill, Mangasarian–Fromovitz, and linear independent CQs in extension of the CQs that are introduced in [1, 2, 10, 11, 13,14,15]. We also introduce some new useful CQs for \(\varPi \).

Definition 3.1

We say that \(\varPi \) satisfies

  • the MPVC-LICQ at \(\hat{x}\) if \(\varPi _1\) satisfies the LICQ at \(\hat{x}\), i.e.,

    $$\begin{aligned}&0\in \sum _{i\in I_{0+}\cup I_{0-}\cup I_{00} } \alpha _i \partial _cH_i(\hat{x}) \\&\quad +\,\sum _{i\in I_{00}\cup I_{+0}}\beta _i \partial _cG_i(\hat{x}) \Rightarrow \alpha _i=0 \quad (\forall i\in I_{0+}\cup I_{0-}\cup I_{00}), \beta _i=0 \quad (\forall i\in I_{00}\cup I_{+0}) . \end{aligned}$$
  • the MPVC-MFCQ at \(\hat{x}\) if \(\varPi _1\) satisfies the MFCQ at \(\hat{x}\), i.e.,

    1. (i)

      \( 0 \in \sum _{i\in I_{00}\cup I_{0+}} \alpha _i \partial _cH_i(\hat{x})\Rightarrow \alpha _i=0 , \qquad \forall i\in I_{00}\cup I_{0+}, \)

    2. (ii)

      \( \left( \sigma ^H_{I_{00}\cup I_{0+}}\right) ^\perp \cap \left( -\sigma ^H_{I_{0-}}\right) ^s\cap \left( \sigma ^G_{I_{+0}\cup I_{00}}\right) ^s\ne \emptyset .\)

  • the \(\widehat{GCQ}\) at \(\hat{x}\) if

    $$\begin{aligned} \mathcal {L}_1^- \subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big ). \end{aligned}$$
  • the VC-MFCQ at \(\hat{x}\) if

    1. (i)

      \(0 \in \sum _{i\in I_{00}\cup I_{0+}} \alpha _i \partial _cH_i(\hat{x})\Rightarrow \alpha _i=0 ,\qquad \forall i\in I_{00}\cup I_{0+}, \)

    2. (ii)

      \( \left( \sigma ^H_{I_{00}\cup I_{0+}}\right) ^\perp \cap \left( -\sigma ^H_{I_{0-}}\right) ^s\cap \left( \sigma ^G_{I_{+0}}\right) ^s\ne \emptyset \).

  • the VC-GCQ at \(\hat{x}\) if

    $$\begin{aligned} \mathcal {L}_2^- \subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big ). \end{aligned}$$
  • the weak GCQ (resp., ACQ, KTCQ, ZCQ), denoted shortly by WGCQ (resp., WACQ, WKTCQ, WZCQ), at \(\hat{x}\) if

    $$\begin{aligned} {\mathcal {L}}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\ \Big (\mathrm{resp}.,\ \subseteq \varGamma (S,\hat{x}),\ \subseteq \varOmega (S,\hat{x}),\ \subseteq cl\big ({\varUpsilon (S,\hat{x})}\big )\Big ). \end{aligned}$$

Remark 3.1

  1. (i)

    Since the feasible set of \( \varPi _2 \) (resp. \( \varPi _1 \)) is included in S and is not necessarily equal with it, therefore VC-GCQ (resp. \(\widehat{GCQ}\)) for \( \varPi \) is not equivalent to GCQ for \( \varPi _2 \) (resp. \( \varPi _1 \)). Note that GCQ for \( \varPi _2 \) (resp. \( \varPi _1 \)) implies the VC-GCQ (resp. \(\widehat{GCQ}\)) for \( \varPi \).

  2. (ii)

    There are no implication relations between the VC-MFCQ for \( \varPi \) and the MFCQ for \( \varPi _2 \) at \(\hat{x}\). Note that the MFCQ for \( \varPi _2 \) is irrelevant for us and we did not consider.

  3. (iii)

    Unlike the Ref. [1], the inclusion \({\mathcal {L}}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\) is not named GCQ in above definition. The reason of this naming is that the classic GCQ for \(\varPi \) is formalized as

    $$\begin{aligned} \left( -\sigma _{ I_{0+}\cup I_{0-}\cup I_{00}}^H\right) ^-\cap \left( \bigcup _{i\in {I_{+0}\cup I_{0+}\cup I_{0-}\cup I_{00}}} \partial _c \big ( H_iG_i \big )(\hat{x}) \right) ^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big ). \end{aligned}$$
    (10)

    On the other hand, owing to

    $$\begin{aligned} \partial _c(G_i H_i)(\hat{x})\subseteq G_i(\hat{x})\partial _c H_i(\hat{x})+H_i(\hat{x})\partial _c G_i(\hat{x}), \end{aligned}$$

    we conclude that:

    $$\begin{aligned}&\left( -\sigma _{ I_{0+}\cup I_{0-}\cup I_{00}}^H\right) ^-\cap \left( \bigcup _{i\in {I_{+0}\cup I_{0+}\cup I_{0-}\cup I_{00}}} \partial _c \big ( H_iG_i \big )(\hat{x}) \right) ^- \\&\quad \supseteq \bigcap _{i\in I_{0+}}\left[ \Big (-\partial _cH_i(\hat{x})\Big )^-\cap \Big (G_i(\hat{x}) \partial _cH_i(\hat{x})\Big )^- \right] \cap \bigcap _{i\in I_{00}}\Big (-\partial _cH_i(\hat{x})\Big )^- \cap \\&\qquad \bigcap _{i\in I_{0-}}\left[ \Big (-\partial _cH_i(\hat{x})\Big )^-\cap \Big (G_i(\hat{x}) \partial _cH_i(\hat{x})\Big )^- \right] \cap \bigcap _{i\in I_{+0}}\Big (H_i(\hat{x})\partial _cG_i(\hat{x})\Big )^-\\&\quad = \bigcap _{i\in I_{0+}}\Big (-\partial _cH_i(\hat{x})\Big )^-\cap \bigcap _{i\in I_{0+}}\Big (\partial _cH_i(\hat{x})\Big )^-\cap \bigcap _{i\in I_{00}}\Big (-\partial _cH_i(\hat{x})\Big )^-\cap \\&\qquad \bigcap _{i\in I_{0-}}\Big (-\partial _cH_i(\hat{x})\Big )^-\cap \bigcap _{i\in I_{+0}}\Big (\partial _cG_i(\hat{x})\Big )^- =\mathcal {L}^-. \end{aligned}$$

    Therefore, \(\mathcal {L}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\) is weaker than (10). Note that if the functions \(H_i\) and \(G_i\) are continuously differentiable, [1, Lemma 4] shows that equality holds in last inclusion, and so \(\mathcal {L}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\) is equivalent to (10).

We recall that by adding the index set \( I_{00} \) as a whole to system \(\varPi _*\), we attained to systems \(\varPi _1\) and \(\varPi _2\). Now, in another way, we add the set \( I_{00} \) to systems \(\varPi _*\) in which \( I_{00} \) has been divided to two parts. To this aim, we call the ordered pair \( (\beta _1,\beta _2) \), a partition of \( I_{00} \) and we write \( (\beta _1,\beta _2) \in \mathcal {P}(I_{00}) \), if \( \beta _1 \) and \( \beta _2 \) are non-empty subsets of \( I_{00} \) such that \( \beta _1\cap \beta _2=\emptyset \) and \( \beta _1\cup \beta _2=I_{00} \). Motivated by [10], we add \( \beta _1 \) to the first line and \( \beta _2 \) to the second line of system \(\varPi _*\), and we get the below system:

$$\begin{aligned} \varPi ^{(\beta _1,\beta _2)}:=\left\{ \begin{array}{rl} H_i(x)=0 , &{} \quad G_i(x)\ge 0,\quad i\in I_{0+}\cup \beta _1,\\ H_i(x)\ge 0 , &{} \quad G_i(x)\le 0,\quad i\in I_{0-}\cup I_{+-}\cup I_{+0}\cup \beta _2\ . \end{array} \right. \end{aligned}$$

The feasible set of \( \varPi ^{(\beta _1,\beta _2)} \) is denoted by \( S^{(\beta _1,\beta _2)}\). Also, we consider the linearized cone \(\left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^- \) at \(\hat{x}\), in which

$$\begin{aligned} \left( \mathcal {L}^{(\beta _1,\beta _2)}\right) :=\left( \sigma ^H_{I_{0+}\cup \beta _1}\right) \cup \left( -\sigma ^H_{I_{0+}\cup \beta _1}\right) \cup \left( -\sigma ^H_{I_{0-}\cup \beta _2}\right) \cup \left( \sigma ^G_{I_{+0}\cup \beta _2}\right) . \end{aligned}$$

Following [10, 11], we introduce another linearized cone for \(\varPi \) as \(\mathcal {L}^\bigtriangleup :=\mathcal {L}^- \cap \mathcal {T}\), where

$$\begin{aligned} \mathcal {T}:=\{d\in \mathbb {R}^n :H_i^\circ (\hat{x};d)~G_i^\circ (\hat{x};d)\le 0,\ \ i\in I_{00}\}. \end{aligned}$$

It is worth mentioning that the product of convex functions \(d \rightarrow H_i^\circ (\hat{x};d)\) and \(d \rightarrow G_i^\circ (\hat{x};d)\) is, in general, non-convex. So, unlike to \(\mathcal {L}^-,\ \mathcal {L}_1^-,\ \mathcal {L}_2^-,\) and \(\left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^-\), the linearized cone \(\mathcal {L}^\bigtriangleup \) is not convex.

Since the proof of following lemma is just similar to the proof of [11, Lemma 2.4], we do not repeat it.

Lemma 3.1

The following equality is always true:

$$\begin{aligned} \varGamma (S,\hat{x})=\bigcup _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})}\varGamma \left( S^{(\beta _1,\beta _2)},\hat{x}\right) . \end{aligned}$$

Lemma 3.2

The following inclusion holds true:

$$\begin{aligned} \bigcup _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})} \left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^- \subseteq \mathcal {L}^\bigtriangleup . \end{aligned}$$

Proof

Assume that \( d\in \left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^-\). Then,

$$\begin{aligned} d\in \left( \sigma ^H_{I_{0+}}\right) ^\bot \cap \left( \sigma ^H_{\beta _{1}}\right) ^\bot \cap \left( -\sigma ^H_{I_{0-}}\right) ^- \cap \left( -\sigma ^H_{\beta _{2}}\right) ^- \cap \left( \sigma ^G_{I_{+0}}\right) ^- \cap \left( \sigma ^G_{\beta _{2}}\right) ^-. \end{aligned}$$

It is enough to prove that:

$$\begin{aligned} d\in \left( -\sigma ^H_{I_{00}}\right) ^- \cap \mathcal {T}. \end{aligned}$$

From \( d\in \left( \sigma ^H_{\beta _{1}}\right) ^\bot \cap \left( -\sigma ^H_{\beta _{2}}\right) ^- \), it can be concluded that \(d\in \left( -\sigma ^H_{I_{00}}\right) ^- \).

From \( d\in \left( -\sigma ^H_{\beta _{2}}\right) ^- \), we get \( \langle d,s \rangle \ge 0 \) for each \(i\in \beta _2 \) and \( s \in \sigma ^H_{\beta _{2}} \). Therefore, \( H_i^\circ (\hat{x};d)\ge 0 \) for each \( i \in \beta _2 \). Similarly, from \( d\in (\sigma ^G_{\beta _{2}})^- \) we can conclude that \( G_i^\circ (\hat{x};d)\le 0 \) for each \( i \in \beta _2 \). Hence,

$$\begin{aligned} H_i^\circ (\hat{x};d) ~ G_i^\circ (\hat{x};d)\le 0, \quad \forall i\in \beta _2. \end{aligned}$$
(11)

Also, from \( d\in (\sigma ^H_{\beta _{1}})^\bot \), we deduce that \( H_i^\circ (\hat{x};d)=0 \) for each \( i \in \beta _1 \). This equality and (11) imply that:

$$\begin{aligned} H_i^\circ (\hat{x};d) ~ G_i^\circ (\hat{x};d)\le 0, \quad i\in \beta _1\cup \beta _2=I_{00}\ \Longrightarrow \ d\in \mathcal {T}, \end{aligned}$$

and the lemma is proved. \(\square \)

In [11, lemma 2.4], the remarkable fact was shown that, if functions \( H_i \) and \( G_i \) are smooth, then equality holds in Lemma 3.2. The following example shows that inclusion of Lemma 3.2 can be strict in non-smooth case.

Example 3.1

Take in \(\varPi \),

$$\begin{aligned} H_1(x_1,x_2)= & {} -x_2^2, \qquad G_1(x_1,x_2)=|x_1|-x_2, \\ H_2(x_1,x_2)= & {} x_1^2, \qquad \quad G_2(x_1,x_2)=x_1-|x_2|. \end{aligned}$$

It is easy to see that the feasible set of this system is \(S={\mathbb {R}}_- \times \{ 0\}\). Considering \(\hat{x} =(0,0)\), we deduce that \(I_{00}=\{1,2\}\) and \(\mathcal {P}(I_{00}) =\big \{ (\beta _1,\beta _2),\ (\beta _2,\beta _1) \big \}\), in which \(\beta _1=\{1\}\) and \(\beta _2=\{2\}\). By a short calculation, we get,

$$\begin{aligned} \partial _c H_1(\hat{x})= & {} \{(0,0) \}, \quad H_1^0(\hat{x};d)=0, \\ \partial _c H_2(\hat{x})= & {} \{(0,0) \},\quad H_2^0(\hat{x};d)=0, \\ \partial _c G_1(\hat{x})= & {} [-1,1] \times \{-1 \},\quad G_1^0(\hat{x};d)=|d_1|-d_2, \\ \partial _c G_2(\hat{x})= & {} \{1 \}\times [-1,1] ,\quad G_2^0(\hat{x};d)=d_1-|d_2|. \end{aligned}$$

Thus, we conclude that

$$\begin{aligned}&\left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^-=\left( \sigma ^H_{\beta _1} \right) ^\bot \cap \left( -\sigma ^H_{\beta _2} \right) ^- \cap \left( \sigma ^G_{\beta _2} \right) ^-={\mathbb {R}}_- \times \{0\}, \\&\left( \mathcal {L}^{(\beta _2,\beta _1)}\right) ^-=\left( \sigma ^H_{\beta _2} \right) ^\bot \cap \left( -\sigma ^H_{\beta _1} \right) ^- \cap \left( \sigma ^G_{\beta _1} \right) ^-=\{0\} \times {\mathbb {R}}_+, \\&\mathcal {L}^-= \left( -\sigma ^H_{I_{00}} \right) ^- ={\mathbb {R}} \times {\mathbb {R}}, \\&\mathcal {T}= \{d\in {\mathbb {R}}^2 :\ 0\times G_i^0(\hat{x};d)\le 0,\ \ i=1,2\} ={\mathbb {R}} \times {\mathbb {R}}. \end{aligned}$$

Therefore,

$$\begin{aligned}&\bigcup _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})} \left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^-=\left( {\mathbb {R}}_- \times \{0\}\right) \cup \left( \{0\} \times {\mathbb {R}}_+ \right) \\&\qquad ={\mathbb {R}}_- \times {\mathbb {R}}_+ \subsetneqq {\mathbb {R}} \times {\mathbb {R}} =\mathcal {L}^- \cap \mathcal {T}=\mathcal {L}^\bigtriangleup . \end{aligned}$$

Corollary 3.1

The following assertion is valid:

$$\begin{aligned} \varGamma (S,\hat{x}) \subseteq \mathcal {L}^\bigtriangleup . \end{aligned}$$

Proof

According to Lemmas 3.1 and 3.2, we can conclude that:

$$\begin{aligned} \varGamma (S,\hat{x})=\bigcup _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})}\varGamma \left( S^{(\beta _1,\beta _2)},\hat{x}\right) {\mathop {\subseteq }\limits ^{(*)}}\bigcup _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})}\varGamma \left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^- \subseteq \mathcal {L}^\bigtriangleup , \end{aligned}$$

in which the inclusion \((*)\) always holds (see, e.g., [19]). \(\square \)

The last corollary leads us to the following definition.

Definition 3.2

We say that \(\varPi \) satisfies the MPVC-GCQ (resp., MPVC-ACQ, MPVC-KTCQ, MPVC-ZCQ) at \(\hat{x}\) if

$$\begin{aligned} \mathcal {L}^\bigtriangleup \subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\ \Big (\mathrm{resp}.,\ \subseteq \varGamma (S,\hat{x}),\ \subseteq \varOmega (S,\hat{x}),\ \subseteq cl\big ({\varUpsilon (S,\hat{x})}\big ) \Big ). \end{aligned}$$

Now, we can state the main theorem of this section.

Theorem 3.1

The following diagram summarizes the relationships between the defined CQs:

(12)

where (d) holds when \(I_{00}=\emptyset \) and \(H_i\) functions are linear for \(i\in I_{0+}\), and implication (c) holds if \(I_{00}=\emptyset \) and \(H_i\) functions for \(i\in I_{0+}\) are continuously differentiable at \(\hat{x}\).

Proof

  • \((a)\text { and } (b)\): Straightforward.

  • \((e), (f), (j), (i), (l)\text { and } (n) \): The conclusions are immediate from (8).

  • \((m), (h), (g) \text { and } (k)\): The implications are straightforward consequences of \(\mathcal {L}^\bigtriangleup \subseteq \mathcal {L}^-.\)

  • (p): It is enough to prove that \(\mathcal {L}^-_2 \subseteq \mathcal {L}^\bigtriangleup .\) To this end, let \( d\in \mathcal {L}^-_2 = \left( \sigma ^H_{I_{0+}}\right) ^\bot \cap \left( -\sigma ^H_{I_{0-}\cup I_{00}}\right) ^- \cap \left( \sigma ^G_{I_{+0}\cup I_{00}}\right) ^-\). From \( d\in (-\sigma ^H_{I_{00}})^- \) and \( d\in (\sigma ^G_{I_{00}})^- \), we can conclude that:

    $$\begin{aligned}&\langle d,\zeta \rangle \le 0, \qquad \forall \zeta \in \sigma ^G_{I_{00}}\ \Longrightarrow \ G_i^\circ (\hat{x};d)\le 0, \qquad i\in I_{00}, \\&\langle d,\xi \rangle \ge 0, \qquad \forall \xi \in \sigma ^H_{I_{00}}\ \Longrightarrow \ H_i^\circ (\hat{x};d)\ge 0, \qquad i\in I_{00}. \end{aligned}$$

    Therefore, for each \( i\in I_{00} \) we have:

    $$\begin{aligned} H_i^\circ (\hat{x};d) ~ G_i^\circ (\hat{x};d)\le 0 \ \Longrightarrow \ d\in \mathcal {T}. \end{aligned}$$

    This means \( d\in \mathcal {L}^\bigtriangleup \) which gives the assertion.

  • (q): It is direct consequence of inclusion \(\mathcal {L}_2 \subseteq \mathcal {L}_1\).

  • (d): Let

    $$\begin{aligned} d\in \left( \sigma ^H_{I_{0+}} \right) ^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^s \cap \left( \sigma ^G_{I_{+0}} \right) ^s. \end{aligned}$$
    (13)

Assume that \(H_i(x)=a_ix\) for each \(i\in I_{0+}\). From \(d\in \left( \sigma ^H_{I_{0+}} \right) ^\bot =\{ a_i :i\in I_{0+}\}^\bot \) and \(a_i \hat{x}=0\) (since \(H_i(\hat{x})=0\) for all \(i\in I_{0+}\)), we get:

$$\begin{aligned} H_i(\hat{x} +td)=a_i \hat{x} + t a_i d=0,\ \ \ \ \forall i\in I_{0+}. \end{aligned}$$
(14)

The definition of Clarke directional derivative and \(d\in \left( \sigma ^G_{I_{+0}} \right) ^s\) imply that for some \(\delta _1>0\) we have:

$$\begin{aligned} G_i^0(\hat{x}; v)<0 \Rightarrow G_i(\hat{x} +td)-\underbrace{G_i(\hat{x})}_{=0}<0,\ \ \forall t\in ]0,\delta _1[,\ \forall i\in I_{+0}. \end{aligned}$$
(15)

Put \(\widehat{H}_i (x):=-H_i(x)\) for \(i\in I_{0-}\). According to \(d\in \left( -\sigma ^H_{I_{0-}} \right) ^s\), we conclude that

$$\begin{aligned} \big< d,\xi \big > <0, \ \ \ \ \forall \xi \in \partial _c \widehat{H}_i (\hat{x}),\ \forall i\in I_{0-}. \end{aligned}$$

This means \(\widehat{H}_i^0 (\hat{x};d)<0\), and similar to (15), we find a scalar \(\delta _2>0\) such that for all \(t\in ]0,\delta _2[\) and \(i\in I_{0-}\) one has \(\widehat{H}_i(\hat{x} +t d)<0\). Thus,

$$\begin{aligned} H_i(\hat{x} +td)>0,\ \ \ \ \forall i\in I_{0-},\ \forall t\in ]0,\delta _2[. \end{aligned}$$
(16)

Taking \(\delta :=\min \{\delta _1,\delta _2 \}>0\), from (14)-(16) we deduce that \(\hat{x}+td \in S\) for each \(t\in ]0,\delta [\). This means that \(d \in \varUpsilon (S,\hat{x})\), and so, with regard to (13), we get:

$$\begin{aligned} \left( \sigma ^H_{I_{0+}} \right) ^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^s \cap \left( \sigma ^G_{I_{+0}} \right) ^s \subseteq \varUpsilon (S,\hat{x}). \end{aligned}$$

Therefore, one has:

$$\begin{aligned} \mathcal {L}^-= & {} \left( \sigma ^H_{I_{0+}} \right) ^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^- \cap \left( \sigma ^G_{I_{+0}} \right) ^-\\= & {} cl{\left( \left( \sigma ^H_{I_{0+}} \right) ^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^s \cap \left( \sigma ^G_{I_{+0}} \right) ^s\right) } \subseteq cl\big ({\varUpsilon (S,\hat{x})}\big )\subseteq \varGamma (S,\hat{x}). \end{aligned}$$

(c): Let

$$\begin{aligned} d\in \{\nabla H_i(\hat{x}) :i\in I_{0+} \}^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^s \cap \left( \sigma ^G_{I_{+0}} \right) ^s. \end{aligned}$$

According to [21, Theorem F], and recalling the linearly independence of \(\{\nabla H_i(\hat{x}) :i\in I_{0+} \} \), we can find a neighborhood U of \(\hat{x}\) and a function \(\psi : U \longrightarrow {\mathbb {R}}^n\) such that \(\psi \) is continuous on U and differentiable at \(\hat{x}\) with

$$\begin{aligned} \psi (\hat{x})= & {} 0,\ \ \ \ \ \nabla \psi (\hat{x})=0,\nonumber \\ H_i \big (x+\psi (x)\big )= & {} \big <\nabla H_i (\hat{x}), x-\hat{x} \big >,\ \ \ \ \ \forall i\in I_{0+}\ \ \ \forall x \in U. \end{aligned}$$
(17)

For each \(\alpha \in {\mathbb {R}}\) with \(\hat{x} +\alpha d \in U\), let \(\varphi (\alpha ):=\hat{x} +\alpha d +\psi (\hat{x} +\alpha d ).\) Owing to \(d\in \{\nabla H_i(\hat{x}) :i\in I_{0+} \}^\bot \), there exists \(\varepsilon >0\) such that for each \(\eta \in ]0,\varepsilon [\) one has:

$$\begin{aligned} H_i\big (\varphi (\eta )\big )=H_i \big ( \hat{x} +\eta d +\psi (\hat{x} +\eta d)\big )=\eta \big < \nabla H_i(\hat{x}),d\big >=0,\ \ \ \ i\in I_{0+}. \end{aligned}$$
(18)

Taking \(\widehat{H}_i (x):=-H_i(x)\) for \(i\in I_{0-}\), and according to \(d\in \left( -\sigma ^H_{I_{0-}} \right) ^s=\left( \bigcup _{i\in {I_{0-}}}\partial _c\widehat{H}_i(\hat{x}) \right) ^s\), we get \(\widehat{H}_i^0(\hat{x};d)<0\). Consequently, there exist \(\varepsilon _1>0\) and \(\beta >0\) such that:

$$\begin{aligned} \widehat{H}_i(\hat{x}+td)-\overbrace{\widehat{H}_i(\hat{x})}^{=0}\le -\beta t,\ \ \ \ \forall t\in ]0,\varepsilon _1[. \end{aligned}$$
(19)

On the other hand, if we denote the Lipschitz constance of \(\widehat{H}_i(\cdot )\) around \(\hat{x}\) by \(\rho _i\), then there exists \(\varepsilon _2>0\) such that:

$$\begin{aligned} \widehat{H}_i(\hat{x}+t\widetilde{d})- \widehat{H}_i(\hat{x}+td)\le \rho t \Vert \widetilde{d} -d \Vert ,\ \ \ \ \forall t\in ]0,\varepsilon _2[,\ \forall \widetilde{d} \in {\mathbb {R}}^n, \end{aligned}$$
(20)

where \(\rho :=\max \{\rho _i :i\in I_{0-}\}\). Since

$$\begin{aligned} 0=\nabla \psi (\hat{x})=\lim _{d\rightarrow 0}\frac{\psi (\hat{x} +t d)-\overbrace{\psi (\hat{x})}^{=0}}{t}=\lim _{d\rightarrow 0}\frac{\psi (\hat{x} +t d)}{t}, \end{aligned}$$
(21)

we find a positive scalar \(\varepsilon _3>0\) such that:

$$\begin{aligned} \left\| \frac{\psi (\hat{x} +t d)}{t}\right\| <\frac{\beta }{2\rho },\quad \forall t\in ]0,\varepsilon _3[. \end{aligned}$$
(22)

Taking \(\eta \in ]0,\bar{\varepsilon }[\) in which \(\bar{\varepsilon }:= \min \{\varepsilon _1,\varepsilon _2,\varepsilon _3 \}\), we conclude that for all \(i\in I_{0-}\) one has

$$\begin{aligned} \widehat{H}_i\big (\varphi (\eta )\big )= & {} \widehat{H}_i\left( \hat{x}+\eta \left( d+\frac{\psi (\hat{x} +\eta d)}{\eta }\right) \right) -\widehat{H}_i(\hat{x}+\eta d)+ \widehat{H}_i(\hat{x}+\eta d)\\&{\mathop {\le }\limits ^{(20)}}\rho \eta ~ \left\| \frac{\psi (\hat{x} +\eta d)}{\eta } \right\| + \widehat{H}_i(\hat{x}+\eta d)\\&{\mathop {\le }\limits ^{(22)}}\rho \eta ~ \frac{\beta }{2\rho }+\widehat{H}_i(\hat{x}+\eta d){\mathop {\le }\limits ^{(19)}} \frac{\eta \beta }{2}-\eta \beta =-\frac{\eta \beta }{2}<0, \end{aligned}$$

which ensure that

$$\begin{aligned} H_i\big (\varphi (\eta )\big )<0,\ \ \ \ \ \ \forall \eta \in ]0,\bar{\varepsilon }[,\ \forall i\in I_{0-}. \end{aligned}$$
(23)

By \(d\in \left( \sigma ^G_{I_{+0}} \right) ^s\) and the latter proof, we can find a scalar \(\tilde{\varepsilon }>0\) such that:

$$\begin{aligned} G_i\big (\varphi (\eta )\big )<0,\ \ \ \ \ \ \forall \eta \in ]0,\tilde{\varepsilon }[,\ \forall i\in I_{+0}. \end{aligned}$$

The last inequality together (18) and (23) implies that for all \(\eta \in \left]0,\min \{\varepsilon ,\bar{\varepsilon },\tilde{\varepsilon }\}\right[\) we have \(\varphi (\eta )\in S\). Also,

$$\begin{aligned} \lim _{\eta \rightarrow 0}\frac{\varphi (\eta )-\varphi (0)}{\eta }=\lim _{\eta \rightarrow 0}\frac{\hat{x} +\eta d +\psi (\hat{x} +\eta d )-\hat{x} -\psi (\hat{x} )}{\eta }{\mathop {=}\limits ^{(21)}}0, \end{aligned}$$

and so \(d\in \varOmega (S,\hat{x})\).

Thus, we proved that:

$$\begin{aligned} \{\nabla H_i(\hat{x}) :i\in I_{0+} \}^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^s \cap \left( \sigma ^G_{I_{+0}} \right) ^s \subseteq \varOmega (S,\hat{x}), \end{aligned}$$

which ensure that

$$\begin{aligned} \mathcal {L}^- = cl{\left( \{\nabla H_i(\hat{x}) :i\in I_{0+} \}^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^s \cap \left( \sigma ^G_{I_{+0}} \right) ^s\right) } \subseteq cl\big ({\varOmega (S,\hat{x})}\big )\subseteq \varOmega (S,\hat{x}). \ \ \ \ \end{aligned}$$

\(\square \)

Fig. 1
figure 1

Feasible set of \(\widehat{\varPi }\)

Example 3.2

Consider the following system,

$$\begin{aligned} \widehat{\varPi }:=\left\{ \begin{array}{ll} x_2\ge 0,\\ x_2(x_2-|x_1|)\le 0. \end{array} \right. \end{aligned}$$

with following data:

$$\begin{aligned} H_1(x_1,x_2)= x_2,\ \ G_1(x_1,x_2)=x_2-|x_1|,\ \ \hat{x}=(0,0),\ \ I=\{1\}. \end{aligned}$$

We have drawn the feasible set of \(\widehat{\varPi }\) in Fig. 1. With a simple calculation, it could be seen that:

$$\begin{aligned} I_{00}=\{1\}, \ \ \ \partial _cH_1(\hat{x})=\{(0,1)\},\ \ \ \partial _c G_1(\hat{x})=[-1,1]\times \{1\}. \end{aligned}$$

The MPVC-LICQ is not satisfied at \(\hat{x}\), since we can choose \(\alpha _1=1\) and \(\beta _1=-1\) in the following inclusion,

$$\begin{aligned} 0\in \alpha _1 \{(0,1)\}+\beta _1 \big ([-1,1]\times \{1\}\big ). \end{aligned}$$

According to

$$\begin{aligned}&0\in \alpha _1 \{(0,1)\}\ \Longrightarrow \ \alpha _1=0, \nonumber \\&\{(0,1)\}^\bot \cap \Big ([-1,1]\times \{1\}\Big )^s=\big ({\mathbb {R}} \times \{0\}\big ) \cap \big (\{0\} \times ]-\infty ,0[ \, \big )=\emptyset , \end{aligned}$$
(24)

we conclude that the MPVC-MFCQ is not satisfied at \(\hat{x}\), while the implication (24) and equality \(\{(0,1)\}^\bot ={\mathbb {R}} \times \{(0,0)\}\ne \emptyset \) show that the VC-MFCQ holds at \(\hat{x}\). Thus, VC-MFCQ is strictly weaker than MPVC-MFCQ for non-smooth SVCs. Also,

$$\begin{aligned}&\mathcal {L}^- = \{-(0,1)\}^- ={\mathbb {R}} \times {\mathbb {R}}_+,\\&\mathcal {L}_1^- = \{(0,1)\}^\bot \cap \Big ([-1,1]\times \{1\}\Big )^-=\big ({\mathbb {R}} \times \{0\}\big ) \cap \big (\{0\} \times {\mathbb {R}}_-\big )=\{(0,0)\}, \\&\mathcal {L}_2^- = \{-(0,1)\}^- \cap \Big ([-1,1]\times \{1\}\Big )^-=\{(0,0)\},\\&\mathcal {L}^\bigtriangleup =\mathcal {L}^- \cap \{d\in {\mathbb {R}}^2 :d_2 \ge 0,\ d_2(d_2-|d_1|)\le 0 \}=S. \end{aligned}$$

Above equalities show that WZCQ, WKTCQ, and WACQ fail, whereas WGCQ, MPVC-ZCQ, MPVC-KTCQ, MPVC-ACQ, MPVC-GCQ, VC-GCQ, and \(\widehat{GCQ}\) hold at \(\hat{x}\). Thus, the inverse implications of \([WZCQ\Rightarrow MPVC\text {-}ZCQ]\), \([WKTCQ\Rightarrow MPVC\text {-}KTCQ]\), \([WACQ\Rightarrow MPVC\text {-}ACQ]\), and \([WACQ\Rightarrow WGCQ]\) do not hold. Also, the assumption \( I_{00}=\emptyset \) could not be eliminated in the implications \([VC\text {-}MFCQ \Rightarrow WKTCQ]\) and \([VC\text {-}MFCQ \Rightarrow WZCQ]\).

4 Stationarity Conditions

In this section, motivated by [12, 14, 15], we will be mainly concerned with the mathematical problem with vanishing constraints (MPVC) given as

figure a

where \( f :\mathbb {R}^n\longrightarrow \mathbb {R} \) is a continuously differentiable function and S is defined as previous section. It should be noted that the general form of a MPVC which has been considered in [1, 2, 10, 11, 13, 16] includes inequality constraints \( g_j(x)\le 0,\ j\in J \) and equality constraints \( h_t(x)=0,\ t\in T \) for some finite index sets J and T. Since adding these constraints to problem (P) does not increase the technical problems of the issue and just prolongs the formulas, we ignore them and just deal with problem (P) .

Now, we can state the KKT type necessary optimality condition for problem (P). The following theorem is non-smooth version of [1, Theorem 1].

Theorem 4.1

Let \( \hat{x} \) be a local solution of (P) such that WGCQ holds at \(\hat{x}\). If \(\mathrm{cone}(\mathcal {L}) \) is a closed cone, and the objective function \(f(\cdot )\) is continuously differentiable at \(\hat{x}\), then there exist coefficients \( \lambda _i^H \) and \( \lambda _i^G \), for \( i\in I \), such that:

$$\begin{aligned} 0\in \nabla f(\hat{x}) +\sum _{i\in I}\Big (-\lambda _i^H\partial _cH_i(\hat{x})+\lambda _i^G\partial _cG_i(\hat{x})\Big ), \end{aligned}$$
(25)

and

$$\begin{aligned}&\lambda _i^H=0\quad (i\in I_{+0}\cup I_{+-}) ,\quad \lambda _i^H\ge 0 \quad (i\in I_{0-}\cup I_{00}), \quad \lambda _i^H \ \mathrm{free} \quad (i\in I_{0+}), \end{aligned}$$
(26)
$$\begin{aligned}&\lambda _i^G=0\quad (i\in I_{0+} \cup I_{0-}\cup I_{00} \cup I_{+-}) , \quad \lambda _i^G\ge 0 \quad (i\in I_{+0}). \end{aligned}$$
(27)

Proof

Owing to (9), WGCQ, bipolar theorem, and closedness of \( \mathrm{cone}(\mathcal {L}) \), we conclude that

$$\begin{aligned} -\nabla f(\hat{x}) \in N_F(S,\hat{x})=\left( \varGamma (S,\hat{x}) \right) ^- =\left( cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big ) \right) ^- \subseteq \mathcal {L}^{--}=cl\big ({\mathrm{cone}}(\mathcal {L})\big ) =\mathrm{cone}(\mathcal {L}). \end{aligned}$$

Considering the structure of convex cones, we can find nonnegative scalars

$$\begin{aligned} a_i\ (i\in I_{0+}) ,\quad b_i\ (i\in I_{0+}) ,\quad c_i \ (i\in I_{00}\cup I_{0-}) ,\quad d_i \ (i\in I_{+0}), \end{aligned}$$

such that

$$\begin{aligned} -\nabla f(\hat{x}) \in \sum _{i\in I_{0+}}a_i \partial _cH_i(\hat{x})+\sum _{i\in I_{0+}}-b_i\partial _cH_i(\hat{x})+\sum _{i\in I_{00}\cup I_{0-}}-c_i\partial _cH_i(\hat{x})+\sum _{i\in I_{+0}}d_i\partial _cG_i(\hat{x}). \end{aligned}$$

Tacking

$$\begin{aligned} \lambda _i^H:= & {} \left\{ \begin{array}{ll} -(a_i-b_i),\ &{} \quad \text {if }\ i\in I_{0+},\\ c_i \ &{}\quad \text {if }\ i\in I_{00}\cup I_{0-},\\ 0\ &{} \quad \text {if }\ i\in I_{+-}\cup I_{+0}, \end{array} \right. \\ \lambda _i^G:= & {} \left\{ \begin{array}{rl} d_i \ &{} \quad \text {if } \ i\in I_{+0},\\ 0 \ &{}\quad \text {if }\ i\in I_{+-}\cup I_{0+} \cup I_{0-}\cup I_{00}, \end{array} \right. \end{aligned}$$

the result is proved. \(\square \)

Conditions (25)–(27) were named the KKT conditions and strongly stationary conditions for (P) in [1, 10], respectively. We will call them the strongly stationary conditions (SSC in short) too.

The following example shows that we cannot replace the smoothness condition of \(f(\cdot )\) by its Lipschitzian condition in above theorem, and we cannot replace the \(\nabla f(\hat{x})\) by \(\partial _c f(\hat{x})\) in (25) too.

Example 4.1

Consider the optimization problem as

$$\begin{aligned} \min \ \ -x_2+|x_1-x_2| \ \ \ \mathrm{s.t.}\ \ \ x\in S=(\{0\}\times \mathbb {R}_+)\cup (\mathbb {R}_+ \times \{0\} ). \end{aligned}$$

We can formalize this problem as a MPVC with following data,

$$\begin{aligned}&H_1(x_1,x_2)=x_2,\quad H_2(x_1,x_2)=x_1,\quad G_1(x_1,x_2)=-x_1, \quad G_2(x_1,x_2)=-x_2,\\&f(x_1,x_2)=-x_2+|x_1-x_2|. \end{aligned}$$

We observe that \( \hat{x}=(0,0) \) is an optimal solution of problem, and \(I_{00}=\{1,2\}\). Since

$$\begin{aligned} \mathcal {L}^-=\left( -\sigma ^H_{I_{00}} \right) ^-=\left( \{(-1,0) , (0,-1)\} \right) ^- = \mathbb {R}_+\times \mathbb {R}_+= \mathrm{conv}(\varGamma (S,\hat{x})), \end{aligned}$$

the WGCQ is satisfied at \(\hat{x}\). It is easy to show that the \(\mathrm{cone}(\mathcal {L})\) is closed, and also, the below SSC type relations do not hold for any nonnegative scalars \( \lambda _1 ^H,~\lambda _2 ^H,~\lambda _1 ^G,\) and \(\lambda _2^G\):

$$\begin{aligned}&\left( {\begin{array}{c}0\\ 0\end{array}}\right) =\left( {\begin{array}{c}\gamma \\ -1-\gamma \end{array}}\right) - \lambda _1^H\left( {\begin{array}{c}0\\ 1\end{array}}\right) -\lambda _2^H\left( {\begin{array}{c}1\\ 0\end{array}}\right) +\lambda _1^G\left( {\begin{array}{c}-1\\ 0\end{array}}\right) +\lambda _2^G\left( {\begin{array}{c}0\\ 1\end{array}}\right) , \\&\quad \gamma \in [-1,1],\quad \quad \lambda _1^G=\lambda _2^G=0. \end{aligned}$$

The following example shows the assumption of closedness of \(\mathrm{cone}(\mathcal {L})\) in Theorem 4.1 cannot be waived.

Example 4.2

Let \(H_1(x_1,x_2)=1\), and \(G_1(x_1,x_2)\) is the support function of

$$\begin{aligned} Q:=\{ (-q,-q^2) :0\le q \le 1 \}. \end{aligned}$$

Thus, \(S=\{(x_1,x_2)\in {\mathbb {R}}^2 :x_1\ge 0,\quad x_1+x_2 \ge 0 \}\). Clearly, \(\hat{x}=(0,0)\) is the optimal solution of the problem \(\min _{x\in S}f(x)\), in which \(f(x_1,x_2)=x_1\). By a short calculation, we get \(I_{+0}=\{1\}\), and

$$\begin{aligned}&\mathcal {L}=\partial _c G_1(\hat{x})=Q=\{(x_1,x_2)\in {\mathbb {R}}^2 :-1<x_1<0,\ x_1 \le x_2<-x_1^2 \}\cup \{(0,0),(-1,-1) \},\\&\mathcal {L}^-=S=cl\Big (\mathrm{conv}\big (\varGamma (S,\hat{x})\big )\Big ),\\&\mathrm{cone} \big (\mathcal {L} \big )=\{(x_1,x_2)\in {\mathbb {R}}^2 :x_1 \le x_2 ,\ x_1<0,\ x_2<0 \}\cup \{(0,0)\}. \end{aligned}$$

Therefore, the WGCQ is satisfied at \(\hat{x}\), and \(\mathrm{cone} \big (\mathcal {L} \big )\) is not closed. Obviously, following SSC type relations do not hold for each of the nonnegative scalars \(\lambda _1^H\) and \(\lambda _1^G\):

$$\begin{aligned} (0,0)\in \{(1,0)\}-\lambda _1^H\{(0,0)\}+\lambda _1^G Q,\ \ \ \lambda _1^H=0. \end{aligned}$$

As diagram (12) shows, VC-GCQ is weaker than WGCQ. In [1, page 93], with an example shows that WGCQ is strictly stronger than VC-GCQ. As a result, it seems that one cannot get from VC-GCQ (resp. \(\widehat{GCQ}\)) to SSC, but it can reach to some weaker conditions which in future, we call them VC-stationary conditions (reap. weakly stationary conditions).

Since the proof of following theorem is similar to the proof of Theorem 4.1, we ignore it.

Theorem 4.2

Suppose that \( \hat{x} \) is a local solution of (P) and VC-GCQ (resp. \(\widehat{GCQ}\)) holds at \( \hat{x} \). If \( \mathrm{cone}(\mathcal {L}_2) \) (resp. \( \mathrm{cone}(\mathcal {L}_1) \)) be a closed cone, then there exist real coefficients \( \lambda _i^H \) and \( \lambda _i^G\), for \( i\in I \), which satisfy in (25) and

$$\begin{aligned}&\left. \begin{array}{rl} &{} \lambda _i^H=0 \qquad (i\in I_{+-}\cup I_{+0}),\quad \lambda _i^H \ \mathrm{free} \ (i\in I_{0+}), \quad \lambda _i^H\ge 0 \quad (i\in I_{0-}\cup I_{00}),\\ &{}\lambda _i^G=0\quad (i\in I_{+-}\cup I_{0+}\cup I_{0-}),\quad \lambda _i^G\ge 0 \quad (i\in I_{+0}\cup I_{00}). \end{array} \qquad \right. \end{aligned}$$
(28)
$$\begin{aligned}&\quad \left( \mathrm{resp}. \left. \begin{array}{rl} &{} \lambda _i^H=0 \qquad (i\in I_{+-}\cup I_{+0}),\quad \lambda _i^H \ \mathrm{free} \ (i\in I_{0+}\cup I_{00}), \quad , \lambda _i^H\ge 0 \quad (i\in I_{0-}),\\ &{}\lambda _i^G=0\quad (i\in I_{+-}\cup I_{0+}\cup I_{0-}),\quad \lambda _i^G\ge 0 \quad (i\in I_{+0}\cup I_{00}). \end{array}\right. \right) \qquad \nonumber \\ \end{aligned}$$
(29)

As mentioned before, conditions (25) and (28) are refereed by VC-stationary condition (VCSC in brief), and conditions (25) and (29) are named weakly stationary conditions (shortly, WSC).

Theorem 4.3

Assume that \( \hat{x} \) is an optimal solution of (P) , and MPVC-GCQ is satisfied at \( \hat{x} \). If \( \mathrm{cone}\left( \mathcal {L}^{(\beta _1^*,\beta _2^*)}\right) \) and \( \mathrm{cone}\left( \mathcal {L}^{(\beta _2^*,\beta _1^*)}\right) \) are closed for some \( (\beta _1^*,\beta _2^*)\in \mathcal {P}(I_{00}) \), then the strongly stationary conditions hold at \(\hat{x}\).

Proof

Considering the classic formula (9), MPVC-GCQ, Lemma 3.2, and bipolar theorem, one can see that

$$\begin{aligned}&-\nabla f(\hat{x})\in \left( \varGamma (S,\hat{x})\right) ^-\subseteq \left( \mathcal {L}^\bigtriangleup \right) ^-\subseteq \left( \bigcup _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})}\left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^- \right) ^-\nonumber \\&\quad = \bigcap _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})}\left( \mathcal {L}^{(\beta _1,\beta _2)} \right) ^{--}= \bigcap _{(\beta _1,\beta _2)\in \mathcal {P}(I_{00})}cl\Big ({\mathrm{cone}}\left( \mathcal {L}^{(\beta _1,\beta _2)}\right) \Big ) \nonumber \\&\quad \subseteq cl\left( {\mathrm{cone}}\left( \mathcal {L}^{(\beta _1^*,\beta _2^*)}\right) \right) \cap cl\left( {\mathrm{cone}}\left( \mathcal {L}^{(\beta _2^*,\beta _1^*)}\right) \right) =\mathrm{cone}\left( \mathcal {L}^{(\beta _1^*,\beta _2^*)}\right) \cap \mathrm{cone}\left( \mathcal {L}^{(\beta _2^*,\beta _1^*)}\right) . \end{aligned}$$
(30)

On the other hand, according to the definition of \( \mathcal {L}^{(\beta _1^*,\beta _2^*)} \) and \( \mathcal {L}^{(\beta _2^*,\beta _1^*)} \) we have

$$\begin{aligned} \mathrm{cone}\left( \mathcal {L}^{(\beta _{1}^{*}, \beta _{2}^{*})}\right)&=\bigcup \bigg \{ \underset{i\in I_{0+}\cup \beta _{1}^{*}}{\sum }- \lambda _{i}^{H}\partial _{c}H_{i}(\hat{x})+ \underset{i\in I_{0-}\cup \beta _{2}^{*}}{\sum }- \lambda _{i}^{H}\partial _{c}H_{i}(\hat{x})\\&\quad +\,\underset{i\in I_{+0}\cup \beta _{2}^{*}}{\sum }\lambda _{i}^{G}\partial _{c}G_{i}(\hat{x}) :\\&\qquad \lambda _{i}^{H}\ \mathrm{free} \quad i\in I_{0+}\cup \beta _{1}^{*}; \ \lambda _{i}^{H}\ge 0\quad i\in I_{0-}\cup \beta _{2}^{*}\\&\qquad \lambda _{i}^{G}\ge 0\quad i\in I_{+0}\cup \beta _{2}^{*}\bigg \}, \end{aligned}$$

and

$$\begin{aligned} \mathrm{cone}\left( \mathcal {L}^{(\beta _{2}^{*}, \beta _{1}^{*})}\right)&=\bigcup \bigg \{ \underset{i\in I_{0+}\cup \beta _{2}^{*}}{\sum }- \lambda _{i}^{H}\partial _{c}H_{i}(\hat{x}) +\underset{i\in I_{0-}\cup \beta _{1}^{*}}{\sum }- \lambda _{i}^{H}\partial _{c}H_{i}(\hat{x})\\&\quad +\,\underset{i\in I_{+0}\cup \beta _{1}^{*}}{\sum }\lambda _{i}^{G}\partial _{c}G_{i}(\hat{x})):\\&\qquad \lambda _{i}^{H}\, \mathrm{free}\,\quad i\in I_{0+}\cup \beta _{2}^{*};\, \, \lambda _{i}^{H}\ge \ 0\ i\in I_{0-}\cup \beta _{1}^{*};\\&\qquad \lambda _{i}^{G}\ge 0\quad i\in I_{+0}\cup \beta _{1}^{*}\bigg \}. \end{aligned}$$

Therefore,

$$\begin{aligned}&\mathrm{cone}\left( \mathcal {L}^{(\beta _{1}^{*}, \beta _{2}^{*})}\right) \cap \mathrm{cone}\left( \mathcal {L}^{(\beta _{2}^{*}, \beta _{1}^{*})}\right) = \bigcup \bigg \{ \underset{i\in I_{0+}\cup I_{0-}\cup I_{00}}{\sum }-\lambda _{i}^{H}\partial _{c}H_{i}(\hat{x})+\underset{i\in I_{+0}}{\sum }\lambda _{i}^{G}\partial _{c}G_{i}(\hat{x}):\nonumber \\&\qquad \lambda _{i}^{H}\, \mathrm{free}\,\, i\in I_{0+} ; \ \lambda _{i}^{H}\ge 0\quad i\in \beta _{1}^{*}\cup \beta _{2}^{*}=I_{00};\ \lambda _{i}^{H}\ge 0\quad i\in I_{0-}; \quad \lambda _{i}^{G}\ge 0 \quad i\in I_{0+}\bigg \}\nonumber \\&=\bigcup \bigg \{ \underset{i\in I}{\sum }(-\lambda _{i}^{H}\partial _{c}H_{i}(\widehat{x})+ \lambda _{i}^{G}\partial _{c}G_{i}(\widehat{x})):\lambda _{i}^{H}\, \mathrm{free}\quad i\in I_{0+};\ \lambda _{i}^{H}= 0\quad i\in I_{+0}\cup I_{+-}; \nonumber \\&\qquad \lambda _{i}^{H}\ge 0 \quad i\in I_{00}\cup I_{0-};\ \lambda _{i}^{G}\ge 0\quad i\in I_{+0};\ \lambda _{i}^{G}=0\quad i\in I_{0+}\cup I_{0-}\cup I_{00} \cup I_{+-}\bigg \}. \end{aligned}$$
(31)

Combining the virtues of (30) and (31) , the result is proved. \(\square \)

As we saw in proof of Theorem 4.3,

$$\begin{aligned} \mathrm{cone}( \mathcal {L}^{(\beta _1, \beta _2)}) \cap \mathrm{cone}( \mathcal {L}^{(\beta _2, \beta _1)})=\mathrm{cone}(\mathcal {L}), \end{aligned}$$

for each \( (\beta _1, \beta _2)\in \mathcal {P}(I_{00}) \), and so, the closedness of \( \mathrm{cone}( \mathcal {L}^{(\beta _1, \beta _2)}) \) and \(\mathrm{cone}( \mathcal {L}^{(\beta _2, \beta _1)})\) leads us to closedness of \( \mathrm{cone}(\mathcal {L}) \), while closedness of \( \mathrm{cone}(\mathcal {L}) \) does not lead to them.

There are two different assumptions in getting the SSC:

(\(\mathfrak {a}\)):

WGCQ and closedness of \(\mathrm{cone}(\mathcal {L})\) (Theorem 4.1).

(\(\mathfrak {b}\)):

MPVC-GCQ and closedness of \( \mathrm{cone}( \mathcal {L}^{(\beta _1, \beta _2)}) \) and \(\mathrm{cone}( \mathcal {L}^{(\beta _2, \beta _1)})\) (Theorem 4.3).

Since the MPVC-GCQ is weaker than WGCQ (see diagram (12)) and closedness condition in (\(\mathfrak {b}\)) is stronger than in (\(\mathfrak {a}\)), none of Theorems 4.1 and 4.3 are better than the other. It is worth mentioning that if \(H_i\) and \(G_i\) are continuously differentiable, their sub-differentials contain single element, and so, the closedness conditions in (\(\mathfrak {a}\)) and (\(\mathfrak {b}\)) automatically hold. Hence, as mentioned in [11], Theorem 4.3 is strictly better than Theorem 4.1 for MPVCs with smooth constraints.

5 Conclusions

Calculating the Clarke sub-differential of a function \(\hbar :{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) might be difficult in general. However, the functions considered in our work are locally Lipschitz and it empowers the numerical viability of our presented necessary conditions. This is due to an important corollary of the Rademacher’s Theorem [18] which says: Given \(\bar{x} \in {\mathbb {R}}^n\),

$$\begin{aligned} \partial _c \hbar (\bar{x})=\mathrm{conv}\left( \{\lim _{r\rightarrow \infty }\triangledown f(x_r):\ x_r\rightarrow \bar{x} ,x_r\in Q\}\right) , \end{aligned}$$
(32)

in which Q is any full-measure subset of \({\mathbb {R}}^n\) that includes all the points where \(\hbar \) is differentiable. According to (32), the gradient sampling approach [22] can be utilized to generate a suitable approximation of \(\partial _c \hbar (\bar{x})\) and applying in our results. Here, we address this approach briefly. Considering an appropriate ball centered at \(\bar{x}\), say \(U_{\bar{x}}\), a large number of points in \(U_{\bar{x}}\) are randomly generated. At the second steep, among randomly generated points, the points at which \(\hbar \) is differentiable are selected. Assume that these points are \(x_1,x_2,\ldots ,x_k\). According to (32),

$$\begin{aligned} C_k:=\mathrm{conv}\left( \{\triangledown \hbar (x_r):r=1,2,\ldots ,k\}\right) \end{aligned}$$

is considered as an approximation of \(\partial _c \hbar (\bar{x})\) and is applied in our results. The accuracy of this approximation can be improved by increasing k. See [22] for more details about gradient sampling techniques. To compute \(\partial _c \hbar (\bar{x})\), another sampling method has been presented in [23]. It works based on \(\varepsilon \)-sub-differential introduced by Goldstein.