Abstract
This paper aims at studying a broad class of mathematical programming with non-differentiable vanishing constraints. First, we are interested in some various qualification conditions for the problem. Then, these constraint qualifications are applied to obtain, under different conditions, several stationary conditions of type Karush/Kuhn–Tucker.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Since 2007, a difficult class of optimization problems was introduced by Kanzow and his co-authors in [1, 2]; it was called mathematical programming with vanishing constraints (MPVC for short). The MPVCs received attention from different fields. Some of their applications in topological optimization have been introduced in [1]. Also, MPVCs are generalizations of another group of optimization problems, named mathematical programming with equilibrium constraints (MPEC in brief).
Since MPVCs are a generalized form of MPECs, it seems natural that we seek to generalize the results of MPECs to them. As we know, in MPECs, occurrence of situations, which could lead to necessary conditions in Karush/Kuhn–Tucker (KKT) types, is very limiting and hindering. Therefore, instead of KKT condition some other conditions are substituted. They have the style and appearance of KKT condition, and also, their coefficients are more free than KKT coefficients. The recent conditions, which are called stationary conditions, can be proved under some constraint qualifications that their occurrence is more natural [3,4,5,6,7,8,9].
The same process has been repeated for MPVCs, and numerous constraint qualifications have been introduced, which lead to different stationary conditions. For instance, the first-order stationary conditions for MPVCs can be found in [1, 10, 11]. The second-order stationary conditions and exact penalty theorem are given in [2] and [12], respectively. Also, relaxation methods in [13], stability in [14], and numerical algorithms in [15] have been presented. In addition, weak and strong duality results for MPVCs could be seen in [16].
In the previous works that referenced earlier, all functions which define MPVC are continuously differentiable. Unlike the MPECs which their non-smooth form has been studied in some references as in [3, 7,8,9, 17], the earlier works for MPVCs assume the limiting criteria of continuously differentiability for functions. So, to the best of our knowledge, the current article is the first that studies the stationary conditions for MPVCs where their constraints are non-smooth, meaning that they are not necessarily differentiable. In this paper, we assume the objective function is differentiable.
Since the feasible set of an MPVC is not necessarily convex, even under the criteria of convexity of the functions which construct it (to be discussed in next sections), applying common methods of convex analysis is not applicable here. Therefore, we take the non-smooth analysis approach. To choose a suitable sub-differential, we select a sub-differential which not only is convex but also its calculation rules are known. The best selection is Clarke sub-differential. Therefore, we consider the functions which define the problem to be locally Lipschitz.
The structure of the subsequent sections of this paper is as follows: In Sect. 2, we define required definitions, theorems, and relations of non-smooth analysis. In Sect. 3, we will introduce some various constraint qualifications for a system with non-smooth vanishing constraints. Also, relations between defined constraint qualifications are presented in Sect. 3. In Sect. 4, we apply these constraint qualifications to obtain different kinds of stationary conditions for the problem.
2 Notations and Preliminaries
In this section, we provide an overview for some notations and preliminary results that will be used throughout this paper. Further details and examples of these results can be found in the books [18,19,20].
First, we recall that \({\mathbb {R}}_+\) and \(\langle x,y \rangle \) denote, respectively, the nonnegative real numbers \([0,+\infty [\) and the standard inner product of vectors \(x,y \in {\mathbb {R}}^n\). For a non-empty subset M of \( {\mathbb {R}}^n \), let:
It can easily be shown that if \( M^s \ne \emptyset \), then \( cl({M^s})=M^- \), which \( cl({M^s}) \) denotes the topological closure of \( M^s \). Here, it is necessary to note that for given closed convex cones \( M_1,\ M_2,\ M_3,\) and \(M_4, \) we have:
The convex hull and the convex cone (containing origin) of \(M \subseteq {\mathbb {R}}^n\) are, respectively, denoted by \( \mathrm{conv}(M) \) and \( \mathrm{cone}(M) \). It is easy to see that if \( \{M_\kappa \}_{\kappa \in \varLambda } \) is a class of convex subsets of \( {\mathbb {R}}^n \) with \(|\varLambda |<\infty \) and with the convention \( \bigcup _{\kappa \in \emptyset }M_\kappa =\emptyset \), then one has:
The famous bipolar theorem says that \( (M^-)^-=cl \big ({\mathrm{cone}}(M)\big ) \) for each \( \emptyset \ne M\subseteq {\mathbb {R}}^n \). We define the normal cone of convex set \( A\subseteq {{\mathbb {R}}}^n \) at \( x_0 \in cl({A})\) as
The following relation for any subset K of \( {\mathbb {R}}^n \) is immediately from bipolar theorem:
Let \(\psi :{\mathbb {R}}^n \rightarrow {{\mathbb {R}}}^*:={\mathbb {R}} \cup \{+\infty \} \) be a locally Lipschitz function, and \(x_0 \in dom \) \( \psi \). The Clarke sub-differential of \(\psi \) at \(x_0\) is defined as
where \(\psi ^\circ (x_0;v)\) denotes the Clarke generalized directional derivative of \(\psi \) at \(x_0\) in direction \(v \in {\mathbb {R}}^n\),
It should be noted that if \( \psi \) is continuously differentiable at \( x_0 \), then \( \partial _c \psi (x_0)=\{\triangledown \psi (x_0)\} \). Consequently, the sentences expressed in terms of \(\partial _c \) are generalizations of sentence that are expressed with gradient for \(C^1\) functions. Moreover, it can be shown that if \( \psi \) is a locally Lipschitz function, \( \partial _c \psi (x_0) \) is a non-empty, convex, and compact set. Also, \( \psi ^\circ (x;v) \) is a convex function with respect to v.
If \( \psi _1 \) and \( \psi _2 \) are locally Lipschitz functions, the following relationships should always hold:
The following three approximate cones for \(\emptyset \ne M \subseteq {\mathbb {R}}^n\) at \(x_0 \in cl({M})\) will be requested in this article:
-
The contingent cone \(\varGamma (M,x_0)\),
$$\begin{aligned} \varGamma (M,x_0):=\Big \{v\in {\mathbb {R}}^n :\exists t_r \downarrow 0,\ \exists v_r \rightarrow v\ \text {such that}\ x_0+t_r v_r \in M \ \ \forall r\in \mathbb {N} \Big \} . \end{aligned}$$ -
The cone of feasible directions \(\varUpsilon (M,x_0)\),
$$\begin{aligned} \varUpsilon (M,x_0):=\Big \{v \in {\mathbb {R}}^n :\exists \varepsilon >0,\ \text {such that}\ x_0+t v \in M \ \ \ \forall t\in ]0,\varepsilon [ \Big \} . \end{aligned}$$ -
The cone of attainable directions \(\varOmega (M,x_0)\),
$$\begin{aligned} \varOmega (M,x_0):=\Big \{v \in {\mathbb {R}}^n :\forall t_r \downarrow 0, \ \exists v_r \rightarrow v \ \text {such that}\ x_0+t_r v_r \in M \ \ \ \forall r\in \mathbb {N} \Big \} . \end{aligned}$$
Remark 2.1
The cone of attainable directions of M at \(x_0\) has an another representation as follows (see, e.g., [19]):
\(v\in \varOmega (M, x_0)\), if and only if there exists a scalar \(\delta >0\) together with a mapping \(\varphi :{\mathbb {R}} \rightarrow {\mathbb {R}}^n\) such that \(\varphi (\beta )\in M\) for all \(\beta \in ]0,\delta [\), \(\varphi (0)=x_0\), and \(\lim _{\beta \downarrow 0}\frac{\varphi (\beta )-\varphi (0)}{\beta }=v\).
The relationship in these cones is as
Note that if the continuously differentiable function \(\psi (\cdot )\) attaints its minimum on \(M\subseteq {\mathbb {R}}^n\) at \(x_0 \in M\), then:
where \(N_F(M,x_0)\) denotes the Fréchet normal cone of M at \(x_0\), which is defined as \( N_F(M,x_0):=\left( \varGamma (M,x_0)\right) ^- \).
3 Constraint Qualifications
Let \(I:=\{1,\ldots ,m\}\), and for each \(i\in I\) the functions \(G_i\) and \(H_i\) are locally Lipschitz from \({\mathbb {R}}^n\) to \({{\mathbb {R}}}^*\). In this section, we introduce and compare several constraint qualifications (CQ) for the following system with vanishing constraints (SVC in short):
The feasible set of \(\varPi \) is assumed to be non-empty, i.e.,
Considering a feasible point \(\hat{x} \in S\) (this point will be fixed throughout this section), we define following index sets:
For given \( I_\star \subseteq I \) and \( \theta \in \{G,H \} \), let
Notice, if \(I_{00}=\emptyset \), then \(\varPi \) is locally equivalent with the following system:
Since \(\varPi ^*\) has the classic form of nonlinear systems, definition of CQs for it is so easy. But, when \(I_{00}\ne \emptyset \), there is no any classic system that is equivalent to \(\varPi \). Thus, we consider the tightened system for \(\varPi \), which is a classic system that its feasible set is contained in S. For this purpose, it is easy to see that the following system is a tightened system for \(\varPi \) when \(I_{00}=\emptyset \):
To add the functions \( H_i \) and \( G_i \) for \( i\in I_{00} \) to \(\varPi _*\), we can add them to the first line or to the second line of it. Thus, we achieve the following tightened systems:
Considering \(\varPi _1\) and \(\varPi _2\), the corresponding linearized cones at \(\hat{x}\) are, respectively, given by \(\mathcal {L}_1^-\) and \(\mathcal {L}_2^-\), in which
Following [1, 10, 11], we also define the linearized cone \(\mathcal {L}^-\) as
Now, we can define some new version of Abadie, Khun–Tucker, Guignard, Zangwill, Mangasarian–Fromovitz, and linear independent CQs in extension of the CQs that are introduced in [1, 2, 10, 11, 13,14,15]. We also introduce some new useful CQs for \(\varPi \).
Definition 3.1
We say that \(\varPi \) satisfies
-
the MPVC-LICQ at \(\hat{x}\) if \(\varPi _1\) satisfies the LICQ at \(\hat{x}\), i.e.,
$$\begin{aligned}&0\in \sum _{i\in I_{0+}\cup I_{0-}\cup I_{00} } \alpha _i \partial _cH_i(\hat{x}) \\&\quad +\,\sum _{i\in I_{00}\cup I_{+0}}\beta _i \partial _cG_i(\hat{x}) \Rightarrow \alpha _i=0 \quad (\forall i\in I_{0+}\cup I_{0-}\cup I_{00}), \beta _i=0 \quad (\forall i\in I_{00}\cup I_{+0}) . \end{aligned}$$ -
the MPVC-MFCQ at \(\hat{x}\) if \(\varPi _1\) satisfies the MFCQ at \(\hat{x}\), i.e.,
-
(i)
\( 0 \in \sum _{i\in I_{00}\cup I_{0+}} \alpha _i \partial _cH_i(\hat{x})\Rightarrow \alpha _i=0 , \qquad \forall i\in I_{00}\cup I_{0+}, \)
-
(ii)
\( \left( \sigma ^H_{I_{00}\cup I_{0+}}\right) ^\perp \cap \left( -\sigma ^H_{I_{0-}}\right) ^s\cap \left( \sigma ^G_{I_{+0}\cup I_{00}}\right) ^s\ne \emptyset .\)
-
(i)
-
the \(\widehat{GCQ}\) at \(\hat{x}\) if
$$\begin{aligned} \mathcal {L}_1^- \subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big ). \end{aligned}$$ -
the VC-MFCQ at \(\hat{x}\) if
-
(i)
\(0 \in \sum _{i\in I_{00}\cup I_{0+}} \alpha _i \partial _cH_i(\hat{x})\Rightarrow \alpha _i=0 ,\qquad \forall i\in I_{00}\cup I_{0+}, \)
-
(ii)
\( \left( \sigma ^H_{I_{00}\cup I_{0+}}\right) ^\perp \cap \left( -\sigma ^H_{I_{0-}}\right) ^s\cap \left( \sigma ^G_{I_{+0}}\right) ^s\ne \emptyset \).
-
(i)
-
the VC-GCQ at \(\hat{x}\) if
$$\begin{aligned} \mathcal {L}_2^- \subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big ). \end{aligned}$$ -
the weak GCQ (resp., ACQ, KTCQ, ZCQ), denoted shortly by WGCQ (resp., WACQ, WKTCQ, WZCQ), at \(\hat{x}\) if
$$\begin{aligned} {\mathcal {L}}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\ \Big (\mathrm{resp}.,\ \subseteq \varGamma (S,\hat{x}),\ \subseteq \varOmega (S,\hat{x}),\ \subseteq cl\big ({\varUpsilon (S,\hat{x})}\big )\Big ). \end{aligned}$$
Remark 3.1
-
(i)
Since the feasible set of \( \varPi _2 \) (resp. \( \varPi _1 \)) is included in S and is not necessarily equal with it, therefore VC-GCQ (resp. \(\widehat{GCQ}\)) for \( \varPi \) is not equivalent to GCQ for \( \varPi _2 \) (resp. \( \varPi _1 \)). Note that GCQ for \( \varPi _2 \) (resp. \( \varPi _1 \)) implies the VC-GCQ (resp. \(\widehat{GCQ}\)) for \( \varPi \).
-
(ii)
There are no implication relations between the VC-MFCQ for \( \varPi \) and the MFCQ for \( \varPi _2 \) at \(\hat{x}\). Note that the MFCQ for \( \varPi _2 \) is irrelevant for us and we did not consider.
-
(iii)
Unlike the Ref. [1], the inclusion \({\mathcal {L}}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\) is not named GCQ in above definition. The reason of this naming is that the classic GCQ for \(\varPi \) is formalized as
$$\begin{aligned} \left( -\sigma _{ I_{0+}\cup I_{0-}\cup I_{00}}^H\right) ^-\cap \left( \bigcup _{i\in {I_{+0}\cup I_{0+}\cup I_{0-}\cup I_{00}}} \partial _c \big ( H_iG_i \big )(\hat{x}) \right) ^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big ). \end{aligned}$$(10)On the other hand, owing to
$$\begin{aligned} \partial _c(G_i H_i)(\hat{x})\subseteq G_i(\hat{x})\partial _c H_i(\hat{x})+H_i(\hat{x})\partial _c G_i(\hat{x}), \end{aligned}$$we conclude that:
$$\begin{aligned}&\left( -\sigma _{ I_{0+}\cup I_{0-}\cup I_{00}}^H\right) ^-\cap \left( \bigcup _{i\in {I_{+0}\cup I_{0+}\cup I_{0-}\cup I_{00}}} \partial _c \big ( H_iG_i \big )(\hat{x}) \right) ^- \\&\quad \supseteq \bigcap _{i\in I_{0+}}\left[ \Big (-\partial _cH_i(\hat{x})\Big )^-\cap \Big (G_i(\hat{x}) \partial _cH_i(\hat{x})\Big )^- \right] \cap \bigcap _{i\in I_{00}}\Big (-\partial _cH_i(\hat{x})\Big )^- \cap \\&\qquad \bigcap _{i\in I_{0-}}\left[ \Big (-\partial _cH_i(\hat{x})\Big )^-\cap \Big (G_i(\hat{x}) \partial _cH_i(\hat{x})\Big )^- \right] \cap \bigcap _{i\in I_{+0}}\Big (H_i(\hat{x})\partial _cG_i(\hat{x})\Big )^-\\&\quad = \bigcap _{i\in I_{0+}}\Big (-\partial _cH_i(\hat{x})\Big )^-\cap \bigcap _{i\in I_{0+}}\Big (\partial _cH_i(\hat{x})\Big )^-\cap \bigcap _{i\in I_{00}}\Big (-\partial _cH_i(\hat{x})\Big )^-\cap \\&\qquad \bigcap _{i\in I_{0-}}\Big (-\partial _cH_i(\hat{x})\Big )^-\cap \bigcap _{i\in I_{+0}}\Big (\partial _cG_i(\hat{x})\Big )^- =\mathcal {L}^-. \end{aligned}$$Therefore, \(\mathcal {L}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\) is weaker than (10). Note that if the functions \(H_i\) and \(G_i\) are continuously differentiable, [1, Lemma 4] shows that equality holds in last inclusion, and so \(\mathcal {L}^-\subseteq cl\big (\mathrm{conv}\left( \varGamma (S,\hat{x})\right) \big )\) is equivalent to (10).
We recall that by adding the index set \( I_{00} \) as a whole to system \(\varPi _*\), we attained to systems \(\varPi _1\) and \(\varPi _2\). Now, in another way, we add the set \( I_{00} \) to systems \(\varPi _*\) in which \( I_{00} \) has been divided to two parts. To this aim, we call the ordered pair \( (\beta _1,\beta _2) \), a partition of \( I_{00} \) and we write \( (\beta _1,\beta _2) \in \mathcal {P}(I_{00}) \), if \( \beta _1 \) and \( \beta _2 \) are non-empty subsets of \( I_{00} \) such that \( \beta _1\cap \beta _2=\emptyset \) and \( \beta _1\cup \beta _2=I_{00} \). Motivated by [10], we add \( \beta _1 \) to the first line and \( \beta _2 \) to the second line of system \(\varPi _*\), and we get the below system:
The feasible set of \( \varPi ^{(\beta _1,\beta _2)} \) is denoted by \( S^{(\beta _1,\beta _2)}\). Also, we consider the linearized cone \(\left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^- \) at \(\hat{x}\), in which
Following [10, 11], we introduce another linearized cone for \(\varPi \) as \(\mathcal {L}^\bigtriangleup :=\mathcal {L}^- \cap \mathcal {T}\), where
It is worth mentioning that the product of convex functions \(d \rightarrow H_i^\circ (\hat{x};d)\) and \(d \rightarrow G_i^\circ (\hat{x};d)\) is, in general, non-convex. So, unlike to \(\mathcal {L}^-,\ \mathcal {L}_1^-,\ \mathcal {L}_2^-,\) and \(\left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^-\), the linearized cone \(\mathcal {L}^\bigtriangleup \) is not convex.
Since the proof of following lemma is just similar to the proof of [11, Lemma 2.4], we do not repeat it.
Lemma 3.1
The following equality is always true:
Lemma 3.2
The following inclusion holds true:
Proof
Assume that \( d\in \left( \mathcal {L}^{(\beta _1,\beta _2)}\right) ^-\). Then,
It is enough to prove that:
From \( d\in \left( \sigma ^H_{\beta _{1}}\right) ^\bot \cap \left( -\sigma ^H_{\beta _{2}}\right) ^- \), it can be concluded that \(d\in \left( -\sigma ^H_{I_{00}}\right) ^- \).
From \( d\in \left( -\sigma ^H_{\beta _{2}}\right) ^- \), we get \( \langle d,s \rangle \ge 0 \) for each \(i\in \beta _2 \) and \( s \in \sigma ^H_{\beta _{2}} \). Therefore, \( H_i^\circ (\hat{x};d)\ge 0 \) for each \( i \in \beta _2 \). Similarly, from \( d\in (\sigma ^G_{\beta _{2}})^- \) we can conclude that \( G_i^\circ (\hat{x};d)\le 0 \) for each \( i \in \beta _2 \). Hence,
Also, from \( d\in (\sigma ^H_{\beta _{1}})^\bot \), we deduce that \( H_i^\circ (\hat{x};d)=0 \) for each \( i \in \beta _1 \). This equality and (11) imply that:
and the lemma is proved. \(\square \)
In [11, lemma 2.4], the remarkable fact was shown that, if functions \( H_i \) and \( G_i \) are smooth, then equality holds in Lemma 3.2. The following example shows that inclusion of Lemma 3.2 can be strict in non-smooth case.
Example 3.1
Take in \(\varPi \),
It is easy to see that the feasible set of this system is \(S={\mathbb {R}}_- \times \{ 0\}\). Considering \(\hat{x} =(0,0)\), we deduce that \(I_{00}=\{1,2\}\) and \(\mathcal {P}(I_{00}) =\big \{ (\beta _1,\beta _2),\ (\beta _2,\beta _1) \big \}\), in which \(\beta _1=\{1\}\) and \(\beta _2=\{2\}\). By a short calculation, we get,
Thus, we conclude that
Therefore,
Corollary 3.1
The following assertion is valid:
Proof
According to Lemmas 3.1 and 3.2, we can conclude that:
in which the inclusion \((*)\) always holds (see, e.g., [19]). \(\square \)
The last corollary leads us to the following definition.
Definition 3.2
We say that \(\varPi \) satisfies the MPVC-GCQ (resp., MPVC-ACQ, MPVC-KTCQ, MPVC-ZCQ) at \(\hat{x}\) if
Now, we can state the main theorem of this section.
Theorem 3.1
The following diagram summarizes the relationships between the defined CQs:
where (d) holds when \(I_{00}=\emptyset \) and \(H_i\) functions are linear for \(i\in I_{0+}\), and implication (c) holds if \(I_{00}=\emptyset \) and \(H_i\) functions for \(i\in I_{0+}\) are continuously differentiable at \(\hat{x}\).
Proof
-
\((a)\text { and } (b)\): Straightforward.
-
\((e), (f), (j), (i), (l)\text { and } (n) \): The conclusions are immediate from (8).
-
\((m), (h), (g) \text { and } (k)\): The implications are straightforward consequences of \(\mathcal {L}^\bigtriangleup \subseteq \mathcal {L}^-.\)
-
(p): It is enough to prove that \(\mathcal {L}^-_2 \subseteq \mathcal {L}^\bigtriangleup .\) To this end, let \( d\in \mathcal {L}^-_2 = \left( \sigma ^H_{I_{0+}}\right) ^\bot \cap \left( -\sigma ^H_{I_{0-}\cup I_{00}}\right) ^- \cap \left( \sigma ^G_{I_{+0}\cup I_{00}}\right) ^-\). From \( d\in (-\sigma ^H_{I_{00}})^- \) and \( d\in (\sigma ^G_{I_{00}})^- \), we can conclude that:
$$\begin{aligned}&\langle d,\zeta \rangle \le 0, \qquad \forall \zeta \in \sigma ^G_{I_{00}}\ \Longrightarrow \ G_i^\circ (\hat{x};d)\le 0, \qquad i\in I_{00}, \\&\langle d,\xi \rangle \ge 0, \qquad \forall \xi \in \sigma ^H_{I_{00}}\ \Longrightarrow \ H_i^\circ (\hat{x};d)\ge 0, \qquad i\in I_{00}. \end{aligned}$$Therefore, for each \( i\in I_{00} \) we have:
$$\begin{aligned} H_i^\circ (\hat{x};d) ~ G_i^\circ (\hat{x};d)\le 0 \ \Longrightarrow \ d\in \mathcal {T}. \end{aligned}$$This means \( d\in \mathcal {L}^\bigtriangleup \) which gives the assertion.
-
(q): It is direct consequence of inclusion \(\mathcal {L}_2 \subseteq \mathcal {L}_1\).
-
(d): Let
$$\begin{aligned} d\in \left( \sigma ^H_{I_{0+}} \right) ^\bot \cap \left( -\sigma ^H_{I_{0-}} \right) ^s \cap \left( \sigma ^G_{I_{+0}} \right) ^s. \end{aligned}$$(13)
Assume that \(H_i(x)=a_ix\) for each \(i\in I_{0+}\). From \(d\in \left( \sigma ^H_{I_{0+}} \right) ^\bot =\{ a_i :i\in I_{0+}\}^\bot \) and \(a_i \hat{x}=0\) (since \(H_i(\hat{x})=0\) for all \(i\in I_{0+}\)), we get:
The definition of Clarke directional derivative and \(d\in \left( \sigma ^G_{I_{+0}} \right) ^s\) imply that for some \(\delta _1>0\) we have:
Put \(\widehat{H}_i (x):=-H_i(x)\) for \(i\in I_{0-}\). According to \(d\in \left( -\sigma ^H_{I_{0-}} \right) ^s\), we conclude that
This means \(\widehat{H}_i^0 (\hat{x};d)<0\), and similar to (15), we find a scalar \(\delta _2>0\) such that for all \(t\in ]0,\delta _2[\) and \(i\in I_{0-}\) one has \(\widehat{H}_i(\hat{x} +t d)<0\). Thus,
Taking \(\delta :=\min \{\delta _1,\delta _2 \}>0\), from (14)-(16) we deduce that \(\hat{x}+td \in S\) for each \(t\in ]0,\delta [\). This means that \(d \in \varUpsilon (S,\hat{x})\), and so, with regard to (13), we get:
Therefore, one has:
(c): Let
According to [21, Theorem F], and recalling the linearly independence of \(\{\nabla H_i(\hat{x}) :i\in I_{0+} \} \), we can find a neighborhood U of \(\hat{x}\) and a function \(\psi : U \longrightarrow {\mathbb {R}}^n\) such that \(\psi \) is continuous on U and differentiable at \(\hat{x}\) with
For each \(\alpha \in {\mathbb {R}}\) with \(\hat{x} +\alpha d \in U\), let \(\varphi (\alpha ):=\hat{x} +\alpha d +\psi (\hat{x} +\alpha d ).\) Owing to \(d\in \{\nabla H_i(\hat{x}) :i\in I_{0+} \}^\bot \), there exists \(\varepsilon >0\) such that for each \(\eta \in ]0,\varepsilon [\) one has:
Taking \(\widehat{H}_i (x):=-H_i(x)\) for \(i\in I_{0-}\), and according to \(d\in \left( -\sigma ^H_{I_{0-}} \right) ^s=\left( \bigcup _{i\in {I_{0-}}}\partial _c\widehat{H}_i(\hat{x}) \right) ^s\), we get \(\widehat{H}_i^0(\hat{x};d)<0\). Consequently, there exist \(\varepsilon _1>0\) and \(\beta >0\) such that:
On the other hand, if we denote the Lipschitz constance of \(\widehat{H}_i(\cdot )\) around \(\hat{x}\) by \(\rho _i\), then there exists \(\varepsilon _2>0\) such that:
where \(\rho :=\max \{\rho _i :i\in I_{0-}\}\). Since
we find a positive scalar \(\varepsilon _3>0\) such that:
Taking \(\eta \in ]0,\bar{\varepsilon }[\) in which \(\bar{\varepsilon }:= \min \{\varepsilon _1,\varepsilon _2,\varepsilon _3 \}\), we conclude that for all \(i\in I_{0-}\) one has
which ensure that
By \(d\in \left( \sigma ^G_{I_{+0}} \right) ^s\) and the latter proof, we can find a scalar \(\tilde{\varepsilon }>0\) such that:
The last inequality together (18) and (23) implies that for all \(\eta \in \left]0,\min \{\varepsilon ,\bar{\varepsilon },\tilde{\varepsilon }\}\right[\) we have \(\varphi (\eta )\in S\). Also,
and so \(d\in \varOmega (S,\hat{x})\).
Thus, we proved that:
which ensure that
\(\square \)
Example 3.2
Consider the following system,
with following data:
We have drawn the feasible set of \(\widehat{\varPi }\) in Fig. 1. With a simple calculation, it could be seen that:
The MPVC-LICQ is not satisfied at \(\hat{x}\), since we can choose \(\alpha _1=1\) and \(\beta _1=-1\) in the following inclusion,
According to
we conclude that the MPVC-MFCQ is not satisfied at \(\hat{x}\), while the implication (24) and equality \(\{(0,1)\}^\bot ={\mathbb {R}} \times \{(0,0)\}\ne \emptyset \) show that the VC-MFCQ holds at \(\hat{x}\). Thus, VC-MFCQ is strictly weaker than MPVC-MFCQ for non-smooth SVCs. Also,
Above equalities show that WZCQ, WKTCQ, and WACQ fail, whereas WGCQ, MPVC-ZCQ, MPVC-KTCQ, MPVC-ACQ, MPVC-GCQ, VC-GCQ, and \(\widehat{GCQ}\) hold at \(\hat{x}\). Thus, the inverse implications of \([WZCQ\Rightarrow MPVC\text {-}ZCQ]\), \([WKTCQ\Rightarrow MPVC\text {-}KTCQ]\), \([WACQ\Rightarrow MPVC\text {-}ACQ]\), and \([WACQ\Rightarrow WGCQ]\) do not hold. Also, the assumption \( I_{00}=\emptyset \) could not be eliminated in the implications \([VC\text {-}MFCQ \Rightarrow WKTCQ]\) and \([VC\text {-}MFCQ \Rightarrow WZCQ]\).
4 Stationarity Conditions
In this section, motivated by [12, 14, 15], we will be mainly concerned with the mathematical problem with vanishing constraints (MPVC) given as
where \( f :\mathbb {R}^n\longrightarrow \mathbb {R} \) is a continuously differentiable function and S is defined as previous section. It should be noted that the general form of a MPVC which has been considered in [1, 2, 10, 11, 13, 16] includes inequality constraints \( g_j(x)\le 0,\ j\in J \) and equality constraints \( h_t(x)=0,\ t\in T \) for some finite index sets J and T. Since adding these constraints to problem (P) does not increase the technical problems of the issue and just prolongs the formulas, we ignore them and just deal with problem (P) .
Now, we can state the KKT type necessary optimality condition for problem (P). The following theorem is non-smooth version of [1, Theorem 1].
Theorem 4.1
Let \( \hat{x} \) be a local solution of (P) such that WGCQ holds at \(\hat{x}\). If \(\mathrm{cone}(\mathcal {L}) \) is a closed cone, and the objective function \(f(\cdot )\) is continuously differentiable at \(\hat{x}\), then there exist coefficients \( \lambda _i^H \) and \( \lambda _i^G \), for \( i\in I \), such that:
and
Proof
Owing to (9), WGCQ, bipolar theorem, and closedness of \( \mathrm{cone}(\mathcal {L}) \), we conclude that
Considering the structure of convex cones, we can find nonnegative scalars
such that
Tacking
the result is proved. \(\square \)
Conditions (25)–(27) were named the KKT conditions and strongly stationary conditions for (P) in [1, 10], respectively. We will call them the strongly stationary conditions (SSC in short) too.
The following example shows that we cannot replace the smoothness condition of \(f(\cdot )\) by its Lipschitzian condition in above theorem, and we cannot replace the \(\nabla f(\hat{x})\) by \(\partial _c f(\hat{x})\) in (25) too.
Example 4.1
Consider the optimization problem as
We can formalize this problem as a MPVC with following data,
We observe that \( \hat{x}=(0,0) \) is an optimal solution of problem, and \(I_{00}=\{1,2\}\). Since
the WGCQ is satisfied at \(\hat{x}\). It is easy to show that the \(\mathrm{cone}(\mathcal {L})\) is closed, and also, the below SSC type relations do not hold for any nonnegative scalars \( \lambda _1 ^H,~\lambda _2 ^H,~\lambda _1 ^G,\) and \(\lambda _2^G\):
The following example shows the assumption of closedness of \(\mathrm{cone}(\mathcal {L})\) in Theorem 4.1 cannot be waived.
Example 4.2
Let \(H_1(x_1,x_2)=1\), and \(G_1(x_1,x_2)\) is the support function of
Thus, \(S=\{(x_1,x_2)\in {\mathbb {R}}^2 :x_1\ge 0,\quad x_1+x_2 \ge 0 \}\). Clearly, \(\hat{x}=(0,0)\) is the optimal solution of the problem \(\min _{x\in S}f(x)\), in which \(f(x_1,x_2)=x_1\). By a short calculation, we get \(I_{+0}=\{1\}\), and
Therefore, the WGCQ is satisfied at \(\hat{x}\), and \(\mathrm{cone} \big (\mathcal {L} \big )\) is not closed. Obviously, following SSC type relations do not hold for each of the nonnegative scalars \(\lambda _1^H\) and \(\lambda _1^G\):
As diagram (12) shows, VC-GCQ is weaker than WGCQ. In [1, page 93], with an example shows that WGCQ is strictly stronger than VC-GCQ. As a result, it seems that one cannot get from VC-GCQ (resp. \(\widehat{GCQ}\)) to SSC, but it can reach to some weaker conditions which in future, we call them VC-stationary conditions (reap. weakly stationary conditions).
Since the proof of following theorem is similar to the proof of Theorem 4.1, we ignore it.
Theorem 4.2
Suppose that \( \hat{x} \) is a local solution of (P) and VC-GCQ (resp. \(\widehat{GCQ}\)) holds at \( \hat{x} \). If \( \mathrm{cone}(\mathcal {L}_2) \) (resp. \( \mathrm{cone}(\mathcal {L}_1) \)) be a closed cone, then there exist real coefficients \( \lambda _i^H \) and \( \lambda _i^G\), for \( i\in I \), which satisfy in (25) and
As mentioned before, conditions (25) and (28) are refereed by VC-stationary condition (VCSC in brief), and conditions (25) and (29) are named weakly stationary conditions (shortly, WSC).
Theorem 4.3
Assume that \( \hat{x} \) is an optimal solution of (P) , and MPVC-GCQ is satisfied at \( \hat{x} \). If \( \mathrm{cone}\left( \mathcal {L}^{(\beta _1^*,\beta _2^*)}\right) \) and \( \mathrm{cone}\left( \mathcal {L}^{(\beta _2^*,\beta _1^*)}\right) \) are closed for some \( (\beta _1^*,\beta _2^*)\in \mathcal {P}(I_{00}) \), then the strongly stationary conditions hold at \(\hat{x}\).
Proof
Considering the classic formula (9), MPVC-GCQ, Lemma 3.2, and bipolar theorem, one can see that
On the other hand, according to the definition of \( \mathcal {L}^{(\beta _1^*,\beta _2^*)} \) and \( \mathcal {L}^{(\beta _2^*,\beta _1^*)} \) we have
and
Therefore,
Combining the virtues of (30) and (31) , the result is proved. \(\square \)
As we saw in proof of Theorem 4.3,
for each \( (\beta _1, \beta _2)\in \mathcal {P}(I_{00}) \), and so, the closedness of \( \mathrm{cone}( \mathcal {L}^{(\beta _1, \beta _2)}) \) and \(\mathrm{cone}( \mathcal {L}^{(\beta _2, \beta _1)})\) leads us to closedness of \( \mathrm{cone}(\mathcal {L}) \), while closedness of \( \mathrm{cone}(\mathcal {L}) \) does not lead to them.
There are two different assumptions in getting the SSC:
- (\(\mathfrak {a}\)):
-
WGCQ and closedness of \(\mathrm{cone}(\mathcal {L})\) (Theorem 4.1).
- (\(\mathfrak {b}\)):
-
MPVC-GCQ and closedness of \( \mathrm{cone}( \mathcal {L}^{(\beta _1, \beta _2)}) \) and \(\mathrm{cone}( \mathcal {L}^{(\beta _2, \beta _1)})\) (Theorem 4.3).
Since the MPVC-GCQ is weaker than WGCQ (see diagram (12)) and closedness condition in (\(\mathfrak {b}\)) is stronger than in (\(\mathfrak {a}\)), none of Theorems 4.1 and 4.3 are better than the other. It is worth mentioning that if \(H_i\) and \(G_i\) are continuously differentiable, their sub-differentials contain single element, and so, the closedness conditions in (\(\mathfrak {a}\)) and (\(\mathfrak {b}\)) automatically hold. Hence, as mentioned in [11], Theorem 4.3 is strictly better than Theorem 4.1 for MPVCs with smooth constraints.
5 Conclusions
Calculating the Clarke sub-differential of a function \(\hbar :{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) might be difficult in general. However, the functions considered in our work are locally Lipschitz and it empowers the numerical viability of our presented necessary conditions. This is due to an important corollary of the Rademacher’s Theorem [18] which says: Given \(\bar{x} \in {\mathbb {R}}^n\),
in which Q is any full-measure subset of \({\mathbb {R}}^n\) that includes all the points where \(\hbar \) is differentiable. According to (32), the gradient sampling approach [22] can be utilized to generate a suitable approximation of \(\partial _c \hbar (\bar{x})\) and applying in our results. Here, we address this approach briefly. Considering an appropriate ball centered at \(\bar{x}\), say \(U_{\bar{x}}\), a large number of points in \(U_{\bar{x}}\) are randomly generated. At the second steep, among randomly generated points, the points at which \(\hbar \) is differentiable are selected. Assume that these points are \(x_1,x_2,\ldots ,x_k\). According to (32),
is considered as an approximation of \(\partial _c \hbar (\bar{x})\) and is applied in our results. The accuracy of this approximation can be improved by increasing k. See [22] for more details about gradient sampling techniques. To compute \(\partial _c \hbar (\bar{x})\), another sampling method has been presented in [23]. It works based on \(\varepsilon \)-sub-differential introduced by Goldstein.
References
Achtziger, W., Kanzow, C.: Mathematical programs with vanishing constraints: optimality conditions and constraint qualifications. Math. Program. 114, 69–99 (2007)
Hoheisel, T., Kanzow, C.: First- and second-order optimality conditions for mathematical programs with vanishing constraints. Appl. Math. 52, 495–514 (2007)
Ansari Ardali, A., Movahedian, N., Nobakhtian, S.: Optimality conditions for nonsmooth mathematical programs with equilibrium constraints, using convexificators. Optimization 65, 67–85 (2016)
Kanzi, N., Nobakhtian, S.: Nonsmooth multiobjective semi-infinite problems with mixed constraints. Pac. J. Optim. 13, 43–53 (2017)
Pandey, Y., Mishra, S.K.: Duality for nonsmooth optimization problems with equilibrium constraints, using convexificators. J. Optim. Theory Appl. 171, 694–707 (2016)
Bigi, G., Pappalardo, M., Passacantando, M.: Optimization tools for solving equilibrium problems with nonsmooth data. J. Optim. Theory Appl. 171, 887–905 (2016)
Chieu, N.H., Lee, G.M.: Constraint qualifications for mathematical programs with equilibrium constraints and their local preservation property. J. Optim. Theory Appl. 163, 755–776 (2014)
Movahedian, N.: Bounded Lagrange multiplier rules for general nonsmooth problems and application to mathematical programs with equilibrium constraints. J. Global Optim. 67, 829–850 (2017)
Luu, D.V.: Optimality condition for local efficient solutions of vector equilibrium problems via convexificators and applications. J. Optim. Theory Appl. 171, 643–665 (2016)
Hoheisel, T., Kanzow, C.: Stationarity conditions for mathematical programs with vanishing constraints using weak constraint qualifications. J. Math. Anal. Appl. 337, 292–310 (2008)
Hoheisel, T., Kanzow, C.: On the Abadie and Guignard constraint qualifications for mathematical programs with vanishing constraints. Optimization 58, 431–448 (2009)
Hoheisel, T., Kanzow, C., Outrata, J.: Exact penalty results for mathematical programs with vanishing constraints. Nonlinear Anal. 72, 2514–2526 (2010)
Achtziger, W., Hoheisel, T., Kanzow, C.: A smoothing-regularization approach to mathematical programs with vanishing constraints. Comput. Optim. Appl. 55, 733–767 (2013)
Izmailov, A.F., Solodov, M.V.: Mathematical programs with vanishing constraints: optimality conditions, sensitivity, and a relaxation method. J. Optim. Theory Appl. 142, 501–532 (2009)
Izmailov, A.F., Pogosyan, A.L.: Optimality conditions and Newton-type methods for mathematical programs with vanishing constraints. Comput. Math. Math. Phys. 49, 1128–1140 (2009)
Mishra, S.K., Singh, V., Laha, V.: On duality for mathematical programs with vanishing constraints. Annal. Oper. Res. 243, 249–272 (2016)
Mordukhovich, B.: Optimization and equilibrium problems with equilibrium constraints in infinite-dimensional spaces. Optimization 57, 715–741 (2008)
Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, Hoboken (1983)
Giorgi, G., Gwirraggio, A., Thierselder, J.: Mathematics of Optimization; Smooth and Nonsmooth cases. Elsevier, Amsterdam (2004)
Hiriart- Urruty, J.B., Lemarechal, C.: Convex Analysis and Minimization Algorithms, I & II. Springer, Berlin (1991)
Halkin, H.: Implicit functions and optimization problems without continuous differentiability of the data. SIAM J. Control 12, 229–236 (1974)
Burke, J.V., Lewis, A.S., Overton, M.L.: Approximating subdifferentials by random sampling of gradients. Math. Oper. Res. 27, 567–584 (2002)
Burke, J.V., Lewis, A.S., Overton, M.L.: A robust gradient sampling algorithm for nonsmooth, nonconvex optimization. SIAM J. Optim. 15, 751–779 (2005)
Acknowledgements
The authors are very grateful to the referees for their constructive comments. Also, we express our thanks to Professor M. Soleimani-damaneh for suggesting the conclusions of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kazemi, S., Kanzi, N. Constraint Qualifications and Stationary Conditions for Mathematical Programming with Non-differentiable Vanishing Constraints. J Optim Theory Appl 179, 800–819 (2018). https://doi.org/10.1007/s10957-018-1373-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-018-1373-7