Abstract
In this paper, we consider a nonsmooth optimization problem with a convex feasible set described by constraint functions which are neither convex nor differentiable nor locally Lipschitz necessarily. Utilizing upper regular convexificators, we characterize the normal cone of the feasible set and derive KKT type necessary and sufficient optimality conditions. Under some assumptions, we show that the set of KKT multipliers is bounded. We also characterize the set of optimal solutions and introduce a linear approximation corresponding to the original problem which is useful in checking optimality. The obtained outcomes extend various results existing in the literature to a more general setting.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Our motivation in this article is recent works of Lasserre (2010) and Dutta and Lalitha (2013). We consider an optimization problem,
where \(f, g_i:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\), (\(i\in I:=\{1, \ldots ,m\})\) are real-valued functions. The set of feasible solutions of (1) is
The problem in the presence of convex objective function and convex constraint functions has been studied by various scholars; see e.g. Bazaraa et al. (2006), Hiriart-Urruty and Lemarechal (1993), Rockafellar and Wets (1998) and the references therein. In the current work, we focus on the case that constraint functions \(g_i\) are nonconvex, nondifferentiable and even discontinuous. These cases are important in both theory and application. In various practical problems, it occurs that some \(g_i\) function is not convex while K is convex. For example, K is a convex set if \(g_i\)’s are quasiconvex functions. It is simple to observe that K might be convex while \(g_i\) functions are neither convex nor quasiconvex. For example,
is convex while the constraint function \(g_1(x_1, x_2)=1-x_1x_2\) is neither convex nor quasiconvex; see Lasserre (2010). As an interesting applied problem, consider an optimization model with probabilistic constraints (OMPC) as follows:
where \(p\in (0,1)\) is a fixed probability level. \(x\in {\mathbb {R}}^n\) and \(Z\in {\mathbb {R}}^p\) are respectively decision and random vectors. Here, \({\mathbb {E}}[\cdot ]\) and \({\mathbb {P}}[\cdot ]\) stand for expected value and probability function, respectively, and we assume that these functions are well-defined. The most important application of OMPC is appeared in portfolio selection (see Dentcheva 2006, p. 50), where h is a return function and \(r_i\) functions are corresponding to a risk measure. One of the vital questions in this area is imposing suitable conditions on \(r_i(\cdot ,\cdot )\) functions which make the feasible set of OMPC convex. If \(r_i(\cdot ,\cdot )\) (\(i\in I\)) are quasiconcave functions jointly in both arguments, and Z is a random variable with an \(\alpha \)-concave probability distribution (see Dentcheva 2006, p. 53), then the feasible set \(\{x\in {\mathbb {R}}^n: {\mathbb {P}}[r_i(x,Z)\ge 0, i\in I]\ge p\}\) is convex; see Dentcheva (2006, Sect. 2.2.3). See also Luedtke and Ahmed (2008) and Nemirovski and Shapiro (2006).
As another practical example, Marcille et al. (2012) proposed a model for resource allocation for ad hoc networks which minimizes a convex function over a convex feasible set with nonconvex constraint functions (see Marcille et al. 2012, Problem 1 and Lemma 1). They provided associated KKT conditions, invoking the Lasserre’s (2010) results, leading to an optimal resource allocation algorithm. More examples of problems with nonconvex constrain functions and convex feasible set can be found in the literature, see e.g. problems with fractional constrains (Schaible 1981) and financial models under cash-subadditive risk measures (Cerreia-Vioglio et al. 2011) among others.
Lasserre (2010) considered the convexity of K without any convexity assumption on \(g_i\) functions and obtained KKT optimality condition under Slater constraint qualification and a non-degeneracy condition on \(g_i\)’s. He investigated only differentiable problems; and then Dutta and Lalitha (2013) extended Lasserre’s work and derived KKT conditions when \(g_i\)’s are locally Lipschitz and regular in the sense of Clarke et al. (1998). Martínez-Legaz (2015) dealt with the problem with tangentially convex constraint functions and a pseudoconvex objective function. In these studies, objective and constraint functions are assumed to be continuous and at least directionally differentiable, while in the present paper we are going to work with discontinuous functions. Optimization models with appeared nonsmooth and even discontinuous functions may arise in various fields, including batch production (Imo and Leech 1984) and switching regression (Tishler and Zang 1979). See also Moreau and Aeyels (2000) and the references therein.
In the present article, we follow the problem without locally Lipschitz assumption; and we use convexificators in the presence of nonsmooth data (nondifferentiable/discontinuous appearing functions). The concept of convexificator has been introduced to apply for each function, leading to sharpen results in nonsmooth optimization. Convexificators enjoy nice properties including mean value theorem, calculus rules, optimality conditions, etc, under mild assumptions. However, in general, we do not have some calculus rules such as exact chain rule; see Jeyakumar and Luc (1999) for more details.
The convexificator notion has been studied in several recent studies, including Demyanov and Rubinov (1995), Dutta and Chandra (2004), Jeyakumar and Luc (1999), Luc (2002) and Luu (2014), for characterizing the optimality. In the first part of the work, a main result is obtained showing that the normal cone of K can be represented with respect to the upper regular convexificators of \(g_i\)’s under some conditions. Afterwards, KKT type optimality conditions are derived from this representation, leading to an extended version of some main results provided in Dutta and Lalitha (2013), Lasserre (2010) and Martínez-Legaz (2015). Furthermore, a result is proved to get KKT conditions from FJ conditions. We prove the boundedness of the set of KKT multipliers and derive a characterization for the solution set. Finally, we build a linear semi-infinite problem to check the optimality of a given feasible solution.
The rest of the paper unfolds as follows. After providing some preliminaries in Sect. 2, we briefly review convexificator notion and its relationship with other well-known generalized gradients in Sect. 3. This section addresses some constraint qualifications as well. The characterization of convexity of the feasible set and its normal cone are given in Sect. 4. Deriving KKT conditions from the characterization of normal cone, getting KKT conditions from FJ conditions, and establishing the boundedness of the set of KKT multipliers are presented in Sect. 5. Finally, in Sect. 6 we obtain some results to characterize the set of optimal solutions; and with a linear approximation, we relate main problem with a linear semi-infinite optimization problem to check the optimality of a given feasible solution.
2 Preliminaries
In this section we provide some preliminaries.
For a set \(\varOmega \subseteq {\mathbb {R}}^n\), we use the notations \(co\, \varOmega \), \(int \, \varOmega \), \(bd\, \varOmega \) and \(cl\, \varOmega \) to denote the convex hull, the interior, the boundary and the closure of \(\varOmega \), respectively. Throughout the paper, the considered norm \(\Vert .\Vert \) is the Euclidean norm, i.e., \(\Vert .\Vert =\Vert .\Vert _2\) and we use the convention \(\infty - \infty = \infty \).
A nonempty set \(C\subseteq {\mathbb {R}}^n\) is called a cone if for each \(x\in C\) and each scalar \(\lambda \ge 0\), we have \(\lambda x\in C\). A cone C is said to be pointed whenever \(C\cap (-C)=\{0\}\). For a convex set \(\varOmega \subseteq {\mathbb {R}}^n\), the cone of feasible directions, the tangent cone and the normal cone of \(\varOmega \) at \(\bar{x}\in cl\, \varOmega \), denoted by \(D_\varOmega (\bar{x})\), \(T_\varOmega (\bar{x})\) and \(N_\varOmega (\bar{x})\), respectively, are defined as
The polar cone of a set \(\varOmega \subseteq {\mathbb {R}}^n\) is defined by
If \(\varOmega \) is convex, then \(cl\, D_\varOmega (\bar{x})=T_\varOmega (\bar{x})\) and \(T^\circ _\varOmega (\bar{x})=N_\varOmega (\bar{x})\). The convex cone generated by \(\varOmega \subseteq {\mathbb {R}}^n\) is defined as follows:
If \(\varOmega _1, \ldots , \varOmega _l\subseteq {\mathbb {R}}^n\) are convex sets, then it can be shown that
3 Convexificator and CQs
The upper Dini directional derivative, defined below, plays a central role in this work. Hereafter, we assume that \(h(\bar{x})\) is finite for a given \(\bar{x}\in {\mathbb {R}}^n\).
Definition 1
Let \(h:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\). The upper Dini directional derivative of h at \(\bar{x}\) in direction \(d\in {\mathbb {R}}^n\) is defined by
If \(h:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is locally Lipschitz, then upper Dini directional derivative exists finitely.
The directional derivative of h at \(\bar{x}\in {\mathbb {R}}^n\) in direction \(d\in {\mathbb {R}}^n\), denoted by \(h'(\bar{x}; d)\), is defined as
Definition 2
(Jeyakumar and Luc 1999) The function \(h: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is said to have an upper regular convexificator (URC) \(\partial ^* h(\bar{x})\) at \(\bar{x}\) if \(\partial ^* h(\bar{x})\subseteq {\mathbb {R}}^n\) is closed and, for each \(d\in {\mathbb {R}}^n\),
See Jeyakumar and Luc (1999) for properties and applications of URCs.
Definition 3
(Clarke et al. 1998) Let \(h: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) be Lipschitz near \(\bar{x}\in {\mathbb {R}}^n\). The Clarke generalized directional derivative of h at \(\bar{x}\) in the direction \(d\in {\mathbb {R}}^n\), denoted by \(h^\circ (\bar{x}; d)\), is defined as
Definition 4
(Clarke et al. 1998) Let \(h: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) be Lipschitz near \(\bar{x}\in {\mathbb {R}}^n\). The Clarke generalized gradient of h at \(\bar{x}\), denoted by \(\partial ^{Cl} h(\bar{x})\), is defined as
Remark 1
If h is locally Lipschitz at \(\bar{x}\) with a URC \(\partial ^*h(\bar{x})\) at this point, then due to the inequality \(h^+(\bar{x};d)\le h^{\circ }(\bar{x};d)\), one has
Now, from Hiriart-Urruty and Lemarechal (1993, Theorem V.3.3.1), we have
This inclusion relation shows that one may obtain a URC smaller than Clarke generalized gradient. Notice that hunting smaller subdifferential sets is an important issue in optimization from a numerical standpoint.
As mentioned in the preceding section, related results in various papers have been obtained for directionally differentiable functions. Moreover, in some works, the functions in question are assumed to be regular (i.e. \(h'(\bar{x};d)\) exists and equals to \(h^\circ (\bar{x};d)\) for any \(d\in {\mathbb {R}}^n\)). However, even Lipschitz functions are not always directionally differentiable. For example, look at \(h:{\mathbb {R}}\rightarrow {\mathbb {R}}\) defined by
This function is Lipschitz on \({\mathbb {R}}\) with a Lipschitz constant \(k=2\), while it is not directionally differentiable at \(\bar{x}=0\) (see Jeyakumar and Luc 2008, p. 7). However,
which implies that \(\partial ^* h(\bar{x})=\{0,1\}\) is a URC of h at \(\bar{x}\). It is seen that \(\partial ^* h(\bar{x})\subsetneqq \partial ^{Cl} h(\bar{x})=[-2,2]\).
According to the discussion and example above, even for locally Lipschitz functions, the URC notion may provide sharper results than Clarke generalized gradient.
The class of functions that admit a URC is rich. Gâteaux differentiable functions and regular functions in the sense of Clarke et al. (1998) are important members of this class. Tangential subdifferential (of a tangentially convex function) is also a URC (see Martínez-Legaz 2015, Definition 5). Moreover, if \(h: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is locally Lipschitz, then the Clarke’s generalized gradient (Clarke et al. 1998) and the Michel and Penot (1992) subdifferential are upper semi regular convexificators (Dutta and Chandra 2002). In the present study we use URC as a strong tool for working with discontinuous and nondifferentiable functions.
Now we recall some constraint qualification conditions from the literature.
Let
where \(g_i: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) for each \(i\in I\) is not necessarily continuous or differentiable.
Definition 5
We say that the Slater Constraint Qualification (SCQ) holds for (1) if there exists \(x\in K\) such that \(g_i(x)<0\) for all \(i\in I\).
Lasserre (2010), Dutta and Lalitha (2013) and Martínez-Legaz (2015) use SCQ. Since in their work, the constraint functions are continuous, SCQ implies \(int\, K\ne \emptyset \). However, we use \(int\, K\ne \emptyset \) for our results because we do not assume continuity.
For a given \(\bar{x}\in K\), set
Assumption 1
Hereafter, given \(\bar{x}\in K\) we assume \(I(\bar{x})\ne \emptyset \), and \(g_i,~i\in I(\bar{x})\) has a URC \(\partial ^* g_i(\bar{x})\) at \(\bar{x}\).
Definition 6
We say that the Linear Independent Constraint Qualification (LICQ) holds at \(\bar{x}\in K\) if
Definition 7
We say that the Constraint Qualification (CQ) holds at \(\bar{x}\in K\) if
Definition 8
We say that the generalized Lasserre Constraint Qualification (GLCQ) holds at \(\bar{x}\in K\) if \(0\notin \partial ^* g_i(\bar{x})\) for each \(i\in I(\bar{x})\).
It is not difficult to show that LICQ \(\Rightarrow \) CQ \(\Rightarrow \) GLCQ, but the opposite statements are not true generally.
4 Convexity and characterization of normal cone
In this section, we characterize the convexity of the feasible set K and its normal cone at a given point \(\bar{x}\in K\), which K is as represented in (3). It is shown that the normal cone is the closure of convex cone generated by \(\cup _{i\in I(\bar{x})} \partial ^* g_i(\bar{x})\).
The following theorem characterizes the convexity of K using URCs. This theorem extends Lasserre (2010, Lemma 2.2), Dutta and Lalitha (2013, Proposition 2.2) and Martínez-Legaz (2015, Proposition 6). We assume
If constraint functions are continuous, then this condition automatically holds.
Theorem 1
Let \(int\, K\ne \emptyset \) and GLCQ hold. Furthermore, assume that K is closed and (4) holds. Then K is convex if and only if for every \(i\in I\) and every \(x\in K\) with \(g_i(x)=0\), there exists \(\zeta \in \partial ^* g_i(x)\) such that \(\langle \zeta , y-x\rangle \le 0\) for each \(y\in K\).
Proof
Assume that K is convex. By indirect proof, assume that there exist \(i\in I\) and \(x\in K\) with \(g_i(x)=0\) such that for each \(\zeta \in \partial ^* g_i(x)\), \(\langle \zeta , y- x\rangle >0\) for some \(y\in K\). Hence, \(g_i^+(x; y-x)>0\) which leads to \(g_i(x+t(y-x))>0\) for some sufficiently small \(t>0\). This contradicts the convexity of K.
Converse: Let \(\bar{x}\) be any boundary point of K. Due to (4), there exists some \(j\in I\) such that \(g_j(\bar{x})=0\). Hence, by assumption, there exists \(\zeta \in \partial ^* g_j(\bar{x})\) such that \(\langle \zeta , y-\bar{x}\rangle \le 0\) for each \(y\in K\). Since GLCQ holds, we have \(\zeta \ne 0\). Thus, \(\zeta \) is the normal vector of a non-trivial supporting hyperplane to the set K at \(\bar{x}\). Now, since \(int\, K\ne \emptyset \), by Theorem 1.3.3 in Schneider (1994), K is convex. \(\square \)
Theorem 2 is devoted to characterization of the normal cone of K at a given point \(\bar{x}\in K\).
Assumption 2
In the rest of this section, we consider K as a convex set as defined in (3) where \(g_i\)’s are not necessary locally Lipschitz, continuous or convex.
Theorem 2
Let \(\bar{x}\in K\). If \(g_i\) for \(i\notin I(\bar{x})\) is upper semicontinuous at \(\bar{x}\), \(int\, K\ne \emptyset \) and GLCQ holds at \(\bar{x}\), then
Proof
As \(N_K(\bar{x})\) is a closed convex cone, to justify the inclusion
it is sufficient to show that \(\partial ^* g_i(\bar{x})\subseteq N_K(\bar{x})\) for each \(i\in I(\bar{x})\).
Let \(i\in I(\bar{x})\) be arbitrary. Since K is convex, for each \(x\in K\) we have \(g_i^+(\bar{x}; x - \bar{x})\le 0\), leading to
Hence, according to the definition of normal cone of a convex set, we have \(\partial ^* g_i(\bar{x})\subseteq N_K(\bar{x})\). Therefore,
Set
To show the converse inclusion, we claim that \(\varGamma ^{\circ }(\bar{x})\subseteq T_K(\bar{x})\). Let \(y\in int\, K\) be arbitrary. There exists some \(\epsilon >0\) such that for each \(d\in {\mathbb {R}}^n\) with \(\Vert d\Vert \le 1\), \(y+\,\epsilon d\in K\). Hence, form the convexity of K, for each \(i\in I(\bar{x})\) and \(d\in {\mathbb {R}}^n\) with \(\Vert d\Vert \le 1\),
Therefore,
By setting \(d=\frac{\zeta }{\Vert \zeta \Vert }\), we get
Hence, because of \(0\notin \partial ^* g_i(\bar{x})\) and the closedness of \(\partial ^* g_i(\bar{x})\), we have
Therefore,
Let \(d\in \varGamma ^\circ (\bar{x})\) and \(y\in int\, K\). From (6) and the definition of URC, we have
Therefore, for sufficiently small \(\lambda \) values, \(g_i\big (\bar{x}+\lambda (d+t(y-\bar{x}))\big )\le 0\) for each \(t>0\) and \(i\in I(\bar{x})\). On the other hand, considering \(t>0\), for each \(i\notin I(\bar{x})\), due to the upper semicontinuity assumption, \(g_i\big (\bar{x}+\lambda (d+t(y-\bar{x}))\big )\le 0\) for sufficiently small \(\lambda \) values. Hence \( d+t(y- \bar{x})\in D_K(\bar{x})\) for each \(t>0\). This implies \(d\in cl\, D_K(\bar{x})=T_K(\bar{x})\). Therefore,
Now, we have
Therefore, according to (5), \(N_K(\bar{x})=cl\, cone\left( \cup _{i\in I(\bar{x})} \partial ^* g_i(\bar{x})\right) \) and the proof is completed. \(\square \)
Remark 2
Under the assumptions of Theorem 2, if \(cone(\varGamma (\bar{x}))\) is closed, then by mentioned theorem and because of
for each \(d\in N_K(\bar{x})\),
for some \(\lambda _i\ge 0\), (\(i\in I(\bar{x})\)). Furthermore, due to (6), \(0\notin co(\partial ^*g_i(\bar{x}))\) for any \(i\in I(\bar{x})\). So, \(cone(\varGamma (\bar{x}))\) is closed provided that \(\partial ^*g_i(\bar{x})\) is bounded for each \(i\in I(\bar{x})\).
For sake of convenience and getting sharper results, in some theorems we use a base of the cone generated by a convexificator instead of the convexificator. Given a cone C, a set \(B\subseteq C\) is called a base of C if \(0\notin B\) and for each \(c\in C\backslash \{0\}\) there are unique \(b\in B\) and \(t> 0\) such that \(c=tb.\)
Let assumptions of Theorem 2 hold and \(cone(\partial ^* g_i(\bar{x}))\) be closed for each \(i\in I(\bar{x})\). From (6), we have
Therefore, \(cone(\partial ^* g_i(\bar{x}))\) is a closed pointed cone, and it has a closed convex bounded base \(B_i\) (see Luc 1989, Remark 1.6), leading to
Corollary 1
Under assumptions of Theorem 2, if \(cone(\partial ^* g_i(\bar{x}))\) is closed for each \(i\in I(\bar{x})\), then
There are known ways for obtaining the base \(B_i\); see e.g. Luc (1989, Proposition 1.10). The representation given in Corollary 1 will help us to prove the boundedness of KKT multipliers in the next section.
5 KKT condition
5.1 KKT via FJ
We start this section by recalling FJ and KKT point notions.
Definition 9
-
(i)
We say that \(\bar{x}\in K\) is a Fritz John (FJ) point for (1) if there exist \(\lambda _0, \lambda _1, \ldots , \lambda _m\ge 0\) not all zero, such that
$$\begin{aligned}&0\in \lambda _0 co(\partial ^* f(\bar{x}))+ \sum _{i=1}^m \lambda _i co(\partial ^* g_i(\bar{x})), \end{aligned}$$(8)$$\begin{aligned}&\lambda _i g_i(\bar{x})=0,~~\forall i\in I.~~~~~~~~~~ \end{aligned}$$(9) -
(ii)
We say that \(\bar{x}\in K\) is a KKT point for (1) if it is a FJ point with \(\lambda _0=1\).
Dutta and Chandra (2002, Theorem 3.3) derived a necessary FJ optimality condition for (1) in terms of convexificators. Theorem 3 shows that KKT conditions are derived if one adds \(int\ K\ne \emptyset \) and GLCQ to the assumptions imposed by Dutta and Chandra in Dutta and Chandra (2002, Theorem 3.3).
Theorem 3
Let \(\bar{x}\in K\) be a FJ point. If \(int\ K\ne \emptyset \) and GLCQ holds, then \(\bar{x}\) is a KKT point.
Proof
Assume \(\lambda _0=0\) in (8). Then \(0= \sum _{i=1}^m \lambda _i \zeta _i,\) for some \(\zeta _i\in co(\partial ^* g_i(\bar{x})),~i=1,2\ldots ,m.\) Thus
where \(J:=\{ i\in I: \lambda _i >0\}\subseteq I(\bar{x})\). For each \(i\in I(\bar{x})\) and \(y\in K\), as K is convex, \(g_i^+(\bar{x}; y - \bar{x})\le 0\) which implies \(\langle \zeta _i, y- \bar{x}\rangle \le 0\). Thus
Since there exists some \(\hat{x}\in int\,K\), for any \(v\in {\mathbb {R}}^n\) and \(t>0\) sufficiently small we have \(\hat{x}+ tv\in K\); and hence
Therefore, \(\langle \zeta _i , \hat{x}-\bar{x}\rangle + \langle \zeta _i , tv\rangle =0\) which leads to
Hence for each \(v\in {\mathbb {R}}^n\) and for each \(i\in J\), \(\langle \zeta _i ,v\rangle =0\). This implies \(\zeta _i=0\) for each \(i\in J\). Therefore, \(0\in co(\partial ^* g_i(\bar{x}))\) for all \(i\in J\). On the other hand, invoking GLCQ and \(int\,K\ne \emptyset \), by a manner similar to the proof of Theorem 2, Eq. (6) is derived, leading to \(0\notin co(\partial ^* g_i(\bar{x}))\) for any \(i\in J\). This contradiction shows \(\lambda _0 > 0\) and without loss of generality one may take \(\lambda _0 = 1\). \(\square \)
5.2 KKT without FJ
In Dutta and Lalitha (2013), Lasserre (2010) and Martínez-Legaz (2015), the authors investigated KKT conditions at optimal solutions of (1). To this end, they consider FJ conditions and then they impose some constraint qualifications to get KKT conditions. Their manner can be summarized in the following diagram:
Now we are going to use the characterization of \(N_K(\bar{x})\), proved in Theorem 2, to get KKT conditions without using FJ conditions. The diagram below clarifies our manner:
Theorem 4
Assume that f has a URC \(\partial ^*f(\bar{x})\) at \(\bar{x}\in K\). If \(\bar{x}\) is an optimal solution of (1), then
Proof
We claim
By indirect proof suppose \(f^+(\bar{x};d)=\displaystyle \sup _{\eta \in \partial ^* f(\bar{x})}\langle \eta , d\rangle < 0\) for some \(d\in D_K(\bar{x})\). This implies the existence of some \(t>0\) satisfying \(\bar{x}+td\in K\) and \(f(\bar{x}+td)< f(\bar{x}),\) which contradict the optimality of \(\bar{x}\). From (10), we have
Therefore,
where \(I_{T_K(\bar{x})}(\cdot )\) stands for the indicator function (\(I_{T_K(\bar{x})}(d)\) equals to zero if \(d\in T_K(\bar{x})\) and \(\infty \) otherwise). By Hiriart-Urruty and Lemarechal (1993, Example V.2.3.1), \(I_{T_K(\bar{x})}(d)=\displaystyle \sup \nolimits _{\eta \in N_K(\bar{x})} \langle \eta , d\rangle \) for each d. Thus
Now, invoking Hiriart-Urruty and Lemarechal (1993, Theorem V.3.3.1), we have
and the proof is completed. \(\square \)
Corollary 2
Assume that \(\bar{x}\in K\), \(int\, K\ne \emptyset \), GLCQ holds, and f has a bounded URC \(\partial ^*f(\bar{x})\) at \(\bar{x}\). Furthermore, assume that \(g_i\) for \(i\notin I(\bar{x})\) is upper semicontinuous at \(\bar{x}\). If \(\bar{x}\) is an optimal solution of (1) and \(cone(\varGamma (\bar{x}))\) is closed, then there exist \(\lambda _i\ge 0\), \(i\in I(\bar{x})\) such that
Proof
It results from Theorem 4 and Remark 2. \(\square \)
Theorem 4 and Corollary 2 present necessary optimality conditions. In Theorem 5, we obtain a sufficient optimality condition in the presence of an asymptotic pseudoconvex objective function. Along the lines of Yang (2005) we use the following generalized definition of pseudoconvexity w.r.t convexificators.
Definition 10
Assume that \(h:\varOmega \subseteq {\mathbb {R}}^n \rightarrow {\mathbb {R}}\cup \{\infty \}\) admits a URC at \(\bar{x}\in \varOmega \). The function h is said to be asymptotic pseudoconvex (a-pseudoconvex in brief) at \(\bar{x}\) if for every \(y\in \varOmega \),
This function is called a-pseudoconvex on \(\varOmega \) if it is a-pseudoconvex at any point of \(\varOmega \).
Theorem 5
Assume that f is a-pseudoconvex with a URC \(\partial ^*f(\bar{x})\) at \(\bar{x}\in K\). If there exist \(\lambda _i\ge 0\), \(i\in I(\bar{x})\) such that
then \(\bar{x}\) is an optimal solution of (1).
Proof
According to the first part of the proof of Theorem 2, Eq. (11) implies \(0\in co(\partial ^* f(\bar{x}))+N_K(\bar{x})\). Hence, \(-\eta \in N_K(\bar{x})\) for some \(\eta \in co(\partial ^* f(\bar{x}))\), which implies
Now, the proof is completed due to the a-pseudoconvexity of f at \(\bar{x}\). \(\square \)
Corollary 3
Let \(\bar{x}\in K\). Assume \(g_i\) for \(i\notin I(\bar{x})\) is upper semicontinuous at \(\bar{x}\), \(int\, K\ne \emptyset \) and GLCQ holds at \(\bar{x}\). Furthermore, assume that f has a bounded URC \(\partial ^*f(\bar{x})\). If \(\bar{x}\) is an optimal solution of (1) and for each \(i\in I(\bar{x})\), \(cone(\partial ^* g_i(\bar{x}))\) is closed, then there exist \(\lambda _i\ge 0\), \(i\in I(\bar{x})\) such that
where \(B_i\) is a convex compact base of \(cone(\partial ^* g_i(\bar{x}))\). If f is a-pseudoconvex at \(\bar{x}\), this condition is sufficient for optimality.
Proof
Apply Corollary 1, Theorem 4, and Theorem 5. \(\square \)
As mentioned before, in the current subsection, KKT conditions are gotten from optimality directly, in contrast to Dutta and Lalitha (2013, Theorem 2.4), Lasserre (2010, Theorem 2.3), and Martínez-Legaz (2015, Theorem 9). Furthermore, here we have used upper regular convexificator instead of usual gradient [used in Lasserre (2010)], the Clarke generalized gradient [used in Dutta and Lalitha (2013)], and the tangential subdifferential [used in Martínez-Legaz (2015)]. Moreover, here \(g_i\) functions are not assumed to be locally Lipschitz or directionally differentiable. It is worth mentioning that Dutta and Lalitha (2013, Theorem 2.4), Lasserre (2010, Theorem 2.3), and Martínez-Legaz (2015, Theorem 9) result from Corollary 2 and Theorem 5.
We conclude this subsection with an example. In the following example, the constraint function g is not locally Lipschitz while the feasible set is closed and convex.
Example 1
Consider
where \(f(x_1,x_2)=-x_1-x_2\) and
Set \(K=\{ (x_1,x_2) : g(x_1,x_2)\le 0\}\). It can be seen that
Two sets \(\partial ^* f(0,0)=\{(-1,-1)\}\) and \(\partial ^* g(0,0)=\{ (1+ t , 1+ \frac{1}{t}) : t>0\}\cup \{(1,1)\}\) are URCs for f and g at \(\bar{x}=(0,0)\), respectively. Furthermore, \((0,0)\notin \partial ^* g(0,0)\). On the other hand, \((1,1)\in \partial ^* g(0,0).\) Hence, \((0,0)\in \partial ^* f(0,0)+\partial ^* g(0,0)\). Therefore, (0, 0) is a KKT point and hence it is an optimal solution.
5.3 Boundedness of KKT multipliers
This subsection is devoted to a result about boundedness of the set of KKT multipliers. The boundedness of KKT multipliers is very useful in sketching numerical algorithms and duality results (Li and Zhang 2010; Nguyen et al. 1980). Without loss of generality assume that \(I(\bar{x})=\{1, \ldots , s\}\subseteq I\). Set
where \(B_i\) is a convex compact base of \(cone(\partial ^* g_i(\bar{x}))\).
Theorem 6
Let \(\bar{x}\in K\). Assume that \(int\, K\ne \emptyset \), GLCQ holds at \(\bar{x}\), objective function f has a bounded URC \(\partial ^*f(\bar{x})\). Then \(\mathcal {M}(\bar{x})\) is a bounded set.
Proof
By the indirect proof, without loss of generality, assume that there is a sequence \(\{(\lambda _1^k, \ldots , \lambda _s^k)\}_{k\in {\mathbb {N}}}\subseteq \mathcal {M}(\bar{x})\) such that \(\lambda _1^k\rightarrow \infty \) as \(k\rightarrow \infty \). For each \(k\in {\mathbb {N}}\), there exist \(\eta ^k\in co(\partial ^* f(\bar{x}))\) and \(\zeta _i^k\in B_i\) such that
By a manner similar to the second part of the proof of Theorem 2, we have \(\langle \eta _i, y-\bar{x}\rangle <0\) for each \(i\in I(\bar{x})\) and each \(\eta _i\in \partial ^* g_i(\bar{x})\). Thus
Since \(co(\partial ^* f(\bar{x}))\) and \(B_1\) are compact, by working with subsequences if necessary one may assume \(\eta ^k\rightarrow \eta \in co(\partial ^* f(\bar{x}))\) and \(\zeta _1^k\rightarrow \zeta _1\in B_1\) as \(k\rightarrow \infty \). Hence, by \(k\rightarrow \infty \),
This contradiction proves the theorem. \(\square \)
6 Characterizations of the solution set
6.1 Convexificators at optimality
In this subsection, we characterize the solution set of optimization problem (1) in terms of convexificators invoking KKT type conditions obtained in Sect. 5. Mangasarian (1988) first introduced several characterizations of the solution set of a convex optimization problem. After that, many attempts were performed on this subject, especially for making the convexity requirements on objective function weaker and for generalizing the results to nonsmooth case; see e.g. Burke and Ferris (1991), Dinh et al. (2006), Zhao and Yang (2013), and the references therein.
Assume that the solution set of (1), denoted by
is nonempty. First we obtain some inclusion characterizations.
Proposition 1
Let \(\bar{x}\in S\) be given. Assume that S is convex. Then we have
Furthermore, for any convex set A satisfying \(S\subseteq A\subseteq K\), we have
Proof
We prove only \(S\subseteq S_1\). The proof for other inclusions is similar. Let \(x\in S\) be arbitrary. By convexity of S, for sufficiently small \(t>0\), \(\bar{x}+t(x - \bar{x})\in S\). This implies \(f^+(\bar{x}; x - \bar{x})=0\), leading to \(\langle \eta , x - \bar{x}\rangle \le 0\) for each \(\eta \in \partial ^*f(\bar{x})\). Therefore, \(\langle \eta , x - \bar{x}\rangle =0\) for each \(\eta \in co(\partial ^*f(\bar{x}))\cap -N_K(\bar{x})\). Let \(y\in K\) be arbitrary. For each \(\bar{\eta }\in co(\partial ^*f(\bar{x}))\cap -N_K(\bar{x})\), we have
Hence, \(\bar{\eta }\in -N_K(x)\) and the proof is completed. \(\square \)
Convexity of S assumed in this subsection holds if f is quasiconvex. If f is continuous and a-pseudoconvex, then this function is quasiconvex (Yang 2005, Definitions 3.1 and 3.2 and Theorem 3.1) and hence S is convex.
Proposition 2 provides a full characterization of the solution set S.
Proposition 2
Let \(\bar{x}\in S\) be given. Assume that S is convex, \(co(\partial ^* f(\bar{x})) + N_K(\bar{x})\) is closed, and f is a-pseudoconvex with URC on K. Then \(S=S_3=S_4=S_5\), where
Proof
It is clear that \(S_3\subseteq S_4\). Furthermore, from a-pseudoconvexity, we get \(S_4\subseteq S\). We establish \(S\subseteq S_3\), \(S\subseteq S_5\), and \(S_5\subseteq S_4\).
\(S\subseteq S_3\): Let \(x\in S\). By Theorem 4 and closedness of \(co(\partial ^* f(\bar{x})) + N_K(\bar{x})\), there exists some \(\zeta \in co(\partial ^*f(x))\cap -N_K(x)\), and similar to the proof of Proposition 1 we get \(\langle \zeta , \bar{x}- x\rangle =0\).
\(S\subseteq S_5\): Let \(x\in S\). Similar to the preceding part there exists some \(\zeta \in co(\partial ^*f(x))\) satisfying \(\langle \zeta , \bar{x}- x\rangle =0\). As S is convex and \(\bar{x}\in S\), we have \(f^+(\bar{x}, x - \bar{x})=0\). Thus for each \(\eta \in \partial ^*f(\bar{x})\), \(\langle \eta , x - \bar{x}\rangle \le 0=\langle \zeta , \bar{x}- x \rangle \).
\(S_5\subseteq S_4\): Let \(x\in S_5\). Since \(\bar{x}\in S\) and K is convex, \(f^+(\bar{x}; x -\bar{x})\ge 0\). Hence, due to \( x\in S_5\), there exists some \(\zeta \in co(\partial ^*f(x))\) such that
\(\square \)
6.2 Linear approximation
In this subsection, we obtain a linear semi-infinite problem (Goberna and López 1998) to check the optimality of a given feasible solution of (1). To the best of our knowledge, Soleimani-damaneh (2008, 2010) is the first scholar who dealt with this problem under locally Lipshchitz data.
Let \(\bar{x}\) be a feasible solution of (1). Assume that f has a URC at \(\bar{x}\) as \(\partial ^*f(\bar{x})\). Let \(\bar{\eta }\in co(\partial ^*f(\bar{x}))\). Consider the following linear semi-infinite programming problem:
where
Theorems 7 and 8 study the relationships between Problems (1) and (12).
Theorem 7
Assume that \(\partial ^*f(\bar{x})\) is bounded, \(int K\ne \emptyset \) and GLCQ holds at \(\bar{x}\). Furthermore, assume that \(g_i\) for \(i\notin I(\bar{x})\) is upper semicontinuous at \(\bar{x}\). If \(\bar{x}\) is an optimal solution of (1) and \(cone(\varGamma (\bar{x}))\) is closed, then \(\bar{x}\) is an optimal solution of (12) for some \(\bar{\eta }\in co(\partial ^*f(\bar{x}))\).
Proof
From Corollary 2, there exist \(\lambda _i\ge 0\), \(i\in I(\bar{x})\) such that
Therefore,
for some \(\bar{\eta }\in co (\partial ^* f(\bar{x}))\) and \(\bar{\zeta }_i\in co(\partial ^* g_i(\bar{x}))\). Let \(x'\) be a feasible solution to (12) corresponding to \(\bar{\eta }\). By (13),
Therefore, \(\bar{x}\) is an optimal solution for (12) corresponding to \(\bar{\eta }\). \(\square \)
Now, we are going to prove the converse of Theorem 7.
Theorem 8
Assume that f is a-pseudoconvex with a URC \(\partial ^*f(\bar{x})\) at \(\bar{x}\). If \(\bar{x}\) is a optimal solution to (12) for some \(\bar{\eta }\in co(\partial ^*f(\bar{x}))\), then \(\bar{x}\) is a optimal solution to (1).
Proof
Let \(x\in K\) be arbitrary. According to the convexity of K, for each \(i\in I(\bar{x})\), \(g_i^+(\bar{x}; x - \bar{x})\le 0\). Hence,
which leads to \(x\in K_1\). Therefore, \(K\subseteq K_1\) which implies \(N_{K_1}(\bar{x})\subseteq N_{K}(\bar{x})\).
Since \(\bar{x}\) solves (12) for some \(\bar{\eta }\in co(\partial ^*f(\bar{x}))\), the objective function \(f(\bar{x}) + \langle \bar{\eta }, x - \bar{x}\rangle \) is linear (w.r.t. x), and the feasible set \(K_1\) is convex, we have \(0\in \bar{\eta }+ N_{K_1}(\bar{x}).\) Hence,
Therefore, \(\bar{x}\) solves (1), because of a-pseudoconvexity assumption on f. \(\square \)
6.3 An example
To illustrate our results, we look at a specific model arising in stochastic programming with probabilistic constraints. Assume that A is a \(s\times n\) matrix and \(C\subseteq {\mathbb {R}}^n\) is a convex set. Consider the following optimization problem:
where \(p\in (0,1)\) is given and \(F(\cdot )\) is corresponding to a probability distribution function; see Henrion and Römisch (1999, pp. 55–56) for more detail about applications of this model.
Example 2
Consider
where
and the probability distribution function F is given by
The function F is not convex or concave while
is a convex set. Both F and f are discontinuous at \(\bar{x}=1\) while two sets \(\partial ^*F(\bar{x})=[\frac{1}{2}, \infty )\) and \(\partial ^*f(\bar{x})=(-\infty , -1]\) are URCs of these functions at \(\bar{x}\), respectively.
Let \(y\in K\). Let \(\{\eta _k\}_{k\in {\mathbb {N}}}\subseteq \partial ^*f(\bar{x})\) be such that \(\displaystyle \lim \nolimits _{k\longrightarrow \infty }\eta _k(y - \bar{x})\ge 0\). Due to \(\eta _k\le -1,\) this implies \(y\le \bar{x}\) and then \(f(y)\ge f(\bar{x})\). Therefore, f is a-pseudoconvex at \(\bar{x}\). Furthermore, \(int\, K\ne \emptyset \). Moreover, \(F(\bar{x})=\frac{1}{2}\) and \(0\notin \partial ^*F(\bar{x})\), leading to GLCQ at \(\bar{x}\). Thus, according to Theorem 2,
and one may consider \(B=\{1\}\) as a convex compact base for this closed pointed convex cone. Since \(0=\eta +\lambda \zeta \) with \(\eta =-1\in \partial ^*f(\bar{x})\), \(\lambda =1\), and \(\zeta =1\in B\), by Theorem 5 (sufficient part in Corollary 3), \(\bar{x}\) is an optimal solution.
Notice that here the objective function is quasiconvex and hence the solution set S is convex. Now consider \(\hat{x}\in K,\) \(\hat{x}\ne \bar{x}\). We have \(\hat{x}<1\) and f is differentibale at x with \(f'(x)=-1\). Hence, \(f'(\hat{x})(\hat{x}-\bar{x})\ne 0\) which implies \(\hat{x}\notin S\) because of Proposition 2. Therefore, \(S=\{\bar{x}\}\).
References
Bazaraa MS, Sherali HD, Shetty CM (2006) Nonlinear programming. Wiley, New Jersey
Burke JV, Ferris MC (1991) Characterization of solution sets of convex programs. Oper Res Lett 10:57–60. doi:10.1016/0167-6377(91)90087-6
Cerreia-Vioglio S, Maccheroni F, Marinacci M, Montrucchio L (2011) Risk measures: rationality and diversification. Math Financ 21:743–774. doi:10.1111/j.1467-9965.2010.00450.x
Clarke FH, Ledyaev YS, Stren RJ, Wolenski PR (1998) Nonsmooth analysis and control theory. Springer, New York
Demyanov VF, Rubinov AM (1995) Constructive nonsmooth analysis. Peter Lang, Frankfurt
Dentcheva D (2006) Optimization models with probabilistic constraints. In: Calafiore G, Dabbene F (eds) Probabilistic and randomized methods for design under uncertainty. Springer, London
Dinh N, Jeyakumar V, Lee GM (2006) Lagrange multiplier characterizations of solution sets of constrained pseudolinear optimization problems. Optimization 55:241–250. doi:10.1080/02331930600662849
Dutta J, Chandra S (2002) Convexifactors, generalized convexity, and optimality conditions. J Optim Theory Appl 113:41–64. doi:10.1023/A:1014853129484
Dutta J, Chandra S (2004) Convexifactors, generalized convexity and vector optimization. Optimization 53:77–94. doi:10.1080/02331930410001661505
Dutta J, Lalitha CS (2013) Optimality conditions in convex optimization revisited. Optim Lett 7:221–229. doi:10.1007/s11590-011-0410-3
Goberna MA, López MA (1998) Linear semi-infinite optimization. Wiley, Chichester
Henrion R, Römisch W (1999) Metric regularity and quantitative stability in stochastic programs with probabilistic constraints. Math Program 84:55–88. doi:10.1007/s10107980016a
Hiriart-Urruty JB, Lemarechal C (1993) Convex analysis and minimization algorithms I. Springer, Berlin
Imo II, Leech DJ (1984) Discontinuous optimization in batch production using SUMT. Int J Prod Res 22:313–321. doi:10.1080/00207548408942456
Jeyakumar V, Luc DT (1999) Nonsmooth calculus, minimality, and monotonicity of convexificators. J Optim Theory Appl 101:599–621. doi:10.1023/A:1021790120780
Jeyakumar V, Luc DT (2008) Nonsmooth vector functions and continuous optimization. Springer, New York
Lasserre JB (2010) On representations of the feasible set in convex optimization. Optim Lett 4:1–5. doi:10.1007/s11590-009-0153-6
Li XF, Zhang JZ (2010) Existence and boundedness of the Kuhn-Tucker multipliers in nonsmooth multiobjective optimization. J Optim Theory Appl 145:373–386. doi:10.1007/s10957-009-9644-y
Luc DT (1989) Theory of vector optimization. Springer, Berlin
Luc DT (2002) A multiplier rule for multiobjective programming problems with continuous data. SIAM J Optim 13:168–178. doi:10.1137/S1052623400378286
Luedtke J, Ahmed S (2008) A sample approximation approach for optimization with probabilistic constraints. SIAM J Optim 19:674–699. doi:10.1137/070702928
Luu DV (2014) Necessary and sufficient conditions for efficiency via convexificators. J Optim Theory Appl 160:510–526. doi:10.1007/s10957-013-0377-6
Mangasarian OL (1988) A simple characterization of solution sets of convex programs. Oper Res Lett 7:21–26. doi:10.1016/0167-6377(88)90047-8
Marcille S, Ciblat P, Martret CJL (2012) Resource allocation for type-I HARQ based wireless Ad Hoc networks. IEEE Wirel Commun Lett 1:597–600. doi:10.1109/WCL.2012.083012.120519
Martínez-Legaz JE (2015) Optimality conditions for pseudoconvex minimization over convex sets defined by tangentially convex constraints. Optim Lett 9:1017–1023. doi:10.1007/s11590-014-0822-y
Michel P, Penot JP (1992) A generalized derivative for calm and stable functions. Differ Integral Equ 5:433–454
Moreau L, Aeyels D (2000) Optimization of discontinuous functions: a generalized theory of differentiation. SIAM J Optim 11:53–69. doi:10.1137/S1052623499354679
Nemirovski A, Shapiro A (2006) Convex approximations of chance constrained programs. SIAM J Optim 17:969–996. doi:10.1137/050622328
Nguyen VH, Strodiot J-J, Mifflin R (1980) On conditions to have bounded multipliers in locally Lipschitz programming. Math Program 18:100–106. doi:10.1007/BF01588302
Rockafellar RT, Wets RJB (1998) Variational analysis. Springer, Berlin
Schaible S (1981) Fractional programming: applications and algorithms. Eur J Oper Res 7:111–120. doi:10.1016/0377-2217(81)90272-1
Schneider R (1994) Convex bodies: the Brunn–Minkowski theory. Cambridge University Press, Cambridge
Soleimani-damaneh M (2008) Infinite (semi-infnite) problems to characterize the optimality of nonlinear optimization problems. Eur J Oper Res 188:49–56. doi:10.1016/j.ejor.2007.04.026
Soleimani-damaneh M (2010) Nonsmooth optimization using Mordukhovich’s subdifferential. SIAM J Control Optim 48:3403–3432. doi:10.1137/070710664
Tishler A, Zang I (1979) A switching regression method using inequality conditions. J Econom 11:259–274. doi:10.1016/0304-4076(79)90040-X
Yang XQ (2005) Continuous generalized convex functions and their characterizations. Optimization 54:495–506. doi:10.1080/02331930500100163
Zhao KQ, Yang XM (2013) Characterizations of the solution set for a class of nonsmooth optimization problems. Optim Lett 7:685–694. doi:10.1007/s11590-012-0471-y
Acknowledgements
The authors would like to express their gratitude to the associate editor and anonymous referees for their helpful comments on the first version of the paper. This work has been supported by the Center for International Scientific Studies and Collaboration (CISSC). The research of the second author was in part supported by a grant from IPM (No. 95260124).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kabgani, A., Soleimani-damaneh, M. & Zamani, M. Optimality conditions in optimization problems with convex feasible set using convexificators. Math Meth Oper Res 86, 103–121 (2017). https://doi.org/10.1007/s00186-017-0584-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-017-0584-2