Abstract
Using an augmented Lagrangian approach, we study the existence of augmented Lagrange multipliers of a semi-infinite programming problem and discuss their characterizations in terms of saddle points. In the case of a sharp Lagrangian, we obtain a first-order necessary condition for the existence of an augmented Lagrange multiplier for the semi-infinite programming problem and some first-order sufficient conditions by assuming inf-compactness of the data functions and the extended Mangasarian–Fromovitz constraint qualification. Using a valley at 0 augmenting function and assuming suitable second-order sufficient conditions, we obtain the existence of an augmented Lagrange multiplier for the semi-infinite programming problem.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The semi-infinite programming problem (SIP in short) has attracted much attention due to its various applications in engineering design, optimal control, economic equilibria, etc. It has become an active field of research in applied mathematics. For example, see Fiacco and Kortanek [1], Goberna and López [2], Polak [3], Polak and Wuu [4], Polak [5] and the references therein. The main difficulty in solving (SIP) comes from the fact that it has an infinite number of constraints, and hence it is hard to design algorithms for solving it. However, in recent years, many effective methods have been proposed for solving semi-infinite programming, such as discretization methods (see Hettich [6], Still [7]), reduction methods (see Hettich and Kortanek [8], Reemsten and Görner [9]) and exchange methods (see Goberna and López [2], Hettich and Kortanek [8], Reemsten and Görner [9], Zhang et al. [10]).
For studying duality properties of (SIP), we will use an augmented Lagrangian approach, which is crucial for dealing with nonconvex and constrained optimization problems. The augmented Lagrangian function was first proposed independently by Hestenes [11] and Powell [12] to solve an optimization problem with equality constraints. This was extended to solve an inequality constrained optimization problem by Buys [13]. The augmented Lagrangian method has been used to solve an optimization problem with both equality and inequality constraints in Rockafellar [14]. Under a growth condition, one can relate the value of the dual problem to the behavior of the perturbation function. In Rockafellar [14], the definition of a quadratic growth condition was introduced and a necessary and sufficient condition for a zero duality gap between a constrained optimization problem and its augmented Lagrangian dual problem was obtained. In Rockafellar and Wets [15, 11K*], an augmented Lagrangian with a convex augmenting function was introduced and the corresponding zero duality property was established. The definitions of several growth conditions were delineated by Penot [16]. By assuming that the augmenting function satisfies the level coercivity condition and the perturbation function satisfies some growth condition, several zero duality gap results between the primal problem and its augmented Lagrangian dual problem were obtained in Penot [16]. A level-bounded augmenting function was given in Huang and Yang [17], where the convexity of augmenting functions in Rockafellar and Wets [15] was replaced by a level-boundedness condition. A valley at 0 augmenting function was given in Zhou and Yang [18], where the level-boundedness condition of the augmenting functions in Huang and Yang [17] was replaced by the valley at 0 property. The latter property allows one to introduce a duality scheme with zero duality gap. More general frameworks for establishing zero duality gap and the existence of exact penalty parameters were introduced in [19,20,21]. Optimization problems in which the constraints are given in terms of a set inclusion were considered by Shapiro and Sun [22]. Using a quadratic growth condition (a property originally introduced by Rockafellar [14]), and a quadratic augmented Lagrangian, Shapiro and Sun established strong duality and second-order conditions ensuring existence of Lagrange multipliers (see [22, Theorems 2.4 and 3.4]). The concept of augmented Lagrange multiplier is related with that of a global saddle point of augmented Lagrangian functions. This connection was exploited by Sun et al. [23], who considered four classes of augmented Lagrangian functions. Under second-order sufficient conditions and certain additional conditions, Sun et al. [23] established the existence of local and global saddle points of these four kinds of augmented Lagrangian functions. Rückmann and Shapiro [24] studied the (SIP) via an augmented Lagrangian approach with a nonnegative convex augmenting function and a reduction approach method. They obtained some necessary and sufficient conditions for the existence of an augmented Lagrange multiplier. Furthermore, they discussed two particular cases: the proximal Lagrangian and the sharp Lagrangian.
Sufficient conditions for the existence of an augmented multiplier of an augmented Lagrangian with a valley at 0 augmenting function of a cone constrained optimization problem were obtained in Zhou et al. [25] by using local saddle point conditions and bounded conditions of a perturbed constraint set. In the present paper, we will use the augmented Lagrangian to study \(\text {(SIP)}\). Firstly, we use the sharp Lagrangian to derive first-order necessary conditions and first-order sufficient conditions for the existence of an augmented Lagrange multiplier for \(\text {(SIP)}\) (see Theorems 3.1 and 3.2). Secondly, we consider an augmented Lagrangian with an augmenting function having a valley at 0 property. This allows us to obtain the existence of an augmented Lagrange multiplier for \(\text {(SIP)}\) under two possible assumptions: (i) that \(\text {(SIP)}\) has a unique global optimal solution, and (ii) that the objective and the constraint functions satisfy a generalized representation condition originally introduced in [26].
Since the augmenting function used in this paper may be nonconvex and non-quadratic, standard techniques such as those used in Rückmann and Shapiro [24] and in Shapiro and Sun [22] to show the existence of augmented Lagrange multipliers cannot be applied here. For the same reasons, the saddle point results we derive cannot make use of the techniques used in Sun et al [23]. As a result, both the results and the proof techniques we use in the present paper are different from those in the last three references.
The structure of the paper is outlined as follows: in Sect. 2, we introduce some notations and preliminary results needed in the sequel of this article. In Sect. 3, we obtain some first-order optimality conditions for the existence of an augmented Lagrange multiplier of the sharp Lagrangian problem. In Sect. 4, we obtain some second-order conditions for the existence of an augmented Lagrange multiplier of a valley at 0 augmented Lagrangian for \((\text {SIP})\). The last section, Sect. 5, contains some concluding remarks and an open question.
2 Preliminaries
We consider a semi-infinite programming problem (SIP, in short) in the following form
where \(\varOmega \) is a (possibly infinite) nonempty set, \(f: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) and \( g:{\mathbb {R}}^n\times \varOmega \rightarrow {\mathbb {R}}\) are real-valued functions. (SIP) is said to be convex if f and \(g(\cdot ,\omega ) (\omega \in \varOmega )\) are convex.
Let \({\mathbb {R}}_+:=[0,+\infty [\,\,, {\mathbb {R}}_{++}:=(0,+\infty [, {\mathbb {R}}_{\pm \infty }:=\ ]-\infty ,+\infty [\) and the indicator function of a set \(Z \subset {\mathbf {Y}}\) be defined by \(\delta _Z(y) = 0, \text{ if } y \in Z, \text{ and } \delta _Z(y) =+\infty , \text{ otherwise. }\)
Let \(\varOmega \) be a nonempty set and \({\mathbb {R}}^{\varOmega }\) be the set of all functions from \(\varOmega \) to \({\mathbb {R}}\). Denote by \({\mathbb {R}}^{(\varOmega )}\) the subset of \({\mathbb {R}}^{\varOmega }\) consisting of all functions from \(\varOmega \) to \({\mathbb {R}}\) which are nonzero in at most a finite subset of \(\varOmega \). More precisely, \({\mathbb {R}}^{(\varOmega )}\) is defined as follows.
and \({\mathbb {R}}_+^{(\varOmega )} := \{ \lambda \in {\mathbb {R}}^{(\varOmega )} : \lambda (\omega ) \ge 0, \forall \omega \in \varOmega \}\). For \(\lambda \in {\mathbb {R}}^{(\varOmega )}\), let
Therefore, we say that \({\mathbb {R}}^{(\varOmega )}\) is the subset of \({\mathbb {R}}^{\varOmega }\) consisting of all functions with finite support. The set \({\mathbb {R}}^{(\varOmega )}\) will play the role of the dual space, and hence will contain the Lagrange multipliers.
For \(\lambda \in {\mathbb {R}}^{(\varOmega )}\) and \(y\in {\mathbb {R}}^{\varOmega }\), we can define a scalar product in a natural way:
For a finite set \(\varGamma =\{\omega _1,\ldots ,\omega _m\}\subset \varOmega \) and \(y\in {\mathbb {R}}^{\varOmega }\), denote by \(y^\varGamma (\cdot )\) the restriction of function \(y(\cdot )\) to \(\varGamma \), i.e.,
By abuse of notation, we may use the same symbol \(y^\varGamma \) to denote the finite dimensional vector that has for coordinates \(y^\varGamma =(y(\omega _1),\ldots ,y(\omega _m))^T\).
Let \(\sigma :{\mathbb {R}}^{(\varOmega )}\rightarrow {\mathbb {R}}_+\) be a function with \(\sigma (0)=0\). We refer to the function \(\sigma \) as an augmenting function.
For a given \(x\in {\mathbb {R}}^n\), we define the set
which is the set of “admissible” perturbations at the point x.
We will now use all these elements to define the augmented Lagrangian for (SIP).
Definition 2.1
Fix \(\lambda \in {\mathbb {R}}^{(\varOmega )}\), \(r>0\), a penalty parameter. The augmented Lagrangian for the problem \({(\text {SIP})}\) is the function \(L^{\varOmega }:{\mathbb {R}}^n\times {{\mathbb {R}}^{(\varOmega )}}\times {\mathbb {R}}_{++}\rightarrow {\mathbb {R}}_{\pm \infty }\) defined by
Denote by \(\text {val }{(\text {SIP})}\) the optimal value of the problem \((\text {SIP})\), and denote by \(X_0\) the set of feasible solutions of \({(\text {SIP})}\), i.e.,
Let \(x^* \in X_0\). Denote the active constraint index set at \(x^*\) by
We assume throughout the paper that the optimal value of problem \(\text {(SIP)}\) is finite.
Fix \(y\in {\mathbb {R}}^{\varOmega }\) and denote by val \((P_{y})\) the optimal value of the following parameterized problem of \( \text {(SIP)}\):
With this notation, for \(y=0\) we naturally obtain val \( (P_{0})=\text {val }\text {(SIP)}\).
Let \(\lambda \in {\mathbb {R}}^{(\varOmega )}\), \(\varGamma =\text {supp}(\lambda )\) and \(r>0\). Define \(v_r^\varOmega :{\mathbb {R}}^{\varOmega }\rightarrow {\mathbb {R}}_{\pm \infty }\) as
The assumption that \(\sigma (0)=0\) implies that \(v_r^\varOmega (0)=\text {val }\text {(SIP)}\) for every \(r>0\). By (2)–(3), we always have that
When there exists \(r>0\) such that the above expression holds as an equality, we say that \(\lambda \) is an augmented Lagrange multiplier with penalty parameter r. We make this precise in the next definition.
Definition 2.2
\(\lambda \in {\mathbb {R}}^{(\varOmega )}\) is said to be an augmented Lagrange multiplier for (SIP) if val (SIP) is finite and there exists \(r\ge 0\) such that
If (5) holds for r, then we say that the augmented Lagrange multiplier \(\lambda \) has a penalty parameter r.
Remark 2.1
-
(i)
If (5) holds for a given \(r>0\), then \(\lambda \) is an augmented Lagrange multiplier for (SIP) for every \(r'>r\). This is a direct consequence of (3) and the fact that \(\sigma \) is nonnegative.
-
(ii)
It follows from (5) that \(\lambda \) is an augmented Lagrange multiplier for \(\text {(SIP)}\) if and only if it satisfies an inequality resembling the subgradient inequality for convex functions. That is, \( \lambda \in \partial v_r^\varOmega (y_1)\) if and only if \( v_r^\varOmega (y)-v_r^\varOmega (y_1)\ge \langle \lambda , y-y_1\rangle , \forall y\in {\mathbb {R}}^{\varOmega }\). Note that \(\partial v_r^\varOmega (0)\) is well-defined only if \(v_r^\varOmega (0)=\text {val }\text {(SIP)}\) is finite.
2.1 Discretizations and Lagrangian Functions for (SIP)
A finite set \(\varGamma :=\{\omega _1,\dots , \omega _m\}\subset \varOmega \) naturally induces a discretized version of (SIP), in which only the constraints corresponding to the elements of \(\varGamma \) are taken into account. This discretized problem is denoted \(({P}^\varGamma )\), and defined as:
Denote by \(S^\varGamma \) the set of optimal solutions of (\(P^\varGamma )\). As in (\(\hbox {P}_y\)), we can perturb the discretized problem (\(P^\varGamma )\), this time with \({\bar{y}}=(\bar{y}_1, \dots , \bar{y}_m)^T\in {\mathbb {R}}^m\). This parameterized problem becomes:
We denote by \({v}^\varGamma ({\bar{y}})\) the optimal value of \((P_{{\bar{y}}}^\varGamma )\). Recall that \( y^\varGamma (\cdot )\) is the restriction of the function \(y(\cdot )\) to \(\varGamma \) [cf. (1)]. Define \( \sigma ^\varGamma ({\bar{y}}):=\sigma ( y^\varGamma ). \) This allows us to define, as in (3):
We define next the Lagrangians associated with the discretized problem (P\(_{{\bar{y}}}^\varGamma )\).
Definition 2.3
Given a finite set \(\varGamma :=\{\omega _1,\dots , \omega _m\}\subset \varOmega \), the augmented Lagrangian and the classical Lagrangian for problem \((P^\varGamma )\) are defined respectively by
where \(x\in {\mathbb {R}}^n,{\bar{\lambda }}\in {\mathbb {R}}^m\) and \(r\in {\mathbb {R}}_+\).
Remark 2.2
Consider Problem \(({P}_{{\bar{y}}}^\varGamma )\), and let \(y\in {\mathbb {R}}^{(\varOmega )}\) be such that \(y(\omega _j)=:\bar{y}_j,\) for all \(\omega _j\in \varGamma \). In this situation, the discretized problem \(({P}^\varGamma _{{\bar{y}}}\)) is a relaxation of problem \(({P}_y)\), in the sense that the constraint set in \(({P}_{{\bar{y}}}^\varGamma )\) is larger than the one in \(({P}_y)\). Namely,
This fact and (3) yield
where \({\bar{y}}\) is the restriction of \(y\in {\mathbb {R}}^{(\varOmega )}\) to the set \(\varGamma \).
We define next an augmented Lagrangian multiplier for the discretized problems.
Definition 2.4
Fix \(\varGamma :=\{\omega _1,\ldots ,\omega _m\}\subset \varOmega \), a vector \(\lambda \in {\mathbb {R}}^m\) is said to be an augmented Lagrange multiplier for \(({P}^\varGamma )\) if val \(({P}^\varGamma )\) is finite and there exists \(r\ge 0\) such that
If (8) holds for r, then we say that the augmented Lagrange multiplier \(\lambda \) has a penalty parameter r.
The following is an example where the augmented Lagrangian for \(({P}^\varGamma )\) with a separable augmenting function is computed explicitly.
Example 2.1
If \(\sigma ^\varGamma \) is separable, i.e., \(\sigma ^\varGamma ({\bar{y}}) = \sum _{j =1}^{m} \sigma _j({\bar{y}}_j)\), then the expression of the augmented Lagrangian can be simplified to
Assume furthermore that \(\sigma _j({\bar{y}}_j)=\max \{{\bar{y}}_j^2, \sqrt{|{\bar{y}}_j|}\}\) for all \(j=1,\ldots ,m,\) and let \(\psi :{\mathbb {R}}\times {\mathbb {R}}\rightarrow {\mathbb {R}}\) be defined as \(\psi _j(a,b) = \inf \{ -bz + \max \{z^2, \sqrt{|z|}\} : z \le -a \},\) then
Thus, we obtain
The following lemma shows that the optimal value of the discretized augmented Lagrangian is always equal to the augmented Lagrangian for (SIP). It also relates the discretized Lagrangian with the classical one.
In the sequel, we will use the following notation: given a finite set \(\varGamma :=\{\omega _1,\ldots ,\omega _m\}\subset \varOmega \), and \((x,u)\in {\mathbb {R}}^n\times {\mathbb {R}}_+^m\), call \(\gamma (x,u)\in {\mathbb {R}}^m\) the vector defined by
Lemma 2.1
Let \(f: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\), \( g: {\mathbb {R}}^n\times \varOmega \rightarrow {\mathbb {R}}\) be real-valued functions, and let \(\lambda \in {\mathbb {R}}^{(\varOmega )}\), \(\sigma :{\mathbb {R}}^{\varOmega }\rightarrow {{\mathbb {R}}_+}\) be a real-valued function with \(\sigma (0)=0\), \(\varGamma =\text {supp}(\lambda )=\{\omega _1,\dots , \omega _m\}\subset \varOmega \) and \(r>0\). Then
-
(i)
$$\begin{aligned} L^\varOmega (x,\lambda ,r) = L^\varGamma (x,{\bar{\lambda }},r),\quad x\in {\mathbb {R}}^n, \end{aligned}$$(10)
where \({\bar{\lambda }}= (\lambda (\omega _1),\ldots , \lambda (\omega _m))^T\).
-
(ii)
For every \(x\in {\mathbb {R}}^n\), we have
$$\begin{aligned} \begin{aligned} L^\varGamma (x,{\bar{\lambda }},r)&= f(x)+\sum _{j=1}^{m }\lambda (\omega _j)g(x,\omega _j)+\inf _{u\ge 0}\left\{ \sum _{j=1}^{m} \lambda (\omega _j) u_j+r\sigma ^\varGamma (\gamma (x,u))\right\} \\&= L_0^\varGamma (x,{\bar{\lambda }})+\inf _{u\ge 0}\left\{ \sum _{j=1}^{m} \lambda (\omega _j) u_j+r\sigma ^\varGamma (\gamma (x,u))\right\} ,\\ \end{aligned} \end{aligned}$$(11)where \(\gamma (x,u)\) is as in (9).
Proof
(i) Let \( x\in {\mathbb {R}}^n\). For any \(y\in {\mathbb {R}}^{\varOmega }\) such that \(g(x,\omega )+y(\omega )\le 0,\, \omega \in \varGamma \), set \(\bar{y}_j=y(\omega _j), j=1,\ldots ,m\), so \(\bar{y}=(\bar{y}_1,\ldots ,\bar{y}_m)^T\), and we have \(g(x,\omega _j)+\bar{y}_j\le 0\) for all \(\omega _j \in \varGamma \). The definitions then yield
where we used (6) in the last inequality. Taking now infimum over \(y\in D(x)\) and using (2), we obtain
Therefore, (10) holds. To prove the opposite inequality, define the set
The set \(D^\varGamma (x)\) can be identified with the set
Given any function \(y\in D^\varGamma (x)\), call \({\bar{y}}\in {\mathbb {R}}^m\) the vector of images of y. Since \(supp(y)\subset \varGamma \) and \(y^\varGamma ={\bar{y}}\) we have that \(\sigma (y^\varGamma )=\sigma ^\varGamma ({\bar{y}})\) and \(\langle \lambda ,y \rangle =\langle \bar{\lambda },y^\varGamma \rangle = \langle \bar{\lambda },{\bar{y}}\rangle \), where we used the definition of \(\bar{\lambda }\) in the latter equality. Altogether, we can write
which proves the opposite inequality.
(ii) Fix \(x\in {\mathbb {R}}^n\) and take any \(y\in {\mathbb {R}}^{\varOmega }\) such that \(g(x,\omega _j)+ y(\omega _j)\le 0\) for all \(j=1, \dots , m\). Define \(u_j:=-(g(x,\omega _j)+ y(\omega _j))\) for all \(j=1, \dots , m\). Using (6), we can write
That is, the augmented Lagrangian \(L^{\varGamma }(x,{\bar{\lambda }},r)\) of \((P^\varGamma )\) can be expressed as (11). \(\square \)
Lemma 2.2
Let \(f: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) and \( g:{\mathbb {R}}^n\times \varOmega \rightarrow {\mathbb {R}}\) be real-valued functions, and let \(\sigma :{\mathbb {R}}^{\varOmega }\rightarrow {{\mathbb {R}}_+}\) be a real-valued function with \(\sigma (0)=0\), \(x^*\in X_0\), \(\lambda ^*\in {\mathbb {R}}_+^{(\varOmega )}\), \(\varGamma =\text {supp}(\lambda ^*)= \{\omega _1, \dots , \omega _m\}\). Assume that the KKT conditions for (SIP) is satisfied at \(x^*\) and \(\lambda ^*\), i.e.,
and
Then
where \({\bar{\lambda }}^*= (\lambda ^*(\omega _1),\ldots , \lambda ^*(\omega _m))\).
Proof
Since \(x^*\in X_0\), we have \(g(x^*,\omega )\le 0, \omega \in \varOmega \). It follows from (2) that
Letting \(\tilde{u}_j=- g(x^*,\omega _j)\ge 0\) for all \(j=1,\ldots , m\), we obtain
By using (13), we have
Noticing the KKT conditions and combining (16) and (17) with (11), we have
This, together with (10) and (15), implies that (14) holds. \(\square \)
2.2 Duality Properties of (SIP)
In this section, we establish some properties and strong duality results for (SIP), including first-order optimality conditions and a global saddle property.
Lemma 2.3
Let \(f: {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) and \( g:{\mathbb {R}}^n\times \varOmega \rightarrow {\mathbb {R}}\) be real-valued functions, and let \(\sigma :{\mathbb {R}}^{\varOmega }\rightarrow {{\mathbb {R}}_+}\) be a real-valued function with \(\sigma (0)=0\), \(x^*\in X_0\), \(\lambda ^*\in {\mathbb {R}}_+^{(\varOmega )}\), \(\varGamma =\text {supp}(\lambda ^*)= \{\omega _1, \dots , \omega _m\}\). Assume that the KKT conditions (12) and (13) for (SIP) are satisfied at \(x^*\) and \(\lambda ^*\). If there exists \(r^*>0\), such that \(\lambda ^*\) is an augmented Lagrange multiplier of the problem (SIP) with the penalty parameter \(r^*\) and \(f(x^*)={\text {val }}\) (SIP), then
Conversely, if
then \(\lambda ^*\) is an augmented Lagrange multiplier of (SIP) with the penalty parameter \(r^*\).
Proof
As \(\lambda ^*\) is an augmented Lagrange multiplier of \(\text {(SIP)}\) with the penalty parameter \(r^*\), we have
Therefore,
It follows from (14), (4) and (20) that
The above expression implies that
Let \(\lambda \in {\mathbb {R}}^{(\varOmega )}.\) Since \(x^*\in X_0\) and \(0\in {\mathbb {R}}^{\varOmega }\), it follows from (2) that
By (21) and (22), we obtain both inequalities in (18).
Assume that there exists \(r^*>0\), such that (19) holds. Then we obtain
It follows from (4) and (23) that
This implies that
That is, \(\lambda ^*\) is an augmented Lagrange multiplier of \(\text {(SIP)}\) with penalty parameter \(r^*\). \(\square \)
Remark 2.3
By Lemma 2.3, if \(\lambda ^*\in {\mathbb {R}}^{(\varOmega )}\) is an augmented Lagrange multiplier of (SIP) with the penalty parameter \(r^*\), and \(x^*\) is an optimal solution of (SIP) and satisfies (12) and (13), then \((x^*,\lambda ^*)\) is a saddle point of \(L^\varOmega (\cdot , \cdot , r^*)\). Conversely, if for some \(r^*>0\), \((x^*,\lambda ^*)\) is a saddle point of \(L(\cdot , \cdot , r^*)\), then \(\lambda ^*\) is an augmented Lagrange multiplier of (SIP) with the penalty parameter \(r^*\).
3 First-order Conditions for Augmented Lagrange Multipliers
In this section, we focus on the augmented Lagrangian in which the augmenting function is given by the norm of the function \(y^\varGamma \), the restriction of \(y\in {\mathbb {R}}^{\varOmega }\) to a fixed finite subset \(\varGamma \subset \varOmega \). Namely, given \(\varGamma :=\{\omega _1,\ldots ,\omega _m\}\) we consider \(\sigma ^\varGamma :{\mathbb {R}}^{(\varOmega )}\rightarrow {\mathbb {R}}_+\), defined as
The resulting augmented Lagrangian, inspired in [15, Example 11.58], is called the sharp Lagrangian. It is a classical fact from convex programming that the sharp Lagrangian admits an exact penalty parameter (see, e.g., [27, Theorem 9.3.1]). We will show that this can also be achieved in our more general context. First, we will obtain necessary optimality conditions for the existence of an augmented Lagrange multiplier of the sharp Lagrangian problem. Second, we establish sufficient conditions for the existence of an augmented Lagrange multiplier, by assuming inf-compactness of the functions involved and an extended Mangasarian–Fromovitz constraint qualification. We start by establishing necessary first-order necessary conditions for the existence of an augmented Lagrange multiplier of the semi-infinite programming problem \(\text {(SIP)}\).
Given a fixed finite subset \(\varGamma \subset \varOmega \), we will use the following notation for the set of active constraints of (SIP) and the set of active constraints of \((P^\varGamma )\) at a given \(x \in X_0\), respectively.
Theorem 3.1
Let \(f(\cdot )\) and \( g(\cdot ,\omega ), \omega \in \varOmega \), be real-valued continuously differentiable functions. Fix \(\lambda ^*\in {\mathbb {R}}^{(\varOmega )}\), and call \(\varGamma = \text {supp}(\lambda ^*)=\{\omega _1, \dots , \omega _m\}\). Consider \(\sigma ^\varGamma \) as in (24). Assume that
-
(a)
\(\lambda ^*\) is an augmented Lagrange multiplier of (SIP) with penalty parameter \(r^*\),
-
(b)
\(x^*\) is a local minimum for (SIP), and
-
(c)
\(\text {val }\) (SIP) \(=f(x^*)\).
In this situation, we have that
where
is the cone of feasible directions from \(x^*\).
Proof
Assumption (a) means that
Noting that \(v_{r^*}^{\varOmega }(y)=\text {val }(P_y)+{r^*}\sigma (y^\varGamma )=\text {val }(P_y)+{r^*} \sum _{\omega \in \varGamma }|y(\omega )|\) and \(v_{r^*}^{\varOmega }(0)=\text {val }\text { (SIP)},\) the last inequality is rewritten as
Taking infimum in the above inequality and using (c), we obtain
Using now the definition of \(\text {val }(P_y)\) and the fact (b) that \(x^*\) is a local minimum for \((\text {SIP})\), we obtain
where we used the definition of \(L^\varOmega \) (see (2)) in the last equality, and the fact that \(\varGamma =\text {supp}\,\lambda ^*\) in the second-to-last equality. This expression readily gives
for all \(x\in {\mathbb {R}}^n\). Fix \(x\in {\mathbb {R}}^n\), for each \(\omega _j\in \varGamma \), we have two possibilities. Either \(g(x,\omega _j)\le 0\), or \(g(x,\omega _j)> 0\). If \(g(x,\omega _j)\le 0\), take \(\tilde{y}(\omega _j)=0\), otherwise, take \(\tilde{y}(\omega _j)=-g(x,\omega _j)\). Namely, take \(\tilde{y}(\omega ):=-\max \{0,g(x,\omega )\}\) for \(\omega \in \varGamma \) and \(\tilde{y}(\omega )=0\) for \(\omega \not \in \varGamma \). Using this choice of \(\tilde{y}\in {\mathbb {R}}^{\varOmega }\) in (25), we can write
where \((a)^+=\max \{a,0\}\) for any \(a\in {\mathbb {R}}\). Inequality (26) implies that
i.e., \(x^*\) is an unconstrained minimum of \(H(x):=f(x)+\sum _{j=1}^m (\lambda ^*(\omega _j) + r^*) g^+(x,\omega _j)\). Thus it must satisfy the necessary first-order optimality conditions
where \(H^\circ (x^*, d)\) is the Clarke generalized directional derivative of H at \(x^*\) in the direction d, see [29], and the sum is restricted to \(j\in I(x^*)\) because \(x^*\) is feasible and \(\nabla _x g^+(x^*,\omega _j) ^\top d=0\) for all \(j\not \in I(x^*)\) and all \(d\in {\mathbb {R}}^n\). Fix now \(d\in \varPsi (x^*)\). By definition of \(\varPsi (x^*)\), we must have \(\max \{\nabla _x g(x^*,\omega _j)^Td, 0\}=0\), and the above inequality yields
This completes the proof. \(\square \)
The last result in this section establishes sufficient first-order conditions for the existence of an augmented Lagrange multiplier. Given \(\varGamma =\{\omega _1,\ldots , \omega _m\}\subset \varOmega \), define
the subset of points which are feasible for problem \((P^\varGamma )\). We will make the following assumptions on the objective \(f(\cdot ) \) and the subset of constraint functions \(g(\cdot ,\omega )\) for \(\omega \in \varGamma \).
Assumption 3.1
Let f(x), and \(g(x,\omega )\) (\(\omega \in \varGamma \)) be inf-compact, that is, there exist \(\alpha \in {\mathbb {R}}\), \(\omega \in \varGamma \), such that the sets \(\{ x \in {\mathbb {R}}^n : f(x) \le \alpha \}\) and \(\{ x \in {\mathbb {R}}^n : g(x,\omega ) \le \alpha \}\) are nonempty and compact.
Remark 3.1
If Assumption 3.1 holds, then the inf-compactness of f implies that the solution set \(S^\varGamma \) of the discretized problem \((P^{\varGamma })\) is nonempty and compact.
The extended Mangasarian–Fromovitz constraint qualification (EMFCQ in short) is said to hold for \((P^\varGamma )\) at a feasible solution \(x_0\), see Bonnans and Shapiro [28, p. 510], if there exists \(z\in {\mathbb {R}}^n\) such that
Assumption 3.2
The (EMFCQ) holds for the reduced problem \(({P}^\varGamma )\) at every \(x\in S^\varGamma \).
Let \(x\in X_0\) and define the set of (classical) Lagrange multipliers associated with x:
It is known that if Assumption 3.2 holds, then \(\varLambda ^\varGamma (x)\) is bounded (cf. Gauvin [30], see also Bonnans and Shapiro [28, p. 510]). A general result on the boundedness of the set of Lagrange multipliers for an optimization problem with a closed set inclusion constraint is given in Burke [31].
Theorem 3.2
We assume that (SIP) is convex. Let \(\lambda ^*\in {\mathbb {R}}^{(\varOmega )}\), and \(\varGamma =\text {supp}( \lambda ^*)= \{\omega _1, \dots , \omega _m\}\). Denote by \( \lambda ^*(\omega _j)={\bar{\lambda }}_j\), \(j=1,\ldots ,m\). Suppose that \(\text {val }(\) SIP \()=\text {val }(P^\varGamma )\). Let \(f(\cdot )\) and \( g(\cdot ,\omega ), \omega \in \varOmega \), be real-valued continuously differentiable functions, and let \(\sigma ^\varGamma \) be as in (24). Suppose that Assumptions 3.1 and 3.2 are satisfied. Then, there exists \(r^*> 0\), such that
-
(i)
\({\bar{\lambda }}= ({\bar{\lambda }}_1, \dots , {\bar{\lambda }}_m)^T \) is an augmented Lagrange multiplier of \(( P^\varGamma )\), with penalty parameter \(r^*\), and
-
(ii)
\(\lambda ^*\in {\mathbb {R}}^{(\varOmega )}\) is an augmented Lagrange multiplier of (SIP), with the same penalty parameter \(r^*\).
Proof
(i) First, we claim that there exists \(\alpha _0\in {\mathbb {R}}\) such that \(f(x)\ge \alpha _0, \forall x\in {\mathbb {R}}^n\). Indeed, if this is not the case, there exists a sequence \(\{x_k\}\) such that \(f(x_k)\rightarrow -\infty \) as \(k\rightarrow \infty \). By Assumption 3.1, there exists \(\alpha \in {\mathbb {R}}\), such that the set
is nonempty and compact. So, \(x_k\in {\mathbb {E}}\) for large enough k. Thus, there exists a subsequence \(\{x_{k_j}\} \) of \(\{x_k\}\), such that \(x_{k_j}\rightarrow x_0\) as \(j\rightarrow \infty \). Since f is continuous, \(x_0\in {\mathbb {E}}\) and \(f(x_{k_j})\rightarrow f(x_0)\) as \(j\rightarrow \infty \), which is impossible.
Recall that \({v}^\varGamma ({\bar{y}})\) is the optimal value of \((P_{{\bar{y}}}^\varGamma )\). Thus
since \(f(x)\ge \alpha _0, \forall x\in {\mathbb {R}}^n\).
To complete our proof, we will use similar arguments to the ones used in Rückmann and Shapiro [24, Theorem 3]. It follows from Assumption 3.1 that the set of optimal solutions of \((P^\varGamma )\) is compact. By Assumption 3.2, the sets \(\varLambda ^\varGamma (x)\) are uniformly bounded for all x in a neighborhood of any point \({\bar{x}}\in S^\varGamma \) (cf. Proposition 4.43 of Bonnans and Shapiro [28]). Thus, we obtain that the set \(\bigcup _{x\in S^\varGamma }\varLambda ^\varGamma (x)\) is bounded as \(S^\varGamma \) is compact, and then \(\sup _{\lambda \in \bigcup _{x\in S^\varGamma }\varLambda ^\varGamma (x)}\Vert \lambda -{\bar{\lambda }}\Vert \) is finite. As for \((P^\varGamma )\), Robinson’s constraint qualification is equivalent to the EMFCQ, by Theorem 4.27 of Bonnans and Shapiro [28], we have
where \((v^\varGamma )'_{-}(0,d)\) is the lower Hadamard directional derivative of \(v^\varGamma (\cdot )\) at 0 in a unit direction \(d\in {\mathbb {R}}^m\), more precisely,
Using the above definition and (29), we can write, for a fixed \(\varepsilon >0\)
Hence, by definition of the supremum in the left-hand side there exists \(\tau >0\) such that for all \(d'\in \tau B_{{\mathbb {R}}^m}\) (where \(B_{{\mathbb {R}}^m}\) is the unit ball in \({\mathbb {R}}^m\)), and every \(t\in [0,\tau ]\) we can write
Multiplying by t the expression above and restricting the quotient in the left-hand side to \(d'=d\), we can write (30) in terms of \({\bar{y}}:=td\in \tau B_{{\mathbb {R}}^m}\), to obtain
Let us find a suitable lower bound for the first term in the right-hand side of (31). We can write
Taking now the infimum in this expression, we obtain
The boundedness of the set \(\bigcup _{x\in S^\varGamma }\varLambda ^\varGamma (x)\) implies the existence of \(r_1>0\) such that
Using this fact, and (32)-(31), we obtain
Noting that every \({\bar{y}}\in \tau B_{{\mathbb {R}}^m}\) can be written as \({\bar{y}}:=td\) for \(t\in [0,\tau ]\) and d a unit vector, we conclude that the inequality in (33) holds for every \({\bar{y}}\in \tau B_{{\mathbb {R}}^m}\). Altogether, we can write for all \({\bar{y}}\in \tau B_{{\mathbb {R}}^m}\)
where we used the definition of \({v}_{r}^\varGamma \) in the first equality, the definition of \(\sigma ^\varGamma \) in the second equality, and (33) in the inequality. This implies that for \(r>(\varepsilon +r_1)\), we obtain the inequality in the definition of an augmented Lagrangian for \((P^\varGamma )\), for every \({\bar{y}}\in \tau B_{{\mathbb {R}}^m}\). Now we need to establish this inequality for \({\bar{y}}\not \in \tau B_{{\mathbb {R}}^m}\). Recalling that \(v^\varGamma (0)=\mathrm{val}(P^\varGamma )\), and using also (28), we can write
where the first inequality comes from (28) and the last inequality follows from choosing \(r>\Vert {\bar{\lambda }}\Vert \) and the fact that \({\bar{y}}\not \in \tau B_{{\mathbb {R}}^m}\). Given these two inequalities, we can choose
With this choice, (34), (35) and (36) yield
From (8) \({\bar{\lambda }}\) is an augmented Lagrange multiplier of \((P^\varGamma )\), with penalty parameter \(r^*\).
(ii) Since \(\text {val }\text {(SIP)}=\text {val (P}^\varGamma )\), we have that \(v_{r^*}^{\varOmega }(0)=v_{r^*}^\varGamma (0)=v^\varGamma (0)=\text {val }(\text {SIP})\). Fix an arbitrary \( y\in {\mathbb {R}}^{(\varOmega )}\), we can take its restriction \(y^\varGamma \) to \(\varGamma \), namely set
By Remark 2.2 and (7) we have \(v_{r^*}^{\varOmega }(y)\ge v_{r^*}^{\varGamma }(y^\varGamma )\). So, it follows from (37) applied to \(y:=y^\varGamma \) that
Therefore, \({\bar{\lambda }}\in {\mathbb {R}}^{(\varOmega )}\) is an augmented Lagrange multiplier of \((\text {SIP})\), with penalty parameter \(r^*\).
Remark 3.2
Note that the existence of an augmented Lagrange multiplier is a global property. The proof of Theorem 3.2 shows that Assumption 3.1 is essential to obtain the existence of an augmented Lagrange multiplier of \((\text {SIP})\). Indeed, by Assumption 3.1 and the continuity property of f, we obtain that (28) and (35) hold.
4 Second-order Conditions for Augmented Lagrange Multipliers
In this section, we investigate second-order conditions for the existence of an augmented Lagrange multiplier for the semi-infinite programming problem \((\text {SIP})\). Recall that for \(y\in {\mathbb {R}}^{\varOmega }\), the norm of the restriction \(y^\varGamma \) to a finite set \(\varGamma =\{\omega _1,\ldots ,\omega _m\}\) is defined as \(\Vert y^\varGamma \Vert =\sum _{\omega \in \varGamma }|y(\omega )|\), i.e., the \(\ell _1\)-norm in \({\mathbb {R}}^m\) of the vector \(y^\varGamma =(y(\omega _1),\ldots ,y(\omega _m))^T\). We will also use \(\Vert \cdot \Vert _{\infty }\) for denoting the infinity norm in \({\mathbb {R}}^m\). Namely, given \(z\in {\mathbb {R}}^m\), we write \(\Vert z\Vert _{\infty }:=\max _{j=1,\ldots ,m}|z_j|\). Next we state a crucial assumption we make on \(\sigma \).
Definition 4.1
Let \(\sigma :{\mathbb {R}}^{(\varOmega )}\rightarrow {{\mathbb {R}}_+}\) be a real-valued function and \(\varGamma \) be a finite subset of \(\varOmega \).
-
(a)
We say that \(\sigma \) has a valley at 0 with respect to \(\varGamma \) in \({\mathbb {R}}^{\varOmega }\) if \(\sigma \) is continuous at 0 with \(\sigma (0)=0\), and for every \(\delta >0\) we have \(c_\delta =\inf _{y\in {\mathbb {R}}^{\varOmega }, \Vert y^\varGamma \Vert \ge \delta } \sigma (y^\varGamma ) >0\).
-
(b)
Fix \(p\in ]0,1]\). We say that \(\sigma \) satisfies a p-growth condition at 0 with respect to \(\varGamma \) if there exists \(\rho _0>0\) such that
$$\begin{aligned} \lim _{y\in {\mathbb {R}}^{\varOmega },y\rightarrow 0}\frac{\sigma (y^\varGamma )}{\Vert y^\varGamma \Vert ^p}=\rho _0>0. \end{aligned}$$(38)
Remark 4.1
Under Assumption (a) in Definition 4.1, there exists \(\alpha >0\) such that the level set
is bounded. Indeed, if this is not true, then for all \(\alpha >0\) the set \(V_{\alpha }\) is unbounded. Take \(0<\alpha <c_1\), where \(c_1>0\) is as in Definition 4.1(a) for \(\delta =1\). Since \(V_{\alpha }\) is unbounded there exists \(z\in V_{\alpha }\) such that \(\Vert z\Vert >1\). Altogether, we have
contradicting the choice of \(\alpha \). Hence, the claim is true and there exists \(\alpha _0>0\) such that \(V_{\alpha _0}\) is bounded. For future use, denote by \(M_{\alpha _0}\) the positive constant such that
where \(B[0,M_{\alpha _0}]\) is the ball centered at 0 with radius \(M_{\alpha _0}\).
The properties described in Definition 4.1 are used for establishing a lower bound on the augmented Lagrangian. The next technical lemma shows why this assumption is instrumental in establishing the lower bound.
Lemma 4.1
Fix a pair \((\hat{x},\hat{\lambda })\in {\mathbb {R}}^n\times {\mathbb {R}}_+^{(\varOmega )}\) and denote by \(\varGamma =\text {supp}(\hat{\lambda })= \{\omega _1, \dots , \omega _m\}\). Assume that
-
(i)
For every \(\omega \in \varGamma \), the function \( g(\cdot ,\omega ):{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is continuous at \(\hat{x}\), and
-
(ii)
\(\sigma :{\mathbb {R}}^{(\varOmega )}\rightarrow {{\mathbb {R}}_+}\) has a valley at 0 respect to \(\varGamma \), and it satisfies p-growth condition at 0 with respect to \(\varGamma \).
In this situation, there exists an open neighborhood \(N(\hat{x})\) of \(\hat{x}\) and \(r'>0\) such that for every \(x\in N(\hat{x})\) and every \(r>r'\) we have
Proof
The statement of the lemma is clearly true when \(\hat{\lambda }=0\), so it is enough to prove the lemma for \(\hat{\lambda }\not =0\). As in (9) define \(\gamma (x,u)\in {\mathbb {R}}^m\) as \([\gamma (x,u)]_j:=-g(x,\omega _j)-u_j\) for all \(j=1,\ldots ,m.\) With this notation, we have that \([\gamma (x,u)]_j\le -g(x,\omega _j)\) for all \(u\ge 0\). The infimum in the statement of the lemma thus re-writes as
Fix \(r_0>0\). The continuity of g implies that there exists a constant \(C_1>0\) such that
This allows us to find a lower bound for the first term in the argument of (39). Indeed,
where we used the fact that \(u_j\ge 0\) in the first inequality. The p-growth condition assumed in part (ii) implies that
Therefore, there exists \(r_1>0\) such that
for every \(z\in B(0,r_1)\subset {\mathbb {R}}^m\). Take \(r_2<\min \{1,r_1\}\) This yields
where we used the fact that \(p\in (0,1]\) in the last inequality.
Call \(A(r,x,u):=\sum _{j=1}^m \hat{\lambda }(w_j) ([-\gamma (x,u)]_j) +r\sigma (\gamma (x,u))\), the expression between the curly brackets in (39). We will show that
for large enough \(r>0\). We consider first the case in which \(u\ge 0\) is such that \(\Vert \gamma (x,u)\Vert <r_2\). In this case, using (41) for \(z:=\gamma (x,u)\), we obtain
which is positive for \(r> \frac{2 \Vert \hat{\lambda }\Vert _{\infty }}{\rho _0} =: r_3\). Hence, for all \(x\in B(\hat{x},r_0)\) we can write
To complete the proof, we consider now those \(u\ge 0\) such that \(\Vert \gamma (x,u)\Vert \ge r_2\). Using the valley at 0 property and (40) have
which is positive when \(r>\frac{C_1\Vert \hat{\lambda }\Vert _{\infty }}{c_{r_2}}=:r_4\). Hence, for all \(x\in B(\hat{x},r_0)\) we have
From (43) and (42) we deduce that, for \(r>\max \{r_3,r_4\}=:r'\), the statement of the lemma is true for \(N(\hat{x}):= B(\hat{x},r_0)\). \(\square \)
The next result will be used to show that the Lagrangian \(L^\varGamma \) is lsc when \(\sigma \) is a valley at zero with respect to \(\varGamma \).
Lemma 4.2
Fix a pair \((\hat{x},\hat{\lambda })\in {\mathbb {R}}^n\times {\mathbb {R}}_+^{(\varOmega )}\), and denote by \(\varGamma =\text {supp}(\hat{\lambda })= \{\omega _1, \dots , \omega _m\}\). Assume that
-
(i)
\( g(\cdot ,\omega ), \omega \in \varGamma \), is real-valued and continuous,
-
(ii)
\(\sigma :{\mathbb {R}}^{(\varOmega )}\rightarrow {{\mathbb {R}}_+}\) is lsc and it has a valley at 0 with respect to \(\varGamma \).
For a fixed \(r>0\), define \(q(\cdot ,r):{\mathbb {R}}^n\rightarrow {\mathbb {R}}_{\pm \infty }\) as
Then, for \(r>0\) large enough, the function \(q(\cdot ,r)\) is lsc in \({\mathbb {R}}^n\).
Proof
Fix \(r>0\). Set \(Q(r,x,u):=\sum _{j=1}^m \hat{\lambda }(w_j) ([-\gamma (x,u)]_j) +r\sigma (\gamma (x,u)) +\delta _{{\mathbb {R}}_+^m}(u)\). Then, we can write
Theorem 1.17 in [15] states that \(q(\cdot ,r)\) is lsc if (a) \(Q(r,\cdot ,\cdot )\) is lsc, and (b) \(Q(r,\cdot ,\cdot )\) is level-bounded in u locally uniformly in x. Condition (b) can be re-stated as follows. For every \(\alpha \in {\mathbb {R}}\) there exists a ball \(B(\hat{x},r_0)\) such that the set
Since the \({\mathbb {R}}^n\)-coordinate of \(W_{\alpha }(r)\) is bounded by definition, it is enough to show that the \({\mathbb {R}}^m\)-coordinate of \(W_{\alpha }(r)\) is bounded. We start by noting that \(Q(r,\cdot ,\cdot )\) is lsc. This is a consequence of the fact that \(Q(r,\cdot ,\cdot )\) is the sum of lower semicontinuous functions. Indeed, the continuity assumption (i) on g, together with the lower semicontinuity of \(\sigma \) assumed in (ii) implies that \(\sigma \circ \gamma (\cdot ,\cdot )\) is lsc. The first term in the expression of \(Q(r,\cdot ,\cdot )\) is continuous on u and hence lsc. The third term in the expression of \(Q(r,\cdot ,\cdot )\) is \(\delta _{{\mathbb {R}}_+^m}\), which is lsc because the set \({\mathbb {R}}_+^m\) is closed. This proves that \(Q(r,\cdot ,\cdot )\) is lsc for every \(r>0\).
Let us show now that, for \(r>0\) and large enough, \(Q(r,\cdot ,\cdot )\) is level-bounded in u locally uniformly in x. Namely, we will prove that (44) holds for \(r_0=1\) and for \(r>0\) large enough. The continuity of g implies that there exists \(C_1>0\) such that
Note that, for every fixed \(\omega _{j_0}\) we have
Assume that \((x,u)\in W_{\alpha }(r)\) (and hence \(x\in B[\hat{x},1]\)), then we must have \(u\ge 0\) and we can write a lower bound for the first term in the expression of Q as follows:
where we used the fact that \(u_j\ge 0\) in the fist inequality and (45) in the last one. Altogether, we can write, for every \((x,u)\in W_{\alpha }(r):=\{(x,u)\,:\, x\in B(\hat{x},1),\, Q(r,x,u)\le \alpha \}\),
Therefore, we deduce that
By Remark 4.1 there exists \(\alpha _0\) such that the set
is bounded. Take \(r>0\) such that \(\frac{\alpha +C_1\Vert \hat{\lambda }\Vert }{r}\le \alpha _0\). For this choice of r, we have that, if \((x,u)\in W_{\alpha }(r)\), then \(\gamma (x,u)\in V_{\alpha _0}\).
Let \(M_{\alpha _0}\) be as in Remark 4.1. We can thus write, for every u such that \((x,u)\in W_{\alpha }(r)\),
showing that, for this choice of r, we have
Hence, (44) holds for \(r>0\) large enough. By Theorem 1.17 in [15], the function \(q(\cdot ,r)\) is lsc for \(r>0\) large enough. \(\square \)
This lemma readily gives the lower semicontinuity of \(L^\varGamma (\cdot ,\lambda ,r)\) for large enough r.
Proposition 4.1
Fix a pair \((\hat{x},\hat{\lambda })\in {\mathbb {R}}^n\times {\mathbb {R}}_+^{(\varOmega )}\), and denote by \(\varGamma =\text {supp}(\hat{\lambda })= \{\omega _1, \dots , \omega _m\}\). Assume that
-
(i)
\(f(\cdot )\) is real-valued and lsc and \( g(\cdot ,\omega ), \omega \in \varGamma \), is real-valued and continuous,
-
(ii)
\(\sigma :{\mathbb {R}}^{(\varOmega )}\rightarrow {{\mathbb {R}}_+}\) is lsc and it has a valley at 0 with respect to \(\varGamma \).
In this situation, \(L^\varGamma (\cdot ,\lambda ,r)\) is lsc for \(r>0\) large enough.
Proof
By definition, we know that
where we are using the notation of Lemma 4.2. By Assumption (i) the first term is lsc. Since we are in conditions of Lemma 4.2, the second term is lsc for r large enough. Hence, the whole expression is lsc for r large enough.
Remark 4.2
With the notation of Definition 4.1, the sharp augmenting function \(\sigma ^\varGamma (y):=\Vert y^\varGamma \Vert \) used in Sect. 3 has a valley at 0 with \(c_{\delta }=\delta \), and it satisfies a p-growth condition at 0 with respect to \(\varGamma \) for \(p=1\). More generally, the augmenting function \(\sigma _p^\varGamma (y):=\Vert y^\varGamma \Vert ^p\) for \(p\in (0,1]\) has a valley at 0 with \(c_{\delta }=\delta ^p\), and it satisfies a p-growth condition at 0.
The following lemma uses an augmenting function as in Definition 4.1 (a), (b) and a second-order sufficient condition, see [32].
Lemma 4.3
Let \(f(\cdot )\) and \( g(\cdot ,\omega ), \omega \in \varOmega \), be real-valued twice continuously differentiable functions, and let \(x^*\in X_0\), \(\lambda ^*\in {\mathbb {R}}^{(\varOmega )}\), \(\varGamma =\text {supp}(\lambda ^*)= \{\omega _1, \dots , \omega _m\}\), \(\sigma :{\mathbb {R}}^{(\varOmega )}\rightarrow {{\mathbb {R}}_+}\) is lsc, it has a valley at 0 respect to \(\varGamma \), and it satisfies p-growth condition at 0 with respect to \(\varGamma \).
Assume that the KKT conditions (12) and (13) for (SIP) are satisfied, and that the second-order sufficient condition given by
holds, where \({V}(x^*)\) is defined by
Then there exist \(r^*>0\) and a neighborhood \(N(x^*)\) of \(x^*\) such that
Proof
By contradiction, suppose that there exist two sequences \(\{r_k\}\) and \(\{x_k\}\) with \(r_k>0\), \(x_k\not =x^*\), such that \(r_k\rightarrow \infty \), \(x_k\rightarrow x^*\), as \(k \rightarrow \infty \) and
We have
where the first inequality follows from (11) and the nonnegativity of \(\lambda ^*(\omega )\) (\(\forall \omega \in \varOmega )\) (since (13) holds), the second equality is due to (10) and the last inequality comes from (49) and (14). Define \(v_k:=x_k-x^*\) and \(s_k:=\frac{v_k}{\Vert v_k\Vert }\). By compactness of the ball, there exists a subsequence of \(\{s_{k}\}\) converging to a vector s with \(\Vert s\Vert =1\). Without loss of generality, suppose that \(s=\lim _{k\rightarrow \infty }s_k\). We will show first that \(s\in V(x^*)\). We can write
where A(r, x, u) is as in Lemma 4.1. Since we are under conditions of Lemma 4.1, there exist \(r',r_0>0\) such that the last term is nonnegative for \(r_k>r'\) and \(x_k\in B(x^*,r_0)\). Since \(r_k\) tends to infinity and \(x_k\) tends to \(x^*\), we can find \(k_0\) such that \(r_k>r'\) and \(x_k\in B(x^*,r_0)\) for \(k\ge k_0\). We assume from now on that \(k\ge k_0\). Using this fact and the Taylor development of f, we can write
where \(\lim _k \frac{o(\Vert v_k\Vert )}{\Vert v_k\Vert }=0\). Taking limit yields
which is one of the inequalities defining \(V(x^*)\). To show that the remaining inequalities in \(V(x^*)\) hold for s, write again
where the first inequality follows from (49), the first equality follows from the definitions of \(A(x_k,r_k,u)\) and \(L_0^\varGamma \), and the second equality comes from (12) in Lemma 2.2. Thus, we have
since \(r_k\rightarrow \infty \), as \( k\rightarrow \infty \). Therefore, there exists \(u^k\in {\mathbb {R}}_+^m\) such that
and then
Since \(\sigma \) has a valley at 0, this can only happen if
The p-growth condition now implies that
Note that \(0<p\le 1\) and \(|g(x_k, \omega _j)+u_j^k|\ge g^+(x_k, \omega _j)\) for all \(\omega _j\in \varGamma (x^*)\). Proceeding as in the proof of Lemma 4.1, and taking k large enough so that \(\Vert \gamma (x_k,u^k)\Vert <1\), we obtain
From (54) we can take also k large enough so that \(\frac{\sigma (\gamma (x_k,u^k))}{\Vert \gamma (x_k,u^k)\Vert ^p}>\rho _0/2\). Combine this with (55) to write
Using the above expression and (53), we deduce that
Since the functions \(g(\cdot ,\omega _j),\,\omega _j\in \varGamma (x^*)\) are continuously differentiable at \(x^*\), and \(g(x^*,\omega _j)=0\) for all \(\omega _j\in \varGamma (x^*)\), we use the Taylor development of \(g(\cdot ,\omega _j)\) to write
for all \(\omega _j\in \varGamma (x^*)\). Since each term in the sum of (56) is nonnegative, we deduce that
for all \(\omega _j\in \varGamma (x^*)\). Use this fact in (57) to obtain
for all \(\omega _j\in \varGamma (x^*)\). This readily gives
for all \(\omega _j\in \varGamma (x^*)\). Combining this fact and (52), we conclude that \(s\in V(x^*).\) The Taylor development of \(L_0^\varGamma \) and the first-order conditions yield
Assumption (46) and the fact that \(s\in V(x^*)\), imply that, for k large enough, the expression between parentheses is positive. Hence, we have that
contradicting (50) for large enough k. This implies that (49) cannot hold, and therefore (48) must hold. \(\square \)
The following result establishes existence of an augmented Lagrange multiplier for (SIP), under the conditions of Lemma 4.3 and some boundedness assumptions on f and \(g(\cdot ,\omega )\).
Theorem 4.1
Let \(x^*\) be a unique global optimal solution of \((P^\varGamma )\). Suppose that all conditions in Lemma 4.3 are satisfied. Moreover, if
-
(i)
f and \( g(x,\omega )\ (\omega \in \varGamma )\) are bounded below,
-
(ii)
there exists \(\mu _0>0\) such that the set below is bounded
$$\begin{aligned} \varTheta :=\{x\in {\mathbb {R}}^n:g(x,\omega )\le \mu _0, \omega \in \varGamma \}, \end{aligned}$$(58)then \(\lambda ^*\) is an augmented Lagrange multiplier of (SIP).
Proof
By Condition (i), there exists \(\xi _0>-\infty \) such that
By Lemma 2.2 (ii), it is enough to prove that there exists \(r_0^*>0\) such that (19) holds for every \(x\in {\mathbb {R}}^n\). We establish this fact in two steps, according to whether x belongs or not to the set \(\varTheta \) defined in (58). We use in this proof the notation of Lemma 4.1.
Step I In this step, we prove that there exists \(r_1^*>0\) such that (19) holds on \({\mathbb {R}}^n\backslash \varTheta \). If \(x\in {\mathbb {R}}^n \backslash \varTheta \), then \(g(x,\omega _{j_0}) >\mu _0\) for some \(\omega _{j_0}\in \varGamma \). Thus, for every \(u\ge \) we have
Since \(\sigma \) has a valley at 0, there exists \( c_{\mu _0}>0\) such that
By (11), (59) and (60), we have
Then there exists \(r_1^*>0\) such that
This inequality, combined with (10) in Lemma 2.1 and (14) in Lemma 2.2, yields that
That is, (19) holds on \( {\mathbb {R}}^n \backslash \varTheta \).
Step II We prove in this step that there exists \(r_2^*>0\) such that (19) holds for \(x\in \varTheta \). By contradiction, suppose that there exist sequences \(\{r_k\}\) and \(\{x_k\}\) with \(r_k>0\), \(x_k\in \varTheta \), such that \(r_k\rightarrow +\infty \) and
where we used (14) in the equalities. Set \(\varGamma = \{\omega _1, \dots , \omega _m\}\). It follows from (11) and (62) that
Since \(r_k\rightarrow +\infty \) the above inequality implies that
Therefore, we can take a sequence \(\{u^k\}\subset {\mathbb {R}}_+^m\) such that
By condition (ii), \(\varTheta \) is bounded, and the sequence \(\{x_k\}\) has at least a cluster point \(\bar{x}\). Without loss of generality, we assume that \(x_k\rightarrow {\bar{x}}.\) We claim that \({\bar{x}}\in X_0^\varGamma \), where \( X_0^\varGamma \) is defined by (27). Indeed, the fact that \(\sigma \) has a valley at zero and the last expression imply that, for every \(\omega _j\in \varGamma \) we have
Since g is continuous, the sequence \(\{g(x_k,\omega _j)\}\) is bounded for every j, and hence we conclude that the sequence \(\{u^k\}\subset {\mathbb {R}}_+^m\) is bounded too. Therefore, they both have convergent subsequences. We may and do denote these still by \(\{g(x_k,\omega _j)\}\) and \(\{u^k\}\). Hence, we can write, for every \(\omega _j\in \varGamma \),
because \(\{u^k\}\subset {\mathbb {R}}_+^m\). Therefore, \({\bar{x}}\in X_0^\varGamma \). Since we are under the conditions of Lemma 4.3, there exist \(r^*>0\) and a neighborhood \(N(x^*)\) of \(x^*\) such that (48) holds, i.e.,
It follows from (62) and (64) that \(x_k\notin N(x^*)\). Indeed, for large enough k we must have \(r_k>r^*\) and hence (62) gives
where we used the fact that \(r_k>r^*\) in the first inequality. This expression and (64) yield \(x_k\not \in N(x^*)\). Since \(x_k\rightarrow {\bar{x}}\), we deduce that \({\bar{x}}\not = x^*\). By assumption, \(x^*\) is a unique global optimal solution of \((P^\varGamma )\) and \(\bar{x}\) is feasible for \((P^\varGamma )\), hence we must have \(f({\bar{x}})> f(x^*)\). Let \(\beta =f({\bar{x}})- f(x^*)\). By using (10) and (11), we have
Since we are under the conditions of Lemma 4.1, the last term in (65) is nonnegative for r large enough. Using this in (65) gives
Fix \(k_0\) such that \(r_k\ge r'\) for all \(k\ge k_0\), where \(r'>0\) is as in (66). Using Proposition 4.1, we may and do assume that \(r'\) is large enough such that \(L^\varGamma (\cdot , \lambda ^*, r')\) is lsc. Then by (62) we obtain
where we used (62) in the first inequality, the fact that \(L^\varGamma (x_k,\lambda ^*,r_k)\ge L^\varGamma ({\bar{x}},\lambda ^*,r')\) if \(r_k\ge r'\) in the second inequality, lower semicontinuity of \(L^\varGamma (\cdot , \lambda ^*, r')\) in the third inequality, (10) in the equality, and (66) in the last inequality. This expression entails a contradiction and hence there exists \(r_2^*>0\) such that
Equivalently, (19) holds on \(\varTheta \). Let \(r_0^*=\max \{r_1^*, r_2^*\}\). It follows from (61) and (67) that (19) holds on \({\mathbb {R}}^n\). The proof is complete. \(\square \)
Remark 4.3
Shapiro and Sun [22] established second-order conditions ensuring existence of an augmented Lagrange multiplier when the augmenting function is assumed to be a quadratic function and condition (R) (called quadratic growth condition in Rockafellar[14]) is satisfied. Under some second-order sufficient conditions and by assuming that the global optimal solution of the primal problem is unique, Sun et al. [23] obtained existence results of global saddle points for four classes of augmented Lagrangian functions (by Remark 2.3, this implies existence of augmented Lagrange multipliers). In the latter paper, the augmenting function \(\sigma \) is twice differentiable and convex, while in the other three classes of augmented Lagrangian functions the augmenting function is assumed to be twice differentiable. In our analysis (see Theorem 4.1), the augmenting function is not necessarily convex nor differentiable. Hence, the result and the proof techniques in Theorem 4.1 are different than those in Shapiro and Sun [22] and Sun et al. [23]. Moreover, it is known that some nonconvex and nondifferentiable penalty functions, which are the special cases of the augmented Lagrangians with nonconvex and nondifferentiable augmenting functions, are able to provide a smaller exact penalty parameter than that of the classical \(l_1\) penalty function, see [33, Theorem 4.9], and indeed it has been shown that nonconvex and nondifferentiable penalty functions need only smaller penalty parameters to achieve the required accuracy than the classical \(l_1\) penalty function in some practical applications, see [34].
Next we apply a global second-order sufficient condition [26] to obtain the existence of an augmented Lagrange multiplier.
Definition 4.2
Let X be a subset of \({\mathbb {R}}^n\), \(x^*\in X\), \( W(x^*)\) be a subset of \({\mathbb {R}}^n\) and \(f:X\rightarrow {\mathbb {R}}\) be twice continuously differentiable at \(x^*\). We say that a generalized representation condition holds for f at \(x^*\) on X with respect to \(\eta (x,x^*) \in W(x^*)\) if, for every \(x \in X\),
Remark 4.4
There are some functions that satisfy the generalized representation condition in Definition 4.2 (cf. Yang [26]). For example, for a linear fractional function
where \(a, b \in {\mathbb {R}}^n\), \(c, s\in {\mathbb {R}}\) and X is a convex set such that \(a^\top x+s>0, \forall x \in X.\) It is easy to show that
Then f(x) satisfies the generalized representation condition on X with
Theorem 4.2
Let \(f(\cdot )\) and \( g(\cdot ,\omega ), \omega \in \varOmega \), be real-valued twice continuously differentiable functions, and let \(x^*\in X_0\), \(\lambda ^*\in {\mathbb {R}}^{(\varOmega )}\), \(\varGamma =\text {supp}(\lambda ^*)= \{\omega _1, \dots , \omega _m\}\), \(V(x^*)\) be defined by (47), \(\sigma :{\mathbb {R}}^{\varOmega }\rightarrow {{\mathbb {R}}_+}\) have a valley at 0 respect to \(\varGamma \) and (38) hold. Assume that the KKT Conditions (12) and (13) for problem (SIP) and Conditions (i), (ii) in Theorem 4.1 are satisfied. Moreover, if there exists a set \(W(x^*) \subset {\mathbb {R}}^n {\setminus } \{0\}\) such that \(V(x^*) \subset W(x^*)\) and the generalized representation condition holds for f(x) and \(g(x,\omega _j), j=1,\ldots ,m,\) at \(x^*\) on \( X_0^\varGamma \) with respect to the same \(\eta (x,x^*)\in W(x^*)\), where \(\eta (\cdot ,x^*): {\mathbb {R}}^n \rightarrow W(x^*)\) is continuous,
Then \(\lambda ^*\) is an augmented Lagrange multiplier of (SIP).
Proof
As in the proof of Theorem 4.1, we will prove that there exists \(r^*>0\) such that (19) holds on \( {\mathbb {R}}^n= \varTheta \cup ( {\mathbb {R}}^n \backslash \varTheta )\), where \(\varTheta \) is defined by (58). The proof for the case in which \(x\not \in \varTheta \) follows the same line of argument as that in Step 1 of the proof of Theorem 4.1 and hence it is omitted. Namely, Step 1 of the proof of Theorem 4.1 can be quoted to conclude that there exists \(r_1^*>0\) such that (19) holds on \( {\mathbb {R}}^n \backslash \varTheta \). To complete the proof, we prove that there exists \(r_2^*>0\) such that (19) holds on \(\varTheta \). By contradiction, suppose that there exist sequences \(\{r_k\}\) and \(\{x_k\}\) with \(r_k>0\), \(x_k\in \varTheta \), such that \(r_k\rightarrow +\infty \) and (62) holds. Again, similar steps as those in Theorem 4.1, Step II establishes (63) and proves that \({\bar{x}}\in X_0^\varGamma \), where \( X_0^\varGamma \) is defined by (27). Because \(V(x^*) \subset W(x^*)\), it follows from Lemma 4.3 that there are \(r^*>0\) and a neighborhood \(N(x^*)\) of \(x^*\) such that (48) holds, i.e.,
Thus, it follows from (62) that \({\bar{x}}\not = x^*\), since \(x_k\rightarrow {\bar{x}}\), as \(k\rightarrow \infty \).
By the assumption, there is \(\eta ({\bar{x}},x^*) \in W(x^*)\) such that the generalized representation conditions of f(x) and \(g(x,\omega _j), j=1,\cdots ,m,\) hold. By (68), we have
Thus, for k sufficiently large, and by continuity of \(\eta (\cdot , x^*)\),
Thus, by the KKT conditions and (69), we have
which contradicts to (63). Therefore, there exists \(r_2^*>r^*>0\) such that (19) holds on \(\varTheta \).
Let \(r^*=\max \{r_1^*,r_2^*\}\). Therefore, (19) holds on \( {\mathbb {R}}^n\). By Lemma 2.2, \(\lambda ^*\) is an augmented Lagrange multiplier of \((\text {SIP})\) with penalty parameter \(r^*\). \(\square \)
Remark 4.5
If Assumption 3.1 holds, then conditions (i) and (ii) of Theorem 4.1 are satisfied (see the proof of Theorem 3.2).
5 Conclusions
In the case of a sharp Lagrangian, we obtained first-order necessary or sufficient conditions for the existence of an augmented Lagrange multiplier for (SIP). Using a valley at 0 augmenting function, we obtained second-order sufficient conditions for the existence of an augmented Lagrange multiplier for (SIP). We employed the discretization technique in our work and provided some characterizations of augmented Lagrange multipliers for (SIP) in terms of saddle points. However, we could not establish the equivalence of the (SIP) and its discretized optimization problem when an augmented Lagrange multiplier exists. We leave this question as an open problem for our future research.
References
Fiacco, A.V., Kortanek, K.O.: Semi-Infinite Programming and Applications, Lecture Notes in Economics and Mathematical Systems, vol. 215. Springer, New York (1983)
Goberna, M.A., López, M.A.: Linear Semi-Infinite Optimization. Wiley, New York (1998)
Polak, E.: On the mathematical foundations of nondifferentiable optimization in engineering design. SIAM Rev. 29, 21–89 (1987)
Polak, E., Wuu, T.L.: On the design of stabilizing compensators via semi-infinite optimization. IEEE Trans. Autom. Control 34, 196–200 (1989)
Polak, E.: On the use of consistent approximations in the solution of semi-infinite optimization and optimal control problems. Math. Program. 62, 385–414 (1993)
Hettich, R.: An implementation of a discretization method for semi-infinite programming. Math. Program. 34, 354–361 (1986)
Still, G.: Discretization in semi-infinite programming: the rate of convergence. Math. Program. 91, 53–69 (2001)
Hettich, R., Kortanek, K.O.: Semi-infinite programming: theory, methods and applications. SIAM Rev. 35, 380–429 (1993)
Reemsten, R., Görner, S.: Numerical methods for semi-infinite programming: a survey. In: Reemsten, R., Rückmann, J. (eds.) Semi-Infinite Programming, pp. 195–275. Kluwer Academic Publishers, Boston (1998)
Zhang, L.P., Wu, S.Y., López, M.A.: A new exchange method for convex semi-infinite programming. SIAM J. Optim. 20, 2959–2977 (2010)
Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4, 303–320 (1969)
Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization, pp. 283–298. Academic Press, New York (1969)
Buys, J.D.: Dual algorithms for constrained optimization problems. Ph.D. thesis, University of Leiden, Leiden, The Netherlands (1972)
Rockafellar, R.T.: Augmented Lagrange multiplier functions and duality in nonconvex programming. SIAM J. Control Optim. 12, 268–285 (1974)
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis. Springer, Berlin (1998)
Penot, J.P.: Augmented Lagrangians, duality and growth conditions. J. Nonlinear Convex Anal. 3, 283–302 (2002)
Huang, X.X., Yang, X.Q.: A unified augmented Lagrangian approach to duality and exact penalization. Math. Oper. Res. 28, 533–552 (2003)
Zhou, Y.Y., Yang, X.Q.: Some results about duality and exact penalization. J. Glob. Optim. 29, 497–509 (2004)
Burachik, R.S., Rubinov, A.: Abstract convexity and augmented Lagrangians. SIAM J. Optim. 18(2), 413–436 (2007)
Burachik, R.S.: On primal convergence for augmented Lagrangian duality. Optimization 60(8–9), 979–990 (2011)
Burachik, R.S., Iusem, A.N., Melo, J.G.: The exact penalty map for nonsmooth and nonconvex optimization. Optimization 64(4), 717–738 (2015)
Shapiro, A., Sun, J.: Some properties of the augmented Lagrangian in cone constrained optimization. Math. Oper. Res. 29, 479–491 (2004)
Sun, X.L., Li, D., McKinnon, K.I.M.: On saddle points of augmented Lagrangians for constrained nonconvex optimization. SIAM J. Optim. 15, 1128–1146 (2005)
Rückmann, J.J., Shapiro, A.: Augmented Lagrangians in semi-infinite programming. Math. Program. Ser. B 116, 499–512 (2009)
Zhou, Y.Y., Zhou, J.C., Yang, X.Q.: Existence of augmented Lagrange multipliers for cone constrained optimization problems. J. Glob. Optim. 58(2), 243–260 (2014)
Yang, X.Q.: Second-order global optimality conditions for convex composite optimization. Math. Program. 81, 327–347 (1998)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 3rd edn. Wiley-Interscience, Hoboken, New Jersey (2006)
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)
Clarke, F.H.: Optimization and Nonsmooth Analysis, Classics in Applied Mathematics. SIAM, Philadelphia (1983)
Gauvin, J.: A necessary and sufficient regularity condition to have bounded multipliers in nonconvex programming. Math. Program. 12, 136–138 (1977)
Burke, J.V.: An exact penalization viewpoint of constrained optimization. SIAM J. Control Optim. 29, 968–998 (1991)
Han, S.P., Mangasarian, O.L.: Exact penalty functions in nonlinear programming. Math. Program. 17, 251–269 (1979)
Rubinov, A.M., Yang, X.Q.: Lagrange-type Functions in Constrained Non-convex Optimization. Kluwer Academic Publishers, Boston (2003)
Wang, S., Yang, X.Q., Teo, K.L.: Power penalty method for a linear complementarity problem arising from American option valuation. J. Optim. Theory Appl. 129(2), 227–254 (2006)
Acknowledgements
The authors are grateful to the referees and the associate Editor for their careful reading and comments which have improved the final presentation of the paper. The work described in this paper was partially supported by Grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (PolyU 5334/08E and PolyU 5292/13E) and by Natural Science Foundation of China (11471235, 11171247 and 11371273).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Marco Antonio López-Cerdá.
This paper is dedicated to the memory of the late Professor Vladimir Demyanov.
Rights and permissions
About this article
Cite this article
Burachik, R.S., Yang, X.Q. & Zhou, Y.Y. Existence of Augmented Lagrange Multipliers for Semi-infinite Programming Problems. J Optim Theory Appl 173, 471–503 (2017). https://doi.org/10.1007/s10957-017-1091-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-017-1091-6
Keywords
- Semi-infinite programming
- Augmented Lagrange multiplier
- Optimality conditions
- Sharp Lagrangian
- A valley at 0 augmenting function