Abstract
In this paper, we consider a nonlinear programming problem for which the constraint set may be infeasible. We propose an algorithm based on a large family of augmented Lagrangian functions and analyze its global convergence properties taking into account the possible infeasibility of the problem. We show that, in a finite number of iterations, the algorithm stops detecting the infeasibility of the problem or finds an approximate feasible/optimal solution with any required precision. We illustrate, by means of numerical experiments, that our algorithm is reliable for different Lagrangian/penalty functions proposed in the literature.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
It is well known that augmented Lagrangian methods are efficient to solve constrained optimization problems in different areas of applications. The augmented Lagrangian for optimization problems with equality constraints was proposed by Hestenes [12] and Powell [23] and extended by Rockafellar for inequality constraints [24]. The Powell–Hestenes–Rockafellar (PHR) augmented Lagrangian function has been widely used in practice, see for example [1, 8] related to the packages Algencan and Lancelot, respectively.
In each iteration of an augmented Lagrangian method, a subproblem with simple constraints is approximately solved. One drawback of the PHR approach is that the objective functions of the subproblems are not twice continuously differentiable, which can result in difficulties to apply a Newton-like method. This motivates the study of different augmented Lagrangian functions. Many nonlinear Lagrangian/penalty functions have been proposed and/or analyzed by many authors, see for example [2, 3, 6, 11, 14, 16, 17, 21, 22, 25, 29].
It is desirable that optimization algorithms have effective stopping criteria. Ideally, an algorithm should stop at an iterate for which it is possible to ensure feasibility and optimality up to arbitrarily small given precisions. On the other hand, if the problem is infeasible, the algorithm should stop with a certificate of infeasibility. In the context of global optimization and PHR augmented Lagrangian methods, this issue was considered in [5]. In the local optimization field, strategies handling possible infeasibility were considered, for example, in [7, 19, 30] in the context of SQP, augmented Lagrangian and Interior Point methods, respectively. In some practical applications, it may be useful to accept some level of infeasibility (for example when the constraints involve modeling errors) and distinguish infeasible points according to their objective function values. These issues were considered in [4].
In this paper, we consider a large family of augmented Lagrangian methods introduced by Kort and Bertsekas [2, 14]. We propose an algorithm based on this class and, accepting inexact global solutions of the subproblems, study its asymptotic convergence properties. We prove that the limit points of the sequence generated by the algorithm are global minimizers of an infeasibility measure, which is specified for some penalty functions. Based on these results, a second algorithm is proposed, which stops after a finite number of iterations at an approximate feasible/optimal solution or detects that the problem is infeasible. In particular, we extend some results of [5] to our more general setting. We also present some numerical experiments illustrating the applicability of the algorithm.
This paper is organized as follows. In Sect. 2, we consider the optimization problem and the class of Lagrangian/penalty functions to be analyzed. In Sect. 3, we propose an algorithm and examine its asymptotic convergence properties. We also propose a second algorithm that takes into account the possible infeasibility of the problem. In Sect. 4, we present some numerical experiments analyzing our algorithm for some Lagrangian/penalty functions. Last section is devoted to conclusions and final remarks.
Notation
The \(\alpha \)-norm and maximum norm of \(x \in {{\mathbb {R}}}^m\) are denoted, respectively, by \( \Vert x \Vert _\alpha :=\left( \sum _{i=1}^{m}|x_i|^{\alpha }\right) ^{1/\alpha }\) and \(\Vert x \Vert _\infty :=\max (|x_{1}|,\ldots ,|x_m|),\) when \(\alpha =2\), we use the notation \(\Vert \cdot \Vert \). We denote \(\mathbb {R}_{+}\) and \(\mathbb {R}_{++}\) the sets of non-negative and strictly positive real numbers, respectively. Let \(v \in \mathbb {R}^n\), then \(v_{+} := (\max \{0, v_1\}, \ldots , \max \{0, v_n\})\). For \(K = (k_1, k_2, \ldots ) \subseteq \mathbb {N}\) (with \(k_j < k_{j+1}\) for all \(j\)), we denote \(K \displaystyle \mathop {\subset }_{\infty }\mathbb {N}\).
2 General Lagrangian function
In this paper, we consider the inequality constrained optimization problem
where \( f: \mathbb {R}^n \rightarrow \mathbb {R}\) and \(g: \mathbb {R}^n \rightarrow \mathbb {R}^m\) are continuous and \(\Omega \subset \mathbb {R}^n \) is a nonempty compact set. In general, \(\Omega \) is defined by simple constraints such as affine constraints or box constraints.
In order to solve problem (1) or identify if the feasible set is empty, we propose an algorithm based on the general class of Lagrangians introduced by Kort and Bertsekas [2, 14]
where \(x\in \Omega ,\) \(\lambda \in \mathbb {R}^{m}_{+}, \rho >0\), and the penalty function \(W:{{\mathbb {R}}}^2\rightarrow {{\mathbb {R}}}\) satisfies the following conditions:
-
(H1)
\(W\) is continuously differentiable on \({{\mathbb {R}}}\times ]0,\infty [\) and possesses, for all \(s \in {{\mathbb {R}}}\), the right derivative
$$\begin{aligned} \lim _{t \rightarrow 0^+}\frac{W(s,t)-W(s,0)}{t}; \end{aligned}$$ -
(H2)
\(W(s,.)\) is concave on \({{\mathbb {R}}}_{+}\) for each fixed \(s\in {{\mathbb {R}}}\);
-
(H3)
for each fixed \(t \in {{\mathbb {R}}}_{+}, W(.,t)\) is convex on \({{\mathbb {R}}}\) and satisfies the following strict convexity condition: If \(s_0>0\) or \(W'_s(s_0,t)>0\), then \(W(s,t)- W(s_0,t)>(s-s_0)W'_s(s_0,t)\) for \(s\ne s_0\), where \(W'_s(s,t)\) denotes the partial derivative of \(W(s,t)\) with respect to s;
-
(H4)
\(W(0,t)=0, W'_s(0,t)=t\), for all \(t\in {\mathbb {R}}_{+}\);
-
(H5)
\( \lim _{s\rightarrow -\infty }W'_s(s,t)=0\), for all \(t\in {\mathbb {R}}_{+}\);
-
(H6)
\( \inf _{s\in {{\mathbb {R}}}}W(s,t)> -\infty \), for all \(t\in {{\mathbb {R}}}_{+}\);
-
(H7)
\( \lim _{s\rightarrow \infty }\frac{W(s,t)}{s}=\infty \), for all \(t\in {{\mathbb {R}}}_{+}\).
Remark 1
Since by (H3) \(W(.,t)\) is convex, it follows that \(W'(.,t)\) is monotone. Hence, (H5) implies
which shows that \(W(\cdot ,t)\) is non-decreasing for every \(t\in {{\mathbb {R}}}_{+}\). It is also possible to show that (H7) is equivalent to \(\lim _{s\rightarrow \infty }{W'_s(s,t)}=\infty \), for all \(t\in {{\mathbb {R}}}_{+}\).
The family of penalty functions satisfying (H1)–(H7) is very large. Next, we present two subclasses of functions satisfying these conditions. The first one generalizes the classical quadratic augmented Lagrangian, and the second one contains some exponential type functions. In particular, every function in the second family is twice continuously differentiable. For more details on these functions and other examples, see for instance [2].
Example 1
Let \(W:{{\mathbb {R}}}^2\rightarrow {{\mathbb {R}}}\) such that
where \(\phi :{{\mathbb {R}}}\rightarrow {{\mathbb {R}}}\) satisfies the following conditions:
-
(E1)
\(\phi \) is continuously differentiable and strictly convex on \({{\mathbb {R}}}\);
-
(E2)
\(\phi (0)=0,\) \(\phi '(0)=0,\) \(\lim _{s \rightarrow -\infty }\phi '(s)=-\infty \);
-
(E3)
\(\lim _{s \rightarrow \infty } \frac{\phi (s)}{|s|}=\infty .\)
It is not difficult to prove that the penalty function \(W(s,t)\) defined in (3) satisfies (H1)–(H7), see [2]. In the following, we consider some examples of \(\phi \) satisfying conditions (E1)–(E3):
-
(i)
\(\phi (s)=\alpha ^{-1}|s|^{\alpha }, \alpha >1\);
-
(ii)
\(\phi (s)=\alpha ^{-1}|s|^{\alpha }+(1/2)s^2, \alpha >1\);
-
(iii)
\(\phi (s)=\cosh (s)-1\).
Taking \(\phi (s)=(1/2)s^2\) in (3), then
Hence, after simple calculations, the corresponding augmented Lagrangian function (2) can be rewritten as
which is the PHR augmented Lagrangian function [24]. Now, if \(\phi (s)=\alpha ^{-1}|s|^{\alpha }\), where \(\alpha >1\), or \(\phi (s)=\cosh (s)-1\), we obtain the following penalty functions, respectively,
and
(Recall that \(\sinh ^{-1}(x)=\ln (x+\sqrt{x^2+1})\)).
We mention that, similarly to the quadratic augmented Lagrangian, the penalty functions of Example 1 may not be twice continuously differentiable. Next, we present some examples of penalty functions with the advantage that the corresponding augmented Lagrangian functions are twice continuously differentiable, see [2] for details.
Example 2
Let \(W:{{\mathbb {R}}}^2\rightarrow {{\mathbb {R}}}\) such that
where \( \psi ,\xi :{{\mathbb {R}}}\rightarrow {{\mathbb {R}}}\) satisfy the following conditions:
-
(A1)
\(\psi \) and \(\xi \) are twice continuously differentiable functions and convex;
-
(A2)
\(\psi (0)=0,\) \(\psi '(0)=1\) and \(\psi ''(s)>0\) for all \(s\in \mathbb {R}\);
-
(A3)
\(\lim _{s \rightarrow -\infty }\psi (s)>-\infty , \lim _{s \rightarrow -\infty }\psi '(s)=0\) and \(\lim _{s \rightarrow \infty }\psi '(s)=\infty \);
-
(A4)
\(\xi (s)=0\) for all \(s\le 0\) and \(\xi (s)>0\) for all \(s>0\);
-
(A5)
\(\xi ''(0)=0\) and \(\lim _{s \rightarrow \infty }\xi '(s)=\infty \).
An interesting example in this family of penalty functions is obtained by taking \(\psi (s)=e^ s-1, \xi (s)=\frac{1}{3}[\max \{0,s\}]^3\). In this case,
and the associated augmented Lagrangian is
Another example of a twice continuously differentiable penalty function is obtained by taking \(\psi \) and \(\xi \) as
Thus, the corresponding penalty function (8) becomes
In the following, we present a lemma which is essential for the analysis of our algorithm.
Lemma 1
Let \(s\in {{\mathbb {R}}}, t\ge 0\) and a bounded sequence \(\{t_k\}\subset {{\mathbb {R}}}_{+}\). Then
-
(i)
\(W_t'(s,t)\ge s\);
-
(ii)
\(sW_s'(s,t)\ge W(s,t)\ge tW_t'(s,t)\ge st;\)
-
(iii)
\(W_s'(s,t)=t\) implies \(s\le 0, st=0\);
-
(iv)
\(\lim _{k\rightarrow \infty } \frac{1}{r_k}\inf _{s\in \mathbb {R}} W(s,t_k)=0,\) where \(r_k \rightarrow \infty \).
Proof
For the proofs of items (i)–(iii), see [2, Proposition 5.1]. Let us prove item (iv). It follows by assumptions (H4) and (H6) that
From (H2) we have \(q(t):= \inf _{s\in \mathbb {R}} W(s,t)\) is concave \([0,+\infty [\). Let \(0<t_k\le b , v_k \in \partial q(t_k)\) and \(v \in \partial q(b)\). Therefore, using the supergradient inequality \(q(0)\le q(t_k)-v_kt_k \) and the non-increasing property of superdifferential \((t_k\le b\Rightarrow v\le v_k )\), we obtain
Since the last estimate also holds for \(t_k=0\) and \(q(t)\le 0\) for all \(t\ge 0\), it follows that \(q(t_k)\) is bounded for \(0\le t_k\le b \). Therefore, as \(r_k \rightarrow \infty \), we have
\(\square \)
3 Algorithms
In this section, we study global convergence properties of an augmented Lagrangian method associated with the Lagrangian function defined in (2). We state the algorithm and show that its generated sequence satisfies an asymptotic infeasibility measure. Based on this result, we show some infeasibility measures satisfied by the limit points of the sequence generated by the algorithm when specific examples of penalty functions are considered. We also analyze some properties of the algorithm, which are essential for the feasibility/infeasibility test presented in the second algorithm. In order to state the algorithm, let \(\{\varepsilon _k\}\subset {{\mathbb {R}}}_{+}\) be a bounded sequence, which is used to estimate the quality of approximate solutions of the augmented Lagrangian subproblems. Also, let us define \(\sigma (x,\lambda ,\rho ):=(\sigma _1(x,\lambda ,\rho ),\ldots ,\sigma _m(x,\lambda ,\rho ) )^T\) with
It follows by item (iii) of Lemma 1 that \(\left\| \sigma (x,\lambda ,\rho )\right\| _{\infty }\) may be used to measure the feasibility and complementarity condition of an approximate solution \(x\) and a multiplier \(\lambda \). We shall consider this measure in order to update the penalty parameter.
Algorithm 1
Let \(\gamma >1, 0 < \tau < 1, \rho _1>0, \lambda _{\max }>0\), and \({\lambda }^1 \in {\mathbb {R}}^m\) such that \({{\lambda }}^1_i \in [0, \lambda _{\max }],\) for all \(i=1,\dots ,m\). Assume that \(\{\varepsilon _k\}\) is a bounded positive sequence and initialize \(k \leftarrow 1\).
- Step 1.:
-
(Subproblem)
Find \(x_k \in \Omega \) such that
$$\begin{aligned} L(x^k, {\lambda }^k,\rho _k) \le L(x, {\lambda }^k,\rho _k) + \varepsilon _k \end{aligned}$$(12)for all \(x \in \Omega \).
- Step 2.:
-
(Update penalty parameter)
If \(k=1\) or
$$\begin{aligned} \Vert \sigma (x^k,{\lambda }^k,\rho _k)\Vert _{\infty } \le \tau \; \Vert \sigma (x^{k-1},{\lambda }^{k-1},\rho _{k-1})\Vert _{\infty }, \end{aligned}$$(13)set \(\rho _{k+1} = \rho _k\). Otherwise, set \(\rho _{k+1} = \gamma \rho _k\).
- Step 3.:
-
(Update multipliers)
Compute \( {\lambda }^{k+1}_i \in [0, \lambda _{\max }],\) for all \( i=1,\dots ,m\).
Set \(k \leftarrow k+1\) and go to Step 1.
We mention that, as in many augmented Lagrangian methods, solving the subproblems is usually a difficult task. It is assumed that to solve a sequence of optimization problems with simple constraints is much less difficult than to solve the constrained optimization problem (1). In order to alleviate this task, we will accept approximate solutions of the subproblems. In the first iterations the tolerance \(\varepsilon _k\) may be considered not so stringent and one shall start to consider it smaller during the execution of the algorithm. Moreover, it follows from Remark 1 that, if problem (1) is convex, then the subproblems are also convex, which implies that every stationary point is a global solution.
In order to prove the theoretical results, we require the boundedness of the Lagrange multipliers, see Step 3 of Algorithm 1. In practice, the rule to update the Lagrangian multipliers is determinant for the performance of the algorithm. For the family of augmented Lagrangian functions studied in this paper, we consider the safeguard strategy to update the Lagrangian multipliers given by
A similar strategy is also considered in [1, 3, 5, 16].
In the following, we present some properties of Algorithm 1. These properties will be summarized and they will be the motivations for our last algorithm, for which we will introduce an infeasibility test. Let us consider the function \(M:\mathbb {R}^m \times \mathbb {R}_{+}^{m} \times \mathbb {R}_{++} \rightarrow \mathbb {R}\) defined by
Proposition 2
Assume that \(\{x^k\}\) is a sequence generated by Algorithm 1. Then, for all \(z \in \Omega \),
for any continuous function \(\beta :\;]0,\infty [\;\rightarrow \mathbb {R}_{++}\) such that \(\lim _{s \rightarrow \infty }\beta (s)=\infty \). In particular, if problem (1) is feasible, then every limit point of \(\{x^ k\}\) is feasible.
Proof
Let \(x^* \in \Omega \) be an arbitrary limit point of \(\{x^k\}\) (such a point exists because \(\Omega \) is compact). Let \(K \displaystyle \mathop {\subset }_{\infty }\mathbb {N}\) such that \(\lim _{k \in K} x^k = x^*\). As \(\{\rho _k\}\) is non-decreasing, we may consider two cases: \(\{\rho _k\}\) bounded and \(\rho _k\rightarrow \infty \).
Case 1 (\(\{\rho _k\}\) bounded): By Step 2 of Algorithm 1, we have \(\lim _{k \rightarrow \infty }\Vert \sigma (x^k,\lambda ^k,\rho _k)\Vert _{\infty }=0,\) or equivalently,
Since \(\{\lambda ^k\}\) and \(\{\rho _k\}\) are bounded, we may assume, without loss of generality, \(\lambda ^k\rightarrow \bar{\lambda }\ge 0\), and \(\rho _k\rightarrow \bar{\rho }>0\). So, (17) implies
This result, combined with item (iii) of Lemma 1, gives \(g({x^*})\le ~0\), which trivially implies the second part of the proposition when \(\{\rho _k\}\) is bounded. Now, since \( g_i({x^*})\le ~0\), for all \(i=1,\ldots , m, W(\cdot ,t)\) is non-decreasing for every \(t\ge 0\) and \(W(0,\bar{\lambda _i})=0\), we obtain
Similarly, observing that \(0\le \max \{0,g_i(z)\}=g_i(z)_{+}\), we have
Therefore, (16) is implied by (18) and (19).
Case 2 (\(\rho _k\rightarrow \infty \)): For any \(z \in \Omega \), we have, by Step 1 of Algorithm 1
Since \(g_i(z)\le g_i(z)_{+}\), for all \(i=1, \ldots ,m,\) \(W(\cdot ,t)\) is non-decreasing for every \(t\ge 0\) and taking into account (2), the last inequality becomes
Hence, using the functions \(\beta \) and M [see (15)], it follows from the last inequality that
As \(\{\varepsilon _k\}\) is bounded, \(f\) is continuous, \(\lim _{\rho _k \rightarrow \infty }\beta (\rho _k)=\infty \), and \(\Omega \) is compact, then (16) follows from the last inequality. Thus, the first part of the proposition is concluded.
Observing that when \(\{\rho _k\}\) is bounded, the last assertion of the proposition has already been proved, we may assume that \(\rho _k \rightarrow \infty \). Let \(z \in \Omega \) be a global minimizer of (1). Hence, using that \(\max \{0,g_i(z)\}=0, W(0,{\lambda }^k_i)=0\) and (16), we obtain
for any continuous function \(\beta :\;]0,\infty [\;\rightarrow \mathbb {R}_{++}\) such that \( \lim _{s \rightarrow \infty }\beta (s)=\infty \). Suppose, by contradiction, that \(x^*\) is not feasible for problem (1). Thus, there exists some \(i_0\) such that \(g_{i_0}(x^{*})> 0\). Hence, for some \(\delta >0\), it follows from the continuity of \(g_{i_0}\) that there exist \(k_0>0\) such that \(g_{i_0}(x^{k})\ge \delta \) for all \(k\ge k_0, k \in K\). Consequently, using that \(W(\cdot ,t)\) is non-decreasing for every \(t\ge 0\), we obtain
where we have used properties of infimum and item (i) of Lemma 1 in the last inequality. It follows by item (iv) of Lemma 1 and properties of function \(\beta \) that
Now, taking \(\beta (s)={W(\delta s,0)}/{s}\), (H7) implies \(\lim _{s\rightarrow \infty }\beta (s)=\infty \). Hence, it follows by (21) and (22) that
which is a contradiction with (20), concluding the proof. \(\square \)
Proposition 2 presents an asymptotic infeasibility measure associated with the general augmented Lagrangian function (2). In the following, we specify this infeasibility measure for some specific penalty functions.
Corollary 3
Let \(\{x^k\}\) be an infinite sequence generated by Algorithm 1 with \(W\) defined in (6). For any \(x^*\) limit point of \(\{x^k\}\), and for all \(z \in \Omega \) we have
Proof
Let \(K \displaystyle \mathop {\subset }_{\infty }\mathbb {N}\) be such that \(\lim _{k \in K} x^k = x^*\). Using the definition in (6) and \(\rho _k>0\) for all \(k \in \mathbb {N}\), we obtain
First, consider the case in which \(\rho _k \rightarrow \infty \). It follows from this expression of \(W\) that
It is easy to see that a similar argument shows
Therefore, by Proposition 2 with \(\beta (s)=s^{\alpha -1}\), we conclude that (23) holds in the case \(\rho _k \rightarrow \infty \).
In the case \(\{\rho _k\}\) bounded, similarly to the first part of the proof of Proposition 2, we obtain that each limit point \(x^*\) of \(\{x^k\}\) is feasible for problem (1), which trivially implies (23). \(\square \)
Remark 2
For the PHR augmented Lagrangian function (5), Corollary 3 implies the infeasibility measure \(\Vert g(x^*)_+\Vert \le \Vert g(z)_+\Vert \), for every \(z\in \Omega \) and \(x^*\) limit point of \(\{x^k\}\) (see [5, Theorem 2.1] for a similar result). If we consider the penalty function (10), similarly to the proof of Corollary 3 with \( \beta (s)=s^{2}\), we obtain \(\Vert g(x^*)_+\Vert _3\le \Vert g(z)_+\Vert _3\), for every \(z\in \Omega \) and \(x^*\) limit point of \(\{x^k\}\).
Next, we present an infeasibility measure for the limit points of the sequence \(\{x^k\}\) generated by Algorithm 1 when the exponential type function is considered.
Corollary 4
Assume that \(\{x^k\}\) is a sequence generated by Algorithm 1 with \(W\) defined in (9). Let \(x^*\in \Omega \) be a limit point of \(\{x^k\}\) such that \(\displaystyle \lim _{k \in K} x^k = x^*\) with \(\displaystyle K\displaystyle \mathop {\subset }_{\infty }\mathbb {N}\). Suppose that if \(g_i(x^*)>0\) then \(\displaystyle \limsup _{j\in K}\lambda _i^j>0\). Then, for all \(z \in \Omega \),
Proof
Let us assume that problem (1) is infeasible, otherwise, it follows from the last statement of Proposition 2 that (24) trivially holds. Suppose by contradiction that there exists \(z\in \Omega \) such that \(\Vert g(x^*)_+\Vert _{\infty }> \Vert g(z)_+\Vert _{\infty }\). Since problem (1) is infeasible, \(\Vert g(z)_+\Vert _{\infty }>0\) and \(\rho _k \rightarrow \infty \) (see Step 2 of Algorithm 1). Therefore, there exists an index \(i_0\in \{1,\ldots ,m\}\) and \(\delta >0\) such that
As \(g\) is continuous, there exists \(k_0>0\) such that \(g_{i_0}(x^k)>\Vert g(z)_+\Vert _{\infty }+\delta \), for any \(k\ge k_0, k\in K\). In particular, it follows from our assumptions that \( \limsup _{k\in K}\lambda _{i_0}^k>0\). Observing that \(0\le \lambda ^k_i\le \lambda _{\max } \) and considering the function \(M\) of (15) and \(W\) defined in (9), it follows that
Therefore, since \(z\) is fixed, defining \(\displaystyle \beta (s)=e^{s\Vert g(z)_+\Vert _{\infty }}\) we have
As \( \limsup _{k\in K}\lambda _{i_0}^k>0\), we obtain \( \limsup _{k\in K}( \lambda _{i_0}^k{e^{\rho _k\delta }})/{\rho _k}=\infty .\) Combining this result with previous inequalities, it yields
On the other hand,
where the first and third inequalities are due to fact that the multipliers are non-negative and bounded by \(\lambda _{\max }\), respectively. In the second one, we used \(g_i(z)_+\le \Vert g(z)_+\Vert _{\infty }\), for \(i=1,\ldots ,m.\) Since \(\{\rho _k\}\) tends to infinity, we obtain from (26)
which combined with (25) gives a contradiction, by Proposition 2. The proof of the corollary follows. \(\square \)
Remark 3
We mention that the update rule for the Lagrange multipliers is general, see Step 3 of Algorithm 1. Therefore, the additional asymptotic condition of Corollary 4 may be forced during the progress of the algorithm. If the update rule (14) is considered, then the asymptotic condition is satisfied when \(\Vert x_{k+1}-x_k\Vert \rightarrow 0\). We also mention that Corollary 4 holds for the penalty function defined in (7).
Next, we present an estimate which is used in Algorithm 2 to define a criterion to detect an approximate solution of problem (1). We shall consider for any \(k\),
Lemma 5
Assume that \(\{x^k\}\) is an infinite sequence generated by Algorithm 1. Suppose that problem (1) is feasible. Then, for any \(k\)
for all feasible point \(z\).
Proof
By Step 1 of Algorithm (1) and (2), we have
Let \(z\) be a feasible point. Therefore, since \(g(z) \le 0 , W(0,t)=0\) for \(t\ge 0\) and \(W(\cdot ,t)\) is non-decreasing for every \(t\ge 0\), we have
Combining this result with (27) and (28), we obtain
which proves the lemma. \(\square \)
In the following theorem, we present a criterion to detect infeasibility after a finite number of iterations of Algorithm 1. We assume that it is possible to obtain a bounded sequence \(\{c_k\}\) satisfying
In the numerical experiments section, we describe a way to obtain \(c_k\). Clearly, since \(f\) is continuous and \(\Omega \) is bounded, the sequence \(\{c_k\}\) may be assumed to be bounded.
Theorem 6
Assume that \(\{x^k\}\) is an infinite sequence generated by Algorithm 1. Then, problem (1) is infeasible if, and only if, there exists \(k \in \mathbb {N}\) such that
where \(\gamma _k\) is defined in (27).
Proof
Suppose the feasible region of (1) is non-empty. Therefore, it follows by (29) and Lemma 5 that, for all feasible point \(z\),
This means that the infeasibility test (30) fails to be fulfilled.
Reciprocally, suppose that problem (1) is infeasible. In this case, it follows from Step 2 of Algorithm 1 that \(\rho _k \rightarrow \infty \). Also, for any subsequence \(\{x^k\}_{k \in K}\), there exist \(k_0>0\) and \(i_0\in \{1,\ldots ,m\}\) such that \(g_{i_0}(x^{k})> \delta \) for some \(\delta >0\), and \(k\in K, k\ge k_0\). Consequently, using \(W(\cdot ,t)\) is non-decreasing for every \(t\ge 0\), we obtain for all \(k\in K, k\ge k_0\)
where we have used properties of infimum and item (i) of Lemma 1 to get the last inequality. It follows by item (iv) of Lemma 1 that
On the other hand, (H7) implies
Therefore, it follows from (31) that
Hence, the boundedness of \(\{\varepsilon _k\}\) and \(\{c_k\}\) implies the fulfillment of test (30), for \(k\) large enough. \(\square \)
In the following theorem, we show that an optimality condition holds after a finite number of iterations of Algorithm 1, under the assumption that problem (1) is feasible and \(\{ \varepsilon _k\} \) tends to zero.
Theorem 7
Assume that \(\{x^k\}\) is an infinite sequence generated by Algorithm 1. Let \(K \displaystyle \mathop {\subset }_{\infty }\mathbb {N}\) and \(x^* \in \Omega \) be such that \(\lim _{k \in K} x^k = x^*\). Suppose that problem (1) is feasible and \(\lim _{k \rightarrow \infty } \varepsilon _k = 0\). Then,
where \(\gamma _k\) is defined in (27). As a consequence, given \(\varepsilon _{\text{ opt }}>0\), for large enough \(k\),
which in turn implies
Proof
It follows from Proposition 2 that \(x^*\) is feasible for problem (1). Since the feasible region of (1) is non-empty and compact, there exists a global minimizer \(z \in \Omega \). By Lemma 5,
First, we shall prove (32) in the case that \(\rho _k \rightarrow \infty \). For this, note that item (iv) of Lemma 1 implies
Using the definition of \(\gamma _k\) in (27), properties of infimum and the last equality, we obtain
On the other hand, taking the infimum limit for \(k \in K\) in (33) and using \({ \varepsilon _k} \rightarrow 0\) and \(f(z)\le f(x^*)\), we obtain
which, combined with (34), proves (32) in the case that \(\rho _k \rightarrow \infty \).
Now, if \(\{\rho _k\}\) is bounded, by step 2 of Algorithm 1, we have \(\lim _{k \rightarrow \infty }\Vert \sigma (x^k,\lambda ^k,\rho _k)\Vert _{\infty }=0,\) or equivalently,
As \(\{\lambda ^k\}\) and \(\{\rho _k\}\) are bounded, we can assume that \(\lambda ^k\rightarrow \bar{\lambda }\ge 0\) and \(\rho _k\rightarrow \bar{\rho }>0\). So, (35) implies
which, combined with item (iii) of Lemma 1, gives
Now, using item (ii) of Lemma 1, we have for all \(k\) and \(i\)
Taking the limit for \(k \in K\) in the last inequality, together with (36) and (37), it follows that
which, combined with (27), yields (32) in the case \(\{\rho _k\}\) bounded. Thus, the proof of (32) is concluded.
The second statement of the theorem follows from the first result and the assumption \( \varepsilon _k \rightarrow 0\). In order to establish the last part of the theorem, combine the second one and Lemma 5. \(\square \)
Due to the previous theorems, we are able to define a variation of Algorithm 1, for which we can guarantee finite termination with certificates of infeasibility or optimality up to given precisions. To state Algorithm 2, we assume that \(\varepsilon _{\text{ feas }} > 0\) and \(\varepsilon _{\text{ opt }} > 0\) are user-given tolerances for feasibility and optimality respectively.
Algorithm 2
Let \(\gamma >1, 0 < \tau < 1, \rho _1>0, \lambda _{\max }>0\), and \({\lambda }^1 \in {{\mathbb {R}}}^m\) such that \({{\lambda }}^1_i \in [0, \lambda _{\max }],\) for all \(i=1,\dots ,m\). Assume that \(\{\varepsilon _k\}\) is a bounded positive sequence and initialize \(k \leftarrow 1\).
- Step 1:
-
(Subproblem)
Obtain \(x^k \in \Omega \) satisfying
$$\begin{aligned} L(x^k, {\lambda }^k,\rho _k) \le L(z, {\lambda }^k,\rho _k) + \varepsilon _k,\qquad \forall z \in \Omega . \end{aligned}$$ - Step 2:
-
(Infeasibility test)
Let
$$\begin{aligned} \gamma _k := -\frac{1}{\rho _k}\sum _{i=1}^{m}[W(\rho _k g_i(x^k),\lambda ^k_i)]. \end{aligned}$$Compute \(c_k > 0\) such that \(| f(x^k) - f(z) | \le c_k \) for all \(z \in \Omega \). If
$$\begin{aligned} \gamma _k + \varepsilon _k < - c_k, \end{aligned}$$stop declaring Infeasible problem.
- Step 3:
-
(Feasibility and optimality test)
If
$$\begin{aligned} \Vert g(x^k)_+\Vert _{\infty } \le \varepsilon _{\text{ feas }} \;\; \text{ and } \;\; \gamma _k + \varepsilon _k \le \varepsilon _{\text{ opt }}, \end{aligned}$$stop declaring Solution found.
- Step 4:
-
(Updating the penalty parameter)
If \(k=1\) or
$$\begin{aligned} \Vert \sigma (x^k,\lambda ^k,\rho _k)\Vert _{\infty } \le \tau \; \Vert \sigma (x^{k-1},\lambda ^{k-1},\rho _{k-1})\Vert _{\infty }, \end{aligned}$$then define \(\rho _{k+1} = \rho _k\). Otherwise, \(\rho _{k+1} := \gamma \rho _k\).
- Step 5:
-
(Updating the multipliers)
Compute \( \lambda ^{k+1}_i=\min \{W'_s(\rho _kg_i(x^k),\lambda _i^k), \lambda _{\max }\},\) for all \( i=1,\dots ,m\).
Set \(k \leftarrow k+1\) and go to Step 1.
The next result is an immediate consequence of Theorems 6 and 7.
Theorem 8
Let \(\{x_k\}\) be generated by Algorithm 2 with \(\lim _{k \rightarrow \infty } \varepsilon _k = 0\). Then, after a finite number of iterations, the algorithm finds an approximate solution \(x^k\) satisfying
for all \(z \in \Omega \) such that \(g(z) \le 0\), or stops, certifying the problem is infeasible.
4 Numerical experiments
This section summarizes the results of the numerical experiments we carried out in order to verify the effectiveness of Algorithm 2, for solving problem (1) or identifying if it is infeasible. We implemented Algorithm 2 considering five different augmented Lagrangian functions (2) with the penalty functions (4), (6) with \(\alpha =4/3\), (7), (9) and (10), which will be denoted by augmented Lagrangian of type (ALtype) \(1, 2, 3, 4\) and 5, respectively.
Our algorithm was implemented as modification of the software Algencan, which is a Fortran 77 code, free available in http://www.ime.usp.br/~egbirgin/tango/ and based on the theory presented in [1]. Similarly to [3], our main modifications are the implementation of different augmented Lagrangian functions. The main difference of our codes and the ones implemented in [3] is a multistart strategy to solve the subproblems and the introduction of stopping criteria, described in Steps 2 and 3 of Algorithm 2. The experiments were run on a 3.4 GHz Intel(R) i7 with 4 processors, 8Gb of RAM and Linux operating system.
4.1 Implementation details
The performance of algorithms is, in general, sensitive to the choice of the initial parameters. For comparison purposes, we decided to keep the same configuration of the initial algorithmic parameters for all augmented Lagrangians versions. We set \(\varepsilon _{\text{ feas }} = \varepsilon _{\text{ opt }} = 10^{-4}, \lambda ^1=0, \lambda _{\max }=10^{10}, \rho _1=10, \gamma = 10\) and \(\tau =0.5\). We put \(\varepsilon _1=10^{-4}\) and \(\varepsilon _k=\max \{10^{-(k+2)},10^{-8}\}\) for \(k\ge 2\).
In all considered problems the set \(\Omega \) is a box \(\{ x \in \mathbb {R}^n \;| \; l \le x \le u \}\). We use the local optimization package Gencan to solve the box-constrained subproblems. Gencan is an active-set method which uses a truncated Newton approach with line searches within the faces whereas, for leaving the faces, uses spectral projected gradient iterations, see [1] and references therein. In all implemented versions, we replaced each Hessian of the augmented Lagrangian-vector product by an incremental quotient, using the approximation
with a small value of \(t\).
Step 1 of Algorithm 2 requires the global solution of the subproblems. However, we use Gencan, which is a local optimization algorithm, to solve the subproblems. In each outer iteration, we ran Gencan 30 times using different randomly generated initial points (feasible with respect to the box \(\Omega \)) and we considered as solution the point with the smallest value of the objective function of the subproblem. Adopting this strategy, we assume that we are able to find a global minimizer of each subproblem. Let us call this strategy multistart.
For simplicity, let us denote the objective function of the \(k\)-th subproblem \(L (x,\lambda ^k,\rho _k)\) by \(L (x)\). Given a tolerance \(\bar{\varepsilon }_k>0\), Gencan stops at point \(y\) if
where \(P_{\Omega }(x)\) is the Euclidean projection of \(x\) on the box \(\Omega \). Therefore, the inequality (38) holds for \(y\) equal to \(x^k\). The tolerance \(\bar{\varepsilon }_k\) used in the stopping criterion (38) of the \(k\)-th subproblem is related to the gap \(\varepsilon _k\) at Step 1 of Algorithm 2 by
where
Let us justify the relation (39). Denote by \(x_*^k\) a global minimizer of the \(k\)-th subproblem. Since the box \(\Omega \) is compact and convex, using properties of the Euclidean projection, we obtain
Then,
and, by the Cauchy–Schwarz inequality,
Therefore, by the stopping criterion (38) and the definition of \(d(x)\), we have
If \(L(x)\) is a convex function, it follows by the gradient inequality that
or, equivalently,
Hence, using the Cauchy–Schwarz inequality, stopping criterion (38) and (40), we obtain
which, combined with (39), gives the desired inequality in Step 1 of Algorithm 2. Now, consider the case where \(L(x)\) is not a convex function. If we assume that \(x^k\) is close enough to \(x_*^k\) then the linear approximation \(L(x^k)+ \left<\nabla L(x^k),x_*^k-x^k\right>\) of \(L(x_*^k)\) is “acceptable”. Therefore, we can proceed as in the convex case to conclude that the inequality in Step 1 of Algorithm 2 is “approximately” satisfied.
The constant \(c_k\) at Step 2 of Algorithm 2 is computed as suggested in [5]. By interval arithmetic, an interval \([f^{\min },f^{\max }]\) is computed, such that \(f^{\min } \le f(x) \le f^{\max }\) for all \(x \in \Omega \). So, \(c_k\) is calculate by
For interval analysis calculations we use the Intlib library [13].
4.2 Some feasible examples
The main purpose of this section is to illustrate the performance of Algorithm 2 with the five types of augmented Lagragian functions for some feasible problems. In the following experiments, we considered the set of problems with inequality constraints analyzed in [5]. These problems come from the global optimization literature. Each problem is feasible and the optimal value of the objective function is known. Therefore, it is possible to check if a particular solution given by Algorithm 2 is indeed a global minimizer.
In this section, we are also interested to check whether the multistart strategy to solve the subproblems and the relation (39) are appropriate or not. The reliability of these approaches will be corroborated if the problems are successfully solved. We understand that a problem has been successfully solved if Algorithm 2 stops at Step 3 and the final iterate \(x^*\) satisfies \(f(x^*)\le f(z)+\varepsilon _{opt}\) for all feasible \(z\). In the following, we describe the problems considered in these section.
Table 1 shows the performance of Algorithm 2 considering the five types of augmented Lagrangian functions. In the table, the first three columns identify the problem and the number of variables and constraints. “ALtype” identifies the considered augmented Lagrangian function, “Iter” is the number of augmented Lagrangian iterations, \(f(x^*)\) is the value of the objective function at the final iterate \(x^*, \Vert g(x^*)_+\Vert _{\infty }\) is the infeasibility measurement at \(x^*, \gamma _k+\varepsilon _k\) is the computed value at Steps 2 and 3 in the last iteration and “Global” identifies whether the final iterate is a global solution, in the sense that \(f(x^*)\le f(z)+\varepsilon _{opt}\) for all feasible \(z\), or not.
In all instances, Algorithm 2 stopped at Step 3 in a finite number of iterations reporting solution found. In fact, note that the values of \(\gamma _k+\varepsilon _k\) in Table 1 are equal to or less than \(\varepsilon _{opt}=10^{-4}\). The last column reports that all problems have been successfully solved by Algorithm 2 for each augmented Lagrangian type. Analyzing the column “Iter” of Table 1, we may note that, for these problems, the algorithm has a similar performance for all augmented Lagrangian functions considered. Moreover, these results allow us to conclude that the multistart strategy to solve the subproblems and the relation (39) are appropriate.
4.3 Kissing problems
In this section, our aim is to illustrate the behavior of our algorithm to detect if a problem is infeasible. For this, we analyze our infeasible test for some infeasible problems. In order to be self-contained, we also consider some feasible problems.
Given a set of \(nSph\) \(N\)-dimensional unit spheres, we consider the problem of finding the coordinates of the sphere centers in such a way that each one of the spheres touches another given one, with no overlap and maximizing the sum of the squared distances between the sphere centers.
Let us denote by \(C_i\in \mathbb {R}^{N}\) the coordinate vector of the \(N\)-dimensional sphere center for all \(i=1,\dots ,nSph\). This problem can be modeled as follows:
The first set of constraints requires that each considered sphere touches another one centered at the origin, while the second set determines that the spheres must be placed without overlapping. Simple devices can be used to transform problem (41) to the form (1). For example, we may rewrite each equality constraint as two inequality constraints. After making the necessary modifications, problem (41) has \(n= N \times nSph\) variables, \(2\times nSph\) constraints arising from the first set and another \(C^{nSph}_{2}\) arising from the second one, totaling \(m = 2\times nSph + C^{nSph}_{2}\) constraints beyond the box-constraints.
The kissing number is defined as the maximum number of non-overlapping \(N\)-dimensional unit spheres that can be arranged such that each one touches another given \(N\)-dimensional unit sphere. When \(N=2\) the kissing number is equal 6, while for \(N=3\) and \(N=4\) is equal 12 and 24, respectively, [20, 27]. Therefore, for \(N=2\), problem (41) is feasible if \(1\le nSph\le 6\) and infeasible if \(nSph>6\). Analogously, feasible instances of problem (41) are obtained when \(N=3\) and \(1\le nSph\le 12\) or when \(N=4\) and \(1\le nSph\le 24\). Infeasible instances may be generated with \(N=3\) and \(nSph> 12\) or with \(N=4\) and \(nSph> 24\).
For illustrative purposes, we considered six instances of problem (41) with \(N=2, nSph=3, 6, 7, 10, 12\) and 15. The first two instances are feasible while the remaining ones are infeasible. The problems were solved with different types of augmented Lagrangians, specified in the Fig. 1. For all instances Algorithm 2 behaves according to theory. For the feasible instances Algorithm 2 stopped at Step 3 reporting solution found while for infeasible ones stopped at Step 2 declaring infeasibility. Figure 1 illustrates the “solutions”.
For the infeasible instances illustrated in Fig. 1, Algorithm 2 stopped at outer iteration \(4, 3, 3\) and 3, respectively. Despite that Algorithm 2 stopped with few iterations, the nice arrangement of the circles corroborates the result of Proposition 2. In some sense, Proposition 2 establishes that the limit points of the sequence generated by Algorithm 2 are global minimizers of an infeasibility measure. See Corollary 3 for ALtype 1 and 2, Remark 2 for ALtype 5, Corollary 4 for ALtype 4 and Remark 3 for ALtype 3.
In order to verify the applicability of our approach, we run Algorithm 2 for feasible and infeasible instances of problem (41) considering the five types of augmented Lagrangians. Table 2 shows the performance of Algorithm 2 with the five augmented Lagrangian functions for three feasible instances of the problem (41). These problems correspond to dimension \(N=2, 3\) and 4 with \(nSph\) equal to their respective kissing number. Therefore, it is reasonable to believe that these are the most difficult feasible problems for each considered dimension. In the table, the first four columns identify the dimension \(N\), the number of \(N\)-dimensional spheres \(nSph\) and the number of variables and constraints. “ALtype” identifies the considered augmented Lagrangian function, “Iter” is the number of augmented Lagrangian iterations, \(f(x^*)\) is the value of the objective function at the final iterate \(x^*, \Vert g(x^*)_+\Vert _{\infty }\) is the infeasibility measurement at \(x^*, \gamma _k+\varepsilon _k\) and \(-c_k\) are the computed values at Steps 2 and 3 in the last iteration. In all the three feasible problems for all augmented Lagrangian types, Algorithm 2 stopped at Step 3 reporting solution found. In fact, note that the values of \(\gamma _k+\varepsilon _k\) in Table 2 is less than \(\varepsilon _{opt}=10^{-4}\).
Table 3 shows the performance of Algorithm 2 with the five augmented Lagrangian functions for six infeasible instances of the problem (41). Three out of the six considered instances may be the most difficult infeasible problems for each considered dimension. These problems correspond to dimension \(N=2, 3\) and 4 with \(nSph\) equal to their respective kissing number plus one. Table 3 contains the same information as Table 2. In all infeasible instances, Algorithm 2 stopped at Step 2 reporting infeasibility. In fact, comparing the last two columns of Table 3, \(\gamma _k+\varepsilon _k < -c_k\) in all cases. The results in Tables 2 and 3 show that the performance of Algorithm 2 is in agreement with the theoretical part.
4.4 Comments on the numerical experiments
In order to compare the performance of the five type of Lagrangian functions, we may consider the amount of augmented Lagrangian iterations (columns “Iter”) of the problems presented in Sects. 4.2 and 4.3. For the ten feasible problems of our numerical experiments, the numbers of augmented Lagrangian iterations are 32, 46, 37, 37 and 36 for ALtype 1, 2, 3, 4 and 5, respectively. For the six infeasible problems, the numbers of augmented Lagrangian iterations are 38, 62, 25, 24 and 23 for ALtype 1, 2, 3, 4 and 5, respectively.
Based on our numerical experiments, we may conclude that Algorithm 2 with ALtype 2 presents the worst performance for both feasible and infeasible instances. It is interesting to point out that augmented Lagrangians for ALtype 3, 4 and 5 demonstrated to be as efficient as the PHR augmented Lagrangian (ALtype 1) for feasible instances. Moreover, we observe that Algorithm 2 with these Lagrangians detected infeasibility in less iterations than PHR, which may reflect in computational savings. These motivate to continue the research on general augmented Lagrangian methods for nonlinear programming under possible infeasibility.
5 Conclusions
In this article, we considered a nonlinear programming problem for which the constraint set may be infeasible. We studied the augmented Lagrangian method based on the penalty functions of Kort and Bertsekas. A general algorithm was stated and its convergence properties were analyzed. For some specific penalty functions, we showed that every limit point of the sequence generated by the algorithm is a global minimizer of an infeasibility measure. Some results regarding optimality are also presented. Based on these results, a second algorithm is proposed, which stops after a finite number of iterations at an approximate solution or detects that the problem is infeasible. We also considered some numerical experiments illustrating the applicability of the algorithm. It is interesting to point out that some other augmented Lagrangians presented similar computational performance to the classical quadratic augmented Lagrangian (PHR). Besides, for infeasible problems, our algorithm based on these other Lagrangians detected infeasibility in less iterations than PHR. These numerical results motivate to continue the studies and analysis of general Lagrangian methods.
References
Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.: On augmented Lagrangian methods with general lower-level constraints. SIAM J. Optim. 18(4), 1286–1309 (2008)
Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York (1982)
Birgin, E.G., Castillo, R.A., Martínez, J.M.: Numerical comparison of augmented Lagrangian algorithms for nonconvex problems. Comput. Optim. Appl. 31(1), 31–55 (2005)
Birgin, E.G., Martínez, J. M., Prudente, L.F.: Optimality properties of an augmented Lagrangian method on infeasible problems. Comput. Optim. Appl. doi:10.1007/s10589-014-9685-5
Birgin, E.G., Martínez, J.M., Prudente, L.F.: Augmented Lagrangians with possible infeasibility and finite termination for global nonlinear programming. J. Glob. Optim. 58(2), 207–242 (2014)
Burachik, R.S., Iusem, A.N., Melo, J.G.: A primal dual modified subgradient algorithm with sharp Lagrangian. J. Glob. Optim. 46(3), 347–361 (2010)
Byrd, R., Curtis, F., Nocedal, J.: Infeasibility detection and SQP methods for nonlinear optimization. SIAM J. Optim. 20(5), 2281–2299 (2010)
Conn, A.R., Gould, N.I.M., Toint, PhL: Lancelot: A FORTRAN Package for Large-Scale Nonlinear Optimization (Release A). Springer, Secaucus (1992)
Floudas, C.A.: Deterministic Global Optimization: Theory, Methods and Application. Kluwer Academic Publishers, Dordrecht (1999)
Floudas, C.A., Visweswaran, V.: A global optimization algorithm (GOP) for certain classes of nonconvex NLPS II. Application of theory and test problems. Comput. Chem. Eng. 14, 1419–1434 (1990)
Griva, I., Polyak, R.: Primal-dual nonlinear rescaling method with dynamic scaling parameter update. Math. Program. 106(2), 237–259 (2006)
Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4(5), 303–320 (1969)
Kearfott, R.B., Dawande, M., Du, K., Hu, C.: Algorithm 737: Intlib & mdash: a portable Fortran 77 interval standard-function library. ACM Trans. Math. Softw. 20(4), 447–459 (1994)
Kort, B., Bertsekas, D.P.: Combined primal-dual and penalty methods for convex programming. SIAM J. Control Optim. 14(2), 268–294 (1976)
Liebman, J., Lasdon, L., Schrage, L., Waren, A.: Modeling and Optimization with GINO. The Scientic Press, Palo Alto (1986)
Luo, H.Z., Sun, X.L., Li, D.: On the convergence of augmented Lagrangian methods for constrained global optimization. SIAM J. Optim. 18(4), 1209–1230 (2007)
Mangasarian, O.: Unconstrained Lagrangians in nonlinear programming. SIAM J. Control Optim. 13(4), 772–791 (1975)
Manousiouthakis, M., Sourlas, D.: A global optimization approach to rationally constrained rational programming. Chem. Eng. Comm. 115(1), 127–147 (1992)
Martínez, J.M., Prudente, L.F.: Handling infeasibility in a large-scale nonlinear optimization algorithm. Numer. Algorithms 60(2), 263–277 (2012)
Musin, O.R.: The kissing number in four dimensions. Ann. Math. 168, 1–32 (2008)
Polyak, R.: Modified barrier functions (theory and methods). Math. Program. 54(1–3), 177–222 (1992)
Polyak, R., Griva, I.: Primal-dual nonlinear rescaling method for convex optimization. J. Optim. Theory Appl. 122(1), 111–156 (2004)
Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization, pp. 283–298. Academic Press, New York (1969)
Rockafellar, R.: Augmented Lagrange multiplier functions and duality in nonconvex programming. SIAM J. Control Optim. 12(2), 268–285 (1974)
Rubinov, A.M., Yang, X.Q.: Lagrange-Type Functions in Constrained Non-convex Optimization. Kluwer Academic Publishers, Dordrecht (2003)
Sahinidis, N.V., Grossmann, I.E.: Reformulation of the multiperiod milp model for capacity expansion of chemical processes. Oper. Res. 40(1), S127–S144 (1992)
Schtte, K., Waerden, B.L.: Das problem der dreizehn Kugeln. Math. Ann. 125(1), 325–334 (1952)
Swaney, R.E.: Global Solution of Algebraic Nonlinear Programs. AIChE Annual Meeting, Chicago (1990)
Wang, C.Y., Li, D.: Unified theory of augmented Lagrangian methods for constrained global optimization. J. Glob. Optim. 44(3), 433–458 (2009)
Wächter, A., Biegler, L.T.: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106(1), 25–57 (2006)
Acknowledgments
This work was supported in part by Projeto 136/11- CAPES/MES-Cuba, CNPq Grant 471815/2012-8 and 444134/2014-0, and FAPEG/GO.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gonçalves, M.L.N., Melo, J.G. & Prudente, L.F. Augmented Lagrangian methods for nonlinear programming with possible infeasibility. J Glob Optim 63, 297–318 (2015). https://doi.org/10.1007/s10898-015-0289-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-015-0289-0