Abstract
For nonlinear programming problems, we propose a new class of smooth exact penalty functions, which includes both barrier-type and exterior-type penalty functions as special cases. We develop necessary and sufficient conditions for exact penalty property and inverse proposition of exact penalization, respectively. Furthermore, we establish the equivalent relationship between these penalty functions and classical simple exact penalty functions in the sense of exactness property. In addition, a feasible penalty function algorithm is proposed. The convergence analysis of the algorithm is presented, including the global convergence property and finite termination property. Finally, numerical results are reported.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The penalty function algorithm is an important method for solving constrained optimization problems, which, by augumenting a penalty term, transforms the original problem into a single unconstrained (or simple constrained) problem or into a sequence of unconstrained problems. For classical penalty function algorithms, we need to make the penalty parameter infinitely large in a limiting sense to recover an optimal solution of the original problem, which incurs numerical instability in implementation. To avoid this difficulty, we require the penalty function to satisfy the following property: recovering an exact solution of the original problem for reasonable finite values of the penalty parameter. The penalty function possessing this property is said to be exact. However, it should be identified that some exact penalty functions have the disadvantage that they either need Jacobian [7, 9, 14, 15, 20] or are no longer smooth (\(l_1\) or \(l_\infty \) penalty functions etc. [1, 3, 4, 6, 12, 16, 18, 19, 21, 22]).
In this paper, we are mainly concerned with the following nonlinear programming problem
where \([u,v]=\{x\in \mathbb{R }^n | u\le x\le v \},\) \(u\in (\{-\infty \}\cup \mathbb{R })^n,\) \(v\in (\{+\infty \}\cup \mathbb{R })^n\) and \(\mathrm{int}[u,v]\ne \emptyset \). The functions \(f:D\rightarrow \mathbb{R }\) and \(F:D\rightarrow \mathbb{R }^m\) are continuously differentiable on an open set \(D\) satisfying \([u,v]\subset D.\) We always assume that the feasible region is nonempty and the function \(f\) is bounded below on \(D\), because \(f\) can be replaced by \(e^f\) otherwise.
Let \(\omega \in \mathbb{R }^m\) be fixed. The problem \((P)\) can be written equivalently as
For this problem, Huyer and Neumaier [13] introduce the following exact penalty function
where \(q>0\) is a given positive constant, \(\sigma >0\) is a penalty parameter, \(\beta : [0,\overline{\varepsilon }]\rightarrow [0,\infty )\) is continuous on \([0,\overline{\varepsilon }]\) with \(\overline{\varepsilon }>0\) and continuously differentiable on \((0,\overline{\varepsilon }]\) with \(\beta (0)=0\). The term \(\Delta (x,\varepsilon )\) measures the violation of the constraints, i.e.,
The corresponding penalty problem to (1.1) is
Note that \(f_\sigma \) is continuously differentiable on \(D_q=\{(x,\varepsilon )\in D\times (0,\overline{\varepsilon })| \Delta (x,\varepsilon )<q^{-1}\}\). The exact penalty property of \(f_\sigma \) is discussed thoroughly in [13], e.g., for smooth case, the exact penalty property is established if \(D_F\) (see (3.1)) is bounded and each \(x\in D_F\) satisfies Mangasarian-Fromovitz constraint qualification [13, Theorem 2.1]; for nonsmooth case of \((P)\), sufficient conditions are established by the regular technique and other strong conditions [13, Theorem 5.3].
The above penalty function \(f_{\sigma }\) is simple and exact. Here, we call a penalty function “simple”, if it involves objective and constraint functions of the original problem \((P)\) and does not involve gradients or the Jacobian. The main reason of \(f_\sigma \) to have significant differences with the classical simple exact penalty functions is that \(f_{\sigma }\) has good smoothness property, which is not enjoyed by the latter [2, 10]. In this paper, we restrict our focus primarily on the case of \((P)\) being smooth. The main results are listed as follows.
-
(i)
The term \(\frac{\Delta (x,\varepsilon )}{1-q\Delta (x,\varepsilon )}\) in (1.1) plays a role as a barrier term. However, to compute an interior point is not an easy thing in practical applications. Therefore, we extend this term to a class of convex functions. This allows us to present a unified framework for some barrier-type and exterior-type penalty functions, in which the latter are smooth on \([u,v]\times (0,{\overline{\varepsilon }})\) and have a larger smooth area than the former. The extended penalty function class (2.1) in Sect. 2 provides several alternatives in the design of penalty function algorithms.
-
(ii)
Based on (2.1) in Sect. 2, the corresponding penalty problems for \((P)\) are proposed. We firstly introduce the extended Mangasarian-Fromovitz constraint qualification, which generalizes the classical Mangasarian-Fromovitz constraint qualification from a single point to an infinite sequence. The sufficient condition for exact penalty property of \(\widetilde{f}_\sigma \) is established, provided that \(\nabla f\) is bounded on \(D_F\) and the extended Mangasarian-Fromovitz constraint qualification holds (see Theorem 3.1). This finding generalizes [13, Theorem 2.1]. Furthermore, necessary conditions for exact penalty property are obtained (see Theorem 3.2).
-
(iii)
Following by a series of technical lemmas, we show that these necessary conditions are also necessary and sufficient conditions for inverse proposition of exact penalization (see Theorem 4.1). This result accurately characterizes the equivalence between \(\widetilde{f}_\sigma \) and the classical simple exact penalty functions in the sense of exactness property. It demonstrates that \(\widetilde{f}_\sigma \) possesses exactness property as the classical simple penalty functions as well as the smoothness property, which is not shared by the classical simple exact penalty functions.
-
(iv)
It should be mentioned that, even if the original programming problem has optimal solutions, the penalty functions may fail to locate an optimal solution. In this case, it is impossible to obtain optimal solutions via penalty function algorithms, because in some penalty function algorithms, we need to find the optimal solution of penalty subproblem at each iteration. It also may occur even for the smooth penalty function with lower bound. For example, although the penalty function proposed in [8] is smooth and bounded from below, there exists an example to indicate that, the penalty function may fail to locate optimal solutions even though the original problem has an optimal solution. This causes the penalty function algorithm proposed in [8] and its revised version inapplicable. In addition, some papers discuss the convergence of iterative sequences generated by penalty function algorithms to FJ points or KKT points; see [5, 17, 20]. In order to avoid the cases mentioned above, we present a class of revised penalty function algorithms. The main feature of this class of algorithms is that if the optimal solution of the penalty subproblem does not exist, then we resort to finding a \(\delta \)-optimal solution of the penalty subproblem. Note that the \(\delta \)-optimal solution always exists, because penalty functions are bounded from below. This ensures the revised penalty function algorithms are always feasible. In addition, utilizing the exact penalty property and structural features of the class of penalty functions, we show that, under certain conditions, the proposed algorithm has finite termination property, i.e., the algorithm terminates at the optimal solution of the original problem after finitely many iterations; otherwise, by a perturbation theorem, the global convergence property is given, that is, every accumulation point of the sequence generated by the algorithm is an optimal solution of the original problem.
The organization of this paper is as follows. Section 2 extends the penalty function proposed by Huyer and Neumaier (1.1) to a class of penalty functions (2.1), establishes corresponding penalty problems and introduces the extended Mangasarian-Fromovitz constraint qualification. The necessary and sufficient conditions for exact penalty property are developed in Sect. 3. In Sect. 4, necessary and sufficient conditions for the inverse propositions of exact penalization are established. Additionally, the equivalence relationship between the new class of penalty functions and the classical simple exact penalty function in the sense of exactness property is established. Section 5 is devoted to the revised penalty function algorithm, where the finite termination property and global convergence property of the proposed algorithm are discussed, respectively. Numerical results are reported in Sect. 6.
2 A class of exact penalty functions
In this section, we extend the term \(\frac{\Delta (x,\varepsilon )}{1-q\Delta (x,\varepsilon )}\) in (1.1) to a class of convex functions. This allows us to present a unified framework for some barrier-type and exterior-type penalty functions. Given \(a\in (0,+\infty ]\), let a function \(\phi :[0,a)\rightarrow [0,+\infty )\) satisfy
\((i_1)\) \(\phi \) is convex and continuously differentiable on \([0,a)\) with \(\phi (0)=0\).
\((i_2)\) \(\phi '(t)>0\) for all \(t\in [0,a)\).
Many functions satisfy the conditions \((i_1),(i_2)\), for example
Utilizing \(\phi \), a penalty function is given by
It is easy to see that \(\widetilde{f}_\sigma \) is continuously differentiable on \(D_a=\{(x,\varepsilon )\in D\times (0,\overline{\varepsilon })|\Delta (x,\varepsilon )<a\}\). The barrier-type penalty functions correspond to the case of \(a<+\infty \), and the exterior-type ones correspond to the case of \(a=+\infty \).
The corresponding penalty problem is defined as
Let’s firstly recall Mangasarian-Fromovitz constraint qualification. Consider
where \(g:D\rightarrow \mathbb{R }^l\) is continuously differentiable, and \(F\) is defined as before. Denote by \(\nabla F\in \mathbb{R }^{n\times m}\) the transpose of Jacobian of \(F\).
Definition 2.1
For constraint system (2.2), we say that Mangasarian-Fromovitz constraint qualification (hereafter, MFCQ for short) holds at \(z^*\in \mathbb{R }^n\), if
-
(1)
\(g(z^*)\le 0\) and \(\mathrm{rank}(\nabla F(z^*))=m\);
-
(2)
there exists \(p\in \mathbb{R }^n\) such that \(\nabla F(z^*)^T p=0\) and \(\nabla g_j(z^*)^Tp<0\) for all \(j\in J(z^*)\),
where \(J(z^*)=\{j\in J|g_j(z^*)=0\}\) and \(J=\{1,2,\ldots ,l\}\).
For an infinite sequence \(K\subset \{1,2,\ldots \}\) and \(\{z^k\}_{k\in K}\subset \mathbb{R }^n\), denote
Definition 2.2
For constraint system (2.2), we say that an extended Mangasarian-Fromovitz constraint qualification (hereafter, EMFCQ for short) holds for \(\{z^k\}_{k\in K}\), if there exist a matrix \(\nabla F^*\) and an infinite subset \(K_0\subset K\) such that
-
(1)
\(\lim \limits _{k\in K_0,k\rightarrow \infty }\nabla F(z^k)=\nabla F^*\) and \(\mathrm{rank}(\nabla F^*)=m\);
-
(2)
there exists \(p\in \mathbb{R }^n\) such that \((\nabla F^*)^T p=0\) and \(\limsup \limits _{k\in K_0,k\rightarrow \infty }\nabla g_j(z^k)^T p<0\) for all \(j\in J^+(K_0)\).
Obviously, if MFCQ holds at \(z^*\), then EMFCQ holds for the constant sequence \(\{z^k\}_{k\in K}\) satisfying \(z^k=z^*\) for all \(k\in K\). Furthermore, from Definitions 2.1 and 2.2, we readily get the following result.
Proposition 2.1
Let \(\{z^k\}_{k\in K}\subset \mathbb{R }^n\). If the standard MFCQ holds for an accumulation point \(z^*\) of the sequence \(\{z^k\}_{k\in K}\) for (2.2), then EMFCQ holds for \(\{z^k\}_{k\in K}\).
The following example shows that EMFCQ is suitable for an unbounded and infeasible sequence.
Example 2.1
Consider the following constraint system
It is easy to see that EMFCQ holds for the unbounded sequence \(\{z^k\}= \{(k+\frac{1}{k},k,1+\frac{1}{k})^T\}\). Now let us discuss EMFCQ in more detail. Clearly, from Definition 2.1, Definition 2.2 and Example 2.1, we generalize MFCQ from a single point \(z^*\) with \(g(z^*)\le 0\) to an infinite sequence \(\{z^k\}_{k\in K}\). This sequence may be unbounded and infeasible, i.e., not necessarily satisfy \(g(z^k)\le 0\). Therefore, EMFCQ is suitable to weaken some assumptions on exactness property in the existing literature. In fact, it, to a great extent, mainly can be used to remove the level-bounded assumption. For instance, as stated in [13, Theorem 2.1], exactness property of the penalty function \(f_{\sigma }(z,\varepsilon )\) is established, provided that the level set \(D_F\) is bounded and every point \(z\in D_F\) satisfies MFCQ. These conditions ensure that the sequence \(\{z^k\}_{k\in K}\) is bounded and every accumulation point satisfies MFCQ, where \((z^k,\varepsilon _k)\) is a local optimal solution of the penalty problem \((\tilde{P}_\sigma )\). According to Proposition 2.1, EMFCQ holds for \(\{z^k\}_{k\in K}\). Therefore, we can show that \(\tilde{f}_\sigma (z,\varepsilon )\) is also exact (see Theorem 3.1 below) by requiring the validity of EMFCQ for \(\{z^k\}_{k\in K}\) and the boundedness of \(\nabla f\) over \(D_F\). Therefore, [13, Theorem 2.1] is a special case of our result (see Corollary 3.1 and Remark 3.1 below).
Similarly, the level-bounded assumption in the convergence analysis of penalty function methods can be weakened by EMFCQ. For example, in [8], a class of penalty function methods (see [8, Algorithm 1]) was proposed for the nonlinear programming problem with inequality constraints. Notice that, in general, the iteration point \(z^k\) generated by Algorithm 1 is infeasible. By the assumptions that the level set \(\Omega _\varepsilon \) is bounded and all global optimal solutions of the original problem satisfy MFCQ (see [8], Assumptions \((H1)\) and \((H2)\)]), an infinite sequence \(\{z^k\}\) generated by Algorithm 1 is proven to be feasible as \(k\) sufficiently large. Similar to the above discussion, it immediately follows from Assumptions \((H1), (H2)\) and [8, Lemma 3] that \(\{z^k\}\) is bounded and every accumulation point satisfies MFCQ. Thus, we recover [8, Theorem 1] by replacing these assumptions by EMFCQ holds for \(\{z^k\}_{k\in K}\) and \(\nabla f\) is bounded on the level set. This indicates that [8, Theorem 1] is also a special case in our work.
3 Exact penalty property
This section deals mainly with the “exact” property of the penalty functions (2.1). In particular, [13, Theorem 2.1] is generalized by using EMFCQ, instead of standard MFCQ. Define the level set
where \(a\in (0,+\infty ]\). If \(a=+\infty \), then \(D_F\) reduces to \([u,v]\). Before proceeding, we need the following assumptions.
Assumption (A)
\((A_1)\) \(\nabla f\) is bounded on \(D_F\);
\((A_2)\) For any infinite subsequence \(\{\sigma _k\}_{k\in K}\rightarrow +\infty \), EMFCQ holds for \(\{x^k\}_{k\in K}\), where \((x^k,\varepsilon _k)\) is a local optimal solution of \((\widetilde{P}_{\sigma _k})\) with finite value.
Lemma 3.1
Suppose that there exists \(\beta _1>0\) such that \(\beta '(\varepsilon )\ge \beta _1\) for all \(\varepsilon \in (0,\overline{\varepsilon }]\). If \((x,\varepsilon )\) is a KKT point of \((\widetilde{P}_\sigma )\) with \(\varepsilon >0\), then
where \(\Delta \) denotes \(\Delta (x,\varepsilon )\) for simplification.
Proof
If \((x,\varepsilon )\) is a KKT point of \((\widetilde{P}_\sigma )\) with \(\varepsilon >0\), then by the construction of \(\widetilde{f}_\sigma \), there exist \(\lambda ,~\eta \in \mathbb{R }^n_+,~\lambda _{n+1}\ge 0,\) and \(\eta _{n+1}\ge 0\) such that
It follows from (3.2) and (3.3) that
from which and the fact that \(\phi '(\Delta )>0\) we have
Rearranging (3.4) yields
The convexity of \(\phi \), the condition \((i_2)\) and the fact \(\phi (0)=0\) ensure that
Combining (3.5) and (3.6) yields
In consideration of \(\beta '(\varepsilon )\ge \beta _1\) and the monotonicity of \(\phi '\) (since \(\phi \) is convex), the above inequality implies that
\(\square \)
Lemma 3.2
Suppose that there exists \(\beta _1>0\) such that \(\beta '(\varepsilon )\ge \beta _1\) for all \(\varepsilon \in (0,\overline{\varepsilon }]\). If \((x^*,\varepsilon ^*)\) is a local optimal solution of \((\widetilde{P}_\sigma )\) with finite optimal value, then \(x^*\) is a local optimal solution of \((P)\) if and only if \(\varepsilon ^*=0\).
Proof
Since constraint functions of \((\widetilde{P}_\sigma )\) are all linear, then \((x^*,\varepsilon ^*)\) is also a KKT point of \((\widetilde{P}_\sigma )\). If \(x^*\) is a local optimal solution of \((P)\), then \(F(x^*)=0\). According to Lemma 3.1, we must have \(\varepsilon ^*=0\). Conversely, if \(\varepsilon ^*=0\), taking account of the finiteness of \(\widetilde{f}_\sigma (x^*,0)\) and the construction of \(\widetilde{f}_\sigma \), we have \(F(x^*)=0\), and hence \(x^*\) is a local optimal solution of \((P)\). \(\square \)
Lemma 3.3
Suppose that there exists \(\beta _1>0\) such that \(\beta '(\varepsilon )\ge \beta _1\) for all \(\varepsilon \in (0,\overline{\varepsilon }]\). If \((x^k,\varepsilon _k)\) is a KKT point of \((\widetilde{P}_{\sigma _k})\) with \(\varepsilon _k>0\), then for \(\sigma _k\rightarrow +\infty (k\rightarrow \infty )\), we have
where \(\Delta _k:=\Delta (x^k,\varepsilon _k)\).
Proof
Lemma 3.1 implies that
Note that
Therefore,
which, together with the monotonicity of \(\phi '\), yields
\(\square \)
Theorem 3.1
Suppose that Assumptions \((A_1)\) and \((A_2)\) hold, and that there exists \(\beta _1>0\) such that \(\beta '(\varepsilon )\ge \beta _1\) for all \(\varepsilon \in (0,\overline{\varepsilon }]\). When \(\sigma >0\) is sufficiently large, if \((x^*,\varepsilon ^*)\) is a local optimal solution of \((\widetilde{P}_\sigma )\) with finite optimal value, then \(\varepsilon ^*=0\). Furthermore, \(x^*\) is a local optimal solution of \((P)\).
Proof
It suffices to show that \(\varepsilon ^*=0\). Suppose on the contrary that there exist \(\sigma _k\rightarrow +\infty \) as \(k\rightarrow \infty \) and a sequence of local optimal solutions \((x^k,\varepsilon _k)\) of \((\widetilde{P}_{\sigma _k})\) with \(\widetilde{f}_\sigma (x^k,\varepsilon _k)\) finite and \(\varepsilon _k>0\). Since all constraint functions are linear, then \((x^k,\varepsilon ^k)\) is a KKT point of \((\widetilde{P}_{\sigma _k})\), i.e., there exist \(\lambda ^k,\eta ^k\in \mathbb{R }^n_+\) such that
Since the index set \(\{1,\ldots ,n\}\) is finite, then there exists an infinite subset \(K\subset \{1,2,\ldots \}\) such that for \(k\in K\),
where
Invoking Assumption \((A_2)\), there exist an infinite subset \(K_0\subset K\) and \(p\in \mathbb{R }^n\) such that
Putting (3.7)–(3.11) together yields
where \(\partial f(x)/\partial x_i\) denotes the partial derivative of \(f\) with respect to \(x_i\). Let
Lemma 3.3 implies
Since \(\Vert q_k\Vert =1\) is bounded, we can assume, without loss of generality, that
It then follows from (3.14) that
From Assumption \((A_1)\), (3.12), (3.15), and (3.16), taking limits in the above formula yields
Combining this with (3.12) implies
which, together with (3.13) and (3.17) yields \((\nabla F^*)\tilde{q}=0\), and hence \(\tilde{q}=0\) since \(\nabla F^*\) has full column rank by (3.12). This leads to a contradiction to \(\tilde{q}\ne 0\) in (3.16). Therefore, the desired result readily follows from Lemma 3.2. \(\square \)
As a corollary of Theorem 3.1, we have
Corollary 3.1
Suppose that
- (1):
-
the set \(D_F\) is bounded,
- (2):
-
the standard MFCQ holds at each point of \(D_F\),
- (3):
-
there exists \(\beta _1>0\) such that \(\beta '(\varepsilon )\ge \beta _1\) for all \(\varepsilon \in (0,\overline{\varepsilon }]\).
Then, whenever \(\sigma >0\) is large enough, every local optimal solution \((x^*,\varepsilon ^*)\) of \((\widetilde{P}_\sigma )\) with finite value satisfies \(\varepsilon ^*=0\). Furthermore, \(x^*\) is a local optimal solution of \((P)\).
Proof
The validity of Assumption \((A_1)\) comes from the condition \((1)\), and Assumption \((A_2)\) is due to the conditions \((1), (2)\), and Proposition 2.1. \(\square \)
Remark 3.1
In particular, Corollary 3.1 recovers [13, Theorem 2.1] by taking \(\phi (t)=\frac{t}{1-qt}\).
Inspired by [13], a necessary condition for exact penalty property is given below. Toward this end, consider a class of functions \(\Phi :[0,a )\rightarrow [0,+\infty )\), where \(a>0\), satisfying
- \((j_1)\) :
-
\(\Phi \) is continuous and increasing on \([0,a)\) with \(\Phi (0)=0\),
- \((j_2)\) :
-
there exists \(a'\in (0,a)\) such that \(\Phi (t)\ge t\) for all \(t\in [0,a']\).
Such a class of functions includes several important functions as special cases, for example
Assumption (B)
Let \(x^*\) be a feasible point of \((P)\). There exist \(\gamma >0\) and a neighborhood \(N(x^*)\) such that
where \(\Phi \) satisfies the conditions \((j_1)\) and \((j_2)\).
The function \(\Phi \) plays a key role in (3.18), which is illustrated by the following example.
Example3.1
Clearly, the optimal solution is \(x^*=0\). If \(\Phi (t)=t\), i.e., \(\Phi (\Vert F(x)\Vert )=x^2\), then (3.18) is false for all \(\gamma >0\), while if \(\Phi (t)=\sqrt{t}\), i.e., \(\Phi (\Vert F(x)\Vert )=|x|,\) then (3.18) is true by taking \(\gamma =1\).
Theorem 3.2
Suppose that \(\beta (\varepsilon )=\Phi (\varepsilon )\) for \(\varepsilon >0\) sufficiently small. If \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_{\sigma _0})\) with finite value for some \(\sigma _0>0\), then Assumption (B) holds at \(x^*\) for \(\Phi \) with \(\gamma \ge \sigma _0+\phi '(0)(1+\Vert \omega \Vert )^2\).
Proof
Let \(\gamma \ge \sigma _0+\phi '(0)(1+\Vert \omega \Vert )^2\). Suppose on the contrary that there exists a sequence \(x^k\in [u,v]\) converging to \(x^*\) such that
Since \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_{\sigma _0})\) and the corresponding optimal value is finite, it follows from the construction of \(\widetilde{f}_\sigma \) that \(x^*\) is feasible (cf.(2.1)) and thus \(x^*\) is a local optimal solution of \((P)\). Therefore, \(x_k\) is infeasible by (3.19). This, together with the continuity of \(F\), implies that
Let \(\varepsilon _k=\Vert F(x^k)\Vert \), i.e., \(\varepsilon _k>0\) and \(\varepsilon _k\rightarrow 0\) as \(k\rightarrow \infty .\) Notice that
Thus,
As a result,
whenever \(k\) is sufficiently large, since \(\phi '\) is continuous and \(\phi '(0)>0\). Since \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_{\sigma _0})\), then for \(k\) large enough we have
which leads to a contradiction. \(\square \)
4 Inverse propositions for exact penalization
In this section, we shall show that, if \(x^*\) is a local optimal solution of \((P)\), then Assumption (B) introduced in Sect. 3 is a necessary and sufficient condition for \((x^*,0)\) to be a local optimal solution of \((\widetilde{P}_\sigma )\). Based on Theorem 3.2, we further establish the equivalence between this new class of exact penalty functions and the classical simple and exact penalty functions in the sense of “exact penalty property”. Our conclusions clarify that this class of penalty functions (2.1) possesses not only exactness property as the classical simple penalty function, but also the smoothness property, which is not shared by the classical simple exact penalty function, however.
Theorem 4.1
Let \(x^*\) be a local optimal solution of \((P)\). The following two statements hold:
- (1):
-
If Assumption (B) holds at \(x^*\) for \(\Phi \), and \(\beta (\varepsilon )\ge \Phi (\sqrt{\varepsilon })\) for \(\varepsilon >0\) sufficiently small, then \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_\sigma )\) for all \(\sigma \ge \gamma \);
- (2):
-
Let \(\beta (\varepsilon )=\Phi (\varepsilon )\) for \(\varepsilon >0\) sufficiently small. If there exists \(\sigma _0>0\) such that \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_\sigma )\) for all \(\sigma \ge \sigma _0\), then Assumption (B) holds at \(x^*\) for \(\Phi \) with \(\gamma \ge \sigma _0+\phi '(0)(1+\Vert \omega \Vert )^2\).
Proof
We only need to show the validity of part (1), since part (2) follows from Theorem 3.2. Suppose on the contrary that for some \(\sigma \ge \gamma \), there exist \((x^k,\varepsilon _k)\in {[u,v]\times (0,\overline{\varepsilon }]}, x^k\rightarrow x^*,\) and \(\varepsilon _k\rightarrow 0\) as \(k\rightarrow \infty \) such that \(f(x^*)=\widetilde{f}_\sigma (x^*,0)>\widetilde{f}_\sigma (x^k, \varepsilon _k),\), i.e.,
where we have used the gradient inequality of convex functions, i.e., \(\phi (\Delta _k)\ge \phi (0)+\phi '(0)\Delta _k=\phi '(0)\Delta _k\). Taking limits on both sides of (4.1) yields \( \lim \limits _{k\rightarrow \infty }\Delta _k/{\varepsilon _k}= 0,\) which means that
whenever \(k\) is sufficiently large. So,
Since \(\varepsilon _k\rightarrow 0\), then by hypothesis, for \(k\) large enough,
Therefore, we conclude that for \(k\) sufficiently large,
which leads to a contradiction. \(\square \)
The concept of regular zero is introduced in [13, Definition 4.1]. To be consistent with their notation, let us define \(H:\mathbb{R }^n\rightarrow \mathbb{R }^p\) with \(p=m+|I^*|\le n\) and
where \(|I^*|\) signifies the number of elements in \(I^*=I^*_1\cup I^*_2, I^*_1\) and \(I^*_2\) are, respectively, the active index sets of \(u\le x^*\) and \(x^*\le v\), i.e.,
According to Theorem 4.1, we obtain
Corollary 4.1
Let \(x^*\) be a local optimal solution of \((P)\) and \(\beta (\varepsilon )\ge \sqrt{\varepsilon }\) for all \(\varepsilon \) sufficiently small. If \(x^*\) is a regular zero of \(H\), then there exists \(\gamma >0\) such that \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_\sigma )\) for all \(\sigma \ge \gamma \).
Proof
Since \(x^*\) is a regular zero of \(H\), it then follows from [13, Lemmas 5.1 and 5.2] that Assumption (B) holds at \(x^*\) for \(\Phi (t)=t\). Combining this and Theorem 4.1 yields the desired result. \(\square \)
Remark 4.1
For the smooth case, Corollary 4.1 reduces to [13, Theorem 5.3] by taking \(\phi (t)=\frac{t}{1-qt}\).
Corollary 4.2
Let \(x^*\) be a strict local optimal solution of \((P)\) and \(\beta (\varepsilon )\ge \sqrt{\varepsilon }\) for \(\varepsilon >0\) sufficiently small. If MFCQ holds at \(x^*\), then there exists \(\gamma >0\) such that \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_\sigma )\) for all \(\sigma \ge \gamma \).
Proof
Since MFCQ holds at \(x^*\), it then follows from [10, Theorem 4.4] that Assumption (B) holds at \(x^*\) for \(\Phi (t)=t\). Therefore, the desired result follows from Theorem 4.1. \(\square \)
Actually, these two sufficient conditions given in Corollaries 4.1 and 4.2 are independent. This is illustrated by the following examples.
Example 4.1
It is easy to see that MFCQ holds at \(x^*=(0,0)\), while \(x^*\) is not a regular zero of \(H\).
Now let us show that the second-order sufficient conditions guarantee the validity of Assumption (B). Recall that, for (P), the second-order sufficient conditions are said to hold at \(x^*\) if
-
(a)
\(x^*\) is a KKT point, i.e., there exist \(\lambda ^*,\eta ^*\in {\mathbb{R }^n_+},\) and \(\mu ^*\in \mathbb{R }^m\) such that
$$\begin{aligned}&\nabla f(x^*)-\lambda ^*+\eta ^*+\nabla F(x^*)\mu ^*= 0, \end{aligned}$$(4.4)$$\begin{aligned}&\inf (\lambda ^*_i,x^*_i-u_i)= \inf (\eta ^*_i,v_i-x^*_i)=0,\quad i=1,2,\ldots ,n. \end{aligned}$$(4.5) -
(b)
the matrix \(\nabla _{xx}^2L(x^*,\mu ^*)\) is positive definite on the cone \(\{d\ne 0|\nabla F(x^*)d=0, d_i=0, ~as~ \lambda _i^*>0~or~ \eta _i^*>0\}\), where \(L(x,\mu )=f(x)+\mu ^T F(x).\)
The following example shows that the second-order sufficient conditions are independent of MFCQ and regular zero of \(H\).
Example 4.2
It is easy to check that at the point \(x^*=(0,0)\), the second-order sufficient conditions hold, while MFCQ and regular zero of \(H\) do not hold true.
Corollary 4.3
If \(\beta (\varepsilon )\ge \sqrt{\varepsilon }\) for all \(\varepsilon >0\) sufficiently small and the second-order sufficient conditions hold at \(x^*\), then there exists \(\gamma >0\) such that \((x^*,0)\) is a local strict optimal solution of \((\widetilde{P}_\sigma )\) for all \(\sigma \ge \gamma \).
Proof
Since the second-order sufficient conditions hold at \(x^*\), it then readily follows from [10, Theorem 4.6] that Assumption (B) is valid for \(\Phi (t)=t\) with the inequality being strict. The desired result follows from Theorem 4.1. \(\square \)
Now we propose a new sufficient condition for the validity of Assumption (B) from a different way.
Proposition 4.1
Let \(x^*\) be a local optimal solution of \((P)\). If \(x^*\) is a KKT point and
where \(\lambda ^*,\eta ^*\in {\mathbb{R }^n_+},~\mu ^*\in \mathbb{R }^m\) are the corresponding Lagrangian multipliers, then Assumption (B) holds at \(x^*\) for \(\Phi \) with \(\gamma =1+\Vert \mu ^*\Vert \).
Proof
It is sufficient to show the existence of a neighborhood of \(x^*\), say \(N(x^*)\), such that
Since \(x^*\) is a local optimal solution of \((P)\), then there exists a neighborhood of \(x^*\), say \(\widetilde{N}(x^*)\), such that (4.7) holds true for all \(x\in \widetilde{N}(x^*)\cap [u,v]\) satisfying \(F(x)=0\). Now consider the case of \(x\in \widetilde{N}(x^*)\cap [u,v]\) with \(F(x)\ne 0\). Putting (4.4), (4.5), and (4.6) together yields
Note that there exists a neighborhood \(N(x^*)\subset \widetilde{N}(x^*)\) of \(x^*\) such that
whenever \(x\in N(x^*)\cap [u,v]\) with \(F(x)\ne 0\). Therefore,
This yields the inequality as desired. \(\square \)
It should be emphasized that the condition (4.6) is also independent of other conditions given in the previous discussion, which is illustrated by the following example.
Example 4.3
By a simple calculation, we know that, at \(x^*=0\), the condition (4.6) holds when \(\Phi (t)=\sqrt{t}\), while MFCQ, the second-order sufficient conditions, and regular zero of \(F\) do not hold at \(x^*\). In addition, the condition (4.6) is also true for Example 4.2 when \(\Phi (t)=t\).
Applying Theorem 4.1 and Proposition 4.1 yields the following corollary.
Corollary 4.4
Let \(x^*\) be a local optimal solution of \((P)\) and \(\beta (\varepsilon )\ge \Phi (\sqrt{\varepsilon })\) for all \(\varepsilon >0\) sufficiently small. If \(x^*\) is a KKT point and the condition (4.6) holds, then \((x^*,0)\) is a local optimal solution of \((\widetilde{P}_\sigma )\) with \(\sigma \ge 1+\Vert \mu ^*\Vert \).
We turn our attention to the relationship between this new class of exact penalty functions and the classical simple penalty functions in the sense of “exactness” property. Define
where \(\Phi \) satisfies \((j_1)\) and \((j_2)\). Clearly, \(f_\gamma \) is a classical simple and exact penalty function. The associated penalty problem is
Note that (3.18) in Assumption (B) is equivalent to saying that \(x^*\) is a local optimal solution of \((P_\gamma )\). Let \(x^*\) be a local optimal solution of \((P)\). According to Theorem 4.1, the relationship between \((\widetilde{P}_\sigma )\) and \((P_\gamma )\) is summarized as follows:
-
(1)
Suppose that \(\beta (\varepsilon )\ge \Phi (\sqrt{\varepsilon })\) for \(\varepsilon >0\) sufficiently small. If there exists \(\gamma >0\) such that \(x^*\) is a local optimal solution of the penalty problem \((P_{\gamma })\), then \((x^*,0)\) is a local optimal solution of the penalty problem \((\tilde{P}_{\sigma })\) when \(\sigma >0\) is sufficiently large.
-
(2)
Suppose that \(\beta (\varepsilon )= \Phi (\varepsilon )\) for \(\varepsilon >0\) sufficiently small. If there exists \(\sigma >0\) such that \((x^*,0)\) is a local optimal solution of the penalty problem \((\tilde{P}_{\sigma })\), then \(x^*\) is a local optimal solution of the penalty problem \((P_{\gamma })\) when \(\gamma >0\) is sufficiently large.
5 Penalty function methods
In this section, we present a revised penalty function algorithm via \(\tilde{f}_{\sigma }\). The algorithm is always feasible, since \(\tilde{f}_{\sigma }\) is bounded from below. It should be noted that the parameter \(\varepsilon \) in \(\tilde{f}_{\sigma }\) plays a key role in the framework of the algorithm. It follows from the Assumption (A) and Theorem 3.1 that the proposed algorithm terminates at the optimal solutions of \((P)\) after finitely many iterations (see Theorem 5.1). We further analyze the case in the absence of Assumption (A). In this case, we present a perturbation theorem about the algorithm (see Theorem 5.2). This allows us to obtain the global convergence property of the algorithm (see Corollary 5.1). Furthermore, necessary and sufficient conditions for zero duality gap are also derived between the original problem \((P)\) and its dual problem \((L)\) defined by the penalty function \(\tilde{f}_{\sigma }\) (see Corollary 5.2).
Assumption (C)
The function \(\beta \) satisfies \(\inf \limits _{\varepsilon \ge \varepsilon _0}\beta (\varepsilon )>0\) for all \(\varepsilon _0>0\).
Indeed, the function \(\beta \) given in the previous discussion satisfies this assumption.
Algorithm 5.1
Let \(\alpha \in (0,1)\) be sufficiently small number, \(~\delta _0>0,~\sigma _0\ge 1\), and \(k:=0\);
Step 1. If \(argmin(\widetilde{P}_{\sigma _k})\ne \emptyset \), solve
and go to Step 2. Otherwise, solve
i.e.,
and go to Step 3.
Step 2. If \(\varepsilon _k=0\), stop; Otherwise, go to Step 4.
Step 3. If \(\varepsilon _k=0\) and \(\delta _k\le \alpha \), stop; Otherwise, go to Step 4.
Step 4. Let
where \(\rho > 1\) is a constant.
Step 5. Let \(k:=k+1\) and go back to step 1.
This algorithm is always feasible, since the existence of \(\delta \)-optimal solution is ensured by the lower-boundedness of \(\widetilde{f}_\sigma \) over \(D\times [0,\overline{\varepsilon }]\). In addition, due to the smoothness of \(\widetilde{f}_{\sigma _k}(\cdot ,\varepsilon )\) on \(D_F\), many descent algorithms can be used to find \((x^k,\varepsilon _k)\). Given \(\eta \ge 0\), define a perturbation function of \((P)\) as follows
Clearly, \(\theta \) is a nonincreasing function with \(\theta (0)\) equal to the optimal value of \((P)\). Since \(f(x)\) is lower-bounded on \([u,v]\), then the limit of \(\theta \) as \(\eta \) approaches to \(0^+\) exists and
i.e., \(\theta \) is upper semicontinuous at zero.
Theorem 5.1
Under the assumptions of Theorem 3.1, let \(\{(x^k,\varepsilon _k)\}\) be a sequence generated by Algorithm 5.1 and \((x^k,\varepsilon _k)\in argmin (\widetilde{P}_{\sigma _k})\). Then there exists \(k_0\) such that when \(\sigma _k\ge \sigma _{k_0}, \varepsilon _k=0\), i.e., \(x^k\) is an optimal solution of \((P)\).
Proof
From Theorem 3.1, there exists \(k_0\) such that when \(\sigma _k\ge \sigma _{k_0}, \varepsilon _k=0\). It readily follows from the finiteness of \(\tilde{f}_{\sigma _k}(x^k,0)\) and the definition of \(\tilde{f_{\sigma }}\) that \(F(x^k)=0\), i.e., \(\tilde{f}_{\sigma _k}(x^k,0)=f(x^k)\). Note that \((x^k,0)\in argmin (\widetilde{P}_{\sigma _k})\) yields that \(x^k\in argmin (P)\). \(\square \)
Lemma 5.1
Let \(\beta \) satisfy Assumption \((C)\), and \(\{(x^k,\varepsilon _k)\}\) be an infinite sequence generated by Algorithm 5.1. Then
- (1):
-
\(\exists k_0>0\) such that \(\varepsilon _k>0\) for all \(k\ge k_0\);
- (2):
-
\(\lim \limits _{k\rightarrow \infty }\varepsilon _k=0\);
- (3):
-
\(\lim \limits _{k\rightarrow \infty }F(x^k)=0\).
Proof
Note that the condition \(\delta _k\le \alpha \) is always true as \(k\) large enough, since \(\delta _{k+1}=\delta _k/2\) by Step 4. This, according to Step 2 and Step 3, means that the termination condition is \(\varepsilon _k=0\). In other words, the algorithm will generate an infinite sequence only if \(\varepsilon _k>0\) as \(k\) sufficiently large. By the iteration strategy on \(\sigma _k\) in Step 4, we have
From Step 1, we have
where \(\overline{x}\) is a feasible point of \((P)\). Note that \(f\) is bounded from below on \(D\) and \(\phi \) is nonnegative. It immediately follows from (5.2) and (5.3) that \(\lim \limits _{k\rightarrow \infty }\beta (\varepsilon _k)=0,\) which, together with Assumption (C), further implies that
Similarly, according to the nonnegativity of \(\beta \), we get from (5.3) and (5.4) that \(\lim \limits _{k\rightarrow \infty }\phi (\Delta _k)= 0\), and hence
where we have used the gradient inequality of convex function \(\phi \), i.e., \(\phi (\Delta _k)\ge \phi '(0)\Delta _k\ge 0\). Since \(\Vert F(x^k)\Vert \le \sqrt{\Delta _k}+\varepsilon _k\Vert \omega \Vert \), putting (5.4) and (5.5) together yields \(\lim \limits _{k\rightarrow \infty }\Vert F(x^k)\Vert =0\) as claimed. \(\square \)
The global convergence property of Algorithm 5.1 is given below.
Theorem 5.2
Let \(\beta \) satisfy Assumption \((C)\). The following statements hold:
- (1):
-
Assume Algorithm 5.1 stops after finitely many iterations, then \(x^k\) is an either optimal solution or \(\alpha \)-optimal solution of \((P)\).
- (2):
-
If an infinite sequence is generated, then
$$\begin{aligned} \lim \limits _{k\rightarrow \infty }f(x^k)= \lim \limits _{\eta \rightarrow 0^+}\theta (\eta ), \end{aligned}$$and
$$\begin{aligned} \lim \limits _{k\rightarrow \infty }\inf \{\widetilde{f}_{\sigma _k} (x,\varepsilon )~|~(x,\varepsilon )\in {[u,v]\times [0,\overline{\varepsilon }]}\} =\lim \limits _{\eta \rightarrow 0^+}\theta (\eta ). \end{aligned}$$
Proof
(1). Suppose the algorithm stops at the \(k\)-th iteration, then \(\varepsilon _k=0\). Since \(\widetilde{f}_{\sigma _k}(x^k,0)\) is finite, then \(F(x^k)=0,\) and hence \(\widetilde{f}_{\sigma _k}(x^k,0)=f(x^k)\) by the definition of \(\widetilde{f}_{\sigma }\). Therefore, according to Step 1, we have \((x^k,0)\in argmin (\widetilde{P}_{\sigma _k})\) or \((x^k,0)\in \delta _k- argmin (\widetilde{P}_{\sigma _k})\). This together with \(\widetilde{f}_{\sigma _k}(x^k,0)=f(x^k)\) and \(\delta _k\le \alpha \) yields \(x^k\in argmin(P)\) or \(x^k\in \alpha - argmin(P)\).
(2). If the algorithm does not stop after finitely many iterations, then we have \(\lim \limits _{k\rightarrow \infty }\sigma _k=+\infty \). By Lemma 5.1, there exists a \(k_0\) such that \(\varepsilon _k>0\) for all \(k\ge k_0\). Because \(\widetilde{f}_{\sigma _k}(x^k,\varepsilon _k)\) is finite, then
The continuity of \(\beta \) and the fact \(\beta (0)=0\) guarantee the existence of \(\overline{\varepsilon }_k>0\) such that \(\lim \limits _{k\rightarrow \infty }\overline{\varepsilon }_k=0\) and
Indeed, this is ensured by choosing \(\overline{\varepsilon }_k\) such that \(\beta (\overline{\varepsilon }_k)\le 1/\sigma _k^2.\) Choose another sequence \(\{\xi _k\}\) with \(\xi _k>0\) and \(\lim \limits _{k\rightarrow \infty }\xi _k= 0\). According to the definition of infimum in \(\theta \), there exist \(y^k\in [u,v]\) such that \(\Vert F(y^k)\Vert \le \overline{\varepsilon }_k,\) and
Let \(\overline{\Delta }_k=\Vert F(y^k)-\overline{\varepsilon }_k\omega \Vert ^2. \) Since \(\Vert F(y^k)\Vert \le \overline{\varepsilon }_k\), then
from which and the fact \(\lim \limits _{k\rightarrow \infty }\overline{\varepsilon }_k=0\), we get \(\overline{\Delta }_k<a\) for all \(k\) sufficiently large. Thus, \(f_{\sigma _k}(y^k,\overline{\varepsilon }_k)\) is finite by (2.1). Lemma 5.1 implies that, for any \(\eta >0\), we have \(\Vert F(x^k)\Vert \le \eta \) as \(k\) large enough. Hence, \(\theta (\eta )\le f(x^k)\) holds by the definition of \(\theta \). Therefore,
where the fifth inequality comes from (5.7) and the convexity of \(\phi \) as before, and the last inequality is due to (5.8). Using (5.6) and the arbitrariness of \(\eta >0\), we get the desired result by taking limits on both sides of above inequality. \(\square \)
Corollary 5.1
Let \(\beta \) satisfy Assumption \((C)\), and \(\{(x^k,\varepsilon _k)\}\) be an infinite sequence generated by Algorithm 5.2. The following statements hold.
- (1):
-
\(\lim \limits _{k\rightarrow \infty }f(x^k)=\theta (0)\) if and only if \(\theta \) is lower semicontinuous at zero.
- (2):
-
If \(x^*\) is an accumulation point of \(\{x^k\}\), then \(x^*\) is a global optimal solution of \((P)\).
Proof
(1). This follows directly from Part 2) of Theorem 5.2.
(2). If \(x^*\) is an accumulation point, then \(F(x^*)=0\) by Lemma 5.1, i.e., \(x^*\) is feasible. According to Part 2) in Theorem 5.2, we get from (5.1) that
which implies \(f(x^*)=\theta (0)\), i.e., \(x^*\) is a global optimal solution of \((P)\). \(\square \)
Using the penalty function \(\widetilde{f}_\sigma \), define the associated dual optimization problem
where \(l(\sigma )=\inf \{\widetilde{f}_\sigma (x,\varepsilon )~|~(x,\varepsilon )\in {[u,v]\times [0,\overline{\varepsilon }]}\}\). It is easy to see that the weak dual theorem holds, i.e.,
Note that \(l\) is a nondecreasing function. Therefore, invoking Part 2) in Theorem 5.2, we develop the following necessary and sufficient conditions for the validity of zero duality gap property, i.e.,
Corollary 5.2
Let \(\beta \) satisfy Assumption \((C)\). Then the zero duality gap property between \((P)\) and \((L)\) holds if and only if \(\theta \) is lower semicontinuous at zero.
6 Numerical results
To give some insight into the behavior of Algorithm 5.1, numerical tests are performed on four nonlinear programming problems with equality constraints obtained from [11]. The algorithm is implemented in Matlab 7.8.0 and runs on Intel Core 2 CPU 2.39 GHz with 1.99 GB memory. We use \(\Vert \nabla _{(x,\varepsilon )}\widetilde{f}_{\sigma }(x,\varepsilon )\Vert \le 10^{-6} \) as stopping criteria. Tables 1, 2, 3 and 4 show the computational results for the corresponding problems with the following items:
We make numerical tests using different choices of \(\phi _i,i=1,4,6\) defined in Sect. 2, where \(\phi _1(t)\) is barrier-type penalty function and \(\phi _4(t), \phi _6(t)\) are exterior-type penalty functions. Here, \(\beta (\varepsilon )\) in (2.1) is set as \(\sqrt{\varepsilon }\) and \(\rho =2\) or \(\rho =4\). Tables 1, 2, 3 and 4 illustrate the practical behavior of the algorithm. The penalty subproblem can be (approximately) solved by unconstrained smooth minimization techniques since the proposed penalty function has good smoothness property. Here, in our numerical experiments, trust-region methods are employed for solving the penalty subproblems.
Example 6.1
([11])
The point \(\bar{x}=(1,1,0,0)\) is the unique (global) minimizer with the optimal objective function value \(-\)1.0000.
Example 6.2
([11])
The point \(\bar{x}=(1,1,1,1,1)\) is the (global) minimizer with the optimal objective function value 0.
Example 6.3
([11])
The point \(\bar{x}=(-\frac{33}{349},\frac{11}{349},\frac{180}{349}, -\frac{158}{349},\frac{11}{349})\) is a minimizer with the optimal objective function value \(\frac{1859}{349}\).
Example 6.4
([11])
where \(c_1=-6.089,c_2=-17.164,c_3=-34.054, c_4=-5.914,c_5=-24.721, c_6=-14.986,c_7=-24.100, c_8=-10.708, c_9=-26.662,c_{10}=-22.179\). The point \(\bar{x}=(0.1083,0.9438, 0,0.0004,0.4978,0.0040,0,0,0,0)\) is a (but not unique) local minimizer with the optimal objective function value \(-\)30.581215.
As shown, for whatever choices of \(\phi _i\), with the penalty parameter gradually increasing, the values of indicator variable \(\varepsilon _k\) and constraint violation measure \(\Delta (x^k,\varepsilon _k)\) tend to zero as desired. Additionally, it is not difficult to observe that the minimizers can be obtained without requirements of large penalty parameters for the choices of \(\phi _i\) considered here. Numerical performances verify the correctness of the developed theory as desired. For example, as illustrated in Tables 1, 2, 3 and 4, the iterates \((x^k,\varepsilon _k)\) are already quite close to the point \((\bar{x},0)\), where \(\bar{x}\) is a minimizer of the original problem. As stated in Theorem 3.1, the optimal solution of the penalty problem must take the form of \((x^k,\varepsilon _k)\) with \(\varepsilon _k=0\) for sufficiently large penalty parameters, which further means \(x^k\) is a local optimal solution of the original problem \((P)\).
In summary, our numerical experiments on four examples of nonlinear programming problems with equality constraints confirm the efficiency of the proposed algorithm. As shown in Tables 1, 2, 3 and 4, the numerical outputs for the different choices for \(\phi _i, i=1,4,6\) seem to have no significant differences, which demonstrates the class of convex functions presenting a unified framework for some barrier-type and exterior-type functions are effective for solving nonlinear programming problems. Nevertheless, there exists a little difference in the algorithm implementation process for solving exterior-type penalty functions and barrier-type penalty functions; for barrier-type penalty functions, we must take the additional constraint \(\Delta (x,\varepsilon )<a\) into account in contrast to exterior-type penalty functions, for instance, \(a=q^{-1}\) for \(\phi _1(t), a=\frac{\pi }{2}\) for \(\phi _2(t)\), and \(a=1\) for \(\phi _3(t)\).
References
Antczak, T.: Exact penalty functions method for mathematical programming problems involving invex functions. Eur. J. Oper. Res. 198, 29–36 (2009)
Bazaraa, M., Sherali, H., Shetty, C.: Nonlinear Programming: Theory and Algorithms. Wiley, New York (2006)
Burke, J.: Calmness and exact penalization. SIAM J. Control Optim. 29, 493–497 (1991)
Cominetti, R., Dussault, J.P.: Stable exponential-penalty algorithm with superlinear convergence. J. Optim. Theory Appl. 83, 285–390 (1994)
Chen, L., Goldfarb, D.: Interior-point \(l_2\) penalty methods for nonlinear programming with strong global convergence properties. Math. Program. 108, 1–36 (2006)
Evans, J.P., Gould, F.J., Tolle, J.W.: Exact penalty functions in nonlinear programming. Math. Programm. 4, 72–97 (1973)
Fletcher, R.: An exact penalty function for nonlinear programming with inequalities. Math. Program. 5, 129–150 (1973)
Gonzaga, C.C., Castillo, R.A.: A nonlinear programming algorithm based on non-coercive penalty functions. Math. Program. 96, 87–101 (2003)
Han, S.P., Mangasarian, O.L.: A dual differentiable exact penalty function. Math. Program. 25, 293–306 (1983)
Han, S.P., Mangasarian, O.L.: Exact penalty functions in nonlinear programming. Math. Program. 17, 251–269 (1979)
Hock, W., Schittkowski, K.: Test Examples for Nonlinear Programming Codes. Springer, New York (1981)
Hoheisel, T., Kanzowa, C., Outrata, J.: Exact penalty results for mathematical programs with vanishing constraints. Nonlinear Anal. 72, 2514–2526 (2010)
Huyer, W., Neumaier, A.: A new exact penalty function. SIAM J. Optim. 13, 1141–1158 (2003)
Li, W., Peng, J.: Exact penalty functions for constrained minimization problems via regularized gap function for variational inequalities. J. Global Optim. 37, 85–94 (2007)
Lucidi, S.: New results on a continuously differentiable exact penalty function. SIAM J. Optim. 2, 558–574 (1992)
Luo, Z.Q., Pang, J.S., Ralph, D.: Mathematical Programs with Equilibrium Constraints. Cambridge University Press, Cambridge (1996)
Meng, Z.Q., Dang, C.Y., Yang, X.Q.: On the smoothing of the square-root exact penalty function for inequality constrained optimization. Comput. Optim. Appl. 35, 375–398 (2006)
Di Pillo, G: Exact penalty methods, algorithms for continuous optimization. In: Spedicato, E. (ed.). Kluwer, The Netherlands (1994)
Di Pillo, G., Grippo, L.: An exact penalty function method with global convergence properties for nonlinear programming problems. Math. Program. 36, 1–18 (1986)
Di Pillo, G., Lucidi, S.: An augmented Lagrangian function with improved exactness properties. SIAM J. Optim. 12, 376–406 (2001)
Tseng, P., Bertsekas, D.P.: On the convergence of the exponential multiplier method for convex programming. Math. Program. 60, 1–19 (1993)
Zaslavski, A.J.: Existence of exact penalty for constrained optimization problems in Hilbert spaces. Nonlinear Anal. 67, 238–248 (2007)
Acknowledgments
The authors would like to thank referees for their constructive comments and suggestions, which significantly improved the presentation of the paper. The authors also thank Prof. Zhiyuan Tian in Qingdao University for his help.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by the National Natural Science Foundation of China (10971118, 11271226, 11271233, 11101248, 10901096).
Rights and permissions
About this article
Cite this article
Wang, C., Ma, C. & Zhou, J. A new class of exact penalty functions and penalty algorithms. J Glob Optim 58, 51–73 (2014). https://doi.org/10.1007/s10898-013-0111-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-013-0111-9
Keywords
- Nonlinear programming
- Smooth and exact penalty function
- Constraint qualifications
- Penalty function algorithms