Abstract
We propose a two-phase-SQP (Sequential Quadratic Programming) algorithm for equality-constrained optimization problem. In this paper, an iteration process is developed, and at each iteration, two quadratic sub-problems are solved. It is proved that, under some suitable assumptions and without computing further higher-order derivatives, this iteration process achieves higher-order local convergence property in comparison to Newton-SQP scheme. Theoretical advantage and a note on \(l_{1}\) merit function associated to the method are provided.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Sequential quadratic programming (SQP) method is one of the most successful methods for solving constrained nonlinear optimization problems. This is an iterative procedure which generates a sequence of points (not necessarily feasible points), obtained by solving quadratic programming sub problems, and converges to the Karush-Kuhn-Tucker (KKT) point. This idea was first proposed by Wilson [1] in 1963. Since then SQP method has been studied extensively by many researchers (see [2–8]). The readers may see Boggs [9], Gould et al. [10], Schittkowski et al. [11] for some good reviews on SQP algorithms, discussed so far. Some SQP methods use convex quasi-Newton approximation which makes the algorithm slow in case of large scale problems, whereas some other SQP methods employ the exact Hessian of the Lagrangian, which are Newton like methods. Under suitable assumptions the Newton version of the SQP algorithm converges to the minimum point without the use of line search and additional parameters. The Newton-SQP framework is the same as solving a Newton system derived from the KKT conditions, in which case it is often difficult to choose an initial iterate, close enough to the true solution to guarantee the convergence of the algorithm. These SQP methods, discussed so far, at most show quadratic or superlinear local convergence property. For the past few decades, researchers have shown their interest (see [12–16]) in iterative algorithms with higher-order convergence property. Again we notice that equality-constrained optimization emerges as a special branch of interest in constrained optimization theory (see [17, 18]).
In this paper, we suggest a two-phase-SQP method for equality-constrained optimization problems, which provides local cubic-order convergence under some comfortable assumptions. We have also discussed the conditions for the associated line search method to the scheme, under the \(l_{1}\) merit function.
Section 2 explains the Newton-SQP method. The two-phase-SQP is proposed in Sect. 3 which is followed by the convergence analysis in Sect. 4. We propose a note on \(l_{1}\) merit function of the new scheme in Sect. 5. Numerical examples are provided in Sect. 6, and finally some concluding remarks are provided in Sect. 7.
2 Background: Existing Newton-SQP Method
Consider the following optimization problem with equality constraints.
where \(f : {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) and \(h: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^m\) are smooth functions. The motivation behind the local SQP approach is to model (EP) at the current iterate \(x_k\) by a quadratic programming subproblem, then use the minimizer of the subproblem to define a new iterate \(x_{k+1}\).
Lagrangian function for the problem (EP) is \({\mathscr {L}} (x,\lambda )=f(x)-\lambda ^\mathrm{T}h(x),~\lambda \in {\mathbb {R}}^{m}\). Denote \(A(x)=[ \nabla h_{1}(x), \nabla h_{2}(x), \cdots , \nabla h_{m}(x)]^\mathrm{T}\) where \(h_{i}(x)\) is the \(i\mathrm{th}\) component of the vector h(x). The KKT system for (EP) is
Any solution \((x^{*}, \lambda ^{*})\) of (EP) for which A(x) has full rank, satisfies (2.2). System (2.2) consists of \(n+m\) equations in \(n+m\) unknowns x and \(\lambda \). Let \(P^{\prime } (x_{k}, \lambda _{k})\) be the first-order Frechet derivative of P at \((x_{k}, \lambda _{k})\). The next iterate \((x_{k+1}, \lambda _{k+1})\) is given by \(x_{k+1}=x_{k}+\widehat{x}_{k}\) and \(\lambda _{k+1}=\lambda _{k}+\widehat{\lambda }_{k}\), where \((\widehat{x}_{k}, \widehat{\lambda }_{k})\) is the value of \((\alpha , \beta ) \in {\mathbb {R}}^n \times {\mathbb {R}}^m\), satisfying the following system:
which is same as
where \({\mathscr {L}}_{k} \triangleq {\mathscr {L}}(x_{k}, \lambda _{k}), f_{k} \triangleq f(x_{k}), A_{k} \triangleq A(x_{k})\).
The Newton step generated from the iterate \((x_{k}, \lambda _{k})\) is thus given by
where \(\widehat{x}_{k}\) and \(\widehat{\lambda }_{k}\) can be obtained by solving (2.3). If the following assumptions are true, then (2.3) has unique solution (see [19]).
Assumptions
-
(A-1)
The constraint Jacobian matrix A(x) has full rank.
-
(A-2)
The matrix \(\nabla _{xx}^{2} {\mathscr {L}}(x, \lambda )\) is positive definite on the tangent space of the constraints, i.e., \(d^\mathrm{T} \nabla _{xx}^{2} {\mathscr {L}}(x, \lambda ) d>0\) for \(d \in \{d \in {\mathbb {R}}^{n} ~|~ A(x)d=0, d \ne 0 \}\).
Consider the following quadratic programming problem as an approximate model of (EP) at \((x_{k}, \lambda _{k})\):
The Lagrange function of (QP\(_k\)) is
where l is the Lagrange multiplier of (QP\(_k\)). If the assumptions (A-1) and (A-2) hold, then (\(\mathrm{QP}_k\)) has a unique solution \((s_{k},l_{k})\), that satisfies the first-order KKT conditions of (\(\mathrm{QP}_k\)), which are:
These two equations can be written in the following matrix form:
Subtracting \(A_{k}^\mathrm{T} \lambda _{k}\) from both side of the first equation of (2.3), we have
i.e.,
where \(\lambda _{k+1}= \lambda _{k}+\widehat{\lambda }_{k+1}\).
Now the local SQP iterates are \((x_{k+1}, \lambda _{k+1})\), where \(x_{k+1}=x_{k}+s_{k}\) and \(\lambda _{k+1}=l_{k}\). If the assumptions (A-1) and (A-2) hold then \(P^{\prime }(x_{k}, \lambda _{k})\) becomes a non singular matrix. Hence from (2.6) and (2.7) we have that \(s_{k}=\widehat{x}_{k}\) and \(l_{k}=\lambda _{k+1}\). In this way, the new iterate \((x_{k+1}, \lambda _{k+1})\) can be generated by solving (\(\mathrm{QP}_k\)). Further it is proved in [9] that chosen \((x_{0}, \lambda _{0})\) close to the solution \((x^{*}, \lambda ^{*})\) of (EP), then sequence \(\{(x_{k}, \lambda _{k})\}\) converges to \((x^{*}, \lambda ^{*})\) quadratically.
In the next section, this concept is used to develop a new algorithm to solve (EP) with higher-order convergence property.
3 Two-Phase-SQP Technique
Let \((s_{k}, l_{k})\) be the solution of (\(\mathrm{QP}_k\)), which is obtained using the process described in Sect. 2. \(l_{k}\) is the Lagrange multiplier at \(k\mathrm{th}\) stage. Denote
At this new point \((x_{\bar{k}},\lambda _{\bar{k}})\) find \(A_{\bar{k}}\triangleq [\nabla h_{1}(x_{\bar{k}}),\nabla h_{2}(x_{\bar{k}}), \cdots , \nabla h_{m}(x_{\bar{k}})]^\mathrm{T}\) and \(\nabla _{xx}^{2}{\mathscr {L}}_{{\bar{k}}} \triangleq \nabla _{xx}^{2} {\mathscr {L}}(x_{\bar{k}},\lambda _{\bar{k}})\). Consider the following optimization problem \((\mathrm{QP}_k1)\) at \((x_{k},\lambda _{k})\):
where \(B_{k}=A_{\bar{k}}-A_{k}\), \(C_{k}=A_{\bar{k}} +A_{k}~\).
\((\mathrm{QP}_k1)\) is a convex quadratic programming problem on the tangent subspace \( \{ d~|~d^\mathrm{T} \nabla _{xx}^{2}{\mathscr {L}}(x,\lambda ) d>0, A(x)d=0 \}\). Assume that \(C_{k}\) is of full rank. The unique solution \((s_{\bar{k}}, l_{\bar{k}})\) of \((\mathrm{QP}_k1)\) satisfies the first-order KKT optimality condition of \((\mathrm{QP}_k1)\). \(l_{\bar{k}}\) be the Lagrange multiplier of \((\mathrm{QP}_k1)\). Then from KKT optimality conditions
Adding \(\frac{1}{2}C_{k}^\mathrm{T} \lambda _{k}\) to both sides of (3.2a), we get
(3.3a) and (3.3b) can be written as the following matrix equation:
This is equivalent to
i.e.,
Denote
The iteration process discussed above can be summarized in the following two steps:
First step:
where (\(\widehat{x}_{k}\),\(\widehat{\lambda }_{k}\)) is the value of \((\alpha , \beta )\) satisfying
Second step:
where (\(\widehat{x}_{\bar{k}}\),\(\widehat{\lambda }_{\bar{k}}\)) is the value of \((\alpha , \beta )\) satisfying
Since \(\lambda _{k+1}=\lambda _{k} +\widehat{ \lambda }_{\bar{k}}\), so from (3.4) and (3.6a), \(\lambda _{k+1}=\lambda _{k}+(l_{\bar{k}}-\lambda _{k})=l_{\bar{k}}\). Hence the new iterate \((x_{k+1}, \lambda _{k+1})\) can be generated by solving the KKT optimality conditions given in (3.2a) or by its equivalent form given in (3.5) and (3.6).
We state the algorithm for the proposed two-phase-SQP (TP-SQP) scheme in its simplest form in the following algorithm:
For solving the equality-constrained optimization problem, a good initial estimate \(x_{0}\) for \(x^{*}\) can be used to obtain a good initial estimate \(\lambda _{0}\) for the optimal multiplier vector \(\lambda ^{*}\), which may be considered as follows in the light of [9]:
\(\lambda _{0}\) can be made arbitrarily close to \(\lambda ^{*}\) by choosing \(x_{0}\) close to \(x^{*}\).
4 Convergence Analysis
Theorem 4.1
If P is thrice Frechet differentiable function and \((x_{0},\lambda _{0})\) is sufficiently close to the solution \((x^{*},\lambda ^{*})\) of (EP), then sequence \((x_{k}, \lambda _{k})\) generated by Algorithm 1 converges cubically to \((x^{*}, \lambda ^{*})\).
Proof
Let \(u_{k}\) and \(v_{k}\) are \(m+n\) dimensional vectors, where \(u_{k} \triangleq (x_{k}, \lambda _{k})\) and \(v_{k} \triangleq (x_{\bar{k}}, \lambda _{\bar{k}})\), respectively. Then the steps (3.5) and (3.6) can be written as follows.
We denote \(\alpha =(x^{*}, \lambda ^{*})\). Here \(P:{\mathbb {R}}^{n+m}\rightarrow {\mathbb {R}}^{n+m}\). Using Taylor expansion of \(P(\alpha )\) about \(u_{k}\), we have
Subtracting \(\alpha \) from both the sides of (4.1b), and denoting \(e_{k}=u_{k}-\alpha \), we have
Since \((x^{*}, \lambda ^{*})\) is the solution of (EP), so \(P(x^{*}, \lambda ^{*})=0\). From (4.2),
Expanding \(P^{\prime }(v_{k})\) with respect to \(u_{k}\) we get,
The right hand side of (4.3) gives
Using (4.6), Expression (4.3) can be written as
where
This shows that the Algorithm 1 has cubic-order convergence.
One may observe the following theoretical advantages of the TP-SQP scheme.
-
Without computing higher-order derivatives, a higher-order convergence is achieved in comparison to Newton-SQP framework.
-
TP-SQP scheme involves the same order of arithmetic operations as compared to Newton-SQP method.
One may associate a line search technique to each iteration of Algorithm 1 with the help of a merit function, which can decide whether a trial step should be accepted or not. In the next section, we limit our discussion to \(l_{1}\) merit function for the proposed scheme. It has been discussed how to choose the penalty parameter \(\mu \) that makes the chosen direction a descent one in every iteration.
5 A Note on \(l_{1}\)-Merit Function for the Proposed Scheme
The \(l_{1}\)-merit function is of the form \(\phi _{1} : {\mathbb {R}}^ {n+1} \rightarrow {\mathbb {R}}\), as
where \(\mu \) is the penalty parameter, \(\Vert .\Vert _{1} \) is \(l_{1}\) norm. In TP-SQP method, the step length is of one unit length. The steplength \(\alpha _{k} \in (0,1]\) is associated if the following decrease condition holds:
where \(\eta \in (0,1)\) and \({\mathscr {D}}(\phi _{1}(x_{k}; \mu );p_{k})\) denotes the directional derivative of \(\phi _{1}\) in the direction \(p_{k}\). This requirement is analogous to Armijo condition of unconstrained optimization problem provided that \(p_{k}\) is a descent direction, that is, \({\mathscr {D}}(\phi _{1}(x_{k}; \mu );p_{k}) \leqslant 0\). Descent direction holds if \(\mu \) is chosen sufficiently large as shown in the following theorem:
Theorem 5.1
For the TP-SQP scheme, if \(\max ~ \{ \Vert A_{k}s_{\bar{k}} \Vert _{1} , \Vert \frac{1}{2}(A_{\bar{k}}- A_{k}) s_{\bar{k}} \Vert _{1} \} \leqslant \Vert h_{k} \Vert _{1}\), then \( \mu > 4 \Vert \lambda _{k} \Vert _{\infty } + 2\Vert \lambda _{k+1} \Vert _{\infty }~\) implies
Proof
This implies
i.e.,
Multiplying \(s_{\bar{k}}^\mathrm{T}\) in left to both sides of (3.2a) and using \(l_{\bar{k}}= \lambda _{k+1}\), the above inequality becomes
Using (3.3a), we get
Using the given condition \(\max ~ \{ \Vert A_{k} s_{\bar{k}} \Vert _{1} , \Vert \frac{1}{2}(A_{\bar{k}}- A_{k}) s_{\bar{k}} \Vert _{1} \} \leqslant \Vert h_{k} \Vert _{1}\), we get
It follows that under the assumption stated, \(s_{\bar{k}}\) will be a descent direction for \(\phi \) if \(\nabla ^{2}_{xx} {\mathscr {L}}_{{\bar{k}}}\) is positive definite and \(\mu > 2 \Vert \lambda _{k+1} \Vert _{\infty } + 4 \Vert \lambda _{k} \Vert _{\infty }\).
6 Numerical Examples
Following two examples are examined with the proposed algorithm, and the results are analyzed. The algorithm will terminate whenever the norm of the gradient of the Lagrangian comes close to zero.
Example 6.1
Example 6.2
(P1) and (P2) are solved in MATLAB R-2013(b) by Newton-SQP method and TP-SQP method with the tolerance limit is \(10^{-15}\) and different initial points. Value of \(\lambda _{0}\) is chosen using (3.7). The number of iterations by both the methods are provided in Table 1. Since the proposed scheme has local cubic-order convergence property, it is expected that the number of iterations of the proposed scheme to attain the solution must be less than the number of iterations of Newton-SQP scheme, which has quadratic convergence property. Table 1 certainly meets this expectation.
7 Conclusion
In this paper, we have proposed a local scheme to solve a nonlinear equality-constrained optimization problem with higher-order convergence property. Without computing higher-order derivatives, TP-SQP scheme achieves higher-order convergence property more quickly than Newton-SQP method. However, since we are solving two quadratic subproblems in each iteration, for the aspect of software implementation, TP-SQP scheme is more suitable than the Newton-SQP method for those optimization problems, where the former one takes almost half the number of iterations than the latter one. In future, in the light of TP-SQP scheme, one may propose schemes even having order of convergence more than three.
References
Wilson, R.B.: A simplicial algorithm for concave programming. Ph.D. Dissertation. Harvard University (1963)
Byrd, R.H., Curtis, F.E., Nocedal, J.: Infeasibility detection and SQP methods for nonlinear optimization. SIAM J. Optim. 20(5), 2281–2299 (2010)
Byrd, R.H., Nocedal, J.: An analysis of reduced Hessian methods for constrained optimization. Math. Program. 49(1–3), 285–323 (1990)
Conn, A.R., Gould, N.I., Toint, P.L.: Trust Region Methods, vol. 1. SIAM, Philadelphia (2000)
Han, S.P.: Superlinearly convergent variable metric algorithms for general nonlinear programming problems. Math. Program. 11(1), 263–282 (1976)
Huang, M., Pu, D.: A trust-region SQP method without a penalty or a filter for nonlinear programming. J. Comput. Appl. Math. 281, 107–119 (2015)
Izmailov, A.F., Solodov, M.V.: A truncated SQP method based on inexact interior-point solutions of subproblems. SIAM J. Optim. 20(5), 2584–2613 (2010)
Mo, J., Zhang, K., Wei, Z.: A variant of SQP method for inequality constrained optimization and its global convergence. J. Comput. Appl. Math. 197(1), 270–281 (2006)
Boggs, P.T., Tolle, J.W.: Sequential quadratic programming. Acta Numer. 4, 1–51 (1995)
Gould, N., Orban, D., Toint, P.: Numerical methods for large-scale nonlinear optimization. Acta Numer. 14, 299–361 (2005)
Schittkowski, K., Yuan, Y.X.: Sequential Quadratic Programming Methods. Wiley Encyclopedia of Operations Research and Management Science, New York 147–224 (2011)
Babajee, D., Dauhoo, M.: An analysis of the properties of the variants of Newtons method with third order convergence. Appl. Math. Comput. 183(1), 659–684 (2006)
Frontini, M., Sormani, E.: Third-order methods from quadrature formulae for solving systems of nonlinear equations. Appl. Math. Comput. 149(3), 771–782 (2004)
Golbabai, A., Javidi, M.: A third-order Newton type method for nonlinear equations based on modified homotopy perturbation method. Appl. Math. Comput. 191(1), 199–205 (2007)
Kou, J., Li, Y., Wang, X.: A modification of Newton method with third-order convergence. Appl. Math. Comput. 181(2), 1106–1111 (2006)
Sharma, J.R., Guha, R.K., Sharma, R.: An efficient fourth order weighted-newton method for systems of nonlinear equations. Numer. Algorithms 62(2), 307–323 (2013)
Grapiglia, G.N., Yuan, J., Yuan, Y.X.: A subspace version of the Powell–Yuan trust-region algorithm for equality constrained optimization. J. Oper. Res. Soc. China 1(4), 425–451 (2013)
Liu, X., Yuan, Y.: A sequential quadratic programming method without a penalty function or a filter for nonlinear equality constrained optimization. SIAM J. Optim. 21(2), 545–571 (2011)
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, 2nd edn. Springer, New York (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chakraborty, S.K., Panda, G. Two-Phase-SQP Method with Higher-Order Convergence Property. J. Oper. Res. Soc. China 4, 385–396 (2016). https://doi.org/10.1007/s40305-016-0122-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40305-016-0122-6