A subspace SQP method for equality constrained optimization

Lee, Jae Hwa; Jung, Yoon Mo; Yuan, Ya-xiang; Yun, Sangwoon

doi:10.1007/s10589-019-00109-6

A subspace SQP method for equality constrained optimization

Published: 16 May 2019

Volume 74, pages 177–194, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Computational Optimization and Applications Aims and scope Submit manuscript

A subspace SQP method for equality constrained optimization

Download PDF

Jae Hwa Lee¹,
Yoon Mo Jung²,
Ya-xiang Yuan³ &
…
Sangwoon Yun ORCID: orcid.org/0000-0002-5782-7266⁴

734 Accesses
13 Citations
Explore all metrics

Abstract

In this paper, we present a subspace method for solving large scale nonlinear equality constrained optimization problems. The proposed method is based on a SQP method combined with the limited-memory BFGS update formula. Each subproblem is solved in a theoretically suitable subspace. In the case of few constraints, we show that our search direction in the subspace is equivalent to that of the SQP subproblem in the full space. In the case of many constraints, we reduce the number of constraints in the subproblem and we show that the solution of the subspace subproblem is a descent direction of a particular exact penalty function. Global convergence properties of the proposed method are given for both cases. Numerical results are given to illustrate the soundness of the proposed model.

QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming

Article 13 April 2018

A regularized limited memory subspace minimization conjugate gradient method for unconstrained optimization

Article 19 June 2023

Large-scale unconstrained optimization using separable cubic modeling and matrix-free subspace minimization

Article 16 October 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this paper, we consider equality constrained optimization problems of the form

$$\begin{aligned}&\min _{x\in {\mathbb {R}}^n} f(x) \end{aligned}$$

(1a)

$$\begin{aligned}&\text {subject to }\, c_j(x)=0, \quad j=1, \ldots , m, \end{aligned}$$

(1b)

where f, $c_j:{\mathbb {R}}^n \rightarrow {\mathbb {R}} \,\, (j=1, \ldots , m)$ are continuously differentiable.

One of the most popular algorithms for (1) is the Sequential Quadratic Programming (SQP). At the k-th iteration of the SQP method, a search direction $d_k$ is computed by solving the subproblem:

$$\begin{aligned}&\min _{d\in {\mathbb {R}}^n} \varphi _k(d)=g_k^Td+\frac{1}{2}d^TB_kd \end{aligned}$$

(2a)

$$\begin{aligned}&\text {subject to }\, A_k^Td+c_k=0, \end{aligned}$$

(2b)

where

$$\begin{aligned} g_k=\nabla f(x_k), \,\, c_k=(c_1(x_k),\ldots ,c_{m}(x_k))^T, \,\, A_k=\begin{bmatrix}\nabla c_1(x_k), \ldots , \nabla c_{m}(x_k) \end{bmatrix}, \end{aligned}$$

and $B_k$ is an approximation to the Hessian of the Lagrangian of (1). Formally,

$$\begin{aligned} B_k \approx \nabla _x^2 \mathcal {L}(x_k,\lambda _k). \end{aligned}$$

We refer to [2, 6] for comprehensive understanding and further references on SQP methods. SQP is a very successful algorithm for small or medium-size problems (1). However, it may not be suitable for large scale problems, due to computational cost and memory issue, for example.

Currently, solving large scale problems is one of the most important issues in nonlinear optimization. For such problems, a large number of variables, limited memory storage, and massive computations at each iteration consist of major difficulties. A recent review by Gould et al. [7] discusses the main approaches for handling large scale problems, such as step computation, active set, gradient projection and interior point methods.

To solve large scale problems, such as large scale nonlinear optimization, nonlinear equations, and nonlinear least squares, Yuan [19, 20] proposed a new approach using subspace techniques. Also, a model algorithm using subspace was given for unconstrained and equality constrained optimization and possible choices for subspaces were introduced. In the subspace methods, the main issue is how to construct subproblems in a low dimension so that the computational cost in each iteration can be reduced, compared to standard approaches.

Yuan [19, 20] also showed that in many standard algorithms, such as conjugate gradient method, limited-memory quasi-Newton method, projected gradient method, and null space method, there are ideas or techniques which can be viewed as subspace approaches. Based on these observations, several algorithms have been proposed. In particular, for unconstrained optimization, Wang and Yuan [18] and Wang et al. [17] proposed subspace trust region methods using standard quasi-Newton and limited-memory quasi-Newton approach, respectively. More detailed discussions on subspace methods for unconstrained optimization can be found in [5, 15,16,17,18, 21]. For nonlinear equality constrained optimization, Grapiglia et al. [9] investigated subspace properties for the Celis-Dennis-Tapia (CDT) problem [3] and proposed a subspace method. In this method, as iteration increases, the dimension of subspace increases, and so memory and computational cost increase. This is the main drawback of the method proposed by Grapiglia et al. [9]. Recently, subspace choices for the CDT problem and subspace properties of the quadratically constrained quadratic program (QCQP) were investigated by Zhao and Fan [22, 23], respectively.

By extending the method in [19], we propose a new subspace SQP method for large scale nonlinear equality constrained optimization problems. Since our method controls the dimension of subspace at each iteration, we avoid the rapid increase of dimension. Furthermore, we adopt a line search. As far as we know, line search has not been applied yet in the context of subspace methods for equality constrained optimization.

In the case of few constraints, our method is equivalent to the standard SQP method in the full space. In the case of many constraints, we reduce the number of constraints in the subproblem and we show that the solution of the subspace subproblem is a descent direction of a particular exact penalty function. Global convergence properties of the proposed method are given for both cases.

This paper is organized as follows. In Sect. 2, we introduce the subspace method for unconstrained optimization in [19]. In Sect. 3, we describe a new subspace technique. Based on it, we propose a subspace method for equality constrained optimization by extending the method in [19]. In Sect. 4, the convergence of the proposed method is established. Numerical results are reported in Sect. 5. Finally, concluding remarks are given in Sect. 6.

2 Subspace methods for unconstrained optimization

In this section, we review the subspace method proposed in [19]. Let us consider unconstrained optimization problems of the form

$$\begin{aligned} \min _{x\in {\mathbb {R}}^n} f(x). \end{aligned}$$

(3)

The choice of a subspace can be motivated by limited-memory quasi-Newton methods [11]. In general, to approximate the Hessian $\nabla ^2f(x_k)$, quasi-Newton methods for (3) use the secant equation

$$\begin{aligned} B_ks_{k-1}=y_{k-1}, \end{aligned}$$

where $s_{k-1}=x_k-x_{k-1}$ and $y_{k-1}=g_k-g_{k-1}$. A typical example is the famous BFGS method [12]:

$$\begin{aligned} B_k=B_{k-1}-\frac{B_{k-1}s_{k-1}s_{k-1}^TB_{k-1}}{s_{k-1}^TB_{k-1}s_{k-1}} +\frac{y_{k-1}y_{k-1}^T}{s_{k-1}^Ty_{k-1}}. \end{aligned}$$

Thus, $B_k$ is updated by using $B_{k-1}$, $s_{k-1}$, and $y_{k-1}$. However, in large scale problems, retaining $B_{k-1}$ may introduce a memory shortage as k increases.

As a variant of quasi-Newton methods, limited-memory quasi-Newton methods [11] use only a few previous terms; given ${\bar{m}}$, they periodically approximate the Hessian using $B_{k}^{(i-1)}$, $s_{k-{\bar{m}}-1+i}$, and $y_{k-{\bar{m}}-1+i}$ for $i=1,\ldots , {\bar{m}}$ from the initial guess $B_k^{(0)}=\delta _k I$ with $\delta _k > 0$. Byrd et al. [1] proposed a compact representation of limited-memory BFGS update matrix

$$\begin{aligned} B_k=B_k^{({\bar{m}})}=\delta _kI+W_kD_kW_k^T, \end{aligned}$$

where $D_k$ is a $2{\bar{m}}\times 2{\bar{m}}$ matrix and

$$\begin{aligned} W_k=\left[ S_k \quad Y_k \right] =\left[ s_{k-1}, s_{k-2}, \ldots , s_{k-{\bar{m}}}, y_{k-1}, y_{k-2}, \ldots , y_{k-{\bar{m}}}\right] \in {\mathbb {R}}^{n\times 2{\bar{m}}}. \end{aligned}$$

If a line search type method is applied, we have $s_k=\alpha _kd_k=-\alpha _kB_k^{-1}g_k$, while for trust region algorithms one has $s_k=-(B_k+\nu _kI)^{-1}g_k$ with $\nu _k \ge 0$. In either case, it can be shown that $s_k$ belongs to the subspace

$$\begin{aligned} \text {span} \left\{ g_k, s_{k-1}, \ldots , s_{k-{\bar{m}}}, y_{k-1}, \ldots , y_{k-{\bar{m}}} \right\} . \end{aligned}$$

(4)

Based on the subspace (4), Wang et al. [17] proposed a subspace trust region method for large scale unconstrained optimization problems (3). Preliminary numerical results show that their algorithm is encouraging. In their experiments, the scales of problem range from 500 to 10,000, but the dimensions of subspace are low, between 6 and 16. In the next section, we propose a new subspace method for the equality constrained optimization problem, using a line search framework.

3 A new subspace method for equality constrained optimization

In this section, we describe a new subspace technique for solving the equality constrained optimization problem (1). Suppose that at the k-th iteration, we have the current iterate point $x_k$ and the subspace ${\mathcal {T}}_k$. We denote the dimension of ${\mathcal {T}}_k$ by $i_k$ and let $Q_k=\left[ q_1^{(k)},q_2^{(k)},\ldots ,q_{i_k}^{(k)}\right] $ be a matrix of linearly independent vectors in ${\mathcal {T}}_k$. Since any vector $d\in {\mathcal {T}}_k$ can be written as $d=Q_kz$ for some $z\in {\mathbb {R}}^{i_k}$, the SQP subproblem with respect to the subspace ${\mathcal {T}}_k$ is given as

$$\begin{aligned}&\min _{z\in {\mathbb {R}}^{i_k}} {\bar{\varphi }}_k(z)=(Q_k^Tg_k)^Tz+\frac{1}{2}z^TQ_k^TB_kQ_kz \end{aligned}$$

(5a)

$$\begin{aligned}&\text {subject to }\, A_k^TQ_kz+c_k=0. \end{aligned}$$

(5b)

However, when $m > i_k$, the linearized constraints (5b) may not be feasible because the system is over-determined. In this case, to guarantee the non-emptiness of feasible set, we consider the reduced constrained version of (5b):

$$\begin{aligned}&\min _{z\in {\mathbb {R}}^{i_k}} {\bar{\varphi }}_k(z)=(Q_k^Tg_k)^Tz+\frac{1}{2}z^TQ_k^TB_kQ_kz \end{aligned}$$

(6a)

$$\begin{aligned}&\text {subject to }\, P_k^T(A_k^TQ_kz+c_k)=0, \end{aligned}$$

(6b)

where $P_k$ is a projection matrix. The Karush–Kuhn–Tucker (KKT) system of (6) is as follows:

$$\begin{aligned} Q_k^T (B_k Q_k z + A_kP_k \mu + g_k)= & {} 0, \end{aligned}$$

(7)

$$\begin{aligned} P_k^T(A_k^TQ_k z + c_k)= & {} 0, \end{aligned}$$

(8)

where $\mu $ is a vector of Lagrangian multipliers.

Now we present a subspace SQP method for equality constrained optimization. It can be considered as a modification of the subspace method for equality constrained optimization proposed by Yuan [19].

Algorithm 1(Subspace SQP algorithm)

Step 1
Given $x_0$, set an initial subspace ${\mathcal {T}}_0$, an initial penalty parameter $\sigma _0> 0, tolerance \epsilon >0, iteration number k{:}{=}0.$
Step 2
Solve the subspace subproblem (6):
$$\begin{aligned}&\min _{z\in {\mathbb {R}}^{i_k}} {\bar{\varphi }}_k(z)=(Q_k^Tg_k)^Tz+\frac{1}{2}z^TQ_k^TB_kQ_kz\\&\text {subject to }\, P_k^T(A_k^TQ_kz+c_k)=0, \end{aligned}$$
to obtain $z_k$ and Lagrangian multipliers $\mu _k$, and set

$d_k=Q_kz_k$ and $\lambda _k^{+}=P_k\mu _k$. If $\left\| d_k\right\| \le \epsilon $ then stop.
Step 3
Perform a line search to obtain the stepsize $\alpha _k$ which satisfies
$$\begin{aligned} \phi (x_k+\alpha d_k, \sigma _k) < \phi (x_k, \sigma _k) \end{aligned}$$
and set
$$\begin{aligned} x_{k+1}= & {} x_k+\alpha _k d_k\\ \lambda _{k+1}= & {} \lambda _k+\alpha _k (\lambda _k^{+}-\lambda _k) \end{aligned}$$
where $\phi (x,\sigma )$ is a penalty function and $\sigma $ is a penalty parameter.
Step 4
Generate $\sigma _{k+1}$, subspace ${\mathcal {T}}_{k+1}$, and $\bar{\varphi }_{k+1}(z)$.
Step 5
Set $k{:}{=}k+1$ and go to Step 2.

The key issue for Algorithm 1 is how to choose a subspace and solve the corresponding subspace subproblem quickly. We adapt the damped limited-memory BFGS update formula for equality constrained problems, which guarantees $B_k$ to be positive definite [13]. With the relation $H_k=B_k^{-1}$, we apply the following formulae:

$$\begin{aligned} B_0=I, \quad H_0=I. \end{aligned}$$

For $k \ge 1$,

$$\begin{aligned}&B_k=\delta _k I -\left[ \delta _k S_k\quad {\bar{Y}}_k \right] \begin{bmatrix} \delta _k S_k^T S_k&L_k\\ L_k^T&-D_k \end{bmatrix}^{-1}\begin{bmatrix} \delta _kS_k^T \\ {\bar{Y}}_k^T \end{bmatrix}, \\&H_k=\gamma _k I + \left[ S_k\quad \gamma _k{\bar{Y}}_k \right] \begin{bmatrix} R_k^{-T}(D_k+\gamma _k {\bar{Y}}_k^T{\bar{Y}}_k)R_k^{-1}&-R_k^{-T}\\{\vspace{4pt}} -R_k^{-1}&0\end{bmatrix}\begin{bmatrix} S_k^T \\ \gamma _k{\bar{Y}}_k^T \end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned}&S_k = \left\{ \begin{array}{ll} \left[ s_0,\ldots ,s_{k-1}\right] &{} \text { if}\quad k \le {\bar{m}}\\ {\vspace{4pt}} \left[ s_{k-{\bar{m}}},\ldots ,s_{k-1}\right] &{} \text { otherwise}\\ \end{array} \right. , \\&{\bar{Y}}_k = \left\{ \begin{array}{ll} \left[ {\bar{y}}_0,\ldots ,{\bar{y}}_{k-1}\right] &{} \text { if} \quad k \le {\bar{m}}\\ {\vspace{4pt}} \left[ {\bar{y}}_{k-{\bar{m}}},\ldots , {\bar{y}}_{k-1}\right] &{} \text { otherwise}\\ \end{array} \right. , \\&D_k = \left\{ \begin{array}{ll} \text {diag}\left( s_0^T{\bar{y}}_0,\ldots ,s_{k-1}^T{\bar{y}}_{k-1}\right) &{} \text { if}\quad k \le {\bar{m}}\\ {\vspace{4pt}} \text {diag}\left( s_{k-{\bar{m}}}^T{\bar{y}}_{k-{\bar{m}}},\ldots ,s_{k-1}^T{\bar{y}}_{k-1}\right) &{} \text { otherwise}\\ \end{array} \right. , \\&\delta _k=\frac{s_{k-1}^T{\bar{y}}_{k-1}}{\left\| s_{k-1}\right\| _2^2}, \quad \gamma _k = \frac{1}{\delta _k}, \\{\vspace{4pt}}&(L_k)_{ij} = \left\{ \begin{array}{ll} {\tilde{s}}_{i-1}^T {\tilde{y}}_{j-1} &{} \text { if}\quad i> j \\ 0&{} \text { otherwise}\\ \end{array} \right. ,\quad i,j = \left\{ \begin{array}{ll} 1,2,\ldots ,k &{} \text { if}\quad k \le {\bar{m}}\\ {\vspace{4pt}} 1,2,\ldots ,{\bar{m}} &{} \text { otherwise}\\ \end{array} \right. , \\&(R_k)_{ij} = \left\{ \begin{array}{ll} {\tilde{s}}_{i-1}^T {\tilde{y}}_{j-1} &{} \text { if}\quad i > j \\ 0&{} \text { otherwise}\\ \end{array} \right. ,\quad i,j = \left\{ \begin{array}{ll} 1,2,\ldots ,k &{} \text { if}\quad k \le {\bar{m}}\\ {\vspace{4pt}} 1,2,\ldots ,{\bar{m}} &{} \text { otherwise}\\ \end{array} \right. , \\&{\tilde{s}}_i = \left\{ \begin{array}{ll} s_i &{} \text { if}\quad k \le {\bar{m}}\\ {\vspace{4pt}} s_{k-{\bar{m}}+i}&{} \text { otherwise}\\ \end{array} \right. ,\quad {\tilde{y}}_i = \left\{ \begin{array}{ll} {\bar{y}}_i &{} \text { if}\quad k \le {\bar{m}}\\ {\vspace{4pt}} {\bar{y}}_{k-{\bar{m}}+i}&{} \text { otherwise}\\ \end{array} \right. , \\&s_i=x_{i+1}-x_i, \quad y_i=\nabla _x {\mathcal {L}}(x_{i+1},\lambda _{i+1})-\nabla _x \mathcal {L}(x_i,\lambda _i), \\&\theta _i= \left\{ \begin{array}{ll} 1 &{} \text { if}\quad s_i^Ty_i\ge 0.2s_i^TB_is_i\\ {\vspace{4pt}} (0.8s_i^TB_is_i)/(s_i^TB_is_i-s_i^Ty_i) &{} \text { otherwise}\\ \end{array} \right. , \\&{\bar{y}}_i=\theta _iy_i+(1-\theta _i)B_is_i, \end{aligned}$$

and $\lambda _i$ are Lagrangian multipliers. The modified differences of the gradients of Lagrangian ${\bar{y}}_i$ are used to guarantee $B_k$ to be positive definite. See [13] for details.

There are a variety of methods for solving the subproblem (2) and the corresponding subspace subproblem (6). In our numerical test, we use the range space method for these subproblems since $B_k^{-1}$ is explicitly given as $H_k$ through the quasi-Newton update formula [12]. More specifically, the range space method solves the following two systems

$$\begin{aligned} (A_k^TB_k^{-1}A_k) \lambda= & {} -A_k^TB_k^{-1}g_k +c_k, \end{aligned}$$

(9)

$$\begin{aligned} d= & {} -B_k^{-1}(g_k+A_k\lambda ), \end{aligned}$$

(10)

for the solution of the subproblem (2).

In Step 3, we use backtracking line search, which starts with $\alpha =1$. The stepsize $\alpha _k=\alpha $ is accepted if

$$\begin{aligned} \phi (x_k+\alpha _kd_k, \sigma _k)\le \phi (x_k, \sigma _k)+\eta \alpha _kD_{d_k}\phi (x_k, \sigma _k), \end{aligned}$$

where $\eta \in (0, \frac{1}{2})$, and $D_{d}\phi (x, \sigma )$ is the directional derivative of $\phi $ at x in the direction d. Otherwise, we set $\alpha \leftarrow \tau \alpha $, $\tau \in (0,1)$ and repeat the line search. More details for our numerical experiments are given in Sect. 5.

4 Convergence analysis

In this section, we analyze convergence properties of the proposed method under the following assumptions:

Assumption 1

The starting point and all succeeding iterates lie in some closed, bounded and convex region ${\mathcal {C}}$ in ${\mathbb {R}}^n$.

Assumption 2

The columns of $A(x)=\begin{bmatrix}\nabla c_1(x), \ldots , \nabla c_{m}(x) \end{bmatrix}$ are linearly independent for all $x\in {\mathcal {C}}$.

Assumption 3

For all $d\in {\mathbb {R}}^n$, there are positive constants $\beta _1$ and $\beta _2$ such that

$$\begin{aligned} \beta _1 \Vert d\Vert _2^2 \le d^TB_kd \le \beta _2 \Vert d\Vert _2^2 \end{aligned}$$

for all k.

These conditions are commonly chosen to prove global convergence of algorithms for constrained optimization problems [2].

We consider two cases based on the number of constraints m, comparing with the number of the variables n. For simplicity, we assume $k \ge {\bar{m}}$.

4.1 Case I: Few constraints

First, we consider the case of relatively few constraints in the problem (1). Since $m \ll n$, the subproblem (5) is used at each iteration. The following theorem shows that subproblem (2) is equivalent to subproblem (5). Its proof is an adaptation of Lemma 2.2 in [18].

Theorem 1

Suppose $B_k$ is the k-th updated matrix by the damped limited-memory BFGS update formula. Let $d_k$ be a solution of problem (2), and $s_k=\alpha _k d_k$ with stepsize $\alpha _k>0$. Let

$$\begin{aligned} {{\mathcal {T}}^F_k}=\text {span} \left\{ g_k, s_{k-{\bar{m}}}, \ldots , s_{k-1}, {\bar{y}}_{k-{\bar{m}}}, \ldots , {\bar{y}}_{k-1}, \nabla c_1(x_k), \ldots , \nabla c_{m}(x_k)\right\} . \end{aligned}$$

If $Q_k$ is an orthonormal basis matrix of ${\mathcal {T}}^F_k$, then subproblem (2) is equivalent to (5).

Proof

We denote the dimension of the subspace ${\mathcal {T}}^F_k$ by r. Let $U_k\in {\mathbb {R}}^{n\times (n-r)}$ be a matrix such that $\left[ Q_k, \,\, U_k\right] $ is an $n\times n$ orthogonal matrix. Then, for each $d\in {\mathbb {R}}^n$, there exists one and only one pair $z\in {\mathbb {R}}^r$, $u\in {\mathbb {R}}^{n-r}$ such that $d=Q_kz+U_ku$. Thus, it follows that

$$\begin{aligned} \varphi _k(d)= & {} g_k^T d + \frac{1}{2} d^T B_k d\nonumber \\= & {} g_k^T (Q_k z + U_k u) + \frac{1}{2} (Q_k z + U_k u)^T B_k (Q_k z + U_k u)\nonumber \\= & {} (Q_k^T g_k)^T z + (U_k^T g_k)^T u+\frac{1}{2} z^TQ_k^T B_k Q_k z +u^TU_k^T B_k Q_k z\nonumber \\&+\frac{1}{2} u^TU_k^T B_k U_k u. \end{aligned}$$

(11)

By the choice of ${\mathcal {T}}^F_k$, we have $S_k^TU_k=0$, ${\bar{Y}}_k^TU_k=0$, and $g_k^TU_k=0$. Hence, it follows that

$$\begin{aligned} B_kU_k=\delta _kU_k-\left[ \delta _k S_k\quad {\bar{Y}}_k \right] \begin{bmatrix} \delta _k S_k^T S_k&L_k\\ L_k^T&-D_k \end{bmatrix}^{-1}\begin{bmatrix} \delta _kS_k^TU_k \\ {\bar{Y}}_k^TU_k \end{bmatrix}=\delta _kU_k, \end{aligned}$$

and $U_k^T B_k Q_k=\delta _k U_k^T Q_k =0$. By (11), we have

$$\begin{aligned} \varphi _k(d)=(Q_k^T g_k)^T z + \frac{1}{2} z^TQ_k^T B_k Q_k z+\frac{1}{2}\delta _k \left\| u\right\| _2^2. \end{aligned}$$

Since $A_k^TU_k=0$, we have

$$\begin{aligned} 0=A_k^T d + c_k=A_k^T(Q_kz+U_k u)+c_k=A_k^T Q_k z+c_k. \end{aligned}$$

Now the above relation indicates (2) is equivalent to

$$\begin{aligned}&\min _{z, \, u} \quad \quad \quad (Q_k^Tg_k)^Tz+\frac{1}{2}z^TQ_k^TB_kQ_kz+\frac{1}{2}\delta _k \left\| u \right\| _2^2\\&\text {subject to }\, A_k^TQ_kz+c_k=0. \end{aligned}$$

Since $\delta _k>0$, it is easy to see that the above problem is equivalent to (5) with $u=0$. Therefore (2) is equivalent to (5) with $d_k=Q_kz_k$. $\square $

By Theorem 1, we ensure that the proposed method generates the same iterate as the standard SQP method. Thus, we can obtain the global convergence of the proposed method using Theorem 4.3 of Boggs and Tolle [2]. For completeness, we restate this theorem for the $L_1$ exact penalty function [10]

$$\begin{aligned} \phi _1(x,\sigma )=f(x)+\sigma \left\| c(x)\right\| _1. \end{aligned}$$

Theorem 2

(Boggs and Tolle [2]) Assume that $\sigma $ is chosen by

$$\begin{aligned} \sigma = \left\| \lambda _k^{+}\right\| _{\infty } +\bar{\rho } \end{aligned}$$

for some constant $\bar{\rho } > 0$. Then the proposed algorithm started at any point $x_0$ with stepsize $\alpha \ge \bar{\alpha } > 0$ chosen to satisfy

$$\begin{aligned} \phi _1(x_k+\alpha d_k, \sigma _k)\le \phi _1(x_k, \sigma _k)+\eta \alpha D_{d_k}\phi _1(x_k, \sigma _k) \end{aligned}$$

converges to a KKT point of (1).

4.2 Case II: Many constraints

Now, we consider the case of relatively many constraints, i.e. $m \approx n$ in (1). In this case, problem (6) is solved at each iteration. Since we reduce the number of constraints in the subproblem, we cannot expect that the proposed method generates the same iterate as the standard SQP method. Nevertheless, we can show that the search direction is a descent direction for the $L_{\infty }$ exact penalty function [10]

$$\begin{aligned} \phi _{\infty }(x,\sigma )=f(x)+\sigma \left\| c(x)\right\| _{\infty }. \end{aligned}$$

(12)

Using this property, the global convergence of the proposed method is obtained.

First of all, we state the following result of Yuan [20]. It is used to show that the solution of the subproblem (6) is a descent direction of $\phi _{\infty }(x,\sigma )$.

Lemma 1

(Yuan [20]) Suppose we sort the violations in decreasing order:

$$\begin{aligned} \left| c_{l_1}(x_k)\right| \ge \left| c_{l_2}(x_k)\right| \ge ...\ge \left| c_{l_{m}}(x_k)\right| , \end{aligned}$$

(13)

and $|c_{l_{j_k+1}}| < \Vert c_k\Vert _{\infty }$. If the search direction $d_k$ satisfies

$$\begin{aligned} P_k^T(c_k+A_k^Td_k)=0, \end{aligned}$$

where

$$\begin{aligned} P_k=\left[ e_{l_1},e_{l_2},...,e_{l_{j_k}}\right] \in {\mathbb {R}}^{m \times j_k}, \end{aligned}$$

(14)

then $d_k$ is a descent direction of $\left\| c(x)\right\| _{\infty }$ at $x_k$.

Using Lemma 1, we derive the following theorem for a descent property of the search direction.

Theorem 3

Suppose $\{x_k\}$, $\{d_k\}$, $\{\sigma _k\}$, $\{c_k\}$, $\{Q_k\}$, $\{P_k\}$, and $\{B_k\}$ are sequences generated by the proposed subspace SQP algorithm under Assumptions 2 and 3. If $|c_{l_{j_k+1}}(x_k)| < \left\| c_k\right\| _{\infty }$, $P_k$ is given by (14), and sufficiently large $\sigma _k$ is chosen, then the search direction $d_k$ is a descent direction for $\phi _{\infty }(x,\sigma )$. That is, its directional derivative at $x_k$ in the direction $d_k$ satisfies

$$\begin{aligned} D_{d_k}\phi _{\infty }(x_k,\sigma _k)<0. \end{aligned}$$

Proof

Using Lemma 1, we obtain that

$$\begin{aligned} D_{d_k}\phi _{\infty }(x_k,\sigma _k)= & {} g_k ^Td_k-\sigma _k \Vert c_k\Vert _{\infty }\\= & {} g_k^T Q_k z_k - \sigma _k \Vert c_k\Vert _{\infty }\\= & {} -z_k^T Q_k^T B_k Q_k z_k - \mu _k^T P_k^T A_k^T Q_k z_k -\sigma _k \Vert c_k\Vert _{\infty }\\= & {} -z_k^T Q_k^T B_k Q_k z_k + \mu _k^T P_k^T c_k -\sigma _k \Vert c_k\Vert _{\infty }\\\le & {} -z_k^T Q_k^T B_k Q_k z_k -(\sigma _k - \Vert \mu _k\Vert _1)\Vert c_k\Vert _{\infty }, \end{aligned}$$

where the third equation is obtained by using (7), the fourth equation is from (8), and we get the last inequality by using Hölder’s inequality. Assumption 3 implies that $-z_k^T Q_k^T B_k Q_k z_k$ is always negative. Thus, if $\sigma _k$ is sufficiently large, that is, $\sigma _k > \Vert \mu _k\Vert _1$, then $D_{d_k}\phi _{\infty }(x_k,\sigma _k)<0$. Hence the search direction $d_k$ is a descent direction for $\phi _{\infty }(x,\sigma )$. $\square $

Now we establish the global convergence of our algorithm by applying the idea of Powell [14].

Theorem 4

Suppose $\{x_k\}$, $\{d_k\}$, $\{\sigma _k\}$, $\{c_k\}$, $\{Q_k\}$, $\{P_k\}$, and $\{B_k\}$ are sequences generated by the proposed subspace SQP algorithm under Assumptions 1–3. If $|c_{l_{j_k+1}}(x_k)| < \left\| c_k\right\| _{\infty }$, $P_k$ is given by (14), $\sigma _k$ is chosen as

$$\begin{aligned} \sigma _k = \left\| \mu _k\right\| _1 +\bar{\rho } \end{aligned}$$

(15)

for some constant $\bar{\rho } > 0$, $Q_k$ is an orthonormal basis matrix of ${\mathcal {T}}^M_k$ where

$$\begin{aligned} {{\mathcal {T}}^M_k}=\text {span} \left\{ g_k, s_{k-{\bar{m}}}, \ldots , s_{k-1}, {\bar{y}}_{k-{\bar{m}}}, \ldots , {\bar{y}}_{k-1}, \nabla c_{l_1}(x_k), \ldots , \nabla c_{l_{j_k}}(x_k)\right\} , \end{aligned}$$

and the stepsize $\alpha _k > 0$ satisfies

$$\begin{aligned} \phi _{\infty }(x_k+\alpha _k d_k, \sigma _k)\le \phi _{\infty }(x_k, \sigma _k)+\eta \alpha _k D_{d_k}\phi _{\infty }(x_k, \sigma _k) \end{aligned}$$

(16)

for $\eta \in (0, \frac{1}{2})$, then any limit point of $\{x_k\}$ is a KKT point of (1).

Proof

First, we show that if $d_k=0$, then $x_k$ is a KKT point of original problem (1). By using $d_k=Q_kz_k,\,\lambda _k^{+}=P_k\mu _k$, we can rewrite the Eqs. (7) and (8) as follows:

$$\begin{aligned} Q_k^T(B_k d_k +A_k\lambda _k^{+}+g_k)= & {} 0, \end{aligned}$$

(17)

$$\begin{aligned} P_k^T(A_k^Td_k+c_k)= & {} 0. \end{aligned}$$

(18)

With a matrix $U_k$ which makes $\left[ Q_k, \,\, U_k\right] $ an $n\times n$ orthogonal matrix, we have the following relations:

$$\begin{aligned}&B_kU_k=\delta _kU_k,\,\, U_k^TA_kP_k=0,\,\, U_k^TB_kQ_k=\delta _kU_k^TQ_k=0,\,\, U_k^Tg_k=0, \nonumber \\&U_k^T(B_k d_k +A_k\lambda _k^{+}+g_k)=\delta _k U_k^T Q_k z_k+U_k^TA_kP_k\mu _k+U_k^Tg_k=0. \end{aligned}$$

(19)

Using (17) and (19), we have

$$\begin{aligned} B_k d_k +A_k\lambda _k^{+}+g_k= & {} 0, \nonumber \\ P_k^T(A_k^Td_k+c_k)= & {} 0. \end{aligned}$$

(20)

If $d_k=0$, then

$$\begin{aligned} A_k\lambda _k^{+}+g_k= & {} 0, \\ P_k^Tc_k= & {} 0. \end{aligned}$$

From (13), $P_k^Tc_k=0$ implies $c_k=0$. Thus, $x_k$ is a KKT point of the original problem (1) with a new Lagrangian multiplier $\lambda _k^{+}$.

Now suppose $d_k \ne 0$ for all k. We define the following two functions ${\bar{L}}_k$ and $\bar{\phi }_k$ of d:

$$\begin{aligned} {\bar{L}}_k(x_k+d)= & {} f_k+g_k^T(Q_kz)+\frac{1}{2}(Q_kz)^TB_k(Q_kz)+\mu _k^TP_k^T(A_k^TQ_kz+c_k)\nonumber \\= & {} f_k+g_k^Td+\frac{1}{2}d^TB_kd+{\lambda _k^+}^T(A_k^Td+c_k),\nonumber \\ \bar{\phi }_k(x_k+d)= & {} f_k+g_k^T(Q_kz)+\frac{1}{2}(Q_kz)^TB_k(Q_kz)+\sigma _k \left\| P_k^T(A_k^TQ_kz+c_k)\right\| _{\infty } \nonumber \\= & {} f_k+g_k^Td+\frac{1}{2}d^TB_kd+\sigma _k \left\| P_k^T(A_k^Td+c_k)\right\| _{\infty }, \end{aligned}$$

(21)

with $d=Q_kz$, $\lambda _k^{+}=P_k\mu _k$. These functions are convex and achieve their minimum values at $d=d_k$. Because of (20), we have

$$\begin{aligned} {\bar{L}}_k(x_k)={\bar{L}}_k(x_k+d_k)+\frac{1}{2}d_k^TB_kd_k\ge {\bar{L}}_k(x_k+d_k). \end{aligned}$$

(22)

The difference between these functions is bounded below by the estimate below.

$$\begin{aligned} \bar{\phi }_k(x_k+d)-{\bar{L}}_k(x_k+d)= & {} \sigma _k\left\| P_k^T(A_k^Td+c_k)\right\| _{\infty }-\mu _k^TP_k^T(A_k^Td+c_k) \nonumber \\\ge & {} (\sigma _k-\Vert \mu _k\Vert _1)\left\| P_k^T(A_k^Td+c_k)\right\| _{\infty } \nonumber \\\ge & {} \bar{\rho }\left\| P_k^T(A_k^Td+c_k)\right\| _{\infty }. \end{aligned}$$

(23)

If any of the equalities (6b) are violated, then we have

$$\begin{aligned} \bar{\phi }_k(x_k+d) > {\bar{L}}_k(x_k+d). \end{aligned}$$

In this case, if $x_k$ is infeasible with respect to the given nonlinear constraints (1b), we have the bound

$$\begin{aligned} \bar{\phi }_k(x_k) > {\bar{L}}_k(x_k)\ge {\bar{L}}_k(x_k+d_k)=\bar{\phi }_k(x_k+d_k). \end{aligned}$$

(24)

While, if $x_k$ is feasible, we have the condition

$$\begin{aligned} \bar{\phi }_k(x_k) = {\bar{L}}_k(x_k)> {\bar{L}}_k(x_k+d_k)=\bar{\phi }_k(x_k+d_k), \end{aligned}$$

(25)

where the strict inequality comes from that the term $d_k^TB_kd_k$ in (22) is positive. Hence the number

$$\begin{aligned} t_k=\bar{\phi }_k(x_k)-\bar{\phi }_k(x_k+d_k) \end{aligned}$$

is positive. In addition, we have

$$\begin{aligned} \Vert c(x_k+\alpha d_k)\Vert _{\infty }= & {} \Vert c_k\Vert _{\infty }+\alpha D_{d_k}\Vert c(x_k)\Vert _{\infty }+o(\alpha )\\= & {} \Vert c_k\Vert _{\infty }-\alpha \Vert c_k\Vert _{\infty }+o(\alpha ) \end{aligned}$$

and

$$\begin{aligned} \bar{\phi }_k(x_k+\alpha d_k) = f_k+\alpha g_k^Td_k+\frac{1}{2}\alpha ^2 d_k^TB_kd_k+\sigma _k (1-\alpha )\left\| c_k\right\| _{\infty }. \end{aligned}$$

Owing to the above two relations and the convexity of the function (21), we have

$$\begin{aligned} \phi _\infty (x_k,\sigma )-\phi _{\infty }(x_k+\alpha d_k,\sigma )= & {} \bar{\phi }_k(x_k)-\bar{\phi }_k(x_k+\alpha d_k)+o(\alpha )\nonumber \\\ge & {} \alpha [\bar{\phi }_k(x_k)-\bar{\phi }_k(x_k+d_k)]+o(\alpha )\nonumber \\= & {} \alpha t_k +o(\alpha ),\,\,\, 0\le \alpha \le 1. \end{aligned}$$

(26)

Therefore the reduction of the penalty function

$$\begin{aligned} \phi _{\infty }(x_k+\alpha _k d_k, \sigma _k) < \phi _{\infty }(x_k, \sigma _k) \end{aligned}$$

can be achieved by a sufficiently small and positive $\alpha _k$. From the choice of $\alpha _k$ satisfying (16), we can replace the above condition on the stepsize by the inequality

$$\begin{aligned} \phi _{\infty }(x_{k+1},\sigma )\le \phi _{\infty }(x_k,\sigma )-\eta \alpha _k t_k, \end{aligned}$$

(27)

since $D_{d_k}\phi _{\infty }(x_k,\sigma _k)\le \bar{\phi }_k(x_k+d_k)-\bar{\phi }_k(x_k)=-t_k<0$.

Let $\xi $ be a small positive constant, and consider iterations with $t_k > \xi $. Because the continuity of the first order derivatives in a compact domain introduces uniform continuity, and the vector $d_k$ in (26) is bounded, there is a positive constant $\beta (\xi )$ such that the inequality (27) holds for any $\alpha _k \in [0,\beta (\xi )]$. Thus, by the line search process, we can deduce that

$$\begin{aligned} {[}\phi _\infty (x_k,\sigma )-\phi _\infty (x_{k+1},\sigma )]\ge \eta \xi \bar{\alpha }, \end{aligned}$$

where $\bar{\alpha }\in (0,\beta (\xi )]$. However, the reduction in the line search tends to zero, because $\{\phi _{\infty }(x_k,\sigma ):k=1,2,3,\ldots \}$ is a monotonically decreasing sequence which is bounded below. Therefore $t_k$ tends to zero, that is,

$$\begin{aligned} \lim _{k \rightarrow \infty }[\bar{\phi }_k(x_k)-\bar{\phi }_k(x_k+d_k)]=0. \end{aligned}$$

(28)

Because either (24) or (25) is satisfied for each k, the limit (28) gives

$$\begin{aligned} \lim _{k \rightarrow \infty }[\bar{\phi }_k(x_k)-{\bar{L}}_k(x_k)]=0 \end{aligned}$$

(29)

and

$$\begin{aligned} \lim _{k \rightarrow \infty }[{\bar{L}}_k(x_k)-{\bar{L}}_k(x_k+d_k)]=0. \end{aligned}$$

(30)

By letting $d=0$ in the estimate (23), and using the condition (15), we deduce from (29) that all limit points of the sequence $\{x_k:k=1,2,3,\ldots \}$ satisfy the nonlinear constraints (1b).

Without loss of generality, we assume that $\{x_k:k=1,2,3,\ldots \}$ has only one limit point, say $x_*$. By the above arguments, $c_j(x_*)=0$ for all $j=1,\dots ,m$. From (20), we have that

$$\begin{aligned} \lim _{k \rightarrow \infty } \left\| B_k d_k+A_k\lambda _k^{+}+g_k\right\| =0. \end{aligned}$$

(31)

Since (22) and (30) imply $B_k^{1/2}d_k\rightarrow 0$, and the matrices $\{B_k : k=1,2,3,...\}$ are uniformly bounded, the limit (31) still holds after eliminating the term $B_k d_k$. Because the Lagrange multipliers are bounded, by passing to a subsequence if necessary, we deduce that $A_*\lambda _*^{+}+g_*=0$, where $\lambda _*^{+}$ is any cluster points of $\{\lambda _k^{+}\}$, $\lim _{k \rightarrow \infty }A_k=A_*$, and $\lim _{k \rightarrow \infty }g_k=g_*$. Hence $x_*$ is a KKT point of (1). $\square $

5 Numerical results

In this section, we illustrate the implementation of the proposed algorithm SubSQP and report numerical results on some of CUTEst problems [8]. To show the effectiveness of SubSQP in large scale nonlinear equality constrained problems, we test it on 11 CUTEst problems. In most of these problems, the number of variables is greater than or equal to 5000, and the number of constraints is greater than half of the number of variables, consisting of only nonlinear equality constraints. The experiment also includes 5 problems with few constraints. To show the soundness and efficiency of SubSQP, we compare it with the standard SQP method. All tests are performed on an Intel Core i5 3.50 GHz CPU with 8 GB memory, running on Ubuntu 16.04 LTS and MATLAB R2016b.

For numerical tests, parameters are chosen as $\eta = 10^{-3}$, $\tau =0.5$, $\sigma _k = \Vert \mu _k\Vert _1+\bar{\rho }$, $\bar{\rho }=10^{-3}$, $\epsilon = 10^{-3}$ and the maximum number of iterations is 100. Since we are mostly interested in large scale problems, we use the $L_{\infty }$ exact penalty function

$$\begin{aligned} \phi _{\infty }(x,\sigma )=f(x)+\sigma \left\| c(x)\right\| _{\infty }, \end{aligned}$$

to measure progress towards a solution.

In order to prevent a rapid increase of the dimension of subspace in the case of many constraints, we limit the number of constraint gradients $\nabla c_{l_j}(x_k)$ in the subspace to ${\bar{m}}$, for all k unless specified otherwise. To effectively decrease constraint violations, we select the ${\bar{m}}$ largest gradients in magnitude after sorting:

$$\begin{aligned} \left| c_{l_1}(x_k)\right| \ge \left| c_{l_2}(x_k)\right| \ge ...\ge \left| c_{l_{m}}(x_k)\right| . \end{aligned}$$

By including those constraint gradients in the subspace ${\mathcal {T}}_k^M$, the step $s_k=\alpha _k d_k$ generated by the proposed method moves in the direction of reducing severe constraint violations. Consequently, we choose $P_k$ in (14) as follows:

$$\begin{aligned} P_k=\left[ e_{l_1},e_{l_2},\ldots ,e_{l_{{\bar{m}}}}\right] \in {\mathbb {R}}^{m \times {\bar{m}}}, \end{aligned}$$

and set the subspace ${\mathcal {T}}_k^M$:

$$\begin{aligned} {\mathcal {T}}_k^M=\text {span} \left\{ g_k, s_{k-{\bar{m}}}, \ldots , s_{k-1}, {\bar{y}}_{k-{\bar{m}}}, \ldots , {\bar{y}}_{k-1}, \nabla c_{l_1}(x_k), \ldots , \nabla c_{l_{{\bar{m}}}}(x_k)\right\} . \end{aligned}$$

In the case of few constraints, all m constraint gradients are used to construct the subspace ${\mathcal {T}}_k^F$.

To compute the orthonormal basis matrix $Q_k$, the reorthogonalization procedure in [4] is applied. If $A_k$ is nearly rank deficient, SubSQP may eliminate some constraint gradients in the orthogonalization process. We use the conjugate gradient method to solve (9), and the relative residual norm is chosen for its stopping condition with tolerance $10^{-4}$.

Table 1 Numerical results on CUTEst problems

Full size table

To improve the performance of the proposed method, we skip the update $(s_k, {\bar{y}}_k)$ if

$$\begin{aligned} 10^{-8}{\bar{y}}_k^T{\bar{y}}_k\ge {\bar{y}}_k^Ts_k, \end{aligned}$$

and reset the subspace ${\mathcal {T}}_k$ if

$$\begin{aligned} |f_{k}-f_{k+1}|\le 10^{-4}|f_{k}| \end{aligned}$$

and

$$\begin{aligned} | \Vert c_{k}\Vert _{\infty }-\Vert c_{k+1}\Vert _{\infty } |\le 10^{-4}\Vert c_{k}\Vert _{\infty }. \end{aligned}$$

The stopping criterion consists of

$$\begin{aligned} ||g_k+A_k\lambda _k||_{\infty }\le \epsilon , \, ||c_k||_{\infty }\le \epsilon \quad \text {or} \quad ||d_k||_2\le \epsilon . \end{aligned}$$

We also terminate each algorithm when it generates a tiny stepsize $\alpha _k < 10^{-6}$.

The numerical results on a few CUTEst test problems are reported in Tables 1–2. In Table 1, n represents the number of variables, m represents the number of constraints, ${\bar{m}}$ is a parameter for constructing the subspace. Since SQP and SubSQP give different solutions due to the adoption of subspace in SubSQP, we report the number of iterations, CPU time, the final objective value $f_*$ and the infinity norm of the final constraint violation $||c_*||$ in Tables 1 and 2.

Table 2 Function values and constraints violation on CUTEst problems

Full size table

In the case of many constraints, the computational cost of SubSQP per iteration is less than that of SQP, especially on large scale problems since it performs in the proposed subspace. However, SubSQP may require more iterations, depending on the choice of subspace. SubSQP overall outperforms SQP in terms of final objective value, given in Table 2. Especially, for the problems DTOC2, LUKVLE16, and LUKVLE2, SubSQP is faster than SQP with a less constraint violation. In the case of few constraints, the results of SubSQP are similar to those of SQP. Since SubSQP requires an orthogonalization process, SubSQP is slower than SQP.

6 Conclusion

In this paper, we present a subspace SQP method to solve large scale nonlinear equality constrained optimization problems. The subspace technique is a promising approach for solving large scale optimization problems [21], since it can reduce both computational cost and memory requirements. To efficiently apply the technique to equality constrained problems, we propose a subspace generated by the gradient of the objective function, previous steps, the modified differences of the gradients of Lagrangian, and gradients of constraints with substantial violation, based on damped limited-memory BFGS update.

With this subspace, we prove that our method has a global convergence property. Numerical results on some of CUTEst problems show that our proposed algorithm works better for large scale problems when compared with the standard SQP method. Future work concerns a different choice of subspace and accompanied analyses of such space, including implementation.

References

Byrd, R.H., Nocedal, J., Schnabel, R.B.: Representations of quasi-Newton matrices and their use in limited memory methods. Math. Program. 63, 129–156 (1994)
Article MathSciNet MATH Google Scholar
Boggs, P.T., Tolle, J.W.: Sequential quadratic programming. Acta Numer. 4, 1–51 (1995)
Article MathSciNet MATH Google Scholar
Celis, M.R., Dennis, J.E., Tapia, R.A.: A trust region strategy for nonlinear equality constrained optimization. In: Boggs, P.T., Byrd, R.H., Schnabel, R.B. (eds.) Numerical Optimization 1984, pp. 71–82. SIAM, Philadelphia (1985)
Google Scholar
Daniel, J.W., Gragg, W.B., Kaufman, L., Stewart, G.W.: Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization. Math. Comput. 30, 772–795 (1976)
MathSciNet MATH Google Scholar
Gill, P.E., Leonard, M.W.: Reduced-Hessian quasi-Newton methods for unconstrained optimization. SIAM J. Optim. 6, 418–445 (1996)
Article MathSciNet Google Scholar
Gould, N.I.M., Toint, P.L.: SQP methods for large-scale nonlinear programming. In: Powell, M.J.D., Scholtes, S. (eds.) System Modelling and Optimization: Methods, Theory and Applications. Kluwer, Dordrecht (2000)
Google Scholar
Gould, N.I.M., Orban, D., Toint, PhL: Numerical methods for large-scale nonlinear optimization. Acta Numer. 14, 299–361 (2005)
Article MathSciNet MATH Google Scholar
Gould, N.I.M., Orban, D., Toint, PhL: CUTEst: a constrained and unconstrained testing environment with safe threads for mathematical optimization. Comput. Optim. Appl. 60, 545–557 (2015)
Article MathSciNet MATH Google Scholar
Grapiglia, G.N., Yuan, J., Yuan, Y.: A subspace version of the Powell–Yuan trust region algorithm for equality constrained optimization. J. Oper. Res. Soc. China 1(4), 425–451 (2013)
Article MATH Google Scholar
Han, S.-P., Mangasarian, O.L.: Exact penalty functions in nonlinear programming. Math. Program. 17, 251–269 (1979)
Article MathSciNet MATH Google Scholar
Nocedal, J.: Updating quasi-Newton matrices with limited storage. Math. Comput. 35(151), 773–782 (1980)
Article MathSciNet MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization (Springer Series in Operations Research), 2nd edn. Springer, New York (2006)
Google Scholar
Powell, M.J.D.: A fast algorithm for nonlinearly constrained optimization calculations. In: Watson, G.A. (ed.) Numerical Analysis Dundee 1977 (Lecture Notes in Mathematics 630), pp. 144–157. Springer, Berlin (1978)
Google Scholar
Powell, M.J.D.: Variable metric methods for constrained optimization. In: Bachem, A., Grötschel, M., Korte, B. (eds.) Mathematical Programming: The State of the Art, Bonn, 1982, pp. 288–311. Springer, Berlin (1983)
Chapter Google Scholar
Stoer, J., Yuan, Y.: A subspace study on conjugate gradient algorithms. ZAMM Z. Angew. Math. Mech. 75, 69–77 (1995)
Article MathSciNet MATH Google Scholar
Vlček, J., Lukšan, L.: New variable metric methods for unconstrained minimization covering the large-scale case. Technical report No. V876, Institute of Computer Science, Academy of Sciences of the Czech Republic (2002)
Wang, Z.-H., Wen, Z.-W., Yuan, Y.: A subspace trust region method for large scale unconstrained optimization. In: Yuan, Y. (ed.) Numerical Linear Algebra and Optimization, pp. 265–274. Science Press, Beijing (2004)
Google Scholar
Wang, Z.-H., Yuan, Y.: A subspace implementation of quasi-Newton trust region methods for unconstrained optimization. Numer. Math. 104, 241–269 (2006)
Article MathSciNet MATH Google Scholar
Yuan, Y.: Subspace techniques for nonlinear optimization. In: Jeltsch, R., Li, T., Sloan, I.H. (eds.) Some Topics in Industrial and Applied Mathematics (Series in Contemporary Applied Mathematics CAM 8), pp. 206–218. Higher Education Press, Beijing (2007)
Chapter Google Scholar
Yuan, Y.: Subspace methods for large scale nonlinear equations and nonlinear least squares. Optim. Eng. 10, 207–218 (2009)
Article MathSciNet MATH Google Scholar
Yuan, Y.: A review on subspace methods for nonlinear optimization. In: Proceedings of International Congress of Mathematicians 2014 Seoul, Korea, pp. 807–827 (2014)
Zhao, X., Fan, J.: Subspace choices for the Celis–Dennis–Tapia problem. Sci. China Math. 60, 1717–1732 (2017)
Article MathSciNet MATH Google Scholar
Zhao, X., Fan, J.: On subspace properties of the quadratically constrained quadratic program. J. Ind. Manag. Optim. 13, 1625–1640 (2017)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank Professor Jinyan Fan and anonymous referees for their kind advice and modifications. This work was supported by the National Research Foundation of Korea (NRF) NRF-2016R1A5A1008055. The second author was supported by the National Research Foundation of Korea (NRF) NRF-2016R1D1A1B03931337. The fourth author was supported by the National Research Foundation of Korea (NRF) NRF-2016R1D1A1B03934371.

Author information

Authors and Affiliations

AORC, Sungkyunkwan University, Suwon, 16419, Korea
Jae Hwa Lee
Department of Mathematics, Sungkyunkwan University, Suwon, 16419, Korea
Yoon Mo Jung
State Key Laboratory of Scientific and Engineering Computing, ICMSEC, AMSS, CAS, Beijing, 100190, China
Ya-xiang Yuan
Department of Mathematics Education, Sungkyunkwan University, Seoul, 03063, Korea
Sangwoon Yun

Authors

Jae Hwa Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yoon Mo Jung
View author publications
You can also search for this author in PubMed Google Scholar
Ya-xiang Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Sangwoon Yun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sangwoon Yun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, J.H., Jung, Y.M., Yuan, Yx. et al. A subspace SQP method for equality constrained optimization. Comput Optim Appl 74, 177–194 (2019). https://doi.org/10.1007/s10589-019-00109-6

Download citation

Received: 18 October 2018
Published: 16 May 2019
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s10589-019-00109-6

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A subspace SQP method for equality constrained optimization

Abstract

Similar content being viewed by others

QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming

A regularized limited memory subspace minimization conjugate gradient method for unconstrained optimization

Large-scale unconstrained optimization using separable cubic modeling and matrix-free subspace minimization

1 Introduction

2 Subspace methods for unconstrained optimization

3 A new subspace method for equality constrained optimization

4 Convergence analysis

Assumption 1

Assumption 2

Assumption 3

4.1 Case I: Few constraints

Theorem 1

Proof

Theorem 2

4.2 Case II: Many constraints

Lemma 1

Theorem 3

Proof

Theorem 4

Proof

5 Numerical results

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A subspace SQP method for equality constrained optimization

Abstract

Similar content being viewed by others

QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming

A regularized limited memory subspace minimization conjugate gradient method for unconstrained optimization

Large-scale unconstrained optimization using separable cubic modeling and matrix-free subspace minimization

1 Introduction

2 Subspace methods for unconstrained optimization

3 A new subspace method for equality constrained optimization

4 Convergence analysis

Assumption 1

Assumption 2

Assumption 3

4.1 Case I: Few constraints

Theorem 1

Proof

Theorem 2

4.2 Case II: Many constraints

Lemma 1

Theorem 3

Proof

Theorem 4

Proof

5 Numerical results

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation