An Active Set Modified Polak–Ribiére–Polyak Method for Large-Scale Nonlinear Bound Constrained Optimization

Cheng, Wanyou; Li, Donghui

doi:10.1007/s10957-012-0091-9

An Active Set Modified Polak–Ribiére–Polyak Method for Large-Scale Nonlinear Bound Constrained Optimization

Published: 09 June 2012

Volume 155, pages 1084–1094, (2012)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

An Active Set Modified Polak–Ribiére–Polyak Method for Large-Scale Nonlinear Bound Constrained Optimization

Download PDF

Wanyou Cheng¹ &
Donghui Li²

454 Accesses
15 Citations
Explore all metrics

Abstract

In this paper, we propose an active set modified Polak–Ribiére–Polyak method for solving large-scale optimization with simple bounds on the variables. The active set is guessed by an identification technique at each iteration and the recently developed modified Polak–Ribiére–Polyak method is used to update the variables with indices outside of the active set. Under appropriate conditions, we show that the proposed method is globally convergent. Numerical experiments are presented using box constrained problems in the CUTEr test problem libraries.

Primal-Dual Active-Set Methods for Large-Scale Optimization

Article 18 February 2015

An Active Set Trust-Region Method for Bound-Constrained Optimization

Article Open access 27 July 2021

Approximate solution of system of equations arising in interior-point methods for bound-constrained optimization

Article Open access 16 February 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The bound constrained problem is very important in practical optimization. Many practical problems can be converted into the form of the problem. In addition, it often a subproblem of augmented Lagrangian or penalty schemes for general constrained optimization [1]. In recent works on complementarity and variational inequalities, these problems are reduced to bound constrained minimization problems in an efficient way [2]. Thus, the development of numerical algorithms to efficiently solve it, especially when the dimension is large, is important in both theory and applications. The bound constrained optimization problem has received much attention in recent decades. We refer to papers [1, 3] for a review on recent advances in this area.

Active set methods are widely used in solving the bound constrained optimization problem. Early active set methods [4] are quite efficient for small dimensional problems, but are unattractive for large-scale problems [5]. The main reason is that typically at each step of the algorithm, at most one constraint can be added to or dropped from the active set. The potential worst-case may appear, where each of the possible 3ⁿ active sets is visited before discovering the optimal one [1]. Recently, there has been a growing interest in the design of active set methods that are capable of making rapid changes incorrect predictions [1, 6–11].

In [7], Facchinei, Júdice, and Soares proposed an efficient identification technique of the active set at the solution and developed an active set Newton’s algorithm for large-scale nonlinear programming with box constraints. The reported numerical results in [7] showed that this identification technique works well. Recently, based on the identification technique, Xiao and Hu [12] proposed an active set subspace Barzilai–Borwein gradient method (SBB). Preliminary numerical results show that the identification technique works well and the SBB method is competitive with the well-known SPG2 method [13].

Conjugate gradient methods are very efficient for solving large-scale unconstrained optimization problems due to their simplicity and low storage [14]. However, the study of conjugate gradient method for nonquadratic bound constrained optimization is relatively rare. In this paper, by the use of the active identification technique in [7] and the recently developed modified Polak–Ribiére–Polyak (PRP) method [15], we propose an active set modified PRP method for bound constrained optimization problem. Specifically, at each iteration, the active variables and free variables are defined by the identification technique [7]; we use the method in [7] to update the active variables and use the modified PRP method to update the free variables. The use of the modified PRP method makes the method require lower storage. Hence, the method can be used to solve large-scale problems. In addition, the method possesses the following advantages: all iterates are feasible and rapid changes in the active set are allowed. Under appropriate conditions, we show that the proposed method is globally convergent. The numerical experiments are also given using the bound optimization problems from CUTEr [16] collection with more than 500 variables, which shows that the proposed method is competitive with some existing methods such as the SPG2 method [13] and the ASA method [3].

The remainder of the paper is organized as follows. We propose the algorithm in Sect. 2. In Sect. 3, we show that the proposed algorithm is globally convergent. In Sect. 4, we test the performance of the proposed algorithm and compare it with the SPG2 method and the ASA method.

2 Motivation and Properties

Throughout the paper, ∥.∥ denotes the Euclidean norm of vectors. Let P _Ω(x) denote the projection of x on the set Ω. If w=(w ₁,w ₂,…,w _n)^T is a n dimension vector and I is an index set such that I⊂{1,2,…,n}, we denote by w _I the subvector with components w _i,i∈I, and denote by w _I≥0 the subvector with components w _i≥0,i∈I.

Consider the solution of the bound constrained nonlinear programming problem

$$ \bigl\{ \min f(x), x\in \varOmega:=\bigl\{x\in {\mathcal{R}}^n: l\leq x\leq u\bigr\}\bigr\}, $$

(1)

where f is a real-valued, continuously differentiable function in an open set containing Ω, l and u are constant vectors, and the inequalities are valid component wise. Let $\bar{x}$ be a stationary of (1), and consider the associated active sets

$$\bar{L}:=\{i:\bar{x}_i=l_i\}, \quad\quad \bar{U}:=\{i:\bar{x}_i=u_i\}. $$

Furthermore, let

$$\bar{F}:=\{1,\ldots,n\}\verb|\|(\bar{L}\cup\bar{U}) $$

be the set of the free variables. By using this notation, a vector $\bar{x}$ is said to be a stationary point for (1) iff it satisfies:

$$ \left \{\begin{array}{ll} i\in \bar{L} & \Rightarrow g_i(\bar{x}) \geq 0 \\ i \in \bar{F} & \Rightarrow g_i(\bar{x})= 0 \\ i \in \bar{U} & \Rightarrow g_i(\bar{x})\leq 0\\ \end{array} \right . $$

(2)

where g _i(x) is the ith component of the gradient vector of f at x. Facchinei, Júdice, and Soares [7] gave the following approximations L(x), F(x) and U(x) to $\bar{L}$, $\bar{F}$, $\bar{U}$, respectively,

where a _i(x) and b _i(x), i=1,…,n are nonnegative, continuous, and bounded functions defined on Ω, such that if x _i=l _i or x _i=u _i then a _i(x)>0 or b _i(x)>0, respectively. The following theorem shows that L(x), F(x), and U(x) are indeed good estimates of $\bar{L}$, $\bar{F,}$ and $\bar{U}$. For the proof, see Theorem 3.1 in [7].

Theorem 2.1

For any x∈Ω, $L(x)\cap U(x)={\mathcal{O}}$. Furthermore, if $\bar{x}$ is a stationary point of problem (1) at which the strict complementarity holds, then there exists a neighborhood $N(\bar{x})$ of $\bar{x}$ such that

$$L(x)=\bar{L}, \quad\quad F(x)=\bar{F}, \quad\quad U(x)=\bar{U}, \quad \forall \ x\in N(\bar{x}). $$

Following the idea of [7], we are going to develop an active set modified PRP method as follows. We firstly define the search direction. Let x ^k∈Ω be the current point at iteration k. For simplicity, we let L ^k:=L(x ^k), U ^k:=U(x ^k), and F ^k:=F(x ^k). Define the direction $d^{k}=(d^{k}_{L^{k}},d^{k}_{F^{k}},d^{k}_{U^{k}})^{T}$ by

$$ d_{i}^k:=l_i-x_{i}^k, \quad i\in L^k $$

(3)

and

$$ d_{i}^k:=u_i-x_{i}^k, \quad i\in U^k. $$

(4)

In what follows, we are going to define $d^{k}_{F^{k}}$. For this, we define the active set indices of f at x ^k as follows:

$$A\bigl(x^k\bigr):=\bigl\{ i:x^k_{i}=l_i \ \mbox{or}\ x^k_{i}=u_i\bigr\}. $$

The active indices are further subdivided into those indices:

$$A_1\bigl(x^k\bigr):=\bigl\{i:x^k_{i}=l_i, g_i\bigl(x^k\bigr)\geq 0\bigr\}, \quad\quad A_2 \bigl(x^k\bigr):=\bigl\{i:x^k_{i}=l_i, g_i\bigl(x^k\bigr) <0\bigr\} $$

and

$$A_3\bigl(x^k\bigr):=\bigl\{i:x^k_{i}=u_i, g_i\bigl(x^k\bigr) \leq 0\bigr\}, \quad\quad A_4\bigl(x^k\bigr):=\bigl\{i:x^k_{i}=u_i, g_i\bigl(x^k\bigr) > 0\bigr\}. $$

It is clearly that a direction d is a feasible direction of f at x ^k if and only if d _i≥0,i∈A ₁(x ^k)∪A ₂(x ^k) and d _i≤0,i∈A ₃(x ^k)∪A ₄(x ^k). Furthermore, from the definitions of L(x ^k), U(x ^k) and A ₁(x ^k)−A ₄(x ^k) we can observe that the following fact holds:

By (3) and (4), we have $d^{k}_{i} \geq 0, i\in A_{1}(x^{k})$ and $d^{k}_{i} \leq 0, i\in A_{3}(x^{k})$. However, we can not determine the relation between A ₂(x ^k) and U(x ^k) and the relation between A ₄(x ^k) and L(x ^k). Let $A^{*}_{2}(x^{k}):=A_{2}(x^{k}) \cap U(x^{k})$, $A^{*}_{4}(x^{k}):=A_{4}(x^{k})\cap L(x^{k})$ and $F^{*}(x^{k}):=F(x^{k})|((A_{4}(x^{k})-A^{*}_{4}(x^{k})) \cup (A_{2}(x^{k})-A^{*}_{2}(x^{k})))$. It follows from (3) and (4) that $d^{k}_{i} \geq 0, i\in A^{*}_{2}(x^{k})$ and $d^{k}_{i} \leq 0, i\in A^{*}_{4}(x^{k})$. Thus, we let

$$ d^k_{i}:=-g_{i}^k, \quad i\in \bigl(A_4\bigl(x^k\bigr)-A^{*}_4 \bigl(x^k\bigr)\bigr) \cup \bigl(A_2\bigl(x^k \bigr)-A^{*}_2\bigl(x^k\bigr)\bigr) $$

(5)

and

$$ d^{k}_{F^{*k}}:=\left \{\begin{array}{l@{\quad}l} -g^{k}_{F^{*k}}, & \mbox{iff\ } k=0 \ \mbox{or} \ \big\|g^{k-1}_{F^{*k}}\big\|\leq \epsilon, \\[4pt] -g^{k}_{F^{*k}}+\beta_{k}^{\mathrm{PRP}}d^{k-1}_{F^{*k}}-\eta_ky_{F^{*k}}^{k-1}, & \mbox{iff } k>0 \ \mbox{and} \ \big\|g^{k-1}_{F^{*k}}\big\| > \epsilon, \end{array} \right . $$

(6)

where

and ϵ>0 is given a constant. From the above discussion, we can see that d ^k is a feasible direction of f at x ^k. Moreover, it is not difficult to see from (5) and (6) that $d_{F^{k}}^{k}$ satisfies

$$ \bigl(g_{F^k}^k\bigr)^Td_{F^k}^k=- \big\|g_{F^k}^k\big\|^2. $$

(7)

The following two theorems show that d ^k is a descent direction f at x _k, if x _k is not a stationary point of (1).

Theorem 2.2

If {x ^k}∈Ω, then d ^k=0 if and only if x ^k is a stationary point of (1).

Proof

Let d ^k=0. For i∈L ^k, by (3) and (4) we have

$$0=d^{k}_i=l_i-x_{i}^k \geq -a_i\bigl(x^k\bigr) g_i \bigl(x^k\bigr). $$

Since $x_{i}^{k}=l_{i}$, a _i(x ^k)>0, it holds that g _i(x ^k)≥0. For each i∈U ^k we have

$$0=d^{k}_i=u_i-x_{i}^k \leq -b_{i}\bigl(x^k\bigr)g_i \bigl(x^k\bigr), $$

This implies that g _i(x ^k)≤0. From the above discussion, we have A ₂(x ^k)∪A ₄(x ^k)=∅. Thus, F ^∗k=F ^k. For i∈F ^k, by the use of Cauchy–Schwarz inequality to (7), we have

$$\big\|g_{F^k}^k \big\| \leq \big\|d_{F^k}^k \big\|. $$

Hence, g _i(x ^k)=0 for each i∈F ^k. The above argument has shown that x ^k is a stationary point of f on Ω.

Suppose that x ^k be a stationary point of f on Ω. Then we have A ₂(x ^k)∪A ₄(x ^k)=∅ and F ^∗k=F ^k. It follows from (2) that

$$L^k=\{i:x^k_i=l_i\}, \quad\quad F^k=\{i:l_i<x^k_i<u_i\}, \quad\quad U^k=\{i:x^k_i=u_i\}. $$

By the definition of d ^k , we immediately have d ^k=0. The proof is complete. □

Theorem 2.3

Suppose that x _k be not a stationary point of (1). Then the direction d ^k determined by (3)–(6) is a descent direction of f at x ^k.

Proof

Following a similar proof as the proof of Theorem 3.2 in [7], it is not difficult to show that for each i∈L ^k∪U ^k, there exists a positive scalar γ _i such that

$$ g^{k}_i d^{k}_i \leq -\gamma_i \bigl(d_{i}^k\bigr)^2, $$

(8)

This together with (7) implies that

$$ \bigl(g^k\bigr)^Td^k\leq - \big\|g_{F^k}^k\big\|^2+\sum _{i \in L^k \cup U^k}-\gamma_i \bigl(d_{i}^k \bigr)^2. $$

(9)

Since x ^k is not a stationary point of (1), by Theorem (2.2), there must exist an index i such that at least one of the two possibilities holds:

Therefore, the right-hand side of (9) must be negative. In other words, d ^k is a descent direction of f at x ^k. The proof is complete. □

Based on the above discussion, we propose an active set PRP-type method (APRP) for (1) as follows.

Algorithm 2.1

(APRP)

Step 0.:: Given initial point x ⁰∈Ω and positive constants ρ, β, ϵ ₁ and δ∈ ]0,1[. Set k:=0.
Step 1.:: If ∥P _Ω(x ^k−g ^k)−x ^k∥≤ϵ ₁, then stop.
Step 2.:: Compute d ^k by (3)–(6).
Step 3.:: Determine α ^k:=max{βρ ^j,j=0,1,…} satisfying
$$ f\bigl(P_{\varOmega}\bigl(x^k+\alpha^kd^k \bigr)\bigr)\leq f\bigl(x^k\bigr)-\delta\bigl(\alpha^{k} \big\|d^k\big\|\bigr)^2. $$
(10)
Step 4.:: Let the next iterate be x ^k+1:=x ^k+α ^k d ^k.
Step 5.:: Let k:=k+1 and go to Step 1.

Remark 2.1

Since the direction d ^k is a feasible descent direction of f at x ^k, there exists a positive constant β ^k such that x ^k+αd ^k∈Ω when 0<α<β ^k. This implies

$$f\bigl(P_{\varOmega}\bigl(x^k+\alpha^{k}d^k \bigr)\bigr)=f\bigl(x^k+\alpha^{k}d^k\bigr) \quad \mbox{for} \ 0< \alpha^{k} <\beta^k. $$

Clearly, Theorem 2.3 implies that the condition (10) necessarily holds after a finite number of reduction of β. Consequently, Algorithm 2.1 is well defined. In addition, it follows from (10) that the function value sequence {f(x ^k)} is decreasing. We also have from (10) that

$$\sum_{k=0}^{\infty}\bigl(\alpha^{k} \bigr)^2\big\|d^k\big\|^2 <\infty, $$

if f is bounded from below. In particular, we have

$$ \lim_{k \rightarrow \infty}\alpha^k\big\|d^k\big\|=0. $$

(11)

3 Convergence Analysis

In this section, we analyze the convergence of Algorithm 2.1. To this end, we make the following assumptions.

Assumption 3.1

(i)
The level set D:={x∈Ω:f(x)≤f(x ₀)} is bounded.
(ii)
In some neighborhood N of D, f is continuously differentiable and its gradient is Lipschitz continuous, namely, there exists a constant L>0 such that
$$ \big\|g(x)-g(y)\big\|\leq L \|x-y\|,\quad\forall x, y \in N. $$
(12)

Since {f(x ^k)} is decreasing, it is clearly that the sequence {x ^k} generated by Algorithm 2.1 is contained in D. In addition, from Assumption 3.1, we get that there is a constant γ>0 such that

$$ \big\|g(x)\big\| \leq \gamma,\quad \forall x \in D. $$

(13)

The following lemma shows that the sequence {d ^k} generated by Algorithm 2.1 is bounded, also.

Lemma 3.1

Let the sequence {x ^k} be generated by Algorithm 2.1. Then there exists a constant M>0 such that

$$ \big\|d^k\big\| \leq M,\quad \forall k. $$

(14)

Proof

First, by (8), there exists a constant $\bar{\gamma}>\gamma$ such that

$$\big\|d^k_{L^k \cup U^k}\big\| \leq \bar{\gamma}. $$

This together with the definition of d ^k, (12) and (13) implies that

By (11), there exists a constant r∈]0,1[ and an integer k ₀ such that the following inequality holds for all k≥k ₀,

$$\frac{2L\bar{\gamma}}{\varepsilon^2}\alpha^{k-1}\big\|d^{k-1}\big\| \leq r. $$

Hence, we have for all k≥k ₀

Letting $M:=\max\{\|d_{1}\|,\|d_{2}\|,\ldots, \|d^{k_{0}}\|,2\frac{\bar{\gamma}}{1-r}+\|d^{k_{0}}\|\}$, we get (14). □

Lemma 3.2

Let the sequence {x ^k} and {d ^k} be generated by Algorithm 2.1. If there are subsequences $\{x_{k}\}_{k\in K}\rightarrow \widehat{x}$ and {d _k}_k∈K→0, then $\widehat{x}$ is a stationary point of (1).

Proof

Since the number of distinct sets L ^k, U ^k and F ^k is finite, without loss of generality, we can assume that the following relations hold for all k∈K

$$L^k=L,\quad\quad F^k=F,\quad\quad U^k=U. $$

Since $d^{k}_{L}\rightarrow 0$ and $d^{k}_{U} \rightarrow 0$, we obviously have

$$\widehat{x}_L=l_{L},\quad g_{L}(\widehat{x}) \geq 0,\quad\mbox{and}\quad \widehat{x}_U=u_{U},\quad g_{U}(\widehat{x}) \leq 0. $$

On the other hand, by (7), we have

$$\big\|g^k_{F}\big\|\leq \big\|d^k_{F}\big\|. $$

Consequently, we obtain $\|g_{F}^{k}(\widehat{x})\|=0$. The proof is complete. □

The following theorem shows that there exists an accumulation point of {x ^k} that is a stationary point of (1).

Theorem 3.1

Let {x ^k} be generated by Algorithm 2.1. Then we have

$$ \lim_{k \rightarrow \infty} \inf \big\|d^k\big\|=0. $$

(15)

Proof

For the sake of contradiction, we suppose that (15) is not true. Then there exists a constant ϵ>0 such that

$$\big\| d^k \big\| \geq \epsilon , \quad \forall k. $$

This together with (11) implies that lim_k→∞ α ^k=0. By the line search rule, ρ ⁻¹ α ^k≤β ^k does not satisfy (10) when k is sufficiently large. This means

$$ f\bigl(x^k+\rho^{-1} \alpha^{k} d^k\bigr)-f\bigl(x^k\bigr)>-\delta \rho^{-2} \bigl(\alpha^{k}\bigr)^2\big\|d^k\big\|^2. $$

(16)

By the mean-value theorem and (12), there exists a constant 0<θ ^k<1 such that

where L is the Lipschitz constant of g. Substituting the last inequality into (16), we get for all k sufficiently large

Since {d ^k} is bounded and lim_k→∞ α ^k∥d ^k∥=0, the last inequality implies

$$\lim_{k \rightarrow \infty}-\bigl(g^{k}\bigr)^Td^k=0. $$

This together with (9) implies that

$$ \lim_{k \rightarrow \infty} \big\|g^{k}_{F^k} \big\|^2=0\quad \mbox{and}\quad\lim_{k\rightarrow \infty} \sum _{i \in L^k \cup U^k} \gamma_i \bigl(d_{i}^k \bigr)^2=0. $$

(17)

Since {x ^k} is bounded, there exists an infinite index set K ₁ such that $\lim_{k\in K_{1}}x^{k}=x^{*}$. Hence, (17) together with (3)–(4) implies that x ^∗ is a stationary point of (1). By Theorem 2.2, we have $\lim_{k \in K_{1}}\!\|d^{k}\|=0$. This yields a contradiction. □

4 Numerical Experiments

In this section, we do some numerical experiments to test the performance of the APRP method and compare it with the SPG2 method [13, 17] and the ASA method [3]. The SPG2 code was obtained from Birgin’s web page at: http://www.ime.usp.br/~egbirgin/tango/. The ASA code was obtained from Hager’s web page at: http://www.math.ufl.edu/~hager/. The SPG2 code and the APRP code were written in Fortran 77 and the ASA code was written in C. All codes were run on a PC with Pentium IV 3.0 GHz CPU processor and 512 M RAM memory and with a Linux operating system.

The ASA code and the SPG2 code were implemented with default parameters. We implemented the APRP code with the following parameters: δ=10⁻⁴,ρ=0.5,ϵ=10⁻⁶ and

$$\beta:=\left \{ \begin{array}{l@{\quad}l} \min\bigl\{1.0,1.01\frac{g^T_{k-1}d_{k-1}}{g_k^Td_k}\bigr\}, &\mbox{iff} \ k>0\\[4pt] \frac{1}{\|g(x_0)\|_{\infty}}, & \mbox{iff}\ k=0. \end{array} \right . $$

Moreover, we also test the APRP method with different parameters a _i(x) and b _i(x) to see that a _i(x)=b _i(x)=10⁻⁶ is the best choice.

The test problems consists of 52 box constrained problems in the CUTEr library [16] with dimensions between 500 and 15625. For each test problem and all tested methods, the termination condition is

$$\big\|P_{\varOmega}\bigl(x^k- g\bigl(x^k\bigr) \bigr)-x^k\big\|_{\infty}\leq 10^{-6}. $$

A limit of 5000 iterations is also imposed. In running the numerical experiments, we checked whether different codes converged to different local minimizers; when comparing the codes, we restricted ourselves to test problems in which all codes converged to the same local minimizer. The detailed numerical results concerned with the CPU time in seconds and the number of iterations, function evaluations, and gradient evaluations for each of the tested method are posted at the following Web site: http://bgchengwy.blog.163.com/.

First, we note that the SPG2 method fails to satisfy the termination in the problems LINVERSE (999), QRTQUAD (1200–5000), and JNLBRNGB (9375–15625), while the APRP method and the ASA method fails to satisfy the termination in the same problems QRTQUAD (1200–5000). More insight into the numerical results, we consider the statistics “iter,” “fun,” “grad,” “time” and list the number of problems where any method achieves the best value on the number of iteration, function evaluations, gradient evaluations and CPU time in Table 1.

Table 1 Statical data for 52 problems

Full size table

From the above table, we observe that the APRP method works efficiently, which require fewer iterations, fewer function evaluations, fewer gradient evaluations, and less time consuming. Further to show the efficiency of the APRP method, we adopt the performance profiles introduced by Dolan and Moré [18] to evaluate CPU time in Fig. 1. It is not difficult to see from Fig. 1 that the APRP method performs best among all the methods on CPU time. In particular, both the APRP method and the ASA method can solve about 96 % of the test problems, whereas the SPG2 method can solve about 90 % of the test problems. In summary, the results from Table 1 and Fig. 1 show that the APRP method provides an efficient method for solving large-scale nonlinear box constrained problems.

5 Conclusions

In this paper, we have introduced an active set modified PRP method for large-scale optimization with simple bounds on the variables. Under appropriate conditions, we show that the proposed method is globally convergent. The implementation of the algorithm uses an active set identification technique to estimate the active variables at each iteration, and determine the search direction of free variables by the modified PRP conjugate gradient method. The active set identification strategy allows rapid changes in the active set from one iteration to the next, and the cost of computing the search direction of active variables is reduced. We think that our algorithm is practical and effective for the test problems.

References

Gould, N.I.M., Orban, D., Toint, P.L.: Numerical methods for large-scale nonlinear optimization. Acta Numer. 14, 299–361 (2005)
Article MathSciNet MATH Google Scholar
Burdakov, O.P., Martínez, J.M., Pilotta, E.A.: A limited-memory multipoint symmetric secant method for bound constrained optimization. Ann. Oper. Res. 117, 51–70 (2002)
Article MathSciNet MATH Google Scholar
Hager, W.W., Zhang, H.: A new active set algorithm for box constrained optimization. SIAM J. Optim. 17, 526–557 (2006)
Article MathSciNet MATH Google Scholar
Goldfarb, D.: Extension of Davidson’s variable metric method to maximization under linear inequality and constraints. SIAM J. Appl. Math. 17, 739–764 (1969)
Article MathSciNet MATH Google Scholar
Dai, Y.H., Fletcher, R.: Projected Barzilai–Borwein methods for large-scale box-constrained quadratic programming. Numer. Math. 100, 21–47 (2005)
Article MathSciNet MATH Google Scholar
Birgin, E.G., Martínez, J.M.: Large-scale active-set box-constrained optimization method with spectral projected gradients. Comput. Optim. Appl. 23, 101–125 (2002)
Article MathSciNet MATH Google Scholar
Facchinei, F., Júdice, J., Soares, J.: An active set newton algorithm for large-scale nonlinear programs with box Canstranits. SIAM J. Optim. 8, 158–186 (1998)
Article MathSciNet MATH Google Scholar
Facchinei, F., Lucidi, S., Palagi, L.: A truncated newton algorithm for large scale box constrained optimization. SIAM J. Optim. 12, 1100–1125 (2002)
Article MathSciNet MATH Google Scholar
Lin, C.J., Moré, J.J.: Newton’s method for large bound-constrained optimization problems. SIAM J. Optim. 9, 1100–1127 (1999)
Article MathSciNet MATH Google Scholar
Moré, J.J., Toraldo, G.: Algorithms for bound constrained quadratic programming problems. Numer. Math. 4, 377–400 (1989)
Article Google Scholar
Ni, Q., Yuan, Y.X.: A subspace limited memory quasi-Newton algorithm for large-scale nonlinear bound constrained optimization. Math. Comput. 66, 1509–1520 (1997)
Article MathSciNet MATH Google Scholar
Xiao, Y.H., Hu, Q.G.: Subspace Barzilai–Borwein gradient method for large-scale bound constrained optimization. Appl. Math. Optim. 58, 275–290 (2008)
Article MathSciNet MATH Google Scholar
Birgin, E.G., Martínez, J.M., Raydan, M.: Nonmonotone spectral projected gradient methods on convex sets. SIAM J. Optim. 10, 1121–1196 (2000)
Article Google Scholar
Hager, W.W., Zhang, H.: A survey of nonlinear conjugate gradient methods. Pac. J. Optim. 2, 35–58 (2006)
MathSciNet MATH Google Scholar
Zhang, L., Zhou, W.J., Li, D.H.: A descent modified Polak–Ribiere–Polyak conjugate gradient method and its global convergence. IMA J. Numer. Anal. 26, 629–640 (2006)
Article MathSciNet MATH Google Scholar
Bongartz, I., Conn, A.R., Gould, N.I.M., Toint, P.L.: CUTE: constrained and unconstrained testing environment. ACM Trans. Math. Softw. 21, 123–160 (1995)
Article MATH Google Scholar
Birgin, E.G., Martínez, J.M., Raydan, M.: Algorithm 813: SPG-software for convex-constrained optimization. ACM Trans. Math. Softw. 27, 340–349 (2001)
Article MATH Google Scholar
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We would like to thank the referees for their detailed comments and suggestions that have led to a significant improvement of the paper. We also thank the Professors E.G. Birgin, J.M. Martínez, and M. Raydan for their SPG2 code and Professors W.W.Hager and H. Zhang for their ASA code for numerical comparison. W. Cheng’s research was supported by the NSF of China via grant (11071087, 11101081) and by Foundation for Distinguished Young Talents in Higher Education of Guangdong, China LYM10127. D. Li’s research is supported by the major project of the Ministry of Education of China Grant 309023 and the NSF of China Grant 11071087.

Author information

Authors and Affiliations

College of Computer, Dongguan University of Technology, Dongguan, 523808, China
Wanyou Cheng
School of Mathematical Sciences, South China Normal University, Guangzhou, 510631, China
Donghui Li

Authors

Wanyou Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Donghui Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wanyou Cheng.

Additional information

Communicated by F. Giannessi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, W., Li, D. An Active Set Modified Polak–Ribiére–Polyak Method for Large-Scale Nonlinear Bound Constrained Optimization. J Optim Theory Appl 155, 1084–1094 (2012). https://doi.org/10.1007/s10957-012-0091-9

Download citation

Received: 31 October 2011
Accepted: 26 May 2012
Published: 09 June 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10957-012-0091-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An Active Set Modified Polak–Ribiére–Polyak Method for Large-Scale Nonlinear Bound Constrained Optimization

Abstract

Similar content being viewed by others

Primal-Dual Active-Set Methods for Large-Scale Optimization

An Active Set Trust-Region Method for Bound-Constrained Optimization

Approximate solution of system of equations arising in interior-point methods for bound-constrained optimization

1 Introduction