An efficient conjugate gradient trust-region approach for systems of nonlinear equation

Rahpeymaii, Farzad

doi:10.1007/s13370-019-00669-0

An efficient conjugate gradient trust-region approach for systems of nonlinear equation

Published: 08 March 2019

Volume 30, pages 597–609, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Afrika Matematika Aims and scope Submit manuscript

An efficient conjugate gradient trust-region approach for systems of nonlinear equation

Download PDF

Farzad Rahpeymaii¹

213 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we introduce a combination of family of some conjugate gradient methods (CG) with the trust-region method. Whenever the trust-region algorithm is unsuccessful, a family of CG methods is used to prevent resolving the trust-region subproblem. The computational cost for such a family is trivial. The global theory of the new approach is proved and numerical experiments are reported.

An accelerated nonmonotone trust region method with adaptive trust region for unconstrained optimization

Article 01 September 2017

An efficient line search trust-region for systems of nonlinear equations

Article 28 June 2020

A new trust-region method for solving systems of equalities and inequalities

Article 29 July 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We introduce a combined trust-region method with conjugate gradient (CG) methods to solve the system of nonlinear equations

$$\begin{aligned}&F(x)=0,\nonumber \\&x \in {\mathbb {R}}^{n}, \end{aligned}$$

(1)

where $F:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}^{n}$ is a continuously differentiable mapping, that is,

$$\begin{aligned} F(x):=\left( \begin{array}{c} F_1(x) \\ F_2(x) \\ \vdots \\ F_n(x) \\ \end{array} \right) , \end{aligned}$$

for which each function $F_i:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}$, $i=1,\ldots ,n$, is smooth. Let us denote $x^*$ as a solution or root of (1). Under certain circumstance, the nonlinear system (1) can be written as the following unconstrained optimization problem

$$\begin{aligned}&\text {min}~f(x):=\frac{1}{2}\Vert F(x)\Vert ^2,\nonumber \\&\text {s.t}\ \ x \in {\mathbb {R}}^{n}, \end{aligned}$$

(2)

in which $\Vert \cdot \Vert $ denotes the Euclidean norm, cf. [24]. There are some popular methods to solve (2) such as [6, 10,11,12,13,14, 16, 21, 25,26,27,28,29].

Contribution In this paper, we amied for presenting a derivative-free trust-region algorithm that a family of CG methods is used as long as iterations are unsuccessful in traditional trust-region (TTR) method. In addition, the global convergence and q-quadratic convergence of new algorithm are proved. Numerical results show that the new algorithm is efficient to solve systems of nonlinear equations.

Organization This paper is organized as follows: In Sect. 2, we describe briefly the traditional trust-region method. In next section, we first introduce a family of some CG methods and next the structure of new algorithm will be described. In Sect. 4, the global convergence of the new algorithm under some suitable assumptions is investigated. Numerical results are reported in Sect. 5. Finally, some conclusive remarks are given in Sect. 6.

2 Review of traditional trust-region method

Trust-region method is one efficient class of global approaches to solve (1), which utilizes the information gathered about f in order to produce a quadratic model function $m_k$ whose behavior close to the current point $x_k$ is similar to that of the actual objective function f. We now describe the trust-region method in a little more detail. If x is far away from $x_k$ and $m_k$ is not a good approximation of f, then finding a minimizer of $m_k$ can be restricted to a region around $x_k$, as follows

$$\begin{aligned} \varOmega _k:=\{x\mid \Vert x-x_k\Vert \le \varDelta _k\}, \end{aligned}$$

where $\varDelta _k>0$ is the trust-region radius. In other words, trust-region method finds a trial step $d_k$ by computing an approximate solution of the following model subproblem

$$\begin{aligned}&\text {min}~ m_{k}(x_k+d):=\frac{1}{2}\Vert F_k+J_kd\Vert ^2=f_k+d^T J_k^T F_k+\frac{1}{2}d^T J_k^TJ_k d,\nonumber \\&\text {subject to}~ d \in {\mathbb {R}}^n~ \text {and} ~\Vert d\Vert \le \varDelta _k, \end{aligned}$$

(3)

where $f_k:=f(x_k)$, $F_k:=F(x_k)$ and $J_k:=F'(x_k)$ is an approximated Jacobian of F(x). In order to update $\varDelta _k$ and accept $d_k$, the ratio of trust-region is defined by

$$\begin{aligned} r_k:=\frac{f_k-f(x_k+d_k)}{m_{k}(x_k)-m_{k}(x_k+d_k)}, \end{aligned}$$

(4)

which tries to imply an appropriate agreement between the actual reduction in the function f and the reduction predicted by the model. If $r_k\ge \mu _1$, then iterations are called successful and $x_{k+1}:=x_k+d_k$, in which $\mu _1 \in (0,1)$. Otherwise, iterations are called unsuccessful and $x_{k+1}:=x_k$.

We now present the traditional trust-region algorithm (TTR).

In TTR, the loop started from Line 3 to Line 19 is called the outer cycle and the loop started from Line 5 to Line 13 is called the inner cycle. In addition, if $r_k<\mu _1$ (Line 10), then $\varDelta _k$ is reduced in Line 11 and if $r_k\ge \mu _2$ (Line 15), $\varDelta _k$ is increased in Line 16 and therefore iterations are called very successful. Whenever $r_k<\mu _1$, TTR solves, several times, the trust-region subproblem (3) which leads to increase computational cost.

3 Motivation and algorithmic structure

It is well-known that TTR has some drawbacks, see [1, 2, 7, 10, 28]. One of the main drawbacks of TTR is that resolving the subproblem of trust-region leads to increase the computational cost whenever iterations are unsuccessful. In order to overcome this shortcoming, we can apply some techniques with low memory, decreasing the total number of iterations without resolving the subproblem of trust-region. Hence, under no circumstances, do we like to resolve the trust-region subproblem whenever iterations are unsuccessful.

Efficient methods to prevent resolving the trust-region subproblem are conjugate gradient methods, using low computational costs, for which strong global convergence properties are established. Some researchers have been introduced various methods of CG that they don’t require any matrix storage, cf. [8, 15, 17, 18]. These methods produce an iterative sequence $\{x_k\}_{k\ge 0}$ satisfying $x_{k+1}:=x_k+\alpha _kd_k$, in which $\alpha _k$ is a step-size and $d_k$ is a direction defined by

$$\begin{aligned} d_k:= {\left\{ \begin{array}{ll} -g_k, &{} \quad \text {if} \ \ k=0, \\ -g_k+\beta _kd_{k-1}, &{} \quad \text {if} \ \ k\ge 1, \end{array}\right. } \end{aligned}$$

(5)

for which $g_k:=J_k^TF_k$ and $\beta _k$ is a scalar. Several famous formula for $\beta _k$ are presented by

$$ \begin{aligned}&\beta _k^{^{HS}}:=\frac{g_k^Ty_{k-1}}{d_{k-1}^Ty_{k-1}},\quad ({\textsc {Hestenes \& Stiefel}} [18]) \end{aligned}$$

(6)

$$ \begin{aligned}&\beta _k^{^{FR}}:=\frac{\Vert g_k\Vert ^2}{\Vert g_{k-1}\Vert ^2}, \quad ({\textsc {Fletcher \& Reeves}}\ [15]) \end{aligned}$$

(7)

$$ \begin{aligned}&\beta _k^{^{DY}}:=\frac{\Vert g_k\Vert ^2}{d_{k-1}^Ty_{k-1}},\quad ({\textsc {Dai \& Yuan}}\ [8]) \end{aligned}$$

(8)

$$ \begin{aligned}&\beta _k^{^{HZ}}:=\beta _k^{^{HS}}-2\Vert y_{k-1}\Vert ^2\frac{d^T_{k-1}g_k}{(d_{k-1}^Ty_{k-1})^2}, \ \ ({\textsc {Hager \& Zhang}}\ [17]) \end{aligned}$$

(9)

where $y_{k-1}:=g_k-g_{k-1}$. Now, we introduce a new conjugate gradient method as follows:

$$\begin{aligned} d^{cg}_k:= {\left\{ \begin{array}{ll} -g_k, &{} \quad \text{ if } \ \ k=0,\\ -g_k+\beta _k^{cg}d_{k}, &{} \quad \text{ if } \ \ k\ge 1, \end{array}\right. } \end{aligned}$$

(10)

in which

$$\begin{aligned} \beta _k^{cg}:=\min \{\max \{\beta _k,\beta _{\min }\},\beta _{\max }\}, \end{aligned}$$

$\beta _k$ can be chosen from the formula (6)–(9), $0<\beta _{\min }<\beta _{\max }<+\infty $ and $d_{k}$ is the solution of (3). The direction $d_k^{cg}$ uses advantages of $d_k$ for which there is not any additional computational cost, hence, it has a little computational cost.

To establish the convergence results of CG methods, the step-size $\alpha _k$ should be obtained by exact or inexact line search technique, cf. [24]. Here, we use backtraking procedure to find the step-size $\alpha _k$, as follows:

We now introduce the conjugate gradient trust-region algorithm to solve (2), as follows:

To prove the global convergence of new method, we present the following assumptions:

(A1) :

Let $\varOmega $ be a compact convex set such that $L(x_0)\subseteq \varOmega $. The level set $L(x_0):=\{x\in {\mathbb {R}}^n \mid f(x)\le f(x_{0})\}$ for which F(x) is continuously differentiable on set $\varOmega $ is bounded for any given $x_{0}\in {\mathbb {R}}^{n}$.

(A2) :

The matrix J(x) is bounded and uniformly nonsingular on $\varOmega $, i.e., there exist constants $0<M_0\le 1 \le M_1$ such that

$$\begin{aligned} \Vert J(x)\Vert \le M_1 \ \ \text {and} \ \ M_0\Vert F(x)\Vert \le \Vert J(x)^T F(x)\Vert , \end{aligned}$$

(11)

see [10, 11].

(A3) :

The decrease on the model $m_k$ is at least as much as a fraction of that obtained by the Cauchy point, i.e., there exists a constant $\zeta \in (0,1)$ such that

$$\begin{aligned} m_k(x_k)-m_k(x_k+d_k) \ge \zeta ~ \Vert g_k\Vert ~ \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^T J_k\Vert }\right] , \end{aligned}$$

(12)

for all $k \in {\mathbb {N}}_0$, see [10, 11, 24].

(A4) :

To prove the strong theoretical for the proposed algorithm, it is necessary that the step $d_k$, the solution of (3), satisfies

$$\begin{aligned} g_k^Td_k\le -\zeta \Vert g_k\Vert \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^TJ_k\Vert }\right] , \end{aligned}$$

(13)

where $\zeta \in (0,1)$.

Note that by Assumption (A4) and (10) we get

$$\begin{aligned} g_k^Td^{cg}_k= & {} -g_k^Tg_k+\beta _k^{cg}g_k^Td_k\nonumber \\\le & {} -\Vert g_k\Vert ^2-\beta _k^{cg}\zeta \Vert g_k\Vert \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^TJ_k\Vert }\right] \nonumber \\= & {} -\Vert g_k\Vert \Big (\Vert g_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^TJ_k\Vert }\right] \Big ). \end{aligned}$$

(14)

Let us define two index sets

$$\begin{aligned} {\mathcal {I}}_1:=\{k\mid r_k\ge \mu _1\} \ \ \text {and} \ \ {\mathcal {I}}_2:=\{k\mid r_k<\mu _1\}. \end{aligned}$$

In other words, ${\mathcal {I}}_2$ contains the iterations set of CGTR using CG methods and ${\mathcal {I}}_1$ includes the iterations set of CGTR without using them.

4 Convergence theory

We now investigate the global convergence and q-quadratic rate of results of CGTR.

Remark 1

Suppose that there exists a positive constant $\kappa >0$ such that $\Vert d_k\Vert \le \kappa \Vert g_k\Vert $. This fact implies that

$$\begin{aligned} \Vert d^{cg}_k\Vert \le \Vert g_k\Vert +\beta _{k}^{cg}\Vert d_k\Vert \le \Vert g_k\Vert +\beta _{\max }\kappa \Vert g_k\Vert :=\underbrace{(1+\beta _{\max }\kappa )}_{:={{\widetilde{\kappa }}}}\Vert g_k\Vert .\end{aligned}$$

By Remark 1, the following lemma helps us to establish the global convergence.

Lemma 1

Suppose that the sequence $\{x_k\}_{k\ge 0}$ is generated by CGTR. Then, for sufficiently large $k\in {\mathcal {I}}_2$, the step-size $\alpha _k$ satisfies

$$\begin{aligned} \alpha _k>\frac{2\sigma (1-\gamma )M_0\left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^3{{\widetilde{\kappa }}}^2 \Vert F_k\Vert }. \end{aligned}$$

Proof

First, let us define $\alpha :=\alpha _k/\sigma $. By BT procedure, we have

$$\begin{aligned} f_k+\gamma \alpha g_k^Td^{cg}_k<f\left( x_k+\alpha d^{cg}_k\right) , \end{aligned}$$

the rewriting of which yields

$$\begin{aligned} \gamma \alpha g_k^Td^{cg}_k <f\left( x_k+\alpha d^{cg}_k\right) -f_k. \end{aligned}$$

(15)

In addition, by Taylor’s theorem, there exists a $\xi \in \left[ x_k,x_k+\alpha d^{cg}_k\right] $ such that

$$\begin{aligned} f\left( x_k+\alpha d^{cg}_k\right) -f_k=\alpha g_k^T d^{cg}_k+\frac{1}{2}\alpha ^2 \left( d^{cg}_k\right) ^TJ(\xi )^TJ(\xi )d^{cg}_k {\mathop {>}\limits ^{15)}}\gamma \alpha g_k^Td^{cg}_k. \end{aligned}$$

(16)

On the other hand, Assumption (A2), for any $\xi \in \left[ x_k,x_k+\alpha d^{cg}_k\right] $, implies that

$$\begin{aligned} \frac{1}{2}\left( d^{cg}_k\right) ^TJ(\xi )^TJ(\xi )d^{cg}_k\le \frac{M_1^2}{2}\Vert d^{cg}_k\Vert ^2. \end{aligned}$$

(17)

Then, by removing $\alpha $ from both sides of (15), we get

$$\begin{aligned} \begin{aligned} \gamma g_k^Td^{cg}_k&< g_k^Td^{cg}_k+\frac{1}{2}\alpha \left( d^{cg}_k\right) ^TJ(\xi )^TJ(\xi )d^{cg}_k\\&< g_k^Td^{cg}_k+\frac{1}{2}M_1^2\alpha \Vert d^{cg}_k\Vert ^2,\\ \end{aligned} \end{aligned}$$

leading to

$$\begin{aligned} -(1-\gamma )g_k^Td^{cg}_k<\frac{1}{2}\alpha M_1^2\Vert d^{cg}_k\Vert ^2. \end{aligned}$$

Assumptions (A2) and (A4) and the above inequality can result in

$$\begin{aligned} (1-\gamma )M_0\Vert F_k\Vert (M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] <\frac{M_1^2}{2}\frac{\alpha _k}{\sigma }\Vert d^{cg}_k\Vert ^2. \end{aligned}$$

By recalling this fact that $\Vert d^{cg}_k\Vert \le {{\widetilde{\kappa }}}\Vert g_k\Vert {\mathop {\le }\limits ^{(A2)}} {{\widetilde{\kappa }}} M_1\Vert F_k\Vert $, we get

$$\begin{aligned} \alpha _k> & {} \frac{2\sigma (1-\gamma )M_0\Vert F_k\Vert \left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^2\Vert d^{cg}_k\Vert ^2}\\\ge & {} \frac{2\sigma (1-\gamma )M_0\Vert F_k\Vert \left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^2({{\widetilde{\kappa }}} M_1\Vert F_k\Vert )^2}\\\ge & {} \frac{2\sigma (1-\gamma )M_0\left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^3{{\widetilde{\kappa }}}^2 \Vert F_k\Vert }\\ \end{aligned}$$

which completes the proof. $\square $

Lemma 2

Suppose that the sequence $\{x_k\}_{k\ge 0}$ is generated by CGTR. Then, the BT loop in CGTR is well-defined.

Proof

For $k\in {\mathcal {I}}_2$, we now show that the backtraking loop terminates in the finite number of steps. Let us, by contradiction, assume that there exists $k\in {\mathcal {I}}_2$ such that

$$\begin{aligned} f\left( x_k+\sigma ^i\alpha _kd^{cg}_k\right) > f_k+\gamma \sigma ^i\alpha _kg_k^Td^{cg}_k, \quad \forall i\in {\mathbb {N}}_0. \end{aligned}$$

(18)

After rewriting (18) as

$$\begin{aligned} \frac{f(x_k+\sigma ^i\alpha _kd^{cg}_k)-f_k}{\sigma ^i\alpha _k}>\gamma g_k^Td^{cg}_k , \quad \forall i\in {\mathbb {N}}_0, \end{aligned}$$

we take a limit, as $i\rightarrow \infty $, from it, which leads to

$$\begin{aligned} g_k^Td^{cg}_k\ge \gamma g_k^Td^{cg}_k, \end{aligned}$$

because f is a differentiable function. Since $\gamma \in [\frac{1}{2},1)$, it is obtained that $g_k^Td^{cg}_k\ge 0$ which contradicts (14). $\square $

According to Assumptions (A1)–(A5), the main global convergence of CGTR is established by the following theorem.

Theorem 1

Suppose that Assumptions (A1)–(A5) hold. Then Algorithm 2 either stops at a stationary point of f(x) or generates an infinite sequence $\{x_k\}$ such that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert F_k\Vert =0. \end{aligned}$$

(19)

Proof

The proof is by contradiction. It assumes that the oppositive of (19) holds, that is, there exists a constant $\epsilon >0$ and an infinite subset $K\subseteq {\mathbb {N}}$ such that

$$\begin{aligned} \Vert F_k\Vert >\epsilon ,\quad \text {for all}\; k\in K. \end{aligned}$$

(20)

We follow the proof in the following two cases:

Case 1 For $k\in {\mathcal {I}}_1$. By setting (20) in (12) and (A2), we get

$$\begin{aligned} \begin{aligned} f_k-f(x_k+d_k)&\ge \mu _1 [m_k(x_k)-m_k(x_k+d_k)]\\&\ge \mu _1\zeta M_0 \Vert F_k\Vert \min \left\{ \varDelta _k,\frac{M_0 \Vert F_k\Vert }{M_1^2}\right\} \\&> \mu _1 \zeta M_0 \epsilon \min \left\{ \varDelta _k,\frac{M_0 \epsilon }{M_1^2}\right\} .\\ \end{aligned} \end{aligned}$$

(21)

Taking a limit from both sides of (21) leads to

$$\begin{aligned} \lim _{k\rightarrow \infty }\varDelta _k=0. \end{aligned}$$

In other words, there exist a $\delta >0$ and $m\in {\mathbb {N}}$ such that

$$\begin{aligned} \varDelta _k<\delta ,\quad \forall k\ge m. \end{aligned}$$

(22)

Setting (22) in (21) leads to

$$\begin{aligned} f_k-f(x_k+d_k)\ge \mu _1 \zeta M_0 \epsilon \min \left\{ \delta ,\frac{M_0 \epsilon }{M_1^2}\right\} :=\theta _1>0, \end{aligned}$$

as $k\rightarrow \infty $, it implies a contradiction because $0<\theta _1\le 0$. Hence, the hypothesis (20) is not true and the proof is complete.

Case 2 For $k\in {\mathcal {I}}_2$. Lemma 1 along with Assumptions (A2) and (A4) implies

$$\begin{aligned} f(x_k+\alpha _kd^{cg}_k)\le & {} f_k+\gamma \alpha _kg^T_kd^{cg}_k\\\le & {} f_k+\gamma \left( \frac{2\sigma (1-\gamma )M_0\left( M_0\Vert F_k\Vert + \beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^3{\tilde{\kappa }}^2 \Vert F_k\Vert }\right) \\&\quad \times \left( -\zeta M_0\Vert F_k\Vert \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) \\&{\mathop {\le }\limits ^{20}} f_k-\frac{2\sigma \gamma (1-\gamma )\zeta M_0^2}{M_1^3{\tilde{\kappa }}^2}\left( M_0\epsilon + \beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\epsilon }{M_1^2}\right] \right) \left( \min \left[ \varDelta _k,\frac{M_0\epsilon }{M_1^2}\right] \right) .\\ \end{aligned}$$

Similar to case 1, (22) is true. Hence, as $k\rightarrow \infty $, the above inequality results

$$\begin{aligned} 0\ge \frac{2\sigma \gamma (1-\gamma )\zeta M_0^2}{M_1^3{\tilde{\kappa }}^2}\left( M_0\epsilon + \beta _{\min }\zeta \min \left[ \delta ,\frac{M_0\epsilon }{M_1^2}\right] \right) \left( \min \left[ \delta ,\frac{M_0\epsilon }{M_1^2}\right] \right) :=\theta _2>0, \end{aligned}$$

yielding a contradiction. Hence, our original assertion (20) must be false, yielding (19). $\square $

Now, to investigate the quadratic convergence of new method, we present the following assumption.

(A5) The matrix J(x) is Lipschitz continuous in $L(x_0)$, with Lipschitz constant $\gamma _L$.

Since $d_k^{cg}$ is satisfied in (14), the quadratic convergence rate of the sequence generated by CGTR, under the some standard assumptions, can be established similar to Theorem 2 in [3].

Table 1 List of test functions

Full size table

5 Numerical experiments

In this section, we compare five different solvers on the set of some test problems. First, five solvers of new algorithm are introduced as follows:

CGFRTR: The trust-region algorithm combined with conjugate gradient method proposed by Fletcher and Reeves [15];
CGHSTR: The trust-region algorithm combined with conjugate gradient method proposead by Hestenes and Stiefel [18];
CGHZTR: The trust-region algorithm combined with conjugate gradient method proposed by Hager and Zhang [17];
CGDYTR: The trust-region algorithm combined with conjugate gradient method proposed by Dai and Yuan [8].

Then, we compare CGFRTR, CGHSTR, CGHZTR and CGDYTR with the traditional trust-region algorithm (TTR) proposed by Conn et al. [7].

All algorithms are run on a collection of systems of nonlinear equations with the dimension from 2 to 1000 that are selected from the wide range of literatures. We have tested our implementation on the set of test functions [22] (problems 1–28), [20] (problems 29–55) and [23] (problems 56–62), respectively. Table 1 provides the name and dimension of each test problem. All codes are written in MATLAB 15 programming environment with double precision format in the same subroutine. All algorithms are stopped if

$$\begin{aligned} \Vert F_k\Vert \le 10^{-6}\sqrt{n}, \end{aligned}$$

where is the main termination criterion. In other case, we count the corresponding test run as a failure if the total number of iterates exceeds 1000. We use Steihaug-Toint procedure to solve the trust-region subproblems of the proposed algorithms which terminates at $x_k+d$ if

$$\begin{aligned} \Vert \nabla m_k(x_k+d)\Vert \le \min \left\{ \frac{1}{10},\Vert \nabla m_k(x_k+d)\Vert ^{\frac{1}{2}}\right\} \Vert \nabla m_k(x_k+d)\Vert . \end{aligned}$$

The finite-differences formula is used to evaluate the Jacobian matrix $J_k$, as follows

$$\begin{aligned}{}[J_k]_{\cdot j}\sim \frac{1}{h_j}(F(x_k+h_je_j)-F_k), \end{aligned}$$

where $[J_k]_{\cdot j}$ denotes the jth column of $J_k$, $e_j$ is the jth vector of the canonic basis and

$$\begin{aligned} h_j:=\left\{ \begin{array}{ll} \sqrt{\epsilon _m} &{} \quad \hbox {if }\;x_{k_j}=0, \\ \sqrt{\epsilon _m} \text {sign}(x_{k_j})\max \{|x_{k_j}|,\frac{\Vert x_k\Vert _1}{n}\} , &{} \quad \hbox {else.} \\ \end{array} \right. \end{aligned}$$

in which $\epsilon _m$ denotes the machine epsilon provided by the Matlab function eps. For all algorithms, we choose $\varDelta _0=1$ (see [25]) and employ the parameters $\mu _1=0.1$, $\mu _2=0.9$, $c_1=0.25$ and $c_2=0.3$. In addition, $\varDelta _{k}$ is updated like [24] by the following formula

$$\begin{aligned} \varDelta _{k+1}= \left\{ \begin{array}{ll} c_1\Vert d_k\Vert , \ \ &{} \quad \hbox {if }\; r_k < \mu _1, \\ \varDelta _k, \ \ &{} \quad \hbox {if }\; \mu _1 \le r_k \le \mu _2, \\ c_2 \varDelta _k, \ \ &{} \quad \hbox {if }\; r_k \ge \mu _2. \\ \end{array} \right. \\ \end{aligned}$$

The performance profiles for all of the algorithms are given for the number of (successful) iterations ($N_i$), number of function evaluations ($N_f$) and CPU-times ($C_t$) by Figs. 1, 2, 3, respectively. In these figures, P designates the percentage of problems, which are solved within a factor $\tau $ of the best solver, cf. [9].

Figure 1 shows that CGDYTR, CGHSTR and CGHZTR are the best solver, in terms of number of iterations, on 92%, 90% and 90% of the problems, respectively. In Fig. 2, it can be seen that CGHSTR, CGHZTR and CGDYTR are the best solver on approximately 95%, 95% and 88% of the test problems. Figure 3 shows that CGDYTR is the best solver, in terms of CPU time, on 30% of the problems. These results show that the propose algorithms are efficient for solving systems of nonlinear equations.

6 Concluding remarks

We presented a new CG trust-region strategy to solve nonlinear systems. In order to prevent resolving the subproblem of trust-region with high computational cost, we added a family of some CG methods into the trust-region algorithm such that it is combined with the direction generated in previous iteration by Steihaug-Toint procedure, using information Hessian matrix. The global and q-quadratic convergence properties of CGTR are established. Numerical results on a set of nonlinear systems show that CGDYTR is the best solver for solving nonlinear systems.

References

Ahookhosh, M., Amini, K.: A nonmonotone trust-region method with adaptive radius for unconstrained optimization. Comput. Math. Appl. 60, 411–422 (2010)
Article MathSciNet MATH Google Scholar
Ahookhosh, M., Amini, K., Peyghami, M.R.: A nonmonotone trust-region line search method for large-scale unconstrained optimization. Appl. Math. Modell. 36, 478–487 (2012)
Article MathSciNet MATH Google Scholar
Amini, K., Shiker Mushtak, A.K., Kimiaei, M.: A line search trust-region algorithm with nonmonotone adaptive radius for a system of nonlinear equations. 4OR-Q. J. Oper. Res. 4(2), 132–152 (2016)
MathSciNet MATH Google Scholar
Bellavia, S., Macconi, M., Morini, B.: STRSCNE: a scaled trust-region solver for constrained nonlinear equations. Comput. Optim. Appl. 28, 31–50 (2004)
Article MathSciNet MATH Google Scholar
Bouaricha, A., Schnabel, R.B.: Tensor methods for large sparse systems of nonlinear equations. Math. Program. 82, 377–400 (1998)
MathSciNet MATH Google Scholar
Broyden, C.G.: The convergence of an algorithm for solving sparse nonlinear systems. Math. Comput. 25(114), 285–294 (1971)
Article MathSciNet MATH Google Scholar
Conn, A.R., Gould, N.I.M., Toint, PhL: Trust-Region Methods. Society for Industrial and Applied Mathematics SIAM, Philadelphia (2000)
Book MATH Google Scholar
Dai, Y.H., Yuan, Y.: A nonlinear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 10, 177–182 (1999)
Article MathSciNet MATH Google Scholar
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Article MathSciNet MATH Google Scholar
Esmaeili, H., Kimiaei, M.: A new adaptive trust-region method for system of nonlinear equations. Appl. Math. Model. 38(11–12), 3003–3015 (2014)
Article MathSciNet MATH Google Scholar
Fan, J.Y.: Convergence rate of the trust region method for nonlinear equations under local error bound condition. Comput. Optim. Appl. 34, 215–227 (2005)
Article MathSciNet MATH Google Scholar
Fan, J.Y.: An improved trust region algorithm for nonlinear equations. Comput. Optim. Appl. 48(1), 59–70 (2011)
Article MathSciNet MATH Google Scholar
Fan, J.Y., Pan, J.Y.: A modified trust region algorithm for nonlinear equations with new updating rule of trust region radius. Int. J. Comput. Math. 87(14), 3186–3195 (2010)
Article MathSciNet MATH Google Scholar
Fasano, G., Lampariello, F., Sciandrone, M.: A truncated nonmonotone Gauss-Newton method for large-scale nonlinear least-squares problems. Comput. Optim. Appl. 34(3), 343–358 (2006)
Article MathSciNet MATH Google Scholar
Fletcher, R., Reeves, C.: Function minimization by conjugate gradients. Comput. J. 7, 149–154 (1964)
Article MathSciNet MATH Google Scholar
Grippo, L., Sciandrone, M.: Nonmonotone derivative-free methods for nonlinear equations. Comput. Optim. Appl. 37, 297–328 (2007)
Article MathSciNet MATH Google Scholar
Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170–192 (2005)
Article MathSciNet MATH Google Scholar
Hestenes, M.R., Stiefel, E.L.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stand. 49, 409–436 (1952)
Article MathSciNet MATH Google Scholar
Kimiaei, M., Rahpeymaii, F.: A new nonmonotone line-search trust-region approach for nonlinear systems. TOP (2019). https://doi.org/10.1007/s11750-019-00497-2
Google Scholar
La Cruz, W., Venezuela, C., Martínez, J.M., Raydan, M.: Spectral residual method without gradient information for solving large-scale nonlinear systems of equations: Theory and experiments, Technical Report RT–04–08, July (2004)
Li, D.H., Fukushima, M.: A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Optim. Methods Softw. 13, 181–201 (2000)
Article MathSciNet MATH Google Scholar
Lukšan, L., Vlček, J.: Sparse and partially separable test problems for unconstrained and equality constrained optimization, Technical Report. No. 767, (1999)
Moré, J.J., Garbow, B.S., Hillström, K.E.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7, 17–41 (1981)
Article MathSciNet MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2006)
MATH Google Scholar
Toint, PhL: Numerical solution of large sets of algebraic nonlinear equations. Math. Comput. 46(173), 175–189 (1986)
Article MathSciNet MATH Google Scholar
Yuan, G., Lu, S., Wei, Z.: A new trust-region method with line search for solving symmetric nonlinear equations. Int. J. Comput. Math. 88(10), 2109–2123 (2011)
Article MathSciNet MATH Google Scholar
Yuan, G.L., Wei, Z.X., Lu, X.W.: A BFGS trust-region method for nonlinear equations. Computing. 92(4), 317–333 (2011)
Article MathSciNet MATH Google Scholar
Yuan, Y.: Trust region algorithm for nonlinear equations. Information. 1, 7–21 (1998)
MathSciNet MATH Google Scholar
Zhang, J., Wang, Y.: A new trust region method for nonlinear equations. Math. Methods Oper. Res. 58, 283–298 (2003)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Payame Noor University, P.O. BOX 19395-3697, Tehran, Iran
Farzad Rahpeymaii

Authors

Farzad Rahpeymaii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Farzad Rahpeymaii.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahpeymaii, F. An efficient conjugate gradient trust-region approach for systems of nonlinear equation. Afr. Mat. 30, 597–609 (2019). https://doi.org/10.1007/s13370-019-00669-0

Download citation

Received: 22 March 2017
Accepted: 06 February 2019
Published: 08 March 2019
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s13370-019-00669-0

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An efficient conjugate gradient trust-region approach for systems of nonlinear equation

Abstract

Similar content being viewed by others

An accelerated nonmonotone trust region method with adaptive trust region for unconstrained optimization

An efficient line search trust-region for systems of nonlinear equations

A new trust-region method for solving systems of equalities and inequalities

1 Introduction

2 Review of traditional trust-region method

3 Motivation and algorithmic structure