1 Introduction

We introduce a combined trust-region method with conjugate gradient (CG) methods to solve the system of nonlinear equations

$$\begin{aligned}&F(x)=0,\nonumber \\&x \in {\mathbb {R}}^{n}, \end{aligned}$$
(1)

where \(F:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}^{n}\) is a continuously differentiable mapping, that is,

$$\begin{aligned} F(x):=\left( \begin{array}{c} F_1(x) \\ F_2(x) \\ \vdots \\ F_n(x) \\ \end{array} \right) , \end{aligned}$$

for which each function \(F_i:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}\), \(i=1,\ldots ,n\), is smooth. Let us denote \(x^*\) as a solution or root of (1). Under certain circumstance, the nonlinear system (1) can be written as the following unconstrained optimization problem

$$\begin{aligned}&\text {min}~f(x):=\frac{1}{2}\Vert F(x)\Vert ^2,\nonumber \\&\text {s.t}\ \ x \in {\mathbb {R}}^{n}, \end{aligned}$$
(2)

in which \(\Vert \cdot \Vert \) denotes the Euclidean norm, cf. [24]. There are some popular methods to solve (2) such as [6, 10,11,12,13,14, 16, 21, 25,26,27,28,29].

Contribution In this paper, we amied for presenting a derivative-free trust-region algorithm that a family of CG methods is used as long as iterations are unsuccessful in traditional trust-region (TTR) method. In addition, the global convergence and q-quadratic convergence of new algorithm are proved. Numerical results show that the new algorithm is efficient to solve systems of nonlinear equations.

Organization This paper is organized as follows: In Sect. 2, we describe briefly the traditional trust-region method. In next section, we first introduce a family of some CG methods and next the structure of new algorithm will be described. In Sect. 4, the global convergence of the new algorithm under some suitable assumptions is investigated. Numerical results are reported in Sect. 5. Finally, some conclusive remarks are given in Sect.  6.

2 Review of traditional trust-region method

Trust-region method is one efficient class of global approaches to solve (1), which utilizes the information gathered about f in order to produce a quadratic model function \(m_k\) whose behavior close to the current point \(x_k\) is similar to that of the actual objective function f. We now describe the trust-region method in a little more detail. If x is far away from \(x_k\) and \(m_k\) is not a good approximation of f, then finding a minimizer of \(m_k\) can be restricted to a region around \(x_k\), as follows

$$\begin{aligned} \varOmega _k:=\{x\mid \Vert x-x_k\Vert \le \varDelta _k\}, \end{aligned}$$

where \(\varDelta _k>0\) is the trust-region radius. In other words, trust-region method finds a trial step \(d_k\) by computing an approximate solution of the following model subproblem

$$\begin{aligned}&\text {min}~ m_{k}(x_k+d):=\frac{1}{2}\Vert F_k+J_kd\Vert ^2=f_k+d^T J_k^T F_k+\frac{1}{2}d^T J_k^TJ_k d,\nonumber \\&\text {subject to}~ d \in {\mathbb {R}}^n~ \text {and} ~\Vert d\Vert \le \varDelta _k, \end{aligned}$$
(3)

where \(f_k:=f(x_k)\), \(F_k:=F(x_k)\) and \(J_k:=F'(x_k)\) is an approximated Jacobian of F(x). In order to update \(\varDelta _k\) and accept \(d_k\), the ratio of trust-region is defined by

$$\begin{aligned} r_k:=\frac{f_k-f(x_k+d_k)}{m_{k}(x_k)-m_{k}(x_k+d_k)}, \end{aligned}$$
(4)

which tries to imply an appropriate agreement between the actual reduction in the function f and the reduction predicted by the model. If \(r_k\ge \mu _1\), then iterations are called successful and \(x_{k+1}:=x_k+d_k\), in which \(\mu _1 \in (0,1)\). Otherwise, iterations are called unsuccessful and \(x_{k+1}:=x_k\).

We now present the traditional trust-region algorithm (TTR).

figure a

In TTR, the loop started from Line 3 to Line 19 is called the outer cycle and the loop started from Line 5 to Line 13 is called the inner cycle. In addition, if \(r_k<\mu _1\) (Line 10), then \(\varDelta _k\) is reduced in Line 11 and if \(r_k\ge \mu _2\) (Line 15), \(\varDelta _k\) is increased in Line 16 and therefore iterations are called very successful. Whenever \(r_k<\mu _1\), TTR solves, several times, the trust-region subproblem (3) which leads to increase computational cost.

3 Motivation and algorithmic structure

It is well-known that TTR has some drawbacks, see [1, 2, 7, 10, 28]. One of the main drawbacks of TTR is that resolving the subproblem of trust-region leads to increase the computational cost whenever iterations are unsuccessful. In order to overcome this shortcoming, we can apply some techniques with low memory, decreasing the total number of iterations without resolving the subproblem of trust-region. Hence, under no circumstances, do we like to resolve the trust-region subproblem whenever iterations are unsuccessful.

Efficient methods to prevent resolving the trust-region subproblem are conjugate gradient methods, using low computational costs, for which strong global convergence properties are established. Some researchers have been introduced various methods of CG that they don’t require any matrix storage, cf. [8, 15, 17, 18]. These methods produce an iterative sequence \(\{x_k\}_{k\ge 0}\) satisfying \(x_{k+1}:=x_k+\alpha _kd_k\), in which \(\alpha _k\) is a step-size and \(d_k\) is a direction defined by

$$\begin{aligned} d_k:= {\left\{ \begin{array}{ll} -g_k, &{} \quad \text {if} \ \ k=0, \\ -g_k+\beta _kd_{k-1}, &{} \quad \text {if} \ \ k\ge 1, \end{array}\right. } \end{aligned}$$
(5)

for which \(g_k:=J_k^TF_k\) and \(\beta _k\) is a scalar. Several famous formula for \(\beta _k\) are presented by

$$ \begin{aligned}&\beta _k^{^{HS}}:=\frac{g_k^Ty_{k-1}}{d_{k-1}^Ty_{k-1}},\quad ({\textsc {Hestenes \& Stiefel}} [18]) \end{aligned}$$
(6)
$$ \begin{aligned}&\beta _k^{^{FR}}:=\frac{\Vert g_k\Vert ^2}{\Vert g_{k-1}\Vert ^2}, \quad ({\textsc {Fletcher \& Reeves}}\ [15]) \end{aligned}$$
(7)
$$ \begin{aligned}&\beta _k^{^{DY}}:=\frac{\Vert g_k\Vert ^2}{d_{k-1}^Ty_{k-1}},\quad ({\textsc {Dai \& Yuan}}\ [8]) \end{aligned}$$
(8)
$$ \begin{aligned}&\beta _k^{^{HZ}}:=\beta _k^{^{HS}}-2\Vert y_{k-1}\Vert ^2\frac{d^T_{k-1}g_k}{(d_{k-1}^Ty_{k-1})^2}, \ \ ({\textsc {Hager \& Zhang}}\ [17]) \end{aligned}$$
(9)

where \(y_{k-1}:=g_k-g_{k-1}\). Now, we introduce a new conjugate gradient method as follows:

$$\begin{aligned} d^{cg}_k:= {\left\{ \begin{array}{ll} -g_k, &{} \quad \text{ if } \ \ k=0,\\ -g_k+\beta _k^{cg}d_{k}, &{} \quad \text{ if } \ \ k\ge 1, \end{array}\right. } \end{aligned}$$
(10)

in which

$$\begin{aligned} \beta _k^{cg}:=\min \{\max \{\beta _k,\beta _{\min }\},\beta _{\max }\}, \end{aligned}$$

\(\beta _k\) can be chosen from the formula (6)–(9), \(0<\beta _{\min }<\beta _{\max }<+\infty \) and \(d_{k}\) is the solution of (3). The direction \(d_k^{cg}\) uses advantages of \(d_k\) for which there is not any additional computational cost, hence, it has a little computational cost.

To establish the convergence results of CG methods, the step-size \(\alpha _k\) should be obtained by exact or inexact line search technique, cf. [24]. Here, we use backtraking procedure to find the step-size \(\alpha _k\), as follows:

figure b

We now introduce the conjugate gradient trust-region algorithm to solve (2), as follows:

figure c

To prove the global convergence of new method, we present the following assumptions:

(A1) :

Let \(\varOmega \) be a compact convex set such that \(L(x_0)\subseteq \varOmega \). The level set \(L(x_0):=\{x\in {\mathbb {R}}^n \mid f(x)\le f(x_{0})\}\) for which F(x) is continuously differentiable on set \(\varOmega \) is bounded for any given \(x_{0}\in {\mathbb {R}}^{n}\).

(A2) :

The matrix J(x) is bounded and uniformly nonsingular on \(\varOmega \), i.e., there exist constants \(0<M_0\le 1 \le M_1\) such that

$$\begin{aligned} \Vert J(x)\Vert \le M_1 \ \ \text {and} \ \ M_0\Vert F(x)\Vert \le \Vert J(x)^T F(x)\Vert , \end{aligned}$$
(11)

see [10, 11].

(A3) :

The decrease on the model \(m_k\) is at least as much as a fraction of that obtained by the Cauchy point, i.e., there exists a constant \(\zeta \in (0,1)\) such that

$$\begin{aligned} m_k(x_k)-m_k(x_k+d_k) \ge \zeta ~ \Vert g_k\Vert ~ \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^T J_k\Vert }\right] , \end{aligned}$$
(12)

for all \(k \in {\mathbb {N}}_0\), see [10, 11, 24].

(A4) :

To prove the strong theoretical for the proposed algorithm, it is necessary that the step \(d_k\), the solution of (3), satisfies

$$\begin{aligned} g_k^Td_k\le -\zeta \Vert g_k\Vert \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^TJ_k\Vert }\right] , \end{aligned}$$
(13)

where \(\zeta \in (0,1)\).

Note that by Assumption (A4) and (10) we get

$$\begin{aligned} g_k^Td^{cg}_k= & {} -g_k^Tg_k+\beta _k^{cg}g_k^Td_k\nonumber \\\le & {} -\Vert g_k\Vert ^2-\beta _k^{cg}\zeta \Vert g_k\Vert \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^TJ_k\Vert }\right] \nonumber \\= & {} -\Vert g_k\Vert \Big (\Vert g_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^TJ_k\Vert }\right] \Big ). \end{aligned}$$
(14)

Let us define two index sets

$$\begin{aligned} {\mathcal {I}}_1:=\{k\mid r_k\ge \mu _1\} \ \ \text {and} \ \ {\mathcal {I}}_2:=\{k\mid r_k<\mu _1\}. \end{aligned}$$

In other words, \({\mathcal {I}}_2\) contains the iterations set of CGTR using CG methods and \({\mathcal {I}}_1\) includes the iterations set of CGTR without using them.

4 Convergence theory

We now investigate the global convergence and q-quadratic rate of results of CGTR.

Remark 1

Suppose that there exists a positive constant \(\kappa >0\) such that \(\Vert d_k\Vert \le \kappa \Vert g_k\Vert \). This fact implies that

$$\begin{aligned} \Vert d^{cg}_k\Vert \le \Vert g_k\Vert +\beta _{k}^{cg}\Vert d_k\Vert \le \Vert g_k\Vert +\beta _{\max }\kappa \Vert g_k\Vert :=\underbrace{(1+\beta _{\max }\kappa )}_{:={{\widetilde{\kappa }}}}\Vert g_k\Vert .\end{aligned}$$

By Remark 1, the following lemma helps us to establish the global convergence.

Lemma 1

Suppose that the sequence \(\{x_k\}_{k\ge 0}\) is generated by CGTR. Then, for sufficiently large \(k\in {\mathcal {I}}_2\), the step-size \(\alpha _k\) satisfies

$$\begin{aligned} \alpha _k>\frac{2\sigma (1-\gamma )M_0\left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^3{{\widetilde{\kappa }}}^2 \Vert F_k\Vert }. \end{aligned}$$

Proof

First, let us define \(\alpha :=\alpha _k/\sigma \). By BT procedure, we have

$$\begin{aligned} f_k+\gamma \alpha g_k^Td^{cg}_k<f\left( x_k+\alpha d^{cg}_k\right) , \end{aligned}$$

the rewriting of which yields

$$\begin{aligned} \gamma \alpha g_k^Td^{cg}_k <f\left( x_k+\alpha d^{cg}_k\right) -f_k. \end{aligned}$$
(15)

In addition, by Taylor’s theorem, there exists a \(\xi \in \left[ x_k,x_k+\alpha d^{cg}_k\right] \) such that

$$\begin{aligned} f\left( x_k+\alpha d^{cg}_k\right) -f_k=\alpha g_k^T d^{cg}_k+\frac{1}{2}\alpha ^2 \left( d^{cg}_k\right) ^TJ(\xi )^TJ(\xi )d^{cg}_k {\mathop {>}\limits ^{15)}}\gamma \alpha g_k^Td^{cg}_k. \end{aligned}$$
(16)

On the other hand, Assumption (A2), for any \(\xi \in \left[ x_k,x_k+\alpha d^{cg}_k\right] \), implies that

$$\begin{aligned} \frac{1}{2}\left( d^{cg}_k\right) ^TJ(\xi )^TJ(\xi )d^{cg}_k\le \frac{M_1^2}{2}\Vert d^{cg}_k\Vert ^2. \end{aligned}$$
(17)

Then, by removing \(\alpha \) from both sides of (15), we get

$$\begin{aligned} \begin{aligned} \gamma g_k^Td^{cg}_k&< g_k^Td^{cg}_k+\frac{1}{2}\alpha \left( d^{cg}_k\right) ^TJ(\xi )^TJ(\xi )d^{cg}_k\\&< g_k^Td^{cg}_k+\frac{1}{2}M_1^2\alpha \Vert d^{cg}_k\Vert ^2,\\ \end{aligned} \end{aligned}$$

leading to

$$\begin{aligned} -(1-\gamma )g_k^Td^{cg}_k<\frac{1}{2}\alpha M_1^2\Vert d^{cg}_k\Vert ^2. \end{aligned}$$

Assumptions (A2) and (A4) and the above inequality can result in

$$\begin{aligned} (1-\gamma )M_0\Vert F_k\Vert (M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] <\frac{M_1^2}{2}\frac{\alpha _k}{\sigma }\Vert d^{cg}_k\Vert ^2. \end{aligned}$$

By recalling this fact that \(\Vert d^{cg}_k\Vert \le {{\widetilde{\kappa }}}\Vert g_k\Vert {\mathop {\le }\limits ^{(A2)}} {{\widetilde{\kappa }}} M_1\Vert F_k\Vert \), we get

$$\begin{aligned} \alpha _k> & {} \frac{2\sigma (1-\gamma )M_0\Vert F_k\Vert \left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^2\Vert d^{cg}_k\Vert ^2}\\\ge & {} \frac{2\sigma (1-\gamma )M_0\Vert F_k\Vert \left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^2({{\widetilde{\kappa }}} M_1\Vert F_k\Vert )^2}\\\ge & {} \frac{2\sigma (1-\gamma )M_0\left( M_0\Vert F_k\Vert +\beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^3{{\widetilde{\kappa }}}^2 \Vert F_k\Vert }\\ \end{aligned}$$

which completes the proof. \(\square \)

Lemma 2

Suppose that the sequence \(\{x_k\}_{k\ge 0}\) is generated by CGTR. Then, the BT loop in CGTR is well-defined.

Proof

For \(k\in {\mathcal {I}}_2\), we now show that the backtraking loop terminates in the finite number of steps. Let us, by contradiction, assume that there exists \(k\in {\mathcal {I}}_2\) such that

$$\begin{aligned} f\left( x_k+\sigma ^i\alpha _kd^{cg}_k\right) > f_k+\gamma \sigma ^i\alpha _kg_k^Td^{cg}_k, \quad \forall i\in {\mathbb {N}}_0. \end{aligned}$$
(18)

After rewriting (18) as

$$\begin{aligned} \frac{f(x_k+\sigma ^i\alpha _kd^{cg}_k)-f_k}{\sigma ^i\alpha _k}>\gamma g_k^Td^{cg}_k , \quad \forall i\in {\mathbb {N}}_0, \end{aligned}$$

we take a limit, as \(i\rightarrow \infty \), from it, which leads to

$$\begin{aligned} g_k^Td^{cg}_k\ge \gamma g_k^Td^{cg}_k, \end{aligned}$$

because f is a differentiable function. Since \(\gamma \in [\frac{1}{2},1)\), it is obtained that \(g_k^Td^{cg}_k\ge 0\) which contradicts (14). \(\square \)

According to Assumptions (A1)–(A5), the main global convergence of CGTR is established by the following theorem.

Theorem 1

Suppose that Assumptions (A1)–(A5) hold. Then Algorithm 2 either stops at a stationary point of f(x) or generates an infinite sequence \(\{x_k\}\) such that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert F_k\Vert =0. \end{aligned}$$
(19)

Proof

The proof is by contradiction. It assumes that the oppositive of (19) holds, that is, there exists a constant \(\epsilon >0\) and an infinite subset \(K\subseteq {\mathbb {N}}\) such that

$$\begin{aligned} \Vert F_k\Vert >\epsilon ,\quad \text {for all}\; k\in K. \end{aligned}$$
(20)

We follow the proof in the following two cases:

Case 1 For \(k\in {\mathcal {I}}_1\). By setting (20) in (12) and (A2), we get

$$\begin{aligned} \begin{aligned} f_k-f(x_k+d_k)&\ge \mu _1 [m_k(x_k)-m_k(x_k+d_k)]\\&\ge \mu _1\zeta M_0 \Vert F_k\Vert \min \left\{ \varDelta _k,\frac{M_0 \Vert F_k\Vert }{M_1^2}\right\} \\&> \mu _1 \zeta M_0 \epsilon \min \left\{ \varDelta _k,\frac{M_0 \epsilon }{M_1^2}\right\} .\\ \end{aligned} \end{aligned}$$
(21)

Taking a limit from both sides of (21) leads to

$$\begin{aligned} \lim _{k\rightarrow \infty }\varDelta _k=0. \end{aligned}$$

In other words, there exist a \(\delta >0\) and \(m\in {\mathbb {N}}\) such that

$$\begin{aligned} \varDelta _k<\delta ,\quad \forall k\ge m. \end{aligned}$$
(22)

Setting (22) in (21) leads to

$$\begin{aligned} f_k-f(x_k+d_k)\ge \mu _1 \zeta M_0 \epsilon \min \left\{ \delta ,\frac{M_0 \epsilon }{M_1^2}\right\} :=\theta _1>0, \end{aligned}$$

as \(k\rightarrow \infty \), it implies a contradiction because \(0<\theta _1\le 0\). Hence, the hypothesis (20) is not true and the proof is complete.

Case 2 For \(k\in {\mathcal {I}}_2\). Lemma 1 along with Assumptions (A2) and (A4) implies

$$\begin{aligned} f(x_k+\alpha _kd^{cg}_k)\le & {} f_k+\gamma \alpha _kg^T_kd^{cg}_k\\\le & {} f_k+\gamma \left( \frac{2\sigma (1-\gamma )M_0\left( M_0\Vert F_k\Vert + \beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) }{M_1^3{\tilde{\kappa }}^2 \Vert F_k\Vert }\right) \\&\quad \times \left( -\zeta M_0\Vert F_k\Vert \min \left[ \varDelta _k,\frac{M_0\Vert F_k\Vert }{M_1^2}\right] \right) \\&{\mathop {\le }\limits ^{20}} f_k-\frac{2\sigma \gamma (1-\gamma )\zeta M_0^2}{M_1^3{\tilde{\kappa }}^2}\left( M_0\epsilon + \beta _{\min }\zeta \min \left[ \varDelta _k,\frac{M_0\epsilon }{M_1^2}\right] \right) \left( \min \left[ \varDelta _k,\frac{M_0\epsilon }{M_1^2}\right] \right) .\\ \end{aligned}$$

Similar to case 1, (22) is true. Hence, as \(k\rightarrow \infty \), the above inequality results

$$\begin{aligned} 0\ge \frac{2\sigma \gamma (1-\gamma )\zeta M_0^2}{M_1^3{\tilde{\kappa }}^2}\left( M_0\epsilon + \beta _{\min }\zeta \min \left[ \delta ,\frac{M_0\epsilon }{M_1^2}\right] \right) \left( \min \left[ \delta ,\frac{M_0\epsilon }{M_1^2}\right] \right) :=\theta _2>0, \end{aligned}$$

yielding a contradiction. Hence, our original assertion (20) must be false, yielding (19). \(\square \)

Now, to investigate the quadratic convergence of new method, we present the following assumption.

(A5) The matrix J(x) is Lipschitz continuous in \(L(x_0)\), with Lipschitz constant \(\gamma _L\).

Since \(d_k^{cg}\) is satisfied in (14), the quadratic convergence rate of the sequence generated by CGTR, under the some standard assumptions, can be established similar to Theorem 2 in [3].

Table 1 List of test functions

5 Numerical experiments

In this section, we compare five different solvers on the set of some test problems. First, five solvers of new algorithm are introduced as follows:

  • CGFRTR: The trust-region algorithm combined with conjugate gradient method proposed by Fletcher and Reeves [15];

  • CGHSTR: The trust-region algorithm combined with conjugate gradient method proposead by Hestenes and Stiefel [18];

  • CGHZTR: The trust-region algorithm combined with conjugate gradient method proposed by Hager and Zhang [17];

  • CGDYTR: The trust-region algorithm combined with conjugate gradient method proposed by Dai and Yuan [8].

Then, we compare CGFRTR, CGHSTR, CGHZTR and CGDYTR with the traditional trust-region algorithm (TTR) proposed by Conn et al. [7].

Fig. 1
figure 1

Iterates performance profile for the presented algorithms

Fig. 2
figure 2

Function evaluations performance profile for the presented algorithms

Fig. 3
figure 3

CPU time performance profile for the presented algorithms

All algorithms are run on a collection of systems of nonlinear equations with the dimension from 2 to 1000 that are selected from the wide range of literatures. We have tested our implementation on the set of test functions [22] (problems 1–28), [20] (problems 29–55) and [23] (problems 56–62), respectively. Table 1 provides the name and dimension of each test problem. All codes are written in MATLAB 15 programming environment with double precision format in the same subroutine. All algorithms are stopped if

$$\begin{aligned} \Vert F_k\Vert \le 10^{-6}\sqrt{n}, \end{aligned}$$

where is the main termination criterion. In other case, we count the corresponding test run as a failure if the total number of iterates exceeds 1000. We use Steihaug-Toint procedure to solve the trust-region subproblems of the proposed algorithms which terminates at \(x_k+d\) if

$$\begin{aligned} \Vert \nabla m_k(x_k+d)\Vert \le \min \left\{ \frac{1}{10},\Vert \nabla m_k(x_k+d)\Vert ^{\frac{1}{2}}\right\} \Vert \nabla m_k(x_k+d)\Vert . \end{aligned}$$

The finite-differences formula is used to evaluate the Jacobian matrix \(J_k\), as follows

$$\begin{aligned}{}[J_k]_{\cdot j}\sim \frac{1}{h_j}(F(x_k+h_je_j)-F_k), \end{aligned}$$

where \([J_k]_{\cdot j}\) denotes the jth column of \(J_k\), \(e_j\) is the jth vector of the canonic basis and

$$\begin{aligned} h_j:=\left\{ \begin{array}{ll} \sqrt{\epsilon _m} &{} \quad \hbox {if }\;x_{k_j}=0, \\ \sqrt{\epsilon _m} \text {sign}(x_{k_j})\max \{|x_{k_j}|,\frac{\Vert x_k\Vert _1}{n}\} , &{} \quad \hbox {else.} \\ \end{array} \right. \end{aligned}$$

in which \(\epsilon _m\) denotes the machine epsilon provided by the Matlab function eps. For all algorithms, we choose \(\varDelta _0=1\) (see [25]) and employ the parameters \(\mu _1=0.1\), \(\mu _2=0.9\), \(c_1=0.25\) and \(c_2=0.3\). In addition, \(\varDelta _{k}\) is updated like [24] by the following formula

$$\begin{aligned} \varDelta _{k+1}= \left\{ \begin{array}{ll} c_1\Vert d_k\Vert , \ \ &{} \quad \hbox {if }\; r_k < \mu _1, \\ \varDelta _k, \ \ &{} \quad \hbox {if }\; \mu _1 \le r_k \le \mu _2, \\ c_2 \varDelta _k, \ \ &{} \quad \hbox {if }\; r_k \ge \mu _2. \\ \end{array} \right. \\ \end{aligned}$$

The performance profiles for all of the algorithms are given for the number of (successful) iterations (\(N_i\)), number of function evaluations (\(N_f\)) and CPU-times (\(C_t\)) by Figs. 1, 2, 3, respectively. In these figures, P designates the percentage of problems, which are solved within a factor \(\tau \) of the best solver, cf. [9].

Figure 1 shows that CGDYTR, CGHSTR and CGHZTR are the best solver, in terms of number of iterations, on 92%, 90% and 90% of the problems, respectively. In Fig. 2, it can be seen that CGHSTR, CGHZTR and CGDYTR are the best solver on approximately 95%, 95% and 88% of the test problems. Figure 3 shows that CGDYTR is the best solver, in terms of CPU time, on 30% of the problems. These results show that the propose algorithms are efficient for solving systems of nonlinear equations.

6 Concluding remarks

We presented a new CG trust-region strategy to solve nonlinear systems. In order to prevent resolving the subproblem of trust-region with high computational cost, we added a family of some CG methods into the trust-region algorithm such that it is combined with the direction generated in previous iteration by Steihaug-Toint procedure, using information Hessian matrix. The global and q-quadratic convergence properties of CGTR are established. Numerical results on a set of nonlinear systems show that CGDYTR is the best solver for solving nonlinear systems.