Abstract
In this paper, we introduce a combination of family of some conjugate gradient methods (CG) with the trust-region method. Whenever the trust-region algorithm is unsuccessful, a family of CG methods is used to prevent resolving the trust-region subproblem. The computational cost for such a family is trivial. The global theory of the new approach is proved and numerical experiments are reported.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We introduce a combined trust-region method with conjugate gradient (CG) methods to solve the system of nonlinear equations
where \(F:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}^{n}\) is a continuously differentiable mapping, that is,
for which each function \(F_i:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}\), \(i=1,\ldots ,n\), is smooth. Let us denote \(x^*\) as a solution or root of (1). Under certain circumstance, the nonlinear system (1) can be written as the following unconstrained optimization problem
in which \(\Vert \cdot \Vert \) denotes the Euclidean norm, cf. [24]. There are some popular methods to solve (2) such as [6, 10,11,12,13,14, 16, 21, 25,26,27,28,29].
Contribution In this paper, we amied for presenting a derivative-free trust-region algorithm that a family of CG methods is used as long as iterations are unsuccessful in traditional trust-region (TTR) method. In addition, the global convergence and q-quadratic convergence of new algorithm are proved. Numerical results show that the new algorithm is efficient to solve systems of nonlinear equations.
Organization This paper is organized as follows: In Sect. 2, we describe briefly the traditional trust-region method. In next section, we first introduce a family of some CG methods and next the structure of new algorithm will be described. In Sect. 4, the global convergence of the new algorithm under some suitable assumptions is investigated. Numerical results are reported in Sect. 5. Finally, some conclusive remarks are given in Sect. 6.
2 Review of traditional trust-region method
Trust-region method is one efficient class of global approaches to solve (1), which utilizes the information gathered about f in order to produce a quadratic model function \(m_k\) whose behavior close to the current point \(x_k\) is similar to that of the actual objective function f. We now describe the trust-region method in a little more detail. If x is far away from \(x_k\) and \(m_k\) is not a good approximation of f, then finding a minimizer of \(m_k\) can be restricted to a region around \(x_k\), as follows
where \(\varDelta _k>0\) is the trust-region radius. In other words, trust-region method finds a trial step \(d_k\) by computing an approximate solution of the following model subproblem
where \(f_k:=f(x_k)\), \(F_k:=F(x_k)\) and \(J_k:=F'(x_k)\) is an approximated Jacobian of F(x). In order to update \(\varDelta _k\) and accept \(d_k\), the ratio of trust-region is defined by
which tries to imply an appropriate agreement between the actual reduction in the function f and the reduction predicted by the model. If \(r_k\ge \mu _1\), then iterations are called successful and \(x_{k+1}:=x_k+d_k\), in which \(\mu _1 \in (0,1)\). Otherwise, iterations are called unsuccessful and \(x_{k+1}:=x_k\).
We now present the traditional trust-region algorithm (TTR).
In TTR, the loop started from Line 3 to Line 19 is called the outer cycle and the loop started from Line 5 to Line 13 is called the inner cycle. In addition, if \(r_k<\mu _1\) (Line 10), then \(\varDelta _k\) is reduced in Line 11 and if \(r_k\ge \mu _2\) (Line 15), \(\varDelta _k\) is increased in Line 16 and therefore iterations are called very successful. Whenever \(r_k<\mu _1\), TTR solves, several times, the trust-region subproblem (3) which leads to increase computational cost.
3 Motivation and algorithmic structure
It is well-known that TTR has some drawbacks, see [1, 2, 7, 10, 28]. One of the main drawbacks of TTR is that resolving the subproblem of trust-region leads to increase the computational cost whenever iterations are unsuccessful. In order to overcome this shortcoming, we can apply some techniques with low memory, decreasing the total number of iterations without resolving the subproblem of trust-region. Hence, under no circumstances, do we like to resolve the trust-region subproblem whenever iterations are unsuccessful.
Efficient methods to prevent resolving the trust-region subproblem are conjugate gradient methods, using low computational costs, for which strong global convergence properties are established. Some researchers have been introduced various methods of CG that they don’t require any matrix storage, cf. [8, 15, 17, 18]. These methods produce an iterative sequence \(\{x_k\}_{k\ge 0}\) satisfying \(x_{k+1}:=x_k+\alpha _kd_k\), in which \(\alpha _k\) is a step-size and \(d_k\) is a direction defined by
for which \(g_k:=J_k^TF_k\) and \(\beta _k\) is a scalar. Several famous formula for \(\beta _k\) are presented by
where \(y_{k-1}:=g_k-g_{k-1}\). Now, we introduce a new conjugate gradient method as follows:
in which
\(\beta _k\) can be chosen from the formula (6)–(9), \(0<\beta _{\min }<\beta _{\max }<+\infty \) and \(d_{k}\) is the solution of (3). The direction \(d_k^{cg}\) uses advantages of \(d_k\) for which there is not any additional computational cost, hence, it has a little computational cost.
To establish the convergence results of CG methods, the step-size \(\alpha _k\) should be obtained by exact or inexact line search technique, cf. [24]. Here, we use backtraking procedure to find the step-size \(\alpha _k\), as follows:
We now introduce the conjugate gradient trust-region algorithm to solve (2), as follows:
To prove the global convergence of new method, we present the following assumptions:
- (A1) :
-
Let \(\varOmega \) be a compact convex set such that \(L(x_0)\subseteq \varOmega \). The level set \(L(x_0):=\{x\in {\mathbb {R}}^n \mid f(x)\le f(x_{0})\}\) for which F(x) is continuously differentiable on set \(\varOmega \) is bounded for any given \(x_{0}\in {\mathbb {R}}^{n}\).
- (A2) :
-
The matrix J(x) is bounded and uniformly nonsingular on \(\varOmega \), i.e., there exist constants \(0<M_0\le 1 \le M_1\) such that
$$\begin{aligned} \Vert J(x)\Vert \le M_1 \ \ \text {and} \ \ M_0\Vert F(x)\Vert \le \Vert J(x)^T F(x)\Vert , \end{aligned}$$(11) - (A3) :
-
The decrease on the model \(m_k\) is at least as much as a fraction of that obtained by the Cauchy point, i.e., there exists a constant \(\zeta \in (0,1)\) such that
$$\begin{aligned} m_k(x_k)-m_k(x_k+d_k) \ge \zeta ~ \Vert g_k\Vert ~ \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^T J_k\Vert }\right] , \end{aligned}$$(12) - (A4) :
-
To prove the strong theoretical for the proposed algorithm, it is necessary that the step \(d_k\), the solution of (3), satisfies
$$\begin{aligned} g_k^Td_k\le -\zeta \Vert g_k\Vert \min \left[ \varDelta _k,\frac{\Vert g_k\Vert }{\Vert J_k^TJ_k\Vert }\right] , \end{aligned}$$(13)where \(\zeta \in (0,1)\).
Note that by Assumption (A4) and (10) we get
Let us define two index sets
In other words, \({\mathcal {I}}_2\) contains the iterations set of CGTR using CG methods and \({\mathcal {I}}_1\) includes the iterations set of CGTR without using them.
4 Convergence theory
We now investigate the global convergence and q-quadratic rate of results of CGTR.
Remark 1
Suppose that there exists a positive constant \(\kappa >0\) such that \(\Vert d_k\Vert \le \kappa \Vert g_k\Vert \). This fact implies that
By Remark 1, the following lemma helps us to establish the global convergence.
Lemma 1
Suppose that the sequence \(\{x_k\}_{k\ge 0}\) is generated by CGTR. Then, for sufficiently large \(k\in {\mathcal {I}}_2\), the step-size \(\alpha _k\) satisfies
Proof
First, let us define \(\alpha :=\alpha _k/\sigma \). By BT procedure, we have
the rewriting of which yields
In addition, by Taylor’s theorem, there exists a \(\xi \in \left[ x_k,x_k+\alpha d^{cg}_k\right] \) such that
On the other hand, Assumption (A2), for any \(\xi \in \left[ x_k,x_k+\alpha d^{cg}_k\right] \), implies that
Then, by removing \(\alpha \) from both sides of (15), we get
leading to
Assumptions (A2) and (A4) and the above inequality can result in
By recalling this fact that \(\Vert d^{cg}_k\Vert \le {{\widetilde{\kappa }}}\Vert g_k\Vert {\mathop {\le }\limits ^{(A2)}} {{\widetilde{\kappa }}} M_1\Vert F_k\Vert \), we get
which completes the proof. \(\square \)
Lemma 2
Suppose that the sequence \(\{x_k\}_{k\ge 0}\) is generated by CGTR. Then, the BT loop in CGTR is well-defined.
Proof
For \(k\in {\mathcal {I}}_2\), we now show that the backtraking loop terminates in the finite number of steps. Let us, by contradiction, assume that there exists \(k\in {\mathcal {I}}_2\) such that
After rewriting (18) as
we take a limit, as \(i\rightarrow \infty \), from it, which leads to
because f is a differentiable function. Since \(\gamma \in [\frac{1}{2},1)\), it is obtained that \(g_k^Td^{cg}_k\ge 0\) which contradicts (14). \(\square \)
According to Assumptions (A1)–(A5), the main global convergence of CGTR is established by the following theorem.
Theorem 1
Suppose that Assumptions (A1)–(A5) hold. Then Algorithm 2 either stops at a stationary point of f(x) or generates an infinite sequence \(\{x_k\}\) such that
Proof
The proof is by contradiction. It assumes that the oppositive of (19) holds, that is, there exists a constant \(\epsilon >0\) and an infinite subset \(K\subseteq {\mathbb {N}}\) such that
We follow the proof in the following two cases:
Case 1 For \(k\in {\mathcal {I}}_1\). By setting (20) in (12) and (A2), we get
Taking a limit from both sides of (21) leads to
In other words, there exist a \(\delta >0\) and \(m\in {\mathbb {N}}\) such that
as \(k\rightarrow \infty \), it implies a contradiction because \(0<\theta _1\le 0\). Hence, the hypothesis (20) is not true and the proof is complete.
Case 2 For \(k\in {\mathcal {I}}_2\). Lemma 1 along with Assumptions (A2) and (A4) implies
Similar to case 1, (22) is true. Hence, as \(k\rightarrow \infty \), the above inequality results
yielding a contradiction. Hence, our original assertion (20) must be false, yielding (19). \(\square \)
Now, to investigate the quadratic convergence of new method, we present the following assumption.
(A5) The matrix J(x) is Lipschitz continuous in \(L(x_0)\), with Lipschitz constant \(\gamma _L\).
Since \(d_k^{cg}\) is satisfied in (14), the quadratic convergence rate of the sequence generated by CGTR, under the some standard assumptions, can be established similar to Theorem 2 in [3].
5 Numerical experiments
In this section, we compare five different solvers on the set of some test problems. First, five solvers of new algorithm are introduced as follows:
-
CGFRTR: The trust-region algorithm combined with conjugate gradient method proposed by Fletcher and Reeves [15];
-
CGHSTR: The trust-region algorithm combined with conjugate gradient method proposead by Hestenes and Stiefel [18];
-
CGHZTR: The trust-region algorithm combined with conjugate gradient method proposed by Hager and Zhang [17];
-
CGDYTR: The trust-region algorithm combined with conjugate gradient method proposed by Dai and Yuan [8].
Then, we compare CGFRTR, CGHSTR, CGHZTR and CGDYTR with the traditional trust-region algorithm (TTR) proposed by Conn et al. [7].
All algorithms are run on a collection of systems of nonlinear equations with the dimension from 2 to 1000 that are selected from the wide range of literatures. We have tested our implementation on the set of test functions [22] (problems 1–28), [20] (problems 29–55) and [23] (problems 56–62), respectively. Table 1 provides the name and dimension of each test problem. All codes are written in MATLAB 15 programming environment with double precision format in the same subroutine. All algorithms are stopped if
where is the main termination criterion. In other case, we count the corresponding test run as a failure if the total number of iterates exceeds 1000. We use Steihaug-Toint procedure to solve the trust-region subproblems of the proposed algorithms which terminates at \(x_k+d\) if
The finite-differences formula is used to evaluate the Jacobian matrix \(J_k\), as follows
where \([J_k]_{\cdot j}\) denotes the jth column of \(J_k\), \(e_j\) is the jth vector of the canonic basis and
in which \(\epsilon _m\) denotes the machine epsilon provided by the Matlab function eps. For all algorithms, we choose \(\varDelta _0=1\) (see [25]) and employ the parameters \(\mu _1=0.1\), \(\mu _2=0.9\), \(c_1=0.25\) and \(c_2=0.3\). In addition, \(\varDelta _{k}\) is updated like [24] by the following formula
The performance profiles for all of the algorithms are given for the number of (successful) iterations (\(N_i\)), number of function evaluations (\(N_f\)) and CPU-times (\(C_t\)) by Figs. 1, 2, 3, respectively. In these figures, P designates the percentage of problems, which are solved within a factor \(\tau \) of the best solver, cf. [9].
Figure 1 shows that CGDYTR, CGHSTR and CGHZTR are the best solver, in terms of number of iterations, on 92%, 90% and 90% of the problems, respectively. In Fig. 2, it can be seen that CGHSTR, CGHZTR and CGDYTR are the best solver on approximately 95%, 95% and 88% of the test problems. Figure 3 shows that CGDYTR is the best solver, in terms of CPU time, on 30% of the problems. These results show that the propose algorithms are efficient for solving systems of nonlinear equations.
6 Concluding remarks
We presented a new CG trust-region strategy to solve nonlinear systems. In order to prevent resolving the subproblem of trust-region with high computational cost, we added a family of some CG methods into the trust-region algorithm such that it is combined with the direction generated in previous iteration by Steihaug-Toint procedure, using information Hessian matrix. The global and q-quadratic convergence properties of CGTR are established. Numerical results on a set of nonlinear systems show that CGDYTR is the best solver for solving nonlinear systems.
References
Ahookhosh, M., Amini, K.: A nonmonotone trust-region method with adaptive radius for unconstrained optimization. Comput. Math. Appl. 60, 411–422 (2010)
Ahookhosh, M., Amini, K., Peyghami, M.R.: A nonmonotone trust-region line search method for large-scale unconstrained optimization. Appl. Math. Modell. 36, 478–487 (2012)
Amini, K., Shiker Mushtak, A.K., Kimiaei, M.: A line search trust-region algorithm with nonmonotone adaptive radius for a system of nonlinear equations. 4OR-Q. J. Oper. Res. 4(2), 132–152 (2016)
Bellavia, S., Macconi, M., Morini, B.: STRSCNE: a scaled trust-region solver for constrained nonlinear equations. Comput. Optim. Appl. 28, 31–50 (2004)
Bouaricha, A., Schnabel, R.B.: Tensor methods for large sparse systems of nonlinear equations. Math. Program. 82, 377–400 (1998)
Broyden, C.G.: The convergence of an algorithm for solving sparse nonlinear systems. Math. Comput. 25(114), 285–294 (1971)
Conn, A.R., Gould, N.I.M., Toint, PhL: Trust-Region Methods. Society for Industrial and Applied Mathematics SIAM, Philadelphia (2000)
Dai, Y.H., Yuan, Y.: A nonlinear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 10, 177–182 (1999)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Esmaeili, H., Kimiaei, M.: A new adaptive trust-region method for system of nonlinear equations. Appl. Math. Model. 38(11–12), 3003–3015 (2014)
Fan, J.Y.: Convergence rate of the trust region method for nonlinear equations under local error bound condition. Comput. Optim. Appl. 34, 215–227 (2005)
Fan, J.Y.: An improved trust region algorithm for nonlinear equations. Comput. Optim. Appl. 48(1), 59–70 (2011)
Fan, J.Y., Pan, J.Y.: A modified trust region algorithm for nonlinear equations with new updating rule of trust region radius. Int. J. Comput. Math. 87(14), 3186–3195 (2010)
Fasano, G., Lampariello, F., Sciandrone, M.: A truncated nonmonotone Gauss-Newton method for large-scale nonlinear least-squares problems. Comput. Optim. Appl. 34(3), 343–358 (2006)
Fletcher, R., Reeves, C.: Function minimization by conjugate gradients. Comput. J. 7, 149–154 (1964)
Grippo, L., Sciandrone, M.: Nonmonotone derivative-free methods for nonlinear equations. Comput. Optim. Appl. 37, 297–328 (2007)
Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170–192 (2005)
Hestenes, M.R., Stiefel, E.L.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stand. 49, 409–436 (1952)
Kimiaei, M., Rahpeymaii, F.: A new nonmonotone line-search trust-region approach for nonlinear systems. TOP (2019). https://doi.org/10.1007/s11750-019-00497-2
La Cruz, W., Venezuela, C., Martínez, J.M., Raydan, M.: Spectral residual method without gradient information for solving large-scale nonlinear systems of equations: Theory and experiments, Technical Report RT–04–08, July (2004)
Li, D.H., Fukushima, M.: A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Optim. Methods Softw. 13, 181–201 (2000)
Lukšan, L., Vlček, J.: Sparse and partially separable test problems for unconstrained and equality constrained optimization, Technical Report. No. 767, (1999)
Moré, J.J., Garbow, B.S., Hillström, K.E.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7, 17–41 (1981)
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2006)
Toint, PhL: Numerical solution of large sets of algebraic nonlinear equations. Math. Comput. 46(173), 175–189 (1986)
Yuan, G., Lu, S., Wei, Z.: A new trust-region method with line search for solving symmetric nonlinear equations. Int. J. Comput. Math. 88(10), 2109–2123 (2011)
Yuan, G.L., Wei, Z.X., Lu, X.W.: A BFGS trust-region method for nonlinear equations. Computing. 92(4), 317–333 (2011)
Yuan, Y.: Trust region algorithm for nonlinear equations. Information. 1, 7–21 (1998)
Zhang, J., Wang, Y.: A new trust region method for nonlinear equations. Math. Methods Oper. Res. 58, 283–298 (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rahpeymaii, F. An efficient conjugate gradient trust-region approach for systems of nonlinear equation. Afr. Mat. 30, 597–609 (2019). https://doi.org/10.1007/s13370-019-00669-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13370-019-00669-0
Keywords
- Nonlinear equations
- Trust-region framework
- Conjugate gradient
- Global theory
- Derivative-free optimization