Abstract
In this investigation, a new two-step Newton method to solve convex unconstrained optimization problems is developed. This proposed method is based on Traub’s iterative scheme (Iterative methods for the solution of equations, Prentice Hall, Englewood Cliffs, 1964) which is extended to n-variable. The presented two-step algorithm is a modification of Newton method for solving unconstrained optimization problems. The convergence analysis for this iterative algorithm is established under suitable conditions. Various numerical examples are given to illustrate the efficiency and performance of the newly suggested method.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The general form of optimization problems can be written as follows:
where E and I are, respectively, the index set of equality and inequality constraints, \(c_{i}(x), (i = 1,\ldots ,m \in E \cup I)\) are constraint functions. When both objective function and constraint functions are linear functions, the problem is called linear programming. Otherwise, the problem is called nonlinear programming. Also, a problem that does not entail any equality or inequality constraints is said to be an unconstrained optimization problem. Now, we consider an unconstrained optimization problem
where \(f: \mathbb {R}^n \rightarrow \mathbb {R}\) is a smooth and continuously differentiable function. It is not easy to find a global minimizer of f(x), because our knowledge of the objective function is commonly only local. Therefore most algorithms are able to find only a local minimizer, which is a point that achieves the smallest value of f(x) in its neighborhood [2]. In other words, we say that a point \(x^{*}\) is a local minimizer if there is a neighborhood \(\mathcal {N}\) (an open set that contains \(x^{*}\)) of \(x^{*}\) such that \(f(x^ {*} ) \le f(x)\) for all \(x\in \mathcal {N}\). Most of the numerical methods for unconstrained optimization problems can be classified into two groups, line search algorithms and trust region algorithms. There are many useful algorithms for solving the problem (2) such as the conjugate gradient methods, the trust region methods, the quasi-Newton methods, the classical Newton method, the Nelder–Meade simplex method for problems with noisy functions, the Levenberg–Marquardt method and etc. [2,3,4]. Among the methods mentioned above, the classical Newton method is very famous for its fast convergence property. There are several modifications of the Newton method for unconstrained minimization to achieve global and local convergence, see [2, 4] and the references therein. In Newton method, the positive definiteness of the Hessian matrix of the objective function is an essential condition to get the local minimum and the fast local convergence. Zhou et al. [5] introduced a new algorithm for monotone equations and showed its superliner convergence under a local error-bound assumption that is weaker than the standard nonsingularity condition. A new trust region method for nonlinear equations with the trust region radius converging to zero is proposed in [6], and its convergence under some weak conditions is provided. Li et al. [7] obtained two regularized Newton methods for convex minimization problems in which the Hessian at solutions may be singular and showed that if the objective function be in LC\(^2\), then the methods possess local quadratic convergence under a local error bound condition without the requirement of isolated nonsingular solutions. Zhou and Chen in [8] proposed a modified regularized Newton method for convex minimization problems whose Hessian matrices may be singular. Nesterov and Polyak [9] proposed a cubic regularization of the Newton method. At each iteration, it requires solving an unconstrained minimization problem. Nesterov and Nemirovsky [10] presented the class of self-concordant functions that is three time differentiable convex function with the second and third derivatives satisfying a particular condition at each point. Polyak [11] introduced the regularized Newton method for unconstrained convex optimization. For any convex function, with a bounded optimal set, the RNM generates a sequence that converges to the optimal set from any starting point. Dehghan et al. [12] inroduced a new modification of the Newton method to solve unconstrained optimization problems. Also, they used a new improved Newton method in trust region algorithm for unconstrained minimization problems and analyzed its local and global convergence [13]. Recently, Dehghan et al. [14] proposed a new regularized Newton method based on Q.I.F method to solve optimization problems. One of the applications of the above algorithms is to solve system of nonlinear equations. For example, the system of nonlinear equations appearing in the references [15,16,17,18,19] can be solved by using these methods.
In this paper, we introduce a new algorithm to solve the convex unconstrained optimization problems. The organization of the paper is as follows: In Sect. 2, a new algorithm for solving unconstrained minimization problems is presented. Its associated convergence analysis is given in Sect. 3. Some numerical results to compare the new proposed method with the other algorithms are reported in Sect. 4 and finally the conclusions are described in Sect. 5.
2 Description of the method
In this section, a brief review of Newton method for unconstrained optimization problems is given which mentioned in [2]. Consider the minimization problem,
where \(f:\mathbb {R}^n \longrightarrow \mathbb {R}\) is a smooth function and twice continuously differentiable. Gradient \(\nabla f(x)\) and Hessian matrix \(\nabla ^{2} f(x)\) are denoted by g(x) and H(x), respectively. In the line search method, each iteration computes a search direction \(p_{k}\) and then decides how far to move along that direction. Iterations are as follows:
where the positive scalar \( \alpha _{k} \) is called the step length. The success of linear search method depends on the appropriate selections of step length \( \alpha _{k} \) and direction \(p_{k}\). Most of the line search methods require \(p_{k}\) to be a descent direction, because this property guarantees that the function f(x) can be reduced along this direction. The search direction can be defined as:
where \( B_{k} \) is a symmetric and nonsingular matrix.
The Newton method is a powerful technique for solving nonlinear equations. It is an application of Taylor Polynomials for finding roots of functions. Newton method is the iterative method for finding a simple root \(x^{*}\) of nonlinear equation f(x), i.e., \(f(x^{*})=0,~f'(x^{*})\ne 0\), by using
that converges quadratically in some neighborhood of \(x^{*}\) [1, 20,21,22]. Newton method can also be used to find a minimum or maximum of a function. The derivative is zero at a minimum or maximum, so minima and maxima can be found by applying Newton method to the derivative. The iteration becomes:
The above method can be generalized to several dimensions by replacing the derivative with the gradient,\( \nabla f(x_{k})\), and the reciprocal of the second derivative with the inverse of the Hessian matrix, \( \nabla ^{2}f(x_{k}) \). So, we have the following iterative formula
In the line search Newton method, \( B_{k} \) is the Hessian matrix \( \nabla ^{2}f(x_{k}) \). If the Hessian matrix is not positive definite, or is close to being singular, then we can modify this matrix before or during the solution process. Following is a general description of this method.
The choice of Hessian modification \( E_{k}\) is crucial to the effectiveness of the method. The modified Newton method for multiple root \(x^{*}\) of multiplicity m, i.e., \(f^{(j) }(x^{*})=0,~j=0,1,\ldots ,m-1\) and \(f^{(m)}(x^{*})\ne 0\), is quadratically convergent and it is written as:
If the multiplicity m is unknown, the standard Newton method has a linear convergence with a rate of \(\frac{(m-1)}{m}\) [23]. Traub [1] used a transformation \(u(x)=\frac{f(x)}{f'(x)}\) instead of f(x) for computing a multiple root of \(f(x)=0\). Then the problem of finding a multiple root is reduced to the problem of finding a simple root of the transformed equation u(x), and thus any iterative method can be used preserving its original convergence order. Applying the standard Newton method (6) to \(u(x)=0\), we can obtain
This method can be extended to n-variable functions \((f:\mathbb {R}^{n} \rightarrow \mathbb {R})\) as
Now, we propose a two-step algorithm to solve unconstrained optimization problems by using methods (8) and (11).
It is clear that, the introduced matrix \( (g_{k}g_{k}^{T}-f_{k}H_{k})\) is a symmetric matrix. Furthermore, in this algorithm, there is no need to calculate the step length and \( \alpha _{k}=1\) at each iteration.
3 Convergence analysis
In this section, we study convergence of the proposed method. The following assumptions are imposed throughout the paper.
Assumption 3.1
- (A1): :
-
Let \(f:\Lambda \subset \mathbb {R}^{n} \rightarrow \mathbb {R}\) be defined on the bounded and close set \(\Lambda \). Suppose f is twice continuously differentiable and let \(\overline{\Lambda } (x_{0})\) denote the closure of the level set,
$$\begin{aligned} \Lambda (x_{0})=\lbrace x: x\in \Lambda , f(x)\le f(x_{0}) \rbrace . \end{aligned}$$(12) - (A2): :
-
\(\nabla f(x)\) and \(\nabla ^{2}f (x)\) are both Lipschitz continuous that is, there exists constants \(L_{1} > 0\) and \(L_{2} > 0\) such that
$$\begin{aligned} \Vert \nabla f(x)-\nabla f(y)\Vert \le L_{1} \Vert x-y\Vert ,~~~ x, y\in \mathbb {R}^{n}, \end{aligned}$$(13)and
$$\begin{aligned} \Vert \nabla ^{2}f (x)-\nabla ^{2}f (y)\Vert \le L_{2} \Vert x-y\Vert ,~~~ x, y\in \mathbb {R}^{n}. \end{aligned}$$(14) - (A3): :
-
\( 2(\nabla f_{k}^{T}(f_{k} \nabla ^{2}f_{k})^{-1} \nabla f_{k})\le 1. \)
- (A4): :
-
\( L= \max \{L_{1}, L_{2}\} \) and \( \gamma =L\Vert (\nabla ^{2} f(x^{*}))^{-1}\Vert < 1\).
- (A5): :
-
\(\theta \in \mathbb {R} \) and \(\vert \theta \vert < \frac{1-\gamma }{6\gamma }. \)
Theorem 3.2
Suppose A is a nonsingular \(N \times N\) matrix, U is \(N \times M\), V is \(M \times N\), then \(A + UV\) is nonsingular if and only if \(I + V A^{-1} U\) is a nonsingular \(M \times M\) matrix. If this is the case, then
This is the Sherman–Morrison–Woodbury formula [3, 24, 25]. See [3] for further generalizations.
Proposition 3.3
[3] Let B be a nonsingular \(n \times n\) matrix and let \( u, v \in \mathbb {R}^n. \) Then \(B + uv^{T}\) is invertible if and only if \(1 + v^{T}B^{-1}u \ne 0.\) In this case,
Lemma 3.4
Suppose that Assumption 3.1(A1) and (A3) hold. Then
- (I):
\( \Big | \frac{\nabla f_{k}^{T} (-f_{k}\nabla ^{2}f_{k})^{-1}\nabla f_{k}}{1+\nabla f_{k}^{T}(-f_{k}\nabla ^{2}f_{k})^{-1}\nabla f_{k}} \Big | \le 1\).
- (II):
\((-f_{k}\nabla ^{2}f_{k}+ \nabla f_{k} \nabla f_{k}^{T})^{-1}=(-f_{k}\nabla ^{2}f_{k})^{-1}\left( I-\frac{\nabla f_{k} \nabla f_{k}^{T} (-f_{k}\nabla ^{2}f_{k})^{-1}}{1+\nabla f_{k}^{T}(-f_{k}\nabla ^{2}f_{k})^{-1}\nabla f_{k}}\right) \).
Proof
From Assumption 3.1(A3), we have
and hence
According to Theorem (3.2) and Proposition (3.3), we set \( B=-f_{k}\nabla ^{2}f_{k},\)\(u=v=\nabla f_{k}\). From (17) we obtain
Therefore, the matrix \(B + uv^{T}\) is invertible and we can get
\(\square \)
Theorem 3.5
Suppose that Assumption 3.1(A1)–(A5) hold. Assume that
and
Then the iteration \(x_{k+1} = x_{k} +\theta p_{k}+ (1-\theta ) \tilde{p}_{k}\) generated by Algorithm 2 converges to \(x^{*}\), where \( x_{0} \) is the starting point.
Proof
In this proof, \(f_k, \nabla f_k\) and \(\nabla ^2 f_k\) denotes the \(f(x_k), \nabla f(x_k)\) and \(\nabla ^2 f(x_k)\), respectively. By using Lemma 3.4-II
Without loss of generality, we assume \( \nabla ^{2} f_{k} \) positive definite matrix. According to (22) and (23) we have
and hence
Since
then
Now from Lemma 3.4-I, Assumption 3.1(A2) and (27), we obtain
Since \(\nabla ^{2}f(x^{*})\) is nonsingular and \( \nabla ^{2}f(x_{k})\rightarrow \nabla ^{2}f(x^{*}), \) this implies that \( \Vert (\nabla ^{2} f(x_{k}))^{-1}\Vert \le 2 \Vert (\nabla ^{2} f(x^{*}))^{-1}\Vert \) for all sufficiently large k. Therefore from Assumption 3.1(A4)
Now suppose that there exists an integer k such that (i)\( \Vert x_{k}-x^{*} \Vert \le 1\) or (ii) \(\Vert x_{k}-x^{*} \Vert > 1\). Then we consider the following cases:
Case (i): This case implies that
and with the help of (30) we have
If the starting point is sufficiently near \( x^{*}, \) then, by using Assumption 3.1(A5) the sequence of \( \lbrace x_{k}\rbrace \) converges to \( x^{*}. \)
Case (ii): This case implies that
and therefore
and similar to previous case by using Assumption 3.1(A5) the sequence of \( \lbrace x_{k}\rbrace \) converges to \( x^{*}. \)\(\square \)
Remark 1
For \( \theta =0\), the proposed method reduces to the standard Newton method.
4 Numerical results
In this section, we report some results on the following numerical experiments for the proposed algorithm. Also, the effectiveness of the proposed method with classical Newton method and Modified Regularized Newton method (MRN) proposed by Zhou and Chen in [8] are compared. We suppose \( \tau =10^{-6} \) and stopping condition is \( \Vert g(x_{k})\Vert \le 10^{-5}. \) Moreover, we suppose \( p_{0}=0.0001, p_{1}=0.25, p_{2}=0.75, \mu _{1}=10^{-2} \) and \(m=10^{-8}\) in MRN method. \(N_f\) represents the number of the objective function evaluations and \(N_g\) represents the number of gradient evaluations. The codes have been written in Matlab 12.0 and runs are made on 2.3v GHz PC with 8 GB memory. The following test functions with standard starting points [26, 27] and [28] are used for the comparison. The obtained numerical results are summarized in Tables 1 and 2. As shown in these tables, the proposed method (PM) is preferable to classical Newton method (NM) and MRN method. Here, we used the performance profiles proposed by Dolan and More [29] to compare and evaluate each implementation on the test functions. Figures 1, 2, 3, 4, 5 and 6 give the performance profiles of the proposed method with \(\theta =0.25,0.5\) and 0.75, MRN method and the classical Newton method relative to the number of objective function evaluations \((N_{f})\) and the number of gradient evaluations \((N_{g})\). Also, “Dim” shows the dimension of the problem. Figures 1, 2 and 3 show the performance of considered algorithms in terms of number of function evaluations. At a glance to these figures, we observe that in the cases \(\theta =0.25\) and \(\theta =0.5\) the performance of the proposed algorithm is better than others. Also according to Figs. 4, 5 and 6, the proposed method with \(\theta =0.25\) performe better than the other methods.
4.1 Systems of nonlinear equations
In this part, we solve systems of nonlinear equations by using the proposed algorithm. Consider the following nonlinear system of equations
where \(F(x)=(f_{1}(x),f_{2}(x), \ldots , f_{n}(x))\) and \( x \in \mathbb {R}^{n} \). This system can be extended as
For solving (34) by proposed algorithm, we suppose that \( f(x)=\sum \nolimits _{i=1}^{n} f_{i}^{^{2}}(x) \). Here, we have worked out our proposed method on the following test problems. In all problems, the stopping criterion is given by \( \Vert f(x_{k})\Vert < 10^{-15}\).
Problem 1
Consider the system of two equations [30]:
with the initial value \( x_{0}=(1.5, 1.5)\) and exact solution \(x^{*} =(0, 0).\)
Problem 2
Consider the system of three equations [30]:
with the initial value \(x_{0}=(-1, 1, -1)\) and exact solution
Problem 3
\( n=16 \), \( 1\le i \le n-1 \) [31]:
with the initial value \( x_{0}=(-0.85,-0.85, \ldots , -0.85)\) and exact solution
Problem 4
Next, consider a system of 25 equations [32]:
The initial guess is \( x_{0}=(0.2, 0.2, \ldots , 0.2) \) and the root correct up to 16 digits is given by \( x^{*}=(0.2142756147552158,0.2142756147552158, \ldots , 0.2142756147552158).\)
Table 3 shows the results about the considered nonlinear systems solved by the proposed algorithm, MRN and classical Newton method. From Table 3, we can see that the performance of proposed method on these problems is better than the other two algorithms.
5 Conclusions
In the present work, we proposed a new algorithm for solving convex unconstrained optimization problems. This proposed method is based on Traub’s iterative algorithm [1] which is extended to n-variable. The local convergence of the proposed method is also provided. The best property of this method is that it does not need to calculate the step length and \(\alpha _{k}=1\) at each iteration. The numerical results and comparison with other algorithms confirmed the efficiency and robustness of our algorithm. In addition, by comparing the performance profile figures for three different \(\theta =0.25, 0.5, 0.75\), it is seen that the numerical results for \(\theta = 0.25\) are better than two other \(\theta \) values. This algorithm is also used to solve systems of nonlinear equations, which have better numerical results compared to MRN method and classical Newton method.
References
Traub, J.F.: Iterative Methods for the Solution of Equations. Prentice Hall, Englewood Cliffs (1964)
Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer, New York (2006)
Kelley, C.T.: Iterative Methods for Linear and Nonlinear Equations. North Carolina State University, Chapel Hill (1995)
Dennis, J.E., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equation. Society for Industrial and Applied Mathematics, Prentice-Hall (1996)
Zhou, G., Toh, K.C.: Superlinear convergence of a Newton-type algorithm for monotone equations. J. Optim. Theory Appl. 125, 205–221 (2005)
Fan, J.Y.: Convergence rate of the trust region method for nonlinear equations under local error bound condition. Comput. Optim. Appl. 34, 215–227 (2006)
Li, D.H., Fukushima, M., Qi, L., Yamashita, N.: Regularized Newton methods for convex minimization problems with singular solutions. Comput. Optim. Appl. 28, 131–147 (2004)
Zhou, W., Chen, X.: On the convergence of a modified regularized Newton method for convex optimization with singular solutions. Comput. Appl. Math. 239, 179–188 (2013)
Nesterov, Y., Nemirovsky, A.: Interior Point Polynomial Algorithms in Convex Programming. SIAM, Philadelphia (1994)
Nesterov, Y., Polyak, B.T.: Cubic regularization of Newton method and its global performance. Math. Program. 108, 177–205 (2006)
Polyak, R.A.: Regularized Newton method for unconstrained convex optimization. Math. Program. 120, 125–145 (2009)
Dehghan Niri, T., Hosseini, M.M., Heydari, M.: An efficient improvement of the Newton method for solving nonconvex optimization problems. Comput. Methods Differ. Equ. 7, 69–85 (2019)
Heydari, M., Dehghan Niri, T., Hosseini, M.M.: A new modified trust region algorithm for solving unconstrained optimization problems. J. Math. Ext. 12, 115–135 (2018)
Dehghan Niri, T., Shahzadeh Fazeli, S.A., Hosseini, M.M.: Using a new regularized factorization method for unconstrained optimization problems. Int. J. Numer. Model. Electron. Netw. Devices Fields (2019). https://doi.org/10.1002/jnm.2580
Abu Arqub, O., Al-Smadi, M.: Atangana–Baleanu fractional approach to the solutions of Bagley–Torvik and Painlevé equations in Hilbert space. Chaos Solitons Fractals 117, 161–167 (2018)
Abu Arqub, O., Maayah, B.: Numerical solutions of integrodifferential equations of Fredholm operator type in the sense of the Atangana–Baleanu fractional operator. Chaos Solitons Fractals 117, 117–124 (2018)
Abu Arqub, O.: Solutions of time-fractional Tricomi and Keldysh equations of Dirichlet functions types in Hilbert space. Numer. Methods Partial Differ. Equ. 34, 1759–1780 (2018)
Abu Arqub, O.: Numerical solutions for the Robin time-fractional partial differential equations of heat and fluid flows based on the reproducing kernel algorithm. Int. J. Numer. Methods Heat Fluid Flow 28, 828–856 (2018)
Abu Arqub, O.: Fitted reproducing kernel Hilbert space method for the solutions of some certain classes of time-fractional partial differential equations subject to initial and Neumann boundary conditions. Comput. Math. Appl. 73, 1243–1261 (2017)
Aslam Noor, M.: New classes of iterative methods for nonlinear equations, A new modified Halley method without second derivatives for nonlinear equation. Appl. Math. Comput. 191, 128–131 (2007)
Aslam Noor, M., Inayat Noor, K., Mohyud-Din, S.T., Shabbir, A.: An iterative method with cubic convergence for nonlinear equations. Appl. Math. Comput. 183, 1249–1255 (2006)
Aslam Noor, M., Asghar Khan, W., Hussain, A.: A new modified Halley method without second derivatives for nonlinear equation. Appl. Math. Comput. 189, 1268–1273 (2007)
Atkinson, K.E.: An Introduction to Numerical Analysis, 2nd edn. Wiley, Singapore (1988)
Duncan, W.J.: Some devices for the solution of large sets of simultaneous linear equations (with an appendix on the reciprocation ofp artitioned matrices). Lond. Edinb. Dublin Philos. Mag. J. Sci. 35, 660–670 (1944)
Woodbury, M.A.: Inverting Modified Matrices, Memorandum Report 42. Statistical Research Group, Princeton, NJ (1950)
Andrei, N.: Test Functions for Unconstrained Optimization. Academy of Romanian Scientists, Bucharest (2004)
Andrei, N.: An unconstrained optimization test functions collection. Adv. Model. Optim. 10, 147–161 (2008)
More, J.J., Grabow, B.S., Hillstrom, K.E.: testing unconstrained optimization software. ACM Trans. Math. Softw. 7, 17–41 (1981)
Dolan, E.D., More, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Sharma, J.R., Gupta, P.: An efficient fifth order method for solving systems of nonlinear equations. Comput. Math. Appl. 67, 591–601 (2014)
Xiao, X., Yin, H.: A new class of methods with higher order of convergence for solving systems of nonlinear equations. Appl. Math. Comput. 264, 300–309 (2015)
Narang, M., Bhatia, S., Kanwar, V.: New two-parameter Chebyshev–Halley-like family of fourth and sixth-order methods for systems of nonlinear equations. Appl. Math. Comput. 275, 394–403 (2016)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dehghan Niri, T., Shahzadeh Fazeli, S.A. & Heydari, M. A two-step improved Newton method to solve convex unconstrained optimization problems. J. Appl. Math. Comput. 62, 37–53 (2020). https://doi.org/10.1007/s12190-019-01272-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12190-019-01272-z