A modified Levenberg–Marquardt method for solving system of nonlinear equations

Chen, Liang; Ma, Yanfang

doi:10.1007/s12190-022-01823-x

A modified Levenberg–Marquardt method for solving system of nonlinear equations

Original Research
Published: 30 November 2022

Volume 69, pages 2019–2040, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Applied Mathematics and Computing Aims and scope Submit manuscript

A modified Levenberg–Marquardt method for solving system of nonlinear equations

Download PDF

Liang Chen^1,3 &
Yanfang Ma²

502 Accesses
4 Citations
Explore all metrics

Abstract

A modified Levenberg–Marquardt methods for solving system of nonlinear equations is described and analysed in this paper. Specifically, we propose a convex combination of $\Vert F_k\Vert ^\delta $ and $\left\| J_k^TF_k\right\| ^\delta $ with $\delta \in [1,2]$ for the LM parameter and analyse the convergence of this modified Levenberg–Marquardt method under the $\gamma $-Hölderian local error bound of the underlying function and the v-Hölderian continuity of its Jacobian. The results show that, under some suitable relationships of exponents v, $\gamma $ and $\delta $, the modified Levenberg–Marquardt method converges to the solution set of the system of nonlinear equations at least superlinearly. Numerical experiments show the new algorithm is efficient.

A modified Levenberg–Marquardt method with line search for nonlinear equations

Article 27 May 2016

On the Convergence of Levenberg-Marquardt Method for Solving Nonlinear Systems

Algebraic rules for computing the regularization parameter of the Levenberg–Marquardt method

Article 25 April 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this paper, we consider the system of nonlinear equations

$$\begin{aligned} F(x)=0, \end{aligned}$$

(1)

where $F(x):{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n$ is a continuously differentiable function. We assume that the system of nonlinear Eq. (1) has a nonempty solution set $X_*$, and refer $\Vert \cdot \Vert $ to the Euclidean 2-norm.

As we all know, the system of nonlinear equations is one of the cornerstones in many fields of science, including engineering, finance, machine learning, etc. There have been many well-known iterative methods to find one or more solutions for the system of nonlinear Eq. (1), which may be summarized by two strategies: the line search and the trust region paradigms. The line search strategy computes a descent direction and a suitable stepsize to obtain a new estimation of the minimum point, while the trust region methods choose a step and a stepsize in a region around the current iterate within which we trust. Both these strategies contain the globally convergent line search and trust region methods, the Newton-based methods, the conjugate gradient methods, the quasi-Newton BFGS, DFP, and SR1 etc. They are operational in the system of nonlinear equations and unconstrained optimizations [1,2,3]. Moreover, the iterative technique is not only widely used for solving systems of nonlinear equations and unconstrained optimization, but also for solving matrix equations. Some gradient and least squares based iterative methods have been used to find the solutions to matrix equations that are often encountered in systems and control [4, 5].

Although there have been many efficient iterative methods to find a solution to the system of nonlinear equations, none of them is guaranteed to obtain the solutions for all systems of nonlinear equations. Many iterative methods are specialized at solving particular classes of systems of nonlinear equations. Therefore, the application of numerical methods is one of the important problems that many researchers are interested in. According to the characteristics of nonlinear problems, a suitable iterative method for solving a class of system of nonlinear equations that plays a key role in applied science, and there have been many works on these recently. For instants, in system identification, many identification methods and parameter estimation approaches have been developed for linear systems, bilinear systems [6] and nonlinear systems. Among them, there are many methods based on gradient and least squares that have been widely used for solving stochastic systems [7, 8], parameter identification [9], dual-rate systems [10, 11], and state space systems [12], as well as many identification methods for solving Hammerstein nonlinear systems [13].

In this paper, we focus on the numerical methods for solving the system of nonlinear equations. Due to the characteristics of the different nonlinear problems, there are many iterative algorithms have not yet been utilized efficiently. Therefore, many improved modifications of the classical iterative methods have been made [14,15,16]. Here, we improve the classical Levenberg–Marquardt method for solving the system of nonlinear equations and discuss the convergence under some conditions.

The rest of this paper is organized as follows. In the next section, we first propose the modified Levenberg–Marquardt algorithm for the system of nonlinear Eq. (1). Then, in Sect. 3, the convergence rate of the algorithm will be discussed with $\gamma $-Hölderian local error bound condition and the v-Hölderian continuity of the Jacobian. In Sect. 4, the global convergence result is given when the line search is used. Some numerical results are given in Sect. 5. Finally, we conclude this paper in Sect. 6.

2 The modified Levenberg–Marquardt method

One of the most well-known methods for solving system of nonlinear Eq. (1) is the Levenberg–Marquardt method (hereinafter referred to as the LM method) [17, 18], which computes the following trial step at every iteration

$$\begin{aligned} d_k=-\left( J_k^TJ_k+\lambda _kI\right) ^{-1}J_k^TF_k, \end{aligned}$$

(2)

where $F_k=F(x_k)$, $J_k=J(x_k)=F'(x_k)$ is the Jacobian, and $\lambda _k$ is the LM parameter updated by an appropriate rule. The LM method has quadratic convergence when the function F(x) is Lipschitz continuous and its Jacobian J(x) is nonsingular. To improve the computational efficiency and theoretical results of the LM method, there have been intense and interesting discussions on the choices of the LM parameter and convergence conditions in many works of literature recently [19].

Let $N(x_*,b)=\left\{ x\mid \Vert x-x_*\Vert \le b\right\} \subset {\mathbb {R}}^n$, with $b<1$ is a positive constant, be a neighbourhood of $x_*$ and $\forall x\in N(x_*,b)$. By choosing the LM parameter as $\lambda _k=\Vert F_k\Vert ^2$, Yamashita and Fukushima [20] show that the LM method has quadratic convergence under the following local error bound condition which is weaker than nonsingularity,

$$\begin{aligned} c~\textrm{dist}\left( x,X_*\right) \le \Vert F(x)\Vert ,\quad \forall x\in {N(x_*,b)}, \end{aligned}$$

(3)

where c is a positive constant, and $\textrm{dist}\left( x,X_*\right) =\inf _{{\bar{x}}\in X_*}\Vert x-{\bar{x}}\Vert $ means the distance from x to $X_*$. It is obvious that, if $J(x_*)$ is nonsingular at a solution $x_*$ of (1), then $x_*$ is an isolated solution and $\Vert F(x)\Vert $ provides a local error bound on some neighbourhood of $x_*$.

Although the local error bound condition (3) is weaker than nonsingularity of Jacobian $J_k$, it is not always applicable for some ill-conditioned nonlinear equations from application fields like biochemical systems. The following condition, called $\gamma $-Hölderian local error bound, is proposed,

$$\begin{aligned} c~\textrm{dist}(x,X_*)\le \Vert F(x)\Vert ^\gamma ,\quad \forall x\in N(x_*,b), \end{aligned}$$

(4)

where $\gamma \in (0,1]$, $c>0$ is a positive constant. For example, for the Powell singular function [21], the local error bound condition (3) is invalid, but the $\gamma $-Hölderian local error bound condition (4) is valid with $\gamma =1/2$ around the origin [22].

The $\gamma $-Hölderian local error bound condition (4) is obviously a generalization of the local error bound condition (3), which extends the exponent $\gamma $ from 1 to an interval (0, 1]. Recently, under the $\gamma $-Hölderian local error bound condition, the convergence of some modified LM methods has been discussed with several choices of LM parameter [23,24,25,26].

From the definition of the LM trial step (2), it can be seen that the choice of the LM parameter $\lambda _k$ has a great influence on the computational efficiency of the LM method. Therefore, in order to improve the computational efficiency, many choices of the LM parameter, besides $\lambda _k=\Vert F_k\Vert ^2$ proposed above, have been given recently. For example, Fan and Yuan [27] chose $\lambda _k=\Vert F_k\Vert $, Dennis and Schnable [28] employed $\lambda _k=O(\Vert J_k^TF_k\Vert )$ while Fischer [29] took $\lambda _k=\Vert J_k^TF_k\Vert $, Ma and Jiang [30] proposed the convex combination of these two choices, i.e. $\lambda _k=\theta \Vert F_k\Vert +(1-\theta )\Vert J_k^TF_k\Vert $ with $\theta \in [0,1]$. And some other choices are given [31,32,33]. To make the LM parameter belongs to an interval which is proportional to $\Vert F_k\Vert $, some algebraic rules for the LM parameter also have been given [34, 35].

The choice of the LM parameter may cause some unsatisfactory properties when the sequence $\{x_k\}$ generated by the LM method is close to or far away from the solution set, Fan et al. [36] and Chen [37] extended the choice $\lambda _k=\Vert F_k\Vert ^2$ to $\lambda =\Vert F_k\Vert ^\delta $ with $\delta \in [1,2]$. Inspired by those observations, we consider our choice as

$$\begin{aligned} \lambda _k=\theta \Vert F_k\Vert ^\delta +(1-\theta )\Vert J_k^TF_k\Vert ^\delta , \end{aligned}$$

(5)

where $\delta \in [1,2]$ and $\theta \in [0,1]$. It’s clear that, if we take $\delta =1$, the choice (5) will be that which took by Ma and Jiang in [30], and if $\delta =\theta =1$, the choice (5) will reduce to that proposed by Fan and Yuan in [27], etc. It also can be considered as the convex combination of $\Vert F_k\Vert ^\delta $ and $\Vert J_k^TF_k\Vert ^\delta $. Therefore, choice (5) can be considered as the extension of many other choices.

Now, we consider the following modified LM algorithm with the LM parameter as (5) for the system of nonlinear Eq. (1).

To generalize the theoretical application of the modified LM algorithm 1, we analyse the convergence of our new LM method under the $\gamma $-Hölderian local error bound (4) instead of the commonly used local error bound condition (3).

3 Convergence rate

The following assumptions are needed to establish convergence of Algorithm 1 throughout the paper.

Assumption 1

The following conditions hold for some $x_*\in X_*$.

(a)
F(x) admits a $\gamma $-Hölderian local error bound property in the neighbourhood of $x_*\in X_*$, where $\gamma \in (0,1]$, i.e., there exits positive constants $c>0$ that makes
$$\begin{aligned} c~\textrm{dist}(x,X_*)\le \Vert F(x)\Vert ^\gamma ,\quad \forall x\in N(x_*,b). \end{aligned}$$
(6)
(b)
J(x) is $\nu $-Hölderian continuous, where $v\in (0,1]$, i.e., there exists a constant $\kappa _1>0$ such that
$$\begin{aligned} \Vert J(x)-J(y)\Vert \le \kappa _1\Vert x-y\Vert ^v,\quad \forall x,y\in N(x_*,b). \end{aligned}$$
(7)

Remark 1

If $\gamma =1$, then the $\gamma $-Hölderian local error bound condition (6) will be the local error bound condition (3). If $v=1$, then the v-Hölderian continuity of the Jacobian (7) will be Lipschitz continuity of the Jacobian.

It follows from Assumption 1 and the mean value theorem that

$$\begin{aligned} \Vert F(y)-F(x)-J(x)(y-x)\Vert&=\left\| \int _0^1J\left( x+t(y-x)\right) (y-x)dt-J(x)(y-x)\right\| \nonumber \\&\le \Vert y-x\Vert \int _0^1\left\| J\left( x+t(y-x)\right) -J(x)\right\| dt\nonumber \\&\le \frac{\kappa _1}{1+v}\Vert y-x\Vert ^{1+v} \quad \hbox { for all}\ x,y\in N(x_*,b). \end{aligned}$$

(8)

It can also been seen that there is a constant $\kappa _2>0$ such that

$$\begin{aligned} \Vert J(x)\Vert \le \kappa _2,\quad \forall x\in N(x_*,b). \nonumber \\ \Vert F(y)-F(x)\Vert \le \kappa _2\Vert y-x\Vert ,\quad \forall x,y\in N(x_*,b). \end{aligned}$$

(9)

Suppose that Algorithm 1 starts with an initial $x_0$ that is sufficiently close to $x_*$ such that the sequence $\{x_k\}_{k=0}^{\infty }$ stays in the feasible set $X_*$ for F(x). It is worth emphasizing that $x_*$ may not be an isolated zero of F(x), and in the end $x_k$ may converge to some point in $X_*$ if it converges. Denote by ${\tilde{x}}\in X_*$ which satisfies

$$\begin{aligned} \Vert {\tilde{x}}-x\Vert =\textrm{dist}\left( x,X_*\right) . \end{aligned}$$

(10)

Now, we will study the local convergence of our modified LM method, i.e., Algorithm 1.

Denote by

$$\begin{aligned} d(x)=-\left( J(x)^TJ(x)+\lambda (x)I\right) ^{-1}J(x)^TF(x) \end{aligned}$$

(11)

and

$$\begin{aligned} \lambda (x)=\theta \Vert F(x)\Vert ^{\delta }+(1-\theta )\Vert J(x)^TF(x)\Vert ^{\delta }. \end{aligned}$$

(12)

Lemma 2

Let Assumption 1 hold. If $v\ge 2(1/\gamma -1)$, then there exist some positive constants $c_1, c_2$ such that

$$\begin{aligned} c_1\textrm{dist}(x,X^*)^{\delta /\gamma }\le \lambda (x)\le c_2\textrm{dist}(x,X^*)^\delta ,\quad \forall x\in N(x^*,b). \end{aligned}$$

(13)

Proof

First we will show the second inequality in (13). From (9), we have

$$\begin{aligned} \lambda (x)&=\theta \Vert F(x)\Vert ^\delta +(1-\theta )\Vert J(x)^TF(x)\Vert ^\delta \le \left( \theta +(1-\theta )\Vert J(x)\Vert ^\delta \right) \Vert F(x)\Vert ^\delta \nonumber \\&=\left( \theta +(1-\theta )\Vert J(x)\Vert ^\delta \right) \left\| F({\bar{x}})-F(x)\right\| ^\delta \le \left( \theta +(1-\theta )\Vert J(x)\Vert ^\delta \right) \kappa _2^\delta \Vert {\bar{x}}-x\Vert ^\delta \nonumber \\&\le \kappa _2^\delta \left( \theta +(1-\theta )\kappa _2^\delta \right) \Vert {\bar{x}}-x\Vert ^\delta . \end{aligned}$$

This proves the second inequality in (13) with $c_2=\kappa _2^\delta \left( \theta +(1-\theta )\kappa _2^\delta \right) $.

Now, we show the first inequality in (13). From

$$\begin{aligned} \Vert F(x)\Vert ^2&=F(x)^TF(x)\nonumber \\&=F(x)^T\left( F({\bar{x}})+J(x)(x-{\bar{x}})\right) +F(x)^T\left( F(x)-F({\bar{x}})-J(x)(x-{\bar{x}})\right) , \end{aligned}$$

we have

$$\begin{aligned} F(x)^TJ(x)(x-{\bar{x}})= \Vert F(x)\Vert ^2-F(x)^T\left( F(x)-F({\bar{x}})-J(x)(x-{\bar{x}})\right) . \end{aligned}$$

From the properties of the norm, we get

$$\begin{aligned} \Vert F(x)^TJ(x)(x-{\bar{x}})\Vert \ge \Vert F(x)\Vert ^2-\Vert F(x)^T\Vert ~\Vert F(x)-F({\bar{x}})-J(x)(x-{\bar{x}})\Vert .\nonumber \\ \end{aligned}$$

(14)

Since $v\ge 2(1/\gamma -1)$, it follows from (6), (8), (9) and (14) that

$$\begin{aligned} \Vert J(x)^TF(x)\Vert ~\Vert {\bar{x}}-x\Vert \ge c^{2/\gamma }\Vert {\bar{x}}-x\Vert ^{2/\gamma }-\frac{\kappa _1\kappa _2}{1+v}\Vert {\bar{x}}-x\Vert ^{2+v}. \end{aligned}$$

Thus, we have

$$\begin{aligned} \Vert J(x)^TF(x)\Vert \ge&c^{2/\gamma }\Vert {\bar{x}}-x\Vert ^{2/\gamma -1}-\frac{\kappa _1\kappa _2}{1+v}\Vert {\bar{x}}-x\Vert ^{1+v} \ge c_3\Vert {\bar{x}}-x\Vert ^{2/\gamma -1} \end{aligned}$$

with $c_3>0$ is some positive constant. Hence

$$\begin{aligned} \lambda (x)&=\theta \Vert F(x)\Vert ^\delta +(1-\theta )\Vert J(x)^TF(x)\Vert ^\delta \nonumber \\&\ge \theta c^{\delta /\gamma }\Vert {\bar{x}}-x\Vert ^{\delta /\gamma }+c_3(1-\theta )\Vert {\bar{x}}-x\Vert ^{\delta (2/\gamma -1)}\nonumber \\&\ge c_1\Vert {\bar{x}}-x\Vert ^{\delta /\gamma }. \end{aligned}$$

This proves the first inequality in (13) with $c_1$ being a positive constant. $\square $

Lemma 3

Let Assumption 1 hold, then the following inequality holds with a positive constant $c_4>0$.

$$\begin{aligned} \Vert d(x)\Vert \le c_4\textrm{dist}(x,X^*)^{\min \{1,1+v-\delta /2\gamma \}},\quad \forall x\in N(x^*,b/2). \end{aligned}$$

(15)

Proof

For all $x\in N(x^*,b/2)$, we get

$$\begin{aligned} \Vert {\bar{x}}-x^*\Vert \le \Vert {\bar{x}}-x\Vert +\Vert x-x^*\Vert \le b. \end{aligned}$$

(16)

Thus, ${\bar{x}}\in N(x^*,b)$.

Define the quadratical convex function $\varphi (d)$ as

$$\begin{aligned} \varphi (d)=\Vert F(x)+J(x)d\Vert ^2+\lambda (x)\Vert d\Vert ^2. \end{aligned}$$

(17)

Hence, d(x) is the minimizer of $\varphi (d)$. By Assumption 1, Lemma 2, (16) and (17), we get

$$\begin{aligned} \Vert d(x)\Vert ^2&\le \frac{\varphi (d(x))}{\lambda (x)}\le \frac{\varphi ({\bar{x}}-x)}{\lambda (x)} =\frac{\Vert F(x)+J(x)({\bar{x}}-x)\Vert ^2+\lambda (x)\Vert {\bar{x}}-x\Vert ^2}{\lambda (x)}\nonumber \\&=\frac{\Vert F({\bar{x}})-F(x)-J(x)({\bar{x}}-x)\Vert ^2}{\lambda (x)}+\Vert {\bar{x}}-x\Vert ^2\nonumber \\&\le \frac{\kappa _1^2}{c_1(1+v)^2}\Vert {\bar{x}}-x\Vert ^{2(1+v)-\delta /\gamma }+\Vert {\bar{x}}-x\Vert ^2\nonumber \\&\le \left( \frac{\kappa _1^2}{c_1(1+v)^2}+1\right) \Vert {\bar{x}}-x\Vert ^{2\min \{1,1+v-\delta /2\gamma \}}. \end{aligned}$$

Let $c_4=\sqrt{\kappa _1^2/c_1(1+v)^2+1}$, we obtain (15). $\square $

Lemma 4

Assume $x,x+d(x)\in N(x_*,b/2)$. Then the following inequality holds with a positive constant $c_5$.

$$\begin{aligned} \textrm{dist}(x+d(x),X_*)\le c_5\textrm{dist}(x,X_*)^{\gamma \min \{1+v,1+\delta /2,2(1+v-\delta /2\gamma )\}}. \end{aligned}$$

(18)

Proof

From the definition of $\varphi (d)$ (17) and Lemma 2, we can show

$$\begin{aligned} \Vert F(x)+J(x)d(x)\Vert ^2&\le \varphi (d(x))\le \varphi ({\tilde{x}}-x)\le \Vert F(x)+J(x)({\tilde{x}}-x)\Vert ^2 +\lambda _k\Vert {\tilde{x}}-x\Vert ^2\nonumber \\&\le \frac{\kappa _1^2}{(1+v)^2}\Vert {\tilde{x}}-x\Vert ^{2(1+v)}+c_2\Vert {\tilde{x}}-x\Vert ^{2+\delta }\nonumber \\&\le \left( \frac{\kappa _1^2}{(1+v)^2}+c_2\right) \Vert {\tilde{x}}-x\Vert ^{2\min \{1+v,1+\delta /2\}}. \end{aligned}$$

Then we have

$$\begin{aligned} \Vert F(x)+J(x)d(x)\Vert \le c_6\Vert {\tilde{x}}-x\Vert ^{\min \{1+v,1+\delta /2\}}, \end{aligned}$$

(19)

where positive constant $c_6=\sqrt{\kappa _1^2/(1+v)^2+c_2}$.

It follows from Assumption 1, Lemma 3 and (19) that

$$\begin{aligned} c~\textrm{dist}(x+d(x),X_*)&\le \Vert F(x+d(x))\Vert ^\gamma \le \left( \Vert F(x)+J(x)d(x)\Vert +\kappa _1\Vert d(x)\Vert ^2\right) ^\gamma \\&\le \left( c_6\Vert {\tilde{x}}-x\Vert ^{\min \{1+v,1+\delta /2\}} +\kappa _1c_4^2\Vert {\tilde{x}}-x\Vert ^{2\min \{1,1+v-\delta /2\gamma \}}\right) ^\gamma \\&\le \left( c_6+\kappa _1c_4^2\right) ^\gamma \Vert {\tilde{x}}-x\Vert ^{\gamma \min \{1+v,1+\delta /2,2,2(1+v-\delta /2\gamma )\}}. \end{aligned}$$

Hence the inequality (18) holds with $c_5=c^{-1}\left( c_6+\kappa _1c_4^2\right) ^\gamma $. $\square $

It is clear that, from (11) and (12), we have $d_k=d(x_k)$, $\lambda _k=\lambda (x_k)$ and $x_{k+1}=x_k+d_k$. Then it follows from Lemmas 2, 3 and 4 that, at every iteration,

$$\begin{aligned} c_1\textrm{dist}(x_k,X^*)^{\delta /\gamma }&\le \lambda _k\le c_2\textrm{dist}(x_k,X^*)^\delta ,\\ \Vert d_k\Vert&\le c_4\textrm{dist}(x_k,X^*)^{\min \{1,1+v-\delta /2\gamma \}},\\ \textrm{dist}(x_{k+1},X^*)&\le c_5\textrm{dist}(x_k,X^*)^{\gamma \min \{1+v,1+\delta /2,2(1+v-\delta /2\gamma )\}} \end{aligned}$$

hold with $x_k,x_{k+1}\in N(x_*,b/2)$.

Remark 2

Let

$$\begin{aligned} \alpha =&\min \{\gamma (1+v),\gamma (1+\delta /2),2\gamma (1+v)-\delta \}, \\ \beta =&\min \{1,1+v-\delta /2\gamma \}. \end{aligned}$$

To analyse the convergence rate of the sequence $\{x_k\}$ generated by Algorithm 1, the value of above $\alpha $ and $\beta $ in Lemmas 3 and 4 should be discussed. Obviously, a larger value of $\alpha $ will lead to derive a better convergence rate. Also, to deduce a convergence result, one needs to have $\alpha >1$. This holds if and only if

$$\begin{aligned} \gamma >\frac{1}{1+v}\quad \text{ and }\quad 2\left( \frac{1}{\gamma }-1\right)<\delta <2\gamma (1+v)-1, \end{aligned}$$

(20)

which imposes an additional requirement on the values of $\gamma $, v and $\delta $.

Since $\alpha >1$, we have that $\beta \alpha ^k>k$ for sufficiently large k. As $\sum _{l=0}^{\infty }\left( \frac{1}{2}\right) ^l=2$, we deduce that

$$\begin{aligned} c_7=\sum _{l=0}^{\infty }\left( \frac{1}{2}\right) ^{\beta \alpha ^l}<\infty . \end{aligned}$$

Let

$$\begin{aligned} r=\min \left\{ \frac{1}{2c_5^{1/(\alpha -1)}},\frac{b}{2\left( 1+2c_4c_5^{(1-\beta )/(\alpha -1)}c_7\right) }\right\} , \end{aligned}$$

(21)

where $c_4, c_5, c_7, \alpha , \beta $ are defined above.

Now, we show that if the initial point $x_0$ in Algorithm 1 is close enough to $X_*$, the assumption of Lemma 4 will be satisfied.

Lemma 5

Assume (20) holds. If the initial point $x_0\in N(x_*,r)$, where r is given by (21), then we have $x_k\in N(x_*,b/2)$ for every k.

Proof

The proof is conducted by induction of k. We start with $k=0$. By the assumption, we have $x_0\in N(x_*,r)$. Since $r\le b/2$, then $x_l\in N(x_*,b/2)$ for all $l=0,1,\cdots ,k$. In order to show that $x_{k+1}\in N(x_*,b/2)$, we first from Lemma 3 note that

$$\begin{aligned} \Vert x_{k+1}-x_*\Vert&=\Vert x_k+d_k-x_*\Vert \le \Vert x_k-x_*\Vert +\Vert d_k\Vert \le \Vert x_{k-1}-x_*\Vert +\Vert d_{k-1}\Vert +\Vert d_k\Vert \nonumber \\&\le \cdots \le \Vert x_0-x_*\Vert +\sum _{l=0}^{k}\Vert d_l\Vert \le r+c_4\sum _{l=0}^{k}\textrm{dist}(x_l,X_*)^{\beta }. \end{aligned}$$

(22)

By Lemma 4, since $r\le {1}/{2c_5^{1/(\alpha -1)}}$ defined in (21), we get

$$\begin{aligned} \textrm{dist}(x_l,X_*)&\le c_5\textrm{dist}(x_{l-1},X_*)^{\alpha }\le c_5c_5^{\alpha }\textrm{dist}(x_{l-2},X_*)^{\alpha ^2}\nonumber \\&\le \cdots \nonumber \\&\le c_5c_5^{\alpha }\cdots c_5^{\alpha ^{l-1}}\textrm{dist}(x_0,X_*)^{\alpha ^l} =c_5^{1+\alpha +\cdots +\alpha ^{l-1}}\textrm{dist}(x_0,X_*)^{\alpha ^l}\nonumber \\&\le c_5^{\frac{\alpha ^{l}-1}{\alpha -1}}\Vert x_0-x_*\Vert ^{\alpha ^l}\le c_5^{\frac{\alpha ^{l}-1}{\alpha -1}}r^{\alpha ^l}\le r\left( c_5^{\frac{1}{\alpha -1}}r\right) ^{\alpha ^l-1}\nonumber \\&\le r\left( \frac{1}{2}\right) ^{\alpha ^l-1}, \end{aligned}$$

(23)

where $l=0,1,\cdots ,k$.

By (21), (22) and (23), we have

$$\begin{aligned} \Vert x_{k+1}-x_*\Vert&\le r+c_4\sum _{l=0}^{k}\textrm{dist}(x_l,X_*)^\beta \le r+c_4\sum _{l=0}^{k}\left( r\left( \frac{1}{2}\right) ^{\alpha ^l-1}\right) ^{\beta }\nonumber \\&=r+c_4(2r)^\beta \sum _{l=0}^{k}\left( \frac{1}{2}\right) ^{\beta \alpha ^l} \le r\left( 1+2^\beta c_4r^{\beta -1}\sum _{l=0}^{\infty }\left( \frac{1}{2}\right) ^{\beta \alpha ^l}\right) \nonumber \\&\le r\left( 1+2c_4c_5^{(1-\beta )/(\alpha -1)}c_7\right) \le \frac{b}{2}. \end{aligned}$$

The above inequality indicates that $x_{k+1}\in N\left( x_*,b/2\right) $. $\square $

From Lemmas 4 and 5 that we obtain the following main theorem.

Theorem 6

Under Assumption 1, if (20) holds, then the sequence $\{\textrm{dist}(x_k,X_*)\}$ superlinearly converges to zero with order $\min \{\gamma (1+v),\gamma (1+\delta /2),2\gamma (1+v)-\delta \}$. Moreover, the sequence $\{x_k\}$ generated by Algorithm 1 with initial point $x_0\in N(x_*,r)$ converges to a solution ${\tilde{x}}\in X_*\bigcap N(x_*,b/2)$ of nonlinear Eq. (1).

Proof

The first part of this theorem, that is the convergence rate of sequence $\{\textrm{dist}(x_k,X_*)\}$, is clearly derived by Lemmas 4 and 5. So, only the proof of second part should been given.

From the hypothesis, for every k, we get $x_k\in N(x_*,b/2)$. Hence, we just have to show that $\{x_k\}$ converges. From Lemma 3 and (23), we have

$$\begin{aligned} \Vert d_k\Vert \le c_4\textrm{dist}(x_k,X_*)^\beta \le c_4r^\beta \left( \frac{1}{2}\right) ^{\beta (\alpha ^l-1)}. \end{aligned}$$

(24)

Denoting by

$$\begin{aligned} s_k=&\sum _{l=0}^{k}\Vert d_l\Vert \le c_4(2r)^\beta \sum _{l=0}^{k}\left( \frac{1}{2}\right) ^{\beta \alpha ^l} \le c_4c_5^{\frac{\beta }{1-\alpha }}\sum _{l=0}^{k}\left( \frac{1}{2}\right) ^{\beta \alpha ^l}\le c_4c_5^{\frac{\beta }{1-\alpha }}c_7<\infty , \end{aligned}$$

we obtain that $\{s_k\}$ is a Cauchy sequence. Thus, we have

$$\begin{aligned} \Vert x_{k+p}-x_k\Vert&\le \Vert d_{k+p-1}\Vert +\Vert x_{k+p-1}-x_k\Vert \nonumber \\&\le \cdots \le \sum _{l=k}^{k+p-1}\Vert d_l\Vert \nonumber \\&= \mid s_{k+p-1}-s_{k-1}\mid , \end{aligned}$$

(25)

for any $k,p\in {\mathbb {N}}$. Which indicates that the sequence $\{x_k\}$ is also a Cauchy sequence. Then, $\{x_k\}$ converges to some ${\tilde{x}}$. Since $x_{k}\in N(x_*,b/2)$ for all k and $\{\textrm{dist}(x_k,X_*)\}$ convergence to zero, we have ${\tilde{x}}\in X_*\bigcap N(x_*,b/2)$. $\square $

Theorem 7

Under the assumptions of Theorem 6, if $v\ge \delta /2\gamma $, then sequence $\{x_k\}$, generated by Algorithm 1 with $x_0\in N(x_*,r)$, converges to the solution ${\tilde{x}}$ of the nonlinear Eq. (1) with order $\gamma (1+\delta /2)$.

Proof

It follows from $v\ge \delta /2\gamma $ that

$$\begin{aligned}&\gamma (1+v)-\gamma \left( 1+\frac{\delta }{2}\right) \\&=\gamma \left( v-\frac{\delta }{2}\right) \ge \gamma \left( \frac{\delta }{2\gamma }-\frac{\delta }{2}\right) \\&=\frac{\gamma \delta }{2}\left( \frac{1}{\gamma }-1\right) \ge 0 \end{aligned}$$

and

$$\begin{aligned}&\left( 2\gamma (1+v)-\delta \right) -\gamma \left( 1+\frac{\delta }{2}\right) \\&=\gamma +2\gamma v-\delta -\frac{\gamma \delta }{2}\ge \gamma -\frac{\gamma \delta }{2}\\&=\frac{\gamma }{2}(2-\delta )\ge 0. \end{aligned}$$

So, from the the assumptions of Theorem 6, we have

$$\begin{aligned} \alpha =\min \{\gamma (1+v),\gamma (1+\delta /2),2\gamma (1+v)-\delta \}=\gamma (1+\delta /2)>1. \end{aligned}$$

(26)

Moreover, it follows from (26) that $\textrm{dist}(x_{k+1},X_*)\le \frac{1}{2}\textrm{dist}(x_k,X_*)$ holds for all sufficiently large k. From $v\ge \delta /2\gamma $, we have $\beta =1$ in Lemma 3. By letting $p\rightarrow \infty $ in (25), we deduce

$$\begin{aligned} \Vert {\tilde{x}}-x_k\Vert&\le \sum _{l=k}^{\infty }\Vert d_l\Vert \le c_4\sum _{l=k}^{\infty }\textrm{dist}(x_l,X_*)\le c_4\sum _{l=k}^{\infty }\left( \frac{1}{2}\right) ^{l-k}\textrm{dist}(x_k,X_*)\nonumber \\&\le 2c_4\textrm{dist}(x_k,X_*)\le 2c_4c_5\textrm{dist}(x_{k-1},X_*)^\alpha \le 2c_4c_5\Vert {\tilde{x}}-x_{k-1}\Vert ^\alpha , \end{aligned}$$

(27)

which implies that $\{x_k\}$ converges with rate $\gamma (1+\delta /2)$. $\square $

Remark 3

(1)
As it was shown in Theorem 7, if $\gamma =v=1$, we have $\alpha =1+\delta /2$. From (27), we obtain
$$\begin{aligned} \Vert {\tilde{x}}-x_k\Vert \le 2c_4c_5\Vert {\tilde{x}}-x_{k-1}\Vert ^{1+\delta /2} \end{aligned}$$
which shows that the sequence $\{x_k\}$ at least converges superlinearly and converges quadratically while $\delta =2$. This result is the same as the result in [20] with $\delta =2$ and $\theta =1$.
(2)
Under Assumption 1, let $\gamma =v=1$. Algorithm 1 will reduce to the algorithm proposed by Fan and Yuan in [27] when $\theta =1$, $\delta =1$, the algorithm proposed by Ma and Jiang in [30, Algorithm 2.1] when $\delta =1$, and the algorithm proposed by Fan and Yuan in [36] when $\theta =1$, $\delta \in [1,2]$. The results of Theorem 7 is also the same as theirs.

Remark 4

To obtain fast convergence order, singular value decomposition (SVD) technology was employed by some papers [27, 30, 36], but we omit it here.

4 Global convergence

In this section, we discuss the global convergence properties of the modified LM method. To guarantee the global convergence, the Armijo line search technique is employed in the modified LM method.

Take

$$\begin{aligned} \Phi (x)=\frac{1}{2}\Vert F(x)\Vert ^2 \end{aligned}$$

as the merit function for nonlinear equations (1).

We consider the following algorithm.

Theorem 8

Suppose Assumption 1 holds. Then the sequence $\{x_k\}$ be generated by Algorithm 2 converges to a stationary point of $\Phi (x)$. Moreover, if the stationary point $x_*$ is a solution of nonlinear Eq. (1), then $\{x_k\}$ converges to the solution with convergence rate $\min \{\gamma (1+v),\gamma (1+\delta /2),2\gamma (1+v)-\delta \}$.

Proof

The Armijo line search technique is employed in Algorithm 2. That is, the next iteration will be computed might be

$$\begin{aligned} x_{k+1}=x_k+{\tilde{\beta }}^md_k, \end{aligned}$$

where ${\tilde{\beta }}^m$ is a step size and $m\geqslant 0$ is the smallest nonnegative integer that satisfies the following inequality.

$$\begin{aligned} \Phi \left( x_k+{\tilde{\beta }}^md_k\right) -\Phi (x_k)\le {\tilde{\alpha }}{\tilde{\beta }}^m\nabla \Phi (x_k)^Td_k. \end{aligned}$$

Combine with [38, Eq. (2.10)], we get

$$\begin{aligned} \Vert F(x_{k+1})\Vert ^2\le \Vert F_k\Vert ^2-{\tilde{\alpha }}{\hat{\alpha }}\frac{\left( F_k^TJ_kd_k\right) ^2}{\Vert d_k\Vert ^2}, \end{aligned}$$

(28)

where ${\hat{\alpha }}$ is some positive constant.

Therefore, from the condition $\Vert F(x_k+d_k)\Vert \le {\tilde{\gamma }}\Vert F_k\Vert $ and (28), we obtain that $\left\{ \Vert F(x_k)\Vert \right\} $ is monotonically decreasing. Hence, the sequence $\{x_k\}$ converges to a stationary point of $\Phi (x)$.

Next we show the remainder part. We deduce from the first part that $\Vert F(x_k+d_k)\Vert \le {\tilde{\gamma }}\Vert F_k\Vert $ holds for sufficient large k, that is, ${\tilde{\beta }}^m=1$ holds for all sufficient large k. Since $\{x_k\}$ converges to a stationary point $x_*$ which is a solution of nonlinear Eq. (1), there exists a large K such that

$$\begin{aligned} \Vert F(x_K)\Vert ^{\alpha \gamma -1}\le \frac{c^\alpha {\tilde{\gamma }}}{\kappa _2c_5}, \end{aligned}$$

(29)

and

$$\begin{aligned} \Vert x_K-x_*\Vert \le r \end{aligned}$$

where c, $\kappa _2$ and $c_5$ are defined in Sect. 3, r is defined by (21).

Let sequence $\{y_k\}$ be generated by Algorithm 1 with unit step size and $y_0=x_K$. Then, by Theorem 6, the sequence $\textrm{dist}(y_l,X_*)$ converges to zero with convergence rate $\min \{\gamma (1+v),\gamma (1+\delta /2),2\gamma (1+v)-\delta \}$. So, we just have to prove that $x_{K+l}=y_l$ holds for each l, i.e. $\{y_l\}$ satisfies

$$\begin{aligned} \Vert F(y_{l+1})\Vert \le {\tilde{\gamma }}\Vert F(y_l)\Vert . \end{aligned}$$

Let ${\bar{y}}_{l+1}\in X_*$ such that $\Vert y_{l+1}-{\bar{y}}_{l+1}\Vert =\textrm{dist}(y_{l+1},X_*)$. It follows from Assumption 1 (a), Lemma 4, (9) and (29) that

$$\begin{aligned} \Vert F(y_{l+1})\Vert&=\Vert F(y_{l+1})-F({\bar{y}}_{l+1})\Vert \nonumber \\&\le \kappa _2\textrm{dist}(y_{l+1},X_*) \le \kappa _2c_5\textrm{dist}(y_l,X_*)^\alpha \nonumber \\&\le \frac{\kappa _2c_5\Vert F(y_l)\Vert ^{\alpha \gamma -1}}{c^\alpha }\Vert F(y_l)\Vert \nonumber \\&\le {\tilde{\gamma }}\Vert F(y_l)\Vert \end{aligned}$$

(30)

holds for ${\tilde{\gamma }}<1$ and each l. (30) implies that ${\tilde{\beta }}^m=1$ holds for all sufficient large k in Algorithm 2. Thus, we have $\{x_k\}$ converges to the solution with convergence rate $\min \{\gamma (1+v),\gamma (1+\delta /2),2\gamma (1+v)-\delta \}$. $\square $

5 Numerical example

Some numerical experiments are carried out to verify Algorithm 1 proposed in Sect. 3 in this section. We first test the following Powell singular function [21] proposed in Sect. 1.

$$\begin{aligned} F(x)=\left( x_1+10x_2,\sqrt{5}(x_3-x_4),(x_2-2x_3)^2,\sqrt{10}(x_1-x_4)^2\right) ^T, \end{aligned}$$

(31)

where $x\in {\mathbb {R}}^4$. As proposed in Sect. 1, the local error bound (3) does not satisfy (31) but the $\gamma $-Hölderian local error bound (4) does.

The other test problems are systems of nonlinear equations created by the modifying singular problems proposed in [21, 39],

$$\begin{aligned} {\hat{F}}(x)=F(x)-J(x_*)A\left( A^TA\right) ^{-1}A^T(x-x_*), \end{aligned}$$

(32)

where $x_*$ is its root, function F(x) is the standard nonsingular function, and matrix $A\in {\mathbb {R}}^{n\times k}$ has full column rank with $1\le k\le n$. It is easy to check that ${\hat{F}}(x_*)=0$ and ${\hat{J}}(x_*)=J(x_*)\left( I-A\left( A^TA\right) ^{-1}A^T\right) $ has rank $n-k$.

We chose the rank of ${\hat{J}}(x_*)$ to be $n-1$ by using

$$\begin{aligned} A\in {\mathbb {R}}^{n\times 1},\quad A^T=(1,1,\cdots ,1) \end{aligned}$$

(33)

and $n-2$ by using

$$\begin{aligned} A\in {\mathbb {R}}^{n\times 2},\quad A^T=\left( \begin{array}{rrrrrr} 1&{}1&{}1&{}1&{}\cdots &{}1\\ 1&{}-1&{}1&{}-1&{}\cdots &{}\pm 1\\ \end{array}\right) \end{aligned}$$

(34)

respectively.

We test several choices of the LM parameter in Algorithm 1. Based on the range of $\theta $ and $\delta $ defined in (5), we use $\theta =0$, 0.5 and 1 and $\delta =1$, 1.5 and 2 for the LM method with unit step size in Sect. 3. The algorithm is terminated when the norm of $\Vert J^T_kF_k\Vert \leqslant 10^{-6}$, or when the number of the iterations exceeds $100(n+1)$. However, for Powell singular function, we take the number $10^6$ instead of $100(n+1)$. The results for the Powell singular function are tabulated in Table 1, while the results for systems of nonlinear equations with rank $n-1$ and $n-2$ case are tabulated in Tables 2 and 3, respectively. We test three starting points $x_0$, $10x_0$ and $100x_0$ where $x_0$ is suggested by Moré, Garbow and Hillstrom in [21]. The notation “–” is used if the method fails to find the solution within the maximum iterations.

Table 1 Results on Powell singular function

Full size table

Table 2 Results on the first singular test set with rank$(F'(x_*))=n-1$

Full size table

Table 3 Results on the second singular test set with rank$(F'(x_*))=n-2$

Full size table

The results in Table 1 show that Algorithm 1 is efficient for Powell singular function even it does not satisfy the local error bound (3). But it is not perfect because of the big number of the iteration when initial point is not close enough to $x_*$. From the results in the all tables, we can see that Algorithm 1 is always outstanding when $\delta =1$ no matter what the value of $\theta $ is. In fact, when $\{x_k\}$ is close to the solution set $X_*$, the LM parameter $\lambda _k$ with $\delta =2$ may be very small, even less than the machine accuracy, and it will lose its role. Conversely, when $\{x_k\}$ is far away from the solution set $X_*$, the LM parameter $\lambda _k$ with $\delta =2$ may be very large, and the trial step $d_k$ will be small, then it prevents $\{x_k\}$ from converging rapidly. Also Algorithm 1 with $\lambda _k=\Vert F_k\Vert ^\delta $ is always outperforms or at least performs as well as Algorithm 1 with $\lambda _k=\Vert J_k^TF_k\Vert ^\delta $.

6 Conclusion

We have presented a modified Levenberg–Marquardt method for solving system of nonlinear equations with adaptive choice of LM parameter which is a convex combination of $\Vert F_k\Vert ^\delta $ and $\Vert J_k^TF_k\Vert ^\delta $. Under the $\gamma $-Hölderian local error bound condition of the underlying function and the v-Hölderian continuity of its Jacobian, its convergence including local and global has been analysed. These convergence properties hold in many applied problems, as they are satisfied by any real differentiable function. In our numerical experiments, we clearly obtained a superior performance of our choice of LM parameter.

References

Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, Heidelberg (1999)
Book MATH Google Scholar
Sun, W., Yuan, Y.: Optimization Theory and Methods: Nonlinear Programming. Springer, Heidelberg (2006)
MATH Google Scholar
Andrei, N.: Modern Numerical Nonlinear Optimization. Springer, Cham (2022)
Book MATH Google Scholar
Xie, L., Ding, J., Ding, F.: Gradient based iterative solutions for general linear matrix equations. Comput. Math. Appl. 58(7), 1441–1448 (2009)
Article MathSciNet MATH Google Scholar
Xie, L., Liu, Y., Yang, H.: Gradient based and least squares based iterative algorithms for matrix equations $AXB+CX^TD=F$. Appl. Math. Comput. 217(5), 2191–2199 (2010)
MathSciNet MATH Google Scholar
Li, M., Liu, X.: Iterative identification methods for a class of bilinear systems by using the particle filtering technique. Int. J. Adapt. Control Signal Process. 35(10), 2056–2074 (2021)
Article MathSciNet Google Scholar
Ding, F., Chen, T.: Parameter estimation of dual-rate stochastic systems by using an output error method. IEEE Trans. Automat. Control 50(9), 1436–1441 (2005)
Article MathSciNet MATH Google Scholar
Li, M., Liu, X.: Maximum likelihood hierarchical least squares-based iterative identification for dual-rate stochastic systems. Int. J. Adapt. Control Signal Process. 35(2), 240–261 (2020)
Article MathSciNet Google Scholar
Ding, F., Ling, X., Meng, D., Jin, X.-B., Alsaedi, A., Hayat, T.: Gradient estimation algorithms for the parameter identification of bilinear systems using the auxiliary model. J. Comput. Appl. Math. 369, 112575 (2020)
Article MathSciNet MATH Google Scholar
Ding, F., Chen, T.: Combined parameter and output estimation of dual-rate systems using an auxiliary model. Automatica 40(10), 1739–1748 (2004)
Article MathSciNet MATH Google Scholar
Liu, Y., Ding, F., Shi, Y.: An efficient hierarchical identification method for general dual-rate sampled-data systems. Automatica 50(3), 962–970 (2014)
Article MathSciNet MATH Google Scholar
Ding, F., Liu, X., Ma, X.: Kalman state filtering based least squares iterative parameter estimation for observer canonical state space systems using decomposition. J. Comput. Appl. Math. 301, 135–143 (2016)
Article MathSciNet MATH Google Scholar
Ding, F., Liu, X.P., Liu, G.: Identification methods for Hammerstein nonlinear systems. Digital Signal Process. 21(2), 215–238 (2011)
Article Google Scholar
Deepho, J., Abubakar, A.B., Malik, M., Argyros, I.K.: Solving unconstrained optimization problems via hybrid cd-dy conjugate gradient methods with applications. J. Comput. Appl. Math. 405, 113823 (2022)
Article MathSciNet MATH Google Scholar
Chen, L., Ma, Y.: Shamanskii-like Levenberg–Marquardt method with a new line search for systems of nonlinear equations. J. Syst. Sci. Complexity 33(5), 1694–1707 (2020)
Article MathSciNet MATH Google Scholar
Chen, L.: A modified Levenberg–Marquardt method with line search for nonlinear equations. Comput. Optim. Appl. 65(3), 753–779 (2016)
Article MathSciNet MATH Google Scholar
Levenberg, K.: A method for the solution of certain nonlinear problems in least squares. Q. Appl. Math. 2, 164–168 (1944)
Article MathSciNet MATH Google Scholar
Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11, 431–441 (1963)
Article MathSciNet MATH Google Scholar
Yuan, Y.: Recent advances in trust region algorithms. Math. Program. 151(1), 249–281 (2015)
Article MathSciNet MATH Google Scholar
Yamashita, N., Fukushima, M.: On the rate of convergence of the Levenberg–Marquardt method. In: Alefeld, G., Chen, X. (eds.) Topics in Numerical Analysis: With Special Emphasis on Nonlinear Problems, pp. 239–249. Springer Vienna, Vienna (2001)
Chapter MATH Google Scholar
Moré, J.J., Garbow, B.S., Hillstrom, K.H.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 17–41 (1981)
Article MathSciNet MATH Google Scholar
Ahookhosh, M., Aragón Artacho, F.J., Fleming, Ronan M. T., Vuong, P.T.: Local convergence of the Levenberg–Marquardt method under Hölder metric subregularity. Adv. Comput. Math. 45(5–6), 2771–2806 (2019)
Article MathSciNet MATH Google Scholar
Wang, H., Fan, J.: Convergence rate of the Levenberg–Marquardt method under Hölderian local error bound. Optim. Methods Softw. 35(4), 767–786 (2020)
Article MathSciNet MATH Google Scholar
Guo, L., Lin, G.-H., Jane, J.Y.: Solving mathematical programs with equilibrium constraints. J. Optim. Theory Appl. 166(1), 234–256 (2015)
Article MathSciNet MATH Google Scholar
Zheng, L., Chen, L., Ma, Y.: A variant of the Levenberg–Marquardt method with adaptive parameters for systems of nonlinear equations. AIMS Math. 7(1), 1241–1256 (2022)
Article MathSciNet Google Scholar
Zheng, L., Chen, L., Tang, Y.: Convergence rate of the modified Levenberg–Marquardt method under Hölderian local error bound. Open Math. 20(1), 998–1012 (2022)
Article MathSciNet MATH Google Scholar
Fan, J., Yuan, Y.: On the convergence of a new Levenberg–Marquardt method. In Technical Report, AMSS, Chinese Academy of Sciences (2001)
Dennis, J.E., Jr., Schnable, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Society for Industrial and Applied Mathematics, Philadelphia (1983)
Google Scholar
Fischer, A.: Local behavior of an iterative framework for generalized equations with nonisolated solutions. Math. Program. 94, 91–124 (2002)
Article MathSciNet MATH Google Scholar
Ma, C., Jiang, L.: Some research on Levenberg–Marquardt method for the nonlinear equations. Appl. Math. Comput. 184(2), 1032–1040 (2007)
MathSciNet MATH Google Scholar
Fan, J., Pan, J.: A note on the Levenberg–Marquardt parameter. Appl. Math. Comput. 207(2), 351–359 (2009)
MathSciNet MATH Google Scholar
Huang, B., Ma, C.: A shamanskii-like self-adaptive Levenberg–Marquardt method for nonlinear equations. Comput. Math. Appl. 77(2), 357–373 (2019)
Article MathSciNet MATH Google Scholar
Amini, K., Rostami, F., Caristi, G.: An efficient Levenberg–Marquardt method with a new LM parameter for systems of nonlinear equations. Optimization 67(5), 637–650 (2018)
Article MathSciNet MATH Google Scholar
Karas, E.W., Santos, S.A., Svaiter, B.F.: Algebraic rules for computing the regularization parameter of the Levenberg–Marquardt method. Comput. Optim. Appl. 65(3), 1–29 (2016)
Article MathSciNet MATH Google Scholar
Musa, Y.B., Waziri, M.Y., Halilu, A.S.: On computing the regularization parameter for the Levenberg–Marquardt method via the spectral radius approach to solving systems of nonlinear equations. J. Numer. Math. Stochast. 9(1), 80–94 (2017)
MathSciNet MATH Google Scholar
Fan, J., Yuan, Y.: On the quadratic convergence of the Levenberg–Marquardt method without nonsingularity assumption. Computing 74(1), 23–39 (2005)
Article MathSciNet MATH Google Scholar
Chen, L., Ma, Y., Su, C.: An efficient m-step Levenberg–Marquardt method for nonlinear equations. ChinaXiv, p. 15, (2016)
Yuan, Y.-X.: Problems on convergence of unconstrained optimization algorithms. In: Numerical Linear Algebra and Optimization, pp. 95–107. Science Press, Beijing, New York (1999)
Google Scholar
Schnabel, R.B., Frank, P.D.: Tensor methods for nonlinear equations. SIAM J. Numer. Anal. 21(5), 815–843 (1984)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research was supported by NFSC Grant 11901061, the University Natural Science Research Project of Anhui Province Grant KJ2020ZD008 and the Natural Science Foundation of Anhui province Grant 2108085MF204. Part of this work was completed during authors’ visit to University of Texas at Arlington. They would like to thank Prof. Ren-Cang Li for his hospitality during the visit.

Author information

Authors and Affiliations

School of Sciences, Changzhou Institute of Technology, Changzhou, 213032, Jiangsu, People’s Republic of China
Liang Chen
School of Computer Science and Information Engineering, Changzhou Institute of Technology, Changzhou, 213032, Jiangsu, People’s Republic of China
Yanfang Ma
School of Mathematical Sciences, Huaibei Normal University, Huaibei, 235000, Anhui, People’s Republic of China
Liang Chen

Authors

Liang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yanfang Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no known conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, L., Ma, Y. A modified Levenberg–Marquardt method for solving system of nonlinear equations. J. Appl. Math. Comput. 69, 2019–2040 (2023). https://doi.org/10.1007/s12190-022-01823-x

Download citation

Received: 24 July 2022
Revised: 16 November 2022
Accepted: 20 November 2022
Published: 30 November 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s12190-022-01823-x

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A modified Levenberg–Marquardt method for solving system of nonlinear equations

Abstract

Similar content being viewed by others

A modified Levenberg–Marquardt method with line search for nonlinear equations

On the Convergence of Levenberg-Marquardt Method for Solving Nonlinear Systems

Algebraic rules for computing the regularization parameter of the Levenberg–Marquardt method

1 Introduction

2 The modified Levenberg–Marquardt method

3 Convergence rate

Assumption 1

Remark 1

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Remark 2

Lemma 5

Proof

Theorem 6

Proof

Theorem 7

Proof

Remark 3

Remark 4

4 Global convergence

Theorem 8

Proof

5 Numerical example

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation