On a New Updating Rule of the Levenberg–Marquardt Parameter

Zhao, Ruixue; Fan, Jinyan

doi:10.1007/s10915-017-0488-6

On a New Updating Rule of the Levenberg–Marquardt Parameter

Published: 23 June 2017

Volume 74, pages 1146–1162, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Scientific Computing Aims and scope Submit manuscript

On a New Updating Rule of the Levenberg–Marquardt Parameter

Download PDF

Ruixue Zhao¹ &
Jinyan Fan¹

652 Accesses
10 Citations
Explore all metrics

Abstract

A new Levenberg–Marquardt (LM) algorithm is proposed for nonlinear equations, where the iterate is updated according to the ratio of the actual reduction to the predicted reduction as usual, but the update of the LM parameter is no longer just based on that ratio. When the iteration is unsuccessful, the LM parameter is increased; but when the iteration is successful, it is updated based on the value of the gradient norm of the merit function. The algorithm converges globally under certain conditions. It also converges quadratically under the local error bound condition, which does not require the nonsingularity of the Jacobian at the solution.

A modified Levenberg–Marquardt method with line search for nonlinear equations

Article 27 May 2016

Algebraic rules for computing the regularization parameter of the Levenberg–Marquardt method

Article 25 April 2016

An Adaptive Multi-step Levenberg–Marquardt Method

Article 28 June 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We consider the system of nonlinear equations

$$\begin{aligned} F(x)=0, \end{aligned}$$

(1.1)

where $F(x): R^n\rightarrow R^n$ is continuously differentiable.

The Levenberg–Marquardt method (LM) is one of the most well-known iterative methods for nonlinear equations [5, 6, 15]. At the k-th iteration, it computes the trial step

$$\begin{aligned} d_k=-(J^T_kJ_k+\lambda _kI)^{-1}J_k^TF_k, \end{aligned}$$

(1.2)

where $F_k=F(x_k), J_k=J(x_k)$ is the Jacobian at $x_k$, and $\lambda _k$ is the LM parameter introduced to overcome the difficulties caused by the singularity or near singularity of $J_k$.

Let

$$\begin{aligned} \min \limits _{x\in R^n}~~\phi (x):=\Vert F(x)\Vert ^2 \end{aligned}$$

(1.3)

be the merit function of (1.1). Define the actual reduction of the merit function as

$$\begin{aligned} Ared_k=\Vert F_k\Vert ^2-\Vert F(x_k+d_k)\Vert ^2, \end{aligned}$$

the predicted reduction as

$$\begin{aligned} Pred_k=\Vert F_k\Vert ^2-\Vert F_k+J_kd_k\Vert ^2, \end{aligned}$$

and the ratio of the actual reduction to the predicted reduction

$$\begin{aligned} r_k=\frac{Ared_k}{Pred_k}. \end{aligned}$$

In classical LM methods, one sets

$$\begin{aligned} x_{k+1}=\left\{ \begin{array}{ll} x_k+d_k, &{} \text{ if }\ r_k\ge p_0, \\ x_k, &{} \text{ otherwise }, \end{array} \right. \end{aligned}$$

(1.4)

where $p_0\ge 0$ is a constant, and updates the LM parameter as

$$\begin{aligned} \lambda _{k+1}=\left\{ \begin{array}{ll} c_0\lambda _k, &{} \hbox {if} \ r_k< p_1,\\ \lambda _k, &{} \hbox {if} \ r_k\in [p_1, p_2], \\ c_1\lambda _k,&{} \hbox {if}\ r_k>p_2, \end{array}\right. \end{aligned}$$

(1.5)

where $p_0<p_1<p_2<1$, $0<c_1<1<c_0$ are positive constants (cf. [7, 9, 13, 16, 17]).

It was shown in [14] that, if the LM parameter is chosen as $\lambda _k=\Vert F_k\Vert ^2$, then the LM method converges quadratically under the local error bound condition, which is weaker than the nonsingularity of the Jacobian at the solution. It was further proved in [4] that the LM method converges quadratically for all $\lambda _k=\Vert F_k\Vert ^{\delta } (\delta \in [1,2])$ under the local error bound condition. In [1], Fan chose

$$\begin{aligned} \lambda _k=\mu _k\Vert F_k\Vert , \end{aligned}$$

(1.6)

and updated $\mu _k$ according to the ratio $r_k$ as follows:

$$\begin{aligned} \mu _{k+1}=\left\{ \begin{array}{ll} c_0\mu _k, &{} \hbox {if} \ r_k< p_1,\\ \mu _k, &{} \hbox {if} \ r_k\in [p_1, p_2], \\ \max \left\{ c_1\mu _k, m\right\} ,&{} \hbox {if}\ r_k>p_2, \end{array}\right. \end{aligned}$$

(1.7)

where $m>0$ is a small constant to prevent the LM parameter from being too small.

Recently, Zhao and Fan [18] took the LM parameter as

$$\begin{aligned} \lambda _k=\mu _k\Vert J^T_kF_k\Vert , \end{aligned}$$

(1.8)

where the update of $\mu _k$ is no longer just based on the ratio $r_k$. When the iteration is unsuccessful (i.e., $r_k< p_0$), $\mu _k$ is increased; but when the iteration is successful (i.e., $r_k\ge p_0$), $\mu _{k+1}$ is updated as

$$\begin{aligned} \mu _{k+1}=\left\{ \begin{array}{ll} c_0\mu _k, &{} \hbox {if} \ \Vert J_k^TF_k\Vert < \dfrac{p_1}{\mu _k},\\ \mu _k, &{} \hbox {if} \ \Vert J_k^TF_k\Vert \in [\dfrac{p_1}{\mu _k},~~\dfrac{p_2}{\mu _k}],\\ \max \left\{ c_1\mu _k,m\right\} ,&{} \hbox {if} \ \Vert J_k^TF_k\Vert >\dfrac{p_2}{\mu _k}. \end{array}\right. \end{aligned}$$

(1.9)

It was shown that the global complexity bound of the above LM algorithm is $O(\varepsilon ^{-2})$, that is, it takes at most $O(\varepsilon ^{-2})$ iterations to derive the norm of the gradient of the merit function below the desired accuracy $\varepsilon $.

The logic behind the updating rule (1.9) follows from the fact that the LM step is actually the solution of the trust region subproblem

$$\begin{aligned} \min \limits _{d\in R^n}&\ \Vert F_k+J_kd\Vert ^{2}\nonumber \\ s.t.&\ \Vert d\Vert \le \Delta _k:=\Vert d_k\Vert . \end{aligned}$$

(1.10)

So, the step size computed by solving (1.10) is proportional to the norm of the model gradient $\Vert J^T_kF_k\Vert $. Hence, the trust region, a magnitude of the inverse of $\mu _k$, should also be of comparable size.

In this paper, we present a new LM algorithm for (1.1), where the LM parameter is computed as

$$\begin{aligned} \lambda _k=\mu _k\Vert F_k\Vert ^{2}. \end{aligned}$$

(1.11)

We update the iterate $x_k$ according to the ratio $r_k$ as classical LM algorithms. When the iteration is unsuccessful, we increase $\mu _k$; otherwise, we update $\mu _{k+1}$ by (1.9). We show that the new LM algorithm preserves the global convergence of classical LM algorithms. We also prove that the algorithm converges quadratically under the local error bound condition.

The paper is organized as follows. In Sect. 2, we present the new LM algorithm for (1.1). The global convergence of the algorithm is also proved. In Sect. 3, we study the convergence rate of the algorithm under the local error bound condition. Some numerical results are given in Sect. 4. Finally, we conclude the paper in Sect. 5.

2 The LM Algorithm and Global Convergence

In this section, we first give the new LM algorithm, then show that the algorithm converges globally under certain conditions.

The LM algorithm is presented as follows.

Algorithm 2.1

(A Levenberg–Marquardt algorithm for nonlinear equations)

Step 1.
Given $x_0 \in R^n,\mu _0>m>0, 0<p_0< p_1< p_2<1, c_0>1, 0<c_1<1, \varepsilon \ge 0, k:=0$.
Step 2.
If $\Vert J_k^TF_k\Vert \le \varepsilon $, then stop. Otherwise, solve
$$\begin{aligned} (J_k^TJ_k+\lambda _kI)d=-J_k^TF_k ~\hbox {with}~ \lambda _k=\mu _k\Vert F_k\Vert ^{2} \end{aligned}$$
(2.1)
to obtain $d_k$.
Step 3.
Compute $r_k=\frac{Ared_k}{Pred_k}$. If $r_k\ge p_0$, set $ x_{k+1}=x_k+d_k$ and compute $\mu _{k+1}$ by (1.9); Otherwise, set $x_{k+1}=x_k$ and compute $\mu _{k+1}=c_0\mu _k$. Set $k:=k+1$ and go to step 2.

To study the global convergence of Algorithm 2.1, we make the following assumption.

Assumption 2.1

F(x) is continuously differentiable, both F(x) and its Jacobian J(x) is Lipschitz continuous, i.e., there exist positive constants $L_1$ and $L_2$ such that

$$\begin{aligned} \Vert J(x)-J(y)\Vert \le L_1\Vert y-x\Vert ,\quad \forall x,y\in R^n \end{aligned}$$

(2.2)

and

$$\begin{aligned} \Vert F(x)-F(y)\Vert \le L_2\Vert y-x\Vert ,\quad \forall x,y\in R^n. \end{aligned}$$

(2.3)

Due to the result given by Powell [10], we have the following lemma.

Lemma 2.1

The predicted reduction satisfies

$$\begin{aligned} \Vert F_k\Vert ^2-\Vert F_k+J_kd_k\Vert ^2\ge \Vert J^T_kF_k\Vert \min \left\{ \Vert d_k\Vert ,\frac{\Vert J_k^TF_k\Vert }{\Vert J^T_kJ_k\Vert }\right\} \end{aligned}$$

(2.4)

for all k.

Lemma 2.1 implies that the predicted reduction is always nonnegative.

In the following, we first prove the weak global convergence of Algorithm 2.1, that is, at least one accumulation point of the sequence generated by Algorithm 2.1 is a stationary point of the merit function $\phi (x)$.

Theorem 2.1

Under Assumption 2.1, Algorithm 2.1 terminates in finite iterations or satisfies

$$\begin{aligned} \liminf _{k\rightarrow \infty }\Vert J_k^TF_k\Vert =0. \end{aligned}$$

Proof

We prove by contradiction. Suppose that there exists a constant $\tau >0$ such that

$$\begin{aligned} \Vert J_k^TF_k \Vert \ge \tau , \quad \forall k. \end{aligned}$$

(2.5)

Define the index set of successful iterations:

$$\begin{aligned} S=\{k: r_k \ge p_0 \}. \end{aligned}$$

We discuss in two cases.

Case I. S is infinite. Since $\Vert F(x)\Vert $ is nonincreasing and bounded below, it follows from (2.3) and (2.4) that

$$\begin{aligned} +\infty >&\sum _{k\in S} (\Vert F_k\Vert ^2-\Vert F_{k+1}\Vert ^2)\nonumber \\ \ge&\sum _{k\in S} p_0 (\Vert F_k\Vert ^2-\Vert F_k+J_kd_k\Vert ^2)\nonumber \\ \ge&\sum _{k\in S} p_0\Vert J^T_kF_k\Vert \min \left\{ \Vert d_k\Vert ,\frac{\Vert J_k^TF_k\Vert }{\Vert J^T_kJ_k\Vert }\right\} \nonumber \\ \ge&\sum _{k\in S} p_0\tau \min \left\{ \Vert d_k\Vert , \frac{\tau }{L_2^2}\right\} . \end{aligned}$$

(2.6)

So,

$$\begin{aligned} \lim _{k\in S, k\rightarrow \infty } d_k=0. \end{aligned}$$

(2.7)

Note that $d_k=0$ for $k\notin S$, we have

$$\begin{aligned} \lim _{k\rightarrow \infty } d_k=0. \end{aligned}$$

(2.8)

Since $\Vert J_k\Vert \le L_2$ and $\Vert F_k\Vert \le \Vert F_0\Vert $, by (2.1), we have

$$\begin{aligned} \mu _k\rightarrow +\infty . \end{aligned}$$

(2.9)

On the other hand, it follows from (2.3) and (2.4) that

$$\begin{aligned} |r_k-1|=&\left| \frac{Ared_k-Pred_k}{Pred_k}\right| \nonumber \\ =&\frac{|\Vert F(x_k+d_k)\Vert ^2-\Vert F_k+J_kd_k\Vert ^2|}{Pred_k} \nonumber \\ \le&\frac{\Vert F_k+J_kd_k\Vert O(\Vert d_k\Vert ^2)+O(\Vert d_k\Vert ^4)}{\tau \min \{\Vert d_k\Vert ,\tau /L_2^2\}} \nonumber \\ \rightarrow&0. \end{aligned}$$

(2.10)

So, $r_k\rightarrow 1$. Thus, $\mu _k$ is updated by (1.9) and $\Vert J_k^TF_k\Vert >p_2/\mu _k$ for all sufficiently large k. Hence, $\mu _k=\max \left\{ c_1\mu _k,m\right\} $ for all large k. Note that $0<c_1<1$, there exists a positive constant $\tilde{c}$ such that

$$\begin{aligned} \mu _k< \tilde{c} \end{aligned}$$

for all large k. This is a contradiction to (2.9).

Case II. S is finite. Then there exists a $\tilde{k}$ such that

$$\begin{aligned} r_k<p_0, \quad k\ge \tilde{k}. \end{aligned}$$

(2.11)

According to the updating rule of $x_k$ in Algorithm 2.1, we have $d_k\rightarrow 0$. By the same arguments as (2.10), we get $r_k\rightarrow 1$, which contradicts (2.11). The proof is completed.$\square $

Based on Theorem 2.1, we can further prove the strong global convergence of Algorithm 2.1, that is, all limit points of the sequence generated by Algorithm 2.1 are stationary points of the merit function $\phi (x)$. We first give an auxiliary result (cf. [3, Lemma 2.7]).

Lemma 2.2

Let $b, a_1, \ldots , a_N>0$. Then,

$$\begin{aligned} \sum _{j=1}^N \min \{a_j,b\}\ge \min \left\{ \sum _{j=1}^Na_j, b\right\} . \end{aligned}$$

(2.12)

Theorem 2.2

Under Assumption 2.1, Algorithm 2.1 terminates in finite iterations or satisfies

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }~~\Vert J_k^TF_k\Vert =0. \end{aligned}$$

(2.13)

Proof

Suppose by contradiction that there exists $\tau >0$ such that the set

$$\begin{aligned} \Omega =\{k: \Vert J_k^TF_k\Vert \ge \tau \} \end{aligned}$$

(2.14)

is infinite. Given $k\in \Omega $, consider the first index $l_k>k$ such that $\Vert J_{l_k}^TF_{l_k}\Vert \le \frac{\tau }{2}$. The existence of such $l_k$ is guaranteed by Theorem 2.1. By (2.2), (2.3) and $\Vert F_k\Vert \le \Vert F_0\Vert $,

$$\begin{aligned} \frac{\tau }{2}\le&\Vert J_k^TF_k\Vert -\Vert J_{l_k}^TF_{l_k}\Vert \le \Vert J_k^TF_k-J_{l_k}^TF_{l_k}\Vert \nonumber \\ \le&\Vert J_k^TF_k-J_{l_k}^TF_k\Vert +\Vert J_{l_k}^TF_k-J_{l_k}^TF_{l_k}\Vert \le (L_1\Vert F_0\Vert +L_2^2)\Vert x_k-x_{l_k}\Vert , \end{aligned}$$

which yields

$$\begin{aligned} \Vert x_k-x_{l_k}\Vert \ge \frac{\tau }{2(L_1\Vert F_0\Vert +L_2^2)}. \end{aligned}$$

Define the set

$$\begin{aligned} S_k=\{j: k\le j<l_k, x_{j+1}\not =x_j\}. \end{aligned}$$

Then,

$$\begin{aligned} \frac{\tau }{2(L_1\Vert F_0\Vert +L_2^2)}\le \Vert x_k-x_{l_k}\Vert \le \sum _{j\in S_k} \Vert x_j-x_{j+1}\Vert \le \sum _{j\in S_k}\Vert d_j\Vert . \end{aligned}$$

(2.15)

It now follows from (2.4), (2.15) and Lemma 2.2 that, for all $k\in \Omega $,

$$\begin{aligned} \Vert F_k\Vert ^2-\Vert F_{l_k}\Vert ^2=&\sum _{j\in S_k} (\Vert F_j\Vert ^2-\Vert F_{j+1}\Vert ^2) \nonumber \\ \ge&\sum _{j\in S_k} p_0 \Vert J_j^TF_j\Vert \min \left\{ \Vert d_j\Vert , \frac{\Vert J_j^TF_j\Vert }{\Vert J_j^TJ_j\Vert }\right\} \nonumber \\ \ge&\sum _{j\in S_k} \frac{p_0\tau }{2} \min \left\{ \Vert d_j\Vert , \frac{\tau }{2L_2^2}\right\} \nonumber \\ \ge&\frac{p_0\tau }{2} \min \left\{ \sum _{j\in S_k}\Vert d_j\Vert , \frac{\tau }{2L_2^2}\right\} \nonumber \\ \ge&\frac{p_0\tau }{2} \min \left\{ \frac{\tau }{2(L_1\Vert F_0\Vert +L_2^2)}, \frac{\tau }{2L_2^2}\right\} \nonumber \\ =&\frac{p_0\tau ^2}{4(L_1\Vert F_0\Vert +L_2^2)}\nonumber \\ >&0. \end{aligned}$$

(2.16)

However, since $\{\Vert F_k\Vert ^2\}$ is nonincreasing and bounded below, $\Vert F_k\Vert ^2-\Vert F_{l_k}\Vert ^2\rightarrow 0$. This contradicts (2.16). So, the set $\Omega $ defined by (2.14) is finite. Therefore, (2.13) holds true. The proof is completed.$\square $

3 Local Convergence

We assume that the sequence $\{x_k\}$ generated by Algorithm 2.1 converges to the solution set $X^*$ of (1.1) and lies in some neighbourhood of $x^*\in X^*$. We first give some important properties of the algorithm, then show that the algorithm converges quadratically under the local error bound condition.

We make the following assumption.

Assumption 3.1

(a) F(x) is continuously differentiable, and $\Vert F(x)\Vert $ provides a local error bound on some neighbourhood of $x^*\in X^*$, i.e., there exist positive constants $c>0$ and $b_1<1$ such that

$$\begin{aligned} \Vert F(x)\Vert \ge c \ \text{ dist }(x, X^*), \qquad \forall x\in N(x^*, b_1)=\{x: \Vert x-x^*\Vert \le b_1 \}. \end{aligned}$$

(3.1)

(b) The Jacobian J(x) is Lipschitz continuous on $N(x^*, b_1)$, i.e., there exists a positive constant $L_1$ such that

$$\begin{aligned} \quad \Vert J(y)-J(x)\Vert \le L_1 \Vert y-x\Vert , \quad \forall x, y\in N(x^*, b_1). \end{aligned}$$

(3.2)

Note that, if J(x) is nonsingular at a solution of (1.1), then it is an isolated solution, so $\Vert F(x)\Vert $ provides a local error bound on its neighborhood. However, the converse is not necessarily true. Please see examples in [14]. Thus, the local error bound condition is weaker than the nonsingularity.

By (3.2), we have

$$\begin{aligned} \Vert F(y)-F(x)-J(x)(y-x)\Vert \le L_1\Vert y-x\Vert ^2,\quad \forall x, y\in N(x^*,b_1). \end{aligned}$$

(3.3)

Moreover, there exists a constant $L_2>0$ such that

$$\begin{aligned} \Vert F(y)-F(x)\Vert \le L_2\Vert y-x\Vert ,\quad \forall x, y\in N(x^*,b_1). \end{aligned}$$

(3.4)

Throughout the paper, we denote by $\bar{x}_k$ the vector in $X^*$ that satisfies

$$\begin{aligned} \Vert \bar{x}_k-x_k\Vert =\text{ dist }(x_k, X^*). \end{aligned}$$

3.1 Some Properties

In the following, we first show the relationship between the length of the trial step $d_k$ and the distance from $x_k$ to the solution set.

Lemma 3.1

Under Assumption 3.1, if $x_k\in N(x^*,b_1/2)$, then

$$\begin{aligned} \Vert d_k\Vert \le c_2\Vert \bar{x}_k-x_k\Vert \end{aligned}$$

(3.5)

holds for all sufficiently large k, where $ c_2=\sqrt{L_1^2c^{-2}m^{-1}+1} $ is a positive constant.

Proof

Since $x_k\in N(x^*, b_1/2)$, we have

$$\begin{aligned} \Vert \bar{x}_k-x^*\Vert \le \Vert \bar{x}_k-x_k\Vert +\Vert x_k-x^*\Vert \le 2\Vert x_k-x^*\Vert \le b_1. \end{aligned}$$

So, $\bar{x}_k\in N(x^*, b_1)$. Thus, it follows from (1.9) and (3.1) that the LM parameter $\lambda _k$ satisfies

$$\begin{aligned} \lambda _k=\mu _k\Vert F_k\Vert ^{2}\ge c^{2} m\Vert \bar{x}_k-x_k\Vert ^{2}. \end{aligned}$$

(3.6)

Note that $d_k$ is also a minimizer of

$$\begin{aligned} \min \limits _{d\in R^n}\Vert F_k+J_kd\Vert ^2+\lambda _k\Vert d\Vert ^2\triangleq \varphi _k(d), \end{aligned}$$

by (3.3) and (3.6), we have

$$\begin{aligned} \Vert d_k\Vert ^2\le & {} \displaystyle \frac{\varphi _{k}(d_k)}{\lambda _k}\\\le & {} \displaystyle \frac{\varphi _{k}(\bar{x}_k-x_k)}{\lambda _k}\\= & {} \displaystyle \frac{\Vert F_k+J_k(\bar{x}_k-x_k)\Vert ^2}{\lambda _k}+\Vert \bar{x}_k-x_k\Vert ^2\\\le & {} \displaystyle \frac{L_1^2\Vert \bar{x}_k-x_k\Vert ^4}{\lambda _k}+\Vert \bar{x}_k-x_k\Vert ^2\\\le & {} (L_1^2c^{-2}m^{-1}+1)\Vert \bar{x}_k-x_k\Vert ^{2}. \end{aligned}$$

So, we obtain (3.5).$\square $

Next we show that the gradient of the merit function also provides a local error bound on some neighbourhood of $x^*\in X^*$.

Lemma 3.2

Under Assumption 3.1, if $x_k\in N(x^*,b_1/2)$, then there exists a constant $c_3>0$ such that

$$\begin{aligned} \Vert J_k^TF_k\Vert \ge c_3 \Vert \bar{x}_k- x_k\Vert \end{aligned}$$

(3.7)

holds for all sufficiently large k.

Proof

It follows from (3.3) that

$$\begin{aligned} \Vert F_k+J_k(\bar{x}_k- x_k)\Vert \le L_1\Vert \bar{x}_k- x_k\Vert ^2. \end{aligned}$$

Thus,

$$\begin{aligned} \Vert F_k\Vert ^2+2(\bar{x}_k- x_k)^TJ^T_kF_k+(\bar{x}_k- x_k)^TJ^T_kJ_k(\bar{x}_k- x_k)\le L_1^2\Vert \bar{x}_k- x_k\Vert ^4. \end{aligned}$$

So,

$$\begin{aligned} \Vert F_k\Vert ^2+2(\bar{x}_k- x_k)^TJ^T_kF_k\le L_1^2\Vert \bar{x}_k- x_k\Vert ^4. \end{aligned}$$

By (3.1),

$$\begin{aligned} c^2\Vert \bar{x}_k- x_k\Vert ^2-L_1^2\Vert \bar{x}_k- x_k\Vert ^4\le 2\Vert \bar{x}_k- x_k\Vert \Vert J^T_kF_k\Vert . \end{aligned}$$

Hence, (3.7) holds for sufficiently large k. The proof is completed.$\square $

Lemma 3.3

Under Assumption 3.1, if $x_k\in N(x^*,b_1/2)$, then there exists a positive integer K such that

$$\begin{aligned} r_k\ge p_0,\quad \forall k\ge K. \end{aligned}$$

That is, $\mu _k$ is updated by (1.9) when $k\ge K$.

Proof

It follows form (3.4), (3.5) and (3.7) that

$$\begin{aligned} Pred_k&\ge \Vert J^T_kF_k\Vert \min \left\{ \Vert d_k\Vert ,\frac{\Vert J_k^TF_k\Vert }{\Vert J^T_kJ_k\Vert }\right\} \\&\ge c_3 \Vert \bar{x}_k-x_k\Vert \min \{\Vert d_k\Vert ,\frac{c_2^{-1}c_3}{L_2^2}\Vert d_k\Vert \}\\&=\Vert d_k\Vert O(\Vert \bar{x}_k-x_k\Vert ). \end{aligned}$$

This, together with (3.3), (3.4) and $\Vert F_k+J_kd_k\Vert \le \Vert F_k\Vert $, gives

$$\begin{aligned} |r_k-1|&=|\frac{Ared_k-Pred_k}{Pred_k}|\\&\le \frac{\Vert F_k+J_kd_k\Vert O(\Vert d_k\Vert ^2)+O(\Vert d_k\Vert ^4)}{Pred_k}\\&\le \frac{O(\Vert \bar{x}_k-x_k\Vert )O(\Vert d_k\Vert ^2)+O(\Vert d_k\Vert ^4)}{\Vert d_k\Vert O(\Vert \bar{x}_k-x_k\Vert )}\\&=O(\Vert d_k\Vert )\\&\rightarrow 0. \end{aligned}$$

So, $r_k\rightarrow 1$. Therefore, we obtain the result.$\square $

Let

$$\begin{aligned} C_1&=\max \{p_2,c_1^{-1}mL_2\Vert F_0\Vert \}, \end{aligned}$$

(3.8)

$$\begin{aligned} c_4&= L_2^2+L_1\Vert F_0\Vert \end{aligned}$$

(3.9)

be two positive constants.

Lemma 3.4

Under Assumption 3.1 and $c_1\le (1+c_4c_2c_3^{-1})^{-1}$, if $k\ge K$ and $\mu _k\Vert J_k^TF_k\Vert >C_1$, then

$$\begin{aligned} \mu _{k+1}\Vert J_{k+1}^TF_{k+1}\Vert \le \mu _k\Vert J_{k}^TF_{k}\Vert . \end{aligned}$$

(3.10)

Proof

By (3.2) and (3.4),

$$\begin{aligned} |\Vert J_{k+1}^TF_{k+1}\Vert -\Vert J_k^TF_k\Vert | \le&|\Vert J_{k+1}^TF_{k+1}\Vert -\Vert J_{k+1}^TF_{k}\Vert |+|\Vert J_{k+1}^TF_{k}\Vert -\Vert J_k^TF_k\Vert | \\ \le&\Vert J_{k+1}\Vert \Vert F_{k+1}-F_k\Vert +\Vert F_k\Vert \Vert J_{k+1}-J_k\Vert \\ \le&(L_2^2+L_1\Vert F_0\Vert )\Vert d_k\Vert \\ =&c_4\Vert d_k\Vert . \end{aligned}$$

It then follows from Lemmas 3.1 and 3.2 that

$$\begin{aligned} \Vert J_{k+1}^TF_{k+1}\Vert \le \Vert J_k^TF_k\Vert +c_4\Vert d_k\Vert \le (1+c_4c_2c_3^{-1})\Vert J_k^TF_k\Vert . \end{aligned}$$

(3.11)

Since $\mu _k\Vert J_k^TF_k\Vert >C_1$, by (3.4) and $\Vert F_k\Vert \le \Vert F_0\Vert $, we have

$$\begin{aligned} \mu _k> \frac{p_2}{\Vert J_k^TF_k\Vert }, \quad \mu _k\Vert J_k^TF_k\Vert \ge \frac{mL_2}{c_1}\Vert F_0\Vert \ge \frac{m}{c_1}\Vert J_k^TF_k\Vert . \end{aligned}$$

So, $\mu _k\ge \frac{m}{c_1}$. It then follows from $k\ge K$, Lemma 3.3 and the updating rule (1.9) that

$$\begin{aligned}\mu _{k+1}=c_1\mu _k. \end{aligned}$$

By (3.11) and $c_1\le (1+c_4c_2c_3^{-1})^{-1}$, we have

$$\begin{aligned}\mu _{k+1}\Vert J_{k+1}^TF_{k+1}\Vert =&c_1\mu _k\Vert J_{k+1}^TF_{k+1}\Vert \nonumber \\ \le&c_1(1+c_4c_2c_3^{-1})\mu _k\Vert J_{k}^TF_{k}\Vert \nonumber \\ \le&\mu _k\Vert J_{k}^TF_{k}\Vert . \end{aligned}$$

The proof is completed.$\square $

Let

$$\begin{aligned} C_2=&\max \{\mu _K\Vert J_K^TF_K\Vert ,c_0(1+c_4c_2c_3^{-1})C_1\} \end{aligned}$$

be a positive constant.

The next lemma shows that $\mu _k\Vert J^T_kF_k\Vert $ is upper bounded.

Lemma 3.5

Under conditions of Lemma 3.4,

$$\begin{aligned} \mu _k\Vert J^T_kF_k\Vert \le C_2, \quad \forall k\ge K. \end{aligned}$$

(3.12)

Proof

We discuss in two cases.

Case 1 $\mu _K\Vert J^T_KF_K\Vert \le c_0(1+c_4c_2c_3^{-1})C_1$. Then, we must have

$$\begin{aligned} \mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert \le c_0(1+c_4c_2c_3^{-1})C_1. \end{aligned}$$

(3.13)

Otherwise, suppose

$$\begin{aligned} \mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert > c_0(1+c_4c_2c_3^{-1})C_1. \end{aligned}$$

(3.14)

It follows from (3.11) and $\mu _{K+1}\le c_0\mu _K$ that

$$\begin{aligned} (1+c_4c_2c_3^{-1})C_1<\mu _K\Vert J^T_{K+1}F_{K+1}\Vert \le (1+c_4c_2c_3^{-1})\mu _K\Vert J^T_KF_K\Vert . \end{aligned}$$

(3.15)

This gives

$$\begin{aligned} \mu _K\Vert J^T_KF_K\Vert > C_1. \end{aligned}$$

By Lemma 3.4, we obtain

$$\begin{aligned} \mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert \le \mu _K\Vert J^T_KF_K\Vert \le c_0(1+c_4c_2c_3^{-1})C_1.\end{aligned}$$

This is a contradiction to (3.14). So (3.13) holds true.

By induction, we can obtain

$$\begin{aligned} \mu _k\Vert J^T_kF_k\Vert \le c_0(1+c_4c_2c_3^{-1})C_1, \quad \forall k\ge K. \end{aligned}$$

(3.16)

Case 2 $\mu _K\Vert J^T_KF_K\Vert > c_0(1+c_4c_2c_3^{-1})C_1$. Note that $c_0>1$, we have

$$\begin{aligned} \mu _K\Vert J^T_KF_K\Vert >C_1. \end{aligned}$$

So, by Lemma 3.4,

$$\begin{aligned} \mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert \le \mu _K\Vert J^T_KF_K\Vert . \end{aligned}$$

(3.17)

If $\mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert > c_0(1+c_4c_2c_3^{-1})C_1$, then by Lemma 3.4 and (3.17),

$$\begin{aligned} \mu _{K+2}\Vert J^T_{K+2}F_{K+2}\Vert \le \mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert . \end{aligned}$$

(3.18)

Otherwise, if $\mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert \le c_0(1+c_4c_2c_3^{-1})C_1$, then by the same arguments as in case 1, we have

$$\begin{aligned} \mu _{K+2}\Vert J^T_{K+2}F_{K+2}\Vert \le c_0(1+c_4c_2c_3^{-1})C_1. \end{aligned}$$

(3.19)

In view of (3.17)–(3.19), we obtain

$$\begin{aligned} \mu _{K+2}\Vert J^T_{K+2}F_{K+2}\Vert \le&\max \{\mu _{K+1}\Vert J^T_{K+1}F_{K+1}\Vert , c_0(1+c_4c_2c_3^{-1})C_1\} \nonumber \\ \le&\max \{\mu _K\Vert J^T_KF_K\Vert , c_0(1+c_4c_2c_3^{-1})C_1\}. \end{aligned}$$

(3.20)

By induction, we can prove that, for all $k> K$,

$$\begin{aligned}\mu _k\Vert J^T_kF_k\Vert \le&\max \{\mu _{k-1}\Vert J^T_{k-1}F_{k-1}\Vert , c_0(1+c_4c_2c_3^{-1})C_1\} \nonumber \\ \le&\cdots \nonumber \\ \le&\max \{\mu _K\Vert J^T_KF_K\Vert , c_0(1+c_4c_2c_3^{-1})C_1\}\nonumber \\ =&C_2. \end{aligned}$$

The proof is completed.$\square $

Let

$$\begin{aligned} C_3=c_3^{-1}L_2C_2. \end{aligned}$$

The following lemma shows that $\mu _k\Vert F_k\Vert $ is bounded by $C_3$.

Lemma 3.6

Under conditions of Lemma 3.4,

$$\begin{aligned} \mu _k\Vert F_k\Vert \le C_3 \end{aligned}$$

(3.21)

holds for all sufficiently large k.

Proof

It follows from (3.4) that

$$\begin{aligned} \Vert F_k\Vert \le L_2\Vert \bar{x}_k- x_k\Vert . \end{aligned}$$

This, together with (3.7), gives

$$\begin{aligned} \Vert F_k\Vert \le c_3^{-1}L_2 \Vert J^T_kF_k\Vert . \end{aligned}$$

Thus, by (3.12), we obtain (3.21). The proof is completed.$\square $

3.2 Quadratic Convergence

Based on the above lemmas, we study the quadratic convergence of Algorithm 2.1 under the local error bound condition, by using the singular value decomposition (SVD) technique.

Suppose the SVD of $J(\bar{x}_k)$ is

$$\begin{aligned} \bar{J}_k= & {} \bar{U}_k\bar{\Sigma }_k\bar{V}_k^{T}\\= & {} (\bar{U}_{k,1}, \bar{U}_{k,2})\left( \begin{array}{cc}\bar{\Sigma }_{k,1}&{}\\ {} &{} 0\end{array}\right) \left( \begin{array}{c} \bar{V}_{k,1}^{T}\\ \bar{V}_{k,2}^{T}\end{array}\right) \\= & {} \bar{U}_{k,1}\bar{\Sigma }_{k,1}\bar{V}_{k,1}^{T}, \end{aligned}$$

where $\bar{\Sigma }_{k,1}={\text {diag}}(\bar{\sigma }_{k,1},\ldots , \bar{\sigma }_{k,r})$ with $\bar{\sigma }_{k,1} \ge \bar{\sigma }_{k,2}\ge \cdots \ge \bar{\sigma }_{k,r}>0$, and the correspondingly SVD of $J_k$ is

$$\begin{aligned} J_k= & {} U_k\Sigma _kV_k^T\\= & {} (U_{k,1},U_{k,2})\left( \begin{array}{cc}\Sigma _{k,1}&{}\\ {} &{}\Sigma _{k,2}\end{array}\right) \left( \begin{array}{c}V_{k,1}^T\\ V_{k,2}^T \end{array}\right) \\= & {} U_{k,1}\Sigma _{k,1}V_{k,1}^T+U_{k,2}\Sigma _{k,2}V_{k,2}^T, \end{aligned}$$

where $\Sigma _{k,1}={\text {diag}}(\sigma _{k,1},\ldots , \sigma _{k,r})$ with $\sigma _{k,1} \ge \cdots \ge \sigma _{k,r}>0$, and $\Sigma _{k,2}={\text {diag}}(\sigma _{k,r+1},\ldots , \sigma _{k,n})$ with $\sigma _{k,r}\ge \cdots \ge \sigma _{k,n}\ge 0$. In the following, if the context is clear, we neglect the subscription k in $\Sigma _{k,i}$ and $U_{k,i}, V_{k,i}(i=1,2)$, and write $J_k$ as

$$\begin{aligned} J_k=U_1\Sigma _1V_1^T+U_2\Sigma _2V_2^T. \end{aligned}$$

By the theory of matrix perturbation [12] and the Lipschitzness of $J_k$,

$$\begin{aligned} \Vert \text{ diag }(\Sigma _1-\bar{\Sigma }_1, \Sigma _2)\Vert \le \Vert J_k-\bar{J}_k\Vert \le L_1\Vert \bar{x}_k-x_k\Vert . \end{aligned}$$

So,

$$\begin{aligned} \Vert \Sigma _1-\bar{\Sigma }_1\Vert \le L_1\Vert \bar{x}_k-x_k\Vert \qquad \text{ and }\qquad \Vert \Sigma _2\Vert \le L_1\Vert \bar{x}_k-x_k\Vert . \end{aligned}$$

(3.22)

Since $\{x_k\}$ converges to the solution set $X^*$, we assume that $L_1\Vert \bar{x}_k-x_k\Vert \le \bar{\sigma }_r/2$ holds for all sufficiently large k. Then, it follows from (3.22) that

$$\begin{aligned} \Vert \Sigma _1^{-1}\Vert \le \dfrac{1}{\bar{\sigma }_r-L_1\Vert \bar{x}_k-x_k\Vert }\le \dfrac{2}{\bar{\sigma }_r}. \end{aligned}$$

(3.23)

Lemma 3.7

Under Assumption 3.1, if $x_k\in N(x^*, b_1/2)$, then we have

(a)
$\Vert U_1U_1^TF_k\Vert \le L_2\Vert \bar{x}_k-x_k\Vert $;
(b)
$\Vert U_2U_2^TF_k\Vert \le 2L_1\Vert \bar{x}_k-x_k\Vert ^2$;

where $L_1, L_2$ are given in (3.2) and (3.4) respectively.

Proof

(a) follows from (3.4) directly.

Denote $F(\bar{x}_k)$ by $\bar{F}_k$. By (3.3) and (3.22),

$$\begin{aligned} \Vert U_2U_2^TF_k\Vert =&\Vert U_2U_2^T(\bar{F}_k- F_k)\Vert \\ \le&\Vert U_2U_2^TJ_k(\bar{x}_k- x_k)\Vert +L_1\Vert U_2U_2^T\Vert \Vert \bar{x}_k- x_k\Vert ^2\\ \le&\Vert U_2U_2^T(U_1\Sigma _1V_1^T+U_2\Sigma _2V_2^T)\Vert \Vert \bar{x}_k- x_k\Vert +L_1\Vert \bar{x}_k- x_k\Vert ^2\\ \le&\Vert \Sigma _2\Vert \Vert \bar{x}_k- x_k\Vert +L_1\Vert \bar{x}_k- x_k\Vert ^2\\ \le&2L_1\Vert \bar{x}_k- x_k\Vert ^2. \end{aligned}$$

The proof is completed.$\square $

Now we can give the main result of this section.

Theorem 3.1

Under Assumption 3.1, the sequence generated by Algorithm 2.1 converges to some solution of (1.1) quadratically.

Proof

By the SVD of $J_k$,

$$\begin{aligned} d_k=-V_1(\Sigma _1^2+\lambda _kI)^{-1}\Sigma _1U_1^TF_k-V_2(\Sigma _2^2+\lambda _kI)^{-1}\Sigma _2U_2^TF_k, \end{aligned}$$

and

$$\begin{aligned} F_k+J_kd_k=\lambda _kU_1(\Sigma _1^2\!+\!\lambda _kI)^{-1}U_1^TF_k \!+\!\lambda _kU_2(\Sigma _2^2\!+\!\lambda _kI)^{-1}U_2^TF_k. \end{aligned}$$

It follows from (3.4), (3.23), Lemmas 3.6 and 3.7 that

$$\begin{aligned} \Vert F_k+J_kd_k\Vert \le&\mu _k\Vert F_k\Vert ^{2}\Vert \Sigma _1^{-2}\Vert \Vert U_1U_1^TF_k\Vert +\Vert U_2U_2^TF_k\Vert \\ \le&\frac{4L_2^{2}C_3}{\bar{\sigma }_r^2}\Vert \bar{x}_k- x_k\Vert ^{2}+2L_1\Vert \bar{x}_k- x_k\Vert ^2\\ \le&c_5\Vert \bar{x}_k- x_k\Vert ^{2}, \end{aligned}$$

where $c_5=\frac{4L_2^{2}C_3}{\bar{\sigma }_r^2}+2L_1$ is a positive constant. So, by (3.1), (3.3) and Lemma 3.1,

$$\begin{aligned} c\Vert \bar{x}_{k+1}- x_{k+1}\Vert&\le \Vert F_{k+1}\Vert \nonumber \\&\le \Vert F_k+J_kd_k\Vert +L_1\Vert d_k\Vert ^2\nonumber \\&\le c_5\Vert \bar{x}_k- x_k\Vert ^{2}+L_1c_2^2\Vert \bar{x}_k- x_k\Vert ^2\nonumber \\&\le c_6\Vert \bar{x}_k- x_k\Vert ^{2}, \end{aligned}$$

(3.24)

where $c_6=c_5+c_2^2 L_1$ is a positive constant.

Note that

$$\begin{aligned} \Vert \bar{x}_k-x_k\Vert \le \Vert \bar{x}_{k+1}-x_{k+1}\Vert +\Vert d_k\Vert . \end{aligned}$$

(3.25)

By (3.24),

$$\begin{aligned} \Vert \bar{x}_k-x_k\Vert \le 2\Vert d_k\Vert \end{aligned}$$

holds for all sufficiently large k. Combining (3.5), (3.24) and (3.25), we obtain

$$\begin{aligned} \Vert d_{k+1}\Vert \le O(\Vert d_k\Vert ^{2}). \end{aligned}$$

The proof is completed.$\square $

Remark 3.1

If the Levenberg–Marquardt parameter is chosen as $\lambda _k=\mu _k\Vert F_k\Vert ^\delta $, where $\mu _k$ is updated by (1.9) and $\delta \in (1,2]$, the algorithm converges superlinearly to some solution of the nonlinear equations with the order $\delta $. The proof is almost the same as above, except that we have $\Vert F_k+J_kd_k\Vert \le c_5\Vert \bar{x}_k- x_k\Vert ^\delta $ instead of $\Vert F_k+J_kd_k\Vert \le c_5\Vert \bar{x}_k- x_k\Vert ^2$ in the proof of Theorem 3.1, which then yields $\Vert d_{k+1}\Vert \le O(\Vert d_k\Vert ^\delta )$.

Table 1 Results on the first singular test set with $\text{ rank }(F'(x^*))=n-1$

Full size table

Table 2 Results on the second singular test set with $\text{ rank }(F'(x^*))=n-2$

Full size table

Table 3 Results on the first singular test set with rank $n-1$

Full size table

Table 4 Results on the first singular test set with rank $n-2$

Full size table

4 Numerical Results

We test Algorithm 2.1, where the LM parameter is computed by $\lambda _k=\mu _k\Vert F_k\Vert ^2$ with $\mu _k$ updated by (1.9), on some singular nonlinear equations, and compare it with other two LM algorithms, where $\lambda _k=\mu _k\Vert F_k\Vert $ and $\lambda _k=\mu _k\Vert F_k\Vert ^2$ with $\mu _k$ updated by (1.7), respectively.

The test problems are created by modifying the nonsingular problems given by Moré, Garbow and Hillstrom in [8], and have the same form as in [11],

$$\begin{aligned} \hat{F}(x)=F(x)-J(x^*)A(A^TA)^{-1}A^T(x-x^*), \end{aligned}$$

where F(x) is the standard nonsingular test function, $x^*$ is its root, and $A\in R^{n\times k}$ has full column rank with $1\le k\le n$. Obviously, $\hat{F}(x^*)=0$ and

$$\begin{aligned} \hat{J}(x^*)=J(x^*)(I-A(A^TA)^{-1}A^T) \end{aligned}$$

has rank $n-k$. A disadvantage of these problems is that $\hat{F}(x)$ may have roots that are not roots of F(x). We create two sets of singular problems, with $\hat{J}(x^*)$ having rank $n-1$ and $n-2$, by using

$$\begin{aligned} A\in R^{n\times 1},\qquad A^T=(1,1,\ldots , 1) \end{aligned}$$

and

$$\begin{aligned} A\in R^{n\times 2},\qquad A^T=\left( \begin{array}{cccccc} 1 &{} 1 &{} 1 &{} 1 &{} \cdots &{} 1\\ 1 &{} -1&{} 1 &{} -1&{} \cdots &{} \pm 1 \end{array}\right) , \end{aligned}$$

respectively. Meanwhile, we make a slight alteration on the variable dimension problem, which has $n+2$ equations in n unknowns; we eliminate the $(n-1)$-th and n-th equations. (The first n equations in the standard problem are linear.)

We set $p_0=0.0001, p_1=0.25, p_2=0.75, c_0=4, c_1=0.25, \mu _0=10^{-8}, m=10^{-8}, \varepsilon =10^{-6}$ for all the tests. The stopping criterion is $\Vert J^T_kF_k\Vert \le \varepsilon $ or when the number of iterations exceeds $100(n+1)$. The results for the first set problems of rank $n - 1$ with small scale are listed in Table 1, and the second set of rank $n-2$ in Table 2. We also test the algorithms on some large scale problems. The results are given in Tables 3 and 4.

The third column of the table indicates that the starting point is $x_0, 10x_0$, and $100x_0$, where $x_0$ is suggested by Moré, Garbow and Hillstrom in [8]; “NF” and “NJ” represent the numbers of function calculations and Jacobian calculations, respectively. If the algorithm fails to find the solution in $100(n+1)$ iterations, we denote it by the sign “−”, and if the algorithm has underflows or overflows, we denote it by OF. Note that, for general nonlinear equations, the calculations of the Jacobian are usually n times of the function calculations. So, for small scale problems, we also present the values “NF+n*N” for comparisons of the total calculations. However, if the Jacobian is sparse, this kind of value does not mean much. For the large scale problem, the computing time is also given.

From Tables 1 and 2, we can see that Algorithm 2.1 works almost the same as other two LM algorithms for small scale problems. From Tables 3 and 4, we can see that Algorithm 2.1 outperforms the other two algorithms for most large scale problems.

5 Conclusion and Discussion

In traditional LM algorithms for nonlinear equations, both the iterate and the LM parameter are updated according to the ratio of the actual reduction to the predicted reduction of the merit function (cf. [1, 2]). In this paper, we proposed a new LM algorithm for nonlinear equations, where the LM parameter is taken as $\lambda _k=\mu _k\Vert F_k\Vert ^2$ with $\mu _k$ being updated by (1.9). Though the iterate is still updated according to the ratio of the actual reduction to the predicted reduction, the update of $\mu _k$ is no longer based on it. When the iteration is unsuccessful, $\mu _k$ is increased; otherwise it is updated based on the value of the gradient norm of the merit function as in (1.9). We proved that all limit points of the sequence generated by the algorithm are stationary points of the merit function under standard conditions. Since the updating rule of $\mu _k$ changes, the analysis of the convergence rate in this paper is quite different from those in [1, 3]. We developed new techniques to prove the quadratic convergence of the algorithm under the local error bound condition.

We also discussed the LM parameter as $\lambda _k=\mu _k\Vert F_k\Vert ^\delta $, where $\mu _k$ is updated by (1.9) and $\delta \in [1,2)$. We found that the algorithm converges with the order $\delta $, by using the similar analysis in this paper. We conjecture that the convergence rate is quadratic for any $\delta \in [1,2)$. This will be our future study.

References

Fan, J.Y.: A modified Levenberg–Marquardt algorithm for singular system of nonlinear equations. J. Comput. Math. 21, 625–636 (2003)
MathSciNet MATH Google Scholar
Fan, J.Y.: The modified Levenberg–Marquardt method for nonlinear equations with cubic convergence. Math. Comput. 81, 447–466 (2012)
Article MathSciNet MATH Google Scholar
Fan, J.Y., Pan, J.Y., Song, H.Y.: A retrospective trust region algorithm with trust region converging to zero. J. Comput. Math. 34, 421–436 (2016)
Fan, J.Y., Yuan, Y.X.: On the quadratic convergence of the Levenberg–Marquardt method without nonsingularity assumption. Computing 74, 23–39 (2005)
Article MathSciNet MATH Google Scholar
Levenberg, K.: A method for the solution of certain nonlinear problems in least squares. Quardt. Appl. Math. 2, 164–166 (1944)
Article MathSciNet MATH Google Scholar
Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear inequalities. SIAM J. Appl. Math. 11, 431–441 (1963)
Article MATH Google Scholar
Moré, J.J.: The Levenberg–Marquardt algorithm: implementation and theory. Numer. Anal. 630, 105–116 (1978)
MathSciNet MATH Google Scholar
Moré, J.J., Garbow, B.S., Hillstrom, K.H.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7, 17–41 (1981)
Article MathSciNet MATH Google Scholar
Osborne, M.R.: Nonlinear least squares-the Levenberg–Marquardt algorithm revisited. J. Aust. Math. Soc. 19, 343–357 (1976)
Article MATH Google Scholar
Powell, M.J.D.: Convergence properties of a class of minimization algorithms. In: Mangasarian, O.L., Meyer, R.R., Robinson, S.M. (eds.) Nonlinear Programming 2, pp. 1–27. Academic Press, New York (1975)
Google Scholar
Schnabel, R.B., Frank, P.D.: Tensor methods for nonlinear equations. SIAM J. Numer. Anal. 21, 815–843 (1984)
Article MathSciNet MATH Google Scholar
Stewart, G.W., Sun, J.G.: Matrix Perturbation Theory. Academic Press, San Diego (1990)
MATH Google Scholar
Wright, S.J., Holt, J.N., Holt, J.N.: An inexact Levenberg–Marquardt method for large sparse nonlinear least squares. J. Aust. Math. Soc. 26, 387–403 (1985)
Article MathSciNet MATH Google Scholar
Yamashita, N., Fukushima, M.: On the rate of convergence of the Levenberg–Marquardt mehod. Computing (Supplement) 15, 237–249 (2001)
MATH Google Scholar
Yuan, Y.X.: Trust region algorithms for nonlinear equations. Information 1, 7–20 (1998)
MathSciNet MATH Google Scholar
Yuan, Y.X.: Recent advances in numerical methods for nonlinear equations and nonlinear least sqaures. Numer. Algebra Control Optim. 1, 15–34 (2011)
Article MathSciNet Google Scholar
Yuan, Y.X.: Recent advances in trust region algorithms. Math. Program. Ser. B 151, 249–281 (2015)
Article MathSciNet MATH Google Scholar
Zhao, R.X., Fan, J.Y.: Global complexity bound of the Levenberg–Marquardt method. Optim. Methods Softw. 31, 805–814 (2016)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, and MOE-LSC, Shanghai Jiao Tong University, Shanghai, 200240, China
Ruixue Zhao & Jinyan Fan

Authors

Ruixue Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jinyan Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinyan Fan.

Additional information

Jinyan Fan is partially supported by the NSFC Grant 11571234.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, R., Fan, J. On a New Updating Rule of the Levenberg–Marquardt Parameter. J Sci Comput 74, 1146–1162 (2018). https://doi.org/10.1007/s10915-017-0488-6

Download citation

Received: 07 June 2016
Revised: 21 January 2017
Accepted: 16 June 2017
Published: 23 June 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s10915-017-0488-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On a New Updating Rule of the Levenberg–Marquardt Parameter

Abstract

Similar content being viewed by others

A modified Levenberg–Marquardt method with line search for nonlinear equations

Algebraic rules for computing the regularization parameter of the Levenberg–Marquardt method

An Adaptive Multi-step Levenberg–Marquardt Method

1 Introduction

2 The LM Algorithm and Global Convergence

Algorithm 2.1

Assumption 2.1

Lemma 2.1

Theorem 2.1

Proof

Lemma 2.2

Theorem 2.2

Proof

3 Local Convergence

Assumption 3.1

3.1 Some Properties

Lemma 3.1

Proof

Lemma 3.2

Proof

Lemma 3.3

Proof

Lemma 3.4

Proof

Lemma 3.5

Proof

Lemma 3.6

Proof

3.2 Quadratic Convergence

Lemma 3.7

Proof

Theorem 3.1

Proof

Remark 3.1

4 Numerical Results

5 Conclusion and Discussion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation