The Convergence Conditions of Interval Newton’s Method Based on Point Estimates

Li, Zhe; Wan, Baocheng; Zhang, Shugong

doi:10.1007/978-3-319-66320-3_20

Zhe Li^17,18,
Baocheng Wan^18,19 &
Shugong Zhang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10490))

Included in the following conference series:

International Workshop on Computer Algebra in Scientific Computing

599 Accesses

Abstract

Both Smale’s alpha theory and Rump’s interval theorem provide the conditions which guarantee the existence of a simple solution of a square nonlinear system. In this paper, we generalize the conclusion provided by Rall to reveal the relationship between Smale’s alpha theory and Rump’s interval theorem. By point estimates, we propose the conditions under which the condition of Rump’s interval theorem holds. Furthermore, using only the information of the given system at the initial approximate point, we give the convergence conditions of interval Newton’s algorithm proposed by Rump.

Z. Li—This research was supported by Chinese National Natural Science Foundation under Grant Nos. 11601039, 11671169, 11501051, by the open fund Key Lab. of Symbolic Computation and Knowledge Engineering under Grant No. 93K172015K06, and by the Education Department of Jilin Province, “13th Five-Year” science and technology project under Grant No. JJKH20170618KJ.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Ball convergence of a sixth order iterative method with one parameter for solving equations under weak conditions

Article 23 November 2015

Ball convergence theorems and the convergence planes of an iterative method for nonlinear equations

Article 14 August 2015

Ball convergence theorems for eighth-order variants of Newton’s method under weak conditions

Article Open access 10 April 2015

Keywords

1 Introduction

Solving a nonlinear system in the form $\varvec{f}(\varvec{x})=\varvec{0}$ with $\varvec{f}={({f}_1, f_2, \ldots , f_n)}^{T}$ and $\varvec{x}={(x_1, \ldots , x_n)}^T$ is one of the most fundamental problems in scientific computing. In this paper, we assume that $\varvec{f}: \mathbb {R}^n \rightarrow \mathbb {R}^n$ and $f_1, f_2, \ldots , f_n$ have all order continuous partial derivatives. Denote the Jacobian matrix of $\varvec{f}$ at $\varvec{x}$ by $\varvec{f}'(\varvec{x})$.

Newton’s method and its modifications have long played an important role in solving nonlinear systems. Under certain conditions, Newton’s method constructs a sequence of iteration points that will converge to a solution of the given nonlinear system. In 1948, the author of [2] established the Kantorovich theorem based on the assumption that the Jacobian matrix of the nonlinear system is Lipschitz continuous on some domain. The Kantorovich theorem first gives the condition to ensure that a simple solution exists close to the initial point and the Newton iteration sequence quadratically converges to this simple solution. Using the technique of point estimates, Smale et al. [15,16,17] developed the alpha theory to locate and approximate simple solutions. The alpha theory requires only information concerning the nonlinear system at the initial point of the Newton iteration. By introducing the dominating sequence technique, Wang and Han [18] improved both the condition and conclusion of the alpha theory. With the aid of Schröder operator, Giusti et al. [3] provided a criterion for locating clusters of solutions of univariate nonlinear functions. Later on, Giusti et al. [4] generalized their results to locate breadth-one singular solutions of multivariate nonlinear systems. For the performance of the alpha theory, Hauenstein and Sottile [6] described the program alphaCertified to certify solutions of polynomial systems. Recently, Hauenstein and Levandovskyy [5] extended the program alphaCertified to verify solutions to polynomial-exponential systems.

Interval arithmetic is another important tool of verification methods. In 1960s, Krawczyk [9] first introduced an interval version of Newton’s method for verifying the existence of simple solutions. Moore [10] proposed computationally verifiable sufficient condition for interval Newton’s method given by Krawczyk. Rump [12] made interval Newton’s method perform better in practice, which is called Rump’s interval theorem and included in verifynlss function in INTLAB toolbox [13] in Matlab. Based on the deflation technique using smoothing parameters, Rump and Graillat [14] described a verification algorithm to verify multiple solutions of univariate nonlinear functions and double solutions of multivariate nonlinear systems. Further, Li and Zhi [8] provided an algorithm to verify breadth-one singular solutions of polynomial systems, which had been generalized to deal with the verification of isolated singular solutions in [7]. By combining interval algorithms with some other methods, Yang et al. [19] investigated the verification for real solutions of positive-dimensional polynomial systems.

In 1980, Rall [11] exhibited the relationship between the Kantorovich theorem and Moore’s interval theorem. By the quantities of the Kantorovich theorem, Rall provided the conditions under which Moore’s verifiable sufficient condition holds. For an initial approximate $\varvec{x}^{(0)}\in \mathbb {R}^n $ and a radius $\rho >0$, let $\varvec{X}_{\rho }$ denote the set $\{\varvec{x}: \Vert \varvec{x}-\varvec{x}^{(0)}\Vert _{\infty }<\rho \}$, and let $\eta $, B, $\kappa $ be the constants such that

$$\begin{aligned}&\Vert {\varvec{f}'(\varvec{x}^{(0)} )}^{-1}\varvec{f}(\varvec{x}^{(0)}) \Vert _\infty \le \eta ,\\&\Vert {\varvec{f}'(\varvec{x}^{(0)} )}^{-1}\Vert _\infty \le B,\\&\Vert \varvec{f}'(\varvec{u})-\varvec{f}'(\varvec{v})\Vert _\infty \le \kappa \Vert \varvec{u}-\varvec{v}\Vert , \quad \quad \varvec{u}, \varvec{v}\in \varOmega , \end{aligned}$$

where $\varOmega $ is a sufficiently large region containing $\varvec{x}^{(0)}$. Rall’s conclusion is that if

$$h=B\kappa \eta <\frac{1}{4},$$

then

$$ \varvec{x}^{(0)}-{\varvec{f}'(\varvec{x}^{(0)})}^{-1}\varvec{f}(\varvec{x}^{(0)})+(I-{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{X}_{\rho })) (\varvec{X}_{\rho }-\varvec{x}^{(0)}) \subset \varvec{X}_{\rho }$$

holds for any $\rho $ satisfying the inequality

$$\frac{1-\sqrt{1-4 h}}{2 h} \eta \le \rho \le \frac{1+\sqrt{1-4 h}}{2 h} \eta .$$

Since the alpha theory and Rump’s interval theorem are respectively the generalization of the Kantorovich theorem and Moore’s interval theorem, we generalize Rall’s conclusion to discuss the relationship between the alpha theory and Rump’s interval theorem in this paper. By only the information of the given system at the initial approximate point, we provide the conditions to guarantee that we can obtain an approximate point after finite Newton’s iterations, where this approximate iteration point corresponds to an interval solution satisfying the condition of Rump’s interval theorem. The next section will give some notation and background results.

2 Notation and Preliminaries

First of all, we emphasize that the norm $\Vert \cdot \Vert $ of the vector and the matrix in this paper are both the infinite norm $\Vert \cdot \Vert _{\infty }$ since the metric for the interval vector is closely related to the infinite norm.

Henceforward, we use boldface letters to express tuples and denote their entries by the same letter with subscripts, for example $\varvec{\alpha }={({\alpha }_1, \ldots , {\alpha }_n)}^T$. Denote the usual product order on $\mathbb {R}^n$ by $\le $, that is, for arbitrary $\varvec{\alpha }, \varvec{\beta }\in \mathbb {R}^n$, $\varvec{\alpha }\le \varvec{\beta }$ if and only if $\alpha _i\le \beta _i$ for $1\le i\le n$.

For $\varvec{x}\in \mathbb {R}^n$, if $\varvec{f}'(\varvec{x})$ is nonsingular, then define

$$\begin{aligned} \alpha (\varvec{f}, \varvec{x})&=\beta (\varvec{f}, \varvec{x})\gamma (\varvec{f}, \varvec{x}),\\ \beta (\varvec{f}, \varvec{x})&=\Vert {\varvec{f}'(\varvec{x})}^{-1} \varvec{f}(\varvec{x})\Vert ,\\ \gamma (\varvec{f}, \varvec{x})&=\sup _{k\ge 2}{\Vert {\varvec{f}'(\varvec{x})}^{-1}\frac{\varvec{f}^{(k)}(\varvec{x})}{k!}\Vert }^{1/(k-1)}. \end{aligned}$$

Given the initial approximate $\varvec{x}^{(0)}$ with the associated simple root $\varvec{x}^*$ of $\varvec{f}$, we let $\alpha $, $\beta $ and $\gamma $ to stand for $\alpha (\varvec{f}, \varvec{x}^{(0)})$, $\beta (\varvec{f}, \varvec{x}^{(0)})$ and $\gamma (\varvec{f}, \varvec{x}^{(0)})$, respectively. Applying Newton’s method for $\varvec{f}$ can get the Newton iteration sequence $\{\varvec{x}^{(k)}\}$, that is,

$$\varvec{x}^{(k+1)}= \varvec{x}^{(k)}-{\varvec{f}'(\varvec{x}^{(k)})}^{-1}\varvec{f}(\varvec{x}^{(k)}), \quad \quad k\in \mathbb {N}.$$

The alpha theory provides the convergence conditions which ensure the sequence $\{\varvec{x}^{(k)}\}$ converges to $\varvec{x}^*$ only with the values $\alpha $, $\beta $ and $\gamma $. The dominating sequence technique is a powerful tool for improving the alpha theory. The dominating sequence $\{t^{(k)}\}$ is produced by the Newton iteration with the initial approximate $t^{(0)}=0$ for the univariate function

$$ h(t)=\beta -t+\frac{\gamma t^2}{1- \gamma t}, $$

where the equation $h(t)=0$ has the following two solutions

$$\begin{aligned} t^{*}=\frac{ 2 \beta }{1+\alpha +\sqrt{1-6 \alpha +{\alpha }^2}},\quad t^{**}=\frac{1+\alpha +\sqrt{1-6 \alpha +{\alpha }^2}}{4 \gamma }. \end{aligned}$$

(1)

The following theorem is a version of the alpha theory given by Wang and Han, where the condition on the quantity $\alpha $ is best possible.

Theorem 1

[4, 18] If $0<\alpha <3-2\sqrt{2}$, then for any $t^{*}\le \rho <t^{**}$, the system $\varvec{f}(\varvec{x})=\varvec{0}$ has exactly one simple root $\varvec{x}^{*}$ in $\overline{B}(\varvec{x}^{(0)}, \rho )$. In addition, the Newton iteration sequence $\{\varvec{x}^{(k)}\}$ converges quadratically to $\varvec{x}^{*}$, and the dominating sequence $\{t^{(k)}\}$ increases and converges quadratically to $t^*$. Furthermore, for all $k\in \mathbb {N}$,

$$\begin{aligned} \Vert \varvec{x}^{(k+1)}-\varvec{x}^{(k)} \Vert&\le t^{(k+1)}-t^{(k)},\\ \Vert \varvec{x}^{(k+1)}-\varvec{x}^{(k)} \Vert&\le {q(\alpha )}^{2^{k}-1} \beta \end{aligned}$$

with

$$\begin{aligned} q(\alpha )=\frac{4 \alpha }{{(1-\alpha +\sqrt{1-6\alpha +\alpha ^2}) }^2}. \end{aligned}$$

(2)

Denote the set of intervals by $\mathbb {I}\mathbb {R}$. An interval vector $\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]\in \mathbb {I}\mathbb {R}^n$ with $\underline{\varvec{x}}, \overline{\varvec{x}}\in \mathbb {R}^n$ and $\underline{\varvec{x}}\le \overline{\varvec{x}}$ is defined by

$$\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]=\{\varvec{x}\in \mathbb {R}^n:\underline{\varvec{x}} \le \varvec{x} \le \overline{\varvec{x}}\}.$$

For $\varvec{x}\in \mathbb {R}^n$, $\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]\in \mathbb {I}\mathbb {R}^n$, $\varvec{x}+\mathbf X =[\varvec{x}+\underline{\varvec{x}}, \varvec{x}+\overline{\varvec{x}}]$. Let $\mathbf Y _\rho =\{\varvec{y}\in \mathbb {R}^n: \Vert \varvec{y}\Vert \le \rho \}$, then $\overline{B}(\varvec{x}, \rho )=\varvec{x}+\mathbf Y _\rho $.

The norm of the interval vector $\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]$ is defined by

$$\begin{aligned} \Vert \mathbf X \Vert =\Vert [\underline{\varvec{x}}, \overline{\varvec{x}}]\Vert&=\max \{ \Vert \varvec{x}\Vert : \varvec{x}\in \mathbf X \}. \end{aligned}$$

Besides $\mathrm {int}(\mathbf X )$ designates the interior of the interval vector $\mathbf X $. Given a set $\mathbf Z \subset \mathbb {R}^n$, the interval hull of $\mathbf Z $ is the narrowest interval vector containing $\mathbf Z $, namely,

$$\text {hull}(\mathbf {Z})= \bigcap \{\mathbf {X}\in \mathbb {I}\mathbb {R}^n: \mathbf {X}\supseteq \mathbf {Z}\}.$$

Given a continuous mapping $\varvec{g}: \mathbb {R}^{n} \rightarrow \mathbb {R}^{m}$ and an interval vector $\mathbf X $, the interval vector $\varvec{g}(\mathbf X )\in \mathbb {I}\mathbb {R}^{m}$ is defined as

$$ \varvec{g}(\mathbf X )=\mathrm {hull}\{\varvec{g}(\varvec{x}):\varvec{x}\in \mathbf X \}.$$

Given an interval matrix $\mathbf A \in \mathbb {I}\mathbb {R}^{m\times n}$ and an interval vector $\mathbf X \in \mathbb {I}\mathbb {R}^{n}$, the interval vector $\mathbf A {} \mathbf X $ is defined by

$$\mathbf A {} \mathbf X =\mathrm {hull}\{A \varvec{x}: A\in \mathbf A , \varvec{x}\in \mathbf X \}.$$

Specially, the norm of the interval matrix $\mathbf A $ is defined as

$$\Vert \mathbf A \Vert =\max \{\Vert A\Vert : A\in \mathbf A \}.$$

The following theorem is a version of the interval Newton’s method given by Rump.

Theorem 2

[12] Given ${\varvec{x}}^{(0)}\in \mathbb {R}^n$, $\mathbf Y \in \mathbb {I} \mathbb {R}^n$ with $\varvec{0}\in \mathbf Y $, $R\in \mathbb {R}^{n\times n}$, if

$$ S(\mathbf Y , {\varvec{x}}^{(0)}):=-R {\varvec{f}}({\varvec{x}}^{(0)})+(I-R \varvec{f}'({\varvec{x}}^{(0)}+\mathbf Y )) \mathbf Y \subseteq \mathrm {int} (\mathbf Y ), $$

then there exists a unique ${{\varvec{x}}^{*}}\in {\varvec{x}}^{(0)}+\mathbf Y $ such that $\varvec{f}({{\varvec{x}}^{*}})=\varvec{0}$.

3 Main Results

To give the main results of this paper, we need the following functions,

$$\begin{aligned} \psi (u)&=2 u^2-4 u+1,\\ g_1(\alpha )&={\alpha }^2-4 \alpha +10, \\ g_2(\alpha )&={\alpha }^3- 6 {\alpha }^2+21 \alpha +28,\\ g_3(\alpha )&=4{\alpha }^3-25 {\alpha }^2+88 {\alpha }-8, \\ \theta (\alpha )&=\frac{1}{3}\arccos (\frac{g_2(\alpha ) }{ \sqrt{{g_1(\alpha ) }^3}}),\\ \omega _1(\alpha )&= \sqrt{g_1(\alpha )} \cos (\theta (\alpha )+\frac{2\pi }{3})+2+ \frac{\alpha }{2},\\ \omega _2(\alpha )&=\sqrt{g_1(\alpha )} \cos (\theta (\alpha )+\frac{4\pi }{3})+2+ \frac{\alpha }{2},\\ \omega _3(\alpha )&=\sqrt{g_1(\alpha )} \cos (\theta (\alpha ))+2+ \frac{\alpha }{2}. \end{aligned}$$

Here we give some lemmas and one proposition from which the main theorem will easily follow.

Lemma 1

[15] Given $\widetilde{\varvec{x}}\in \mathbb {R}^n$, if $\gamma \Vert \widetilde{\varvec{x}}-{\varvec{x}}^{(0)}\Vert <1-\sqrt{2}/{2}$, then

$$ \gamma (\varvec{f}, \widetilde{\varvec{x}})\le \frac{\gamma }{\psi (\gamma \Vert \widetilde{\varvec{x}}-{\varvec{x}}^{(0)}\Vert )(1-\gamma \Vert \widetilde{\varvec{x}}-{\varvec{x}}^{(0)}\Vert )}.$$

Lemma 2

If $0<\alpha < 3-2 \sqrt{2}$, then for all $k\ge 1$,

$$\begin{aligned}&\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert <1-\frac{\sqrt{2}}{2},\end{aligned}$$

(3)

$$\begin{aligned}&\gamma (f, \varvec{x}^{(k)})\le \frac{\gamma }{{\psi (\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )}{(1-\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )}}. \end{aligned}$$

(4)

Proof

If $0<\alpha <3-2 \sqrt{2}$, then by Theorem 1,

$$\begin{aligned} \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert&\le \sum _{j=1}^{k} \Vert \varvec{x}^{(j)}-\varvec{x}^{(j-1)}\Vert \le \sum _{j=1}^{k} (t^{(j)}-t^{(j-1)})\\&\le t^{(k)} \le t^{*}. \end{aligned}$$

Therefore

$$\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert \le \frac{2 \alpha }{1+ \alpha + \sqrt{1-6 \alpha +\alpha ^2}}.$$

Since the right hand side of the above inequality monotonously increases from 0 to $1-\sqrt{2}/2$ as $\alpha $ goes from 0 to $3-2\sqrt{2}$, it follows that (3) holds. By means of Lemma 1, (4) follows. $\square $

Lemma 3

If

$$\begin{aligned}&0<\rho < \frac{1-\sqrt{2}/{2}}{\gamma }, \end{aligned}$$

(5)

$$\begin{aligned}&\beta +\rho (\frac{1}{{(1-\gamma \rho )}^2}-1)<\rho , \end{aligned}$$

(6)

then

$$\begin{aligned} S(\mathbf{Y }_{\rho }, {\varvec{x}}^{(0)})\subseteq \mathrm {int}(\mathbf{Y }_{\rho }). \end{aligned}$$

(7)

Proof

Given an arbitrary real vector $\varvec{y}\in \mathbf{Y }_{\rho }$, we expand the Jacobian matrix $\varvec{f}'(\varvec{x}^{(0)}+\varvec{y})$ into power series and get

$$\begin{aligned} {\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}+\varvec{y})&={\varvec{f}'(\varvec{x}^{(0)})}^{-1}({\varvec{f}'(\varvec{x}^{(0)})}+\sum _{k=2}^{\infty } { \varvec{f}^{(k)}(\varvec{x}^{(0)})} \frac{\varvec{y}^{k-1}}{(k-1)!})\\&=I+\sum _{k=2}^{\infty } k {\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}^{(k)}(\varvec{x}^{(0)})\frac{ \varvec{y}^{k-1} }{k!}. \end{aligned}$$

If (5) holds, then

$$\begin{aligned} \Vert I-{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}+\varvec{y})\Vert&\le \sum _{k=2}^{\infty } k \Vert {\varvec{f}'(\varvec{x}^{(0)})}^{-1} \frac{\varvec{f}^{(k)}(\varvec{x}^{(0)})}{k!}\Vert { \Vert \varvec{y}\Vert ^{k-1} }\\&\le \sum _{k=2}^{\infty } k {(\gamma \rho )}^{k-1}\\&=\frac{1}{ {(1-\gamma \rho )}^2}-1. \end{aligned}$$

Suppose that (5) (6) hold, then for an arbitrary real vector $\varvec{y}\in \mathbf X _{\rho }$, we can infer that

$$\begin{aligned} \Vert -{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}) +(I-{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}+\varvec{y}))\varvec{y}\Vert < \rho , \end{aligned}$$

which implies that (7) holds. $\square $

Lemma 4

Let

$$\begin{aligned} \alpha ^*&=-\frac{1}{12} {(22247+1320 \sqrt{330})}^{1/3}+\frac{431}{12 {(22247+1320 \sqrt{330})}^{1/3}} +\frac{25}{12} \\&\approx 0.093347623,\nonumber \end{aligned}$$

(8)

then for an arbitrary $0< \alpha < \alpha ^*$, we have

$$\begin{aligned}&g_1(\alpha ^*)< g_1(\alpha )< g_1(0),\\&g_2(0)<g_2(\alpha )< g_2(\alpha ^*),\\&g_3(0)<g_3(\alpha )< g_3(\alpha ^*)=0,\\&\theta (\alpha ^*)< \theta (\alpha )< \theta (0),\\&\omega _1(0)< \omega _1(\alpha )< \omega _1(\alpha ^*), \\&\omega _2(\alpha ^*)< \omega _2(\alpha )< \omega _2(0)<3-\frac{3\sqrt{2}}{2},\\&\omega _3(\alpha )>3-\frac{3\sqrt{2}}{2}. \end{aligned}$$

Proof

According to Cartan’s root-finding formula, the equation $g_3(\alpha )=0$ has only a positive real root $\alpha ^*$ as in (8). Obviously, $g'_1(\alpha )<0$ and $g'_2(\alpha )>0$ for all $0<\alpha < \alpha ^*$. Thus both $\theta (\alpha )$ and $\omega _2(\alpha )$ monotonously decrease on the interval $(0, 3-2\sqrt{2})$. A routine computation gives rise to

$$\begin{aligned} \omega '_1(\alpha )= \frac{(4-2\alpha )\sqrt{-3 g_3(\alpha )}\sin (\theta (\alpha )+\frac{\pi }{6})+(84-36\alpha )\cos (\theta (\alpha )+\frac{\pi }{6} )}{2\sqrt{-3g_1(\alpha )g_3(\alpha )}}+\frac{1}{2}, \end{aligned}$$

then for all $0<\alpha <\alpha ^*$, $\omega '_1(\alpha )>0$. This lemma follows immediately from what we have proved. $\square $

Proposition 1

Under the condition (5), the inequality (6) holds if and only if

$$\begin{aligned} 0<&\alpha <\alpha ^{*},\end{aligned}$$

(9)

$$\begin{aligned} \frac{\omega _1(\alpha )}{ 3 \gamma }<&\rho <\frac{\omega _2(\alpha )}{ 3 \gamma }. \end{aligned}$$

(10)

Proof

Define $\varDelta ={({q}/{2})}^2+{({p}/{3})}^3$ with

$$\begin{aligned} p&=-\frac{1}{3}{(\frac{4+\beta \gamma }{-2{\gamma } })}^2+\frac{1+2 \beta \gamma }{2 {\gamma }^2},\\ q&=2{(\frac{4+\beta \gamma }{-6 \gamma })}^3 +\frac{\beta }{2 {\gamma }^2}-\frac{(4+\beta \gamma )(1+2\beta \gamma )}{-12 {\gamma }^3 }, \end{aligned}$$

then

$$\begin{aligned} p=-\frac{ g_1(\alpha )}{12{\gamma }^2 },\quad q=-\frac{g_2(\alpha ) }{108 {\gamma ^3}}, \quad \varDelta =\frac{g_3(\alpha )}{ 1728 \gamma ^6}. \end{aligned}$$

It follows by Lemma 4 that for all $\alpha >0$, $p<0$ and $q<0$, then we have three cases to consider.

Case I: $\alpha >\alpha ^*$. In this case, $\varDelta >0$ and the equation

$$\begin{aligned} \rho ^3-\frac{4+\beta \gamma }{2\gamma }\rho ^2+ \frac{1+2\beta \gamma }{2{\gamma }^2}\rho -\frac{\beta }{2 {\gamma }^2}=0 \end{aligned}$$

(11)

has only one real solution

$$\begin{aligned} \rho ^*= \root 3 \of {{-\frac{q}{2}+\sqrt{\varDelta }}} +\root 3 \of {{-\frac{q}{2}-\sqrt{\varDelta }}}. \end{aligned}$$

An easy computation yields that for all $\alpha >\alpha ^*$, $\rho ^*>{(1-{\sqrt{2}}/{2})}/{\gamma }$. Therefore, in this case, (6) holds if and only if $\rho >\rho ^*$, which contradicts the condition (5).

Case II: $\alpha =\alpha ^*$. In this case, $\varDelta =0$ and (11) has only two unequal real solutions

$$-\root 3 \of {\frac{{-{q}}}{2}}, \quad 2\root 3 \of {\frac{{-{q}}}{2}}.$$

Clearly,

$$\begin{aligned} -\root 3 \of {\frac{{-{q}}}{2}}<0, \quad 2\root 3 \of {\frac{{-{q}}}{2}}>\frac{1-{\sqrt{2}}/{2}}{\gamma }. \end{aligned}$$

Hence in this case, (6) can not hold under the condition (5).

Case III: $0<\alpha <\alpha ^*$. In this case, $\varDelta <0$ and (11) has three unequal real solutions

$$ \frac{\omega _1(\alpha )}{ 3 \gamma },\quad \frac{\omega _2(\alpha )}{ 3 \gamma },\quad \frac{\omega _3(\alpha )}{ 3 \gamma }. $$

Recalling Lemma 4, we know that for all $0<\alpha <\alpha ^*$,

$$\begin{aligned}&\frac{\omega _3(\alpha )}{ 3 \gamma }> \frac{1-\sqrt{2}/2}{ \gamma },\\&\frac{\omega _2(\alpha )}{ 3 \gamma }< \frac{1-\sqrt{2}/2}{ \gamma },\\&0<\frac{\omega _1(\alpha )}{ 3 \gamma }<\frac{\omega _2(\alpha )}{ 3 \gamma }. \end{aligned}$$

Thus in this case, (6) holds if and only if (10) holds.

As a whole, under the condition (5), the inequality (6) holds if and only if (9) and (10) hold. $\square $

On the basis of $\alpha $, $\beta $ and $\gamma $, the following theorem provides the conditions such that (7) holds, which is an immediate conclusion of Proposition 1.

Theorem 3

If $0<\alpha <{\alpha }^*$, then for any $\rho $ satisfying the inequality

$$\frac{\omega _1(\alpha )}{3 \gamma }< \rho < \frac{\omega _2(\alpha )}{3 \gamma },$$

the condition (7) holds.

The following corollary indicates that the alpha theory is of greater precision than Rump’s interval theorem, which can be directly deduced by an easy computation.

Corollary 1

If $0<\alpha <{\alpha }^*$, then

$$\frac{\omega _1(\alpha )}{3 \gamma }> t^*.$$

For the Newton iteration sequence $\{ \varvec{x}^{(k)}\}$, define

$$\begin{aligned} \rho ^{*}(\varvec{f},\varvec{x}^{(k)} )=\frac{\omega _1(\alpha (\varvec{f},\varvec{x}^{(k)} ))}{3 \gamma (\alpha (\varvec{f},\varvec{x}^{(k)} ))},\quad \rho ^{**}(\varvec{f},\varvec{x}^{(k)} )=\frac{\omega _2(\alpha (\varvec{f},\varvec{x}^{(k)} ))}{3 \gamma (\alpha (\varvec{f},\varvec{x}^{(k)} ))}. \end{aligned}$$

(12)

In view of the quantity $\alpha $ of the alpha theory proposed by Wang and Han [18], we give the following convergence condition of interval Newton’s algorithm proposed by Rump.

Proposition 2

Let $\lceil \cdot \rceil $ be the integer ceiling function, $p(\alpha )$ be defined by

$$p(\alpha )=\frac{2\alpha }{1+\alpha +\sqrt{1-6 \alpha +\alpha ^2}},$$

and $q(\alpha )$ be defined in (2). If $0<\alpha < 3-2 \sqrt{2}$, then for any

$$\begin{aligned} k\ge \lceil \log _2 ( \frac{ \ln \alpha ^*+\ln (1-p(\alpha ))+\ln \psi (p(\alpha ))-\ln \alpha }{ \ln q(\alpha )}+1)\rceil , \end{aligned}$$

(13)

the condition

$$\begin{aligned} S(\mathbf{Y }_{\rho ^{(k)}}, \varvec{x}^{(k)}) \subseteq \mathrm {int}(\mathbf{Y }_{\rho ^{(k)}}) \end{aligned}$$

(14)

holds for any $\rho ^{(k)}$ satisfying the inequality

$$\begin{aligned} \rho ^{*}(\varvec{f},\varvec{x}^{(k)} )<\rho ^{(k)}<\rho ^{**}(\varvec{f},\varvec{x}^{(k)} ). \end{aligned}$$

(15)

Proof

By Theorem 1 and Lemma 2, we know that if $0<\alpha < 3-2 \sqrt{2}$, then for any $k\in \mathbb {N}$,

$$ \alpha (\varvec{f},\varvec{x}^{(k)} )\le \frac{\alpha {q(\alpha )}^{2^k-1}}{\psi (\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )(1-\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )}. $$

If $0<\alpha < 3-2 \sqrt{2}$, then for all $k \in \mathbb {N}$,

$$\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert \le p(\alpha )<1-\frac{\sqrt{2}}{2},$$

which implies that for all $k \in \mathbb {N}$,

$$\psi (\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )(1-\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert ) \ge \psi (p(\alpha ))(1-p(\alpha )).$$

Hence if $0<\alpha < 3-2 \sqrt{2}$, then for all $k \in \mathbb {N}$,

$$ \alpha (\varvec{f},\varvec{x}^{(k)} )\le \frac{\alpha {q(\alpha )}^{2^k-1}}{\psi (p(\alpha ))(1-p(\alpha ))}, $$

which implies that $0<\alpha (\varvec{f},\varvec{x}^{(k)})<\alpha ^*$ holds if the iteration number k satisfies the inequality (13). Our conclusion will follow from Theorem 3. $\square $

With the aid of the quantity $\alpha $ of the alpha theory given in [1], the conclusion of the above proposition can be improved as follows.

Proposition 3

If $0<\alpha \le {(13-3\sqrt{17})}/{4}$, then for any $k\ge 3$, the condition (14) holds for any $\rho ^{(k)}$ satisfying the inequality (15).

Proof

Since the function $\alpha / {\psi (\alpha )}^2$ monotonously increases on the interval $[0, 1-\sqrt{2}/2)$, it follows that ${\alpha }/{{\psi (\alpha )}^2}<1$ for any $0<\alpha \le {(13-3\sqrt{17})}/{4}$. Recalling Proposition 1 in [15], we can deduce that if $0<\alpha \le {(13-3\sqrt{17})}/{4}$, then for any $k\in \mathbb {N}$,

$$\alpha (\varvec{f},\varvec{x}^{(k)})\le {(\frac{\alpha }{{\psi (\alpha )}^2})}^{2^k-1} \alpha .$$

As a result, if $0<\alpha \le {(13-3\sqrt{17})}/{4}$, then $0<\alpha (\varvec{f},\varvec{x}^{(3)})<\alpha ^*$. $\square $

4 Example

In this section, we propose some examples to illustrate our conclusion, which are done in Matlab R2012a with INTLAB V6 under Windows 7. In these examples, the true interval of the display as ${-0.0059\_\_\_\_\_\_\_\_\_\_}$ is obtained by subtracting and adding 1 to the last displayed digit, namely,

$$-0.0059\_\_\_\_\_\_\_\_\_\_ =[-0.00600000000000, -0.00580000000000].$$

Example 1

Let

$$\begin{aligned} f_1&=x_1^2+x_2-2=0,\\ f_2&=x_1-x_2=0, \end{aligned}$$

and $\varvec{x}^{(0)}={(1.8, 1.8)}^T$. The program of the alpha theory computes

$$\alpha ^*<\alpha =0.1436672967< \frac{13-3\sqrt{17}}{4},$$

then by Proposition 3, we can immediately deduce that $0<\alpha (\varvec{f}, \varvec{x}^{(3)})<\alpha ^*$. The values of $\alpha (\varvec{f}, \varvec{x}^{(k)})$, $\rho ^*(\varvec{f}, \varvec{x}^{(k)})$, $\rho ^{**}(\varvec{f}, \varvec{x}^{(k)})$, $k=1, 2, 3$, are shown in Table 1, and the values of $\varvec{x}^{(k)}$, $\rho ^{(k)}$, $S(\mathbf{Y }_{\rho ^{(k)}}, \varvec{x}^{(k)})$, $k=1, 2, 3$, are shown in Table 2.

Table 1. The values of $\alpha (\varvec{f}, \varvec{x}^{(k)})$, $\rho ^*(\varvec{f}, \varvec{x}^{(k)})$ and $\rho ^{**}(\varvec{f}, \varvec{x}^{(k)})$ about Example 1

Full size table

Table 2. The values of $\varvec{x}^{(k)}$, $\rho ^{(k)}$ and $S(\mathbf{Y }_{\rho ^{(k)}}, \varvec{x}^{(k)})$ about Example 1

Full size table

Example 2

[11] Let

$$f_i(\varvec{x})=x_i-0.7 x_i\sum _{j=1}^9 a_{i, j} x_j-1=0, \quad \quad i=1, 2\ldots , 9,$$

with

$$a_{i, j}=\frac{3}{4} \frac{t_i{(1-t_j^2)}^2 w_j}{t_i+t_j},$$

where $t_i, w_i,i=1,2, \ldots , 9$, are respectively the nodes and weights of Gaussian integration rule of order 17 on the interval [0, 1] (See Table 3).

For the initial approximate

$$\varvec{x}^{(0)}={[1.36, 1.36, 1.36, 1.36, 1.36, 1.36, 1.36, 1.36, 1.36]}^T,$$

we have

$$\alpha ^*<\alpha =0.144327213206543< \frac{13-3\sqrt{17}}{4}.$$

It follows by Proposition 3 that $0<\alpha (\varvec{f}, \varvec{x}^{(3)})<\alpha ^*$. To be precise,

$$\begin{aligned}&\alpha (\varvec{f}, \varvec{x}^{(3)})=9.771724278220212\mathrm {e}\text{- }10,\\&\rho ^*(\varvec{f}, \varvec{x}^{(3)})=2.106476714805001\mathrm {e}\text{- }09,\\&\rho ^{**}(\varvec{f}, \varvec{x}^{(3)})=0.721018979786819. \end{aligned}$$

Choose $\rho ^{(3)}=2.206476714805001\mathrm {e}\text{- }09$, then $S(\mathbf{Y }_{\rho ^{(3)}}, \varvec{x}^{(3)})\subseteq \mathrm {int}(\mathbf{Y }_{\rho ^{(3)}})$. The values of $\varvec{x}^{(3)}$ and $S(\mathbf{Y }_{\rho ^{(3)}}, \varvec{x}^{(3)})$ are shown in Table 4, where i stands for the coordinate index.

Table 3. The values of $t_i, w_i$ of Example 2

Full size table

Table 4. The values of ${\varvec{x}}^{(3)}$ and $S(\mathbf Y _{\rho ^{(3)}}, \varvec{x}^{(3)})$ about Example 2

Full size table

Example 3

Let

$$\begin{aligned}&f_i=x_i^2+x_{i+1}-2=0, \quad \quad i=1,2,\ldots , 99,\\&f_{100}=x_{99}-x_{100}=0. \end{aligned}$$

Given $\varvec{x}^{(0)}$ with $x_i^{(0)}=1.28$, $i=1, 2, \ldots , 100$, it follows that

$$ \frac{13-3\sqrt{17}}{4}<\alpha =0.165370210314031<3-2\sqrt{2}.$$

Recalling Proposition 2, we know that there exists $k\in \mathbb {N}$ such that $\alpha (\varvec{f}, \varvec{x}^{(k)})<\alpha ^*$. Indeed, for the first-step iteration point $\varvec{x}^{(1)}$, we have

$$\begin{aligned}&\alpha (\varvec{f}, \varvec{x}^{(1)})=0.0209408111113277<\alpha ^*,\\&\rho ^{*}(\varvec{f}, \varvec{x}^{(1)})=0.0229019461130693,\\&\rho ^{**}(\varvec{f}, \varvec{x}^{(1)})=0.291551749088838. \end{aligned}$$

Choose $\rho ^{(1)}=0.0229019461130694$, then $S(\mathbf{Y }_{\rho ^{(1)}}, \varvec{x}^{(1)})\subseteq \mathrm {int}(\mathbf{Y }_{\rho ^{(1)}})$.

References

Blum, L., Cucker, F., Shub, M., Smale, S.: Complexity and Real Computation. Springer, New York (1998)
Book MATH Google Scholar
Kantorovich, L.V.: Functional analysis and applied mathematics. Uspehi. Mat. Nauk. 3(6), 89–185 (1948)
MathSciNet MATH Google Scholar
Giusti, M., Lecerf, G., Salvy, B., Yakoubsohn, J.C.: On location and approximation of clusters of zeros of analytic functions. Found. Comput. Math. 5(3), 257–311 (2005)
Article MathSciNet MATH Google Scholar
Giusti, M., Lecerf, G., Salvy, B., Yakoubsohn, J.C.: On location and approximation of clusters of zeros: case of embedding dimension one. Found. Comput. Math. 7(1), 1–58 (2007)
Article MathSciNet MATH Google Scholar
Hauenstein, J.D., Levandovskyy, V.: Certifying solutions to square systems of polynomial-exponential equations. J. Symb. Comput. 79(3), 575–593 (2015)
MathSciNet MATH Google Scholar
Hauenstein, J.D., Sottile, F.: Algorithm 921: alphacertified: certifying solutions to polynomial systems. ACM Trans. Math. Softw. 38(4), 1–20 (2011)
Article MathSciNet MATH Google Scholar
Li, N., Zhi, L.: Verified error bounds for isolated singular solutions of polynomial systems: case of breadth one. Theor. Comput. Sci. 479, 163–173 (2013)
Article MathSciNet MATH Google Scholar
Li, N., Zhi, L.: Verified error bounds for isolated singular solutions of polynomial systems. SIAM J. Numer. Anal. 52(4), 1623–1640 (2014)
Article MathSciNet MATH Google Scholar
Krawczyk, R.: Newton-algorithmen zur bestimmung von nullstellen mit fehlerschranken. Computing 4, 187–201 (1969)
Article MathSciNet MATH Google Scholar
Moore, R.E.: A test for existence of solutions to nonlinear system. SIAM J. Numer. Anal. 14(4), 611–615 (1977)
Article MathSciNet MATH Google Scholar
Rall, L.B.: A comparison of the existence theorems of Kantorvich and Moore. SIAM J. Numer. Anal. 17(1), 148–161 (1980)
Article MathSciNet MATH Google Scholar
Rump, S.M.: Solving algebraic problems with high accuracy. In: Kulisch, W.L., Miranker, W.L. (eds.) A New Approach to Scientific Computation, pp. 51–120. Academic Press, San Diego (1983)
Chapter Google Scholar
Rump, S.M.: INTLAB-Interval Laboratory. Springer, Netherlands, Berlin (1999)
Book MATH Google Scholar
Rump, S.M., Graillat, S.: Verified error bounds for multiple roots of systems of nonlinear equations. Numer. Algorithm 54, 359–377 (2009)
Article MathSciNet MATH Google Scholar
Smale, S.: Newton method estimates from data at one point. The Merging of Disciplines: New Directions in Pure, Applied and Computational Mathematics, pp. 185C–196C. Springer, Berlin (1986)
Chapter Google Scholar
Shub, M., Smale, S.: Computational complexity: on the geometry of polynomials and a theory of cost. I. Ann. Sci. Ècole Norm. Sup. 18(1), 107–142 (1985)
Article MathSciNet MATH Google Scholar
Shub, M., Smale, S.: Computational complexity: on the geometry of polynomials and a theory of cost. II. SIAM J. Comput. 15(1), 145–161 (1986)
Article MathSciNet MATH Google Scholar
Wang, X.H., Han, D.F.: On dominating sequence method in the point estimate and Smale theorem. Sci. China Ser. A 33(2), 135–144 (1990)
MathSciNet MATH Google Scholar
Yang, Z., Zhi, L., Zhu, Y.: Verfied error bounds for real solutions of positive-dimensional polynomial systems. In: The 2013 International Symposium on Symbolic and Algebraic Computation, pp. 371–378. ACM press, San Jose (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Science, Changchun University of Science and Technology, Changchun, 130022, China
Zhe Li
School of Mathematics, Key Laboratory of Symbolic Computation and Knowledge Engineering, Jilin University, Changchun, 130012, China
Zhe Li, Baocheng Wan & Shugong Zhang
Jilin Agricultural University, Changchun, 130118, China
Baocheng Wan

Authors

Zhe Li
View author publications
You can also search for this author in PubMed Google Scholar
Baocheng Wan
View author publications
You can also search for this author in PubMed Google Scholar
Shugong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhe Li .

Editor information

Editors and Affiliations

Joint Institute of Nuclear Research, Dubna, Russia
Vladimir P. Gerdt
Universität Kassel, Kassel, Germany
Wolfram Koepf
Universität Kassel, Kassel, Germany
Werner M. Seiler
Russian Academy of Sciences, Novosibirsk, Russia
Evgenii V. Vorozhtsov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Z., Wan, B., Zhang, S. (2017). The Convergence Conditions of Interval Newton’s Method Based on Point Estimates. In: Gerdt, V., Koepf, W., Seiler, W., Vorozhtsov, E. (eds) Computer Algebra in Scientific Computing. CASC 2017. Lecture Notes in Computer Science(), vol 10490. Springer, Cham. https://doi.org/10.1007/978-3-319-66320-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-66320-3_20
Published: 30 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66319-7
Online ISBN: 978-3-319-66320-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Convergence Conditions of Interval Newton’s Method Based on Point Estimates

Abstract

Similar content being viewed by others

Ball convergence of a sixth order iterative method with one parameter for solving equations under weak conditions

Ball convergence theorems and the convergence planes of an iterative method for nonlinear equations

Ball convergence theorems for eighth-order variants of Newton’s method under weak conditions

Keywords

1 Introduction

2 Notation and Preliminaries

Theorem 1

Theorem 2

3 Main Results

Lemma 1

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Proposition 1

Proof

Theorem 3

Corollary 1

Proposition 2

Proof

Proposition 3

Proof

4 Example

Example 1

Example 2

Example 3

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation