Keywords

1 Introduction

Solving a nonlinear system in the form \(\varvec{f}(\varvec{x})=\varvec{0}\) with \(\varvec{f}={({f}_1, f_2, \ldots , f_n)}^{T}\) and \(\varvec{x}={(x_1, \ldots , x_n)}^T\) is one of the most fundamental problems in scientific computing. In this paper, we assume that \(\varvec{f}: \mathbb {R}^n \rightarrow \mathbb {R}^n\) and \(f_1, f_2, \ldots , f_n\) have all order continuous partial derivatives. Denote the Jacobian matrix of \(\varvec{f}\) at \(\varvec{x}\) by \(\varvec{f}'(\varvec{x})\).

Newton’s method and its modifications have long played an important role in solving nonlinear systems. Under certain conditions, Newton’s method constructs a sequence of iteration points that will converge to a solution of the given nonlinear system. In 1948, the author of [2] established the Kantorovich theorem based on the assumption that the Jacobian matrix of the nonlinear system is Lipschitz continuous on some domain. The Kantorovich theorem first gives the condition to ensure that a simple solution exists close to the initial point and the Newton iteration sequence quadratically converges to this simple solution. Using the technique of point estimates, Smale et al. [15,16,17] developed the alpha theory to locate and approximate simple solutions. The alpha theory requires only information concerning the nonlinear system at the initial point of the Newton iteration. By introducing the dominating sequence technique, Wang and Han [18] improved both the condition and conclusion of the alpha theory. With the aid of Schröder operator, Giusti et al. [3] provided a criterion for locating clusters of solutions of univariate nonlinear functions. Later on, Giusti et al. [4] generalized their results to locate breadth-one singular solutions of multivariate nonlinear systems. For the performance of the alpha theory, Hauenstein and Sottile [6] described the program alphaCertified to certify solutions of polynomial systems. Recently, Hauenstein and Levandovskyy [5] extended the program alphaCertified to verify solutions to polynomial-exponential systems.

Interval arithmetic is another important tool of verification methods. In 1960s, Krawczyk [9] first introduced an interval version of Newton’s method for verifying the existence of simple solutions. Moore [10] proposed computationally verifiable sufficient condition for interval Newton’s method given by Krawczyk. Rump [12] made interval Newton’s method perform better in practice, which is called Rump’s interval theorem and included in verifynlss function in INTLAB toolbox [13] in Matlab. Based on the deflation technique using smoothing parameters, Rump and Graillat [14] described a verification algorithm to verify multiple solutions of univariate nonlinear functions and double solutions of multivariate nonlinear systems. Further, Li and Zhi [8] provided an algorithm to verify breadth-one singular solutions of polynomial systems, which had been generalized to deal with the verification of isolated singular solutions in [7]. By combining interval algorithms with some other methods, Yang et al. [19] investigated the verification for real solutions of positive-dimensional polynomial systems.

In 1980, Rall [11] exhibited the relationship between the Kantorovich theorem and Moore’s interval theorem. By the quantities of the Kantorovich theorem, Rall provided the conditions under which Moore’s verifiable sufficient condition holds. For an initial approximate \(\varvec{x}^{(0)}\in \mathbb {R}^n \) and a radius \(\rho >0\), let \(\varvec{X}_{\rho }\) denote the set \(\{\varvec{x}: \Vert \varvec{x}-\varvec{x}^{(0)}\Vert _{\infty }<\rho \}\), and let \(\eta \), B, \(\kappa \) be the constants such that

$$\begin{aligned}&\Vert {\varvec{f}'(\varvec{x}^{(0)} )}^{-1}\varvec{f}(\varvec{x}^{(0)}) \Vert _\infty \le \eta ,\\&\Vert {\varvec{f}'(\varvec{x}^{(0)} )}^{-1}\Vert _\infty \le B,\\&\Vert \varvec{f}'(\varvec{u})-\varvec{f}'(\varvec{v})\Vert _\infty \le \kappa \Vert \varvec{u}-\varvec{v}\Vert , \quad \quad \varvec{u}, \varvec{v}\in \varOmega , \end{aligned}$$

where \(\varOmega \) is a sufficiently large region containing \(\varvec{x}^{(0)}\). Rall’s conclusion is that if

$$h=B\kappa \eta <\frac{1}{4},$$

then

$$ \varvec{x}^{(0)}-{\varvec{f}'(\varvec{x}^{(0)})}^{-1}\varvec{f}(\varvec{x}^{(0)})+(I-{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{X}_{\rho })) (\varvec{X}_{\rho }-\varvec{x}^{(0)}) \subset \varvec{X}_{\rho }$$

holds for any \(\rho \) satisfying the inequality

$$\frac{1-\sqrt{1-4 h}}{2 h} \eta \le \rho \le \frac{1+\sqrt{1-4 h}}{2 h} \eta .$$

Since the alpha theory and Rump’s interval theorem are respectively the generalization of the Kantorovich theorem and Moore’s interval theorem, we generalize Rall’s conclusion to discuss the relationship between the alpha theory and Rump’s interval theorem in this paper. By only the information of the given system at the initial approximate point, we provide the conditions to guarantee that we can obtain an approximate point after finite Newton’s iterations, where this approximate iteration point corresponds to an interval solution satisfying the condition of Rump’s interval theorem. The next section will give some notation and background results.

2 Notation and Preliminaries

First of all, we emphasize that the norm \(\Vert \cdot \Vert \) of the vector and the matrix in this paper are both the infinite norm \(\Vert \cdot \Vert _{\infty }\) since the metric for the interval vector is closely related to the infinite norm.

Henceforward, we use boldface letters to express tuples and denote their entries by the same letter with subscripts, for example \(\varvec{\alpha }={({\alpha }_1, \ldots , {\alpha }_n)}^T\). Denote the usual product order on \(\mathbb {R}^n\) by \(\le \), that is, for arbitrary \(\varvec{\alpha }, \varvec{\beta }\in \mathbb {R}^n\), \(\varvec{\alpha }\le \varvec{\beta }\) if and only if \(\alpha _i\le \beta _i\) for \(1\le i\le n\).

For \(\varvec{x}\in \mathbb {R}^n\), if \(\varvec{f}'(\varvec{x})\) is nonsingular, then define

$$\begin{aligned} \alpha (\varvec{f}, \varvec{x})&=\beta (\varvec{f}, \varvec{x})\gamma (\varvec{f}, \varvec{x}),\\ \beta (\varvec{f}, \varvec{x})&=\Vert {\varvec{f}'(\varvec{x})}^{-1} \varvec{f}(\varvec{x})\Vert ,\\ \gamma (\varvec{f}, \varvec{x})&=\sup _{k\ge 2}{\Vert {\varvec{f}'(\varvec{x})}^{-1}\frac{\varvec{f}^{(k)}(\varvec{x})}{k!}\Vert }^{1/(k-1)}. \end{aligned}$$

Given the initial approximate \(\varvec{x}^{(0)}\) with the associated simple root \(\varvec{x}^*\) of \(\varvec{f}\), we let \(\alpha \), \(\beta \) and \(\gamma \) to stand for \(\alpha (\varvec{f}, \varvec{x}^{(0)})\), \(\beta (\varvec{f}, \varvec{x}^{(0)})\) and \(\gamma (\varvec{f}, \varvec{x}^{(0)})\), respectively. Applying Newton’s method for \(\varvec{f}\) can get the Newton iteration sequence \(\{\varvec{x}^{(k)}\}\), that is,

$$\varvec{x}^{(k+1)}= \varvec{x}^{(k)}-{\varvec{f}'(\varvec{x}^{(k)})}^{-1}\varvec{f}(\varvec{x}^{(k)}), \quad \quad k\in \mathbb {N}.$$

The alpha theory provides the convergence conditions which ensure the sequence \(\{\varvec{x}^{(k)}\}\) converges to \(\varvec{x}^*\) only with the values \(\alpha \), \(\beta \) and \(\gamma \). The dominating sequence technique is a powerful tool for improving the alpha theory. The dominating sequence \(\{t^{(k)}\}\) is produced by the Newton iteration with the initial approximate \(t^{(0)}=0\) for the univariate function

$$ h(t)=\beta -t+\frac{\gamma t^2}{1- \gamma t}, $$

where the equation \(h(t)=0\) has the following two solutions

$$\begin{aligned} t^{*}=\frac{ 2 \beta }{1+\alpha +\sqrt{1-6 \alpha +{\alpha }^2}},\quad t^{**}=\frac{1+\alpha +\sqrt{1-6 \alpha +{\alpha }^2}}{4 \gamma }. \end{aligned}$$
(1)

The following theorem is a version of the alpha theory given by Wang and Han, where the condition on the quantity \(\alpha \) is best possible.

Theorem 1

[4, 18] If \(0<\alpha <3-2\sqrt{2}\), then for any \(t^{*}\le \rho <t^{**}\), the system \(\varvec{f}(\varvec{x})=\varvec{0}\) has exactly one simple root \(\varvec{x}^{*}\) in \(\overline{B}(\varvec{x}^{(0)}, \rho )\). In addition, the Newton iteration sequence \(\{\varvec{x}^{(k)}\}\) converges quadratically to \(\varvec{x}^{*}\), and the dominating sequence \(\{t^{(k)}\}\) increases and converges quadratically to \(t^*\). Furthermore, for all \(k\in \mathbb {N}\),

$$\begin{aligned} \Vert \varvec{x}^{(k+1)}-\varvec{x}^{(k)} \Vert&\le t^{(k+1)}-t^{(k)},\\ \Vert \varvec{x}^{(k+1)}-\varvec{x}^{(k)} \Vert&\le {q(\alpha )}^{2^{k}-1} \beta \end{aligned}$$

with

$$\begin{aligned} q(\alpha )=\frac{4 \alpha }{{(1-\alpha +\sqrt{1-6\alpha +\alpha ^2}) }^2}. \end{aligned}$$
(2)

Denote the set of intervals by \(\mathbb {I}\mathbb {R}\). An interval vector \(\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]\in \mathbb {I}\mathbb {R}^n\) with \(\underline{\varvec{x}}, \overline{\varvec{x}}\in \mathbb {R}^n\) and \(\underline{\varvec{x}}\le \overline{\varvec{x}}\) is defined by

$$\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]=\{\varvec{x}\in \mathbb {R}^n:\underline{\varvec{x}} \le \varvec{x} \le \overline{\varvec{x}}\}.$$

For \(\varvec{x}\in \mathbb {R}^n\), \(\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]\in \mathbb {I}\mathbb {R}^n\), \(\varvec{x}+\mathbf X =[\varvec{x}+\underline{\varvec{x}}, \varvec{x}+\overline{\varvec{x}}]\). Let \(\mathbf Y _\rho =\{\varvec{y}\in \mathbb {R}^n: \Vert \varvec{y}\Vert \le \rho \}\), then \(\overline{B}(\varvec{x}, \rho )=\varvec{x}+\mathbf Y _\rho \).

The norm of the interval vector \(\mathbf X =[\underline{\varvec{x}}, \overline{\varvec{x}}]\) is defined by

$$\begin{aligned} \Vert \mathbf X \Vert =\Vert [\underline{\varvec{x}}, \overline{\varvec{x}}]\Vert&=\max \{ \Vert \varvec{x}\Vert : \varvec{x}\in \mathbf X \}. \end{aligned}$$

Besides \(\mathrm {int}(\mathbf X )\) designates the interior of the interval vector \(\mathbf X \). Given a set \(\mathbf Z \subset \mathbb {R}^n\), the interval hull of \(\mathbf Z \) is the narrowest interval vector containing \(\mathbf Z \), namely,

$$\text {hull}(\mathbf {Z})= \bigcap \{\mathbf {X}\in \mathbb {I}\mathbb {R}^n: \mathbf {X}\supseteq \mathbf {Z}\}.$$

Given a continuous mapping \(\varvec{g}: \mathbb {R}^{n} \rightarrow \mathbb {R}^{m}\) and an interval vector \(\mathbf X \), the interval vector \(\varvec{g}(\mathbf X )\in \mathbb {I}\mathbb {R}^{m}\) is defined as

$$ \varvec{g}(\mathbf X )=\mathrm {hull}\{\varvec{g}(\varvec{x}):\varvec{x}\in \mathbf X \}.$$

Given an interval matrix \(\mathbf A \in \mathbb {I}\mathbb {R}^{m\times n}\) and an interval vector \(\mathbf X \in \mathbb {I}\mathbb {R}^{n}\), the interval vector \(\mathbf A {} \mathbf X \) is defined by

$$\mathbf A {} \mathbf X =\mathrm {hull}\{A \varvec{x}: A\in \mathbf A , \varvec{x}\in \mathbf X \}.$$

Specially, the norm of the interval matrix \(\mathbf A \) is defined as

$$\Vert \mathbf A \Vert =\max \{\Vert A\Vert : A\in \mathbf A \}.$$

The following theorem is a version of the interval Newton’s method given by Rump.

Theorem 2

[12] Given \({\varvec{x}}^{(0)}\in \mathbb {R}^n\), \(\mathbf Y \in \mathbb {I} \mathbb {R}^n\) with \(\varvec{0}\in \mathbf Y \), \(R\in \mathbb {R}^{n\times n}\), if

$$ S(\mathbf Y , {\varvec{x}}^{(0)}):=-R {\varvec{f}}({\varvec{x}}^{(0)})+(I-R \varvec{f}'({\varvec{x}}^{(0)}+\mathbf Y )) \mathbf Y \subseteq \mathrm {int} (\mathbf Y ), $$

then there exists a unique \({{\varvec{x}}^{*}}\in {\varvec{x}}^{(0)}+\mathbf Y \) such that \(\varvec{f}({{\varvec{x}}^{*}})=\varvec{0}\).

3 Main Results

To give the main results of this paper, we need the following functions,

$$\begin{aligned} \psi (u)&=2 u^2-4 u+1,\\ g_1(\alpha )&={\alpha }^2-4 \alpha +10, \\ g_2(\alpha )&={\alpha }^3- 6 {\alpha }^2+21 \alpha +28,\\ g_3(\alpha )&=4{\alpha }^3-25 {\alpha }^2+88 {\alpha }-8, \\ \theta (\alpha )&=\frac{1}{3}\arccos (\frac{g_2(\alpha ) }{ \sqrt{{g_1(\alpha ) }^3}}),\\ \omega _1(\alpha )&= \sqrt{g_1(\alpha )} \cos (\theta (\alpha )+\frac{2\pi }{3})+2+ \frac{\alpha }{2},\\ \omega _2(\alpha )&=\sqrt{g_1(\alpha )} \cos (\theta (\alpha )+\frac{4\pi }{3})+2+ \frac{\alpha }{2},\\ \omega _3(\alpha )&=\sqrt{g_1(\alpha )} \cos (\theta (\alpha ))+2+ \frac{\alpha }{2}. \end{aligned}$$

Here we give some lemmas and one proposition from which the main theorem will easily follow.

Lemma 1

[15] Given \(\widetilde{\varvec{x}}\in \mathbb {R}^n\), if \(\gamma \Vert \widetilde{\varvec{x}}-{\varvec{x}}^{(0)}\Vert <1-\sqrt{2}/{2}\), then

$$ \gamma (\varvec{f}, \widetilde{\varvec{x}})\le \frac{\gamma }{\psi (\gamma \Vert \widetilde{\varvec{x}}-{\varvec{x}}^{(0)}\Vert )(1-\gamma \Vert \widetilde{\varvec{x}}-{\varvec{x}}^{(0)}\Vert )}.$$

Lemma 2

If \(0<\alpha < 3-2 \sqrt{2}\), then for all \(k\ge 1\),

$$\begin{aligned}&\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert <1-\frac{\sqrt{2}}{2},\end{aligned}$$
(3)
$$\begin{aligned}&\gamma (f, \varvec{x}^{(k)})\le \frac{\gamma }{{\psi (\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )}{(1-\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )}}. \end{aligned}$$
(4)

Proof

If \(0<\alpha <3-2 \sqrt{2}\), then by Theorem 1,

$$\begin{aligned} \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert&\le \sum _{j=1}^{k} \Vert \varvec{x}^{(j)}-\varvec{x}^{(j-1)}\Vert \le \sum _{j=1}^{k} (t^{(j)}-t^{(j-1)})\\&\le t^{(k)} \le t^{*}. \end{aligned}$$

Therefore

$$\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert \le \frac{2 \alpha }{1+ \alpha + \sqrt{1-6 \alpha +\alpha ^2}}.$$

Since the right hand side of the above inequality monotonously increases from 0 to \(1-\sqrt{2}/2\) as \(\alpha \) goes from 0 to \(3-2\sqrt{2}\), it follows that (3) holds. By means of Lemma 1, (4) follows.    \(\square \)

Lemma 3

If

$$\begin{aligned}&0<\rho < \frac{1-\sqrt{2}/{2}}{\gamma }, \end{aligned}$$
(5)
$$\begin{aligned}&\beta +\rho (\frac{1}{{(1-\gamma \rho )}^2}-1)<\rho , \end{aligned}$$
(6)

then

$$\begin{aligned} S(\mathbf{Y }_{\rho }, {\varvec{x}}^{(0)})\subseteq \mathrm {int}(\mathbf{Y }_{\rho }). \end{aligned}$$
(7)

Proof

  Given an arbitrary real vector \(\varvec{y}\in \mathbf{Y }_{\rho }\), we expand the Jacobian matrix \(\varvec{f}'(\varvec{x}^{(0)}+\varvec{y})\) into power series and get

$$\begin{aligned} {\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}+\varvec{y})&={\varvec{f}'(\varvec{x}^{(0)})}^{-1}({\varvec{f}'(\varvec{x}^{(0)})}+\sum _{k=2}^{\infty } { \varvec{f}^{(k)}(\varvec{x}^{(0)})} \frac{\varvec{y}^{k-1}}{(k-1)!})\\&=I+\sum _{k=2}^{\infty } k {\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}^{(k)}(\varvec{x}^{(0)})\frac{ \varvec{y}^{k-1} }{k!}. \end{aligned}$$

If (5) holds, then

$$\begin{aligned} \Vert I-{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}+\varvec{y})\Vert&\le \sum _{k=2}^{\infty } k \Vert {\varvec{f}'(\varvec{x}^{(0)})}^{-1} \frac{\varvec{f}^{(k)}(\varvec{x}^{(0)})}{k!}\Vert { \Vert \varvec{y}\Vert ^{k-1} }\\&\le \sum _{k=2}^{\infty } k {(\gamma \rho )}^{k-1}\\&=\frac{1}{ {(1-\gamma \rho )}^2}-1. \end{aligned}$$

Suppose that (5) (6) hold, then for an arbitrary real vector \(\varvec{y}\in \mathbf X _{\rho }\), we can infer that

$$\begin{aligned} \Vert -{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}) +(I-{\varvec{f}'(\varvec{x}^{(0)})}^{-1} \varvec{f}'(\varvec{x}^{(0)}+\varvec{y}))\varvec{y}\Vert < \rho , \end{aligned}$$

which implies that (7) holds.    \(\square \)

Lemma 4

Let

$$\begin{aligned} \alpha ^*&=-\frac{1}{12} {(22247+1320 \sqrt{330})}^{1/3}+\frac{431}{12 {(22247+1320 \sqrt{330})}^{1/3}} +\frac{25}{12} \\&\approx 0.093347623,\nonumber \end{aligned}$$
(8)

then for an arbitrary \(0< \alpha < \alpha ^*\), we have

$$\begin{aligned}&g_1(\alpha ^*)< g_1(\alpha )< g_1(0),\\&g_2(0)<g_2(\alpha )< g_2(\alpha ^*),\\&g_3(0)<g_3(\alpha )< g_3(\alpha ^*)=0,\\&\theta (\alpha ^*)< \theta (\alpha )< \theta (0),\\&\omega _1(0)< \omega _1(\alpha )< \omega _1(\alpha ^*), \\&\omega _2(\alpha ^*)< \omega _2(\alpha )< \omega _2(0)<3-\frac{3\sqrt{2}}{2},\\&\omega _3(\alpha )>3-\frac{3\sqrt{2}}{2}. \end{aligned}$$

Proof

According to Cartan’s root-finding formula, the equation \(g_3(\alpha )=0\) has only a positive real root \(\alpha ^*\) as in (8). Obviously, \(g'_1(\alpha )<0\) and \(g'_2(\alpha )>0\) for all \(0<\alpha < \alpha ^*\). Thus both \(\theta (\alpha )\) and \(\omega _2(\alpha )\) monotonously decrease on the interval \((0, 3-2\sqrt{2})\). A routine computation gives rise to

$$\begin{aligned} \omega '_1(\alpha )= \frac{(4-2\alpha )\sqrt{-3 g_3(\alpha )}\sin (\theta (\alpha )+\frac{\pi }{6})+(84-36\alpha )\cos (\theta (\alpha )+\frac{\pi }{6} )}{2\sqrt{-3g_1(\alpha )g_3(\alpha )}}+\frac{1}{2}, \end{aligned}$$

then for all \(0<\alpha <\alpha ^*\), \(\omega '_1(\alpha )>0\). This lemma follows immediately from what we have proved.    \(\square \)

Proposition 1

Under the condition (5), the inequality (6) holds if and only if

$$\begin{aligned} 0<&\alpha <\alpha ^{*},\end{aligned}$$
(9)
$$\begin{aligned} \frac{\omega _1(\alpha )}{ 3 \gamma }<&\rho <\frac{\omega _2(\alpha )}{ 3 \gamma }. \end{aligned}$$
(10)

Proof

Define \(\varDelta ={({q}/{2})}^2+{({p}/{3})}^3\) with

$$\begin{aligned} p&=-\frac{1}{3}{(\frac{4+\beta \gamma }{-2{\gamma } })}^2+\frac{1+2 \beta \gamma }{2 {\gamma }^2},\\ q&=2{(\frac{4+\beta \gamma }{-6 \gamma })}^3 +\frac{\beta }{2 {\gamma }^2}-\frac{(4+\beta \gamma )(1+2\beta \gamma )}{-12 {\gamma }^3 }, \end{aligned}$$

then

$$\begin{aligned} p=-\frac{ g_1(\alpha )}{12{\gamma }^2 },\quad q=-\frac{g_2(\alpha ) }{108 {\gamma ^3}}, \quad \varDelta =\frac{g_3(\alpha )}{ 1728 \gamma ^6}. \end{aligned}$$

It follows by Lemma 4 that for all \(\alpha >0\), \(p<0\) and \(q<0\), then we have three cases to consider.

Case I: \(\alpha >\alpha ^*\). In this case, \(\varDelta >0\) and the equation

$$\begin{aligned} \rho ^3-\frac{4+\beta \gamma }{2\gamma }\rho ^2+ \frac{1+2\beta \gamma }{2{\gamma }^2}\rho -\frac{\beta }{2 {\gamma }^2}=0 \end{aligned}$$
(11)

has only one real solution

$$\begin{aligned} \rho ^*= \root 3 \of {{-\frac{q}{2}+\sqrt{\varDelta }}} +\root 3 \of {{-\frac{q}{2}-\sqrt{\varDelta }}}. \end{aligned}$$

An easy computation yields that for all \(\alpha >\alpha ^*\), \(\rho ^*>{(1-{\sqrt{2}}/{2})}/{\gamma }\). Therefore, in this case, (6) holds if and only if \(\rho >\rho ^*\), which contradicts the condition (5).

Case II: \(\alpha =\alpha ^*\). In this case, \(\varDelta =0\) and (11) has only two unequal real solutions

$$-\root 3 \of {\frac{{-{q}}}{2}}, \quad 2\root 3 \of {\frac{{-{q}}}{2}}.$$

Clearly,

$$\begin{aligned} -\root 3 \of {\frac{{-{q}}}{2}}<0, \quad 2\root 3 \of {\frac{{-{q}}}{2}}>\frac{1-{\sqrt{2}}/{2}}{\gamma }. \end{aligned}$$

Hence in this case, (6) can not hold under the condition (5).

Case III: \(0<\alpha <\alpha ^*\). In this case, \(\varDelta <0\) and (11) has three unequal real solutions

$$ \frac{\omega _1(\alpha )}{ 3 \gamma },\quad \frac{\omega _2(\alpha )}{ 3 \gamma },\quad \frac{\omega _3(\alpha )}{ 3 \gamma }. $$

Recalling Lemma 4, we know that for all \(0<\alpha <\alpha ^*\),

$$\begin{aligned}&\frac{\omega _3(\alpha )}{ 3 \gamma }> \frac{1-\sqrt{2}/2}{ \gamma },\\&\frac{\omega _2(\alpha )}{ 3 \gamma }< \frac{1-\sqrt{2}/2}{ \gamma },\\&0<\frac{\omega _1(\alpha )}{ 3 \gamma }<\frac{\omega _2(\alpha )}{ 3 \gamma }. \end{aligned}$$

Thus in this case, (6) holds if and only if (10) holds.

As a whole, under the condition (5), the inequality (6) holds if and only if (9) and (10) hold.    \(\square \)

On the basis of \(\alpha \), \(\beta \) and \(\gamma \), the following theorem provides the conditions such that (7) holds, which is an immediate conclusion of Proposition 1.

Theorem 3

If \(0<\alpha <{\alpha }^*\), then for any \(\rho \) satisfying the inequality

$$\frac{\omega _1(\alpha )}{3 \gamma }< \rho < \frac{\omega _2(\alpha )}{3 \gamma },$$

the condition (7) holds.

The following corollary indicates that the alpha theory is of greater precision than Rump’s interval theorem, which can be directly deduced by an easy computation.

Corollary 1

If \(0<\alpha <{\alpha }^*\), then

$$\frac{\omega _1(\alpha )}{3 \gamma }> t^*.$$

For the Newton iteration sequence \(\{ \varvec{x}^{(k)}\}\), define

$$\begin{aligned} \rho ^{*}(\varvec{f},\varvec{x}^{(k)} )=\frac{\omega _1(\alpha (\varvec{f},\varvec{x}^{(k)} ))}{3 \gamma (\alpha (\varvec{f},\varvec{x}^{(k)} ))},\quad \rho ^{**}(\varvec{f},\varvec{x}^{(k)} )=\frac{\omega _2(\alpha (\varvec{f},\varvec{x}^{(k)} ))}{3 \gamma (\alpha (\varvec{f},\varvec{x}^{(k)} ))}. \end{aligned}$$
(12)

In view of the quantity \(\alpha \) of the alpha theory proposed by Wang and Han [18], we give the following convergence condition of interval Newton’s algorithm proposed by Rump.

Proposition 2

Let \(\lceil \cdot \rceil \) be the integer ceiling function, \(p(\alpha )\) be defined by

$$p(\alpha )=\frac{2\alpha }{1+\alpha +\sqrt{1-6 \alpha +\alpha ^2}},$$

and \(q(\alpha )\) be defined in (2). If \(0<\alpha < 3-2 \sqrt{2}\), then for any

$$\begin{aligned} k\ge \lceil \log _2 ( \frac{ \ln \alpha ^*+\ln (1-p(\alpha ))+\ln \psi (p(\alpha ))-\ln \alpha }{ \ln q(\alpha )}+1)\rceil , \end{aligned}$$
(13)

the condition

$$\begin{aligned} S(\mathbf{Y }_{\rho ^{(k)}}, \varvec{x}^{(k)}) \subseteq \mathrm {int}(\mathbf{Y }_{\rho ^{(k)}}) \end{aligned}$$
(14)

holds for any \(\rho ^{(k)}\) satisfying the inequality

$$\begin{aligned} \rho ^{*}(\varvec{f},\varvec{x}^{(k)} )<\rho ^{(k)}<\rho ^{**}(\varvec{f},\varvec{x}^{(k)} ). \end{aligned}$$
(15)

Proof

By Theorem 1 and Lemma 2, we know that if \(0<\alpha < 3-2 \sqrt{2}\), then for any \(k\in \mathbb {N}\),

$$ \alpha (\varvec{f},\varvec{x}^{(k)} )\le \frac{\alpha {q(\alpha )}^{2^k-1}}{\psi (\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )(1-\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )}. $$

If \(0<\alpha < 3-2 \sqrt{2}\), then for all \(k \in \mathbb {N}\),

$$\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert \le p(\alpha )<1-\frac{\sqrt{2}}{2},$$

which implies that for all \(k \in \mathbb {N}\),

$$\psi (\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert )(1-\gamma \Vert \varvec{x}^{(k)}-\varvec{x}^{(0)}\Vert ) \ge \psi (p(\alpha ))(1-p(\alpha )).$$

Hence if \(0<\alpha < 3-2 \sqrt{2}\), then for all \(k \in \mathbb {N}\),

$$ \alpha (\varvec{f},\varvec{x}^{(k)} )\le \frac{\alpha {q(\alpha )}^{2^k-1}}{\psi (p(\alpha ))(1-p(\alpha ))}, $$

which implies that \(0<\alpha (\varvec{f},\varvec{x}^{(k)})<\alpha ^*\) holds if the iteration number k satisfies the inequality (13). Our conclusion will follow from Theorem 3.    \(\square \)

With the aid of the quantity \(\alpha \) of the alpha theory given in [1], the conclusion of the above proposition can be improved as follows.

Proposition 3

If \(0<\alpha \le {(13-3\sqrt{17})}/{4}\), then for any \(k\ge 3\), the condition (14) holds for any \(\rho ^{(k)}\) satisfying the inequality (15).

Proof

Since the function \(\alpha / {\psi (\alpha )}^2\) monotonously increases on the interval \([0, 1-\sqrt{2}/2)\), it follows that \({\alpha }/{{\psi (\alpha )}^2}<1\) for any \(0<\alpha \le {(13-3\sqrt{17})}/{4}\). Recalling Proposition 1 in [15], we can deduce that if \(0<\alpha \le {(13-3\sqrt{17})}/{4}\), then for any \(k\in \mathbb {N}\),

$$\alpha (\varvec{f},\varvec{x}^{(k)})\le {(\frac{\alpha }{{\psi (\alpha )}^2})}^{2^k-1} \alpha .$$

As a result, if \(0<\alpha \le {(13-3\sqrt{17})}/{4}\), then \(0<\alpha (\varvec{f},\varvec{x}^{(3)})<\alpha ^*\).    \(\square \)

4 Example

In this section, we propose some examples to illustrate our conclusion, which are done in Matlab R2012a with INTLAB V6 under Windows 7. In these examples, the true interval of the display as \({-0.0059\_\_\_\_\_\_\_\_\_\_}\) is obtained by subtracting and adding 1 to the last displayed digit, namely,

$$-0.0059\_\_\_\_\_\_\_\_\_\_ =[-0.00600000000000, -0.00580000000000].$$

Example 1

Let

$$\begin{aligned} f_1&=x_1^2+x_2-2=0,\\ f_2&=x_1-x_2=0, \end{aligned}$$

and \(\varvec{x}^{(0)}={(1.8, 1.8)}^T\). The program of the alpha theory computes

$$\alpha ^*<\alpha =0.1436672967< \frac{13-3\sqrt{17}}{4},$$

then by Proposition 3, we can immediately deduce that \(0<\alpha (\varvec{f}, \varvec{x}^{(3)})<\alpha ^*\). The values of \(\alpha (\varvec{f}, \varvec{x}^{(k)})\), \(\rho ^*(\varvec{f}, \varvec{x}^{(k)})\), \(\rho ^{**}(\varvec{f}, \varvec{x}^{(k)})\), \(k=1, 2, 3\), are shown in Table 1, and the values of \(\varvec{x}^{(k)}\), \(\rho ^{(k)}\), \(S(\mathbf{Y }_{\rho ^{(k)}}, \varvec{x}^{(k)})\), \(k=1, 2, 3\), are shown in Table 2.

Table 1. The values of \(\alpha (\varvec{f}, \varvec{x}^{(k)})\), \(\rho ^*(\varvec{f}, \varvec{x}^{(k)})\) and \(\rho ^{**}(\varvec{f}, \varvec{x}^{(k)})\) about Example 1
Table 2. The values of \(\varvec{x}^{(k)}\), \(\rho ^{(k)}\) and \(S(\mathbf{Y }_{\rho ^{(k)}}, \varvec{x}^{(k)})\) about Example 1

Example 2

[11] Let

$$f_i(\varvec{x})=x_i-0.7 x_i\sum _{j=1}^9 a_{i, j} x_j-1=0, \quad \quad i=1, 2\ldots , 9,$$

with

$$a_{i, j}=\frac{3}{4} \frac{t_i{(1-t_j^2)}^2 w_j}{t_i+t_j},$$

where \(t_i, w_i,i=1,2, \ldots , 9\), are respectively the nodes and weights of Gaussian integration rule of order 17 on the interval [0, 1] (See Table 3).

For the initial approximate

$$\varvec{x}^{(0)}={[1.36, 1.36, 1.36, 1.36, 1.36, 1.36, 1.36, 1.36, 1.36]}^T,$$

we have

$$\alpha ^*<\alpha =0.144327213206543< \frac{13-3\sqrt{17}}{4}.$$

It follows by Proposition 3 that \(0<\alpha (\varvec{f}, \varvec{x}^{(3)})<\alpha ^*\). To be precise,

$$\begin{aligned}&\alpha (\varvec{f}, \varvec{x}^{(3)})=9.771724278220212\mathrm {e}\text{- }10,\\&\rho ^*(\varvec{f}, \varvec{x}^{(3)})=2.106476714805001\mathrm {e}\text{- }09,\\&\rho ^{**}(\varvec{f}, \varvec{x}^{(3)})=0.721018979786819. \end{aligned}$$

Choose \(\rho ^{(3)}=2.206476714805001\mathrm {e}\text{- }09\), then \(S(\mathbf{Y }_{\rho ^{(3)}}, \varvec{x}^{(3)})\subseteq \mathrm {int}(\mathbf{Y }_{\rho ^{(3)}})\). The values of \(\varvec{x}^{(3)}\) and \(S(\mathbf{Y }_{\rho ^{(3)}}, \varvec{x}^{(3)})\) are shown in Table 4, where i stands for the coordinate index.

Table 3. The values of \(t_i, w_i\) of Example 2
Table 4. The values of \({\varvec{x}}^{(3)}\) and \(S(\mathbf Y _{\rho ^{(3)}}, \varvec{x}^{(3)})\) about Example 2

Example 3

Let

$$\begin{aligned}&f_i=x_i^2+x_{i+1}-2=0, \quad \quad i=1,2,\ldots , 99,\\&f_{100}=x_{99}-x_{100}=0. \end{aligned}$$

Given \(\varvec{x}^{(0)}\) with \(x_i^{(0)}=1.28\), \(i=1, 2, \ldots , 100\), it follows that

$$ \frac{13-3\sqrt{17}}{4}<\alpha =0.165370210314031<3-2\sqrt{2}.$$

Recalling Proposition 2, we know that there exists \(k\in \mathbb {N}\) such that \(\alpha (\varvec{f}, \varvec{x}^{(k)})<\alpha ^*\). Indeed, for the first-step iteration point \(\varvec{x}^{(1)}\), we have

$$\begin{aligned}&\alpha (\varvec{f}, \varvec{x}^{(1)})=0.0209408111113277<\alpha ^*,\\&\rho ^{*}(\varvec{f}, \varvec{x}^{(1)})=0.0229019461130693,\\&\rho ^{**}(\varvec{f}, \varvec{x}^{(1)})=0.291551749088838. \end{aligned}$$

Choose \(\rho ^{(1)}=0.0229019461130694\), then \(S(\mathbf{Y }_{\rho ^{(1)}}, \varvec{x}^{(1)})\subseteq \mathrm {int}(\mathbf{Y }_{\rho ^{(1)}})\).