1 Introduction

Let \(B_1, B_2\) are Banach spaces \(D\subset B_1\) be an open set \(F:D\longrightarrow B_2\) be a continuously differentiable operator in the Fréchet-sense. Consider the problem of finding a solution \(x^*\in D\) of of the equation

$$\begin{aligned} F(x)=0. \end{aligned}$$
(1)

Numerous applications from computational sciences reduce to finding the point \(x^*.\) The points \(x^*\) is needed in closed form. But this can be achieved only in special cases. That is why most solution procedures for (1) involve iterative procedures. The convergence region for these procedures is small in general, limiting their applicability. The error bounds on the distances involved are also pessimistic (in general).

Motivated by optimization concerns we develop a technique that addresses all these problems. In particular, we determine a subset of D where the iterates also belong. But in this subset the Lipschitz-type constants involved are at least as tight as the ones in D. This modification leads to a finer convergence analysis with advantages (A):

  1. (1)

    Extended convergence domain.

  2. (2)

    Tighter error bounds on the distances involved.

  3. (3)

    More precise information for the location of the solution.

The advantages A are obtained with no additional conditions. We apply our technique on Newton-like procedures [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]. But it is so general that it can be used to extend the applicability of other iterative procedures along the same lines.

In particular, we extend the results by Cibulka et al in [10] which in turn generalized earlier ones [1,2,3, 9,10,11,12,13,14,15,16,17,18].

2 Convergence

We introduce certain Lipschitz-type conditions to be used in order to compare them. Let \( \beta > 0\) be a given parameter. We use the notation U(xa), U[xa] to denote respectively the open and closed balls in \(B_1\) with center x and of radius \(a>0\).

Definition 2.1

Operator \(F'\) is center Lipschitz continuous on D if there exists \(K_0 > 0\) such that

$$\begin{aligned} \Vert F'(u)-F'(x_0)\Vert \le K_0\Vert u-x_0\Vert \end{aligned}$$
(2)

for all \(u\in D.\)

Set

$$\begin{aligned} D_0=U\left( x_0,\frac{1}{\beta K_0}\right) \cap D. \end{aligned}$$
(3)

Definition 2.2

Operator \(F'\) is \(1-\)Restricted Lipschitz continuous on \(D_0\) if there exists \(K > 0\) such that

$$\begin{aligned} \Vert F'(u)-F'(v)\Vert \le K\Vert u-v\Vert \end{aligned}$$
(4)

for all \(u\in D_0,\, v=u- F'(u)^{-1}F(u)\in D_0.\)

Definition 2.3

Operator \(F'\) is \(2-\)Restricted Lipschitz continuous on \(D_0\) if there exists \(M > 0\) such that

$$\begin{aligned} \Vert F'(u)-F'(v)\Vert \le M\Vert u-v\Vert \end{aligned}$$
(5)

for all \(u,v\in D_0.\)

Definition 2.4

Operator \(F'\) is Lipschitz continuous on D if there exists \(K_1 > 0\) such that

$$\begin{aligned} \Vert F'(u)-F'(v)\Vert \le K_1\Vert u-v\Vert \end{aligned}$$
(6)

for all \(u,v\in D.\)

Remark 2.5

(i) By the definition of \(D_0\) in (3), we have

$$\begin{aligned} D_0\subseteq D, \end{aligned}$$
(7)

so by (2), (4)- (7)

$$\begin{aligned} K_0\le K_1,\,K\le M,\, M\le K_1,\,\text { and}\,\, K\le K_1. \end{aligned}$$
(8)

We also assume that

$$\begin{aligned} K_0\le K. \end{aligned}$$
(9)

If not, then the results that follow hold with \(K_0\) replacing K. Examples where (7)-(9) are strict can be found in the numerical section and in [4,5,6,7]. Hence, the new technique improves our earlier results too [4,5,6,7].

We suppose that there exists \(x_0\in D\) such that \(F'(x_0)^{-1}\in L(B_2,B_1)\) and

$$\begin{aligned} \Vert F'(x_0)^{-1}\Vert \le \beta . \end{aligned}$$
(10)

Notice that using (6) and (10) we obtain by the Banach lemma on invertible operators [4, 5, 13] that

$$\begin{aligned} \Vert F'(x)^{-1}\Vert \le \frac{1}{1-\beta K_1\Vert x-x_0\Vert } \end{aligned}$$
(11)

for all \(x\in U(x_0, \frac{1}{\beta K_1}).\) But using the weaker and more precise (2) we have

$$\begin{aligned} \Vert F'(x)^{-1}\Vert \le \frac{1}{1-\beta K_0\Vert x-x_0\Vert } \end{aligned}$$
(12)

for all \(x\in U(x_0, \frac{1}{\beta K_0}).\) This modification in the proofs of earlier works [1, 2, 8,9,10,11,12,13,14,15,16,17,18] leads to advantages A. It is worth noticing that \(K_0=K_0(F',D),\, K_1=K_1(F',D), \) but \(M=M(F', D_0),\) and \(K=K(F',D_0).\) Notice also that we require (4) to only hold for the Newton iterates and not for all elements in D or \(D_0\) (see also the numerical section).

In order to further emphasize the importance of introducing the center Lipschitz condition and using it instead of the Lipschitz condition to provide tighter upper bounds on the norm of \(\Vert F'(x)^{-1}\Vert ,\) we present a motivational example.

Example 2.6

Let \(B_1=B_2={\mathbb {R}}.\) Moreover, define the function

$$\begin{aligned} \varphi (x)=\alpha _0t+\alpha _1+\alpha _2\sin \alpha _3 t,\, x_0=0, \end{aligned}$$

where \(\alpha _i,\, i=0,1,2,3\) are given real parameters. Then, it follows that for \(\alpha _3\) sufficiently large and \(\alpha _2\) sufficiently small, \(\frac{K_0}{K_1}\) can be arbitrarily small, i.e.\(\frac{K_0}{K_1}\longrightarrow 0.\)

Next, we get the following results for \(D_0=U(x_0, r), r > 0.\)

Theorem 2.7

(Extended semi-local Kantorovich Theorem [10, 13]) Suppose:

  1. (i)

    (2), (4) and (10) hold.

  2. (ii)

    \(\Vert F'(x_0)^{-1}F(x_0)\Vert \le \gamma \)

    and

  3. (iii)

    \(\delta = \beta K \gamma r < \frac{1}{2}\) for \(r\ge r_0=\frac{1-\sqrt{1-2\delta }}{\beta K}.\)

Then, there exists a unique sequence \(\{x_n\}\) satisfying Newton procedure

$$\begin{aligned} F(x_n)+F'(x_n)(x_{n+1}-x_n)=0 \end{aligned}$$
(13)

initiated at \(x_0\in D.\) Moreover, this sequence converges to a unique solution \(x^*\in U(x_0,r_0)\) of equation \(F(x)=0.\) Furthermore, the convergence rate is quadratic so that

$$\begin{aligned} \Vert x^*-x_{n}\Vert \le e_n:=\frac{\gamma }{\delta }(2\delta )^{2^n}. \end{aligned}$$

Proof

Simply replace \(K_1\) by K,  (6) by (2) and (3) in the proof of the Kantorovich theorem in [10]. Notice that M can also replace \(K_1\) in this Theorem. But we have \(K\le M\). \(\square \)

Remark 2.8

If \(D_0=D,\) then our Theorem 2.7 reduces to the corresponding one in [10]. But if \(K < K_1,\) then

$$\begin{aligned} \delta _1=\beta K_1 \gamma r_1< \frac{1}{2} \Rightarrow \delta< \frac{1}{2},r_0< \bar{r}_0\,\,\text {and}\,\,e_n <\bar{e}_n=\frac{\gamma }{\delta _1}(2\delta _1)^{2^n}, \end{aligned}$$
(14)

where

$$\begin{aligned} r_1\ge \bar{r}_0=\frac{\sqrt{1-2\delta _1}}{\beta K_1}. \end{aligned}$$

Implications (14) justify the advantages A as stated in the introduction. Hence, the new results are always at least as good as the ones in [10]. It is worth noticing that in practice the computation of the constant \(K_1\) used before requires that of the rest of the Lipschitz constants as special cases. Hence, no additional computational effort or conditions are required to obtain advantages (A). Moreover, in view of (14) our results can hold consistently in cases the earlier ones cannot (see also the numerical Section).

Set \(B_1=B_2={\mathbb {R}}^i\) in the next result. Consider Newton-like procedure

$$\begin{aligned} F(x_n)+T_n(x_{n+1}-x_n), \end{aligned}$$
(15)

where \(T_n\) are matrices from \(\bar{\partial }F(x_n)\) the Clarke generalized Jacobian of F [11].

Theorem 2.9

(Extended local [10]) Suppose (2) and (3) hold for \(x_0=x^*.\) Moreover, for every \(\epsilon > 0\) there exists \(\alpha > 0\) such that for all \(x\in U(x^*,\alpha )\) and \(T\in \bar{\partial }F(x)\)

$$\begin{aligned} \Vert F(x)-F(x^*)-T(x-x^*)\Vert \le \alpha \Vert x-x^*\Vert . \end{aligned}$$
(16)

Then, There exists a neighborhood U of \(x^*\) such that for each \(x_0\in U\) there exists a sequence \(\{x_n\}\) satisfying (15) so that \(\lim _{n\longrightarrow \infty }x_n=x^*.\) The convergence is supperlinear.

Proof

Simply replace \(K_1\) by K in the proof of Theorem 1.3 in [10]. \(\square \)

Next, we solve the inclusion problem

$$\begin{aligned} 0\in F(x)+G(x), \end{aligned}$$
(17)

where F is as before and \(G:B_1\rightrightarrows B_2\) is a set-valued operator whose graph is closed [10]. The Newton-like procedure

$$\begin{aligned} 0\in F(x_n)+F'(x_n)(x_{n+1}-x_n)+G(x_{n+1}) \end{aligned}$$
(18)

shall be used.

Recall that a set-valued operator \(\psi :B_1\rightrightarrows B_2\) is said to be metrically regular at \(x_0\) for \(y_0\) if \(y_0\in \psi (x_0)\) and there exist neighborhoods \(N_1\) of \(x_0\) and \(N_2\) of \(y_0\) and a positive parameter \(\mu \) such that the graph of \(\psi \) denoted by \(gph \psi \) is such that \(gph \psi \cap (N_1\times N_2)\) is closed [11] and

$$\begin{aligned} dist (x, \psi ^{-1}(y))\le \mu \,dis(y, \psi (x) \,\,\,\,\forall (x,y)\in N_1\times N_2. \end{aligned}$$

The next result using the concept of metric regularity extends Theorem 3.1 in [10] which generalized Theorem 6C.1 and 6D.2 from [11].

Theorem 2.10

(Semi-local) Suppose:

  1. (i)

    Let \(r> 0, b> 0, \beta > 0, p\ge 0\) and \(x_0\in B_1,\, y_0\in F(x_0)+G(x_0)\) be such that \(\beta p <1\) and \(\Vert y_0\Vert < (1-\beta p)\min \{\frac{r}{\beta }, b\}.\)

  2. (ii)

    Conditions (2) and (4) hold

  3. (iii)

    For each \(w\in U(x_0, r)\)

    $$\begin{aligned} x\longrightarrow Q_w(x):=F(x_0)+F'(w)(x-x_0)+G(x) \end{aligned}$$

    is metrically regular at \(x_0\) for \(y_0\) with constant \(\beta \) and neighborhood \(U(x_0, r)\) and \(U(y_0,b).\)

  4. (iv)

    \(\Vert F(x)-F(y)-F'(x)(x-y)\Vert \le p\Vert x-y\Vert \) for each \(x\in U(x_0, r), y\in F(x)+G(x).\)

Then, there exists a sequence \(\{x_n\}\) satisfying (18) and convergent \(q-\)superlinearly to a solution \(x^*\) if the inclusion problem (17) (if (2) and (4) are not assumed). Moreover, if (2) and (3) are assumed, then \(\{x_n\}\) converges \(Q-\) quadratically to \(x^*.\)

The rest of the results in [10] can be extended along the same lines.The details are left to the motivated reader.

3 Three Examples

The older convergence criteria are compared with the new ones.

Example 3.1

Let us consider a scalar function h defined on the set \(D=[x_0-(1-\xi ), x_0+1-\xi ] \)for \(\xi \in (0,\frac{1}{2})\) by

$$\begin{aligned} h(x)=x^3-\xi . \end{aligned}$$

Choose \(x_0=r=r_1=1.\) Then, we obtain the estimates \(\eta =\frac{1-\xi }{3},\, \beta =\frac{1}{3}\)

$$\begin{aligned} |h'(x)-h'(x_0)|= & {} 3|x^2-x_0^2|\\\le & {} 3|x+x_0||x-x_0|\le 3(|x-x_0||2|x_0|)|x-x_0|\\= & {} 3(1-\xi +2)|x-x_0|=3(3-\xi )|x-x_0|, \end{aligned}$$

for all \(x\in D, \) so \(K_0=3(3-\xi ),\) \(D_0=U(x_0,\frac{1}{\beta K_0})\cap D=U(x_0, \frac{1}{\beta K_0}),\)

$$\begin{aligned} |h'(y)-h'(x)|= & {} 3|y^2-x^2|\\\le & {} 3|y+x||y-x|\le 3(|y-x_0+x-x_0+2x_0)|y-x|\\= & {} 3(|y-x_0|+|x-x_0|+2|x_0|)|y-x|\\\le & {} 3\left( \frac{1}{\beta K_0}+\frac{1}{\beta K_0}+2\right) |y-x|=6\left( 1+\frac{1}{\beta K_0}\right) |y-x|, \end{aligned}$$

for all \(x,y\in D\) and so \(K_1=6(2-\xi ).\) Notice that for all \(\xi \in (0, \frac{1}{2})\)

$$\begin{aligned} K_0< M < K_1. \end{aligned}$$

Next, set \(y=x-F'(x)^{-1}F(x), x\in D.\) Then, we have

Define function \(\bar{h}\) on the interval \(D=[\xi , 2-\xi ]\) bt

$$\begin{aligned} \bar{h}(x)=\frac{5x^3+\xi }{3x^2}. \end{aligned}$$

Then, we get by this definition that

$$\begin{aligned} \bar{h}'(x)= & {} \frac{15x^4-6x\xi }{9x^4}\\= & {} \frac{5(x-q)(x^2+xq+q^2)}{3x^3}, \end{aligned}$$

where \(q=\root 3 \of {\frac{2\xi }{5}}\) is the critical point of function \(\bar{h}.\) Notice that \(\xi< q < 2-\xi .\) It follows that this function is decreasing on the interval \((\xi ,q)\) and increasing on the interval \((q, 2-\xi ),\) since \(x^2+xq+q^2 > 0\) and \(x^3 > 0.\) So, we can set

$$\begin{aligned} M_1=\frac{5(2-\xi )^2+\xi }{3(2-\xi )^2}. \end{aligned}$$

Thus, we have

$$\begin{aligned} M < K_0. \end{aligned}$$

But if \(x\in D_0=[1-\frac{1}{\beta K_0}, 1+\frac{1}{\beta K_0}],\) then

$$\begin{aligned} K=\frac{5\lambda ^3+\xi }{3\lambda ^2}, \end{aligned}$$

where \(\lambda =\frac{4-\xi }{3-\xi }.\) Then, we also get \(K < M_1.\) Hence, K can replace \(K_1\) or M in all previous results [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]. Next, we verify existing convergence criteria [1, 2, 9,10,11,12,13,14,15].

$$\begin{aligned} 2\beta K_1 \le 1, \end{aligned}$$

for \(\xi \in (0.5,1)\)

Ours in [4,5,6,7]:

$$\begin{aligned} 2\beta M\eta \le 1. \end{aligned}$$

But this criterion does not hold for all \(\xi \in (0, \frac{1}{2}).\) Hence, there is no guarantee that Newton’s method converges. In particular, the results in [10] cannot apply.

$$\begin{aligned} 2\beta K\eta \le 1 \end{aligned}$$

and those of Theorem 2.6 for \(\xi \in (0,1).\)

Clearly, the new results extend the range of values \(\xi \) for which Newton’s procedure (13) converges.

Next, we present an example to show that our conditions can be used to solve equations in cases where the ones in [1, 2, 9,10,11,12,13,14,15,16,17] cannot.

Example 3.2

Consider \(B_1=B_2=C[0,1]\) with the norm-max. Set \(D=U(x_0,3).\) Define operator G on D by

$$\begin{aligned} G(z)(w)=z(w)-y(w)-\int _0^1Q(w,t)v^3(t)dt, \end{aligned}$$
(19)

\(w\in [0,1], \, z\in C[0,1],\) where \(y\in C[0,1]\) is fixed and Q is a Green’s Kernel defined by

$$\begin{aligned} Q(w,u)=\left\{ \begin{array}{cc} (1-w)u,&{} if\,\, u\le w\\ w(1-u),&{}if\,\,w\le u. \end{array}\right. \end{aligned}$$
(20)

Then, the derivative \(G'\) according to Fréchet is defined by

$$\begin{aligned} {[}G'(v)(z)](w)=z(w)-3\int _0^1Q(w,u)v^2(t)y(t)dt, \end{aligned}$$
(21)

\(w\in [0,1],\, z\in C[0,1].\) Let \(y(w)=x_0(w)=1.\) Then, using (2)-(21), we obtain \(G'(x_0)^{-1}\in L(B_2,B_1),\) \(\Vert I-G'(x_0)\Vert < \frac{3}{8},\, \Vert G'(x_0)^{-1}\Vert \le \frac{8}{5}:=\beta ,\, \gamma =\frac{1}{5},\) \(K_0=\frac{12}{5},\, K_1=\frac{18}{5},\) and \(D_0=U(1,3)\cap U(1,\frac{5}{12})=U(1,\frac{5}{12}),\) so \(M=\frac{3}{2},\) and \(K_0< K_1,\, M < K_1.\) Set \(M=K.\) Then, the old sufficient convergence criterion is not satisfied, since \(\gamma \beta K_1 =\frac{1}{5}\frac{8}{5}\frac{18}{5}=\frac{144}{125}> \frac{1}{2}\) holds. Therefore, there is no guarantee that Newton’s method (14) converges to \(x^*\) under the conditions of the aforementioned references. But our condition hold, since \(\gamma \beta K=\frac{1}{5}\frac{8}{5}\frac{3}{2}=\frac{24}{50} <\frac{1}{2}.\) Hence, the conclusions of Theorem 2.7 follow.

Example 3.3

Consider \(B_1=B_2=C[0,1]\) and \(D=U[0,1].\) Then the boundary value problem (BVP) [4]

$$\begin{aligned}&\tau (0)=0,\tau (1)=1,\\&\tau ''=-\tau -\sigma \tau ^2 \end{aligned}$$

can be also given as

$$\begin{aligned} \tau (t_2)=t_2+\int _0^1 P(t_2,t_1)(\tau ^3(t_1)+\sigma \tau ^2(t_1))dt_1 \end{aligned}$$

where \(\sigma \) is a constant and \(P(t_2,t_1)\) is the Green’s function

$$\begin{aligned} P(t_2,t_1)=\left\{ \begin{array}{cc} t_1(1-t_2),&{}\, t_1\le t_2\\ t_2(1-t_1),&{}\,\, t_2 < t_1. \end{array}\right. \end{aligned}$$

Consider \(F:D\longrightarrow B_2\) as

$$\begin{aligned} {[}F(x)](t_2)=x(t_2)-t_2-\int _0^1P(t_2,t_1)(x^3(t_1)+\sigma x^2(t_1))dt_1. \end{aligned}$$

Let us set \(\tau _0(t_2)=t_2\) and \(D=U(\tau _0, \rho _0).\) Then, clearly \(U(\tau _0,\rho _0)\subset U(0, \rho _0+1),\) since \(\Vert \tau _0\Vert =1.\) If \(2\sigma < 5.\) Then,

$$\begin{aligned} K_0=\frac{2\sigma +3\rho _0+6}{8}\,\,\text { and}\,\, K_1=\frac{\sigma +6\rho _0+3}{4}. \end{aligned}$$

Hence, \(K_0 < K_1.\)

4 Conclusions

A technique is introduced by which the convergence analysis of Newton methods (13 and (18) is consistently extended under weaker conditions. Researchers and practitioners will always prefer to use these results over the earlier ones, since they are at least as applicable. The same idea can be used on other methods in order to extend their applicability along the same lines. This will be the focus of our future research.