The unconstrained minimization problem deals with functions \(f( \cdot )\) of n real variables defined and sufficiently smooth in a neighborhood \(U(x\text{*})\) of the minimizer. Throughout this paper, the set of minimizers of f is denoted by \(X\text{*} = {\text{Arg}}\min f\) and is assumed to be nonempty. The necessary condition for x* to be a minimizer of f is the equality \(f{\kern 1pt} '(x\text{*}) = 0,\) and the Hessian matrix of f at the minimizer is positive semidefinite. Newton’s method has been studied in numerous works, for example, in [26]. An overview of relevant studies can be found in [1]. In this paper, we show that, even if the Hessian matrix is singular at the point x*, the objective function gradient in a neighborhood of x* belongs to the image of its second derivative and, hence, the Newtonian system for the descent direction is solvable at points of this neighborhood. This topological property of convexity and optimality provides a new insight into Newton-type methods and makes it possible to prove their rate of convergence without the restrictive assumption that the Hessian matrix is nonsingular at x*. For unconstrained optimization problems, we establish properties that refine Lojasiewicz’s inequality [7, 8]. Namely, the inequality is generalized to the entire spectrum of derivatives up to a certain order. We consider functions whose derivatives are bounded uniformly in argument and jointly in order in a neighborhood \(U(x\text{*})\) of x*, i.e., for this neighborhood, there exists a positive constant M such that the values of the derivatives of any order do not exceed M in absolute value. Additionally, we consider functions that are sufficiently smooth in a neighborhood of the minimizer, i.e., infinitely differentiable functions. Throughout this paper, without loss of generality, we assume that \(f(x\text{*}) = 0\) and use the notation \({{f}^{{(0)}}}(x) = f(x).\) The following result holds for functions of one variable.

Lemma 1. If x* is an isolated local minimizer of a function \(f{\kern 1pt} :\;R \to R\) that is sufficiently smooth in a neighborhood \(U(x\text{*})\) and has derivatives that are bounded in \(U(x\text{*})\) uniformly in argument and jointly in order, then there exists an even 2p, \(p = {{p}_{f}} \in \mathbb{N}\), such that

$$\begin{gathered} {{f}^{{(2p)}}}(x\text{*}) > 0,\quad {{f}^{{(k)}}}(x\text{*}) = 0, \\ \frac{{{{f}^{{(k)}}}(x)}}{{{{{(x - x\text{*})}}^{{2p - k}}}}} \geqslant {{C}_{k}},\quad k = 0,1,2, \ldots ,2p - 1, \\ \end{gathered} $$

for all \(x \in \overset{\circ}{U}(x\text{*})\), where \({{C}_{k}} = {{C}_{{k,f}}}\) for k = 0, 1, 2, ..., \(2p - 1\) are positive constants independent of \(x.\)

Proof. Since x* is a local minimizer of f, we have \(f{\kern 1pt} '(x\text{*}) = 0.\) First, we show that, under the conditions of Lemma 1, it is not possible that \({{f}^{{(k)}}}(x\text{*}) = 0\) for any k, \(k = 1,2\), .... Indeed, in this case, given a fixed point \(x = x\text{*} + \;t \in U(x\text{*})\), since the derivatives are bounded in a neighborhood of x*, the Taylor formula with zero derivatives up to any fixed order \(k\) yields the inequality \({\text{|}}f{\kern 1pt} '(x\text{*}\, + \,\theta t){\text{|}}\, \leqslant \,\tfrac{{M{{{({\text{|}}\theta t{\text{|}})}}^{k}}}}{{k!}}\), where M is the above-introduced the upper bound for the values of the derivatives of f in the indicated neighborhood and \(\theta \in (0,1).\) Combining this result with the condition that the local minimizer x* is isolated yields f(x* + t) = \(\int\limits_0^1 {f{\kern 1pt} '(x{\kern 1pt} {\text{*}}\, + \,\theta t)d\theta } \, \leqslant \,\frac{M}{{(k\, + \,1)!}}\). For sufficiently large k, this means a violation of the possible assumption \(f(x) \ne 0\), \(x \ne x\text{*}\), which contradicts the fact that the local minimizer x* is isolated. Therefore, there exists a finite k > 1 such that \({{f}^{{(k)}}}(x\text{*}) \ne 0\) and \({{f}^{{(l)}}}(x\text{*}) = 0\), \(l = 1,2, \ldots ,k - 1\). Moreover, k can only be even: \(k = 2p\), \(p \in \mathbb{N}\), and \({{f}^{{(2p)}}}(x\text{*}) > 0\), since x* is a local minimizer of f. Now, to complete the proof, we denote \(\frac{1}{{(2p + 1)!}}{{f}^{{(2p)}}}(x\text{*})\) by C0. Then the first inequality in the lemma follows from the Taylor formula and the boundedness of the \((2p + 1)th\) derivative in the neighborhood \(U(x\text{*})\). The Taylor expansion of \({{f}^{{(k)}}}(x)\) in a neighborhood of x* yields the other inequalities of the lemma for any k, \(k = 1,2, \ldots ,2p - 1,\) if Ck denotes \(\tfrac{1}{{(2p + 1 - k)!}}{{f}^{{(2p)}}}(x\text{*})\).

Corollary 1. Under the conditions of Lemma 1, the function f is locally convex.

Indeed, if \({{f}^{{(2)}}}(x\text{*}) > 0\), then the second derivative remains positive in a small neighborhood of x*, which means that f(x) is locally convex. If \({{f}^{{(2)}}}(x\text{*}) = 0\), then, as is shown in Lemma 1, \({{f}^{{(2p)}}}(x\text{*}) > 0\) for some p = \({{p}_{f}} \in \mathbb{N}\), \(p > 1\), and, additionally, \({{f}^{{(k)}}}(x\text{*}) = 0\) for \(k = 1,2\), ..., 2p – 1. Then \({{f}^{{(2)}}}(x) \geqslant {{C}_{2}}{{(x - x\text{*})}^{{2p - 2}}}\) for any \(x \in U(x\text{*})\), where C2 is a positive constant determined in Lemma 1. This inequality also means that f is locally convex.

For a function of n variables, the kth derivative in the direction h at a point x is denoted by \(f_{h}^{{(k)}}(x).\)

Corollary 2. If \(f{\kern 1pt} :\;{{R}^{n}} \to R\) is a sufficiently smooth function whose derivatives are bounded uniformly in argument and jointly in order in a neighborhood \(U(x\text{*})\) of an isolated minimizer \(x\text{*}\), then, for any \(h\, \in \,{{R}^{n}}{\kern 1pt} :\;{\text{||}}h{\text{||}}\) = 1, there exists an even \(2{{p}_{{h,f}}},{{p}_{{h,f}}} \in \mathbb{N}\), such that \({{f}^{{(2{{p}_{{h,f}}})}}}(x\text{*})\) > 0, \({{f}^{{(k)}}}(x\text{*}) = 0\), and \(\tfrac{{{{f}^{{(k)}}}(x\text{*} + \;th)}}{{{{t}^{{2{{p}_{{h,f}}} - k}}}}} \geqslant {{C}_{k}}\), k = 0, 1, 2, ..., \(2{{p}_{{h,f}}} - 1\) for all \(t \in (0,{{\delta }_{h}}],\) where \({{C}_{k}} = {{C}_{{k,h,f}}}\) for k = 0, \(1,2, \ldots ,2{{p}_{{h,f}}} - 1\) are positive constants independent of \(t,x\text{*} + \;{{\delta }_{h}} \in U(x\text{*})\); moreover, the function f is locally convex in the direction \(h.\)

Here, the element \(p = {{p}_{{h,f}}} \in \mathbb{N}\) is determined by the direction \(h,{\text{||}}h{\text{||}} = 1\) and does not depend on small t for which \(x\text{*} + \;{{\delta }_{h}} \in U(x\text{*})\), but the value \({{\delta }_{h}} > 0\) depends on h and, in the general case, \({{\delta }_{h}}\) can be infinitesimal with respect to h. It will be shown in Lemma 2 below that, for a convex function f, we can guarantee the existence of an upper bound for \({{p}_{{h,f}}} \in \mathbb{N}\) and a positive lower bound for δh on the set of vectors \(h{\kern 1pt} :\;{\text{||}}h{\text{||}} = 1\).

Lemma 1 implies that the Newton operator ψ(x) = \(x - {{f}^{{(2)}}}{{(x)}^{{ - 1}}}f{\kern 1pt} '(x)\) is well defined in a sufficiently small deleted neighborhood of the solution. In the case of Rn of arbitrary dimension, Lemma 1 means that f is convex along any straight line passing through the point x*, assuming that the function is sufficiently smooth in the intersection of the straight line and the neighborhood U(x*). Additionally, for any straight line passing through the point x* along the vector h, we have \(f_{h}^{{(2p)}}(x\text{*}) \geqslant {{C}_{2}}\) for some exponent \(2p,p = {{p}_{h}} \in \mathbb{N}\) and some constant \({{C}_{2}} = {{C}_{{2,h}}} > 0.\) Lojasiewicz’s inequality guarantees that this inequality holds if the function is analytic in U(x*) for some p and C2 independent of h. Due to the convexity of the function in U(x*), we assume that the function is only sufficiently smooth and obtain the “stability” of Lojasiewicz’s inequality, which expands the applicability of Newton’s method.

Lemma 2. If \(f{\kern 1pt} :\;{{R}^{n}} \to R\) is a convex sufficiently smooth function whose derivatives are bounded uniformly in argument and jointly in order in a neighborhood U(x*) of an isolated minimizer x*, then there exists a even exponent \(2p,\;p = {{p}_{f}} \in \mathbb{N}\), a constant \(C = {{C}_{f}}\) > 0, and a number \(\delta = {{\delta }_{f}} > 0\) such that

$$f(x\text{*} + \;th) \geqslant C{{t}^{{2p}}}$$

for all \(h{\kern 1pt} :\;{\text{||}}h{\text{||}} = 1,\;t \in (0,\delta ]\).

Proof. Let us prove the existence of \(p = {{p}_{f}} \in \mathbb{N}\) such that, if \(f_{h}^{{(i)}}(x\text{*}) = 0\) for \(i = 0,1,2,...,2p{\kern 1pt} '(h) - 1\) and \(f_{h}^{{(2p'(h))}}(x\text{*}) > 0\), then \(p{\kern 1pt} '(h) \leqslant p\). Assume the opposite. Then, for some sequences \({{h}_{k}}{\kern 1pt} :\;{\text{||}}{{h}_{k}}{\text{||}} = 1\), \({{t}_{k}}{\kern 1pt} :\;{{t}_{k}} \to + 0\), it is true that \(f(x\text{*} + \;{{t}_{k}}{{h}_{k}}) < {{C}_{k}}{{t}^{{{{m}_{k}}}}}\), where \({{C}_{k}} > 0\) is a bounded sequence and \({{m}_{k}} \in \mathbb{N}\) is an increasing sequence. Without loss of generality, we assume that \({{h}_{k}} \to h\) and \({{C}_{k}} \to C\) as \(k \to \infty \). Let \({\text{conv}}\{ {{h}_{k}},{{h}_{{k + 1}}},{{h}_{{k + 2}}},...,{{h}_{{k + n}}}\} \) be denoted by \({{V}_{k}}.\) If \({\text{dim}}{{V}_{k}} < n\), this set of vectors is modified by adding linearly independent increments of length on the order of \({{C}_{k}}{{t}^{{2p}}}\) (\({{C}_{k}} \to 0\), \(k \to \infty \)) to \(n - {\text{dim}}{{V}_{k}}\) vectors, after which we can assume that \({\text{dim}}{{V}_{k}}\) = n. Next, suppose that \(h_{k}^{'} \in {\text{int}}{{V}_{k}}\), \(h_{k}^{{''}} = h + {{\alpha }_{k}}(h_{k}^{'} - h)\), \({{\alpha }_{k}} \in (0,1)\), αk = inf{α > 0: h + \(\alpha (h_{k}^{'} - h) \in {{V}_{k}}\} \). The convexity and continuity of f imply that \(f(x\text{*} + \;{{t}_{k}}h_{k}^{{''}}) \leqslant 2C{{t}^{{{{m}_{k}}}}}\). On the other hand, Lemma 1 implies that, for some \(p = {{p}_{h}} \in \mathbb{N}\), there exists \(k \in \mathbb{N}\), \(k \leqslant p\), for which \(f_{h}^{{(2k)}}(x\text{*}) \geqslant {{C}_{2}} > 0\) for some \({{C}_{2}} > 0.\) Since \(h_{k}^{{''}} \to h\) as \(k \to \infty \), for sufficiently large indices k, we have \(f(x\text{*} + \;{{t}_{k}}h_{k}^{{''}}) \geqslant {{C}_{2}}{{t}^{{2k}}}\), which contradicts the assumption.

Corollary 3. The proof of Lemma 2 implies the existence of an exponent \(2p,\;p = {{p}_{f}} \in \mathbb{N}\), and constants \({{C}_{k}} = {{C}_{{k,f}}}\), \(k = 1,2,...,2p{\kern 1pt} '\; - 1\), for which

$$\begin{gathered} f_{h}^{{(2p')}}(x\text{*}) > 0,\quad f_{h}^{{(k)}}(x\text{*}) = 0,\quad k = 0,1,...,2p{\kern 1pt} '\; - 1, \\ f_{h}^{{(k)}}(x\text{*} + \;th) \geqslant {{C}_{k}}{{t}^{{2p - k}}},\quad 0 < t \leqslant \delta , \\ \end{gathered} $$

moreover, \(p{\kern 1pt} ' = p{\kern 1pt} '(h) \leqslant p\) for all \(h{\kern 1pt} :\;{\text{||}}h{\text{||}} = 1\).

For the point \(x = x\text{*} + \;th \in U(x\text{*})\), where \({\text{||}}h{\text{||}} = 1\), the basis \(G = G(h) = \{ {{g}_{1}},{{g}_{2}},...,{{g}_{n}}\} \) in Rn and the corresponding index \(q = q(h)\) are determined as follows. Consider the matrices

$${{a}_{k}} = \frac{1}{{(k - 1)!}}{{f}^{{(k)}}}(x\text{*}){{[h]}^{{k - 2}}},\quad k = 2,3,...\,\,.$$

Then, for a sufficiently smooth function f at the point \(x = x\text{*} + \;th\), we have the relations

$$\begin{gathered} {{f}^{{(2)}}}(x)th = \sum\limits_{k \geqslant 2} \,(k - 1){{a}_{k}}h{{t}^{{k - 1}}}, \\ f{\kern 1pt} '(x) = \sum\limits_{k \geqslant 2} \,{{a}_{k}}h{{t}^{{k - 1}}},\quad f(x) = \sum\limits_{k \geqslant 2} \,\frac{{{{a}_{k}}{{{[h]}}^{2}}{{t}^{k}}}}{k}. \\ \end{gathered} $$
(1)

Next the index \({{k}_{1}} \geqslant 2\) is defined as the minimum among those indices k for which the one-dimensional space \({{L}_{1}} = {{L}_{{{{k}_{1}}}}} = Lin\{ {{a}_{k}}h\} \ne \{ 0\} \). The straight line L1 is redenoted by L1, the vector \({{g}_{1}}\) is defined as \(\tfrac{{{{a}_{{{{k}_{1}}}}}h}}{{{\text{||}}{{a}_{{{{k}_{1}}}}}h{\text{||}}}}\), and the index \({{k}_{2}} > {{k}_{1}}\) is defined as the minimum k for which \(Lin\{ {{a}_{k}}h\} \) is not contained in L1. Then \({{L}_{{{{k}_{2}}}}}\, = \,Lin\{ {{a}_{{{{k}_{2}}}}}h\} \), \({\text{dim}}({{L}_{{{{k}_{2}}}}} \oplus {{L}^{1}}) = 2\), the one-dimensional subspace L2 is defined as \(P{{r}_{{{{{({{L}^{1}})}}^{ \bot }}}}}{{L}_{{{{k}_{2}}}}}\), and the vector g2 is defined as the normalized vector \(P{{r}_{{{{L}_{2}}}}}{{a}_{{{{k}_{2}}}}}h.\) Here, \(P{{r}_{{{{L}_{2}}}}}({{a}_{{{{k}_{2}}}}}h) \subseteq P{{r}_{{{{L}_{2}}}}}Im{{a}_{{{{k}_{2}}}}}.\) Next, the space L2 is defined as \({{L}^{1}} \oplus {{L}_{2}}\). For each \(j = 3,4\), ..., q = q(h), in a similar fashion, we determine the straight lines \({{L}_{j}} = P{{r}_{{{{{({{L}^{{j - 1}}})}}^{ \bot }}}}}{{L}_{{{{k}_{j}}}}}\), \(j = 3,4,...,q\) and the corresponding unit vectors \({{g}_{j}},j = 1,2,...,q.\) Here, \(q = q(h) = \dim Lin\{ {{a}_{1}}h,{{a}_{2}}h,...\} \leqslant n\), the direct sum \({{L}^{{j - 1}}} \oplus {{L}_{j}}\) is denoted as Lj and is a j-dimensional subspace of \({{R}^{n}}\). At the final stage, we construct a subspace Lq of dimension \(q = q(h) \leqslant n.\) The orthogonal complement of Lq is \(H{\kern 1pt} :\;H = {{({{L}^{q}})}^{ \bot }}\) for q < n and \(H = \{ 0\} \) for q = n. The basis \(G(h)\) in \({{R}^{n}}\) is defined as the set consisting of the constructed mutually orthogonal unit vectors \({{g}_{1}},{{g}_{2}},...,{{g}_{q}}\) directed along the constructed orthogonal straight lines \({{L}_{l}},{\text{ }}l = 1,2,...,q,\) and an arbitrary orthogonal basis \({{g}_{{q + 1}}},{{g}_{{q + 2}}},...,{{g}_{n}}\) in H.

Theorem 1. If \(f{\kern 1pt} :\;{{R}^{n}} \to R\) is a sufficiently smooth function whose derivatives are bounded uniformly in argument and jointly in order in a neighborhood U(x*) of an isolated minimizer x*, then, for any fixed \(h{\kern 1pt} :\;{\text{||}}h{\text{||}} = 1\), the system

$${{f}^{{(2)}}}(x\text{*} + \;th)s = f{\kern 1pt} '(x\text{*} + \;th)$$
(2)

is solvable with respect to s for sufficiently small t.

Proof. For any fixed vector h, the solution s of the system in the basis G is determined coordinatewise as follows. Let \(P{{r}_{H}}s = 0.\) Then, according to (1), the coordinate sq is determined by the condition \(P{{r}_{{{{L}_{q}}}}}(({{k}_{q}} - 1){{a}_{{{{k}_{q}}}}}{{t}^{{{{k}_{q}} - 2}}}s)\) = \(P{{r}_{{{{L}_{q}}}}}f{\kern 1pt} '(x) = P{{r}_{{{{L}_{q}}}}}\sum\limits_{k \geqslant {{k}_{q}}} {{{a}_{k}}h{{t}^{{k - 1}}}} \). Specifically, \({{s}_{q}} = \frac{{P{{r}_{{{{L}_{q}}}}}ht}}{{({{k}_{q}} - 1)}}\) + o(t). Substituting the coordinate sq into system (2) yields the coordinate \({{s}_{{q - 1}}} = \frac{{P{{r}_{{{{L}_{{q - 1}}}}}}ht}}{{({{k}_{{q - 1}}} - 1)}}\) + o(t). Next, the coordinates sl, \(l = q - 2,q - 3\), ..., 1, are determined sequentially. Generally speaking, the solution of system (2) is not unique, but it follows from (1) that the coordinates of any solution s in the basis \(G = G(h)\) are given by

$${{s}_{j}} = \tfrac{{P{{r}_{{{{L}_{j}}}}}ht}}{{{{k}_{j}} - 1}} + o(t),\quad j = 1,2,....,q.$$
(3)

Remark 1. To obtain a unique solution of system (2), we consider the problem

$${{\left| {\left| s \right|} \right|}^{2}} \to {{\min }_{s}},\quad {{f}^{{(2)}}}(x)s = f{\kern 1pt} '(x),$$
(4)

whose solution exists for each fixed h for all sufficiently small \(t\) under the assumptions made in Theorem 1. The length of the interval in \(t\) within which the solution exists depends on h.

Example 1. For the nonconvex function f(x) = \(x_{1}^{4} + {{({{x}_{2}} - x_{1}^{2})}^{2}}\), Theorem 1 is violated for points of the parabola \({{x}_{2}} = 4x_{1}^{2}.\) Thus, the length of the interval on which Theorem 1 holds tends to zero as the vector h approaches (1, 0).

Below, we show that, in the case of a convex function, there is a neighborhood with respect to t with a common radius for all vectors of the unit sphere in which problem (4) is solvable. The form of the objective function of problem (4) implies that its solution is given by \(s = {{f}^{{(2)}}}{{(x)}^{ + }}f{\kern 1pt} '(x),\) where \({{( \cdot )}^{ + }}\) denotes a pseudoinverse operator on its image. The coordinates of the solution to this problem in the basis \(G\) satisfy the condition \(P{{r}_{H}}s = 0.\)

Theorem 2. Under the conditions of Lemma 2, for all \(x\) sufficiently close to \(x\text{*}\), problem (4) is solvable and its solution satisfies the conditions

$$\begin{gathered} \left( {f{\kern 1pt} '\left( x \right),s} \right) \geqslant {{M}_{1}}{\text{||}}x - x\text{*}{\text{|}}{{{\text{|}}}^{{2p}}}, \\ (f{\kern 1pt} '{\kern 1pt} '(x)s,s) \geqslant {{M}_{2}}{\text{||}}x - x\text{*}{\text{|}}{{{\text{|}}}^{{2p}}}, \\ \end{gathered} $$
(5)

where the exponent \(2p\) is defined in Lemma 2 and the constants \({{M}_{1}} = {{M}_{{1,f}}}\) and \({{M}_{2}} = {{M}_{{2,f}}} > 0\) do not depend on x.

Proof. Lemma 2 and (1) imply that, for every \(h{\kern 1pt} :\;{\text{||}}h{\text{||}} = 1\), the numbers kj determining the basis G satisfy the following condition: there is an index \({{l}_{1}} \in \{ {{k}_{1}},{{k}_{1}} + 1,...,2p\} \) for which \({{a}_{{{{l}_{1}}}}}{{[h]}^{2}} \geqslant C > 0,\) the exponent 2p is defined in Lemma 2 and is independent of \(h\), and the constant C > 0 is also independent of h and is defined in Corollary 3. The last inequality is equivalent to the condition \(\tfrac{{{{a}_{{{{l}_{1}}}}}{{{[h]}}^{2}}}}{{{{l}_{1}} - 1}} \geqslant {{C}_{1}}{{t}^{{2p}}}\) for small t. Denoting \(\tfrac{C}{{2p - 1}}\) by \({{M}_{1}},\) we derive the inequality stated in the theorem. The second inequality is derived from Corollary 3 in a similar manner.

Corollary 4. Under the conditions of Lemma 2, for a convex function f, the system

$${{f}^{{(2)}}}(x)s = f{\kern 1pt} '(x)$$
(6)

solvable with respect to s for all x sufficiently close to x*, which is equivalent to the inclusion

$$f{\kern 1pt} '\left( x \right) \in {\text{Im}}{{f}^{{(2)}}}(x)$$
(7)

irrespective of the rank of the matrix \({{f}^{{(2)}}}(x).\)

In what follows, the set \(\{ x \in {{R}^{n}}{\kern 1pt} :\;f(x) \leqslant f(x\text{*}) + \varepsilon \} \) is denoted by \(X_{\varepsilon }^{*}\) and the diameter of a set A is denoted by \({\text{diam}}A.\)

Theorem 3. Under the conditions of Lemma 2, there exists a positive integer p and a constant \(\bar {C} < \infty \) for which

$${\text{diam }}X_{\varepsilon }^{*} < \bar {C}{{\varepsilon }^{{\tfrac{1}{{2p}}}}}.$$
(8)

Remark 2. The above-mentioned properties hold assuming that the set of minimizers of f(x) is an isolated point. If this assumption is rejected, while the other assumptions of Theorem 1 are satisfied, then we have the representation

$${\text{Arg}}\min {\text{ }}f = \{ x\text{*}\} + L,$$
(9)

where L is the eigensubspace of Rn determined by the condition \({{f}^{k}}(x{\text{*}}){{[h]}^{k}} = 0\) for any positive integer k.

Theorem 1 does not imply that \({{f}^{{(2)}}}(x)\) is positive definite in a neighborhood of the minimizer x*. Nevertheless, it guarantees that Newton’s method with a suitable initial point is applicable for finding a minimizer of a smooth function, since problem (1) makes it possible to obtain a transition vector in the iterative scheme without assuming that the objective function is convex. Moreover, in this case, we can obtain the monotonicity of Newton’s method in argument and, under stronger assumptions, a linear rate of convergence. If the objective function is convex, then this property holds without making additional assumptions.

First, we use Theorem 1 to prove the monotonicity of Newton’s scheme in argument. Specifically, the Newton operator is defined as ψ(x, s) = x\({{f}^{{(2)}}}{{(x)}^{ + }}f{\kern 1pt} '(x)\), where \({{f}^{{(2)}}}{{(x)}^{ + }}f{\kern 1pt} '(x) = s\) is the solution of problem (4). The process of constructing the solution s in Theorem 1 and the remark to the theorem imply that, for any fixed \(h{\kern 1pt} :\;{\text{||}}h{\text{||}} = 1\), this operator is well defined for sufficiently small t without assuming that the function f is convex. To obtain a convergence estimate, the following property is naturally introduced in addition to the assumptions of Theorem 1.

Definition 1. The function f is said to be weakly regular at the point x* if there exists a constant \(d = {{d}_{f}} > 0\) such that \(\mathop {\max }\limits_{j \leqslant q} {\text{|}}{{h}_{j}}{\text{|}} \geqslant d\) for any \(h,{\text{ ||}}h{\text{||}} = 1\). Here, as before, \(q = q(h) = \dim Lin\{ {{a}_{2}}h,{{a}_{3}}h,...\} \) and hj is the jth coordinate of the vector h in the basis G.

Theorem 4. If the function f is weakly regular at the point x*, then, under the conditions of Theorem 1, there exists \(\lambda \in (0,1)\) such that

$${\text{||}}\psi (x{\kern 1pt} {\text{*}} + th,s) - x\text{*}{\text{||}} \leqslant \lambda t$$

for all sufficiently small t.

Proof. For any solution of system (1), under the weak regularity assumption at the point x*, there exists an index j for which \({\text{||}}P{{r}_{{{{L}_{{{{k}_{j}}}}}}}}h{\text{||}} > d\). It follows that

$$\begin{gathered} {\text{||}}th - s{\text{||}} \leqslant \left\| {t\sum\limits_{l = 1,l \ne {{k}_{j}}}^q \,P{{r}_{{{{L}_{l}}}}}h - P{{r}_{{{{L}_{l}}}}}s} \right\| + t{\text{||}}P{{r}_{H}}h{\text{||}} \\ {\text{ + }}\;{\text{|}}tP{{r}_{{{{L}_{{{{k}_{j}}}}}}}}h - P{{r}_{{{{L}_{{{{k}_{j}}}}}}}}s{\text{|}} \leqslant t(1 - \alpha ) \\ \, + t\alpha \left( {1 - \frac{1}{{{{k}_{j}} - 1}}} \right) \leqslant t\left( {1 - \frac{d}{{{{k}_{j}} - 1}}} \right) = \lambda t, \\ \end{gathered} $$
$$\lambda = \lambda (h) = 1 - \frac{d}{{{{k}_{j}} - 1}} \in (0,1),\quad \alpha = \alpha (h) \geqslant d.$$

Remark 3. The resulting inequality does not mean the linear convergence of Newton’s method, since \(\lambda \) involved in the estimate depends, generally speaking, on h. In the case of a convex function, the result holds in a stronger form, namely, \({\text{||}}\psi (x,s) - x\text{*}{\text{||}} \leqslant \lambda {\text{||}}x - x\text{*}{\text{||}}\) for all x from a sufficiently small neighborhood of x*, since the following property is valid for convex functions.

Definition 2. The function f is said to be uniformly regular at the point x* if there exists an index m = \({{m}_{f}} \in \mathbb{N}\) and a constant \(d = {{d}_{f}} > 0\) such that there is \(j \leqslant q:{{k}_{j}} \leqslant m\) for which \({\text{||}}P{{r}_{{{{L}_{j}}}}}h{\text{||}} \geqslant d\).

Theorem 5. If the function f is uniformly regular at the point x*, then, under the conditions of Theorem 1, there exists \(\lambda \in (0,1)\) such that \({\text{||}}\psi (x,s) - x\text{*}{\text{||}} \leqslant \lambda {\text{||}}x - x\text{*}{\text{||}}\) for all x from a sufficiently small neighborhood of x*.

Proof. Theorem 4 implies that the denominator \(\lambda \, = \,\lambda (h)\) is bounded from above uniformly in h, ||h|| = 1: \(\lambda (h) = 1 - \tfrac{d}{{{{k}_{j}} - 1}} \leqslant \lambda {\kern 1pt} ' < 1\), where \(j{\kern 1pt} :\;{{k}_{j}} \leqslant m.\) Such an index j exists, as was shown in Theorem 1. Additionally, the uniform regularity assumption implies that \({\text{||}}h_{H}^{ \bot }{\text{||}} > d\) for some d > 0 independent of h, where hA is the projection of the vector \(h\) onto the subspace A. This yields an \(h\)-independent estimate for λ.

Remark 4. Lemma 2 implies that a convex function is uniformly regular at the point \(x\text{*}\): kj = \({{k}_{j}}(h) \leqslant m\) = 2p, \({\text{||}}h{\text{||}} = 1\). Uniform regularity is a necessary and sufficient condition for Newton’s method to converge linearly under the hypotheses of Theorem 1 without assuming that the function f is convex. Newton’s iteration has the form

$${{x}_{{k + 1}}} = {{x}_{k}} - {{f}^{{(2)}}}{{({{x}_{k}})}^{ + }}f{\kern 1pt} '({{x}_{k}}),$$

and the convergence rate estimate of the method is linear:

$${\text{||}}{{x}_{{k + 1}}} - x\text{*}{\text{||}} \leqslant \lambda {\text{||}}{{x}_{k}} - x\text{*}{\text{||}}\,,\quad \lambda \in (0,1),\quad k = 0,1,2,...,$$

for an initial approximation that is sufficiently close to the solution x* in the case when f is uniformly regular at x*.

Corollary 5. Under the conditions of Theorem 2, for a convex function, the rate of convergence of Newton’s iteration is linear for any choice of the initial point x from a sufficiently small neighborhood of the solution. Indeed, Theorem 2 implies that a convex function is uniformly regular at the solution x*, which implies a linear convergence estimate.

To ensure that Newton’s method converges faster than linearly, we consider a modified operator \({{\psi }_{1}}(x,s)\) written coordinatewise in the basis G as ψ1(x, s)j = \(x - ({{k}_{j}} - 1){{s}_{j}}\), \(j = 1,2,...,q\); \({{\psi }_{1}}{{(x,s)}_{H}} = {{x}_{H}} - g(x,h)\), where \(g(x,h){\kern 1pt} :\;H \to H\) and \(s = {{f}^{{(2)}}}{{(x)}^{ + }}f{\kern 1pt} '(x)\). The operator \({{\psi }_{1}}{{(x,s)}_{H}}\) determines the modified Newton scheme \({{x}_{{k + 1}}}\) = \({{\psi }_{1}}({{x}_{k}},{{s}_{k}})\). The following result is a consequence of Theorem 1.

Theorem 6. Given a good initial approximation, the modified Newton scheme guarantees a superlinear convergence estimate if and only if the function \(g(x,h)\) satisfies the condition \({\text{||}}g(x,h) - {{h}_{H}}{\text{||}} = o(t)\).