1 Introduction

There exists a relatively rich literature on the behavior of the Newton method near singular solution of smooth nonlinear equations. With no intention to give a comprehensive survey, we mention only the works [25,26,27] most closely related to our development below, but dealing with equations as smooth as needed (smoothness is not an issue), and with the basic Newton method. In this setting, these works provide natural conditions ensuring linear local convergence of the Newton method from an asymptotically dense starlike domain around a singular solution, and also provide some acceleration techniques bases on the established convergence pattern. Partial extensions of these convergence results to wide classes of methods that can be interpreted as a perturbed Newton method were developed in [30]. Acceleration of convergence and a related issue of asymptotic acceptance of the full Newton step by a linesearch globalization procedure were further investigated in [21, 22], while [20] contains some extensions of these results to the case of constrained equations.

Having in mind typical equation reformulations of complementarity problems, an important issue consists of possible extensions of the results mentioned above to equations with restricted smoothness properties. As one example of this kind, the case of piecewise smooth equations was addressed in [19]. Different reformulations of complementarity lead to equations with different smoothness and regularity properties, and as a result, to different methods for solving complementarity problems, and understanding the relative advantages and disadvantages of these methods is of much interest and importance.

In this work, we focus on nonlinear equations with operators differentiable near the solution in question, and with their derivatives being strongly semismooth at this solution, but when the second derivatives of the operator may not exist. The concept of strong semismoothness was introduced in [44]; see, e.g., [35, Sect. 1.4] for a recent exposition of the related theory. Local convergence properties of the basic Newton method and some acceleration techniques were studied under similar smoothness assumptions in [42]. The main difference between the results in [42] and our development below is that we deal not only with the basic Newton method, but with its perturbed version covering, in particular, some stabilized modifications of the basic Newton scheme, specially intended for tackling singular (and even nonisolated) solutions. Moreover, we consider not only the local convergence properties, but also the issue of the asymptotic acceptance of the unit stepsize by the algorithms equipped with linesearch for globalization of convergence. The latter line of analysis leads to a new result for the perturbed Newton method, even in the case of arbitrary smoothness.

As it will be discussed below, reformulations of complementarity problems, possessing the specified smoothness properties, necessarily give rise to singularity of solutions violating strict complementarity, and hence, serve as a natural source of applications, both in [42] and below.

The rest of the paper is structured as follows. In Sect. 2, we provide the needed preliminaries, and specify the problem setting. Section 3 contains the main result on linear local convergence of the perturbed Newton method framework to singular solution satisfying a certain 2-regularity property that may only hold at solutions called critical. In Sect. 4, we consider a linesearch globalization procedure for the methods in question, and investigate the issue of asymptotic acceptance of the full step, playing a key role for the potential success of the extrapolation procedure intended for acceleration of convergence to critical solutions. Finally, Sect. 5 contains examples of application of the results obtained to smooth equation reformulations of nonlinear complementarity problems.

Some words about our notation. For any \({\bar{u}},\, {\bar{v}}\in \mathbb {R}^p\), and any given scalars \(\varepsilon >0\) and \(\delta >0\), define the set

$$\begin{aligned} K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}}):= \{ u\in \mathbb {R}^p\mid \Vert u-{\bar{u}}\Vert \le \varepsilon ,\; \Vert \Vert {\bar{v}}\Vert (u-{\bar{u}}) -\Vert u-{\bar{u}}\Vert {\bar{v}}\Vert \le \delta \Vert u-{\bar{u}}\Vert \Vert {\bar{v}}\Vert \}. \end{aligned}$$

For a \(q\times p\) matrix A, the null space of the corresponding linear operator is \(\ker A:= \{ v\in \mathbb {R}^p\mid Av = 0\} \). For a mapping \(\Phi :\mathbb {R}^p\rightarrow \mathbb {R}^q\) differentiable at \({\bar{u}}\), we will make use of the unique decomposition of every \(u\in \mathbb {R}^p\) into the sum \(u=u_1+u_2\) with \(u_1\in (\ker \Phi '({\bar{u}}))^\bot \) and \(u_2\in \ker \Phi '({\bar{u}})\), where \(^\bot \) stands for the orthogonal complement of a linear subspace.

2 Preliminaries and Problem Setting

Consider a mapping \(\Phi : \mathbb {R}^p \rightarrow \mathbb {R}^q\) that is differentiable near a point \({\bar{u}}\in \mathbb {R}^p\), but not necessarily twice differentiable, even at \({\bar{u}}\). The analysis in this paper will rely on the assumption that the derivative \(\Phi ': \mathbb {R}^p\rightarrow \mathbb {R}^{q\times p}\) is strongly semismooth at \({\bar{u}}\). According to [35, Sect. 1.4.2], this requirement means that \(\Phi '\) is Lipschitz-continuous near \({\bar{u}}\), directionally differentiable at \({\bar{u}}\) in every direction, and the estimate

$$\begin{aligned} \max _{J\in \partial \Phi '(u)} \Vert \Phi '(u)-\Phi '({\bar{u}})-J(u-{\bar{u}})\Vert = O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(1)

holds as \(u\in \mathbb {R}^p\) tends to \({\bar{u}}\). Here, \(\partial \Phi '(u)\) stands for Clarke’s generalized Jacobian of \(\Phi '\) at u [6, Definition 2.6.1]. These “smoothness” assumptions can actually be further relaxed: it would be enough to assume that \(\Phi '\) itself is just calm at \({\bar{u}}\), while \(\Pi \Phi '\) is strongly semismooth at \({\bar{u}}\), with \(\Pi \) being the orthogonal projector onto \(({{\,\textrm{im}\,}}\Phi '({\bar{u}}))^\bot \) in \(\mathbb {R}^q\). We do not pursue this further, in order to keep the presentation reasonably simple.

Let \((\Phi ')'({\bar{u}};\, v)\) stand for the directional derivative of \(\Phi '\) at \({\bar{u}}\) in a direction \(v\in \mathbb {R}^p\). Observe that \((\Phi ')'({\bar{u}};\, \cdot )\) maps \(\mathbb {R}^p\) to \(\mathbb {R}^{q\times p}\), and is positively homogeneous and Lipschitz-continuous. Define

$$\begin{aligned}{} & {} r(u):= \Phi (u)-\Phi ({\bar{u}})-\Phi '({\bar{u}})(u-{\bar{u}})-\frac{1}{2} (\Phi ')'({\bar{u}};\, u-{\bar{u}})(u-{\bar{u}}), \end{aligned}$$
(2)
$$\begin{aligned}{} & {} R(u):= \Phi '(u)-\Phi '({\bar{u}})- (\Phi ')'({\bar{u}};\, u-{\bar{u}}). \end{aligned}$$
(3)

Combining (1) with [35, Proposition 1.71 (c)], from (3), we readily obtain the estimate

$$\begin{aligned} R(u) = O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(4)

as \(u\rightarrow {\bar{u}}\). Furthermore, according to (2), by the Newton–Leibniz formula we derive

$$\begin{aligned} r(u)= & {} \int _0^1(\Phi '(\tau u+(1-\tau ){\bar{u}})-\Phi '({\bar{u}}))(u-{\bar{u}})\, d\tau -\frac{1}{2} (\Phi ')'({\bar{u}};\, u-{\bar{u}})(u-{\bar{u}})\nonumber \\= & {} \int _0^1(\Phi '({\bar{u}}+\tau (u-{\bar{u}}))-\Phi '({\bar{u}})- (\Phi ')'({\bar{u}};\, \tau (u-{\bar{u}}))(u-{\bar{u}})\, d\tau \nonumber \\= & {} \int _0^1R({\bar{u}}+\tau (u-{\bar{u}}))(u-{\bar{u}})\, d\tau \nonumber \\= & {} O(\Vert u-{\bar{u}}\Vert ^3), \end{aligned}$$
(5)

where the second equality employs the fact that \((\Phi ')'({\bar{u}};\, \cdot )\) is positively homogeneous, the third is by (3), while the last one is by (4).

The mapping \(\Phi \) is said to be 2-regular at \({\bar{u}}\) in the direction v if the linear operator \(B(v):\ker \Phi '({\bar{u}})\rightarrow ({{\,\textrm{im}\,}}\Phi '({\bar{u}}))^\bot \) defined as the restriction of \(\Pi (\Phi ')'({\bar{u}};\, v)\) to \(\ker \Phi '({\bar{u}})\) is surjective; see the corresponding definitions and their discussion in [32, 33, 42].

Remark 2.1

At this point, we mention that the structure of the set consisting of directions of 2-regularity of \(\Phi \) at \({\bar{u}}\) is not arbitrary. For instance, if \(\Phi \) is twice differentiable at \({\bar{u}}\), it can only be 2-regular at \({\bar{u}}\) in every nonzero direction if either \({{\,\textrm{rank}\,}}\Phi '({\bar{u}}) = q\), or \(\Phi '({\bar{u}}) = 0\).

Indeed, assuming that \({{\,\textrm{rank}\,}}\Phi '({\bar{u}}) < q\), fix any \(w\in ({{\,\textrm{im}\,}}\Phi '({\bar{u}}))^\bot \), and consider the \(p\times p\) matrix \(w\Phi ''({\bar{u}}):= \sum _{i=1}^qw_i\Phi _i''({\bar{u}})\). Then for any \(\widehat{v}\in \mathbb {R}^p\) and \(v\in \ker \Phi '({\bar{u}})\) it holds that

$$\begin{aligned} \langle w,\, B({\widehat{v}})v\rangle= & {} \langle w,\, \Pi \Phi ''({\bar{u}})[{\widehat{v}},\, v]\rangle \nonumber \\= & {} \langle \Pi w,\, \Phi ''({\bar{u}})[{\widehat{v}},\, v]\rangle \nonumber \\= & {} \langle w,\, \Phi ''({\bar{u}})[{\widehat{v}},\, v]\rangle \nonumber \\= & {} \sum _{i=1}^qw_i\langle \Phi _i''({\bar{u}}){\widehat{v}},\, v\rangle \nonumber \\= & {} \left\langle \sum _{i=1}^qw_i\Phi _i''({\bar{u}}){\widehat{v}},\, v\right\rangle \nonumber \\= & {} \langle w\Phi ''({\bar{u}}){\widehat{v}},\, v\rangle , \end{aligned}$$
(6)

where the second equality is due to the symmetry of \(\Pi \), while the third is because \(\Pi \) acts as the identity on \(({{\,\textrm{im}\,}}\Phi '({\bar{u}}))^\bot \). If the matrix \(w\Phi ''({\bar{u}})\) is singular, then there exists \({\widehat{v}}\in \ker w\Phi ''({\bar{u}})\setminus \{ 0\} \), and substituting it into (6), we conclude that \(w\in ({{\,\textrm{im}\,}}B({\widehat{v}}))^\bot \). On the other hand, if \(w\Phi ''({\bar{u}})\) is nonsingular, and \(\Phi '({\bar{u}}) \not = 0\), then there exists \({\widehat{v}}\in \mathbb {R}^p\) such that \(w\Phi ''({\bar{u}}){\widehat{v}}\in (\ker \Phi '({\bar{u}}))^\bot {\setminus }\{ 0\} \), implying, in particular, that \({\widehat{v}}\not = 0\), and again by (6) we have that \(w\in ({{\,\textrm{im}\,}}B({\widehat{v}}))^\bot \).

Therefore, if \({{\,\textrm{rank}\,}}\Phi '({\bar{u}}) < q\) and \(\Phi '({\bar{u}}) \not = 0\), then for any \(w\in ({{\,\textrm{im}\,}}\Phi '({\bar{u}}))^\bot \) we have the existence of a nonzero \({\widehat{v}}\in \mathbb {R}^p\) such that \(w\in ({{\,\textrm{im}\,}}B(\widehat{v}))^\bot \). In particular, if we take \(w\not = 0\), this implies that \(\Phi \) is not 2-regular at \({\bar{u}}\) in the direction \({\widehat{v}}\).

The case when \(\Phi '({\bar{u}}) = 0\) is of course quite a special instance of singularity on its own. Moreover, even in this case, from the considerations above it follows that 2-regularity of \(\Phi \) in any nonzero direction is only possible if there exists no nonzero \(w\in \mathbb {R}^q\) such that the matrix \(w\Phi ''({\bar{u}})\) is singular. But the latter property imposes further restrictions on the dimensions p and q. See, e.g., [1, Theorem 1], implying in particular, that this is not possible when p is odd and \(q\ge 2\). A related observation can be found in [3].

In the rest of the paper, we deal with Newton-type methods for the equation

$$\begin{aligned} \Phi (u) = 0, \end{aligned}$$
(7)

and to that end, we assume that \(p = q\). In this case, \({\bar{u}}\) is called a singular solution of (7) if \(\Phi '({\bar{u}})\) is a singular matrix. Observe that every nonisolated solution is necessarily singular. Observe further that if \({\bar{u}}\) is nonsingular, \(\Phi \) is 2-regular at \({\bar{u}}\) in every direction v, including \(v = 0\). At the same time, \(\Phi \) may be 2-regular at \({\bar{u}}\) in nonzero directions even when \({\bar{u}}\) is singular, and even when \({\bar{u}}\) is a nonisolated solution of (7), and even in directions \({\bar{v}}\in \ker \Phi '({\bar{u}})\), which is specially important here as this will play a crucial role in our analysis below, and leads to the

Key assumption: there exist \({\bar{v}}\in \ker \Phi '({\bar{u}})\) such that the mapping \(\Phi \) is 2-regular at \({\bar{u}}\) in the direction \({\bar{v}}\).

According to Izmailov et al. [31, Theorem 2], a solution \({\bar{u}}\) of (7) is regarded as critical if and only if it violates the local Lipschitzian error bound property

$$\begin{aligned} {{\,\textrm{dist}\,}}(u,\, \Phi ^{-1}(0)) = O(\Vert \Phi (u)\Vert ) \end{aligned}$$
(8)

as \(u\in \mathbb {R}^p\) tends to \({\bar{u}}\). The property in (8) is related to the concept of (weak) sharp minima (see [43, Sect. 5.2.3], and [5]) for the residual function \(\Vert \Phi (\cdot )\Vert \). By Izmailov et al. [31, Theorem 3], every critical solution is necessarily singular, but generally not the other way round. Moreover, the discussion in [31, p. 497] demonstrates that for a singular (e.g., nonisolated) solution \({\bar{u}}\), our key assumption may only hold if \({\bar{u}}\) is a critical solution.

3 Local Convergence of Perturbed Newton Methods to Critical Solutions

As in [30], define the perturbed Newton method (pNM) framework for Equation (7) as follows. For a given iterate \(u^k\in \mathbb {R}^p\), the next iterate is \(u^{k+1}=u^k+v^k\), where \(v^k\) is a solution of the linear equation

$$\begin{aligned} \Phi (u^k)+(\Phi '(u^k)+\Omega (u^k))v = \omega (u^k), \end{aligned}$$
(9)

where the mappings \(\Omega :\mathbb {R}^p\rightarrow \mathbb {R}^{p\times p}\) and \(\omega :\mathbb {R}^p\rightarrow \mathbb {R}^p\) are the terms characterizing various kinds of perturbation, and defining specific methods within the pNM framework.

The following is a generalization of Izmailov et al. [30, Lemma 1] and Izmailov et al. [22, Lemma 1] to the case when the first derivative is strongly semismooth, but the second derivative may not exist.

Lemma 3.1

Let \(\Phi :\mathbb {R}^p\rightarrow \mathbb {R}^p\) be differentiable near \({\bar{u}}\in \mathbb {R}^p\), and let the derivative of \(\Phi \) be strongly semismooth at \({\bar{u}}\). Let \({\bar{u}}\) be a solution of Equation (7), and assume that \(\Phi \) is 2-regular at \({\bar{u}}\) in a direction \({\bar{v}}\in \mathbb {R}^p\). Let \(\Omega :\mathbb {R}^p\rightarrow \mathbb {R}^{p\times p}\) and \(\omega :\mathbb {R}^p\rightarrow \mathbb {R}^p\) satisfy the following properties: there exists \(\delta > 0\) such that

$$\begin{aligned} \Omega (u) = O(\Vert u-{\bar{u}}\Vert ),\quad \omega (u) = O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(10)

for \(u\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) as \(\varepsilon \rightarrow 0+\), and

$$\begin{aligned} \Pi \Omega (u) = o(\Vert u-{\bar{u}}\Vert ) \end{aligned}$$
(11)

for \(u\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) as \(\varepsilon \rightarrow 0+\) and \(\delta \rightarrow 0+\).

Then there exist \({\bar{\varepsilon }} > 0\) and \({\bar{\delta }} > 0\) such that, for every \(u\in K_{{\bar{\varepsilon }},\, {\bar{\delta }} }({\bar{u}};\, {\bar{v}}){\setminus } \{ {\bar{u}}\} \), the linear operator \(B(u-{\bar{u}})\) is invertible,

$$\begin{aligned} (B(u-{\bar{u}}))^{-1} = O(\Vert u-{\bar{u}}\Vert ^{-1}) \end{aligned}$$
(12)

as \(u\rightarrow {\bar{u}}\), Equation (9) with \(u^k = u\) has the unique solution v, and this solution satisfies

$$\begin{aligned} u_1+v_1-{\bar{u}}_1= & {} O(\Vert u-{\bar{u}}\Vert \Vert u_1-{\bar{u}}_1\Vert )+O(\Vert u-{\bar{u}}\Vert \Vert \Omega (u)\Vert ) \nonumber \\{} & {} +\,O(\Vert \omega (u)\Vert )+O(\Vert u-{\bar{u}}\Vert ^3), \end{aligned}$$
(13)
$$\begin{aligned} u_2+v_2-{\bar{u}}_2= & {} \frac{1}{2} (u_2-{\bar{u}}_2 +(B(u-{\bar{u}}))^{-1}\Pi (\Phi ')'({\bar{u}};\, u-{\bar{u}})(u_1-{\bar{u}}_1))\nonumber \\{} & {} +\,O(\Vert \Pi \Omega (u)\Vert )+O(\Vert u-{\bar{u}}\Vert ^{-1}\Vert \Pi \omega (u)\Vert )\nonumber \\{} & {} +\,O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(14)

as \(u\rightarrow {\bar{u}}\).

Proof

The argument below follows the lines of that in [30, Lemma 1], with modifications needed under the current restricted smoothness assumptions. Without loss of generality assume that \({\bar{u}}= 0\).

Multiplying (9) by \((I-\Pi )\) and by \(\Pi \), and employing (2)–(3), Equation (9) with \(u^k = u\in \mathbb {R}^p\) is decomposed into the following two equations:

$$\begin{aligned}&(\Phi '({\bar{u}})+(I-\Pi )((\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u)))v_1\nonumber \\&\quad = -\Phi '({\bar{u}})u_1 -(I-\Pi )\left( \frac{1}{2}(\Phi ')'({\bar{u}};\, u)u+r(u)-\omega (u)\right) \nonumber \\&\qquad -\,(I-\Pi )((\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u)))v_2 \end{aligned}$$
(15)

and

$$\begin{aligned} \Pi ( (\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u))(v_1+v_2) = -\Pi \left( \frac{1}{2} (\Phi ')'({\bar{u}};\, u)u+r(u)-\omega (u)\right) . \end{aligned}$$
(16)

Let \({\bar{\varepsilon }} >0\) and \({\bar{\delta }} >0\) be fixed arbitrarily for now, and from this point on, we consider only those \(u\in K_{\bar{\varepsilon },\, {\bar{\delta }} }({\bar{u}};\, {\bar{v}}){\setminus } \{ 0\} \). Define the linear operator \({{\mathcal {A}}}(u):(\ker \Phi '({\bar{u}}))^\bot \rightarrow {{\,\textrm{im}\,}}\Phi '({\bar{u}})\) as the restriction of \((\Phi '({\bar{u}})+(I-\Pi )( (\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u)))\) to \((\ker \Phi '({\bar{u}}))^\bot \). Furthermore, let \({\widehat{A}}:(\ker \Phi '({\bar{u}}))^\bot \rightarrow {{\,\textrm{im}\,}}\Phi '({\bar{u}})\) be the restriction of \(\Phi '({\bar{u}})\) to \((\ker \Phi '({\bar{u}}))^\bot \). Then, taking into account (5), the equality (15) can be written as

$$\begin{aligned} {{\mathcal {A}}}(u)v_1= & {} -{\widehat{A}}u_1-(I-\Pi )( (\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u))v_2\nonumber \\{} & {} -\,(I-\Pi )\left( \frac{1}{2} (\Phi ')'({\bar{u}};\, u)u-\omega (u)\right) +O(\Vert u\Vert ^3) \end{aligned}$$
(17)

as \(u\rightarrow 0\).

Evidently, \({\widehat{A}}\) is invertible, and according to (4) and the first condition in (10),

$$\begin{aligned} {{\mathcal {A}}}(u) = {\widehat{A}}+O(\Vert u\Vert ). \end{aligned}$$

This implies that if \({\bar{\varepsilon }} >0\) is small enough, then \(\mathcal{A}(u)\) is invertible, and

$$\begin{aligned} ({{\mathcal {A}}}(u))^{-1} = {\widehat{A}}^{-1}+O(\Vert u\Vert ) \end{aligned}$$
(18)

as \(u\rightarrow 0\); this follows, e.g., from Izmailov and Solodov [35, Lemma A.6]. Therefore, taking also into account the second condition in (10), (17) can be written as

$$\begin{aligned} v_1 =-u_1+{{\mathcal {M}}}(u)v_2+O(\Vert u\Vert ^2), \end{aligned}$$
(19)

where \({{\mathcal {M}}}(u):\ker \Phi '({\bar{u}})\rightarrow (\ker \Phi '({\bar{u}}))^\bot \) is defined by

$$\begin{aligned} {{\mathcal {M}}}(u):= -({{\mathcal {A}}}(u))^{-1}(I-\Pi )( (\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u)) = O(\Vert u\Vert ) \end{aligned}$$
(20)

as \(u\rightarrow 0\), where the last estimate is again by (4) and by the first condition in (10).

Substituting (19) into (16), and taking into account (4), we obtain the equation

$$\begin{aligned}&\Pi ((\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u))(I+{{\mathcal {M}}}(u))v_2 \nonumber \\&\quad =-\Pi \left( \frac{1}{2}(\Phi ')'({\bar{u}};\, u)u-\omega (u)\right) \nonumber \\&\qquad +\,\Pi ((\Phi ')'({\bar{u}};\, u)+\Omega (u))u_1+ O(\Vert u\Vert ^3). \end{aligned}$$
(21)

Define the linear operator \({{\mathcal {B}}}(u):\ker \Phi '({\bar{u}})\rightarrow ({{\,\textrm{im}\,}}\Phi '({\bar{u}}))^\bot \) as the restriction of \(\Pi ( (\Phi ')'({\bar{u}};\, u)+R(u)+\Omega (u))(I+{{\mathcal {M}}}(u))\) to \(\ker \Phi '({\bar{u}})\). Then (21) can be written in the form

$$\begin{aligned} {{\mathcal {B}}}(u)v_2=-\frac{1}{2} B(u)u_2 +\Pi \left( \left( \frac{1}{2} (\Phi ')'({\bar{u}};\, u)+\Omega (u)\right) u_1+\omega (u)\right) +O(\Vert u\Vert ^3) \end{aligned}$$
(22)

as \(u\rightarrow 0\).

Observe now that by Izmailov and Solodov [35, Lemma A.6], and by continuity of \((\Phi ')'({\bar{u}};\, \cdot )\) at \({\bar{v}}\), 2-regularity of \(\Phi \) at 0 in the direction \({\bar{v}}\) implies the existence of \(C > 0\) such that B(u) is invertible and

$$\begin{aligned} \Vert (B(u))^{-1}\Vert \le C\Vert u\Vert ^{-1} \end{aligned}$$
(23)

provided \({\bar{\delta }} > 0\) is taken small enough. This yields (12). According to (4), (10), and (20), it further holds that

$$\begin{aligned} {{\mathcal {B}}}(u) = B(u)+\Pi \Omega (u)+O(\Vert u\Vert ^2). \end{aligned}$$

Further reducing \({\bar{\varepsilon }} >0\) and \({\bar{\delta }} >0\) if necessary, by (11) and (23), and again by Izmailov and Solodov [35, Lemma A.6], we now obtain that \({{\mathcal {B}}}(u)\) is invertible, and

$$\begin{aligned} ({{\mathcal {B}}}(u))^{-1} = (B(u))^{-1} +O(\Vert u\Vert ^{-2}\Vert \Pi \Omega (u)\Vert )+O(1) = O(\Vert u\Vert ^{-1}) \end{aligned}$$

as \(u\rightarrow 0\). Therefore, (22) is uniquely solvable, and its unique solution has the form

$$\begin{aligned} v_2= & {} -\frac{1}{2} u_2 +\frac{1}{2}(B(u))^{-1}\Pi (\Phi ')'({\bar{u}};\, u)u_1 +O(\Vert \Pi \Omega (u)\Vert ) \end{aligned}$$
(24)
$$\begin{aligned}{} & {} +\,O(\Vert u\Vert ^{-1}\Vert \Pi \omega (u)\Vert ) +O(\Vert u\Vert ^2)\nonumber \\= & {} O(\Vert u\Vert ), \end{aligned}$$
(25)

as \(u\rightarrow 0\), where the last estimate is by (10) and (23).

Substituting (24) into (17), and employing (4) again, we obtain the equation

$$\begin{aligned} {{\mathcal {A}}}(u)v_1 = -{\widehat{A}}u_1+O(\Vert u\Vert \Vert u_1\Vert )+O(\Vert u\Vert \Vert \Omega (u)\Vert )+O(\Vert \omega (u)\Vert )+O(\Vert u\Vert ^3) \end{aligned}$$

and hence, by (18),

$$\begin{aligned} v_1 = -u_1+O(\Vert u\Vert \Vert u_1\Vert )+O(\Vert u\Vert \Vert \Omega (u)\Vert )+O(\Vert \omega (u)\Vert )+O(\Vert u\Vert ^3) \end{aligned}$$
(26)

as \(u\rightarrow 0\).

From (24) and (26), and from (10), we have the needed estimates (13) and (14). \(\square \)

The next example demonstrates that even in the case of twice continuous differentiability of \(\Phi \), and even in the absence of perturbations, strong semismoothness of \(\Phi '\) is essential for the conclusion of Lemma 3.1 to be valid.

Example 3.1

Let \(p = 2\), \(\Phi (u) = (u_1+3u_2^{7/3}/7,\, u_2^2/2)\). Then \(\Phi \) is everywhere twice continuously differentiable, and the unique solution of (7) is \({\bar{u}}= 0\). Furthermore, for any \(u,\, v\in \mathbb {R}^p\)

$$\begin{aligned} \Phi '(u) = \left( \begin{array}{cc} 1&{}u_2^{4/3}\\ 0&{}u_2 \end{array} \right) ,\quad \Phi '(0) = \left( \begin{array}{cc} 1&{}0\\ 0&{}0 \end{array} \right) ,\quad \Phi ''(0)[v] = \left( \begin{array}{cc} 0&{}0\\ 0&{}v_2 \end{array} \right) . \end{aligned}$$

Therefore, \(\Phi \) is 2-regular at 0 in every nonzero direction in \(\ker \Phi '(0) = \{ 0\} \times \mathbb {R}\).

Assuming that \(u_2\not = 0\), the basic Newton step from \(u^k = u\) (i.e., the unique solution of (9) with \(\Omega \equiv 0\) and \(\omega \equiv 0\)) is \(v = (-u_1+u_2^{7/3}/14,\, -u_2/2)\). In particular, (14) is valid, while (13) (and even a weaker estimate from Izmailov et al. [30, Lemma 1]) is not. The reason is violation of (1).

Theorem 3.1

Let \(\Phi :\mathbb {R}^p\rightarrow \mathbb {R}^p\) be differentiable near \({\bar{u}}\in \mathbb {R}^p\), and let the derivative of \(\Phi \) be strongly semismooth at \({\bar{u}}\). Let \({\bar{u}}\) be a solution of Equation (7), and assume that \(\Phi \) is 2-regular at \({\bar{u}}\) in a direction \({\bar{v}}\in \ker \Phi '({\bar{u}}){\setminus } \{ 0\} \). Moreover, let \(\Omega :\mathbb {R}^p\rightarrow \mathbb {R}^{p\times p}\) and \(\omega :\mathbb {R}^p\rightarrow \mathbb {R}^p\) satisfy the following properties: there exists \(\delta > 0\) such that, along with (10), the estimates

$$\begin{aligned} \Pi \Omega (u) = O(\Vert u_1-{\bar{u}}_1\Vert )+O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(27)

and

$$\begin{aligned} \Pi \omega (u) = O(\Vert u-{\bar{u}}\Vert \Vert u_1-{\bar{u}}_1\Vert )+O(\Vert u-{\bar{u}}\Vert ^3) \end{aligned}$$
(28)

hold for \(u\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) as \(\varepsilon \rightarrow 0+\).

Then, for every \({\widehat{\varepsilon }} >0\) and \({\widehat{\delta }} >0\), there exist \(\varepsilon = \varepsilon ({\bar{v}})>0\) and \(\delta = \delta ({\bar{v}})>0\) such that for any starting point \(u^0\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) there exists the unique sequence \(\{ u^k\} \subset \mathbb {R}^p\) such that for each k it holds that \(u^{k+1} = u^k+v^k\), where \(v^k\) satisfies (9), and for this sequence and for each k, it holds that \(u_2^k\not ={\bar{u}}_2\), \(u^k\in K_{{\widehat{\varepsilon }},\, {\widehat{\delta }} }({\bar{u}};\, {\bar{v}})\), \(\{ u^k\} \) converges to \({\bar{u}}\), \(\{ \Vert u^k-{\bar{u}}\Vert \} \) converges to zero monotonically,

$$\begin{aligned} \frac{\Vert u_1^{k+1}-{\bar{u}}_1\Vert }{\Vert u_2^{k+1}-{\bar{u}}_2\Vert } = O(\Vert u^k-{\bar{u}}\Vert ) \end{aligned}$$
(29)

as \(k\rightarrow \infty \), and

$$\begin{aligned} \lim _{k\rightarrow \infty } \frac{\Vert u_2^{k+1}-{\bar{u}}_2\Vert }{\Vert u_2^k-{\bar{u}}_2\Vert } = \frac{1}{2}. \end{aligned}$$
(30)

Proof

Under the assumption (10), estimates (12)–(14) in Lemma 3.1 further imply that

$$\begin{aligned} u_1+v_1-{\bar{u}}_1= & {} O(\Vert u-{\bar{u}}\Vert ^2), \end{aligned}$$
(31)
$$\begin{aligned} u_2+v_2-{\bar{u}}_2= & {} \frac{1}{2} (u_2-{\bar{u}}_2)+O(\Vert u_1-{\bar{u}}_1\Vert ) \nonumber \\{} & {} +\,O(\Vert \Pi \Omega (u)\Vert )+O(\Vert u-{\bar{u}}\Vert ^{-1}\Vert \Pi \omega (u)\Vert ) \nonumber \\{} & {} +\,O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(32)

as \(u\in K_{{\bar{\varepsilon }},\, {\bar{\delta }} }({\bar{u}};\, {\bar{v}}){\setminus } \{ {\bar{u}}\} \) tends to \({\bar{u}}\), where \({\bar{\varepsilon }} > 0\) and \({\bar{\delta }} > 0\) are defined according to Lemma 3.1.

Assuming further that there exists \(\delta > 0\) such that (27), (28) hold for \(u\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) as \(\varepsilon \rightarrow 0+\), the estimate (32) is further simplified to

$$\begin{aligned} u_2+v_2-{\bar{u}}_2 = \frac{1}{2} (u_2-{\bar{u}}_2)+O(\Vert u_1-{\bar{u}}_1\Vert ) +O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(33)

as \(u\rightarrow {\bar{u}}\), and the subsequent analysis in the proof of Izmailov et al. [30, Theorem 1] goes through, as it does not further rely on any smoothness assumptions but only on the estimates (31) and (33). This yields the needed result. \(\square \)

Remark 3.1

The flexibility of the assumption on perturbation terms \(\Omega (\cdot )\) and \(\omega (\cdot )\) allows for applications of Theorem 3.1 to various specific Newton-type methods, including those equipped with stabilizing features intended specially for finding singular and even nonisolated solutions. To begin with, taking \(\Omega (\cdot )\equiv 0\) and \(\omega (\cdot )\equiv 0\) recovers the classical Newton method for Equation (7), with the subproblem

$$\begin{aligned} \Phi (u^k)+\Phi '(u^k)v = 0. \end{aligned}$$
(34)

Furthermore, consider the Levenberg–Marquardt method [38, 39] (see also [41, Sect. 10.3]) with the subproblem of the form

$$\begin{aligned} \begin{array}{ll} \text{ minimize }&\displaystyle \frac{1}{2} \Vert \Phi (u^k)+\Phi '(u^k)v \Vert ^2 +\frac{1}{2} \rho (u^k)\Vert v\Vert ^2,\quad v\in \mathbb {R}^p, \end{array} \end{aligned}$$
(35)

where \(\rho :\mathbb {R}^p\rightarrow \mathbb {R}_+\) defines the regularization parameter. For modern local quadratic convergence theories for this method under the local Lipschitzian error bound condition (8) (i.e., noncriticality of the solution in question), and including the associated rules to control the regularization parameter, see [10, 13, 14, 16, 17, 24, 46].

Passing to the case of a critical solution, observe that the subproblem (35) employing the Euclidean norm is equivalent to the linear equation

$$\begin{aligned} (\Phi '(u^k))^\top \Phi (u^k)+((\Phi '(u^k))^\top \Phi '(u^k)+\rho (u^k)I)v = 0, \end{aligned}$$
(36)

and the constructions in [30, Sect. 3.1] allow to interpret this equation as the subproblem (9) with \(\Omega (\cdot )\) and \(\omega (\cdot )\) possessing the needed properties when \(\rho (\cdot ):= \Vert \Phi (\cdot )\Vert ^\tau \), with \(\tau \ge 2\). This yields a counterpart of Izmailov et al. [30, Corollary 1], saying essentially that under the smoothness and 2-regularity assumptions in Theorem 3.1, the conclusion of this theorem is valid for the Levenberg–Marquardt method.

Another relevant algorithm in this context is the LP-Newton method introduced in [11], with the iteration subproblem of the form

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} \gamma \\ \text{ subject } \text{ to } &{} \Vert \Phi (u^k)+\Phi '(u^k)v\Vert \le \gamma \Vert \Phi (u^k)\Vert ^2, \\ &{}\Vert v \Vert \le \gamma \Vert \Phi (u^k)\Vert ,\\ &{}(v,\, \gamma )\in \mathbb {R}^p\times \mathbb {R}. \end{array} \end{aligned}$$
(37)

As demonstrated in [10, 11] (see also [17]), local convergence properties of this method near noncritical solutions are the same as for the Levenberg–Marquardt method. Yet again, thinking of critical solutions, and following the development in [30, Sect. 3.2], one can embed the LP-Newton method into the pNM framework above, and obtain counterpart of Izmailov et al. [30, Corollary 2], saying that under the smoothness and 2-regularity assumptions in Theorem 3.1, for every \({\widehat{\varepsilon }} >0\) and \({\widehat{\delta }} >0\), there exist \(\varepsilon = \varepsilon ({\bar{v}})>0\) and \(\delta = \delta ({\bar{v}})>0\) such that for any starting point \(u^0\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) there exists a sequence \(\{ u^k\} \subset \mathbb {R}^p\) such that for each k the pair \((u^{k+1}-u^k,\, \gamma _{k+1})\) with some \(\gamma _{k+1}\) solves (37), and for any such sequence and for each k, it holds that \(u_2^k\not ={\bar{u}}_2\), \(u^k\in K_{{\widehat{\varepsilon }},\, {\widehat{\delta }} }({\bar{u}};\, {\bar{v}})\), \(\{ u^k\} \) converges to \({\bar{u}}\), \(\{ \Vert u^k-{\bar{u}}\Vert \} \) converges to zero monotonically, and (29) and (30) hold. (Observe that uniqueness of \(\{ u^k\} \) is not claimed in this case, and indeed, (37) may have nonunique solutions.)

We finally mention the stabilized Newton–Lagrange (sequential quadratic programming) method for equality-constrained optimization problems [15, 28, 34, 45]; see also [35, Chapter 7]. It can also be covered by Theorem 3.1, thus relaxing the smoothness hypothesis in [30, Sect. 3.3], thus generalizing [30, Corollary 3]. We do not go into more detail regarding this issue as this would require an extensive discussion, including introducing terminology not needed in this paper otherwise.

Remark 3.2

An extension of Izmailov et al. [30, Theorem 1] to the case of a constrained equation as in [20, Theorem 3.1] is also possible under the smoothness hypothesis of this work. Consider the problem

$$\begin{aligned} \Phi (u) = 0,\quad u\in P, \end{aligned}$$

where \(P\subset \mathbb {R}^p\) is a given closed convex set. Then the analysis in [20, Sect. 3] allows to conclude that under the assumptions of Theorem 3.1, with the additional requirement that \({\bar{v}}\) belongs to the interior of the radial cone to P at \({\bar{u}}\), the iterates \(u^k\) in that theorem can be additionally claimed to stay feasible (i.e., to belong to P for all k). This allows to cover the constrained Gauss–Newton method with the subproblem

$$\begin{aligned} \begin{array}{llll} \text{ minimize }&\displaystyle \frac{1}{2} \Vert \Phi (u^k)+\Phi '(u^k)v\Vert ^2&\text{ subject } \text{ to }&u^k+v\in P; \end{array} \end{aligned}$$

the constrained Levenberg–Marquardt method [4, 12, 37, 47] with the subproblem

$$\begin{aligned} \begin{array}{llll} \text{ minimize }&\displaystyle \frac{1}{2} \Vert \Phi (u^k)+\Phi '(u^k)v \Vert ^2 +\frac{1}{2} \rho (u^k)\Vert v\Vert ^2&\text{ subject } \text{ to }&u^k+v\in P \end{array} \end{aligned}$$

(cf. (35)); the version of the LP-Newton method with the additional constraint [11], with the subproblem

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} \gamma \\ \text{ subject } \text{ to } &{} \Vert \Phi (u^k)+\Phi '(u^k)v\Vert \le \gamma \Vert \Phi (u^k)\Vert ^2, \\ &{}\Vert v \Vert \le \gamma \Vert \Phi (u^k)\Vert ,\\ &{}u^k+v\in P \end{array} \end{aligned}$$

(cf. (37)); as well as projected version of these methods; see [20, Sect.s 1.1, 3] for details.

Remark 3.3

According to Izmailov et al. [30, Remark 2], the estimates (29)–(30) in Theorem 3.1 imply that

$$\begin{aligned} \lim _{k\rightarrow \infty }\frac{\Vert u^{k+1}-{\bar{u}}\Vert }{\Vert u^k-{\bar{u}}\Vert } = \frac{1}{2}, \end{aligned}$$

i.e., \(\{ u^k\} \) converges to \({\bar{u}}\) linearly, with an asymptotic ratio exactly equal to 1/2.

This convergence pattern serves as a basis for convergence acceleration techniques [25, 27], one of them being the so-called extrapolation. The simplest variant of it consists of generating an auxiliary sequence \(\{ {\widehat{u}}^k\} \) by doubling the Newtonian step: for each k, set

$$\begin{aligned} {\widehat{u}}^{k+1}=u^k+2v^k. \end{aligned}$$
(38)

According to Griewank [27, Theorem 4.1], one may expect \(\{ \widehat{u}^k\} \) to converge linearly with the asymptotic ratio of 1/4, instead of 1/2 for \(\{ u^k\} \), at least for the basic Newton method with the subproblem (34). Observe that this procedure can be easily incorporated into any implementations of the algorithms discussed above: (38) does not affect the main iteration sequence \(\{ u^k\} \), and is not concerned with any computational overhead except for one extra evaluation of \(\Phi \) needed to assess the quality of the obtained \({\widehat{u}}^{k+1}\). The specified extrapolation procedure will be employed in Sect. 5.

4 Asymptotic Acceptance of the Full Step

We will deal with the issue specified in the title of this section for the following prototype algorithm combining the local perturbed Newton method framework with a linesearch globalization technique.

Algorithm 4.1

Choose \(u^0\in \mathbb {R}^p\), \(\sigma \in (0,\, 1)\), \(\theta \in (0,\, 1)\), and set \(k=0\).

  1. 1.

    If \(\Phi (u^k) = 0\), stop.

  2. 2.

    Compute \(v^k\in \mathbb {R}^p\) as a solution of (9).

  3. 3.

    Set \(\alpha = 1\). If the inequality

    $$\begin{aligned} \Vert \Phi (u^k+\alpha v^k)\Vert \le (1-\sigma \alpha )\Vert \Phi (u^k)\Vert \end{aligned}$$
    (39)

    is satisfied, set \(\alpha _k = \alpha \). Otherwise, replace \(\alpha \) by \(\theta \alpha \), check the inequality (39) again, etc., until (39) becomes valid.

  4. 4.

    Set \(u^{k+1} = u^k+\alpha _kv^k\).

  5. 5.

    Increase k by 1 and go to Step 1.

The fact that Algorithm 4.1 (equipped with some further safeguards for the cases when Step 2 fails or produces a direction “of poor quality” [21]) is well-defined and possesses reasonable global convergence properties is supposed to be established for the specific instances of (9) at Step 2. The role of the perturbed Newton method framework is only local, which conforms with the local nature of our analysis, and in principle, those global issues are not the subject of this work, but we will give some related comments in Remark 4.1 below.

Theorem 4.1

Under the assumptions of Theorem 3.1, let the estimates (27) and (28) hold with removed \(\Pi \), i.e.,

$$\begin{aligned} \Omega (u) = O(\Vert u_1-{\bar{u}}_1\Vert )+O(\Vert u-{\bar{u}}\Vert ^2) \end{aligned}$$
(40)

and

$$\begin{aligned} \omega (u) = O(\Vert u-{\bar{u}}\Vert \Vert u_1-{\bar{u}}_1\Vert )+O(\Vert u-{\bar{u}}\Vert ^3) \end{aligned}$$
(41)

for \(u\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) as \(\varepsilon \rightarrow 0+\).

Then, for every \({\widehat{\varepsilon }} >0\) and \({\widehat{\delta }} >0\), there exist \(\varepsilon = \varepsilon ({\bar{v}})>0\) and \(\delta = \delta ({\bar{v}})>0\) such that for any starting point \(u^0\in K_{\varepsilon ,\, \delta }({\bar{u}};\, {\bar{v}})\) Algorithm 4.1 with \(\sigma \in (0,\, 3/4)\) uniquely defines the sequence \(\{ u^k\} \), \(u^k\in K_{{\widehat{\varepsilon }},\, {\widehat{\delta }} }({\bar{u}},\, {\bar{v}})\) for all k, and \(\alpha _k = 1\) holds for all k large enough.

Observe that conditions (40) and (41) imply both (10) and (11), and of course cover the case when \(\Omega (\cdot )\equiv 0\) and \(\omega (\cdot )\equiv 0\), and (9) turns into the basic (unperturbed) Newton scheme (34), while Algorithm 4.1 turns into its instance considered in [22, Algorithm 1]. Therefore, Theorem 4.1 generalizes [22, Proposition 3], both in a sense of weaker smoothness assumptions, and of allowed perturbations of the basic Newton scheme.

Proof

As in Lemma 3.1, let \({\bar{u}}= 0\), and let \({\bar{\varepsilon }} > 0\) and \({\bar{\delta }} \in (0,\, 1)\) be chosen according to that lemma. Then for \(u\in K_{{\bar{\varepsilon }},\, {\bar{\delta }} }({\bar{u}};\, {\bar{v}}){\setminus } \{ {\bar{u}}\} \), there exists the unique solution v of (9). Moreover, by the argument in the proof of Izmailov et al. [30, Theorem 1] we then have

$$\begin{aligned} \Vert u_1\Vert \le {\bar{\delta }} \Vert u\Vert \le \frac{{\bar{\delta }} }{1-\bar{\delta }}\Vert u_2\Vert , \end{aligned}$$
(42)

and hence, estimates (31), (33) yield

$$\begin{aligned} u+v = \frac{1}{2} u_2+O( \Vert u_1\Vert )+O( \Vert u\Vert ^2) = \frac{1}{2} u_2+O({\bar{\delta }} \Vert u_2\Vert )+O(\Vert u_2\Vert ^2), \end{aligned}$$
(43)
$$\begin{aligned} \frac{1}{2} u+v = O(\Vert u_1\Vert )+O( \Vert u\Vert ^2) = O( {\bar{\delta }} \Vert u_2\Vert )+O( \Vert u_2\Vert ^2) \end{aligned}$$
(44)

as \(u\rightarrow 0\) and \({\bar{\delta }} \rightarrow 0+\).

According to (2) and (5),

$$\begin{aligned} \Phi (u)= & {} \Phi '({\bar{u}})u+\frac{1}{2} (\Phi ')'({\bar{u}};\, u)u+O(\Vert u\Vert ^3)\nonumber \\= & {} \Phi '({\bar{u}})u_1+\frac{1}{2} (\Phi ')'({\bar{u}};\, u_2)u_2+O(\Vert u_1\Vert ^2)+O(\Vert u_1\Vert \Vert u_2\Vert ) +O(\Vert u\Vert ^3)\nonumber \\= & {} \Phi '({\bar{u}})u_1+\frac{1}{2} (\Phi ')'({\bar{u}};\, u_2)u_2+O({\bar{\delta }} \Vert u_2\Vert ^2) +O(\Vert u_2\Vert ^3) \end{aligned}$$
(45)

as \(u\rightarrow 0\) and \({\bar{\delta }} \rightarrow 0+\), where the second equality is by Lipschitz continuity of \((\Phi ')'({\bar{u}};\, \cdot )\), while the last one is by (42). Furthermore, by the same reasoning, but also employing (43), we obtain that

$$\begin{aligned} \Phi (u+v)= & {} \Phi '({\bar{u}})(u+v)+\frac{1}{2} (\Phi ')'({\bar{u}};\, u+v)(u+v)+O(\Vert u+v\Vert ^3)\nonumber \\= & {} \Phi '({\bar{u}})(u+v)+\frac{1}{8} (\Phi ')'({\bar{u}};\, u_2)u_2+O( \bar{\delta }\Vert u_2\Vert ^2)+O( \Vert u_2\Vert ^3) \end{aligned}$$
(46)

as \(u\rightarrow 0\) and \({\bar{\delta }} \rightarrow 0+\).

Since v is a solution of (9), by (2)–(5) and (40)–(41), we conclude that

$$\begin{aligned} 0= & {} -\Phi (u)-\Phi '(u)v -\Omega (u)v+\omega (u)\\= & {} -\Phi '({\bar{u}})u-\frac{1}{2} (\Phi ')'({\bar{u}};\, u)u-\Phi '({\bar{u}}) v- (\Phi ')'({\bar{u}};\, u)v\\{} & {} +O(\Vert u\Vert ^3)+O(\Vert u\Vert ^2\Vert v\Vert )+O(\Vert u_1\Vert \Vert v\Vert )+O(\Vert u \Vert \Vert u_1 \Vert ), \end{aligned}$$

which by (42), (44) implies that

$$\begin{aligned} \Phi '({\bar{u}})(u+v)= & {} -(\Phi ')'({\bar{u}};\, u)\left( \frac{1}{2} u+v\right) \\{} & {} +O(\Vert u^3\Vert )+O(\Vert u\Vert ^2\Vert v\Vert )+O(\Vert u_1\Vert \Vert v\Vert )+O(\Vert u \Vert \Vert u_1 \Vert )\\= & {} O( {\bar{\delta }} \Vert u_2\Vert ^2)+O( \Vert u_2\Vert ^3). \end{aligned}$$

Substituting the latter into (46) yields

$$\begin{aligned} \Phi (u+v) = \frac{1}{8} (\Phi ')'({\bar{u}};\, u_2)u_2 +O( {\bar{\delta }} \Vert u_2\Vert ^2)+O( \Vert u_2\Vert ^3) \end{aligned}$$
(47)

as \(u\rightarrow 0\) and \({\bar{\delta }} \rightarrow 0+\).

Estimates (45) and (47) comprise what is needed for the analysis leading to Fischer et al. [22, Proposition 3] to go through when combined with the following additional facts none of which requires stronger smoothness assumptions. First, 2-regularity of \(\Phi \) in a direction \({\bar{v}}\in \ker \Phi '({\bar{u}}){\setminus } \{ 0\}\) implies that \(\Pi (\Phi ')'({\bar{u}};\, {\bar{v}}){\bar{v}} = B(\bar{v}){\bar{v}} \not = 0\), and then it can be seen that \({\bar{\delta }} > 0\) can be chosen in such a way that there exists \(\gamma > 0\) such that

$$\begin{aligned} \Vert (\Phi ')'({\bar{u}};\, u_2)u_2\Vert \ge \Vert \Pi (\Phi ')'({\bar{u}};\, u_2)u_2\Vert \ge \gamma \Vert u_2\Vert ^2. \end{aligned}$$

Second, (13), (33), and (42) imply the estimates

$$\begin{aligned}{} & {} u_1+v_1 = O(\Vert u\Vert \Vert u_1\Vert )+O(\Vert u\Vert ^3) = O( {\bar{\delta }} \Vert u_2\Vert ^2)+O( \Vert u_2\Vert ^3), \end{aligned}$$
(48)
$$\begin{aligned}{} & {} u_2+v_2 = \frac{1}{2} u_2+O(\Vert u_1\Vert )+O(\Vert u\Vert ^2) = \frac{1}{2} u_2+O( {\bar{\delta }} \Vert u_2\Vert )+O( \Vert u_2\Vert ^2)\nonumber \\ \end{aligned}$$
(49)

as \(u\rightarrow 0\) and \({\bar{\delta }} \rightarrow 0+\).

Observe that unlike for the local convergence result in Theorem 3.1, the estimate (48) (that is sharper than (31)) is essential here, as together with (49), it allows to conclude that for every \({\bar{\gamma }} > 0\), one can chose \({\bar{\varepsilon }} > 0\) and \({\bar{\delta }} > 0\) in such a way that

$$\begin{aligned} \Vert u_1+v_1\Vert \le {\bar{\gamma }} \Vert u_2+v_2\Vert ^2, \end{aligned}$$

yielding another key ingredient of this analysis. \(\square \)

Remark 4.1

Algorithm 4.1 makes perfect sense when used with the basic Newton scheme (34) at Step 2 (i.e., with \(\Omega (\cdot )\equiv 0\) and \(\omega (\cdot )\equiv 0\) in (9)), and with Euclidean norm in (39) at Step 3; see the related comments in [22]. In some sense, this remains true for the Levenberg–Marquardt method with the iteration system (36) as well, since the function \(\varphi :\mathbb {R}^p\rightarrow \mathbb {R}_+\), \(\varphi (u):= \Vert \Phi (u)\Vert \), defined using the Euclidian norm, is differentiable at any point \(u^k\) such that \(\Phi (u^k)\not = 0\) (cf. Step 1 of Algorithm 4.1), and

$$\begin{aligned} \varphi '(u^k) = (\Phi '(u^k))^\top \Phi (u^k)/\Vert \Phi (u^k)\Vert , \end{aligned}$$

and hence, for the solution \(v^k\) of (36) it holds that

$$\begin{aligned} \langle \varphi '(u^k),\, v^k\rangle = -\langle ((\Phi '(u^k))^\top \Phi '(u^k)+\rho (u^k)I)v^k,\, v^k\rangle /\Vert \Phi (u^k)\Vert < 0. \end{aligned}$$

Therefore, \(v^k\) is a direction of descent for \(\varphi \) at \(u^k\). That said, we emphasize that here we only discuss a principal possibility of using the Levenberg–Marquardt directions with linesearch tests like (39), i.e., we only consider the descent property of these directions for the residual. In particular, we do not discuss finite termination of backtracking procedures using this test, as this would still not guarantee global convergence of the overall algorithm, and we do not state here any formal results of this kind, as this is beyond the scope of this paper focusing on local analysis. Moreover, actual linesearch algorithms with known global convergence guarantees, involving the Levenberg–Marquardt directions, either employ, in a hybrid manner, some kind of safeguards for the case when the quality of descent is insufficient (like in [7, 13]), or special linesearch tests (like in [23]).

Observe that the result on asymptotic acceptance of the full step for the Levenberg–Marquardt method with \(\rho (\cdot ):= \Vert \Phi (\cdot )\Vert ^\tau \), \(\tau \ge 2\), in cases of convergence to (critical) solutions with the needed 2-regularity property, following from Theorem 4.1 and considerations in [30, Sect. 3.1] (recall also Remark 3.1), is new even in the case of twice differentiable \(\Phi \).

As for the LP-Newton method, the natural choice of the norm in the subproblem (37) is the infinity-norm, as it makes (37) a linear programming problem. In any case, the globalization procedure proposed in [18, Algorithm 1] employs the stepsize test of the form

$$\begin{aligned} \Vert \Phi (u^k+\alpha v^k)\Vert \le (1-\sigma \alpha )\Vert \Phi (u^k)\Vert +\sigma \alpha \gamma _{k+1}\Vert \Phi (u^k)\Vert ^2 \end{aligned}$$

with the same norm as the one appearing in (37). This test is evidently weaker than (39) (with the same norm), and hence, accepts the unit stepsize once (39) does. Therefore, Theorem 4.1 and considerations in [30, Sect. 3.2] (recall also Remark 3.1 again) yield the result on asymptotic acceptance of the full step for the LP-Newton method, under the needed assumptions.

In completion of this section we note that, unlike in [22], under the current smoothness assumptions one cannot expect the set of excluded directions for starlike domains of convergence and asymptotic acceptance of the full step to be thin, even for the basic (unperturbed) Newton method; see Examples 5.15.3 below.

5 Applications to a Smooth Reformulation of Nonlinear Complementarity Problems and Numerical Results

Consider the nonlinear complementarity problem (NCP)

$$\begin{aligned} u\ge 0,\quad F(u)\ge 0,\quad \langle u,\, F(u)\rangle =0, \end{aligned}$$
(50)

where \(F:\mathbb {R}^p\rightarrow \mathbb {R}^p\) is a given smooth mapping. Using the complementarity function \(\psi :\mathbb {R}\times \mathbb {R}\rightarrow \mathbb {R}\),

$$\begin{aligned} \psi (a,\, b):= 2ab-(\min \{ 0,\, a+b\} )^2 \end{aligned}$$
(51)

(originally introduced in [9]), NCP (50) is equivalently reformulated as (7) with

$$\begin{aligned} \Phi (u):= \psi (u,\, F(u)), \end{aligned}$$
(52)

where \(\psi \) is applied componentwise. The function \(\psi \) in (51) is one of known smooth complementarity functions [36, 40]: assuming that F is differentiable at \(u\in \mathbb {R}^p\), the corresponding mapping defined in (52) is also differentiable at u, with the Jacobian \(\Phi '(u)\) having the rows

$$\begin{aligned} \Phi _i'(u) = 2u_iF_i'(u)+2F_i(u)e^i-2\min \{ 0,\, u_i+F_i(u)\} (F_i'(u)+e^i),\quad i = 1,\, \ldots ,\, p, \end{aligned}$$
(53)

where \(e^1,\, \ldots ,\, e^p\) is the standard basis in \(\mathbb {R}^p\). From [35, Proposition 1.75] it then follows that if \(F'\) is strongly semismooth at \({\bar{u}}\in \mathbb {R}^p\) (in particular, if it is twice differentiable near \({\bar{u}}\), with its second derivative being Lipschitz-continuous near \({\bar{u}}\)), then \(\Phi '\) is strongly semismooth at \({\bar{u}}\).

If \({\bar{u}}\) is a solution of NCP (50), then the disjoint index sets

$$\begin{aligned} \begin{array}{c} I_0({\bar{u}}):= \{ i = 1,\, \ldots ,\, p\mid {\bar{u}}_i = F_i({\bar{u}}) = 0\}, \\ I_1({\bar{u}}):= \{ i = 1,\, \ldots ,\, p\mid {\bar{u}}_i> 0,\; F_i({\bar{u}}) = 0\}, \\ I_2({\bar{u}}):= \{ i = 1,\, \ldots ,\, p\mid {\bar{u}}_i = 0,\; F_i({\bar{u}}) > 0\}, \end{array} \end{aligned}$$

provide a partition of \(\{ 1,\, \ldots ,\, p\} \), and from (53) we have

$$\begin{aligned} \Phi _i'({\bar{u}}) = \left\{ \begin{array}{ll} 0&{}\text{ if } i\in I_0({\bar{u}}),\\ 2{\bar{u}}_iF_i'({\bar{u}})&{}\text{ if } i\in I_1({\bar{u}}),\\ 2F_i({\bar{u}})e^i&{}\text{ if } i\in I_2({\bar{u}}). \end{array} \right. \end{aligned}$$
(54)

This implies that if \(I_0({\bar{u}})\not = \emptyset \), meaning violation of the strict complementarity condition at \({\bar{u}}\), then \({\bar{u}}\) is necessarily a singular solution of Equation (7).

From (53) one can easily obtain that for any \(v\in \mathbb {R}^p\) and \(i\in I_0({\bar{u}})\)

$$\begin{aligned} (\Phi _i')'({\bar{u}};\, v)= & {} 2(v_i-\min \{ 0,\, v_i+\langle F_i'({\bar{u}}),\, v\rangle \})F_i'({\bar{u}})\\{} & {} +\,2(\langle F_i'({\bar{u}}),\, v\rangle -\min \{ 0,\, v_i+\langle F_i'({\bar{u}}),\, v\rangle \} )e^i\\= & {} 2\max \{ v_i,\, -\langle F_i'({\bar{u}}),\, v\rangle \} F_i'({\bar{u}}) -2\min \{ v_i,\, -\langle F_i'({\bar{u}}),\, v\rangle \} e^i. \end{aligned}$$

Then from (54), we derive that the key assumption of 2-regularity of \(\Phi \) at \({\bar{u}}\) in some direction \({\bar{v}}\in \ker \Phi '({\bar{u}})\) automatically holds for any \({\bar{v}}\in \mathbb {R}^p\) such that

$$\begin{aligned} \langle F_i'({\bar{u}}),\, {\bar{v}}\rangle = 0,\; i\in I_1({\bar{u}}),\quad {\bar{v}}_i = 0,\; i\in I_2({\bar{u}}), \end{aligned}$$
(55)

and the matrix with the rows

$$\begin{aligned} \begin{array}{c} \max \{ {\bar{v}}_i,\, -\langle F_i'({\bar{u}}),\, {\bar{v}}\rangle \}F_i'({\bar{u}}) -\min \{ {\bar{v}}_i,\, -\langle F_i'({\bar{u}}),\, {\bar{v}}\rangle \} )e^i,\; i\in I_0({\bar{u}}), \\ F_i'({\bar{u}}),\; i\in I_1({\bar{u}}), \\ e^i,\; i\in I_2({\bar{u}}), \end{array} \end{aligned}$$
(56)

is nonsingular. The latter sufficient condition for 2-regularity of \(\Phi \) at \({\bar{u}}\) in a direction \({\bar{v}}\) evidently implies that

$$\begin{aligned} F_i'({\bar{u}}),\; i\in I_1({\bar{u}}),\quad e^i,\; i\in I_2({\bar{u}}),\quad \hbox {are linearly independent}, \end{aligned}$$
(57)

and moreover, this sufficient condition also becomes necessary under (57). The general characterization of 2-regularity in the current context, not assuming (57), can be found in [42]. For easier understanding of the essence of the properties in question, here we restrict ourselves to the case when singularity is imposed in a natural way, i.e., only by violation of strict complementarity at \({\bar{u}}\), or, in other words, when (57) holds. That said, see Example 5.3 below, demonstrating the case when the key assumption holds in the absence of (57).

Example 5.1

[19, Example 1] Let \(p = 1\), \(F(u) = u^2\). Then NCP (50) has the unique solution \({\bar{u}}= 0\), with \(I_1({\bar{u}}) = I_2({\bar{u}}) = \emptyset \), \(F'({\bar{u}}) = 0\), and the first line in (56) is positive if \({\bar{v}}< 0\), and equals 0 otherwise. Therefore, the key assumption holds with any \({\bar{v}}< 0\), but not with \({\bar{v}}\ge 0\).

Being initialized at \(u^0 < 0\), Algorithm 4.1 employing the basic Newton method, and with \(\sigma < 3/4\), converges to \({\bar{u}}\) by full steps (from some iteration on), and the rate of convergence is linear with the asymptotic ratio 1/2. For \(\sigma \ge 3/4\), the full step is never accepted (the ultimate stepsize value is \(\alpha = 0.5\) for \(\sigma = 3/4\), and approaches 0 as \(\sigma \) approaches 1), and the linear convergence rate is lower (with the asymptotic ratio 3/4 for \(\sigma = 3/4\), and approaching 1 as \(\sigma \) approaches 1).

The case when \(u^0 > 0\) is not covered by the theory above, and the method ultimately accepts the unit stepsize for sufficiently small values of \(\sigma \) (only for those smaller than some threshold \({\bar{\sigma }} \in (0,\, 3/4)\)), but in such cases the rate of convergence is linear with the asymptotic ratio 2/3. This specific rate is explained by the fact that for \(u > 0\), it holds that \(\Phi (u) = 2u^3\), and the Newton iteration at \(u^k > 0\) produces \(u^{k+1} = 2u^k/3\). This also agrees with the theory developed in [26] for arbitrarily smooth equations and for the basic Newton method, allowing for higher-order regularity when \(\Phi '' ({\bar{u}}) = 0\) (as it essentially happens in this case).

Fig. 1
figure 1

Example 5.2: the Newton method

Example 5.2

(Test problem affknot1 in [42]) Let \(p = 2\), \(F(u) = (u_2-1,\, u_1)\). Then NCP (50) has the solution set \(\{ 0\} \times [1,\, +\infty )\) (thick line in Fig. 1, where thin lines are contours of \(\Vert \Phi (\cdot )\Vert \)), with \({\bar{u}}= (0,\, 1)\) (thick dot in Fig. 1) being the unique critical solution, and \(I_0({\bar{u}}) = \{ 1\} \), \(I_1({\bar{u}}) = \{ 2\} \), \(I_2({\bar{u}}) = \emptyset \), \(F_1'({\bar{u}}) = e^2\), \(F_2'({\bar{u}}) = e^1\). Condition (55) yields \({\bar{v}}_1 = 0\), while the matrix with rows given by (56) takes the form

$$\begin{aligned} \left( \begin{array}{cc} 0&{}-{\bar{v}}_2\\ 1&{}0 \end{array} \right) \end{aligned}$$

when \({\bar{v}}_2 < 0\), and

$$\begin{aligned} \left( \begin{array}{cc} {\bar{v}}_2&{}0\\ 1&{}0 \end{array} \right) \end{aligned}$$

otherwise. Therefore, the key assumption holds with \({\bar{v}}= (0,\, {\bar{v}}_2)\) for any \({\bar{v}}_2< 0\), but does not hold for \({\bar{v}}_2\ge 0\).

Figure 1a demonstrates the starting points from which convergence of the basic Newton method to the critical solution \({\bar{u}}\) was detected. In order to obtain this figure, we initialized the method at 10000 random starting points distributed uniformly in the cubic neighborhood of \({\bar{u}}\) with the half-edge equal to 1. The runs were terminated with success when the residual \(\Vert \Phi (u^k)\Vert \) was achieving a value below \(10^{-11}\), and out of these cases, convergence to \({\bar{u}}\) was claimed when \(\Vert u^k-{\bar{u}}\Vert \) at successful termination was smaller than \(10^{-3}\). Figures with domains of attraction for other examples below were generated similarly. Note that the tolerance \(10^{-3}\) is a compromise between the tasks of numerically detecting the cases of convergence and non-convergence to the solution of interest.

Figure 1b shows some typical iterative sequences. The observed pattern of convergence to \({\bar{u}}\) agrees with the developed theory, and the full step is ultimately accepted.

Fig. 2
figure 2

Example 5.3: the Newton method

Fig. 3
figure 3

Example 5.3: the Newton method

Example 5.3

[2, Example 3.3] Let \(p = 2\), \(F(u) = ((u_1-1)u_2,\, (u_1-1)^2)\). Then NCP (50) has the solution set \((\mathbb {R}_+\times \{ 0\} )\cup (\{ 1\} \times \mathbb {R}_+)\), with \((0,\, 0)\) and \((1,\, 0)\) being the only critical solutions. Figures 2 and 3 provide the same kind of information as Fig. 1 above.

For \({\bar{u}}= (0,\, 0)\) we have \(I_0({\bar{u}}) = \{ 1\} \), \(I_1({\bar{u}}) = \emptyset \), \(I_2({\bar{u}}) = \{ 2\} \), \(F_1'({\bar{u}}) = -e^2\), \(F_2'({\bar{u}}) = 2e^1\). Condition (55) yields \({\bar{v}}_2 = 0\), while the matrix with rows given by (56) takes the form

$$\begin{aligned} \left( \begin{array}{cc} -{\bar{v}}_1&{}0\\ 0&{}1 \end{array} \right) \end{aligned}$$

when \({\bar{v}}_1 < 0\), and

$$\begin{aligned} \left( \begin{array}{cc} 0&{}-{\bar{v}}_1\\ 0&{}1 \end{array} \right) \end{aligned}$$

otherwise. Therefore, the key assumption holds with \({\bar{v}}= ({\bar{v}}_1,\, 0)\) for any \({\bar{v}}_1 < 0\), but does not hold for \({\bar{v}}_1\ge 0\).

Figure 2a demonstrates the starting points from which convergence of the basic Newton method to the critical solution \({\bar{u}}= (0,\, 0)\) was detected, while Fig. 2b shows some typical iterative sequences. The observed pattern of convergence to \({\bar{u}}\) agrees with the developed theory, and the full step is ultimately accepted.

For \({\bar{u}}= (1,\, 0)\) we have \(I_0({\bar{u}}) = \{ 2\} \), \(I_1({\bar{u}}) = \{ 1\} \), \(I_2({\bar{u}}) = \emptyset \), \(F_1'({\bar{u}}) = F_2'({\bar{u}}) = (0,\, 0)\), implying, in particular, that (57) does not hold. Nevertheless, it can be seen that the key assumption holds with any \({\bar{v}}\) such that \({\bar{v}}_2 < 0\).

Figure 3 is intended to emphasize the role of the critical solution \({\bar{u}}= (1,\, 0)\). Observe that \(\varepsilon ({\bar{v}})\rightarrow 0\) as \({\bar{v}}_2\rightarrow 0-\), i.e., as \({\bar{v}}\) approaches nonzero directions in the \(u_1\)-axis, in which 2-regularity is violated. That said, the boundary of the attraction domain in Fig. 3a is tangential to the \(u_1\)-axis at \((1,\, 0)\), and \(\varepsilon ({\bar{v}})\) is positive for every direction \({\bar{v}}\) with \({\bar{v}}_2 < 0\). A similar effect is observed in other examples.

Fig. 4
figure 4

Example 5.4: the Newton method

Fig. 5
figure 5

Example 5.4: the Levenberg–Marquardt method

Fig. 6
figure 6

Example 5.4: the Levenberg–Marquardt method

Example 5.4

[2, Example 3.2] Let \(p = 2\), \(F(u) = (0,\, -u_1+u_2+1)\). Then NCP (50) has the solution set \(([0,\, 1]\times \{ 0\} )\cup \{ (t+1,\, t)\mid t\ge 0\} \), with \((0,\, 0)\) and \((1,\, 0)\) being the only critical solutions. Figures 4, 5 and 6 provide the same kind of information as Figs. 1, 2 and 3 above, though Figs. 5 and 6 are for the Levenberg–Marquardt method rather than the basic Newton method.

For \({\bar{u}}= (0,\, 0)\) as in the previous examples one can check that the key assumption holds with \({\bar{v}}= ({\bar{v}}_1,\, 0)\) for any \({\bar{v}}_1 < 0\), and Figs. 4 and 5 reflect this fact.

Furthermore, one can see that \(\Phi _1'(u) = -2\min \{ 0,\, u_1\} e^1 = 0\) for all \(u\in \mathbb {R}^2\) with \(u_1\ge 0\), implying that \(\Phi '(u)\) is singular for all such u, and in particular, it is singular in a neighborhood of \({\bar{u}}= (1,\, 0)\). Therefore, the key assumption cannot hold at this \({\bar{u}}\), and the Newton method is not well-defined near this solution. At the same time, the Levenberg–Marquardt method behaves nicely near this solution, and does not exhibit any tendency of convergence to it; see Fig. 6. In particular, the sparse set in Fig. 6a is actually a result of using an approximate test on closeness of the iterate at termination to \({\bar{u}}\), with rather rough tolerance \(10^{-3}\). Further reducing this tolerance makes the “domain of attraction” being shown more and more sparse, and eventually eliminates it completely at the level \(10^{-6}\).

Fig. 7
figure 7

Example 5.5: the Levenberg–Marquardt method

Example 5.5

(Test problem quadknot in [42]) Let \(p = 2\), \(F(u) = (u_2-1,\, u_1^2)\). Then NCP (50) has the solution set \(\{ 0\} \times [1,\, +\infty )\), with \({\bar{u}}= (0,\, 1)\) being the only critical solution. See Fig. 7.

Fig. 8
figure 8

Example 5.6: the Levenberg–Marquardt method

Example 5.6

[2, Example 3.4] Let \(p = 2\), \(F(u) = ((u_1-1)^2+(u_1-1)u_2,\, (u_1-1)^2)\). Then apart from a strictly complementary solution \((0,\, 0)\), NCP (50) has the solution set \(\{ 1\} \times \mathbb {R}_+\), with \({\bar{u}}= (1,\, 0)\) being the only critical solution. See Fig. 8.

We complete the paper with numerical results for a collection of small NCPs taken from Oberlin and Wright [42], and for some other examples of NCP with solutions violating strict complementarity, taken from various sources. The algorithms being tested were applied to (7) with \(\Phi \) defined according to (51)–(52).

Table 1 Numerical results for NCPs: the Newton method

Table 1 presents the results for Algorithm 4.1 employing the basic Newton method with the subproblem (34), and with \(\sigma = 0.01\) and \(\theta = 1/2\) (abbreviated below as “NM”), as well as for the version of the method supplied with the simplest extrapolation procedure defined according to (38) (abbreviated as “NM-EP”). Successful termination was declared when the Euclidean residual of (7) at the main or extrapolated iterate was getting smaller or equal to \(10^{-11}\), within 50 iterations. The identifiers of test problems with the key assumption satisfied at the singular solution in question are boldfaced. Some of the test problems have two solutions of interest, and then their identifiers have additional attributes 1 or 2. For each test problem, we performed a single run from the “recommended” starting point (when available; abbreviated as “Rec”), and also 1000 runs from randomly generated starting points distributed uniformly in the cubic neighborhood of the solution in question, with a half-edge equal to 1 (abbreviated as “Rand”). For the former case, we report only the iteration count, while for the latter we report the average iteration count over successful runs (rounded up to the nearest integer), and additionally the percentage of successful runs and the average distance to the solution of interest over cases when this distance at successful termination was no greater than \(10^{-3}\) (in parenthesis, separated by commas). The cases when there were no successful runs are marked by “–”.

Table 2 reports the same kind of information as Table 1 for Algorithm 4.1 with the same parameter values, but employing the Levenberg–Marquardt method with the subproblem (36) making use of the regularization parameter \(\rho (\cdot ):= \Vert \Phi (\cdot )\Vert ^2\) (abbreviated as LMM, and as LMM-EP for a version supplied with extrapolation).

Table 2 Numerical results for NCPs: the Levenverg–Marquardt method

The asymptotic acceptance of the full step was encountered in all runs of these experiments. Moreover, the full step was accepted almost always, except for some rare cases when it was not accepted on some early iteration (usually once per run, if at all). Moreover, for LMM, the iterations where the full step was not accepted were systematically encountered for DIS61 and quarquad, 2, only. These observations confirm the conclusions of Theorem 4.1: despite convergence to singular solutions, the full step is asymptotically accepted.

Furthermore, the results reported in Tables 1 and 2 clearly demonstrate the accelerating effect of the extrapolation procedure for problems satisfying the key assumption, both for the Newton and the Levenberg–Marquardt methods. This can be considered as an indirect evidence of the convergence pattern established in Theorem 3.1.

6 Conclusions

We have extended some known results on behavior of Newton-type methods (including the Levenberg–Marquardt and the LP-Newton methods) near singular (and perhaps nonisolated) solutions of nonlinear equations to the case when the operator of the equation possesses a strongly semismooth derivative, but is not necessarily twice differentiable. Specifically, we have presented the results on local linear convergence, and on asymptotic acceptance of the full step by linesearch versions of such algorithms. The results were further applied to nonlinear complementarity problems violating strict complementarity, and a collection of examples was presented demonstrating peculiarities of the smoothness assumptions used in this work.