1 Introduction

Keplerian orbits can be described by two parameters, the semilatus rectum \(p>0\) and the eccentricity \(e\ge 0\), from which all other orbital constants can be derived. According to Kepler’s laws, the coordinates (xy) at time t of a satellite in the standard reference frame (central body at the origin and x-axis pointing to the periapsis) are given by

Elliptic (\(e<1\))

Parabolic (\(e=1\))

Hyperbolic (\(e>1\))

\(x=\frac{p}{1-e^2}(\cos (E)-e)\)

\(x=\frac{p}{2}(1-D^2)\)

\(x=\frac{p}{1-e^2}(\cosh (H)-e)\)

\(y=\frac{p}{\sqrt{1-e^2}}\sin (E)\)

\(y=pD\)

\(y=\frac{p}{\sqrt{e^2-1}}\sinh (H)\)

\(E-e\sin (E)=M\)

\(D+\frac{D^3}{3}=M\)

\(e\sinh (H)-H=M\)

\(M=\sqrt{\frac{\mu (1-e^2)^3}{p^3}}(t-t_0)\)

\(M=\sqrt{\frac{4\mu }{p^3}}(t-t_0)\)

\(M=\sqrt{\frac{\mu (e^2-1)^3}{p^3}}(t-t_0)\)

where \(\mu \) is the gravitational parameter of the central body and \(t_0\) is the time of passage through periapsis (Montenbruck and Pfleger 1994). The value M, called mean anomaly, is a linear function of the time. E, D and H are the elliptic, parabolic and hyperbolic eccentric anomalies respectively, which are functions of the time which are related to M through Kepler’s equation (third line of formulas in the table above). The eccentric anomalies are needed to compute the coordinates x and y, hence the need for a method to solve Kepler’s equation.

In the case of circular orbits (\(e=0\)), the equation for the eccentric anomaly reduces to \(E=M\), and in the case of parabolic trajectories (\(e=1\)), the cubic equation for D can be solved exactly:

$$\begin{aligned} D=\root 3 \of {\frac{3M+\sqrt{9M^2+4}}{2}}+\root 3 \of {\frac{3M-\sqrt{9M^2+4}}{2}}. \end{aligned}$$

Therefore, the real problem resides on the cases \(0<e<1\) and \(e>1\), where the equation involves a combination of algebraic and trascendental functions.

For simplicity, the hyperbolic Kepler’s equation is usually written as \(\sinh (H)-gH=L\), where \(g=\frac{1}{e}\in (0,1)\) and \(L=\frac{M}{e}\in (-\infty ,\infty )\). Since the left side of the equation is an odd function of H, it is enough to consider \(L\in [0,\infty )\). Moreover, assuming that H can be found, the formula for the y coordinate requires \(\sinh (H)\), so it makes more sense to directly find \(S=\sinh (H)\) solving

$$\begin{aligned} f_{g,L}(S)= S-g\;{{\mathrm{arcsinh}}}(S)-L=0. \end{aligned}$$
(1.1)

Once that is done, we can compute y as \(y=\frac{p}{\sqrt{e^2-1}}S\) and, since \(\cosh (H)=\sqrt{1+\sinh ^2(H)}\), x as \(x=\frac{p}{1-e^2}(\sqrt{1+S^2}-e)\).

While there are many articles discussing solutions for the elliptic case (see Avendano et al. 2014; Colwell 1993; Danby and Burkardt 1983; Montenbruck and Pfleger 1994; Ng 1979; Odell and Gooding 1986; Taff and Brennan 1989, among others), the hyperbolic case has received less attention. Prussing (1977) and Serafin (1986) gave upper bounds for the actual solution of the hyperbolic Kepler’s equation, which can be used as starters for Newton’s method since \(f_{g,L}''\ge 0\). Gooding and Odell (1988) solved the hyperbolic Kepler’s equation by using Newton’s method starting from a well-tuned formula depending on the parameters g and L. Their approach gives a relative accuracy of \(10^{-20}\) with only two iterations. Although their starter is a single formula that works on the entire region \((0,1)\times [0,\infty )\), it is too complicated to provide an efficient implementation.

In contrast, Fukushima (1997) focused on the efficiency and simplicity of the starter rather than trying to find a universal formula, and produced a starter that is defined by different formulas, each valid in a stripe-like region. He showed that his starter converges under Newton’s method, but not necessarily at quadratic speed.

More recently, Farnocchia et al. (2013), gave a method to avoid numerical issues that sometimes arise when applying Newton’s method in near parabolic orbits. They used instead a non-singular iterative technique that avoids rounding off problems.

Our approach to solve Eq. (1.1) is to use Newton’s method starting from a value \(\widetilde{S}(g,L)\) that is close enough to the actual solution S(gL) to guarantee quadratic convergence speed, i.e. if \(S_n\) denotes the value obtained after n iterations, then \(|S_n-S|\le 0.5^{2^n-1}|\widetilde{S}-S|\). We use a simple criterion, Smale’s \(\alpha \)-test (Smale 1986) with the constant \(\alpha _0\) improved by Wang and Han (1989), to decide whether a starter gives the claimed convergence rate. Values that satisfy the test are called approximate zeros.

Definition 1.1

(Smale’s \(\alpha \)-test) Let \(f:(a,b)\subseteq \mathbb {R}\rightarrow \mathbb {R}\) be an infinitely differentiable function. The value \(z\in (a,b)\) is an approximate zero of f if it satisfies the following condition

$$\begin{aligned} \alpha (f,z)=\beta (f,z) \cdot \gamma (f,z) <\alpha _0, \end{aligned}$$

where

$$\begin{aligned} \beta (f,z)=\left| \frac{f(z)}{f'(z)} \right| ,\quad \gamma (f,z)=\sup _{k \ge 2} \left| \frac{f^{(k)}(z)}{k!f'(z)}\right| ^{\frac{1}{k-1}} \end{aligned}$$

and \(\alpha _0=3-2\sqrt{2}\approx 0.1715728\).

In an earlier paper (Avendano et al. 2014), we obtained the following simple approximate zero \(\widetilde{E}(e,M)\) for the elliptic case.

Theorem 1.2

The starter

is an approximate zero of \(E-e\sin (E)-M\) for all \(e\in [0,1)\) and \(M\in [0,\pi ]\).

Our main result in this paper, proven in Sect. 2, is a starter for the hyperbolic Kepler’s equation that is a piecewise-defined function in eight stripe-like regions. In seven of them, we use only linear expressions in g and L, while in the last one we need a cubic and a square root. Trying to construct a starter with only linear expressions is impractical because, as we approach the corner, the stripes become thinner and therefore a large number of them will be needed. Besides, numerical evidence suggests that it is impossible to cover a neighborhood of the corner \(g=1\), \(L=0\) with only linear expressions.

Theorem 1.3

The starter

$$\begin{aligned} \widetilde{S}(g,L)=\left\{ \begin{array}{l@{\quad }l} L+2.30 \, g &{} \text {if }\, 4-1.9 \, g < L \\ L+1.90 \, g &{} \text {if }\, 2.74 - 1.56\,g < L \le 4 - 1.9\,g \\ L+1.56 \, g &{} \text {if }\, 2.01 - 1.33\,g < L \le 2.74 - 1.56\,g \\ L+1.33 \, g &{} \text {if }\, 1.60 - 1.16\,g < L \le 2.01 - 1.33\,g \\ L+1.16 \, g &{} \text {if }\, 1.32 - 1.02\,g < L \le 1.60 - 1.16\,g\\ L+1.02 \, g &{} \text {if }\, 1.12 - 0.91\,g < L \le 1.32 - 1.02\,g \\ L+0.91 \, g &{} \text {if }\, 1-\frac{5}{6} \,g < L \le 1.12 - 0.91\,g \\ \text {the exact solution of } \\ (1-g) \widetilde{S}+g\frac{\widetilde{S}^3}{6}=L &{} \text {if }\, 0 \le L \le 1-\frac{5}{6} \, g\\ \end{array} \right. \end{aligned}$$

is an approximate zero of \(f_{g,L}\) for all \(g\in (0,1)\) and \(L\in [0,\infty )\) (Fig. 1).

Fig. 1
figure 1

The regions of Theorem 1.3 are represented in different colors. The uppermost region (in dark red) is actually unbounded

Remark 1.4

The exact solution \(\widetilde{S}\) of the cubic equation \((1-g)\widetilde{S}+g\frac{\widetilde{S}^3}{6} =L\) is given by

$$\begin{aligned} \widetilde{S}(g,L) = \root 3 \of {\frac{3L}{g}+\sqrt{\frac{9L^2}{g^2}+\frac{8(1-g)^3}{g^3}}} +\root 3 \of {\frac{3L}{g}-\sqrt{\frac{9L^2}{g^2}+\frac{8(1-g)^3}{g^3}}}, \end{aligned}$$

which can also be written as

$$\begin{aligned} \widetilde{S}(g,L) = A-\frac{2(1-g)}{gA}, \end{aligned}$$

where \(A=\root 3 \of {\frac{3L}{g}+\sqrt{\frac{9L^2}{g^2}+\frac{8(1-g)^3}{g^3}}} \).

For any bounded region of \((0,1) \times [0,\infty )\) that excludes a neighborhood of \(g=1, L=0\), we give in Sect. 3 a construction of another piecewise-defined starter using only constants.

Theorem 1.5

For any \(0<\varepsilon <\frac{1}{4}\) and \(L_{\max }>\varepsilon '= \frac{\sqrt{3}\alpha _0\varepsilon ^\frac{3}{2}}{(1-\varepsilon )^\frac{1}{2}}\), there is a piecewise constant function \(\widetilde{S}\) defined in \(R=(0,1)\times [0,L_{\max }] \setminus (1-\varepsilon ,1) \times [0,\varepsilon ']\) that is an approximate zero of \(f_{g,L}\).

The previous result is important for building look-up table solutions of the hyperbolic Kepler’s equation. On the one hand, the starter obtained in this way is valid only on a bounded region of space, since L must be bounded above by a predetermined value \(L_{\max }\). On the other hand, once the look-up table has been built, the starter can be retrieved immediately for any point within the region. The starter of Theorem 1.3 is better if the region is unknown in advance or if we are interested in solving the equation only a few times.

The method of Theorem 1.5 cannot be extended indefinitely to include a neighborhood of the corner, as shown in the following result.

Theorem 1.6

Let \(U \subseteq (0,1) \times [0,\infty )\) be an open set such that \(\bar{U} \supseteq \{ 1 \} \times [0,\varepsilon ]\) for some \(\varepsilon >0\). For any constant starter \(S_0\ge 0\), there exists a point \((g_0,L_0) \in U\) such that \(S_0\) is not an approximate zero of \(f_{g_0,L_0}\).

Finally, in Sect. 4, we include a comparison of the CPU timings and number of Newton’s method iterations of the starter of Theorem 1.3 with the ones given by Gooding and Odell (1988) and Fukushima (1997).

2 Starters for the hyperbolic Kepler’s equation

The aim of this section is to prove Theorem 1.3. To do that, we first prove some necessary technical results. Then we show in Theorem 2.4 that the solution of the cubic equation \((1-g)\widetilde{S}+g\frac{\widetilde{S}^3}{6} =L\) is an approximate zero when \(0\le L\le 1-\frac{5}{6} g\). Finally, we study in Theorem 2.5 the family of starters \(\widetilde{S}=L+ag\), and in Corollary 2.6 those with \(a \in \{0.91, 1.02, 1.16, 1.33, 1.56, 1.90, 2.30 \}\).

Lemma 2.1

For all \(k\ge 1\), we have

$$\begin{aligned} \frac{d^k}{dx^k}({{\mathrm{arcsinh}}}(x))=(1+x^2)^{\frac{1}{2}-k} P_k(x), \end{aligned}$$

where \(P_k (x)\) are polynomials of degree \(k-1\). The monomials of \(P_k(x)\) have all even or odd degree:

$$\begin{aligned} {P_k(x)=c^{(k)}_{k-1}x^{k-1}+c^{(k)}_{k-3} x^{k-3}+c^{(k)}_{k-5}x^{k-5}+\cdots } \end{aligned}$$

The leading coefficient is \((-1)^{k-1} (k-1)!\), the coefficients alternate signs, and the sum of the absolute value of all the coefficients is \(||P_k (x) ||_1 =1\cdot 3 \cdot 5\cdots (2k-3)=(2k-3)!!\) The independent term satisfies

$$\begin{aligned} |P_k(0)|= {\left\{ \begin{array}{ll} ((k-2)!!)^2 &{}\quad \text { if }k \text { is odd,}\\ 0 &{}\quad \text { if }{ k}\text { is even.}\\ \end{array}\right. } \end{aligned}$$

Proof

We proceed by induction. Note that \({{\mathrm{arcsinh}}}'(x)=(1+x^2)^{-\frac{1}{2}}\), so the Lemma holds for \(k=1\) by setting \(P_1(x)=1\). Assume now that the Lemma is true for some \(k\ge 1\). Differentiating the k-th derivative, we get

$$\begin{aligned} \frac{d^{k+1}}{dx^{k+1}}{{\mathrm{arcsinh}}}(x)=(1+x^2)^{\frac{1}{2}-(k+1)}\left[ (1-2k)xP_k(x)+(1+x^2)P'_k(x)\right] , \end{aligned}$$

so we define \(P_{k+1}(x)=(1-2k)xP_k(x)+(1+x^2)P'_k(x)\). This shows that \(P_{k+1}(x)\) is a polynomial of degree at most k. Moreover, if \(P_k(x)\) has only terms of odd (or even) degree, then \(P_{k+1}(x)\) has only terms of even (or odd) degree, respectively. A more careful study of the leading coefficient of \(P_{k+1}(x)\) from the recurrence gives

$$\begin{aligned} {c_{k}^{(k+1)}=(1-2k)c_{k-1}^{(k)}+(k-1)c_{k-1}^{(k)}=(-k) c_{k-1}^{(k)}=(-k)(-1)^{k-1}(k-1)!=(-1)^kk!,} \end{aligned}$$

showing that the degree of \(P_{k+1}(x)\) is k and that the leading coefficient of \(P_{k+1}(x)\) is \((-1)^kk!\), as we needed for the inductive step. The other coefficients of \(P_{k+1}(x)\) can be also found by using the recurrence:

$$\begin{aligned} { c_{k-2r}^{(k+1)}=(-k-2r)c_{k-2r-1}^{(k)}+(k-2r+1)c_{k-2r+1}^{(k)}} \end{aligned}$$

for \(r=1,\ldots ,\left[ \frac{k}{2}\right] \). Since \(-k-2r<0\), \(k-2r+1>0\), and the signs of \({c_{k-2r-1}^{(k)}}\) and \({c_{k-2r+1}^{(k)}}\) are different, we have \({\mathrm{sgn}(c_{k-2r}^{(k+1)})=\mathrm{sgn}(c_{k-2r+1}^{(k)})}\), which alternate by the inductive hypothesis. Finally, for a polynomial with coefficients of alternating signs and all even (or odd) exponents, we have \(||P_{k+1}(x)||_1=|P_{k+1}(i)|=|(1-2k)iP_k(i)|=(2k-1)||P_k(x)||_1=(2k-1)!!\), where \( i=\sqrt{-1}\) is the imaginary unit.

According to Jeffrey and Dai (2008), the power series expansion of \({{\mathrm{arcsinh}}}(x)\) at \(x=0\) is

$$\begin{aligned} {{\mathrm{arcsinh}}}(x)=\sum _{n\ge 0}\frac{(-1)^n(2n-1)!!}{(2n+1)(2n)!!}x^{2n+1} =\sum _{k\ge 0} \frac{P_k(0)}{k!} x^k, \end{aligned}$$

so \(P_{2n}(0)=0\), and \(P_{2n+1}(0)=\frac{(-1)^n(2n-1)!!(2n+1)!}{(2n+1)(2n)!!}=(-1)^n(2n-1)!!^2\). \(\square \)

Lemma 2.2

Given a real number \(x\ne 0\) and an integer \(n\ge 2\), then

$$\begin{aligned} \sup _{k\ge n} \left| \frac{x}{k} \right| ^{\frac{1}{k-1}} =\max \left\{ 1, \left| \frac{x}{n} \right| ^{\frac{1}{n-1}} \right\} . \end{aligned}$$

Proof

Assume first that \(|x|\le n\). Then \(\max \{ 1,\left| \frac{x}{n} \right| ^{\frac{1}{n-1}} \}=1\). Moreover,

$$\begin{aligned} \sup _{k\ge n} \left| \frac{x}{k} \right| ^{\frac{1}{k-1}} \le \sup _{k\ge n} \left| \frac{x}{n} \right| ^{\frac{1}{k-1}} \le 1 \end{aligned}$$

and

$$\begin{aligned} \sup _{k\ge n} \left| \frac{x}{k} \right| ^{\frac{1}{k-1}} \ge \lim _{k \rightarrow \infty } \left| \frac{x}{k} \right| ^{\frac{1}{k-1}}=e^{\lim \limits _{k \rightarrow \infty } \frac{\log |x|-\log |k|}{k-1} }=e^0=1. \end{aligned}$$

Therefore,

$$\begin{aligned} \sup _{k\ge n} \left| \frac{x}{k} \right| ^{\frac{1}{k-1}}=1=\max \left\{ 1, \left| \frac{x}{n} \right| ^{\frac{1}{n-1}} \right\} . \end{aligned}$$

Now consider the case \(|x| >n\). Then \(\max \left\{ 1, \left| \frac{x}{n} \right| ^{\frac{1}{n-1}} \right\} =\left| \frac{x}{n} \right| ^{\frac{1}{n-1}}\). Moreover,

$$\begin{aligned} \sup _{k\ge n} \left| \frac{x}{k} \right| ^{\frac{1}{k-1}} \ge \left| \frac{x}{n} \right| ^{\frac{1}{n-1}} \end{aligned}$$

since the supremum is greater than or equal to the first term, and

$$\begin{aligned} \sup _{k\ge n} \left| \frac{x}{k} \right| ^{\frac{1}{k-1}} \le \sup _{k\ge n} \left| \frac{x}{n} \right| ^{\frac{1}{k-1}} \le \left| \frac{x}{n} \right| ^{\frac{1}{n-1}} \end{aligned}$$

since the sequence \({\left| \frac{x}{n} \right| ^{\frac{1}{k-1}}}\) is decreasing with respect to k. Therefore,

$$\begin{aligned} \sup _{k\ge n} \left| \frac{x}{k} \right| ^{\frac{1}{k-1}} =\left| \frac{x}{n} \right| ^{\frac{1}{n-1}}= \max \left\{ 1, \left| \frac{x}{n} \right| ^{\frac{1}{n-1}} \right\} , \end{aligned}$$

which concludes the proof. \(\square \)

The previous two lemmas give us the following upper bound for \(\gamma (f_{g,L},\widetilde{S})\).

Lemma 2.3

For all \(n\ge 2\), and \((g,L) \in (0,1) \times [0,\infty )\),

$$\begin{aligned} \gamma (f_{g,L},\widetilde{S}) \le \frac{1}{1+\widetilde{S}^2} \max \left\{ a_2, \ldots , a_{n-1}, 2\max \{1,|\widetilde{S}|\} \max \left\{ 1, \left( \frac{g}{n\left( \sqrt{1+\widetilde{S}^{2}}-g\right) }\right) ^{\frac{1}{n-1}} \right\} \right\} , \end{aligned}$$

where

$$\begin{aligned} a_k=\left( \frac{g|P_k(\widetilde{S})|}{k! \left( \sqrt{1+\widetilde{S}^2}-g\right) } \right) ^{\frac{1}{k-1}}. \end{aligned}$$

Proof

By Lemma 2.1,

$$\begin{aligned} \gamma (f_{g,L},\widetilde{S})&=\sup _{k \ge 2} \left| \frac{f_{g,L}^{(k)}(\widetilde{S})}{k!f_{g,L}'(\widetilde{S})}\right| ^{\frac{1}{k-1}} =\sup _{k \ge 2} \left| \frac{g P_k (\widetilde{S}) (1+\widetilde{S}^2)^{\frac{1}{2}-k}}{k! \left( 1-\frac{g}{{}\sqrt{{}1+\widetilde{S}^2}}\right) }\right| ^{\frac{1}{k-1}}\\&=\sup _{k \ge 2} \left| \frac{g P_k (\widetilde{S}) (1+\widetilde{S}^2)^{1-k}}{k! \left( \sqrt{1+\widetilde{S}^2}-g\right) } \right| ^{\frac{1}{k-1}} =\frac{1}{1+\widetilde{S}^2}\sup _{k \ge 2} \left| \frac{g P_k (\widetilde{S})}{k! \left( \sqrt{1+\widetilde{S}^2}-g\right) } \right| ^{\frac{1}{k-1}}\\&=\frac{1}{{}1+\widetilde{S}^2} \max \left\{ a_2,\ldots ,a_{n-1}, \sup _{k \ge n} a_k \right\} . \end{aligned}$$

Again by Lemma 2.1,

$$\begin{aligned} a_k&\le \left| \frac{g \max \{1,|\widetilde{S}| \}^{k-1} \cdot (2k-3)!!}{k! \left( \sqrt{1+\widetilde{S}^2}-g\right) } \right| ^{\frac{1}{k-1}} \le \max \{1,|\widetilde{S}| \} \left| \frac{g \cdot (2k-2)!!}{k! \left( \sqrt{1+\widetilde{S}^2}-g\right) } \right| ^{\frac{1}{k-1}}\\&= \max \{1,|\widetilde{S}| \} \left| \frac{g \cdot 2^{k-1}}{k\left( \sqrt{1+\widetilde{S}^2} -g\right) } \right| ^{\frac{1}{k-1}} =2\max \{1,|\widetilde{S}| \} \left| \frac{g }{k\left( \sqrt{1+\widetilde{S}^2} -g\right) } \right| ^{\frac{1}{k-1}}. \end{aligned}$$

By Lemma 2.2,

$$\begin{aligned} \sup _{k\ge n} a_k&\le 2\max \{1,|\widetilde{S}| \} \sup _{k\ge n} \left| \frac{g }{k\left( \sqrt{1+\widetilde{S}^2} -g\right) } \right| ^{\frac{1}{k-1}}\\&=2\max \{1,|\widetilde{S}| \} \max \left\{ 1, \left( \frac{g }{n \left( \sqrt{1+\widetilde{S}^2} -g\right) } \right) ^{\frac{1}{n-1}} \right\} . \end{aligned}$$

\(\square \)

Theorem 2.4

The exact solution \(\widetilde{S}\) of the cubic equation \((1-g)\widetilde{S}+g\frac{\widetilde{S}^3}{6} =L\) is an approximate zero of \(f_{g,L}\) in the region \(R_1\), where

$$\begin{aligned} R_{1} =\left\{ 0 < g < 1, \; 0 \le L \le 1-\frac{5}{6}g \right\} . \end{aligned}$$

Proof

For any point \((g,L) \in R_1\), the solution \(\widetilde{S}(g,L)\) of the cubic equation satisfies \(\widetilde{S} \in [0,1]\). Therefore,

$$\begin{aligned} | f_{g,L}(\widetilde{S})|&= \left| \widetilde{S} -g {{\mathrm{arcsinh}}}(\widetilde{S}) -L \right| \\&=\left| \left( (1-g) \widetilde{S}+\frac{g}{6} \widetilde{S}^3 -L \right) - g \left( {{\mathrm{arcsinh}}}(\widetilde{S}) -\widetilde{S}+\frac{1}{6} \widetilde{S}^3\right) \right| \\&=g \left( {{\mathrm{arcsinh}}}(\widetilde{S}) -\widetilde{S}+\frac{1}{6} \widetilde{S}^3 \right) \le {{\mathrm{arcsinh}}}(\widetilde{S}) -\widetilde{S}+\frac{1}{6} \widetilde{S}^3. \end{aligned}$$

Using the previous inequality in the definition of \(\beta (f_{g,L},\widetilde{S})\) and the fact that \(g<1\), we obtain

$$\begin{aligned} \beta (f_{g,L},\widetilde{S})= \left| \frac{f(\widetilde{S})}{f'(\widetilde{S})} \right| \le \frac{{{\mathrm{arcsinh}}}(\widetilde{S}) -\widetilde{S}+\frac{1}{6} \widetilde{S}^3}{1-\frac{1}{\sqrt{1+\widetilde{S}^2}}}= \frac{\left( {{\mathrm{arcsinh}}}(\widetilde{S}) -\widetilde{S}+\frac{1}{6} \widetilde{S}^3 \right) \sqrt{1+\widetilde{S}^2}}{ \sqrt{1+\widetilde{S}^2} -1}. \end{aligned}$$

On the other hand, it follows from Lemma 2.3 for \(n=3\), \(g<1\) and \(|\widetilde{S}| \le 1\) that

$$\begin{aligned} \gamma (f_{g,L},\widetilde{S})&\le \frac{1}{1+\widetilde{S}^2} \max \left\{ \frac{\widetilde{S}}{2 (\sqrt{1+\widetilde{S}^2}-1 )}, 2 \max \left\{ 1, \left( \frac{1}{3({}\sqrt{{}1+\widetilde{S}^2}-1)}\right) ^{\frac{1}{2}} \right\} \right\} \\&=\frac{1}{1+\widetilde{S}^2} \max \left\{ 2, \frac{2}{\sqrt{3} (\sqrt{1+\widetilde{S}^2}-1)^{\frac{1}{2}} } \right\} , \end{aligned}$$

since \(\frac{\widetilde{S}}{2 \left( \sqrt{1+\widetilde{S}^2}-1 \right) } \le \frac{2}{\sqrt{3} \left( \sqrt{1+\widetilde{S}^2}-1\right) ^{\frac{1}{2}} }\).

Multiplying the inequalities for \(\beta (f_{g,L},\widetilde{S})\) and \(\gamma (f_{g,L},\widetilde{S})\) together, we have that the \(\alpha \)-test is satisfied if

$$\begin{aligned} \frac{\left( {{\mathrm{arcsinh}}}(\widetilde{S}) -\widetilde{S}+\frac{1}{6} \widetilde{S}^3 \right) }{\sqrt{1+\widetilde{S}^2} \left( \sqrt{1+\widetilde{S}^2} -1 \right) } \max \left\{ 2, \frac{2}{\sqrt{3} (\sqrt{1+\widetilde{S}^2}-1)^{\frac{1}{2}} } \right\} < \alpha _0. \end{aligned}$$

A standard analytic study of the previous one-variable function shows that it is bounded above by \(\alpha _0\) in the interval \(\widetilde{S} \in [0,1]\). \(\square \)

Theorem 2.5

The starter \(\widetilde{S}(g,L)=L+a \,g\) is an approximate zero of \(f_{g,L}\) in the stripe

$$\begin{aligned} R_2 (a) =\{0< g < 1, \widetilde{S}_{\min }(a) \le L +a g \le \widetilde{S}_{\max } (a) \} \end{aligned}$$

if \(\widetilde{S}_{\min }(a),\widetilde{S}_{\max }(a)\) are chosen such that

$$\begin{aligned} \frac{ |a-{{\mathrm{arcsinh}}}(x)|}{{}\sqrt{{}1+x^2}({}\sqrt{{}1+x^2}- 1)^2}\max \left\{ 1, x , 2x(\sqrt{1+x^2}-1) \right\} <\alpha _0 \end{aligned}$$

for all \(x\in [\widetilde{S}_{\min }(a),\widetilde{S}_{\max }(a)]\).

Proof

By definition, we have

$$\begin{aligned} \beta (f_{g,L},\widetilde{S})=\left| \frac{f_{g,L}(\widetilde{S})}{f_{g,L}'(\widetilde{S})} \right| =\frac{g |a-{{\mathrm{arcsinh}}}(\widetilde{S})|}{1-\frac{g}{\sqrt{1+\widetilde{S}^2}}} \le \frac{ |a-{{\mathrm{arcsinh}}}(\widetilde{S})|}{1-\frac{1}{\sqrt{1+\widetilde{S}^2}}}. \end{aligned}$$

By Lemma 2.3 with \(n=2\),

$$\begin{aligned} \gamma (f_{g,L},\widetilde{S}) \le \frac{2}{1+\widetilde{S}^2} \max \{1,\widetilde{S} \} \max \left\{ 1, \frac{1}{2(\sqrt{1+\widetilde{S}^2}-1)} \right\} . \end{aligned}$$

Therefore, we obtain that

$$\begin{aligned} \alpha (f_{g,L},\widetilde{S})&=\beta (f_{g,L},\widetilde{S}) \cdot \gamma (f_{g,L},\widetilde{S}) \\&\le \frac{2 |a-{{\mathrm{arcsinh}}}(\widetilde{S})|}{\sqrt{1+\widetilde{S}^2}(\sqrt{1+\widetilde{S}^2}- 1)} \max \{1,\widetilde{S} \} \max \left\{ 1, \frac{1}{2(\sqrt{1+\widetilde{S}^2}-1)} \right\} \\&= \frac{ |a-{{\mathrm{arcsinh}}}(\widetilde{S})|}{\sqrt{1+\widetilde{S}^2}(\sqrt{1+\widetilde{S}^2}- 1)^2} \max \left\{ 1, \widetilde{S} , 2\widetilde{S}(\sqrt{1+\widetilde{S}^2}-1) \right\} , \end{aligned}$$

which is less than \(\alpha _0\) in \(R_2(a)\) when \(\widetilde{S} \in [\widetilde{S}_{\min }(a),\widetilde{S}_{\max }(a)]\). \(\square \)

Corollary 2.6

The starter \(\widetilde{S}(g,L)=L+a \,g\) is an approximate zero of \(f_{g,L}\) in the stripe

$$\begin{aligned} R_2 (a) =\{0< g < 1, \widetilde{S}_{\min }(a) \le L +a g \le \widetilde{S}_{\max } (a) \} \end{aligned}$$

for the values of a, \(\widetilde{S}_{\min }(a)\), \(\widetilde{S}_{\max }(a)\) given in the following table:

a

\(\widetilde{S}_{\min }(a)\)

\(\widetilde{S}_{\max }(a)\)

0.91

0.99

1.12

1.02

1.12

1.32

1.16

1.32

1.60

1.33

1.59

2.01

1.56

2.00

2.74

1.90

2.73

4.00

2.30

4.00

\(\infty \)

Proof (of Theorem 1.3)

By Theorem 2.4, \(\widetilde{S}(g,L)\) is an approximate zero of \(f_{g,L}\) when \(0\le L \le 1-\frac{5}{6} g\). For the remaining cases, we use Corollary 2.6. It is easy to verify that for each \(a \in \{ 0.91, 1.02, 1.16, 1.33, 1.56, 1.90, 2.30\}\), the region where \(\widetilde{S}\) is defined as \(L+a \, g\) is contained in \(R_2(a)\). \(\square \)

3 An analytical study of piecewise constant starters

Theorem 3.1

The constant starter \(\widetilde{S}(g,L)=S_0 >0\) is an approximate zero of \(f_{g,L}\) in the stripe

$$\begin{aligned} R_3(S_0)=\left\{ 0<g<1, S_0-g{{\mathrm{arcsinh}}}(S_0) -\varDelta (S_0) <L < S_0-g{{\mathrm{arcsinh}}}(S_0) +\varDelta (S_0) \right\} , \end{aligned}$$

where

The constant starter \(\widetilde{S}(g,L)=0\) is an approximate zero of \(f_{g,L}\) in the region

$$\begin{aligned} R_3 (0)=\left\{ 0<g<1, 0 \le L < \min \left\{ \alpha _0 (1-g), \frac{\sqrt{3} \alpha _0 (1-g)^\frac{3}{2} }{g^\frac{1}{2}} \right\} \right\} . \end{aligned}$$

Proof

Consider first the case \(S_0>0\). By Definition 1.1 and the fact that \(g<1\),

$$\begin{aligned} \beta (f_{g,L},\widetilde{S})&=\left| \frac{f_{g,L}(\widetilde{S})}{f_{g,L}'(\widetilde{S})} \right| =\left| \frac{S_0-g{{\mathrm{arcsinh}}}(S_0)-L}{1-\frac{g}{\sqrt{1+S_0^2}}} \right| \\&\le \frac{\left| S_0-g{{\mathrm{arcsinh}}}(S_0)-L\right| }{1-\frac{1}{\sqrt{1+S_0^2}}}\\&=\frac{1}{S_0^2}\left| S_0-g{{\mathrm{arcsinh}}}(S_0)-L\right| \sqrt{1+S_0^2} \left( \sqrt{1+S_0^2}+1 \right) . \end{aligned}$$

On the other hand, it follows from Lemma 2.3 with \(n=3\) that

$$\begin{aligned} \begin{aligned} \gamma (f_{g,L},\widetilde{S})&\le \frac{1}{1+S_0^2} \max \left\{ \frac{ S_0}{2\left( \sqrt{1+S_0^2}-1\right) }, 2\max \{ 1,S_0 \} \max \left\{ 1,\left( \frac{1}{3\left( \sqrt{1+S_0^2}-1\right) }\right) ^{\frac{1}{2}} \right\} \right\} \\&=\frac{2}{1+S_0^2} \max \{ 1,S_0 \} \max \left\{ 1,\left( \frac{1}{3\left( \sqrt{1+S_0^2}-1\right) }\right) ^{\frac{1}{2}} \right\} . \end{aligned} \end{aligned}$$

We will now distinguish between \(S_0\ge 1\) and \(0 < S_0 < 1\). In the first case, \(\gamma (f_{g,L},\widetilde{S}) \le \frac{2S_0}{1+S_0^2}\), so the \(\alpha \)-test follows from

$$\begin{aligned} \frac{2\left( \sqrt{1+S_0^2}+1 \right) \left| S_0-g{{\mathrm{arcsinh}}}(S_0)-L\right| }{S_0\sqrt{1+S_0^2}} <\alpha _0, \end{aligned}$$

which is equivalent to

$$\begin{aligned} \left| S_0-g{{\mathrm{arcsinh}}}(S_0)-L \right|&<\frac{\alpha _0 S_0 \sqrt{1+S_0^2}}{2\left( \sqrt{1+S_0^2}+1 \right) }=\varDelta (S_0). \end{aligned}$$

When \(0 < S_0 < 1\), \(\gamma (f_{g,L},\widetilde{S}) \le \frac{2}{1+S_0^2} \max \left\{ 1,\left( \frac{1}{3(\sqrt{1+S_0^2}-1)}\right) ^{\frac{1}{2}} \right\} \) so the \(\alpha \)-test is satisfied if

$$\begin{aligned} \left| S_0-g{{\mathrm{arcsinh}}}(S_0)-L\right| \frac{2\left( \sqrt{1+S_0^2}+1 \right) }{S_0^2\sqrt{1+S_0^2}} \max \left\{ 1,\left( \frac{1}{3(\sqrt{1+S_0^2}-1)}\right) ^{\frac{1}{2}} \right\} <\alpha _0, \end{aligned}$$

which is equivalent to

Therefore, the \(\alpha \)-test is satisfied in the whole region \(R_3 (S_0)\).

When \(\widetilde{S}(g,L)=S_0=0\), we have

$$\begin{aligned} \beta (f_{g,L},\widetilde{S})=\left| \frac{f_{g,L}(\widetilde{S})}{f_{g,L}'(\widetilde{S})} \right| =\frac{L}{1-g} \end{aligned}$$

and by Lemmas 2.1 and 2.2

$$\begin{aligned} \begin{aligned} \gamma (f_{g,L},\widetilde{S})&= \sup _{k\ge 2} \left| \frac{f^{(k)}(0)}{k! (1-g)}\right| ^{\frac{1}{k-1}} =\sup _{k\ge 2} \left| \frac{g P_k(0)}{k! (1-g)}\right| ^{\frac{1}{k-1}} =\sup _{\genfrac{}{}{0.0pt}{}{k\ge 3}{\text {odd}}} \left| \frac{g ((k-2)!!)^2}{k! (1-g)}\right| ^{\frac{1}{k-1}} \\&\le \sup _{k\ge 3} \left| \frac{g}{k (1-g)}\right| ^{\frac{1}{k-1}} = \max \left\{ 1, \sqrt{\frac{g}{3(1-g)}} \right\} . \end{aligned} \end{aligned}$$

Therefore, the \(\alpha \)-test is satisfied if

$$\begin{aligned} \frac{L}{1-g} \max \left\{ 1, \sqrt{\frac{g}{3(1-g)}} \right\} < \alpha _0, \end{aligned}$$

which is equivalent to \((g,L) \in R_3(0)\). \(\square \)

Remark 3.2

The region \(R_3(0)\) of Theorem 3.1 can be written as

$$\begin{aligned} R_3(0)= \left\{ 0 < g < \frac{3}{4}, 0\le L < \alpha _0 (1-g) \right\} \cup \left\{ \frac{3}{4} \le g < 1, 0\le L < \frac{\sqrt{3}\alpha _0 (1-g)^\frac{3}{2}}{g^\frac{1}{2}} \right\} . \end{aligned}$$

Lemma 3.3

The function \(\varDelta (S_0)\) of Theorem 3.1 is strictly increasing for \(S_0>0\) (Fig. 2).

Fig. 2
figure 2

The region \(R_3(0)\) of Theorem 3.1 and Remark 3.2 is represented in red

Proof

When \(0<S_0 \le \frac{\sqrt{7}}{3}\), then

When \(S_0 > \frac{\sqrt{7}}{3}\), then

$$\begin{aligned} \varDelta (S_0)=\frac{\alpha _0 S_0 \min \{1,S_0\} \sqrt{1+S_0^2}}{2\left( \sqrt{1+S_0^2}+1\right) } =\frac{\alpha _0}{2} S_0 \min \{1,S_0\} \left( 1-\frac{1}{\sqrt{1+S_0^2}+1} \right) . \end{aligned}$$

Since both expressions are the product of constant and strictly increasing functions and \(\varDelta (S_0)\) is continuous at \(S_0=\frac{\sqrt{7}}{3}\), then \(\varDelta (S_0)\) is strictly increasing.

Proof (of Theorem 1.5)

We decompose the region of the theorem as

$$\begin{aligned} \underbrace{(0,1-\varepsilon ) \times [0,\varepsilon ']}_{A} \cup \underbrace{(0,1) \times [\varepsilon ',L_{\max }]}_{B}. \end{aligned}$$

From Remark 3.2, we have that \(A \subseteq R_3 (0)\).

Define the sequence \(S_0=\varepsilon '\) and \(S_{i+1}=S_i +2\varDelta (S_i)\), for \(i\ge 0\), where \(\varDelta (S_i)\) is the function of Theorem 3.1. Since \(\varDelta \) is a positive and strictly increasing function by Lemma 3.3, the sequence \(S_i\) is strictly increasing and satisfies \(S_i \ge S_0+2 i \varDelta (S_0) \longrightarrow \infty \) as \(i \rightarrow \infty \).

Since \(\displaystyle {\lim _{x \rightarrow \infty }} (x-{{\mathrm{arcsinh}}}(x))=\infty \), there exists a large enough N for which \(S_N-{{\mathrm{arcsinh}}}(S_N) >L_{\max }\). Once we prove that \(\displaystyle {B \subseteq C=\bigcup _{i=0}^N R_3 (S_i)}\), with \(R_3(S_i)\) as in Theorem 3.1, we will have that \(A \cup B\) is covered by the regions where the starters \(0,S_0,\ldots ,S_N\) are approximate zeros of \(f_{g,L}\). In particular, this will provide the piecewise constant starter \(\widetilde{S}\) that we looked for.

It only remains to prove that \(B \subseteq C\). Note first that \(R_3(S_i)\) are stripes whose union is

$$\begin{aligned} C = \left\{ 0<g<1, S_0-\varDelta (S_0)-g{{\mathrm{arcsinh}}}(S_0)<L<S_N+\varDelta (S_N)-g{{\mathrm{arcsinh}}}(S_N)\right\} \end{aligned}$$

since the lower boundary of \(R_3(S_{i+1})\) is a segment that is always below the upper boundary of \(R_3(S_i)\). Indeed, it is enough to see that the endpoints of the first segment, \((0,S_{i+1}-\varDelta (S_{i+1}))\) and \((1,S_{i+1}-\varDelta (S_{i+1})-{{\mathrm{arcsinh}}}(S_{i+1}))\), are below the endpoints of the second segment, \((0,S_i+\varDelta (S_i))\) and \((1,S_i+\varDelta (S_i)-{{\mathrm{arcsinh}}}(S_i))\), respectively:

$$\begin{aligned} \begin{aligned}&S_{i+1}-\varDelta (S_{i+1})<S_i+\varDelta (S_i) \iff \varDelta (S_i)<\varDelta (S_{i+1}),\\&S_{i+1}-\varDelta (S_{i+1})-{{\mathrm{arcsinh}}}(S_{i+1})<S_i+\varDelta (S_i)-{{\mathrm{arcsinh}}}(S_i) \iff \\&\qquad \iff \varDelta (S_i)+{{\mathrm{arcsinh}}}(S_i)<\varDelta (S_{i+1})+{{\mathrm{arcsinh}}}(S_{i+1}), \end{aligned} \end{aligned}$$

which is true because \(\varDelta \) and \({{\mathrm{arcsinh}}}\) are both strictly increasing.

On the other hand, the vertices of the rectangle B are in \(\bar{C}\), which means that \(B\subseteq C\). Indeed, it follows from \({{\mathrm{arcsinh}}}(\varepsilon '),{{\mathrm{arcsinh}}}(S_N),\varDelta (\varepsilon '),\varDelta (S_N)>0\); \(\varepsilon '<L_{\max }\) by hypothesis and \(S_N-{{\mathrm{arcsinh}}}(S_N)>L_{\max }\) by construction, that

$$\begin{aligned} \varepsilon '-&\varDelta (\varepsilon ')-{{\mathrm{arcsinh}}}(\varepsilon ')< \varepsilon '-\varDelta (\varepsilon ')< \varepsilon '<L_{\max }<\\&<S_N-{{\mathrm{arcsinh}}}(S_N)<S_N+\varDelta (S_N)-{{\mathrm{arcsinh}}}(S_N)<S_N+\varDelta (S_N). \end{aligned}$$

This implies

$$\begin{aligned}&\varepsilon '-\varDelta (\varepsilon ')<\varepsilon '<S_N+\varDelta (S_N) \Rightarrow (0,\varepsilon ')\in \bar{C}, \\&\varepsilon '-\varDelta (\varepsilon ')-{{\mathrm{arcsinh}}}(\varepsilon ') <\varepsilon '<S_N+\varDelta (S_N)-{{\mathrm{arcsinh}}}(S_N) \Rightarrow (1,\varepsilon ')\in \bar{C},\\&\varepsilon '-\varDelta (\varepsilon ')<L_{\max }<S_N+\varDelta (S_N) \Rightarrow (0,L_{\max })\in \bar{C},\\&\varepsilon '-\varDelta (\varepsilon ')-{{\mathrm{arcsinh}}}(\varepsilon ')< L_{\max }<S_N+\varDelta (S_N)-{{\mathrm{arcsinh}}}(S_N) \Rightarrow (1,L_{\max })\in \bar{C}, \end{aligned}$$

concluding the proof. \(\square \)

Proof (of Theorem 1.6)

We proceed by contradiction, i.e. we assume that \(S_0\) is an approximate zero of \(f_{g,L}\) for all \((g,L)\in U\). By definition, this means that

$$\begin{aligned} \left| \frac{S_0-g{{\mathrm{arcsinh}}}(S_0)-L}{\sqrt{1+S_0^2} \left( \sqrt{1+S_0^2}-g\right) } \right| \, \sup _{k\ge 2}\left| \frac{gP_k(S_0)}{k!\left( \sqrt{1+S_0^2}-g\right) } \right| ^\frac{1}{k-1}<\alpha _0 \end{aligned}$$
(3.1)

for all \((g,L)\in U\). We will derive a contradiction from (3.1) for all \(S_0\ge 0\).

Case \(S_0=0\). Inequality (3.1) is equivalent to

$$\begin{aligned} \frac{L}{1-g}\sup _{k\ge 2}\left| \frac{gP_k(0)}{k!(1-g)} \right| ^\frac{1}{k-1}<\alpha _0. \end{aligned}$$

Using Lemma 2.1, we obtain

$$\begin{aligned} \alpha _0>\frac{L}{1-g}\sup _{\genfrac{}{}{0.0pt}{}{k\ge 3}{\text {odd}}} \left| \frac{g ((k-2)!!)^2}{k!(1-g)}\right| ^\frac{1}{k-1}\ge \frac{L}{1-g}\sup _{\genfrac{}{}{0.0pt}{}{k\ge 3}{\text {odd}}}\left| \frac{g}{k(k-1)(1-g)} \right| ^\frac{1}{k-1}\ge \frac{Lg^\frac{1}{2}}{\sqrt{6}(1-g)^\frac{3}{2}}. \end{aligned}$$

This implies that \(L\le \frac{\sqrt{6}\alpha _0(1-g)^\frac{3}{2}}{g^\frac{1}{2}}\) in \(\bar{U}\). Therefore \(L\le 0\) in \(\bar{U}\cap \{g=1\}=\{1\}\times [0,\varepsilon ]\), which is impossible.

For the rest of the cases, we take limit as \(g \rightarrow 1^-\) and then limit as \(L \rightarrow 0^+\), obtaining

$$\begin{aligned} \left| \frac{S_0-{{\mathrm{arcsinh}}}(S_0)}{\sqrt{1+S_0^2}\left( \sqrt{1+S_0^2}-1\right) } \right| \, \sup _{k\ge 2}\left| \frac{P_k(S_0)}{k!\left( \sqrt{1+S_0^2}-1\right) } \right| ^\frac{1}{k-1} \le \alpha _0. \end{aligned}$$
(3.2)

Case \(0<S_0<1.598\). Inequality (3.2) implies that

$$\begin{aligned} \alpha _0 \ge \frac{(S_0-{{\mathrm{arcsinh}}}(S_0)) \left( \sqrt{1+S_0^2}+1\right) ^2}{2S_0^3 \sqrt{1+S_0^2}} \ge \frac{2(S_0-{{\mathrm{arcsinh}}}(S_0))}{S_0^3 } \end{aligned}$$

because \((t+1)^2/t \ge 4\) for all \(t>0\). Therefore, \(h(S_0)={{\mathrm{arcsinh}}}(S_0) -S_0+\frac{\alpha _0S_0^3}{2} \ge 0\), which is false in the interval (0, 1.598) because h has only one critical point there, which is a minimum, and \(h(0),h(1.598)\le 0\).

Case \(1.598\le S_0<3.1\). Inequality (3.2) is equivalent to

$$\begin{aligned} \frac{(S_0-{{\mathrm{arcsinh}}}(S_0))\left( \sqrt{1+S_0^2}+1\right) ^\frac{3}{2} \sqrt{2S_0^2-1}}{\sqrt{6} S_0^3 \sqrt{1+S_0^2}} \le \alpha _0. \end{aligned}$$

Since \(\frac{\left( \sqrt{1+S_0^2}+1\right) ^\frac{3}{2} \sqrt{2S_0^2-1}}{S_0 \sqrt{1+S_0^2}}\) is increasing in this interval, then

$$\begin{aligned} \alpha _0 \ge \frac{S_0-{{\mathrm{arcsinh}}}(S_0)}{\sqrt{6} S_0^2} \cdot \frac{\left( \sqrt{1+1.598^2}+1\right) ^\frac{3}{2} \sqrt{2 \cdot 1.598^2-1}}{1.598 \sqrt{1+1.598^2}}. \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{S_0-{{\mathrm{arcsinh}}}(S_0)}{S_0^2} \le \frac{\sqrt{6}\alpha _0 1.598 \sqrt{1+1.598^2}}{\left( \sqrt{1+1.598^2}+1\right) ^\frac{3}{2} \sqrt{2 \cdot 1.598^2-1}} <0.13, \end{aligned}$$

and thus \(S_0-{{\mathrm{arcsinh}}}(S_0)-0.13S_0^2<0\). This is false since the function \(h(S_0)=S_0-{{\mathrm{arcsinh}}}(S_0)-0.13S_0^2\) has an absolute minimum at \(S_0=3.1\) and \(h(3.1)>0\).

Case \(3.1\le S_0<9.62\). Inequality (3.2) is equivalent to

$$\begin{aligned} \frac{(S_0-{{\mathrm{arcsinh}}}(S_0))\left( \sqrt{1+S_0^2}+1\right) ^\frac{4}{3} (6S_0^2-9)^\frac{1}{3}}{\root 3 \of {24} S_0^\frac{7}{3} \sqrt{1+S_0^2}} \le \alpha _0. \end{aligned}$$
(3.3)

On the one hand,

$$\begin{aligned} \frac{\left( \sqrt{1+S_0^2}+1\right) ^\frac{4}{3} (6S_0^2-9)^\frac{1}{3}}{S_0 \sqrt{1+S_0^2}} \ge \frac{\left( \sqrt{1+9.62^2}+1\right) ^\frac{4}{3} (6\cdot 9.62^2-9)^\frac{1}{3}}{9.62 \sqrt{1+9.62^2}} > 2.064, \end{aligned}$$

since the expression on the left is decreasing. On the other hand, \(\frac{S_0-{{\mathrm{arcsinh}}}(S_0)}{S_0^\frac{4}{3}} > 0.24\) because \(h(S_0)=S_0-{{\mathrm{arcsinh}}}(S_0)-0.24S_0^\frac{4}{3}\) is increasing in the interval, so \(h(S_0)\ge h(3.1)>0\).

Substituting these inequalities in Eq. (3.3), we obtain

$$\begin{aligned} \alpha _0 \ge \frac{(S_0-{{\mathrm{arcsinh}}}(S_0))\left( \sqrt{1+S_0^2}+1\right) ^\frac{4}{3} (6S_0^2-9)^\frac{1}{3}}{\root 3 \of {24} S_0^\frac{7}{3} \sqrt{1+S_0^2}} > \frac{2.064 \cdot 0.24}{\root 3 \of {24}}>0.1717>\alpha _0, \end{aligned}$$

which is a contradiction.

Case \(S_0\ge 9.62\). Using that \(\frac{S_0-{{\mathrm{arcsinh}}}(S_0)}{\sqrt{1+S_0^2}}\ge \frac{1}{2}\) and that \(\sqrt{1+S_0^2}-1\le S_0\) in the interval, we obtain from Eq. (3.2) that

$$\begin{aligned} \alpha _0 \ge \frac{1}{2\left( \sqrt{1+S_0^2}-1\right) } \sup _{k\ge 2}\left| \frac{P_k(S_0)}{k!\left( \sqrt{1+S_0^2}-1\right) } \right| ^\frac{1}{k-1} \ge \frac{1}{2S_0} \sup _{k\ge 2}\left| \frac{P_k(S_0)}{k!S_0} \right| ^\frac{1}{k-1}. \end{aligned}$$

This implies that, for all \(k\ge 2\),

$$\begin{aligned} |P_k(S_0)|\le (2\alpha _0)^{k-1}k!S_0^k. \end{aligned}$$

By Lemma 2.1, the leading term of \(P_k(S_0)\) is \(\pm (k-1)!S_0^{k-1}\), its coefficients add up to a maximum of \((2k-3)!!\) and it has no term of degree \(k-2\), so

$$\begin{aligned} (2\alpha _0)^{k-1}k!S_0^k\ge |P_k(S_0)|\ge (k-1)!S_0^{k-1}-(2k-3)!!S_0^{k-3}, \text { for all } k\ge 2, \end{aligned}$$

which is equivalent to

$$\begin{aligned} (2\alpha _0)^{k-1}kS_0^3-S_0^2+\frac{(2k-3)!!}{(k-1)!}\ge 0, \text { for all }k\ge 2. \end{aligned}$$

This implies that

$$\begin{aligned} h_k(S_0)=(2\alpha _0)^{k-1}kS_0^3-S_0^2+2^{k-1}\ge 0, \text { for all } k\ge 2. \end{aligned}$$

The degree 3 polynomial \(h_k\) has two local extrema: a local maximum at \(S_0=0\) where \(h_k(0)=2^{k-1}>0\) and a local minimum at \(S_0=r_{\min }(k)=\frac{2}{3k(2\alpha _0)^{k-1}}\). Since \(h_k(r_{\min }(k))<0\) if and only if \(k\ge 5\), \(h_k\) has two positive roots when \(k\ge 5\), namely \(r_{\text {left}}(k)\) and \(r_{\text {right}}(k)\). Therefore, \(h_k(S_0)\ge 0\) is equivalent to \(S_0\in [0,r_{\text {left}}(k)]\cup [r_{\text {right}}(k),\infty )\) when \(k\ge 5\).

We will obtain a contradiction by showing that \(S_0\in (r_{\text {left}}(k),r_{\text {right}}(k))\) for some \(k\ge 5\). This is necessarily true since \(\cup _{k\ge 5}(r_{\text {left}}(k),r_{\text {right}}(k))= (r_{\text {left}}(5),\infty ) \supseteq (r_{\min }(5),\infty ) \supseteq [9.62,\infty )\ni S_0\).

It only remains to be proven that \(\cup _{k\ge 5}(r_{\text {left}}(k),r_{\text {right}}(k))=(r_{\text {left}}(5),\infty )\). This follows from \(r_{\text {right}}(k)>r_{\min }(k)=\frac{2}{3k(2\alpha _0)^{k-1}}\longrightarrow \infty \) as \(k\rightarrow \infty \), and the fact that \(r_{\text {left}}(k+1)<r_{\text {right}}(k)\) for all \(k\ge 5\). Indeed, since \(h_{k+1}(r_{\min }(k))<0\) and \(r_{\min }(k)<r_{\text {right}}(k)\), then \(r_{\text {left}}(k+1)<r_{\min }(k)<r_{\text {right}}(k)\).

4 Comparison with existing methods

In this section, we compare the accuracy and computational complexity (CPU time) of our starter given in Theorem 1.3 with the starters proposed by Gooding and Odell (1988) and Fukushima (1997). The more recent method presented in Farnocchia et al. (2013) cannot be directly compared with ours, since their approach is significantly different.

For the comparison, we follow the criterion of Table II of Fukushima (1997). We evaluate the starters in \(10^4\times 10^4\) equally-spaced grid points \((g,\log _{10}L)\) with \(0<g<1\) and \(-3\le \log _{10}L\le 8\). At each point, we record the CPU time and the number of Newton’s iterations needed to reach a solution that satisfies \(|f_{g,L}(S)|<5 \times 10^{-15}(1+L)\). The factor \(1+L\) was introduced by Fukushima to account for both relative and absolute errors. Table 1 shows the average CPU time (in \(10^{-6}\) s) and number of steps needed.

Table 1 Comparison of CPU time (in \(\mu \)s) and number of Newton’s method iterations

Table 1 shows that our proposed starter is faster than the other two. Moreover, the fact that the average number of Newton iterations is about 1.77, shows that our method requires less than 2 iterations in most cases. Although Gooding and Odell’s starter requires fewer steps on average, it is slower that ours due to its complexity.

5 Conclusions

We provided in Theorem 1.3 a very simple starter \(\widetilde{S}(g,L)\) for the hyperbolic Kepler’s equation which uses a single addition and multiplication when \(L>1-\frac{5}{6}g\) and a few arithmetic operations, a cubic and square root otherwise. Our starter can be implemented efficiently on modern computers and has guaranteed quadratic convergence speed from the first iteration, so the relative error becomes negligible after just a few iterations for any value of \(g\in (0,1)\) and \(L\in [0,\infty )\).

Indeed, on a modern computer it takes \(0.33\,\mu \)s on average to solve an instance of the equation (evaluated over a big grid of different possible inputs) and in most cases it requires at most two iterations of the Newton method to get an error of order \(5\times 10^{-15}\), as defined in Sect. 4. We have done this test on an Intel Core i5-2410M CPU @ 2.30 GHz\(\times \) 4 machine running Ubuntu \(14.04LTS\;64\)- bits.

If an even simpler starter is necessary, we showed in Theorem 1.5 that it is possible to produce a piecewise constant starter in any bounded region of \((0,1)\times [0,\infty )\), excluding a neighborhood of the corner \(g=1,L=0\). This starter also has quadratic convergence rate and its evaluation requires no operations, so it can be efficiently implemented as a look-up table.