1 Introduction

The spatial three-body problem concerns three-point masses in space moving according to Newton’s equations of gravitation. The point of this article is to prove that there exist no periodic solutions to this problem which “hang out near infinity.”

The conserved quantities for the problem are the energy H, angular momentum J and linear momentum. As is standard, we may, without loss of generality, assume that the linear momentum is zero and the origin of space coincides with the center of mass of the three bodies. If \(m_i\) denotes the masses and \(q_i\in {\mathbb {R}}^3\) the positions of the bodies, then the standard measure of size is \(\Vert q\Vert =\sqrt{I(q)}\) where \(q=(q_1,q_2,q_3)\) and \(I=\sum m_i |q_i|^2\) is known as the total moment of inertia. Neighborhoods of infinity are regions of the form \(\{ q: I(q)\ge I_0\}\). As \(I_0\rightarrow \infty \) the neighborhood converges to infinity. Our main theorem is:

Theorem 1

For \(H<0\) there exists \(I_0(m_j, H, J)>0\) such that any orbit at these energy and momentum levels beginning in the region \(I>I_0\) enters the region \(I\le I_0\) in forward or backward time.

Motivation The motivation behind our result came from the problem of which syzygy sequences are realized in the zero angular momentum planar three-body problem (see Moeckel and Montgomery 2015; Montgomery 2002, 2007). The term syzygy is from astronomy and refers to when the three bodies are in eclipse, that is collinear. Each syzygy has a ‘type’ 1, 2 or 3, according to the label of the mass in the middle. Then the syzygy sequence of an orbit is this list of syzygy types in temporal order. A first open problem is whether or not the periodic sequence of repeating 1212s is realized by a periodic solution to the zero angular momentum problem. One imagines such a motion as consisting of masses 1 and 2 going around each other in a near circular orbit, very far from mass 3, and the center of mass of the \(m_1\) and \(m_2\) orbit slowly going around mass 3, like the Earth–Moon–Sun system. The action over such solutions decreases as the distance of the Earth–Moon system to the Sun goes to infinity, i.e., minimizing the action forces the solution to slide off into a neighborhood of infinity, see Chenciner and Montgomery (2000). The theorem excludes the existence of such solutions “near infinity,” i.e., in the region \(I \ge I_0 (m_j, H, 0)\).

Remark In the theorem we may either exclude orbits having a binary collision singularity or pass through them using Levi–Civita regularization. Can we prove an analogous result to Theorem 1 for \(N\ge 4\)? The proof here breaks down in Proposition 1 where the neighborhoods of infinity fail to split into connected components characterized by a far body with suitable Jacobi coordinates. The connectedness of the neighborhoods of infinity due to these spread-out clusters of tight binaries is utilized for Jeff Xia’s orbits realizing infinity in finite time singularities where \(N\ge 5\). Can these infinities in finite time orbits provide counterexamples to Theorem 1 for \(N\ge 5\)?

Remark In Meyer (1994) comet-like periodic orbits for the N-body problem are established in a region \(I\ge I_C\) for \(I_C\) large. These orbits do not contradict our theorem because as \(I_C\rightarrow \infty \) their orbits angular momentum \(|J|\rightarrow \infty \).

2 Related results

The behavior of I(t) has long been studied to gain some qualitative understanding of the N-body problem. Sundman (1912) showed for the three-body problem that nonzero angular momentum implies no orbits suffer triple collision, i.e., \(I>0\) for all orbits. Namely, there exists a positive lower bound, , for orbits at such levels. That is \(I(t)>I_S>0\) over the solutions with energy H and angular momentum \(J\ne 0\) and with initial conditions at I(0), \(\dot{I}(0)\). Hadamard (1915), p. 259, gave an explicit formula for such an \(I_S\), and Birkhoff (1927), Ch. IX §8, studied escape conditions in the nonzero angular momentum case by showing for example (p. 282) that I sufficiently small (near zero) at some instant, \(t_0\), implies I becomes infinite as t goes to infinity. One might paraphrase Birkhoff’s result as ‘no hanging out in neighborhoods of triple collision.’

A great deal of analysis on I has followed these two tracks around small I values. See, for example, Laskar and Marchal (1984), Marchal and Yoshida (1984) on the greatest lower bound of I for bounded orbits and Marchal (1974), Marchal and Yoshida (1984), Pollard (1970) for efficient tests of escape in a variety of cases. The book Marchal (1990), Ch. 11, is a detailed reference for the qualitative study of I.

For each orbit let \(I_m\) be the minimum value of I over this orbit, sharpening Sundman leads one to seek a (greatest) lower bound of \(I_m\) over classes of orbits. An analogous question here is instead to seek a (least) upper bound of \(I_m\) over all orbits.

While most focus in the literature so far appears on the greatest lower bound and escape, this upper bound question has not entirely escaped notice. A statement similar to Theorem 1 appears in Marchal (1990), p. 468, where an upper bound is given in a remark about a class of equal mass cases (those with \(H|J|^2=-\frac{4}{3^5}\)) and the least upper bound is conjectured to be attained over the Broucke–Henon orbit (Marchal 1990, p. 469). Here we give a new motivation to this question as to the existence of the 1212\(\ldots \) solution in the zero angular momentum case and use a different method than that of Marchal (1990). Additionally, we observe that both methods give upper bounds in a general case rather than just treating an equal mass case (see also Marchal 1990, p. 483). Moreover, the method we use here offers hope of lowering the upper bound if the perturbation step (Propositions 3, 4, 5) is dealt with more effectively. In Appendix we give some comparison of the two methods.

Fig. 1
figure 1

For the planar three-body problem the shape space is \({\mathbb {R}}^3\) where I is the distance from the origin. The admissible configurations at fixed \(H<0\) are interior to a pair of pants where each leg of the pants is asymptotic to a cylinder around a binary collision ray. See Moeckel (1988) or Montgomery (2015) for details

3 Structure of proof

For \(H < 0\) as we let \(I_0 \) increase, eventually the domain \(\{I \ge I_0\}\) splits into three components each component characterized by the selection of one of the three masses. The two remaining masses stay close to each other, while this third selected mass stays relatively far away from either member of this pair (see Fig. 1). We fix attention on one of these regions, supposing, after relabeling, that the close masses are \(m_1\) and \(m_2\). In this region, we use the standard Jacobi coordinates \(\xi _1, \xi _2\). See Fig. 2.

Fig. 2
figure 2

A tight binary configuration, set \(r:=|\xi _1|\), \(\rho :=|\xi _2|\)

When written in these coordinates, Newton’s differential equations become a perturbation of two uncoupled Kepler problems, one for each Jacobi vector, with the perturbation term getting arbitrarily small as \(I_0 \rightarrow \infty \). We focus attention on the long Jacobi vector, which connects the center of mass of the \(m_1\) and \(m_2\) system to the third mass. When we drop the perturbation term of this perturbed Kepler system, we get an exact solvable Kepler problem whose solutions we call “the osculating solutions.”

The Kepler parameters (energy, angular momentum, Laplace or Runge–Lenz vector) for the osculating system can be bounded using that HJ, the masses, are fixed and the fact that \(I_0\gg 0\). Now here comes the key observation, due to Chenciner. Consider a family of solutions to Kepler equation having fixed energy and bounded angular momentum. If, along the solutions of this family, the initial distance from the origin tends to infinity then these orbits become extremely eccentric and thus must come close to the origin. Thus, the osculating orbits cannot “hang out near infinity.” Said slightly differently, since large circular orbits for the Kepler problem have large angular momentum and since our total angular momentum is fixed, large near circular motions for osculating system are excluded and this excludes orbits of the type of our Earth–Moon–Sun cartoon described above.

Here is the strategy of proof then. Show that for sufficiently large \(I_0\) all of the osculating solutions starting in \(\{ I \ge I_0\}\) are extremely eccentric, enough so to enter the region \(\{ I\le I_0\}\) (see Proposition 2). Next show that the real solutions do not vary too much from these osculating solutions, as long as they stay in the region \(I \ge I_0\), and for bounded times (indeed for times of order \(O(I_0 ^{3/2})\), Proposition 3). It follows that if the osculating orbit enters the region \(I \le I_0\) within the time \(O(I_0^{3/2})\) (which we expect by Kepler third law), then the true orbit must also enter into that region. Finally, (Proposition 5) we verify that there is indeed sufficient time: the timescale over which the approximation of the true motion by the osculating motion is valid is long enough that the true motions must follow their osculating leads into a region \(I\le I_0\).

4 Setup and notation

In the spatial three-body problem, we consider the motion of three-point masses \(m_1,m_2,m_3\) under Newton’s gravitational attraction. We will denote the configurations by

$$\begin{aligned} q=(q_1,q_2,q_3)\in ({\mathbb {R}}^3)^3\backslash \{ (x_1,x_2,x_3): x_i=x_j\text { some } i\ne j\}. \end{aligned}$$

As is standard, we may take the center of mass zero coordinates (\(\sum m_i q_i=0\)) and will now define the Jacobi coordinates in which the splitting into two perturbed Kepler problems will be clear (see Fig. 2 as well as Pollard 1966 2.7; Féjoz 2002, or Kaplan et al. 2008):

$$\begin{aligned} \xi _1= & {} q_2-q_1,\\ \xi _2= & {} q_3-(m_1+m_2)^{-1}(m_1q_1+m_2q_2)=\frac{m_1+m_2+m_3}{m_1+m_2} q_3. \end{aligned}$$

We set

$$\begin{aligned} r=|\xi _1| \text { and } \rho =|\xi _2|. \end{aligned}$$

For reference we record here in one place the mass constants that will be used throughout:

Mass constants:

$$\begin{aligned} \mu= & {} m_1+m_2\\ M= & {} m_1+m_2+m_3\\ \alpha _1= & {} m_1m_2\mu ^{-1}\\ \alpha _2= & {} m_3\mu M^{-1}\\ \beta _1= & {} \mu \alpha _1\\ \beta _2= & {} M\alpha _2 \end{aligned}$$

Then in these coordinates we find:

$$\begin{aligned}&\displaystyle I:=\sum m_i |q_i|^2=\alpha _1 r^2+\alpha _2\rho ^2 \end{aligned}$$
(1)
$$\begin{aligned}&\displaystyle J:=\sum m_i (q_i\times \dot{q}_i)=\alpha _1 \xi _1\times \dot{\xi }_1+\alpha _2\xi _2\times \dot{\xi }_2=J_1+J_2 \end{aligned}$$
(2)

for the moment of inertia and angular momentum, respectively. Also the energy splits into

$$\begin{aligned} H=H_{kep}+g, \end{aligned}$$

where

$$\begin{aligned} H_{kep}=\frac{1}{2}\alpha _1 |\dot{\xi }_1|^2-\frac{\beta _1}{r}+\frac{1}{2}\alpha _2 |\dot{\xi }_2|^2-\frac{\beta _2}{\rho }=H_1+H_2 \end{aligned}$$

is an energy for two uncoupled Kepler problems and

$$\begin{aligned} g=\frac{\beta _2}{\rho }-\frac{m_1m_3}{|\xi _2+m_2 \mu ^{-1}\xi _1|}-\frac{m_2m_3}{|\xi _2-m_1\mu ^{-1}\xi _1|} \end{aligned}$$

is a perturbation term with \(g=O(r^2/\rho ^3)\), \(g_{\xi _1}=O(r/\rho ^3)\) and \(g_{\xi _2}=O(r^2/\rho ^4)\).

The equations of motion are then the two perturbed Kepler problems

$$\begin{aligned} \alpha _i\ddot{\xi }_i=-\frac{\beta _i\xi _i}{|\xi _i|^3}\mathbf {-}g_{\xi _i}. \end{aligned}$$
(3)

Definition 1

A solution to the unperturbed Kepler problems satisfying the same initial conditions as a solution to these perturbed Kepler problems (Eq. 3) will be called an osculating orbit (see Pollard 1966, 1.16).

5 Proof of main theorem

Fix the masses, angular momentum, negative energy \(H<0\), linear momentum zero and a parameter \(\lambda >0\) and only consider orbits at these energy and momentum levels in appropriate Jacobi coordinates. We will use \(\overline{I}\) for a placeholder constant.

Proposition 1

For \(H<0\), there exists \( I^*(m_i, H, J)>0\) such that the region \(I> I^*\) consists of three connected components \(B_1, B_2, B_3\). Moreover, relabeling if necessary to fix our attention to \(B_3\) (where \(q_3\) is the far body) with appropriate Jacobi coordinates we have the bounds:

$$\begin{aligned} |g|\le & {} c_g(r^2/\rho ^3),~~~|g_{\xi _2}|\le c_{g_2}(r^2/\rho ^4) \end{aligned}$$
(4)
$$\begin{aligned} |J_2|\le & {} \alpha _2c_{J_2} \end{aligned}$$
(5)
$$\begin{aligned} r\le & {} c_r \end{aligned}$$
(6)

on the perturbation term g, angular momentum \(J_2\) and short Jacobi vector r throughout \(B_3\) for some constants \(c_g, c_{g_2}, c_{J_2}, c_r\) depending on masses, energy and angular momentum.

See Moeckel (1988), Féjoz (2002), Kaplan et al. (2008), Marchal (1990) regarding these well-known lunar regions.

Proposition 2

Take \(I^{**}=\max \{ I^*, \alpha _1 c_r^2+\alpha _2 c_{J_2}^4/M^2\}\) where \(I^*, c_r, c_{J_2}\) are from Proposition 1. Then any osculating orbit with initial condition in \(I> I^{**}\) falls in forward or backward time into the region \(I\le I^{**}\). Moreover, the time to fall into the region \(I\le I^{**}\) is less than or equal to the time to reach pericenter.

Proof

By Eqs. (1, 6) in the region \(I>I^{**}\) we have \(\rho ^2> c_{J_2}^4/M^2\).

The ‘\(\rho \)’ component of the osculating orbit of an initial condition in \(I> I^{**}\) is a solution to the Kepler problem

$$\begin{aligned} \ddot{\xi }_{osc}=-M\xi _{osc}/|\xi _{osc}|^3 \end{aligned}$$

with \(\rho _{osc}^2(0)=|\xi _{osc}(0)|^2>c_{J_2}^4/M^2\) and the restriction from Eq. (5)

$$\begin{aligned} |\xi _{osc}\times \dot{\xi }_{osc}|=\alpha _2^{-1}|J_2(0)|\le c_{J_2} \end{aligned}$$

on the angular momentum. Also from Proposition 1, we have the r component satisfying \(r\le c_r\) as long as we remain in the region \(I>I^*\).

We now verify that for all such orbits, \(\xi _{osc}\), the pericenter distance, \(\rho _{osc}^{pc}\) is bounded.

Case 1 \(J_2\ne 0\).

In polar coordinates, any non-collision osculating orbit is (for some \(e\ge 0\)):

$$\begin{aligned} \rho _{osc}=\frac{\alpha _2^{-2}|J_2(0)|^2}{M(1+e\cos \theta )}, \end{aligned}$$

where \(\theta =0\) corresponds to the pericenter.

Then as \(e\ge 0\) and by Eq. (5),

$$\begin{aligned} \rho _{osc}^{pc}=\frac{\alpha _2^{-2}|J_2(0)|^2}{M(1+e)}\le \frac{c_{J_2}^2}{M}. \end{aligned}$$

Case 2 \(J_2=0\).

Collision! So the pericenter distance in this case is zero.

Now an osculating orbit starting in \(I> I^{**}\) either reaches pericenter or leaves \(I^*\) before it reaches pericenter. If it reaches pericenter before leaving \(I>I^*\) then we have \(I_{pc}\le \alpha _1c_r^2+\alpha _2 c_{J_2}^4/M^2\le I^{**}\), so in either case we fall into the region \(I\le \max \{I^*, \alpha _1 c_r^2+\alpha _2 c_{J_2}^4/M^2\}= I^{**}\) in forward or backward time which is no more than \(t_{pc}\), the time to pericenter. \(\square \)

Proposition 3

Let \(\overline{I}\ge \max \{I^*, \alpha _1c_r^2+\max \{1,(\frac{3c_{J_2}^2}{2M})^2\}\alpha _2 \}=\overline{R}\). Set \(\overline{\rho }=\sqrt{\alpha _2^{-1}(\overline{I}-\alpha _1 c_r^2)}\) and \(\varepsilon =1/\overline{\rho }\). Then any orbit with initial condition in \(I\ge \overline{I}\) satisfies:

$$\begin{aligned} |\rho (t)-\rho _{osc}(t)|<A_1\varepsilon \end{aligned}$$
(7)

for time

$$\begin{aligned} |t|\le B_1\varepsilon ^{-3/2} \end{aligned}$$
(8)

throughout the region \(I\ge \overline{I}\).

Here we may pick the constant \(B_1>0\) and then define \(A_1=\frac{a}{M}(2+e^{\sqrt{2M+3c_{J_2}^2}B_1})\) where \(a=\alpha _2^{-1}((c_{g_2}c_r^2 B_1)^2+2c_{J_2}c_{g_2}c_r^2 B_1+c_{g_2}c_r^2)\).

Proof

First, from Eq. (6) any configuration with \(I\ge \overline{I}\) has \(\mathbf {\rho }\ge \overline{\rho }\ge \max \{ 1, \frac{3c_{J_2}^2}{2M}\}\ge \max \{ 1, \frac{3\alpha _2^{-2}|J_2|^2}{2M}\}\), in particular our initial condition.

We consider our perturbed Kepler problem for the ‘\(\rho \)’ motion:

$$\begin{aligned} \ddot{\xi }_2=-\frac{M\xi _2}{\rho ^3}+F(\xi _2, t), \end{aligned}$$

where the time dependence in the perturbation term \(F=-\alpha _2^{-1}g_{\xi _2}\) is due to the interaction of the motion of masses 1 and 2.

In the region \(I\ge \overline{I}\) , we have \(|F|\le \alpha _2^{-1}c_{g_2}c_{r}^2\rho ^{-4}\le \alpha _2^{-1}c_{g_2}c_{r}^2\varepsilon ^4\). We will set

$$\begin{aligned} A=\alpha _2^{-1}c_{g_2}c_{r}^2. \end{aligned}$$

An estimate for the variation of \(c_t^2:=|\xi _2\times \dot{\xi }_2|^2=\alpha _2^{-2} |J_2(t)|^2\) will be needed. Since \(|\dot{c}|\le |\alpha _2^{-1}\dot{J}_2|=|\xi _2\times F|\le A\rho ^{-3}\), we have

$$\begin{aligned} |\dot{c}|\le A\varepsilon ^3, \end{aligned}$$

so that

$$\begin{aligned} |c_t-c_0|\le A\varepsilon ^3 |t|. \end{aligned}$$

Hence,

$$\begin{aligned} |c_t^2-c_0^2|\le A\varepsilon ^3|t|(A\varepsilon ^3|t|+2c_0)\le A\varepsilon ^3|t|(A\varepsilon ^3|t|+2c_{J_2}), \end{aligned}$$

so that for \(|t|\le B_1\varepsilon ^{-3/2}\) and \(I\ge \overline{I}\) with \(b=(AB_1)^2+2c_{J_2}AB_1\) we have

$$\begin{aligned} |c_t^2-c_0^2|\le b\varepsilon ^{3/2}, \end{aligned}$$
(9)

provided \(\varepsilon \le 1\) which is guaranteed so long as \(\overline{I}\ge \alpha _1 c_r^2+\alpha _2\) as is indeed the case since \(\overline{I}\ge \overline{R}\).

To prove the proposition we will use the Sandwich Lemma (see Montgomery 2007, p. 1942). Note that in Montgomery (2007) there is an unneeded assumption requiring that \(F_+<0\)):

Sandwich Lemma: Given \(\ddot{x}_{-}=F_{-}(x_{-})\), \(\ddot{x}=F(x,t)\) and \(\ddot{x}_{+}=F_{+}(x_{+})\) satisfying \(F_{-}(x)\le F(x,t)\le F_{+}(x)\) and \(\frac{\partial F_{\pm }}{\partial x_{\pm }}\ge 0\) over some time interval, then over this same time interval the solutions to \(F_{\pm }, F\) satisfying the same initial conditions have:

$$\begin{aligned} x_{-}(t)\le x(t)\le x_{+}(t). \end{aligned}$$

Now:

$$\begin{aligned} \rho _{osc}\ddot{\rho }_{osc}+\dot{\rho }_{osc}^2= & {} \frac{d}{dt} \rho _{osc}\dot{\rho }_{osc}=\frac{d}{dt} \xi _{osc}\cdot \dot{\xi }_{osc}=-M\rho _{osc}^2\rho _{osc}^{-3}+|\dot{\xi }_{osc}|^2 =-M\rho _{osc}^{-1}\\&+\,\dot{\rho }_{osc}^2+c_0^2\rho _{osc}^{-2}, \end{aligned}$$

so

$$\begin{aligned} \ddot{\rho }_{osc}=c_0^2\rho _{osc}^{-3}-M\rho _{osc}^{-2}. \end{aligned}$$

And likewise:

$$\begin{aligned} \ddot{\rho }= c_t^2\rho ^{-3}-M\rho ^{-2}+f(t), \end{aligned}$$
(10)

where \(|f(t)|=|\rho (t)^{-1}(\xi _2(t)\cdot F(\xi _2(t), t))|\le A\rho (t)^{-4}\).

Take \(v_1(\rho )=c_0^2 \rho ^{-3}-M\rho ^{-2}\) and \(v_2(\rho ,t)=c_t^2 \rho ^{-3}-M\rho ^{-2}+f\). We view f here as f(t) by plugging the true solutions \(\xi _1(t), \xi _2(t)\) into \(F, \rho \).

Now using our \(|c_t^2-c_0^2|\) estimate Eq. (9) and our bound on f we get:

$$\begin{aligned} |v_1-v_2|\le b\varepsilon ^{9/2}+A\varepsilon ^4\le a\varepsilon ^4 \end{aligned}$$

for \(a=b+A\), or

$$\begin{aligned} v_1-a\varepsilon ^4\le v_2\le v_1+a\varepsilon ^4 \end{aligned}$$

for time \(|t|\le B_1\varepsilon ^{-3/2}\) and \(I\ge \overline{I}\).

Now \(\rho \) is a solution to \(\ddot{\rho }=v_2\) and let \(\rho _{\pm }\) be solutions to:

$$\begin{aligned} \ddot{\rho }_\pm =v_1(\rho _{\pm })\pm a\varepsilon ^4=:F_{\pm }(\rho _\pm ) \end{aligned}$$

satisfying the same initial conditions as \(\rho \). Throughout the region \(I\ge \overline{I}\) we have \(\rho _{\pm }\ge \frac{3c_0^2}{2M}\) which implies \(\frac{\partial F_{\pm }}{\partial \rho _{\pm }}\ge 0\), so that we may apply the Sandwich Lemma throughout the region \(I\ge \overline{I}\) yielding:

$$\begin{aligned} \rho _{-}\le \rho \le \rho _{+} \end{aligned}$$

for time \(|t|\le B_1\varepsilon ^{-3/2}\) as long as we remain in the region \(I\ge \overline{I}\).

Likewise since \(v_1-a\varepsilon ^4\le v_1\le v_1+a\varepsilon ^4\), we have for \(|t|\le B_1\varepsilon ^{-3/2}\) and throughout \(I\ge \overline{I}\) that

$$\begin{aligned} \rho _{-}\le \rho _{osc}\le \rho _{+} \end{aligned}$$

holds.

Now we will show that \(\rho _+\) and \(\rho _{-}\) remain close to finish the proof. Set \(\eta =\rho _+-\rho _{-}\ge 0\).

Note that \(v_1\) is Lipschitz in the region \(\rho \ge \overline{\rho }\) with

$$|v_1(x)-v_1(y)|\le \omega |x-y| \mathrm{for} x,y\ge \overline{\rho }\ \mathrm{and}\ \omega =(2M+3c_0^2)\varepsilon ^3=k\varepsilon ^3,$$

Then \(\ddot{\eta }=v_1(\rho _+)-v_1(\rho _{-})+2a\varepsilon ^4\Rightarrow |\ddot{\eta }|\le \omega |\eta |+2a\varepsilon ^4=\omega \eta +2a\varepsilon ^4\), so

$$\begin{aligned} |\ddot{\eta }|\le \omega \eta +2a\varepsilon ^4. \end{aligned}$$

Let \(F=v_1(\rho _+)-v_1(\rho _{-})+2a\varepsilon ^4\) then we have \(0\le F\le \omega \eta +2a\varepsilon ^4\) provided \(\rho _{-}\le \rho _+\) and \(\frac{\partial v_1}{\partial \rho }>0\), which indeed holds throughout the region \(I\ge \overline{I}\) for time \(|t|\le B_1\varepsilon ^{-3/2}\). Now the Sandwich Lemma with \(F_{+}(\eta )=\omega \eta +2a\varepsilon ^4\) and \(F_{-}=0\) gives:

$$\begin{aligned} 0\le \eta (t)\le \frac{2a\varepsilon ^4}{\omega }(\cosh \sqrt{\omega }t-1) \end{aligned}$$

and since \(\omega =k\varepsilon ^3\) where \(2M\le k\le 2M+3c_{J_2}^2\) we have

$$\begin{aligned} |\rho (t)-\rho _{osc}(t)|\le \rho _+(t)-\rho _{-}(t)=\eta (t)\le \frac{2a\varepsilon }{k}(2+e^{\sqrt{\omega }|t|})\le A_1\varepsilon \end{aligned}$$

for time \(|t|\le B_1\varepsilon ^{-3/2}\) as long as we are in the region \(I\ge \overline{I}\) and where we set \(A_1=\frac{a}{M}(2+e^{\sqrt{2M+3c_{J_2}^2} B_1})\). \(\square \)

Proposition 4

Set \(R=\max \{ \overline{R}, I^{**}, 4\alpha _1c_r^2\}\) where \(\overline{R}\) is from Proposition 3. For \(\overline{I}\ge R\) set

$$\begin{aligned} \overline{I}^+=4(\overline{I}-\alpha _1c_r^2)>\overline{I}. \end{aligned}$$

Then for any orbit with an initial condition in the strip

$$\begin{aligned} \overline{I}\le I\le \overline{I}^+, \end{aligned}$$

we have that Eq. (7) holds with \(B_1=2^{3/2}\pi \sqrt{M}\) until the osculating orbit enters the region \(I\le \overline{I}\).

Fig. 3
figure 3

Two equivalent configurations

Proof

First consider orbits with initial condition in \(I\ge \overline{I}\) for some \(\overline{I}\ge \max \{I^{**}, \overline{R}\}\) and with \( \varepsilon =1/\overline{\rho }\) defined as in Proposition 3 and recall that \(I\ge \overline{I}\) implies that \(\rho \ge \overline{\rho }\). For osculating collision orbits with \(J_2(0)=0\), some energy \(H_2\) and \(\rho (0)=\rho _{osc}(0)>\overline{\rho }\) the time to collision in forward time (or time from expulsion in backward time) \(t_c\) satisfies:

$$\begin{aligned} t_c\le \pi (8M)^{-1/2}\rho _{osc}(0)^{3/2}. \end{aligned}$$
(11)

We will use Lambert’s theorem (see Albouy 2002) to compare time to pericenter for general osculating orbits to these collision times. Lambert says that for Kepler orbits, the time of travel between two points, \(a_1, a_2\) on the orbit is a function of the energy, chord length \(d=|a_1-a_2|\) and \(|a_1|+|a_2|=r_1+r_2\) (where the origin is at the focus, see Fig. 3). Namely, for equivalent configurations (those having the same energy, same chord length d, and \(r_1+r_2=s_1+s_2\)) the time of travel from \(a_2\) to \(a_1\) is the same as the time of travel from \(b_2\) to \(b_1\). Figure 3 shows how we will choose our equivalent configurations.

For a general osculating orbit \(\rho _{osc}\), take \(r_1=\rho _{osc}^{pc}\), \(r_2=\rho _{osc}(0)=\rho (0)>\overline{\rho }\) and then \(s_1, s_2\) are determined by \(s_2-s_1=d=|a_2-a_1|\) and \(s_1+s_2=r_1+r_2\). By Lambert’s theorem and Eq. (11) we have that the time to pericenter, \(t_{pc}\), satisfies

$$\begin{aligned} t_{pc}\le \pi (8M)^{-1/2} s_2^{3/2}. \end{aligned}$$
(12)

And since \(r_2\ge r_1\) (as we are in \(I\ge I^{**}\)) we have:

$$\begin{aligned} 2s_2-(r_1+r_2)= & {} s_2-s_1=d\le r_1+r_2\Rightarrow \\ s_2\le & {} r_1+r_2\le 2r_2. \end{aligned}$$

So continuing with Eq. (12), we have

$$\begin{aligned} t_{pc}\le \pi M^{-1/2}r_2^{3/2}. \end{aligned}$$

To compare \(t_{pc}\) with our estimates Eq. (8) we want \(t_{pc}\le B_1\varepsilon ^{-3/2}=B_1\overline{\rho }^{3/2}\), which holds when:

$$\begin{aligned} \pi M^{-1/2}r_2^{3/2}\le & {} B_1\overline{\rho }^{3/2}\Rightarrow \\ r_2^{3/2}\le & {} \pi ^{-1}M^{1/2}B_1\overline{\rho }^{3/2}. \end{aligned}$$

Take \(B_1=2^{3/2}\pi /\sqrt{M}\) so that we will be working in the strip:

$$\begin{aligned} \overline{\rho }^{3/2}\le r_2^{3/2}\le 2^{3/2}\overline{\rho }^{3/2} \end{aligned}$$

i.e., (recall that \(r_2=\rho (0)=\rho _{osc}(0)\))

$$\begin{aligned} \overline{\rho }\le \rho \le 2\overline{\rho }. \end{aligned}$$

The condition \(\rho \le 2\overline{\rho }\) is ensured (Eqs. 1, 6) when \(I\le \overline{I}^+:=4\alpha _2\overline{\rho }^2=4(\overline{I}-\alpha _1 c_r^2)\).

Also, we ensure \(\overline{I}<\overline{I}^+=4(\overline{I}-\alpha _1c_r^2)\) provided \(\overline{I}\ge 4 \alpha _1 c_r^2 > \frac{4}{3} \alpha _1 c_r^2\). \(\square \)

Proposition 5

(Main Theorem) Fix a parameter \(\lambda >0\). Then there exists \(R_\lambda (m_i, H, J)>0\) such that any orbit with initial condition satisfying \(I(0)\ge R_\lambda \) comes in forward or backward time into the region \(I\le R_\lambda \).

Explicitly, take \(\overline{R}_\lambda =\max \{R, \alpha _1c_r^2+\alpha _2(\frac{\alpha _2 A_1^2}{\lambda }), 2\alpha _2 A_1+4\alpha _1 c_r^2 +\lambda \}\) and \(R_\lambda =\overline{R}_{\lambda }+2\alpha _2A_1+\lambda \) where R is from Proposition 4.

Proof

Take \(\overline{I}\ge \overline{R}_\lambda \) and \(\varepsilon ^{-1}=\overline{\rho }=\sqrt{\alpha _2^{-1}(\overline{I}-\alpha _1c_r^2)}\) and consider an orbit with initial condition in \(\overline{I}^+ \ge I\ge \overline{I}\) as in Proposition 4. By Proposition 2 we can let \(t^*\) be the time the osculating orbit hits \(\alpha _1 c_r^2+\alpha _2\rho _{osc}^2( t^*)=\overline{I}\), i.e., \(\rho _{osc}(t^*)=\overline{\rho }=\varepsilon ^{-1}\).

Along the true motion then at \(t^*\) we have by Proposition 4 (Eqs. 6, 7) that

$$\begin{aligned} I(t^*)\le \alpha _1 c_r^2+\alpha _2\rho (t^*)^2\le \alpha _1c_r^2+\alpha _2(\rho _{osc}(t^*)+A_1\varepsilon )^2=\overline{I}+2\alpha _2 A_1+\alpha _2 A_1^2\varepsilon ^2 \end{aligned}$$

holds. Moreover, due to the condition \(\overline{I}\ge \overline{R}_\lambda \ge \alpha _1 c_r^2+\alpha _2(\frac{\alpha _2A_1^2}{\lambda })\) we have \(\varepsilon ^2\le \frac{\lambda }{\alpha _2 A_1^2}\), so that

$$\begin{aligned} I(t^*)\le \overline{I}+2\alpha _2A_1+\lambda . \end{aligned}$$

Also the condition \(\overline{I}\ge \overline{R}_\lambda \ge 2\alpha _2 A_1+4\alpha _1 c_r^2 +\lambda >\frac{1}{3}(2\alpha _2 A_1+4\alpha _1 c_r^2 +\lambda )\) ensures that \(\overline{I}+2\alpha _2A_1+\lambda <\overline{I}^+.\)

That is taking any \(\overline{I}\ge \overline{R}_\lambda \) and setting \(\lambda '=2\alpha _2A_1+\lambda \), then all orbits with initial condition in the strip

$$\begin{aligned} \overline{I}+\lambda '\le I(0)\le \overline{I}^+ \end{aligned}$$

come in forward or backward time into the region

$$\begin{aligned} I\le \overline{I}+\lambda '. \end{aligned}$$

In particular by setting

$$\begin{aligned} \overline{I}_s=\overline{R}_\lambda +s \end{aligned}$$

for \(s\ge 0\), we may exhaust the region \(I\ge R_\lambda =\overline{R}_\lambda +\lambda '=\overline{I}_0+\lambda '\) with the strips

$$\begin{aligned} \overline{I}_s+\lambda '\le I\le \overline{I}_s^+. \end{aligned}$$

Note that \(s>s'\) implies \(\overline{I}_s^+-\overline{I}_s>\overline{I}_{s'}^+-\overline{I}_{s'}\). Hence, any orbit with initial condition \(I(0)\ge R_\lambda \) will be forced to jump back along the strips (see Fig. 4).

Fig. 4
figure 4

Jumping back along the strips

Finally in Theorem 1 we can take \(I_0=R_{\lambda }\) for any choice of \(\lambda \) (for instance, \(I_0=\min _{\lambda \in (0,1)} R_\lambda \)). \(\square \)