Abstract
The Kepler orbits form a 3-parameter family of unparametrized plane curves, consisting of all conics sharing a focus at a fixed point. We study the geometry and symmetry properties of this family, as well as natural 2-parameter subfamilies, such as those of fixed energy or angular momentum. Our main result is that Kepler orbits is a ‘flat’ family, that is, the local diffeomorphisms of the plane preserving this family form a 7-dimensional local group, the maximum dimension possible for the symmetry group of a 3-parameter family of plane curves. These symmetries are different from the well-studied ‘hidden’ symmetries of the Kepler problem, acting on energy levels in the 4-dimensional phase space of the Kepler system. Each 2-parameter subfamily of Kepler orbits with fixed non-zero energy (Kepler ellipses or hyperbolas with fixed length of major axis) admits \(\mathrm { PSL}_2(\mathbb {R})\) as its (local) symmetry group, corresponding to one of the items of a classification due to Tresse (Détermination des invariants ponctuels de l’équation différentielle ordinaire du second ordre \(y^{\prime \prime }= \omega (x, y, y^{\prime })\), vol. 32, S. Hirzel, 1896) of 2-parameter families of plane curves admitting a 3-dimensional local group of symmetries. The 2-parameter subfamilies with zero energy (Kepler parabolas) or fixed non-zero angular momentum are flat (locally diffeomorphic to the family of straight lines). These results can be proved using techniques developed in the nineteenth century by Lie to determine ‘infinitesimal point symmetries’ of ODEs, but our proofs are much simpler, using a projective geometric model for the Kepler orbits (plane sections of a cone in projective 3-space). In this projective model, all symmetry groups act globally. Another advantage of the projective model is a duality between Kepler’s plane and Minkowski’s 3-space parametrizing the space of Kepler orbits. We use this duality to deduce several results on the Kepler system, old and new.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction and Statement of Main Results
A Kepler orbit is a plane conic—ellipse, parabola or hyperbola—with a focus at the origin (in case of a hyperbola only the branch bending around the origin is taken). Kepler orbits form a 3-parameter family of plane curves, traced by the motions of a point mass subject to Newton’s inverse square law: the radial attractive force is proportional to the inverse square of the distance to the origin. We exclude ‘collision orbits’ (lines through the origin). See Fig. 1.
1.1 Orbital Symmetries
These are local diffeomorphisms of \(\mathbb {R}^2{\setminus } 0,\) taking (unparametrized) Kepler orbits to Kepler orbits. At the outset, it is not clear that there are any such symmetries, local or global, other than the obvious ones—dilations and rotations about the origin, or reflections about lines through the origin (a 2-dimensional group of symmetries). Nevertheless, as we find out, there are many additional orbital symmetries, both for the full 3-parameter family of Kepler orbits, as well as for some natural 2-parameter subfamilies.
Theorem 1
The orbital symmetries of the Kepler problem form a 7-dimensional group of local diffeomorphisms of \(\mathbb {R}^2{\setminus } 0\) (aka a ‘pseudo-group’), the maximum dimension possible for a 3-parameter family of plane curves, generated by the following infinitesimal symmetries (vector fields whose flows act by orbital symmetries):
(using both Cartesian and polar coordinates).
Note that the first two vector fields generate dilations and rotations, the ‘obvious’ symmetries mentioned above. How about the rest of the symmetries? Where do they come from?
We emphasize that the 7 vector fields of Theorem 1 do not generate a honest 7-dimensional Lie group action on \(\mathbb {R}^2{\setminus } 0\). The first 4 vector fields do generate an action of the connected component of the group \(\mathrm {CO}_{2,1}\) on \(\mathbb {R}^2{\setminus } 0\), but the last three vector fields are in fact incomplete (their integral curves “run to infinity” in finite time). As we explain later, to obtain a global group action, one needs to embed the Kepler plane in a larger surface, a cone in \(\mathbb {R}P^3\), to which the above 7 vector fields extend, generating an action of the 7-dimensional subgroup of \(\mathrm {PGL}_4(\mathbb {R})\) preserving this cone.
Now quite generally, there is a standard method for finding infinitesimal symmetries of n-parameter families of plane curves, going back to Lie in the nineteenth century, consisting of first writing down an nth-order scalar ODE whose graphs of solutions form the curves of the family. Then, one writes down a system of PDEs for the infinitesimal symmetries of this ODE, which with some luck and skill, one can solve explicitly. See Chapter 6 of Olver’s book [38]. This is a straightforward albeit tedious procedure (best left nowadays to computers), producing the infinitesimal symmetries above, but the result remains mysterious.
Instead, our proof of Theorem 1 exploits the peculiar geometry of Kepler’s problem, in particular, its projective geometry, borrowing from Lie’s theory only the upper bound of 7 on the dimension of the symmetry group. This proof, rather than the actual statement of Theorem 1, is the main thrust of this article. See Sect. 1.3 for a sketch of the proof.
1.2 The Space of Kepler Orbits
Every Kepler orbit is the orthogonal projection onto the xy plane (the ‘Kepler plane’) of a conic section, the intersection of the cone \({\mathcal {C}}:=\{x^2+y^2=z^2\}\subset \mathbb {R}^3\) with a plane \(ax+by+cz=1,\ c\ne 0\). See Sect. 3 below for a proof as well as a reminder of some other standard facts about the Kepler problem. Let \(\mathbb {R}^{2,1}\) be the 3-dimensional space with coordinates (a, b, c) equipped with Minkowski’s quadratic form \(\Vert (a,b,c)\Vert ^2:=a^2+b^2-c^2\) (we use this notation even though the expression has negative values!). Note that the planes \(ax+by+cz=1\) and \(ax+by-cz=1\) (the reflection of the former about the xy plane) generate the same Kepler orbit. Thus \(\mathbb {R}^{2,1}_+=\{c>0\}\subset \mathbb {R}^{2,1}\) is identified with the space of Kepler orbits. Furthermore, the cone \(\Vert (a,b,c)\Vert ^2=0\) parametrizes Kepler parabolas, its interior \(\Vert (a,b,c)\Vert ^2<0\) parametrizes Kepler ellipses and its exterior \(\Vert (a,b,c)\Vert ^2>0\) parametrizes Kepler hyperbolas. See Fig. 2.
The orbital symmetries of Theorem 1 clearly act on the space of Kepler orbits and thus on \(\mathbb {R}^{2,1}_+\). Again, this is only a local action (a 7-dimensional Lie algebra of vector fields), but it extends to a global action on all of \(\mathbb {R}^{2,1}\).
Theorem 2
The local group action of the orbital symmetries of the Kepler problem on \(\mathbb {R}^{2,1}_+\) extends to \(\mathbb {R}^{2,1}\), generating the identity component of the group \(\mathrm {CO}_{2,1}\ltimes \mathbb {R}^{2,1}\) of Minkowski similarities (compositions of Minkowski rotations, dilations and translations). The infinitesimal generators of this action, corresponding to those of Eq. (1), are
The first vector field generates dilations in \(\mathbb {R}^{2,1}\), the next 3 generate Minkowski rotations about the origin and the last 3 generate translations. It follows that orbital symmetries actually ‘mix’ the orbit types (ellipses, parabolas, hyperbolas).
The horizontal plane \(\{c=0\}\subset \mathbb {R}^{2,1}\) corresponds to ‘ideal’ Kepler orbits which are inevitably added upon completing the orbital symmetry action. For \((a,b,0)\ne (0,0,0)\) they are (affine) lines in \(\mathbb {R}^2{\setminus } 0\), obtained by projecting to the xy plane sections of \({\mathcal {C}}\) by vertical affine 2-planes in \(\mathbb {R}^3\). The point \((0,0,0)\in \mathbb {R}^{2,1}\) corresponds to the ‘line at infinity’ in the Kepler plane.
1.3 Sketch of Proof of Theorems 1 and 2
With Fig. 2 in mind, consider the group \(\mathrm {CO}_{2,1}\subset \mathrm {GL}_3(\mathbb {R})\), preserving the quadratic form \(x^2+y^2-z^2\) up to scale. Its identity component acts on \({\mathcal {C}}_+:=\{x^2+y^2=z^2, z>0\}\), preserving its set of plane sections, thus projects to an action on \(\mathbb {R}^2{\setminus } 0\) by orbital symmetries. This accounts for the first 4 vector fields of Eq. (1).
Next, consider the 3-dimensional projective space \(\mathbb {R}P^3\) with homogeneous coordinates (X : Y : Z : W) and embed \(\mathbb {R}^3\hookrightarrow \mathbb {R}P^3\) as the affine chart \(W\ne 0\), \((x,y,z)\mapsto (x:y:z:1).\) The closure of \({\mathcal {C}}\) in \(\mathbb {R}P^3\), \(\overline{{\mathcal {C}}}=\{(X:Y:Z:W)\, | \,X^2+Y^2=Z^2\}\), is obtained by adding to \({\mathcal {C}}\) the ‘circle at infinity’ \(S^1_\infty =\{X^2+Y^2=Z^2, W=0\}\). See Fig. 3. Now consider the group \(\widetilde{{\mathcal {G}}}\subset \mathrm {GL}_4(\mathbb {R})\), preserving the (degenerate) quadratic form \(X^2+Y^2-Z^2\), up to scale. A simple calculation (see Sect. 5 below) shows that \(\widetilde{{\mathcal {G}}}\) is an 8-dimensional group, thus its image \({\mathcal {G}}=\widetilde{{\mathcal {G}}}/\mathbb {R}^*\subset \mathrm {PGL}_4(\mathbb {R})\) is 7-dimensional, acting effectively on \(\overline{{\mathcal {C}}}\), preserving its set of (projective) plane sections. It leaves invariant the set of sections by planes not passing through the vertex of \(\overline{{\mathcal {C}}}\), parametrized by \(\mathbb {R}^{2,1}\). The action restricts to a local action on \({\mathcal {C}}_+\subset \overline{{\mathcal {C}}}\), then projects to a local action on \(\mathbb {R}^2{\setminus } 0\) by orbital symmetries. Equations (1) and (2) follow easily from this description.
Finally, we use a basic result of Lie’s theory of symmetries of ODEs (reviewed in the Appendix), according to which the maximum dimension of the group of point symmetries of a 3rd-order ODE is 7, thus the above construction provides the full group of orbital symmetries of the Kepler problem. See Sect. 5 for the full details.
1.4 2-Parameter Subfamilies
The simplest example of a 2-parameter family of plane curves (also called a ‘path geometry’) is the family of straight lines. It admits an 8-dimensional local group of symmetries (the projective group), the maximum dimension possible for a 2-parameter family of plane curves. A 2-parameter family of plane curves locally diffeomorphic to this family is called flat. There are no straight lines among Kepler orbits, but there are flat 2-parameter subfamilies.
Theorem 3
Kepler’s parabolas form a flat 2-parameter family of curves. The map \({{\mathbf {z}}}\mapsto {{\mathbf {z}}}^2\) (in complex notation) is a local diffeomorphism taking straight affine lines to Kepler parabolas.
This theorem is essentially known. The squaring map \({{\mathbf {z}}}\mapsto {{\mathbf {z}}}^2\), in the context of the Kepler problem, is known sometimes as the Levi-Civita or Bohlin map. It can be also used to define a local orbital equivalence between Hooke and Kepler orbits (see, e.g., Appendix 1 of [5]).
Theorem 4
Kepler’s orbits with fixed angular momentum \(\pm M\ne 0\) form a flat 2-parameter family of curves. The map \({{\mathbf {r}}}\mapsto {{\mathbf {r}}}/(1-r/M^2)\) takes Kepler orbits with angular momentum M to straight lines.
See Sect. 3 for a reminder about the angular momentum (also Fig. 1). One could verify this theorem by a straightforward calculation in polar coordinates (see Sect. 5.4) but the result becomes more transparent using the geometry of the space \(\mathbb {R}^{2,1}\) of Kepler orbits: the family of Kepler orbits with fixed |M| is represented in \(\mathbb {R}^{2,1}\) by a horizontal plane; a vertical translation in this space, which according to Theorem 2 is available as an orbital symmetry, maps this plane to the plane \(c=0\), parametrizing lines in the xy-plane.
Next we consider Kepler orbits with fixed energy \(E\ne 0.\) These fill up a plane region \({\mathcal {H}}_E\), the Hill region. For \(E\ge 0\) (Kepler hyperbolas with major axis 1/E or Kepler parabolas), the Hill region is the whole punctured plane, for \(E<0\) (Kepler ellipses with major axis 1/|E|) it is a punctured disk of radius 1/|E|. See Fig. 4.
Theorem 5
-
(a)
For each fixed energy \(E\ne 0\), the 2-parameter family of Kepler orbits with energy E is non-flat but is locally homogeneous: its orbital symmetry group is a 3-dimensional subgroup of the 7-dimensional group of Kepler’s orbital symmetries, isomorphic to \(\mathrm {PSL}_2(\mathbb {R})\) and generated by the infinitesimal symmetries
$$\begin{aligned} \partial _\theta , r(\partial _x + Ex\partial _r), r(\partial _y +E y\partial _r). \end{aligned}$$(3) -
(b)
For \(E<0\) the action of \(\mathrm {PSL}_2(\mathbb {R})\) on the Hill region \({\mathcal {H}}_E\) is global; for \(E>0\) it is only local.
This theorem is also essentially known, or at least can be deduced by experts on ‘superintegrable metrics’ from known results (see Remark 5.4 below for more details and references).
Our proof of this theorem is quite simple using the geometry of the space of orbits \(\mathbb {R}^{2,1}\): as we explain in Sect. 3, orbits of fixed energy E correspond to one of the sheets of the hyperboloid of two sheets \(a^2+b^2-(c-|E|)^2=-E^2\) (the upper sheet for \(E<0\), the lower one for \(E>0\)). See Fig. 5(iii). The Minkowski metric in \(\mathbb {R}^{2,1}\) restricts to a hyperbolic metric in each of these sheets, the subgroup of \({\mathcal {G}}\simeq \mathrm {CO}_{2,1}\ltimes \mathbb {R}^{2,1}\) preserving the hyperboloid acts as the full group of isometries of this metric, with generators given by Eq. (3).
Any two Hill regions with the same sign of energy are obviously orbitally equivalent by dilation. For opposite signs of energies, this is still true but less obvious.
Theorem 6
\({\mathcal {H}}_1\) is orbitally embedded in \({\mathcal {H}}_{-1}\) by the map \({{\mathbf {r}}}\mapsto {{\mathbf {r}}}/(1+2r).\) See Fig. 6.
Viewed in \(\mathbb {R}^{2,1}\), where the two Hill regions correspond to the two sheets of a hyperboloid, the map is simply the reflection about a horizontal plane \(c=1\), interchanging the two sheets. See Fig. 5(iii).
1.5 Further Results
-
1.
We establish a dictionary between the Minkowski geometry of the Kepler orbit space \(\mathbb {R}^{2,1}\) and properties of Kepler orbits. For example: a parabolic (or isotropic) plane in \(\mathbb {R}^{2,1}\) corresponds to the family of Kepler orbits passing through a fixed point. See Table 1 of Sect. 4.
-
2.
We give three illustrations of the usage of this dictionary: a new proof of ‘Kepler’s fireworks’ (Proposition 4.13), a Keplerian analogue of the 4 vertex and Tait–Kneser theorems (Theorem 8) and a ‘minor axis version’ of Lambert’s Theorem (Theorem 9).
-
3.
Similar results to Theorems 1–6 hold for orbital symmetries of the Hooke problem—the set of conics sharing a center (trajectories of mass points under central force proportional to the distance to the origin), and the orbits of the corresponding ‘Coulomb’ problems, where the sign of the force is reversed, becoming a repelling force. By central projection, our results extend to Hooke and Kepler orbits on surfaces of constant curvature (sphere and hyperbolic plane). See Table 2.
-
4.
We establish a converse to Theorem 1: among all central forces, Hooke and Kepler force laws are the only ones producing ‘flat’ families of orbits (3 parameter families with a 7-dimensional group of symmetries). See Theorem 10. This is reminiscent of Bertrand’s Theorem (1873), characterizing these two force laws as the only central force laws with bound orbits all of whose bound orbits are closed [9, 6, p. 37].
* * *
Techniques. Other than standard projective and differential geometric constructions, we use some of the work of Lie (1874), [45] and [47] on point symmetries of 2nd and 3rd order ODEs. We do not assume the reader’s familiarity with their work. We summarize in the Appendix the needed tools of this theory.
Figures. The figures here were computer generated using Wolfram’s Mathematica and Apple’s Keynote.
2 Wider Context: ‘Orbital’ Versus ‘Dynamical’ Symmetries
The Kepler problem is centuries old with an enormous literature. It is hard to imagine one can add anything new to this problem in the twenty-first century. Yet, new and interesting works continue to appear. See, for example, [8, 10, 24, 29, 36, 43]. Some facts have been rediscovered several times, centuries apart, especially before the existence of internet search engines. For example, V.I. Arnol’d attributes in his 1990 book [5, Appendix 1] the fact that \({{\mathbf {z}}}\mapsto {{\mathbf {z}}}^2\) maps Hooke orbits to Kepler orbits to Bohlin’s 1911 article [11], then goes on to generalize it to a ‘duality’ between central force power laws. In fact, all this appeared in Maclaurin’s 1742 ‘Treatise of fluxions’ [33, Book II, Chap. V, §875] (we thank S. Tabachnikov for pointing out this reference to us).
One of the most studied aspects of the Kepler problem are its symmetry properties. The most obvious symmetries are diffeomorphisms of the plane, mapping solutions \({{\mathbf {r}}}(t)\) of the underlying ODE, \(\ddot{{\mathbf {r}}}=-{{\mathbf {r}}}/r^3\), to solutions. One can show that these consist only of the rotations about the origin and reflections about lines through the origin, valid for any central force motion.
More interesting symmetries arise when the Kepler problem is considered as a Hamiltonian system, ie a flow defined on its phase space \(T^*(\mathbb {R}^2{\setminus } 0)\). The symplectomorphisms of phase space preserving parametrized trajectories of this flow form a larger group of symmetries, associated with the Hamiltonian flows of additional conserved quantities such as components of the Laplace–Runge–Lenz vector. These symmetries generate a (local) \(\mathrm {SO}_3\)-action on the open subset of phase space with negative energy. Apart from the lift of the rotation symmetries above, these oft-called ‘hidden’ symmetries do not descend to an action on the Kepler plane, even locally. The action is rather on phase space, mixing position and momentum variables. A good reference for this type of ‘dynamic’ or ‘phase space’ symmetries of the Kepler problem is the book [29] or Chapters 3 and 4 of [22].
In contrast, the symmetries in this article are ‘orbital’ symmetries, acting on the configuration space of the Kepler problem, \(\mathbb {R}^2{\setminus } 0\), not its phase space. They are closer to the symmetries one can extract from Albouy’s ‘projective dynamics’ papers [1, 2].
So how original are our results? As far as we can tell, after consulting with experts and searching the literature, our results are new. The articles [1, 2, 15] are the nearest in spirit that we found. ‘Hidden symmetries’ of the Kepler problem, i.e., of its phase space, have been studied extensively, and symmetries of 2nd- and 3rd-order ODEs have been studied extensively as well since the mid nineteenth century, but it seems that the symmetries of the 2nd- and 3rd-order ODEs that arise in the Kepler problem have not been studied systematically before, which is the present article’s contribution.
But of course, given the subject’s long and rich history, it is still quite possible that at least some of the theorems announced here have been noted before, in some form or another. If some of the readers of this article are aware of such work we will be grateful if they contact us.
3 A Reminder on the Kepler Problem
Here we review briefly some known facts about the Kepler problem that will be used in the sequel. See also [3, 5, 6, 24].
Kepler orbits are the unparametrized plane curves traced by the solutions of the ODE
where \( {{\mathbf {r}}}={{\mathbf {r}}}(t)=(x(t), y(t))\in \mathbb {R}^2{\setminus } 0\) and \( r:=\Vert {{\mathbf {r}}}\Vert =\sqrt{x^2+y^2}.\)
The energy and angular momentum of a solution are
respectively, and can be easily shown to remain constant during the motion.
Note that M is twice the sectorial velocity, the rate at which area is swept by the line segment connecting the origin to \({{\mathbf {r}}}(t)\). It follows that \(M=0\) if and only if the motion is along a line passing through the origin. Our exclusion of ‘collision’ orbits thus amounts to assuming \(M\ne 0.\) Note also that although E and M are defined in Eq. (5) via the time parametrization of the Kepler orbit, they are in fact determined by the shape of the underlying unparametrized curve (except for the sign of M). See Fig. 1.
A conic in a Euclidean plane is the locus of points with constant ratio of distances to a fixed point and a fixed line. The fixed point, line and ratio are called a focus, directrix and eccentricity e (respectively). Conics with \(e>1\), \(e=1\), \(0< e <1\) and \(e=0\) are hyperbolas, parabolas, non-circular ellipses and circles (respectively).
Identify the xy plane with the plane \(z=0\) in \(\mathbb {R}^3\), \((x,y)\mapsto (x,y,0)\). We use the term ‘projection’ to mean the orthogonal projection \(\mathbb {R}^3\rightarrow \mathbb {R}^2,\) \((x,y,z)\mapsto (x,y)\).
Theorem 7
-
(a)
Every Kepler orbit is the projection of a section of the cone \({\mathcal {C}}=\{x^2+y^2=z^2\}\subset \mathbb {R}^3\) by a plane \(ax+by+cz=1\), \(c\ne 0.\) More precisely: if \(c>0\) then the orbit is the projection of the intersection of the plane with the upper cone \({\mathcal {C}}_+:={\mathcal {C}}\cap \{z>0\}\); if \(c<0\) then it is the projection of the intersection of the plane with the lower cone \({\mathcal {C}}_-:={\mathcal {C}}\cap \{z<0\}\).
-
(b)
The projected section is a conic with a focus at the origin and eccentricity
$$\begin{aligned} e={\sqrt{a^2+b^2}\over | c|}. \end{aligned}$$(6) -
(c)
The angular momentum and energy of the projected Kepler orbit are
$$\begin{aligned} M= \pm {1\over \sqrt{|c|}}, \quad E={a^2+b^2-c^2\over 2|c|}. \end{aligned}$$(7)
Remark 3.1
For positive energy orbits (hyperbolas), the plane section has two components (branches), one in each of \({\mathcal {C}}_\pm \), and one needs to pick carefully the correct branch, as stated in item (a).
Proof
(a) Let \({{\mathbf {r}}}(t)=(x(t), y(t))\) be a solution of Eq. (4) with \(M\ne 0\). Rewriting Eqs. (4) and (5) in polar coordinates, we have
From the first equation follows that the inhomogeneous linear ODE
has two particular solutions: r(t) and the constant solution \(M^2\). Their difference is thus a solution of the homogeneous equation \(\ddot{f}+f/ r(t)^3=0\). But x(t), y(t) are two solutions of this equation, linearly independent for \(M\ne 0\); hence, there are constants A, B such that \(r(t)-M^2=Ax(t)+By(t).\) Rearranging and renaming the constants we obtain \(ax+by+cr=1\), \(r^2=x^2+y^2\), as claimed.
The statement about the precise right half cone to pick is best seen by examining Fig. 2(i) and (ii).
(b) By rotating the secting plane about the z axis and possibly reflecting it about the xy plane, we can assume \(a\ge 0, b=0, c>0\). If \(a=0\) then the secting plane is parallel to the xy plane and the projected conic is a circle (\(e=0\)). Otherwise, \(a>0\), the secting plane is \(ax+cz=1\), its intersection with the xy plane is the line \(ax=1\) and the projected conic is \(ax+cr=1\). The ratio of distances of a point (x, y) on the projected section to the origin and the intersection line is thus \(e=r/|x-1/a|=ar/|cr|=a/c\), a constant, hence the projected section is a conic, the origin is a focus and the intersection line is the corresponding directrix. The formula for e follows from this calculation, since rotation of the plane \(ax+by+cz=1\) about the z axis and reflecting it about the xy plane does not affect the values of e, |c| and \(a^2+b^2\).
(c) The formula for M follows from the proof of item (a). For E, we again assume \(a\ge 0, b=0, c>0.\) The orbit is then \(ax+cr=1\) and at the pericenter (the point nearest the origin) we have \(x=r=1/(a+c)\). Using this in the formula for E in Eq. (8), with \(\dot{r}=0\), \(M^2=1/c\), we get \(E=(a^2-c^2)/(2c).\) For a general secting plane, \(a^2\) is replaced with \(a^2+b^2\) and c with |c|. \(\square \)
Remark 3.2
The clever argument in the above proof of item (a) is due to Lagrange [31]. Another elegant proof, more geometric, is found in [24, §4]. There are many more. A proof along the lines the subject is usually taught in modern introductory courses and textbooks, such as [28, §3.5], consists of writing E and M in polar coordinates, \(E=(\dot{r}^2+r^2{\dot{\theta }}^2)/2 -1/r,\) \(M=r^2{\dot{\theta }},\) then using these to write a differential equation for \(\rho :=1/r\) as a function of \(\theta \), \(E=(M^2/2)[(\rho ')^2+\rho ^2]-\rho ,\) or \(\rho ''+\rho =1/M^2.\) Solving this ODE gives \(r=M^2/[1+e \cos (\theta -\theta _0)]\), where e is an integration constant. This is the equation in polar coordinates of a conic with eccentricity e and focus at the origin, which proves the first part of item (b). Expanding the cosine in this formula and setting \(x=r\cos \theta , y=r\sin \theta \), one obtains \(ax+by+cr=1\), with \(a=(e/M^2)\cos \theta _0,\) \(b= (e/M^2)\sin \theta _0,\) \(c=1/M^2,\) as claimed in item (a). From all these formulas follow easily the expressions for e, M and E of Eqs. (6) and (7).
Corollary 3.3
The cone \({\mathcal {C}}^*:=\{a^2+b^2=c^2\}\subset \mathbb {R}^{2,1}\) parametrizes Kepler parabolas (\(e=1\)), its interior \(a^2+b^2<c^2\) Kepler ellipses (\(0\le e<1\)) and exterior \(a^2+b^2>c^2\) Kepler hyperbolas (\(e>1\)). See Fig. 2(iii).
Corollary 3.4
Kepler orbits with angular momentum \(M\ne 0\) have fixed latus rectum \(2M^2\) and are the projections of sections of \({\mathcal {C}}\) by non-vertical planes passing through \((0,0, M^2)\) or \((0,0, -M^2)\). See Fig. 7a.
This is immediate from Eqs. (6) and (7).
Corollary 3.5
Kepler orbits with energy \(E\ne 0\) are the projections of sections of \({\mathcal {C}}\) by planes tangent to the paraboloid of revolution
inscribed in \({\mathcal {C}}\) and tangent to it along a horizontal circle, dividing \({\mathcal {P}}\) into two components: Kepler ellipses with energy \(-|E|\) are the projections of sections of \({\mathcal {C}}_+\) by planes tangent to the lower component \({\mathcal {P}}_-={\mathcal {P}}\cap \{z<1/|E|\}\); Kepler hyperbolas with energy |E| are the projections of sections of \({\mathcal {C}}_-\) by planes tangent to the upper component \({\mathcal {P}}_+={\mathcal {P}}\cap \{z>1/|E|\}\). See Fig. 5.
Proof
\({\mathcal {P}}\) is given in homogeneous coordinates (X : Y : Z : W) on \(\mathbb {R}P^3\) by \(E^2(X^2+Y^2)-2|E|ZW+W^2=0.\) The dual equation, parametrizing the planes \(AX+BY+CZ+DW=0\) tangent to \({\mathcal {P}}\), is given by inverting the coefficient matrix of the quadratic equation defining \({\mathcal {P}}\), and is \(A^2+B^2-C^2-2|E|CD=0,\) or in affine coordinates, \(a^2+b^2-c^2+2|E|c=0\). At a point \({{\mathbf {p}}}_0=(x_0,y_0,z_0)\in {\mathcal {P}}\) the tangent plane is \(ax+by+cz=1\), where \((a,b,c)=(|E|x_0, |E|y_0, -1)/(|E|z_0-1)\). If \({{\mathbf {p}}}_0\in {\mathcal {P}}_-\) then \(z_0<1/|E|\) hence \(c>0\), so by Eq. (7) the energy of the corresponding orbit is \((a^2+b^2-c^2)/(2c)=-|E|,\) as needed. A similar calculation for the case \({{\mathbf {p}}}_0\in {\mathcal {P}}_+\) completes the proof. \(\square \)
Remark 3.6
The last corollary we learned from [24, p. 145], although our proof is quite different.
4 The Geometry of the Space of Kepler Orbits
Recall that \(\mathbb {R}^{2,1}\) is the 3-dimensional space with coordinates a, b, c, equipped with the indefinite quadratic form \(\Vert (a,b,c)\Vert ^2:=a^2+b^2-c^2\) and associated flat Lorentzian metric \( ds^2=da^2+db^2-dc^2\). A line in \(\mathbb {R}^{2,1}\) is spacelike, null or timelike if \(ds^2\) restricts on it to a positive, null or negative metric, respectively. A plane in \(\mathbb {R}^{2,1}\) is elliptic, parabolicFootnote 1 or hyperbolic if \(ds^2\) restricted to it is of signature (2, 0), (1, 0), or (1, 1), respectively. The null cone with vertex \({{\mathbf {v}}}\in \mathbb {R}^{2,1}\) is the set of points \({{\mathbf {v}}}'\in \mathbb {R}^{2,1}\) such that \(\Vert {{\mathbf {v}}}-{{\mathbf {v}}}'\Vert ^2=0\); equivalently, the union of null lines through \({{\mathbf {v}}}\).
4.1 Duality
The equations \(ax+by+cz=1, x^2+y^2=z^2\) define a duality between Kepler’s xy plane and Minkowski’s space \(\mathbb {R}^{2,1}\): to each point \((a,b,c)\in \mathbb {R}^{2,1}{\setminus } 0 \) corresponds a curve in the xy plane, a Kepler orbit if \(c\ne 0\) or a straight line if \(c=0\), the projection of the intersection of the plane \(ax+by+cz=1\) with one of the components of \({\mathcal {C}}=\{x^2+y^2=z^2\}\) (see Theorem 7(a)): if \(c>0\) then one projects the intersection with \({\mathcal {C}}_+={\mathcal {C}}\cap \{z>0\}\), if \(c< 0\) the intersection with \({\mathcal {C}}_-={\mathcal {C}}\cap \{z<0\}\) and if \(c=0\) the intersection with either component. Conversely, to each point \((x,y)\in \mathbb {R}^2{\setminus } 0\) corresponds the plane \(ax+by+ cr=1\) in \(\mathbb {R}^{2,1}\), where \(r=\sqrt{x^2+y^2}. \) Table 1 summarizes some instances of this duality.
We shall not dwell on all items of this table, as most reflect statements proven elsewhere in this article or are simple to verify. We sketch here proofs of a few items and leave the rest for the reader to explore.
Proposition 4.1
(Item 4 of Table 1) The set of Kepler orbits sharing a point corresponds to a parabolic plane in \(\mathbb {R}^{2,1}\). Every parabolic plane in \(\mathbb {R}^{2,1}\) arises in this way.
Proof
A plane \(ax+by+cz=1\) in \(\mathbb {R}^{2,1}\) is parabolic if and only if it forms an angle of 45 degrees with a horizontal plane. This angle satisfies \(\tan \alpha =\sqrt{x^2+y^2}/|z|\) and the result follows. \(\square \)
Remark 4.2
This last proposition is equivalent to Corollary 3.3 above by projective duality.
Proposition 4.3
(Item 6 of Table 1) The set of Kepler orbits tangent to a given Kepler orbit at one of its points corresponds to a null line in \(\mathbb {R}^{2,1}\). Every null line is obtained in this way. See Fig. 8.
Proof
Let C be the given Kepler orbit and \(P\in C\). Using Kepler’s orbital symmetries (Theorems 1 and 2) we can assume, without loss of generality, that C is the unit circle and \(P=(0,1)\) (see Remark 4.5 below, though). A Kepler orbit \(ax+by+cr=1\) is tangent to C at P if and only if \(a=1, b+c=0\), which is a null line in \(\mathbb {R}^{2,1}\). Every null line is congruent to this line by an orbital symmetry. \(\square \)
Proposition 4.4
(Item 8 of Table 1) The Kepler orbits corresponding to a line in \(\mathbb {R}^{2,1}\) (a ‘pencil’ of Kepler orbits) have concurrent directrices (they all pass through a single point). The orbits of a timelike pencil are nested (same as disjoint).
Proof
The orbits of a Kepler pencil corresponding to a line \(\ell ^*\subset \mathbb {R}^{2,1}\) are obtained by projecting sections of \({\mathcal {C}}\) by planes passing through a fixed line \(\ell \subset \mathbb {R}^3\) (the line dual to \(\ell ^*\)). The directrix of a Kepler orbit is the intersection of the secting plane with the xy plane. Thus all directrices of Kepler orbits in a pencil pass through a fixed point, the intersection of \(\ell \) with the xy plane. The line \(\ell ^*\) is spacelike, null or timelike if and only if \(\ell \) intersects \({\mathcal {C}}\) at 2, 1 or 0 points, respectively. These intersections points project to the intersection points of the orbits of the pencil. Thus the orbits of a timelike pencil are disjoint. See Fig. 9. \(\square \)
Remark 4.5
(Error alert) Strictly speaking, items 5–7 of Table 1, and the last two propositions with their proof, are incorrect. Can you see why before continuing reading?
The exceptions arise with the hyperbolic orbits. By our definition, they only include one branch (the ‘attractive branch’, see Fig. 1). For example, there are spacelike pencils of Kepler hyperbolas which only intersect at one point (the 2nd point of intersection is on the ‘repelling branch’) or even spacelike pencils of disjoint Kepler hyperbolas (the 2 intersection points are on the repelling branch). The same problem occurs with null lines: there are null pencils of disjoint Kepler hyperbolas (the tangency point is again on the repelling branch). The proof of Proposition 4.3 is not correct because applying an orbital symmetry to the circular case may move the tangency point to a repelling branch.
Another problem is that some of the statements are true only when considered in the projective plane. For example, the null line \(a=c, b=0\) corresponds to all Kepler parabolas symmetric about the x-axis. Their common tangency point lies on the line at infinity.
To fix these problems one needs to separate statements and proofs of some items of Table 1 into cases. It is not difficult, and can be even quite entertaining, but we shall not elaborate further on this issue, trusting the reader to make adjustments of the relevant items in the table accordingly.
Corollary 4.6
Each family of Kepler orbits of fixed minor axis, ellipses or hyperbolas, is a non-flat 2-parameter family admitting a 3-dimensional group of symmetries. The elliptic and hyperbolic cases are not orbitally equivalent, although in both cases the orbital symmetry group is isomorphic to \(\mathrm {PSL}_2(\mathbb {R})\).
Proof
The dual surface of such a family is a hyperboloid of either 1 or 2 sheets, the ‘hypersphere’ \(a^2+b^2-c^2=\pm 4/B^2\) (items 12–13 of Table 1). These are the level surfaces of the Minkowski norm and are thus invariant under the Lorentz group \(\mathrm {O}_{2,1}\), a 3-dimensional subgroup of the full 7-dimensional group of orbital symmetries. This shows that every such family admits at least a 3-dimensional group of symmetries. To show that the family is non-flat, and hence its symmetry group is at most 3-dimensional, we turn to the same argument in the proof of Theorem 5, explained in the Appendix (Proposition 5.6).
Note also that in the elliptic case, the said surface (a spacelike hypersphere) is a translation of the surface corresponding to Kepler orbits of fixed non-zero energy (items 10-11). Since translations are generated by orbital symmetries (Theorem 2), the non-flatness follows from Theorem 5.
The elliptic and hyperbolic cases are not orbitally equivalent, even locally, because the two actions of the symmetry group \(\mathrm {PSL}_2(\mathbb {R})\) are non-equivalent: in the elliptic case the isotropy is an elliptic subgroup and in the hyperbolic case it is a hyperbolic subgroup, which are non conjugate 1-parameter subgroups of \(\mathrm {PSL}_2(\mathbb {R})\). \(\square \)
The ‘curved’ Kepler problem (Item 14 of Table 1). There is an analogue of the Kepler problem on surfaces of constant curvature \(k\ne 0\) (a sphere in \(\mathbb {R}^3\) for \(k>0\) and a spacelike ‘hypersphere’ in \(\mathbb {R}^{2,1}\) for \(k<0\)). They are characterized by the property that their unparametrized orbits centrally project to planar Kepler orbits. See [2] for more details, where the following proposition is proved.
Proposition 4.7
Central projection maps orbits of the ‘curved’ Kepler problem on a surface of constant curvature \(k\ne 0\) to Kepler orbits in \(\mathbb {R}^2\). The energy \(E_k\) of an orbit in the curved space is related to the energy E of its centrally projected orbit by
where M is their common angular momentum value.
Corollary 4.8
Central projections of Kepler orbits with energy \(\pm E_k\) on a surface of constant curvature k are parametrized by the surface \(\{a^2 + b^2 -(c - |E_k|)^2=- E_k^2 - k \}\subset \mathbb {R}^{2,1}\), where \(c>0\) represent orbits with negative energy \(E_k = -|E_k|\) and \(c<0\) orbits of positive energies, \(E_k = |E_k|\). They are the projections to the xy-plane of sections of \({\mathcal {C}}\) by planes tangent to the surface \((E_k^2 + k)(x^2 + y^2) =kz^2 + 2|E_k|z - 1\) in \(\mathbb {R}^3\).
The proof is immediate from the last proposition and formulas (7). Let us remark also that Corollary 4.8 gives a pleasant dynamical interpretation of Kepler orbits of fixed minor axis: they are the central projections of zero energy orbits of an appropriate curved Kepler problem.
4.2 A Keplerian Version of the Tait–Kneser and 4 Vertex Theorems
4.2.1 Point-Line Duality
The equation \(ax+by=1\) defines a duality between the xy and ab-planes. Namely, each point (a, b) defines a line in the xy plane and vice versa. Given a curve C in one of these planes, its dual \(C^*\) is a curve in the other plane, whose points correspond to the lines tangent to C. For example, the dual of the circle \(x^2+y^2=R^2\) is the circle \(a^2 + b^2=1/R^2.\) If C is a smooth strictly convex curve, containing the origin in its interior, so is \(C^*\) and \(C^{**}=C.\) This still works if C does not contain the origin in its interior, provided we allow for curves in the projective plane, as we do in the sequel. The tangents to C through the origin then correspond to intersections of \(C^*\) with the ‘line at infinity’.
Proposition 4.9
C is a Kepler orbit if and only if \(C^*\) is a circle. If C is an ellipse then \(C^*\) contains the origin, if it is a parabola then \(C^*\) passes through the origin and if C is an hyperbola then the origin lies outside \(C^*\). In the latter case, the two tangents to \(C^*\) through the origin divide \(C^*\) into two arcs, corresponding to the two branches of C. The larger arc corresponds to the ‘attractive branch’ of C and the shorter to the ‘repelling branch’. See Fig. 10.
Proof
Let \({{\mathbf {v}}}=(a,b,c)\in \mathbb {R}^{2,1}_+\) be the point corresponding to C. The intersection of the null cone through \({{\mathbf {v}}}\) with the ab plane is a circle of radius c centered at (a, b). See Fig. 8 (right). The points of this circle correspond to the lines tangent to C (a special case of Proposition 4.3), so the circle is \(C^*\). For a parabola, one of its tangents is the line at infinity, whose dual is the origin of the ab plane.
When C is a hyperbola it has two tangents, its asymptotes, whose tangency points with C are two points on the line at infinity of the xy plane. The two asymptotes correspond to two points on \(C^*\) and their intersection points with the line at infinity correspond to the two tangents to \(C^*\) at these points, passing through the origin of the ab plane. The longer arc of \(C^*\) corresponds to the attractive branch of C because the latter is nearer the origin then the repelling branch. \(\square \)
Remark 4.10
The same warning as in Remark 4.5 applies here, although it is simpler to fix: if C is a Kepler hyperbola then \(C^*\) is not a circle, but rather a circular arc, corresponding to the Kepler branch of the ‘full’ hyperbola, as shown in Fig. 10. The complementary arc of the circle corresponds to the ‘repelling branch’.
4.2.2 Osculating Circles
A plane curve with non-vanishing curvature admits at each of its points an osculating circle, tangent to the curve at this point to 2nd order (its curvature coincides with that of the curve at this point). Sometimes the osculating circle is hyperosculating, i.e., tangent to order higher than two. This occurs at the critical points of the curvature and such points are called vertices. For example, a (non-circular) ellipse has 4 vertices, corresponding to two minima and two maxima of the curvature. The 4-vertex theorem states that on any convex simple planar closed curve there are at least 4 vertices. A related theorem is the Tait–Kneser theorem, stating that along any vertex-free curve segment with non-vanishing curvature the osculating circles are pairwise disjoint and nested. Both theorems are over 100 years old and there are many variations [19, 23].
Using Proposition 4.9 above, we shall obtain a Keplerian version of these theorems. To this end, we consider a strictly convex star-shaped closed curve \(\gamma \), that is \(\gamma , \gamma '\) and \(\gamma ', \gamma ''\) are everywhere linearly independent (these are parametrization independent conditions). These conditions imply that one can define at each point along \(\gamma \) its osculating Kepler orbit, tangent to the curve to 2nd order. A point where the osculating Kepler orbit is hyperosculating is a Kepler vertex.
Theorem 8
There are at least 4 Kepler vertices along \(\gamma \). Along any vertex-free segment of \(\gamma \) the osculating Kepler orbits are pairwise disjoint and nested. See Fig. 11.
The proof reduces to the observation that point-line duality preserves order of contact between curves; hence, by Proposition 4.9, it maps the osculating Kepler orbit of \(\gamma \) to the osculating circle of \(\gamma ^*\), and the same for hyperosculating Kepler orbits, so it maps Euclidean vertices to Kepler vertices. It also maps nested Kepler orbits to nested circles, so the theorem is reduced to the Euclidean version. In a recent article, we gave a different proof of this theorem [12].
4.3 A Minor Axis Variant of Lambert’s Theorem
Lambert’s Theorem (1761) is a statement about the elapsed time along a Keplerian arc [4, 42]. Let us recall this theorem. Consider a time parametrized Kepler ellipse, i.e., a solution \({{\mathbf {r}}}(t)\) of \(\ddot{{\mathbf {r}}}=-{{\mathbf {r}}}/r^3\), with major axis A. We fix two moments \(t_1< t_2\), the corresponding points \({{\mathbf {r}}}_1={{\mathbf {r}}}(t_1),{{\mathbf {r}}}_2={{\mathbf {r}}}(t_2)\), the chord distance \(r_{12}=\Vert {{\mathbf {r}}}_1-{{\mathbf {r}}}_2\Vert \), the distances to the origin \(r_i=\Vert {{\mathbf {r}}}_i\Vert \) and the time lapse \(\Delta t=t_2-t_1\). See Fig. 12a.
4.3.1 Lambert’s Theorem
\(\Delta t\) is a function of \(r_{12}, r_1 + r_2\) and A.
Clearly, for elliptical orbits the said function is only well defined modulo the period of the orbit (a function of A). The main thrust of the theorem is that \(\Delta t\) does not depend on the individual values of \(r_1, r_2\). Thus, one can deform the orbit, keeping the three quantities \(r_{12}, r_1 + r_2,A\) fixed, into a linear orbit, for which the time \(\Delta t\) is easy to write as an explicit integral.
Our ‘minor axis variant’ of this theorem involves a different well-known parametrization of Kepler orbits, by the eccentric anomaly u, see Fig. 12b. For simplicity, we shall only deal with Kepler ellipses, although the statement and proof can be easily modified for parabolic and hyperbolic orbits. Consider a Kepler ellipse with minor axis B, two values \(u_1<u_2\), \({{\mathbf {r}}}_1={{\mathbf {r}}}(u_1),{{\mathbf {r}}}_2={{\mathbf {r}}}(u_2)\), \(r_{12}=\Vert {{\mathbf {r}}}_1-{{\mathbf {r}}}_2\Vert \), \(r_i=\Vert {{\mathbf {r}}}_i\Vert \) and \(\Delta u=u_2-u_1\).
Theorem 9
\(\Delta u\) is a function of \(r_{12}, r_1 - r_2\) and B, well defined modulo \(2\pi \). Explicitly,
Proof
We consider an ellipse \({\mathcal {E}}\) with minor axis B, parametrized by u, as in Fig. 12b. We lift \({\mathcal {E}}\) to \({\tilde{{\mathcal {E}}}}\subset {\mathcal {C}}_+\) and \({{\mathbf {r}}}_i\) to \({\tilde{{{\mathbf {r}}}}}_i=({{\mathbf {r}}}_i, r_i)\in {\tilde{{\mathcal {E}}}}.\) The right-hand side of Eq. (9) is then \(\Vert {\tilde{{{\mathbf {r}}}}}_1-{\tilde{{{\mathbf {r}}}}}_2\Vert ^2\) (using Minkowski’s norm), hence is invariant under the Lorentz group \(\mathrm {O}_{2,1}\). We claim that the left hand is invariant as well, hence it is enough to check formula (9) in the circular case, for which it is immediate.
To establish the said invariance, we first note that B is \(\mathrm {O}_{2,1}\)-invariant by item 12 of Table 1. The invariance of \(\Delta u\) follows from the next lemma.
Lemma 4.11
-
1.
Restricted to \({\mathcal {C}}\), \(dx^2+dy^2-dz^2=(r d\theta )^2.\)
-
2.
Restricted to \({\mathcal {E}}\), \(rd\theta =(B/2)du.\)
Proof
The 1st statement is a simple calculation, using \(x=r\cos \theta , y=r\sin \theta \) and \(x^2+y^2=z^2\). For the 2nd statement, from Fig. 12 we have \(x = a ( \cos u - e), y = b \sin u, r = a ( 1 - e \cos u )\), where a, b are the major and minor semi axes of \({\mathcal {E}}\) (respectively) and \(e=\sqrt{a^2-b^2}/a\) the eccentricity. From the first two equations follows \(dx^2+dy^2=(a^2(\sin u)^2+b^2(\cos u)^2)du^2\) and from the last follows \(dx^2+dy^2=dr^2+r^2d\theta ^2= a^2e^2(\sin u)^2du^2+r^2d\theta ^2.\) Equating these two expressions for \(dx^2+dy^2\) we obtain \(b^2du^2=r^2d\theta ^2\), as needed. This completes the proof of the lemma and also the theorem. \(\square \)
Remark 4.12
Formula (9) is an elementary geometric statement about ellipses, so one expects to find an elementary proof. Indeed, we give such a proof here and invite the reader to compare it with our proof above. Let \(a=A/2\), \(b=B/2\) (the major and minor semi-axes), \(e=\sqrt{a^2-b^2}/a\) (the eccentricity). Then \( r_j = a( 1 - e\cos u_j) \) and \(r_{12}^2 = a^2(\cos u_1 - \cos u_2)^2 + b^2(\sin u_1 - \sin u_2)^2, \) from which follows \(r_{12}^2 - ( r_1 - r_2)^2 = b^2 \left[ ( \cos u_1 - \cos u_2)^2 + ( \sin u_1 - \sin u_2 )^2 \right] = B^2 \sin ^2 (\Delta u/2).\)
4.4 Kepler Fireworks
The following intriguing result is well known.
Proposition 4.13
Consider the family of Kepler ellipses of fixed (negative) energy, passing through a fixed point. Then there exists a Kepler ellipse, with second focus at the fixed point, tangent to all ellipses of the family (the ‘envelope’ of the family). See Fig. 13c.
There are many proofs available. For example, Richard’s proof [40, p. 839], using only elementary Euclidean geometric, is hard to beat for simplicity and elegance. We shall prove it following a longer path, but will obtain on the way two variations on this result, seemingly new. Let us begin.
Proposition 4.14
Consider the family of Hooke (or central) ellipses of fixed area passing through a fixed point in \(\mathbb {R}^2{\setminus } 0.\) Then these ellipses are all tangent to a pair of parallel lines, symmetric about the line passing through the origin and the fixed point. See Fig. 13a.
Proof
Without loss of generality, let the fixed area be \(\Delta \) and the fixed point (1, 0) (using rotations and dilations about the origin). Any ellipse of area \(\Delta \) passing through (1, 0) can be brought by a ‘shear’ \(S:(X,Y)\mapsto (X+sY, Y)\) to an ellipse of the form \(X^2+(\pi Y/\Delta )^2=1,\) which is clearly tangent to the two lines \(Y=\pm \Delta /\pi \). Since S preserves these lines the original ellipse is also tangent to these lines. \(\square \)
This is our 1st variation on Proposition 4.13 (a rather modest one, admittedly). Before stating the next variation we use another lemma, possibly of some independent interest.
Lemma 4.15
The squaring map \(\mathbb {C}\rightarrow \mathbb {C}\), \({{\mathbf {z}}}\mapsto {{\mathbf {z}}}^2\), takes Hooke ellipses of fixed area to Kepler ellipses of fixed minor axis.
Proof
Let a Hooke ellipse be \((x/a)^2+(y/b)^2=1\) (without loss of generality). Its area is \(\Delta =\pi ab\) and it is parametrized by \(X=a\cos \theta , Y=b\sin \theta .\) Its square is parametrized by \(x=X^2-Y^2=(a^2-b^2)/2+(a^2+b^2)\cos 2\theta , y=2XY=ab\sin 2\theta .\) This is a Kepler ellipse with minor axis \(2ab=2\Delta /\pi .\) \(\square \)
Now for the 2nd variation.
Proposition 4.16
Consider the family of Kepler ellipses with fixed minor axis and passing through a fixed point in \(\mathbb {R}^2{\setminus } 0\). Then there exists a Kepler parabola tangent to all ellipses of the family (the ‘envelope’ of the family). See Fig. 13b.
Proof
By Lemma 4.15, the family of Kepler ellipses with fixed minor axis, passing through a fixed point, is the image under the squaring map of the family of Hooke ellipses of fixed area passing through a fixed point. By Proposition 4.14, the envelope of these Hooke ellipses is a pair of parallel lines, equidistant from the origin. Under the squaring map, the image of these lines is the envelope of the family of Kepler ellipses. Following this recipe for the envelope of the Kepler ellipses with minor axis B going through \((x_1,0)\) we get the Kepler parabola \(y^2=4p(x+p)\), where \(p=B^2/(4x_1).\) \( \square \)
Remark 4.17
The last proposition can be also established by passing to the dual statement using Table 1, by considering the parabolic plane in \(\mathbb {R}^{2,1}\) corresponding to the fixed point, then taking its polar with respect to the quadric corresponding to ellipses with a fixed minor axis (hyperboloid of 2 sheets). We leave the details of this alternate proof for the reader to explore.
Now we use duality (Table 1) and translation symmetries in \(\mathbb {R}^{2,1}\) (Theorem 2) to derive Proposition 4.13 from its minor axis variant (Proposition 4.16).
Proof of Proposition 4.13
Kepler ellipses with energy \(E<0\) passing through \((x_0,0)\) correspond to the intersection of \(a^2+b^2-(c+E)^2=-E^2\) with \( x_0(a+c)=1.\) This is mapped by \((a,b,c)\mapsto (a,b,c+E)\) to the intersection of \(a^2+b^2-c^2=-E^2\) with \(x_0(a+c-E)=1.\) The latter are Kepler ellipses with minor axis \(B=-2/E\) passing through \((x_1, 0)\), where \(x_1=x_0/(1+Ex_0)\), with envelope \(y^2=4p(x+p)\), where \(p=B^2/(4x_1)=(1+Ex_0)/(x_0E^2),\) corresponding to \((-1/(2p), 0, 1/(2p))\in \mathbb {R}^{2,1}\). Translating back, the envelope of the original family is given by \((-1/(2p), 0, 1/(2p)-E)\in \mathbb {R}^{2,1}\). Using the value of p and a bit of algebra, this is seen to correspond to a Kepler ellipse with 2nd focus \((x_0, 0)\), as needed. \(\square \)
Remark 4.18
The positive energy analog of Proposition 4.13, i.e., for hyperbolic orbits, is somewhat disappointing, as the family admits no envelope. There is however a ‘scattering’ version of this proposition, for the repelling inverse square law, see Fig. 14a. A familiar ‘everyday’ version, for constant force, where all orbits as well as the envelope are parabolas, can be observed in fireworks displays and water fountains. See Fig. 14b, c.
5 Proofs of Theorems 1–6
5.1 Proof of Theorem 1
Let \(\mathbb {R}P^3\) be the 3-dimensional projective space with homogeneous coordinates (X : Y : Z : W). We identify \(\mathbb {R}^3\) with the affine chart \(W\ne 0\), \((x,y,z)\mapsto (x:y:z:1).\) The closure of \({\mathcal {C}}=\{x^2+y^2=z^2\}\) in \(\mathbb {R}P^3\) is \({\overline{{\mathcal {C}}}}=\{ X^2+Y^2=Z^2\}\), obtained by adding to \({\mathcal {C}}\) the ‘circle at infinity’ \(S^1_\infty =\{ X^2+Y^2=Z^2, \ W=0\}=\overline{{\mathcal {C}}}{\setminus } {\mathcal {C}}\). See Fig. 3.
Let \(\widetilde{{\mathcal {G}}}\subset \mathrm {GL}_4(\mathbb {R})\) be the subgroup preserving the (degenerate) quadratic form \(X^2+Y^2-Z^2\), up to scale. Its image \({\mathcal {G}}:=\widetilde{{\mathcal {G}}}/\mathbb {R}^*\) in the projective group \(\mathrm {PGL}_4(\mathbb {R})=\mathrm {GL}_4(\mathbb {R})/\mathbb {R}^*\) is the group of projective transformations of \(\mathbb {R}P^3\) preserving \(\overline{{\mathcal {C}}}\).
Lemma 5.1
\( \widetilde{{\mathcal {G}}}\) consists of elements of the form
Proof
\(g\in \widetilde{{\mathcal {G}}}\) if and only if \(g^tJg=cJ\), where \(J=\mathrm {diag}(1,1,-1,0)\) and \(c\in \mathbb {R}\). By a simple calculation g has the claimed form. \(\square \)
It follows that \(\widetilde{{\mathcal {G}}}\) is an 8-dimensional group and \({\mathcal {G}}=\widetilde{{\mathcal {G}}}/\mathbb {R}^*\) is 7-dimensional. In the affine chart \(\mathbb {R}^3\subset \mathbb {R}P^3\) (column vectors), \({{\mathbf {q}}}\mapsto ({{\mathbf {q}}}:1)\), the action of an element of \(\widetilde{{\mathcal {G}}}\) given by Eq. (10) is
It restricts to a local action on \({\mathcal {C}}_+\) and projects to a local action on \(\mathbb {R}^2{\setminus } 0\). By the general theory of point symmetries of ODEs (see the Appendix), the maximal dimension of the symmetry group of a 3-parameter family of plane curves is 7, hence this local \({\mathcal {G}}\)-action on \(\mathbb {R}^2{\setminus } 0\) provides the full group of orbital symmetries.
The expressions for the infinitesimal symmetries in Eq. (1) follow from the above by differentiating the action along 1-parameter subgroups of \(\widetilde{{\mathcal {G}}}\). Let \(X\in \mathrm {Lie}(\widetilde{{\mathcal {G}}})\) (the Lie algebra of \(\widetilde{{\mathcal {G}}}\)). Since we are considering projectivized action, we can assume without loss of generality that \(\mathrm{tr}(X)=0\). From Eq. (10) follows that such an X has the form
The induced vector field on \(\mathbb {R}^2{\setminus } 0\) is \((x,y)\mapsto \gamma '(0),\) where \(\gamma (t)=\pi (e^{tX}q)\), \(q=(x,y,\sqrt{x^2+y^2},1)^t\) and \(\pi (X,Y,Z,W)=\left( X/W, Y/W\right) .\) The formulas of Eq. (1) follow from this recipe by setting \(x_i=1\) and the rest 0 in Equation (12), \(i=1, \ldots , 7.\) \(\square \)
5.2 Proof of Theorem 2
Note first that an element \(g\in \widetilde{{\mathcal {G}}}\), given by Eq. (10), acts on \((\mathbb {R}^4)^*\) (row vectors) by \(p\mapsto pg^{-1}\). In the affine chart \(\mathbb {R}^{2,1}\subset \mathrm {P}((\mathbb {R}^4)^*)\) (row vectors), \({{\mathbf {p}}}\mapsto ({{\mathbf {p}}}:-1)\), the action on \(\mathbb {R}^{2,1}\) by an element of \(\widetilde{{\mathcal {G}}}\), given by Eq. (10), is
It follows that for X given by Eq. (12) the induced vector field on \(\mathbb {R}^{2,1}\) is \({{\mathbf {p}}}\mapsto \gamma '(0),\) where \(\gamma (t)=\pi (p e^{-tX})\), \(p=({{\mathbf {p}}},-1)\) and \(\pi (A,B,C,D)=-\left( A/D, B/D, C/D\right) .\) \(\square \)
5.3 Proof of Theorem 3
Identify \(\mathbb {R}^2=\mathbb {C}\) and consider the squaring map \(B: {{\mathbf {z}}}\mapsto {{\mathbf {z}}}^2.\)
Lemma 5.2
B defines a 2 : 1 cover \(\mathbb {C}{\setminus } 0\rightarrow \mathbb {C}{\setminus } 0\), mapping pairs of parallel symmetric affine lines into Kepler parabolas.
Proof
Since B is \(\mathbb {C}^*\)-equivariant, \(B(\lambda Z)=\lambda ^2B(Z),\) \(\lambda \in \mathbb {C}^*\), it is enough to consider the pair \(x=\pm 1\). Their B-image is the Kepler parabola \(x=(1+y/2)^2.\) \(\square \)
It follows that the set of Kepler parabolas is a flat 2-parameter family of plane curves. \(\square \)
5.4 Proof of Theorem 4
We offer 3 different proofs, with increasing level of abstraction, and a bonus one in Remark 5.3. Which one is your favorite?
(a) The proposed map, \({{\mathbf {r}}}\mapsto {{\mathbf {r}}}/(1-r/M^2)\), in polar coordinates, is \((r,\theta )\mapsto (R,\theta )\), where \(R=r/(1-r/M^2),\) or \(M^2/r=1+M^2/R .\) A Kepler orbit with angular momentum M is given by \(M^2/r = 1+e\cos (\theta -\theta _0)\) (see Remark 3.2) and is mapped to \(M^2/R= e\cos (\theta -\theta _0)\), or \(R\cos (\theta -\theta _0)=M^2/e.\) This is the equation of a line whose distance to the origin is \(M^2/e\), making an angle \(\theta _0\) with the y-axis.
(b) Kepler orbits with angular momentum M are the projections to the xy plane of sections of \({\mathcal {C}}\) by planes passing through \(P:=(0,0,M^2)\) (Corollary 3.4). Central projection from P maps these sections to straight lines in the xy plane.
(c) Kepler orbits with fixed M are parametrized by the horizontal plane \(\{c=1/M^2\}\subset \mathbb {R}^{2,1}\), see Corollary 3.4 above. We know that \({\mathcal {G}}\) acts on \(\mathbb {R}^{2,1}\) as its full group of Minkowski similarities, so there is an element \(g\in \widetilde{{\mathcal {G}}}\) that translates this plane to the plane \(c=0\), parametrizing straight lines in the xy plane. By Eq. (13), we can take g corresponding to \(A = id, \mathbf{b}= (0,0,-1/M^2)\). The stated formula follows from Eq. (11). \(\square \)
Remark 5.3
Yet another proof, which shows flatness, without an explicit formula, consists of writing down a second order linear ODE for the family of Kepler orbits with fixed M and use the fact that second order linear ODEs are flat [7, p. 44]. The said ODE is \(\rho ''(\theta )+\rho (\theta )=1/M^2,\) where \(\rho =1/r\). See the proof of Proposition 5.6 below.
5.5 Proof of Theorem 5
According to the general theory of symmetries of ODEs, flatness of a 2-parameter family of plane curves is equivalent to the vanishing of certain two differential invariants of an associated second order ODE. In the Appendix we carry out a calculation showing that one of these invariants is non-vanishing for the family of Kepler orbits of fixed non-zero energy, thus proving that each such family is non-flat, see Proposition 5.6. Next, according to another basic result of the theory, the dimension of the symmetry group of a non-flat 2-parameter family is at most 3. Thus, for each \(E\ne 0\), it is enough to find a 3-dimensional subgroup of \({\mathcal {G}}\) preserving the set of Kepler orbits with energy E.
As explained in Corollary 3.5, Kepler orbits with energy \(\pm E\ne 0\) are projections of sections of \({\mathcal {C}}\) by planes tangent to the inscribed paraboloid of revolution \({\mathcal {P}}= \{2z=|E|\left( x^2+y^2\right) +1/|E|\}\). Let \({\overline{{\mathcal {P}}}}\) be the closure of \({\mathcal {P}}\) in \(\mathbb {R}P^3\). It is a smooth convex compact surface, given in homogeneous coordinates by the vanishing of the quadratic form \(|E|\left( X^2+Y^2\right) -2ZW+W^2/|E|,\) obtained by adding to \({\mathcal {P}}\) the point (0 : 0 : 1 : 0), the tangency point of \({\overline{{\mathcal {P}}}}\) with the plane \(W=0\) (the white dot in Fig. 15a). Consider the subgroup \(\widetilde{{\mathcal {G}}}_E\subset \widetilde{{\mathcal {G}}}\) preserving this quadratic form up to scale. A short calculation shows that its Lie algebra consists of matrices of the form
The associated vector field in the xy-plane is \((x,y)\mapsto \gamma '(0)\), where \(\gamma (t)=\pi (e^{tX}q)\), \(q=(x,y,\pm \sqrt{x^2+y^2},1) ^t\) and \(\pi (X,Y,Z,W)=\left( X/W, Y/W\right) .\) The sign in q is the opposite sign of E, since for \(E>0\) (the hyperbolic case) we need to project the action from \({\mathcal {C}}_-\) and for \(E<0\) from \({\mathcal {C}}_+\). Setting \(x_i=1\) and the rest 0 in Eq. (14), \(i=2,3,4, \) we obtain from this recipe for \(E<0\) the vector fields
as in Eq. (3). For \(E>0\) we get the vector fields \(v_2,-v_3,-v_4.\) In both cases, \(v_2, v_3, v_4\) are infinitesimal generators of the \({\mathcal {G}}_E\)-action, as stated.
The isomorphism \(\widetilde{{\mathcal {G}}}_E/\mathbb {R}^*\simeq \mathrm {PSL}_2(\mathbb {R})\) is best seen in the dual picture, in \(\mathbb {R}^{2,1}\). See Fig. 15b.
Kepler orbits of energy \(E\ne 0\) are parametrized by the surface \({\mathcal {P}}^*=\{-a^2-b^2+(c-|E|)^2=E^2\}\subset \mathbb {R}^{2,1}\), the quadric surface dual to \({\overline{{\mathcal {P}}}}\) (see Eq. (7) and Fig. 15b). This is a hyperboloid of revolution of two sheets. The lower sheet \({\mathcal {P}}^*_+\) parametrizes planes tangent to \({\mathcal {P}}_+\), which correspond to Kepler hyperbolas with energy |E|. Similarly for the lower sheet. The Lorentzian metric \(da^2+db^2-dc^2\) in \(\mathbb {R}^{2,1}\) restricts to an hyperbolic metric on each of the sheets, on each of which the identity component of \({\mathcal {G}}_E\) acts as the identity component of its isometry group (in the full \({\mathcal {G}}_E\) there is also an element interchanging the two sheets, we will use it in the proof of the next theorem).
It is also clear from Fig. 15a why the orbital symmetry action on \({\mathcal {H}}_E\) for \(E>0\) is only local. This is because \({\overline{{\mathcal {P}}}}_+\) touches the plane \(W=0\) (the ‘plane at infinity’ of the affine chart \(W=0\), intersecting \(\overline{{\mathcal {C}}}\) at \(S^1_\infty \)) at one point, which does not correspond to any point in Kepler’s xy plane. \(\square \)
Remark 5.4
As mentioned in the Introduction, Theorem 5 can be deduced from known results on “superintegrable metrics” (although we did not find it stated explicitly). We sketch the argument. Kepler orbits with fixed energy are the (unparametrized) geodesics of a well-known metric, the Jacobi–Maupertuis metric, see p. 247 of [6]. This metric is known to be ‘super-integrable’, admitting 4 quadratic integrals, see §3.1 of [34]. Such metrics admit a 3-dimensional group of “projective symmetries” (same as our orbital symmetries), see Lemma 2 in §2.2.4 of [14]. Then one can use a classification of Lie of projective local groups of transformations to deduce that the said symmetry group is isomorphic to \(\mathrm {SL}_2(\mathbb {R})\), see §2.2.2 and references in footnote 9 on p. 442 of [14]. We are thankful to V. Matveev for patiently pointing out to us this non-trivial chain of ideas and the relevant references.
5.6 Proof of Theorem 6
As in the case of Theorem 4, there are various proofs available. We will only present our favorite one.
Consider in Fig. 15b the reflection about the horizontal plane \(c=|E|\) passing through the vertex of the shown cone, \((a,b,c)\mapsto (a,b, 2|E|-c),\) interchanging the lower and upper sheets \({\mathcal {P}}^*_\pm \) of \({\mathcal {P}}^*\). The corresponding element in \(\widetilde{{\mathcal {G}}}\) is
In Fig. 15a, in the affine chart \(Z\ne 0\) with coordinates \(x=X/Z, y=Y/Z, w=W/Z\), g acts by \((x,y,w)\mapsto (x,y,2|E|-w)\), a reflection about the center (0, 0, |E|) of \({\overline{{\mathcal {P}}}}\) (the dark dot), interchanging \({\overline{{\mathcal {P}}}}_\pm \). In Fig. 5, in the affine chart \(W\ne 0\), with coordinates \(x=X/W, y=Y/W, z=Z/W\), g acts by \((x,y,z)\mapsto (x,y,-z)/(1-2|E|z),\) interchanging \({\mathcal {P}}_\pm .\)
To write an explicit orbital embedding \({\mathcal {H}}_E\rightarrow {\mathcal {H}}_{-E}\), note first in Fig. 5 that Kepler hyperbolas are the projections of sections of the lower part \({\mathcal {C}}_-\) with planes tangent to \({\mathcal {P}}_+\), and that Kepler ellipses are the projections of sections of the upper part \({\mathcal {C}}_+\) with planes tangent to \({\mathcal {P}}_-\). The embedding is thus given by the composition \({{\mathbf {r}}}=(x,y)\mapsto ({{\mathbf {r}}},-r)\mapsto ({{\mathbf {r}}},r)/(1+2Er)\mapsto {{\mathbf {r}}}/(1+2Er),\) as needed.
We can also map the ‘repelling branches’ of Kepler hyperbolas with energy E into \({\mathcal {H}}_{-E}\), but these are the projections of sections of the upper part of \({\mathcal {C}}\) with planes tangent to \({\mathcal {P}}_+\), thus the embedding is \({{\mathbf {r}}}=(x,y)\mapsto ({{\mathbf {r}}},r)\mapsto ({{\mathbf {r}}},-r)/(1-2Er)\mapsto {{\mathbf {r}}}/(1-2Er).\) See Fig. 6. \(\square \)
Notes
Some authors use the term ‘isotropic’ instead of ‘parabolic’. For example, Cartan [16]. Elliptic planes are called also ‘spacelike.’
References
Albouy, A.: Projective dynamics and classical gravitation. Regul. Chaot. Dyn. 13, 525–542 (2008)
Albouy, A.: There is a Projective Dynamics. Eur. Math. Soc. Newsl. 89, 37–43 (2013)
Albouy, A.: Lectures on the two-body problem. In: Cabral, H., Diacu, F. (eds.) Classical and Celestial Mechanics: The Recife Lectures. Princeton University Press, Princeton (2002)
Albouy, A.: Lambert’s theorem: geometry or dynamics? Celest. Mech. Dyn. Astron. 131(9), 1–30 (2019)
Arnold, V.I.: Huygens and Barrow, Newton and Hooke: Pioneers in Mathematical Analysis and Catastrophe Theory from Evolvents to Quasicrystals. Birkhäuser, Basel (1990)
Arnold, V.I.: Mathematical Methods of Classical Mechanics, 2nd edn. Springer, Berlin (1989)
Arnold, V.I.: Geometrical Methods in the Theory of Ordinary Differential Equations, vol. 250. Springer, Berlin (2012)
Arnold, V.I.: Newton’s Principia read 300 years later. Not. AMS 36(9), 1148–1154 (1989)
Bertrand, J.: Théorème relatif au mouvement d’un point attiré vers un centre fixe. C. R. Acad. Sci. 77, 849–853 (1873). Available online: https://gallica.bnf.fr/ark:/12148/bpt6k3034n/f849. English translation: arXiv:0704.2396
Blaschke, P.: Pedal coordinates, d ark Kepler, and other force problems. J. Math. Phys. 58, 063505 (2017)
Bohlin, K.: Note sur le problème des deux corps et sur une intégration nouvelle dans le problème des trois corps. Bull. Astr. 28, 113–119 (1911)
Bor, G., Jackman, C., Tabachnikov, S.: Variations on the Tait-Kneser theorem. Math. Intelligencer 43, 8–14 (2021)
Bluman, G.W., Kumei, S.: Symmetries and Differential Equations. Springer, New York (1989)
Bryant, R., Manno, G., Matveev, V.: A solution of S. Lie Problem: Normal forms of 2-dim metrics admitting two projective vector fields. Math. Ann. 340(2), 437–463 (2008)
Cariñena, J.F., López, C., del Olmo, M.A., Santander, M.: Conformal geometry of the Kepler orbit space. Celest. Mech. Dyn. Astron. 52(4), 307–343 (1991)
Cartan, É.: La geometría de las ecuaciones diferenciales de tercer orden. Rev. Math. Hispano-Amer. 4, 1–31 (1941). Reprinted in : Œuvres complètes, Partie III 2, 1535–1565. Gauthier-Villars (1952)
Chern, S.-S.: Sur la géométrie d’une équation différentielle du troisième ordre. C. R. Acad. Sci. Paris 204, 1227–1229 (1937)
Chern, S.-S.: The geometry of the differential equations \(y^{\prime \prime \prime } = F(x, y, y^{\prime }, y^{\prime \prime })\). Sci. Rep. Nat. Tsing Hua Univ. 4, 97–111 (1940)
DeTurck, D., Gluck, H., Pomerleano, D., Vick, D.S.: The four vertex theorem and its converse. Not. AMS 54(2), 192–207 (2007)
Duzhin, S.V., Lychagin, V.V.: Symmetries of distributions and quadrature of ordinary differential equations. Acta Appl. Math. 24(1), 29–57 (1991)
Doubrov, B., Komrakov, B.: The geometry of second-order ordinary differential equations. Preprint (2016). arXiv:1602.00913
Frauenfelder, U., Van Koert, O.: The restricted three-body problem and holomorphic curves. Springer, Berlin (2018)
Ghys, E., Tabachnikov, S., Timorin, V.: Osculating curves: around the Tait–Kneser theorem. Math. Intelligencer 35(1), 61–66 (2013)
Givental, A.: Kepler’s laws and conic sections. Arnold Math J. 2, 139–148 (2016)
Godlinski, M.: Geometry of Third-Order Ordinary Differential Equations and Its Applications in General Relativity. PhD thesis, University of Warsaw (2008). arXiv:0810.2234
Godlinski, M., Nurowski, P.: Third-order ODEs and four-dimensional split signature Einstein metrics. J. Geom. Phys. 56(3), 344–357 (2006)
Godlinski, M., Nurowski, P.: Geometry of third-order ODEs. Preprint (2009). arXiv:0902.4129
Goldstein, H., Poole, C., Safko, J.: Classical Mechanics, 3rd edn. Addison-Wesley, New York (2002)
Guillemin, V., Sternberg, S.: Variations on a Theme by Kepler, vol. 42. American Mathematical Society (2006)
Kasner, E.: The trajectories of dynamics. Trans. Am. Math. Soc. 7(3), 401–424 (1906)
Lagrange, J.L.: Recherches sur la théorie des perturbations, Mémoires des Savant étrangers, tome X, 1785. Reproduced in: Oeuvres complètes, tome 6, 419–431. Gauthier-Villars, Paris (1873). https://gallica.bnf.fr/ark:/12148/bpt6k229225j/
Lang, J.: Three Projective Problems on Finsler Surfaces. PhD thesis, Friedrich Schiller Universität Jena (2020). https://www.db-thueringen.de/receive/dbt_mods_00040622
Maclaurin, C.: A Treatise of Fluxions, vol. 2. Ruddimans, Edinburgh (1742)
Miller, W., Post, S., Winternitz, P.: Classical and quantum superintegrability with applications. J. Phys. A: Math. Theor. 46, 423001 (2013). arXiv:1309.2694
Montgomery, R.: Metric cones, N-body collisions, and Marchal’s lemma. Preprint (2018). arXiv:1804.03059
Moser, J.: Regularization of Kepler’s problem and the averaging method on a manifold. Comm. Pure Appl. Math. 23, 609–636 (1970)
Nurowski, P.: Differential equations and conformal structures. J. Geom. Phys. 55(1), 19–49 (2005)
Olver, P.J.: Equivalence, Invariants and Symmetry. Cambridge University Press, Cambridge (1995)
Prince, G., Sherring, J.: Geometric aspects of reduction of order. Trans. AMS 334(1), 433–453 (1992)
Richard, J.-M.: Safe domain and elementary geometry. Eur. J. Phys. 25(6), 835–844 (2004)
Sato, H., Yoshikawa, A.Y.: Third order ordinary differential equations and Legendre connection. J. Math. Soc. Jpn. 50(4), 993–1013 (1998)
Sezebehely, V.G.: Adventures in Celestial Mechanics, A First Course in the Theory of Orbits. University of Texas Press (1989)
Souriau, J.M.: Sur la variété de Kepler. Centre de Physique Théorique (1973)
Stephani, H.: Differential Equations: Their Solution Using Symmetries. Cambridge University Press, Cambridge (1989)
Tresse, A.: Détermination des invariants ponctuels de l’équation différentielle ordinaire du second ordre \(y^{\prime \prime }= \omega (x, y, y^{\prime })\). Vol. 32. S. Hirzel (1896)
Tod, K.P.: Einstein–Weyl spaces and third-order differential equations. J. Math. Phys. 41(8), 5572–5581 (2000)
Wünschmann, K.: Über Berührungsbedingungen bei Integralkurven von Differentialgleichungen. Inauguraldissertation, Leipzig, Teubner, 6–13 (1905)
Acknowledgements
The authors thank Richard Montgomery, Sergei Tabachnikov, Alain Albouy and Vladimir Matveev for fruitful correspondence and discussions. GB was supported by CONACYT Grant A1-S-45886.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Symmetries of ODEs
Appendix A: Symmetries of ODEs
The purpose of this appendix is twofold: first, we fulfill a promise made in the beginning of the proof of Theorem 5, showing that the 2-parameter family of Kepler orbits with fixed non-zero energy is not flat. See Theorem 10 below. Second, we fit the results of this article into the general context of the theory of symmetries of ODEs.
1.1 Lie’s Theory of Symmetries of ODEs
An n-parameter family of plane curves is given, locally, under some mild regularity conditions, by the graphs of solutions y(x) of an nth-order ODE \(y^{(n)}=f(x,y,y',\ldots , y^{(n-1)}).\) Local diffeomorphisms of the xy plane preserving the graphs of solutions of the ODE are classically called point symmetries of the ODE. Vector fields in the plane whose flow acts by point symmetries are infinitesimal point symmetries. The subject was developed in the nineteenth century, mostly by Sophus Lie and his students, later on in the twentieth century by É. Cartan and many others, and is a still an active area of research. A standard modern reference is P. Olver’s book, Olver’s book [38], see also [13, 20, 39, 44].
1.2 On ‘Local Symmetries’
Point symmetries are local not only in the xy plane but also in the jet spaces over \(\mathbb {R}^2\) to which they are naturally prolonged. An nth-order ODE \(y^{(n)}=f(x,y,y',\ldots , y^{(n-1)})\) defines a hypersurface \(M:=\{p_n=f(x,y,p_1, \ldots , p_{n-1})\}\) in the total space \(J^n\) of the bundle of nth-order jets of curves in \(\mathbb {R}^2\). M is an \((n+1)\)-dimensional manifold, doubly foliated, with leaves of dimensions \(n-1\), 1, the sum of whose tangents span a contact distribution on M. The first foliation is by the fibers of the projection \((x,y,p_1,\ldots , p_n)\mapsto (x,y)\) and the second by the nth jets of the solutions to the ODE. A point symmetry of the ODE is a local diffeomorphism of M preserving both foliations. It projects to a local diffeomorphism of the xy plane. A good introduction to this geometric point of view on ODEs, for \(n=2\), is Arnold’s book [7, Section 1.6]. Of course, our proof is completely different.
1.3 Flat Families
An n-parameter family of plane curves is flat if it is locally diffeomorphic to the family given by \(y^{(n)}=0\) (graphs of polynomial functions of degree \(<n\)). As was shown by Lie, a family is flat if and only if its local symmetry group is \((n+4)\)-dimensional for \(n>2\) and 8-dimensional for \(n=2\), the maximal dimension possible for an n-parameter family of plane curves (Theorems 6.39 and 6.42 of [38]).
The \(n=3\) case, i.e., point symmetries of 3rd order ODEs, was further studied in more depth in 1905 by Wünschmann [47], around 1940 by S.-s Chern [17, 18] and É. Cartan [16], and later on by others [25,26,27, 41, 46]. The only result from this theory that we use, in the proof of Theorem 1, due to Lie, is that the maximum dimension of the symmetry group of a 3-parameter family of plane curves is 7.
Theorem 1 can thus be interpreted as saying that the 3-parameter family of Kepler orbits is locally diffeomorphic to the solutions of \(y'''=0,\) i.e., vertical parabolas of the form \(y=ax^2+bx+c.\) Let us find such a diffeomorphism. Define a map from the XY plane to the xy-plane by
Proposition 5.5
Equation (15) defines a local diffeomorphism from the XY-plane into the xy-plane, mapping each vertical parabola \(Y=AX^2+BX+C,\) \(A,B,C\in \mathbb {R}\), onto the Kepler orbit \(ax+by+cr=1\), where \(a=(A-C)/2, b=B/2, c=(A+C)/2.\)
The proof is by a straightforward verification.
1.4 Path Geometries, Tresse Classification
The \(n=2\) case is the best known and is called a path geometry. If a 2-parameter family is not flat then the maximal possible dimension of the symmetry group drops from 8 to 3. A list of normal forms of 2nd order ODEs admitting a 3-dimensional group of symmetries, over the complex numbers, was derived by Tresse (a French student of Lie) in his 1896 PhD dissertation [45]. The list is divided into 4 ‘types’, according to the symmetry group (all types come with 1 or 2 continuous parameters). Type d), the type that concerns us, deals with \(\mathrm {SL}_2(\mathbb {C})\) invariant 2nd order ODEs, and is given by Tresse as \(y''=(a(y')^3-y')/(6x),\) where a is a (complex) parameter.
Tresse classification was extended to the real case [21, 32] but by and large we think that this list has not been sufficiently explored.
Over the reals, Tresse’s type d) breaks first into two subtypes, according to the two real forms of \(\mathrm {SL}_2(\mathbb {C})\): \(\mathrm {SU}_2\) and \(\mathrm {SL}_2(\mathbb {R})\). We are concerned with \(\mathrm {SL}_2(\mathbb {R})\).
Among the \({\mathrm {SL}_2(\mathbb {R})}\)-invariant path geometries, there are two ‘exceptional’ cases (without parameters), corresponding to the two ODEs \(y''= \pm (xy' - y)^3\). What distinguishes these two cases from all other items on Tresse list is that these are the only cases of projective path geometries, i.e., the paths are the (unparametrized) geodesics of a torsionless affine connection. In fact, in this case the paths are the geodesics of the well-known Jacobi–Maupertuis metric defined on the Hill region for any mechanical system with fixed energy.
The case that appears here (constant energy Kepler orbits) corresponds to \(y''=(xy'-y)^3\), but it is not so easy to see the equivalence (we will not pursue it here).
A path geometry on a surface S determines a ‘dual’ path geometry on the path space \(S^*\), parametrized by the points of S: to each point of S is assigned a path in \(S^*\), the set of paths in S passing through this point. The dual path geometry of a flat path geometry (straight lines, graphs of solutions to \(y''=0\)) is also flat, but a generic non-flat path geometry is not equivalent to its dual. The flatness of a path geometry, given by a 2nd order ODE \(y''=f(x,y, y')\), is detected by the vanishing of the relative invariants
where \(p=y'\) and \(D=\partial _x+p\partial _y+f\partial _p.\)
The vanishing of \(I_1\) simply means that f is at most cubic in \(y'\). This is a diffeomorphism invariant property, characterizing projective path geometries. The vanishing of \(I_2\) is equivalent to the projectivity of the dual path geometry. Thus a path geometry is flat if and only if it is projective and its dual path geometry is projective as well.
1.5 Kepler Orbits of Fixed Energy
We can now fill the gap left out in the proof of Theorem 5.
Proposition 5.6
Kepler orbits of fixed energy \(E\ne 0\) form a non-flat path geometry. In fact, \(I_1=0\) but \(I_2\ne 0\). Thus the maximum dimension of the symmetry group of such a family is 3.
Proof
We 1st write down a 2nd order ODE for Kepler orbits of energy E. Using the equation \(ax+by+cr=1\) of Theorem 7(a), we get
where \(x=r\cos \theta , y=r\sin \theta , r=1/ \rho .\) It follows that
Using this in \( 2cE=a^2+b^2-c^2 \) (Eq. (7) with \(c>0\)), we get,
Using Eq. (16) we get \(I_2=9E^2/(E+\rho )^3,\) hence \(I_2\ne 0\) for \(E\ne 0.\) \(\square \)
Remark 5.7
Incidentally, the formula \(I_2=9E^2/(E+\rho )^3\) of the last proof gives another proof of Theorem 3.
1.6 Central Forces with Flat Orbit Space: The Wünschman Condition
Theorem 1 establishes that Kepler orbits form a flat 3-parameter family of curves, i.e., locally diffeomorphic to the family of vertical parabolas, given by \(y'''=0\). Using the squaring map, \({{\mathbf {z}}}\mapsto {{\mathbf {z}}}^2\), this result extends to Hooke orbits, the family of central conics, trajectories of a mass under Hooke’s force laws, \(\ddot{{\mathbf {r}}}=\pm {{\mathbf {r}}}.\) Are there any other force laws, whose orbits form a flat family of plane curves?
We do not know the answer in general. But for central force laws, i.e., Newton’s equations of the form \(\ddot{{\mathbf {r}}}=f(r) {{\mathbf {r}}}/r,\) the answer is negative. To prove it, we show that in fact the Hooke and Kepler laws are the only central force laws satisfying a condition weaker than flatness, called the Wünschman condition (1905). Given a 3-parameter family of plane curves, one defines null cones in the parameter space whose rulings consist of the curves that are tangent to a fixed line at a fixed point. In the flat case, such as the space of Kepler orbits, these cones are quadratic and thus define a (flat) conformal structure on the parameter space. However, for a general family, these cones may fail to be quadratic. The families for which the null cones are quadratic, and hence define a conformal Lorentzian metric on the parameter space, are characterized by a complicated PDE on the ODE that defines this family, studied by Wünschmann [47]. For a modern presentation of this deep result see [37].
Theorem 10
The orbits of the system \(\ddot{{\mathbf {r}}}=f(r) {{\mathbf {r}}}/r\) form a flat 3-parameter family of plane curves if and only if f(r) is a constant multiple of r or \(1/r^2\). In fact, these force laws are the only central ones satisfying the Wünschmann condition.
Proof
Following the standard procedure outlined above, we first write a 3rd order ODE whose solutions are the (unparametrized) orbits of the system \(\ddot{{\mathbf {r}}}=f(r) {{\mathbf {r}}}/r\),
where \(\rho =1/r, \) \(\rho =\rho (\theta )\), \(\phi =f'(\rho )/f(\rho )\) (see for example [30]). Next, the Wünschmann condition for \(\rho '''=F(\rho , \rho ', \rho '')\) is
where
See [37, Equation 8]. Applying this condition to the right-hand side of Eq. (17), the resulting equation is
where
Equation (18) is thus equivalent to the system of three ODEs, \(W_1=W_2=W_3=0.\) One can then easily check that the only solutions of this system are constant multiples of \(\rho ^2\) and \(1/\rho .\) \(\square \)
1.6.1 Central Forces and Projective Path Geometries
As mentioned above, in the local classification of path geometries admitting a 3-dimensional group of symmetries there are only 3 projective cases, where the paths arise as the unparametrized geodesics of a torsionless affine connection. In general, a projective path geometry need not be a metric path geometry, i.e., the affine connection may not be the Levi-Civita connection of a pseudo-Riemannian metric, but in our 3 cases they are metric connections. In fact, all 3 cases arise as the orbits of fixed energy of conservative mechanical systems, and thus can be realized as geodesics of the associated Jacobi–Maupertuis metric. Let us list the 3 cases by 2nd-order ODEs defining them:
-
I.
\(y''=0\).
-
II.
\(y''=(xy'-y)^3\).
-
III.
\(y''=-(xy'-y)^3\).
(See, e.g., [21], where our type I is item 4 of Theorem 7 and our types II and III are items \(3d_+\) and \(3d_-\) , respectively.)
Type I is the flat path geometry, admitting an 8-dimensional symmetry group, the projective group \(\mathrm {PGL}_3(\mathbb {R})\). Type II and III are non-flat, each admitting \({\mathrm {SL}_2(\mathbb {R})}\) as a local symmetry group. In both types II and III the \({\mathrm {SL}_2(\mathbb {R})}\) action is locally equivalent to the standard linear action on \(\mathbb {R}^2{\setminus } 0\). The dual actions, on the dual path geometries, are non-equivalent: for the dual of type II \({\mathrm {SL}_2(\mathbb {R})}\) acts by isometries of the hyperbolic plane and in the dual of type III as isometries of pseudo-hyperbolic plane (non-flat constant curvature Lorentzian metric). Both actions appear naturally as open orbits of the projectivized adjoint representation of \({\mathrm {SL}_2(\mathbb {R})}\).
In Table 2, we place some 2-parameter families of curves arising naturally in planar mechanical systems with central-force laws, locally realizing the 3 path geometries. In the 1st two rows we consider central-force power laws, \(\ddot{{\mathbf {r}}}=f(r){{\mathbf {r}}}/r,\) \(f(r)=\pm r^\alpha ,\) where M and E are the (fixed) angular momentum and energy, respectively. In parentheses is the force law (\(\pm r^\alpha \), with ‘–’ for attractive and ‘+’ for repelling). In the following two rows \(E_k\) is the energy, \(M_k\) the angular momentum, for the Kepler problem in a space of constant curvature k, as in [2].
1.6.2 Some Comments on Table 2
1. ‘Hooke’ orbits, attractive or repelling (\(f=\pm r\)), with fixed angular momentum M, were placed in the table by considering the squaring map, \({{\mathbf {z}}}\mapsto {{\mathbf {z}}}^2\). They are thus mapped to Kepler orbits with fixed minor axis. Attractive Hooke orbits (\(f=-r\)) are mapped to Kepler ellipses with fixed minor axis (see item 12 of Table 1 and Lemma 4.15), which are equivalent to ellipses of constant energy (see proof of Corollary 4.6), corresponding to type II path geometry. Repelling Hooke orbits (\(f=r\)) are mapped to Kepler hyperbolas with fixed minor axis (item 13 of Table 1), which is type III path geometry.
2. Zero energy orbits for all central-force power laws, \(f=\pm r^\alpha \), \(\alpha \ne -1\), can be seen to give a flat path geometry (type I) using the Jacobi–Maupertuis metric: by making the change of variable \(r=\rho ^{2/(\alpha +3)}\) for \(\alpha \ne -3\), or \(r=e^\rho \) for \(\alpha =-3\), one shows that such families are equivalent to geodesics on a right circular cone, so are locally equivalent to lines in the plane [35, §4]. More generally, for planar motion \(\ddot{{\mathbf {r}}}=-\nabla U\), with potential satisfying \(\Delta \log U = \lambda U\) for some \(\lambda \in \mathbb {R}\), the orbits at energy zero will also be locally flat.
3. By computing the relative invariants \(I_1, I_2\) of Eq. (16), it can be shown that orbits with fixed non-zero energy are non-flat for all central-force power laws. It also shows that zero energy orbits for \(f=\pm r^\alpha \) are flat if and only if \(\alpha \ne -1\). Furthermore, using additional (relative) invariants [21, §6], one finds that these path geometries admit a 3-dimensional symmetry group only for the Hooke and Kepler laws (\(\alpha =1, -2\)).
4. Using \(I_1, I_2\), it can be also shown that among all central-force power laws, orbits at a fixed non-zero angular momentum are flat only for the Kepler and inverse cubic force laws (\(\alpha =-2, -3\)).
Rights and permissions
About this article
Cite this article
Bor, G., Jackman, C. Revisiting Kepler: New Symmetries of an Old Problem. Arnold Math J. 9, 267–299 (2023). https://doi.org/10.1007/s40598-022-00213-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40598-022-00213-2