1 Introduction

In 1988 Croke proved that the length of the shortest closed geodesic on a Riemannian two-sphere can be bounded from above in terms of its area: there exists a positive number C such that the quantity

$$\begin{aligned} \ell _{\min }(g) := \text {length of the shortest non-constant closed geodesic on } (S^2,g) \end{aligned}$$

is bounded from above by

$$\begin{aligned} \ell _{\min }(g)^2 \le C \ \mathrm{Area}(S^2,g), \end{aligned}$$

for every Riemannian metric g (see [13]). In other words, the systolic ratio

$$\begin{aligned} \rho _{\mathrm {sys}}(g) := \frac{\ell _{\min }(g)^2}{\mathrm {Area}(S^2,g)} \end{aligned}$$

is bounded from above on the space of all Riemannian metrics on \(S^2\). The value of the supremum of \(\rho _{\mathrm {sys}}\) is not known, but it was shown to be not larger than 32 by Rotman [23], who improved the previous estimates due to Croke [13], Nabutowski and Rotman [22], and Sabourau [24].

The naïve conjecture that the round metric \(g_{\mathrm {round}}\) on \(S^2\) maximises \(\rho _{\mathrm {sys}}\) is false. Indeed,

$$\begin{aligned} \rho _{\mathrm {sys}}(g_{\mathrm {round}}) = \pi , \end{aligned}$$

while, by studying suitable positively curved metrics approximating a singular metric constructed by gluing two flat equilateral triangles along their boundaries, one sees that

$$\begin{aligned} \sup \rho _{\mathrm {sys}} \ge 2\sqrt{3} > \pi . \end{aligned}$$

This singular example is known as the Calabi–Croke sphere. Actually, it is conjectured that the supremum of \(\rho _{\mathrm {sys}}\) is \(2\sqrt{3}\) and that it is not attained. See [5, 25] for two different proofs of the fact that the Calabi–Croke sphere can be seen as a local maximiser of \(\rho _{\mathrm {sys}}\).

In this paper, we are interested in the behaviour of \(\rho _{\mathrm {sys}}\) near the round metric \(g_{\mathrm {round}}\) on \(S^2\). To the authors’ knowledge, this question was first raised by Babenko, and then studied by Balacheff, who in [4] showed that \(g_{\mathrm {round}}\) can be seen as a critical point of \(\rho _{\mathrm {sys}}\). In the same paper, Balacheff conjectured the round metric to be a local maximiser of \(\rho _{\mathrm {sys}}\) and gave some evidence in favour of this conjecture (see also [9, Question 8.7.2], where upon request of Balacheff the conjecture is attributed to Babenko).

Certainly, \(g_{\mathrm {round}}\) is not a strict local maximiser of \(\rho _{\mathrm {sys}}\), even after modding out rescaling: in any neighbourhood of it there are infinitely many non-isometric Zoll metrics, i.e. Riemannian metrics on \(S^2\) all of whose geodesics are closed and have the same length, and \(\rho _{\mathrm {sys}}\) is constantly equal to \(\pi \) on them (see [17, 27] and Appendix B below). Further evidence in favour of the local maximality of the round metric is given in [3], where Álvarez Paiva and Balacheff prove that \(\rho _{\mathrm {sys}}\) strictly decreases under infinitesimal deformations of the round metric which are not tangent with infinite order to the space of Zoll metrics.

The aim of this paper is to give a positive answer to Babenko’s and Balacheff’s conjecture and to complement it with a statement about the length \(\ell _{\max }(g)\) of the longest simple closed geodesic on \((S^2,g)\). The latter number is well defined whenever the Gaussian curvature K of \((S^2,g)\) is non-negative, see [11].

We recall that a Riemannian metric g on \(S^2\) is \(\delta \)-pinched, for some \(\delta \in (0,1]\), if its Gaussian curvature K is positive and satisfies

$$\begin{aligned} \min K \ge \delta \max K. \end{aligned}$$

The main result of this article is the following:

Theorem

Let g be a \(\delta \)-pinched smooth Riemannian metric on \(S^2\), with

$$\begin{aligned} \delta > \frac{4+\sqrt{7}}{8} = 0.8307\ldots \end{aligned}$$

Then

$$\begin{aligned} \ell _{\min }(g)^2 \le \pi \, {\mathrm {Area}}(S^2,g) \le \ell _{\max }(g)^2. \end{aligned}$$

Each of the two inequalities is an equality if and only if g is Zoll.

In particular, the left most inequality alone implies that the round metric maximizes \(\rho _\mathrm{sys}\) in a somewhat large \(C^2\)-neighborhood of \(g_{\mathrm {round}}\), which can be described in terms of the Gaussian curvature: if g satisfies the above pinching condition, then

$$\begin{aligned} \rho _{\mathrm {sys}}(g) \le \rho _{\mathrm {sys}}(g_{\mathrm {round}}) = \pi , \end{aligned}$$

with the equality holding if and only if g is Zoll. Although the optimal pinching constant might be less than \((4+\sqrt{7})/8\), it must be a strictly positive value because the Calabi–Croke singular sphere is itself a limit of pinched metrics.

As far as we know, also the lower bound for the length \(\ell _{\max }(g)\) of the longest simple closed geodesic which is stated in the above theorem is new. Lower bounds for \(\ell _{\max }(g)\) are studied by Calabi and Cao in the already mentioned [11], where the non-sharp bound

$$\begin{aligned} \ell _{\max }(g)^2 \ge \frac{\pi }{2} \, {\mathrm {Area}}(S^2,g) \end{aligned}$$

is proved for any metric g with non-negative curvature. This bound is deduced by the following sharp lower bound in terms of the diameter

$$\begin{aligned} \sup \{ \ell (\gamma ) \mid \gamma \text{ simple } \text{ closed } \text{ geodesic } \text{ on } (S^2,g) \} \ge 2 \, \mathrm {diam} (S^2,g), \end{aligned}$$

which is due to Croke and holds for any metric (when finite, this supremum is a maximum; the supremum is finite in the case \(K\ge 0\)). Unlike for the first inequality, we do not have counterexamples to the second inequality in our main theorem for metrics which are far from the round one. Our theorem also implies that, under the pinching assumption, when all the simple closed geodesics have the same length the metric must be Zoll.

The proof of the above theorem combines arguments from Riemannian geometry and techniques from symplectic geometry. The role of symplectic geometry in the proof should not surprise: as stressed in [3], the systolic ratio \(\rho _{\mathrm {sys}}\) is a symplectic invariant, meaning that if two metrics give rise to geodesic flows on the cotangent bundle of \(S^2\) which are conjugate by a symplectic diffeomorphism, then their systolic ratios coincide. Another argument in favour of the symplectic nature of our theorem is that Zoll metrics, which produce the extremal cases of both our inequalities, are in general not pairwise isometric, but their geodesic flows are symplectically conjugate, see Appendix B. The presence of a large set of not pairwise isometric local maximisers for \(\rho _{\mathrm {sys}}\) seems to exclude the possibility of a purely Riemannian geometric proof.

We conclude this introduction with an informal description of the proof. We start by looking at a closed geodesic \(\gamma \) on \((S^2,g)\) of minimal length \(L=\ell _{\min }(g)\), parametrised by arc length. When the curvature of \((S^2,g)\) is non-negative, this curve is simple (see [11], or Lemma 3.11 below for a proof under the assumption that g is \(\delta \)-pinched for some \(\delta >1/4\)).

Then we consider a Birkhoff annulus \(\Sigma _{\gamma }^+\) which is associated to \(\gamma \): \(\Sigma _{\gamma }^+\) is the set of all unit tangent vectors to \(S^2\) which are based at points of \(\gamma (\mathbb {R})\) and point in the direction of one of the two disks which compose \(S^2 {\setminus } \gamma (\mathbb {R})\). The set \(\Sigma _{\gamma }^+\) is a closed annulus, and its boundary consists of the unit vectors \(\dot{\gamma }(t)\) and \(-\dot{\gamma }(t)\), for \(t\in \mathbb {R}/L\mathbb {Z}\).

By a famous result of Birkhoff, the positivity of the curvature K guarantees that all orbits of the geodesic flow on the unit tangent bundle \(T^1 S^2\) of \((S^2,g)\), except for the two closed orbits \(\dot{\gamma }\) and \(-\dot{\gamma }\), hit the interior part of \(\Sigma _{\gamma }^+\) infinitely many times in the future and in the past. In other words, \(\Sigma _{\gamma }^+\) is a global Poincaré section for the geodesic flow \(\phi _t : T^1 S^2 \rightarrow T^1 S^2\) induced by g. This allows us to consider the first return time function

$$\begin{aligned} \tau : \mathrm {int}(\Sigma _{\gamma }^+) \rightarrow (0,+\infty ), \qquad \tau (v) := \inf \{ t> 0 \; | \; \phi _t(v)\in \Sigma _{\gamma }^+ \}, \end{aligned}$$

and the first return time map

$$\begin{aligned} \varphi : \mathrm {int}(\Sigma _{\gamma }^+) \rightarrow \mathrm {int}(\Sigma _{\gamma }^+), \qquad \varphi (v) := \phi _{\tau (v)}(v). \end{aligned}$$

The function \(\tau \) and the map \(\varphi \) are smooth and, as we will show, extend smoothly to the boundary of \(\Sigma _{\gamma }^+\).

The map \(\varphi \) preserves the two-form \(d\lambda \), where \(\lambda \) is the restriction to \(\Sigma _{\gamma }^+\) of the standard contact form on \(T^1 S^2\). The two-form \(d\lambda \) is an area-form in the interior of \(\Sigma _{\gamma }^+\), but vanishes on the boundary, due to the fact that the geodesic flow is not transverse to the boundary. Indeed, if we consider the coordinates

$$\begin{aligned} (x,y)\in \mathbb {R}/L \mathbb {Z}\times [0,\pi ] \end{aligned}$$

on \(\Sigma _{\gamma }^+\) given by the arc parameter x on the geodesic \(\gamma \) and the angle y which a unit tangent vector makes with \(\dot{\gamma }\), the one-form \(\lambda \) and its differential have the form

$$\begin{aligned} \lambda = \cos y\, dx, \qquad d\lambda = \sin y\, dx\wedge dy. \end{aligned}$$
(1)

By lifting the first return map \(\varphi \) to the strip \(S=\mathbb {R}\times [0,\pi ]\), we obtain a diffeomorphism \(\Phi :S \rightarrow S\) which preserves the two-form \(d\lambda \) given by (1), maps each boundary component into itself, and satisfies

$$\begin{aligned} \Phi (x+L,y) = (L,0) + \Phi (x,y), \qquad \forall (x,y)\in S. \end{aligned}$$

As we shall see, diffeomorphisms of S with these properties have a well defined flux and, when the flux vanishes, a well defined Calabi invariant. The flux of \(\Phi \) is its average horizontal displacement. We shall prove that, if g is \(\delta \)-pinched with \(\delta >1/4\), one can find a lift \(\Phi \) of \(\varphi \) having zero flux. For diffeomorphisms \(\Phi \) with zero flux, the action and the Calabi invariant can be defined in the following way. The action of \(\Phi \) is the unique function

$$\begin{aligned} \sigma : S \rightarrow \mathbb {R}, \end{aligned}$$

such that

$$\begin{aligned} d\sigma =\Phi ^* \lambda - \lambda \qquad \text{ on } \; S, \end{aligned}$$

and whose value at each boundary point \(w\in \partial S\) coincides with the integral of \(\lambda \) on the arc from w to \(\Phi (w)\) along \(\partial S\). The Calabi invariant of \(\Phi \) is the average of the action, that is, the number

$$\begin{aligned} \mathrm {CAL}(\Phi ) = \frac{1}{2L} \iint _{[0,L]\times [0,\pi ]} \sigma \, d\lambda . \end{aligned}$$

We shall prove that, still assuming g to be \(\delta \)-pinched with \(\delta >1/4\), the action and the Calabi invariant of \(\Phi \) are related to the geometric quantities we are interested in by the identities

$$\begin{aligned} \tau \circ p= & {} L + \sigma ,\end{aligned}$$
(2)
$$\begin{aligned} \pi \, \mathrm {Area} (S^2,g)= & {} L^2 + L\ \mathrm {CAL}(\Phi ), \end{aligned}$$
(3)

where

$$\begin{aligned} p: S= \mathbb {R}\times [0,\pi ] \rightarrow \Sigma _{\gamma }^+ = \mathbb {R}/L\mathbb {Z}\times [0,\pi ] \end{aligned}$$

is the standard projection. The \(\delta \)-pinching assumption on g with \(\delta > (4+\sqrt{7})/8\) implies that the map \(\Phi \) is monotone, meaning that, writing

$$\begin{aligned} \Phi (x,y) = (X(x,y),Y(x,y)), \end{aligned}$$

the strict inequality \(D_2 Y>0\) holds on S. This is proved by using an upper bound on the perimeter of convex geodesic polygons which follows from Toponogov’s comparison theorem. This upper bound plays an important role also in the proof of some of the other facts stated above, and we discuss it in Appendix A. The monotonicity of \(\Phi \) allows us to represent it in terms of a generating function. The method of generating functions is absolutely classical, it goes back to the foundational contributions of Poincaré, and continues to be a fundamental tool in modern Symplectic Topology. By using such a generating function, we shall prove the following fixed point theorem (Theorem 2.12): If a monotone map \(\Phi \) with vanishing flux is not the identity and satisfies \(\mathrm {CAL}(\Phi )\le 0\), then \(\Phi \) has an interior fixed point with negative action.

The first inequality in our main theorem is now a consequence of the latter fixed point theorem and of the identities (2) and (3). First one observes that \(\Phi \) is the identity if and only if g is Zoll. Assume that g is not Zoll. If, by contradiction, the inequality

$$\begin{aligned} L^2 = \ell _{\min }(g)^2 \ge \pi \, {\mathrm {Area}}(S^2,g) \end{aligned}$$

holds, (3) implies that \(\mathrm {CAL}(\Phi )\le 0\), so \(\Phi \) has a fixed point \(w\in \mathrm {int}(S)\) with \(\sigma (w)<0\). But then (2) implies that the closed geodesic which is determined by \(p(w)\in \Sigma _{\gamma }^+\) has length \(\tau (p(w)) < L\), which is a contradiction, because L is the minimal length of a closed geodesic. This shows that when g is not Zoll, the strict inequality

$$\begin{aligned} \ell _{\min }(g)^2 < \pi \, {\mathrm {Area}}(S^2,g) \end{aligned}$$

holds. This proves the first inequality. The proof of the second one uses the Birkhoff map associated to a simple closed geodesic of maximal length and is similar.

2 A class of self-diffeomorphisms of the strip preserving a two-form

We denote by S the closed strip

$$\begin{aligned} S := \mathbb {R}\times [0,\pi ], \end{aligned}$$

on which we consider coordinates (xy), \(x\in \mathbb {R}\), \(y\in [0,\pi ]\). The smooth two-form

$$\begin{aligned} \omega (x,y) := \sin y \, dx\wedge dy \end{aligned}$$

is an area form on the interior of S and vanishes on its boundary. Fix some \(L>0\), and let \(\mathcal {D}_L(S,\omega )\) be the group of all diffeomorphisms \(\Phi : S \rightarrow S\) such that:

  1. (i)

    \(\Phi (x+L,y) = (L,0) + \Phi (x,y)\) for every \((x,y)\in S\).

  2. (ii)

    \(\Phi \) maps each component of \(\partial S\) into itself.

  3. (iii)

    \(\Phi \) preserves the two-form \(\omega \).

The elements of \(\mathcal {D}_L(S,\omega )\) are precisely the maps which are obtained by lifting to the universal cover

$$\begin{aligned} S \rightarrow A:= \mathbb {R}/L\mathbb {Z}\times [0,\pi ] \end{aligned}$$

self-diffeomorphisms of A which preserve the two-form \(\omega \) on A and map each boundary component into itself.

By conjugating an element \(\Phi \) of \(\mathcal {D}_L(S,\omega )\) by the homeomorphism

$$\begin{aligned} S \rightarrow \mathbb {R}\times [-1,1], \qquad (x,y) \mapsto (x,-\cos y), \end{aligned}$$

one obtains a self-homeomorphism of the strip \(\mathbb {R}\times [-1,1]\) which preserves the standard area form \(dx\wedge dy\). Such a homeomorpshism is in general not continuously differentiable up to the boundary. Since we find it more convenient to work in the smooth category, we prefer not to use the above conjugacy and to deal with the non-standard area-form \(\omega \) vanishing on the boundary.

2.1 The flux and the Calabi invariant

In this section, we define the flux on \(\mathcal {D}_L(S,\omega )\) and the Calabi homomorphism on the kernel of the flux. These real valued homomorphisms were introduced by Calabi in [10] for the group of compactly supported symplectic diffeomorphisms of symplectic manifolds of arbitrary dimension. See also [21, Chapter 10]. In this paper we need to extend these definitions to the surface with boundary S. Our presentation is self-contained.

Definition 2.1

The flux of a map \(\Phi \in \mathcal {D}_L(S,\omega )\), \(\Phi (x,y)=(X(x,y),Y(x,y))\), is the real number

$$\begin{aligned} \mathrm {FLUX}(\Phi ) := \frac{1}{2L} \iint _{[0,L]\times [0,\pi ]} (X(x,y)-x)\, \omega (x,y). \end{aligned}$$

In other words, the flux of \(\Phi \) is the average shift in the horizontal direction (notice that 2L is the total area of \([0,L]\times [0,\pi ]\) with respect to the area form \(\omega \)). Using the fact that the elements of \(\mathcal {D}_L(S,\omega )\) preserve \(\omega \), it is easy to show that the function \(\mathrm {FLUX}: \mathcal {D}_L(S,\omega ) \rightarrow \mathbb {R}\) is a homomorphism.

Proposition 2.2

Let \(\alpha _0: [0,\pi ] \rightarrow S\) be the path \(\alpha _0(t):=(0,t)\). Then

$$\begin{aligned} \mathrm {FLUX}(\Phi ) = \frac{1}{2} \int _{\Phi (\alpha _0)} x\sin y \, dy, \end{aligned}$$

for every \(\Phi \) in \(\mathcal {D}_L(S,\omega )\).

Proof

Let \(\Theta :S\rightarrow S\) be the covering transformation \((x,y)\mapsto (x+L,y)\), and set \(Q:=[0,L]\times [0,\pi ]\). With its natural orientation, \(Q\subset S\) is the region whose signed boundary is \(\Theta (\alpha _0)-\alpha _0\) plus pieces that lie in \(\partial S\). Since \(\Phi \in \mathcal {D}_L(S,\omega )\) commutes with \(\Theta \), we have

$$\begin{aligned} \Phi (Q)-Q=\Theta (R)-R \end{aligned}$$
(4)

as simplicial 2-chains in S, where \(R \subset S\) is an oriented region whose signed boundary consists of \(\Phi (\alpha _0)-\alpha _0\) plus two additional pieces in \(\partial S\) that we do not need to label. Therefore,

$$\begin{aligned} \mathrm {FLUX}(\Phi ) = \frac{1}{2L} \int _{Q} (X-x)\,\omega = \frac{1}{2L} \int _{Q} ( \Phi ^*(x\, \omega )-x\,\omega ) = \frac{1}{2L} \int _{R} ( \Theta ^*(x\, \omega )-x\, \omega ), \end{aligned}$$

using (4) for the last equality. Since

$$\begin{aligned} \Theta ^*(x\, \omega )-x\, \omega = L\, \omega =L\, d\big (x\sin y\,dy\big ), \end{aligned}$$

by Stokes theorem we conclude that

$$\begin{aligned} \mathrm {FLUX}(\Phi ) = \frac{1}{2} \int _{\partial R} x\sin y\,dy = \frac{1}{2} \int _{\Phi (\alpha _0)-\alpha _0} x\sin y\,dy = \frac{1}{2} \int _{\Phi (\alpha _0)} x\sin y\,dy. \end{aligned}$$

\(\square \)

Remark 2.3

More generally, it is not difficult to show that if \(\alpha \) is any smooth path in S with the first end-point in \(\mathbb {R}\times \{0\}\) and the second one in \(\mathbb {R}\times \{\pi \}\), then

$$\begin{aligned} \mathrm {FLUX} (\Phi ) = \frac{1}{2} \int _{\Phi (\alpha )} x\sin y \, dy - \frac{1}{2} \int _{\alpha } x\sin y \, dy, \end{aligned}$$

for every \(\Phi \) in \(\mathcal {D}_L(S,\omega )\).

Now we fix the following primitive of \(\omega \) on S

$$\begin{aligned} \lambda := \cos y\, dx. \end{aligned}$$

Notice that \(\lambda \) is invariant with respect to translations in the x-direction. Let \(\Phi \) be an element of \(\mathcal {D}_L(S,\omega )\). Since \(\Phi \) preserves \(\omega =d\lambda \), the one-form

$$\begin{aligned} \Phi ^* \lambda - \lambda \end{aligned}$$

is closed. Since S is simply connected, there exists a unique smooth function

$$\begin{aligned} \sigma : S \rightarrow \mathbb {R}\end{aligned}$$

such that

$$\begin{aligned} d\sigma = \Phi ^* \lambda - \lambda \qquad \text{ on } \; S, \end{aligned}$$
(5)

and

$$\begin{aligned} \sigma (0,0) = \int _{\gamma _0} \lambda - \mathrm {FLUX}(\Phi ), \end{aligned}$$
(6)

where \(\gamma _0\) is a smooth path in \(\partial S\) going from (0, 0) to \(\Phi (0,0)\). Of course, the value of the integral in (6) does not depend on the choice of \(\gamma _0\), but only on its end-points.

Notice that the function \(\sigma \) is L-periodic in the first variable: This follows from the fact that \(\Phi ^* \lambda -\lambda \) is L-periodic in the first variable and its integral on the path \(\beta _0: [0,L] \rightarrow S\), \(\beta _0(t)=(t,0)\), vanishes:

$$\begin{aligned} \int _{\beta _0} ( \Phi ^* \lambda - \lambda ) = \int _{\Phi (\beta _0)} \lambda - \int _{\beta _0} \lambda = \int _{\Phi (0,0) + \beta _0} \lambda - \int _{\beta _0} \lambda = 0, \end{aligned}$$

thanks to the invariance of \(\lambda \) with respect to horizontal translations (here, the L-periodicity of \(\lambda \) in the first variable would have sufficed).

Notice also that, thanks to (5), the same normalization condition (6) holds for every point in the lower component of the boundary of S: For every x in \(\mathbb {R}\) there holds

$$\begin{aligned} \sigma (x,0) = \int _{\gamma _x} \lambda - \mathrm {FLUX}(\Phi ), \end{aligned}$$
(7)

where \(\gamma _x\) is a smooth path in \(\partial S\) going from (x, 0) to \(\Phi (x,0)\). Indeed, if \(\xi _x\) is a smooth path in \(\partial S\) from (0, 0) to (x, 0), then the paths \(\gamma _0 \# (\Phi \circ \xi _x)\) and \(\xi _x \# \gamma _x\) in \(\partial S\) have the same end-points. Thus,

$$\begin{aligned} \int _{\gamma _0} \lambda + \int _{\xi _x} \Phi ^* \lambda = \int _{\xi _x} \lambda + \int _{\gamma _x} \lambda , \end{aligned}$$

and Eqs. (5) and (6) imply

$$\begin{aligned} \sigma (x,0) = \sigma (0,0) + \int _{\xi _x} d\sigma = \int _{\gamma _0} \lambda - \mathrm {FLUX}(\Phi ) + \int _{\xi _x} ( \Phi ^* \lambda - \lambda ) = \int _{\gamma _x} \lambda - \mathrm {FLUX}(\Phi ). \end{aligned}$$

Therefore, we can give the following definitions.

Definition 2.4

Let \(\Phi \in \mathcal {D}_L(S,\omega )\). The unique smooth function \(\sigma :S \rightarrow \mathbb {R}\) which satisfies (5) and (6) (or, equivalently, (5) and (7)) is called action of \(\Phi \).

Definition 2.5

Let \(\Phi \in \ker \mathrm {FLUX}\) and let \(\sigma \) be the action of \(\Phi \). The Calabi invariant of \(\Phi \) is the real number

$$\begin{aligned} \mathrm {CAL}(\Phi ) = \frac{1}{2L} \iint _{[0,L]\times [0,\pi ]} \sigma \, \omega . \end{aligned}$$

In other words, the Calabi invariant of \(\Phi \) is its average action. The following remark explains why we define the Calabi invariant only for diffeomorphisms having zero flux.

Remark 2.6

The action \(\sigma \) depends on the choice of the primitive \(\lambda \) of \(\omega \). Let \(\lambda '\) be another primitive of \(\omega \), still L-periodic in the first variable. Then one can easily show that \(\lambda '=\lambda +df + c\, dx\), where \(f:S\rightarrow \mathbb {R}\) is a smooth function which is L-periodic in the first variable and c is a real number, and that the action \(\sigma '\) of \(\Phi \) with respect to \(\lambda '\) is given by

$$\begin{aligned} \sigma '(x,y) = \sigma (x,y) + f\circ \Phi (x,y) - f(x,y) + c (X(x,y)-x), \end{aligned}$$

where \(\Phi =(X,Y)\). If \(\Phi \) has zero flux, then the integrals of \(\sigma ' \, \omega \) and of \(\sigma \, \omega \) on \([0,L]\times [0,\pi ]\) coincide, so the Calabi invariant of \(\Phi \) does not depend on the choice of the periodic primitive of \(\omega \). Moreover, this formula also shows that the value of the action at a fixed point of \(\Phi \) is independent on the choice of the primitive of \(\omega \). Since \(\Phi ^* \lambda \) is another periodic primitive of \(\omega \), the above facts imply that \(\mathrm {CAL}:\ker \mathrm {FLUX} \rightarrow \mathbb {R}\) is a homomorphsim. In this paper, we work always with the chosen primitive \(\lambda \) of \(\omega \) and do not need the homomorphsim property of \(\mathrm {CAL}\), so we leave these verifications to the reader. See [14, 15] for interesting equivalent definitions of the Calabi invariant in the case of compactly supported area preserving diffeomorphisms of the plane.

In our definition of the action, we have chosen to normalise \(\sigma \) by looking at the lower component of \(\partial S\). The following result describes what happens on the upper component.

Proposition 2.7

Let \(\Phi \in \mathcal {D}_L(S,\omega )\) and let \(\sigma : S \rightarrow \mathbb {R}\) be its action. Let \(\delta _x\) be a smooth path in \(\partial S\) going from \((x,\pi )\) to \(\Phi (x,\pi )\). Then

$$\begin{aligned} \sigma (x,\pi ) = \int _{\delta _x} \lambda + \mathrm {FLUX}(\Phi ). \end{aligned}$$

Proof

The same argument used in the paragraph above Definition 2.4 shows that it is enough to check the formula for \(x=0\). In this case, by integrating over the path \(\alpha _0:[0,\pi ] \rightarrow S\), \(\alpha _0(t):=(0,t)\), we find by Stokes theorem

$$\begin{aligned} \sigma (0,\pi )= & {} \sigma (0,0) + \int _{\alpha _0} d\sigma = \int _{\gamma _0} \lambda -\mathrm{FLUX}(\Phi ) + \int _{\alpha _0} ( \Phi ^* \lambda - \lambda ) \\= & {} \int _{\gamma _0} \lambda -\mathrm{FLUX}(\Phi ) + \int _{\Phi (\alpha _0)} \lambda + \int _{\alpha _0^{-1}} \lambda = \int _{\delta _0} \lambda -\mathrm{FLUX}(\Phi ) + \iint _R h^*(d\lambda ), \end{aligned}$$

where \(h: R \rightarrow S\) is a smooth map on a closed rectangle R whose restriction to the boundary is given by the concatenation \(\gamma _0 \# (\Phi \circ \alpha _0) \# \delta _0^{-1} \# \alpha _0^{-1}\). By using again Stokes theorem with the primitive \(x\sin y\, dy\) of \(\omega =d\lambda \), we get

$$\begin{aligned} \iint _R h^*(d\lambda ) = \int _{\gamma _0 \# (\Phi \circ \alpha _0) \# \delta _0^{-1} \# \alpha _0^{-1}} x\sin y\, dy = \int _{\Phi (\alpha _0)} x \sin y\, dy. \end{aligned}$$

By Proposition 2.2, the latter quantity coincides with twice the flux of \(\Phi \), and the conclusion follows. \(\square \)

2.2 Generating functions

As it is well known, area-preserving self-diffeomorphisms of the strip which satisfy a suitable monotonicity condition can be represented in terms of a generating function. See for instance [21, Chapter 9]. Here we need to review these facts in the case of diffeomorphims preserving the special two-form \(\omega = \sin y \, dx\wedge dy\).

Definition 2.8

The diffeomorphism \(\Phi =(X,Y)\) in \(\mathcal {D}_L(S,\omega )\) is said to be monotone if \(D_2 Y (x,y) >0\) for every \((x,y)\in S\).

Assume that \(\Phi =(X,Y)\in \mathcal {D}_L(S,\omega )\) is a monotone map. Then for every \(x\in \mathbb {R}\) the map \(y \mapsto Y(x,y)\) is a diffeomorphism of \([0,\pi ]\) onto itself, and hence the map

$$\begin{aligned} \Psi : S \rightarrow S, \qquad \Psi (x,y) = ( x,Y(x,y) ) \end{aligned}$$

is a diffeomorphism. Denoting by y the second component of the inverse of \(\Psi \), we can work with coordinates (xY) on S and consider the one-form

$$\begin{aligned} \eta (x,Y) = ( \cos Y - \cos y)\, dx + (X-x) \sin Y\, dY \qquad \text{ on } \; S. \end{aligned}$$

From the fact that \(\Phi \) preserves \(\omega \) we find

$$\begin{aligned} d\eta= & {} \sin Y \, dx \wedge dY - \sin y \, dx\wedge dy + \sin Y\, dX \wedge dY - \sin Y \, dx \wedge dY \\= & {} -\sin y \, dx \wedge dy + \sin Y \, dX \wedge dY = 0, \end{aligned}$$

so \(\eta \) is closed. Let \(W=W(x,Y)\) be a primitive of \(\eta \). Then also \((x,y) \mapsto W(x+L,y)\) is a primitive of \(\eta \), and hence

$$\begin{aligned} W(x+L,Y) - W(x,Y) = c, \qquad \forall (x,Y)\in S, \end{aligned}$$

for some real number c. Since the integral of \(\eta \) on any path in \(\partial S\) connecting (0, 0) to (L, 0) vanishes, the constant c must be zero, and hence any primitive W of \(\eta \) is L-periodic. By writing

$$\begin{aligned} dW (x,Y) = D_1 W(x,Y)\, dx + D_2W (x,Y)\, dY, \end{aligned}$$

and using the definition of \(\eta \), we obtain the following:

Proposition 2.9

Assume that \(\Phi \) in \(\mathcal {D}_L(S,\omega )\) is a monotone map. Then there exists a smooth function \(W: S \rightarrow \mathbb {R}\) such that the following holds: \(\Phi (x,y)=(X,Y)\) if and only if

$$\begin{aligned} (X-x) \sin Y= & {} D_2 W(x,Y), \end{aligned}$$
(8)
$$\begin{aligned} \cos Y - \cos y= & {} D_1 W(x,Y). \end{aligned}$$
(9)

The function W is L-periodic in the first variable. It is uniquely defined up to the addition of a real constant.

A function W as above is called a generating function of \(\Phi \). Equation (9) implies that W is constant on each of the two connected components of the boundary of S. The difference between these two constant values coincides with twice the flux of \(\Phi \):

Proposition 2.10

If W is a generating function of the monotone map \(\Phi \in \mathcal {D}_L(S,\omega )\), then

$$\begin{aligned} \mathrm {FLUX}(\Phi ) = \frac{1}{2} ( W|_{\mathbb {R}\times \{\pi \}} - W|_{\mathbb {R}\times \{0\}} ). \end{aligned}$$

Proof

By Proposition 2.2 and (8) we compute

$$\begin{aligned} \mathrm {FLUX}(\Phi )= & {} \frac{1}{2} \int _{\Phi (\alpha _0)} x \sin y\, dy = \frac{1}{2} \int _{\alpha _0} X \sin Y\, dY = \frac{1}{2} \int _{\alpha _0} (X-x) \sin Y\, dY \\= & {} \frac{1}{2} \int _{\alpha _0} D_2 W(x,Y)\, dY = \frac{1}{2} ( W|_{\mathbb {R}\times \{\pi \}} - W|_{\mathbb {R}\times \{0\}} ), \end{aligned}$$

where we have used the fact that \(x=0\) on the path \(\alpha _0\) which is defined in Proposition 2.2. \(\square \)

By the above proposition, we can choose the free additive constant of the generating function W in such a way that:

$$\begin{aligned} W|_{\mathbb {R}\times \{0\}} = -\mathrm {FLUX}(\Phi ), \qquad W|_{\mathbb {R}\times \{\pi \}} = \mathrm {FLUX}(\Phi ). \end{aligned}$$
(10)

We conclude this section by expressing the action and the Calabi invariant of a monotone element of \(\mathcal {D}_L(S,\omega )\) in terms of its generating function, normalised by the above condition.

Proposition 2.11

Let \(\Phi =(X,Y)\in \mathcal {D}_L(S,\omega )\) be a monotone map, and denote by W the generating function of \(\Phi \) normalised by (10). Then we have:

  1. (i)

    The action of \(\Phi \) is the function

    $$\begin{aligned} \sigma (x,y) = W(x,Y(x,y)) + D_2 W(x,Y(x,y)) \cot Y(x,y). \end{aligned}$$
  2. (ii)

    If moreover \(\mathrm {FLUX}(\Phi )=0\), then the Calabi invariant of \(\Phi \) is the number

    $$\begin{aligned} \mathrm {CAL}(\Phi )= \frac{1}{2L} \iint _{[0,L]\times [0,\pi ]} ( W(x,y) + W(x,Y(x,y)) )\, \omega (x,y). \end{aligned}$$

The formula for \(\sigma \) in (i) is valid only in the interior of S, because the cotangent function diverges at 0 and \(\pi \). Since \(D_2 W\) vanishes on the boundary of S, thanks to (8), this formula defines a smooth function on S by setting

$$\begin{aligned} \sigma (x,0) = W(x,0) + D_{22} W(x,0), \qquad \sigma (x,\pi ) = W(x,\pi ) + D_{22} W(x,\pi ), \end{aligned}$$

for every \(x\in \mathbb {R}\).

Proof

Let us check that the function \(\sigma \) which is defined in (i) coincides with the action of \(\Phi \). By (8) we have

$$\begin{aligned} \sigma = W + D_2 W \cot Y = W + (X-x) \cos Y \end{aligned}$$
(11)

on \(\mathrm {int}(S)\). By continuity, this formula for \(\sigma \) is valid on the whole S. By differentiating it and using again (8) together with (9), we obtain

$$\begin{aligned} d\sigma= & {} dW - (X-x) \sin Y \, dY + \cos Y (dX-dx) \\= & {} dW - D_2 W\, dY + \cos Y (dX-dx) = D_1 W \, dx + \cos Y (dX-dx)\\= & {} (\cos Y- \cos y)\, dx + \cos Y (dX-dx) = \cos Y\, dX - \cos y\, dx = \Phi ^* \lambda - \lambda . \end{aligned}$$

Therefore, \(\sigma \) satisfies (5). Evaluating (11) in (0, 0) we find

$$\begin{aligned} \sigma (0,0) = W(0,0) + X(0,0) = - \mathrm {FLUX}(\Phi ) + X(0,0) = - \mathrm {FLUX}(\Phi ) + \int _{\gamma _0} \lambda , \end{aligned}$$

where \(\gamma _0\) is a path in \(\partial S\) going from (0, 0) to \(\Phi (0,0)\). We conclude that \(\sigma \) satisfies also (6), and hence coincides with the action of \(\Phi \). This proves (i).

We now use (i) in order to compute the integral of the two form \(\sigma \, \omega \) on \([0,L]\times [0,\pi ]\). We start from the identity

$$\begin{aligned} \iint _{[0,L]\times [0,\pi ]} \sigma \, \omega= & {} \iint _{[0,L]\times [0,\pi ]} W(x,Y(x,y)) \, \omega (x,y) \nonumber \\&+ \iint _{[0,L]\times [0,\pi ]} D_2W(x,Y(x,y)) \cot Y(x,y) \sin y \, dx \wedge dy,\nonumber \\ \end{aligned}$$
(12)

and we manipulate the last integral. By differentiating (9), that is, the identity

$$\begin{aligned} \cos Y(x,y) - \cos y = D_1 W(x,Y(x,y)), \end{aligned}$$

we obtain

$$\begin{aligned} \sin y \, dy = \sin Y \, dY + D_{11} W \, dx + D_{12} W \, dY. \end{aligned}$$

By the above formula, the integrand in the last integral in (12) can be rewritten as

$$\begin{aligned} D_2W \cot Y \sin y \, dx \wedge dy= & {} D_2W \cot Y \, dx \wedge ( \sin Y \, dY + D_{12} W\, dY) \nonumber \\= & {} D_2 W\cos Y \, dx\wedge dY + D_2 W D_{12} W \cot Y\, dx\wedge dY.\nonumber \\ \end{aligned}$$
(13)

We integrate the above two forms separately. By the L-periodicity in x, the integral of the first two-form can be manipulated as follows:

$$\begin{aligned}&\iint _{[0,L]\times [0,\pi ]} D_2 W(x,Y(x,y)) \cos Y(x,y)\, dx\wedge dY(x,y) \nonumber \\&\quad = \iint _{[0,L]\times [0,\pi ]} D_2 W(x,Y) \cos Y\, dx\wedge dY \nonumber \\&\quad = \int _0^L \left( \int _0^\pi D_2 W(x,Y) \cos Y \, dY \right) \, dx \nonumber \\&\quad = \int _0^L \left( [ W(x,Y)\cos Y ]_{Y=0}^{Y=\pi } + \int _0^{\pi } W(x,Y)\sin Y\, dY \right) \, dx \nonumber \\&\quad = - L ( W|_{\mathbb {R}\times \{\pi \}} + W|_{\mathbb {R}\times \{0\}} ) + \iint _{[0,L]\times [0,\pi ]} W(x,Y) \sin Y \, dx\wedge dY \nonumber \\&\quad = - L ( - \mathrm {FLUX}(\Phi ) + \mathrm {FLUX}(\Phi ) ) + \iint _{[0,L]\times [0,\pi ]} W(x,y) \, \sin y \, dx\wedge dy\nonumber \\&\quad = \iint _{[0,L]\times [0,\pi ]} W(x,y) \, \omega (x,y), \end{aligned}$$
(14)

where we have used the normalization condition (10). The integral of the second form in the right-hand side of (13) vanishes, because

$$\begin{aligned}&\iint _{[0,L]\times [0,\pi ]} D_2 W D_{12} W \cot Y \, dx\wedge dY \nonumber \\&\quad = \frac{1}{2} \iint _{[0,L]\times [0,\pi ]} D_1 (D_2 W)^2 \cot Y \, dx \wedge dY \nonumber \\&\quad = \frac{1}{2} \int _0^{\pi } \cot Y \left( \int _0^L D_1 (D_2 W)^2\, dx \right) \, dY = 0, \end{aligned}$$
(15)

by L-periodicity in x. By (12), (13), (14) and (15) we obtain

$$\begin{aligned} \iint _{[0,L]\times [0,\pi ]} \sigma \, \omega = \iint _{[0,L]\times [0,\pi ]} ( W(x,Y(x,y)) + W(x,y) )\, \omega (x,y), \end{aligned}$$

and (ii) follows. \(\square \)

2.3 The Calabi invariant and the action at fixed points

We are now in the position to prove the main result of this first part.

Theorem 2.12

Let \(\Phi \) be a monotone element of \(\mathcal {D}_L(S,\omega )\) which is different from the identity and has zero flux. If \(\mathrm {CAL}(\Phi )\le 0\) (resp. \(\mathrm {CAL}(\Phi )\ge 0\)), then \(\Phi \) has an interior fixed point with negative (resp. positive) action.

Proof

Let W be the generating function of \(\Phi \) normalised by the condition (10). Since \(\Phi \) has zero flux, this condition says that W is zero on the boundary of S. Since \(\Phi \) is not the identity, W is not identically zero. Then the condition \(\mathrm {CAL}(\Phi )\le 0\) and the formula of Proposition 2.11 (ii) for \(\mathrm {CAL}(\Phi )\) imply that W is somewhere negative. Being a continuous periodic function, W achieves its minimum at some interior point \((x,Y)\in \mathrm {int}(S)\). Since the differential of W vanishes at (xY), Eqs. (8) and (9) imply that \((x,y):=(x,Y)\) is a fixed point of \(\Phi \). By Proposition 2.11 (i),

$$\begin{aligned} \sigma (x,y) = W(x,Y) < 0. \end{aligned}$$

Therefore, (xy) is an interior fixed point of \(\Phi \) with negative action. The case \(\mathrm {CAL}(\Phi )\ge 0\) is completely analogous. \(\square \)

The conclusion of the above theorem is false if we drop the assumption on the monotonicity of \(\Phi \): there exist non monotone maps \(\Phi \in \mathcal {D}_L(S,\omega )\) which have zero flux, negative Calabi invariant but no fixed points with negative action. See Section 2.8 and Remark 2.22 in [1].

3 The geodesic flow on a positively curved two-sphere

Throughout this section, a smooth oriented Riemannian two-sphere \((S^2,g)\) is fixed. The associated unit tangent bundle is

$$\begin{aligned} T^1 S^2:=\{v\in TS^2 \mid g_{\pi (v)}(v,v)=1\}, \end{aligned}$$

where \(\pi :TS^2 \rightarrow S^2\) denotes the bundle projection. For each \(v \in T^1 S^2\), we denote by \(v^\perp \in T_{\pi (v)}S^2\) the unit vector perpendicular to v such that \(\{v,v^\perp \}\) is a positive basis of \(T_{\pi (v)}S^2\).

We shall deal always with Riemannian metrics g having positive Gaussian curvature K and shall often use Klingenberg’s lower bound on the injectivity radius \(\mathrm {inj}(g)\) of the metric g from [18], that is,

$$\begin{aligned} \mathrm {inj}(g) \ge \frac{\pi }{\sqrt{\max K}}, \end{aligned}$$
(16)

see also [19, Theorem 2.6.9].

3.1 Extension and regularity of the Birkhoff map

Let \(\gamma :\mathbb {R}/ L\mathbb {Z}\rightarrow S^2\) be a simple closed geodesic of length L parametrised by arc-length, i.e. satisfying \(g_{\gamma }(\dot{\gamma },\dot{\gamma })\equiv 1\). The smooth unit vector field \(\dot{\gamma }^\perp \) along \(\gamma \) determines the Birkhoff annuli

$$\begin{aligned} \Sigma _\gamma ^+:= & {} \{\cos y \ \dot{\gamma }(x) + \sin y \ \dot{\gamma }^\perp (x)\in T^1S^2 \mid (x,y)\in \mathbb {R}/ L\mathbb {Z}\times [0,\pi ]\}, \nonumber \\ \Sigma _\gamma ^-:= & {} \{\cos y \ \dot{\gamma }(x) + \sin y \ \dot{\gamma }^\perp (x)\in T^1S^2 \mid (x,y)\in \mathbb {R}/ L\mathbb {Z}\times [-\pi ,0]\}. \end{aligned}$$
(17)

These sets are embedded closed annuli and (xy) are smooth coordinates on them. The annuli \(\Sigma _{\gamma }^+\) and \(\Sigma _{\gamma }^-\) intersect along their boundaries \(\partial \Sigma _{\gamma }^+=\partial \Sigma _{\gamma }^-\). This common boundary has two components, one containing unit vectors \(\dot{\gamma }\) and the other containing unit vectors \(-\dot{\gamma }\). We denote the open annuli by

$$\begin{aligned} \mathrm {int}(\Sigma ^+_\gamma ) := \Sigma ^+_\gamma {\setminus } \partial \Sigma ^+_\gamma , \qquad \mathrm {int}(\Sigma ^-_\gamma ) := \Sigma ^-_\gamma {\setminus } \partial \Sigma ^-_\gamma . \end{aligned}$$

Let \(\phi _t\) be the geodesic flow on \(T^1S^2\). We define the functions

$$\begin{aligned}&\tau _+ : \mathrm {int}(\Sigma _{\gamma }^+) \rightarrow (0,+\infty ], \qquad \tau _+(v) := \inf \{ t>0 \mid \phi _t(v) \in \mathrm {int}( \Sigma ^-_\gamma ) \}, \\&\tau _- : \mathrm {int}(\Sigma _{\gamma }^-) \rightarrow (0,+\infty ], \qquad \tau _-(v) := \inf \{ t>0 \mid \phi _t(v) \in \mathrm {int}( \Sigma ^+_\gamma ) \}, \end{aligned}$$

where the infimum of the empty set is \(+\infty \). The functions \(\tau _+\) and \(\tau _-\) are the transition times to go from the interior of \(\Sigma _{\gamma }^+\) to the interior of \(\Sigma _{\gamma }^-\) and the other way round. The first return time to \(\Sigma _{\gamma }^+\) is instead the function

$$\begin{aligned} \tau : \mathrm {int}(\Sigma _{\gamma }^+) \rightarrow (0,+\infty ], \qquad \tau (v) := \inf \{ t>0 \mid \phi _t(v) \in \mathrm {int}( \Sigma ^+_\gamma ) \}. \end{aligned}$$

Recall the following celebrated theorem due to Birkhoff (see also [6]):

Theorem 3.1

(Birkhoff [7]) If the Gaussian curvature of g is everywhere positive then the functions \(\tau _+\), \(\tau _-\) and \(\tau \) are everywhere finite.

Thanks to the above result, we have the transition maps

$$\begin{aligned}&\varphi _+ : \mathrm {int}(\Sigma _{\gamma }^+) \rightarrow \mathrm {int}(\Sigma _{\gamma }^-), \qquad \varphi _+(v) := \phi _{\tau _+(v)}(v), \\&\varphi _- : \mathrm {int}(\Sigma _{\gamma }^-) \rightarrow \mathrm {int}(\Sigma _{\gamma }^+), \qquad \varphi _-(v) := \phi _{\tau _-(v)}(v), \end{aligned}$$

and the first return map

$$\begin{aligned} \varphi : \mathrm {int}(\Sigma _{\gamma }^+) \rightarrow \mathrm {int}(\Sigma _{\gamma }^+), \qquad \varphi (v) := \phi _{\tau (v)}(v). \end{aligned}$$

By construction,

$$\begin{aligned} \varphi= & {} \varphi _- \circ \varphi _+,\end{aligned}$$
(18)
$$\begin{aligned} \tau= & {} \tau _+ + \tau _-\circ \varphi _+. \end{aligned}$$
(19)

Using the implicit function theorem and the fact that the geodesic flow is transverse to both \(\mathrm {int}(\Sigma ^+_\gamma )\) and \(\mathrm {int}(\Sigma ^-_\gamma )\), one easily proves that the functions \(\tau ^+\), \(\tau _-\) and \(\tau \) are smooth. These functions have smooth extensions to the closure of their domains. More precisely, we have the following statement.

Proposition 3.2

Assume that the Gaussian curvature of \((S^2,g)\) is everywhere positive. Then:

  1. (i)

    The functions \(\tau _+\) and \(\tau _-\) can be smoothly extended to \(\Sigma ^+_\gamma \) and \(\Sigma _{\gamma }^-\), respectively, as follows: \(\tau _+(\dot{\gamma }(x))=\tau _-(\dot{\gamma }(x))\) is the time to the first conjugate point along the geodesic ray \(t \in [0,+\infty ) \mapsto \gamma (x+t)\), and \(\tau _+(-\dot{\gamma }(x))=\tau _-(-\dot{\gamma }(x))\) is the time to the first conjugate point along the geodesic ray \(t\in [0,+\infty ) \mapsto \gamma (x-t)\).

  2. (ii)

    The function \(\tau \) can be smoothly extended to \(\Sigma ^+_\gamma \) as follows: \(\tau (\dot{\gamma }(x))\) is the time to the second conjugate point along the geodesic ray \(t \in [0,+\infty ) \mapsto \gamma (x+t)\), and \(\tau (-\dot{\gamma }(x))\) is the time to the second conjugate point along the geodesic ray \(t\in [0,+\infty ) \mapsto \gamma (x-t)\).

The smooth extensions of \(\tau ^+\), \(\tau _-\) and \(\tau \) are denoted by the same symbols. The above proposition has the following consequence:

Corollary 3.3

Suppose that the Gaussian curvature of \((S^2,g)\) is everywhere positive. Then the formulas

$$\begin{aligned} v\mapsto \phi _{\tau _+(v)}(v), \qquad v\mapsto \phi _{\tau _-(v)}(v) \quad \text{ and } \quad v\mapsto \phi _{\tau (v)}(v) \end{aligned}$$

define smooth extensions of the maps \(\varphi _+\), \(\varphi _-\) and \(\varphi \) to diffeomorphisms

$$\begin{aligned} \varphi _+ : \Sigma _{\gamma }^+ \rightarrow \Sigma _{\gamma }^-, \qquad \varphi _- : \Sigma _{\gamma }^- \rightarrow \Sigma _{\gamma }^+ \quad \text{ and } \quad \varphi : \Sigma _{\gamma }^+ \rightarrow \Sigma _{\gamma }^+, \end{aligned}$$

which still satisfy (18) and (19).

Proof

The smoothness of the geodesic flow \(\phi \) and of the functions \(\tau _+\), \(\tau _-\) and \(\tau \) imply that \(\varphi _+\), \(\varphi _-\) and \(\varphi \) are smooth. Since the inverses of these maps on the interior of their domains have analogous definitions, such as for instance

$$\begin{aligned} \varphi _+^{-1} (v) = \phi _{\hat{\tau }_+(v)}(v), \quad \text{ where }\quad \hat{\tau }_+(v) := \sup \{ t<0 \mid \phi _t(v) \in \mathrm {int}(\Sigma _{\gamma }^+) \}, \end{aligned}$$

the maps \(\varphi ^{-1}_+\), \(\varphi _-^{-1}\) and \(\varphi ^{-1}\) have also smooth extensions to the closure of their domains, and hence \(\varphi _+\), \(\varphi _-\) and \(\varphi \) are diffeomorphisms. \(\square \)

For sake of completeness, we include a proof of Proposition 3.2. A proof of statement (ii) has recently appeared in [26]. This proof is based on a technical lemma about return time functions of a certain class of flow, which we now introduce. Consider coordinates \((x,q,p) \in \mathbb {R}/\mathbb {Z}\times \mathbb {R}^2\) and a smooth tangent vector field X on \(\mathbb {R}/\mathbb {Z}\times \mathbb {R}^2\) satisfying

$$\begin{aligned} X(x,0,0)=(1,0,0), \qquad \forall x\in \mathbb {R}/\mathbb {Z}. \end{aligned}$$
(20)

If we denote by \(\psi _t\) the flow of X then

$$\begin{aligned} \psi _t(x,0,0)=(x+t,0,0), \qquad \forall x\in \mathbb {R}/\mathbb {Z}, \end{aligned}$$

and \(P:=\mathbb {R}/\mathbb {Z}\times 0\) is a 1-manifold invariant by the flow. We assume also that for every \(x\in \mathbb {R}/\mathbb {Z}\) and \(t\in \mathbb {R}\) the subspace \(\{0\}\times \mathbb {R}^2\subset \mathbb {R}^3\) is preserved by the differential of the flow, i.e.

$$\begin{aligned} D\psi _t(x,0,0) [ \{0\} \times \mathbb {R}^2 ]=\{0\} \times \mathbb {R}^2, \qquad \forall x\in \mathbb {R}/\mathbb {Z}, \quad \forall t\in \mathbb {R}. \end{aligned}$$
(21)

For each \(\delta \in (0,\infty ]\) consider the annuli

$$\begin{aligned} A_\delta ^+ := \mathbb {R}/\mathbb {Z}\times [0,\delta ), \qquad A_\delta ^- := \mathbb {R}/\mathbb {Z}\times (-\delta , 0], \end{aligned}$$

both equipped with the coordinates (xy). To each point \((x,y)\in \mathrm {int}( A_\delta ^+)\) one may try to associate the point \(\varphi _+(x,y)\in \mathrm {int}( A_\delta ^-)\) given by the formula

$$\begin{aligned} \varphi _+(x,y) = \psi _{\tau _+(x,y)}(x,y,0) \end{aligned}$$
(22)

where \(\tau _+(x,y)\) is a tentative “first hitting time of \(A_{\delta }^-\)”, that is,

$$\begin{aligned} \tau _+(x,y) = \inf \ \{t>0 \; \mid \; \psi _t(x,y,0) \in \mathrm {int}(A_\infty ^-)\times \{0\}\}. \end{aligned}$$
(23)

Of course, in general \(\tau _+\) and \(\varphi _+\) may not be well-defined, even for small \(\delta \). Our purpose below is to give a sufficient condition on the vector field X to guarantee that, if \(\delta \) is small enough, \(\tau _+\) and \(\varphi _+\) are well-defined smooth functions on \(\mathrm {int}( A_\delta ^+)\) which extend smoothly to \(A_\delta ^+\). In the following definition and in the proof of the lemma below, we identify \(\mathbb {R}^2\) with \(\mathbb {C}\).

Definition 3.4

Fix some \(x\in \mathbb {R}/\mathbb {Z}\) and \(v\in \mathbb {R}^2{\setminus } \{0\}\). By (21) the image of (0, v) by the differential of \(\psi _t\) at (x, 0, 0) has the form

$$\begin{aligned} D\psi _t(x,0,0)[(0,v)]=(0,\rho (t)e^{i\theta (t)}), \end{aligned}$$

for suitable smooth functions \(\rho >0\) and \(\theta \), where \(\rho \) is unique and \(\theta \) is unique up to the addition of an integer multiple of \(2\pi \). We say that the linearised flow along P has a positive twist if for every choice of \(x\in \mathbb {R}/\mathbb {Z}\) and \(v\in \mathbb {R}^2{\setminus } \{0\}\) the function \(\theta \) which is defined above satisfies \(\theta '(t)>0 \) for all \(t\in \mathbb {R}\).

Lemma 3.5

If the linearised flow along P has a positive twist, then there exists \(\delta _0>0\) such that \(\tau _+\) is a well-defined smooth function on \(\mathrm {int}( A_{\delta _0}^+)\) which extends smoothly as a positive function on \(A_{\delta _0}^+\). Moreover, this extension is described by the formula

$$\begin{aligned} \tau ^+(x,0) = \inf \ \{ t>0 \; \mid \; D\psi _t(x,0,0)[\partial _y] \in \mathbb {R}^- \partial _y \}, \end{aligned}$$
(24)

where \(\partial _y := (0,1,0)\).

Proof

Write \(w=y+iz\) and \(Y=X_2+iX_3\), where \((X_1,X_2,X_3)\) are the components of the vector field X. Then

$$\begin{aligned} X(x,w)=(X_1(x,w),Y(x,w)). \end{aligned}$$

By (20) we have \(X_1(x,0)=1\) and \(Y(x,0)=0\). Consider \(W(x,w) \in \mathcal {L}_\mathbb {R}(\mathbb {C})\) defined by

$$\begin{aligned} W(x,w)=\int _0^1 D_2Y(x, s w)\, ds, \end{aligned}$$

where \(D_2Y\) denotes derivative with respect to the second variable. Then

$$\begin{aligned} W(x,0)=D_2Y(x,0), \qquad Y(x,w)=W(x,w)w. \end{aligned}$$

We shall now translate the assumption that the linearised flow along P has a positive twist into properties of W(x, 0). Choose \(v_0\in \mathbb {C}{\setminus }0\). Using (21) we find a smooth non-vanishing complex valued function v such that

$$\begin{aligned} D\psi _t(x,0)[(0,v_0)]=(0,v(t)). \end{aligned}$$

From

$$\begin{aligned} \frac{d}{dt}D\psi _t = (DX\circ \psi _t) D\psi _t, \end{aligned}$$

and from (21) we get the linear ODE

$$\begin{aligned} \dot{v}(t) = D_2Y(x+t,0) v(t) = W(x+t,0)v(t). \end{aligned}$$

Writing \(v(t)=r(t)e^{i\theta (t)}\) with smooth functions \(r>0\) and \(\theta \), we know that

$$\begin{aligned} \theta '= & {} \mathrm {Re}\, \left( \frac{\dot{v}}{iv}\right) = \mathrm {Re}\, \left( \frac{W(x+t,0)v}{iv} \ \frac{\overline{iv}}{\overline{iv}} \right) \nonumber \\= & {} \frac{\langle W(x+t,0)v,iv \rangle }{|v|^2} = \langle W(x+t,0)e^{i\theta },ie^{i\theta } \rangle , \end{aligned}$$
(25)

where \(\langle \cdot ,\cdot \rangle \) denotes the Hermitian product on \(\mathbb {C}\). Since xt and v(t) can take arbitrary values, we conclude from the above formula and the assumptions of the lemma that

$$\begin{aligned} \langle W(x,0)u,iu \rangle >0, \qquad \forall u\in \mathbb {C}{\setminus } \{0\}, \quad \forall x\in \mathbb {R}/\mathbb {Z}. \end{aligned}$$
(26)

Consider polar coordinates \((r,\theta )\in [0,+\infty )\times \mathbb {R}/2\pi \mathbb {Z}\) in the w-plane given by \(w=y+iz=re^{i\theta }\). The map

$$\begin{aligned} (x,r,\theta ) \mapsto X(x,re^{i\theta }) \end{aligned}$$

is smooth. Using the formulas

$$\begin{aligned} \partial _y=\frac{y}{r}\partial _r-\frac{z}{r^2}\partial _\theta , \qquad \partial _z=\frac{z}{r}\partial _r+\frac{y}{r^2}\partial _\theta , \end{aligned}$$

we obtain that the vector field X pulls back by this change of coordinates to a smooth vector field

$$\begin{aligned} Z=(Z_1,Z_2,Z_3), \end{aligned}$$

which is given by

$$\begin{aligned} \left\{ \begin{array}{ll} Z_1(x,r,\theta ) &{}= X_1(x,re^{i\theta }), \\ Z_2(x,r,\theta ) &{}= \cos \theta \ X_2(x,re^{i\theta }) + \sin \theta \ X_3(x,re^{i\theta }), \\ Z_3(x,r,\theta ) &{}= \frac{1}{r} \left( \cos \theta \ X_3(x,re^{i\theta }) - \sin \theta \ X_2(x,re^{i\theta }) \right) . \end{array} \right. \end{aligned}$$
(27)

Indeed, the smoothness of \(Z_1\) and \(Z_2\) follows immediately from the above formulas, while that of \(Z_3\) needs a little more care. Since \(X_2,X_3\) vanish on \(\mathbb {R}/\mathbb {Z}\times \{0\}\), we can find smooth functions \(X_{2,2},X_{2,3},X_{3,2},X_{3,3}\) such that

$$\begin{aligned} X_2(x,y+iz)= & {} y X_{2,2}(x,y+iz)+z X_{2,3}(x,y+iz), \\ X_3(x,y+iz)= & {} y X_{3,2}(x,y+iz)+z X_{3,3}(x,y+iz), \end{aligned}$$

where

$$\begin{aligned} X_{2,2}(x,0)= & {} D_2 X_2 (x,0,0), \qquad X_{2,3}(x,0)=D_3 X_2 (x,0,0), \\ X_{3,2}(x,0)= & {} D_2 X_3 (x,0,0), \qquad X_{3,3}(x,0)=D_3 X_3(x,0,0), \end{aligned}$$

and

$$\begin{aligned} W(z,w)=\begin{bmatrix} X_{2,2}(x,w)&X_{2,3}(x,w) \\ X_{3,2}(x,w)&X_{3,3}(x,w) \end{bmatrix}. \end{aligned}$$

Substituting \(y=r\cos \theta \), \(z=r\sin \theta \) we find

$$\begin{aligned} Z_3(x,r,\theta ) = \langle W(x,re^{i\theta })e^{i\theta },ie^{i\theta }\rangle . \end{aligned}$$
(28)

Thus \(Z_3\) is a smooth function of \((x,r,\theta )\) and

$$\begin{aligned} Z_3(x,0,\theta )>0, \qquad \forall x\in \mathbb {R}/\mathbb {Z}, \quad \forall \theta \in \mathbb {R}/2\pi \mathbb {Z}, \end{aligned}$$
(29)

thanks to (26).

From now on we lift the variable \(\theta \) from \(\mathbb {R}/2\pi \mathbb {Z}\) to the universal covering \(\mathbb {R}\) and think of the vector field Z as a smooth vector field defined on \(\mathbb {R}/\mathbb {Z}\times [0,+\infty )\times \mathbb {R}\), having components \(2\pi \)-periodic in \(\theta \). Clearly this vector field is tangent to \(\{r=0\}\).

Let \(\zeta _t\) denote the flow of Z. After changing coordinates and lifting, we see that the conclusions of the lemma will follow if we check that

$$\begin{aligned} \tau _+(x,r) = \inf \{t>0 \mid \theta \circ \zeta _t(x,r,0) =\pi \} \end{aligned}$$
(30)

defines a smooth function of \((x,r)\in \mathbb {R}/\mathbb {Z}\times [0,\delta )\) when \(\delta \) is small enough. By (29) we see that if \(\delta _0\) is fixed small enough then \(\tau _+(x,r)\) is a well-defined, uniformly bounded and strictly positive function of \((x,r)\in \mathbb {R}/\mathbb {Z}\times [0,\delta _0)\). Here we used that Z is tangent to \(\{r=0\}\). Perhaps after shrinking \(\delta _0\), we may also assume that

$$\begin{aligned} Z_3(\zeta _t(x,r,0))>0, \qquad \forall (x,r)\in \mathbb {R}/\mathbb {Z}\times [0,\delta _0), \quad \forall t\in [0,\tau _+(z,r)]. \end{aligned}$$
(31)

Continuity and smoothness properties of \(\tau _+\) remain to be checked. This is achieved with the aid of the implicit function theorem. In fact, consider the smooth function

$$\begin{aligned} F: \mathbb {R}\times \mathbb {R}/\mathbb {Z}\times [0,+\infty ) \rightarrow \mathbb {R}, \qquad F(\tau ,x,r) := \theta \circ \zeta _{\tau }(x,r,0). \end{aligned}$$

Since

$$\begin{aligned} D_1F(\tau ,x,r)=d\theta [ Z(\zeta _{\tau }(x,r,0)) ] = Z_3 (\zeta _{\tau }(x,r,0)), \end{aligned}$$

it follows from (31) and from the implicit function theorem that the equation

$$\begin{aligned} F(\tau _+,x,r)=\pi \end{aligned}$$

determines \(\tau _+=\tau _+(x,r)\) as a smooth function of \((x,r)\in \mathbb {R}/\mathbb {Z}\times [0,\delta _0)\).

We now check formula (24) for \(\tau _+(x,0)\). From the above equations one sees that \(\theta (t)=\theta \circ \zeta _t(x,0,0)\) satisfies the differential equation

$$\begin{aligned} \theta '(t)=\langle D_2Y(x+t,0)e^{i\theta },ie^{i\theta }\rangle , \end{aligned}$$

with initial condition \(\theta (0)=0\). Thanks to (25), this is exactly the same initial value problem for the argument \(\hat{\theta }(t)\) of the solution \(v(t)=\rho (t)e^{i\hat{\theta }(t)}\) of the linearised flow starting at the base point (x, 0) applied to the vector \(\partial _y\). \(\square \)

In order to prove Proposition 3.2, it is enough to show that coordinates can be arranged in such a way that the geodesic flow near a simple closed geodesic \(\gamma \) meets the assumptions of Lemma 3.5 when the Gaussian curvature is positive along \(\gamma \). We will assume for simplicity, and without loss of generality, that \(L=1\). We start by recalling basic facts from Riemannian geometry and fixing some notation.

Given \(v\in TS^2\), let \(\mathcal {V}_v\subset T_vTS^2\) be the vertical subspace, which is defined as \(\mathcal {V}_v:=\ker d\pi (v)\). The isomorphism

$$\begin{aligned} i_{\mathcal {V}_v}:T_{\pi (v)}S^2 \rightarrow \mathcal {V}_v \end{aligned}$$

is defined as

$$\begin{aligned} i_{\mathcal {V}_v}(w):=\frac{d}{dt}(v+tw)\Bigr |_{t=0}, \qquad \forall w\in T_{\pi (v)} S^2. \end{aligned}$$

The Levi-Civita connection of g determines a bundle map \(K:TTS^2\rightarrow TS^2\) satisfying \(\nabla _YX=K(dX \circ Y)\), where XY are vector fields on \(S^2\) seen as maps \(S^2\rightarrow TS^2\). The horizontal subspace \(\mathcal {H}_v:=\ker K|_{T_vTS^2}\) satisfies \(T_v TS^2 = \mathcal {V}_v \oplus \mathcal {H}_v\). There is an isomorphism

$$\begin{aligned} i_{\mathcal {H}_v}:T_{\pi (v)}S^2 \rightarrow \mathcal {H}_v, \qquad i_{\mathcal {H}_v}(w) := \frac{d}{dt} V(t)\Bigr |_{t=0}, \qquad \forall w\in T_{\pi (v)} S^2, \end{aligned}$$

where V is the parallel vector field along the geodesic \(\beta (t)\) satisfying \(\dot{\beta }(0)=w\) with initial condition \(V(0)=v\), seen as a curve in \(TTS^2\). The isomorphism \(i_{\mathcal {H}_v}\) satisfies

$$\begin{aligned} d\pi (v)\bigl [i_{\mathcal {H}_v}(w)\bigr ]=w, \qquad \forall w\in T_{\pi (v)} S^2. \end{aligned}$$
(32)

For each \(v\in T^1S^2\) we have

$$\begin{aligned} T_vT^1S^2 = \mathrm{span}\{i_{\mathcal {V}_v}(v^\perp ), i_{\mathcal {H}_v}(v^\perp ),i_{\mathcal {H}_v}(v)\}. \end{aligned}$$

The Hilbert form \(\lambda _H\) on \(TS^2\) is given by

$$\begin{aligned} \lambda _H(v) [\zeta ] := g_{\pi (v)}\bigl (v, d\pi (v) [\zeta ]\bigr ), \qquad \forall \zeta \in T_vS^2, \end{aligned}$$
(33)

and restricts to a contact form \(\alpha \) on \(T^1S^2\). The contact structure \(\xi := \ker \alpha \) is trivial since

$$\begin{aligned} \xi _v = \mathrm{span}\{i_{\mathcal {V}_v}(v^\perp ),i_{\mathcal {H}_v}(v^\perp )\}. \end{aligned}$$

The Reeb vector field \(R_\alpha \) of \(\alpha \) coincides with \(i_{\mathcal {H}_v}(v)\), and \(\{i_{\mathcal {V}_v}(v^\perp ),i_{\mathcal {H}_v}(v^\perp )\}\) forms a symplectic basis for \(d\alpha |_{\xi _v}\), because

$$\begin{aligned} d\alpha (v)[ i_{\mathcal {V}_v}(v^\perp ),i_{\mathcal {H}_v}(v^\perp )]=1. \end{aligned}$$

If (xy) are the standard coordinates on \(\Sigma _{\gamma }^{\pm }\) given by

$$\begin{aligned} v = \cos y\ \dot{\gamma }(x) + \sin y \ \dot{\gamma }(x)^{\perp }, \end{aligned}$$

then the tangent vectors \(\partial _x\) and \(\partial _y\) in \(T_v \Sigma _{\gamma }^{\pm }\) are

$$\begin{aligned}&\partial _x = i_{\mathcal {H}_v}(\dot{\gamma }(x))= \cos y\ i_{\mathcal {H}_v}(v) - \sin y\ i_{\mathcal {H}_v}(v^\perp ), \nonumber \\&\partial _y = i_{\mathcal {V}_v}(v^\perp ). \end{aligned}$$
(34)

Proof of Proposition 3.2

It is enough to prove statement (i) for the function \(\tau _+\). In fact, the case of \(\tau _-\) follows by inverting the orientation of \(\gamma \), and statement (ii) is then a direct consequence of the identity (19).

By (34) the vector field \(R_\alpha =i_{\mathcal {H}_v}(v)\) is transverse to the interior of \(\Sigma _\gamma ^\pm \). The smooth vector field

$$\begin{aligned} i_{\mathcal {H}_v}(\dot{\gamma }^\perp )= \sin y \ i_{\mathcal {H}_v}(v) + \cos y\ i_{\mathcal {H}_v}(v^\perp ) \end{aligned}$$

along \(\Sigma _\gamma ^+\cup \Sigma _\gamma ^-\) is transverse to it near \(\dot{\gamma }\). To obtain the desired coordinates near \(\dot{\gamma }\) we proceed as follows: let \(\bar{g}\) be the Riemannian metric on \(T^1S^2\) defined by

$$\begin{aligned} \bar{g}_v(\zeta _1,\zeta _2) := \alpha (\zeta _1)\alpha (\zeta _2) + d\alpha (\pi _\xi (\zeta _1), J \pi _\xi ( \zeta _2)), \end{aligned}$$

where \(J:\xi \rightarrow \xi \) is the \(d\lambda \)-compatible complex structure determined by

$$\begin{aligned} J ( i_{\mathcal {V}_v}(v^\perp ) ) =i_{\mathcal {H}_v}(v^\perp ), \end{aligned}$$

\(\pi _\xi :T^1S^2 \rightarrow \xi \) is the projection along \(R_\alpha \), and \(\zeta _1,\zeta _2 \in T_vT^1S^2\) are arbitrary. Note that \(\xi \) is orthogonal to \(\mathbb {R}R_\alpha \) with respect to \(\bar{g}\) and \(\bar{g}(i_{\mathcal {V}_v}(v^\perp ),i_{\mathcal {H}_v}(v^\perp ))=0\).

Denote by \(\mathrm{Exp}\) the exponential map of \(\bar{g}\). Then for all \(\delta >0\) sufficiently small, the map

$$\begin{aligned}&\mathbb {R}/\mathbb {Z}\times (-\delta ,\delta ) \times (-\delta ,\delta ) \rightarrow \mathcal U \\&(x,y,z) \mapsto \mathrm{Exp}_{v=\cos y \dot{\gamma }(x) + \sin y \dot{\gamma }^\perp (x)} \left( z (\sin y \ i_{\mathcal {H}_v}(v) + \cos y \ i_{\mathcal {H}_v}(v^\perp )) \right) \end{aligned}$$

is a diffeomorphism, where \(\mathcal {U}\subset T^1S^2\) is a small tubular neighborhood of \(\dot{\gamma }\). In coordinates (xyz), we have

$$\begin{aligned}&\dot{\gamma }\equiv \mathbb {R}/\mathbb {Z}\times \{(0,0)\} \nonumber \\&\Sigma _\gamma ^+ \equiv \{z=0, y\ge 0\} \nonumber \\&\Sigma _\gamma ^- \equiv \{z=0, y\le 0\} \nonumber \\&R_\alpha |_{\dot{\gamma }} \equiv (1,0,0)|_{\mathbb {R}/\mathbb {Z}\times \{(0,0)\}} \nonumber \\&\xi |_{\dot{\gamma }} \equiv \{0\} \times \mathbb {R}^2|_{\mathbb {R}/\mathbb {Z}\times \{(0,0)\}} \nonumber \\&i_{\mathcal {V}_{\dot{\gamma }}}(\dot{\gamma }^\perp ) \equiv \partial _y|_{\mathbb {R}/\mathbb {Z}\times \{(0,0)\}} \nonumber \\&i_{\mathcal {H}_{\dot{\gamma }}}(\dot{\gamma }^\perp ) \equiv \partial _z|_{\mathbb {R}/\mathbb {Z}\times \{(0,0)\}}. \end{aligned}$$
(35)

Denote by \(X=(X_1,X_2,X_3)\) the Reeb vector field \(R_\alpha \) in these coordinates and by \(\psi _t\) its flow. Then \(X(x,0,0)=(1,0,0)\) and since \(\psi _t\) preserves the contact structure, we have

$$\begin{aligned} D\psi _t(x,0,0)\bigl [ \{0\} \times \mathbb {R}^2 \bigr ] = \{0\} \times \mathbb {R}^2. \end{aligned}$$

A linearised solution \(\zeta (t) = a_1(t) \partial _y + a_2(t) \partial _z\) along \(\psi _t(x,0,0)=(x+t,0,0)\) satisfies

$$\begin{aligned} \left( \begin{array}{c} a_1'(t) \\ a_2'(t) \end{array}\right) = \left( \begin{array}{cc} 0 &{} -K(t) \\ 1 &{} 0 \end{array} \right) \left( \begin{array}{c} a_1(t) \\ a_2(t) \end{array} \right) , \end{aligned}$$

where K(t) is the Gaussian curvature at \(\gamma (x+t)\). Writing in complex polar coordinates \(a_1(t) + i a_2(t) = \rho (t)e^{i \theta (t)}\), for smooth functions \(\rho \ge 0\) and \(\theta \), we can easily check that

$$\begin{aligned} \theta '(t) = \cos ^2 \theta (t) + K(t) \sin ^2 \theta (t),\qquad \forall t\in \mathbb {R}. \end{aligned}$$

Therefore, the positivity of the Gaussian curvature along \(\gamma \) implies the twist condition. We have finished checking that X meets all the assumptions of Lemma 3.5. Proposition 3.2 follows readily from an application of that lemma. \(\square \)

3.2 The contact volume, the return time and the Riemannian area

As we have seen in the previous section, the Hilbert form \(\lambda _H\) defined in (33) induces by restriction a contact form \(\alpha \) on \(T^1 S^2\). A further restriction produces the one-form \(\lambda \) on the Birkhoff annulus \(\Sigma _{\gamma }^+\). By using the standard smooth coordinates \((x,y) \in \mathbb {R}/L\mathbb {Z}\times [0,\pi ]\) on \(\Sigma _{\gamma }^+\), we express a vector \(v\in \Sigma _{\gamma }^+\) as

$$\begin{aligned} v= \cos y\ \dot{\gamma }(x) + \sin y\ \dot{\gamma }(x)^{\perp }, \end{aligned}$$
(36)

and we find, using (33) and (34), together with (32),

$$\begin{aligned} \lambda (v)[\partial _x]= & {} g_{\pi (v)} ( v, d\pi (v)[\cos y\ i_{\mathcal {H}_v}(v) - \sin y\ i_{\mathcal {H}_v}(v^\perp )]) \\= & {} g_{\pi (v)}(v,\cos y \ v-\sin y \ v^\perp )= \cos y, \lambda (v)[\partial _y]\\= & {} g_{\pi (v)} ( v, d\pi (v)[i_{\mathcal {V}_v}(v^\perp )]) = g_{\pi (v)}(v,0)=0 \\ \end{aligned}$$

Therefore, the expression of \(\lambda \) in the coordinates (xy) is

$$\begin{aligned} \lambda =\cos y \, dx, \end{aligned}$$

and its differential reads

$$\begin{aligned} d\lambda = \sin y \, dx\wedge dy. \end{aligned}$$

Thus, the forms \(\lambda \) and \(\omega =d\lambda \) are the ones considered in part 2 on the universal cover S of \(\mathbb {R}/L\mathbb {Z}\times [0,\pi ]\).

Since the geodesic flow \(\phi _t\) preserves \(\alpha \) for all t, we have for any v in \(\mathrm {int}(\Sigma ^+_\gamma )\) and \(\zeta \) in \(T_v\Sigma ^+_\gamma \)

$$\begin{aligned} (\varphi ^* \lambda )(v)[\zeta ]= & {} \lambda (\varphi (v))[d\varphi (v)[\zeta ]] \\= & {} \lambda (\phi _{\tau (v)}(v) )[ d\phi _{\tau (v)}(v)[\zeta ] + d\tau (v)[\zeta ] R_{\alpha }(\phi _{\tau (v)}(v)) ] \\= & {} \lambda (v)[\zeta ] + d\tau (v)[\zeta ] \end{aligned}$$

on \(\mathrm {int}(\Sigma _{\gamma }^+)\), and hence on its closure \(\Sigma _{\gamma }^+\) since all the objects here are smooth. Here, \(R_{\alpha }\) is the Reeb vector field on the contact manifold \((T^1 S^2,\alpha )\), which coincides with the generator of the geodesic flow. Therefore,

$$\begin{aligned} d\tau = \varphi ^* \lambda - \lambda \qquad \text{ on } \; \Sigma _{\gamma }^+. \end{aligned}$$

Now let

$$\begin{aligned} \Psi :\mathrm {int}( \Sigma _\gamma ^+) \times \mathbb {R}\rightarrow T^1S^2 {\setminus } (\dot{\gamma }(\mathbb {R}) \cup - (\dot{\gamma }(\mathbb {R})) \end{aligned}$$

be defined as \(\Psi (v,t):=\phi _t(v)\). Then

$$\begin{aligned} \Psi ^* \alpha (v,t)[(\zeta ,s)]= & {} \alpha (\phi _t(v)) [ d\phi _t(v)[\zeta ] + s R_{\alpha }(\phi _t(v))] \\= & {} \alpha (v)[\zeta ] + s = \lambda (v)[\zeta ]+s, \end{aligned}$$

that is,

$$\begin{aligned} \Psi ^* \alpha = \lambda + dt. \end{aligned}$$

Again, we used the preservation of \(\alpha \) by \(\phi _t\). Since \(\lambda \wedge d\lambda =0\), being a three-form on a two-dimensional manifold, we deduce that

$$\begin{aligned} \Psi ^*(\alpha \wedge d\alpha ) = dt \wedge d\lambda . \end{aligned}$$

Denoting by K the subset

$$\begin{aligned} K:= \{(v,t)\in \mathrm {int}( \Sigma _\gamma ^+ ) \times \mathbb {R}\mid v\in \mathrm {int}(\Sigma _\gamma ^+), \ t\in [0, \tau (x)]\}, \end{aligned}$$

we can relate the contact volume \(\mathrm{Vol}(T^1S^2,\alpha )\) with the function \(\tau \) as follows

$$\begin{aligned} \mathrm{Vol}(T^1S^2,\alpha )= & {} \int \!\!\int \!\!\int _{T^1S^2{\setminus } (\dot{\gamma }(\mathbb {R}) \cup (-\dot{\gamma }(\mathbb {R})))} \alpha \wedge d \alpha =\int \!\!\int \!\!\int _K \Psi ^*(\alpha \wedge d\alpha ) \\= & {} \int \!\!\int \!\!\int _K dt \wedge d\lambda = \iint _{\Sigma _\gamma ^+} \left( \int _0 ^{\tau (v)} dt \right) \, d\lambda (v) = \iint _{\Sigma _\gamma ^+} \tau \, d\lambda . \end{aligned}$$

Summarizing, we have proved the following:

Proposition 3.6

The restriction \(\lambda \) of the contact form \(\alpha \) of \(T^1 S^2\) to \(\Sigma _{\gamma }^+\) has the form

$$\begin{aligned} \lambda = \cos y\, dx \end{aligned}$$

in the standard coordinates \((x,y)\in \mathbb {R}/L\mathbb {Z}\times [0,\pi ]\). The first return map \(\varphi : \Sigma _{\gamma }^+ \rightarrow \Sigma _{\gamma }^+\) preserves \(d\lambda \). Moreover, the first return time \(\tau :\Sigma _{\gamma }^+ \rightarrow \mathbb {R}\) satisfies

$$\begin{aligned} d\tau = \varphi ^* \lambda - \lambda \qquad \text{ on } \; \Sigma _{\gamma }^+. \end{aligned}$$

Finally

$$\begin{aligned} \mathrm{Vol}(T^1S^2,\alpha ) = \iint _{\Sigma _\gamma ^+} \tau \ d\lambda . \end{aligned}$$

For completeness we state and prove below a well known fact.

Proposition 3.7

The contact volume of \((T^1 S^2,\alpha )\) and the Riemannian area of \((S^2,g)\) are related by the identity

$$\begin{aligned} \mathrm{Vol}(T^1S^2,\alpha ) = 2\pi \, \mathrm{Area}(S^2,g). \end{aligned}$$

Proof

Take isothermal coordinates \((x,y)\in U \subset \mathbb {R}^2\) on an embedded closed disk \(U' \subset S^2\). In these coordinates, the metric g takes the form

$$\begin{aligned} ds^2 = a(x,y)^2(dx^2 + dy^2), \end{aligned}$$

for a smooth positive function a. Any unit tangent vector \(v \in T^1U' \subset T^1S^2\) can be written as

$$\begin{aligned} v = \frac{\cos \theta }{a} \partial _x + \frac{\sin \theta }{a} \partial _y, \qquad \text{ with } \qquad \theta \in \mathbb {R}/ 2\pi \mathbb {Z}, \end{aligned}$$

where \(a=|\partial _x|_g=|\partial _y|_g\). Thus \((x,y,\theta )\in U \times \mathbb {R}/2\pi \mathbb {Z}\) can be taken as coordinates on \(T^1 U'\), and the bundle projection becomes \(\pi (x,y,\theta )=(x,y)\). With respect to these coordinates, the contact form

$$\begin{aligned} \alpha (v)[\zeta ] = g_{\pi (v)} \bigl (v, d\pi (v)[\zeta ] \bigr ) \end{aligned}$$

has the expression

$$\begin{aligned} \alpha = a (\cos \theta \,dx + \sin \theta \,dy ). \end{aligned}$$

Differentiation yields

$$\begin{aligned} d\alpha = da\wedge (\cos \theta \,dx + \sin \theta \, dy) + a(-\sin \theta \, d\theta \wedge dx+\cos \theta \, d\theta \wedge dy). \end{aligned}$$

Hence

$$\begin{aligned} \alpha \wedge d\alpha= & {} a\, da\wedge (\cos \theta \sin \theta \, dx\wedge dy+\sin \theta \cos \theta \, dy\wedge dx) \\&+ a^2(\cos ^2\theta \, dx\wedge d\theta \wedge dy-\sin ^2 \theta \, dy\wedge d\theta \wedge dx) \\= & {} a^2 \, dx\wedge d\theta \wedge dy = - a^2 \, dx \wedge dy \wedge d\theta . \end{aligned}$$

Therefore, the orientation of \(T^1 U'\) which is induced by \(\alpha \wedge d\alpha \) is opposite to the standard orientation of \(U\times \mathbb {R}/2\pi \mathbb {Z}\), and we get

$$\begin{aligned} \mathrm{Vol}(T^1U',\alpha )= & {} \int \!\!\int \!\!\int _{T^1U'} \alpha \wedge d\alpha = \int \!\!\int \!\!\int _{U \times \mathbb {R}/2\pi \mathbb {Z}} a^2 dx\wedge dy \wedge d\theta \\= & {} \iint _{U} a^2(x,y) \left( \int _0^{2\pi } d\theta \right) dxdy = 2\pi \iint _U a^2(x,y) \ dxdy \\= & {} 2\pi \iint _U \sqrt{\det (g)} \ dxdy = 2\pi \, \mathrm{Area}(U',g). \end{aligned}$$

Taking two embedded disks \(U',U''\subset S^2\) with disjoint interiors and coinciding boundaries, we get

$$\begin{aligned} \mathrm{Vol}(T^1S^2,\alpha )= & {} \mathrm{Vol}(T^1U',\alpha )+\mathrm{Vol}(T^1U'',\alpha ) \\= & {} 2\pi (\mathrm{Area}(U',g) + \mathrm{Area}(U'',g)) \\= & {} 2\pi \,\mathrm{Area}(S^2,g). \end{aligned}$$

\(\square \)

3.3 The flux and the Calabi invariant of the Birkhoff return map

By using the standard smooth coordinates (xy) given by (36), we can identify the Birkhoff annulus \(\Sigma _{\gamma }^+\) with \(\mathbb {R}/L \mathbb {Z}\times [0,\pi ]\). Its universal cover is the natural projection

$$\begin{aligned} p : S \rightarrow \Sigma _{\gamma }^+, \end{aligned}$$

where S is the strip \(\mathbb {R}\times [0,\pi ]\). The first return map \(\varphi : \Sigma _{\gamma }^+ \rightarrow \Sigma _{\gamma }^+\) preserves the two-form \(\omega =d\lambda \) and maps each boundary component into itself. Therefore, \(\varphi \) can be lifted to a diffeomorphism in the group \(\mathcal {D}_L(S,\omega )\) which is considered in part 2. The aim of this section is to prove the following result, which relates the objects of this part with those of part 2.

Theorem 3.8

Assume that the metric g on \(S^2\) is \(\delta \)-pinched with \(\delta > 1/4\). Let \(\gamma \) be a simple closed geodesic of length L on \((S^2,g)\). Then the first return map \(\varphi : \Sigma _{\gamma }^+ \rightarrow \Sigma _{\gamma }^+\) has a lift \(\Phi : S \rightarrow S\) which belongs to \(\mathcal {D}_L(S,\omega )\) and has the following properties:

  1. (i)

    \(\Phi \) has zero flux.

  2. (ii)

    The first return time \(\tau :\Sigma _{\gamma }^+ \rightarrow \mathbb {R}\) is related to the action \(\sigma : S \rightarrow \mathbb {R}\) of \(\Phi \) by the identity

    $$\begin{aligned} \tau \circ p = L+ \sigma \qquad \text{ on } \; S. \end{aligned}$$
  3. (iii)

    The area of \((S^2,g)\) is related to the Calabi invariant of \(\Phi \) by the identity

    $$\begin{aligned} \pi \, \mathrm {Area}(S^2,g) = L^2 + L\ \mathrm {CAL}(\Phi ). \end{aligned}$$

The proof of this theorem requires an auxiliary lemma, which will play an important role also in the next section.

Lemma 3.9

Assume that \((S^2,g)\) is \(\delta \)-pinched for some \(\delta >1/4\). Fix some v in \(\Sigma _{\gamma }^{\pm }\) and denote by \(\alpha \) the geodesic satisfying \(\dot{\alpha }(0)=v\). Then the geodesic arc \(\alpha |_{[0,\tau ^{\pm }(v)]}\) is injective.

Proof

We consider the case of \(\Sigma _{\gamma }^+\), the case of \(\Sigma _{\gamma }^-\) being completely analogous. Up to the multiplication of g by a positive number, we may assume that \(1\le K < 4\).

Let \(x^*\in \mathbb {R}\) be such that \(\alpha (0) = \gamma (x^*)\) and let \(y^*\in [0,\pi ]\) be the angle between \(\dot{\gamma }(x^*)\) and \(v=\dot{\alpha }(0)\). Consider the family of unit speed geodesics \(\alpha _y\) with \(\alpha _y(0)=\alpha (0)=\gamma (x^*)\) such that the angle from \(\dot{\gamma }(x^*)\) to \(v_y := \dot{\alpha }_y(0)\) is y, for \(y \in [0,\pi ]\). In particular, \(\alpha _{y^*}=\alpha \) and \(v_{y^*}=v\). By Proposition 3.2 (i),

$$\begin{aligned} \{\alpha _{y}|_{[0,\tau _+(v_y)]}\}_{y\in [0,\pi ]} \end{aligned}$$

is a smooth family of geodesic arcs, parametrised on a family of intervals whose length varies smoothly.

We claim that \(\tau _+(v_0) <L\) and \(\tau _+(v_{\pi })<L\). In order to prove this, first notice that the length L of the closed geodesic \(\gamma \) satisfies

$$\begin{aligned} L \ge \frac{2\pi }{\sqrt{\max K}} > \frac{2\pi }{\sqrt{4}} = \pi , \end{aligned}$$
(37)

thanks to the lower bound (16) on the injectivity radius and to the inequality \(K<4\). Moreover, by Proposition 3.2 (i) the number \(\tau _+(v_0)\) is the first positive zero of the solution u of the Jacobi equation

$$\begin{aligned} u''(t) + K(\gamma (x^*+t)) u(t) = 0, \qquad u(0)=0, \qquad u'(0)=1. \end{aligned}$$

Writing the complex function \(u'+iu\) in polar coordinates as \(u'+iu=r e^{i\theta }\), for smooth real functions \(r>0\) and \(\theta \) satisfying \(r(0)=1\), \(\theta (0)=0\), a standard computation gives

$$\begin{aligned} \theta '(t) = \cos ^2 \theta (t) + K(\gamma (x^*+t)) \sin ^2 \theta (t). \end{aligned}$$

Since \(K\ge 1\), we have \(\theta '\ge 1\) and hence \(\theta (L)\ge L > \pi \). This implies that \(\tau _+(v_0)<L\). The case of \(\tau _+(v_{\pi })\) follows by applying the previous case to the geodesic \(t\mapsto \gamma (-t)\).

Let \(Y_0\) be the subset of \([0,\pi ]\) consisting of those y for which \(\alpha _{y}|_{[0,\tau _+(v_y)]}\) is injective. The set \(Y_0\) is open in \([0,\pi ]\), and by the above claim 0 and \(\pi \) belong to \(Y_0\). Let \(Y_1\) be the subset of \((0,\pi )\) consisting of those y for which \(\alpha _{y}|_{[0,\tau _+(v_y)]}\) has an interior self-intersection: there exist \(0<s<t<\tau _+(v_y)\) such that \(\alpha _{y}(s)=\alpha _{y}(t)\). Such an interior self-intersection must be transverse, so the fact that \(S^2\) is two-dimensional implies that also \(Y_1\) is open in \([0,\pi ]\). It is enough to show that \(Y_0 \cup Y_1 = [0,\pi ]\): Indeed, if this is so, the fact that \([0,\pi ]\) is connected implies that only one of the two open sets \(Y_0\) and \(Y_1\) can be non-empty, and we have already checked that \(Y_0\) contains 0 and \(\pi \). The conclusion is that \([0,\pi ]=Y_0\), and in particular \(\alpha =\alpha _{y^*}\) is injective.

If y belongs to the complement of \(Y_0 \cup Y_1\) in \([0,\pi ]\), then \(y\in (0,\pi )\) and \(\alpha _{y}|_{[0,\tau _+(v_y)]}\) has a self-intersection only at its endpoints: \(\alpha |_{[0,\tau _+(v_y))}\) is injective and \(\alpha _{y}(\tau _+(v_y)) = \alpha _{y}(0)\). Denote by \(l>0\) the length of the geodesic loop \(\alpha _{y}|_{[0,\tau _+(v_y)]}\). Together with the closed curve \(\gamma \), this geodesic loop forms a two-gon with perimeter equal to \(L+l\). By Theorem A.12 and the inequality \(K\ge 1\), its perimeter \(L+l\) satisfies

$$\begin{aligned} L+l \le \frac{2\pi }{\sqrt{\min K}}\le 2\pi . \end{aligned}$$

By using the bound (37) and the analogous bound \(l>\pi \) for the geodesic loop \(\alpha _{y}|_{[0,\tau _+(v_y)]}\), we obtain

$$\begin{aligned} L+l > 2\pi . \end{aligned}$$

The above two estimates contradict each other, and this shows that the complement of \(Y_0 \cup Y_1\) is empty, concluding the proof. \(\square \)

Proof of Theorem 3.8

Given \(v\in T^1 S^2\), we denote by \(\alpha _v\) the geodesic parametrised by arc length such that \(\dot{\alpha }_v(0)=v\). Let \(v\in \Sigma _\gamma ^+\) with \(\pi (v)=\gamma (x)\). Then we know from Lemma 3.9 that the geodesic arc \(\alpha _v|_{[0,\tau _+(v)]}\) is injective. In particular, \(\alpha _v(\tau _+(v))\) is distinct from \(\alpha _v(0)=\gamma (x)\), so there exists a unique number

$$\begin{aligned} \rho _+(v)\in (0,L) \end{aligned}$$

such that

$$\begin{aligned} \alpha _v(\tau _+(v)) = \gamma (x+\rho _+(v)). \end{aligned}$$

By the continuity of the geodesic flow and of the function \(\tau _+\), the function

$$\begin{aligned} \rho _+ : \Sigma _{\gamma }^+ \rightarrow (0,L) \end{aligned}$$

is continuous. The restriction of \(\tau _+\) to the boundary of \(\Sigma _{\gamma }^+\) satisfies

$$\begin{aligned} \rho _+(\dot{\gamma }(x)) = \tau _+(\dot{\gamma }(x)) \quad \text{ and } \quad \rho _+(-\dot{\gamma }(x)) = L -\tau _+(-\dot{\gamma }(x)), \qquad \forall x\in \mathbb {R}. \end{aligned}$$
(38)

Similarly, there exists a unique continuous function

$$\begin{aligned} \rho _-: \Sigma _{\gamma }^- \rightarrow (0,L) \end{aligned}$$

such that, if \(v\in \Sigma _{\gamma }^-\) is based at \(\gamma (x)\), we have

$$\begin{aligned} \alpha _v(\tau _-(v)) = \gamma (x+\rho _-(v)). \end{aligned}$$

As before,

$$\begin{aligned} \rho _-(\dot{\gamma }(x)) = \tau _-(\dot{\gamma }(x)) \quad \text{ and } \quad \rho _-(-\dot{\gamma }(x)) = L -\tau _-(-\dot{\gamma }(x)), \qquad \forall x\in \mathbb {R}. \end{aligned}$$
(39)

Define the function

$$\begin{aligned} \rho : \Sigma _{\gamma }^+ \rightarrow (0,2L) \end{aligned}$$

by

$$\begin{aligned} \rho := \rho _+ + \rho _- \circ \varphi _+. \end{aligned}$$

By construction, we have for every \(v\in \Sigma _{\gamma }^+\) with \(\pi (v)=\gamma (x)\),

$$\begin{aligned} \pi (\varphi (v)) = \gamma (x+\rho (v)), \end{aligned}$$
(40)

and, by (38) and (39), together with (19),

$$\begin{aligned} \rho (\dot{\gamma }(x)) = \tau (\dot{\gamma }(x)) \quad \text{ and } \quad \rho (-\dot{\gamma }(x)) = 2L -\tau (-\dot{\gamma }(x)), \qquad \forall x\in \mathbb {R}. \end{aligned}$$
(41)

Using the standard coordinates \((x,y)\in \mathbb {R}/L\mathbb {Z}\times [0,\pi ]\) on \(\Sigma _\gamma ^+\), we can see \(\rho \) and \(\tau \) as functions on \(\mathbb {R}/L\mathbb {Z}\times [0,\pi ]\) or, equivalently, as functions on \(\mathbb {R}\times [0,\pi ]\) which are L-periodic in the first variable. Thanks to (40) we can fix a lift \(\Phi =(X,Y)\in \mathcal {D}_L(S,\omega )\) of \(\varphi \) by requiring its first component to be given by

$$\begin{aligned} X(x,y) = x + \rho (x,y) - L. \end{aligned}$$
(42)

By (41) we have

$$\begin{aligned} X(x,0) - x= \tau (x,0)-L, \qquad X(x,\pi ) - x = L - \tau (x,\pi ), \qquad \forall x\in \mathbb {R}. \end{aligned}$$
(43)

By definition, the action \(\sigma : S \rightarrow \mathbb {R}\) of \(\Phi \) is uniquely determined by the conditions

$$\begin{aligned} d\sigma= & {} \Phi ^* \lambda - \lambda , \\ \sigma (x,0) + \mathrm {FLUX}(\Phi )= & {} \int _{\gamma _x} \lambda = X(x,0)-x , \qquad \forall x\in \mathbb {R}. \end{aligned}$$

where \(\gamma _x\) is a path in \(\partial S\) connecting (x, 0) to \(\Phi (x,0)=(X(x,0),0)\). By the first identity in (43) we have

$$\begin{aligned} \sigma (x,0) + \mathrm {FLUX}(\Phi ) = \tau (x,0) - L, \qquad \forall x\in \mathbb {R}. \end{aligned}$$

By Proposition 3.6, also the (L, 0)-periodic function \(\tau : S \rightarrow \mathbb {R}\) satisfies \(d\tau = \Phi ^* \lambda - \lambda \), so the above identity implies that

$$\begin{aligned} \sigma (x,y) + \mathrm {FLUX}(\Phi ) = \tau (x,y) - L, \qquad \forall (x,y)\in S. \end{aligned}$$
(44)

By Proposition 2.7 and the second identity in (43) we have

$$\begin{aligned} \sigma (x,\pi ) - \mathrm {FLUX}(\Phi ) = \int _{\delta _x} \lambda = - X(x,\pi ) + x = \tau (x,\pi ) - L, \qquad \forall x\in \mathbb {R}, \end{aligned}$$

where \(\delta _x\) is a path in \(\partial S\) connecting \((x,\pi )\) to \(\Phi (x,\pi )=(X(x,\pi ),\pi )\). Together with (44) this implies that \(\mathrm {FLUX}(\Phi )=0\), thus proving statement (i). Statement (ii) now follows from (44).

By Propositions 3.7 and 3.6, we have

$$\begin{aligned} \pi \, \mathrm {Area} (S^2,g)= & {} \frac{1}{2} \, \mathrm {Vol} (T^1S^2,\alpha ) = \frac{1}{2} \iint _{\mathbb {R}/L\mathbb {Z}\times [0,\pi ]} \tau \, d\lambda = \frac{1}{2} \iint _{[0,L] \times [0,\pi ]} (L+\sigma )\, d\lambda \\= & {} L^2 + \frac{1}{2} \iint _{[0,L] \times [0,\pi ]} \sigma \, d\lambda = L^2 + L \ \mathrm {CAL}(\Phi ), \end{aligned}$$

and (iii) is proved. \(\square \)

3.4 Proof of the monotonicity property

As we have seen, the first return map \(\varphi \) can be lifted to a diffeomorphism \(\Phi \) in the class \(\mathcal {D}_L(S,\omega )\). The aim of this section is to prove that, if the curvature is sufficiently pinched, then this lift is a monotone map, in the sense of Definition 2.8 (notice that the monotonicity does not depend on the choice of the lift).

Proposition 3.10

If g is \(\delta \)-pinched for some \(\delta > (4+\sqrt{7})/8\), then any lift \(\Phi : S \rightarrow S\) of the first return map \(\varphi : \Sigma _{\gamma }^+ \rightarrow \Sigma _{\gamma }^+\) is monotone.

Proof

We may assume that the values of the curvature lie in the interval \([\delta ,1]\), where \(\delta >(4+\sqrt{7})/8\).

Fix some \(x^*\in \mathbb {R}\). In order to simplify the notation in the next computations, we set for every \(y\in [0,\pi ]\)

$$\begin{aligned} l_y:=\tau (x^*,y), \qquad t_y := X(x^*,y), \qquad \tilde{y}(y):= Y(x^*,y), \end{aligned}$$

where \(\tau \) is seen as a (L, 0)-periodic function on S and X and Y are the components of the fixed lift \(\Phi =(X,Y)\) of \(\varphi \). Our aim is to show that the derivative of the function \(\tilde{y}\) is positive on \([0,\pi ]\).

Consider the 1-parameter geodesic variation

$$\begin{aligned} \alpha _y(t) := \exp _{\gamma (x^*)}[t(\cos y\ \dot{\gamma }(x^*)+\sin y\ \dot{\gamma }(x^*)^\bot )], \end{aligned}$$

where \(y \in [0,\pi ]\). For each \(y\in (0,\pi )\), \(l_y\) is the second time \(\alpha _y(t)\) hits \(\gamma (\mathbb {R})\) or, equivalently, the first time \(\dot{\alpha }_y(t)\) hits \(\Sigma ^+_\gamma \). Moreover, \(\alpha _0(t)=\gamma (x^*+t)\), and \(l_0\) is the time to the second conjugate point to \(\alpha _0(0)\) along \(\alpha _0\); analogously, \(\alpha _\pi (t)=\gamma (x^*-t)\), and \(l_\pi \) is the time to the second conjugate point to \(\alpha _{\pi }(0)\) along \(\alpha _{\pi }\). By construction

$$\begin{aligned} \alpha _y(l_y) = \gamma (t_y), \end{aligned}$$

and

$$\begin{aligned} \dot{\alpha }_y(l_y)= & {} \cos \tilde{y} \ \dot{\gamma }(t_y) + \sin \tilde{y} \ \dot{\gamma }(t_y)^\perp , \nonumber \\ \dot{\alpha }_y(l_y)^\perp= & {} - \sin \tilde{y} \ \dot{\gamma }(t_y) + \cos \tilde{y} \ \dot{\gamma }(t_y)^\perp , \end{aligned}$$
(45)

for every \(y\in [0,\pi ]\), where the function \(\tilde{y}\) is evaluated at y. Since \(\gamma \) is a geodesic,

$$\begin{aligned} \frac{D}{dy} \dot{\gamma }\circ t_y = \frac{D}{dt} \dot{\gamma } (t_y) \frac{\partial t_y}{\partial y} = 0, \end{aligned}$$

and since the vector field \(\dot{\gamma }^{\perp }\) along \(\gamma \) is parallelly transported,

$$\begin{aligned} \frac{D}{dy} \dot{\gamma }^{\perp } \circ t_y = \frac{D}{dt} \dot{\gamma } (t_y)^{\perp } \frac{\partial t_y}{\partial y} = 0. \end{aligned}$$

Notice that \(V(y):=\dot{\alpha }_y(l_y)\) is a vector field along the smooth curve \(y\mapsto \gamma (t_y)\). Using that \(\gamma \) is a geodesic we obtain from (45)

$$\begin{aligned} \frac{ DV}{dy}(y)&= -\tilde{y}' \sin \tilde{y} \ \dot{\gamma }(t_y) + \cos \tilde{y} \ \frac{D}{dy} \dot{\gamma }\circ t_y +\tilde{y}'\cos \tilde{y} \ \dot{\gamma }(t_y)^\perp + \sin \tilde{y} \ \frac{D}{dy} \dot{\gamma }^{\perp } \circ t_y\nonumber \\&= -\tilde{y}' \sin \tilde{y} \ \dot{\gamma }(t_y) +\tilde{y}'\cos \tilde{y} \ \dot{\gamma }(t_y)^\perp \nonumber \\&= \tilde{y}'(y) \ \dot{\alpha }_y(l_y)^\perp . \end{aligned}$$
(46)

The geodesic variation \(\{\alpha _y\}\) at \(y=y^*\) corresponds to the Jacobi field J along \(\alpha _{y^*}\) given by

$$\begin{aligned} J(t) := \left. \frac{\partial }{\partial y}\right| _{y=y^*}\alpha _y(t). \end{aligned}$$
(47)

From the initial conditions \(J(0)=0\) and

$$\begin{aligned} \frac{DJ}{dt}(0) = \left. \frac{D}{dy}\right| _{y=y^*}\dot{\alpha }_{y}(0) = \left. \frac{d}{dy}\right| _{y=y^*} \dot{\alpha }_{y}(0) = \dot{\alpha }_{y^*}(0)^\perp , \end{aligned}$$

we find a smooth real function u such that

$$\begin{aligned} J(t)=u(t)\dot{\alpha }_{y^*}(t)^\perp , \qquad \frac{DJ}{dt}(t) = u'(t)\dot{\alpha }_{y^*}(t)^\perp , \qquad \forall t\in \mathbb {R}, \end{aligned}$$

and

$$\begin{aligned} u(0) =0 , \qquad u'(0)=1. \end{aligned}$$
(48)

Moreover

$$\begin{aligned} \left. \frac{D}{dy}\right| _{y=y^*} \dot{\alpha }_y(t) = \frac{D}{dt} J(t) = u'(t)\dot{\alpha }_{y^*}(t)^\perp , \qquad \forall t\in \mathbb {R}. \end{aligned}$$
(49)

Recall that the covariant derivative of a vector field v along a curve \(\delta \) on \(S^2\) is the full derivative of the corresponding curve \((\delta ,v)\) on \(TS^2\) projected back to \(TS^2\) by the connection operator \(K:TTS^2 \rightarrow TS^2\). More precisely, K projects this full derivative \((\delta ,v)'\) onto the vertical subspace \(\mathcal {V}_{(\delta ,v)}\subset T_{(\delta ,v)}TS^2\) along the horizontal subspace \(\mathcal {H}_{(\delta ,v)}\subset T_{(\delta ,v)}TS^2\), and then brings it to \(T_\delta S^2\) via the inverse of the isomorphism \(i_{\mathcal {V}_{v}}\), see the discussion after the proof of Lemma 3.5. In (46) we find the covariant derivative of the vector field \(y\mapsto \dot{\alpha }_y(l_y)\) along the curve \(y\mapsto \alpha _y(l_y)\). In (49) we see the covariant derivative of the vector field \(y \mapsto \dot{\alpha }_y(t)\) along the curve \(y\mapsto \alpha _y(t)\) for fixed t. Since \(\alpha _y\) is a geodesic for all y, by using the above description of the covariant derivative we get from (46) and (49)

$$\begin{aligned} \tilde{y}'(y^*) \dot{\alpha }_{y^*}(l_{y^*})^\perp= & {} \frac{DV}{dy}(y^*) = \left. \frac{D}{dy}\right| _{y=y^*} \dot{\alpha }_y (l_{y^*}) + l_y'(y^*) \left. \frac{D}{dt}\right| _{t=l_{y^*}} \dot{\alpha }_{y^*}(t) \\= & {} \left. \frac{D}{dy}\right| _{y=y^*} \dot{\alpha }_y (l_{y^*}) = u'(l_{y^*})\dot{\alpha }_{y^*}(l_{y^*})^\perp , \end{aligned}$$

for every \(y^*\in [0,\pi ]\), from which we derive the important identity

$$\begin{aligned} \tilde{y}'(y^*) = u'(l_{y^*}), \qquad \forall y^* \in [0,\pi ]. \end{aligned}$$
(50)

Write

$$\begin{aligned} l_{y^*} = l+l' \end{aligned}$$

for \(y^* \in (0,\pi )\), where \(l>0\) is the first time \(\alpha _{y^*}(t)\) hits \(\gamma \), that is,

$$\begin{aligned} l = \tau _+(\dot{\alpha }_{y^*}(0)), \qquad l' = \tau _-( \varphi _+(\dot{\alpha }_{y^*}(0))). \end{aligned}$$

By Lemma 3.9, \(\alpha _{y^*}|_{[0,l]}\) is injective and, in particular, its end-points are distinct points of \(\gamma \), dividing it into two segments \(\gamma _1,\gamma _2\) with lengths \(l_1,l_2>0\), respectively, and \(l_1+l_2=L\). Therefore, \(\alpha _{y^*}|_{[0,l]}\) and \(\gamma _1\) determine a geodesic two-gon. The same holds with \(\alpha _{y^*}|_{[0,l]}\) and \(\gamma _2\). It follows from Theorem A.12 that

$$\begin{aligned} l_1+l\le \frac{2\pi }{\sqrt{\delta }} \quad \text{ and } \quad l_2+l\le \frac{2\pi }{\sqrt{\delta }}. \end{aligned}$$

Theorem A.12 also implies that \(L\le 2\pi /\sqrt{\delta }\). From Klingenberg’s lower bound (16) on the injectivity radius of g, we must have \(l_1+l\ge 2\pi \), \(l_2+l\ge 2\pi \), and \(L\ge 2\pi \). Putting these inequalities together, we obtain

$$\begin{aligned}&2\pi \le l_i+l \le \displaystyle {\frac{2\pi }{\sqrt{\delta }}}, \qquad i=1,2,\end{aligned}$$
(51)
$$\begin{aligned}&2\pi \le L=l_1+l_2 \le \displaystyle {\frac{2\pi }{\sqrt{\delta }}}. \end{aligned}$$
(52)

By adding the inequalities (51), we obtain

$$\begin{aligned} 4\pi \le 2l + L \le \frac{4\pi }{\sqrt{\delta }}. \end{aligned}$$
(53)

Together with (52), the above inequality implies

$$\begin{aligned} 2\pi - \frac{\pi }{\sqrt{\delta }} \le l \le \frac{2\pi }{\sqrt{\delta }} - \pi . \end{aligned}$$

Arguing analogously with the geodesic arc \(\alpha _{y^*}|_{[l,l_{y^*}=l+l']}\), we obtain the similar estimate

$$\begin{aligned} 2\pi - \frac{\pi }{\sqrt{\delta }} \le l' \le \frac{2\pi }{\sqrt{\delta }} - \pi , \end{aligned}$$

concluding that the length \(l_{y^*}\) of \(\alpha _{y^*}\) satisfies

$$\begin{aligned} 4\pi - \frac{2\pi }{\sqrt{\delta }} \le l_{y^*} = l+ l' \le \frac{4\pi }{\sqrt{\delta }} - 2\pi . \end{aligned}$$
(54)

The Jacobi equation for the vector field J along \(\alpha _{y^*}\) which is defined in (47) can be written in terms of the scalar function u as

$$\begin{aligned} u''(t) + K(\alpha _{y^*}(t)) u(t) = 0. \end{aligned}$$

Writing

$$\begin{aligned} u(t)'+ i u(t) = r e^{i \theta } \end{aligned}$$

for smooth real functions \(r>0\) and \(\theta \), we get

$$\begin{aligned} \theta ' = \cos ^2\theta + K(\alpha _{y^*}) \sin ^2 \theta . \end{aligned}$$
(55)

The initial conditions (48) imply that \(r(0)=1\) and \(\theta (0)=0\). From (55) we have \(\delta \le \theta ' \le 1\). Hence, from the estimate for \(l_{y^*}\) given in (54), we find

$$\begin{aligned} \delta \left( 4\pi - \frac{2\pi }{\sqrt{\delta }} \right) \le \theta (l_{y^*})\le \frac{4\pi }{\sqrt{\delta }} - 2\pi . \end{aligned}$$
(56)

From \(\delta >(4+\sqrt{7})/8\) we get

$$\begin{aligned} \delta \left( 4\pi - \frac{2\pi }{\sqrt{\delta }} \right) > \frac{3\pi }{2}, \end{aligned}$$

and since a fortiori \(\delta >64/81\), we have also

$$\begin{aligned} \frac{4\pi }{\sqrt{\delta }} - 2\pi < \frac{5\pi }{2}. \end{aligned}$$

Therefore, (56) implies that \(\cos \theta (l_{y^*})\) is positive. By the identity (50), we conclude that

$$\begin{aligned} \tilde{y}'(y^*) = u'(l_{y^*}) = r(l_{y^*}) \cos \theta (l_{y^*})>0, \end{aligned}$$

as we wished to prove. \(\square \)

3.5 Proof of the main theorem

In [11] Calabi and Cao have proved that any shortest closed geodesic on a two-sphere with non-negative curvature is simple. If one assumes that the curvature is suitably pinched, this fact follows also from the lower bound (16) on the injectivity radius and from Theorem A.12:

Lemma 3.11

Assume that the metric g on \(S^2\) is \(\delta \)-pinched for some \(\delta >1/4\). Then any closed geodesic \(\gamma \) of minimal length on \((S^2,g)\) is a simple curve.

Proof

If a closed geodesic \(\gamma \) of minimal length is not simple, then it contains at least two distinct geodesic loops. By the lower bound (16) on the injectivity radius, each of these two geodesic loops has length at least

$$\begin{aligned} \frac{2\pi }{\sqrt{\max K}}, \end{aligned}$$

and we deduce that

$$\begin{aligned} L \ge \frac{4\pi }{\sqrt{\max K}}. \end{aligned}$$
(57)

A celebrated theorem due to Lusternik and Schnirelmann implies the existence of simple closed geodesics on any Riemannian \(S^2\). By Theorem A.12 any simple closed geodesic has length at most

$$\begin{aligned} \frac{2\pi }{\sqrt{\min K}}. \end{aligned}$$

By the pinching assumption,

$$\begin{aligned} \frac{2\pi }{\sqrt{\min K}} \le \frac{2\pi }{\sqrt{\delta \max K}} < \frac{4\pi }{\sqrt{\max K}}, \end{aligned}$$

so by (57) any simple closed geodesic is shorter than L. This contradicts the fact that L is the minimal length of a closed geodesic and proves that \(\gamma \) must be simple. \(\square \)

Now let \(\gamma \) be a simple closed geodesic on \((S^2,g)\) of length L. Let \(\varphi :\Sigma _{\gamma }^+ \rightarrow \Sigma _{\gamma }^+\) be the associated Birkhoff first return map and let \(\Phi \in \mathcal {D}_L(S,\omega )\) be the lift of \(\varphi \) with zero flux whose existence is guaranteed by Theorem 3.8. Here is a first consequence of Theorem 3.8:

Lemma 3.12

Assume that the metric g on \(S^2\) is \(\delta \)-pinched for some \(\delta >1/4\). Then g is Zoll if and only if \(\Phi =\mathrm {id}\).

Proof

Assume that \(\Phi =\mathrm {id}\). Then the action \(\sigma \) of \(\Phi \) is identically zero, so by Theorem 3.8 (ii) the first return time function \(\tau \) is identically equal to L. Therefore, all the vectors in the interior of \(\Sigma _{\gamma }^+\) are initial velocities of closed geodesics of length L. Since also the vectors in the boundary of \(\Sigma _{\gamma }^+\) are by construction initial velocities of closed geodesics of length L, we deduce that all the geodesics on \((S^2,g)\) are closed and have length L.

Conversely assume that \((S^2,g)\) is Zoll. Since \(\gamma \) has length L, all the geodesics on \((S^2,g)\) are closed and have length L. Then every v in \(\mathrm {int}(\Sigma _{\gamma }^+)\) is a periodic point of \(\varphi \), i.e. there is a minimal natural number k(v) such that \(\varphi ^{k(v)}(v)=v\), and the identity

$$\begin{aligned} \sum _{j=0}^{k(v)-1} \tau (\varphi ^j(v)) = L \end{aligned}$$

holds on \(\mathrm {int}(\Sigma _{\gamma }^+)\). Thanks to the continuity of \(\tau \) and \(\varphi \) and to the positivity of \(\tau \), the above identity forces the function k to be constant, \(k\equiv k_0\in \mathbb {N}\). By continuity, the above identity holds also on the boundary of \(\Sigma _{\gamma }^+\), and we have in particular

$$\begin{aligned} \sum _{j=0}^{k_0-1} \tau (\varphi ^j(\dot{\gamma }(t))) = L \qquad \forall t\in \mathbb {R}/L \mathbb {Z}. \end{aligned}$$

By the above identity, there exists \(t_0\in \mathbb {R}/L \mathbb {Z}\) such that

$$\begin{aligned} \tau (\dot{\gamma }(t_0)) \le \frac{L}{k_0}, \end{aligned}$$

that is, the time to the second conjugate point to \(\gamma (t_0)\) along \(\gamma \) is at most \(L/k_0\). Since this time is at least twice the injectivity radius of \((S^2,g)\), we obtain from (16)

$$\begin{aligned} \frac{L}{k_0} \ge \tau (\dot{\gamma }(t_0)) \ge 2 \, \mathrm {inj}(g) \ge \frac{2\pi }{\sqrt{\max K}}. \end{aligned}$$
(58)

On the other hand, by Theorem A.12 and by the pinching assumption, the length L of the simple closed geodesic \(\gamma \) satisfies

$$\begin{aligned} L \le \frac{2\pi }{\sqrt{\min K}} \le \frac{2\pi }{\sqrt{\delta \max K}} < \frac{4\pi }{\sqrt{\max K}}. \end{aligned}$$
(59)

Inequalities (58) and (59) imply that the positive integer \(k_0\) is less than 2, hence \(k_0=1\) and \(\varphi =\mathrm {id}\). Then \(\Phi \) is a translation by an integer multiple of L and, having zero flux, it must be the identity. \(\square \)

The theorem which is stated in the introduction concerns two inequalities, which we treat separately in the following two statements.

Theorem 3.13

If g is \(\delta \)-pinched with \(\delta > (4+\sqrt{7})/8\), then

$$\begin{aligned} \ell _{\min }(g)^2 \le \pi \, \mathrm {Area}(S^2,g), \end{aligned}$$
(60)

and the equality holds if and only if \((S^2,g)\) is Zoll.

Proof

Let \(\gamma \) be a shortest closed geodesic on \((S^2,g)\) and let L be its length. Since in particular \(\delta >1/4\), Lemma 3.11 implies that \(\gamma \) is simple. Let \(\Phi \in \mathcal {D}_L(S,\omega )\) be the lift with zero flux of the Birkhoff first return map which is associated to \(\gamma \).

If \((S^2,g)\) is Zoll, then by the Lemma 3.12 \(\Phi =\mathrm {id}\), so \(\mathrm {CAL}(\Phi )=0\), and Theorem 3.8 (iii) implies that

$$\begin{aligned} \pi \, \mathrm {Area} (S^2,g) = L^2. \end{aligned}$$

This shows that if g is Zoll, then the equality holds in (60).

There remains to show that if \((S^2,g)\) is not Zoll, then the strict inequality holds in (60). Assume by contradiction that

$$\begin{aligned} L^2 \ge \pi \, \mathrm {Area}(S^2,g). \end{aligned}$$

Then by Theorem 3.8 (iii) we have

$$\begin{aligned} L\ \mathrm {CAL}(\Phi ) = \pi \, \mathrm {Area}(S^2,g) - L^2 \le 0, \end{aligned}$$

and \(\mathrm {CAL}(\Phi )\) is non-positive. Since \((S^2,g)\) is not Zoll, by Lemma 3.12 the map \(\Phi \) is not the identity. By Proposition 3.10, \(\Phi \) satisfies the hypothesis of Theorem 2.12, which guarantees the existence of a fixed point \((x,y)\in \mathrm {int}(S)\) of \(\Phi \) with action \(\sigma (x,y)<0\). The geodesic which is determined by the corresponding vector in \(\Sigma _{\gamma }^+\) is closed and, by Theorem 3.8 (ii), has length

$$\begin{aligned} \tau (x,y) = L + \sigma (x,y) < L. \end{aligned}$$

This contradicts the fact that L is the minimal length of a closed geodesic. This contradiction implies that when \((S^2,g)\) is not Zoll, then the strict inequality

$$\begin{aligned} L^2 < \pi \, \mathrm {Area}(S^2,g) \end{aligned}$$

holds. \(\square \)

The proof of the second inequality differs only in a few details:

Theorem 3.14

If g is \(\delta \)-pinched with \(\delta > (4+\sqrt{7})/8\), then

$$\begin{aligned} \ell _{\max }(g)^2 \ge \pi \, \mathrm {Area}(S^2,g), \end{aligned}$$
(61)

and the equality holds if and only if \((S^2,g)\) is Zoll.

Proof

Let \(\gamma \) be a longest simple closed geodesic on \((S^2,g)\) and let L be its length. Let \(\Phi \in \mathcal {D}_L(S,\omega )\) be the lift with zero flux of the Birkhoff first return map which is associated to \(\gamma \).

If \((S^2,g)\) is Zoll, then by the Lemma 3.12 \(\Phi =\mathrm {id}\), so \(\mathrm {CAL}(\Phi )=0\), and Theorem 3.8 (iii) implies that

$$\begin{aligned} \pi \, \mathrm {Area} (S^2,g) = L^2. \end{aligned}$$

This shows that if g is Zoll, then the equality holds in (61).

There remains to show that if \((S^2,g)\) is not Zoll, then the strict inequality holds in (61). Assume by contradiction that

$$\begin{aligned} L^2 \le \pi \, \mathrm {Area}(S^2,g). \end{aligned}$$

Then by Theorem 3.8 (iii) we have

$$\begin{aligned} L\ \mathrm {CAL}(\Phi ) = \pi \, \mathrm {Area}(S^2,g) - L^2 \ge 0, \end{aligned}$$

and \(\mathrm {CAL}(\Phi )\) is non-negative. Since \((S^2,g)\) is not Zoll, by Lemma 3.12 the map \(\Phi \) is not the identity. By Proposition 3.10, \(\Phi \) satisfies the hypothesis of Theorem 2.12, which guarantees the existence of a fixed point \((x,y)\in \mathrm {int}(S)\) of \(\Phi \) with action \(\sigma (x,y)>0\). The geodesic which is determined by the corresponding vector in \(\Sigma _{\gamma }^+\) is closed and, by Theorem 3.8 (ii), has length

$$\begin{aligned} \tau (x,y) = L + \sigma (x,y) > L. \end{aligned}$$

Moreover, Lemma 3.9 implies that this closed geodesic is simple. This contradicts the fact that the longest simple closed geodesic has length L and proves that the strict inequality

$$\begin{aligned} L^2 > \pi \, \mathrm {Area}(S^2,g) \end{aligned}$$

holds. The proof is complete. \(\square \)

Remark 3.15

The proof of our main theorem uses the bound \(\delta > (\sqrt{7}+4)/8\) on the pinching constant \(\delta \) only to have the monotonicity of the map \(\Phi \). If the fixed point Theorem 2.12 holds without this assumption, then the conclusion of our main theorem holds under the weaker condition \(\delta >1/4\).