1 Introduction

Let \(M\) be a closed Riemannian manifold and denote by \(\nabla \) the Levi-Civita connection and by \(\mathcal{L }M\) the loop space, that is the space of free loops \(C^\infty (S^1,M)\). For \(x:S^1\rightarrow M\) consider the action functional

$$\begin{aligned} \mathcal{S }_V(x) = \int _0^1 \left( \frac{1}{2}\left|\dot{x}(t)\right|^2 -V(t,x(t)) \right) dt. \end{aligned}$$

Here and throughout we identify \(S^1=\mathbb{R }/\mathbb{Z }\) and think of \(x\in \mathcal{L }M\) as a smooth map \(x:\mathbb{R }\rightarrow M\) which satisfies \(x(t+1)=x(t)\). Smooth means \(C^\infty \) smooth. The potential is a smooth function \(V:S^1\times M\rightarrow \mathbb{R }\) and we set \(V_t(q):=V(t,q)\). The critical points of \(\mathcal{S }_V\) are the 1-periodic solutions of the ODE

$$\begin{aligned} \nabla {}_{t}\dot{x}=-\nabla V_t(x), \end{aligned}$$
(1)

where \(\nabla V_t\) denotes the gradient and \(\nabla {}_{t}\dot{x}\) denotes the covariant derivative, with respect to the Levi-Civita connection, of the vector field \(\dot{x}:=\frac{d}{dt} x\) along the loop \(x\) in direction \(\dot{x}\). By \(\mathcal{P }=\mathcal{P }(V)\) we denote the set of 1-periodic solutions of (1). These solutions are called perturbed closed geodesics, since in the case \(V=0\) these are the closed geodesics.

From now on we assume that \(\mathcal{S }_V\) is a Morse function function on the loop space, i.e. all critical points are nondegenerate. By [19] the action is Morse for a generic potential \(V_t\) and, furthermore, in this case the set

$$\begin{aligned} \mathcal{P }^a(V):=\{ x\in \mathcal{P }(V)\mid \mathcal{S }_V(x)\le a\} \end{aligned}$$

is finite for every real number \(a\). By \(E_x^u\) we denote the eigenspace corresponding to negative eigenvalues of the Hessian of \(\mathcal{S }_V\) at \(x\in \mathcal{P }(V)\). The dimension of \(E_x^u\) is finite, called the Morse index of   \(\mathbf{x}\). Choose an orientation \(\langle x\rangle \) of the vector space \(E_x^u\) for all \(x\in \mathcal{P }(V)\) and denote this set of choices by \(\langle \mathcal{P }\rangle \). Now consider the \(\mathbb{Z }\)-module graded by the Morse index and given by

$$\begin{aligned} \mathrm{CM}^a_*=\mathrm{CM}^a_*(V) :=\bigoplus _{x\in \mathcal{P }^a(V)} \mathbb{Z }\, x. \end{aligned}$$

If \(\mathcal{S }_V\) is even Morse–Smale, then \(\mathrm{CM}^a_*\) carries the following boundary operator \({\partial }_*\). Consider the (negative) \(L^2\) gradient flow lines of \(\mathcal{S }_V\) on the loop space. These are solutions \(u:\mathbb{R }\times S^1\rightarrow M\) of the heat equation

$$\begin{aligned} {\partial }_su-\nabla {}_{t}{\partial }_tu-\nabla V_t(u)=0 \end{aligned}$$
(2)

satisfying

$$\begin{aligned} \lim _{s\rightarrow \pm \infty } u(s,t) =x^\pm (t),\quad \lim _{s\rightarrow \pm \infty } {\partial }_su(s,t) =0, \end{aligned}$$
(3)

where both limits are uniform in the \(t\) variable and \(x^\pm \in \mathcal{P }(V)\). By definition the moduli space \(\mathcal{M }(x^-,x^+;V)\) is the space of solutions of (2) and (3). The action functional \(\mathcal{S }_V\) is called Morse–Smale below level \(\mathbf{a}\) if the operator \(\mathcal{D }_u\) obtained by linearizing (2) is onto as a linear operator between appropriate Banach spaces and this is true for all \(u\in \mathcal{M }(x^-,x^+;V)\) and \(x^\pm \in \mathcal{P }^a(V)\). Morse–Smale implies Morse; consider \(u_x:=x\). Under the Morse–Smale hypothesis the space \(\mathcal{M }(x^-,x^+;V)\) is a smooth manifold whose dimension is equal to the difference of the Morse indices of the perturbed closed geodesics \(x^\pm \). In the case of index difference one a compactness result implies that the quotient \(\mathcal{M }(x^-,x^+;V)/\mathbb{R }\) by the (free) time shift action is a finite set. Counting these elements with appropriate signs defines the boundary operator \({\partial }_*\) on \(\mathrm{CM}^a_*\). We call the Morse complex \(\left(\mathrm{CM}^a_*,{\partial }_*\right)\) the heat flow complex and the corresponding homology groups \(\mathrm{HM}_*^a(\mathcal{L }M,\mathcal{S }_V)\) heat flow homology.

In Sect. 5 we explain how to perturb the Morse function \(\mathcal{S }_V\) by an abstract perturbation \(v\in \mathcal{O }^a_{reg}(V)\) to achieve the Morse–Smale condition without changing the set of critical points. By definition heat flow homology of \(\mathcal{S }_V\) is then equal to heat flow homology of the perturbed functional. It is an open question if \(\mathcal{S }_V\) is Morse–Smale for a generic potential \(V_t\). The class of abstract perturbations for which we can establish transversality is introduced in the following Sect. 1.1. In contrast we call the potentials \(V_t\) geometric perturbations.

Theorem 1

Fix a potential \(V\in C^\infty (S^1\times M)\) such that the action \(\mathcal{S }_V\) is Morse and take a choice of orientations \(\langle \mathcal{P }\rangle \). Assume \(a\in \mathbb{R }\) is a regular value of \(\mathcal{S }_V\) and \(v^a\in \mathcal{O }^a_{reg}(V)\) is a (regular) perturbation. Then \({\partial }_*={\partial }_*(V,\langle \mathcal{P }\rangle ,v^a)\) satisfies \({\partial }^2=0\). Moreover, heat flow homology defined by

$$\begin{aligned} \mathrm{HM}_*^a(\mathcal{L }M,\mathcal{S }_V) :=\frac{\ker \, {\partial }_*}{\mathrm{im\, }\, {\partial }_*} \end{aligned}$$

does not depend on the choice of regular perturbation \(v^a\) and orientations \(\langle \mathcal{P }\rangle \).

The construction of the Morse complex in finite dimensions goes back to Thom [17], Smale [14, 15], and Milnor [9]. It was rediscovered by Witten [23] and extended to infinite dimensions by Floer [5, 6]. We refer to [1] for an extensive historical account.

1.1 Perturbations

We introduce a class of abstract perturbations of equations (2) and (1) for which transversality works. The abstract perturbations take the form of smooth maps \( \mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }. \) For \(x\in \mathcal{L }M\) let \(\mathrm{grad }\mathcal{V }(x)\in {\Omega }^0(S^1,x^*TM)\) denote the \(L^2\)-gradient of \(\mathcal{V }\); it is defined by

$$\begin{aligned} \int _0^1\langle \mathrm{grad }\mathcal{V }(u), {\partial }_su\rangle \,dt = \frac{d}{ds}\mathcal{V }(u) \end{aligned}$$

for every smooth path \(\mathbb{R }\rightarrow \mathcal{L }M:s\mapsto u(s,\cdot )\). The covariant Hessian of \({\varvec{\mathcal{V }}}\) at a loop \(x:S^1\rightarrow M\) is the operator \(\mathcal{H }_\mathcal{V }(x)\) on \({\Omega }^0(S^1,x^*TM)\) defined by

$$\begin{aligned} \mathcal{H }_\mathcal{V }(u){\partial }_su := \nabla {}_{s}\mathrm{grad }\mathcal{V }(u) \end{aligned}$$
(4)

for every smooth map \(\mathbb{R }\rightarrow \mathcal{L }M:s\mapsto u(s,\cdot )\). The axiom (V1) below asserts that this Hessian is a zeroth order operator. We impose the following conditions on \(\mathcal{V }\); here \(\mathopen |\cdot \mathclose |\) denotes the pointwise absolute value at \((s,t)\in \mathbb{R }\times S^1\) and \(\mathopen \Vert {\cdot } \mathclose \Vert _{L^p}\) denotes the \(L^p\)-norm over \(S^1\) at time \(s\). Although condition (V1) and the first part of (V2) are special cases of (V3) we state the axioms in the form below, because some of our results don’t require all the conditions to hold.

  1. (V0)

    \(\mathcal{V }\) is continuous with respect to the \(C^0\) topology on \(\mathcal{L }M\). Moreover, there is a constant \(C=C(\mathcal{V })\) such that

    $$\begin{aligned} \sup _{x\in \mathcal{L }M}\left|\mathcal{V }(x)\right| +\sup _{x\in \mathcal{L }M}\left\Vert\mathrm{grad }\mathcal{V }(x)\right\Vert_{L^\infty (S^1)} \le C. \end{aligned}$$
  2. (V1)

    There is a constant \(C=C(\mathcal{V })\) such that

    $$\begin{aligned} \left|\nabla {}_{s}\mathrm{grad }\mathcal{V }(u)\right|&\le C\bigl (\left|{\partial }_su\right|+\left\Vert {{\partial }_su} \right\Vert_{L^1}\bigr ), \\ \left|\nabla {}_{t}\mathrm{grad }\mathcal{V }(u)\right|&\le C\Bigl (1+\left|{\partial }_tu\right|\Bigr ) \end{aligned}$$

    for every smooth map \(\mathbb{R }\rightarrow \mathcal{L }M:s\mapsto u(s,\cdot )\) and every \((s,t)\in \mathbb{R }\times S^1\).

  3. (V2)

    There is a constant \(C=C(\mathcal{V })\) such that

    $$\begin{aligned} \left|\nabla {}_{s}\nabla {}_{s}\mathrm{grad }\mathcal{V }(u)\right|&\le C\Bigl (\left|\nabla {}_{s}{\partial }_su\right| + \left\Vert {\nabla {}_{s}{\partial }_su} \right\Vert_{L^1} + \bigl (\left|{\partial }_su\right| + \left\Vert {{\partial }_su} \right\Vert_{L^2}\bigr )^2 \Bigr ), \\ \left|\nabla {}_{t}\nabla {}_{s}\mathrm{grad }\mathcal{V }(u)\right|&\le C\Bigl ( \left|\nabla {}_{t}{\partial }_su\right| + \bigl (1+\left|{\partial }_tu\right|\bigr ) \bigl (\left|{\partial }_su\right| + \left\Vert {{\partial }_su} \right\Vert_{L^1}\bigr ) \Bigr ), \end{aligned}$$

    and

    $$\begin{aligned} \left|\nabla {}_{s}\nabla {}_{s}\mathrm{grad }\mathcal{V }(u) - \mathcal{H }_\mathcal{V }(u)\nabla {}_{s}{\partial }_su\right| \le C\bigl (\left|{\partial }_su\right| + \left\Vert {{\partial }_su} \right\Vert_{L^2}\bigr )^2 \end{aligned}$$

    for every smooth map \(\mathbb{R }\rightarrow \mathcal{L }M:s\mapsto u(s,\cdot )\) and every \((s,t)\in \mathbb{R }\times S^1\).

  4. (V3)

    For any two integers \(k>0\) and \(\ell \ge 0\) there is a constant \(C=C(k,\ell ,\mathcal{V })\) such that

    $$\begin{aligned} \left|\nabla _t^\ell \nabla _s^k\mathrm{grad }\mathcal{V }(u)\right| \le C\sum _{k_j,\ell _j} \left( \prod _{\overset{j}{\scriptscriptstyle \ell _j>0}} \left|\nabla _t^{\ell _j}\nabla _s^{k_j}u\right| \right) \prod _{\overset{j}{\scriptscriptstyle \ell _j=0}} \Biggl (\left|\nabla _s^{k_j}u\right| +\left\Vert\nabla _s^{k_j}u\right\Vert_{L^{p_j}} \Biggr ) \end{aligned}$$

    for every smooth map \(\mathbb{R }\rightarrow \mathcal{L }M:s\mapsto u(s,\cdot )\) and every \((s,t)\in \mathbb{R }\times S^1\); here \(p_j\ge 1\) and \(\sum _{\ell _j=0}1/p_j=1\); the sum runs over all partitions \(k_1+\cdots +k_m=k\) and \(\ell _1+\cdots +\ell _m\le \ell \) such that \(k_j+\ell _j\ge 1\) for all \(j\). For \(k=0\) the same inequality holds with an additional summand \(C\) on the right.

Remark 1

If \(V\in C^\infty (S^1\times M,\mathbb{R })\) and \(x\in \mathcal{L }M\), then \(\mathcal{V }(x):=\int _0^1 V_t\left(x(t)\right)dt\) satisfies \(\mathrm{grad }\mathcal{V }(x) = \nabla V_t(x)\) and \(\mathcal{H }_\mathcal{V }(x)\xi = \nabla {}_{\xi }\nabla V_t(x)\) for \(\xi \in \Omega ^0(S^1,x^*TM)\).

Remark 2

To prove transversality in Sect. 5 we use perturbationsFootnote 1

$$\begin{aligned} \mathcal{V }(x):= \rho \left(\left\Vert{x}-x_0 \right\Vert{_{L^{2}}}^{2} \right) \int _{0}^{1} V_{t(x(t))}\,dt, \end{aligned}$$

where \(\rho :\mathbb{R }\rightarrow [0,1]\) is a smooth cutoff function and \(x_0:S^1\rightarrow M\) is a loop. Any such perturbation satisfies (V0)–(V3). Here compactness of \(M\) enters.

1.2 Main results

There are two purposes of this text (which is the main part of the author’s habilitation thesis [20]). One is to construct the Morse chain complex for the action functional on the loop space. The other one is to provide proofs of the results announced and used in [13] to calculate the adiabatic limit of the Floer complex of the cotangent bundle. More precisely, in [13] we proved in joint work with D. Salamon that the connecting orbits of the heat flow are the adiabatic limit of Floer connecting orbits in the cotangent bundle \(T^*M\) with respect to the Hamiltonian given by kinetic plus potential energy. The key idea is to appropriately rescale the Riemannian metric on \(M\). Both purposes are achieved simultaneously by Theorems 2–8.

From now on we replace the potential \(V\) by an abstract perturbation \(\mathcal{V }\) satisfying (V0)–(V3). Then the action is given by

$$\begin{aligned} \mathcal{S }_\mathcal{V }(x) = \frac{1}{2}\int _0^1\left|\dot{x}(t)\right|^2\,dt - \mathcal{V }(x) \end{aligned}$$
(5)

for smooth loops \(x:S^1\rightarrow M\) and the set \(\mathcal{P }(\mathcal{V })\) of critical points of \(\mathcal{S }_\mathcal{V }\) consists of those loops \(x:S^1\rightarrow M\) that solve the ODE

$$\begin{aligned} \nabla {}_{t}\dot{x}=-\mathrm{grad }\mathcal{V }(x). \end{aligned}$$
(6)

The subset \(\mathcal{P }^a(\mathcal{V })\) consists of those with \(\mathcal{S }_\mathcal{V }(x)\le a\). Now the heat equation has the form

$$\begin{aligned} {\partial }_su - \nabla {}_{t}{\partial }_tu - \mathrm{grad }\mathcal{V }(u) = 0 \end{aligned}$$
(7)

for smooth cylinders \(u:\mathbb{R }\times S^1\rightarrow M\). Here \(\mathrm{grad }\mathcal{V }(u)\) denotes the value of \(\mathrm{grad }\mathcal{V }\) on the loop \(u_s:t\mapsto u(s,t)\). Given two nondegenerate critical points \(x^\pm \in \mathcal{P }(\mathcal{V })\) denote by \(\mathcal{M }(x^-,x^+;\mathcal{V })\) the set of all solutions \(u\) of (7) which satisfy the limit condition (3). Such \(u\) are called connecting orbits or connecting trajectories. The energy of a connecting trajectory is given by

$$\begin{aligned} E(u) = \int _{-\infty }^\infty \int _0^1\left|{\partial }_su\right|^2\,dtds = \mathcal{S }_\mathcal{V }(x^-)-\mathcal{S }_\mathcal{V }(x^+). \end{aligned}$$
(8)

Theorem 2

(Regularity) Fix a constant \(p>2\) and a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3). Let \(u:\mathbb{R }\times S^1\rightarrow M\) be a continuous function of class \(\mathcal{W }^{1,p}_{loc}\), that is \(u,{\partial }_tu,\nabla {}_{t}{\partial }_tu,{\partial }_su\) are locally \(L^p\) integrable. Assume that \(u\) solves the heat equation (7) almost everywhere. Then \(u\) is smooth.

Remark 3

It seems unlikely that the assumption \(u\in \mathcal{W }^{1,p}_{loc}\) can be weakened to \(u\in W^{1,p}_{loc}\), as announced in [13], unless we also replace \(p>2\) by \(p>3\); see [20, rmk. 2.19]. Fortunately, the stronger assumption \(u\in \mathcal{W }^{1,p}_{loc}\) is satisfied in our applications of Theorem 2. These are [13, proof of lemma 10.2], the Banach bundle setup introduced in Sect. 3, step 1 of the proof of Theorem 7, and the proof of Proposition 9 on surjectivity of the universal section.

Theorem 3

(A priori estimates) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V1) and a constant \(c_0\). Then there is a positive constant \(C=C(c_0,\mathcal{V })\) such that the following holds. If \(u:\mathbb{R }\times S^1\rightarrow M\) is a solution of (7) such that \(\mathcal{S }_\mathcal{V }(u(s,\cdot ))\le c_0\) for every \(s\in \mathbb{R }\), then

$$\begin{aligned} \left\Vert {{\partial }_tu} \right\Vert_\infty +\left\Vert {\nabla {}_{t}{\partial }_tu} \right\Vert_\infty +\left\Vert {{\partial }_su} \right\Vert_\infty +\left\Vert {\nabla {}_{t}{\partial }_su} \right\Vert_\infty +\left\Vert {\nabla {}_{s}{\partial }_su} \right\Vert_\infty \le C. \end{aligned}$$

The covariant Hessian of \({\varvec{\mathcal{S }}}_{\varvec{\mathcal{V }}}\) at a loop \(x:S^1\rightarrow M\) is the linear operator \(A_x:W^{2,2}(S^1,x^*TM)\rightarrow L^2(S^1,x^*TM)\) given by

$$\begin{aligned} A_x\xi = - \nabla {}_{t}\nabla {}_{t}\xi - R(\xi ,\dot{x})\dot{x} - \mathcal{H }_\mathcal{V }(x)\xi \end{aligned}$$
(9)

where \(R\) denotes the Riemannian curvature tensor and the Hessian \(\mathcal{H }_\mathcal{V }\) is defined by (4). This operator is self-adjoint with respect to the standard \(L^2\) inner product. The number of negative eigenvalues is finite. It is denoted by \(\mathrm{ind}_\mathcal{V }(A_x)\) and called the Morse index of \(A_x\). If \(x\) is a critical point of \(\mathcal{S }_\mathcal{V }\) we define its Morse index by \(\mathrm{ind}_\mathcal{V }(x):=\mathrm{ind}_\mathcal{V }(A_x)\) and we call \(x\) nondegenerate if \(A_x\) is bijective. Linearizing the heat equation (7) gives rise to the linear operator \( \mathcal{D }_u:\mathcal{W }_u^{1,p}\rightarrow \mathcal{L }_u^p \), see [18, app. A.2], which in the notation introduced above is given by

$$\begin{aligned} \mathcal{D }_u\xi = \nabla {}_{s}\xi + A_{u_s}\xi . \end{aligned}$$
(10)

Here \(u_s(t):=u(s,t)\) and the spaces \(\mathcal{L }_u^p\) and \(\mathcal{W }_u^{1,p}\) are defined as the completions of the space of smooth compactly supported sections of the pullback tangent bundle \(u^*TM\rightarrow \mathbb{R }\times S^1\) with respect to the norms

$$\begin{aligned} \begin{aligned} \left\Vert\xi \right\Vert_p&= \left(\,\,\int _{-\infty }^\infty \int _0^1 |\xi |^p\,dtds\right)^{1/p},\\ \left\Vert\xi \right\Vert_{\mathcal{W }^{1,p}}&= \left(\,\,\int _{-\infty }^\infty \int _0^1 |\xi |^p + |\nabla {}_{s}\xi |^p + |\nabla {}_{t}\nabla {}_{t}\xi |^p \,dtds\right)^{1/p}. \end{aligned} \end{aligned}$$
(11)

Theorem 4

(Exponential decay) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and assume \(\mathcal{S }_\mathcal{V }\) is Morse.

  1. (F)

    Let \(u:[0,\infty )\times S^1\rightarrow M\) be a solution of (7). Then there are positive constants \(\rho \) and \(c_0,c_1,c_2,\dots \) such that

    $$\begin{aligned} \left\Vert {{\partial }_su} \right\Vert_{C^k([T,\infty )\times S^1)} \le c_ke^{-\rho T} \end{aligned}$$

    for every \(T\ge 1\). Moreover, there is a periodic orbit \(x\in \mathcal{P }(\mathcal{V })\) such that \(u(s,\cdot )\) converges to \(x\) in \(C^2(S^1)\) as \(s\rightarrow \infty \).

  2. (B)

    Let \(u:(-\infty ,0]\times S^1\rightarrow M\) be a solution of (7) with finite energy. Then there are positive constants \(\rho \) and \(c_0,c_1,c_2,\dots \) such that

    $$\begin{aligned} \left\Vert {{\partial }_su} \right\Vert_{C^k((-\infty ,-T]\times S^1)} \le c_ke^{-\rho T} \end{aligned}$$

    for every \(T\ge 1\). Moreover, there is a periodic orbit \(x\in \mathcal{P }(\mathcal{V })\) such that \(u(s,\cdot )\) converges to \(x\) in \(C^2(S^1)\) as \(s\rightarrow -\infty \).

Theorem 5

(Fredholm) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3), a constant \(p>1\), and two nondegenerate critical points \(x^\pm \in \mathcal{P }(\mathcal{V })\). Then for each \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\) the operator \(\mathcal{D }_u:\mathcal{W }_u^{1,p}\rightarrow \mathcal{L }_u^p\) is Fredholm and

$$\begin{aligned} \mathrm{index}\,\mathcal{D }_u =\mathrm{ind}_\mathcal{V }(x^-)-\mathrm{ind}_\mathcal{V }(x^+). \end{aligned}$$

Moreover, the formal adjoint operator \(\mathcal{D }_u^*=-\nabla {}_{s}+A_{u_s}:\mathcal{W }_u^{1,p}\rightarrow \mathcal{L }_u^p\) is Fredholm with \( \mathrm{index}\,\mathcal{D }_u^* =-\mathrm{index}\,\mathcal{D }_u \).

See [21, thm. 3.13] for the stronger version announced in [13, thm. A.4] which, together with Corollary 1 in Sect. 2.4 on exponential decay, proves Theorem 5.

Theorem 6

(Implicit function theorem) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3). Assume \(x^\pm \) are nondegenerate critical points of \(\mathcal{S }_\mathcal{V }\) and \(\mathcal{D }_u\) is onto for every \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\). Then \(\mathcal{M }(x^-,x^+;\mathcal{V })\) is a smooth manifold of dimension \(\mathrm{ind}_\mathcal{V }(x^-)-\mathrm{ind}_\mathcal{V }(x^+)\).

Proposition 1

(Finite set) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and assume \(\mathcal{S }_\mathcal{V }\) is Morse–Smale below level \(a\) in the sense that every \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\) is regular (i.e. the Fredholm operator \(\mathcal{D }_u\) is surjective) for every pair \(x^\pm \in \mathcal{P }^a(\mathcal{V })\). Then the quotient space

$$\begin{aligned} \widehat{\mathcal{M }}(x^-,x^+;\mathcal{V }) :=\mathcal{M }(x^-,x^+;\mathcal{V })/\mathbb{R }\end{aligned}$$

is a finite set for every such pair of Morse index difference one. Here the (free) action of \(\mathbb{R }\) is given by time shift \( (\sigma ,u)\mapsto u(\sigma +\cdot ,\cdot ) \).

Theorem 7

(Refined implicit function theorem) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and a pair of nondegenerate critical points \(x^\pm \in \mathcal{P }(\mathcal{V })\) with \(\mathcal{S }_\mathcal{V }(x^+)<\mathcal{S }_\mathcal{V }(x^-)\) and Morse index difference one. Then, for every \(p>2\) and every large constant \(c_0>1\), there are positive constants \(\delta _0\) and \(c\) such that the following holds. Assume \(\mathcal{S }_\mathcal{V }\) is Morse–Smale below level \(2c_0^2\). Assume further that \(u:\mathbb{R }\times S^1\rightarrow M\) is a smooth map such that \(u(s,\cdot )\) converges in \(W^{1,2}(S^1)\) to \(x^\pm \), as \(s\rightarrow \pm \infty \), and such that

$$\begin{aligned} \left|{\partial }_su(s,t)\right| \le \frac{c_0}{1+s^2},\quad \left|{\partial }_tu(s,t)\right| \le c_0,\quad \left|\nabla {}_{t}{\partial }_tu(s,t)\right| \le c_0,\quad \end{aligned}$$

for all \((s,t)\in \mathbb{R }\times S^1\) and

$$\begin{aligned} \left\Vert {{\partial }_su-\nabla {}_{t}{\partial }_tu -\mathrm{grad }\mathcal{V }(u)} \right\Vert_p\le \delta _0. \end{aligned}$$

Then there exist \(u_*\in \mathcal{M }(x^-,x^+;\mathcal{V })\) and \(\xi ^*\in \mathrm{im\, }\mathcal{D }_{u_*}^*\cap \mathcal{W }^{1,p}_{u_*}\) which satisfy

$$\begin{aligned} u=\exp _{u_*}(\xi ^*),\quad \left\Vert {\xi ^*} \right\Vert_\mathcal{W }\le c \left\Vert {{\partial }_su-\nabla {}_{t}{\partial }_tu-\mathrm{grad }\mathcal{V }(u)} \right\Vert_p. \end{aligned}$$

In the previous theorem “\(c_0\) large” means that the constant \(c_0\) should be larger than the constant \(C_0\) in axiom (V0). Recall that a subset of a complete metric space is called residual if it contains a countable intersection of open and dense sets. By Baire’s category theorem a residual subset is dense. Throughout singular homology \(\mathrm{H}_*\) is meant with integer coefficients.

Theorem 8

(Transversality) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and assume \(\mathcal{S }_{\mathcal{V }}\) is Morse. Then for every regular value \(a\) there is a complete metric space \(\mathcal{O }^a(\mathcal{V })\) of perturbations supported away from \(\mathcal{P }^a(\mathcal{V })\) and satisfying (V0)–(V3) such that the following is true. If \(v\in \mathcal{O }^a(\mathcal{V })\), then

$$\begin{aligned} \mathcal{P }^a(\mathcal{V })=\mathcal{P }^a(\mathcal{V }+v), \quad \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a\right\} \right) \cong \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v}\le a\right\} \right). \end{aligned}$$

Moreover, there is a residual subset \(\mathcal{O }^a_{reg}(\mathcal{V })\subset \mathcal{O }^a(\mathcal{V })\) such that for each \(v\in \mathcal{O }^a_{reg}(\mathcal{V })\) the perturbed functional \(\mathcal{S }_{\mathcal{V }+v}\) is Morse–Smale below level \(a\).

1.3 Outlook

The next step is to relate heat flow homology \(\mathrm{HM}_*\) to singular homology of the loop space. In our forthcoming paper [22] we establish the following result.

Theorem 9

Assume \(\mathcal{S }_V\) is Morse and \(a\) is a regular value of \(\mathcal{S }_V\). Then there is a natural isomorphism

$$\begin{aligned} \mathrm{HM}_*^a(\mathcal{L }M,\mathcal{S }_V) \cong \mathrm{H}_*(\mathcal{L }^a M), \quad \mathcal{L }^a M:=\{\gamma \in \mathcal{L }M\mid \mathcal{S }_V(\gamma )\le a\}. \end{aligned}$$

If \(M\) is not simply connected, then there is a separate isomorphism for each component of the loop space. For \(a<b\) the isomorphism commutes with the homomorphisms \( \mathrm{HM}_*^a(\mathcal{L }M,\mathcal{S }_V) \rightarrow \mathrm{HM}_*^b(\mathcal{L }M,\mathcal{S }_V) \) and \( \mathrm{H}_*(\mathcal{L }^a M) \rightarrow \mathrm{H}_*(\mathcal{L }^b M) \).

For a \(C^1\) gradient flow on a Banach manifold, where the Morse functional is bounded below and its critical points are of finite Morse index, Abbondandolo and Majer [1] proved the existence of a natural isomorphism between singular homology and Morse homology. The geometric idea is that the unstable manifolds carry the homologically relevant information. A major point is to construct a cellular filtration of \(\mathcal{L }^a M\) by open forward flow invariant subsets \(F_0\subset F_1\subset \ldots \subset F_N\subset \mathcal{L }^a M\) such that \(F_k\) contains all critical points up to Morse index \(k\) and such that relative singular homology \(\text{ H}_\ell (F_k,F_{k-1})\) is isomorphic to the free abelian group generated by the critical points of index \(k\) in case \(\ell =k\) and it is trivial otherwise. The idea of their construction is the following. Let \(F_0\) be a union of disjoint, open, and forward flow invariant neighborhoods of the critical points of index zero. Then fix small neighborhoods of the index one critical points and consider the set exhausted by the forward flow (which runs into \(F_0\) by the Morse–Smale condition). Now take the union of this set with \(F_0\) to obtain \(F_1\). Clearly \(F_1\) is forward flow invariant. Moreover, it is open, because the time-\(t\)-map of the flow is an open map. Continue with the index two points.

Unfortunately, the time-\(t\)-map for the semiflow generated by the heat equation does not take open sets to open sets due to the extremely strong regularizing nature of the heat flow. So new ideas are required. In [22] we define and use Conley index pairs for the critical points in the infinite dimensional situation at hand. Recall that solving the forward time Cauchy problem for the heat equation (7) for initial values in the Hilbert manifold \(\Lambda M=W^{1,2}(S^1,M)\) leads to existence of a continuous semiflow

$$\begin{aligned} \varphi :[0,\infty )\times \Lambda ^a M \rightarrow \Lambda ^a M, \end{aligned}$$

see [20]. Now a simple but crucial consequence of continuity of the time-\(T\)-map is that the preimage \({\varphi _T}^{-1}(F_0)\) is an open subset of \(\Lambda ^a M\). Here \(F_0\) is an open set consisting of local (strict) sublevel sets near the index zero critical points. Moreover, for \(T>0\) sufficiently large \(\varphi _T\) maps the exit set \(L_1\) (of the Conley index pair \((N_1,L_1)\) associated to the index one critical points) into \(F_0\). Hence \(F_1:=N_1\cup {\varphi _T}^{-1}(F_0)\) is semiflow invariant (and open, since \(N_1\) is open). Continue with index two.

1.4 Overview

In Appendix A we recall for convenience of the reader from [20] the definition of the relevant parabolic spaces \(\mathcal{W }^{k,p}\) and \(\mathcal{C }^{k,p}\) and the parabolic bootstrap Proposition 12. It is a side remark that its proof, hence Theorem 2, relies on the \(L^p\) product estimate [21, le. 4.1] which allows to deal with the quadratic first order part of the heat equation (7).

In Sect. 2 we study the solutions \(u\) to the heat equation (7). Since \({\partial }_su\) solves the linearized equation the results of [21] are available. In Sect. 2.1 we prove smoothness of \(\mathcal{W }^{1,p}_{loc}\) solutions and a compactness result for sequences with uniformly bounded gradient with respect to appropriate norms. In Sects. 2.22.4 boundedness of the action is a crucial assumption. Fix a positive constant \(c_0\). Then all solutions \(u\) of (7) with \( \sup _{s\in \mathbb{R }} \mathcal{S }_\mathcal{V }(u_s) \le c_0 \) admit a uniform a priori estimate for \(\mathopen \Vert {{\partial }_tu} \mathclose \Vert _\infty \) (Theorem 12), uniform energy bounds (Lemma 2), uniform gradient bounds (Theorem 13), and uniform \(L^2\) exponential decay (Theorem 14). In Sect. 2.5 we study compactness of the moduli spaces \(\mathcal{M }(x^-,x^+;\mathcal{V })\) in the case that \(\mathcal{S }_\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) is a Morse function.

Section 3 deals with implicit function theorems. Here, in addition to the Morse condition, the Morse–Smale condition enters: To prove that the moduli spaces are smooth manifolds we not only need nondegeneracy of the asymptotic boundary data (the critical points \(x^\pm \)) but in addition surjectivity of the linearized operators. Under these assumptions Proposition 1 asserts that modulo time shift there are only finitely many heat flow lines from \(x^-\) to \(x^+\) whenever the Morse index difference is one. Here the compactness results of Sect. 2.5 enter. Furthermore, we prove the refined implicit function Theorem 7, a major technical tool in [13]. Here the required quadratic estimates use again the product estimate [21, le. 4.1]. Furthermore, the choice of the sublevel set on which \(\mathcal{S }_\mathcal{V }\) needs to be Morse–Smale requires care. The reason is that one starts out only with an approximate solution \(u\) along which the action is not necessarily decreasing. However, the assumptions guarantee that all loops \(u_s\) are contained in the sublevel set \(\{\mathcal{S }_\mathcal{V }\le 2c_0^2\}\).

Section 4 deals with unique continuation for the linear and the nonlinear heat equation based on an extension of a result by Agmon and Nirenberg. Backward unique continuation for a forward semiflow may be surprising. Of course, there is an assumption: If the action along the two semi-infinite backward trajectories \(u,v\) which coincide at time \(s=0\) is bounded, then \(u=v\).

In Sect. 5 we construct a separable Banach space \(Y\) of abstract perturbations that satisfy axioms (V0)–(V3). Assume \(\mathcal{S }_\mathcal{V }\) is Morse and \(a\) is a regular value. Then we define a Banach submanifold \(\mathcal{O }^a(\mathcal{V })\) of admissible perturbations \(v\). These have the property that \(\mathcal{S }_\mathcal{V }\) and \(\mathcal{S }_{\mathcal{V }+v}\) do have the same critical points on their respective sublevel sets associated to \(a\) and, moreover, both sublevel sets are homologically equivalent. The proof that there is a residual subset \(\mathcal{O }^a_{reg}(\mathcal{V })\) of regular perturbations for which \(\mathcal{S }_{\mathcal{V }+v}\) is Morse–Smale below level \(a\) requires unique continuation for the linearized heat equation and the fact that the action is strictly decreasing along nonconstant heat flow trajectories.

In Sect. 6 we define Morse homology for the heat flow. In Sect. 6.1 we define the unstable manifold of a critical point \(x\) of the action functional \(\mathcal{S }_\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) as the set of endpoints at time zero of all backward halfcylinders solving the heat equation (7) and emanating from \(x\) at \(-\infty \). The main result is Theorem 18 saying that if the critical point \(x\) is nondegenerate, then this is a contractible submanifold of the loop space and its dimension equals the Morse index of \(x\). Here we use unique continuation for the linear and the nonlinear heat equation. In Sect. 6.2 we put together everything to define the Morse complex for the negative \(L^2\) gradient of the action functional on the loop space.

Note that despite the title of this text the fact that the heat equation generates a forward semiflow is nowhere used. In contrast we study the heat equation in analogy to Floer theory in terms of a boundary value problem for infinite cylinders in \(M\) which are solutions of the (parabolic) PDE (7). However, the semiflow point of view will be useful to construct a natural isomorphism to singular homology of the loop space via Conley theory in our forthcoming paper [22].

Notation

If \(f=f(s,t)\) denotes a map, then \(f_s\) abbreviates the map \(f(s,\cdot ):t\mapsto f(s,t)\). In contrast partial derivatives are denoted by \({\partial }_sf\) and \({\partial }_tf\).

2 Solutions of the nonlinear heat equation

2.1 Regularity and compactness

Throughout Sect. 2.1 embed the compact Riemannian manifold \(M\) isometrically into some Euclidean space \(\mathbb{R }^N\) and view any continuous map \(u:Z=(-T,0]\times S^1\rightarrow M\) as a map into \(\mathbb{R }^N\) taking values in \(M\). We indicate this by the notation \(u:Z\rightarrow M\hookrightarrow \mathbb{R }^N\). Then the heat equation (7) is of the form

$$\begin{aligned} {\partial }_su-{\partial }_t{\partial }_tu =\Gamma (u)\left({\partial }_tu,{\partial }_tu\right) +F. \end{aligned}$$
(12)

Here and throughout this section \(\Gamma \) denotes the second fundamental form associated to the embedding \(M\hookrightarrow \mathbb{R }^N\) and the map \(F:Z\rightarrow \mathbb{R }^N\) is given by

$$\begin{aligned} F(s,t):=(\mathrm{grad }\mathcal{V }(u_s))(t). \end{aligned}$$
(13)

Recall the definition of the \(\mathcal{W }^{k,p}\) and the \(\mathcal{C }^k\) norm in (79) and (80), respectively.

Proposition 2

Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3), constants \(p>2\) and \(\mu _0>0\), and cylinders

$$\begin{aligned} Z=(-T,0]\times S^1 ,\quad Z^\prime =(-T^\prime ,0]\times S^1 ,\quad T>T^\prime >0. \end{aligned}$$

Then for every integer \(k\ge 1\) there is a constant \(c_k=c_k(p,\mu _0,T,T^\prime ,\mathcal{V })\) such that the following is true. If \(u:Z\rightarrow M\hookrightarrow \mathbb{R }^N\) is a \(\mathcal{W }^{1,p}\) map such that

$$\begin{aligned} \left\Vert {u} \right\Vert_p +\left\Vert {{\partial }_su} \right\Vert_p +\left\Vert {{\partial }_tu} \right\Vert_p +\left\Vert {{\partial }_t {\partial }_tu} \right\Vert_p \le \mu _0 \end{aligned}$$
(14)

and which satisfies the heat equation (12) almost everywhere, then

$$\begin{aligned} \left\Vert {u} \right\Vert_{\mathcal{W }^{k,p}(Z^\prime ,{\mathbb{R }}^N)} \le c_k. \end{aligned}$$

Proposition 2 follows by induction from the bootstrap Proposition 12 and Lemma 1 below. By standard arguments it implies the following two results.

Theorem 10

(Regularity) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and constants \(p>2\) and \(a<b\). Let \(u\) be a map \((a,b]\times S^1\rightarrow M\hookrightarrow \mathbb{R }^N\) which is of Sobolev class \(\mathcal{W }^{1,p}\) and solves the heat equation (12) almost everywhere. Then \(u\) is smooth.

Theorem 11

(Compactness) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and constants \(p>2\) and \(a<b\). Let \(u^\nu :(a,b]\times S^1\rightarrow M\hookrightarrow \mathbb{R }^N\) be a sequence of smooth solutions of the heat equation (12) such that

$$\begin{aligned} \sup _\nu \left\Vert {{\partial }_tu^\nu } \right\Vert_\infty +\sup _\nu \left\Vert {{\partial }_su^\nu } \right\Vert_p <\infty . \end{aligned}$$

Then there is a smooth solution \(u:(a,b]\times S^1\rightarrow M\) of (12) and a subsequence, still denoted by \(u^\nu \), such that \(u^\nu \) converges to \(u\), uniformly with all derivatives on every compact subset of \((a,b]\times S^1\).

Proof of Proposition 2

Consider the family \(T_r:=T^\prime +\frac{T-T^\prime }{r}\), \(r\in [1,\infty )\), and the corresponding nested sequence of cylinders

$$\begin{aligned} Z_r:=(-T_r,0]\times S^1 ,\quad Z=Z_1\supset Z_2\supset Z_3\supset \cdots \supset Z^\prime . \end{aligned}$$

Denote by \(C_0\) the constant in (V0). More generally, for \(\ell \ge 1\) choose \(C_\ell \) larger than \(C_{\ell -1}\) and larger than all constants \(C(k^\prime ,\ell ^\prime ,\mathcal{V })\) in (V3) for which \(2k^\prime +\ell ^\prime \le \ell \). \(\square \)

Claim

The map \(F\) given by (13) is in \(\mathcal{W }^{\ell ,p}(Z_{\ell +1})\) for every integer \(\ell \ge 1\).

This implies Proposition 2: Given any integer \(k\ge 1\), then \(F\in \mathcal{W }^{k,p}(Z_{k+1})\) by the claim. Furthermore, by inclusion \(Z_{k+1}\subset Z\) and (14)

$$\begin{aligned} \left\Vert {u} \right\Vert_{\mathcal{W }^{1,p}(Z_{k+1})} \le \left\Vert {u} \right\Vert_{\mathcal{W }^{1,p}(Z)} \le \mu _0. \end{aligned}$$

Hence by Corollary 2 for the pair \(Z_{k+2}\subset Z_{k+1}\) there is a constant \(c_{k+1}\) depending on \(p\), \(\mu _0\), \(Z_{k+2}\), \(Z_{k+1}\), \(\mathopen \Vert {\Gamma } \mathclose \Vert _{C^{2k+2}}\), and \(\mathopen \Vert {F} \mathclose \Vert _{\mathcal{W }^{k,p}(Z_{k+1})}\) such that

$$\begin{aligned} \left\Vert {u} \right\Vert_{\mathcal{W }^{k+1,p}(Z^\prime )} \le \left\Vert {u} \right\Vert_{\mathcal{W }^{k+1,p}(Z_{k+2})} \le c_{k+1}. \end{aligned}$$

It remains to prove the claim. The proof is by induction.

Step \({\varvec{\ell }}=\mathbf{1}\) We need to prove that \(F\), \({\partial }_tF\), \({\partial }_sF\), and \({\partial }_t{\partial }_tF\) are in \(L^p(Z_2)\). The domain of all norms of \(\Gamma \) and its derivatives is the compact manifold \(M\). The domain of all other norms is the cylinder \(Z\) unless indicated differently. By axiom (V0) with constant \(C_0\) it follows (even on the larger domain \(Z\)) that

$$\begin{aligned} \left\Vert {F} \right\Vert_\infty =\sup _{s\in (-T,0]} \left\Vert {\mathrm{grad }\mathcal{V }(u_s)} \right\Vert_{L^\infty (S^1)} \le C_0 \end{aligned}$$
(15)

and therefore \( \mathopen \Vert {F} \mathclose \Vert _p \le \mathopen \Vert {F} \mathclose \Vert _\infty \left(\mathrm{Vol}\, Z\right)^{1/p} \le C_0 T^{1/p} \). Next we use axiom (V1) with constant \(C_1\ge C_0\) to obtain that

$$\begin{aligned} \left\Vert {{\partial }_t F} \right\Vert_p&\le \left\Vert {\nabla {}_{t}\mathrm{grad }\mathcal{V }(u)} \right\Vert_p +\left\Vert {\Gamma (u)\left({\partial }_tu, \mathrm{grad }\mathcal{V }(u)\right)} \right\Vert_p\\&\le C_1\left(1+\left\Vert {{\partial }_tu} \right\Vert_p \right) +\left\Vert {\Gamma } \right\Vert_\infty \left\Vert {{\partial }_tu} \right\Vert_p \left\Vert {F} \right\Vert_\infty \\&\le C_1(1+\mu _0)+\left\Vert {\Gamma } \right\Vert_\infty \mu _0 C_0. \end{aligned}$$

Here we used the assumption (14) in the last step. Now by the bootstrap Proposition 12 (i) for \(k=1\) and the pair \(Z_{4/3}\subset Z\) there is a constant \(a_1\) depending on \(p\), \(\mu _0\), \(Z_{4/3}\), \(Z\), \(\mathopen \Vert {\Gamma } \mathclose \Vert _{C^4}\), and the \(L^p(Z)\) norms of \(F\) and \({\partial }_tF\) such that \( \mathopen \Vert {{\partial }_tu} \mathclose \Vert _{\mathcal{W }^{1,p}(Z_{4/3})} \le a_1 \). Then by the Sobolev embedding \(W^{1,p}\hookrightarrow C^0\) with constant \(c^\prime =c^\prime (p,Z_{5/3})\) it follows that \({\partial }_tu\) is continuous on \(Z_{4/3}\) and

$$\begin{aligned} \left\Vert {{\partial }_tu} \right\Vert_{C^0(Z_{5/3})} \le c^\prime \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{W }^{1,p}(Z_{5/3})} \le a_1 c^\prime . \end{aligned}$$
(16)

Again using axiom (V1) we obtain similarly that

$$\begin{aligned} \left\Vert {{\partial }_s F} \right\Vert_p&\le \left\Vert {\nabla {}_{s}\mathrm{grad }\mathcal{V }(u)} \right\Vert_p +\left\Vert {\Gamma (u)\left({\partial }_su, \mathrm{grad }\mathcal{V }(u)\right)} \right\Vert_p \\&\le 2C_1\left\Vert {{\partial }_su} \right\Vert_p +\left\Vert {\Gamma } \right\Vert_\infty \left\Vert {{\partial }_su} \right\Vert_p \left\Vert {F} \right\Vert_\infty \\&\le \mu _0\left(2C_1+\left\Vert {\Gamma } \right\Vert_\infty C_0\right). \end{aligned}$$

In order to estimate \({\partial }_t{\partial }_tF\) observe first that

$$\begin{aligned} \left\Vert {\nabla {}_{t}{\partial }_tu} \right\Vert_{L^p(Z_{5/3})}&\le \left\Vert {{\partial }_t{\partial }_tu} \right\Vert_{L^p(Z_{5/3})} +\left\Vert {\Gamma } \right\Vert_\infty \left\Vert {\left|{\partial }_tu\right|\cdot \left|{\partial }_tu\right|} \right\Vert_{L^p(Z_{5/3})}\\&\le \mu _0 +\left\Vert {\Gamma } \right\Vert_\infty \left\Vert {{\partial }_tu} \right\Vert_{C^0(Z_{5/3})} \left\Vert {{\partial }_tu} \right\Vert_{L^p(Z_{5/3})}\\&\le \mu _0 +\left\Vert {\Gamma } \right\Vert_\infty a_1c^\prime \mu _0. \end{aligned}$$

Here the last step uses assumption (14) and the \(C^0\) estimate (16) for \({\partial }_tu\) which requires shrinking of the domain. Now by axiom (V3) for \(k=0\) and \(\ell =2\) there is a constant still denoted by \(C_1=C_1(\mathcal{V })\) such that

$$\begin{aligned} \left|\nabla {}_{t}\nabla {}_{t} F\right| \le C_1\Bigl (1+\left|{\partial }_tu\right| +\left|\nabla {}_{t}{\partial }_tu\right| \Bigr ) \end{aligned}$$
(17)

pointwise for every \((s,t)\). Integrate this inequality to the power \(p\) to get that

$$\begin{aligned} \left\Vert {\nabla {}_{t}\nabla {}_{t} F} \right\Vert_{L^p(Z_{5/3})}&\le C_1\left(1+\left\Vert {{\partial }_tu} \right\Vert_{L^p(Z_{5/3})} +\left\Vert {\nabla {}_{t}{\partial }_tu} \right\Vert_{L^p(Z_{5/3})} \right)\\&\le C_1\left(1+2\mu _0 +\left\Vert {\Gamma } \right\Vert_\infty a_1c^\prime \mu _0 \right). \end{aligned}$$

By straightforward calculation we obtain

$$\begin{aligned} \left\Vert {{\partial }_t{\partial }_t F} \right\Vert_{L^p(Z_{5/3})}&\le \left\Vert {\nabla {}_{t}\nabla {}_{t} F} \right\Vert_{L^p} +\left\Vert {d\Gamma } \right\Vert_\infty \left\Vert {{\partial }_tu} \right\Vert_{C^0} \left\Vert {{\partial }_tu} \right\Vert_{L^p}\left\Vert {F} \right\Vert_{C^0}\\&+\left\Vert {\Gamma } \right\Vert_\infty \left\Vert {{\partial }_t{\partial }_tu} \right\Vert_{L^p} \left\Vert {F} \right\Vert_{C^0} +2\left\Vert {\Gamma } \right\Vert_\infty \left\Vert {{\partial }_tu} \right\Vert_{C^0} \left\Vert {{\partial }_tF} \right\Vert_{L^p}\\&+\left\Vert {\Gamma } \right\Vert_\infty ^2 \left\Vert {{\partial }_tu} \right\Vert_{C^0} \left\Vert {{\partial }_tu} \right\Vert_{L^p} \left\Vert {F} \right\Vert_{C^0} \end{aligned}$$

where all \(C^0\) and \(L^p\) norms are on the domain \(Z_{5/3}\). Now the right hand side is bounded by a constant \(c=c(p,\mu _0,c^\prime ,C_1,\mathopen \Vert {\Gamma } \mathclose \Vert _{C^1})\) by assumption (14), the estimates for \(F\) and its derivatives obtained earlier, and (16).

Induction step \({\varvec{\ell }}\Rightarrow {\varvec{\ell }}+\mathbf 1 \). Let \(\ell \ge 1\) and assume that the claim is true for \(\ell \). This means that \(F\) is in \(\mathcal{W }^{\ell ,p}(Z_{\ell +1})\) and therefore \( \alpha _\ell :=\mathopen \Vert {F} \mathclose \Vert _{\mathcal{W }^{\ell ,p}(Z_{\ell +1})}<\infty \). Hence by Corollary 2 for the integer \(\ell \) and the pair of sets \(Z_{\ell +1}\supset Z_{\ell +3/2}\) there is a constant \(c_{\ell }=c_\ell (p,\mu _0,T_{\ell +1},T_{\ell +3/2}, \mathopen \Vert {\Gamma } \mathclose \Vert _{C^{2\ell +2}},\alpha _\ell )\) such that

$$\begin{aligned} \left\Vert {u} \right\Vert_{\mathcal{W }^{\ell +1,p}(Z_{\ell +3/2})} \le c_\ell ,\quad \left\Vert {u} \right\Vert_{\mathcal{C }^\ell (Z_{\ell +3/2})} \le c_\ell . \end{aligned}$$
(18)

The second inequality follows from the first by the Sobolev embedding \(W^{1,p}\hookrightarrow C^0\) applied to each term in the \(\mathcal{C }^\ell \) norm. Then choose \(c_\ell \) larger, if necessary. It remains to prove that the \(\mathcal{W }^{\ell ,p}(Z_{\ell +2})\) norms of \({\partial }_tF\), \({\partial }_sF\), and \({\partial }_t{\partial }_tF\) are finite. Similarly as in step \(\ell =1\) we obtain that

$$\begin{aligned} \left\Vert {{\partial }_t F} \right\Vert_{\mathcal{W }^{\ell ,p}(Z_{\ell +3/2})}&\le \left\Vert {\nabla {}_{t} F} \right\Vert_{\mathcal{W }^{\ell ,p}}+\left\Vert {\Gamma (u)\left({\partial }_tu,F\right)} \right\Vert_{\mathcal{W }^{\ell ,p}}\\&\le C_1\left(\left\Vert {1} \right\Vert_{\mathcal{W }^{\ell ,p}} +\left\Vert {{\partial }_tu} \right\Vert_{\mathcal{W }^{\ell ,p}}\right)\\&+\tilde{c}\left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell } \left( \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{W }^{\ell ,p}} \left\Vert {F} \right\Vert_\infty +\left\Vert {u} \right\Vert_{\mathcal{C }^\ell } \left\Vert {F} \right\Vert_{\mathcal{W }^{\ell ,p}} \right)\\&\le C_1\,(T^{1/p}+c_\ell ) +\tilde{c}\left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell } \left( c_\ell C_0+c_\ell \alpha _\ell \right). \end{aligned}$$

Here the domain of all norms, except the one of \(\Gamma \), is \(Z_{\ell +3/2}\). The first step is by definition of the covariant derivative and the triangle inequality. Step two uses axiom (V1) and Lemma 1 with constant \(\tilde{c}\). The last step uses the estimates (15), (18), and the definition of \(\alpha _\ell \) in the induction hypothesis. Now by the refined bootstrap Proposition 12 there is a constant \(a_{\ell +1}\) such that

$$\begin{aligned} \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{W }^{\ell +1,p}(Z_{\ell +2})} \le a_{\ell +1},\quad \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^\ell (Z_{\ell +2})} \le a_{\ell +1}. \end{aligned}$$
(19)

Next observe that

$$\begin{aligned}&\left\Vert {{\partial }_s F} \right\Vert_{\mathcal{W }^{\ell ,p}(Z_{\ell +2})}\\&\quad \le \left\Vert {\nabla {}_{s} F} \right\Vert_{\mathcal{W }^{\ell ,p}} +\left\Vert {\Gamma (u)\left({\partial }_su,F\right)} \right\Vert_{\mathcal{W }^{\ell ,p}}\\&\quad \le 2C_1\left\Vert {{\partial }_su} \right\Vert_{\mathcal{W }^{\ell ,p}} +C^\prime \left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell } \left( \left\Vert {{\partial }_su} \right\Vert_{\mathcal{W }^{\ell ,p}} \left\Vert {F} \right\Vert_\infty +\left(\left\Vert {u} \right\Vert_{\mathcal{C }^\ell }+\left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^\ell }\right) \left\Vert {F} \right\Vert_{\mathcal{W }^{\ell ,p}}\right)\\&\quad \le 2C_1c_\ell +C^\prime \left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell } \left( c_\ell C_0 +(c_\ell +a_{\ell +1})\alpha _\ell \right). \end{aligned}$$

Here the domain of all norms, except the one of \(\Gamma \), is \(Z_{\ell +2}\). Again the first step is by definition of the covariant derivative and the triangle inequality. Step two uses axiom (V1) and Lemma 1 with constant \(C^\prime \). The last step uses the estimates (15), (18), (19), and the definition of \(\alpha _\ell \) in the induction hypothesis. Similarly as in step \(\ell =1\) we obtain that

$$\begin{aligned}&\left\Vert {{\partial }_t{\partial }_t F} \right\Vert_{\mathcal{W }^{\ell ,p}(Z_{\ell +2})}\\&\quad \le \left\Vert {\nabla {}_{t}\nabla {}_{t} F} \right\Vert_{\mathcal{W }^{\ell ,p}} +\left\Vert {d\Gamma (u)\left({\partial }_tu,{\partial }_tu,F\right)} \right\Vert_{\mathcal{W }^{\ell ,p}}\\&\qquad +\left\Vert {\Gamma (u)\left({\partial }_t{\partial }_tu,F\right)} \right\Vert_{\mathcal{W }^{\ell ,p}} +2\left\Vert {\Gamma (u)\left({\partial }_tu,{\partial }_tF\right)} \right\Vert_{\mathcal{W }^{\ell ,p}}\\&\qquad +\left\Vert {\Gamma (u)\left({\partial }_tu,\Gamma (u)\left({\partial }_tu,F\right) \right)} \right\Vert_{\mathcal{W }^{\ell ,p}}\\&\quad \le C_1\left(T^{1/p} +\left\Vert {{\partial }_tu} \right\Vert_{\mathcal{W }^{\ell ,p}} +\left\Vert {{\partial }_t{\partial }_tu} \right\Vert_{\mathcal{W }^{\ell ,p}} +\left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell }\left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^\ell } \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{W }^{\ell ,p}}\right)\\&\qquad +\left\Vert {d\Gamma } \right\Vert_{\mathcal{C }^\ell } \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^\ell }^2\left\Vert {F} \right\Vert_{\mathcal{W }^{\ell ,p}}\\&\qquad +\tilde{c}\left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell } \left( \left\Vert {{\partial }_t{\partial }_tu} \right\Vert_{\mathcal{W }^{\ell ,p}}\left\Vert {F} \right\Vert_\infty +\left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^\ell }\left\Vert {F} \right\Vert_{\mathcal{W }^{\ell ,p}} \right)\\&\qquad +2\left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell } \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^\ell } \left\Vert {{\partial }_tF} \right\Vert_{\mathcal{W }^{\ell ,p}} +\left\Vert {\Gamma } \right\Vert_{\mathcal{C }^\ell }^2 \left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^\ell }^2 \left\Vert {F} \right\Vert_{\mathcal{W }^{\ell ,p}}. \end{aligned}$$

Here the domain of all norms, except the one of \(\Gamma \), is \(Z_{\ell +2}\). In the second step we used axiom (V2) with constant \(C_1\) to estimate the term \(\nabla {}_{t}\nabla {}_{t} F\) and we spelled out the covariant derivative arising in \(\nabla {}_{t}{\partial }_tu\). Moreover, crudely pulling out \(\mathcal{C }^\ell \) norms worked for all terms but the third one, the one involving \({\partial }_t{\partial }_tu\), here we used Lemma 1 with constant \(\tilde{c}\) for the functions \({\partial }_t{\partial }_tu\) and \(F\). Now all terms appearing on the right hand side have been estimated earlier. This proves the induction step and therefore the claim and Proposition 2.

Lemma 1

([20, le. 2.21, le. 4.4]) Fix a constant \(p>2\) and a bounded open subset \(\Omega \subset \mathbb{R }^2\) with area \(\mathopen |\Omega \mathclose |\). Then for every integer \(k\ge 1\) there is a constant \(c=c(k,\mathopen |\Omega \mathclose |)\) such that

$$\begin{aligned} \left\Vert {{\partial }_tu\cdot v} \right\Vert_{\mathcal{W }^{k,p}}&\le c \left(\left\Vert {{\partial }_tu} \right\Vert_{\mathcal{W }^{k,p}} \left\Vert {v} \right\Vert_\infty +\left\Vert {u} \right\Vert_{\mathcal{C }^k} \left\Vert {v} \right\Vert_{\mathcal{W }^{k,p}}\right)\\ \left\Vert {{\partial }_su\cdot v} \right\Vert_{\mathcal{W }^{k,p}}&\le c \left\Vert {{\partial }_su} \right\Vert_{\mathcal{W }^{k,p}}\left\Vert {v} \right\Vert_\infty +c\left(\left\Vert {u} \right\Vert_{\mathcal{C }^k}+\left\Vert {{\partial }_tu} \right\Vert_{\mathcal{C }^k}\right) \left\Vert {v} \right\Vert_{\mathcal{W }^{k,p}} \end{aligned}$$

for all functions \(u,v\in C^\infty (\overline{\Omega })\).

Proof of Theorem 10

Fix any point \(z\in Z=(a,b]\times S^1\) and a subcylinder \(Z^\prime =(a^\prime ,b]\times S^1\) that contains \(z\) and where \(a^\prime \in (a,b)\). Set \(\mu _0=\mathopen \Vert {u} \mathclose \Vert _{\mathcal{W }^{1,p}(Z,\mathbb{R }^N)}\), then Proposition 2 for the function \(\tilde{u}(s,t):=u(s+b,t)\) and the constants \(T=b-a\) and \(T^\prime =b-a^\prime \) implies that

$$\begin{aligned} u\in \bigcap _{k\ge 0} \mathcal{W }^{k,p}(Z^\prime ,\mathbb{R }^N) =\bigcap _{k\ge 0} W^{k,p}(Z^\prime ,\mathbb{R }^N) =C^\infty (\overline{Z^\prime },\mathbb{R }^N). \end{aligned}$$

See [8, app. B.1] for the last step. Hence \(u\) is locally smooth. \(\square \)

Proof of Theorem 2

Theorem 10. \(\square \)

Proof of Theorem 11

Shifting the \(s\) variable by \(b\) and setting \(T=b-a\), if necessary, we may assume without loss of generality that the maps \(u^\nu \) are defined on \((-T,0]\) and, furthermore, by composition with the isometric embedding \(M\hookrightarrow \mathbb{R }^N\) that they take values in \(\mathbb{R }^N\). All norms are taken on the domain \((-T,0]\times S^1\), unless indicated otherwise. To apply Proposition 2 we need to verify that the maps \(u^\nu :(-T,0]\times S^1\rightarrow \mathbb{R }^N\) satisfy the four a priori estimates in (14) for some constant \(\mu _0\) independent of \(\nu \). To see this observe that

$$\begin{aligned} \left\Vert {u^\nu } \right\Vert_p \le \left\Vert {u^\nu } \right\Vert_\infty \mathrm{Vol}\, ((-T,0]\times S^1) \le c_1 T^{1/p} \end{aligned}$$

for some constant \(c_1\) depending only on the isometric embedding \(M\hookrightarrow \mathbb{R }^N\) and the diameter of the compact manifold \(M\). By assumption there is a constant \(c_2\) independent of \(\nu \) such that \( \mathopen \Vert {{\partial }_tu^\nu } \mathclose \Vert _p \le \mathopen \Vert {{\partial }_tu^\nu } \mathclose \Vert _\infty T^{1/p} \le c_2 T^{1/p} \) and \( \mathopen \Vert {{\partial }_su^\nu } \mathclose \Vert _p \le c_2 \). Then it follows by the heat equation (12) that

$$\begin{aligned} \left\Vert {\nabla {}_{t}{\partial }_tu^\nu } \right\Vert_p \le \left\Vert {{\partial }_su^\nu } \right\Vert_p +\left\Vert {\mathrm{grad }\mathcal{V }(u^\nu )} \right\Vert_p \le c_2+C_0T^{1/p}. \end{aligned}$$

In the second step we used (V0) to estimate \(\mathrm{grad }\mathcal{V }(u^\nu )\) in \(L^\infty \) from above by a constant \(C_0=C_0(\mathcal{V })\). By definition of the covariant derivative

$$\begin{aligned} \left\Vert {{\partial }_t{\partial }_tu^\nu } \right\Vert_p&\le \left\Vert {\nabla {}_{t}{\partial }_tu^\nu } \right\Vert_p +\left\Vert {\Gamma } \right\Vert_{C^0(M)} \left\Vert {{\partial }_tu^\nu } \right\Vert_\infty \left\Vert {{\partial }_tu^\nu } \right\Vert_p\\&\le c_2+C_0T^{1/p} +c_2^2T^{1/p}\left\Vert {\Gamma } \right\Vert_{C^0(M)}. \end{aligned}$$

Now set \( \mu _0 :=c_2+C_0T^{1/p} +c_2^2T^{1/p}\left\Vert {\Gamma } \right\Vert_{C^0(M)} +(c_1+c_2)T^{1/p} \). Then Proposition 2 asserts that for every constant \(T^\prime \in (0,T)\) and every integer \(k\ge 2\) there is a constant \(c_k=c_k(p,\mu _0,T,T^\prime ,\mathcal{V })\) such that \( \left\Vert {u^\nu } \right\Vert_{\mathcal{W }^{k,p}(Q,\mathbb{R }^N)} \le c_k \) where \(Q=[-T^\prime ,0]\times S^1\). Recall that the inclusion \(W^{k,p}(Q)\hookrightarrow C^{k-1}(Q)\) is compact; see e.g. [8, B.1.11]. Hence there is a subsequence which converges on \(Q\) in the \(C^k\) topology. We denote the limit by \(u\in C^k(Q)\). Since this is true for every \(k\ge 2\) there is a subsequence, still denoted by \(u^\nu \), converging on \(Q\) to \(u\), uniformly with all derivatives. Since this is true for every compact subcylinder \(Q\) of \((-T,0]\times S^1\), the theorem follows by choosing a diagonal subsequence associated to an exhausting sequence by such \(Q\)’s. Because, in particular, the convergence is in \(C^0\) and the \(u^\nu \) take values in \(M\), so does the limit \(u\). By \(C^k\) convergence with \(k\ge 2\) the limit \(u\) satisfies the heat equation (12). \(\square \)

2.2 An a priori estimate

Theorem 12

Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V1) and a constant \(c_0\). Then there is a constant \(C=C(c_0,\mathcal{V })\) such that the following holds. Assume \(u:\mathbb{R }\times S^1\rightarrow M\) is a solution of the heat equation (7) such that

$$\begin{aligned} \sup _{s\in \mathbb{R }} \mathcal{S }_\mathcal{V }(u(s,\cdot ))\le c_0, \end{aligned}$$

then \( \left\Vert{\partial }_tu\right\Vert_\infty \le C \).

Proof

The idea is to first derive slicewise \(L^2\) bounds, then verify the differential inequality in [13, lemma B.1] and apply the lemma using the slicewise bounds on the right hand side. The slicewise bound for \({\partial }_t u\) follows easily from the assumption \( c_0 \ge \mathcal{S }_\mathcal{V }(u_s) =\frac{1}{2}\mathopen \Vert {{\partial }_t u_s} \mathclose \Vert _{L^2(S^1)}^2 -\mathcal{V }(u_s) \) where \(u_s(t):=u(s,t)\). Let \(C_0\) denote the constant in (V0), then this implies that

$$\begin{aligned} \mathopen \Vert {{\partial }_t u_s} \mathclose \Vert _{L^2(S^1)}^2 \le 2c_0+2\mathcal{V }(u_s) \le 2c_0 + 2C_0 \end{aligned}$$
(20)

for every \(s\in \mathbb{R }\). Consider the pointwise differential inequality given by

$$\begin{aligned} \left({\partial }_t{\partial }_t-{\partial }_s\right)\left|{\partial }_t u\right|^2&= 2\left|\nabla {}_{t}{\partial }_t u\right|^2 +2\langle (\nabla {}_{t}\nabla {}_{t}-\nabla {}_{s}) {\partial }_t u, {\partial }_t u\rangle \\&= 2\left|\nabla {}_{t}{\partial }_t u\right|^2 -2\langle \nabla {}_{t} \mathrm{grad }\mathcal{V }(u), {\partial }_t u\rangle \\&\ge -2C_1\left(1+\left|{\partial }_t u\right|\right) \left|{\partial }_t u\right| \\&\ge -C_1-3C_1\left|{\partial }_t u\right|^2. \end{aligned}$$

To obtain the second step we replaced \(\nabla {}_{t}{\partial }_t u\) according to the heat equation (7) and used that \(\nabla {}_{t}{\partial }_s u=\nabla {}_{s}{\partial }_t u\). The third step is by condition (V1) with constant \(C_1\). Choose \((s_0,t_0)\in \mathbb{R }\times S^1\) and apply [13, lemma B.1] in the case \(r=1\) and with \( w(s,t):=\frac{1}{3}+\mathopen |{\partial }_t u(s_0+s,t_0+t)\mathclose |^2 \) and \(a=3C_1\) to obtain

$$\begin{aligned} w(0)&\le c_1e^a\int _{-1}^0\int _{-1}^{+1}\left( \frac{1}{3} +\left|{\partial }_tu(s_0+s,t_0+t)\right|^2\right)dtds\\&= c_1e^{3C_1}\left( \frac{2}{3} +2\int _{-1}^0 \left\Vert {{\partial }_tu_{s_0+s}} \right\Vert_{L^2(S^1)}^2 ds \right). \end{aligned}$$

Theorem 12 then follows from the slicewise estimate (20). \(\square \)

Lemma 2

Fix a constant \(c_0\) and a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0) with constant \(C_0\). If \(u:\mathbb{R }\times S^1\rightarrow M\) is a solution of (7), then

$$\begin{aligned} \sup _{s\in \mathbb{R }}\mathcal{S }_\mathcal{V }(u(s,\cdot )) \le c_0 \quad \Rightarrow \quad E(u)\le c_0+C_0 \end{aligned}$$

and \( \mathcal{S }_\mathcal{V }(u_a) -\mathcal{S }_\mathcal{V }(u_b) \le 2E(u) + C_0^2+2C_0 \) for all reals \(a\le b\).

Proof

The first assertion is standard. Using the energy identity (8) and the negative \(L^2\) gradient flow property of the heat equation we obtain that \( E_{[-T,T]}(u) =\mathcal{S }_\mathcal{V }(u_{-T})-\mathcal{S }_\mathcal{V }(u_T) \le \mathcal{S }_\mathcal{V }(u_{-T})+C_0 \) for every \(T>0\). The last step is by (V0). Next by partial integration and (7) we obtain that

$$\begin{aligned} \left\Vert {{\partial }_tu_a} \right\Vert_2^2-\left\Vert {{\partial }_tu_b} \right\Vert_2^2&= -\int _{a}^b\frac{d}{ds} \langle {\partial }_tu_s,{\partial }_tu_s\rangle _{L^2(S^1)} \, ds\\&= 2\langle {\partial }_su_s,\nabla {}_{t}{\partial }_tu_s\rangle _{L^2}\\&\le \left\Vert {{\partial }_su} \right\Vert_2^2 +\left\Vert {{\partial }_su-\mathrm{grad }\,\mathcal{V }(u)} \right\Vert_2^2\\&\le 3E(u)+2C_0^2. \end{aligned}$$

The last step is by the energy identity (8) and (V0). Now use that \(\mathopen \Vert {{\partial }_tu_s} \mathclose \Vert _2^2=2\mathcal{S }_\mathcal{V }(u_s)+2\mathcal{V }(u_s)\) by definition (5) of the action. Apply (V0) again. \(\square \)

2.3 Gradient bounds

Linearizing the heat equation (7) at a solution \(u\) provides the linear heat equation

$$\begin{aligned} D_u\xi :=\nabla {}_{s}\xi -\nabla {}_{t}\nabla {}_{t}\xi -R(\xi ,{\partial }_tu){\partial }_tu -\mathcal{H }_\mathcal{V }(u)\xi =0. \end{aligned}$$
(21)

for smooth vector fields \(\xi \) along \(u\). Note that \(\xi :={\partial }_su\) is a solution. The definition of \(\mathcal{D }_u\) makes sense for arbitrary smooth maps \(u:\mathbb{R }\times S^1\rightarrow M\). The formal adjoint operator with respect to the \(L^2\) inner product is given by

$$\begin{aligned} D_u^*\xi =-\nabla {}_{s}\xi -\nabla {}_{t}\nabla {}_{t}\xi -R(\xi ,{\partial }_tu){\partial }_tu -\mathcal{H }_\mathcal{V }(u)\xi . \end{aligned}$$
(22)

Theorem 13

Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V2) and a constant \(c_0\). Then there is a constant \(C=C(c_0,\mathcal{V })>0\) such that the following holds. If \(u:\mathbb{R }\times S^1\rightarrow M\) is a solution of (7) that satisfies \(\sup _{s\in \mathbb{R }}\mathcal{S }_\mathcal{V }(u(s,\cdot ))\le c_0\), then

$$\begin{aligned} \left|{\partial }_s u(s,t)\right|^2 +\left|\nabla {}_{t}{\partial }_s u(s,t)\right|^2&\le C E_{[s-1,s]}(u)\\ \left|\nabla {}_{s}{\partial }_s u(s,t)\right|^2 +\left|\nabla {}_{t}\nabla {}_{t}{\partial }_s u(s,t)\right|^2&\le C E_{[s-2,s]}(u) \end{aligned}$$

for every \((s,t)\in \mathbb{R }\times S^1\). Here

$$\begin{aligned} E_I(u) =\int _{I\times S^1}\left|{\partial }_s u\right|^2 \end{aligned}$$

denotes the energy of the solution \(u\) over the set \(I\times S^1\).

Proof

By Theorem 12 there is a constant \(C_0=C_0(c_0,\mathcal{V })\) such that \( \left\Vert {{\partial }_tu} \right\Vert_\infty \le C_0 \). Let \(C=C(C_0,\mathcal{V })\) be the constant of [21, thm. 3.3] with this choice of \(C_0\). Since \(\xi :={\partial }_su\) solves the linearized heat equation, the a priori estimate [21, thm. 3.3] shows that

$$\begin{aligned} \left|{\partial }_s u(s,t)\right|^2 \le C^2 E_{[s-1,s]}(u) \le C^2(c_0+c^\prime ) \end{aligned}$$

for every \((s,t)\in \mathbb{R }\times S^1\). Here the last step is by Lemma 2 and axiom (V0) with constant \(c^\prime \). Use that \(u\) solves (7) and satisfies axiom (V0) to obtain that

$$\begin{aligned} \left\Vert {\nabla {}_{t}{\partial }_tu} \right\Vert_\infty \le \left\Vert {{\partial }_su} \right\Vert_\infty +\left\Vert {\mathrm{grad }\mathcal{V }(u)} \right\Vert_\infty \le C\sqrt{c_0+c^\prime } + c^\prime . \end{aligned}$$

Now choose \(C_0\) larger than \(2C\sqrt{c_0+c^\prime } + c^\prime \) and let \(C=C(C_0,\mathcal{V })\) be the constant of [21, thm. 3.3] with this new choice of \(C_0\). Then [21, thm. 3.3] proves the desired estimate for \(\mathopen |\nabla {}_{t}{\partial }_su\mathclose |\). Hence \(\mathopen \Vert {\nabla {}_{t}{\partial }_su} \mathclose \Vert _\infty \) is bounded by Lemma 2. Then \(\mathopen \Vert {\nabla {}_{t}\nabla {}_{t}{\partial }_tu} \mathclose \Vert _\infty \) is bounded by (7) and axiom (V1). Hence the a priori estimate [21, thm. 3.4] applies with a new choice of \(C_0\) and proves the remaining two estimates of Theorem 13. \(\square \)

Proof of Theorem 3

Theorem 12, Theorem 13 and Lemma 2. Only (V0)–(V1) are used. Use (7) and (V0) to obtain the estimate for \(\nabla {}_{t}{\partial }_tu\). \(\square \)

2.4 Exponential decay

First we prove asymptotic exponential decay for solutions \(u\) of the heat equation (7) assuming only an action bound, say \(a\in \mathbb{R }\), along \(u\). In this case nondegeneracy of all critical points (at least below level \(a\)) is essential.

Subsequently we deal with the case \(u\in \mathcal{M }(x^-;x^+;\mathcal{V })\). Here boundedness of the action is automatic and, in addition, existence of asymptotic boundary conditions \(x^\pm \) is part of the assumption on \(u\). In this case nondegeneracy is only required for \(x^\pm \).

Theorem 14

(Exponential energy decay) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V2). Suppose \(\mathcal{S }_\mathcal{V }\) is Morse and fix a regular value \(a\in \mathbb{R }\) of \(\mathcal{S }_\mathcal{V }\). Then there are constants \(\delta _0,c,\rho >0\) such that the following holds. If \(u:\mathbb{R }\times S^1\rightarrow M\) is a solution of (7) that satisfies \(\sup _{s\in \mathbb{R }}\mathcal{S }_\mathcal{V }(u(s,\cdot ))\le a\) and

$$\begin{aligned} E_{\mathbb{R }\setminus [-T_0,T_0]}(u) <\delta _0 \end{aligned}$$
(23)

for some \(T_0>0\), then

$$\begin{aligned} E_{\mathbb{R }\setminus [-T,T]}(u) \le ce^{-\rho (T-T_0)} E_{\mathbb{R }\setminus [-T_0,T_0]}(u) \end{aligned}$$

for every \(T\ge T_0+1\).

Lemma 3

(Critical point nearby) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0), a regular value \(a\in \mathbb{R }\) of \(\mathcal{S }_\mathcal{V }\), and a constant \(\delta >0\). Then there is a constant \({\varepsilon }>0\) such that the following is true. Suppose \(\gamma :S^1\rightarrow M\) is a smooth loop such that

$$\begin{aligned} \mathcal{S }_\mathcal{V }(\gamma )\le a,\quad \left\Vert {\nabla {}_{t}{\partial }_t \gamma +\mathrm{grad }\mathcal{V }(\gamma )} \right\Vert_\infty <{\varepsilon }. \end{aligned}$$

Then there is a critical point \(x\in \mathcal{P }^a(\mathcal{V })\) and a vector field \(\xi \) along \(x\) such that \( \gamma =\exp _{x}(\xi ) \) and \( \left\Vert {\xi } \right\Vert_\infty +\left\Vert {\nabla {}_{t}\xi } \right\Vert_\infty +\left\Vert {\nabla {}_{t}\nabla {}_{t}\xi } \right\Vert_\infty \le \delta \).

Proof

First note that \( \mathopen \Vert {{\partial }_t \gamma } \mathclose \Vert _2^2 =2\mathcal{S }_\mathcal{V }(\gamma )+2\mathcal{V }(\gamma ) \le 2(a+C) \) where \(C\) is the constant in (V0). Now, assuming \({\varepsilon }\le 1\), we obtain the pointwise inequality

$$\begin{aligned} \frac{d}{dt}\left|{\partial }_t \gamma \right|^2&= 2\langle {\partial }_t \gamma , \nabla {}_{t}{\partial }_t \gamma +\mathrm{grad }\mathcal{V }(\gamma )\rangle -\langle {\partial }_t \gamma , \mathrm{grad }\mathcal{V }(\gamma )\rangle \\&\le 2\left({\varepsilon }+C\right)\left|{\partial }_t \gamma \right| \le \left(1+C\right)^2 + \left|{\partial }_t \gamma \right|^2. \end{aligned}$$

Integrate this inequality to see that \( \mathopen |{\partial }_t \gamma (t_1)\mathclose |^2-\mathopen |{\partial }_t \gamma (t_0)\mathclose |^2 \le \left(1+C\right)^2 + \mathopen \Vert {{\partial }_t \gamma } \mathclose \Vert _2^2 \) for \(t_0,t_1\in [0,1]\). Integrating again over the interval \(0\le t_0\le 1\) gives

$$\begin{aligned} \left\Vert {{\partial }_t \gamma } \right\Vert_\infty \le \sqrt{\left(1+C\right)^2 + 2\left\Vert {{\partial }_t \gamma } \right\Vert_2^2} \le c \end{aligned}$$
(24)

where \(c^2:=\left(1+C\right)^2+4\left(a+C\right)\).

Now suppose that the assertion is wrong. Then there is a constant \(\delta >0\) and a sequence of smooth loops \(\gamma _\nu :S^1\rightarrow M\) satisfying

$$\begin{aligned} \mathcal{S }_\mathcal{V }(\gamma _\nu )\le a,\quad \lim _{\nu \rightarrow \infty }\bigl ( \left\Vert {\nabla {}_{t}{\partial }_t \gamma _\nu +\mathrm{grad }\mathcal{V }(\gamma _\nu )} \right\Vert_\infty \bigr ) = 0, \end{aligned}$$

but not the conclusion of the lemma for the given constant \(\delta \). We know that \(\sup _\nu \left\Vert {\nabla {}_{t}{\partial }_t \gamma _\nu } \right\Vert_\infty <\infty \) by (V0) and \(\sup _\nu \left\Vert {{\partial }_t \gamma _\nu } \right\Vert_\infty <\infty \) by (24). Hence, by the Arzela–Ascoli theorem, there exists a subsequence, still denoted by \(\gamma _\nu \), that converges in the \(C^1\)-topology. Let \(x\in C^1(S^1,M)\) be the limit. We claim that this subsequence actually converges in the \(C^2\)-topology. In this case \(\nabla {}_{t}{\partial }_t x+\mathrm{grad }\mathcal{V }(x)=0\), hence \(x\in \mathcal{P }^a(\mathcal{V })\). But this contradicts our assumption on the sequence \(\gamma _\nu \) and proves the lemma.

It remains to prove the claim. For simplicity, we assume that \(M\) is isometrically embedded in Euclidean space \(\mathbb{R }^N\) for some sufficiently large integer \(N\). Since \(\sup _\nu \left\Vert {\nabla {}_{t}{\partial }_t \gamma _\nu } \right\Vert_2<\infty \), the Banach–Alaoglu Theorem asserts existence of a subsequence, still denoted by \(\gamma _\nu \), and an element \(v\in L^2\) such that \(\nabla {}_{t}{\partial }_t \gamma _\nu \) converges to \(v\) weakly in \(L^2\). Note that \(v\) is equal to the weak \(t\)-derivative \(\nabla {}_{t}{\partial }_t x\) of \({\partial }_t x\). Now \(\mathrm{grad }\mathcal{V }(\gamma _\nu )\) converges to \(\mathrm{grad }\mathcal{V }(x)\) in \(L^\infty \) (hence in \(L^2\)) by axiom (V0) and to \(-v\) weakly in \(L^2\). Thus \(v=-\mathrm{grad }\mathcal{V }(x)\) by uniqueness of limits. Hence \(v\in C^0\) and therefore \(\nabla {}_{t}{\partial }_t x\in C^0\). Using our assumption on the sequence \(\gamma _\nu \) it follows that \( \nabla {}_{t}{\partial }_t\gamma _\nu =-\mathrm{grad }\mathcal{V }(\gamma _\nu ) \) converges in \(L^\infty \) to \( -\mathrm{grad }\mathcal{V }(x)=v=\nabla {}_{t}{\partial }_t x \), as \(\nu \rightarrow \infty \), and this proves the claim.

Proof of Theorem 14

Recall that if \(u\) is a solution of the heat equation (7), then \(\xi :={\partial }_su\) solves the linear heat equation (21) and \(E_I(\xi )=\mathopen \Vert {\xi } \mathclose \Vert _{L^2(I\times S^1)}^2\) for each interval \(I\subset \mathbb{R }\). Hence it remains to check that the assumptions of [21, thm. 3.9] and [21, rmk. 3.10] on exponential \(L^2\) decay are satisfied by our given solution \(u\). In particular, we need to show that \(u_s\) converges asymptotically in \(W^{2,2}(S^1)\) to nondegenerate critical points \(x^\pm \). Here Lemma 3 enters.

Given \(a\) and \(\mathcal{V }\), let \(C=C(a,\mathcal{V })\) be the constant in Theorem 13 with this choice. Let \(C_0=C_0(\mathcal{V })\) be the constant in axiom (V0). Then \(E(u)\le a+C_0\) by Lemma 2, hence \(\mathopen \Vert {{\partial }_su} \mathclose \Vert _\infty \le C E(u)\le C(a+C_0)\) by Theorem 13. Note that

$$\begin{aligned} \left\Vert {\xi _s} \right\Vert_2 =\left\Vert {{\partial }_su_s} \right\Vert_2 \le \left\Vert {{\partial }_su_s} \right\Vert_\infty \le \left\Vert {{\partial }_su} \right\Vert_\infty \le C(a+C_0), \end{aligned}$$

for all \(s\in \mathbb{R }\), and that for every \(x\in \mathcal{P }^a(\mathcal{V })\) it follows that

$$\begin{aligned} c_0:=\sqrt{2a+2C_0}+C_0 \quad \Rightarrow \quad \left\Vert {{\partial }_tx} \right\Vert_2+\left\Vert {\nabla {}_{t}{\partial }_tx} \right\Vert_2 \le c_0. \end{aligned}$$

These are already two of the assumptions in [21, thm. 3.9]. Let \(\delta \) and \(\rho \) be the constants in that theorem with this choice of \(c_0(a,\mathcal{V })\). If necessary, choose \(\delta >0\) smaller than one quarter the minimal \(C^0\) distance \(\kappa =\kappa (a)\) of any two elements of \(\mathcal{P }^a(\mathcal{V })\). Let \({\varepsilon }\) be the constant in Lemma 3 associated to \(a\) and \(\delta \) and set

$$\begin{aligned} \delta _0 :=\min \left\{ {\varepsilon }^2/4C,\delta ^2/4C\right\} . \end{aligned}$$

Note that \(\delta \), \(\rho \), \({\varepsilon }\), and \(\delta _0\) depend only on \(a\) and \(\mathcal{V }\). Now assume (23) holds true for some constant \(T_0=T_0(u)>0\) with this choice of \(\delta _0\). Suppose \(\mathopen |s\mathclose |\ge T_0+1\). Then \(E_{[s-1,s]}(u)\le E_{\mathbb{R }\setminus [-T_0,T_0]}(u)<\delta _0\) by assumption (23). Now Theorem 13 (gradient bound) implies that

$$\begin{aligned} \left\Vert {{\partial }_su_s} \right\Vert_\infty +\left\Vert {\nabla {}_{t}{\partial }_su_s} \right\Vert_\infty \le \sqrt{CE_{[s-1,s]}(u)} \le \sqrt{C\delta _0} <\min \left\{ {\varepsilon },\delta \right\} . \end{aligned}$$
(25)

Hence by Lemma 3 for \(\gamma :=u_s\) using (25) and (7) there are \(x^\pm \in \mathcal{P }^a(\mathcal{V })\) with

$$\begin{aligned} u_s=\exp _{x^\pm }(\eta ^\pm _s),\quad \mathopen \Vert {\eta ^\pm _s} \mathclose \Vert _{C^2(S^1)}\le \delta , \end{aligned}$$

whenever \(\mathopen |s\mathclose |\ge T_0+1\). Although the critical points \(x^\pm \) a priori depend on \(s\) they are in fact independent, because \(\delta <\kappa /4\) and \(\mathcal{P }^a(\mathcal{V })\) is a finite set by the Morse condition. Moreover, injectivity of the operators \(A_{x^\pm }\) is equivalent to nondegeneracy of the critical points \(x^\pm \) which is true by the Morse condition. Then [21, thm. 3.9 and rmk. 3.10] conclude the proof of Theorem 14. \(\square \)

To prove Theorem 4 it is useful to denote \(\exp _u(\xi )\) by \(E(u,\xi )\) and define linear maps, for \(\xi \in T_uM\) and \(i,j\in \{1,2\}\), by

$$\begin{aligned} E_i (u,\xi ):T_uM\rightarrow T_{exp_u\xi }M ,\quad E_{ij}(u,\xi ):T_uM\times T_uM\rightarrow T_{exp_u\xi }M. \end{aligned}$$

If \(u:\mathbb{R }\rightarrow M\) is a smooth curve and \(\xi ,\eta \) are smooth vector fields along \(u\), then the maps \(E_i\) and \(E_{ij}\) are characterized by the identities

$$\begin{aligned} \frac{d}{ds}\exp _u(\xi )&= E_1(u,\xi ){\partial }_su +E_2(u,\xi )\nabla {}_{s}\xi \nonumber \\ \nabla {}_{s}\left( E_1(u,\xi )\eta \right)&= E_{11}(u,\xi )\left(\eta ,{\partial }_s u\right) +E_{12}(u,\xi )\left(\eta ,\nabla {}_{s}\xi \right)+E_1(u,\xi )\nabla {}_{s}\eta \\ \nabla {}_{s}\left( E_2(u,\xi )\eta \right)&= E_{21}(u,\xi )\left(\eta ,{\partial }_s u\right) +E_{22}(u,\xi )\left(\eta ,\nabla {}_{s}\xi \right) +E_2(u,\xi )\nabla {}_{s}\eta .\nonumber \end{aligned}$$
(26)

These maps satisfy the symmetry properties

$$\begin{aligned} E_{12}(u,\xi )\left(\eta ,\eta ^\prime \right) =E_{21}(u,\xi )\left(\eta ^\prime ,\eta \right),\quad E_{22}(u,\xi )\left(\eta ,\eta ^\prime \right) =E_{22}(u,\xi )\left(\eta ^\prime ,\eta \right), \end{aligned}$$
(27)

and the identities

$$\begin{aligned} E_{11}(u,0)=E_{12}(u,0)=E_{22}(u,0)=0,\quad E_1(u,0)=E_2(u,0)={\small 1}\!\!1. \end{aligned}$$
(28)

Proof of Theorem 4

We prove exponential decay in three steps.

  • I. Finite energy. If \(u:[0,\infty )\times S^1\rightarrow M\), then \(E(u)\le \mathcal{S }_\mathcal{V }(u_0)+C_0\) by (the proof of) Lemma 2 where \(C_0\) is the constant in axiom (V0).

  • II. Bounded action along \(u\) and existence of asymptotic limits. Consider the backward case (B). By Lemma 2 it follows that

    $$\begin{aligned} \sup _{s\in (-\infty ,0]}\mathcal{S }_\mathcal{V }(u_s) \le 2E(u)+C_0^2+2C_0+\mathcal{S }_\mathcal{V }(u_0)=:c_0. \end{aligned}$$
    (29)

Now fix a regular value \(a\ge c_0\) of \(\mathcal{S }_\mathcal{V }\). First we prove that \({\partial }_su(s,t)\rightarrow 0\) uniformly in \(t\), as \(s\rightarrow -\infty \). To see this let \(C>0\) be the constant in Theorem 13 (gradient bounds) and let \(s\ge 1\), then

$$\begin{aligned} \mathopen |{\partial }_su(s,t)\mathclose | \le C E_{[s-1,s]}(u) =C\int _{s-1}^s \mathopen \Vert {{\partial }_su_\sigma } \mathclose \Vert ^2_{L^2(S^1)} d\sigma \stackrel{s\rightarrow \infty }{\longrightarrow } 0 \end{aligned}$$

where the last step follows by finite energy of \(u\). Thus by the heat equation (7) also \(\nabla {}_{t}{\partial }_tu_s+\mathrm{grad }\mathcal{V }(u_s)\) converges to zero in \(L^\infty (S^1)\). Hence it follows from Lemma 3 that there is a critical point \(x^-\in \mathcal{P }^a(\mathcal{V })\) and, for every sufficiently large \(s\), there is a smooth vector field \(\xi _s\) along \(x^-\) such that

$$\begin{aligned} u_s=\exp _{x^-} (\xi _s), \quad \mathopen \Vert {\xi _s} \mathclose \Vert _\infty +\mathopen \Vert {\nabla {}_{t}\xi _s} \mathclose \Vert _\infty +\mathopen \Vert {\nabla {}_{t}\nabla {}_{t}\xi _s} \mathclose \Vert _\infty \stackrel{s\rightarrow \infty }{\longrightarrow } 0. \end{aligned}$$

(The set \(\mathcal{P }^a(\mathcal{V })\) is finite, because \(\mathcal{S }_\mathcal{V }\) is Morse.) This and the identities for the maps \(E_{ij}\) in (26) imply that

$$\begin{aligned} \mathopen \Vert {{\partial }_su} \mathclose \Vert _\infty +\mathopen \Vert {{\partial }_tu} \mathclose \Vert _\infty +\mathopen \Vert {\nabla {}_{t}{\partial }_tu} \mathclose \Vert _\infty <\infty . \end{aligned}$$
(30)

In the forward case (F) the action along \(u\) is bounded from above by \(c_0:=\mathcal{S }_\mathcal{V }(u_0)\) due to the negative gradient flow property. The remaining part of the proof goes through unchanged.

  • III. Exponential decay. Consider the forward case (F). We prove by induction that for every \(k\in \mathbb{N }\) there is a constant \(c_k^\prime >0\) such that

    $$\begin{aligned} \left\Vert {{\partial }_su} \right\Vert_{W^{k,2} ([s,\infty )\times S^1)} \le c_k^\prime \left\Vert {{\partial }_su} \right\Vert _{L^2([s-k,\infty )\times S^1)} \end{aligned}$$
    (31)

    for every \(s\ge k\). This estimate, the energy identity (8), and Theorem 14 with constants \(\delta _0,c,\rho \) and \(T_0\) chosen sufficiently large such that (23) holds, show that

    $$\begin{aligned} \left\Vert {{\partial }_su} \right\Vert_{W^{k,2} ([s,\infty )\times S^1)} \le c_k^\prime \sqrt{E_{[s-k,\infty ]}(u)} \le c_k^\prime \sqrt{c\delta _0} e^{-\rho (s-k-T_0)/2} \end{aligned}$$

    whenever \(s\ge k+T_0+1\). The Sobolev embedding \(W^{k,2}\hookrightarrow C^{k-2}\), e.g. on the compact set \([s,s+1]\times S^1\), concludes the proof of forward exponential decay (F).

It remains to carry out the induction argument. It is based on the following identity. Linearize the heat equation (7) in the \(s\)-direction to obtain that

$$\begin{aligned} \left(\nabla {}_{s} -\nabla {}_{t}\nabla {}_{t}\right){\partial }_su =R({\partial }_su,{\partial }_tu){\partial }_tu+\mathcal{H }_\mathcal{V }(u){\partial }_su. \end{aligned}$$
(32)

Observe that [13, le. D.2] applies by (30); formally add to \(u\) a smooth half cylinder imposing a uniform limit as \(s\rightarrow -\infty \). Fix \(s_0\ge 1\) and pick a smooth nondecreasing cutoff function \(\beta :\mathbb{R }\rightarrow [0,1]\) equal to zero for \(s\le s_0-1\), to one for \(s\ge s_0\), and whose slope is at most two. Now [13, le. D.2] for \(p=2\) applied to \(\beta \xi \) shows that there is a constant \(c^\prime >0\) such that

$$\begin{aligned}&\left\Vert {\nabla {}_{s}\xi } \right\Vert _{L^2([s_0,\infty )\times S^1)} +\left\Vert {\nabla {}_{t}\xi } \right\Vert _{L^2([s_0,\infty )\times S^1)} +\left\Vert {\nabla {}_{t}\nabla {}_{t}\xi } \right\Vert _{L^2([s_0,\infty )\times S^1)}\nonumber \\&\quad \le c^\prime \left( \left\Vert {\nabla {}_{s}\xi -\nabla {}_{t}\nabla {}_{t}\xi } \right\Vert _{L^2([s_0-1,\infty )\times S^1)} + \left\Vert {\xi } \right\Vert _{L^2([s_0-1,\infty )\times S^1)} \right) \end{aligned}$$
(33)

for every \(\xi \in C^\infty _0([0,\infty )\times S^1,u^*TM)\). We used [13, le. D.4] to include \(\nabla {}_{t}\xi \).

We prove the induction hypothesis (31) for \(k=1\). Let \(s\ge 1\) and denote by \(C_1>0\) the constant in (V1). By (33) with \(\xi ={\partial }_su\) and (32) it follows that

$$\begin{aligned}&\left\Vert {\nabla {}_{s}{\partial }_su} \right\Vert _{L^2([s,\infty )\times S^1)} +\left\Vert {\nabla {}_{t}{\partial }_su} \right\Vert _{L^2([s,\infty )\times S^1)} +\left\Vert {\nabla {}_{t}\nabla {}_{t}{\partial }_su} \right\Vert _{L^2([s,\infty )\times S^1)}\\&\quad \le c^\prime \left( \left\Vert {(\nabla {}_{s}-\nabla {}_{t}\nabla {}_{t}){\partial }_su} \right\Vert _{L^2([s-1,\infty )\times S^1)} + \left\Vert {{\partial }_su} \right\Vert _{L^2([s-1,\infty )\times S^1)} \right)\\&\quad = c^\prime \left( \left\Vert {R({\partial }_su,{\partial }_tu){\partial }_tu+\mathcal{H }_\mathcal{V }(u){\partial }_su} \right\Vert _{L^2([s-1,\infty )\times S^1)} + \left\Vert {{\partial }_su} \right\Vert _{L^2([s-1,\infty )\times S^1)} \right)\\&\quad \le c^\prime \left( \mathopen \Vert {R} \mathclose \Vert _\infty \mathopen \Vert {{\partial }_tu} \mathclose \Vert _\infty ^2 +2C_1+1\right) \left\Vert {{\partial }_su} \right\Vert _{L^2([s-1,\infty )\times S^1)}. \end{aligned}$$

Observe that the induction hypothesis (31) for \(k=2\) follows similarly. Assume \(s\ge 2\). Then by (33) with \(\xi =\nabla {}_{s}{\partial }_su\) and (32) it follows that

$$\begin{aligned}&\left\Vert {\nabla {}_{s}\nabla {}_{s}{\partial }_su} \right\Vert _{L^2([s,\infty )\times S^1)} +\left\Vert {\nabla {}_{t}\nabla {}_{s}{\partial }_su} \right\Vert _{L^2([s,\infty )\times S^1)} +\left\Vert {\nabla {}_{t}\nabla {}_{t}\nabla {}_{s}{\partial }_su} \right\Vert _{L^2([s,\infty )\times S^1)}\\&\quad \le c^\prime \Bigl ( \left\Vert {\nabla {}_{s}\left( R({\partial }_su,{\partial }_tu){\partial }_tu +\mathcal{H }_\mathcal{V }(u){\partial }_su\right) +[\nabla {}_{s},\nabla {}_{t}\nabla {}_{t}]{\partial }_su} \right\Vert _{L^2([s-1,\infty )\times S^1)}\\&\qquad + \left\Vert {\nabla {}_{s}{\partial }_su} \right\Vert _{L^2([s-1,\infty )\times S^1)} \Bigr ). \end{aligned}$$

Now use \(s\ge 2\), the a priori estimates (30), axiom (V2), and the case \(k=1\) to bound the right hand side by a constant times \(\mathopen \Vert {{\partial }_su} \mathclose \Vert _{L^2([s-2,\infty )\times S^1)}\). Then the \(L^2\) bound for \(\nabla {}_{t}\nabla {}_{t}{\partial }_su\) obtained earlier in the case \(k=1\) together with the identity \( \nabla {}_{s}\nabla {}_{t}{\partial }_su =\nabla {}_{t}\nabla {}_{s}{\partial }_su -R({\partial }_tu,{\partial }_su){\partial }_su \) imply an \(L^2\) bound for \(\nabla {}_{s}\nabla {}_{t}{\partial }_su\).

To prove the induction hypothesis (31) for \(k=3\) requires the yet unkown fact that \(\mathopen \Vert {\nabla {}_{t}{\partial }_su} \mathclose \Vert _\infty <\infty \). Note that our heat flow solution \(u\) admits an upper action bound, namely \(\mathcal{S }_\mathcal{V }(u(0,\cdot ))\), and this is the essential assumption of Theorem 12 and Theorem 13. Hence corresponding versions recover (30) and prove the desired estimate. The latter is crucial, because (33) with \(\xi =\nabla {}_{s}\nabla {}_{s}{\partial }_su\) and (32) lead to terms of the form

$$\begin{aligned} \mathopen \Vert {R(\nabla {}_{s}{\partial }_su, \nabla {}_{t}{\partial }_su){\partial }_tu} \mathclose \Vert _{L^2([s,\infty )\times S^1)}, \end{aligned}$$

whereas our induction hypothesis in the case \(k=2\) only provides a \(C^0\) bound for \({\partial }_su\). The remaining part of proof follows the same pattern as in the case \(k=2\). Here we use axiom (V3).

Now fix an integer \(k\ge 3\) and assume the induction hypothesis (31) is true for every \(\ell \in \{1,\dots ,k\}\). In particular, we have \(W^{k,2}\) and \(C^{k-2}\) bounds for \({\partial }_su\) on the appropriate domains. Apply (33) with \(\xi ={\nabla {}_{s}}^k{\partial }_su\) and (32) to obtain \(L^2\) bounds for \({\nabla {}_{s}}^{k+1}{\partial }_su\) and \(\nabla {}_{t}{\nabla {}_{s}}^k{\partial }_su\). Here we use axiom (V3) and the induction hypothesis for \(\ell \in \{1,\dots ,k\}\). A problem of the type encountered in the case \(k=3\) does not arise, since we have \(C^{k-2}\) bounds for \({\partial }_su\) with \(k\ge 3\). To obtain \(L^2\) estimates for the remaining terms of the form \({\nabla {}_{t}}^j{\nabla {}_{s}}^{k-j}{\partial }_su\) with \(j\ge 2\) use (32) to treat any \(\nabla {}_{t}\nabla {}_{t}\) for one \(\nabla {}_{s}\). This reduces the order of the term, hence the induction hypothesis can be applied. This completes the induction step and proves (F). The backward case (B) follows similarly. \(\square \)

Corollary 1

Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3), two nondegenerate critical points \(x^\pm \in \mathcal{P }(\mathcal{V })\), and an element \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\). Then there are positive constants \(\rho \) and \(c_0,c_1,c_2,\dots \) such that

$$\begin{aligned} \left\Vert {{\partial }_su} \right\Vert_{C^k(\mathbb{R }\setminus [-T,T]\times S^1)} \le c_ke^{-\rho T} \end{aligned}$$

for every \(T\ge 1\).

Proof

(I) Since \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\), its energy is finite by (8). (II) Use (29) to see that the action is bounded along \(u\). Existence of asymptotic limits of \(u\) holds by definition. Now (III) in the proof of Theorem 14 applies. \(\square \)

Proof of Theorem 5

By Corollary 1, the heat equation (7), and axioms (V0–V1) the assumptions of the Fredholm Theorem [21, thm. 3.13] are satisfied. \(\square \)

2.5 Compactness up to broken trajectories

Proposition 3

(Convergence on compact sets) Assume that the perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) satisfies (V0)–(V3) and \(\mathcal{S }_\mathcal{V }\) is Morse. Fix critical points \(x^\pm \in \mathcal{P }(\mathcal{V })\) and a sequence of connecting trajectories \(u^\nu \in \mathcal{M }(x^-,x^+;\mathcal{V })\). Then there is a pair \(x_0,x_1\in \mathcal{P }(\mathcal{V })\), a connecting trajectory \(u\in \mathcal{M }(x_0,x_1;\mathcal{V })\), and a subsequence, still denoted by \(u^\nu \), such that the following is true.

  1. (i)

    The subsequence \(u^\nu \) converges to \(u\), uniformly with all derivatives on every compact subset of \(\mathbb{R }\times S^1\).

  2. (ii)

    For all \(s\in \mathbb{R }\) and \(T>0\) it holds that

    $$\begin{aligned} \mathcal{S }_\mathcal{V }\bigl (u(s,\cdot )\bigr ) =\lim _{\nu \rightarrow \infty } \mathcal{S }_\mathcal{V }\bigl (u^\nu (s,\cdot )\bigr ) ,\quad E_{[-T,T]}(u) =\lim _{\nu \rightarrow \infty } E_{[-T,T]}(u^\nu ). \end{aligned}$$

Proof

Since the flow lines \(u^\nu \) connect \(x^-\) to \(x^+\) and the action \(\mathcal{S }_\mathcal{V }\) decreases along flow lines, it follows that \( \sup _{s\in \mathbb{R }}\mathcal{S }_\mathcal{V }(u^\nu (s,\cdot )) =\mathcal{S }_\mathcal{V }(x^-)=:c_0 \). Hence by the a priori estimates Theorem 12 and Theorem 13 there is a constant \(C=C(c_0,\mathcal{V })\) such that \( \mathopen |{\partial }_tu^\nu (s,t)\mathclose |\le C, \) and \( \mathopen |{\partial }_su^\nu (s,t)\mathclose |^2 \le C^2\left(\mathcal{S }_\mathcal{V }(x^-)-\mathcal{S }_\mathcal{V }(x^+)\right), \) for every \((s,t)\in \mathbb{R }\times S^1\). To obtain the second estimate we used the energy identity (8) for connecting orbits. Now fix a constant \(p>2\) and pick an integer \(\ell \ge 2\). Then the assumptions of Theorem 11 are satisfied for the sequence \(u^\nu \) restricted to the cylinder \(Z_\ell =(-\ell ,\ell ]\times S^1\). Hence there is a smooth solution \(u:Z_\ell \rightarrow M\) of the heat equation (7) and a subsequence, still denoted by \(u^\nu \), such that \(u^\nu \) converges to \(u\), uniformly with all derivatives on the compact subset \([-\ell +1,\ell ]\times S^1\) of \(Z_\ell \). Now (i) follows by choosing a diagonal subsequence associated to the exhausting sequence \(Z_2\subset Z_3\subset \dots \) of \(\mathbb{R }\times S^1\).

To prove (ii) note that for every \(T>0\) we obtain that

$$\begin{aligned} E_{[-T,T]}(u) =\lim _{\nu \rightarrow \infty } \int _{Z_T}\left|{\partial }_su^\nu \right|^2 =\lim _{\nu \rightarrow \infty } E_{[-T,T]}(u^\nu ) \le \mathcal{S }_\mathcal{V }(x^-)-\mathcal{S }_\mathcal{V }(x^+) \end{aligned}$$

where the first step uses that by (i) the sequence \({\partial }_su^\nu \) converges to \({\partial }_su\), uniformly on compact sets. The second step is by definition of the energy and the last step is again by the energy identity (8). Hence the limit \(u:\mathbb{R }\times S^1\rightarrow M\) has finite energy and so by Theorem 4 belongs to the moduli space \(\mathcal{M }(x_0,x_1;\mathcal{V })\) for some \(x_0,x_1\in \mathcal{P }(\mathcal{V })\). To prove convergence of the action at time \(s\) note that \( \mathcal{V }\left(u(s,\cdot )\right) =\lim _{\nu \rightarrow \infty }\mathcal{V }\left(u^\nu (s,\cdot )\right) \), because \(\mathcal{V }\) is continuous with respect to the \(C^0\) topology on \(\mathcal{L }M\) by axiom (V0). Convergence of the action at time \(s\) then follows from the fact that \({\partial }_tu^\nu (s,\cdot )\) converges to \({\partial }_tu(s,\cdot )\) in \(L^\infty (S^1)\). \(\square \)

Lemma 4

(Compactness up to broken trajectories) Assume \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) satisfies (V0)–(V3) and \(\mathcal{S }_\mathcal{V }\) is Morse. Fix distinct critical points \(x^\pm \in \mathcal{P }(\mathcal{V })\) and a sequence \(u^\nu \in \mathcal{M }(x^-,x^+;\mathcal{V })\). Then there are a subsequence, still denoted by \(u^\nu \), critical points \(x_0\),...,\(x_m\) with \(x_0=x^+\) and \(x_m=x^-\), solutions

$$\begin{aligned} u_k\in \mathcal{M }(x_k,x_{k-1};\mathcal{V }), \quad {\partial }_su_k\not \equiv 0, \quad k=1,\ldots ,m, \end{aligned}$$

and sequences \(s_k^\nu \), such that the shifted sequence \(u^\nu (s_k^\nu +s,t)\) converges to \(u_k(s,t)\), uniformly with all derivatives on every compact subset of \(\mathbb{R }\times S^1\). Moreover, these limit solutions satisfy \(\sum _{k=1}^mE(u_k) =\mathcal{S }_\mathcal{V }(x^-)-\mathcal{S }_\mathcal{V }(x^+)\).

Proof

In [13, of lemma 10.3] replace lemma 10.2 by Proposition 3. \(\square \)

3 The implicit function theorem

Throughout this section we fix a smooth perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and two nondegenerate critical points \(x^\pm \) of \(\mathcal{S }_\mathcal{V }\). The idea to prove the manifold property and the dimension formula in Theorem 6 is to construct a smooth Banach manifold which contains the moduli space \(\mathcal{M }(x^-,x^+;\mathcal{V })\) and then carry out the proof locally near each element of the moduli space.

Fix a real number \(p>2\) and denote by

$$\begin{aligned} \mathcal{B }^{1,p} =\mathcal{B }^{1,p}(x^-,x^+) \end{aligned}$$
(34)

the space of continuous maps \(u:\mathbb{R }\times S^1\rightarrow M\), which satisfy the first limit condition in (3), are locally of class \(\mathcal{W }^{1,p}\), and satisfy the asymptotic conditions \(\xi ^\pm \in \mathcal{W }^{1,p}(Z_T^\pm )\) for some sufficiently large \(T>0\) where \(Z_T^-=(-\infty ,-T]\times S^1\) and \(Z_T^+=[T,\infty )\times S^1,u^*TM)\); this implies the second limit condition in (3). Here \(\xi ^\pm \) are defined pointwise by the identity \(\exp _{x^\pm (t)}\xi ^\pm (s,t)=u(s,t)\). The space \(\mathcal{B }^{1,p}\) carries the structure of a smooth infinite dimensional Banach manifold. The tangent space \(T_u\mathcal{B }^{1,p}\) is given by the Banach space \(\mathcal{W }_u^{1,p}\) whose norm is defined in (11). Around any smooth map \(u\) local coordinates are provided by the inverse of the map \({\varphi _u}^{-1}: V_u\rightarrow \mathcal{B }^{1,p}\) given by \(\xi \mapsto [(s,t)\mapsto \exp _{u(s,t)}\xi (s,t)]\) where \(V_u\subset \mathcal{W }_u^{1,p}\) is a sufficiently small neighborhood of zero. By abuse of notation we shall denote this map again by \(\xi \mapsto \exp _u\xi \). Observe that any \(u\in \mathcal{B }^{1,p}\) which satisfies the heat equation (7) is automatically smooth by Theorem 2 and therefore lies in \(\mathcal{M }(x^-,x^+;\mathcal{V })\).

For \(x\in M\) and \(\xi \in T_xM\) denote parallel transport with respect to the Levi-Civita connection along the geodesic \(\tau \mapsto \exp _x(\tau \xi )\) by

$$\begin{aligned} \Phi (x,\xi ):T_xM\rightarrow T_{\exp _x(\xi )}M. \end{aligned}$$

For \(u\in \mathcal{B }^{1,p}\) the map \(\mathcal{F }_u:\mathcal{W }^{1,p}_u\rightarrow \mathcal{L }^p_u\) is defined by

$$\begin{aligned} \mathcal{F }_u(\xi ) := \Phi (u,\xi )^{-1} \left( {\partial }_s(\exp _u\xi ) -\nabla {}_{t}{\partial }_t(\exp _u\xi ) -\mathrm{grad }\mathcal{V }(\exp _u\xi ) \right). \end{aligned}$$
(35)

It is a smooth map between Banach spaces. Hence the implicit function theorem for Banach spaces applies. The differential \(d\mathcal{F }_u(0):\mathcal{W }^{1,p}_u\rightarrow \mathcal{L }^p_u\) is given by the linear operator \(\mathcal{D }_u\); see [18, app. A.3]. The map \(\xi \mapsto \exp _u\xi \) identifies a neigborhood \(V\) of zero in \({\mathcal{F }_u}^{-1}(0)\) with a neigborhood of \(u\) in \(\mathcal{M }(x^-,x^+;\mathcal{V })\).

Proof of Theorem 6

Fix \(p>2\) and \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\). Then by Theorem 5 the operator \(d\mathcal{F }_u(0)=\mathcal{D }_u: \mathcal{W }^{1,p}_u\rightarrow \mathcal{L }^p_u\) is Fredholm. It is onto by assumption. Since every surjective Fredholm operator admits a right inverse, the implicit function theorem for Banach spaces, see e.g. [8, thm A.3.3], applies to \(\mathcal{F }_u\) restricted to a small neighborhood \(V\) of zero. It asserts that \({\mathcal{F }_u}^{-1}(0)\cap V\) is a smooth manifold whose tangent space at zero is given by the kernel of \(\mathcal{D }_u\). Since \(\mathcal{D }_u\) is onto, it follows that \(\dim \ker \mathcal{D }_u=\mathrm{index}\, \mathcal{D }_u\) by definition of the Fredholm index. But \(\mathrm{index}\, \mathcal{D }_u=\mathrm{ind}_\mathcal{V }(x^-)-\mathrm{ind}_\mathcal{V }(x^+)\) by Theorem 5. \(\square \)

Proof of Proposition 1

Set \(c_*=\frac{1}{2}(\mathcal{S }_\mathcal{V }(x^-)-\mathcal{S }_\mathcal{V }(x^+))\) and identify

$$\begin{aligned} \widehat{\mathcal{M }}(x^-,x^+;\mathcal{V }) \simeq \mathcal{M }^*:=\{u\in \mathcal{M }(x^-,x^+;\mathcal{V })\mid \mathcal{S }_\mathcal{V }(u(0,\cdot ))=c_*\}. \end{aligned}$$

Here we use that the action \(\mathcal{S }_\mathcal{V }\) strictly decreases along nonconstant heat flow trajectories (use the first variation formula for \(\mathcal{S }_\mathcal{V }\); see e.g. [10, sec. 12]). Note that \(\mathcal{M }^*\) is a manifold of dimension zero, since \(\mathcal{M }(x^-,x^+;\mathcal{V })\) is a manifold of dimension one by Theorem 6 on which \(\mathbb{R }\) acts freely. Now choose a sequence \(u^\nu \) in \(\mathcal{M }^*\). By Lemma 4 there is a subsequence, still denoted by \(u^\nu \), finitely many critical points \(x_0=x^+,x_1,\ldots ,x_m=x^-\), finitely many connecting trajectories \(u_k\in \mathcal{M }(x_k,x_{k-1};\mathcal{V })\) and sequences \(s_k^\nu \) where \(k=1,\ldots ,m\), such that each shifted sequence \(u^\nu (s_k^\nu +s,t)\) converges to \(u_k(s,t)\) in \(C^\infty _{loc}\). By the Morse–Smale assumption Theorem 6 applies to all moduli spaces and shows that

$$\begin{aligned} \mathrm{ind}_\mathcal{V }(x_k)-\mathrm{ind}_\mathcal{V }(x_{k-1}) =\dim \mathcal{M }(x_k,x_{k-1};\mathcal{V }) \ge 1, \quad \forall k\in \{1,\ldots ,m\}, \end{aligned}$$

where the inequality follows from the facts that \({\partial }_su_k\not \equiv 0\) and the heat equation (7) is \(s\)-shift invariant. Hence \(\mathrm{ind}_\mathcal{V }(x^-)-\mathrm{ind}_\mathcal{V }(x^+)\ge m\ge 1\) and so \(m=1\) by assumption on \(x^\pm \). But this means that \(u^\nu \) converges to \(u_1\in \mathcal{M }(x^-,x^+;\mathcal{V })\) in \(C^\infty _{loc}\). In fact \(u_1\in \mathcal{M }^*\) by convergence of the action functional for fixed time \(s=0\); see Proposition 3 (ii). Hence \(\mathcal{M }^*\) is compact in the \(C^\infty _{loc}\) topology. \(\square \)

3.1 The refined implicit function theorem

Proposition 4

(The estimate for the right inverse) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and nondegenerate critical points \(x^\pm \) of \(\mathcal{S }_\mathcal{V }\). Assume \(u\in \mathcal{M }(x^-;x^+;\mathcal{V })\) and \(\mathcal{D }_u\) is onto. Then, for every \(p>1\), there is a positive constant \(c=c(p,u)\) invariant under \(s\)-shifts of \(u\) such that

$$\begin{aligned} \left\Vert {\xi ^*} \right\Vert_{\mathcal{W }_u^{1,p}} \le c\left\Vert {\mathcal{D }_u\xi ^*} \right\Vert_p \end{aligned}$$
(36)

for every \(\xi ^*\in \mathrm{im\, }( \mathcal{D }_u^*:\mathcal{W }^{2,p}_u\rightarrow \mathcal{W }^{1,p}_u)\).

The proof of Proposition 4 is standard; see e.g. [4, lemma 4.5]. Details in the parabolic case at hand are provided by [20, prop. 5.1].

Proposition 5

(Quadratic estimate) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V1). Let \(\iota >0\) be the injectivity radius of \(M\) and fix constants \(1<p<\infty \) and \(c_0>0\). Then there is a constant \(C=C(p,c_0)>0\) such that the following is true. If \(u:\mathbb{R }\times S^1\rightarrow M\) is a smooth map and \(\xi \) is a compactly supported smooth vector field along \(u\) such that

$$\begin{aligned} \left\Vert {{\partial }_su} \right\Vert_\infty +\left\Vert {{\partial }_tu} \right\Vert_\infty +\left\Vert {\nabla {}_{t}{\partial }_tu} \right\Vert_\infty \le c_0,\quad \left\Vert {\xi } \right\Vert_\infty \le \iota , \end{aligned}$$

then

$$\begin{aligned} \left\Vert {\mathcal{F }_u(\xi )-\mathcal{F }_u(0)-d\mathcal{F }_u(0) \xi } \right\Vert_p \le C \left\Vert {\xi } \right\Vert_\infty \left\Vert {\xi } \right\Vert_{\mathcal{W }^{1,p}_u} \left( 1+ \left\Vert {\xi } \right\Vert_{\mathcal{W }^{1,p}_u} \right). \end{aligned}$$

Proof

Recall the definition (26) of the maps \(E_i\) and \(E_{ij}\) and write

$$\begin{aligned} \mathcal{F }_u(\xi )-\mathcal{F }_u(0) -{\displaystyle \left.\frac{d}{d\tau }\right|}_{\tau =0} \mathcal{F }_u(\tau \xi ) =f(\xi )-g(\xi )-h(\xi ) \end{aligned}$$

where

$$\begin{aligned} f(\xi )&:= \Phi (u,\xi )^{-1}{\partial }_sE(u,\xi )-{\partial }_su -{\displaystyle \left.\frac{d}{d\tau }\right|}_{\tau =0} \Phi (u,\tau \xi )^{-1}{\partial }_su -{\displaystyle \left.\frac{d}{d\tau }\right|}_{\tau =0} {\partial }_s E(u,\tau \xi )\\ g(\xi )&:= \Phi (u,\xi )^{-1}\nabla {}_{t}{\partial }_tE(u,\xi ) -\nabla {}_{t}{\partial }_tu +\left(\nabla {}_{2}\Phi |_{(u,0)}\xi \right) \nabla {}_{t}{\partial }_tu -{\displaystyle \left.\frac{d}{d\tau }\right|}_{0} \nabla {}_{t}{\partial }_t E(u,\tau \xi )\\ h(\xi )&:= \Phi (u,\xi )^{-1}\mathrm{grad }\,\mathcal{V }(E(u,\xi )) -\mathrm{grad }\,\mathcal{V }(u) +\left(\nabla {}_{2}\Phi |_{(u,0)}\xi \right) \mathrm{grad }\,\mathcal{V }(u)\\&-{\displaystyle \left.\frac{d}{d\tau }\right|}_{\tau =0} \mathrm{grad }\,\mathcal{V }(E(u,\tau \xi )). \end{aligned}$$

Here we used that \(\Phi (u,0)={\small 1}\!\!1\). Straightforward calculation using the identities (28) shows that \(f(\xi )=f_1(\xi )\nabla {}_{s}\xi +f_2(\xi )\) where

$$\begin{aligned} f_1(\xi )\nabla {}_{s}\xi&= \left(\Phi (u,\xi )^{-1}E_2(u,\xi ) -{\small 1}\!\!1\right)\nabla {}_{s}\xi \\ f_2(\xi ){\partial }_su&= \left(\Phi (u,\xi )^{-1}E_1(u,\xi )-{\small 1}\!\!1 +\nabla {}_{2}\Phi (u,0)\xi \right){\partial }_su, \end{aligned}$$

that

$$\begin{aligned} g =g_1\circ \nabla {}_{t}{\partial }_tu +g_2\circ \left({\partial }_tu,{\partial }_tu\right) +g_3\circ \nabla {}_{t}\nabla {}_{t}\xi +g_4\circ \left({\partial }_tu,\nabla {}_{t}\xi \right) +g_5\circ \left(\nabla {}_{t}\xi ,\nabla {}_{t}\xi \right) \end{aligned}$$

where

$$\begin{aligned} g_1(\xi )&= \Phi (u,\xi )^{-1}E_1(u,\xi )-{\small 1}\!\!1 +\nabla {}_{2}\Phi (u,0)\xi \\ g_2(\xi )&= \Phi (u,\xi )^{-1}E_{11}(u,\xi ) -{\displaystyle \left.\frac{d}{d\tau }\right|}_{\tau =0} E_{11}(u,\tau \xi )\\ g_3(\xi )&= \Phi (u,\xi )^{-1}E_2(u,\xi )-{\small 1}\!\!1\\ g_4(\xi )&= 2\Phi (u,\xi )^{-1}E_{12}(u,\xi )\\ g_5(\xi )&= \Phi (u,\xi )^{-1}E_{22}(u,\xi ), \end{aligned}$$

and that

$$\begin{aligned} h(\xi ) =\Phi (u,\xi )^{-1}\mathrm{grad }\,\mathcal{V }(E(u,\xi )) -\left({\small 1}\!\!1 -\left(\nabla {}_{2}\Phi (u,0)\xi \right)\right) \mathrm{grad }\,\mathcal{V }(u) -\mathcal{H }_\mathcal{V }(u)\xi . \end{aligned}$$

Here \(\mathcal{H }_\mathcal{V }\) denotes the covariant Hessian of \(\mathcal{V }\) given by (4). It follows by inspection using the identities (28) that the maps \(f_2,g_1,g_2\), and \(h\) together with their first derivative are zero at \(\xi =0\). Therefore there exists a constant \(c>0\) which depends continuously on \(\mathopen |\xi \mathclose |\) and the constant in (V1) such that

$$\begin{aligned} \left|(f_2+g_1+g_2+h)(\xi )\right| \le c \left|\xi \right|^2\left( \left|{\partial }_su\right|+\left|\nabla {}_{t}{\partial }_tu\right| +\left|{\partial }_tu\right|^2+1 \right) \end{aligned}$$

pointwise at every \((s,t)\). Similarly, it follows that the remaining functions are zero at \(\xi =0\) and therefore

$$\begin{aligned} \left|(f_1+g_3+g_4+g_5)(\xi )\right| \le c \left|\xi \right|\left( \left|\nabla {}_{s}\xi \right| +\left|\nabla {}_{t}\nabla {}_{t}\xi \right| +\left|\nabla {}_{t}\xi \right|\left|{\partial }_tu\right| +\left|\nabla {}_{t}\xi \right|^2 \right). \end{aligned}$$

Take these pointwise estimates to the power \(p\), integrate them over \(\mathbb{R }\times S^1\) and pull out \(L^\infty \) norms of \({\partial }_su,{\partial }_tu\), and \(\nabla {}_{t}{\partial }_tu\) to obtain the conclusion of Proposition 5. The term \(\left|\xi \right|\cdot \left|\nabla {}_{t}\xi \right|^2\) involving a product of first order terms is taken care of by the product estimate [21, le. 4.1] and [21, rmk. 4.2]. Here we use the fact that the (compact) support of \(\xi \) is contained in some set \((a,b]\times S^1\). \(\square \)

3.1.1 Proof of the refined implicit function Theorem 7

Fix \(\mathcal{V }\) and \(x^\pm \) satisfying the assumptions of Theorem 7 and assume by contradiction the conclusion of the theorem was not true. Denote the constant in (V0) by \(C_0^\prime >1\). Then there are constants \(p>2\) and \(c_0>C_0^\prime \) and a sequence of smooth maps \(u_\nu :\mathbb{R }\times S^1\rightarrow M\) such that \(u_\nu (s,\cdot )\) converges asymptotically to \(x^\pm \) in \(W^{1,2}(S^1)\) and

$$\begin{aligned} \left|{\partial }_su_\nu (s,t)\right| \le \frac{c_0}{1+s^2},\quad \left\Vert {{\partial }_tu_\nu } \right\Vert_\infty \le c_0,\quad \left\Vert {\nabla {}_{t}{\partial }_tu_\nu } \right\Vert_\infty \le c_0, \end{aligned}$$
(37)

for all \((s,t)\in \mathbb{R }\times S^1\) and

$$\begin{aligned} \left\Vert {{\partial }_su_\nu -\nabla {}_{t}{\partial }_tu_\nu -\mathrm{grad }\, \mathcal{V }(u_\nu )} \right\Vert_p \le \frac{1}{\nu }, \end{aligned}$$
(38)

but which does not satisfy the conclusion of Theorem 7 for \(c=\nu \). This means that for every \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\) and every \(\xi ^\nu \in \mathrm{im\, }\,\mathcal{D }_{u}^*\cap \mathcal{W }_{u}\) which satisfy \(u_\nu =\exp _{u}(\xi ^\nu )\) it holds that

$$\begin{aligned} \left\Vert {{\partial }_su_\nu -\nabla {}_{t}{\partial }_tu_\nu -\mathrm{grad }\, \mathcal{V }(u_\nu )} \right\Vert_p <\frac{1}{\nu } \left\Vert {\xi ^\nu } \right\Vert_\mathcal{W }. \end{aligned}$$
(39)

The time shift of a smooth map \(u:\mathbb{R }\times S^1\) by \(\sigma \in \mathbb{R }\) is defined pointwise by

$$\begin{aligned} \left(u*\sigma \right)(s,t) :=u^\sigma (s,t):=u(s+\sigma ,t). \end{aligned}$$

Set \(a_0:=2c_0^2\) and observe that

$$\begin{aligned} \mathcal{S }_\mathcal{V }(x^-) =\lim _{s\rightarrow -\infty }\mathcal{S }_\mathcal{V }(u_\nu (s,\cdot )) =\frac{1}{2} \left\Vert {{\partial }_tu_\nu (s,\cdot )} \right\Vert_2^2 -\mathcal{V }(u_\nu (s,\cdot )) \le \frac{1}{2} c_0^2+C_0^\prime \le a_0 \end{aligned}$$

by asymptotic \(W^{1,2}\) convergence, estimate (37), axiom (V0), and \(c_0>C_0^\prime \). Now fix a regular value \(c_*\) of \(\mathcal{S }_\mathcal{V }\) between \(\mathcal{S }_\mathcal{V }(x^+)\) and \(\mathcal{S }_\mathcal{V }(x^-)\); use that the set \(\mathcal{P }^{a_0}(\mathcal{V })\) is finite, because \(\mathcal{S }_\mathcal{V }\) is Morse–Smale below level \(a_0\) by assumption. Applying time shifts, if necessary, we may assume without loss of generality that

$$\begin{aligned} \mathcal{S }_\mathcal{V }\left( u_\nu (0,\cdot )\right) =c_*. \end{aligned}$$
(40)

Furthermore, choose \(c_0^\prime :=a\) and denote by \(C_0=C_0(a,\mathcal{V })>0\) the constant in Theorem 3 (a priori estimates) with that choice. Hence

$$\begin{aligned} \left\Vert{\partial }_su\right\Vert_\infty +\left\Vert{\partial }_tu\right\Vert_\infty +\left\Vert\nabla {}_{t}{\partial }_tu\right\Vert_\infty \le C_0 \end{aligned}$$
(41)

for all \(u\in \mathcal{M }(x,y;\mathcal{V })\) and \(x,y\in \mathcal{P }^a(\mathcal{V })\).

Claim

There is a subsequence, still denoted by \(u_\nu \), a constant \(C\), a trajectory \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\), and a sequence of times \(\sigma _\nu \) such that the sequence \(\eta _\nu \) determined by the identity \( u_\nu =\exp _{u^{\sigma _\nu }} (\eta _\nu ) \) satisfies \(\eta _\nu \in \mathrm{im\, }\,\mathcal{D }_{u^{\sigma _\nu }}^* \cap \mathcal{W }_{u^{\sigma _\nu }}\) and

$$\begin{aligned} \lim _{\nu \rightarrow \infty }\left( \left\Vert {\eta _\nu } \right\Vert_\infty +\left\Vert {\eta _\nu } \right\Vert_p\right) =0 ,\quad \left\Vert {\eta _\nu } \right\Vert_\mathcal{W }\le C. \end{aligned}$$
(42)

The claim leads to a contradiction as follows. Consider the time shifted trajectories \(u^{\sigma _\nu }:=u*\sigma _\nu \) and vector fields \(\eta _\nu \) provided by the claim and note that \(u^{\sigma _\nu }\in \mathcal{M }(x^-,x^+;\mathcal{V })\). Note further that the assumptions of the quadratic estimate Proposition 5 are satisfied by (41) and by choosing a further subsequence, if necessary, to achieve that \(\mathopen \Vert {\eta _\nu } \mathclose \Vert _\infty <\iota \). Set \(c_0^\prime :=C_0(a,\mathcal{V })\) and let \(C_2=C_2(p,c_0^\prime )\) be the constant in Proposition 5 with that choice. Furthermore, since \(\mathcal{M }(x^-,x^+;\mathcal{V })/\mathbb{R }\) is a finite set by Proposition 1 and \(\mathcal{P }^a(\mathcal{V })\) is a finite set as well, the estimate for the right inverse Proposition 4 applies with constant \(C_1\) depending only on \(p\), \(a\), and \(\mathcal{V }\). Now definition (35) of the map \(\mathcal{F }_{u}\) and parallel transport being an isometry imply the first step of the estimate

$$\begin{aligned} \left\Vert { {\partial }_su_\nu -\nabla {}_{t}{\partial }_tu_\nu -\mathrm{grad }\mathcal{V }(u_\nu ) } \right\Vert_p&= \left\Vert {\mathcal{F }_{u}(\eta _\nu )} \right\Vert_p\\&\ge \left\Vert {\mathcal{D }_{u} \eta _\nu } \right\Vert_p -\left\Vert {\mathcal{F }_{u}(\eta _\nu ) -\mathcal{F }_{u}(0) -d\mathcal{F }_{u}(0)\eta _\nu } \right\Vert_p\\&\ge \left\Vert {\eta _\nu } \right\Vert_\mathcal{W }\left(\frac{1}{C_1} -C_2\left\Vert {\eta _\nu } \right\Vert_\infty \left(1+\left\Vert {\eta _\nu } \right\Vert_\mathcal{W }\right)\right)\\&\ge \frac{1}{2C_1} \left\Vert {\eta _\nu } \right\Vert_\mathcal{W }. \end{aligned}$$

Step two uses that \(\mathcal{F }_{u}(0)= {\partial }_su-\nabla {}_{t}{\partial }_tu -\mathrm{grad }\mathcal{V }(u)=0\) and \(d\mathcal{F }_{u}(0)=\mathcal{D }_{u}\). Step three is by Proposition 4 and Proposition 5. By (42) the last step holds for sufficiently large \(\nu \). For \(\nu >2C_1\) the estimate contradicts (39) and this proves Theorem 7. It remains to prove the claim and this takes four steps.

Step 1

There is a subsequence of \(u_\nu \), still denoted by \(u_\nu \), and a trajectory \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\) such that

$$\begin{aligned} u_\nu =\exp _u(\xi _\nu ),\quad \lim _{\nu \rightarrow \infty }\left( \left\Vert {\xi _\nu } \right\Vert_\infty +\left\Vert {\xi _\nu } \right\Vert_p\right) =0. \end{aligned}$$
(43)

Proof

We embed the compact Riemannian manifold \(M\) isometrically into some Euclidean space \(\mathbb{R }^N\) and consider \(u_\nu :\mathbb{R }\times S^1\rightarrow M\) as a map to \(\mathbb{R }^N\) thereby conveniently obtaining \(L^p\) and \(L^\infty \) norms for \(u_\nu \). By translation we may assume that \(M\) contains the origin. By compactness of \(M\) and the \(L^\infty \) bounds (37) we obtain on every compact cylindrical domain \(Z_T:=[-T,T]\times S^1\) the estimates

$$\begin{aligned} \left\Vert {u_\nu } \right\Vert_{L^p(Z_T)} \le (2T)^\frac{1}{p}\, \mathrm{diam\,}M, \quad \left\Vert {{\partial }_tu_\nu } \right\Vert_{L^p(Z_T)} +\left\Vert {\nabla {}_{t}{\partial }_tu_\nu } \right\Vert_{L^p(Z_T)} \le 2c_0(2T)^\frac{1}{p}, \end{aligned}$$

and

$$\begin{aligned} \left\Vert {{\partial }_su_\nu } \right\Vert_r \le 4c_0\quad \forall r\in (1,\infty ]. \end{aligned}$$
(44)

The latter follows from \( \int _{-\infty }^\infty (1+s^2)^{-r} \,ds \le 2+2\int _1^\infty s^{-2r} \,ds = 4(2-1/r)^{-1} <4 \) whenever \(r>1\). Hence the sequence \(u_\nu \) is uniformly bounded in \(\mathcal{W }^{1,p}(Z_T)\). Thus by the Arzela–Ascoli and the Banach–Alaoglu theorem a suitable subsequence, still denoted by \(u_\nu \), converges strongly in \(C^0\) and weakly in \(\mathcal{W }^{1,p}\) on every compact cylindrical domain \(Z_T\) to some continuous map \(u:\mathbb{R }\times S^1\rightarrow M\) which is locally of class \(\mathcal{W }^{1,p}\). Hence \({\partial }_su_\nu -\nabla {}_{t}{\partial }_tu_\nu -\mathrm{grad }\mathcal{V }(u_\nu )\) converges weakly in \(L^p\) to \({\partial }_su-\nabla {}_{t}{\partial }_tu-\mathrm{grad }\mathcal{V }(u)\). On the other hand, by (38) it converges to zero in \(L^p\). By uniqueness of limits \(u\) satisfies the heat equation (7) almost everywhere. Thus \(u\) is smooth by Theorem 2.

Fix \(s\in \mathbb{R }\) and observe that by (37) there are uniform \(C^1(S^1)\) bounds for the sequence \({\partial }_tu_\nu (s,\cdot )\). Hence by Arzela–Ascoli a suitable subsequence, still denoted by \({\partial }_tu_\nu (s,\cdot )\), converges in \(C^0(S^1)\) to \({\partial }_tu(s,\cdot )\). Thus

$$\begin{aligned} \lim _{\nu \rightarrow \infty } \mathcal{S }_\mathcal{V }(u_\nu (s,\cdot )) =\mathcal{S }_\mathcal{V }(u(s,\cdot )) \end{aligned}$$

and therefore \(\mathcal{S }_\mathcal{V }(u(0,\cdot ))=c_*\) by (40). Recall that \({\partial }_su=\nabla {}_{t}{\partial }_tu+\mathrm{grad }\,\mathcal{V }(u)\). When restricted to \(s=0\) this means that the vector field \({\partial }_su(0,\cdot )\) is equal to the \(L^2\) gradient of \(\mathcal{S }_\mathcal{V }\) at the loop \(u(0,\cdot )\). But \(\mathcal{S }_\mathcal{V }(u(0,\cdot ))=c_*\) and \(c_*\) is a regular value. Hence \({\partial }_su(0,\cdot )\) cannot vanish identically.

On the other hand, by (37) and axiom (V0) it follows exactly as above that

$$\begin{aligned} \sup _\nu \mathcal{S }_\mathcal{V }(u_\nu (s,\cdot )) =\sup _\nu \frac{1}{2}\left\Vert {{\partial }_tu_\nu (s,\cdot )} \right\Vert_2^2 -\mathcal{V }(u_\nu ) \le a_0. \end{aligned}$$

This shows that all relevant trajectories, including relevant limits over \(s\) or \(\nu \), lie in the sublevel set \(\mathcal{L }^{a_0} M\) on which \(\mathcal{S }_\mathcal{V }\) is Morse–Smale by assumption. In particular, we have that \(\sup _{s\in \mathbb{R }}\mathcal{S }_\mathcal{V }(u(s,\cdot ))\le a_0\) and therefore the energy of \(u\) is finite by Lemma 2. Hence by the exponential decay Theorem 4 there are critical points \(y^\pm \in \mathcal{P }^{a_0}(\mathcal{V })\) such that \(u(s,\cdot )\) converges to \(y^\pm \) in \(C^2(S^1)\), as \(s\rightarrow \pm \infty \). Moreover, the limits \(y^-\) and \(y^+\) are distinct, because the action along a nonconstant trajectory is strictly decreasing and the trajectory is nonconstant, since \({\partial }_su\) is not identically zero as observed above.

More generally, a standard argument shows the following, see e.g. [13, lemma 10.3]. There exist critical points \(x^-=x^0,x^1,\dots ,x^\ell =x^+ \in \mathcal{P }^{a_0}(\mathcal{V })\) and trajectories \(u^k\in \mathcal{M }(x^{k-1},x^k;\mathcal{V })\), \({\partial }_su^k\not \equiv 0\), for \(k\in \{1,\dots ,\ell \}\), a subsequence, still denoted by \(u_\nu \), and sequences \(s_\nu ^k\in \mathbb{R }\), \(k\in \{1,\dots ,\ell \}\), such that the shifted sequence \(u_\nu (s_\nu ^k+s,t)\) converges to \(u^k(s,t)\) in an appropriate topology. The point here is that \({\partial }_su^k\not \equiv 0\) and therefore the Morse index strictly decreases along the sequence \(x^-=x^0,x^1,\dots ,x^\ell =x^+\). Namely, each operator \(\mathcal{D }_{u^k}\) is onto by Morse–Smale and Fredholm by Theorem 5. Hence the Fredholm index is equal to the dimension of the kernel which is strictly positive, because the kernel contains the nonzero element \({\partial }_su^k\). On the other hand, again by Theorem 5, the Fredholm index is given by the difference of Morse indices \(\mathrm{ind}_\mathcal{V }(x^{k-1})-\mathrm{ind}_\mathcal{V }(x^k)\). Hence \(\ell =1\), since the pair \(x^\pm \) has Morse index difference one. Thus \(u\in \mathcal{M }(x^-,x^+;\mathcal{V })\) and this proves the first assertion of step 1.

It remains to prove (43). The key observation is that \(u_\nu (s,\cdot )\) not only asymptotically converges in \(W^{1,2}(S^1)\) to \(x^\pm \), but the rate of convergence is independent of \(\nu \). The fundamental theorem of calculus and uniform decay (37) show that

$$\begin{aligned} \left|x^+(t)-u_\nu (s,t)\right|_{\mathbb{R }^N} =\left|\int _{s}^\infty {\partial }_su_\nu (\sigma ,t)\,d\sigma \right|_{\mathbb{R }^N} \le \int _{s}^\infty \frac{c_0}{\sigma ^2}\, d\sigma =\frac{c_0}{s} \end{aligned}$$
(45)

for all \(t\in S^1\), \(\nu \in \mathbb{N }\), and \(s\ge 1\). Since the restriction of the Euclidean distance in \(\mathbb{R }^N\) to the compact manifold \(M\) and the Riemannian distance \(d\) in \(M\) are locally equivalent, estimate (45) shows the following. Consider the injectivity radius \(\iota >0\) of \(M\) and assume \({\varepsilon }\in (0,\iota /2)\), then

$$\begin{aligned} s>\frac{6c_0}{{\varepsilon }} \quad \Longrightarrow \quad d\left( u_\nu (s,t),x^+(t)\right)<\frac{{\varepsilon }}{6} \end{aligned}$$

for all \(t\in S^1\) and \(\nu \in \mathbb{N }\); similarly for \(x^-\). Now denote by \(Z_{\varepsilon }^+ :=[6c_0/{\varepsilon },\infty )\times S^1\) the positive end of the cylinder \(\mathbb{R }\times S^1\) and by \(Z_{\varepsilon }^-\) the negative end. Observe that the ends \(u_\nu (Z_{\varepsilon }^\pm )\) are contained in the \(({\varepsilon }/6)\)-neighborhood of \(x^\pm (S^1)\), for all \(\nu \). We may assume without loss of generality that this is also true for the ends \(u(Z_{\varepsilon }^\pm )\) of \(u\); otherwise replace \(6c_0\) by a larger constant. Now, since \(u_\nu \) converges to \(u\) uniformly on \(Z({\varepsilon }):=[-6c_0/{\varepsilon },6c_0/{\varepsilon }]\times S^1\), there exists \(\nu _0({\varepsilon })\in \mathbb{N }\) such that \(\mathopen \Vert {\xi _\nu } \mathclose \Vert _{L^\infty (Z({\varepsilon }))}<{\varepsilon }/3\) for every \(\nu \ge \nu _0({\varepsilon })\). Hence

$$\begin{aligned} \left\Vert {\xi _\nu } \right\Vert_\infty&= \left\Vert {\xi _\nu } \right\Vert_{L^\infty (Z_{\varepsilon }^-)} +\left\Vert {\xi _\nu } \right\Vert_{L^\infty (Z({\varepsilon }))}+\left\Vert {\xi _\nu } \right\Vert_{L^\infty (Z_{\varepsilon }^+)}\nonumber \\&\le \sup _{Z_{\varepsilon }^-}\left( d(u_\nu ,x^-) +d(x^-,u)\right) +\left\Vert {\xi _\nu } \right\Vert_{L^\infty (Z({\varepsilon }))}\nonumber \\&+\sup _{Z_{\varepsilon }^+}\left( d(u_\nu ,x^+) +d(x^+,u)\right)\nonumber \\&\le {\varepsilon }\end{aligned}$$
(46)

for every \(\nu \ge \nu _0({\varepsilon })\). Next pick a sequence \({\varepsilon }_k\rightarrow 0\) and choose a sequence \(\nu _k\rightarrow \infty \) such that \(\nu _k\ge \nu _0({\varepsilon }_k)\). Then, without changing notation, replace \(u_\nu \) by the subsequence \(u_{\nu _k}\) and observe that the corresponding \(L^\infty \) limit in (43) is indeed zero. To prove that the \(L^p\) limit is zero use again the decomposition of \(\mathbb{R }\times S^1\) into the compact part \(Z({\varepsilon })\) and the two ends \(Z_{\varepsilon }^\pm \). Observe that the right hand side of (45) is \(p\)-integrable over the ends \(Z_{\varepsilon }^\pm \). Again the key facts are that the values of both integrals do not depend on \(\nu \) and they converge to zero, as \(\mathopen |s\mathclose |\rightarrow \infty \). In the case of \(u\) use the exponential decay Theorem 4 to obtain a similar asymptotic estimate in terms of an exponentially decaying function. \(\square \)

Step 2

Consider the constant \(C_0\) in (41) and \(u\) and the sequence \(\xi _\nu \) provided by Step 1. Set \({\varepsilon }_\nu :=\mathopen \Vert {\xi _\nu } \mathclose \Vert _\infty +\mathopen \Vert {\xi _\nu } \mathclose \Vert _p\). Then there is a constant \(\sigma _0>0\) and integer \(\nu _0\ge 1\) such that \(\eta =\eta (s,t;\sigma ,\nu )\), determined by the identity \(u_\nu =\exp _{u^\sigma }(\eta )\), satisfies \(\mathopen \Vert {\eta } \mathclose \Vert _\infty <\iota /2\) for all \(\sigma \in [-\sigma _0,\sigma _0]\) and \(\nu \ge \nu _0\). Furthermore, there is a constant \(c_2=c_2(a_0,\sigma _0)>0\) such that

$$\begin{aligned} \left\Vert {\eta } \right\Vert_\infty \le {\varepsilon }_\nu +C_0\left|\sigma \right| ,\quad \left\Vert {\eta } \right\Vert_p \le 2{\varepsilon }_\nu +c_2\left|\sigma \right| \end{aligned}$$

and

$$\begin{aligned} \left\Vert {\nabla {}_{s}\eta } \right\Vert_p \le c_2,\quad \left\Vert {\nabla {}_{t}\eta } \right\Vert_\infty \le c_2,\quad \left\Vert {\nabla {}_{t}\nabla {}_{t}\eta } \right\Vert_p \le c_2 \end{aligned}$$

for all \(\sigma \in [-\sigma _0,\sigma _0]\) and \(\nu \ge \nu _0\).

Proof

Existence of \(\sigma _0\) and \(\nu _0\) follows from the fact that \(\eta (\nu ,0)=\xi _\nu \), continuity of time shift, and the \(L^\infty \) limit in (43). Now denote by \(L\) the length functional. Then for all \(\sigma \in \mathbb{R }\) and \(\gamma (r):=u(s+r\sigma ,t)\) with \(r\in [0,1]\) we have that

$$\begin{aligned} d\left( u(s,t),u(s+\sigma ,t) \right) \le L(\gamma ) =\left|\sigma \right| \int _0^1\left|{\partial }_su(s+r\sigma ,t)\right|dr \le \left|\sigma \right| \left\Vert {{\partial }_su} \right\Vert_\infty . \end{aligned}$$
(47)

Since \(d\left( u_\nu (s,t),u(s,t)\right) =\left|\xi _\nu (s,t)\right|\le {\varepsilon }_\nu \), the first estimate of step 2 follows from \(\left|\eta (s,t)\right|=d\left( u_\nu (s,t), u(s+\sigma ,t)\right)\), the triangle inequality, and (41). To prove the second estimate note that the triangle inequality also implies that

$$\begin{aligned} \left\Vert {\eta } \right\Vert_p^p \le 2^{p-1}\left\Vert {\xi _\nu } \right\Vert_p^p +2^{p-1}\int _{-\infty }^\infty \int _0^1 d\left( u(s,t),u(s+\sigma ,t) \right)^p\, dtds. \end{aligned}$$

By Theorem 4 on exponential decay there are constants \(\rho ,c_3>2\) such that for all \((\tilde{s},t)\in \mathbb{R }\times S^1\) we have that

$$\begin{aligned} \left|{\partial }_su(\tilde{s},t)\right| \le c_3e^{-\rho \mathopen |\tilde{s}\mathclose |},\quad \left\Vert {{\partial }_su} \right\Vert_r\le c_3\quad \forall r>1. \end{aligned}$$
(48)

Note that the constants \(\rho \) and \(c_3\) depend only on \(a_0\), since the set \(\mathcal{P }^{a_0}(\mathcal{V })\) is finite and there are only finitely many elements of \(\mathcal{M }(x^-,x^+;\mathcal{V })\) which satisfy (40). By the first inequality in (47) and the first estimate in (48) with \(\tilde{s}=s+r\sigma \)

$$\begin{aligned} d\left( u(s,t),u(s+\sigma ,t) \right) \le \left|\sigma \right| \int _0^1 \left|{\partial }_su(s+r\sigma ,t)\right| dr \le \left|\sigma \right| c_3 e^{\rho \sigma _0} e^{-\rho \mathopen |s\mathclose |}. \end{aligned}$$

But the right hand side is \(L^p\) integrable and this concludes the proof of the second estimate of step 2. To prove the next two estimates we differentiate the identity \( \exp _{u^\sigma }\eta =u_\nu \) with respect to \(s\) and \(t\) to obtain that

$$\begin{aligned} E_1(u^\sigma ,\eta ){\partial }_su^\sigma +E_2(u^\sigma ,\eta )\nabla {}_{s}\eta&={\partial }_su_\nu \end{aligned}$$
(49)
$$\begin{aligned} E_1(u^\sigma ,\eta ){\partial }_tu^\sigma +E_2(u^\sigma ,\eta )\nabla {}_{t}\eta&={\partial }_tu_\nu \end{aligned}$$
(50)

where the maps \(E_i\) are defined by (26). Since \(\mathopen \Vert {{\partial }_su^\sigma } \mathclose \Vert _p\le c_3\) by (48) and \(\mathopen \Vert {{\partial }_su_\nu } \mathclose \Vert _p\le 4c_0\) by (44), the \(L^p\) norm of \(\nabla {}_{s}\eta \) is uniformly bounded as well. Similarly, since \(\mathopen \Vert {{\partial }_tu^\sigma } \mathclose \Vert _\infty \le C_0\) by (41) and \(\mathopen \Vert {{\partial }_tu_\nu } \mathclose \Vert _\infty \le c_0\) by (37), the \(L^\infty \) norm of \(\nabla {}_{t}\eta \) is uniformly bounded. To prove the last estimate of step 2 differentiate (50) covariantly with respect to \(t\) and abbreviate \(E_{ij}=E_{ij}(u^\sigma ,\eta )\) to obtain

$$\begin{aligned}&E_{11}(u^\sigma ,\eta ) \left({\partial }_tu^\sigma ,{\partial }_tu^\sigma \right) +E_{12}(u^\sigma ,\eta ) \left({\partial }_tu^\sigma ,\nabla {}_{t}\eta \right) +E_1(u^\sigma ,\eta )\, \nabla {}_{t}{\partial }_tu^\sigma \\&\qquad \!+\!\,E_{21}(u^\sigma ,\eta ) \left(\nabla {}_{t}\eta ,{\partial }_tu^\sigma \right) \!+\! E_{22}(u^\sigma ,\eta ) \left(\nabla {}_{t}\eta ,\nabla {}_{t}\eta \right) \!+\! E_2(u^\sigma ,\eta )\, \nabla {}_{t}\nabla {}_{t}\eta \!+\! \mathrm{grad }\,\mathcal{V }(u_\nu )\!-\!{\partial }_su_\nu \\&\quad =\nabla {}_{t}{\partial }_tu_\nu +\mathrm{grad }\,\mathcal{V }(u_\nu )-{\partial }_su_\nu . \end{aligned}$$

This identity implies a uniform \(L^p\) bound for \(\nabla {}_{t}\nabla {}_{t}\eta \) as follows. The right hand side is bounded in \(L^p\) by \(1/\nu \) and the last term of the left hand side by \(4c_0\) according to (44). Since \(E_{ij}(u^\sigma ,0)=0\) and since we have uniform \(L^\infty \) bounds for each of the two linear terms to which \(E_{ij}(u^\sigma ,\eta )\) is applied, we can estimate the \(L^p\) norm by a constant times \(\mathopen \Vert {\eta } \mathclose \Vert _p\). The only terms left are term three and term seven of the left hand side. By the heat equation (7) their sum equals

$$\begin{aligned} E_1(u^\sigma ,\eta )\, {\partial }_su^\sigma -E_1(u^\sigma ,\eta )\, \mathrm{grad }\,\mathcal{V }(u^\sigma ) +\mathrm{grad }\mathcal{V }(u_\nu ). \end{aligned}$$

Since \(\mathopen \Vert {{\partial }_su^\sigma } \mathclose \Vert _p\le c_3\) by (48), the \(L^p\) norm of the first term is uniformly bounded. Consider the remaining two terms as a function \(f\) of \(\eta \). Then \(f(0)=0\), because \(E_1(u^\sigma ,0)={\small 1}\!\!1\) and \(\eta =0\) means \(u_\nu =u^\sigma \). Hence \(\mathopen \Vert {f} \mathclose \Vert _p\) is uniformly bounded by a constant times \(\mathopen \Vert {\eta } \mathclose \Vert _p\). Here we used axiom (V0). This proves step 2. \(\square \)

Step 3

For \(\sigma \in [-\sigma _0,\sigma _0]\) and \(\nu \ge \nu _0\) consider the function \( \theta _\nu (\sigma ) =-\langle {\partial }_su^\sigma , \eta \rangle \) where \(\eta =\eta (s,t;\sigma ,\nu )\) is determined by the identity \(u_\nu =exp_{u^\sigma }(\eta )\), see step 2, and \(\langle \cdot ,\cdot \rangle \) denotes the \(L^2(\mathbb{R }\times S^1)\) inner product. This function satisfies

$$\begin{aligned} \theta _\nu (\sigma )=0 \quad \Longleftrightarrow \quad \eta \in \mathrm{im\, }\mathcal{D }_{u^\sigma }^*. \end{aligned}$$

Moreover, there exist new constants \(\sigma _0>0\) and \(\nu _0\in \mathbb{N }\) such that

$$\begin{aligned} \mathopen |\theta _\nu (0)\mathclose | \le c_3{\varepsilon }_\nu ,\quad \frac{d}{d\sigma } \theta _\nu (\sigma ) \ge \frac{\mu }{2} :=\frac{\mathcal{S }_\mathcal{V }(x^-)-\mathcal{S }_\mathcal{V }(x^+)}{2} >0 \end{aligned}$$

for all \(\sigma \in [-\sigma _0,\sigma _0]\) and \(\nu \ge \nu _0\) where \(c_3=c_3(a_0)\) is the constant in (48).

Proof

\(\Leftarrow \)’ follows by definition of the formal adjoint operator using that \({\partial }_su^\sigma \in \ker \mathcal{D }_{u^\sigma }\). We prove ‘\(\Rightarrow \)’. The kernel of \(\mathcal{D }_{u^\sigma }\) is 1-dimensional; the operator is Fredholm of index one by Theorem 5 and onto by the Morse–Smale condition. The kernel is spanned by the (nonzero) element \({\partial }_su^\sigma \). Now consider \(\mathcal{D }_{u^\sigma }^*\) on the domain \(\mathcal{W }^{2,p}\) and apply [21, prop. 3.18] to obtain that \(\mathcal{W }^{1,p}=\ker \mathcal{D }_{u^\sigma } \oplus \mathrm{im\, }\mathcal{D }_{u^\sigma }^*\). The implication ’\(\Rightarrow \)’ now follows immediately by contradiction.

By (48) and the definition of the sequence \({\varepsilon }_\nu \rightarrow 0\) in step 2 it follows that

$$\begin{aligned} \mathopen |\theta _\nu (0)\mathclose | =\left|\langle {\partial }_su, \xi _\nu \rangle _{L^2}\right| \le \left\Vert {{\partial }_su} \right\Vert_q \left\Vert {\xi _\nu } \right\Vert_p \le c_3{\varepsilon }_\nu \end{aligned}$$

where \(q\in (1,2)\) is determined by \(1/q+1/p=1\). Abbreviate \(E_i=E_i(u^\sigma ,\eta )\). Then straightforward calculation using the identity (49) for \(\nabla {}_{s}\eta \) shows that

$$\begin{aligned} \frac{d}{d\sigma } \theta _\nu (\sigma )&= -\langle \nabla {}_{s}{\partial }_su^\sigma , \eta \rangle -\langle {\partial }_su^\sigma , -{\partial }_su^\sigma +{\partial }_su^\sigma -E_2^{-1}E_1{\partial }_su^\sigma \rangle \\&\ge -\left\Vert {\nabla {}_{s}{\partial }_su^\sigma } \right\Vert_q \left\Vert {\eta } \right\Vert_p +\left\Vert {{\partial }_su^\sigma } \right\Vert_2^2 -\left\Vert {{\partial }_su^\sigma } \right\Vert_q \left\Vert {{\partial }_su^\sigma } \right\Vert_\infty c_4 \left\Vert {\eta } \right\Vert_p\\&= \left\Vert {{\partial }_su} \right\Vert_2^2 -\left\Vert {\eta } \right\Vert_p\left( \left\Vert {\nabla {}_{s}{\partial }_su} \right\Vert_q +c_4\left\Vert {{\partial }_su} \right\Vert_q \left\Vert {{\partial }_su} \right\Vert_\infty \right)\\&\ge \left\Vert {{\partial }_su} \right\Vert_2^2 -(2{\varepsilon }_\nu +c_2\mathopen |\sigma \mathclose |) (c_5+c_3^2c_4) \end{aligned}$$

for some constant \(c_4=c_4(a_0,\sigma _0)>0\). The last step is by (48) with constant \(c_3\). We also used that \(\mathopen \Vert {\nabla {}_{s}{\partial }_su} \mathclose \Vert _q\le c_5\) for some positive constant \(c_5=c_5(a_0)\), which follows from exponential decay of \(\nabla {}_{s}{\partial }_su\) according to Theorem 4. The energy identity (8) shows that \(\mathopen \Vert {{\partial }_su} \mathclose \Vert _2^2=\mu >0\). Now choose \(\sigma _0>0\) sufficiently small and \(\nu _0\) sufficiently large to conclude the proof of step 3. \(\square \)

Step 4

We prove the claim.

Proof

By step 3 there exists, for every sufficiently large \(\nu \), an element \(\sigma _\nu \in [-\sigma _0,\sigma _0]\) such that \( \theta _\nu (\sigma _\nu ) =0 \) and \( \left|\sigma _\nu \right| \le {\varepsilon }_\nu (2c_3/\mu ) \). Then \(\eta _\nu :=\eta (\cdot ,\cdot ;\sigma _\nu ,\nu )\) lies in the image of \(\mathcal{D }_{u^{\sigma _\nu }}^*\) by step 3 and

$$\begin{aligned} \left\Vert {\eta _\nu } \right\Vert_\infty +\left\Vert {\eta _\nu } \right\Vert_p \le {\varepsilon }_\nu \left(3+(c_2+C_0)2c_3/ \mu \right) ,\quad \left\Vert {\eta _\nu } \right\Vert_\mathcal{W }\le C, \end{aligned}$$

by step 2. This proves (42), hence the claim, and therefore Theorem 7. \(\square \)

4 Unique continuation

To prove unique continuation for the nonlinear heat equation (7) we slightly extend a result of Agmon and Nirenberg [2] to the case \(C_1\not =0\). Indeed the heat equation (7) leads to (51) with \(C_1\not =0\); see (56). In contrast, for the linear heat equation (21) the original result (\(C_1=0\)) suffices.

Theorem 15

Let \(H\) be a real Hilbert space and let \(A(s):\mathrm{dom\, }A(s)\rightarrow H\) be a family of symmetric linear operators. Assume that \(\zeta :[0,T]\rightarrow H\) is continuously differentiable in the weak topology such that \(\zeta (s)\in \mathrm{dom\, }A(s)\) and

$$\begin{aligned} \left\Vert {\zeta ^\prime (s)-A(s)\zeta (s)} \right\Vert \le c_1 \left\Vert {\zeta (s)} \right\Vert +C_1 \left|\langle A(s)\zeta (s),\zeta (s)\rangle \right|^{1/2} \end{aligned}$$
(51)

for every \(s\in [0,T]\) and two constants \(c_1,C_1\ge 0\). Here \(\zeta ^\prime (s)\in H\) denotes the derivative of \(\zeta \) with respect to \(s\). Assume further that the function \(s\mapsto \langle \zeta (s),A(s)\zeta (s)\rangle \) is also continuously differentiable and satisfies

$$\begin{aligned} \frac{d}{ds} \langle \zeta ,A\zeta \rangle -2\langle \zeta ^\prime ,A\zeta \rangle \ge -c_2 \left\Vert {A\zeta } \right\Vert\left\Vert {\zeta } \right\Vert -c_3 \left\Vert {\zeta } \right\Vert^2 \end{aligned}$$
(52)

pointwise for each \(s\in [0,T]\) and constants \(c_2,c_3>0\). Then the following holds.

  1. (1)

    If \(\zeta (0)=0\) then \(\zeta (s)=0\) for all \(s\in [0,T]\).

  2. (2)

    If \(\zeta (0)\not =0\) then \(\zeta (s)\not =0\) for all \(s\in [0,T]\) and, moreover,

    $$\begin{aligned} \log \left\Vert {\zeta (s)} \right\Vert^2 \ge \log \left\Vert {\zeta (0)} \right\Vert^2 -\left(2 \frac{\langle \zeta (0),A(0)\zeta (0)\rangle }{\mathopen \Vert {\zeta (0)} \mathclose \Vert ^2} +\frac{b}{a} \right) \frac{e^{as}-1}{a} -2c_1 s \end{aligned}$$

    where \(a=2{C_1}^2+c_2\) and \(b=4{c_1}^2+{c_2}^2/2+2c_3\).

Proof

A beautiful exposition in the case \(C_1=0\) was given by Salamon in [11, appendix E]. It generalizes easily. A key step is to prove that the function

$$\begin{aligned} \varphi (s) := \log \mathopen \Vert {\zeta (s)} \mathclose \Vert ^2 -\int _0^s \frac{2\langle \zeta (\sigma ), \zeta ^\prime (\sigma )-A(\sigma )\zeta (\sigma )\rangle }{\mathopen \Vert {\zeta (\sigma )} \mathclose \Vert ^2} d\sigma \end{aligned}$$

satisfies the differential inequality

$$\begin{aligned} \varphi ^{\prime \prime } +a\left|\varphi ^\prime \right|+b \ge 0 \end{aligned}$$
(53)

for two constants \(a,b>0\).

In [11] it is shown that assumption (52) implies the inequality

$$\begin{aligned} \varphi ^{\prime \prime } \ge 2\left\Vert {\eta -\langle \eta ,\xi \rangle \xi } \right\Vert^2 -\frac{2\left\Vert {\zeta ^\prime -A\zeta } \right\Vert^2}{\left\Vert {\zeta } \right\Vert^2} -2c_2\left\Vert {\eta } \right\Vert -2c_3 \end{aligned}$$

where \(\xi :=\frac{\zeta }{\left\Vert {\zeta } \right\Vert}\) and \(\eta :=\frac{A\zeta }{\left\Vert {\zeta } \right\Vert}\). Now it follows by assumption (51) that

$$\begin{aligned} \frac{2\left\Vert {\zeta ^\prime -A\zeta } \right\Vert^2}{\left\Vert {\zeta } \right\Vert^2} \le 4{c_1}^2 +4{C_1}^2 \frac{\left|\langle A\zeta ,\zeta \rangle \right|}{\left\Vert {\zeta } \right\Vert^2} =4{c_1}^2 +4{C_1}^2\left|\langle \eta ,\xi \rangle \right| \end{aligned}$$

and therefore

$$\begin{aligned} \varphi ^{\prime \prime } \ge 2\left\Vert {\eta -\langle \eta ,\xi \rangle \xi } \right\Vert^2 -4{c_1}^2 -4{C_1}^2\left|\langle \eta ,\xi \rangle \right| -2c_2\left\Vert {\eta } \right\Vert -2c_3. \end{aligned}$$

To obtain the inequality (53) it remains to prove that

$$\begin{aligned} 2\left\Vert {\eta -\langle \eta ,\xi \rangle \xi } \right\Vert^2 -4{c_1}^2 -4{C_1}^2\left|\langle \eta ,\xi \rangle \right| -2c_2\left\Vert {\eta } \right\Vert -2c_3 \ge -a\left|\varphi ^\prime \right|-b. \end{aligned}$$

Since \(\varphi ^\prime =2\langle \xi ,\eta \rangle \), this is equivalent to

$$\begin{aligned} c_2\left\Vert {\eta } \right\Vert \le \left\Vert {\eta -\langle \eta ,\xi \rangle \xi } \right\Vert^2 +(a-2{C_1}^2)\left|\langle \eta ,\xi \rangle \right| +(b/2-2{c_1}^2-c_3). \end{aligned}$$

Abbreviate \(u:=\mathopen \Vert {\eta -\langle \eta ,\xi \rangle \xi } \mathclose \Vert ^2\) and \(v:=\mathopen |\langle \eta ,\xi \rangle \mathclose |\), then \(\mathopen \Vert {\eta } \mathclose \Vert ^2=u^2+v^2\) and the desired inequality has the form

$$\begin{aligned} c_2\sqrt{u^2+v^2} \le u^2 +(a-2{C_1}^2)v +(b/2-2{c_1}^2-c_3). \end{aligned}$$

Since \( c_2\sqrt{u^2+v^2} \le c_2u+c_2v \le u^2+c_2v+{c_2}^2/4 \), this is satisfied with \(a=2{C_1}^2+c_2\) and \(b=4{c_1}^2+{c_2}^2/2+2c_3\). This proves (53). The remaining part of the proof of Theorem 15 carries over from [11] unchanged. \(\square \)

4.1 Linear equation

Unique continuation for the linear heat equation is used to prove transversality of the universal section (Proposition 9) and the unstable manifold Theorem 18.

Proposition 6

Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V2) and two constants \(a<b\). Assume \(u:[a,b]\times S^1\rightarrow M\) is a smooth map and \(\xi \) is a vector field along \(u\) which satisfies \(\mathcal{D }_u\xi =0\) or \(\mathcal{D }_u^*\xi =0\) almost everywhere; see (21) and (22). Denote \(\xi (s,\cdot )\) by \(\xi (s)\). Then the following is true.

  1. (a)

    If \(\xi (s_*)=0\) for some \(s_*\), then \(\xi (s)=0\) for all \(s\in [a,b]\).

  2. (b)

    If \(\xi (s_*)\not =0\) for some \(s_*\), then \(\xi (s)\not =0\) for all \(s\in [a,b]\).

Proof

We represent \(\mathcal{D }_u\) by the Atiyah–Patodi–Singer type operator \( D_{A+C}=\frac{d}{ds}+A(s)+C(s) \) defined in [21, sec. 3.4]. Here the family \(A(s)\) consists of self-adjoint operators on the Hilbert space \(H:=L^2(S^1,\mathbb{R }^n)\) with dense domain \(W\); see (ii) and (iv) in [21, sec. 3.4] where also the space \(W\) is defined. Recall that, if the vector bundle \(u^*TM\rightarrow [a,b]\times S^1\) is trivial, then \(W=W^{2,2}(S^1,\mathbb{R }^n)\), otherwise some boundary condition enters. In either case \(W=:\mathrm{dom\, }A(s)\) is independent of \(s\).

(b) Assume \(\xi \in \ker D_{A+C}\) satisfies \(\xi (s_*)\not =0\). Assume by contradiction that \(\xi (s_0)=0\) for some \(s_0\in [a,b]\). If \(s_0>s_*\), replace \(\xi (s)\) by \(\xi (s+s_*)\) and set \(T=b-s_*\) and \(s_1=s_0-s_*\), otherwise use \(\xi (-s+s_*)\), \(T=-a+s_*\), \(s_1=-s_0+s_*\). Hence we may assume without loss of generality that \(\xi \in \ker D_{A+C}\) maps \([0,T]\) to \(H\) and satisfies \(\xi (0)\not =0\) and \(\xi (s_1)=0\) for some \(s_1\in (0,T]\).

We verify the conditions in Theorem 15. Firstly, the vector field \(\xi \) is smooth by assumption. Secondly, the family \(A(s)\) consists of self-adjoint operators by (ii) in [21, sec. 3.4]. Thirdly, the function \(s\mapsto \langle \xi (s),A(s)\xi (s)\rangle \) is continuously differentiable. Here we use the first condition in axiom (V2), which tells that the Hessian \(\mathcal{H }_\mathcal{V }\) is a zeroth order operator, and the fact that by compactness of the domain the vector fields \({\partial }_tu\), \({\partial }_su\), \(\nabla {}_{t}{\partial }_su\), and \(\nabla {}_{t}\nabla {}_{t}{\partial }_su\) are bounded in \(L^\infty ([0,T]\times S^1)\) by a constant \(c_T\). Now (51) is satisfied with \(C_1=0\), because

$$\begin{aligned} \left\Vert {\xi ^\prime (s)-A(s)\xi (s)} \right\Vert =\left\Vert {C(s)\xi (s)} \right\Vert \le c_T^\prime \left\Vert {\xi (s)} \right\Vert \end{aligned}$$

where the constant \(c_T^\prime =\sup _{[0,T]\times S^1} \mathopen \Vert {C(s,t)} \mathclose \Vert _{\mathcal{L }(\mathbb{R }^n)}\) is finite by compactness of the domain. To verify the inequality (52) note that its left hand side is given by \(\langle \xi (s),A^\prime (s)\xi (s)\rangle \); see [2, Rmk. in sec. 1] and [11, Rmk. F.3]. Now

$$\begin{aligned} \langle \xi (s),A^\prime (s)\xi (s)\rangle&\ge -\left\Vert {\xi (s)} \right\Vert \left\Vert {A^\prime (s)\xi (s)} \right\Vert\\&\ge -\, c_T^{\prime \prime }\left\Vert {\xi (s)} \right\Vert \left( \left\Vert {\xi (s)} \right\Vert +\left\Vert {{\partial }_t\xi (s)} \right\Vert \right) \end{aligned}$$

where the second step is by straightforward calculation of \(A^\prime (s)\). Replacing \(\mathopen \Vert {{\partial }_t\xi (s)} \mathclose \Vert \) according to the elliptic estimate for \(A(s)\) yields (52).

Now the Agmon–Nirenberg Theorem 15 applies. Part (2) tells that \(\xi (s)\not =0\) for all \(s\in [0,T]\). This contradiction proves (b) for elements in the kernel of \(\mathcal{D }_u\). The same argument covers the case of the operator \(\mathcal{D }_u^*\) represented by \(-D_{-A-C}\).

(a) Use a time reversing argument (see proof of the Agmon–Nirenberg Theorem in [11]) and apply (b). Alternatively, use a line of argument analoguous to the proof of (b) replacing in the final step part (2) of Theorem 15 by part (1). \(\square \)

4.2 Nonlinear equation

Unique continuation for the nonlinear heat equation is used to prove the unstable manifold Theorem 18.

Theorem 16

(Unique continuation for compact cylindrical domains) Fix two constants \(a<b\) and a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0) and (V1). If two smooth solutions \(u,v:[a,b]\times S^1\rightarrow M\) of the heat equation (7) coincide along one loop, then \(u=v\).

Proof

Abbreviate \(u_s=u(s,\cdot )\) and assume \(u_\sigma =v_\sigma :S^1\rightarrow M\) for some \(\sigma \in [a,b]\). If \({\partial }_su\) is identically zero, then \(u\) coincides with a critical point \(x\in \mathcal{P }(\mathcal{V })\) and by \(v_\sigma =u_\sigma =x\) so does \(v\) and we are done; similarly if \({\partial }_sv=0\). Now assume that \({\partial }_su\) is nonzero somewhere and so is \({\partial }_s v\). Hence

$$\begin{aligned} \delta := \frac{\iota }{2+\left\Vert {{\partial }_su} \right\Vert_\infty +\left\Vert {{\partial }_sv} \right\Vert_\infty } \in (0,\iota /2). \end{aligned}$$
(54)

Here \(\iota >0\) denotes the injectivity radius of our compact Riemannian manifold.

The first step is to prove that the restrictions of \(u\) and \(v\) to \([\sigma -\delta ,\sigma +\delta ]\times S^1\) are equal. (In fact we should take the intersection with \([a,b]\times S^1\), but suppress this throughout for simplicity of notation.) The key idea is to show that the difference \(\zeta (s)\) of \(u_s\) and \(v_s\) (with respect to geodesic normal coordinates based at \(u_\sigma \)) and a suitable operator \(A\) satisfy the requirements of Theorem 15 with constants \(c_1,C_1>0\). Then, since \(\zeta (\sigma )=0\), part (1) of the theorem shows that \(\zeta =0\) and therefore \(u=v\) on \([\sigma -\delta ,\sigma +\delta ]\times S^1\). Once this is proved we successively restrict \(u\) and \(v\) to cylinders of the form \([\sigma +(2k-1)\delta , \sigma +(2k+1)\delta ]\times S^1\), where \(k\in \mathbb{Z }\). The argument above shows that \(u=v\) on each of these cylinders. Due to compactness of \([a,b]\times S^1\), firstly, the same constants \(c_1\) and \(C_1\) can be chosen in (51) for all cylinders and, secondly, after finitely many steps the union of these cylinders covers \([a,b]\times S^1\). This proves the theorem.

It remains to carry out step one. Consider the interval \(I=[\sigma -\delta ,\sigma +\delta ]\) and restrict \(u\) and \(v\) to the cylinder \(Z=I\times S^1=[\sigma -\delta ,\sigma +\delta ]\times S^1\). Observe that the Riemannian distance between \(u(\sigma ,t)\) and \(u(s,t)\) is less than \(\iota /2\) for every \((s,t)\in Z\); similarly for \(v\). Hence the identities

$$\begin{aligned} u(s,t) =\exp _{u(\sigma ,t)}\xi (s,t), \quad v(s,t) =\exp _{u(\sigma ,t)}\eta (s,t), \end{aligned}$$

for every \((s,t)\in Z\) uniquely determine smooth families of vector fields \(\xi \) and \(\eta \) along the loop \(u_\sigma =v_\sigma \). In particular, the difference \(\zeta =\xi -\eta \) is well defined. Moreover, the domain of \(\xi \) and \(\eta \) is \(Z\) and they satisfy

$$\begin{aligned} \left\Vert {\xi } \right\Vert_\infty <\frac{\iota }{2} ,\quad \left\Vert {\eta } \right\Vert_\infty <\frac{\iota }{2} ,\quad \xi _\sigma =0=\eta _\sigma . \end{aligned}$$

Now consider the Hilbert space \(H=L^2(S^1,{u_\sigma }^*TM)\) and the symmetric differential operator \(A=\nabla {}_{t}\nabla {}_{t}\) with domain \(W=W^{2,2}(S^1,{u_\sigma }^*TM)\). Here \(\nabla {}_{t}\) denotes the covariant derivative along the loop \(u_\sigma \). Hence the operator \(A\) is independent of \(s\) and condition (52) in the Agmon–Nirenberg Theorem 15 is vacuous. If we can verify condition (51) as well, then \(\zeta (\sigma )=0\) implies that \(\zeta (s)=0\) for every \(s\in I\) by Theorem 15 (1). Since \(\zeta \) is smooth, this means that on \(Z\) we have \(\xi =\eta \) pointwise and therefore \(u=v\). It remains to verify (51). By (26) we get

$$\begin{aligned} {\partial }_su&= E_2(u_\sigma ,\xi ){\partial }_s\xi \nonumber \\ \nabla {}_{t}{\partial }_tu&= E_{11}(u_\sigma ,\xi )\bigl ({\partial }_tu_\sigma ,{\partial }_tu_\sigma \bigr ) +2E_{12}(u_\sigma ,\xi )\bigl ({\partial }_tu_\sigma ,\nabla {}_{t}\xi \bigr )\\&+E_1(u_\sigma ,\xi )\nabla {}_{t}{\partial }_tu_\sigma +E_{22}(u_\sigma ,\xi ) \bigl (\nabla {}_{t}\xi ,\nabla {}_{t}\xi \bigr ) +E_2(u_\sigma ,\xi )\nabla {}_{t}\nabla {}_{t}\xi \nonumber \end{aligned}$$
(55)

pointwise for \((s,t)\in Z\) and similarly for \(v\) and \(\eta \). In the second identity we used the symmetry property (27) of \(E_{12}\). Now consider the heat equation (7), replace \({\partial }_su\) and \(\nabla {}_{t}{\partial }_tu\) according to (55), then solve for \({\partial }_s\xi -\nabla {}_{t}\nabla {}_{t}\xi \). Do the same for \(v\) and \(\eta \) to obtain a similar expression for \(-{\partial }_s\eta +\nabla {}_{t}\nabla {}_{t}\eta \). Add both expressions to get the pointwise identity

$$\begin{aligned}&\bigl ({\partial }_s-\nabla {}_{t}\nabla {}_{t}\bigr ) \bigl (\xi -\eta \bigr )\\&\quad =\left(E_2(u_\sigma ,\xi )^{-1}E_{11}(u_\sigma ,\xi ) -E_2(u_\sigma ,\eta )^{-1}E_{11}(u_\sigma ,\eta )\right) \bigl ({\partial }_tu_\sigma ,{\partial }_tu_\sigma \bigr )\\&\qquad +\left(E_2(u_\sigma ,\xi )^{-1}E_1(u_\sigma ,\xi ) -E_2(u_\sigma ,\eta )^{-1}E_1(u_\sigma ,\eta )\right) \nabla {}_{t}{\partial }_tu_\sigma \\&\qquad +2\left(E_2(u_\sigma ,\xi )^{-1} E_{21}(u_\sigma ,\xi )\nabla {}_{t}\xi -E_2(u_\sigma ,\eta )^{-1} E_{21}(u_\sigma ,\eta )\nabla {}_{t}\eta \right) {\partial }_tu_\sigma \\&\qquad +E_2(u_\sigma ,\xi )^{-1}\mathrm{grad }\mathcal{V }(\exp _{u_\sigma }\xi ) -E_2(u_\sigma ,\eta )^{-1}\mathrm{grad }\mathcal{V }(\exp _{u_\sigma }\eta )\\&\qquad +E_2(u_\sigma ,\xi )^{-1}E_{22}(u_\sigma ,\xi ) \bigl (\nabla {}_{t}\xi ,\nabla {}_{t}\xi \bigr ) -E_2(u_\sigma ,\eta )^{-1}E_{22}(u_\sigma ,\eta ) \bigl (\nabla {}_{t}\eta ,\nabla {}_{t}\eta \bigr ). \end{aligned}$$

Now by compactness of the domain \(Z\) there is a constant \(C>0\) such that

$$\begin{aligned} \mathopen \Vert {{\partial }_tu_\sigma } \mathclose \Vert _{L^\infty (S^1)} \le \mathopen \Vert {{\partial }_tu} \mathclose \Vert _{L^\infty (Z)}<C,\quad \mathopen \Vert {\nabla {}_{t}{\partial }_tu_\sigma } \mathclose \Vert _{L^\infty (S^1)}<C. \end{aligned}$$

Moreover, since the maps \(E_i\) and \(E_{ij}\) are uniformly continuous on the radius \(\iota /2\) disk tangent bundle \(\mathcal{O }\subset {TM}\), in which \(\xi \) and \(\eta \) take their values, there exists a constant \(c_1>0\) such that

$$\begin{aligned}&\left|{\partial }_s(\xi -\eta ) -\nabla {}_{t}\nabla {}_{t}(\xi -\eta )\right|\\&\quad \le (c_1C^2+c_1C) \left|\xi -\eta \right|\\&\qquad +2C \left|E_2(u_\sigma ,\xi )^{-1}E_{21}(u_\sigma ,\xi )\nabla {}_{t}\xi -E_2(u_\sigma ,\eta )^{-1}E_{21}(u_\sigma ,\eta )\nabla {}_{t}\eta \right|\\&\qquad +\left|E_2(u_\sigma ,\xi )^{-1}\mathrm{grad }\mathcal{V }(\exp _{u_\sigma }\xi ) -E_2(u_\sigma ,\eta )^{-1}\mathrm{grad }\mathcal{V }(\exp _{u_\sigma }\eta )\right|\\&\qquad +\left|E_2(u_\sigma ,\xi )^{-1}E_{22}(u_\sigma ,\xi ) \bigl (\nabla {}_{t}\xi ,\nabla {}_{t}\xi \bigr ) -E_2(u_\sigma ,\eta )^{-1}E_{22}(u_\sigma ,\eta ) \bigl (\nabla {}_{t}\eta ,\nabla {}_{t}\eta \bigr )\right| \end{aligned}$$

pointwise for \((s,t)\in Z\). It remains to estimate the last three terms in the sum.

First we estimate term three. Use linearity and the symmetry property (27) of \(E_{22}\) to obtain the first identity in the pointwise estimate

$$\begin{aligned}&\left|E_2(u_\sigma ,\xi )^{-1}E_{22}(u_\sigma ,\xi ) \bigl (\nabla {}_{t}\xi ,\nabla {}_{t}\xi \bigr ) -E_2(u_\sigma ,\eta )^{-1}E_{22}(u_\sigma ,\eta ) \bigl (\nabla {}_{t}\eta ,\nabla {}_{t}\eta \bigr )\right|\\&\quad =\bigl | E_2(u_\sigma ,\xi )^{-1}E_{22}(u_\sigma ,\xi ) \bigl (\nabla {}_{t}\xi -\nabla {}_{t}\eta , \nabla {}_{t}\xi \bigr )\\&\qquad +E_2(u_\sigma ,\eta )^{-1}E_{22}(u_\sigma ,\eta ) \bigl (\nabla {}_{t}\xi -\nabla {}_{t}\eta , \nabla {}_{t}\eta \bigr )\\&\qquad +\left(E_2(u_\sigma ,\xi )^{-1}E_{22}(u_\sigma ,\xi ) -E_2(u_\sigma ,\eta )^{-1}E_{22}(u_\sigma ,\eta )\right) \bigl (\nabla {}_{t}\xi ,\nabla {}_{t}\eta \bigr ) \bigr | \\&\quad \le \left\Vert {{E_2}^{-1}E_{22}} \right\Vert_{L^\infty (\mathcal{O })} \left(\left\Vert {\nabla {}_{t}\xi } \right\Vert_\infty +\left\Vert {\nabla {}_{t}\eta } \right\Vert_\infty \right) \left|\nabla {}_{t}(\xi -\eta )\right|\\&\qquad +\,c_1 \left\Vert {\nabla {}_{t}\xi } \right\Vert_\infty \left\Vert {\nabla {}_{t}\eta } \right\Vert_\infty \left|\xi -\eta \right|\\&\quad \le \mu _1\left|\nabla {}_{t}(\xi -\eta )\right| +\mu _2\left|\xi -\eta \right| \end{aligned}$$

where \(\mu _1=2{c_2}^2C(1+c_2)\), \(\mu _2=c_1{c_2}^2C^2(1+c_2)^2\), and the constant \(c_2>0\) is chosen sufficiently large such that for \(j=0,1\) we have

$$\begin{aligned} \left\Vert {E_j} \right\Vert_{L^\infty (\mathcal{O })} +\left\Vert {{E_2}^{-1}} \right\Vert_{L^\infty (\mathcal{O })} +\left\Vert {{E_2}^{-1}E_{22}} \right\Vert_{L^\infty (\mathcal{O })} +\left\Vert {{E_2}^{-1}E_{21}} \right\Vert_{L^\infty (\mathcal{O })} \le c_2. \end{aligned}$$

Moreover, we used that by the first identity in (26)

$$\begin{aligned} \nabla {}_{t}\xi =E_2(u_\sigma ,\xi )^{-1}\left( {\partial }_tu -E_1(u_\sigma ,\xi ){\partial }_tu_\sigma \right). \end{aligned}$$

Hence \(\mathopen \Vert {\nabla {}_{t}\xi } \mathclose \Vert _\infty \le c_2C(1+c_2)\) and similarly for \(\nabla {}_{t}\eta \). Next we estimate term one. Replace \(\nabla {}_{t}\xi \) by \(\nabla {}_{t}\xi -\nabla {}_{t}\eta +\nabla {}_{t}\eta \), then similarly as above we obtain that

$$\begin{aligned}&2C \left|E_2(u_\sigma ,\xi )^{-1}E_{21}(u_\sigma ,\xi )\nabla {}_{t}\xi -E_2(u_\sigma ,\eta )^{-1}E_{21}(u_\sigma ,\eta )\nabla {}_{t}\eta \right|\\&\quad \le 2c_2C\left|\nabla {}_{t}(\xi -\eta )\right| +2c_1c_2C^2(1+c_2)\left|\xi -\eta \right| \end{aligned}$$

pointwise for \((s,t)\in Z\). Next rewrite term two setting \(X:=\eta -\xi \) and replacing \(\eta \) accordingly to obtain pointwise at \((s,t)\in Z\) the identity

$$\begin{aligned}&\frac{d}{d\tau }\left( E_2(u_\sigma ,\xi +\tau X)^{-1} \mathrm{grad }\mathcal{V }(\exp _{u_\sigma }\xi +\tau X) \right)\\&\quad =E_2(u_\sigma ,\xi )^{-1} \mathrm{grad }\mathcal{V }(\exp _{u_\sigma }\xi ) -E_2(u_\sigma ,\xi +X)^{-1} \mathrm{grad }\mathcal{V }(\exp _{u_\sigma }\xi +X)\\&\quad =:f(X)=f(0)+\frac{d}{d\tau } f(\tau X) \end{aligned}$$

for some \(\tau \in [0,1]\). Since \(f(0)=0\), this implies that

$$\begin{aligned} \left|f(X)\right|&\le \left\Vert {{E_2}^{-1}E_{22}} \right\Vert_{L^\infty (\mathcal{O })}\left|X\right|\cdot \left\Vert {{E_2}^{-1}} \right\Vert_{L^\infty (\mathcal{O })} \left|\mathrm{grad }\,\mathcal{V }(\exp _{u_\sigma }(\xi +\tau X))\right|\\&+\left\Vert {{E_2}^{-1}} \right\Vert_{L^\infty (\mathcal{O })} \left|\nabla {}_{\tau } \mathrm{grad }\,\mathcal{V }(\exp _{u_\sigma }(\xi +\tau X))\right|\\&\le c_2^2C_0^\prime \left|X\right| +c_2^2C_1^\prime \left(\left|X\right|+\left\Vert {X_s} \right\Vert_{L^1(S^1)} \right) \end{aligned}$$

pointwise at \((s,t)\in Z\). Here \(C_0^\prime \) and \(C_1^\prime \) denote the constants in axiom (V0) and (V1), respectively. To obtain the final step we applied the first estimate in axiom (V1) to the curve \(\tau \mapsto \exp _{u_\sigma }(\xi _s+\tau X_s)\) in the loop space \(\mathcal{L }M\).

Putting things together we have proved that due to compactness of the domain \(Z\) there is a positive constant \(\mu =\mu (Z,g)\) such that for every \(s\in I\)

$$\begin{aligned} \left\Vert {\zeta ^\prime -A\zeta } \right\Vert \le \mu \left(\left\Vert {\zeta } \right\Vert +\left\Vert {\nabla {}_{t}\zeta } \right\Vert\right) \le \mu \left(\left\Vert {\zeta } \right\Vert +\left|\langle A\zeta ,\zeta \rangle \right|^{1/2}\right). \end{aligned}$$
(56)

Here the norms are in \(L^2(S^1,{u_\sigma }^*TM)\), we abbreviated \(\zeta =\zeta (s)\), and the final step uses that \( \mathopen \Vert {\nabla {}_{t}\zeta } \mathclose \Vert ^2 =\langle \nabla {}_{t}\zeta ,\nabla {}_{t}\zeta \rangle =-\langle A\zeta ,\zeta \rangle \le \left|\langle A\zeta ,\zeta \rangle \right| \). Hence (51) is satisfied and this concludes the proof of Theorem 16. \(\square \)

In the proof of the unstable manifold Theorem 18 we use backward unique continuation for the nonlinear heat equation.

Theorem 17

(Forward and backward unique continuation) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V1).

  1. (F)

    Assume \(u\) and \(v\) are solutions of the heat equation (7) defined on the forward halfcylinder \([0,\infty )\times S^1\). If \(u\) and \(v\) agree along the loop at \(s=0\), then \(u=v\).

  2. (B)

    Assume \(u\) and \(v\) are solutions of the heat equation (7) defined on the backward halfcylinder \((-\infty ,0]\times S^1\). Assume further that

    $$\begin{aligned} \sup _{s\in (-\infty ,0]} \mathcal{S }_\mathcal{V }\bigl (u(s,\cdot )\bigr )\le c_0,\quad \sup _{s\in (-\infty ,0]} \mathcal{S }_\mathcal{V }\bigl (v(s,\cdot )\bigr )\le c_0, \end{aligned}$$

    for some constant \(c_0>0\). Then the following is true. If \(u\) and \(v\) agree along the loop at \(s=0\), then \(u=v\).

Proof

The idea is to decompose, as in the proof of Theorem 16, the halfcylinder into small cylinders of width \(\delta \) and then show \(u=v\) on each piece (by the method developed in the first step of the proof of Theorem 16). The only additional problem is noncompactness of the domain. One way to deal with this is to choose the same width for each piece (in order to arrive at any given time \(s\) in finitely many steps). Here we need uniform bounds for \(\mathopen |{\partial }_su\mathclose |\) and \(\mathopen |{\partial }_sv\mathclose |\). Once we have these we can define \(\delta \) again by (54). Check the proof of Theorem 16 to see that the only further ingredients in proving \(u=v\) on each small cylinder are uniform bounds for the first two \(t\)-derivatives of \(u\) and of \(v\). Hence to complete the proof it remains to show that

$$\begin{aligned} \left\Vert {{\partial }_su} \right\Vert_\infty +\left\Vert {{\partial }_tu} \right\Vert_\infty +\left\Vert {\nabla {}_{t}{\partial }_tu} \right\Vert_\infty +\left\Vert {{\partial }_sv} \right\Vert_\infty +\left\Vert {{\partial }_tv} \right\Vert_\infty +\left\Vert {\nabla {}_{t}{\partial }_tv} \right\Vert_\infty \le C \end{aligned}$$

for some constant \(C>0\).

ad (F) Let \(C_0\) be the constant in axiom (V0) and observe that \(\mathcal{S }_\mathcal{V }\ge -C_0\). Now by Theorem 13 with constant \(C_1\) (more precisely, by checking its proof)

$$\begin{aligned} \left|{\partial }_su(s,t)\right|^2 \le C_1 E_{[s-1,s]}(u) =C_1\left(\mathcal{S }_\mathcal{V }(u_{s-1}) -\mathcal{S }_\mathcal{V }(u_s)\right) \le C_1\left(\mathcal{S }_\mathcal{V }(u_0)+C_0\right) \end{aligned}$$

for \((s,t)\in [1,\infty )\times S^1\). In the second and the last step we used that \(u\) is a negative gradient flow line and the action decreases along \(u\). Note that the proof of Theorem 13 shows that the estimate at a point depends on its past. This is why we get the above estimate only on \([1,\infty )\times S^1\). However, the missing part \([0,1]\times S^1\) is compact and \(u\) is smooth. Hence \(\mathopen \Vert {{\partial }_su} \mathclose \Vert _\infty \le C\) and therefore

$$\begin{aligned} \left\Vert {\nabla {}_{t}{\partial }_tu} \right\Vert_\infty \le \left\Vert {{\partial }_su} \right\Vert_\infty +\left\Vert {\mathrm{grad }\mathcal{V }(u)} \right\Vert_\infty \le C+C_0. \end{aligned}$$

Here we used the heat equation (7) and axiom (V0) with constant \(C_0\). It follows similarly by (checking the proof of) Theorem 12 that \(\mathopen |{\partial }_tu(s,t)\mathclose |\) is uniformly bounded on \([1,\infty )\times S^1\). The corresponding estimates for \(v\) are analogous.

ad (B) The proof of the \(L^\infty \) estimates follows the same steps as in (F). We even get all estimates right away on the whole backward halfcylinder, because this halfcylinder contains the past of each of its points. \(\square \)

5 Transversality

Throughout this section the action functional is a map

$$\begin{aligned} \mathcal{S }_\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R },\quad \mathcal{L }M:=C^\infty (S^1,M), \end{aligned}$$

defined on the free loop space of \(M\). In Sect. 5.1 we construct a separable Banach space \(Y\) of abstract perturbations satisfying axioms (V0)–(V3). In Sect. 5.2 we fix a perturbation \(\mathcal{V }\) such that (V0)–(V3) hold and \(\mathcal{S }_\mathcal{V }\) is Morse. Choosing a closed \(L^2\) neighborhood \(U\) of the critical points of the function \(\mathcal{S }_\mathcal{V }\) we define the subspace \(Y(\mathcal{V },U)\subset Y\) consisting of those perturbations which are supported away from \(U\). Then, given a regular value \(a\) of \(\mathcal{S }_\mathcal{V }\), we define a separable Banach manifold \(\mathcal{O }^a=\mathcal{O }^a(\mathcal{V },U)\) of admissible perturbations. In fact \(\mathcal{O }^a\) is the open ball about zero in the Banach space \(Y(\mathcal{V },U)\) for some sufficiently small radius \(r^a\). For any admissible perturbation \(v\) it holds that

$$\begin{aligned} \mathcal{P }^a(\mathcal{V })=\mathcal{P }^a(\mathcal{V }+v) \end{aligned}$$

—in particular \(a\) is also a regular value of \(\mathcal{S }_{\mathcal{V }+v}\)—and the sublevel sets \(\{\mathcal{S }_\mathcal{V }\le a\}\) and \(\{\mathcal{S }_{\mathcal{V }+v}\le a\}\) are homologically equivalent. For such a triple \((\mathcal{V },U,a)\) we prove in Sect. 5.3 that there is a residual subset \(\mathcal{O }^a_{reg}\subset \mathcal{O }^a\) of regular perturbations \(v\). These, in addition, have the property that the perturbed functional \(\mathcal{S }_{\mathcal{V }+v}\) is Morse–Smale below level \(a\). The crucial step is to prove surjectivity of the universal section \(\mathcal{F }\) (Proposition 9). Here unique continuation for the linear heat equation enters. A further key ingredient in the ’no return’ part of the proof is the (negative) gradient flow property which implies that the functional is strictly decreasing along nonconstant heat flow solutions.

5.1 The universal Banach space of perturbations

We fix, once and for all, the following data.

  1. (a)

    A dense sequence \(\bigl ( x_i\bigr )_{i\in \mathbb{N }}\) in \(\mathcal{L }M=C^\infty (S^1,M)\).

  2. (b)

    For every \(x_i\) a dense sequence \(\bigl ( \eta ^{ij}\bigr )_{j\in \mathbb{N }}\) in \(C^\infty (S^1,x_i^*TM)\).

  3. (c)

    A smooth cutoff function \(\rho :\mathbb{R }\rightarrow [0,1]\) such that \(\rho =1\) on \([-1,1]\) and \(\rho =0\) outside \([-4,4]\) and such that \(\mathopen \Vert {\rho ^\prime } \mathclose \Vert _\infty <1\). Then set \(\rho _{1/k}(r)=\rho (rk^2)\) for \(k\in \mathbb{N }\); see Fig. 1.

Moreover, recall that \(\iota >0\) denotes the injectivity radius of the closed Riemannian manifold \(M\). Fix a smooth cutoff function \(\beta \) such that \(\beta =1\) on \([-(\iota /2)^2,(\iota /2)^2]\) and \(\beta =1\) outside \([-\iota ^2,\iota ^2]\); see Fig. 2.

Fig. 1
figure 1

The cutoff function \(\rho _{1/k}\)

Fig. 2
figure 2

The cutoff function \(\beta \)

Now for any choice of \(i,j,k\in \mathbb{N }\) there is a smooth function on the loop space given by

$$\begin{aligned} \mathcal{V }_\ell (x) =\mathcal{V }_{ijk}(x) =\rho _{1/k}\left(\left\Vert {x-x_i} \right\Vert_{L^2}^2\right) \int _0^1 V^{ij}(t,x(t))\, dt, \end{aligned}$$
(57)

where \(V^{ij}\) is the smooth function on \(S^1\times M\) defined by

$$\begin{aligned} V^{ij}(t,q):= \left\{ \begin{array}{l@{\quad }l} \beta \bigl (\mathopen |\xi _q^i(t)\mathclose |^2\bigr ) \;\big \langle \xi _q^i(t), \eta ^{ij}(t)\big \rangle ,&\mathopen |\xi _q^i(t)\mathclose |<\iota , \\ 0,&\text{ else.} \end{array} \right. \end{aligned}$$

Here the vector \(\xi _q^i(t)\) is determined by the identity \( q=\exp _{x_i(t)} \xi _q^i(t) \) whenever the Riemannian distance between \(q\) and \(x_i(t)\) is less than \(\iota \). To simplify notation we fixed a bijection \(\ell :\mathbb{N }^3\rightarrow \mathbb{N }_0\). Note that the support of \(\mathcal{V }_{ijk}\) is contained in the \(L^2\) ball of radius \(2/k\) about \(x_i\). Each function \(\mathcal{V }_\ell :\mathcal{L }M\rightarrow \mathbb{R }\) is uniformly continuous with respect to the \(C^0\) topology and satisfies (V0)–(V3). This follows by compactness of \(M\), smoothness of \(V^{ij}\), and by the identity

$$\begin{aligned} \left\langle \mathrm{grad }\mathcal{V }(u),{\partial }_su\right\rangle _{L^2}&= \frac{d}{ds} \mathcal{V }(u)\\&= 2\rho ^\prime \left(\left\Vert {u-x_0} \right\Vert_2^2\right) \left(\int _0^1 V_t(u(s,t))\, dt\right) \left\langle u-x_0,{\partial }_su\right\rangle _{L^2}\\&+\rho \left(\left\Vert {u-x_0} \right\Vert_2^2\right) \left\langle \nabla V(u),{\partial }_su\right\rangle _{L^2} \end{aligned}$$

which determines \(\mathrm{grad }\mathcal{V }\). Here \(\mathbb{R }\rightarrow \mathcal{L }M:s\mapsto u(s,\cdot )\) is any smooth map.

Given \(\mathcal{V }_\ell \), we fix a constant \(C_\ell ^0\ge 1\) which is greater than its constant of uniform continuity and for which (V0) holds true. Then we fix a constant \(C_\ell ^1\ge C_\ell ^0\) for which both estimates in (V1) hold true and a constant \(C_\ell ^2\ge C_\ell ^1\) to cover the three estimates of (V2). Furthermore, for every integer \(i\ge 3\), we choose a constant \(C_\ell ^i\ge C_\ell ^{i-1}\) that covers all estimates in (V3) with \(k^\prime +\ell ^\prime =i\) (here \(k^\prime \) and \(\ell ^\prime \) denote the integers \(k\) and \(\ell \) that appear in (V3)). To summarize, for each integer \(\ell \ge 0\) we have fixed a sequence of constants

$$\begin{aligned} 1 \le C_\ell ^0 \le C_\ell ^1\le \cdots \le C_\ell ^\ell \le \cdots \quad \forall \ell \in \mathbb{N }_0. \end{aligned}$$
(58)

The universal space of perturbations is the normed linear space

$$\begin{aligned} Y =\left\{ v_\lambda :=\sum _{\ell =0}^\infty \lambda _\ell \mathcal{V }_\ell \left.\frac{}{}\right|\, \lambda =\left(\lambda _\ell \right) \subset \mathbb{R } \text{ and} \left\Vert {v_\lambda } \right\Vert:= \sum _{\ell =0}^\infty \mathopen |\lambda _\ell \mathclose | C_\ell ^\ell <\infty \right\} . \end{aligned}$$
(59)

Proposition 7

The universal space \(Y\) of perturbations is a separable Banach space and every \(v_\lambda \in Y\) satisfies the axioms (V0)–(V3).

proof

The map \(v_\lambda \mapsto (\lambda _\ell C_\ell ^\ell )_{\ell \in \mathbb{N }_0}\) provides an isomorphism from \(Y\) to the separable Banach space \(\ell ^1\) of absolutely summable real sequences. This proves that \(Y\) is a separable Banach space. That every element \(v_\lambda =\sum \lambda _\ell \mathcal{V }_\ell \) of \(Y\) satisfies (V0)–(V3) follows readily from the corresponding property of the generators \(\mathcal{V }_\ell \). To explain the idea we give the proof of the second estimate in (V2), namely

$$\begin{aligned} \left|\nabla {}_{t}\nabla {}_{s}\mathrm{grad }v_\lambda (u)\right|&\le \sum _{\ell =0}^\infty \left|\lambda _\ell \right|\cdot \left|\nabla {}_{t}\nabla {}_{s}\mathrm{grad }\mathcal{V }_\ell (u)\right|\\&\le \left( \left|\lambda _0\right| C_0^2 +\left|\lambda _1\right| C_1^2 +\sum _{\ell =2}^\infty \left|\lambda _\ell \right| C_\ell ^2 \right) f(u)\\&\le \left( \left|\lambda _0\right| C_0^2 +\left|\lambda _1\right| C_1^2 +\left\Vert {v_\lambda } \right\Vert\right) f(u) \end{aligned}$$

for every smooth map \(\mathbb{R }\rightarrow \mathcal{L }M:s\mapsto u(s,\cdot )\) and every \((s,t)\in \mathbb{R }\times S^1\). We abbreviated \(f(u)=(\mathopen |\nabla {}_{t}{\partial }_su\mathclose | +(1+\mathopen |{\partial }_tu\mathclose |)(\mathopen |{\partial }_su\mathclose | +\mathopen \Vert {{\partial }_su} \mathclose \Vert _{L^1}))\). Step two uses the second estimate in (V2) for each \(\mathcal{V }_\ell \) with constant \(C_\ell ^2\). Step three follows from \(C_\ell ^k\le C_\ell ^\ell \) whenever \(\ell \ge k\), see (58). The remaining estimates in (V0)–(V3) follow by the same argument. Continuity of \(v_\lambda \) with respect to the \(C^0\) topology follows similarly using uniform continuity of the functions \(\mathcal{V }_\ell \).

\(\square \)

5.2 Admissible perturbations

Throughout we fix a perturbation \(\mathcal{V }\) that satisfies (V0)–(V3) and such that \(\mathcal{S }_{\mathcal{V }}:\mathcal{L }M\rightarrow \mathbb{R }\) is Morse. Denote the critical values \(c_i\) of \(\mathcal{S }_\mathcal{V }\) by

$$\begin{aligned} c_0<c_1<c_2<\cdots <c_k<a<c_{k+1}<\cdots . \end{aligned}$$

Note that there is no accumulation point, because \(\mathcal{S }_\mathcal{V }\) admits only finitely many critical points on each sublevel set. Fix a regular value \(a>c_0\) (otherwise \(\{\mathcal{S }_\mathcal{V }\le a\}=\emptyset \) and we are done) and let \(c_k\) be the largest critical value smaller than \(a\). If there are critical values larger than \(a\) let \(c_{k+1}\) be the smallest such, otherwise set \(c_{k+1}\) at the same distance above \(a\) as \(c_k\) sits below \(a\), that is \(c_{k+1}:=a+(a-c_k)\). The idea to prove the transversality Theorem 8 is to perturb \(\mathcal{S }_{\mathcal{V }}\) outside some \(L^2\) neighborhood \(U\) of its critical points in such a way that no new critical points arise on the sublevel set \(\{\mathcal{S }_\mathcal{V }<c_{k+1}\}\). To achieve this we fix for every critical point \(x\) a closed \(L^2\) neighborhood \(U_x\) such that \(U_x\cap U_y=\emptyset \) whenever \(x\not = y\). This is possible, because on any sublevel set there are only finitely many critical points (\(\mathcal{S }_{\mathcal{V }}\) is Morse and satisfies the Palais–Smale condition; see e.g. [19, app. A]). Set

$$\begin{aligned} U=U(\mathcal{V }) :=\bigcup _{x\in \mathcal{P }(\mathcal{V })}U_x \end{aligned}$$
(60)

and consider the Banach space of perturbations \(Y\) given by (59). We are interested in the subset of those perturbations supported away from \(U\), namely

$$\begin{aligned}Y(\mathcal{V },U) :=\left\{ v_\lambda =\sum _{\ell =0}^\infty \lambda _\ell \mathcal{V }_\ell \in Y \left.\frac{}{}\right|\, \mathrm{supp}\mathcal{V }_\ell \cap U\not = \emptyset \;\;\Rightarrow \;\; \lambda _\ell =0 \right\} . \end{aligned}$$

Lemma 5

\(Y(\mathcal{V },U)\) is a closed subspace of the separable Banach space \(Y\).

Proof

Pick \(\alpha ,\beta \in \mathbb{R }\) and \(v_\lambda ,v_\mu \in Y(\mathcal{V },U)\). By definition of \(Y(\mathcal{V },U)\) the following is true for every \(\ell \in \mathbb{N }_0\). If \(\mathrm{supp}\mathcal{V }_\ell \cap U\not = \emptyset \), then \(\lambda _\ell =0\) and \(\mu _\ell =0\). Hence \(\alpha \lambda _\ell +\beta \mu _\ell =0\) and therefore \(\alpha v_\lambda + \beta v_\mu \in Y(\mathcal{V },U)\). To see that the subspace \(Y(\mathcal{V },U)\) is closed let \(v_\lambda ^i=\sum \lambda _\ell ^i\mathcal{V }_\ell \) be a sequence in \(Y(\mathcal{V },U)\) which converges to some element \(v_\lambda =\sum \lambda _\ell \mathcal{V }_\ell \) of \(Y\). This means that \(\lambda _\ell ^i\rightarrow \lambda _\ell \), as \(i\rightarrow \infty \), for each \(\ell \). Assume \(\mathrm{supp}\mathcal{V }_\ell \cap U\not = \emptyset \). It follows that \(\lambda _\ell ^i=0\), because \(v_\lambda ^i\in Y(\mathcal{V },U)\), and this is true for all \(i\). Hence the limit \(\lambda _\ell \) is zero and therefore \(v_\lambda \in Y(\mathcal{V },U)\). \(\square \)

For \(c_k<a<c_{k+1}\) as above set

$$\begin{aligned} \delta ^a=\delta ^a(\mathcal{V }) :=\frac{1}{2}\min \{a-c_k,c_{k+1}-a\}>0, \quad a_\pm :=a\pm \delta ^a. \end{aligned}$$
(61)

Hence the distance between any two of \( c_k<a_-<a<a_+<c_{k+1} \) is at least \(\delta ^a\).

Lemma 6

Fix a perturbation \(\mathcal{V }\) satisfying (V0)–(V3). Assume \(\mathcal{S }_\mathcal{V }\) is Morse. Define \(U\) by (60), fix a regular value \(a\) of \(\mathcal{S }_\mathcal{V }\), and consider the reals \(c_k\), \(c_{k+1}\), \(a_\pm \), \(\delta ^a\) defined above. If \(v_\lambda \in Y(\mathcal{V },U)\) and \(\mathopen \Vert {v_\lambda } \mathclose \Vert <\delta ^a\), then there are inclusions

$$\begin{aligned} \left\{ \mathcal{S }_\mathcal{V }\le c_k\right\}&\subset \left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a_-\right\} \subset \left\{ \mathcal{S }_\mathcal{V }\le a\right\} \subset \left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a_+\right\} \subset \left\{ \mathcal{S }_\mathcal{V }< c_{k+1}\right\} \\&\left\{ \mathcal{S }_\mathcal{V }\le a_-\right\} \subset \left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a\right\} \subset \left\{ \mathcal{S }_\mathcal{V }\le a_+\right\} . \end{aligned}$$

Proof

Fix \(v_\lambda \in Y(\mathcal{V },U)\) with \(\mathopen \Vert {v_\lambda } \mathclose \Vert <\delta ^a\). Observe that for each \(\gamma \in \mathcal{L }M\)

$$\begin{aligned} \left|v_\lambda (\gamma )\right| \le \sum _{\ell =0}^\infty \left|\lambda _\ell \mathcal{V }_\ell (\gamma )\right| \le \sum _{\ell =0}^\infty \left|\lambda _\ell \right| C_\ell ^0 \le \sum _{\ell =0}^\infty \left|\lambda _\ell \right| C_\ell ^\ell =\left\Vert {v_\lambda } \right\Vert <\delta ^a. \end{aligned}$$

Here we used axiom (V0) with constant \(C_\ell ^0\) for \(\mathcal{V }_\ell \), the fact that \(C_\ell ^0\le C_\ell ^\ell \) by (58), and definition (59) of the norm on \(Y\). Observe further that \( \mathcal{S }_{\mathcal{V }+v_\lambda } =\mathcal{S }_\mathcal{V }-v_\lambda \). The proofs of the asserted inclusions all follow the same pattern. We only provide details for the last two inclusions in the first line of the assertion of the lemma. Assume \(\mathcal{S }_\mathcal{V }(\gamma )\le a\), then \(\mathcal{S }_{\mathcal{V }+v_\lambda }(\gamma ) =\mathcal{S }_\mathcal{V }(\gamma )-v_\lambda (\gamma ) <a+\delta ^a=a_+\) where the last step is by definition of \(a_+\). Now assume \(\mathcal{S }_{\mathcal{V }+v_\lambda }(\gamma )\le a_+\), then \(\mathcal{S }_\mathcal{V }(\gamma ) \le a_++v_\lambda (\gamma )<a+2\delta ^a \le c_{k+1}\) again by definition of \(a_+\). The last step is by definition of \(\delta ^a\). \(\square \)

Consider the positive constants given by

$$\begin{aligned} \kappa ^a=\kappa ^a(\mathcal{V },U) :=\inf _{\gamma \in \{\mathcal{S }_\mathcal{V }<c_{k+1}\}\setminus U} \left\Vert {\mathrm{grad }\mathcal{S }_{\mathcal{V }}(\gamma )} \right\Vert_2>0 \end{aligned}$$

and

$$\begin{aligned} r^a=r^a(\mathcal{V },U):=\frac{1}{2} \min \{\delta ^a,\kappa ^a\}>0. \end{aligned}$$
(62)

To prove the strict inequality \(\kappa ^a>0\) assume by contradiction that \(\kappa ^a=0\). Then by Palais–Smale there exists a sequence \((\gamma _k)\subset \{\mathcal{S }_\mathcal{V }<c_{k+1}\}\setminus U\) converging in the \(W^{1,2}\) topology to a critical point \(x\). It follows that \(x\in U\), because \(U\) contains all critical points. Since \(W^{1,2}\) convergence implies \(L^2\) convergence and \(U\) is a \(L^2\) neighborhood of the critical points, we arrive at a contradiction to \(\gamma _k\notin U\) whenever \(k\in \mathbb{N }\).

Proposition 8

Fix a perturbation \(\mathcal{V }\) satisfying (V0–V3). Assume \(\mathcal{S }_\mathcal{V }\) is Morse and \(a\) is a regular value. If \(v_\lambda \in Y(\mathcal{V },U)\) and \(\mathopen \Vert {v_\lambda } \mathclose \Vert \le r^a\), then

$$\begin{aligned} \mathcal{P }^a(\mathcal{V })=\mathcal{P }^a(\mathcal{V }+v_\lambda ),\quad \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a\right\} \right) \cong \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a\right\} \right). \end{aligned}$$

Proof

Fix \(v_\lambda \in Y(\mathcal{V },U)\) with \(\mathopen \Vert {v_\lambda } \mathclose \Vert \le \frac{1}{2}\min \{\delta ^a,\kappa ^a\}\). Define \(a_+\) by (61).

I) We prove that \(\mathcal{P }^{a_+}(\mathcal{V })=\mathcal{P }^{a_+}(\mathcal{V }+v_\lambda )\) which immediately implies the first assertion of the proposition. On \(U\) both functionals \(\mathcal{S }_\mathcal{V }\) and \(\mathcal{S }_{\mathcal{V }+v_\lambda }\) coincide, because \(\mathcal{S }_{\mathcal{V }+v_\lambda }=\mathcal{S }_\mathcal{V }-v_\lambda \) and \(v_\lambda \) is not supported on \(U\). Now \(\mathcal{S }_\mathcal{V }\) does not admit any critical point on \(\{\mathcal{S }_{\mathcal{V }+v_\lambda }<c_{k+1}\}\setminus U\) by definition of \(U\). Assume the same holds true for \(\mathcal{S }_{\mathcal{V }+v_\lambda }\). Then, since \(\{\mathcal{S }_{\mathcal{V }+v_\lambda }\le a_+\} \subset \{\mathcal{S }_\mathcal{V }< c_{k+1}\}\) by Lemma 6, it follows that all critical point of \(\mathcal{S }_{\mathcal{V }+v_\lambda }\) below level \(a_+\) are contained in \(U\). But there it coincides with \(\mathcal{S }_\mathcal{V }\). Hence \(\mathcal{P }^{a_+}(\mathcal{V }+v_\lambda )=\mathcal{P }^{a_+}(\mathcal{V })\).

It remains to prove the assumption. Suppose by contradiction that there is a critical point \(x\) of \(\mathcal{S }_{\mathcal{V }+v_\lambda }\) on \(\{\mathcal{S }_{\mathcal{V }+v_\lambda }<c_{k+1}\}\setminus U\). Hence

$$\begin{aligned} 0=\mathrm{grad }\,\mathcal{S }_{\mathcal{V }+v_\lambda }(x) =\mathrm{grad }\,\mathcal{S }_\mathcal{V }(x) -\mathrm{grad }\, v_\lambda (x) \end{aligned}$$

and therefore \(\mathopen \Vert {\mathrm{grad }\, v_\lambda (x)} \mathclose \Vert _2= \mathopen \Vert {\mathrm{grad }\,\mathcal{S }_\mathcal{V }(x)} \mathclose \Vert _2\ge \kappa ^a\) by definition of \(\kappa ^a\). On the other hand, since \(v_\lambda \) is of the form \(\sum \lambda _\ell \mathcal{V }_\ell \) it follows that

$$\begin{aligned} \left\Vert {\mathrm{grad }\, v_\lambda (x)} \right\Vert_2 \le \sum _{\ell =0}^\infty \left|\lambda _\ell \right|\cdot \left\Vert {\mathrm{grad }\,\mathcal{V }_\ell (x)} \right\Vert_\infty \le \sum _{\ell =0}^\infty \left|\lambda _\ell \right| C_\ell ^0 \le \left\Vert {v_\lambda } \right\Vert\le \frac{1}{2} \kappa ^a. \end{aligned}$$

Here we used axiom (V0) with constant \(C_\ell ^0\) for \(\mathcal{V }_\ell \), and the fact that \(C_\ell ^0\le C_\ell ^\ell \) by (58). The last two steps are by definition (59) of the norm on \(Y\) and the assumption on \(\mathopen \Vert {v_\lambda } \mathclose \Vert \).

II) We prove that \(\mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a\right\} \right) \cong \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a\right\} \right)\). By step I) all elements of the interval \([a_-,a_+]\) are regular values of \(\mathcal{S }_{\mathcal{V }+v_\lambda }\). Hence classical Morse theory for the negative \(W^{1,2}\) gradient flow on the loop space shows that

$$\begin{aligned} \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a_-\right\} \right) \cong \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a_+\right\} \right). \end{aligned}$$

On the other hand, using the inclusions provided by Lemma 6 this isomorphism factors through the inclusion induced homomorphisms

$$\begin{aligned} \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a_-\right\} \right) \rightarrow \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a\right\} \right) \rightarrow \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a_+\right\} \right). \end{aligned}$$

Therefore the first homomorphism is injective and the second one surjective. Since \(a\) lies in the interval of regular values of \(\mathcal{S }_{\mathcal{V }+v_\lambda }\), the first one leads to an injective homomorphism \( \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a\right\} \right) \rightarrow \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a\right\} \right) \). By construction the interval \([a_-,a_+]\) consists of regular values of \(\mathcal{S }_\mathcal{V }\). Hence the same argument using again Lemma 6 to obtain the inclusion induced homomorphisms

$$\begin{aligned} \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a_-\right\} \right) \rightarrow \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a\right\} \right) \rightarrow \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a_+\right\} \right) \end{aligned}$$

provides a surjection \( \mathrm{H}_*\left(\left\{ \mathcal{S }_{\mathcal{V }+v_\lambda }\le a\right\} \right) \rightarrow \mathrm{H}_*\left(\left\{ \mathcal{S }_\mathcal{V }\le a\right\} \right) \). \(\square \)

By definition the set of admissible perturbations is given by the open ball \(\mathcal{O }^a\) in the Banach space \(Y(\mathcal{V },U)\) of radius \(r^a\) defined by (62), namely

$$\begin{aligned} \mathcal{O }^a=\mathcal{O }^a(\mathcal{V },U) :=\left\{ v_\lambda \in Y(\mathcal{V },U): \left\Vert {v_\lambda } \right\Vert\le r^a \right\} . \end{aligned}$$
(63)

Since \(Y(\mathcal{V },U)\) is a separable Banach space by Lemma 5, the closed subset \(\mathcal{O }^a\) inherits the structure of a complete metric space. Proposition 8 then concludes the proof of the first part of Theorem 8. Namely, if \(v_\lambda \in \mathcal{O }^a\), then \(\mathcal{S }_\mathcal{V }\) and \(\mathcal{S }_{\mathcal{V }+v_\lambda }\) have homologically equivalent sublevel sets with respect to \(a\) and the same critical points when restricted to these sublevel sets.

Remark 4

If \(a<b\) are regular values of \(\mathcal{S }_\mathcal{V }\) and \(v\in \mathcal{O }^b\) satisfies \(\mathopen \Vert {v} \mathclose \Vert \le \delta ^a/2\), then \(v\in \mathcal{O }^a\). To see this note that \(\kappa ^b\le \kappa ^a\) and therefore \(\mathopen \Vert {v} \mathclose \Vert \le r^b\le \kappa ^b/2\le \kappa ^a/2\). Hence \(\mathopen \Vert {v} \mathclose \Vert \le \frac{1}{2}\min \{\delta ^a,\kappa ^a\}=r^a\).

Remark 5

Since we chose to cut off our abstract perturbations in Sect. 1.1 with respect to the \(L^2\) norm, we cannot naturally control the support of \(v\in \mathcal{O }^a\) in terms of sublevel sets of \(\mathcal{S }_\mathcal{V }\). This would be possible if we cut off using the \(W^{1,2}\) norm, because the action functional \(\mathcal{S }_\mathcal{V }\) is continuous in the \(W^{1,2}\) topology.

5.3 Surjectivity

Proof of Theorem 8

Assume that the perturbation \(\mathcal{V }\) satisfies (V0)–(V3) and the function \(\mathcal{S }_{\mathcal{V }}:\mathcal{L }M\rightarrow \mathbb{R }\) is Morse. Consider the neighborhood \(U\) of the critical points of \(\mathcal{S }_{\mathcal{V }}\) defined by (60) and fix a regular value \(a\) of \(\mathcal{S }_{\mathcal{V }}\). For \(\mathcal{O }^a=\mathcal{O }^a(\mathcal{V },U)\) defined by (63) the first assertion of Theorem 8 is true by Proposition 8. To prove the second one fix in addition a constant \(p>2\) and two critical points \(x,y\in \mathcal{P }^a(\mathcal{V })\). We denote by \(\mathcal{B }^{1,p}_{x,y}\) the smooth Banach manifold of cylinders between \(x\) and \(y\) defined by (34) in Sect. 3. This manifold is separable and admits a countable atlas. Now consider the smooth Banach space bundle

$$\begin{aligned} \mathcal{E }^p\rightarrow \mathcal{B }^{1,p}_{x,y}\times \mathcal{O }^a \end{aligned}$$

whose fibre over \((u,v_\lambda )\) are the \(L^p\) vector fields along \(u\). The formula

$$\begin{aligned} \mathcal{F }(u,v_\lambda ) ={\partial }_su-\nabla {}_{t}{\partial }_tu -\mathrm{grad }\bigl (\mathcal{V }+v_\lambda \bigr )(u) \end{aligned}$$
(64)

defines a smooth section of this bundle. Note that \(\mathcal{F }(u,v_\lambda )=0\) is equivalent to \(u\in \mathcal{M }(x,y;\mathcal{V }+v_\lambda )\). The zero set

$$\begin{aligned} \mathcal{Z }=\mathcal{Z }(x,y;\mathcal{V },U,a)=\mathcal{F }^{-1}(0) \end{aligned}$$

is called the universal moduli space. It does not depend on \(p>2\), since all solutions of the heat equation (7) are smooth by Theorem 2. The key fact is that the space of perturbations \(\mathcal{O }^a\) is rich enough such that zero is a regular value of \(\mathcal{F }\). By definition the latter means that either there is no zero of \(\mathcal{F }\) at all or \(d\mathcal{F }(u,v_\lambda )\) is onto and \(\ker d\mathcal{F }(u,v_\lambda )\) admits a topological complement whenever \(\mathcal{F }(u,v_\lambda )=0\). In the first case we set \(\mathcal{O }^a_{reg}(x,y):=\mathcal{O }^a\).

The second case decomposes into two classes. First we need to sharpen our notation. By \(\mathcal{D }_{u,\mathcal{V }}\) we denote the operator previously denoted by \(\mathcal{D }_u\). In this notation the linearization of \(\mathcal{F }\) at the zero \((u,v_\lambda )\) is given by

$$\begin{aligned} d\mathcal{F }(u,v_\lambda )\;(\xi ,\hat{\mathcal{V }}) =d\mathcal{F }_{v_\lambda }(u)\;\xi +d\mathcal{F }_u(v_\lambda )\;\hat{\mathcal{V }}=\mathcal{D }_{u,\mathcal{V }+v_\lambda }\xi -\mathrm{grad }\hat{\mathcal{V }}(u) \end{aligned}$$
(65)

where \(\mathcal{F }_{v_\lambda }(u):=\mathcal{F }(u,v_\lambda )=:\mathcal{F }_u(v_\lambda )\) and

$$\begin{aligned} \mathcal{D }\xi :=\mathcal{D }_{u,\mathcal{V }+v_\lambda }\xi :=\nabla {}_{s}\xi -\nabla {}_{t}\nabla {}_{t}\xi -R(\xi ,{\partial }_tu){\partial }_tu-\mathcal{H }_{\mathcal{V }+v_\lambda }(u)\xi . \end{aligned}$$
(66)

I. Automatic transversality of constant trajectories. The first class consists of pairs \((u,v_\lambda )\) where \(v_\lambda \in \mathcal{O }^a\) and \(u\) is a constant heat flow trajectory. The latter means that \(u\) is of the form \(u_x:=x(=y)\). Now for these pairs transversality holds automatically, since \(\mathcal{S }_\mathcal{V }\) is Morse. To see this observe first that the constant trajectory \(u_x\) solves the heat equation (7) for \(\mathcal{V }\) and likewise for \(\mathcal{V }+v_\lambda \), since \(v_\lambda \in \mathcal{O }^a\) is supported away from \(x\). Hence \((u_x,v_\lambda )\) is a zero of \(\mathcal{F }\) to start with. Similarly it follows that \(d\mathcal{F }(u_x,v_\lambda )=\mathcal{D }_{u_x,\mathcal{V }}\). But \(\mathcal{D }_{u_x,\mathcal{V }}\) acts on each time slice by the covariant Hessian \(A_x\) given by (9). Since \(A_x\) is injective by the Morse assumption on \(\mathcal{S }_\mathcal{V }\), it follows that \(\mathcal{D }_{u_x,\mathcal{V }}\) is injective. Now the cokernel of \(\mathcal{D }_{u_x,\mathcal{V }}\) is equal to the kernel of the formal adjoint operator \(\mathcal{D }_{u_x,\mathcal{V }}^*\) by [21, prop. 3.15] and [21, prop. 3.18]. But \(\mathcal{D }_{u_x,\mathcal{V }}^*=\mathcal{D }_{u_x,\mathcal{V }}\) by self-adjointness of \(A_x\). Hence \(\mathcal{D }_{u_x,\mathcal{V }}\) is surjective and we set \(\mathcal{O }^a_{reg}(x,x):=\mathcal{O }^a\).

II. The second class consists of zeroes \((u,v_\lambda )\) of (64) with \({\partial }_su\not =0\). Note that \(\mathcal{S }_{\mathcal{V }+v_\lambda }\) is Morse below level \(a\) by Proposition 8 and since \(v_\lambda \) is supported away from \(x\) and \(y\). Surjectivity of \(d\mathcal{F }(u,v_\lambda )\) is covered by Proposition 9 below. Existence of a topological complement follows, see e.g. [19, prop. 3.3], using surjectivity, boundedness (69), and the fact that \(\mathcal{D }_{u,\mathcal{V }+v_\lambda }:\mathcal{W }^{1,p}_u\rightarrow \mathcal{L }^p_u\) is Fredholm by Theorem 5. Hence zero is a regular value of \(\mathcal{F }\). By the implicit function theorem \(\mathcal{Z }\) is a smooth Banach manifold; see e.g. [8, theorem A.3.3]. Now by Thom-Smale transversality theory the projection onto the second factor

$$\begin{aligned} \pi :\mathcal{Z }\rightarrow \mathcal{O }^a,\quad (u,v_\lambda )\mapsto v_\lambda , \end{aligned}$$

is a smooth Fredholm map whose index at \((u,v_\lambda )\) is given by the Fredholm index of \(\mathcal{D }_{u,\mathcal{V }+v_\lambda }\); see e.g. [8, lemma A.3.6]. This index is equal to the difference of the Morse indices of \(x\) and \(y\) by Theorem 5. Since \(\mathcal{Z }\) is separable and admits a countable atlas, we can apply the Sard-Smale theorem [16] to countably many coordinate representatives of \(\pi \). It follows that the set of regular values of \(\pi \) is residual in \(\mathcal{O }^a\). Denote this set by \(\mathcal{O }^a_{reg}(x,y)\) and observe that

$$\begin{aligned} \mathcal{O }^a_{reg}(x,y) =\{v_\lambda \in \mathcal{O }^a\mid \mathcal{D }_u \,\, \text{ onto}\,\, \forall u\in \mathcal{M }(x,y;\mathcal{V }+v_\lambda ) \} \end{aligned}$$

again by standard transversality theory; see e.g. [19, prop. 3.4].

We define the set of regular perturbations by

$$\begin{aligned} \mathcal{O }^a_{reg} =\mathcal{O }^a_{reg}(\mathcal{V }) :=\bigcap _{x,y\in \mathcal{P }^a(\mathcal{V })} \mathcal{O }^a_{reg}(x,y). \end{aligned}$$
(67)

It is a residual subset of \(\mathcal{O }^a\), since it consists of a finite intersection of residual subsets. This proves Theorem 8 up to Proposition 9. \(\square \)

Proposition 9

(Surjectivity) Fix a perturbation \(\mathcal{V }\) that satisfies (V0)–(V3) and assume \(\mathcal{S }_{\mathcal{V }}\) is Morse. Fix a regular value \(a\), critical points \(x,y\in \mathcal{P }^a(\mathcal{V })\), and a constant \(p>2\). Define \(U\) by (60) and the section \(\mathcal{F }\) by (64). Then

$$\begin{aligned} d\mathcal{F }(u,v_\lambda ):\mathcal{W }^{1,p}_u \times Y(\mathcal{V },U) \rightarrow \mathcal{L }^p_u \end{aligned}$$

is onto at every zero \((u,v_\lambda )\in \mathcal{B }^{1,p}_{x,y}\times \mathcal{O }^a(\mathcal{V },U)\) of \(\mathcal{F }\).

Proof

Fix \((u,v_\lambda )\in \mathcal{F }^{-1}(0)\) such that \({\partial }_su\) does not identically vanish (the case \({\partial }_su=0\) is treated in I. above). Now \(\mathcal{S }_{\mathcal{V }+v_\lambda }\) decreases strictly along \(u\), thus

$$\begin{aligned} c_k\ge \mathcal{S }_\mathcal{V }(x)=\mathcal{S }_{\mathcal{V }+v_\lambda }(x) >\mathcal{S }_{\mathcal{V }+v_\lambda }(u_s) >\mathcal{S }_{\mathcal{V }+v_\lambda }(y)=\mathcal{S }_\mathcal{V }(y) \end{aligned}$$
(68)

where the two identities exploit that \(v_\lambda \) is not supported near \(x\) and \(y\). Hence \(x\not = y\). Define \(1<q<2\) by \(1/p+1/q=1\). Recall that \(u\in \mathcal{M }(x,y;\mathcal{V }+v_\lambda )\) and \(\mathcal{S }_{\mathcal{V }+v_\lambda }\) is Morse below level \(a\) by Proposition 8 and the fact that \(v_\lambda \) is not supported near critical points. Hence \(\mathcal{D }_{u,\mathcal{V }+v_\lambda }\) is Fredholm by Theorem 5. Recall from (65) the linearization of \(\mathcal{F }\) at \((u,v_\lambda )\). Note that the second operator

$$\begin{aligned} Y(\mathcal{V },U)\rightarrow \mathcal{L }^p_u\;:\; \hat{\mathcal{V }}\mapsto -\mathrm{grad }\hat{\mathcal{V }}(u) \end{aligned}$$
(69)

is bounded. To see this observe that, since the support of \(\hat{\mathcal{V }}\) is disjoint to the neighborhood \(U\) of \(x\) and \(y\), there is a constant \(T=T(u)>0\) such that \(\mathrm{grad }\hat{\mathcal{V }}(u_s)=0\) whenever \(\mathopen |s\mathclose |>T\). Now \(\hat{\mathcal{V }}\) is of the form \(\sum _{\ell =0}^\infty \mu _\ell \mathcal{V }_\ell \). Hence

$$\begin{aligned} \bigl \Vert \mathrm{grad }\hat{\mathcal{V }}(u)\bigr \Vert _{L^p(\mathbb{R }\times S^1)}&= \left(\,\,\int _{-T}^T\left\Vert {\mathrm{grad }\hat{\mathcal{V }}(u_s)} \right\Vert_p^p ds \right)^{1/p}\\&\le \left(2T\right)^{1/p} \sum _{\ell =0}^\infty \mathopen |\mu _\ell \mathclose |\cdot \left\Vert {\mathrm{grad }\mathcal{V }_\ell (u_s)} \right\Vert_\infty \\&\le \left(2T\right)^{1/p} \sum _{\ell =0}^\infty \mathopen |\mu _\ell \mathclose | C_\ell ^0\\&\le \left(2T\right)^{1/p} \bigl \Vert \hat{\mathcal{V }}\bigr \Vert \end{aligned}$$

where for each \(\mathcal{V }_\ell \) we used the last condition in (V0) with constant \(C_\ell ^0\le C_\ell ^\ell \). The last step uses the definition (59) of the norm in \(Y\).

Now the range of \(d\mathcal{F }(u,v_\lambda )\) is closed by a standard result; see e.g. [19, proposition 3.3]. Hence it suffices to prove that it is dense. But density of the range is equivalent to triviality of its annihilator. By definition this means that, given \(\eta \in \mathcal{L }_u^q\) and setting \(\mathcal{D }:=\mathcal{D }_{u,\mathcal{V }+v_\lambda }\) to simplify notation, then

$$\begin{aligned} \langle \eta ,\mathcal{D }\xi \rangle =0, \quad \forall \xi \in \mathcal{W }^{1,p}_u, \end{aligned}$$
(70)

and

$$\begin{aligned} \langle \eta ,\mathrm{grad }\hat{\mathcal{V }}(u) \rangle =0, \quad \forall \hat{\mathcal{V }}\in Y(\mathcal{V },U), \end{aligned}$$
(71)

imply that \(\eta =0\).

Assume by contradiction that \(\eta \in \mathcal{L }_u^q\) satisfies (70) and \(\eta \not =0\). In five steps we derive a contradiction to (71). Steps 1–3 are preparatory, in step 4 we construct a model perturbation \(\mathcal{V }_{\varepsilon }\) violating (71) and in step 5 we approximate \(\mathcal{V }_{\varepsilon }\) by the fundamental perturbations \(\mathcal{V }_{ijk}\) of the form (57). To start with observe that \(\eta \) is smooth by (70) and the regularity theorem [21, thm. 3.1]. Furthermore, integrating (70) by parts whenever \(\xi \in C^\infty _0(\mathbb{R }\times S^1,u^*TM)\) shows that \(\mathcal{D }^*\eta =0\) pointwise, where the operator \(\mathcal{D }^*\) arises by replacing \(\nabla {}_{s}\) by \(-\nabla {}_{s}\) in (66). Throughout we use the notation \(\eta _s(t)=\eta (s,t)\). Hence \(\eta _s\) is a smooth vector field along the loop \(u_s\).

Step 1

(Unique continuation) \(\eta _s\not =0\) and \({\partial }_su_s\not =0\) for every \(s\in \mathbb{R }\).

Because \(\eta \) is smooth, nonzero, and \(\mathcal{D }^*\eta =0\), Proposition 6 on unique continuation shows that \(\eta _s\not =0\) for every \(s\in \mathbb{R }\). Next observe that \({\partial }_su\) is smooth and \(0=\frac{d}{ds}\mathcal{F }_{v_\lambda }(u)=\mathcal{D }{\partial }_su\). Since \(u\) connects different critical points, the derivative \({\partial }_su\) cannot vanish identically on \(\mathbb{R }\times S^1\). Apply Proposition 6 to \(\xi (s):={\partial }_su_s\).

Step 2

(Slicewise Orthogonal) \(\langle \eta _s,{\partial }_su_s\rangle =0\) for every \(s\in \mathbb{R }\).

Throughout step 2 we denote the \(L^2(S^1)\) inner product by \(\langle \cdot ,\cdot \rangle \). Observe that

$$\begin{aligned} \frac{d}{ds} \langle \eta _s,{\partial }_su_s\rangle&= \langle \nabla {}_{s}\eta _s,{\partial }_su_s\rangle +\langle \eta _s,\nabla {}_{s}{\partial }_su_s\rangle \\&= \langle -\nabla {}_{t}\nabla {}_{t}\eta _s -R(\eta _s,{\partial }_tu_s){\partial }_tu_s -\mathcal{H }_{\mathcal{V }+v_\lambda }(u_s)\eta _s,{\partial }_su_s\rangle \\&+\langle \eta _s, \nabla {}_{t}\nabla {}_{t}{\partial }_su_s -R({\partial }_su_s,{\partial }_tu_s){\partial }_tu_s -\mathcal{H }_{\mathcal{V }+v_\lambda }(u_s){\partial }_su_s\rangle \\&= 0 \end{aligned}$$

by straightforward calculation. In the second equality we replaced \(\nabla {}_{s}\eta _s\) according to the identity \(\mathcal{D }^*\eta =0\) and \(\nabla {}_{s}{\partial }_su_s\) according to \(\mathcal{D }{\partial }_su=0\); see (66). The last step is by integration by parts, symmetry of the Hessian \(\mathcal{H }\), and the first Bianchi identity for the curvature operator \(R\). Thus \(\langle \eta _s,{\partial }_su_s\rangle \) is constant in \(s\). Now this constant, say \(c\), must be zero, because

$$\begin{aligned} \int _{-\infty }^\infty c \; ds =\int _{-\infty }^\infty \langle \eta _s,{\partial }_su_s\rangle \; ds =\langle \eta ,{\partial }_su\rangle \end{aligned}$$

and the right hand side is finite, since \(\eta \in \mathcal{L }_u^q\) and \({\partial }_su\in \mathcal{L }_u^p\) with \(\frac{1}{p}+\frac{1}{q}=1\). This proves step 2.

Note that \(\eta _s\) and \({\partial }_su_s\) are linearly independent for every \(s\in \mathbb{R }\) as a consequence of step 1 and step 2.

Step 3

(No Return) Assume the loop \(u_{s_0}\) is different from the asymptotic limits \(x\) and \(y\). Assume \(\delta >0\). Then there exists \({\varepsilon }>0\) such that for every \(s\in \mathbb{R }\)

$$\begin{aligned} \left\Vert {u_s-u_{s_0}} \right\Vert_2<3{\varepsilon }\quad \Longrightarrow \quad s\in (s_0-\delta ,s_0+\delta ). \end{aligned}$$

In words, once \(s\) leaves a given \(\delta \)-interval about \(s_0\) the loops \(u_s\) cannot return to some \(L^2\) \({\varepsilon }\)-neighborhood of \(u_{s_0}\).

Key ingredients in the proof are smoothness of \(u\), existence of asymptotic limits, and the gradient flow property. Recall the footnote in Remark 2 concerning the difference of loops \(u_s-u_{s_0}\). Now assume by contradiction that there is a sequence of positive reals \({\varepsilon }_i\rightarrow 0\) and a sequence of reals \(s_i\) which satisfy \(\mathopen \Vert {u_{s_i}-u_{s_0}} \mathclose \Vert _2<3{\varepsilon }_i\) and \(s_i\notin (s_0-\delta ,s_0+\delta )\). In particular, this shows that

$$\begin{aligned} u_{s_i} \stackrel{L^2}{\longrightarrow } u_{s_0}\quad \text{ as} \ i \rightarrow \infty . \end{aligned}$$
(72)

Assume first that the sequence \(s_i\) is unbounded. Hence there is a subsequence, still denoted by \(s_i\), which converges to \(+\infty \) or \(-\infty \). In either case \(u_{s_i}\) converges to one of the critical points \(x\) or \(y\) and the convergence is in \(C^2(S^1)\) by Theorem 4. Hence (72) implies that \(u_{s_0}\in \{x,y\}\) contradicting our assumption.

If the sequence \(s_i\) is bounded, there is a subsequence, still denoted by \(s_i\), which converges to some element \(s_1\notin (s_0-\delta ,s_0+\delta )\). On the other hand, the sequence \(u_{s_i}\) converges to \(u_{s_1}\) in \(C^0(S^1)\) by smoothness of \(u\). Thus \(u_{s_1}=u_{s_0}\). But the action strictly decreases along nonconstant negative gradient flow lines. Therefore \(s_1=s_0\) and this contradiction concludes the proof of step 3.

Step 4

There is a time \(s_0\in \mathbb{R }\) such that \(u_{s_0}\) lies outside \(U\). Moreover, there is a constant \({\varepsilon }>0\) and a smooth function \(\mathcal{V }_0 :\mathcal{L }M\rightarrow \mathbb{R }\) supported in the \(L^2\) ball of radius \(2{\varepsilon }\) about \(u_{s_0}\) such that

$$\begin{aligned} \mathcal{V }_0 (u_{s_0})=0,\quad d\mathcal{V }_0 (u_{s_0})\eta _{s_0}= \left\Vert {\eta _{s_0}} \right\Vert_2^2,\quad \langle \mathrm{grad }\mathcal{V }_0 (u), \eta \rangle \not =0 \end{aligned}$$

where the inner product is in \(L^2(\mathbb{R }\times S^1)\).

The first assertion follows from \(x\not = y\) and the fact that the closed sets \(U_z\), where \(z\in \mathcal{P }(\mathcal{V })\), are pairwise disjoint. Clearly the graph \(t\mapsto (t,u_{s_0}(t))\) of the loop \(u_{s_0}\) is embedded in \(S^1\times M\). We define a smooth function \(V\) on \(S^1\times M\) supported near this graph as follows. Denote by \(\iota >0\) the injectivity radius of the closed Riemannian manifold \(M\). Pick a smooth cutoff function \(\beta :\mathbb{R }\rightarrow [0,1]\) such that \(\beta =1\) on \([-(\iota /2)^2,(\iota /2)^2]\) and \(\beta =0\) outside \([-{\iota }^2,{\iota }^2]\); see Fig. 2. Then define

$$\begin{aligned} V_t(q):=V(t,q):= \left\{ \begin{array}{ll} \beta \bigl (\mathopen |\xi _q(t)\mathclose |^2\bigr )\; \bigl \langle \xi _q(t), \eta _{s_0}(t)\bigr \rangle&,\mathopen |\xi _q(t)\mathclose |<\iota ,\\ 0&\text{,} \text{ else,} \end{array} \right. \end{aligned}$$
(73)

where the vector \(\xi _q(t)\) is determined by the identity \( q=\exp _{u(s_0,t)}\xi _q(t) \) whenever the Riemannian distance \(d\) between \(q\) and \(u_{s_0}(t)\) is less than \(\iota \). Note that the function \(V\) vanishes on the graph of the loop \(u_{s_0}\).

Use that all maps involved are smooth to choose a constant \(\delta >0\) sufficiently small such that for every \(s\in (s_0-\delta ,s_0+\delta )\) the following is true

  1. i)

    \(d_{C^0}(u_s,u_{s_0}) =\left\Vert {\xi _s} \right\Vert_\infty <\iota /2\) where the vector field \(\xi _s\) along the loop \(u_{s_0}\) is uniquely determined by the pointwise identity \( u_s=\exp _{u_{s_0}}\xi _s \),

  2. ii)

    \(\langle E_2(u_{s_0},\xi _s)^{-1} \eta _s,\eta _{s_0} \rangle \ge \frac{1}{2}\mu _0\) where \(\mu _0:=\left\Vert {\eta _{s_0}} \right\Vert_2^2>0\),

  3. iii)

    \(\frac{1}{2} \mu _1 \le \frac{\left\Vert {u_s-u_{s_0}} \right\Vert_2}{\mathopen |s-s_0\mathclose |} \le \frac{3}{2} \mu _1\) where \(\mu _1:=\left\Vert {{\partial }_su_{s_0}} \right\Vert_2>0\).

Recall the definition (26) of \(E_2\) and the identities (28). For \(s\in (s_0-\delta ,s_0+\delta )\), we obtain that

$$\begin{aligned} dV_t(u_s)\,\eta _s&= \left.\frac{d}{dr}\right|_{r=0} V_t(\exp _{u_s}r\eta _s)\nonumber \\&= 2\beta ^\prime (\mathopen |\xi _s\mathclose |^2)\; \langle \xi _s, E_2(u_{s_0},\xi _s)^{-1} \eta _s\rangle \cdot \langle \xi _s, \eta _{s_0}\rangle \nonumber \\&+\,\beta (\mathopen |\xi _s\mathclose |^2)\; \langle E_2(u_{s_0},\xi _s)^{-1} \eta _s,\eta _{s_0} \rangle \nonumber \\&= \langle E_2(u_{s_0},\xi _s)^{-1} \eta _s,\eta _{s_0} \rangle \end{aligned}$$
(74)

pointwise for every \(t\in S^1\). The final step uses i) and the definition of \(\beta \). Note that \(dV_t(u_{s_0})\,\eta _{s_0} =\mathopen |\eta _{s_0}\mathclose |^2\) pointwise.

Integrating \(V\) along a loop defines a smooth function on the loop space which vanishes on \(u_{s_0}\). To cut this function off with respect to the \(L^2\) distance fix a smooth cutoff function \(\rho :\mathbb{R }\rightarrow [0,1]\) such that \(\rho =1\) on \([-1,1]\), \(\rho =0\) outside \([-4,4]\), and \(\mathopen \Vert {\rho ^\prime } \mathclose \Vert _\infty <1\). Then, for the constant \(\delta \) fixed above, choose \({\varepsilon }>0\) according to step 3 (No Return) and set \(\rho _{\varepsilon }(r)=\rho (r/{\varepsilon }^2)\); see Fig. 1 for \({\varepsilon }=\frac{1}{k}\). Note that \(\mathopen \Vert {\rho _{\varepsilon }^\prime } \mathclose \Vert _\infty <{\varepsilon }^{-2}\). Observe that we can choose \({\varepsilon }>0\) smaller and the assertion of step 3 remains true. Now define a smooth function on \(\mathcal{L }M\) by

$$\begin{aligned} \mathcal{V }_0 (x) :=\rho _{\varepsilon }\left( \left\Vert {x-u_{s_0}} \right\Vert_2^2\right) \int _0^1 V(t,x(t))\, dt \end{aligned}$$

where \(V\) is given by (73). The function \(\mathcal{V }_0 \) vanishes on the loop \(u_{s_0}\) and satisfies

$$\begin{aligned} d\mathcal{V }_0 (u_s)\,\eta _s&= \left.\frac{d}{dr}\right|_{r=0} \mathcal{V }_0 (\exp _{u_s}r\eta _s)\\&= 2\rho _{\varepsilon }^\prime \bigl ( \left\Vert {u_s-u_{s_0}} \right\Vert_2^2\bigr )\; \langle u_s-u_{s_0}, \eta _s\rangle \int _0^1 V_t(u_s(t))\, dt\\&+\,\rho _{\varepsilon }\bigl ( \left\Vert {u_s-u_{s_0}} \right\Vert_2^2\bigr )\; \int _0^1 dV_t(u_s(t))\,\eta _s(t)\, dt. \end{aligned}$$

Hence \(d\mathcal{V }_0 (u_{s_0})\eta _{s_0} =\mathopen \Vert {\eta _{s_0}} \mathclose \Vert _2^2\) and this proves another assertion of step 4.

To prove the final assertion of step 4 observe that \(s\notin (s_0-\delta ,s_0+\delta )\) implies \(\mathopen \Vert {u_s-u_{s_0}} \mathclose \Vert _2\ge 3{\varepsilon }\) by step 3, hence \(u_s\notin \mathrm{supp}\,\mathcal{V }_0 \). It follows that

$$\begin{aligned} \langle \mathrm{grad }\,\mathcal{V }_0 (u),\eta \rangle&= \int _{s_0-\delta }^{s_0+\delta } d\mathcal{V }_0 (u_s)\eta _s\, ds\nonumber \\&= \int _{s_0-\delta }^{s_0+\delta } 2\rho _{\varepsilon }^\prime \bigl ( \left\Vert {u_s-u_{s_0}} \right\Vert_2^2\bigr ) \langle u_s-u_{s_0}, \eta _s\rangle \langle \xi _s, \eta _{s_0}\rangle \, ds\nonumber \\&+\int _{s_0-\delta }^{s_0+\delta } \rho _{\varepsilon }\bigl ( \left\Vert {u_s-u_{s_0}} \right\Vert_2^2\bigr ) \langle E_2(u_{s_0},\xi _s)^{-1} \eta _s,\eta _{s_0} \rangle \, ds. \end{aligned}$$
(75)

We shall estimate the two terms in the sum separately. Let \(s_2>s_0\) be such that \(\mathopen \Vert {u_{s_2}-u_{s_0}} \mathclose \Vert _2={\varepsilon }\) and \(\mathopen \Vert {u_s-u_{s_0}} \mathclose \Vert _2<{\varepsilon }\) whenever \(s\in (s_0,s_2)\). This means that \(s_2\) is the forward exit time of \(u_s\) with respect to the \(L^2\) ball of radius \({\varepsilon }\) about \(u_{s_0}\). Let \(s_1<s_0\) be the corresponding backward exit time; see Fig. 3. Use ii) and \(\rho _{\varepsilon }\ge 0\) to obtain that

$$\begin{aligned}&\int _{s_0-\delta }^{s_0+\delta } \rho _{\varepsilon }\bigl ( \left\Vert {u_s-u_{s_0}} \right\Vert_2^2\bigr ) \langle E_2(u_{s_0},\xi _s)^{-1} \eta _s,\eta _{s_0} \rangle \, ds\\&\quad \ge \int _{s_1}^{s_2} 1\cdot \frac{\mu _0}{2}\, ds =\frac{\mu _0}{2} \left( s_2-s_0+s_0-s_1\right)\\&\quad \ge \frac{\mu _0}{3\mu _1}\left( \left\Vert {u_{s_2}-u_{s_0}} \right\Vert_2 +\left\Vert {u_{s_0}-u_{s_1}} \right\Vert_2\right) =\frac{2\mu _0}{3\mu _1}\, {\varepsilon }. \end{aligned}$$

Here the second inequality uses iii). To estimate the other term in (75) let \(\sigma _1\) be the time of first entry into the \(L^2\) ball of radius \(2{\varepsilon }\) starting from \(s_0-\delta \) and let \(\sigma _2\) be the corresponding time when time runs backwards and we start from \(s_0+\delta \); see Fig. 3.

Fig. 3
figure 3

Exit times \(s_1,s_2\) and entry times \(\sigma _1,\sigma _2\)

Then it follows that

$$\begin{aligned}&\int _{s_0-\delta }^{s_0+\delta } 2\rho _{\varepsilon }^\prime \bigl ( \left\Vert {u_s-u_{s_0}} \right\Vert_2^2\bigr ) \langle u_s-u_{s_0}, \eta _s\rangle \langle \xi _s, \eta _{s_0}\rangle \, ds\\&\quad \ge -2\int _{\sigma _1}^{\sigma _2} \left\Vert {\rho _{\varepsilon }^\prime } \right\Vert_\infty \left|\langle u_s-u_{s_0}, \eta _s\rangle \right|\cdot \mathopen |\langle \xi _s, \eta _{s_0}\rangle \mathclose |\, ds\\&\quad \ge -2c_1c_2{\varepsilon }^{-2}\int _{\sigma _1}^{\sigma _2} (s-s_0)^4\, ds\\&\quad =-\frac{2c_1c_2}{5{\varepsilon }^2}\left( \sigma _2-s_0+s_0-\sigma _1\right)^5 \ge -\frac{2c_1c_28^5}{5\mu _1^5}{\varepsilon }^3. \end{aligned}$$

It remains to explain the second and the final inequality. In the final one we use that by iii) there is the estimate \(\sigma _2-s_0\le 2\mathopen \Vert {u_{\sigma _2}-u_{s_0}} \mathclose \Vert _2/\mu _1 =4{\varepsilon }/\mu _1\) and similarly for \(s_0-\sigma _1\). The second inequality is based on the geometric fact that \({\partial }_su\) and \(\eta \) are slicewise orthogonal by step 2. Namely, let \(f(s)=\langle u_s-u_{s_0}, \eta _s\rangle \) and \(h(s)=\langle \xi _s, \eta _{s_0}\rangle \), then \(f(s_0)=h(s_0)=0\) and

$$\begin{aligned} f^\prime (s)&= \langle {\partial }_su_s,\eta _s\rangle +\langle u_s-u_{s_0}, \nabla {}_{s}\eta _s\rangle =\langle u_s-u_{s_0}, \nabla {}_{s}\eta _s\rangle \\ h^\prime (s)&= \langle E_2(u_{s_0},\xi _s)^{-1}{\partial }_su_s, \eta _{s_0}\rangle . \end{aligned}$$

Hence \(f^\prime (s_0)=h^\prime (s_0)=0\) and so there exist constants \(c_1=c_1(f)>0\) and \(c_2=c_2(h)>0\) depending continuously on \(\delta \) such that for every \(s\in (s_0-\delta ,s_0+\delta )\)

$$\begin{aligned} \left|f(s)\right| \le c_1(s-s_0)^2,\quad \left|h(s)\right| \le c_2(s-s_0)^2. \end{aligned}$$

This proves the second inequality. Now choose \({\varepsilon }>0\) sufficiently small such that \({\varepsilon }^2<\mu _0\mu _1^4/c_1c_2\). This implies that \(\langle \mathrm{grad }\,\mathcal{V }_0 (u), \eta \rangle >0\) and proves step 4.

Now recall that \(u_{s_0}\notin U\). Choose \({\varepsilon }>0\) again smaller such that the \(L^2\) ball of radius \(3{\varepsilon }\) about \(u_{s_0}\) is disjoint from the \(L^2\) closed set \(U\), that \(3{\varepsilon }\) is smaller than the injectivity radius \(\iota \) of \(M\), and that \({\varepsilon }=1/k\) for some integer \(k\).

Step 5

Given \(k=1/{\varepsilon }\) as in the paragraph above, there exist integers \(i,j>0\) such that the function \(\hat{\mathcal{V }}:=\mathcal{V }_{ijk}\) given by (57) lies in \(Y(\mathcal{V },U)\) and satisfies

$$\begin{aligned} \langle \mathrm{grad }\,\mathcal{V }_{ijk}(u), \eta \rangle >0. \end{aligned}$$

This contradicts (71) and thereby proves Proposition 9.

Consider the loop \(u_{s_0}\) where \(s_0\) is the time in step 4. In Sect. 5.1 we fixed a dense sequence \(x_i\) in \(C^\infty (S^1,M)\) and for each \(i\) a dense sequence \(\eta ^{ij}\) in \(C^\infty (S^1,x_i^*TM)\). Choose a subsequence, still denoted by \(x_i\), such that

$$\begin{aligned} x_i \rightarrow u_{s_0},\quad \text{ as} \ i\rightarrow \infty . \end{aligned}$$

Now we may assume without loss of generality that every \(x_i\) lies in \(B_{\varepsilon }(u_{s_0})\) the \(L^2\) ball of radius \({\varepsilon }\) about \(u_{s_0}\). Hence \(B_{2{\varepsilon }}(x_i)\subset B_{3{\varepsilon }}(u_{s_0})\). Let \(\xi _{s_0}^i\) be defined by the identity \(u_{s_0}=\exp _{x_i} \xi _{s_0}^i\) pointwise for every \(t\in S^1\). Choose a diagonal subsequence, denoted for simplicity by \(\eta ^{ii}\), such that

$$\begin{aligned} \Phi _{x_i}(\xi _{s_0}^i)\eta ^{ii} \rightarrow \eta _{s_0},\quad \text{ as} \; i\rightarrow \infty . \end{aligned}$$

Here \(\Phi _x(\xi )\) is parallel transport from \(x\) to \(\exp _x \xi \) along \(\tau \mapsto \exp _x\tau \xi \) pointwise for every \(t\in S^1\). Let \((\mathcal{V }_{iik})_{i\in \mathbb{N }}\) be the corresponding sequence of functions where each \(\mathcal{V }_{iik}\) is given by (57). Now observe that

$$\begin{aligned} \mathrm{supp}\mathcal{V }_{iik} \subset B_{2/k}(x_i) =B_{2{\varepsilon }}(x_i) \subset B_{3{\varepsilon }}(u_{s_0}). \end{aligned}$$

But \(B_{3{\varepsilon }}(u_{s_0})\cap U=\emptyset \) by the choice of \({\varepsilon }\) in the paragraph prior to step 4 and so \(\mathcal{V }_{iik}\in Y(\mathcal{V },U)\). Next recall that the constant \(\delta >0\) has been chosen in the proof of step 4 in order to exclude any return of the trajectory \(s\mapsto u_s\) to the ball \(B_{3{\varepsilon }}(u_{s_0})\) once \(s\) has left the interval \((s_0-\delta ,s_0+\delta )\). Since \(\mathrm{supp}\mathcal{V }_{iik}\subset B_{3{\varepsilon }}(u_{s_0})\), this shows that \(\mathcal{V }_{iik}(u_s)=0\) whenever \(s\notin (s_0-\delta ,s_0+\delta )\). Hence

$$\begin{aligned} \langle \mathrm{grad }\,\mathcal{V }_{iik}(u),\eta \rangle&= \int _{s_0-\delta }^{s_0+\delta } 2\rho _{1/k}^\prime \bigl ( \left\Vert {u_s-x_i} \right\Vert_2^2\bigr ) \langle u_s-x_i, \eta _s\rangle \langle \xi _s^i, \eta ^{ii}\rangle \, ds\\&+\int _{s_0-\delta }^{s_0+\delta } \rho _{1/k}\bigl ( \left\Vert {u_s-x_i} \right\Vert_2^2\bigr ) \langle E_2(x_i,\xi _s^i)^{-1} \eta _s,\eta ^{ii} \rangle \, ds \end{aligned}$$

where \(\xi _s^i\) is determined by \(u_s=\exp _{x_i}\xi _s^i\). Now the right hand side converges for \(i\rightarrow \infty \) to the right hand side of (75), which equals \(\langle \mathrm{grad }\,\mathcal{V }_0 (u), \eta \rangle >0\). This proves step 5 and Proposition 9.

6 Heat flow homology

In Sect. 6.1 we define the unstable manifold of a critical point \(x\) of the action functional \(\mathcal{S }_\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) as the set of endpoints at time zero of all backward halfcylinders solving the heat equation (7) and emanating from \(x\) at \(-\infty \). The main result is Theorem 18 saying that if \(x\) is nondegenerate, then this is a submanifold of the loop space and its dimension is the Morse index of \(x\).

In Sect. 6.2 we put together everything to construct the Morse complex for the negative \(L^2\) gradient of the action functional on the loop space \(\mathcal{L }M\).

6.1 The unstable manifold theorem

Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and consider the backward halfcylinder \(Z^-=(-\infty ,0]\times S^1\). Given a critical point \(x\) of the action functional \(\mathcal{S }_\mathcal{V }\) the moduli space

$$\begin{aligned} \mathcal{M }^-(x;\mathcal{V }) \end{aligned}$$
(76)

is, by definition, the set of all solutions \(u^-:Z^-\rightarrow M\) of the heat equation (7) which satisfy the asymptotic limit condition (3), as \(s\rightarrow -\infty \). Note that the moduli space is not empty; it contains the stationary solution \(u^-_x(s,\cdot )=x\). The unstable manifold of \(\mathbf{x}\) is defined by

$$\begin{aligned} W^u(x;\mathcal{V }) =\{{u^-}(0,\cdot )\mid {u^-}\in \mathcal{M }^-(x;\mathcal{V })\}. \end{aligned}$$

Theorem 18

Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3). If \(x\) is a nondegenerate critical point of the action functional \(\mathcal{S }_\mathcal{V }\), then the unstable manifold \(W^u(x;\mathcal{V })\) is a smooth contractible embedded submanifold of the loop space and its dimension is equal to the Morse index of \(x\).

The first step in the proof of Theorem 18 is to show that the moduli space \(\mathcal{M }^-(x;\mathcal{V })\) is a smooth manifold of the desired dimension whenever \(x\) is nondegenerate (Proposition 10). A crucial ingredient is Proposition 11 on surjectivity of the operator \(\mathcal{D }_{u^-}:\mathcal{W }^{1,p}\rightarrow \mathcal{L }^p\) whenever \(u^-\in \mathcal{M }^-(x;\mathcal{V })\) and \(p\ge 2\). Here the operator \(\mathcal{D }_{u^-}\) is given by (21) and arises by linearizing the heat equation at the backward trajectory \(u^-\). A further key result to prove Theorem 18 is unique continuation for the linear and the nonlinear heat equation, Proposition 6 and Theorem 17. Namely, unique continuation implies that the evaluation map

$$\begin{aligned} ev_0:\mathcal{M }^-(x;\mathcal{V })\rightarrow \mathcal{L }M,\quad {u^-}\mapsto {u^-}(0,\cdot ) \end{aligned}$$

is an injective immersion, hence an embedding by the gradient flow property.

Proposition 10

(Moduli space) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and assume \(x\) is a nondegenerate critical point of \(\mathcal{S }_\mathcal{V }\). Then the moduli space \(\mathcal{M }^-(x;\mathcal{V })\) is a smooth contractible manifold of dimension \(\mathrm{ind}_\mathcal{V }(x)\). Its tangent space at \({u^-}\) is equal to the vector space \(X^-\) given by (77).

Proposition 11

(Surjectivity) Fix a perturbation \(\mathcal{V }:\mathcal{L }M\rightarrow \mathbb{R }\) that satisfies (V0)–(V3) and a nondegenerate critical point \(x\) of \(\mathcal{S }_\mathcal{V }\). Assume \(p>2\) and \({u^-}\in \mathcal{M }^-(x;\mathcal{V })\). Then the following is true. The operator \( \mathcal{D }_{u^-}:\mathcal{W }^{1,p}\rightarrow \mathcal{L }^p \) is Fredholm, onto, and its kernel is given by

$$\begin{aligned} X^-&:= \Bigl \{ \xi \in C^\infty (Z^-,{u^-}^*TM) \mid \mathcal{D }_{u^-}\xi =0, \exists c,\delta >0\; \forall s\le 0:\nonumber \\&\quad \left\Vert {\xi _s} \right\Vert_\infty +\left\Vert {\nabla {}_{t}\xi _s} \right\Vert_\infty +\left\Vert {\nabla {}_{t}\nabla {}_{t}\xi _s} \right\Vert_\infty +\left\Vert {\nabla {}_{s}\xi _s} \right\Vert_\infty \le ce^{\delta s}\Bigr \}. \end{aligned}$$
(77)

Moreover, the dimension of \(X^-\) is equal to the Morse index of \(x\).

Proposition 11 is in fact a corollary of Theorem 19 below which asserts surjectivity in the special case of a stationary solution \({u^-}(s,t)=x(t)\), where \(x\) is a nondegenerate critical point of \(\mathcal{S }_\mathcal{V }\). The idea is that if a solution \({u^-}\) is nearby the stationary solution \(x\) in the \(\mathcal{W }^{1,p}\) topology, then the corresponding linearizations \(\mathcal{D }_{u^-}\) and \(\mathcal{D }_x\) are close in the operator norm topology. But surjectivity is an open condition with respect to the norm topology. The case of a general solution reduces to the nearby case by shifting the \(s\)-variable.

Remark 6

Abbreviate \(H=L^2(S^1,\mathbb{R }^n)\) and \(W=W^{2,2}(S^1,\mathbb{R }^n)\) and consider the operator

$$\begin{aligned} A_S=-\frac{d^2}{dt^2}-S:H\rightarrow H \end{aligned}$$

with dense domain \(W\). Here we assume that \(S:W\rightarrow H\) is a symmetric and compact linear operator. Under these assumptions it is well known (see (ii) in [21, sec. 3.4]) that \(A_S\) is self-adjoint and that its Morse index \(\mathrm{ind}(A_S)\) is finite.

Theorem 19

Let \(S\) and \(A_S\) be as in Remark 6. Fix \(p\ge 2\) and assume that the linear operator \(S:W^{1,p}(S^1,\mathbb{R }^n)\rightarrow L^p(S^1,\mathbb{R }^n)\) is bounded with bound \(c_S\). Then the following is true. If \(A_S\) is injective, then the operator

$$\begin{aligned} D={\partial }_s-{\partial }_t{\partial }_t-S: \mathcal{W }^{1,p}(Z^-,\mathbb{R }^n)\rightarrow L^p(Z^-,\mathbb{R }^n) \end{aligned}$$

is onto. In the case \(p=2\) the map \(E^-\rightarrow \ker D\), \(v\mapsto e^{-sA_S}v\) is an isomorphism.

For the details of the proof of Theorem 19 we refer to [20, thm. 8.5]. The proof is rather lengthy, but follows closely the proof of the corresponding result in Floer theory, namely [12, lemma 2.4]. The proof takes four steps. Step 1 is to prove the theorem for \(p=2\). The proof of [12, lemma 2.4 step 1] carries over with minor but important modifications. These are related to the fact that our domain \(Z^-\) does have a boundary. Moreover, the proof uses the theory of semigroups. Step 4 is to generalize surjectivity from \(p=2\) to \(p>2\). This uses an argument due to Donaldson [3]. Here the estimates provided by step 2 and step 3 enter. Here we follow again the presentation in [12, lemma 2.4 steps 2–4] up to minor but subtle modifications. One subtlety is related to the parabolic estimate of step 2 which, in contrast to the elliptic case, requires the domain to be increased only towards the past.

Proof of Proposition 11

The arguments in the proof of [21, prop. 3.15] show that the kernel of \(\mathcal{D }_{u^-}:\mathcal{W }^{1,p}\rightarrow \mathcal{L }^p\) is equal to \(X^-\). But \(X^-\) does not depend on \(p\). On the other hand, for \(p=2\) the dimension of the kernel is equal to the Morse index of \(x\) by Theorem 19. Surjectivity of \(\mathcal{D }_{u^-}\) follows in three stages.

The stationary case. Consider the stationary solution \((s,t)\mapsto x(t)\). Then \(\mathcal{D }_x\) is onto by Theorem 19. To see this represent \(\mathcal{D }_x\) with respect to an orthonormal frame along \(x\); see [21, sec. 3.4].

The nearby case. Surjectivity is preserved under small perturbations with respect to the operator norm. Moreover, the operator family \(\mathcal{D }_{u^-}\) depends continuously on \(u^-\) with respect to the \(\mathcal{W }^{1,p}\) topology (here we use \(p>2\)). Hence, if \(u^-\in \mathcal{M }^-(x;\mathcal{V })\) satisfies \(u^-=\exp _x(\eta )\) and \(\mathopen \Vert {\eta } \mathclose \Vert _{\mathcal{W }^{1,p}}\) is sufficiently small, it follows that \(\mathcal{D }_{u^-}\) is onto.

The general case. Given \(u\in \mathcal{M }^-(x;\mathcal{V })\) and \(\sigma <0\), consider the shifted solution \(u^\sigma (s,t):=u(s+\sigma ,t)\). Then \(\left(\mathcal{D }_{u}\xi \right)^\sigma =\mathcal{D }_{u^\sigma }\xi ^\sigma \) by shift invariance of the linear heat equation. This means that surjectivity of \(\mathcal{D }_{u}\) is equivalent to surjectivity of \(\mathcal{D }_{u^\sigma }\). But the latter is true by the nearby case above, because \(u^\sigma \) converges to \(x\) in the \(\mathcal{W }^{1,p}\) topology, as \(\sigma \rightarrow -\infty \). To see this apply Theorem 14 (B) on exponential decay to \(u\) and note that \(u^\sigma (0,t)=u(\sigma ,t)\). \(\square \)

Proof of Proposition 10

The proof follows the same (standard) pattern as the proof of Theorem 6; see also the introduction to Sect. 3. The first step is the definition of a Banach manifold \(\mathcal{B }=\mathcal{B }^{1,p}_x\) of backward halfcylinders emanating from \(x\) such that \(\mathcal{B }\) contains the moduli space \(\mathcal{M }^-(x;\mathcal{V })\) whenever \(p>2\). The second step is to define a smooth map \(\mathcal{F }_{u^-}\) between Banach spaces as in (35). Its significance lies in the fact that its zeroes correspond precisely to the elements of the moduli space near \(u^-\) and that \(d\mathcal{F }_{u^-}(0)=\mathcal{D }_{u^-}\). By Proposition 11 this operator is Fredholm, surjective, and the dimension of its kernel is equal to the Morse index of \(x\). Hence \(\mathcal{M }^-(x;\mathcal{V })\) is locally near \(u^-\) modeled on \(\ker \mathcal{D }_{u^-}\) by the implicit function theorem for Banach spaces. To see that the moduli space is a contractible manifold observe that backward time shift provides a contraction

$$\begin{aligned} h:\mathcal{M }^-(x;\mathcal{V })\times [0,1]&\rightarrow \mathcal{M }^-(x;\mathcal{V })\\ (u,r)&\mapsto u(\cdot -\sqrt{r/(1-r)},\cdot ) \end{aligned}$$

onto the stationary solution \(x\), that is \(h\) is continuous and satisfies \(h(u,0)=u\) and \(h(u,1)=x\) for every \(u\in \mathcal{M }^-(x;\mathcal{V })\). \(\square \)

Proof of Theorem 18

We abbreviate \(\mathcal{M }^-=\mathcal{M }^-(x;\mathcal{V })\) and \(W^u=W^u(x;\mathcal{V })\). Recall that the moduli space \(\mathcal{M }^-\) is a smooth manifold of dimension equal to \(\mathrm{ind}_\mathcal{V }(x)\) by Proposition 10 and, furthermore, by definition the unstable manifold \(W^u\) is equal to the image of the evaluation map \(ev_0:\mathcal{M }^-\rightarrow \mathcal{L }M\) given by \(u\mapsto u(0,\cdot )=:u_0(\cdot )\). It remains to prove that \(ev_0\) and its linearization are injective and that \(ev_0\) is a homeomorphism onto \(W^u\).

To prove that \(ev_0\) is injective let \(u,v\in \mathcal{M }^-\) and assume that \(ev_0(u)=ev_0(v)\), that is \(u_0=v_0\). Hence \(u=v\) by Theorem 17 on backward unique continuation.

We prove that the linearization \(d(ev_0)_u\) of \(ev_0\) at \(u\in \mathcal{M }^-\) is injective. Pick \(\xi ,\eta \in T_u\mathcal{M }^-\), then \(\mathcal{D }_u\xi =0=\mathcal{D }_u\eta \) by Proposition 10. Now assume that \(d(ev_0)_u\xi =d(ev_0)_u\eta \), that is \(\xi _0=\eta _0\). Therefore \(\xi =\eta \) by application of Proposition 6 (a) on linear unique continuation to the vector field \(\xi -\eta \).

To prove that \(ev_0:\mathcal{M }^-\rightarrow \mathcal{L }M\) is a homeomorphism onto its image fix \(u\in \mathcal{M }^-\). Since every immersion is locally an embedding, there is an open disk \(D\) in \(\mathcal{M }^-\) containing \(u\) such that \(ev_0|_D:D\rightarrow \mathcal{L }M\) is an embedding. It remains to prove that there is an open neighborhood \(U\) of \(u_0=ev_0(u)\) in \(\mathcal{L }M\) such that

$$\begin{aligned} U\cap W^u=U\cap ev_0(D). \end{aligned}$$
(78)

There are two cases. In case one \(u\) is constant in \(s\), that is \(u\equiv x\). In this case we exploit the fact that the restricted function \(\mathcal{S }_\mathcal{V }|_{W^u}\) takes on its maximum precisely at the critical point \(x\) by the (negative) gradient flow property. Case two is the complementary case in which \(u\) depends on \(s\). To deal with this case we use a convergence argument based on the compactness Theorem 11. \(\square \)

Case 1

(\(u\equiv x\)) Set \(c=\mathcal{S }_\mathcal{V }(x)\), then a set \(U\) having the desired property (78) is given by \( U:=\{c- {\varepsilon }< \mathcal{S }_\mathcal{V }< c+ {\varepsilon }\} \), where \( 2{\varepsilon }:= \min _{u\in \mathrm{cl}D\setminus D} \left(\mathcal{S }_\mathcal{V }(x)-\mathcal{S }_\mathcal{V }(u_0)\right) \). Here the compact set \(\mathrm{cl}D\setminus D\) is the topological boundary of the open disc \(D\). Note that the elements of \(W^u\setminus ev_0(D)\) have action at most \(c-2{\varepsilon }\).

Case 2

(\(u\not \equiv x\)) Assume by contradiction that there is no \(U\) which satisfies (78). Then there is a sequence \(\gamma ^\nu \in W^u\setminus ev_0(D)\) that converges to \(u_0\) in \(\mathcal{L }M\), as \(\nu \rightarrow \infty \). Note that \(\gamma ^\nu =ev_0(u^\nu )\) where \(u^\nu \in \mathcal{M }^-\setminus D\). In particular, each trajectory \(u^\nu \) converges in backward time asymptotically to \(x\). Thus

$$\begin{aligned} \sup _{s\in (-\infty ,0]}\mathcal{S }_\mathcal{V }(u^\nu _s) \le \mathcal{S }_\mathcal{V }(x)=:c \end{aligned}$$

for every \(\nu \). Together with the energy identity this implies that

$$\begin{aligned} E(u^\nu ) =\mathcal{S }_\mathcal{V }(x)-\mathcal{S }_\mathcal{V }(u^\nu _0) =c-\tfrac{1}{2} \left\Vert {{\partial }_tu^\nu _0} \right\Vert_{L^2(S^1)}^2 +\mathcal{V }(u^\nu _0) \le c+C_0 \end{aligned}$$

where \(C_0>1\) is the constant in axiom (V0). Adapting the proofs of the a priori Theorem 12 and the gradient bound Theorem 13 to cover the case of backward halfcylinders it follows that there is a constant \(C=C(c,\mathcal{V })>0\) such that

$$\begin{aligned} \left\Vert {{\partial }_tu^\nu } \right\Vert_\infty \le C ,\quad \left\Vert {{\partial }_su^\nu } \right\Vert_\infty \le C \sqrt{E(u^\nu )} \le C(c+C_0), \end{aligned}$$

for every \(\nu \). Here the norms are taken on the domain \((-\infty ,0]\times S^1\). Adapting also the proof of the compactness Theorem 11 we obtain—in view of the uniform a priori \(L^\infty \) bounds for \({\partial }_tu^\nu \) and \({\partial }_su^\nu \) just derived—the existence of a smooth heat flow solution \(v:(-\infty ,0]\times S^1\rightarrow M\) and a subsequence, still denoted by \(u^\nu \), such that \(u^\nu \) converges to \(v\) in \(C^\infty _{loc}\). In particular, this implies that \(u_0=v_0\) and that \({\partial }_tu^\nu _s\) converges to \({\partial }_tv_s\), as \(\nu \rightarrow \infty \), uniformly with all derivatives on \(S^1\) and for each \(s\). This and our earlier uniform action bound for \(u^\nu _s\) show that

$$\begin{aligned} \mathcal{S }_\mathcal{V }(v_s) =\lim _{\nu \rightarrow \infty }\mathcal{S }_\mathcal{V }(u^\nu _s) \le c \end{aligned}$$

for every \(s\). To summarize, we have two backward flow lines \(u\) and \(v\) defined on \((-\infty ,0]\times S^1\) along which the action is bounded from above by \(c\) and which coincide along the loop \(u_0=v_0\). Hence Theorem 17 (B) on backward unique continuation asserts that \(u=v\). Because \(u^\nu \) converges to \(v=u\) in \(C^\infty _{loc}\), it follows that \(u^\nu \) lies in the open disk \(D\) containing \(u\), whenever \(\nu \) is sufficiently large. For such \(\nu \) we arrive at the contradiction \(\gamma ^\nu =ev_0(u^\nu )\in ev_0(D)\).

6.2 The Morse complex

Assume that the action \(\mathcal{S }_V\) is a Morse function on the loop space. This is true for a generic potential \(V\in C^\infty (S^1\times M)\) by [19]. For each critical point \(x\in \mathcal{P }(V)\) fix an orientation \(\langle x\rangle \) of the tangent space at \(x\) to the (finite dimensional) unstable manifold \(W^u(x;V)\). We denote this choice of orientations by \(\langle \mathcal{P }\rangle \). Fix a regular value \(a\) of \(\mathcal{S }_V\). Then the Morse chain groups are the \(\mathbb{Z }\)-modules

$$\begin{aligned} \mathrm{CM}_k^a=\mathrm{CM}^a_k(V) :=\mathop {\bigoplus }\limits _{\mathop {x\in \mathcal{P }^a(V)}\limits _{\mathrm{ind}_V (x)=k}} \mathbb{Z }\, x, \quad k\in \mathbb{Z }. \end{aligned}$$

These modules are finitely generated and graded by the Morse index. We set \(C_k^a=\{0\}\) whenever the direct sum is taken over the empty set. We define

$$\begin{aligned} \mathrm{CM}^a_* :=\bigoplus _{k=0}^N \mathrm{CM}^a_k \end{aligned}$$

where \(N\) is the largest Morse index of an element of the finite set \(\mathcal{P }^a(V)\).

Set \(\mathcal{V }_V(x)=\int _0^1 V_t(x(t))\, dt\) and note that \(\mathcal{V }_V\) satisfies (V0)–(V3). Now consider the associated set \(\mathcal{O }^a(V)\) of admissible perturbations of \(\mathcal{V }_V\) defined by (63). Furthermore, consider its dense subset \({\varvec{\mathcal{O }}}^{\varvec{a}}_{{\varvec{reg}}}({\varvec{V}})\) of regular perturbations provided by Theorem 8; see (67) for the definition. Now for any \(v\in \mathcal{O }^a_{reg}(V)\) we have the following key facts. The functionals \(\mathcal{S }_V\) and \(\mathcal{S }_{V+v}\) coincide near their critical points and have the same sublevel set with respect to \(a\). Moreover, the perturbed functional \(\mathcal{S }_{V+v}\) is Morse–Smale below level \(a\). (Occasionally we denote \(\mathcal{V }+v\) in abuse of notation by \(V+v\) to emphasize that we are actually perturbing a geometric potential.)

To define the Morse boundary operator \({\partial }\) on \(\mathrm{CM}^a_*\) it suffices to define it on the set of generators \(\mathcal{P }^a(V)\) and then extend linearly. Fix a regular perturbation \(v\in \mathcal{O }^a_{reg}(V)\). Note that each chosen orientation \(\langle x\rangle \) not only orients the unstable manifold \(W^u(x;V)\), but also the perturbed one \(W^u(x;V+v)\). This is because the tangent spaces at \(x\) to \(W^u(x;V)\) and \(W^u(x;V+v)\) coincide (\(v\) is not supported near \(x\)) and unstable manifolds are finite dimensional and contractible (Theorem 18), hence orientable. Now given two critical points \(x^\pm \) of action less than \(a\), consider the heat moduli space \(\mathcal{M }(x^-,x^+;V+v)\) of solutions \(u\) of the heat equation (7) with \(\mathcal{V }\) replaced by \(\mathcal{V }+v\) and subject to the boundary condition (3). Recall from [13, ch. 11] that a choice of orientations for all unstable manifolds determines a system of coherent orientations in the sense of Floer–Hofer [7] on the heat moduli spaces.

From now on we assume that \(x^\pm \) are of Morse index difference one. In this case \(\mathcal{M }(x^-,x^+;V+v)\) is a smooth 1-dimensional manifold by Theorem 6 and its quotient \(\mathcal{M }(x^-,x^+;V+v)/\mathbb{R }\) by the (free) time shift action consists of finitely many points by Proposition 1. For \([u]\in \mathcal{M }(x^-,x^+;V+v)/\mathbb{R }\) time shift naturally induces an orientation of the corresponding component of \(\mathcal{M }(x^-,x^+;V+v)\); compare [13] and note that \({\partial }_su\) is a nonzero element of the one-dimensional vector space \(\ker \mathcal{D }_u=\det (\mathcal{D }_u)\). The characteristic sign \({\varvec{n}}_{\varvec{u}}\) of the heat trajectory \(u\) is defined to be \(+1\), if the time shift orientation coincides with the coherent orientation, and \(n_u:=-1\) otherwise. The characteristic sign depends on the chosen orientations \(\langle x^-\rangle \) and \(\langle x^+\rangle \). Consider the (finite) sum of characteristic signs corresponding to all heat trajectories from \(x^-\) to \(x^+\), namely

$$\begin{aligned} n_{\langle x^-\rangle ,\langle x^+\rangle } :=\sum _{[u]\in \mathcal{M }(x^-,x^+;V+v)/\mathbb{R }} n_u. \end{aligned}$$

If the sum runs over the empty set, we set \(n=0\). For \(x\in \mathcal{P }^a(V)\) define the Morse boundary operator \({\partial }={\partial }(V,v,\langle \mathcal{P }\rangle )\) by the (finite) sum

$$\begin{aligned} {\partial }x :=\mathop {\sum }\limits _{\mathop {y\in \mathcal{P }(V)} \limits _{\mathrm{ind}_V(x)-\mathrm{ind}_V(y)=1}} n_{\langle x\rangle ,\langle y\rangle }\, y \end{aligned}$$

and set \({\partial }x=0\) whenever the sum runs over the empty set.

Proof of Theorem 1

As mentioned above the heat moduli spaces are oriented coherently. This means that these orientations are compatible with gluing, which implies that \({\partial }\circ {\partial }=0\); see [7, §5].

The fact that heat flow homology is independent of the choice of regular perturbation \(v\in \mathcal{O }^a_{reg}(V)\) and orientations \(\langle \mathcal{P }\rangle \) of the unstable manifolds follows from the continuation argument which is standard in Floer theory; see again e.g. [5, 12]. Here it is crucial to observe that our admissible perturbations \(v\in \mathcal{O }^a\) are supported away from the level set \(\{\mathcal{S }_V=a\}\) on which the \(L^2\) gradient of \(\mathcal{S }_V\) (hence of \(\mathcal{S }_{V+v}\)) is nonvanishing and inward pointing with respect to \(\mathcal{L }^a M\). Alternatively, independence will follow from Theorem 9. \(\square \)