1 Introduction

In this article, we construct and describe the dynamics of solutions with smooth decaying initial data that exhibit gradient blow-up (while the solution itself remains bounded) for a wide class of perturbations of the inviscid Burgers equation

$$\begin{aligned} \partial _{t} u + u \partial _{x} u = 0. \end{aligned}$$
(Burgers)

Examples covered by our result include the fractional KdV equations

$$\begin{aligned} \partial _{t} u +u \partial _{x} u + \vert {D_{x}}\vert ^{\alpha -1} \partial _{x} u = 0 \quad \hbox { for any } 0 \le \alpha < 1, \end{aligned}$$
(fKdV)

Whitham’s equation,

$$\begin{aligned} \partial _{t} u +u \partial _{x} u + \Gamma (D_{x}) \partial _{x} u = 0 \quad \hbox { where } \Gamma (\xi ) = \sqrt{\frac{\tanh \xi }{\xi }}, \end{aligned}$$
(Whitham)

and the fractal Burgers equations

$$\begin{aligned} \partial _{t} u +u \partial _{x} u + \vert {D_{x}}\vert ^{\beta } u = 0 \quad \hbox { for any } 0 \le \beta < 1, \end{aligned}$$
(fBurgers)

where \(\vert {D_{x}}\vert ^{\alpha } = (-\Delta )^{\frac{\alpha }{2}}\). These equations arise naturally as model problems in the theory of water waves [30], which makes the problem of singularity formation a natural one.

Gradient blow-up (also referred to as shock formation or wave breaking in the contexts of hyperbolic conservation laws or water waves, respectively [38]) for (fKdV) in the range \( 0 \le \alpha < 1\) was conjectured from numerical experiments in [22] and later in [18]. While the existence of gradient-blow-up solutions was known for (fKdV) in the range \(0 \le \alpha < \frac{2}{3}\) and for (Whitham) (see for instance [17, 18, 32, 39]), our result seems to be the first construction of such blow-up solutions in the range \(\frac{2}{3} \le \alpha < 1\). We furthermore give a quantitative description of the blow-up dynamics in a stable blow-up regime in the case \(0< \alpha < \frac{2}{3}\), which seems to have not appeared in the literature (note that in the case \(\alpha =0\), the Burgers–Hilbert equation, a precise description of the same blow-up dynamics has already appeared in the recent work of Yang [39], which is discussed further below). In all cases, we also observe that there exist smooth compactly supported initial data of either sign (everywhere nonnegative or nonpositive) that give rise to the same blow-up behavior, which disproves (yet suggests a refinement of) a conjecture made by Klein–Saut [22, Conjecture 3] for (Whitham); see Remark 1.3 below.

Concerning (fBurgers), gradient blow-up was shown in the papers [1, 14, 20] for all ranges \(0< \beta < 1\). Moreover, in a recent work of Chickering–Moreno-Vasquez–Pandya [7], which we learned of while preparing our article, a quantitative description of a stable blow-up dynamics analogous to [39] in the Burgers–Hilbert case was given in the case \(0< \beta < \frac{2}{3}\). Our work provides an alternative, independent description of the same stable blow-up regime, as well as the precise description of some examples of gradient-blow-up solutions in the case \(\frac{2}{3} \le \beta < 1\), which seems new.

Our proof is based on a systematic study of the stability of self-similar blow-up solutions for (Burgers) under perturbations of the equation. As is well-known, (Burgers) admits a two-parameter family of scaling symmetries (corresponding to separate rescaling of time and space), which results in a one-parameter family of self-similar change of variables

$$\begin{aligned} (t, x, u) \mapsto (s, y, U) = \left( -\log (-t), \tfrac{x}{(-t)^{b}}, \tfrac{(-t)}{(-t)^{b}} u \right) , \end{aligned}$$

parametrized by \(b > 0\).Footnote 1 Among these b’s, there exist countably many choices that lead to smooth self-similar solutions (i.e., s-independent solution in the self-similar variables), namely \(b = \frac{2k+1}{2k}\) for \(k= 1, 2, \ldots \) [15]. In what follows, we will refer the self-similar solutions in the case \(k = 1\) as ground states, and those in the case \(k \ge 2\) as excited states.

A key predecessor of this article is the recent work of Yang [39] that, based on the modulation-theoretic approach of Buckmaster–Shkoller–Vicol [4], constructed an open set of initial data giving rising to gradient-blow-up solutions to the Burgers–Hilbert equation (i.e., (1) with \(\alpha = 0\)) with ground state self-similar solutions to (Burgers) as blow-up profiles (see also the very recent work [7] for (fBurgers) with \(0< \beta < \frac{2}{3}\)). In this article, we extend [39] (and [7]) to more general perturbations of the Burgers equation, while simultaneously allowing for the use of excited states as blow-up profiles.Footnote 2

In fact, these two extensions go hand in hand. A somewhat amusing point, made precise in this article, is that higher excited self-similar profiles, which are less stable under perturbations of the initial data (i.e., they are stable under higher co-dimensional set of initial data perturbations), are more stable under perturbations of the equation, in that higher order dispersive and/or dissipative terms are allowed.Footnote 3 An explanation behind this phenomenon is as follows. The key factor that determines the stability of a self-similar profile under perturbations of the equation turns out to be its rate of concentration (i.e., the exponent \(b = \frac{2k+1}{2k}\)), and the slower rates of the excited states lead to larger classes of admissible perturbations of the equation. To see this point heuristically, one may simply compare the “strength” of each term in the equation on the characteristic time and length scales of the k-th Burgers self-similar profile, which are \(\sim (-t)\) and \(\sim (-t)^{\frac{2k+1}{2k}}\), respectively. For instance, for (fKdV), compare

$$\begin{aligned} \partial _{t} u \sim (\tau - t)^{-1} u \quad \hbox { vs. } \quad \vert {D_{x}}\vert ^{\alpha -1} \partial _{x} u \sim (-t)^{-\frac{2k+1}{2k} \alpha } u. \end{aligned}$$

(In self-similar variables for (Burgers), the “strength” of \(u \partial _{x} u\) is the same as that of \(\partial _{t} u\).) The “strength” of the perturbation \(\vert {D_{x}}\vert ^{\alpha -1} \partial _{x} u\) is weaker than that of \(\partial _{t} u\) when \(- \frac{2k+1}{2k} \alpha < -1\), or equivalently, \(\alpha < \frac{2k}{2k+1}\); note that this range improves as k increases. Our main theorem demonstrates that under this condition, the Burgers self-similar profile with \(b = \frac{2k+1}{2k}\) is stable under passage to (fKdV), leading to gradient-blow-up solutions to (fKdV) that asymptote to the same Burgers self-similar profile near the singularity. On the other hand, it will become apparent that the instability of the self-similar profile under initial data perturbations does not affect its stability under perturbations of the equation.

We remark that the preceding points are in parallel with the recent remarkable works of Merle–Raphaël–Rodnianski–Szeftel on singularity formation for the compressible Euler equations, the compressible Navier–Stokes equations and defocusing nonlinear Schrödinger equations (NLS). In [27], smooth self-similar profiles for the polytropic compressible Euler equations with characteristic length scales \((-t)^{\frac{1}{r}}\) are constructed for discrete values of r. Then these profiles are used to demonstrate singularity formation for second-order dissipative (i.e., compressible Navier–Stokes [28]) and dispersive (i.e., energy-supercritical defocusing NLS after Madelung transform [26]) perturbations of the Euler equations, where the admissible values of r with respect to each perturbation may be determined with similar heuristics as above.

Another innovation in this article is the introduction of a robust yet sharp weighted \(L^{2}\)-based method to establish the optimal spatial growth of the solution in the appropriate self-similar variables, in lieu of the method of characteristics employed in [4] and subsequent works. Such information is necessary to establish the sharp Hölder regularity, and in general even the boundedness (due to the lack of the maximum principle), of the solution up to the blow-up time. We refer the reader to Sect. 1.3 below for a short description of this method.

1.1 First Statement of the Main Result and Discussion

We now precisely state the class of equations studied in this article. For \(u: {\mathbb {R}}_{t} \times {\mathbb {R}}_{x} \rightarrow {\mathbb {R}}\), consider

$$\begin{aligned} \partial _{t} u +u \partial _{x} u + \Gamma \partial _{x} u + \Upsilon u = 0. \end{aligned}$$
(1)

Here, \(\Gamma \) and \(\Upsilon \) are Fourier multipliers with symbols \(\Gamma (\xi )\) and \(\Upsilon (\xi )\) satisfying the following properties:

  1. 1.

    \(\Gamma (\xi ), \Upsilon (\xi ) \in C^{\infty }({\mathbb {R}}\setminus \{0\})\) are real-valued and evenFootnote 4;

  2. 2.

    \(\Gamma (\xi ) \xi \) and \(\Upsilon (\xi )\) are symbols of order \(\alpha \) and \(\beta \) with \(0 \le \alpha , \beta < 1\), in the sense that for every multi-index I, there exists constants \(C_{\Gamma , |I|}, C_{\Upsilon , |I|} > 0\), such that for every \(\vert {\xi }\vert \ge 1\),

    $$\begin{aligned} \vert {\partial _{\xi }^{I} (\Gamma (\xi ) \xi )}\vert \le C_{\Gamma , N} \vert {\xi }\vert ^{\alpha -\vert {I}\vert }, \quad \vert {\partial _{\xi }^{I} \Upsilon (\xi )}\vert \le C_{\Upsilon , N} \vert {\xi }\vert ^{\beta -\vert {I}\vert }. \end{aligned}$$

    On the other hand, we assume that \(\Gamma (\xi ) \xi \) and \(\Upsilon (\xi )\) are bounded on \(\{\xi \in {\mathbb {R}}: \vert {\xi }\vert \le 1\}\).

  3. 3.

    \(\Upsilon (\xi ) \ge 0\) (i.e., \(\Upsilon \) is elliptic).

Clearly, (fKdV), (Whitham) are examples of (1) with \(\Upsilon = 0\), and (fBurgers) is an example of (1) with \(\Gamma = 0\). The order of \(\Gamma \) for (Whitham) is \(\alpha = \frac{1}{2}\).

By the standard energy method, it can be readily seen that the initial value problem for (1) is locally well-posed in \(H^{s}\) for any \(s > \frac{3}{2}\). Our main theorem concerns the formation of singularity for (1) starting from smooth and well-localized initial data. In simple terms, the statement of our main theorem is as follows.

Theorem 1.1

Let k be a positive integer such that \(\alpha , \beta < \frac{2k}{2k+1}\). Then there exist smooth initial data \(u_0\) for (1) such that the resulting solution of (1) blows up in finite time in \({\mathcal {C}}^{\sigma }\) for every \(\sigma > \frac{1}{2k+1}\), while its \({\mathcal {C}}^{\frac{1}{2k+1}}\) norm stays bounded until the blow-up time. In the case \(k=1\), and for \(\alpha < \frac{2}{3}\) and \(\beta < \frac{2}{3}\), the blow-up behavior is stable in \(H^{5}\). In the case \(k > 1\), these initial data form a “codimension \(2k-2\) subset” of \(H^{2k+3}\).

For more precise statements regarding the description of the initial data and blow-up dynamics, we refer to Theorem 3.1, Lemma 3.4 and the ensuing discussion. Note moreover that it would be possible, by a more refined analysis, to show that the “codimension \(2k-2\) subset” of \(H^{2k+3}\) in the above statement constitutes in reality a suitably regular submanifold of initial data. However, this is not carried out in the present work.

Remark 1.2

(Stable blow-up regime) Note that Theorem 1.1 applies to (fKdV) for \(0 \le \alpha < \frac{2}{3}\), (Whitham) and (fBurgers) for \(0 \le \beta < \frac{2}{3}\) with \(k = 1\), and as a result we obtain a blow-up behavior that is stable under initial data perturbations for these equations. On the other hand, in the range \(\frac{2}{3} \le \alpha < 1\), the term \(\vert {D_{x}}\vert ^{\alpha -1} \partial _{x}\) cannot be merely treated as a small perturbation for \(k = 1\), and we must perturb off of an excited Burgers self-similar profile. Description of a stable (under initial data perturbations) blow-up for (fKdV) for \(\frac{2}{3} \le \alpha < 1\) remains an open problem.

Remark 1.3

(The sign of the initial data) In [22, Conjecture 3], based on numerical investigation, the following interesting conjecture concerning the blow-up dynamics for (Whitham) and the sign of the initial data was made:

  • Solutions to the Whitham equation [...] for negative initial data \(u_{0}\) of sufficiently large mass will develop a cusp at \(t^{*} > t_{c}\) of the form \(\vert {x-x^{*}}\vert ^{1/3}\) [...]

  • Solutions to the Whitham equation [...] for positive initial data \(u_{0}\) of sufficiently large mass will develop a cusp at \(t^{*} < t_{c}\) of the form \(\vert {x-x^{*}}\vert ^{1/2}\) [...]

Our construction provides an open set of initial data of each sign whose corresponding solutions all have the same blow-up behavior (i.e., \({\mathcal {C}}^{\frac{1}{3}}\) remains bounded while \({\mathcal {C}}^{\sigma }\) for any \(\sigma > \frac{1}{3}\) blows up), thereby providing a counterexample to this conjecture as stated; see Remark 3.3 below. Nevertheless, it is possible that the blow-up observed in [22] for positive initial data is another stable blow-up regime, whose blow-up profile must have a positive sign. Verification of this revised picture remains an interesting open problem.

1.2 Prior Works

The models we consider in the present paper have been considered several times in the literature. We try to give a (non-exhaustive) list of previous results here, dividing them into four main areas.

  • Water waves Some of the above equations, such as (fKdV) and (Whitham) arise as approximated models in the theory of water waves. In his 1967 paper Whitham [37] introduced the equation bearing his name, arising as a nonlinear approximation for surface water waves, where the dispersive term satisfies the appropriate dispersion relation. For further discussion of the connections of (fKdV) with the theory of water waves, we refer the reader to the work of Klein–Linares–Pilod–Saut [21]. Many authors since the work of Whitham focused on the issue of singularity formation for such models. Wave breaking for (Whitham) was first shown only formally by Seliger [34], followed by Naumkin–Shishmarev [30]. The Russian authors were able to extend Seliger’s argument to (fKdV) in the case \(0 \le \alpha < \frac{2}{3}\). However, it appears that their arguments were not completely rigorous. In follow-up work, A. Constantin–Escher [11] made these arguments fully rigorous in the case of a model problem very similar to (Whitham), requiring however boundedness of the kernel, which does not cover the case of (Whitham) itself. In Castro–Cordoba–Gancedo [6], the authors proved blow-up for (fKdV) in the full range \(0< \alpha < 1\): their result show that the solution blows up in \({\mathcal {C}}^{1,\sigma }\), however it does not imply gradient blow-up in this case. In Klein–Saut [22], the authors performed numerical experiments on (fKdV) in the full range \(0< \alpha < 1\), which lead them to conjecture that wave breaking happens in the full range. In Hur–Tao [18], the authors were then able to show wave breaking for (fKdV) in the case \(0< \alpha < \frac{1}{2}\) and for (Whitham). In later work, Hur was able to extend the blow-up construction to the range \(0< \alpha < \frac{2}{3}\), see Hur [17]. More recently, Yang [39] extended the shock formation construction to the case \(\alpha = 0\), by means of a modulation-theoretic analysis in self-similar variables similar to [4] (discussed below), which gives a precise description of singularity formation. Finally, Saut–Wang [32] have also proved gradient blow-up for (fKdV) in the case \(0 \le \alpha < \frac{3}{5}\) as well as for (Whitham). Concerning model problems, let us mention the work of Klein–Saut–Wang [23], where the authors consider the modified fKdV equation (which features a cubic nonlinearity) in the range \(\alpha \in (0,2)\). In the weakly dispersive range (\(\alpha \in (0,1)\)), they show the existence of wave breaking solutions. Note also that, by the work of Saut–Wang [33], modified fKdV admits global solutions for small data when \(\alpha \) is in the full range (0, 2), \(\alpha \ne 1\). Let us finally briefly mention the case of \(\alpha \ge 1\), where the situation seems to be delicate. Conjecturally, when \(\alpha \in (1, 3/2)\), the picture of “shock formation” is expected not to hold (see Klein–Saut [22]) for the fKdV equation. In a recent work, Rimah was able to establish a precise version of this statement for a paralinearized version of the fKdV equation, thereby excluding wave breaking in the case \(\alpha \in (1,2)\) for a paralinearized model problem [31]. For modified fKdV with \(\alpha \in (1,2)\), Klein–Saut–Wang (again in [23]), conjecture that, in the focusing case for \(\alpha \in (1,2)\), the \(L^\infty \) norm of the solution blows up (and no wave breaking occurs). Finally, it is expected that, for \(\alpha > 3/2\), no blow-up occurs for fKdV. In this direction, we cite the work of Linares–Pilod–Saut [24], in which the authors show local well-posendess for fKdV with initial data in \(H^s(\mathbb {R})\), where \(s > \frac{3}{2}-\frac{3\beta }{8}\) and \(0< \beta < \alpha - 1\). More recently, Molinet–Pilod–Vento [29] extended the previous result to \(s > \frac{3}{2} -\frac{5\beta }{4}\). Together with conservation of energy this implies global well-posedness when \(\alpha > 1+6/7\). Showing global well-posedness all the way to \(\alpha > 3/2\) remains an outstanding open problem.

  • Weak dissipation. Weakly dissipative models have also attracted significant attention from the fluid dynamics community. For (fBurgers) in the case \(0 \le \beta < 1\), Kiselev–Nazarov–Shterenberg [20] and independently Alibaud–Droniou–Vovelle [1] as well as Dong–Du–Li [14] were able to show gradient blow-up. Note that the approaches in [1, 14, 20] rely heavily on monotonicity properties of the fractional Laplacian (a form of the maximum principle). Unfortunately, in the dispersive case (in particular, in fKdV with \(\alpha \in (2/3,1)\)), the monotonicity properties are lost and this approach breaks down. Very recently, Chickering–Moreno-Vasquez–Pandya [7] used an approach similar to Buckmaster et al. [4] and Yang [39] to give a precise description of stable blow-up dynamics in the case \(0 \le \beta < \frac{2}{3}\). We note that the blow-up solutions to (fBurgers) constructed in this article sharply complement known regularity criteria for (fBurgers). More precisely, regularity results on linear advection-fractional dissipation equation [12, 35, 36] (see also [19] for time-integrated criteria) imply that if u is a solution to (fBurgers) such that \(u \in L^{\infty }_{t}([0, \tau _{+}); {\mathcal {C}}^{1-\beta })\), then \(\partial _{x} u\) is Hölder continuous up to time \(\tau _{+}\), and therefore the solution extends past \([0, \tau _{+})\). On the other hand, for each \(k \ge 1\), Theorem 1.1 demonstrates the existence of a blow-up solution u to (fBurgers) with \(u \in L^{\infty }_{t} ([0, \tau _{+}); {\mathcal {C}}^{\frac{1}{2k+1}})\) for any \(\beta < \frac{2k}{2k+1}\) (or more instructively, \(\frac{1}{2k+1} < 1-\beta \)).

  • Self-similar constructions in fluids Our blow-up construction is based on the method of modulation theory in self-similar variables using smooth self-similar solutions to the Burgers equation as blow-up profiles. This method was first applied in the context of the Burgers equation in a seminal work by Collot–Ghoul–Masmoudi [10], in which the authors construct gradient blow-up for a two-dimensional Burgers equation with transverse viscosity, which is a simplified model for Prandtl’s boundary layer equation. In particular, similarly to the present article, [10] employs weighted \(L^{2}\)-bounds and makes use of all excited states as blow-up profiles via a topological argument. The above method was applied more recently to compressible fluid dynamics with great success in a series of works [3,4,5] by Buckmaster–Shkoller–Vicol. In [3, 4], the authors use a self-similar method to show shock formation for polytropic compressible Euler in two and three space dimensions, giving a precise asymptotic description of shock formation at the point of first singularity, even in the presence of vorticity. They moreover extended their treatment to the non-isentropic case in [5], showing for the first time generation of vorticity at the shock. We also mention the work of Buckmaster–Iyer [2], in which the authors show formation of unstable shocks for two-dimensional polytropic compressible Euler by using (first) excited states as blow-up profiles, albeit via a different argument (Newton iteration) than what is used in this article (topological argument) to control the unstable directions. Concerning self-similar solutions in fluids, we finally mention the groundbreaking recent work of Elgindi on the blow-up of the 3D incompressible Euler equations in the \({\mathcal {C}}^{1,\alpha }\) regularity class [16].

  • Geometric blow-up constructions Finally, we also mention the geometric blow-up constructions pioneered by Christodoulou in [8], where shock formation for the compressible irrotational Euler equations is shown. The work of Christodoulou relies on powerful energy estimates, which allow not only to construct the point of first singularity, but also the maximal development of the solution. These ideas enabled Christodolou to later address the restricted shock development problem [9]. Moreover, Luk–Speck used geometric ideas to show stability of planar shocks under perturbations with nonzero vorticity [25]. In the context of the present work, it would be very interesting to extend this type of reasoning to include weakly dispersive and dissipative effects.

1.3 Strategy of the Proof

In this section, we outline the strategy of the proof. For the purposes of this section, let us restrict to the case of (fKdV). Our argument is based on the underlying analysis of stable (and unstable) blow-up for the Burgers equation. It is well known (see, for instance, [15]) that, for any given \(k \in \mathbb {N}\), \(k \ge 1\), the Burgers equation admits self-similar solutions exhibiting blow-up in \({\mathcal {C}}^{0, \frac{1}{2k+1}+}\), which are each associated to a self-similar blow-up profile and self-similar coordinates.

We start from equation (fKdV), which is written in the variables (txu), and we rewrite it in the appropriate self-similar variables arising from Burgers, which we call (syU). For the precise definition of these variables, see Sect. 2.2. We expect the unstable behavior to be encoded by the derivatives of U up to and including order 2k at \(y = 0\). In view of this observation, we are going to track of the values of \(\partial _y^j U(s,0)\) throughout the evolution, \(j = 0, \ldots , 2k\).

Three of the unstable modes can be controlled naturally by modulation parameters adapted to the symmetries of the equation: time translation, space translation and Galilean transformation. The modulation conditions will therefore be imposed on U(s, 0), \(\partial _y U(s,0)\) and \(\partial _y^{2k}U(s,0)\), and the modulation parameters are going to be called \((\tau , \xi , \kappa )\). In the case \(k=1\), there are only three unstable directions, which allows us to show stable blow-up.

In the case \(k > 1\), the remaining \(2k - 2\) unstable directions will have to be controlled by selecting the initial data appropriately. For the precise definition of the modulation parameters and to see how they arise from the symmetries of the equation, see Sect. 2.2.

With this setup, in the self-similar variables (defined precisely below in (5)), (fKdV) then becomes:

$$\begin{aligned} \begin{aligned}&\partial _{s} U + b y \partial _{y} U + \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) \partial _{y} U - \left( b - 1 \right) U + e^{(b-1)s} \kappa _{s}\\&\qquad + (1+e^{s} \tau _{s}) U \partial _{y} U \\&\quad = (1+e^{s} \tau _{s})e^{-s(1- b\alpha )}\partial _y |D_y|^{\alpha -1} U. \end{aligned} \end{aligned}$$

Here, we have defined \(b = \frac{2k + 1}{2k}\).

For ease of exposition, we are now going to set all the modulation parameters to zero. We obtain the following equation:

$$\begin{aligned} \partial _s U - (b-1) U + (U+by)\partial _y U = e^{-s(1- b\alpha )}\partial _y |D_y|^{\alpha -1} U. \end{aligned}$$
(2)

The key observation is that, as long as \(1 - b \alpha > 0\), we are able to treat the term on the RHS in a perturbative way due to the exponentially decaying prefactor. Since \(b \rightarrow 1\) as \(k \rightarrow \infty \), we are able to treat values of \(\alpha \) arbitrarily close to 1 by choosing k appropriately.

We will set up a bootstrap argument and our goal will be to show that Eq. (2) admits a global (in self-similar time s) solution. The starting point is that \(\partial _y U\) can be treated almost independently by a Lagrangian analysis, which yields a uniform bound for \(\partial _y U\) in \(L^\infty \). Through the intermediate step of showing a uniform \(L^2\) bound for \(\partial _y^2 U\), we finally propagate appropriate weighted (in y) bounds for top-order derivatives. The weights are adapted to the Lagrangian flow of the equation, and their purpose is to show that the solution displays the correct asymptotic behavior at the time of blow-up. This is carried out in a weighted \(L^2\) framework, which has a twofold advantage. First, we show blow-up without the need to consider the difference with the exact self-similar blow-up profile, which is an amusing aspect by itself. Second, this part of the argument is entirely \(L^2\) based, which avoids derivative loss at the top order.

The final part of the argument is then devoted to addressing the “unstable” part, i.e. the ODE analysis for the modulation parameters and for the unstable derivatives of U at \(y = 0\). We introduce a “trapping condition” for the unstable coefficients (i.e., a decay condition on derivatives of U at \(y = 0\)) and show, by way of a shooting argument, that initial data can be selected such that the trapping condition holds for all times.

We are now going to describe the strategy in more detail, again focusing on the case of fKdV.

  1. 1.

    Control of \(\Vert \partial _y U \Vert _{L^\infty }\). We differentiate Eq. (2) by \(\partial _y\) and we obtain:

    $$\begin{aligned} \partial _s U' +U' +(U')^2 +(U+by)\partial _y U' =e^{-s(1- b\alpha )}\partial _y |D_y|^{\alpha -1} U'. \end{aligned}$$
    (3)

    Let us for a moment neglect the nonlocal RHS. We rewrite \(U'\) in Lagrangian coordinates (we let \({\tilde{U}}'\) be \(U'\) written in Lagrangian coordinates) and we obtain the following equation for \({\tilde{U}}'\):

    $$\begin{aligned} \partial _s {\tilde{U}}' = -{\tilde{U}}' ({\tilde{U}}' +1). \end{aligned}$$

    We immediately see that the inequality \(-1< {\tilde{U}}' < 1\) is preserved by the above Lagrangian ODE, and moreover this bound carries over to the original Eq. (3). This control is going to be the starting point of our analysis (see Lemma 5.2).

    Building upon this inequality, we then show that, depending on the region considered, \(U'(s,y)\) either satisfies a coarse polynomial bound in terms of |y|, or it decays exponentially in self-similar time (see Lemma 5.2, part 4). We will use this, later in the course of the argument, to show dissipativity of the equation in a region where y is large.

  2. 2.

    Control of \(\Vert \partial ^{2k+2}_y U \Vert _{L^2(\mathbb {R})}\) and \(\Vert \partial ^{2k+3}_y U \Vert _{L^2(\mathbb {R})}\). These terms are “top order” in terms of derivatives. In this part, we shall first accomplish the intermediate task of controlling \(\Vert \partial ^2_y U \Vert _{L^2(\mathbb {R})}\). We first show, by a Lagrangian argument, that \(U''\) satisfies a uniform \(L^\infty \) bound in the “close” region \(|y|\le \frac{1}{4}\). We emphasize that, in this case, it is extremely important that the bound, as well as its region of validity, be independent of the bootstrap parameters. The reason is that we then perform a weighed \(L^2\) estimate in the region \(R:= \{ y \in \mathbb {R}: \frac{1}{4}\le |y| \le y_2\}\), and we wish to control the expression

    $$\begin{aligned} \left\| {\exp \left( - \frac{\lambda }{2} y\right) U''}\right\| _{L^2(R)}, \end{aligned}$$

    where \(\lambda \) is a positive real parameter, and \(y_2\) is a positive number which we regard as large. We thus require the parameter \(\lambda \) to depend on the lower bound for |y| when \(y \in R\), and it is therefore crucial that this lower bound be independent of the bootstrap parameters. Finally, we need to show a bound in the “far away” region, where \(|y| \ge y_2\). We use the smallness of \(U'\) to show that the equation for \(U''\) has a dissipative character for \(|y| \ge y_2\). Combining the three regions, we obtain a uniform \(L^2\) bound for \(U''\). This is the content of Lemma 5.3.

    Turning now to the proof of the bounds for \(\Vert \partial ^{2k+2}_y U \Vert _{L^2(\mathbb {R})}\) and \(\Vert \partial ^{2k+3}_y U \Vert _{L^2(\mathbb {R})}\), we recall the familiar observation that, taking \(2k+2\) derivatives of Eq. (2), the linear term on the LHS becomes dissipative everywhere on \(\mathbb {R}\). Combining this fact with interpolation and the control of \(\partial ^2_y U\) in \(L^2\), which was obtained as an intermediate step, allows us to deduce a uniform bound for \(\Vert \partial ^{2k+2}_y U \Vert _{L^2(\mathbb {R})}\). Using this control, it is then straightforward to derive a bound for \(\Vert \partial ^{2k+3}_y U \Vert _{L^2(\mathbb {R})}\) (we require control up to this order due to a technical point: we will need to bound \(\partial _y^{2k+2}U\) at \(y = 0\) by Sobolev embedding, in order to control the evolution of \(\partial _y^{2k+1}U\) at \(y = 0\) in the “unstable” part of the argument). The high order bounds are obtained in Lemma 5.4.

  3. 3.

    Control of weighted \(L^2\) norms. Recall that the exact self-similar profile \({\bar{U}}\) for the Burgers equation satisfies, for large |y|,

    $$\begin{aligned} |y|^{\frac{1}{2k+1} - j}\lesssim |\partial _y^j {\bar{U}}| \lesssim |y|^{\frac{1}{2k+1} - j}, \end{aligned}$$
    (4)

    where \(j \ge 0\), \(j \in \mathbb {N}\).

    In this part of the argument, we wish to propagate an appropriate \(L^2\) version of the above polynomial decay bound, using a weighted \(L^2\) space, for top-order number of derivatives (i.e., when \(j = 2k+3\)). This information is needed to show that the blow-up solution is in the correct Hölder regularity class up to the blow-up time.

    The weights are constructed such that, in a region of bounded x (i. e. a region which corresponds to the image under the Lagrangian flow of a bounded y-interval centered at 0), one obtains the corresponding decay in y. Outside this region, the weight is “tapered”: it is independent of y, and grows exponentially in self-similar time at the correct rate.

    More precisely, given \(n \in \mathbb {N}\) and \(L > 0\), we define the semi-norm

    $$\begin{aligned} \Vert {V}\Vert _{\dot{{\mathcal {H}}}^{n}_{< L}}= & {} \sup _{j \in {\mathbb {Z}}, \, 2^{j}< L} \left( \int _{2^{j-1}< \vert {y}\vert < 2^{j}} (\vert {y}\vert ^{n-\frac{1}{2k+1}} \partial _{y}^{n} V)^{2} \frac{\textrm{d}y}{y}\right) ^{\frac{1}{2}} \\{} & {} + L^{n-\frac{1}{2k+1}-\frac{1}{2}} \left( \int _{\vert {y}\vert >\frac{L}{2}} (\partial _{y}^{n} V)^{2} \textrm{d}y\right) ^{\frac{1}{2}}. \end{aligned}$$

    Note that it consists of two terms: each expression in the first summand scales according to (4), and the second summand is obtained by choosing the weight to be the matching constant outside the region \(|y| \le \frac{L}{2}\). In practice, since the Lagrangian flow away from \(y = 0\) is well approximated by \(y = C e^{bs}\), we are going to set \(L = e^{bs}\).

    Our goal will then be to show that \(\Vert {U}\Vert _{\dot{{\mathcal {H}}}^{2k+3}_{< e^{bs}}}\) is uniformly bounded in s. As a first step, we first show a uniform bound on \(\Vert {U}\Vert _{\dot{{\mathcal {H}}}^{1}_{< e^{bs}}}\). To obtain it, we multiply the equation for \(U'\) by a weight approximately adapted to the Lagrangian flow in a region of large y. The growth rate of the weight is also chosen appropriately.

    Using this information, the bound for \(\Vert {U}\Vert _{\dot{{\mathcal {H}}}^{2k+3}_{< e^{bs}}}\) is obtained in a similar fashion. In this case, however, one needs to be careful about a potential loss of derivative, as the nonlocal term does not commute with the weight. To deal with this issue, we show a commutator estimate (see Lemma 4.3). The weighted bounds are proved in Lemma 5.5.

Remark 1.4

Note that, if we set \(L = \infty \), the above semi-norm is scale invariant.

  1. 4.

    Topological argument. Finally, in Sect. 6, we employ a topological procedure relying on the instability of the ODE system satisfied by the Taylor coefficients of U at \(y = 0\) to close the argument. This procedure will moreover select appropriately the initial data in the unstable case. This type of construction is well known in the dispersive community: see, for instance, the paper by Côte, Martel, and Merle [13].

    Recall the trapping condition, i.e. a decay condition for the “unstable” Taylor coefficients of U at \(y = 0\). We want to show that, upon appropriately choosing initial data, it can be arranged that the solution remains trapped globally in time.

    First, in Lemma 6.1 we show that, under the bootstrap assumptions and assuming the trapping condition, the evolution of the modulation parameters is controlled.

    Finally, in Lemma 6.3, we show that the ODE system satisfied by the first \(2k+1\) Taylor coefficients of U at \(y=0\) displays an unstable character, and we use this fact, combined with a Brouwer-type argument, to show that we can select initial data such that the corresponding solution is trapped for all time. This concludes the argument.

Remark 1.5

Note that parts (1) and (2) of the above outline rely on showing \(L^\infty \) estimates, which are proved here by means of Lagrangian analysis. This Lagrangian approach seems to be the most efficient way (in terms of degree of technicality) to analyze directly the unknown U (which is what we do in this paper), rather than the difference between U and the exact self-similar profile. However, we believe that, if instead one were to analyze the difference between U and the corresponding self-similar profile, one would be able to carry out the argument without the need for Lagrangian analysis. This approach would make the argument completely \(L^2\) based.

1.4 Organization of the Paper

In Sect. 2, we introduce the relevant equations, the self-similar coordinates, the modulation parameters, and the unstable ODE system for Taylor coefficients at \(y=0\). In Sect. 3, we give a precise statement of the main theorem (Theorem 3.1), and reduce its proof to establishing two key lemmas, Lemma 3.6 (main bootstrap lemma) and Lemma 3.9 (shooting lemma for unstable coefficients when \(k > 1\)). After collecting some useful lemmas for the Fourier multipliers arising in our problem in Sect. 4, the following two sections are devoted to the proof of the two key lemmas. In Sect. 5, close the bootstrap assumptions on the solution U in appropriate self-similar variables. In Sect. 6, we estimate the ODEs for the modulation parameters and stable coefficients, thereby completing the proof of Lemma 3.6. Moreover, in case \(k > 1\), we analyze the ODEs for the unstable coefficients and establish Lemma 3.9.

2 Preliminaries

2.1 Notation and Conventions

As is usual, we use \(C > 0\) to denote a positive constant that may change from line to line. Dependencies of C are expressed by subscripts. Moreover, we use the standard notation \(A \lesssim B\) for \(\vert {A}\vert \le C B\), and \(A \simeq B\) for \(A \lesssim B\) and \(B \lesssim A\), and dependencies of the implicit constant C are expressed by subscripts.

Given a symbol \(\Gamma (\xi )\), we denote by \(\Gamma (D_{x})\) its quantization in x, i.e., \(\Gamma (D_{x}) V = {\mathcal {F}}_{x}^{-1}[\Gamma (\xi ){\mathcal {F}}_{x}[V](\xi )]\), where \({\mathcal {F}}_{x}\) denotes the Fourier transform in the variable x. For each \(k \in {\mathbb {Z}}\), we define the Littlewood–Paley projection \(P_{\le k}\) to be the Fourier multiplier operator with symbol \(P_{\le k}(\xi )\) defined by \(P_{\le k}(\xi ) = P_{\le 0}(2^{-k} \xi )\), where \(P_{\le 0}\) is a nonnegative smooth function supported in \([-2, 2]\) and equals 1 on \([-1, 1]\). We also introduce the symbols \(P_{k}(\xi ) = P_{\le k}(\xi ) - P_{\le k-1}(\xi )\) and \(P_{> k}(\xi ) = 1 - P_{\le k}(\xi )\), as well as the corresponding Fourier multipliers (which are also called Littlewood–Paley projections).

2.2 Derivation of the Equations in Self-Similar Variables

Given parameters \(\tau , \xi , \kappa \in {\mathbb {R}}\) and \(\lambda > 0\) (called modulation parameters), consider the change of variables \((t, x, u) \mapsto (s, y, U)\),

$$\begin{aligned} t = \tau - e^{-s}, \quad x = \lambda y + \xi , \quad u(t, x) = \frac{\lambda }{ \tau - t} U + \kappa . \end{aligned}$$
(5)

Note that varying the modulation parameters \(\tau \), \(\xi \), \(\kappa \) and \(\lambda \) correspond to applying time translation, space translation, Galilean boost (\(u \mapsto u(x-\kappa t) - \kappa \)) and spatial scaling to the solution, which are exact symmetries for the (invsicid) Burgers equation. Hence, (syU) are nothing but the rescaled variables for Burgers equation centered at \((t, x, u) = (\tau , \xi , \kappa )\) at spatial scale \(\lambda \).

We let the modulation parameters depend dynamically on t, i.e., \(\tau = \tau (t)\), \(\xi = \xi (t)\), \(\kappa = \kappa (t)\) and \(\lambda = \lambda (t)\), and consider the same change of variables (5). Note that

$$\begin{aligned} \begin{aligned} \partial _{t} s&= e^{s} (1 + e^{s} \tau _{s})^{-1},\quad \partial _{t} y = - e^{s} (1 + e^{s} \tau _{s})^{-1} \left( \frac{\lambda _{s}}{\lambda } y + \frac{\xi _{s}}{\lambda }\right) , \\ \partial _{x} s&= 0, \quad \partial _{x} y = \frac{1}{\lambda }, \end{aligned} \end{aligned}$$

so that

$$\begin{aligned} \partial _{t} u&= \lambda e^{2 s} (1+e^{s} \tau _{s})^{-1} \left( \partial _{s} U - \frac{\lambda _{s}}{\lambda } y \partial _{y} U - \frac{\xi _{s}}{\lambda } \partial _{y} U \right. \\&\quad \left. + \left( \frac{\lambda _{s}}{\lambda } + 1 \right) U + \frac{e^{-s}}{\lambda } \kappa _{s} \right) , \\ u \partial _{x} u&= \lambda e^{2s} \left( \left( U + \frac{e^{-s}}{\lambda } \kappa \right) \partial _{y} U\right) , \\ \Gamma (D_{x}) \partial _{x} u&= \lambda e^{s} \Gamma (\lambda ^{-1} D_{y}) \lambda ^{-1} \partial _{y} \left( U + \lambda ^{-1} e^{-s} \kappa \right) , \\ \Upsilon (D_{x}) u&= \lambda e^{s} \Upsilon (\lambda ^{-1} D_{y}) \left( U + \lambda ^{-1} e^{-s} \kappa \right) . \end{aligned}$$

Thus, (1) becomes

$$\begin{aligned} \begin{aligned}&\partial _{s} U - \frac{\lambda _{s}}{\lambda } y \partial _{y} U + \left( - \frac{\xi _{s}}{\lambda } + (1+e^{s} \tau _{s}) \frac{e^{-s}}{\lambda } \kappa \right) \partial _{y} U + \left( \frac{\lambda _{s}}{\lambda } + 1 \right) U + \frac{e^{-s}}{\lambda } \kappa _{s} \\&\quad + (1+e^{s} \tau _{s}) \left( U \partial _{y} U + e^{-s} \Gamma (\lambda ^{-1} D_{y}) \lambda ^{-1} \partial _{y} \left( U + \lambda ^{-1} e^{-s} \kappa \right) \right. \\&\quad \left. + e^{-s} \Upsilon (\lambda ^{-1} D_{y}) \left( U + \lambda ^{-1} e^{-s} \kappa \right) \right) = 0. \end{aligned} \end{aligned}$$

In what follows, we shall assume the following self-similarity ansatz for \(\lambda \): Given \(k \in \mathbb {N}\), set \(b: = \frac{2k + 1}{2k}\) and

$$\begin{aligned} \lambda = (\tau - t)^{b} = e^{- b s}. \end{aligned}$$
(6)

Then \(\frac{\lambda _{s}}{\lambda } = - b\), and we arrive at

$$\begin{aligned}{} & {} \partial _{s} U + b y \partial _{y} U + \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) \partial _{y} U - \left( b - 1 \right) U + e^{(b-1)s} \kappa _{s} \nonumber \\{} & {} \quad + (1+e^{s} \tau _{s}) \left( U \partial _{y} U + e^{-s} \Gamma (e^{b s} D_{y}) e^{b s} \partial _{y} \left( U + e^{(b-1)s} \kappa \right) \right. \nonumber \\{} & {} \quad \left. + e^{-s} \Upsilon (e^{b s} D_{y}) \left( U + e^{(b-1)s} \kappa \right) \right) = 0. \end{aligned}$$
(7)

If \(\Gamma \) and \(\Upsilon \) were zero, and \(\tau \), \(\xi \), \(\kappa \) were constant, then (7) is precisely the self-similar Burgers equation with scales (6). As is well-known, the values \(b = \frac{2k+1}{2k}\) with \(k = 1, 2, \ldots \) are distinguished by the property that they admit smooth steady profiles of the self-similar Burgers equation [15, Sect. 11.2]; see also Sect. 2.3 below.

Our intention is to view the linear terms \(e^{-s}\Gamma (e^{bs} D_{y}) e^{bs} \partial _{y} U\) and \(e^{-s}\Upsilon (e^{bs} D_{y}) U\) as perturbations. To motivate the way we will decompose these terms, consider the model cases \(\Gamma (\xi ) = c_{\Gamma } \vert {\xi }\vert ^{\alpha -1}\) and \(\Upsilon (\xi ) = c_{\Upsilon } \vert {\xi }\vert ^{\beta }\) (\(c_{\Gamma }, c_{\Upsilon } \in {\mathbb {R}}\)). Then \(\Gamma (e^{bs} D_{y}) e^{b s} \partial _{y} = c_{\Gamma } e^{b \alpha s} \vert {D_{y}}\vert ^{\alpha -1} \partial _{y}\) and \(\Upsilon (e^{bs} D_{y}) = c_{\Upsilon } e^{b \beta s} \vert {D_{y}}\vert ^{\beta }\), so that

$$\begin{aligned} e^{-s}\Gamma (e^{bs} D_{y}) e^{bs} \partial _{y} U + e^{-s}\Upsilon (e^{bs} D_{y}) U= & {} c_{\Gamma } e^{-(1- b \alpha ) s} \vert {D_{y}}\vert ^{\alpha -1} \partial _{y} U \\{} & {} +\,c_{\Upsilon } e^{-(1- b \beta ) s} \vert {D_{y}}\vert ^{\beta }U. \end{aligned}$$

In the regime we perform our construction, \(\vert {D_{y}}\vert ^{\alpha -1} \partial _{y} U\) and \(\vert {D_{y}}\vert ^{\beta }U\) will morally remain bounded in time.Footnote 5 Therefore, we may regard these terms as perturbative when \(b \alpha < 1\) and \(b \beta < 1\), in which case the factors \(e^{-(1- b \alpha ) s} \) and \(e^{-(1- b \beta ) s} \) decay exponentially.

In view of the above discussion, in what follows, we are going to denote

$$\begin{aligned} \mu = \min \{1 - b \alpha , 1-b\beta , 1\}. \end{aligned}$$

Note that, under our assumptions, \(\mu > 0\). To simplify our notation, we will now rewrite the operator on the RHS as follows:

$$\begin{aligned} e^{-s} \Gamma (e^{b s} D_{y}) e^{b s} \partial _{y} U + e^{-s} \Upsilon (e^{b s} D_{y}) (U + e^{(b-1)s}\kappa )= & {} - e^{-\mu s} {\mathcal {H}}\left( U + e^{(b-1)s} \kappa \right) \\{} & {} -\, e^{-s} {\mathcal {L}}\left( U + e^{(b-1)s} \kappa \right) , \end{aligned}$$

where

$$\begin{aligned} {\mathcal {H}}(V)&= - P_{> 0}(e^{b s} D_{y}) \left( e^{-\max \{\alpha , \beta , 0\} b s} \Gamma (e^{bs} D_{y}) e^{bs} \partial _{y} V\right. \nonumber \\&\quad \left. + e^{- \max \{\alpha , \beta , 0\} b s} \Upsilon (e^{bs} D_{y}) V \right) , \end{aligned}$$
(8)
$$\begin{aligned} {\mathcal {L}}(V)&= - P_{\le 0}(e^{b s} D_{y}) \left( \Gamma (e^{bs} D_{y}) e^{bs} \partial _{y} V + \Upsilon (e^{bs} D_{y}) V \right) . \end{aligned}$$
(9)

Note that \({\mathcal {H}}\left( U + e^{(b-1)s} \kappa \right) = {\mathcal {H}}(U)\) thanks to \(\chi _{\ge 1}(e^{bs} D_{y})\). Putting everything together, we finally have

$$\begin{aligned} \boxed { \begin{aligned}&\partial _{s} U + b y \partial _{y} U + \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) \partial _{y} U - \left( b - 1 \right) U + e^{(b-1)s} \kappa _{s}\\&\quad + (1+e^{s} \tau _{s}) U \partial _{y} U \\&\qquad = (1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U) + e^{-s} {\mathcal {L}}\left( U + e^{(b-1)s} \kappa \right) \right) . \end{aligned} } \end{aligned}$$

2.3 Definition of the Profile

We now solve the steady profile equation for the Burgers problem (i.e., \(\Gamma \), \(\Upsilon \) are zero and \(\tau \), \(\xi \) and \(\kappa \) are fixed):

$$\begin{aligned} (1-b) \mathring{U}+(b y + \mathring{U})\partial _y \mathring{U}= 0. \end{aligned}$$

We define \(\mathring{U}\) to be a solution to the above equation (an exact self-similar profile) such that \(\mathring{U}(0) = 0\), \(\partial _{y} \mathring{U}(0) = -1\), \(\partial _{y}^{2k+1} \mathring{U}(0) = (2k)!\) and \(\partial _y^j \mathring{U} (0) = 0\) for \(j = 2, \ldots , 2k\). We can ensure the last condition by simply noticing that the self-similar profile equation is equivalent to

$$\begin{aligned} y = - \mathring{U} - h_1\mathring{U}^{2k + 1}, \end{aligned}$$

where \(h_1 > 0\) is a free parameter. From this implicit definition, we see that the first three non-vanishing Taylor coefficients at \(y=0\) are \(\partial _{y} \mathring{U}(0)\), \(\partial _{y}^{2k+1} \mathring{U}(0)\) and \(\partial _{y}^{4k+1} \mathring{U}(0)\). We fix \(h_1\) so that \(\mathring{U}\) satisfies \(\partial _{y}^{2k+1} \mathring{U}(0) = (2k)!\), and a calculation shows that \(h_1 = 1/(2k+1)\); in what follows, we will suppress the dependence of constants on \(h_{1}\).

By construction, \(\mathring{U}\) has about \(y = 0\) the Taylor expansion

$$\begin{aligned} \mathring{U}(y) = - y + \frac{1}{2k+1} y^{2k+1} + O(y^{4k+1}) \end{aligned}$$
(10)

and, about \(y = \pm \infty \), the expansion

$$\begin{aligned} \mathring{U}(y) = {\mp } h_{1}^{-\frac{1}{2k+1}} \vert {y}\vert ^{\frac{1}{2k+1}} \left( 1 + O(\vert {y}\vert ^{-\frac{2k}{2k+1}}) \right) . \end{aligned}$$
(11)

We now define our choice of the profile. Consider a function \(\bar{\chi }: \mathbb {R}\rightarrow \mathbb {R}\) which is positive, equal to 1 on the interval \([-1,1]\), equal to 0 outside of the interval \([-8,8]\), and such that \({\bar{\chi }}' \ge - \frac{1}{4}\). We then define the cut-off function \(\chi \) to be transported by the linearized flow generated by \(\mathring{U}\):

$$\begin{aligned} \partial _s \chi + (by + \mathring{U}) \partial _y \chi = 0, \quad \chi (0, y) = {\bar{\chi }}(y). \end{aligned}$$

Some basic properties of the cut-off function \(\chi \) are as follows:

Lemma 2.1

(Support property of \(\chi (s,y)\)) We have \({{\,\textrm{supp}\,}}\chi \subseteq [-C e^{b s}, C e^{b s}]\).

Proof of Lemma 2.1

Define the Lagrangian trajectories \(Y_{\pm }(s)\) as

$$\begin{aligned} \partial _s Y_{\pm }(s) = b Y_{\pm }(s) + \mathring{U}(Y_{\pm }(s)), \quad Y_\pm (0) = \pm 10, \end{aligned}$$
(12)

so that \(\chi (y) = 0\) for all \(y: |y| \ge Y(0)\). The conclusion of the lemma will follow if we can show that \(|Y_{\pm }| \le C e^{bs}\). By the form of \(\mathring{U}\), we have that \(|\mathring{U}(Y_{\pm }(s))| \le 1 + |Y_{\pm }(s)|\), and the lemma follows by integrating (12) in s. \(\square \)

By Lemma 2.1 and (11), we have

$$\begin{aligned} \sup _{y \in {{\,\textrm{supp}\,}}\chi (s, \cdot )} \vert {\mathring{U}(y)}\vert \lesssim e^{\frac{1}{2k+1} b s} = e^{(b-1) s}. \end{aligned}$$
(13)

We finally define the profile

$$\begin{aligned} {\bar{U}}(s,y) = \chi (s,y) \mathring{U} (y). \end{aligned}$$

This is no longer a time-independent profile. Moreover, the modified profile satisfies the equation

$$\begin{aligned} \partial _s {\bar{U}} - (b-1) {\bar{U}} + (b y + {\bar{U}}) \partial _y {\bar{U}} = - \mathring{U} (1 - \chi ) \partial _{y} \bar{U}. \end{aligned}$$

2.4 Equation for Iterated Derivatives of U

We let \(U^{(j)} = \partial _y^{(j)} U\), with \(j \ge 1\). We derive the equation satisfied by \(U^{(j)}\). For \(j = 1\),

$$\begin{aligned} \boxed { \begin{aligned}&\partial _{s} U' +U'+\left( (1 + e^{s} \tau _{s}) U + b y - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) \partial _{y} U'\\&\qquad + (1 + e^{s} \tau _{s}) (U')^{2} \\&\quad = (1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U') + e^{-s} \partial _{y} {\mathcal {L}}(U + e^{(b-1)s} \kappa ) \right) , \end{aligned} } \end{aligned}$$
(14)

and for \(j \ge 2\),

$$\begin{aligned} \boxed { \begin{aligned}&\partial _{s} U^{(j)} + \left( (1 + e^{s} \tau _{s}) U + b y - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) U^{(j+1)} \\&\qquad + \left( 1+ (j-1) b + (j+1)(1 + e^{s} \tau _{s}) U' \right) U^{(j)} + (1 + e^{s} \tau _{s}) M^{(j)} \\&\quad = (1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U^{(j)}) + e^{-s} \partial _{y}^{j} {\mathcal {L}}(U + e^{(b-1)s} \kappa ) \right) . \end{aligned} } \end{aligned}$$
(15)

Here, \(M^{(j)} = \partial _y^{(j)} (U U') - U U^{(j+1)} - ( j+1) U' U^{(j)}\) for \(j \ge 2\).

2.5 Perturbation Equation and Commutation

We now define the perturbation \(W = U - {\bar{U}}\), and we obtain the following equation for W:

$$\begin{aligned} \boxed { \begin{aligned}&\partial _{s} W + (b y + \overline{U} + W) \partial _{y} W - \left( b - 1 - \partial _{y} \bar{U} \right) W \\&\qquad +\, \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) \partial _{y} (\bar{U} + W) + e^{(b-1)s} \kappa _{s} \\&\qquad +\, \frac{1}{2} e^{s} \tau _{s} \partial _{y} (\bar{U} + W)^{2} \\&\quad = E_{\chi } + (1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U) + e^{-s} {\mathcal {L}}\left( U + e^{(b-1)s} \kappa \right) \right) . \end{aligned}} \end{aligned}$$
(16)

Here, \(E_{\chi } = \mathring{U} (1 - \chi ) \partial _{y} \bar{U}\).

Remark 2.2

Note that the error term \(E_{\chi }\) arising from the cutoff is identically zero near \(y =0\).

Suppose now that \(j \ge 1\). We now commute the above Eq. (16) with \(\partial _y^j\), and obtain:

$$\begin{aligned} \boxed { \begin{aligned}&\partial _{s} W^{(j)} + \left( b y + \bar{U} + W \right) W^{(j+1)} + \left( 1+ (j-1) b + (j+1) \bar{U}' \right) W^{(j)}\\&\qquad + N(j) + L(j) \\&\qquad + \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) (\bar{U}^{(j+1)} + W^{(j+1)}) \\&\qquad + \frac{1}{2} e^{s} \tau _{s} \partial _{y}^{(j+1)}(\bar{U} + W)^{2} \\&\quad = E_{\chi }^{(j)} + (1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U^{(j)}) + e^{-s} \partial _{y}^{j} {\mathcal {L}}(U + e^{(b-1)s} \kappa ) \right) . \end{aligned} } \end{aligned}$$

Here,

$$\begin{aligned} N{(j)}&= \partial _y^{(j)} (W W') - W W^{(j+1)}, \\ L{(j)}&= ({\bar{U}} W)^{(j+1)} - {\bar{U}} W^{(j+1)} - (j+1) {\bar{U}}' W^{(j)}. \end{aligned}$$

2.6 Derivation of the Modulation Equations and Unstable ODE System at \(y = 0\)

We now derive the equations satisfied by the derivatives of W at the origin. For each \(j \ge 0\), we let

$$\begin{aligned} w_{j} := W^{(j)}(s, 0), \quad F^{(j)}(s,0) := e^{-\mu s} {\mathcal {H}}\left( U^{(j)} \right) + e^{-s} \partial _{y}^{j} {\mathcal {L}}\left( U +e^{(b-1)s} \kappa \right) , \end{aligned}$$

For \(j \ge 0\), we have

$$\begin{aligned}{} & {} \partial _{s} w_{j} + \left( (j-1) (b-1) - 1 \right) w_{j} + w_{0} w_{j+1} + \left. N(j) \right| _{y = 0} + \left. L(j) \right| _{y = 0} \nonumber \\{} & {} \qquad + \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) (\bar{U}^{(j+1)}(0) + w_{j+1}) + \delta _{0j} e^{(b-1)s} \kappa _{s}\nonumber \\{} & {} \qquad + \frac{1}{2} e^{s} \tau _{s} \left. \partial _{y}^{(j+1)}(\bar{U} + W)^{2} \right| _{y=0} \nonumber \\{} & {} \quad =(1+e^{s} \tau _{s}) F^{(j)}(s, 0), \end{aligned}$$
(17)

where \(\delta _{0j}\) is the Kronecker delta symbol, which equals 1 when \(j = 0\) and vanishes otherwise. We also used the following properties of the profile: \({\bar{U}}(s,0) = \partial ^{j}_y {\bar{U}} = 0\) for \(2\le j \le 2k\), and \(\partial _y {\bar{U}}(s,0) = 1\).

We first consider the cases \(j = 0, 1\) or 2k:

$$\begin{aligned}&\partial _{s} w_{0} -b w_{0} + w_{0} w_{1} \nonumber \\&\qquad +\, \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) (-1 + w_{1}) + e^{(b-1)s} \kappa _{s} + e^{s} \tau _{s} w_{0} (-1 + w_{1}) \nonumber \\&\quad =(1+e^{s} \tau _{s}) F^{(0)}(s, 0), \end{aligned}$$
(18)
$$\begin{aligned}&\partial _{s} w_{1} - w_1 + w_{0} w_{2} + w_{1}^{2} \nonumber \\&\qquad +\, \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) w_{2} + e^{s} \tau _{s} \left( w_{2} w_{0} + (-1 + w_{1})^{2} \right) \nonumber \\&\quad =(1+e^{s} \tau _{s}) F^{(1)}(s, 0), \end{aligned}$$
(19)
$$\begin{aligned}&\partial _{s} w_{2k} + \left( (2k-1) (b-1) - 1 \right) w_{2k} + w_{0} w_{2k+1} + \left. N(2k) \right| _{y=0} + (2k)! w_{1} \nonumber \\&\qquad +\, \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) ((2k)! + w_{2k+1}) \nonumber \\&\qquad + e^{s} \tau _{s} \left( (2k+1) (-1 + w_{1}) w_{2k} + ((2k)! + w_{2k+1}) w_{0}\right) \nonumber \\&\quad =(1+e^{s} \tau _{s}) F^{(2k)}(s, 0). \end{aligned}$$
(20)

Observe that the coefficients in front of the s-derivatives of the modulation parameters in these three equations are always non-zero. Indeed, in (18), \(\xi _s\) is multiplied by \(e^{bs}\), \(\kappa _s\) is multiplied by \(e^{(b-1)s}\), and \(\tau _s\) is multiplied by \(e^s\) (similarly for (19) and (20)).

For this reason, we shall use these equations to determine the dynamic evolution equations for \(\kappa \), \(\tau \) and \(\xi \) by imposing the conditions

$$\begin{aligned} \boxed {w_{0} = w_{1} = w_{2k} = 0 \quad \hbox { for all } s,} \end{aligned}$$
(21)

which leads to the following equations:

$$\begin{aligned}{} & {} \boxed { e^{(b-1) s} \kappa _{s} + e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa = (1+e^{s} \tau _{s}) F^{(0)}(s, 0), } \end{aligned}$$
(22)
$$\begin{aligned}{} & {} \boxed { e^{s} \tau _{s} = (1+e^{s} \tau _{s}) F^{(1)}(s, 0) - w_2(-e^{bs}\xi _s + (1+e^{s}\tau _s)e^{(b-1)s}\kappa ), } \end{aligned}$$
(23)
$$\begin{aligned}{} & {} \boxed { \left( (2k)!+w_{2k+1}\right) \left( e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) = \left. N(2k) \right| _{y=0} - (1+e^{s} \tau _{s}) F^{(2k)}(s, 0). }\nonumber \\ \end{aligned}$$
(24)

Remark 2.3

Note also that, in case \(k =1\), the last term in Eq. (23) vanishes.

Conversely, if \(\kappa _{s}\), \(\tau _{s}\) and \(\xi _{s}\) are fixed so that (22)–(23) are satisfiedFootnote 6 and \(w_{0}\), \(w_{1}\) and \(w_{2k}\) are initially zero, then by (17) in the cases \(j= 0\), 1, and 2k, (21) holds.

When \(k > 1\), the conditions in (21) do not fix all values of \(w_{j}\) for \(j=0, \ldots , 2k\). In such a case, we use the above equation to determine the evolution of \(w_{j}\). More precisely, for the remaining indices \(j = 2, \ldots , 2k-1\), the ODE for \(w_{j}\) is

$$\begin{aligned} \boxed { \begin{aligned}&\partial _{s} w_{j} + \left( (j-1) (b-1) - 1 \right) w_{j} + \left. N(j) \right| _{y = 0} \\&\qquad +\, \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) w_{j+1} + e^{s} \tau _{s} \left( - w_{j} + \left. N(j) \right| _{y=0} \right) \\&\quad =(1+e^{s} \tau _{s}) F^{(j)}(s, 0). \end{aligned}} \end{aligned}$$

Here, we used the properties of \(\overline{U}^{(j)}(0)\) and \(w_{0} = w_{1} = w_{2k} = 0\).

We will now rewrite the above system as a system of ODEs. Introduce the vector \(\vec {w}(s) = (w_2(s), \ldots , w_{2k-1}(s))\). Then, \(\vec {w}(s)\) satisfies the following system of ODEs:

$$\begin{aligned} \boxed { \partial _s \vec {w}(s) - D \vec {w}(s) + (1+e^{s} \tau _{s}) {\mathcal {N}}(\vec {w}(s)) = M \vec {w}(s) + \vec {f}(s).} \end{aligned}$$

Here, D and M are \((2k-1)\times (2k-1)\) matrices given by

$$\begin{aligned} \begin{aligned} D&= \textrm{diag}\,\left( \lambda _{2}, \ldots , \lambda _{2k-1} \right) , \quad \lambda _{j} = 1 - (j-1) (b-1) = 1 - \frac{j-1}{2k},\\ M&=e^{s} \tau _{s} I + \left( e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) N, \end{aligned} \end{aligned}$$
(25)

where I is the identity matrix and N is the nilpotent matrix such that \(N_{j (j+1)} = 1\) and \(N_{j j'} = 0\) otherwise.

Since \(b = \frac{2k+1}{2k}\), each eigenvalue \(\lambda _{j}\) of D is strictly positive, so the main linear part \((\partial _{s} - D) \vec {w}(s)\) defines an unstable system of ODEs. In addition, \({\mathcal {N}}(\vec {w}(s))\) is a vector with quadratic entries as functions of the entries of \(\vec {w}\), and \(\vec {f}\) is the vector \(((1+e^{s} \tau _{s}) F^{(2)}(s,0), \ldots , (1+e^{s} \tau _{s}) F^{(2k-1)}(s, 0))\).

3 Precise Formulation of the Main Theorem and Reduction to the Main Bootstrap Lemma

3.1 Initial Data in the Original Variables and the Main Theorem

The purpose of this subsection is to give a precise formulation of the main theorem of this paper (Theorem 3.1). We begin by specifying the set of initial data.

We begin by introducing the following co-dimension \(2k+1\) subspace of \(H^{2k+3}\):

$$\begin{aligned} H^{2k+3}_{(2k)} = \{W_{0} \in H^{2k+3}: W_{0}(0) = W_{0}'(0) = \cdots = W_{0}^{(2k)}(0) = 0\}. \end{aligned}$$

We parametrize the initial data in \(H^{2k+3}\) that will lead to the desired gradient blow-up solutions with the help of the map \(\varvec{\Phi }: (0, \infty ) \times {\mathbb {R}}\times {\mathbb {R}}\times {\mathbb {R}}^{2k-2} \times H^{2k+3}_{(2k)} \rightarrow H^{2k+3}\), which is defined by the formula

$$\begin{aligned}{} & {} (\tau _{0}, \xi _{0}, \kappa _{0}, w_{2, 0}, \ldots , w_{2k-1, 0}, W_{0}) \nonumber \\{} & {} \mapsto u_{0}(x) = \tau _{0}^{b-1} \left( \chi (-\log \tau _{0}, y) \left( \mathring{U}(y) + \tau _{0}^{1-b} \kappa _{0}\right) \right. \nonumber \\{} & {} \quad \left. \left. + \bar{\chi }(y) \sum _{j=2}^{2k-1} \frac{w_{j, 0}}{j!} y^{j} + W_{0}(y) \right) \right| _{y = \tau _{0}^{-b}(x-\xi _{0}) }, \end{aligned}$$
(26)

where \(b = \frac{2k+1}{2k}\), \(\mathring{U}\) is the k-th smooth self-similar profile for the Burgers equation and \(\chi (s, \cdot )\) and \(\bar{\chi }(\cdot )\) are as in Subsection 2.3. When \(k = 1\), the term \( {\bar{\chi }}(y) \sum _{j=2}^{2k-1} \frac{w_{j, 0}}{j!} y^{j}\) is omitted.

Note that (26) maps the point \((\tau _{0}, \xi _{0}, \kappa _{0}, w_{2, 0}, \ldots , w_{2k-1, 0}, W_{0}) = (\tau _{0}, \xi _{0}, \kappa _{0}, 0, \ldots , 0)\) to the translated and rescaled self-similar Burgers profile whose gradient at \(x = \xi _{0}\) is negative and of size \(\tau _{0}^{-1}\), i.e.,

$$\begin{aligned} (\tau _{0}, \xi _{0}, \kappa _{0}, \ldots , 0) \mapsto u_{0}(x) = \chi (-\log \tau _{0}) \left( \tau _{0}^{b-1} \mathring{U}(\tau _{0}^{-b}(x - \xi _{0})) + \kappa _{0} \right) . \end{aligned}$$

When \(k > 1\), \(w_{j,0}\) equals the j-th Taylor coefficient of U(y) at \(y =0\) in the self-similar variables for \(j=2, \ldots , 2k-1\).

Given \(\tau _{0}, \epsilon _{0} > 0\), we consider the following open subset of \(H^{2k+3}_{(2k)}\):

$$\begin{aligned} {\mathcal {O}}_{\tau _{0}, \epsilon _{0}}&= \{*\}{W_{0} \in H^{2k+3}_{(2k)} : \tau _{0}^{\frac{3}{2} b-1} \left( \Vert {W_{0}}\Vert _{L^{2}} + \tau _{0}^{-b(2k+3)} \Vert {\partial _{y}^{2k+3} W_{0}}\Vert _{L^{2}}\right) < \epsilon _{0}}. \end{aligned}$$

When \(k > 1\), for \(\vec {v}_{0} \in {\mathbb {R}}^{2k-2}\) and \(r > 0\), we also introduce the notation

$$\begin{aligned} B_{\vec {v}_{0}}(r) = \{\vec {v} \in {\mathbb {R}}^{2k-2}: \vert {\vec {v} - \vec {v}_{0}}\vert < r\}. \end{aligned}$$

We are now ready to formulate the main theorem in precise terms.

Theorem 3.1

(Precise formulation of the main result) Let k be a positive integer such that \(\alpha , \beta < \frac{2k}{2k+1}\) and set \(b = \frac{2k+1}{2k}\). Then there exist \(\gamma > 0\) and positive decreasing functions \(\tau _{*}(\cdot )\), \(\epsilon _{*}(\cdot )\) such that the following holds. Let \(\xi _{0} \in {\mathbb {R}}\), \(\kappa _{0} \in {\mathbb {R}}\), \(\tau _{0} < \tau _{*}(\vert {\kappa _{0}}\vert )\), \(\epsilon _{0} < \epsilon _{*}(\vert {\kappa _{0}}\vert )\) and \(W_{0} \in {\mathcal {O}}_{\tau _{0}, \epsilon _{0}}\). When \(k = 1\), the initial data \(u_{0}(x)\) given by (26) gives rise to a (well-posed)Footnote 7 solution to (1) with initial conditions \(u(0, x) = u_{0}(x)\) that blows up in finite time. When \(k \ge 2\), there exists \(\vec {w}_{0} \in B_{0}(\tau _{0}^{\gamma }) \subseteq {\mathbb {R}}^{2k-2}\) such that the initial data \(u_{0}(x)\) given by (26) gives rise to a (well-posed) solution to (1) with initial conditions \(u(0, x) = u_{0}(x)\) that blows up in finite time. In both cases, the following statements hold:

  1. 1.

    The blow-up time \(\tau _{+}\) obeys the bound \(\vert {\tau _{+} - \tau _{0}}\vert < C \tau _{0}^{1+\gamma }\).

  2. 2.

    There exist \(\xi _{+}\), \(\kappa _{+}\) such that

    $$\begin{aligned} \vert {\kappa _{+} - \kappa _{0}}\vert \le C \tau _{0}^{b-1+\gamma }, \quad \vert {\xi _{+} - (\xi _{0} + \tau _{+} \kappa _{0})}\vert \le C \tau _{0}^{b+\gamma }, \end{aligned}$$

    and such that

    $$\begin{aligned} \sup _{0 \le t < \tau _{+}} \Vert {u(t, \cdot )}\Vert _{L^{\infty }} + [u(t, \cdot )]_{{\mathcal {C}}^{\frac{1}{2k+1}}} \le C, \end{aligned}$$

    while for every \(\sigma \in (\frac{1}{2k+1}, 1)\),

    $$\begin{aligned} C_{\sigma }^{-1} \vert {t - \tau _{+}}\vert ^{-\frac{2k+1}{2k}(\sigma -\frac{1}{2k+1})} \le [u(t, \cdot )]_{{\mathcal {C}}^{\sigma }}&\le C_{\sigma } \vert {t - \tau _{+}}\vert ^{-\frac{2k+1}{2k}(\sigma -\frac{1}{2k+1})} \end{aligned}$$

    as \(t \rightarrow \tau _{+}\).

Remark 3.2

For the blow-up solutions in Theorem 3.1, we expect \(\mathring{U}\) to be the blow-up profile, in the sense that U(sy) in appropriate self-similar variables converges to \(\mathring{U}\) as \(s \rightarrow \infty \) on compact sets of y. Such a statement would follow from estimates for \(W = U - \chi \mathring{U}\) on top of those proved in this paper, but we have not carried out the details. We refer to Yang [39] for the proof of this statement in the case of Burgers–Hilbert (i.e., (fKdV) with \(\alpha = 0\)).

Remark 3.3

(Sign of the initial data) There exist smooth compactly supported initial data with both signs (i.e., everywhere nonnegative or nonpositive) that satisfy the hypothesis of Theorem 3.1. Indeed, in (26), note that \(\vert {\mathring{U}}\vert \le C_{0} \tau _{0}^{1-b}\) on the support of \(\chi (-\log \tau _{0}, \cdot )\) (see Lemma 2.1) for some constant \(C_{0} > 0\) independent of \(\tau _{0}\). Therefore, if we choose, say, \(\vert {\kappa _{0}}\vert > 2 C_{0}\), then the initial profile \(\chi (-\log \tau _{0}) ( \tau _{0}^{b-1} \mathring{U}(\tau _{0}^{-b}(x - \xi _{0})) + \kappa _{0} )\) has a definite sign independent of \(\tau _{0} > 0\). Moreover, observe that \(W_{0} \in {\mathcal {O}}_{\tau _{0}, \epsilon _{0}}\) satisfies the pointwise bound \(\vert {W_{0}}\vert \lesssim \tau _{0}^{1-b} \epsilon _{0}\) by the Sobolev embedding. As a consequence, when \(k = 1\), the image of (26) with the above choice of \(\kappa _{0}\) and \(\epsilon _{0} > 0\) sufficiently small leads to the existence of an open subset of signed initial data in \(H^{5}\) that leads to the blow-up behavior described in Theorem 3.1, as alluded to in Remark 1.3 above. When \(k \ge 2\), by taking \(\epsilon _{0}\) and \(\tau _{0} > 0\) sufficiently small, we may ensure that the initial data constructed by Theorem 3.1 has a definite sign.

All statements in Theorem 1.1 can be read off from Theorem 3.1, with the exception of the stability and the co-dimensionality statements. To formulate these statements, we show that the map (26) is a local homeomorphism.

Lemma 3.4

For each \(\Theta = (\tau _{0}, \xi _{0}, \kappa _{0}, w_{2, 0}, \ldots , w_{2k-1, 0}, W_{0})\) satisfying the hypothesis of Theorem 3.1, the map \(\varvec{\Phi }\) defined by (26) is a homeomorphism from an open neighborhood of \(\Theta \) onto an open neighborhood of \(\varvec{\Phi }(\Theta )\) in \(H^{2k+3}\).

Note that \(\varvec{\Phi }\) does not possess any further regularity in, for instance, \(\tau _{0}\), as it acts as a scaling parameter.

Proof

Continuity of the map \(\varvec{\Phi }\) is evident. For every \(\mathring{\Theta } \in {\tilde{{\mathcal {O}}}}_{\mathring{\xi }, \mathring{\kappa }, \epsilon _{0}}\), we may directly construct the continuous inverse in a small neighborhood of \(\mathring{u} = \varvec{\Phi }(\mathring{\Theta })\) as follows. Let u be sufficiently close to \(\mathring{u}\) in the \(H^{2k+3}\) topology. Since \(\partial _{x}^{2k+1} \mathring{u}(\mathring{\xi }) = (2k)! \tau _{0}^{-2k-2}\) and \(\partial _{x}^{2k} \mathring{u}(\mathring{\xi }) = 0\), we may ensure that \(\partial _{x}^{2k+1} u(\mathring{\xi })\) is nonzero and \(\partial _{x}^{2k} u(\mathring{\xi })\) is small. Hence, we can find a unique point \(\xi _{0}\) near \(\mathring{\xi }\) such that \(\partial _{x}^{2k} u(\xi _{0}) = 0\). Next, we choose \(\tau _{0} = - (\partial _{x} u(\xi _{0}))^{-1}\), \(\kappa _{0} = u(\xi _{0})\) and \(w_{j, 0} = \tau _{0}^{bj} \partial _{x}^{j} u(\xi _{0})\). Finally, define \(W_{0}\) from u and the parameters using (26). \(\square \)

Lemma 3.4 shows that the set of initial data for which Theorem 3.1 applies in the case \(k = 1\) is an open subset of \(H^{5}\), which is the precise sense in which the blow-up dynamics described in Theorem 3.1 is stable. In the case \(k \ge 2\), it establishes the precise sense in which the initial data given by prescribing \(\xi _{0} \in {\mathbb {R}}\), \(\kappa _{0} \in {\mathbb {R}}\), \(\tau _{0} < \tau _{*}(\vert {\kappa _{0}}\vert )\), \(\epsilon _{0} < \epsilon _{*}(\vert {\kappa _{0}}\vert )\) and \(W_{0} \in {\mathcal {O}}_{\tau _{0}, \epsilon _{0}}\) but not specifying \(\vec {w}_{0} \in B_{0}(\tau _{0}^{\gamma }) \subseteq {\mathbb {R}}^{2k-2}\) is “co-dimension \(2k-2\)” in \(H^{2k+2}\), as alluded to in Theorem 1.1.

Remark 3.5

An interesting question, which is not pursued in this article, is the regularity of the co-dimension \(2k-2\) set of initial data in \(H^{2k+3}\) given by Theorem 3.1 and Lemma 3.4 (e.g., does it form a \(C^{1}\) submanifold of \(H^{2k+3}\) modelled by \(H^{2k+3}\)?). Such a result seems to require a careful analysis of the difference of blow-up solutions.

3.2 Initial Data in Self-Similar Variables

In this short subsection, we rephrase our ansatz for the initial data in the self-similar variables (5), in which most of our analysis will take place.

We prescribe the initial data at \(s = \sigma _{0}\), where conditions on \(\sigma _{0}\) will be specified later. In the self-similar variables (syU) given by (5) with \(\tau (\sigma _{0}) = \tau _{0}\), \(\xi (\sigma _{0}) = \xi _{0}\) and \(\kappa (\sigma _{0}) = \kappa _{0}\), the initial data for U is of the form

figure a

where the assumptions on \(W_{0}\) are as follows:

figure b

When \(k > 1\), the following smallness conditions are assumed for the unstable coefficients:

$$\begin{aligned} |w_{j, 0}| \le e^{-\gamma \sigma _{0}} \quad \text {for } j = 2, \ldots , 2k-1. \end{aligned}$$
(D4)

3.3 Main Bootstrap and Shooting Lemmas

In this section, we state two central ingredients of our proof, namely, the main bootstrap lemma in self-similar coordinates (Lemma 3.6) and a shooting lemma for handling the unstable modes when \(k \ge 2\) (Lemma 3.9).

Recall that \(\mu = \min \{1-b\alpha , 1-b \beta , 1\}\). Let \(\mu _{0}\) be given by

$$\begin{aligned} \mu _{0} = {\left\{ \begin{array}{ll} \min \{\mu , \frac{2k-1}{2k}\} &{} \hbox { when } \max \{\alpha , \beta \} \ne \frac{1}{2k+1},\\ \frac{2k-\frac{3}{2}}{2k} &{} \hbox { when } \max \{\alpha , \beta \} = \frac{1}{2k+1}.^{8} \end{array}\right. } \end{aligned}$$

Footnote 8 Fix also a number \(\gamma \) satisfying

$$\begin{aligned} 0< \gamma < \mu _{0}. \end{aligned}$$
(27)

To formulate our bootstrap assumptions, we introduce a semi-norm \(\dot{{\mathcal {H}}}^{n}_{< L}\) (n is a nonnegative integer and \(L > 0\)) defined by the formula

$$\begin{aligned} \Vert {V}\Vert _{\dot{{\mathcal {H}}}^{n}_{< L}}= & {} \sup _{j \in {\mathbb {Z}}, \, 2^{j}< L} \left( \int _{2^{j-1}< \vert {y}\vert < 2^{j}} (\vert {y}\vert ^{n-\frac{1}{2k+1}} \partial _{y}^{n} V)^{2} \frac{\textrm{d}y}{y}\right) ^{\frac{1}{2}} \\{} & {} +\,L^{n-\frac{1}{2k+1}-\frac{1}{2}} \left( \int _{\vert {y}\vert >\frac{L}{2}} (\partial _{y}^{n} V)^{2} \textrm{d}y\right) ^{\frac{1}{2}}. \end{aligned}$$

A notable feature of this semi-norm is that, in the limit \(L =\infty \), it is invariant under the self-similar transformation \(x = \lambda y\), \(u(x) = \lambda ^{1-\frac{1}{b}} U(y)\) with \(b = \frac{2k+1}{2k}\) for any \(\lambda > 0\).

Lemma 3.6

(Main bootstrap lemma) There exist increasing functions \(\epsilon _{*}^{-1}(\cdot )\), \(A(\cdot ), y_{0}^{-1}(\cdot )\) and \(\sigma _{*}(\cdot )\) on \([0, \infty )\), all of which are bounded from below by 1, such that the following holds. Let \(\kappa _{0} \in {\mathbb {R}}\), \(\sigma _{0} \ge \sigma _{*}(\vert {\kappa _{0}}\vert )\) and assume that the initial data conditions (D1)–(D4) are satisfied at \(s = \sigma _{0}\) with \(\epsilon _{0} \le \epsilon _{*}(\vert {\kappa _{0}}\vert )\). Suppose that, for some \(\sigma _{1} > \sigma _{0}\), \(A = A(\vert {\kappa _{0}}\vert )\) and \(y_{0} = y_{0}(\vert {\kappa _{0}}\vert )\), the following estimates are satisfied for \(s \in [\sigma _{0}, \sigma _{1}]\):

figure c

Assume also that \(U(s, 0) = U'(s, 0)+1 = U^{(2k)}(s, 0) = 0\) for all \(s \in [\sigma _0, \sigma _1]\). In case \(k > 1\), assume furthermore that \(\vec {w}\) satisfies the trapping condition

$$\begin{aligned}&|\vec {w}(s)| \le e^{-\gamma s} \hbox { for } s \in [\sigma _{0}, \sigma _{1}]. \end{aligned}$$
(T)

Then, stronger estimates actually hold on the interval \(s \in [\sigma _{0}, \sigma _{1}]\), as follows:

figure d

Remark 3.7

(On dependencies) We would like to clarify the order in which the above functions \(\epsilon _*\), A, \(y_0\), and \(\sigma _*\) are chosen. We start from \(\epsilon _{*}\), which is essentially the size of the initial data. Then we choose A, which is the bootstrap parameter (we will eventually choose it to be very large), and, in order to be able to Taylor expand at \(y = 0\), we choose \(y_0\) to be very small based on A. This then forces us to choose \(\sigma _{*}\) very large depending on \(y_0\) and A.

When \(k = 1\), then Lemma 3.6 is already sufficient to set up a bootstrap argument to show the global existence of U(sy) for all \(s \ge \sigma _{0}\), which is the key step in the proof of Theorem 3.1 (see the proof of Theorem 3.1 below).

When \(k > 1\), the trapping condition (T) for \(\vec {w}\) is not improved in general, so we need an extra argument to find a global-in-s solution. For this purpose, we introduce the notion of a trapped solution as follows:

Definition 3.8

Let \(k > 1\). For \(\kappa _{0} \in {\mathbb {R}}\), \(\xi _{0} \in {\mathbb {R}}\) and \(W_{0}\) satisfying the initial data conditions (D2)–(D3), let A, \(y_{0}\) and \(\sigma _{0}\) be determined from Lemma 3.6. We say that a solution U(sy) with the initial data (D1) induced by \(\sigma _{0}\), \(\kappa _{0}\), \(\xi _{0}\), \(W_{0}\) and \(\vert {\vec {w}_{0}}\vert \le e^{-\gamma \sigma _{0}}\) is trapped on an interval \([\sigma _{0}, \sigma _{1}]\) if it satisfies (B1)–(B7) and (T) on \([\sigma _{0}, \sigma _{1}]\).

By Lemma 3.6, it follows that the only way a trapped solution U(sy) on \([\sigma _{0}, \sigma _{1}]\) can fail to be trapped for \(s > \sigma _{1}\) is if (T) is saturated at \(s = \sigma _{1}\), i.e., \(\vert {\vec {w}(\sigma _{1})}\vert = e^{-\gamma \sigma _{1}}\). Combining this property with a topological fact (namely, the nonexistence of a continuous retraction of a closed ball to its boundary), we shall prove the existence of a globally trapped solution:

Lemma 3.9

(Shooting lemma) Let \(W_0\), \(\kappa _{0}\) and \(\xi _{0}\) be fixed so that the conditions (D2)–(D3) hold, and let A, \(y_{0}\), and \(\sigma _{0}\) be as in Lemma 3.6. Then there is a vector \(\vert {\vec {w}_{0}}\vert < e^{-\gamma \sigma _{0}}\) such that the corresponding solution U(sy) with initial data at \(\sigma _{0}\) induced by \(\vec {w}_0\) and \(W_0\) remains trapped for all \(s \ge \sigma _{0}\).

We are going to prove Lemmas 3.6 and 3.9 in Sects. 5 and 6 by breaking the proof into several parts. In the remainder of this section, we show how to establish Theorem 3.1 assuming Lemmas 3.6 and 3.9.

In addition to Lemmas 3.6 and 3.9, we need three more ingredients, which will be useful in the rest of the paper. The first ingredient is the following simple pointwise bound from the weighted \(L^{2}\)-Sobolev norm \(\dot{{\mathcal {H}}}_{<L}^{n}\):

Lemma 3.10

For any \(1 \le \ell \le 2k+2\), we have

$$\begin{aligned} \vert {\partial _{y}^{\ell } V(y)}\vert \lesssim _{\ell , k} \max \{*\}{\vert {y}\vert ^{-\ell +\frac{1}{2k+1}}, L^{-\ell +\frac{1}{2k+1}}} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{1}} + \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{2k+3}}). \end{aligned}$$

Proof

This lemma follows easily from the Sobolev embedding on the unit interval and scaling; we omit the details. \(\square \)

The second ingredient is the observation that Eq. (1) admits an \(L^2\) bound for u(tx), which readily translates into an \(L^2\) bound for U(sy) itself. We record this fact in the following lemma.

Lemma 3.11

Assume that the initial data conditions (D1)–(D4) are satisfied at \(s = \sigma _0\), and u(tx), U(sy) are as above. Then, there is \(C>0\) such that the following bound holds for \(s \in [\sigma _{0}, \sigma _{1}]\):

$$\begin{aligned} \Vert {U+e^{(b-1)s} \kappa }\Vert _{L^{2}_{y}} \le C e^{(\frac{3}{2} b-1) s} {(1+\kappa _{0})}. \end{aligned}$$
(28)

Proof

We first express the initial data for u in terms of the initial data for U. Due to (D1), we have

$$\begin{aligned} u(\tau _0,x)= & {} e^{(1-b)\sigma _0}\Bigg ( \chi (\sigma _{0},y) \left( \mathring{U}(y) + e^{(b-1) \sigma _{0}} \kappa (\sigma _{0})\right) \\{} & {} + \bar{\chi }(y) \sum _{j=2}^{2k-1} \frac{w_{j, 0}}{j!} y^{j} + W_{0}(y) \Bigg ), \end{aligned}$$

where we remind the reader that \(x = e^{-b \sigma _{0}} y + \xi _{0}\). To bound the first (and dominant) term in the above expression, recall from (13) that \(\vert {\mathring{U}(y)}\vert \lesssim e^{\frac{1}{2k+1} b \sigma _{0}} = e^{(b-1) \sigma _{0}}\) on the support of \(\chi (\sigma _{0}, \cdot )\). Therefore,

$$\begin{aligned}&\int e^{2 (1-b)\sigma _0}\chi (\sigma _{0},y)^{2} \left( \mathring{U}(y) + e^{(b-1) \sigma _{0}} \kappa (\sigma _{0})\right) ^{2} \, \textrm{d}x \\&\quad \lesssim (1+\kappa _{0})^{2} \int \chi (\sigma _{0},e^{b \sigma _{0}} (x - \xi _{0}))^{2} \, \textrm{d}x \lesssim (1+\kappa _{0})^{2}, \end{aligned}$$

where we used Lemma 2.1 again in the last inequality. The contribution of the last term is bounded precisely by (D4), while the contribution of the second term would decay as \(\sigma _{0} \rightarrow \infty \) according to our assumptions on the initial data. We eventually obtain

$$\begin{aligned} \Vert {u_{0}}\Vert _{L^{2}_x} \le C (1+\kappa _{0}). \end{aligned}$$

We now use the fact that Eq. (1) satisfies an a-priori \(L^{2}\) bound, since \(\Gamma (D_{x}) \partial _{x}\) is anti-symmetric (dispersive) and \(\Upsilon (D_{x})\) is nonnegative (dissipative). We then calculate, using the fact that \(u =e^{(1-b)s} (U + e^{(b-1) s} \kappa )\),

$$\begin{aligned} \int \vert {u}\vert ^{2} \, \textrm{d}x = \int e^{2(1-b)s} (U + e^{(b-1) s} \kappa )^{2} \, \textrm{d}(e^{-b s} y) = e^{2s - 3bs} \int (U + e^{(b-1) s} \kappa )^{2} \, \textrm{d}y. \end{aligned}$$

This readily implies

$$\begin{aligned} \Vert {U+e^{(b-1)s} \kappa }\Vert _{L^{2}_{y}} = e^{s - \frac{3}{2} b s}\Vert {u(\tau (s) -e^{-s}, x)}\Vert _{L^{2}_{x}} \le C e^{(\frac{3}{2} b - 1) s} (1+\kappa _{0}). \end{aligned}$$

Finally, the third ingredient concerns some specific bounds for the initial data which follow from the requirements in Sect. 3.2. We record these bounds in the next subsection.

3.4 Consequences of the Initial Data Bounds

We record here some consequences of the initial data bounds from Sect. 3.2 which will be used in the proof of Theorem 3.1. By (D3) and interpolation, we have

$$\begin{aligned} \Vert {\partial _{y} W_{0}}\Vert _{L^{2}} \le C \epsilon _{0} e^{-(1-\frac{1}{2} b) \sigma _{0}}, \quad \Vert {\partial _{y}^{2k+3} W_{0}}\Vert _{L^{2}} \le C \epsilon _{0} e^{-(1+(2k+\frac{3}{2}) b) \sigma _{0}}, \end{aligned}$$

and by the Gagliardo–Nirenberg inequality,

$$\begin{aligned} \vert {\partial _{y}^{2k+1} W_{0}(0)}\vert \le C \epsilon _{0} e^{-(1+2k b) \sigma _{0}}. \end{aligned}$$
(29)

On the other hand, (D3) also implies

$$\begin{aligned} \Vert {W_{0}}\Vert _{\dot{{\mathcal {H}}}_{<e^{b\sigma _{0}}}^{1}} + \Vert {W_{0}}\Vert _{\dot{{\mathcal {H}}}_{<e^{b\sigma _{0}}}^{2k+3}} \le C \epsilon _{0}. \end{aligned}$$
(30)

Noting that \(\Vert {\bar{U}}\Vert _{\dot{{\mathcal {H}}}_{<e^{b \sigma _{0}}}^{n}} \le C_{n}\) for any \(n = 0, 1, \ldots \), we have

$$\begin{aligned} \Vert {U(\sigma _{0}, \cdot )}\Vert _{\dot{{\mathcal {H}}}_{<e^{b\sigma _{0}}}^{1}} + \Vert {U(\sigma _{0}, \cdot )}\Vert _{\dot{{\mathcal {H}}}_{<e^{b\sigma _{0}}}^{2k+3}} \le C. \end{aligned}$$
(31)

By the definition of \(\bar{U}\), (30) and Lemma 3.10, we also obtain the pointwise bound

$$\begin{aligned} \vert {\partial _{y} U(\sigma _{0}, y)}\vert \le C \max \{(1+\vert {y}\vert )^{-\frac{2k}{2k+1}}, e^{-\sigma _{0}} \epsilon _{0}\}. \end{aligned}$$
(32)

3.5 Proof of the Main Theorem

We are now ready to give a proof of Theorem 3.1.

Proof of Theorem 3.1 assuming Lemmas 3.6 and 3.9

Let \(\tau _{*}(\cdot ) = e^{-\sigma _{*}(\cdot )}\) and define \(\sigma _{0}\) by \(\tau _{0} = e^{-\sigma _{0}}\). In case \(k = 1\), by a standard bootstrap argument using Lemma 3.6, there exist \(C^{1}\) functions \(\tau (\cdot )\), \(\kappa (\cdot )\), and \(\xi (\cdot )\) on \([\sigma _{0}, \infty )\) such that in the self-similar variables (syU) given by (5) with \(\tau (\cdot )\), \(\kappa (\cdot )\) and \(\xi (\cdot )\), U(sy) is a globally trapped solution on \([\sigma _{0}, \infty )\) and \(\tau \), \(\kappa \), and \(\xi \) solve (22)–(23) with \(\tau (\sigma _{0}) = \tau _{0}\) (so that \(s = \sigma _{0}\) corresponds to \(t = 0\)), \(\kappa (\sigma _{0}) = \kappa _{0}\) and \(\xi (\sigma _{0}) = \xi _{0}\). In case \(k \ge 2\), by Lemmas 3.6 and 3.9, there exists \(\vec {w}_{0} \in B_{0}(e^{- \gamma \sigma _{0}})\) such that the above conclusion holds.

By integrating the ODEs for \(\tau _{s}\), \(\kappa _{s}\) and \(\xi _{s}\) in (IB6), it follows that \((\tau (s), \kappa (s), \xi (s)) \rightarrow (\tau _{+}, \kappa _{+}, \xi _{+})\) as \(s \rightarrow \infty \), where

$$\begin{aligned}{} & {} \vert {\tau _{+} - \tau }\vert \lesssim e^{-(1+\gamma )s}, \quad \vert {\kappa _{+} - \kappa }\vert \lesssim e^{-(b-1+\gamma )s}, \nonumber \\{} & {} \vert {\xi _{+} - \xi - (\tau _{+} - \tau + e^{-s}) \kappa _{+}}\vert \lesssim e^{-(b+\gamma )s}. \end{aligned}$$
(33)

In particular, by (IB6) and \(\vert {\tau _{+} - \tau }\vert \lesssim e^{-(1+\gamma )s}\), it follows that the change of variables \(s \rightarrow t\) is a well-defined strictly increasing map from \([\sigma _{0}, \infty )\) onto \([0, \tau _{+})\). Since \(\partial _{y} U(s, 0) = -1\) for all s, it follows that \(\partial _{x} u(t, \xi (s(t))) = - (\tau (s(t)) - t)^{-1} \rightarrow \infty \) as \(t \rightarrow \tau _{+}\), which implies that u indeed blows up as \(t \nearrow \tau _{+}\). The desired bounds on \(\tau _{+}\), \(\kappa _{+}\) and \(\xi _{+}\) also follow from (33).

To complete the proof, it remains to establish the regularity and blow-up properties of u, which we derive from properties of U and the change of variables (5). To begin with, note that, by (IB4)–(IB5) and Lemma 3.10, we have

$$\begin{aligned} \vert {U'(s, y)}\vert \le C A \max \{*\}{\vert {y}\vert ^{-\frac{2k}{2k+1}}, e^{-s}} \quad \hbox { for } \vert {y}\vert \ge 1. \end{aligned}$$
(34)

On the other hand, \(\vert {U'(s, y)}\vert \le 2\) for \(\vert {y}\vert \le 1\) by (IB1)–(IB2). Using \(U(s, 0) = 0\) and by integration, we arrive at

$$\begin{aligned} \vert {U(s, y)}\vert \le {\left\{ \begin{array}{ll} C \vert {y}\vert &{} \hbox { for } \vert {y}\vert \le 1, \\ C A \max \{*\}{\vert {y}\vert ^{\frac{1}{2k+1}}, \vert {y}\vert e^{-s}} &{} \hbox { for } \vert {y}\vert \ge 1. \end{array}\right. } \end{aligned}$$
(35)

For \(\vert {y}\vert > e^{b s}\), we may eliminate the linear growth \(\vert {y}\vert e^{-s}\) by using the Sobolev inequality based on the \(L^{2}\) bound (28) and

$$\begin{aligned} \Vert {\partial _{y} (U + e^{(b-1) s} \kappa )}\Vert _{L^{2}( \vert {y}\vert> e^{b s} )}= & {} \Vert {\partial _{y} U}\Vert _{L^{2} ( \vert {y}\vert > e^{b s} )} \le e^{( \frac{1}{2} b-1) s} \Vert {U}\Vert _{\dot{{\mathcal {H}}}_{<e^{bs}}^{1}} \\\le & {} e^{( \frac{1}{2} b - 1) s} A. \end{aligned}$$

As a consequence, we obtain

$$\begin{aligned} \vert {U(s, y) + e^{(b-1)} \kappa }\vert \le C e^{( b-1) s} (1 + \kappa _{0} + A) \quad \hbox { for } \vert {y}\vert \ge e^{b s}, \end{aligned}$$

which is an improvement over (35). In particular, it follows that

$$\begin{aligned} \vert {U(s, y)}\vert \le C e^{(b-1) s} (1 + \kappa _{0} + A) \quad \hbox { for all } y \in {\mathbb {R}}, \end{aligned}$$

which implies via (5) that \(\Vert {u}\Vert _{L^{\infty }}\) is uniformly bounded up to the blow-up time \(\tau _{+}\).

To prove the upper bounds on the Hölder semi-norms, first observe the simple gradient bound \(\vert {U'}\vert \le C A (\vert {y}\vert ^{-\frac{2k}{2k+1}} + e^{-s})\) from (IB1)–(IB2) and (34). Recalling (5), \(\lambda = e^{-bs}\) and \(b = \frac{2k+1}{2k}\), we have, for each \(t < \tau _{+}\)

$$\begin{aligned} {[}u]_{{\mathcal {C}}^{\frac{1}{2k+1}}}&= \sup _{x \in {\mathbb {R}}, \, \Delta x > 0} \frac{\vert {u(x + \Delta x)-u(x)}\vert }{(\Delta x)^{\frac{1}{2k+1}}} \\&\le \sup _{x \in {\mathbb {R}}, \, \Delta x \in (0, 1)} \frac{\vert {u(x + \Delta x)-u(x)}\vert }{(\Delta x)^{\frac{1}{2k+1}}} + 2 \Vert {u}\Vert _{L^{\infty }} \\&\le \sup _{y \in {\mathbb {R}}, \, \Delta y \in (0, e^{b s})} \frac{e^{-(b-1)s}\vert {U(y + \Delta y)-U(y)}\vert }{e^{-b\frac{1}{2k+1}s}(\Delta y)^{\frac{1}{2k+1}}} + 2 \Vert {u}\Vert _{L^{\infty }} \\&\le C A \sup _{y \in {\mathbb {R}},\, \Delta y \in (0, e^{b s})} (\Delta y)^{-\frac{1}{2k+1}} \int _{y}^{y + \Delta y} (\vert {y'}\vert ^{-\frac{2k}{2k+1}} + e^{-s}) \, \textrm{d}y' \\&\le C A \sup _{\Delta y \in (0, e^{bs})} (\Delta y)^{-\frac{1}{2k+1}} ((\Delta y)^{\frac{1}{2k+1}} + \Delta y e^{-s}) \le C A. \end{aligned}$$

Then interpolating with the trivial upper bound \(\Vert {\partial _{x} u}\Vert _{L^{\infty }} = (\tau _{+} - t)^{-1} \Vert {\partial _{y} U}\Vert _{L^{\infty }} \lesssim (\tau _{+} - t)^{-1}\), the upper bounds when \(\frac{1}{2k+1}< \sigma < 1\) follow.

Finally, to establish the lower bounds on the Hölder semi-norms, note first that \(\inf _{s \ge \sigma _{0}, \, \vert {y}\vert \le c_{0}} \vert {U'(s, y)}\vert > 0\) for some \(c_{0} > 0\) by Taylor expansion. By the mean value theorem,

$$\begin{aligned} \vert {y}\vert ^{-\sigma } \vert {U(s, y) - U(s, 0)}\vert \ge C \vert {y}\vert ^{1-\sigma } \quad \hbox { for } \vert {y}\vert \le c_{0}, \end{aligned}$$

and then by (5), the desired lower bound follows. \(\square \)

4 Lemmas on Fourier Multiplier

In this section, we establish key analytic lemmas concerning the operators \({\mathcal {H}}\) and \({\mathcal {L}}\), whose definitions are recalled here for convenience:

$$\begin{aligned} {\mathcal {H}}(V)&= - P_{> 0}(e^{b s} D_{y}) \left( e^{-\max \{\alpha , \beta , 0\} b s} \Gamma (e^{bs} D_{y}) e^{bs} \partial _{y} V \right. \\&\quad \left. + e^{- \max \{\alpha , \beta , 0\} b s} \Upsilon (e^{bs} D_{y}) V \right) , \\ {\mathcal {L}}(V)&= - P_{\le 0}(e^{b s} D_{y}) \left( \Gamma (e^{bs} D_{y}) e^{bs} \partial _{y} V + \Upsilon (e^{bs} D_{y}) V \right) .^{9} \end{aligned}$$

Footnote 9

Observe that the assumptions on \(\Gamma \) and \(\Upsilon \) remain true under any increase of \(\alpha \) or \(\beta \). In the proofs in this section, we will often assume, without loss of generality, that \(\alpha = \beta \) and \(\alpha \ge 0\), so that \(\max \{\alpha , \beta , 0\} = \alpha \).

We begin with simple \(L^{2}\) and \(L^{\infty }\) estimates for \({\mathcal {H}}\) and \({\mathcal {L}}\).

Lemma 4.1

For any \(\ell \ge 0\), we have

$$\begin{aligned}&\Vert {\partial _{y}^{\ell } {\mathcal {L}}(V)}\Vert _{L^{2}} \lesssim _{\alpha , \beta } e^{-\ell b s} \Vert {V}\Vert _{L^{2}}, \end{aligned}$$
(36)
$$\begin{aligned}&\Vert {\partial _{y}^{\ell } {\mathcal {L}}(V)}\Vert _{L^{\infty }} \lesssim _{\alpha , \beta } e^{-(\frac{1}{2}+\ell ) bs}\Vert {V}\Vert _{L^{2}}. \end{aligned}$$
(37)

For \(\max \{\alpha , \beta \} < 1\), we have

$$\begin{aligned}&\Vert {{\mathcal {H}}(V)}\Vert _{L^{2}} \lesssim _{\alpha , \beta } \Vert {V}\Vert _{L^{2}}^{1-\max \{\alpha , \beta , 0\}} \Vert {\partial _{y} V}\Vert _{L^{2}}^{\max \{\alpha , \beta , 0\}}, \end{aligned}$$
(38)
$$\begin{aligned}&\Vert {{\mathcal {H}}(V)}\Vert _{L^{\infty }} \lesssim _{\alpha , \beta } \Vert {V}\Vert _{L^{\infty }}^{1-\frac{2}{3} \max \{\alpha , \beta , 0\}} \Vert {\partial _{y}^{2} V}\Vert _{L^{2}}^{\frac{2}{3}\max \{\alpha , \beta , 0\}}. \end{aligned}$$
(39)

Proof

The \(L^{2}\) bound (36) for \({\mathcal {L}}\) is simply a consequence of the fact that, thanks to the frequency projection \(P_{\le 0}(e^{b s} D_{y})\) and the assumptions on \(\Gamma \), \(\Upsilon \), \({\mathcal {L}}\) is a Fourier multiplier with bounded symbol. The case \(\ell \ge 1\) then follows, thanks again to the frequency projection \(P_{\le 0}(e^{b s} D_{y})\). Moreover, (37) follows from Bernstein’s inequality.

To prove (38), it suffices to prove that, for all \(k \in {\mathbb {Z}}\),

$$\begin{aligned} \Vert {P_{k}(D_{y}) {\mathcal {H}}V}\Vert _{L^{2}}&\lesssim \min \{*\}{2^{\alpha k} \Vert {P_{k} (D_{y}) V}\Vert _{L^{2}}, 2^{-(1-\alpha )k} \Vert {\partial _{y} P_{k}(D_{y}) V}\Vert _{L^{2}}}, \\ \Vert {P_{k}(D_{y}) {\mathcal {H}}V}\Vert _{L^{\infty }}&\lesssim \min \{*\}{2^{\alpha k} \Vert {P_{k} (D_{y}) V}\Vert _{L^{\infty }}, 2^{-(\frac{3}{2}-\alpha )k} \Vert {\partial _{y}^{2} P_{k}(D_{y}) V}\Vert _{L^{2}}}. \end{aligned}$$

To see this (in particular, the \(L^{\infty }\) bound), note that

$$\begin{aligned}{} & {} P_{k}(D_{y}) e^{-b\alpha s}\Gamma (e^{b s} D_{y}) e^{b s} \partial _{y} V = 2^{\alpha k} K_{k} *V(y), \\{} & {} \hbox {where } 2^{\alpha k} K_{k} = {\mathcal {F}}^{-1}_{\xi _{y}}[i P_{k}(\xi _{y}) e^{-b\alpha s}\Gamma (e^{b s} \xi _{y}) e^{b s} \xi _{y}]. \end{aligned}$$

Indeed, by the assumptions on \(\Gamma \), the kernel of \(P_{k'}(D_{x}) \Gamma (D_{x}) \partial _{x}\) is of the form \(2^{\alpha k'} K_{k'}(x)\), where \(\int \vert {K_{k}(x)}\vert \, \textrm{d}x \lesssim 1\) (independent of k). By rescaling \(x = e^{b s} y\), we see that the kernel of \(P_{k}(D_{y}) e^{-b \alpha s} \Gamma (e^{b s} D_{y}) e^{b s} \partial _{y} \) is of the form \(2^{\alpha k} e^{-bs} K_{k-(\log 2)^{-1} b s}(e^{-bs} y)\), where the y-integral of \(e^{-bs} \vert {K_{k-(\log 2)^{-1} b s}(e^{-bs} y)}\vert \) is uniformly bounded in k. The desired bounds for the contribution of \(\Gamma \) in \({\mathcal {H}}\) now follows from Young’s inequality. A similar bound holds for \(\Upsilon \). \(\square \)

Next, we prove a sharp upper bound on the kernel of the operator \({\mathcal {H}}\).

Lemma 4.2

For each s, there exists a function \(K_{s} \in C^{\infty }({\mathbb {R}}{\setminus } \{0\})\) such that

$$\begin{aligned} {\mathcal {H}}V(y) = \int _{-\infty }^{\infty } K_{s}(y - y') \partial _{y} V(y') \, \textrm{d}y', \end{aligned}$$

where

$$\begin{aligned} \vert {K_{s}(y)}\vert + \vert {y}\vert \vert {\partial _{y} K_{s}(y)}\vert \lesssim \vert {y}\vert ^{-\max \{\alpha , \beta , 0\}}. \end{aligned}$$

Proof

Without loss of generality, assume \(\alpha = \beta \ge 0\). By the Fourier inversion formula, we have

$$\begin{aligned} P_{> 0}(D_{x}) \Gamma (D_{x}) \partial _{x} f&= \int _{-\infty }^{\infty } K(x - x') \partial _{x} f(x') \, \textrm{d}x', \\ P_{> 0}(D_{x}) \Upsilon (D_{x}) f&= \int _{-\infty }^{\infty } K'(x - x') \partial _{x} f(x') \, \textrm{d}x', \end{aligned}$$

where

$$\begin{aligned} \vert {K(z)}\vert + \vert {z}\vert \vert {\partial _{z} K(z)}\vert&\lesssim \vert {z}\vert ^{-\alpha }, \\ \vert {K'(z)}\vert + \vert {z}\vert \vert {\partial _{z} K'(z)}\vert&\lesssim \vert {z}\vert ^{-\alpha }. \end{aligned}$$

The desired statement now follows by applying the rescaling \(x = e^{b s} y\). \(\square \)

Finally, we formulate and prove a key commutator estimate for \({\mathcal {H}}\) in the weighted \(L^{2}\)-Sobolev space \(\dot{{\mathcal {H}}}_{<L}^{n}\) introduced earlier. For this purpose, it is instructive to generalize the weight in the semi-norm and introduce

$$\begin{aligned} \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{< L}^{n, \nu }}= & {} \sup _{j \in {\mathbb {Z}}, \, 2^{j}< L} \left( \int _{2^{j-1}< \vert {y}\vert < 2^{j}} (\vert {y}\vert ^{n+\nu } \partial _{y}^{n} V)^{2} \, \textrm{d}y \right) ^{\frac{1}{2}} \\{} & {} + L^{n+\nu } \left( \int _{\vert {y}\vert > \frac{L}{2}} (\partial _{y}^{n} V)^{2} \, \textrm{d}y \right) ^{\frac{1}{2}}. \end{aligned}$$

Lemma 4.3

Let \(-\frac{1}{2}< \nu < \frac{1}{2}\), \(\ell \in \{0, 1, \ldots \}\) and \(L > 1\). Let \(\varpi \) be a smooth function satisfying one of the following assumptions:

  • Case 1. \({{\,\textrm{supp}\,}}\varpi \subseteq \{2^{j_{0}-1-c_{0}}< \vert {y}\vert < 2^{j_{0}+c_{0}}\}\) and \(0 \le \varpi \le C_{0} 2^{(\nu + \ell ) j_{0}}\) for some \(c_{0}, C_{0} > 0\) and \(j_{0} \in {\mathbb {Z}}\) such that \(2^{j_{0}} < L\), or

  • Case 2. \({{\,\textrm{supp}\,}}\varpi \subseteq \{\vert {y}\vert > 2^{j_{0}-1-c_{0}}\}\) and \(0 \le \varpi \le C_{0} 2^{(\nu + \ell ) j_{0}}\) for some \(c_{0}, C_{0} > 0\) and \(j_{0} = \lfloor \log _{2} L \rfloor \).

Then for any \(s \in {\mathbb {R}}\) and \(V \in {\mathcal {H}}_{< L}^{\ell +1, \nu }\), we have

$$\begin{aligned} \Vert {\varpi {\mathcal {H}}\partial _{y}^{\ell } V}\Vert _{L^{2}} \lesssim _{\alpha , \beta , \nu , c_{0}, C_{0}} 2^{-\max \{\alpha , \beta , 0\} j_{0}} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{< L}^{0, \nu }} + \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{< L}^{\ell +1, \nu }}), \end{aligned}$$
(40)

where the implicit constant is independent of s and L.

Suppose, in addition, that \(\vert {\varpi '}\vert \le C_{0} 2^{(\nu + \ell - 1)j_{0}}\) and \(\ell \ge 1\). Then for any \(s \in {\mathbb {R}}\) and \(V \in {\mathcal {H}}_{< L}^{\ell , \nu }\), we have

$$\begin{aligned} \Vert {[\varpi , {\mathcal {H}}] \partial _{y}^{\ell } V}\Vert _{L^{2}} \lesssim _{\alpha , \beta , \nu , c_{0}, C_{0}} 2^{-\max \{\alpha , \beta , 0\} j_{0}} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{< L}^{0, \nu }} + \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{< L}^{\ell , \nu }}), \end{aligned}$$
(41)

where the implicit constant is independent of s and L.

The fact that the bound \(-\frac{1}{2}< \nu < \frac{1}{2}\) is sharp (at least when \(\ell = 0\) and \(\alpha = 0\)) can be seen by fixing \({\mathcal {H}}= P_{>0}(e^{bs} D_{y}) \textrm{H}\), where \(\textrm{H}\) is the Hilbert transform, and considering V localized in \(A_{j}\) with \(\vert {j-j_0}\vert \) large. In the proof of Lemma 5.5 below, this lemma will be applied with \(V = U'\) and \(\nu = \frac{1}{2} - \frac{1}{2k+1}\); indeed, observe that \(\Vert {U}\Vert _{\dot{{\mathcal {H}}}_{<L}^{n}} = \Vert {U'}\Vert _{\dot{{\mathcal {H}}}_{<L}^{n-1, \frac{1}{2} - \frac{1}{2k+1}}}\) for \(n \ge 1\).

Remark 4.4

We note that while (40) and (41) are sharp in terms of the spatial weights, it is not sharp in terms of regularity, as we are only working with integer regularity indices. Indeed, the orders of the operators \(\varpi {\mathcal {H}}\partial _{y}^{\ell }\) and \([\varpi , {\mathcal {H}}] \partial _{y}^{\ell }\) are \(\alpha + \ell \) and \(\alpha + \ell - 1\), respectively, while we are using \(\ell +1\) and \(\ell \) derivatives on the RHS, respectively (recall that \(0 \le \alpha < 1\)). A crucial point, however, is that the RHS of the commutator estimate, (41), involves at most \(\ell \) derivatives, which is important for avoiding any loss of derivatives in Lemma 5.5.

Another important point is that (40) and (41) are independent of s and L. In particular, the only s-independent information we have on the symbol of \({\mathcal {H}}\) are the scale-invariant bounds

$$\begin{aligned} \vert {\partial _{\xi _{y}}^{N} {\mathcal {H}}(\xi _{y})}\vert \lesssim _{N} \vert {\xi _{y}}\vert ^{\max \{\alpha , \beta , 0\}-N}, \end{aligned}$$

which are essentially all we use about \({\mathcal {H}}\).

Proof

Without loss of generality, assume \(\alpha = \beta \ge 0\). In what follows, we suppress the dependence of implicit constants on \(\alpha \), \(\nu \), \(c_{0}\) and \(C_{0}\). In what follows, we simply write \(P_{k} = P_{k}(D_{y})\) (\(k \in {\mathbb {Z}}\)).

To simplify the notation, we introduce the following schematic notation: We denote by \(\tilde{P}_{k}\) (resp. \(\tilde{K}_{k_{0}}\)) any function, which may vary from expression to expression, that obeys the same support properties and bounds as \(P_{k}\) (at the level of the symbol) (resp. \(K_{k}\)), i.e., \({{\,\textrm{supp}\,}}\tilde{P}_{k} \subseteq \{\xi \in {\mathbb {R}}: 2^{k-5}< \vert {\xi }\vert < 2^{k+5}\}\) and \(\vert {(\xi \partial _{\xi })^{n} \tilde{P}_{k}(\xi )}\vert \lesssim _{n} 1\) (resp. \(\vert {\partial _{y}^{n} \tilde{K}_{k}(y)}\vert \lesssim _{N, n} \frac{2^{(1+n)k}}{\langle {2^{k} y}\rangle ^{n+N}}\)).

With the above conventions, we have the schematic identities \(P_{k} = \tilde{P}_{k}\) and

$$\begin{aligned} {\mathcal {H}}= \sum _{k} {\mathcal {H}}P_{k} = \sum _{k} 2^{\alpha k} \tilde{P}_{k}, \end{aligned}$$

where an important point in the last identity is that the implicit bounds for \(\tilde{P}_{k}\) are independent of s. Note also that any operator of the form \(\tilde{P}_{k}\) has a kernel of the form \(\tilde{P}_{k} V = \tilde{K}_{k} *V\).

Next, we introduce a nonnegative smooth partition of unity \(\{\eta _{j}\}_{j \in {\mathbb {Z}}}\) on \({\mathbb {R}}\) subordinate to the open cover \(\{A_{j} = \{y \in {\mathbb {R}}: 2^{j-3}< \vert {y}\vert < 2^{j+2}\}\}_{j \in {\mathbb {Z}}}\). We shall write \(\eta _{\ge j} = \sum _{j' \ge j} \eta _{j'}\). We also introduce the shorthands

$$\begin{aligned} \breve{\eta }_{j_{0}} = 2^{-(\nu + \ell ) j_{0}} \varpi \hbox { in Case~1,} \quad \breve{\eta }_{\ge j_{0}} = 2^{-(\nu + \ell ) j_{0}} \varpi \hbox { in Case 2}. \end{aligned}$$

As the notation suggests, \(\breve{\eta }_{j_{0}}\) and \(\breve{\eta }_{\ge j_{0}}\) have similar support and upper bound properties as \(\eta _{j_{0}}\) and \(\eta _{\ge j_{0}}\), respectively, thanks to the hypothesis on \(\varpi \). However, note that we only have control of up to one derivative of \(\breve{\eta }_{j_{0}}\) and \(\breve{\eta }_{\ge j_{0}}\) since we only assume that \(|\varpi '| \le C_0 2^{(\nu + \ell -1)j_0}\), and we have no assumptions on higher derivatives.

Case 1, Step 1. We will use the following three bounds to treat the “non-local”, “low frequency” and “far-away input” cases, respectively: there exists a positive constant c independent of \(j, j_0\) and k such that, for \(\vert {j - j_{0}}\vert > c_{0}+5\) and \(k \ge - j_{0} - 5\),

$$\begin{aligned} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{j} V)}\Vert _{L^{2}}&\lesssim 2^{-\alpha j_{0}} 2^{-c \vert {j_{0} + k}\vert } 2^{-c \vert {j - j_{0}}\vert } \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}, \end{aligned}$$
(42)

and for \(k < - j_{0} - 5\),

$$\begin{aligned} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{j} V)}\Vert _{L^{2}}&\lesssim 2^{-\alpha j_{0}} 2^{-c \vert {j_{0} + k}\vert } 2^{-c \vert {j + k}\vert } \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}, \end{aligned}$$
(43)

and

$$\begin{aligned} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{\ge \log _{2} L} V)}\Vert _{L^{2}}&\lesssim 2^{-\alpha j_{0}} 2^{-c \vert {j_{0} + k}\vert } 2^{-c \vert {\log _{2} L - j_{0}}\vert } \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}, \end{aligned}$$
(44)

We defer their proofs for a moment and prove (40) and (41) assuming (42)–(44).

Case 1, Step 1.(a). To prove (40), we begin by expanding

$$\begin{aligned} \varpi {\mathcal {H}}\partial _{y}^{\ell } V&= \sum _{j, k : 2^{j} < L} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{j} V)\\&\quad + \sum _{k} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{\ge \log _{2} L} V) \\&=: \textrm{I}_{near} + \textrm{I}_{far}. \end{aligned}$$

The term \(\textrm{I}_{far}\) can be treated using (44), so it only remains to estimate \(\textrm{I}_{near}\). Unless \(\vert {j - j_{0}}\vert \le c_{0} + 5\) and \(k \ge - j_{0} - 5\), we can apply (42) and (43) to obtain an acceptable bound for the summand. When \(\vert {j - j_{0}}\vert \le c_{0} + 5\) and \(k \ge - j_{0} - 5\), we use the schematic identity \(\tilde{P}_{k} = 2^{-(\ell +1) k} \tilde{P}_{k} \partial _{x}^{\ell +1}\) to simply bound

$$\begin{aligned} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{j} V)}\Vert _{L^{2}}&\lesssim 2^{- \alpha j_{0}} 2^{- (1-\alpha ) (j_{0}+k)} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }} + \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{\ell +1, \nu }}), \end{aligned}$$

which can be summed up in \(k \ge -j_{0} + 5\).

Case 1, Step 1.(b). Now we prove the commutator estimate (41). We begin by making the following decomposition:

$$\begin{aligned} {[}\breve{\eta }_{j_{0}}, {\mathcal {H}}] \partial _{y}^{\ell } V&= \sum _{k \ge -j_{0}-5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} [\breve{\eta }_{j_{0}}, \tilde{P}_{k}] \partial _{y}^{\ell } V + \sum _{k< -j_{0}-5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \breve{\eta }_{j_{0}} \tilde{P}_{k} \partial _{y}^{\ell } V \nonumber \\&\quad + \sum _{k < -j_{0}-5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \tilde{P}_{k} ( \breve{\eta }_{j_{0}} \partial _{y}^{\ell } V) \nonumber \\&=: \textrm{I} + \textrm{II} + \textrm{III}.^{10} \end{aligned}$$
(45)

Footnote 10

We treat each term in (45) as follows. For \(\textrm{I}\), we start by writing

$$\begin{aligned} \begin{aligned} \textrm{I}&= \sum _{j, k: \vert {j - j_{0}}\vert \le c_{0} + 5,\, k \ge -j_{0}-5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} [\breve{\eta }_{j_{0}}, \tilde{P}_{k}] \partial _{y}^{\ell } (\eta _{j} V) \\&\quad + \sum _{j, k: \vert {j - j_{0}}\vert > c_{0} + 5,\, k \ge -j_{0}-5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} [\breve{\eta }_{j_{0}}, \tilde{P}_{k}] \partial _{y}^{\ell } (\eta _{j} V). \end{aligned} \end{aligned}$$
(46)

To treat the first sum on the RHS of (46), we make use of the commutator structure. We write

$$\begin{aligned} {[}\breve{\eta }_{j_{0}}, \tilde{P}_{k}] \tilde{V}&= \int \tilde{K}_{k}(y-y') (\breve{\eta }_{j_{0}}(y) - \breve{\eta }_{j_{0}}(y')) \tilde{V}(y') \, \textrm{d}y', \end{aligned}$$

where \(\tilde{V} = \partial _{y}^{\ell } (\eta _{j} V)\). Then using the bound for \(\breve{\eta }_{j_{0}}'\) (which comes from that for \(\varpi '\)) and the \(O(2^{k})\)-localization of \(\tilde{K}_{k}\), the kernel on the RHS can be written as \(2^{-k-j_{0}} \breve{K}(y, y')\), where \(\sup _{y} \Vert {\breve{K}(y, \cdot )}\Vert _{L^{1}}\) and \(\sup _{y'} \Vert {\breve{K}(\cdot , y')}\Vert _{L^{1}}\) are bounded by an absolute constant. Hence, by Schur’s test,

$$\begin{aligned} \Vert {2^{(\nu + \ell ) j_{0}} 2^{\alpha k} [\breve{\eta }_{j_{0}}, \tilde{P}_{k}] \partial _{y}^{\ell } (\eta _{j} V)}\Vert _{L^{2}}&\lesssim 2^{(-\alpha + \nu + \ell ) j_{0}} 2^{-(1-\alpha )(j_{0}+ k)} \Vert {\partial _{y}^{\ell }(\eta _{j} V)}\Vert _{L^{2}} \\&\lesssim 2^{-\alpha j_{0}} 2^{-(1-\alpha )(j_{0}+ k)} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }} + \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{\ell , \nu }}), \end{aligned}$$

which is summable in the range \(\{(j, k): \vert {j - j_{0}}\vert \le c_{0} + 5,\, k \ge -j_{0}-5\}\).

For the second sum on the RHS of (46), we simply note that \([\breve{\eta }_{j_{0}}, \tilde{P}_{k_{0}}] \partial _{y}^{\ell } (\eta _{j} V) = \breve{\eta }_{j_{0}} \tilde{P}_{k_{0}} (\eta _{j} V)\) by the support properties of \(\breve{\eta }_{j_{0}}\) and \(\eta _{j}\). Hence, we may apply (42), which is acceptable in the range \(\{(j, k): \vert {j - j_{0}}\vert > c_{0} + 5,\, k \ge -j_{0}-5\}\).

Next, the term \(\textrm{II}\) in (45) is directly bounded using (43).

Finally, we turn to the term \(\textrm{III}\) in (45). We write

$$\begin{aligned} \sum _{k< -j_{0}-5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \tilde{P}_{k} ( \breve{\eta }_{j_{0}} \partial _{y}^{\ell } V) = \sum _{j, k : k < -j_{0}-5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \tilde{P}_{k} ( \breve{\eta }_{j_{0}} \partial _{y}^{\ell } (\eta _{j} V)) \end{aligned}$$

By the support properties of \(\breve{\eta }_{j_{0}}\) and \(\eta _{j}\), the summand vanishes unless \(\vert {j_{0} - j}\vert \le c_{0} + 5\). In that case, we have

$$\begin{aligned} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \Vert {\tilde{P}_{k} (\breve{\eta }_{j_{0}} \partial _{y}^{\ell }(\eta _{j} V))}\Vert _{L^{2}}&\lesssim 2^{(\nu + \ell +\frac{1}{2}) j_{0}} 2^{(\alpha +\frac{1}{2}) k} \Vert {\partial _{y}^{\ell }(\eta _{j} V)}\Vert _{L^{2}} \\&\lesssim 2^{\frac{1}{2} j_{0}} 2^{(\alpha +\frac{1}{2}) k} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}+ \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{\ell , \nu }}), \end{aligned}$$

where on the first line we used the \(L^{1} \hookrightarrow L^{2}\) Bernstein inequality and \(\Vert {\breve{\eta }_{j_{0}}}\Vert _{L^{2}} \lesssim 2^{\frac{1}{2} j_{0}}\). The above bound is acceptable in the range \(\{(j, k): \vert {j - j_{0}}\vert \le c_{0} + 5,\, k < -j_{0}-5\}\).

Case 1, Step 2. It remains to prove (42), (43), and (44). We start with the following bound for the kernel \(\tilde{K}_{k}\) of \(\tilde{P}_{k}\): for \(\vert {j - j_{0}}\vert > c_{0} + 5\) and any \(N \ge 0\),

$$\begin{aligned} \vert {\breve{\eta }_{j_{0}}(y) \tilde{K}_{k}(y - y') \eta _{j}(y)}\vert \lesssim _{N} 2^{k} 2^{-N(\max \{j, j_{0}\} + k)}. \end{aligned}$$
(47)

Indeed, (47) follows from the bound for \(\tilde{K}_{k}\) and the simple fact that \(\vert {y - y'}\vert \simeq 2^{\max \{j, j_{0}\}}\) if \(\vert {y}\vert \simeq 2^{j_{0}}\), \(\vert {y'}\vert \simeq 2^{j}\) and \(\vert {j - j_{0}}\vert > c_{0} + 5\).Footnote 11

As a result, we have \(\Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{j} V)}\Vert _{L^{\infty }} \lesssim _{N} 2^{k} 2^{-N(\max \{j, j_{0}\} + k)} \Vert {\eta _{j} V}\Vert _{L^{1}}\). By two applications of Hölder’s inequality, as well as \(\Vert {\eta _{j} V}\Vert _{L^{2}} \lesssim 2^{-\nu j} \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}\), we have

$$\begin{aligned}&2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k}(\eta _{j} V)}\Vert _{L^{2}}\\&\quad \lesssim _{N} 2^{-\alpha j_{0}} 2^{(\nu + \frac{1}{2} + \alpha + \ell ) (j_{0} + k)} 2^{(-\nu +\frac{1}{2}) (j+k)} 2^{-N(\max \{j, j_{0}\} + k)} \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}. \end{aligned}$$

By choosing N to be appropriately large, (42) and (43) in the case \(j > - k + c_{0}\) follow (note that in the last case, \(j_{0} < - k - 5\), so \(j > j_{0} + c_{0} + 5\)). To treat the remaining cases in (43), namely \(k < - j_{0} -5\) and \(j \le -k + c_{0}\), we simply use the Hölder and Bernstein inequalities to estimate

$$\begin{aligned} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k}(\eta _{j} V)}\Vert _{L^{2}}&\lesssim 2^{(\nu + \frac{1}{2} + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {P_{k} (\eta _{j} V)}\Vert _{L^{\infty }}\\&\lesssim 2^{(\nu + \frac{1}{2} + \ell ) j_{0}} 2^{(\alpha + 1 + \ell ) k} \Vert {\eta _{j} V}\Vert _{L^{1}} \\&\lesssim 2^{-\alpha j_{0}} 2^{(\nu + \frac{1}{2} + \alpha + \ell ) (j_{0} + k)} 2^{(-\nu +\frac{1}{2}) (j+k)} \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}. \end{aligned}$$

Finally, to prove (44), we first split

$$\begin{aligned} \eta _{\ge \log _{2} L} V = \sum _{j: \log _{2} L \le j < \log _{2} L + c_{0}+10} \eta _{j} V + \eta _{\ge \log _{2} L + c_{0} + 10} V. \end{aligned}$$

Observe that the contribution of the first term can be treated using (42) and (43). For the remaining piece, thanks to the spatial separation between the supports of \(\breve{\eta }_{j_{0}}\) and \(\eta _{\ge \log _{2} L + c_{0} + 10}\), (47) implies \(\Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k} (\eta _{\ge \log _{2} L + c_{0} + 10} V)}\Vert _{L^{\infty }} \lesssim _{N} 2^{\frac{1}{2} k} 2^{-\frac{N}{2} (\log _{2} L + k)} \Vert {\eta _{\ge \log _{2} L + c_{0} + 10} V}\Vert _{L^{2}}\). Hence, by Hölder’s inequality,

$$\begin{aligned}&2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{j_{0}} \tilde{P}_{k}(\eta _{\ge \log _{2} L + c_{0} + 10} V)}\Vert _{L^{2}} \\&\quad \lesssim _{N} 2^{-\alpha j_{0}} 2^{(\nu + \frac{1}{2} + \alpha + \ell ) (j_{0} + k)} 2^{-(\frac{N}{2} + \nu )(\log _{2} L + k)} \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}. \end{aligned}$$

By choosing N appropriately, (44) follows.

Case 2, Step 1. In this case, \(L \simeq _{c_{0}} 2^{j_{0}}\). We will use the following bound to treat the “nearby input” case: for \(j < \log _{2} L\),

$$\begin{aligned} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{\ge j_{0}} \tilde{P}_{k} (\eta _{j} V)}\Vert _{L^{2}}&\lesssim 2^{-\alpha j_{0}} 2^{-c \vert {j + k}\vert } 2^{-c \vert {j_{0} - j}\vert } \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}. \end{aligned}$$
(48)

We defer their proofs for a moment and prove (40) and (41) assuming (48).

Case 2, Step 1.(a). As before, to prove (40), we expand

$$\begin{aligned} \varpi {\mathcal {H}}\partial _{y}^{\ell } V= & {} \sum _{j, k: 2^{j} < L} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \breve{\eta }_{\ge j_{0}} \tilde{P}_{k} (\eta _{j} V) \\{} & {} + \sum _{k} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \breve{\eta }_{\ge j_{0}} \tilde{P}_{k} (\eta _{\ge \log _{2} L} V) =: \textrm{I}_{near} + \textrm{I}_{far}. \end{aligned}$$

This time, the term \(\textrm{I}_{near}\) can be treated using (48), so it only remains to estimate \(\textrm{I}_{far}\). By almost orthogonality (in case \(\alpha + \ell = 0\)) or interpolation (in case \(\alpha + \ell > 0\)), it is straightforward to prove

$$\begin{aligned}&\left\| {\sum _{k} 2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \breve{\eta }_{\ge j_{0}} \tilde{P}_{k} (\eta _{\ge \log _{2} L} V)}_{L^{2}}\right\| \\&\quad \lesssim 2^{(\nu + \ell ) j_{0}} \Vert {\eta _{\ge \log _{2}} V}\Vert _{L^{2}}^{\frac{1 -\alpha }{\ell +1}} \Vert {\partial _{y}^{\ell +1} (\eta _{\ge \log _{2}} V)}\Vert _{L^{2}}^{\frac{\ell +\alpha }{\ell +1}}, \end{aligned}$$

which is acceptable.

Case 2, Step 1.(b). To prove (41), we expand

$$\begin{aligned} {[}\varpi , {\mathcal {H}}] \partial _{y}^{\ell } V&= \sum _{k \ge - j_{0} - 5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} [\breve{\eta }_{\ge j_{0}}, \tilde{P}_{k}] \partial _{y}^{\ell } (\eta _{\ge \log _{2} L - c_{0} - 10} V) \\&\quad + \sum _{k< - j_{0} -5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} [\breve{\eta }_{\ge j_{0}}, \tilde{P}_{k}] \partial _{y}^{\ell } (\eta _{\ge \log _{2} L - c_{0} - 10} V) \\&\quad + \sum _{j, k : j < \log _{2} L - c_{0} - 10} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} [\breve{\eta }_{\ge j_{0}}, \tilde{P}_{k}] \partial _{y}^{\ell } (\eta _{j} V) =: \textrm{I}' + \textrm{II}' + \textrm{III}'. \end{aligned}$$

For \(\textrm{I}'\), we argue as in term \(\textrm{I}\) in Case 1, Step 1.(b) using the commutator structure and bound

$$\begin{aligned} \Vert {\textrm{I}'}\Vert _{L^{2}}&\lesssim 2^{-\alpha j_{0}} \sum _{k \ge -j_{0} - 5} 2^{(\nu +\ell ) j_{0}} 2^{-(1-\alpha ) (j_{0}+k)} \Vert {\partial _{y}^{\ell } (\eta _{\ge \log _{2} L - c_{0} - 10} V)}\Vert _{L^{2}}\\&\lesssim 2^{-\alpha j_{0}} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }} + \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{\ell , \nu }}). \end{aligned}$$

For \(\textrm{II}'\), we expand the commutator expression and write

$$\begin{aligned} \textrm{II}'&= \sum _{k< - j_{0} -5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \breve{\eta }_{\ge j_{0}} \tilde{P}_{k} \partial _{y}^{\ell } (\eta _{\ge \log _{2} L - c_{0} - 10} V) \\&\quad - \sum _{k < - j_{0} -5} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \tilde{P}_{k}(\breve{\eta }_{\ge j_{0}} \partial _{y}^{\ell } (\eta _{\ge \log _{2} L - c_{0} - 10} V) ). \end{aligned}$$

We simply bound \(2^{\alpha k} \lesssim 2^{- \alpha j_{0}}\), then use almost orthogonality to estimate

$$\begin{aligned} \Vert {\textrm{II}'}\Vert _{L^{2}} \lesssim 2^{- \alpha j_{0}} 2^{(\nu + \ell ) j_{0}} \Vert {\partial _{y}^{\ell } (\eta _{\ge \log _{2} L - c_{0} - 10} V)}\Vert _{L^{2}} \lesssim 2^{- \alpha j_{0}} (\Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }} + \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{\ell , \nu }}). \end{aligned}$$

For \(\textrm{III}'\), we simply observe that, by the support properties of \(\breve{\eta }_{\ge j_{0}}\) and \(\eta _{j}\),

$$\begin{aligned} \textrm{III}' = \sum _{j, k: j < \log _{2} L - c_{0} - 10} 2^{(\nu + \ell ) j_{0}} 2^{\alpha k} \breve{\eta }_{\ge j_{0}} \tilde{P}_{k} \partial _{y}^{\ell } (\eta _{j} V), \end{aligned}$$

which can be treated using (48).

Case 2, Step 2. It remains to prove (48). Recall that \(L \simeq _{c_{0}} 2^{j_{0}}\) and \(j < \log _{2} L\). We first split

$$\begin{aligned} \breve{\eta }_{\ge j_{0}} = \sum _{j: \log _{2} L \le j < \log _{2} L + c_{0}+10} \eta _{j} \breve{\eta }_{\ge j_{0}} + \eta _{\ge \log _{2} L + c_{0} + 10} \breve{\eta }_{\ge j_{0}}. \end{aligned}$$

Each summand in the first sum obeys the same properties as \(\breve{\eta }_{j}\), so its contribution may be treated by (42) and (43). Let us abbreviate the second term by \(\breve{\eta }_{\ge \log _{2} L + c_{0} + 10}\). Thanks to the spatial separation property, (47) implies \(\Vert {\breve{\eta }_{\ge \log _{2} L + c_{0} + 10} \tilde{P}_{k} (\eta _{j} V)}\Vert _{L^{2}} \lesssim _{N} 2^{\frac{1}{2} k} 2^{-\frac{N}{2} (j_{0} + k)} \Vert {\eta _{j} V}\Vert _{L^{\infty }}\). Hence, by Hölder’s inequality,

$$\begin{aligned}&2^{(\nu + \ell ) j_{0}} 2^{(\alpha + \ell ) k} \Vert {\breve{\eta }_{\ge \log _{2} L + c_{0} + 10} \tilde{P}_{k}(\eta _{j} V)}\Vert _{L^{2}}\\&\quad \lesssim _{N} 2^{-\alpha j_{0}} 2^{(\nu + \alpha + \ell - \frac{N}{2}) (j_{0} + k)} 2^{(-\nu + \frac{1}{2})(j + k)} \Vert {V}\Vert _{\dot{{\mathcal {H}}}_{<L}^{0, \nu }}. \end{aligned}$$

By choosing N appropriately, (48) follows. \(\square \)

5 Estimates on U

In this section, we improve the improved bootstrap bounds in Lemma 3.6 that involve U.

5.1 Non-Top-Order Forcing Term Estimates

To prepare for the ensuing analysis, we establish some bounds for the forcing term involving \({\mathcal {H}}\) and \({\mathcal {L}}\).

Lemma 5.1

Assume the hypotheses of Lemma 3.6. Then the following \(L^{2}\) bounds hold:

$$\begin{aligned}&e^{-\mu s} \Vert {*}\Vert {\partial _{y}^{(j)} {\mathcal {H}}(U)(s, \cdot )}_{L^{2}} \le C A e^{-\mu s} \quad \textrm{for } \,1 \le j \le 2k+2, \end{aligned}$$
(49)
$$\begin{aligned}&e^{- s} \Vert {*}\Vert {\partial _{y}^{(j)} {\mathcal {L}}\left( U+e^{(b-1)s} \kappa \right) }_{L^{2}} \le C (1+\kappa _{0}) e^{-(2+(j-\frac{3}{2})b) s} \quad \textrm{for }\, j \ge 0. \end{aligned}$$
(50)

Moreover, the following pointwise bounds hold:

$$\begin{aligned}&e^{- \mu s} \Vert {*}\Vert {{\mathcal {H}}(U)(s, \cdot )}_{L^{\infty }(-4, 4)} \le C (A+\kappa _{0}) e^{-\mu _{0} s}, \end{aligned}$$
(51)
$$\begin{aligned}&e^{-\mu s} \Vert {*}\Vert {\partial _{y}^{(j)} {\mathcal {H}}(U)(s, \cdot )}_{L^{\infty }} \le C A e^{-\mu s} \quad \textrm{for }\, 1 \le j \le 2k+1, \end{aligned}$$
(52)
$$\begin{aligned}&e^{- s} \Vert {*}\Vert {\partial _{y}^{(j)} {\mathcal {L}}\left( U+e^{(b-1)s} \kappa \right) }_{L^{\infty }} \le C (1+\kappa _{0}) e^{-(2+(j-1)b) s} \quad \textrm{for }\, j \ge 0. \end{aligned}$$
(53)

These bounds will be useful for the proof of essentially all non-top-order estimates (with the sole exception of the weighted \(L^{2}\)-Sobolev bound (IB4)). On the other hand, to estimate \(\partial _{y}^{2k+3} U\), we shall rely instead on the dispersive/dissipative property of \(\Gamma \)/\(\Upsilon \) and appropriate commutator estimates; see Lemmas 5.4 and 5.5 below.

Proof

Bounds (50) and (53) for \(e^{-s} {\mathcal {L}}\) immediately follow by combining (36) and (37), respectively, with Lemma 3.11. On the other hand, to prove (49) and (52), note that, by (B1), we have \(\Vert {U'}\Vert _{L^{\infty }} \le C\). Moreover, by (B2) (for \(\vert {y}\vert \le 1\)) and (B4) (for \(\vert {y}\vert \ge 1\)), we have

$$\begin{aligned} \Vert {U'}\Vert _{L^{2}} \le C A. \end{aligned}$$
(54)

Recall also that \(\Vert {\partial _{y}^{2k+3} U}\Vert _{L^{2}} \le 2 A\) by (B3). Therefore, (49) and (52) follow from (38) and (39), respectively.

To prove the remaining bound (51), we apply Lemma 4.2. By introducing a smooth partition of unity (in the variable \(y'\)) subordinate to \(\{\vert { y' }\vert< 16\} \cup \{8<\vert { y' }\vert< 2 e^{bs}\} \cup \{e^{bs}<\vert { y' }\vert \}\) and using the bounds for \(K_{s}\) and \(\partial _{y} K_{s}\) in Lemma 4.2, we may estimate

$$\begin{aligned} e^{- \mu s} |{\mathcal {H}}(U)(s, y)|&\le e^{-\mu s} \vert {*}\vert {\int _{-\infty }^{\infty } K_{s}(y-y') \partial _{y} U(s, y') \, \textrm{d}y'} \\&\lesssim e^{-\mu s} \int _{\vert {y'}\vert \le 16} \vert {y-y'}\vert ^{-\max \{\alpha , \beta , 0\}} \vert {\partial _{y} U(s, y')}\vert \, \textrm{d}y'\\&\quad + e^{-\mu s} \int _{8 \le \vert {y'}\vert \le 2 e^{b s}} \vert {y'}\vert ^{-\max \{\alpha , \beta , 0\}-1} \vert {U(s, y')}\vert \, \textrm{d}y' \\&\quad + e^{-\mu s} \int _{\vert {y'}\vert \ge e^{b s}} \vert {y'}\vert ^{-\max \{\alpha , \beta , 0\}-1} \vert {U + e^{(b-1)s} \kappa }\vert (s, y') \, \textrm{d}y' \\&\lesssim e^{-\mu s} \Vert {\partial _{y} U}\Vert _{L^{\infty }} + e^{-\mu s} \Vert {\vert {y}\vert ^{-\frac{1}{2k+1}} U}\Vert _{L^{\infty }(8, 2e^{bs})} \\&\qquad \int _{8 \le \vert {y'}\vert \le 2 e^{b s}} \vert {y'}\vert ^{-\max \{\alpha , \beta , 0\}-1+\frac{1}{2k+1}} \, \textrm{d}y' \\&\quad + e^{-\mu s} e^{-(\frac{1}{2}+\max \{\alpha , \beta , 0\}) b s} \Vert {U+e^{(b-1)s} \kappa }\Vert _{L^{2}} . \end{aligned}$$

In the second inequality, we used integration by parts and the property that \(\vert {y - y'}\vert \simeq \vert {y'}\vert \) for \(\vert {y'}\vert \ge 8\) (since \(\vert {y}\vert \le 4\)). When \(\max \{\alpha , \beta , 0\} \ne \frac{1}{2k+1}\), by (B1), (B2), (B4), (35) and (28),

$$\begin{aligned} e^{- \mu s} \vert {*}\vert {{\mathcal {H}}(U)(s, y)}&\lesssim e^{-\mu s} \left( 1 + \max \{1, e^{(\frac{1}{2k+1}- \max \{\alpha , \beta , 0\}) bs}\} \right. \\&\left. \quad + e^{-(\frac{1}{2}+\max \{\alpha , \beta , 0\}) b s} e^{(\frac{3}{2} b -1) s} \right) \\&\lesssim e^{-\min \{\mu , \frac{2k-1}{2k}\} s}. \end{aligned}$$

Indeed, note that \(\mu = 1 - \max \{\alpha , \beta , 0\} b\). Therefore,

$$\begin{aligned} -\mu + \left( \tfrac{1}{2k+1} - \max \{\alpha , \beta , 0\}\right) b&= - 1 + \max \{\alpha , \beta , 0\} b + \tfrac{1}{2k} \\&\quad - \max \{\alpha , \beta , 0\} b = - \tfrac{2k-1}{2k}, \\ -\mu - \left( \tfrac{1}{2} + \max \{\alpha , \beta , 0\}\right) b + \left( \tfrac{3}{2} b - 1\right)&= - 1 + \max \{\alpha , \beta , 0\} b - \tfrac{1}{2} b \\&\quad + \max \{\alpha , \beta , 0\} b + \tfrac{3}{2} b - 1 \\&= - 2 + b = \tfrac{-4k + 2k + 1}{2k} = - \tfrac{2k-1}{2k}. \end{aligned}$$

Since \(\mu _{0} = \min \{\mu , \frac{2k-1}{2k}\}\) in this case, (51) follows. On the other hand, in case \(\max \{\alpha , \beta , 0\} = \frac{1}{2k+1}\), the same computation applies except that the second term is bounded instead by CAbs; however, such a modification is acceptable since \(0< \mu _{0} < \frac{2k-1}{2k}\). \(\square \)

5.2 Estimates on \(\partial _y U\)

Next, we proceed to obtain pointwise estimates for the low derivative \(\partial _yU\) using the method of characteristics. It is important that the bounds proved below (in particular, items 3–4) are independent of A. On the other hand, we need not obtain sharp pointwise bounds for \(\partial _{y} U\) at this point, as they would follow from the weighted \(L^{2}\)-Sobolev bounds (IB4) and (IB5) (cf. derivation of (34) in the proof of Theorem 3.1).

Lemma 5.2

There exist \(\epsilon _{0}\), A, \(y_0\), \(\gamma \), \(\sigma _{0}\) such that if the initial data conditions (D1)–(D4) are satisfied and the bootstrap assumptions (B1)–(B7) hold for \(s \in [\sigma _{0}, \sigma _{1}]\), then the following conclusions hold:

  1. 1.

    For all \(s \in [\sigma _{0}, \sigma _{1}]\), and all \(|y|> y_0\), we have \((by + (1+e^s\tau _s)U(s,y)) y > 0\) (the flow is repulsive).

  2. 2.

    (IB1) and (IB2) hold.

  3. 3.

    For \(|y| \ge 4\) and \(s \in [\sigma _{0}, \sigma _{1}]\), we have

    $$\begin{aligned} {-}U'(s,y) \le \frac{5}{6}. \end{aligned}$$
    (55)
  4. 4.

    There exist \(C, r > 0\) independent of \(A, y_0, \gamma \) such that, for all \(y \in \mathbb {R}\), and for all \(s \in [\sigma _{0}, \sigma _{1}]\), we have

    $$\begin{aligned} |\partial _y U(s,y)| \le C \max \{*\}{\frac{1}{(1+|y|)^r}, A e^{-r s}}. \end{aligned}$$
    (56)

Proof of Lemma 5.2

We prove each item in order.

Proof of 1. We need to show that there exists a choice of \(y_0\) and \(\sigma _{0}\) (depending on A, see Remark 3.7) such that the following holds in the interval \(s \in [\sigma _{0}, \sigma _{1}]\):

$$\begin{aligned} U(s,y)> - \frac{b}{1+e^s \tau _s} y \quad \text { if } y > y_0, \quad U(s,y)< - \frac{b}{1+e^s \tau _s}y \quad \text { if } y < -y_0, \end{aligned}$$

Let us focus on the first claim, the second being analogous. By the fundamental theorem of calculus and the bootstrap assumptions (B1)–(B2), we have that

$$\begin{aligned} U(s,y) - U(s,0) = \int _0^y U'(s, {\underline{y}}) d {\underline{y}} \ge - y_0(1+2y_0) - (y- y_0) \left( 1-\frac{y^{2k}_0}{4} \right) . \end{aligned}$$

Then, we have, for \(y > y_0\),

$$\begin{aligned}{} & {} (1+e^s\tau _s)U(s,y) + by > (b-(1+e^s\tau _s)(1-y_0^{2k}/4))y \\{} & {} \quad - y_0(1+e^s\tau _s)(1+2y_0)+y_0(1+e^s\tau _s)(1-y_0^{2k}/4). \end{aligned}$$

The claim follows by choosing \(y_0\) small enough, and then by choosing \(\sigma _{0}\) large enough to control the factors containing the modulation parameter \(\tau \) (recalling that the bootstrap assumptions (B6) hold true).

Proof of 2. We now show that we can restrict to \(y_0\) small and \(\sigma _{0} \gg 1\) (depending on A, see Remark 3.7) such that the following inequalities hold true:

$$\begin{aligned}&\Big | U(s, \pm y_0) {\pm } y_0 {\mp } \frac{1}{2k+1} y_0^{2k+1} \Big | \le \frac{y_0^{2k+1}}{4{(2k+1)}} \quad \text {for all } s \in [\sigma _{0}, \sigma _{1}] \end{aligned}$$
(57)
$$\begin{aligned}&\Big | U({\sigma _{0}}, y) {+} y {-} \frac{1}{2k+1} y^{2k+1} \Big | \le \frac{|y|^{2k+1}}{4{(2k+1)}} \quad \text {for all } y \in (-1/4,- y_0) \cup (y_0, 1/4), \end{aligned}$$
(58)
$$\begin{aligned}&\Big | U'(s, \pm y_0) {+} 1 {-} y_0^{2k} \Big | \le \frac{y_0^{2k}}{4} \quad \text {for all } s \in [\sigma _{0}, \sigma _{1}] , \end{aligned}$$
(59)
$$\begin{aligned}&\Big | U'({\sigma _{0}}, y) {+} 1{-} y^{2k} \Big | \le \frac{y^{2k}}{4} \quad \text {for all } y \in (-1/4, -y_0) \cup (y_0, 1/4), \end{aligned}$$
(60)
$$\begin{aligned}&{U'(\sigma _{0}, y) \ge -1 + \frac{y_0^{2k}}{2} \quad \text {for all } |y| \ge y_0}. \end{aligned}$$
(61)

Note that the above inequalities imply (on initial data and at \(|y| = y+0\)) bounds which are strict improvements of (IB1) and (IB2). For instance, from (60) it follows that \(0 \ge U'(\sigma _0, y) \ge -1 + \frac{3}{4} y^{2k} \ge -1 + \frac{3}{4} y_0^{2k}\), for all \(y \in (-1/4,- y_0) \cup (y_0, 1/4)\), and similarly for the bounds (59), as well as (61).

Bounds (58), (60) and (61) follow easily from our choice of initial data at \(\sigma _{0}\) and the expression for the profile (with the choice (10) concerning the \((2k+1)\)st derivative at \(y = 0\)), upon choosing \(y_0\) to be small and consequently \(\sigma _0\) to be large.

More precisely, we have, recalling the equation \(\mathring{U} = - y - \frac{1}{2k+1} \mathring{U}^{2k+1}\), and letting \(R:= \mathring{U} + y - \frac{1}{2k+1} y^{2k+1}\),

$$\begin{aligned} R\Big (1+\frac{1}{2k+1}\Sigma \Big ) = \frac{1}{(2k+1)^2} y^{2k+1} \Sigma , \end{aligned}$$

where \(\Sigma = \sum _{h+j = 2k} \mathring{U}^h y^j\). Since, for \(|y| \le 1/4\), \(|\mathring{U}| \le \frac{1}{4}\) we also have that \(|\Sigma | \le \frac{2k}{4^k}\). The claim then follows from the inequality \(\Big (1-\frac{1}{2k+1}|\Sigma |\Big )^{-1} \frac{1}{(2k+1)^2} |\Sigma | \le \frac{1}{5(2k+1)}\), which is valid for all \(k \ge 1\). Thus, for the profile, we have inequality (57) with improved constant \(\frac{1}{5(2k+1)}\) on the RHS. The bound for \(U(\sigma _0, \cdot )\) then follows from this bound for the profile \(\mathring{U}\) and the bootstrap assumptions, after taking \(\sigma _0\) to be large.

The proofs of (57) and  (59) are similar, so we will just focus on showing (59). Using Taylor expansion with integral remainder and Sobolev embedding, we have

$$\begin{aligned} \Bigg | U'(s, y_0) +1 - y^{2k}_0 - \sum _{j=1}^{2k+1} W^{(j)}(s, 0) \frac{y_0^{j-1}}{(j-1)!} \Bigg |\le & {} C y_0^{2k} \int _0^{y_0} |U^{2k+2}(s,y) | dy \nonumber \\\le & {} C A y_0^{2k+1}. \end{aligned}$$
(62)

Note that the two principal terms on the LHS of the previous estimate arise from our choice of the profile in display (10). Moreover, to bound the term \(|U^{2k+2}(s,y) |\) in display (62), we used the bootstrap assumption (B3) and interpolation. We then notice that the coefficients \( W^{(j)}(s, 0)\), \(j = 0, \ldots , 2k\) decay by (B6) (some of them are identically zero by (21)), and \(|W^{(2k+1)}(s,0)| \le 4\) by the bootstrap assumptions (B7). We then first choose \(y_0\) to be small, and then choose \(\sigma _{0}\) accordingly to be large to conclude that (59) holds true (see Remark 3.7).

Recall the equation for \(U'\) from (14); we arrange the equation as follows:

$$\begin{aligned}{} & {} \partial _s U' + U' {+} (U')^2 + (b y +(1+{e^s\tau _s}) U)\partial _y U' \nonumber \\{} & {} \quad = \left( e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1)s} \kappa \right) U'' - { e^{s} \tau _{s} (U')^{2}} \nonumber \\{} & {} \qquad +(1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U') + {e^{-s}}\partial _{y} {\mathcal {L}}(U+e^{(b-1)s} \kappa ) \right) =: E^{(1)}. \end{aligned}$$
(63)

By (B1)–(B2) for \(U'\), (B3) and (54) for \(U''\), (B6) for the modulation terms and Lemma 5.1 for the forcing terms, we have

$$\begin{aligned} \Vert {E^{(1)}}\Vert _{L^{\infty }} \le C A e^{-\gamma s}, \end{aligned}$$
(64)

where we take A sufficiently large compared to \(\kappa _{0}\) if necessary.

We will first focus on showing (IB2). We now define Lagrangian coordinates for the flow of Eq. (63). The flow can either start from a point on the half line \(s = \sigma _{0}, y \ge y_0\) or from a point on the half line \(s \ge \sigma _{0}, y= y_0\). Distinguishing between these two cases, we consider Lagrangian maps \(X_1(s, {\tilde{y}})\), \(X_2(s, {\tilde{s}})\) which are defined by solving the following initial value problems (\({\tilde{s}}\) and \({\tilde{y}}\) are the Lagrangian parameters):

$$\begin{aligned}&\partial _s X_1(s,{\tilde{y}}) = b X_1(s,{\tilde{y}}) {+}{(1+e^s\tau _s) }U(s, X_1(s,{\tilde{y}})) , \quad X_1(\sigma _{0},{\tilde{y}}) = {\tilde{y}}, \quad \text {for all } {\tilde{y}} \ge y_0, \end{aligned}$$
(65)
$$\begin{aligned}&\partial _s X_2(s,{\tilde{s}}) = b X_2(s,{\tilde{s}}) {+}{(1+e^s\tau _s)}U(s, X_2(s,{\tilde{s}})) , \quad X_2({\tilde{s}} , {{\tilde{s}}}) = y_0, \quad \text {for all } {\tilde{s}} \ge \sigma _{0}. \end{aligned}$$
(66)

We now rewrite Eq. (63) in Lagrangian coordinates, to obtain, letting \({\tilde{U}}'\) be the composition \({\tilde{U}}' (s, {\tilde{y}}) = U'(s, X_1(s, {\tilde{y}}))\),

$$\begin{aligned} \partial _s {\tilde{U}}' + {\tilde{U}}' + ({\tilde{U}}')^2 = {E^{(1)}(s, X_1(s, {\tilde{y}})).} \end{aligned}$$
(67)

Note that the following bound holds for all \(s \ge \sigma _0\) and all y:

$$\begin{aligned} U'(s, y) \le \frac{1}{2}. \end{aligned}$$
(68)

Indeed, recalling the form of the initial data (D1), as well as the bounds on the initial data from Sect. 3.4, we have that the following bound holds: \(U'(\sigma _0, y) \le \frac{1}{4}\) upon choosing \(\sigma _0\) to be large. We then integrate Eq. (67), using the upper bound on initial data we just obtained, and recalling that \(E^{(1)}\) decays exponentially, to obtain (68). Footnote 12

Let us for a moment assume that there exists \(\check{s}\) such that \( {\tilde{U}}'(\check{s}, {\tilde{y}}) \ge -\frac{1}{2}\). Integrating (67), and using the bound (64), it is then easy to show that, upon choosing \(\sigma _0\) to be large (depending only on \(y_0\) and A), \( |\tilde{U}'( s, {\tilde{y}})| \le \frac{3}{4}\) for all \(s \ge \check{s}\). Footnote 13 This implies (IB2) on the half line \(\{y > 0\}\) in this case.

Hence it suffices to show (IB2) under the additional assumption that \({\tilde{U}}'(\check{s}, {\tilde{y}}) \le -\frac{1}{2}\). By the repulsivity of the flow, we always have that \(|X_1(s, \tilde{y})| \ge |y_0|\) for \(s \ge \sigma _0\). Hence, for \(s \ge \sigma _0\), we can apply the bootstrap assumption (B2) (i.e., \({\tilde{U}}' \ge -1 + \frac{y_0^{2k}}{4}\)), which yields

$$\begin{aligned} - {\tilde{U}}' - ({\tilde{U}}')^2 \ge \left( 1 - \frac{y_0^{2k}}{4}\right) \frac{y_0^{2k}}{4}\ge \frac{y_0^{2k}}{8}, \end{aligned}$$
(69)

by virtue of the fact that \(y_0\) is chosen to be small. We now choose \(\sigma _0\) to be sufficiently large for the following inequality to be valid, in view of (64):

$$\begin{aligned} \int _{\sigma _0}^{\sigma _{1}} |E^{(1)}(s, X_1(s, {\tilde{y}}))| \textrm{d}\, s \le \frac{y_0^{2k}}{8}.^{14} \end{aligned}$$
(70)

Footnote 14

We then integrate (67), taking into account the bounds (69) and (70), as well as the bound (64). We deduce, recalling (68), that

$$\begin{aligned} \begin{aligned} |U'(s,X_1(s, {\tilde{y}}))| \le 1 -\frac{1}{2} y_0^{2k}, \quad \text {for } \sigma _{0} \le s \le \sigma _{1}, \quad \ |{\tilde{y}}|\ge y_0. \end{aligned} \end{aligned}$$
(71)

Essentially repeating the same argument for the \(X_2\) trajectories, we have

$$\begin{aligned} \begin{aligned} |U'(s,X_2(s, {\tilde{s}}))| \le 1 -\frac{1}{2} y_0^{2k}, \quad \text {for } \sigma _{0} \le {\tilde{s}} \le \sigma _{1}, \quad {\tilde{s}} \le s \le \sigma _{1}. \end{aligned} \end{aligned}$$
(72)

Combining the bounds (71) and (72) yields (IB2) on the half line \(\{y > 0\}\). Arguing in the same manner on \(\{y < 0\}\), we obtain (IB2).

Finally, (IB1) follows from (IB2) and Taylor expansion about \(y = 0\), following a similar reasoning as in inequality (62).

Proof of 3. To prove (55), we first show a quantitative lower bound on the time the Lagrangian trajectories \(X_{1}\), \(X_{2}\) stay in \(y \in [-4, 4]\). Recall the definition of Lagrangian coordinates \(X_1\) and \(X_2\) in (65) and (66). Define now \({\tilde{U}}_i\), for \(i \in \{1,2\}\) as follows:

$$\begin{aligned} {\tilde{U}}_1 (s, {\tilde{y}}) = U(s, X_1(s, {\tilde{y}})), \quad {\tilde{U}}_2 (s, {\tilde{s}}) = U(s, X_2(s, {\tilde{s}})). \end{aligned}$$

We then have the following coupled system for \(({\tilde{U}}_i, X_i)\):

$$\begin{aligned} \begin{aligned}&\partial _s {\tilde{U}}_i = (b-1){\tilde{U}}_i + {E^{(0)}(s, X_i)},\\&\partial _s X_i = b X_i {+} {(1+e^s\tau _s)}{\tilde{U}}_i. \end{aligned} \end{aligned}$$

Here, we recall that

$$\begin{aligned} \begin{aligned} E^{(0)}&= \left( - e^{s} \tau _{s} U + e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1)s} \kappa \right) U' - e^{(b-1)s} \kappa _{s} \\&\quad +(1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U) + {e^{-s}} {\mathcal {L}}(U+e^{(b-1)s} \kappa ) \right) \end{aligned} \end{aligned}$$

We let \(A_i = X_i + {\tilde{U}}_i\), which diagonalizes the system up to a perturbative term on the RHS:

$$\begin{aligned}&\partial _s {\tilde{U}}_i - (b-1){\tilde{U}}_i = {E^{(0)}(s, X_i)}, \end{aligned}$$
(73)
$$\begin{aligned}&\partial _s A_i - b A_i = e^s \tau _s {\tilde{U}}_i + {(1+e^s\tau _s)} {E^{(0)}(s, X_i)} . \end{aligned}$$
(74)

Let us now specify to the case \(i = 1\). The RHS of  (73) and (74) decays exponentially as \(s \rightarrow \infty \) as long as \(|X_1(s, {\tilde{y}})| \le 4\). Indeed, using Lemma 5.1, as well as the fact that \(|U(s,y)| \le 2|y|\) for all \(y \in [-4,4]\), and the bootstrap assumptions (B6), we have

$$\begin{aligned}&e^{-\mu s} \Vert {{\mathcal {H}}(U)}\Vert _{L^{\infty }(-4, 4)} \le C A e^{-\mu _{0} s}, \\&e^{-s} \Vert {{\mathcal {L}}(U + e^{(b-1)s}\kappa )}\Vert _{L^{\infty }} \le C (1+\kappa _{0}) e^{(-2+b)s},\\&e^s |\tau _s {\tilde{U}}_i(s, {\tilde{y}})| \le 8 e^{-\gamma s}. \end{aligned}$$

This implies directly that, as long as \(|X_1(s, {\tilde{y}})| \le 4\),

$$\begin{aligned} |{E^{(0)}(s, X_i)}| + |e^s \tau _s {\tilde{U}}_i + {(1+e^s\tau _s)} {E^{(0)}(s, X_i)}| \le C A e^{- c^\# s}, \end{aligned}$$
(75)

where \(c^\#\) is a positive constant depending on \(\mu _0, \gamma \).

Let us first restrict to the case \(y_0 \le |{\tilde{y}}| \le 1\). We integrate (73)–(74) between \(\sigma _{0}\) and s, and we obtain, taking (75) into account,

$$\begin{aligned}{} & {} \Big | U(s, X_1(s, {\tilde{y}})) e^{-(b-1)(s-\sigma _{0})} - U(\sigma _{0}, {\tilde{y}}) \Big | \le |U(\sigma _{0}, {\tilde{y}})|,\nonumber \\{} & {} \Big | A_1(s, X_1(s, {\tilde{y}}))e^{-b(s-\sigma _{0})} - A_1(\sigma _{0}, {\tilde{y}}) \Big | \le \frac{1}{2}|A_1(\sigma _{0}, {\tilde{y}})|. \end{aligned}$$
(76)

Note that the above inequalities hold by choosing \(\sigma _0\) large as a function of \(y_0\). Indeed, due to our choice of initial data for U at \(\sigma _{0}\), by possibly choosing \(\sigma _0\) large, we have that, for all \({\tilde{y}} \in [-1, -y_0] \cup [y_0, 1]\), \(|U(\sigma _{0}, {\tilde{y}})| \ge \frac{1}{2} |U(\sigma _{0}, y_0)| \ge \frac{y_0}{4}\). It then suffices to choose \(\sigma _{0}\) to be large so that, for all \({\tilde{y}} \in [-1, -y_0] \cup [y_0, 1]\),

$$\begin{aligned} \int _{\sigma _0}^{\sigma _{1}}e^{-(b-1)(s-\sigma _{0})} |E^{(0)}(s, X_1( s, {\tilde{y}}))| \textrm{d}\, s \le \frac{y_0}{4}. \end{aligned}$$

This shows that the first inequality in (76) holds, up to choosing \(\sigma _0\) based on \(y_0\). A similar reasoning holds for the second inequality in (76).

Display (76) implies, due to our choice of initial data at \(\sigma _{0}\),

$$\begin{aligned} \begin{aligned}&| U(s, X_1(s, {\tilde{y}}))| \le 2 |{\tilde{y}}| e^{(b-1)(s-\sigma _{0})},\\&| A_1(s, X_1(s, {\tilde{y}}))|\le \frac{3}{2} (|{\tilde{y}}|^{2k+1} + {e^s \tau _s}) e^{b(s-\sigma _{0})}. \end{aligned} \end{aligned}$$

We recall the bootstrap assumptions (B6), which in particular imply that \(|e^{s} \tau _{s}| \le A e^{- \gamma s}\). This allows us to deduce that, for all times \(s \in [\sigma _{0}, -2k \log (|{\tilde{y}}|)+ \sigma _{0}]\),

$$\begin{aligned}{} & {} | U(s, X_1(s, {\tilde{y}}))| \le 2,\nonumber \\{} & {} | A_1(s, X_1(s, {\tilde{y}}))|\le 2 - \frac{1}{4} \implies |X_1(s,{\tilde{y}})| \le 4. \end{aligned}$$
(77)

The reasoning for \(X_2\) is completely analogous, and we deduce that, for all times \(s \in [{\tilde{s}}, -2k \log (y_0)+{\tilde{s}}]\),

$$\begin{aligned}{} & {} | U(s, X_2(s, {\tilde{s}}))| \le 2,\nonumber \\{} & {} |X_2(s,{\tilde{s}})| \le 4. \end{aligned}$$
(78)

We are now in shape to do an \(L^\infty \) estimate for \(U'\) in the near region. Let us recall the relevant equation:

$$\begin{aligned} \partial _s U' + U' +(U')^2 + (b y +{(1+e^s\tau _s)}U)\partial _y U'= {E^{(1)}}. \end{aligned}$$

In Lagrangian coordinates with respect to \(X_1\), the above equation, letting \( {\tilde{U}}' = U'(s, X_1(s,\sigma _{0}))\), reads as

$$\begin{aligned} \partial _s {\tilde{U}}' + {\tilde{U}}' + ({\tilde{U}}')^2 = {E^{(1)}}(s, X_1(s, {\tilde{y}})). \end{aligned}$$
(79)

Let us now suppose by contradiction that, for all \(s \in [\sigma _{0}, \sigma _{0} - 2k \log (|{\tilde{y}}|)]\),

$$\begin{aligned} U'(s, X_1(s,{\tilde{y}})) \le -\frac{4}{5}. \end{aligned}$$
(80)

Combined with the bootstrap assumption (B2), display (80) yields the bound \(|{\tilde{U}}' +(\tilde{U}')^2| \le \frac{y_0^{2k}}{8}\), which in turn implies, since the RHS of (79) decays exponentially,Footnote 15 upon choosing \(\sigma _{0}\) to be larger, and recalling the definition of \(\mu '\) from (64),

$$\begin{aligned} \frac{\partial _s{\tilde{U}}'}{{\tilde{U}}' +({\tilde{U}}')^2 } + 1 \le \frac{{\mu '}}{4} e^{-s\frac{{\mu '}}{2}}. \end{aligned}$$

This implies

$$\begin{aligned} \partial _s \log \Big ( \frac{ -{\tilde{U}}'}{1 + {\tilde{U}}'} \Big )+ 1 \le \frac{{\mu '}}{4} e^{-s\frac{{\mu '}}{2}}. \end{aligned}$$

By integration, since \(\int _{\sigma _{0}}^{\infty } \frac{\mu '}{4} e^{-s \frac{\mu '}{2}} \, \textrm{d}s < \log 2\), and denoting \(Q = 2 \frac{-{\tilde{U}}'(\sigma _{0}, {\tilde{y}})}{1+ {\tilde{U}}'(\sigma _{0}, \tilde{y})} > 0\), we have

$$\begin{aligned} U'(s, X_1(s, \sigma _{0})) > \frac{-Qe^{-(s-\sigma _{0})}}{1+Q e^{-(s-\sigma _{0})}}. \end{aligned}$$

We now calculate this expression at \(s = s_* = \sigma _{0} - 2k \log (|{\tilde{y}}|)\). We notice that, for \(\epsilon _{0}\) sufficiently small, \({\tilde{U}}'(\sigma _{0}, {\tilde{y}}) \ge -1\), and \(1+\tilde{U}'(\sigma _{0}, {\tilde{y}}) \ge \frac{1}{2} {\tilde{y}}^{2k}\), hence \( 0\le Q \le 4{\tilde{y}}^{-2k}\). Since \(e^{-(s_*-\sigma _{0})} = \tilde{y}^{2k}\), it follows that

$$\begin{aligned} U'(s_*, X_1(s_*, \sigma _{0}))> -\frac{4}{4+1}= - \frac{4}{5}. \end{aligned}$$

This contradicts (80), and yields, for all \({\tilde{y}}\) such that \(y_0 \le |{\tilde{y}}| \le 1\), the existence of \(s^{(1)} \in [\sigma _{0}, s_*]\) such that

$$\begin{aligned} U'( s^{(1)}, X_1(s^{(1)}, {\tilde{y}})) \ge -\frac{4}{5}. \end{aligned}$$
(81)

Moreover, in the case \(|{\tilde{y}}| \ge 1\), the existence of \(s^{(1)}\) in the conditions above follows immediately by our choice of initial data. Indeed, by possibly choosing \(\sigma _0\) to be larger and \(\epsilon _{0}\) smaller,

$$\begin{aligned} U'( \sigma _0, X_1(\sigma _0, {\tilde{y}})) \ge -\frac{4}{5} \end{aligned}$$
(82)

for all \({\tilde{y}}\) such that \(|{\tilde{y}}| \ge 1\).

A completely analogous reasoning shows that for all \({\tilde{s}} \le \sigma _{1} + 2k \log (|y_0|)\), there exists \(s^{(2)} \in [{\tilde{s}}, {\tilde{s}} - 2k \log (|y_0|)]\) such that we have

$$\begin{aligned} U'( s^{(2)}, X_2(s^{(2)}, {\tilde{s}})) \ge - \frac{4}{5}. \end{aligned}$$
(83)

We now combine the bounds (81)–(83) with the bounds on the Lagrangian trajectory (77) and (78) to obtain the existence of \(s^{(1)}, s^{(2)}\) (depending resp. on \({\tilde{y}}\) and \(y_0\)), such that

$$\begin{aligned}{} & {} U'(s^{(1)}, X_1(s^{(1)}, {\tilde{y}})) \ge -\frac{4}{5}, \quad |X_1(s^{(1)}, {\tilde{y}})| \le 4, \quad \text {for all } {\tilde{y}}: y_0 \le |{\tilde{y}}| \le 4, \nonumber \\{} & {} U'( s^{(2)}, X_2(s^{(2)}, {\tilde{s}})) \ge -\frac{4}{5}, \quad |X_2(s^{(2)}, {\tilde{s}})| \le 4. \end{aligned}$$
(84)

We now repeat the reasoning in part 2. integrating Eq. (67), with the difference that the starting time of integration is now \(s^{(1)}\) (resp. \(s^{(2)}\)), and we use the bounds (84). Recall that the RHS of (67) is perturbative everywhere by (64). Repeating the same argument on \(\{y < 0\}\), we deduce (55) valid for \(|y|\ge 4\), \(\sigma _{0} \le s \le \sigma _{1}\).

Proof of 4. We only consider the case of the half-space \(\{y > 0\}\) in detail, as the other case is dealt with similarly. We define Lagrangian trajectories \(X_{1}\) and \(X_{2}\) as in (65) and (66), respectively, but now with \(y_{0}\) replaced by \(y = 4\). By \(U(s, 0) = 0\), (IB1), (IB2) and (55), we may deduce the simple bound \(\vert {U(s, y)}\vert \le y\) by integration. By (B6) and taking \(\sigma _{0}\) sufficiently large, it follows that

$$\begin{aligned} \partial _{s} X_{i} = b X_{i} + (1+e^{s} \tau _{s}) \tilde{U}_{i} \quad \Rightarrow \quad \frac{1}{2} \partial _{s} X_{i}^{2} \le 2 b X_{i}^{2} \quad \Rightarrow \quad \vert {X_{i}}\vert \le e^{2b(s - \tilde{s})} \vert {\tilde{y}}\vert ,\nonumber \\ \end{aligned}$$
(85)

where \((\tilde{s}, \tilde{y}) = (\sigma _{0}, \tilde{y})\) or \((\tilde{s}, 4)\) when \(i = 1\) or 2, respectively.

Next, we again recall the equation for \(U'\) in Lagrangian coordinates, which yields, letting \({\tilde{U}}' = U'(s, X_{i}(s, {\tilde{y}}))\),

$$\begin{aligned} \partial _s {\tilde{U}}' + {\tilde{U}}' + ({\tilde{U}}')^2 = E^{(1)}(s, X_{i}(s, {\tilde{y}})). \end{aligned}$$

Multiplying by \({\tilde{U}}'\) and using (55) plus Cauchy–Schwarz, we have (recall that \(0< \gamma < 1\))

$$\begin{aligned} \frac{1}{2} \partial _s ({\tilde{U}}')^2\le - \frac{\gamma }{10} ({\tilde{U}}')^2 + (E^{(1)}(s, X_{i}(s, {\tilde{y}})))^2. \end{aligned}$$

Recalling (64) for \(E^{(1)}\), it follows that

$$\begin{aligned} \partial _{s} \left( e^{\frac{\gamma }{10} s} (\tilde{U}')^{2} \right) \le C A e^{(\frac{\gamma }{10}-\gamma ) s}. \end{aligned}$$

Integrating this equation, we obtain that, for a constant \(C > 0\),

$$\begin{aligned} |{\tilde{U}}'| \le C e^{- \frac{\gamma }{10} (s-\tilde{s})} \vert {U'(\tilde{s}, \tilde{y})}\vert + C A e^{-\frac{\gamma }{10} s}. \end{aligned}$$
(86)

In case \(i = 2\), we have \((\tilde{s}, \tilde{y}) = (\tilde{s}, 4)\), and the desired bound (56) with \(r = \frac{\gamma }{20 b}\) follows from (55), (85) and (86). On the other hand, in case \(i = 1\), the desired bound with \(r = \frac{\gamma }{20 b}\) follows from (32), (85) and (86), where we simply bound \(e^{-\frac{\gamma }{10}(s-\sigma _{0})} \vert {\tilde{y}}\vert ^{-\frac{2k}{2k+1}} \le C \vert {y}\vert ^{-r}\) and \(e^{-\frac{\gamma }{10}(s-\sigma _{0})} e^{-\sigma _{0}} \epsilon _{0} \le C A e^{-r s}\). \(\square \)

5.3 Estimates on \(\partial _y^2 U\)

As a preparation for closing the bootstrap assumption on \(\Vert {\partial _{y}^{2k+3} U}\Vert _{L^{2}}\), we first prove a uniform bound for \(\Vert {\partial _{y}^{2} U}\Vert _{L^{2}}\). The key ingredients are the method of characteristics in the region close to \(y = 0\), as well as the a-priori pointwise bounds for \(U'\) proved in Lemma 5.2.

Lemma 5.3

There exist \(\epsilon _{0}\), A, \(y_0\), \(\gamma \), \(\sigma _{0}\) such that if the initial data conditions (D1)–(D4) are satisfied and the bootstrap assumptions (B1)–(B7) hold for \(s \in [\sigma _{0}, \sigma _1]\), then the following inequality holds true for all \(s \in [\sigma _0, \sigma _1]\):

$$\begin{aligned} \Vert \partial ^{2}_y U(s, \cdot )\Vert _{L^2} \le C. \end{aligned}$$
(87)

Here, \(C > 0\) is a constant independent of A, \(y_0\), and \(\sigma _{0}\).

Proof of Lemma 5.3

We begin by recalling (15) with \(j = 2\) for \(U''\), which we rewrite as follows:

$$\begin{aligned}{} & {} \partial _{s} U^{''} + (1+b + 3U')U'' + (by+{(1+e^s \tau _s)}U) \partial _{y} U'' \nonumber \\{} & {} \quad = -{ 3 e^{s} \tau _{s} U' U''}+ \left( e^{b s} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) U'''\nonumber \\{} & {} \qquad + (1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U'') + {e^{-s}} \partial _{y}^{2} {\mathcal {L}}(U + e^{(b-1)s} \kappa ) \right) =: E^{(2)}. \end{aligned}$$
(88)

Upon restricting to small \(y_0\) and consequently to large \(\sigma _0\), the following properties are a consequence of Taylor expansion about 0, the Taylor coefficients of the profile at \(y =0\), and the hypotheses on initial data at \(s = \sigma _{0}\) from displays (D1)–(D4) [exactly as in the reasoning around (62)]:

$$\begin{aligned}&\left| U''(s, \pm y_0) {\mp } 2k y_0^{2k-1} \right| \le y_0^{2k-1} \quad \text {for all } s \ge \sigma _{0}, \\&\left| U''(s, y) - 2k y^{2k-1} \right| \le |y|^{2k-1} \quad \text {for all } y \in (-1/4, -y_0) \cup (y_0, 1/4). \end{aligned}$$

We remark that, in order to ensure this condition, we first need to first choose \(y_0\) to be small, and as a function of that, we need to choose \(\sigma _0\) to be large, cf. Remark 3.7.

Let us recall the definition of the Lagrangian trajectories \(X_1\) and \(X_2\) from (65)–(66). Let us first focus on \(X_1\), and we define \({\tilde{U}}'':= U''(s, X_1(s, {\tilde{y}}))\). Assume that \({\tilde{y}}\) is such that \(y_0 \le {\tilde{y}} \le \frac{1}{4}\), the negative \({\tilde{y}}\) case being analogous. We easily have, from (88) [using moreover (B2)] that

$$\begin{aligned} \partial _{s} {\tilde{U}}'' + (b -2){\tilde{U}}'' \le E^{(2)}(s, X_1(s, {\tilde{y}})). \end{aligned}$$

We now notice, similarly to (64), that

$$\begin{aligned} \Vert {E^{(2)}(s, y)}\Vert _{L^{\infty }} \le C A e^{-\gamma s}. \end{aligned}$$
(89)

Hence, restricting to \(\sigma _0\) possibly larger (and treating the terms on the RHS as perturbative), we have, for \(s \ge \sigma _{0}\), upon integration

$$\begin{aligned} 0 \le U''(s, X_1(s, {\tilde{y}})) \le (2k+2) e^{(2-b)(s-\sigma _{0} )}{\tilde{y}}^{2k-1}. \end{aligned}$$
(90)

We notice that we have the following easy consequence of (B2) in the region \(y \in (y_0, 2)\), \(s \in [\sigma _0, \sigma _{1}]\) (upon choosing \(\sigma _0\) large as a function of A and \(y_0\)):

$$\begin{aligned} U(s,y) \ge -(b-1) y. \end{aligned}$$

We now go back to the definition of the Lagrangian trajectories (65), and we immediately deduce that, for \({\tilde{y}} \in (y_0, \frac{1}{4})\), as long as \(|X_1(s, {\tilde{y}})| \le 2\),

$$\begin{aligned} X_1(s, {\tilde{y}}) \ge e^{(b-1)(s-\sigma _0)} {\tilde{y}}. \end{aligned}$$

We now have that, for s such that \(s - \sigma _{0}= - 2k \log \tilde{y}\), the following holds:

$$\begin{aligned} X_1(s, {\tilde{y}}) \ge 1. \end{aligned}$$
(91)

Inequality (90), now gives that, for all s such that \(s - \sigma _{0} \le - 2k \log {\tilde{y}}\), and all \({\tilde{y}} \in (y_0, \frac{1}{4})\):

$$\begin{aligned} |U''(s, X_1(s, {\tilde{y}}))| \le (2k+2) e^{-(2-b)2k \log \tilde{y}}{\tilde{y}}^{2k-1} \le 2k +2, \end{aligned}$$
(92)

since \((2-b)2k = 2k -1\).

Similarly, turning now our attention to \(X_2\), we have, for \(s \ge {\tilde{s}}\), and \({\tilde{y}}\) such that \(y_0 \le {\tilde{y}} \le \frac{1}{4}\):

$$\begin{aligned} 0 \le U''(s, X_2(s, {\tilde{s}})) \le (2k+2) e^{(b-2)(s-{\tilde{s}} )}y^{2k-1}_0. \end{aligned}$$
(93)

We moreover have that, for s such that \(s - {\tilde{s}} = - 2k \log {\tilde{y}}\), the following holds:

$$\begin{aligned} X_2(s, {\tilde{s}}) \ge y_0 e^{-(b-1)2k \log {\tilde{y}}} = 1. \end{aligned}$$
(94)

Combined with inequality (93), this implies that, for all s such that \(s - \sigma _{0} \le - 2k \log {\tilde{y}}\),

$$\begin{aligned} |U''(s, X_2(s, {\tilde{s}}))| \le 2k +2. \end{aligned}$$
(95)

Combining previous inequalities (91), (92), (94), and (95), we conclude that

$$\begin{aligned} |U''(s,y)| \le 2k+2 \end{aligned}$$
(96)

for \(|y| \le \frac{1}{4}\), and \(s \in [\sigma _0, \sigma _1]\). This concludes the bounds in the “near” region.

We now proceed to show an \(L^2\) bound in the “intermediate” region: \(y: \frac{1}{4} \le |y| \le y_2\), where \(y_2\) is chosen depending on \(\zeta > 0\) using Lemma 5.2 in a way that, for all \(y: |y| \ge y_2\), \(s \in [\sigma _0, \sigma _1]\),

$$\begin{aligned} |U'(s,y)| \le \zeta . \end{aligned}$$

We are going to first show a weighted \(L^2\) estimate on \(U''\), where the weight is exponentially decaying in y. Although the estimates for this part are carried out on the whole real line, one should think of them as just useful to the “intermediate region” (\(|y| \le y_2\)). In the final part of the proof of this lemma, we are going to deal with the “far” region using the smallness arising from point 3. in lemma 5.2.

We now multiply Eq. (88) by the weight \(e^{- \lambda y}U''\), to obtain a weighted \(L^2\) estimate in the region \(|y| \ge \frac{1}{4}\). We integrate by parts on the set \(S = [- 1/4, 1/4]^c\). We have, denoting by \(\Vert f\Vert _{w}:= \Vert e^{- \frac{\lambda }{2} y} f(y) \Vert _{L^2(S)}\),

$$\begin{aligned} \begin{aligned}&\frac{1}{2} \partial _s \Vert U''(s,\cdot )\Vert ^2_w + \int _S \Bigg (1+ \frac{b}{2}+ \frac{5}{2} U'- \frac{1}{2} e^s \tau _s U'(s,y) \Bigg ) (e^{-\frac{\lambda }{2} y} U''(s,y))^2 \, \textrm{d}y\\&\qquad +\, \frac{\lambda }{2} \int _S (by+(1+e^s \tau _s) U(s,y)) \left( e^{-\frac{\lambda }{2} y} U''(s,y)\right) ^2 \, \textrm{d}y\\&\quad \le 2e^{- \frac{\lambda }{4}} \Big ( \frac{b}{4} + \frac{1}{4}\Big ) (2k+2)^2 + \Vert U''(s,\cdot )\Vert _w \Vert E^{(2)}(s,\cdot )\Vert _w, \end{aligned} \end{aligned}$$

where we used the bootstrap assumptions and the bounds (96) on \(U''\) to control the boundary terms. Now, for \(y \ge \frac{1}{4}\), possibly choosing \(\sigma _0\) to be large, \(by - U \ge \frac{b-1}{8}\), and \(|U'(s,y)| \le 1\). Hence, it suffices to choose \(\lambda \) to satisfy \((16)^{-1}\lambda (b-1)-2 \ge 1\) to obtain (after using the inequality \(ab \le \frac{1}{20} a^2 + 5 b^2\) to bound the term \( \Vert U''(s,\cdot )\Vert _w \Vert E^{(2)}(s,\cdot )\Vert _w\) on the RHS, and restricting to \(\sigma _0\) large):

$$\begin{aligned} \frac{1}{2} \partial _s \Vert U''(s,\cdot )\Vert ^2_w + \Vert U''(s,\cdot )\Vert ^2_w \le 2(2k+2)^2 + \Vert E^{(2)}(s,\cdot )\Vert ^2_w, \end{aligned}$$

Together with the assumptions on initial data, and the bounds (89), this implies \(\Vert U''(s,\cdot )\Vert ^2_w \le C_k\) for all \(s \ge \sigma _{0}\), where \(C_k\) is a constant depending only on k. Possibly redefining the constant \(C_k\), we also have the unweighted bound

$$\begin{aligned} \Vert U''(s,\cdot ) \Vert _{L^2([-y_2, y_2])} \le C_k. \end{aligned}$$
(97)

(recall the definition of \(y_2\) from lemma 5.2).

We finally perform estimates in the “far away” region \(|y| \ge y_2\). Again we consider Eq. (88), we multiply by \(U''\) and integrate by parts. We have, adding a multiple of the bound (97), and letting \(S_2 =\mathbb {R}\setminus [-y_2, y_2]\), possibly redefining the constant \(C_k\),

$$\begin{aligned}{} & {} \frac{1}{2} \partial _s \Vert U''(s,\cdot )\Vert ^2_{L^2(\mathbb {R})} + \int _{S_2} \Bigg (1+ \frac{b}{2}+ \frac{5}{2} U'- \frac{1}{2} e^s \tau _s U'(s,y) \Bigg ) ( U''(s,y))^2 \, \textrm{d}y\nonumber \\{} & {} \quad \le C_k + \Vert U''(s,\cdot )\Vert _{L^2(\mathbb {R})} \Vert E^{(2)}(s,\cdot )\Vert _{L^2(\mathbb {R})}, \end{aligned}$$
(98)

Recall now that, choosing \(\zeta \) appropriately, in particular \(|U'(s,y)| \le \frac{1}{4}\) for \(|y| \ge y_2\), so that

$$\begin{aligned} 1+ \frac{b}{2}+ \frac{5}{2} U'- \frac{1}{2} e^s \tau _s U'(s,y) - \frac{1}{4}\ge \frac{1}{4}. \end{aligned}$$

Using this observation, we deduce, from (98),

$$\begin{aligned} \frac{1}{2} \partial _s \Vert U''(s,\cdot )\Vert _{L^2(\mathbb {R})}^2 +\frac{1}{4} \Vert U''(s,\cdot )\Vert _{L^2(\mathbb {R})}^2\le & {} C_k + \frac{1}{4} \Vert U''(s,\cdot )\Vert _{L^2(\mathbb {R})}^2\nonumber \\{} & {} + \,4\Vert E^{(2)}(s,\cdot )\Vert ^2_{L^2(\mathbb {R})}\nonumber \\\le & {} C_k + 4\Vert E^{(2)}(s,\cdot )\Vert ^2_{L^2(\mathbb {R})}. \end{aligned}$$
(99)

Here, the second inequality is obtained using bound (97).

Finally, by (B1)–(B2) for \(U'\), (B3) and (54) for \(U''\) and \(U'''\), (B6) for the modulation terms and Lemma 5.1 for the forcing terms, we have

$$\begin{aligned} \Vert {E^{(2)}}\Vert _{L^{2}} \le C A e^{-\gamma s}. \end{aligned}$$

Combining the above bounds with inequality (99), we obtain (possibly choosing \(\sigma _0\) large as a function of A) (87) on \({\mathbb {R}}\) as desired. \(\square \)

5.4 Top Order \(L^2\) Estimate

We are now ready to close the bootstrap assumption (B3) on the top order \(L^{2}\) norm.

Lemma 5.4

There exist \(\epsilon _{0}\), A, \(y_0\), \(\gamma \), \(\sigma _{0}\) such that if the initial data conditions (D1)–(D4) are satisfied and the bootstrap assumptions (B1)–(B7) hold for \(s \in [\sigma _{0}, \sigma _{1}]\), we have

$$\begin{aligned} \Vert \partial _{y}^{2k+3} U(s, \cdot )\Vert _{L^2} \le C, \end{aligned}$$

where \(C > 0\) is a constant independent of A, \(y_{0}\) and \(\sigma _{0}\). Hence, (IB3) holds.

Proof of Lemma 5.4

Let \(j = 2k+3\) and consider Eq. (15). We multiply this equation by \(\partial _y^{(j)}U\) and integrate by parts. Let \(\langle \cdot , \cdot \rangle \) the standard \(L^2\) inner product on \(\mathbb {R}\). We have [cf. (8)–(9)], for a function \(f \in H^1(\mathbb {R})\),

$$\begin{aligned}&\langle f, e^{-s} \Gamma (e^{bs} D_{y}) e^{bs} \partial _{y} f \rangle = 0,\\&\langle f, e^{-s} \Upsilon (e^{bs} D_{y}) f \rangle \ge 0. \end{aligned}$$

We obtain the following inequality:

$$\begin{aligned}&\frac{1}{2} \partial _s \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}^2 +\int _{\mathbb {R}}\left( (1 -b) + jb+(j+1)(1+e^{s} \tau _{s})U' \right) (U^{(j)})^2 \, \textrm{d}y \\&\qquad -\frac{1}{2}\int _{\mathbb {R}} (b + (1+e^s \tau _s)U')( U^{(j)})^2 \, \textrm{d}y \\&\quad \le (1+e^{s} \tau _{s}) \Vert M^{(j)}\Vert _{L^2(\mathbb {R})} \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}. \end{aligned}$$

This implies

$$\begin{aligned}&\frac{1}{2} \partial _s \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}^2 +\int _{\mathbb {R}}\left( \left( 1 - \frac{3}{2} b\right) + jb+\left( j+\frac{1}{2}\right) (1+e^{s} \tau _{s})U' \right) (U^{(j)})^2 \, \textrm{d}y\\&\quad \le \Vert M^{(j)}\Vert _{L^2(\mathbb {R})} \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}. \end{aligned}$$

Note the following inequalities, valid for \(b > 2\), \(a \ge 2\):

$$\begin{aligned}&\Vert U^{(b)}\Vert _{L^2(\mathbb {R})} \lesssim \Vert U''\Vert _{L^2(\mathbb {R})}^{1-\theta _1} \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}^{\theta _1},\\&\Vert U^{(a)}\Vert _{L^\infty } \lesssim \Vert U''\Vert _{L^2(\mathbb {R})}^{1-\theta _2} \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}^{\theta _2}, \end{aligned}$$

where \(\theta _1 = (b-2)/(j-2)\) and \(\theta _2 = (a-3/2)/(j-2)\).

The general form of a term appearing in \(M^{(j)}\) is \(U^{(a)} U^{(b)}\), with \(a,b \ge 1\) and \(a + b = j+1\). We estimate, when \(a \le b\),

$$\begin{aligned} \int _{\mathbb {R}} |U^{(a)} U^{(b)} U^{(j)}| dy \le C_j \Vert U''\Vert ^{1-\theta _1-\theta _2}_{L^2(\mathbb {R})} \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}^{1+ \theta _1 + \theta _2} \le C_j \Vert U^{(j)}\Vert _{L^2(\mathbb {R})}^{2- \frac{1}{2(j-2)}}, \end{aligned}$$

where \(C_j\) is a constant which depends only on j (here, we used the bound \(\Vert U ''\Vert _{L^2(\mathbb {R})} \le C\) from Lemma 5.3). Combining the previous inequalities, and substituting \(2k+3\) for j,

$$\begin{aligned} \begin{aligned}&\frac{1}{2} \partial _s \Vert U^{(2k+3)}\Vert _{L^2(\mathbb {R})}^2 + \int _{\mathbb {R}}(j+\frac{1}{2})e^{s} \tau _{s} U' (U^{(2k+3)})^2 \, \textrm{d}y\\&\quad +\left( \frac{b-1}{2} - C_k \Vert U^{(2k+3)}\Vert _{L^2(\mathbb {R})}^{ -\frac{1}{4k+2}} \right) \Vert U^{(2k+3)}\Vert _{L^2(\mathbb {R})}^2 \le 0. \end{aligned} \end{aligned}$$

Here, \(C_k\) is a constant that depends only on k. It then follows, using the bootstrap assumptions (B6), as well as choosing \(\sigma _0\) sufficiently large, that

$$\begin{aligned} \frac{1}{2} \partial _s \Vert U^{(2k+3)}\Vert _{L^2(\mathbb {R})}^2 +\left( \frac{b-1}{4} - C_k \Vert U^{(2k+3)}\Vert _{L^2(\mathbb {R})}^{ -\frac{1}{4k+2}} \right) \Vert U^{(2k+3)}\Vert _{L^2(\mathbb {R})}^2 \le 0. \end{aligned}$$

From this inequality and the assumptions on initial data it follows that, for all \(s \ge \sigma _{0}\), the following bound is propagated:

$$\begin{aligned} \Vert U^{(2k+3)}\Vert _{L^2(\mathbb {R})} \le 2 \Big (\frac{4C_k}{b-1} \Big )^{4k+2}. \end{aligned}$$

For A large, this proves the improved bound (IB3). \(\square \)

5.5 Weighted \(L^{2}\) Estimates on U

Finally, we improve the bootstrap assumptions (B4) and (B5) concerning weighted \(L^{2}\) estimates on \(\partial _{y} U\) and \(\partial _{y}^{2k+3} U\), respectively.

Lemma 5.5

There exist \(\epsilon _{0}\), A, \(y_0\), \(\gamma \), \(\sigma _{0}\) such that if the initial data conditions (D1)–(D4) are satisfied and the bootstrap assumptions (B1)–(B7) hold for \(s \in [\sigma _{0}, \sigma _{1}]\), the improved inequalities (IB4) and (IB5) hold true.

Proof of Lemma 5.5

We begin the proof with a basic, abstract computation. Consider a first-order operator of the form

$$\begin{aligned} {\mathcal {T}}= \partial _{s} + v \partial _{y} + q. \end{aligned}$$

We decompose \({\mathcal {T}}\) into its anti-symmetric and symmetric parts, i.e., \({\mathcal {T}}= {\mathcal {T}}^{a} + {\mathcal {T}}^{s}\), where

$$\begin{aligned} {\mathcal {T}}^{a} = \tfrac{1}{2} ({\mathcal {T}}- {\mathcal {T}}^{\dagger }) = \partial _{s} + v \partial _{y} + \tfrac{1}{2} (\partial _{y} v), \quad {\mathcal {T}}^{s} = \tfrac{1}{2} ({\mathcal {T}}+ {\mathcal {T}}^{\dagger }) = q - \tfrac{1}{2} (\partial _{y} v). \end{aligned}$$

Let \(\varpi ^{2} = \varpi ^{2}(s, y)\) be a nonnegative weight. If we multiply \({\mathcal {T}}V\) by \(\varpi ^{2} V\) and integrate over \([s_{1}, s_{2}] \times {\mathbb {R}}\), we have

$$\begin{aligned}{} & {} \int _{s_{1}}^{s_{2}} \int ({\mathcal {T}}V) (\varpi ^{2} V) \, \textrm{d}y \textrm{d}s \nonumber \\{} & {} \quad = \frac{1}{2} \left. \int \varpi ^{2} V^{2} \, \textrm{d}y \right| _{s=s_{1}}^{s_{2}}\nonumber \\{} & {} \qquad + \frac{1}{2} \int _{s_{1}}^{s_{2}} \int (\varpi ^{2} {\mathcal {T}}V) V \, \textrm{d}y \textrm{d}s + \frac{1}{2} \int _{s_{1}}^{s_{2}} \int V {\mathcal {T}}^{\dagger } (\varpi ^{2} V) \, \textrm{d}y \textrm{d}s \nonumber \\{} & {} \quad = \frac{1}{2} \left. \int \varpi ^{2} V^{2} \, \textrm{d}y \right| _{s=s_{1}}^{s_{2}} + \frac{1}{2} \int _{s_{1}}^{s_{2}} \int ([\varpi ^{2}, {\mathcal {T}}^{a}] V) V \, \textrm{d}y \textrm{d}s\nonumber \\{} & {} \qquad + \frac{1}{2} \int _{s_{1}}^{s_{2}} \int (\varpi ^{2} {\mathcal {T}}^{s} + {\mathcal {T}}^{s} \varpi ^{2}) V V \, \textrm{d}y \textrm{d}s \nonumber \\{} & {} \quad = \frac{1}{2} \left. \int \varpi ^{2} V^{2} \, \textrm{d}y \right| _{s=s_{1}}^{s_{2}} - \int _{s_{1}}^{s_{2}} \int (\breve{{\mathcal {T}}} \varpi ) \varpi V^{2} \, \textrm{d}y \textrm{d}s. \end{aligned}$$
(100)

where

$$\begin{aligned} \breve{{\mathcal {T}}} \varpi := \partial _{s} \varpi + v \partial _{y} \varpi + \frac{1}{2} (\partial _{y} v) \varpi - q \varpi . \end{aligned}$$

As in the proof of Lemma 4.3, we introduce a nonnegative smooth partition of unity \(\{\eta _{j}\}_{j \in {\mathbb {Z}}}\) on \({\mathbb {R}}\) subordinate to the open cover \(\{A_{j} = \{y \in {\mathbb {R}}: 2^{j-3}< \vert {y}\vert < 2^{j+2}\}\}_{j \in {\mathbb {Z}}}\), and also the shorthand \(\eta _{\ge j} = \sum _{j' \ge j} \eta _{j'}\).

Step 1. Our goal is to prove (IB4) concerning \(U'\). In view of (56) in Lemma 5.2, it suffices to bound the expression

$$\begin{aligned}{} & {} \sup _{j \in {\mathbb {Z}}, \, 2^{j}< e^{bs}}\left( \int _{2^{j-1}< \vert {y}\vert < 2^{j}} \eta _{\ge i_{0}}^{2} \big ( \vert {y}\vert ^{\frac{1}{b}-\frac{1}{2}} U'(s, y) \big )^{2}\, \textrm{d}y \right) ^{\frac{1}{2}} \nonumber \\{} & {} \quad + e^{(1-\frac{1}{2} b ) s} \left( \int _{\vert {y}\vert > \frac{e^{bs}}{2}} \eta _{\ge i_{0}}^{2} U'(s, y)^{2} \, \textrm{d}y \right) ^{\frac{1}{2}}, \end{aligned}$$
(101)

where the cutoff parameter \(i_{0} > 0\) is to be determined below. As a preparation for the proof, we introduce

$$\begin{aligned} {\mathcal {T}}_{1} = \partial _{s} + v \partial _{y} + q_{1}, \end{aligned}$$

where

$$\begin{aligned}{} & {} v = \left( b y + (1 + e^{s} \tau _{s}) U - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) , \\{} & {} q_{1} = \left( 1 + (1 + e^{s} \tau _{s}) U' \right) , \end{aligned}$$

and define \(\breve{{\mathcal {T}}}_1:= \partial _{s} + v \partial _{y} + \frac{1}{2} (\partial _{y} v) - q_1\). Recall, from (14), that \(U'\) obeys

$$\begin{aligned} \begin{aligned} {\mathcal {T}}_{1} U'&= (1+e^{s} \tau _{s}) \left( e^{-\mu s} {\mathcal {H}}(U') + e^{-s} \partial _{y} {\mathcal {L}}(U + e^{(b-1)s} \kappa ) \right) . \end{aligned} \end{aligned}$$
(102)

Let \(j_{1}:= \lfloor b \sigma _{0} \log 2 \rfloor \). For each integer \(j \le j_{1}\), we introduce the weight

$$\begin{aligned}{} & {} \varpi _{j}(s, y) = e^{(1 - \frac{1}{2} b) (s - \sigma _{0})} \varpi _{j, 0}(y e^{-b(s-\sigma _{0})}), \nonumber \\{} & {} \quad \varpi _{j, 0} = {\left\{ \begin{array}{ll} 2^{(\frac{1}{b} - \frac{1}{2}) j} \eta _{j} &{} \hbox { for } j < j_{1}, \\ 2^{(\frac{1}{b} - \frac{1}{2}) j_{1}} \eta _{\ge j_{1}} &{} \hbox { for } j = j_{1}. \end{array}\right. } \end{aligned}$$
(103)

Observe that each \(\varpi _{j}\) solves the equation

$$\begin{aligned} \left( \partial _{s} + by \partial _{y} + \frac{1}{2} b - 1 \right) \varpi _{j} = 0, \end{aligned}$$

where, as we will see, the LHS is a good approximation of \(\breve{{\mathcal {T}}}_{1} \varpi _{j}\).

We are now ready to begin the proof of (IB4) in earnest. For each \(j \le j_{1}\), we apply (100) with \(\varpi = \eta _{\ge i_{0}} \varpi _{j}\), which leads to

$$\begin{aligned}&\frac{1}{2} \int \eta _{\ge i_{0}}^{2} \varpi _{j}^{2} U'(s, y)^{2} \, \textrm{d}y\\&\quad = \frac{1}{2} \int \eta _{\ge i_{0}}^{2} \varpi _{j}^{2} U'(\sigma _{0}, y)^{2} \, \textrm{d}y \\&\qquad + \int _{\sigma _{0}}^{s} \int \left( \breve{{\mathcal {T}}}_{1} (\eta _{\ge i_{0}} \varpi _{j}) U' + \eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{1} U' \right) \eta _{\ge i_{0}} \varpi _{j} U' \textrm{d}y \textrm{d}s. \end{aligned}$$

Thanks to (103), if we take the supremum in \(j \le j_{1}\), then the contribution of the LHS is equivalent to (101), whereas that of the first term on the RHS is bounded by a constant C as a consequence of (31). It is then straightforward to derive the estimate

$$\begin{aligned}{} & {} \sup _{s \in [\sigma _{0}, \sigma _{1}]} \sup _{j \le j_{0}} \left( \int \eta _{\ge i_{0}}^{2} \varpi _{j}^{2} U'(s, y)^{2} \, \textrm{d}y \right) ^{\frac{1}{2}} \\{} & {} \quad \le C + C \sup _{j \le j_{0}} \int _{\sigma _{0}}^{\sigma _{1}} \left( \Vert {\eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{1} U'}\Vert _{L^{2}} + \Vert {\breve{{\mathcal {T}}}_{1} (\eta _{\ge i_{0}} \varpi _{j}) U'}\Vert _{L^{2}}\right) \, \textrm{d}s. \end{aligned}$$

We claim that, for some \(c > 0\) and \(\sigma _{0}\) sufficiently large depending on A,

$$\begin{aligned}{} & {} \sup _{j \le j_{0}} \int _{\sigma _{0}}^{\sigma _{1}} \left( \Vert {\eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{1} U'}\Vert _{L^{2}} + \Vert {\breve{{\mathcal {T}}}_{1} (\eta _{\ge i_{0}} \varpi _{j}) U'}\Vert _{L^{2}}\right) \, \textrm{d}s\\{} & {} \quad \lesssim 2^{(\frac{1}{b}+1) i_{0}} + 2^{-c i_{0}} A + e^{-c \sigma _{0}} A. \end{aligned}$$

From this claim, (IB4) would follow by taking \(i_{0}\), A, and \(\sigma _{0}\) large enough (in this order).

To bound the contribution of \(\eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{1} U'\), we use (102). Using (B6) for \(e^{s} \tau _{s}\), (40) in Lemma 4.3 for \(e^{-\mu s} {\mathcal {H}}(U')\) (with \(\varpi = \eta _{\ge i_{0}} \varpi _{j}\), \(\ell = 0\) and \(\nu = \frac{1}{2} - \frac{1}{b}\)) with (B4)–(B5), and (50) for \(e^{-s} \partial _{y} {\mathcal {L}}(U + e^{(b-1) s} \kappa )\), we have

$$\begin{aligned}&\Vert {\eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{1} U'}\Vert _{L^{2}}\\&\quad \lesssim (1 + e^{-\gamma s} A) \left( e^{-\mu s} A + 2^{\left( \frac{1}{b} - \frac{1}{2}\right) j} e^{\left( 1-\frac{1}{2} b\right) (s - \sigma _{0})} (1+\kappa _{0}) e^{-\left( 2-\frac{1}{2} b\right) s} \right) \\&\quad \lesssim (1 + e^{-\gamma s} A) \left( e^{-\mu s} A + (1+\kappa _{0}) e^{-s} \right) , \end{aligned}$$

where on the last line, we used the fact that \(2^{j} \lesssim 2^{j_{1}} \simeq e^{b \sigma _{0}}\). Therefore,

$$\begin{aligned} \sup _{j \le j_{0}} \int _{\sigma _{0}}^{\sigma _{1}} \Vert {\eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{1} U'}\Vert _{L^{2}} \lesssim e^{-c \sigma _{0}} A, \end{aligned}$$

for some small \(c > 0\), A sufficiently large and \(\sigma _{0}\) sufficiently large (determined in this order).

Next, we bound the contribution of \(\breve{{\mathcal {T}}}_{1} (\eta _{\ge i_{0}} \varpi _{j}) U'\). We compute

$$\begin{aligned} \breve{{\mathcal {T}}}_{1} (\eta _{\ge i_{0}} \varpi _{j}) = v \eta _{\ge i_{0}}' \varpi _{j} + \eta _{\ge i_{0}} (v - by) \varpi _{j}' - \frac{1}{2} \eta _{\ge i_{0}} (1+e^{s} \tau _{s}) U' \varpi _{j}. \end{aligned}$$

To proceed, we note, From its definition (103), that the s-support of \(\Vert {\eta _{\ge i_{0}}' \varpi _{j}}\Vert _{L^{\infty }}\) is at most of length O(1) independent of \(i_{0}\) and j (we remark that this property is also geometrically clear from the transport equation obeyed by \(\varpi _{j}\)). Moreover, by (56) and \(U(s, 0) = 0\),

$$\begin{aligned} \vert {U'(s, y)}\vert + \vert {y}\vert ^{-1} \vert {U(s, y)}\vert \lesssim \max \{ (1+\vert {y}\vert )^{-r}, A e^{-r s} \}. \end{aligned}$$
(104)

Then, using (B6) as well,

$$\begin{aligned} \vert {v - by}\vert= & {} \vert {(1 + e^{s} \tau _{s}) U - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa }\vert \\\lesssim & {} \vert {y}\vert \max \{ (1+\vert {y}\vert )^{-r}, A e^{-r s} \} + e^{-\gamma s} A. \end{aligned}$$

Finally, observe that \(\eta _{\ge i_{0}} \varpi _{j}\) and \(\eta _{\ge i_{0}} \vert {y}\vert \varpi _{j}'\) are supported in \(\{ \vert {y}\vert \gtrsim 2^{j} e^{b(s-\sigma _{0})}\} \cap \{\vert {y}\vert \gtrsim 2^{i_{0}}\}\) and are bounded by \(C 2^{(\frac{1}{b} - \frac{1}{2}) j} e^{(1 - \frac{1}{2} b)(s-\sigma _{0})} {\textbf{1}}_{> 2^{i_{0}-5}}(2^{j} e^{b(s-\sigma _{0})})\), where \({\textbf{1}}_{> 2^{i_{0}-5}}\) is the characteristic function of \(\{\vert {y}\vert > 2^{i_{0} - 5}\}\). Putting all these together, we may arrive at

$$\begin{aligned} \int _{\sigma _{0}}^{\sigma _{1}} \Vert {\breve{{\mathcal {T}}}_{1} (\eta _{\ge i_{0}} \varpi _{j}) U'}\Vert _{L^{2}} \, \textrm{d}s&\lesssim 2^{(\frac{1}{b}+\frac{1}{2}) i_{0}} \Vert {U'}\Vert _{L^{2}({{\,\textrm{supp}\,}}\eta _{\ge i_{0}}')} \\&\quad + \int _{\sigma _{0}}^{\sigma _{1}} \left( 2^{-r j} e^{-rb(s-\sigma _{0})} {\textbf{1}}_{> 2^{i_{0}-5}}(2^{j} e^{b(s-\sigma _{0})}) + A e^{-r s} \right) \\&\qquad \Vert {U}\Vert _{\dot{{\mathcal {H}}}_{<e^{bs}}^{1}} \, \textrm{d}\sigma \\&\lesssim 2^{(\frac{1}{b}+1) i_{0}} + \left( 2^{-r i_{0}} + A e^{-r \sigma _{0}} \right) A. \end{aligned}$$

where we used (IB1), (IB2) and (B4) on the last line. Taking \(0< c < r\) and \(e^{(r-c)\sigma } > A\), the last line is bounded by the right-hand side of the claim, as desired.

Step 2. Now we turn to the proof of (IB5) concerning \(U^{(2k+3)}\). To simplify the notation, let us write \({\mathcal {U}}= U^{(2k+3)}\). In view of Lemma 5.4, it suffices to bound the expression

$$\begin{aligned}{} & {} \sup _{j \in {\mathbb {Z}}, \, 2^{j}< e^{bs}} \left( \int _{2^{j-1}< \vert {y}\vert < 2^{j}} \eta _{\ge i_{0}}^{2} \big ( \vert {y}\vert ^{\frac{1}{b} + 2k+\frac{3}{2}} {\mathcal {U}}(s, y)\big )^{2} \, \textrm{d}y \right) ^{\frac{1}{2}}\nonumber \\{} & {} \quad + e^{(1+(2k+\frac{3}{2}) b) s} \left( \int _{\vert {y}\vert > \frac{e^{bs}}{2}} \eta _{\ge i_{0}}^{2} {\mathcal {U}}(s, y)^{2} \, \textrm{d}y \right) ^{\frac{1}{2}}, \end{aligned}$$
(105)

where the cutoff parameter \(i_{0} > 0\) is to be determined below. Like in Step 1, we introduce

$$\begin{aligned} {\mathcal {T}}_{2k+3} = \partial _{s} + v \partial _{y} + q_{2k+3}, \end{aligned}$$

where v is as before and

$$\begin{aligned} q_{2k+3} = \left( 1 + (2k+2) b + (2k+4) (1 + e^{s} \tau _{s}) U' \right) . \end{aligned}$$

Recall, from (15), that \({\mathcal {U}}\) obeys

$$\begin{aligned} \begin{aligned} {\mathcal {T}}_{2k+3} {\mathcal {U}}&= - (1+e^{s} \tau _{s}) M^{(2k+3)} + (1+e^{s} \tau _{s}) \\&\quad \left( e^{-\mu s} {\mathcal {H}}({\mathcal {U}}) + e^{-s} \partial _{y}^{2k+3} {\mathcal {L}}(U + e^{(b-1)s} \kappa ) \right) . \end{aligned} \end{aligned}$$

Let \(j_{1}:= \lfloor b \sigma _{0} \log 2 \rfloor \) and for each integer \(j \le j_{1}\), we introduce the weight

$$\begin{aligned}{} & {} \varpi _{j}(s, y) = e^{(1 + (2k+\frac{3}{2}) b) (s - \sigma _{0})} \varpi _{j, 0}(y e^{-b(s-\sigma _{0})}), \\{} & {} \varpi _{j, 0} = {\left\{ \begin{array}{ll} 2^{(\frac{1}{b} + 2k+\frac{3}{2}) j} \eta _{j} &{} \hbox { for } j < j_{1}, \\ 2^{(\frac{1}{b} + 2k+\frac{3}{2}) j_{1}} \eta _{\ge j_{1}} &{} \hbox { for } j = j_{1}. \end{array}\right. } \end{aligned}$$

For each \(j \le j_{1}\), we apply (100) with \(\varpi = \eta _{\ge i_{0}} \varpi _{j}\), which leads to

$$\begin{aligned}&\frac{1}{2} \int \eta _{\ge i_{0}}^{2} \varpi _{j}^{2} {\mathcal {U}}(s, y)^{2} \, \textrm{d}y\\&\quad = \frac{1}{2} \int \eta _{\ge i_{0}}^{2} \varpi _{j}^{2} {\mathcal {U}}(\sigma _{0}, y)^{2} \, \textrm{d}y \\&\qquad + \int _{\sigma _{0}}^{s} \int \left( \breve{{\mathcal {T}}}_{2k+3} (\eta _{\ge i_{0}} \varpi _{j}) {\mathcal {U}}+ \eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{2k+3} {\mathcal {U}}\right) \eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\textrm{d}y \textrm{d}s. \end{aligned}$$

To avoid the derivative loss, we further write out the contribution of \(\eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{2k+3} {\mathcal {U}}\) as follows:

$$\begin{aligned} \int \eta _{\ge i_{0}} \varpi _{j} {\mathcal {T}}_{2k+3} {\mathcal {U}}\eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y&= \int (1+e^{s} \tau _{s}) e^{-\mu s} \eta _{\ge i_{0}} \varpi _{j} {\mathcal {H}}({\mathcal {U}}) \eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y \\&\quad + \int (1+e^{s} \tau _{s}) e^{-s} \eta _{\ge i_{0}} \varpi _{j} \partial _{y}^{2k+3} \\&\quad {\mathcal {L}}(U + e^{(b-1) s} \kappa ) \eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y \\&\quad - \int (1+e^{s} \tau _{s}) \eta _{\ge i_{0}} \varpi _{j} M^{(2k+3)} \eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y \\&= (1+e^{s} \tau _{s}) e^{-\mu s} \int {\mathcal {H}}(\eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}) \eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y \\&\quad + \int (1+e^{s} \tau _{s}) e^{-\mu s} [\eta _{\ge i_{0}} \varpi _{j}, {\mathcal {H}}] {\mathcal {U}}\eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y \\&\quad + \int (1+e^{s} \tau _{s}) e^{-s} \eta _{\ge i_{0}} \varpi _{j} \partial _{y}^{2k+3} \\&\quad {\mathcal {L}}(U + e^{(b-1) s} \kappa ) \eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y \\&\quad - \int (1+e^{s} \tau _{s}) \eta _{\ge i_{0}} \varpi _{j} M^{(2k+3)} \eta _{\ge i_{0}} \varpi _{j} {\mathcal {U}}\, \textrm{d}y. \end{aligned}$$

Observe that the first term on the far RHS is nonnegative, by the dispersive/dissipative property of \(\Gamma \)/\(\Upsilon \), respectively. Returning to the weighted energy identity, we take the supremum in \(j \le j_{1}\) and in \(s \in [\sigma _{0}, \sigma _{1}]\), then use (31) to bound

$$\begin{aligned} \left( \int \eta _{\ge i_{0}}^{2} \varpi _{j}^{2} {\mathcal {U}}(\sigma _{0}, y)^{2} \, \textrm{d}y \right) ^{\frac{1}{2}} \le C. \end{aligned}$$

In conclusion, we arrive at

$$\begin{aligned}{} & {} \sup _{s \in [\sigma _{0}, \sigma _{1}]} \sup _{j \le j_{0}} \left( \int \eta _{\ge i_{0}}^{2} \varpi _{j}^{2} {\mathcal {U}}(s, y)^{2} \, \textrm{d}y \right) ^{\frac{1}{2}}\nonumber \\{} & {} \quad \le C + C \sup _{j \le j_{0}} \int _{\sigma _{0}}^{\sigma _{1}} \left( \textrm{I} + \textrm{II} + \textrm{III} + \textrm{IV} \right) \, \textrm{d}s, \end{aligned}$$
(106)

where

$$\begin{aligned} \textrm{I}&= (1+e^{s} \tau _{s}) e^{-\mu s} \Vert {[\eta _{\ge i_{0}} \varpi _{j}, {\mathcal {H}}] {\mathcal {U}}}\Vert _{L^{2}}, \\ \textrm{II}&= (1+e^{s} \tau _{s}) e^{-s} \Vert {\eta _{\ge i_{0}} \varpi _{j} \partial _{y}^{2k+3} {\mathcal {L}}(U + e^{(b-1) s} \kappa )}\Vert _{L^{2}} \\ \textrm{III}&= (1+e^{s} \tau _{s}) \Vert {\eta _{\ge i_{0}} \varpi _{j} M^{(2k+3)}}\Vert _{L^{2}}, \\ \textrm{IV}&= \Vert {\breve{{\mathcal {T}}}_{2k+3} (\eta _{\ge i_{0}} \varpi _{j}) {\mathcal {U}}}\Vert _{L^{2}}. \end{aligned}$$

We claim that, for some \(c > 0\) and \(\sigma _{0}\) sufficiently large depending on A,

$$\begin{aligned} \sup _{j \le j_{0}} \int _{\sigma _{0}}^{\sigma _{1}} \left( \textrm{I} + \textrm{II} + \textrm{III} + \textrm{IV} \right) \, \textrm{d}s \lesssim 2^{(\frac{1}{b}+2k+3) i_{0}} + 2^{-c i_{0}} A + e^{-c \sigma _{0}} A. \end{aligned}$$

Since the LHS of (106) is equivalent to (105), (IB5) would follow from the claim by taking \(i_{0}\), A, and \(\sigma _{0}\) large enough (in this order).

To bound the contributions of \(\textrm{I}\) and \(\textrm{II}\), we use Lemma 4.3 and (50) (as well as (B4)–(B6)) to estimate

$$\begin{aligned} \sup _{j \le j_{0}} \int _{\sigma _{0}}^{\sigma _{1}} \textrm{I} \, \textrm{d}s \lesssim e^{-\mu \sigma _{0}} A, \quad \sup _{j \le j_{0}} \int _{\sigma _{0}}^{\sigma _{1}} \textrm{II} \, \textrm{d}s \lesssim e^{-\sigma _{0}} (1+\kappa _{0}). \end{aligned}$$

Both right-hand sides are bounded by \(e^{-c \sigma _{0}} A\) if we take \(0< c < \min \{\mu , 1\}\), A sufficiently large and \(\sigma _{0}\) sufficiently large (in this order). To treat \(\textrm{III}\), we begin by noting that

$$\begin{aligned} \begin{aligned}&\Vert {\eta _{\ge i_{0}} \varpi _{j} M^{(2k+3)}}\Vert _{L^{2}} \\&\quad \lesssim \Vert {U'}\Vert _{L^{\infty }(\tilde{A}_{j})} \left( 2^{(2k+\frac{3}{2} + \frac{1}{b}) j} 2^{((2k+\frac{3}{2})b - \frac{1}{2}) (s-\sigma _{0})}\Vert {U^{(2k+3)}}\Vert _{L^{2}(\tilde{A}_{j})}\right. \\&\left. \qquad +\, 2^{(\frac{1}{b} - \frac{1}{2}) j} 2^{(1 - \frac{1}{2} b) (s-\sigma _{0})} \Vert {U'}\Vert _{L^{2}(\tilde{A}_{j})} \right) , \end{aligned} \end{aligned}$$

where \(\tilde{A}_{j}\) is a slight enlargement of \({{\,\textrm{supp}\,}}\varpi _{j}\). Indeed, this inequality is proved by first writing

$$\begin{aligned} \eta _{\ge i_{0}} \varpi _{j} M^{(2k+3)}&= \eta _{\ge i_{0}} \varpi _{j} \sum _{\ell = 2}^{2k+2} \frac{1}{2}\left( {\begin{array}{c}2k+4\\ \ell \end{array}}\right) U^{(\ell )} U^{(2k+4-\ell )} \\&= \eta _{\ge i_{0}} \varpi _{j} \sum _{\ell = 2}^{2k+2} \frac{1}{2}\left( {\begin{array}{c}2k+4\\ \ell \end{array}}\right) \partial _{y}^{\ell -1} ({\tilde{\eta }}_{j} U') \partial _{y}^{2k+3-\ell }({\tilde{\eta }}_{j} U') \end{aligned}$$

then applying the usual Gagliardo–Nirenberg inequalities to \({\tilde{\eta }}_{j} U'\); here, \({\tilde{\eta }}_{j}\) is a smooth bump function such that \({{\,\textrm{supp}\,}}{\tilde{\eta }}_{j} \subseteq \tilde{A}_{j}\), \(\vert {\partial _{y}^{N} {\tilde{\eta }}_{j}}\vert \lesssim _{N} 2^{-N j}\) (for any \(N \ge 0\)) and \({\tilde{\eta }}_{j} = 1\) on \({{\,\textrm{supp}\,}}\varpi _{j}\). Then using (104), (B4), and (B5) [as well as (B6)], we obtain

$$\begin{aligned} \int _{\sigma _{0}}^{\sigma _{1}} \textrm{III} \, \textrm{d}s \lesssim (2^{-r i_{0}} + A e^{-r \sigma _{0}})A, \end{aligned}$$

which is bounded by \(2^{-c i_{0}}A + e^{-c \sigma _{0}} A\) if we take \(0< c < r\) and \(e^{(r-c)\sigma _{0}} > A\). Finally, the contribution of \(\textrm{IV}\) is handled similarly as the term \(\breve{{\mathcal {T}}}_{1} (\eta _{\ge i_{0}} \varpi _{j})\) in Step 1, where we use Lemma 5.4 and (B4)–(B5) instead of (IB1)–(IB2) and (B4), respectively. We may show that

$$\begin{aligned} \int _{\sigma _{0}}^{\sigma _{1}} \textrm{IV} \, \textrm{d}s \lesssim 2^{(\frac{1}{b}+2k+3) i_{0}} + (2^{-r i_{0}} + A e^{-r \sigma _{0}}) A, \end{aligned}$$

which is bounded by the right-hand side of the desired claim if \(0< c < r\) and \(e^{(r-c)\sigma _{0}} > A\). This completes the proof. \(\square \)

6 Estimates on Modulation Parameters and Unstable Coefficients

In this section, we analyze the ODE’s satisfied by the modulation parameters and the coefficients \(W^{(j)}(s, 0)\). We prove the bootstrap assumptions in Lemma 3.6 involving the modulation ODE’s and \(W^{(2k+1)}(s, 0)\), as well as Lemma 3.9 in its entirety.

6.1 Control of the Modulation Parameters and \(w_{2k+1}\)

We establish sharp bounds on the ODE’s for modulation parameters, which improve (B6).

Lemma 6.1

(Control of the modulation parameters) Assume the hypotheses of Lemma 3.6. Then we have

$$\begin{aligned}&\vert {e^{s} \tau _{s}}\vert \le C_{A} e^{-\min \{2 \gamma , \mu \} s}, \end{aligned}$$
(107)
$$\begin{aligned}&\vert {e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa }\vert \le C_{A} e^{- \min \{2\gamma , \mu \} s}, \end{aligned}$$
(108)
$$\begin{aligned}&\vert {e^{(b-1) s} \kappa _{s}}\vert \le C_{A} e^{-\min \{2\gamma , \mu _{0}\} s}. \end{aligned}$$
(109)

In particular, if \(\sigma _{0}\) is sufficiently large (depending only on A, \(y_{0}\), \(\kappa _{0}\)), (IB6) holds.

Proof

For \(\tau _{s}\), by (23), we have

$$\begin{aligned} \vert {e^{s} \tau _{s}}\vert \le \frac{\vert {F^{(1)}(s, 0)}\vert + \vert {w_2(e^{bs}\xi _s - (1+e^{s}\tau _s)e^{(b-1)s}\kappa )}\vert }{1 - \vert {F^{(1)}(s, 0)}\vert }. \end{aligned}$$

By (B6), (T) (in the case \(k \ge 2\) for \(w_{2}\)), (52) and (53) with \(j = 1\), \(\vert {F^{(1)}(s, 0)}\vert \le C_{A} \left( e^{- \mu s} + e^{- 2\gamma s}\right) \) (since \(0< \mu < 2\)), which implies that

$$\begin{aligned} |e^s \tau _s| \le 2C_A ( e^{- \mu s} + e^{- 2\gamma s} + C e^{- 2\gamma s}) \le C_A e^{- \text {min}\{2\gamma , \mu \} s}, \end{aligned}$$

thereby showing (107).

For \(\xi _{s}\), we use (23) to bound

$$\begin{aligned} \vert {e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa }\vert \le \frac{1}{(2k)!-1} \left( \vert {\left. N(2k) \right| _{y=0}}\vert + (1+\vert {e^{s} \tau _{s}}\vert ) \vert {F^{(2k)}(s, 0)}\vert \right) , \end{aligned}$$

where we crucially used (B7) to ensure that \((2k)!+w_{2k+1} \ge (2k)!-1 > 0\) in the denominator. By (T) and (21), we have \(\vert {\left. N(2k) \right| _{y=0}}\vert \lesssim e^{-2 \gamma s}\). On the other hand, by (52), (53) with \(j = 2k\), and (107), we have \(\vert {F^{(2k)}(s, 0)}\vert \le C_{A} e^{- \mu s}\) (since \(0< \mu < 2\)). At this point, (108) follows.

Finally, for \(\kappa _{s}\), we use (22) to bound

$$\begin{aligned} \vert {e^{(b-1) s} \kappa _{s}}\vert \le \vert {e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa }\vert + (1+\vert {e^{s} \tau _{s}}\vert ) \vert {F^{(0)}(s, 0)}\vert . \end{aligned}$$

By (51) and (53) with \(j = 0\), we have \(\vert {F^{(0)}(s, 0)}\vert \le C e^{- \min \{\mu _{0}, 2-b\} s} = C e^{-\mu _{0} s}\) (since \(2-b = \frac{2k-1}{2k}\)). Combined with the previous bounds (107) and (108) for \(\vert {e^{s} \tau _{s}}\vert \) and \(\vert {e^{bs} \xi _{s} - (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa }\vert \), respectively, (109) follows.

Finally, since \(\gamma < \min \{2 \gamma , \mu , \mu _{0}\}\) by (27), (IB6) follows from (107)–(109) provided that \(\sigma _{0}\) sufficiently large. \(\square \)

Next, we study the ODE satisfied by \(W^{(2k+1)}(s, 0)\) and improve (B7).

Lemma 6.2

(Control of the stable coefficient) Assume the hypotheses of Lemma 3.6. Then (IB7) holds true for \(\sigma _{0}\) sufficiently large.

Proof

Using Eq. (17) with \(j = 2k+1\), we have the equation for \(w_{2k+1} = W^{(2k+1)}(s,0)\) (since \({\bar{U}}^{(2k+1)}(0) = (2k)!\) and \({\bar{U}}^{(2k+2)}(0) = 0\)):

$$\begin{aligned}{} & {} \partial _{s} w_{2k+1} + (1+e^{s} \tau _{s}) \left. N(2k+1) \right| _{y = 0} \nonumber \\{} & {} \qquad + \left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) w_{2k+2} - e^{s} \tau _{s} \left( (2k)! - (2k+2) w_{2k+1} \right) \nonumber \\{} & {} \quad =(1+e^{s} \tau _{s}) F^{(2k+1)}(s, 0). \end{aligned}$$
(110)

Hence all terms in display (110) other than \(\partial _{s} w_{2k+1}\) decay exponentially as s increases. Let us analyze the terms one by one. First,

$$\begin{aligned} |(1+e^s\tau _s) N(2k+1) |\le C_A e^{- \gamma s} \end{aligned}$$

thanks to the trapping assumption (T), the improved bound (IB6), and the bounds \(|W^{(2k+1)}(s,0)| \le 1\) as well as \(|W^{(2k+2)}(s,0)| \le C A\) (this last bound follows from the estimate (88) in Lemma 5.3 and (B3) plus Sobolev embedding).

In addition,

$$\begin{aligned} \Big |\left( - e^{b s} \xi _{s} + (1+e^{s} \tau _{s}) e^{(b-1) s} \kappa \right) w_{2k+2}\Big | \le C_A e^{-\gamma s}, \end{aligned}$$

which follows by (IB6) and (IB3).

Moreover, by (IB6), (52) and (53), we have \(\vert {(1+e^{s} \tau _{s}) F^{(2k+1)}(s, 0)}\vert \lesssim _{A} e^{-\mu s}\). Finally, recall from (29) that the initial data at \(\sigma _{0}\) are such that \(|\partial ^{(2k+1)}_y W(\sigma _{0}, 0) |\le C \epsilon _{0} e^{-(1+2kb)\sigma _{0}}\). Therefore, we obtain an equation of the following type for \(w_{2k+1}\):

$$\begin{aligned} \partial _s w_{2k+1} - e^s \tau _s(2k + 2) w_{2k+1} = h^\#(s), \end{aligned}$$
(111)

where \(|h^\#(s)|\le C_A \exp (-\min \{\gamma ,\mu \}s).\) Integrating (111) in time, we can prove the improved bootstrap bound  (IB7) upon choosing \(\sigma _{0}\) sufficiently large, as desired. \(\square \)

6.2 Control of the Unstable Coefficients: Proof of Lemma 3.9

The purpose of this subsection is to prove Lemma 3.9 (shooting lemma), which is relevant when \(k > 1\). We start by establishing the key outgoing property of the unstable ODE near the boundary of the trapped region.

Lemma 6.3

Under the hypotheses of Lemma 3.6, there exists \(c_{0} > 0\) such that, for \(\sigma _{0} \) sufficiently large (depending only on k, \(\mu \), \(\gamma \) and A), the following holds: for any \(s \in [\sigma _{0}, \sigma _{1}]\) such that

$$\begin{aligned} \frac{1}{2} e^{- \gamma s}< \vec {w}(s) < e^{- \gamma s}, \end{aligned}$$
(112)

we have

$$\begin{aligned} \partial _{s} \vert {\vec {w}(s)}\vert ^{2} > 2 c_{0} \vert {\vec {w}(s)}\vert ^{2}. \end{aligned}$$

Proof

We recall the vector \(\vec {w}(s) = (w_2, \ldots , w_{2k-1})(s)\), which satisfies the following system of ODEs:

$$\begin{aligned} \partial _s \vec {w}(s) - D \vec {w}(s) + (1+e^{s} \tau _{s}) {\mathcal {N}}(\vec {w}(s)) = M \vec {w}(s) + \vec {f}(s).^{16} \end{aligned}$$
(113)

Footnote 16

Here \(D = \textrm{diag}\,\left( \lambda _{2}, \ldots , \lambda _{2k-1} \right) \) with \(\lambda _{j} = 1 - \frac{j-1}{2k}\), so that \(1> \lambda _{2}> \ldots> \lambda _{2k-1} > 0\). We set that

$$\begin{aligned} c_{0} = \frac{1}{2} \lambda _{2k-1}. \end{aligned}$$

We now evaluate \(\partial _{s} \vert {\vec {w}(s)}\vert ^{2}\) using (113):

$$\begin{aligned}{} & {} \frac{1}{2} \partial _s |\vec {w}(s)|^2 - D \vec {w}(s) \cdot \vec {w}(s)+(1+e^s \tau _s){\mathcal {N}}(\vec {w}(s))\cdot \vec {w}(s) \\{} & {} \quad = M \vec {w}(s) \cdot \vec {w}(s) + \vec {f}(s) \cdot \vec {w}(s). \end{aligned}$$

The contribution of \(D \vec {w}(s)\) gives the main positive term \(4 c_{0} \vert {\vec {w}}\vert ^{2}\) for a suitable choice of \(c_0\), since \(D = \textrm{diag}\,\left( \lambda _{2}, \ldots , \lambda _{2k-1} \right) \), with \(\lambda _j>0\) for all j. We claim that the contribution of the remaining terms is bounded below by \(- 2 c_{0} \vert {\vec {w}}\vert ^{2}\) if \(\sigma _{0}\) is sufficiently large. First, for \({\mathcal {N}}\), we have, by (T),

$$\begin{aligned} |{\mathcal {N}}(\vec {w}(s))| \le C_k |\vec {w}(s)|^2 \le C_{k} e^{-\gamma s} \vert {\vec {w}(s)}\vert \le \frac{c_0}{4} |\vec {w}(s)| \end{aligned}$$

for \(\sigma _0\) sufficiently large.

Next, concerning the term \(M \vec {w}(s)\), we have

$$\begin{aligned} \vert {M}\vert \le C_{k} e^{-\gamma s}, \end{aligned}$$

which follows from (IB6). Finally, by (52), (53) for \(j \ge 2\), (IB6) and (112), we have

$$\begin{aligned} \vert {\vec {f}}\vert \le C_{k, A} e^{-\mu s} \le 2C_{k, A} e^{-(\mu - \gamma ) s} \vert {\vec {w}(s)}\vert . \end{aligned}$$

Recalling from (27) that \(\gamma < \mu _{0} \le \mu \), the contribution of \(\vec {f}(s)\) decays exponentially, which concludes the proof of the lemma. \(\square \)

Finally, we are ready to prove Lemma 3.9.

Proof of Lemma 3.9

For each \(\vert {\vec {w}_{0}}\vert \le e^{- \gamma \sigma _{0}}\), denote by \(U_{\vec {w}_{0}}(s, y)\) the solution with initial data at \(s = \sigma _{0}\) induced by \(\vec {w}_{0}\) and \(W_{0}\), and write \(\vec {w}_{\vec {w}_{0}}(s)\) for the vector \((\partial _{y}^{2} U_{\vec {w}_{0}}(s,0), \ldots , \partial _{y}^{2k-1} U_{\vec {w}_{0}}(s,0))\). For the purpose of contradiction, suppose that for all \(\vec {w}_{0}\) satisfying \(\vert {\vec {w}_{0}}\vert \le e^{- \gamma \sigma _{0}}\), \(U_{\vec {w}_{0}}\) does not remain trapped forever. By Lemma 3.6 and a standard bootstrap argument, there exists a unique \(\sigma _{trap}(\vec {w}_{0}) \ge \sigma _{0}\) such that \(\vert {\vec {w}_{\vec {w}_{0}}(\sigma _{trap}(\vec {w}_{0}))}\vert = e^{- \gamma \sigma _{trap}(\vec {w}_{0})}\) while \(\vert {\vec {w}_{\vec {w}_{0}}(s)}\vert < e^{- \gamma s}\) for all \(\sigma _{0} \le s < \sigma _{trap}(\vec {w}_{0})\) (see the discussion following Definition 3.8). The key step in the proof is to establish the following:

Claim. The map \(H: B_{0}(e^{-\gamma \sigma _{0}}) \rightarrow \partial B_{0}(1)\), \(\vec {w}_{0} \mapsto e^{\gamma \sigma _{trap}(\vec {w}_{0})} \vec {w}_{\vec {w}_{0}}(\sigma _{trap}(\vec {w}_{0}))\) is continuous.

Assuming the claim, we first conclude the proof of the lemma. Note, first, that for \(\vec {w}_{0} \in \partial B_{0}(e^{-\gamma \sigma _{0}})\), we trivially have \(\sigma _{trap} = \sigma _{0}\) and \(\vec {w}_{\vec {w}_{0}}(\sigma _{trap}) = \vec {w}_{0}\); hence H is equal to the identity when restricted to the boundary \(\partial B_{0}(e^{-\gamma \sigma _{0}})\). Hence, by composing with \(B_{0}(1) \rightarrow B_{0}(e^{-\gamma \sigma _{0}})\), \(\vec {v} \mapsto e^{-\gamma \sigma _{0}} \vec {v}\), we obtain a continuous map from \(B_{0}(1)\) into \(\partial B_{0}(1)\) that is equal to the identity map on \(\partial B_{0}(1)\) (i.e., a continuous retraction \(B_{0}(1) \rightarrow \partial B_{0}(1)\)). As is well-known (cf. proof of Brouwer’s fixed point theorem), such a map does not exist, which is a contradiction.

It remains to establish the claim. Fix \(\vec {w}_{0} \in B_{0}(e^{-\gamma \sigma _{0}})\). By definition, (T) holds for \(s \in [\sigma _{0}, \sigma _{trap}(\vec {w}_{0})]\), so \(U_{\vec {w}_{0}}\) obeys (IB1)–(IB7) on \(s \in [\sigma _{0}, \sigma _{trap}(\vec {w}_{0})]\). By a standard argument involving analysis of the linearized system, it can be shown that \(\vec {v} \mapsto U_{\vec {v}}\) is Lipschitz continuousFootnote 17 near \(\vec {w}_{0}\) in \(C_{s} (I; H^{2k+2})\), where I is a fixed open interval containing \([\sigma _{0}, \sigma _{trap}(\vec {w}_{0})]\). By Sobolev embedding, it follows that \(\vec {w}_{\vec {v}}(s)\) depends continuously on \((s, \vec {v}) \in I \times B_{\vec {w}_{0}}(\delta )\) for some \(\delta > 0\). To establish the continuity of H at \(\vec {w}_{0}\), it therefore only remains to show that that \(\vec {v} \mapsto \sigma _{trap}(\vec {v})\) is continuous at \(\vec {v} = \vec {w}_{0}\).

To prove the continuity of \(\sigma _{trap}\), we begin by using the outgoing property near the boundary (Lemma 6.3) to make the following observation: for an arbitrary sufficiently small number \(\epsilon > 0\), if \(U_{\vec {v}}\) is trapped on \([\sigma _{0}, \sigma _{1})\) and \(\vec {w}_{\vec {v}}(\sigma _{1}) > e^{-\gamma \sigma _{1} - c_{0} \epsilon }\) then \(\vert {\sigma _{trap}(\vec {v}) - \sigma _{1}}\vert < \epsilon \). When \(\vec {w}_{0} \in \partial B_{0}(e^{-\gamma \sigma _{0}})\), the continuity of \(\sigma _{trap}\) at \(\vec {w}_{0}\) follows immediately by applying the preceding statement with \(\sigma _{1} = \sigma _{0}\), since \(\epsilon > 0\) may be arbitrarily small. When \(\vec {w}_{0} \not \in \partial B_{0}(e^{-\gamma \sigma _{0}})\), we have \(\sigma _{trap}(\vec {w}_{0}) > \sigma _{0}\). Clearly, there exists \(\sigma _{1} \in (\sigma _{0}, \sigma _{trap}(\vec {w}_{0}))\) such that \(e^{-\gamma \sigma _{1} - c_{0} \epsilon }< \vec {w}_{\vec {w}_{0}}(\sigma _{1}) < e^{-\gamma \sigma _{1}}\) and \(\vert {\sigma _{1} - \sigma _{trap}(\vec {w}_{0})}\vert < \epsilon \). By Lemma 3.6 and the strict inequality \(\vec {w}_{\vec {w}_{0}}(\sigma _{1}) < e^{-\gamma \sigma _{1}}\), for \(\vec {v}\) sufficiently close to \(\vec {w}_{0}\), the corresponding solution \(U_{\vec {v}}\) is trapped on \([\sigma _{0}, \sigma _{1}]\) and obeys \(e^{-\gamma \sigma _{1} - c_{0} \epsilon }< \vec {w}_{\vec {v}}(\sigma _{1}) < e^{-\gamma \sigma _{1}}\). Hence, \(\vert {\sigma _{trap}(\vec {v}) - \sigma _{trap}(\vec {w}_{0})}\vert \le \vert {\sigma _{trap}(\vec {v}) - \sigma _{1}}\vert + \vert {\sigma _{trap}(\vec {w}_{0}) - \sigma _{1}}\vert < 2 \epsilon \), which implies the desired continuity of \(\sigma _{trap}\). \(\square \)