Introduction

The present paper is concerned with existence, nonuniqueness and results of h-principle type for Hölder continuous, weak solutions to inviscid active scalar equations with a divergence free drift velocity. These equations have the form

$$\begin{aligned} \begin{aligned} \partial _t \theta + \partial _l(\theta u^l)&= 0 \\ u^l&= T^l[\theta ] \\ \partial _l u^l&= 0. \end{aligned} \end{aligned}$$
(1.1)

The operator \(T^l[ \cdot ]\) defining the drift velocity \(u^l\) in (1.1) is represented in frequency space by a multiplier

$$\begin{aligned} {\hat{u}}^l(\xi ) = {\widehat{T}}^l[\theta ](\xi )&= m^l(\xi ) {\hat{\theta }}(\xi ). \end{aligned}$$
(1.2)

We assume that \(m^l(\xi )\) is defined on the whole frequency space as a tempered distribution and is homogeneous of degree 0 so that \(T^l\) is an operator of order 0. The multiplier must satisfy \(m^l(-\xi ) = \overline{m^l(\xi )}\) so that the drift-velocity \(u^l\) is real-valued whenever the scalar \(\theta \) is real-valued, and we assume that \(m^l(\xi )\) is smooth away from the origin. The requirement that \(u^l\) is divergence free corresponds to the requirement that \(m^l(\xi )\) takes values perpendicular to the frequency vector \(\xi \), i.e. \(\xi \cdot m(\xi ) = 0\) for \(\xi \ne 0\).

Active scalar equations arise from the full Navier–Stokes, Euler, or magneto-hydrodynamic equations in a number of physical regimes, such as stratification, rapid rotation, hydrostatic, and geostrophic balance. Physically motivated examples include:

  1. 1.

    The surface quasi-geostrophic (SQG) equation [17, 30]. Here

    $$\begin{aligned} m(\xi ) = i \langle -\xi _2,\xi _1\rangle |\xi |^{-1} \end{aligned}$$

    is an odd symbol, bounded and smooth on the unit sphere. The SQG equation belongs to a general class of active scalar equations (with odd constitutive law T) satisfied by the vorticity of a generalized two-dimensional Euler equation on a Lie algebra (á la Arnold [1]) with a specific inner product [43] (see also [47] for a more recent account).

  2. 2.

    The incompressible porous media (IPM) equation with velocity given by Darcy’s law [5, 20]. Here

    $$\begin{aligned} m(\xi ) = \langle \xi _1 \xi _2, - \xi _1^2 \rangle |\xi |^{-2} \end{aligned}$$

    is an even symbol, bounded and smooth on the unit sphere. Note that the IPM equation has a three-dimensional analogue, with symbol \(m(\xi ) = \langle \xi _1 \xi _3, \xi _2 \xi _3, -\xi _1^2 - \xi _2^2\rangle |\xi |^{-2}\), which is again even. Our proof applies to this three-dimensional case as well, cf. Remark 1 below.

  3. 3.

    The magneto-geostrophic (MG) equation [27, 38, 39]. This is a three-dimensional active scalar equation, with symbol given by

    $$\begin{aligned} m(\xi ) = \Big \langle \xi _2 \xi _3 |\xi |^2 + \xi _1 \xi _2^2 \xi _3, - \xi _1\xi _3 |\xi |^2+\xi _2^3 \xi _3, -\xi _2^2(\xi _1^2+\xi _2^2) \Big \rangle (\xi _3^2 |\xi |^2 + \xi _2^4)^{-1} \end{aligned}$$

    for all \(\xi \in {\mathbb {Z}}^3_*\) with \(\xi _3\ne 0\), and by \(m(\xi _1,\xi _2,0) = 0\). The symbol of the MG equation is even and zero-order homogenous, but as opposed to the previous examples, it is not bounded. This unboundedness may be seen by evaluating the symbol on a parabola \(m(\zeta ^2, \zeta ,1)\), and passing \(|\zeta |\rightarrow \infty \). Nonetheless, the proof in our paper still applies to the MG equations as we only require smoothness in a neighborhood of finitely many points, cf.  Remark 2 below.

Remarkably, from the mathematical point of view these scalar equations retain some of the same essential difficulties of the full fluid equations. In particular, the global well-posedness for the 2D SQG and IPM equations remains open, in analogy to the 3D Euler equations. More relevant for this paper, the regularity class in which the conservation of the energy \(\Vert \theta \Vert _{L^2}^2\) may be established for weak solutions of (1.1), is Hölder continuity with exponent 1 / 3, as for 3D Euler. However, due to their more rigid geometry (e.g. no known analogue for Beltrami flows), their non-local nature, and the presence of infinitely many conservation laws (the \(L^p\) norms of \(\theta \), for any \(p\ge 1\)), the construction of weak solutions that fail to conserve energy appears to be more restrictive than for 3D Euler.

The pair \((\theta , u^l)\) is called a weak solution of (1.1) if the equations (1.1) are satisfied on \({\mathbb {R}}\times {\mathbb {T}}^2\) in the sense of distributions. When \((\theta , u^l)\) are continuous, it is equivalent to require the balance laws

$$\begin{aligned} \frac{d}{dt} \int _\Omega \theta (t,x) dx = \int _{\partial \Omega } \theta ~u(t,x) \cdot n ~ d\sigma , \qquad \int _{\partial \Omega } u(t,x) \cdot n ~d\sigma = 0 \end{aligned}$$

to be satisfied as continuous functions of time for all subdomains \(\Omega \) with smooth boundary and inward unit normal n. The definition of weak solution implies immediately that the integral \(\int _{{\mathbb {T}}^2} \theta (t,x) ~dx\) is a conserved quantity, but this definition does not immediately imply the other conservation laws that hold for classical solutions (see also [3, 4] for comparisons with other notions of non-classical solutions for the Euler equations).

The study of weak solutions in fluid dynamics, including those which fail to conserve energy, is natural in the context of turbulent flows. The power spectrum predicted by Kolmogorov [36] implies that solutions which arise in the inviscid limit of the 3D Navier–Stokes equations have Hölder 1 / 3 regularity on average, and in particular are not classical. Such flows are expected to exhibit anomalous dissipation of energy, rather than conserving energy. The exponent 1 / 3 is the same regularity threshold conjectured by Onsager [42] to be critical for energy conservation in the 3D Euler equations (see [2, 25, 44] for recent reviews). For power spectra in active scalar turbulence, we refer to Kraichnan [37] and Constantin [13, 14].

Our first main result, Theorem 1.1, shows that if the symbol of the multiplier \(m^l(\xi )\) is not an odd function of \(\xi \) for \(\xi \ne 0\), there exist nontrivial, space-periodic solutions in two dimensions with compact support in time, having any Hölder regularity \(\theta \in C_{t,x}^\alpha \) with \(\alpha < 1/9\). In contrast, the energy \(\int |\theta |^2(t,x) dx\) is a conserved quantity for solutions with Hölder regularity above \(\alpha > 1/3\) and for classical solutions the quantity \(\theta ^2\) obeys a continuity equation with drift velocity \(u^l = T^l[\theta ]\), whereas both these properties clearly fail for our solutions. This result gives the first proof of nonuniqueness of continuous weak solutions for any active scalar equation of this type.

Theorem 1.1

(Weak Solutions to Active Scalar equations) Consider the active scalar equation (1.1) with divergence free drift velocity, and assume that the multiplier \(m^l(\xi )\) defining the operator \(T^l\) is not an odd function of \(\xi \) for \(\xi \ne 0\). Let \(\alpha < 1/9\) and let I be an open interval. Then there exist nontrivial solutions to (1.1) with Hölder regularity \(\theta , u^l \in C_{t,x}^\alpha ({\mathbb {R}}\times {\mathbb {T}}^2)\) which are identically 0 outside of \(I \times {\mathbb {T}}^2\).

Moreover, if \(f : {\mathbb {R}}\times {\mathbb {T}}^2 \rightarrow {\mathbb {R}}\) is a smooth scalar function with compact support on \(I \times {\mathbb {T}}^2\) which satisfies the conservation law \(\frac{d}{dt} \int _{{\mathbb {T}}^2} f(t,x) ~dx = 0\), then there exists a sequence of weak solutions \(\theta _n : {\mathbb {R}}\times {\mathbb {T}}^2 \rightarrow {\mathbb {R}}\) to (1.1) in the above regularity class such that \(\theta _n\) converges to f in the \(L^\infty \) weak-* topology, and each \(\theta _n\) has compact support in \(I \times {\mathbb {T}}^2\).

The above result builds upon the recent works by Córdoba, Faraco, Gancedo [21], Shvydkoy [45], and Székelyhidi [46] which establish the non-uniqueness of \(L^\infty _{t,x}\) weak solutions to the IPM equations and active scalar equations with even symbols m. These previous works are based on a variant of the method of convex integration introduced for the Euler equations in [22] that provides an effective and elegant approach to producing bounded solutions, but which faces a major obstruction to producing continuous solutions. For the Euler equations, this obstruction was overcome in [11, 23, 26] to produce continuous and \(C^\alpha \) solutions on \({\mathbb {T}}^2\) and \({\mathbb {T}}^3\). A crucial idea to overcome this obstruction is a key cancellation coming from the use of special families of stationary, plane wave solutions which allows for the control of interference terms between different waves in the construction. For 3D Euler, these solutions are Beltrami flows (eigenfunctions of the curl operator), while for 2D Euler they are rotated gradients of Laplace eigenfunctions.

There is an obstruction to generalizing these ideas to obtain continuous solutions to active scalar equations, which is that analogous families of stationary, plane wave solutions do not exist in general for active scalar equations. Furthermore, as we explain more precisely in Section 2.1, there is a sense in which no analogous cancellation is ever available under the assumptions of Theorem 1.1. The same difficulty has also prohibited this approach from generalizing to the Euler equations in higher dimensions, even though similar results in principle could be expected to hold in any dimension. (The conservation of energy for regularity above 1 / 3 holds in any dimension, and the approach of [22] for constructing \(L_{t,x}^\infty \) solutions applies in any dimension.)

The main idea that forms the starting point of our work is a new, more general, mechanism for obtaining the cancellation of interference terms in the construction, which arises without any special Ansatz in the construction. Our observation is that the interference terms which arise when an individual wave interacts with itself must always cancel thanks to the divergence free structure of the equation, even though we lack a general method for controlling the interference between waves which oscillate in different directions. This observation opens the door to a serial iteration scheme based on one-dimensional oscillations, as in the original scheme of Nash [40]. The same observation applies to both the Euler equations and to general active scalar equations regardless of the dimension (c.f. Remark 1). Our proof therefore gives a new approach to constructing continuous and \(C^\alpha \) weak solutions to these equations that is independent of the use of Beltrami flows or the analogue.

Although the regularity obtained in Theorem 1.1 is strictly worse than the results which have been obtained for the Euler equations, the exponent 1 / 9 is the best result we can hope to obtain from our method. For the Euler equations, solutions in the class \(C_{t,x}^{1/5-}\) were constructed in [31], with another proof given by Buckmaster, De Lellis and Székelyhidi [7]. The construction in [7] has recently been refined in [8] to give continuous solutions in the class \(L_t^1 C_x^{1/3 -}\), improving significantly a result of Buckmaster [6]. A main obstruction to higher regularity faced by all of these works and also the present paper is the presence of anomalously sharp time cutoffs. These cutoffs lead to bounds on advective derivatives which are inferior to the bounds that hold for solutions with higher regularity, cf. [32, Sec. 9] and [34, Sec. 1.1.3]. In our case, we face an additional loss of regularity which comes from our inability to eliminate more than one component of the error in a given stage of the iteration. The same obstruction to regularity arises for the isometric embedding equation [18]. For active scalars, we must deal with both obstructions at the same time, and improving on either one seems to be a difficult problem.

Our approach to proving Theorem 1.1 also yields the following result, which shows that our construction can realize arbitrary smooth initial data.

Theorem 1.2

Let \(I = (-T, T)\) be a finite open interval containing the origin, let \(\alpha < 1/9\) and let \((\theta _{(0)}, u_{(0)}^l)\) be a smooth solution to (1.1) on \(I \times {\mathbb {T}}^2\). Then there exists a global, weak solution \((\theta , u^l)\) to (1.1) in the class \((\theta , u^l) \in C_{t,x}^\alpha ({\mathbb {R}}\times {\mathbb {T}}^2)\) which coincides with \((\theta _{(0)}, u_{(0)}^l)\) on the time interval

$$\begin{aligned} \theta (t,x) = \theta _{(0)}(t,x) \qquad (t,x) \in \left( -T/2, T/2\right) \times {\mathbb {T}}^2 \end{aligned}$$

and which coincides with a constant

$$\begin{aligned} \theta (t,x) = \bar{\theta } \end{aligned}$$

for \((t,x) \notin (-4T/5, 4T/5) \times {\mathbb {T}}^2\).

To the best of our knowledge, Theorem 1.2 gives the first proof of global existence of weak solutions for (1.1) with multipliers m which are not odd, from arbitrary smooth initial data [21]. The global existence of weak solutions appears to be only known for odd symbols [9, 43], or for patch-type initial datum in the IPM equations [19]. Thus, in view of the known existence result for odd multipliers, we show that all active scalar equations with smooth constitutive law have global in time weak solutions (see also Remark 2).

Our method of construction demonstrates not only the existence of weak solutions, but also the abundance and flexibility of solutions in the class \(C_{t,x}^{1/9 - \epsilon }\). This point is emphasized by the following result of “h-principle” type, which follows from Theorem 1.1, and completely characterizes the weak-* closure of these solutions in \(L^\infty \). The result illustrates that, within this regularity class, the conservation of the integral is the only source of rigidity for solutions to the equations that is stable in the weak-* topologyFootnote 1. We refer to [12, 24] for more on h-principles for fluid equations.

Corollary 1.1

(h-principle for Active Scalar Equations) Consider the 2D active scalar equation (1.1) as in the hypotheses of Theorem 1.1, with multiplier m that is not odd. Then for any \(\alpha < 1/9\) and for any open interval I, the closure in the weak-* topology on \(L^\infty (I \times {\mathbb {T}}^2)\) of the set of \(C_{t,x}^{\alpha }\) solutions to (1.1) with compact support in \(I \times {\mathbb {T}}^2\) is equal to the space of real-valued \(f \in L^\infty (I \times {\mathbb {T}}^2)\) which satisfy the conservation law \(\int _{{\mathbb {T}}^2} f(t,x) dx = 0\) as a distribution in time.

While Theorems 1.1-1.2 and Corollary 1.1 illustrate an utter lack of rigidity for multipliers which are not odd, we find a much more rigid situation for weak solutions in the case of odd multipliers. The following result implies that, when the multiplier is odd, every weak limit of solutions in \(L_{t,x}^\infty \) must also be a solution to the same active scalar equation, in stark contrast to Theorem 1.1 and Corollary 1.1. This theorem generalizes the statement at the end of [24] concerning weak rigidity for SQG, and makes precise the assumptions necessary for this rigidity.

Theorem 1.3

(Weak Rigidity for Active Scalars with Odd Multipliers) Consider the active scalar equation (1.1) in any dimension, with divergence free drift velocity, and assume that the multiplier \(m^l(\xi )\) defining the operator \(T^l\) is an odd function of \(\xi \) for \(\xi \ne 0\). Suppose that \(f = \lim _n \theta _n\) is a weak limit of solutions to (1.1) in \(L^p(I;L^2({\mathbb {T}}^d))\), for some \(p>2\). Then f(tx) must be a weak solution to (1.1).

We note that the \(L^p\) time integrability condition on \(\theta _n\) is by no means restrictive. Indeed, due to the incompressible transport nature of (1.1), weak solutions constructed via smooth approximations (e.g. vanishing viscosity) are in fact bounded, or even weakly continuous in time.

The proof of Theorem 1.3 is based on the approach of [43], where global \(L^\infty _t L^2_x\) weak solutions of the SQG equations are constructed. The main idea is that odd multipliers m induce a certain commutator structure in the nonlinear term, which yields the necessary compactness. In fact, the oddness of m implies that the equations are well-posed, even if the operator \(T^l\) is not of degree 0 (see [9]), and in such cases the oddness appears to be necessary [28, 29].

In addition to the weak rigidity of Theorem 1.3, in the following theorem we show that every active scalar equation in 2D with odd symbol has a Hamiltonian that is conserved for solutions in the class \(L_{t,x}^3\).

Theorem 1.4

(Conservation of the Hamiltonian for Active Scalars with Odd Multipliers) Consider the active scalar equation (1.1) in two dimensions with divergence free drift velocity and odd multiplier as in Theorem 1.3. Define the operator

$$\begin{aligned} L = (-\Delta )^{-1} (\nabla \cdot T^\perp ) = (-\Delta )^{-1/2} (R_2 T^1 - R_1 T^2) \end{aligned}$$
(1.3)

where \(R_i\) is the \(i^{th}\) Riesz transform. The fact that m is odd, implies that L is self-adjoint. Define the Hamiltonian

$$\begin{aligned} H(t) = \int _{{\mathbb {T}}^2} \theta (t,x) L\theta (t,x) dx. \end{aligned}$$
(1.4)

Then, if \(\theta \) is a solution to (1.1) in the class \(\theta \in L_{t,x}^3\), the function H(t) is constant in time.

We note that due to the transport structure of (1.1), solutions which are obtained by smooth approximations, such as viscosity approximations, Galerkin truncations, etc, will automatically lie in \(L^\infty _{t,x}\), and thus also in \(L^3_{t,x}\).

Theorem 1.3 precludes any results such as Theorems 1.1-1.2 from holding in the case of the SQG equation, in which case \(L = (-\Delta )^{-1/2}\) and we obtain the conservation of the \(H^{-1/2}\) norm for solutions in \(L_{t,x}^3\). Note however that in general the operator L need not be coercive, as is the case when m vanishes somewhere on the unit sphere. We refer to [43, 47] for an exposition of how the quantity H(t) serves as a Hamiltonian for the equation.

We conclude our introduction by remarking on how our method extends to higher dimensions, and to the case of multipliers which are not smooth.

Remark 1

(Higher Dimensions) Our proof generalizes to active scalar equations in arbitrary dimensions (c.f. Section 3.2 for the relevant modifications). In this case, however, there are two further restrictions. First of all, the regularity we obtain becomes worse as the dimension increases. The same type of loss (for essentially the same reason, see Section 2.2.1 below) is also seen in the case of the isometric embedding equations [18]. Second, we cannot obtain our result for all smooth multipliers whose symbols are not odd, and we require a nondegeneracy condition on the even part of the multiplier.

The precise result we obtain is the following:

Theorem 1.5

(Multi-dimensional Case) Consider the active scalar equation (1.1) with divergence free drift velocity on \({\mathbb {T}}^d\). Assume also that the image of the even part of the multiplier contains d vectors

$$\begin{aligned} A_{(i)}&= m(\xi ^{(i)}) + m(-\xi ^{(i)}), \qquad i = 1, 2, \ldots , d \end{aligned}$$
(1.5)

such that the vectors \(A_{(1)}, \ldots , A_{(d)}\) span \({\mathbb {R}}^d\). Then Theorems 1.1-1.2 and Corollary 1.1 hold as stated, but with the condition \(\alpha < \frac{1}{9}\) on the Hölder exponent being replaced by

$$\begin{aligned} \alpha < \frac{1}{1 + 4 d}. \end{aligned}$$

Theorem 1.5 applies in particular to the 3D IPM equation, and in that case yields weak solutions with Hölder regularity \(\alpha < 1/13\). Note also that Theorem 1.5 generalizes the two dimensional case of Theorem 1.1. Namely, if the even part \(m(\xi ^{(1)}) + m(-\xi ^{(1)}) \ne 0\) is nonzero at a single point, it follows already from incompressibility (i.e. the condition \(m(\xi ) \cdot \xi = 0\)) that the span of the image of the even part of m has dimension at least 2.

The assumption (1.5) in Theorem 1.5 arises quickly from the proof and turns out to be necessary for the conclusion of Theorem 1.1. That is to say, when the assumption (1.5) fails, there are in general additional constraints on weak limits of solutions besides the conservation of the mean value. In the case where the multiplier is even, such constraints arise from the conservation of the integrals

$$\begin{aligned} \frac{d}{dt} \int _{{\mathbb {T}}^n} \theta (t,x) \Psi (x) dx = 0 \end{aligned}$$

for functions \(\Psi \) whose gradients take values perpendicular to the image of the multiplier. More generally, we have the following theorem which can be applied to every multiplier that fails to satisfy (1.5):

Theorem 1.6

(Constraints on Weak Limits of Degenerate Multipliers) Consider the active scalar equation (1.1) on a torus \({\mathbb {T}}^n\) of any dimension and suppose that the image of the even part of the multiplier lies in a hyperplane perpendicular to some nonzero vector \(\xi _{(0)} \in \widehat{{\mathbb {T}}^n}\) in the dual lattice. Then there exists a smooth function of compact support \(f \in {\mathcal {C}}_0^\infty ({\mathbb {R}}\times {\mathbb {T}}^n)\) which is real-valued and satisfies the conservation law \(\int _{{\mathbb {T}}^n} f(t,x) dx = 0\) such that f cannot be realized as a weak-* limit in \(L^\infty \) of any sequence of bounded weak solutions to (1.1).

The proof of Theorem 1.6 draws on the proof of weak compactness in Theorem 1.3. One can compare condition (1.5) to criteria for having a large \(\Lambda \)-convex hull in the theory of differential inclusions (e.g. [22, 35, 46]).

Remark 2

(Non-smooth Symbols) In view of the example of the MG equation, it is important to remark that our proof applies also to multipliers which are not smooth. In fact, the only regularity condition we require in our proof is that the multiplier should be smooth in a neighborhood of the points \(\xi ^{(1)}, \xi ^{(2)}, \ldots , \xi ^{(d)}\) and \(- \xi ^{(1)}, -\xi ^{(2)}, \ldots , -\xi ^{(d)}\) appearing in (1.5). Thus Theorem 1.5 applies to the MG equation, if we take for example the points \(\xi ^{(1)} = \langle 1, 0, 1 \rangle \), \(\xi ^{(2)} = \langle 0, 1, 1 \rangle \), \(\xi ^{(3)} = \langle 1, 1, 1 \rangle \).

Difficulties and New Ideas

The proof of Theorem 1.5 contains a number of new ideas in the method of convex integration, which we summarize before we begin the proof.

As stated earlier in the Introduction, our main idea is a new mechanism for obtaining cancellations in interference terms between overlapping waves. This allows us to get around the lack of Beltrami flows, or their analogues, as the type of cancellation given by such flows is entirely unavailable in our setting (cf. Section 2.2). This idea gives a new and general approach to constructing continuous weak solutionsFootnote 2 which generalizes also to Euler. The idea is based on the observation that self-interference terms vanish automatically thanks to the incompressible nature of the equation.

The above idea opens the door to a multi-stage iteration scheme based on one-dimensional oscillations, as in the original scheme of Nash for isometric embeddings applied in [18, 40]. This type of scheme had previously appeared unavailable in the setting of the Euler equations (see [26, Section 1.3, Comment 2]). On the other hand, while implementing a scheme exactly of this type now appears to be possible, it also appears to be relatively complicated, requiring the addition of several iterations of waves (each with their own time, length scale and frequency parameters) before the error improves in the \(C^0\) norm. We manage to avoid these complications by defining a space of approximate solutions by a compound scalar stress equation. This concept allows us to obtain a \(C^0\) improvement after only one iteration, which simplifies the iteration and gives estimates which are much closer to the bounds familiar from the case of Euler.

The main new technical difficulty in obtaining continuous solutions to active scalar equations lies in how to deal with the integral operator in the equation which determines the drift velocity \(u^l = T^l[\theta ]\). The whole construction is based on high frequency, plane-wave type corrections of the form \(e^{i \lambda \xi _I(t,x)} \theta _I(t,x)\), and it is necessary to understand very precisely how adding such waves will affect the drift velocity. Furthermore, the convex integration schemes for producing Hölder continuous Euler flows all use heavily \(C^0\) type estimates on all error terms. From this point of view, the failure of \(C^0\) boundedness of \(T^l\) suggests some serious trouble.

Our main technical device for addressing this difficulty is a “Microlocal Lemma” (Lemma 4.1). This lemma makes precise how a convolution operator behaves to leading order like a multiplication operator when given a high-frequency plane wave input, allowing for the use of nonlinear phase functions. In the case of the operator \(T^l\), represented on the Fourier side by the multiplier \(m^l(\xi )\), our lemma gives a statement of the form

$$\begin{aligned} u^l = T^l[e^{i \lambda \xi (x)} \theta (x)] = e^{i \lambda \xi (x)}( \theta m^l\left( \nabla \xi (x)\right) + \delta u^l) \end{aligned}$$

and gives an explicit formula for the error term \(\delta u^l\) (which also allows us to estimate its spatial and advective derivatives). We expect that this technique should be of independent interest for other applications.

To address the lack of \(C^0\) boundedness of \(T^l\), our proof makes additional use of the frequency localization in the construction, which allows for the effective application of the Microlocal Lemma. A number of other simplifications in the argument arise from the use of frequency localized waves. For instance, many error terms can be estimated in a simpler way than in previous works, and we remove the need for nonstationary phase arguments in solving the relevant elliptic equations.

In connection with our space of approximate solutions, we introduce a family of estimates we call compound frequency energy levels. These estimates generalize to active scalars the frequency energy levels introduced in [31]. These bounds have the key feature that they carry \(C^0\) type estimates for derivatives of the drift velocity along the iteration. Otherwise, the lack of \(C^0\) boundedness of \(T^l\) would prohibit us from deducing these estimates from the bounds on the scalar field.

Outline of the Paper

The overall strategy for the construction is outlined in Section 2. The bulk of the paper then consists of proving the “Main Lemma”, Lemma 3, which is stated in Section 3. After the statement of the Main Lemma, Section 4 is devoted to the proof of a “Microlocal Lemma”, which is one of the main technical tools in the paper. Sections 5-8 are then devoted to proving Lemma 3.

In Section 9, we explain how the Main Lemma implies the results stated in Theorem 1.1 and Corollary 1.1. Section 11 provides an outline of how Theorem 1.2 also follows from the same Lemma. The modifications used to prove Theorem 1.5 regarding higher dimensions are explained in Section 3.2.

Sections 12 and 13 are devoted to the rigidity properties of weak solutions in the case of odd multipliers. In Section 12, we give a proof of Theorem 1.3 on the rigidity of solutions under weak limits when the multiplier is odd. Section 13 is then devoted to the proof of Theorem 1.4 on the conservation of the Hamiltonian for active scalars with odd multipliers in dimension 2.

The last Section 14 is devoted to proving Theorem 1.6, which shows that the nondegeneracy condition in Theorem 1.5 is necessary in general for the weak limit statement of Theorem 1.1 to apply in higher dimensions. In Section 15 we give a conclusion to the paper and state some open questions.

Notation

We use the Einstein summation convention of summing over indices which are repeated. We take the convention that vectors are written with upper indices, whereas covectors are written with lower indices; thus, for a vector field \(u^l\) and function \(\xi \), we write \(u \cdot \nabla \xi = u^l \partial _l \xi \) and \(\text{ div } u = \partial _l u^l\).

We use the notation \(X \unlhd Y\) to indicate an inequalities \(X \le Y\) which have not been proven, but will be proven later on in the course of the argument. We sometimes refer to such inequalities as “goals”.

Basic Technical Outline

In this Section, we give a technical outline of the main ideas of the construction which includes a list of the important error terms and provides a comparison to the cases of the Euler and isometric embedding equations. This section provides the basic ideas to motivate the statement of the Main Lemma of Section 3.

We will perform the construction in a space of approximate solutions to the active scalar equation which we now define.

We say that \((\theta , u^l, R^l)\) satisfy the scalar-stress equation if

$$\begin{aligned} \left\{ \begin{aligned} \partial _t \theta + \partial _l(\theta u^l)&= \partial _l R^l \\ u^l&= T^l(\theta ) \end{aligned} \right. \end{aligned}$$
(2.1)

This system is the analogue for active scalar equations of the Euler–Reynolds system introduced in [26] for the Euler equations. Here \(R^l\) is a vector field on \({\mathbb {T}}^2\) that we call the “stress field” (by analogy with the stress tensor \(R^{jl}\) in the Euler–Reynolds equations) which measures the error by which \(\theta \) fails to solve the active scalar equation.

Recall that the operator

$$\begin{aligned} T^l[\theta ] = \int _{{\mathbb {R}}^2} K^l(h) \theta (x-h) dh \end{aligned}$$

is a convolution operator with a real-valued kernel \(K^l\) which is homogenous of degree \(-2\) as a distribution. The corresponding Fourier multiplier

$$\begin{aligned} m^l(\xi )&= {\hat{K}}^l(\xi ) \end{aligned}$$
(2.2)

is homogeneous of degree 0, satisfies \(m^l(-\xi ) = \overline{m^l(\xi )}\), and we assume that \(m^l(\xi )\) is smooth on \(|\xi | = 1\) (and therefore smooth away from the origin). To ensure that \(u^l = T^l[\theta ]\) satisfies the divergence free condition \(\partial _l u^l = 0\), we require that

$$\begin{aligned} m(\xi )\cdot \xi&= 0 \end{aligned}$$
(2.3)

At a high level, the basic idea of the convex integration construction is to start with a given solution \((\theta , u^l, R^l)\) to (2.1), and proceed to add a (high-frequency) correction \(\Theta \) to the scalar field \(\theta \), so that the corrected scalar field and drift velocity

$$\begin{aligned} \theta _1 = \theta + \Theta , \qquad u_1^l = u^l + U^l, \qquad U^l = T^l[\Theta ] \end{aligned}$$
(2.4)

satisfy the scalar stress equation (2.1) with a new stress field \(R_1^l\) that is significantly smaller than the original stress field \(R^l\). These corrections are added in an iteration to obtain a sequence of solutions to (2.1)

$$\begin{aligned} (\theta _{(k)}, u_{(k)}^l, R_{(k)}^l) \end{aligned}$$

such that \(R_{(k)}^l \rightarrow 0\) as the number of iterations k tends to infinity. From dimensional analysis and experience with the isometric embedding and Euler equations, we expect an estimate \(\Vert \Theta _{(k)} \Vert _{C^0} \le C \Vert R_{(k)} \Vert _{C^0}^{1/2}\) for the size of the corrections, so that we will obtain continuous solutions in the limit provided \(\Vert R_{(k)} \Vert _{C^0}\) tends to 0 at a reasonable rateFootnote 3. On the other hand, the \(C^1\) norms of the corrections \(\Vert \nabla \Theta _{(k)} \Vert _{C^0}\) will diverge as the frequencies in the iteration grow to infinity, and we prove convergence of the iteration in Hölder spaces by interpolating between the bounds for \(\Vert \Theta _{(k)} \Vert _{C^0}\) and \(\Vert \nabla \Theta _{(k)} \Vert _{C^0}\) after the construction has been optimized to reduce the stress field \(\Vert R_{(k)} \Vert _{C^0}\) at the most efficient rate possible. Although this description explains how the scheme works at a high level, we must study the equation and the scheme in much more detail before it is clear that there is any hope of reducing the stress field \(R^l\) in this manner.

As in [31], we will consider corrections built from rapidly oscillating “plane waves” where we allow for phase functions \(\xi _I\) and amplitudes \(\theta _I\) which depend on space and time

$$\begin{aligned} \Theta&= \sum _I \Theta _I \end{aligned}$$
(2.5)
$$\begin{aligned} \Theta _I&= e^{i \lambda \xi _I} \left( \theta _I + \delta \theta _I\right) \end{aligned}$$
(2.6)

The amplitude \(\theta _I\) and the phase functions \(\xi _I\) are scalar functions of our choice, which vary slowly compared to the frequency parameter \(\lambda \). The term \(\delta \theta _I\) is a small correction term which will be made precise later. Each wave \(\Theta _I\) has a conjugate wave \(\Theta _{\bar{I}} = \overline{\Theta }_I\) with opposite phase function \(\xi _{\bar{I}} = - \xi _I\) and amplitude \(\theta _{\bar{I}} = \bar{\theta }_I\) so that the overall correction is real valued.

We now proceed to calculate the equation satisfied by the corrected scalar field \(\theta _1 = \theta + \Theta \). This requires us to calculate the new drift velocity \(u_1^l = T^l[\theta _1] = u^l + U^l\), where \(U^l = T^l[\Theta ]\). Our main tool for this calculation is a Microlocal Lemma, which in this case guarantees that each wave \(\Theta _I\) gives rise to a velocity field

$$\begin{aligned} U_I^l = T^l\left[ \Theta _I\right]&= e^{i \lambda \xi _I} (u_I^l + \delta u_I^l) \end{aligned}$$
(2.7)
$$\begin{aligned} u_I^l&= m^l\left( \nabla \xi _I\right) \theta _I \end{aligned}$$
(2.8)

with amplitude determined by the Fourier multiplier \(m^l(\xi )\) in the definition of \(T^l\).

The amplitude \(u_I^l\) thus has the size comparable to \(\theta _I\), while the term \(\delta u_I\) is a small correction of the same order as \(\delta \theta _I\). Thus, given a highly oscillatory input such as \(\Theta _I = e^{i \lambda \xi _I} \theta _I\), the operator \(T^l\) behaves to leading order like a multiplication operator on the amplitude. (For our purposes, the simplest way to achieve equation (2.7) will be to use phase functions defined on the whole torus \({\mathbb {T}}^2\), but this will not be a serious restriction.)

From the Ansatz (2.5) and equation (2.1), we see that the corrected scalar field \(\theta _1 = \theta + \Theta \) satisfies the equation

$$\begin{aligned} \partial _t \theta _1 + \partial _l(u_1^l \theta _1)&= \partial _t \Theta + \partial _l( u^l \Theta ) + \partial _l(U^l \theta ) + \partial _l( U^l \Theta + R^l) \end{aligned}$$
(2.9)

We now expand \(\Theta \) and U into individual waves using (2.5) to derive

$$\begin{aligned} \partial _t \theta _1 + \partial _l(u_1^l \theta _1)&= \partial _t \Theta + \partial _l( u^l \Theta ) + \partial _l(U^l \theta ) \end{aligned}$$
(2.10)
$$\begin{aligned}&\quad + \sum _{J \ne \bar{I}} \partial _l(U_J^l \Theta _I) + \partial _l( \sum _I U_I^l \Theta _{\bar{I}} + R^l ) \end{aligned}$$
(2.11)

Our goal is to design the correction \(\Theta \) so that the forcing terms on the right hand side of (2.10)-(2.11) can be represented in divergence form \(\partial _l R_1^l\) for a vector field \(R_1^l\) which is significantly smaller in \(C^0\) than the previous error \(R^l\).

The Stress Term

Our first goal is to cancel out the term \(R^l\) appearing in the rightmost term of (2.11), which is the only term in equations (2.10)-(2.11) that has low frequency. We expand this term using (2.7)-(2.8) as

$$\begin{aligned} \sum _I U_I^l \Theta _{\bar{I}} + R^l&= \frac{1}{2} \sum _I(U_I^l \Theta _{\bar{I}} + U_{\bar{I}}^l \Theta _I) + R^l \nonumber \\&\approx \frac{1}{2} \sum _I ( u_I^l \theta _{\bar{I}} + u_{\bar{I}}^l \theta _I) + R^l \nonumber \\&= \frac{1}{2} \sum _I |\theta _I|^2 (m^l\left( \nabla \xi _I\right) + m^l\left( - \nabla \xi _I\right) ) + R^l \end{aligned}$$
(2.12)
$$\begin{aligned}&= \frac{1}{2} \sum _I |\theta _I|^2 (m^l\left( \nabla \xi _I\right) + \overline{m^l\left( \nabla \xi _I\right) }) + R^l \end{aligned}$$
(2.13)

where the error terms are lower order, involving \(\delta \theta _I\) and \(\delta u_I^l\). Here we can see already why we are restricted to multipliers \(m^l( \cdot )\) which are not odd. Namely, for an odd multiplier \(m^l(-\xi ) = -m^l(\xi )\), the high frequency interactions fail to leave a nontrivial low frequency part. In other words, the obstruction is that we lack a high-low frequency cascade.

We therefore assume now that the multiplier \(m^l\) is not odd. Together with the divergence free property \(\xi _l m^l(\xi ) = 0\) and the degree zero homogeneity of the symbol \(m^l(\cdot )\), this condition implies that there are linearly independent vectors in the image of the even part of the multiplier

$$\begin{aligned} A^l = m^l(\xi ^{(1)}) + m^l(-\xi ^{(1)}), \qquad B^l = m^l(\xi ^{(2)}) + m^l(-\xi ^{(2)}) \end{aligned}$$
(2.14)

where \(\xi ^{(1)}, \xi ^{(2)} \in {\mathbb {Z}}^2 = {\widehat{{\mathbb {T}}}}^2\) are nonzero frequencies with integer entries.

At this point, since we now have two vectors \(A^l\) and \(B^l\) in the image of the even part of \(m^l\) that are linearly independent, there is some hope to get the terms in (2.12) to cancel out. Namely, one should first make sure that the phase gradients \(\nabla \xi _I\) are perturbations of the directions \(\xi ^{(1)}, \xi ^{(2)}\) so that each wave yields a velocity field taking values in the direction \((m^l(\nabla \xi _I) + m^l(- \nabla \xi _I) \approx A^l\) or in the direction \((m^l(\nabla \xi _I) + m^l(- \nabla \xi _I) \approx B^l\). One would then like to choose coefficients \(\theta _I\) so that terms \(|\theta _I|^2 (m^l(\nabla \xi _I) + m^l(- \nabla \xi _I))\) in (2.12) form the appropriate linear combinations of \(A^l\) and \(B^l\) needed to cancel out \(R^l\).

However, there is an immediate difficulty in implementing the above approach. Namely, although we know that \(A^l\) and \(B^l\) are linearly independent, it may not be case that \(R^l\) can be written as a linear combination of \(A^l\) and \(B^l\) with non-negative coefficients \(|\theta _I|^2\). To get around this difficulty, we take advantage of a degree of freedom which already played an important role in the arguments of [21] and [45]. Namely, observe that we do not need to solve the equation (2.12) exactly, but need only ensure that (2.12) is divergence free. This freedom allows us to subtract from (2.11) any vector field \(e(t) \delta ^l\) which is constant in space and depends only on time. Therefore the equation we actually solve is more similar to

$$\begin{aligned} \frac{1}{2} \sum _I |\theta _I|^2 (m^l\big (\nabla \xi _I) + \overline{m^l\left( \nabla \xi _I\right) } \big )&= e(t) \delta ^l - R_\epsilon ^l \end{aligned}$$
(2.15)

where \(R_\epsilon ^l\) is a regularized version of \(R^l\) and \(\delta ^l\) is a constant vector field. If we choose \(\delta ^l = A^l + B^l\) and make sure that e(t) is bounded below by, say, \(e(t) \ge 100 \Vert R_\epsilon \Vert _{C^0}\) on the support of \(R_\epsilon \), then the coefficients \(|\theta _I|^2\) solving (2.15) can be guaranteed to be non-negative. Observe also that the equation (2.15) leads to the bounds \(\Vert \theta _I\Vert _{C^0} \le C \Vert R_\epsilon \Vert _{C^0}^{1/2}\) for the amplitudes.

The role played by the function \(e(t) \delta ^l\) is the same as the role played by the low frequency part of the pressure correction in the scheme for Euler [31, Section 7.3]. This device in some way appears to limit our proof to the periodic setting.

The High Frequency Interference Terms

Controlling the interference terms between high frequency waves is a fundamental difficulty in convex integration. In our case, the interference terms require solving the elliptic equation

$$\begin{aligned} \partial _l R_H^l&= \sum _{J \ne \bar{I}} \partial _l(U_J^l \Theta _I) = \sum _{J \ne \bar{I}} U_J^l \partial _l\Theta _I \end{aligned}$$
(2.16)
$$\begin{aligned}&= \frac{1}{2} \sum _{J \ne \bar{I}} (U_J^l \partial _l \Theta _I + U_I^l \partial _l \Theta _J) \end{aligned}$$
(2.17)

To leading order, these terms have the form

$$\begin{aligned} \partial _l R_H^l&= \frac{1}{2} (i \lambda ) \sum _{J \ne \bar{I}} e^{i \lambda \left( \xi _I + \xi _J\right) }(u_J^l \partial _l \xi _I \theta _I + u_I^l \partial _l \xi _J \theta _J) + \ldots \end{aligned}$$
(2.18)
$$\begin{aligned}&= \frac{1}{2} (i \lambda ) \sum _{J \ne \bar{I}} e^{i \lambda \left( \xi _I + \xi _J\right) }\theta _I \theta _J (m^l\left( \nabla \xi _J\right) \partial _l \xi _I + m^l\left( \nabla \xi _I\right) \partial _l \xi _J) + \ldots \end{aligned}$$
(2.19)

We expect to a gain a factor of \(\lambda ^{-1}\) while inverting the divergence in (2.18); however, solving (2.18) leads in principle to a solution \(R_H^l\) of size \(\Vert R_H \Vert _{C^0} \le \Vert \sum _I |\theta _I|^2 \Vert _{C^0} \le \Vert R \Vert _{C^0}\), which is not even an improvement on the size of the previous error \(R^l\). These terms therefore seem to already prohibit the construction of continuous solutions by convex integration. The same difficulty also arises for the Euler equations.

For the Euler equations, the key idea introduced in [26] which made it possible to handle high frequency interference terms similar to (2.18) was to construct the high frequency building blocks using a family of stationary solutions to the Euler equations known as Beltrami flows. Specifically, the basic building blocks in the construction [26] are constructed using vector fields of the form \(B^l e^{i k \cdot x}\) where \(B^l\) is a constant vector amplitude, \(k \cdot x\) is a linear phase function, and we have \((i k )\times B^l = |k| B^l\) so that the expression \(B^l e^{i k \cdot x}\) is an eigenfunction of curl and hence a stationary solution to Euler. The idea of using Beltrami flows was adapted in [31] to building blocks \(V_I = e^{i \lambda \xi _I} v_I\) with nonlinear phase functions \(\xi _I\) by imposing a “microlocal Beltrami flow” condition that \((i \nabla \xi _I) \times v_I = |\nabla \xi _I| v_I\) pointwise. Viewed from this latter approach, the role of the Beltrami flow condition is to ensure that the leading term in (2.19) cancels out.

For the active scalar equations we consider here, such a family of stationary solutions is not available, and moreover we do not have any method to control interference terms between waves which oscillate in distinct directions. For instance, suppose that the multiplier \(m^l(\xi )\) is even, and suppose that \(\xi _1, \xi _2 \in {\hat{{\mathbb {R}}}}^2\) are linearly independent frequencies for which the terms in (2.19) cancel

$$\begin{aligned} m(\xi _1) \cdot \xi _2 \pm m(\xi _2) \cdot \xi _1 = 0 \end{aligned}$$

It then follows from the conditions \(m(\xi _1) \cdot \xi _1 = 0, m(\xi _2) \cdot \xi _2 = 0\) that both \(m(\xi _1)\) and \(m(\xi _2)\) must be equal to 0. More generally, one can show that the even part of the multiplier must vanish when applied to both frequencies

$$\begin{aligned} m^l\left( \xi _1\right) + m^l\left( -\xi _1\right) = m^l\left( \xi _2\right) + m^l\left( -\xi _2\right) = 0 \end{aligned}$$

if we assume that all of the interference terms in (2.19) cancel. This vanishing of the even part would prohibit any nontrivial contribution to (2.12). In contrast, in the case of the surface quasigeostraphic equation where the drift velocity is given by \(u = \nabla ^\perp (- \Delta )^{-1/2} \theta \), the set of Laplace eigenfunctions provides a large family of high frequency, stationary solutions. However, in this case the multiplier \(m(\xi ) = i \langle -\xi _2,\xi _1\rangle |\xi |^{-1}\) is odd and we have already seen that such multipliers are out of reach of our method.

Our main observation which allows us to handle these terms is the fact that the interference terms which arise when an individual wave interacts with itself always vanish to leading order from the structure of the equations. Namely, if we look at a single index \(J = I\), then from the divergence free condition for the symbol \(m(\xi ) \cdot \xi = 0\) we see that the leading term in (2.19) gives no contribution

$$\begin{aligned} (m^l\left( \nabla \xi _I\right) \partial _l \xi _I + m^l\left( \nabla \xi _I\right) \partial _l \xi _I) = 0 \end{aligned}$$

Therefore, while we lack a method to control interference terms between waves which oscillate in different directions, we can still pursue an approach where in each step of the iteration we use corrections \(\Theta \) containing waves which oscillate in only a single direction and thus do not interfere with each other.

Comparison with the Euler and Isometric Embedding Equations

In this Section, we remark on how our observation also gives a new approach to building weak solutions to the Euler equations which is independent of Beltrami flows, and explain why we expect a loss of regularity by comparing to analogous considerations in the case of the isometric embedding equations.

Our observation of vanishing self-interference terms applies in the case of the Euler equations as well. For the Euler equations, an individual wave is a velocity field which takes the form \(V_I = e^{i \lambda \xi _I}( v_I + \delta v_I )\), and we require that the amplitude takes values in \(v_I \in \langle \nabla \xi _I \rangle ^\perp \) in order to ensure the divergence free condition for \(V_I\). In this case, the high frequency interference terms between an individual wave and itself have the form

$$\begin{aligned} V_I^j \partial _j V_I^l&= (i \lambda ) e^{2 i \lambda \xi _I} v_I^j \partial _j \xi _I v_I^l + \text{ lower } \text{ order } \text{ terms } \end{aligned}$$
(2.20)

Observe that the requirement \(v_I \cdot \nabla \xi _I = 0\) forces the the main contribution to cancel. Thus, the method we apply here in principle generalizes to give a new approach to producing Hölder continuous weak solutions to the Euler equations which entirely avoids the use of Beltrami flows and applies in arbitrary dimensions. Our observation appears to be quite natural in that the key cancellation we exploit comes immediately from the structure of the equations themselves without imposing any particular Ansatz in the construction. On the other hand, in contrast to the use of Beltrami flows for Euler, we are restricted here to removing one component of the error at a time during the iteration, which ultimately results in a loss of regularity in the solutions obtained from the construction.

The reason we expect to lose regularity from the restriction of removing one component of the error each stage comes from experience with the isometric embedding equations from the work of Conti, De Lellis and Székelyhidi [18]. For these equations, there is currently no method available for controlling the relevant interference terms between high frequency waves for embeddings of codimension 1, and this obstruction leads to a loss of regularity for the solutions obtained through convex integration. Namely, without a method to control interference terms between distinct waves, it is only possible to eliminate a single, rank one component of the metric error in each step of the iteration from the addition of a single wave. Consequently, it is necessary to increase the frequencies of the waves multiple times before any \(C^0\) improvement in the metric error can be realized, which leads to a loss of regularity. In contrast, the use of Beltrami flows for the Euler equations allows for the addition of waves which oscillate at the same frequency level in several different directions, and the stress error can be made smaller in \(C^0\) after only one step of the iteration. Since our scheme suffers from the same deficiency as in the case of isometric embeddings (that is, we cannot use waves at equal frequency levels which oscillate in multiple directions), it turns out that our scheme is limited to a Hölder exponent which is inferior to the exponent 1 / 5 achieved for the Euler equations.

The restriction to eliminating a single component of the error in each step of the iteration also threatens to make our proof considerably more complicated than the scheme used for Euler. While we are unable to avoid the loss of regularity, we are at least able to keep the overall complexity of the argument to be essentially no more complicated than the scheme used for Euler. This simplification is accomplished by introducing a new technique, which we explain in the following Section.

Reducing the Steps in the Iteration

From the discussion in Section 2.2.1, we can now consider a serial convex integration scheme wherein we cannot reduce the size of the error term \(R^l\) until we have added a series of two corrections

$$\begin{aligned} \theta _1&= \theta + \Theta _{(1)} + \Theta _{(2)} \end{aligned}$$
(2.21)

Following the original scheme of Nash [40] in the isometric embedding problem, we should first decompose \(R^l\) into components as

$$\begin{aligned} R^l = c_A A^l + c_B B^l \end{aligned}$$

where \(A^l\) and \(B^l\) are linearly independent vectors in the image of \(m^l(\xi ) + m^l(-\xi )\) defined in (2.14). The first correction \(\Theta _{(1)}\) to \(\theta \) should oscillate in the \(\xi ^{(1)}\) direction in order to eliminate the \(A^l\) component of the error \(R^l\) by the method described in Section 2.1. Then, the second correction \(\Theta _{(2)}\) should have an even larger frequency than \(\Theta _{(1)}\), but the same amplitude \(| \Theta _{(1)}| \sim |\Theta _{(2)}| \sim |R|^{1/2}\), since its purpose is to eliminate the \(B^l\) component of the error \(R^l\). Thus, one stage of the convex integration is completed after two steps, where each step involves eliminating one component of the error, and the error \(R^l\) is smaller in \(C^0\) only at the end of the stage.

It appears that such a serial convex integration scheme should be possible for active scalar equations and should lead to the same Hölder exponent 1 / 9 that we achieve here. On the other hand, such a serial proof seems to be somewhat complicated compared to the “one-step” scheme used for Euler or to the case of the isometric embedding equations. In our case, a serial proof would involve treating a larger number of error terms having unfamiliar estimates, and optimizing a larger number of time, frequency and length scale parameters. We avoid these additional complexities by making a simple observation that allows us to reduce the \(C^0\) norm of the error in a single step of the iteration rather than several. It turns out that this idea also causes most of the terms in the construction to obey estimates which are familiar from experience with the Euler equations, amounting to an overall more transparent proof.

Our observation which allows us to reduce the error in every stage of the iteration and thereby simplify our proof is the following. First, note that the addition of the first correction \(\Theta _{(1)}\) results in a remaining error \(R_{(1)}^l\) of the form

$$\begin{aligned} R_{(1)}^l&= c_B B^l + R_{E}^l \end{aligned}$$
(2.22)

where \(R_{E}^l\) is much smaller than the original error \(R^l\), whereas the term \(|c_B B^l| \sim |R^l|\) has the same size. Rather than using the second correction \(\Theta _{(2)}\) to eliminate the term \(c_B B^l\) as discussed previously, we observe that we can simultaneously get rid of the \(B^l\) component of the small term \(R_{E}^l\), thus leaving an error of the form

$$\begin{aligned} R_{1}^l&= c_A A^l + R_J^l \end{aligned}$$
(2.23)

where \(c_A A^l\) is the remaining \(A^l\) component of \(R_E^l\), and the term \(R_J^l\) is an even smaller error term. For our next correction, we can repeat the same idea and eliminate the \(A^l\) component of (2.23), leaving an error of the form (2.22). Continuing in this way, we see that each correction now causes an improvement in the size of the error in the \(C^0\) topology, just as in the situation for Euler.

The above discussion has been based on the hope that we can really eliminate the \(A^l\) and \(B^l\) components of the error, which is not entirely justified at this point. In fact, there are some further difficulties which stand in our way before this task can be accomplished which will become more clear as we specify the construction. One such difficulty is the appearance of low frequency interference terms.

Low Frequency Interference Terms

It turns out that the most straightforward approach to the construction based on the ideas Section 2.2 gives rise to certain interference terms of low to intermediate frequency which apparently prohibit the success of our scheme. Thus, while the idea introduced in Section 2.2 allows us to control the high frequency interference terms in a sactifactory manner, we must incorporate one additional idea into the construction before our scheme can handle every type of error term which arises.

The ideas in Section 2.2 suggest that a natural approach to the construction is to use waves of the form \(\Theta _I = e^{i \lambda \xi _I}(\theta _I + \delta \theta _I)\) where the phase functions \(\xi _I\) oscillate in the direction \(\pm \xi ^{(1)}\) (or \(\pm \xi ^{(2)}\)) in the sense that the gradients remain close to their common initial values

$$\begin{aligned} \nabla \xi _I \approx \pm \xi ^{(1)} \end{aligned}$$
(2.24)

For an index I, let us write \(f(I) \in \{ \pm \}\) to denote the sign appearing in (2.24).

According to Section 2.2, we have a method to ensure that high frequency nonlinear interference terms obey good bounds. Thus, every interaction term of the form

$$\begin{aligned} \partial _l( \Theta _I U_J^l + \Theta _J U_I^l) \end{aligned}$$
(2.25)

which arises between waves of the same sign \(f(I) = f(J) \in \{ \pm \}\) can be handled by our method, as these terms are all of high frequency.

A new difficulty arises when we consider interference terms between waves of opposite signs \(f(I) = - f(J)\), which we call “Low-Frequency Interference Terms”. In this case, the terms of the form \(\Theta _I U_J^l + \Theta _J U_I^l\) as in (2.25) can be expressed to leading order as

$$\begin{aligned} \Theta _I U_J^l + \Theta _J U_I^l&\approx e^{i \lambda \left( \xi _I + \xi _J\right) } ( \theta _I u_J^l + \theta _J u_I^l) \end{aligned}$$
(2.26)

When we consider indices with opposite signs \(f(I) = - f(J)\), the term (2.26) cannot be viewed as a high frequency error term. In the worst case it may even be true that \(\nabla (\xi _I + \xi _J) = 0\) thanks to the initial conditions satisfying (2.24).

It turns out that having low frequency interference terms of the form (2.26) prevents us from solving the quadratic equation to determine the amplitudes \(\theta _I\). To see this difficulty, note that the left hand side of the equation analogous to (2.15), which includes all low frequency interactions, would have to include terms of the form

$$\begin{aligned} \sum _{\begin{array}{c} I,J \\ f(I) = +, ~f(J) = - \end{array}} \Theta _I U_J^l + \Theta _J U_I^l&= \sum _{\begin{array}{c} I, J \\ f(I) = f(J) = + \end{array}} e^{i \lambda \left( \xi _I - \xi _J\right) } \left( \theta _I \bar{\theta }_J + \theta _J \bar{\theta }_I\right) A^l + \ldots \end{aligned}$$
(2.27)

Remarkably, the right hand side of (2.27) appears to obey all the estimates we would require for obtaining solutions with Hölder regularity \(1/9-\), despite the appearance of the parameter \(\lambda \). The problem is that the right hand side of (2.27) must remain bounded from 0 in order to solve the quadratic equation for the amplitudes. On the other hand, there is no way to preclude the possibility that the series (2.27) cancels completely at points (tx) on which the amplitudes \(\theta _I(t,x)\) and \(\theta _J(t,x)\) have essentially the same size, due to the presence of the oscillating factors \(e^{i \lambda ( \xi _I - \xi _J) }\) in the cross terms arising from distinct indices \(J \ne I\).

At first sight, this difficulty would seem to completely prevent us even from achieving continuous solutions, as we are left with no way to obtain a \(C^0\) improvement in the size of the error on the regions where distinct indices interact. We overcome this obstruction by making one more adjustment to the construction. Roughly speaking, our idea is to allow the condition (2.24) to be satisfied by “half” the waves in our construction, whereas the other “half” of the waves in the construction involve phase functions with initial data satisfying

$$\begin{aligned} \nabla \xi _I \approx \pm 10 \xi ^{(1)} \end{aligned}$$
(2.28)

Furthermore, we ensure that every nonlinear interaction which takes place between nonconjugate waves involves one wave satisfying (2.24), and a second wave satisfying (2.28). In this way, every interference term of the form (2.26) is actually a high frequency error term. Moreover, every wave oscillates in a direction essentially parallel to \(\xi ^{(1)}\), so that the idea of Section 2.2 still applies to treat these high frequency interference terms.

With these ideas in hand, we are now ready to proceed with the formal construction in detail, beginning with the statement of the Main Lemma.

The Main Lemma

In order to state the main lemma, let us recall that we have fixed once and for all a choice of linearly independent vectors

$$\begin{aligned} A^l = m^l(\xi ^{(1)}) + m^l(-\xi ^{(1)}), \qquad B^l = m^l(\xi ^{(2)}) + m^l(-\xi ^{(2)}) \end{aligned}$$
(3.1)

where \(\xi ^{(1)}, \xi ^{(2)} \in {\mathbb {Z}}^2 = {\widehat{{\mathbb {T}}}}^2\) are nonzero (integral) frequencies. The existence of these vectors is guaranteed by the condition that \(m^l(\xi )\) is not odd, and the orthogonality condition \(\xi _l m^l(\xi ) = 0\).

Definition 3.1

For a constant vector \(A^l\), we say that \((\theta , u^l, c_A, R_J^l)\) satisfy the Compound Scalar-Stress equation (with vector \(A^l\)) if

$$\begin{aligned} \left\{ \begin{aligned} \partial _t \theta + \partial _l(\theta u^l)&= \partial _l( c_A A^l + R_J^l) \\ u^l&= T^l\left( \theta \right) \end{aligned} \right. \end{aligned}$$
(3.2)

In this case, we will refer to the tuple \((\theta , u^l, c_A, R_J^l)\) as a compound scalar-stress field.

For a solution to the compound scalar-stress equation (2.1), we define compound frequency-energy levels to be the following

Definition 3.2

Let \(L \ge 1\) be a fixed integer. Let \(\Xi \ge 2\), and let \(e_v\), \(e_R\) and \(e_J\) be positive numbers with \(e_J \le e_R \le e_v\). We say that \((\theta , u^l, c_A, R^l_J)\) have frequency and energy levels below \((\Xi , e_v, e_R, e_J)\) to order L in \(C^0\) if \((\theta , u^l, c_A, R^l_J)\) solve the system (3.2) and satisfy the bounds

$$\begin{aligned} || \nabla ^k u ||_{C^0} + \Vert \nabla ^k \theta \Vert _{C^0}&\le \Xi ^k e_v^{1/2}&\qquad \quad k = 1, \ldots , L \end{aligned}$$
(3.3)
$$\begin{aligned} \Vert \nabla ^k \left( \partial _t + u \cdot \nabla \right) u \Vert _{C^0}&\le \Xi ^{k+1} e_v&\qquad \quad k = 0, \ldots , L-1 \end{aligned}$$
(3.4)
$$\begin{aligned} || \nabla ^k c_A ||_{C^0}&\le \Xi ^k e_R&\qquad \quad k = 0, \ldots , L \end{aligned}$$
(3.5)
$$\begin{aligned} || \nabla ^k \left( \partial _t + u \cdot \nabla \right) c_A ||_{C^0}&\le \Xi ^{k+1} e_v^{1/2} e_R&\qquad \quad k = 0, \ldots , L - 1 \end{aligned}$$
(3.6)
$$\begin{aligned} || \nabla ^k R_J ||_{C^0}&\le \Xi ^k e_J&\qquad \quad k = 0, \ldots , L \end{aligned}$$
(3.7)
$$\begin{aligned} || \nabla ^k \left( \partial _t + u \cdot \nabla \right) R_J ||_{C^0}&\le \Xi ^{k+1} e_v^{1/2} e_J&\qquad \quad k = 0, \ldots , L - 1 \end{aligned}$$
(3.8)

Here \(\nabla \) refers only to derivatives in the spatial variables.

Note that we assume bounds (3.3)-(3.4) on the drift velocity \(u^l\) which do not in general follow from the corresponding bounds on \((\theta , c_A, R^l_J)\) and the transport equation (3.2). We assume these bounds on \(u^l\) in order to avoid logarithmic losses in our estimates which would arise otherwise from the lack of \(C^0\) boundedness of the operator \(u^l = T^l \theta \) defining the velocity.

We now state the Main Lemma of the paper, which summarizes the result of one step of the convex integration procedure. The statement of this lemma involves two constants: \(K_0 \ge 1\) (specified in Line (5.30) of the construction) and \(K_1 \ge 1\) (determined in Line (5.25) of the construction, see also Section 8.1). These constants \(K_0\) and \(K_1\) depend only on the operator \(T^l\) in the statement of the Main Theorem.

Lemma 3.1

(The Main Lemma) Suppose that \(L \ge 2\) and let \(K, M \ge 4\) be non-negative numbers such that \(K \ge K_0\). There is a constant C depending only on L, K, M and the operator \(T^l\) such that the following holds:

Let \((\theta ,u^l, c_A, R_J^l)\) be any solution of the compound scalar-stress system whose compound frequency and energy levels are below \((\Xi , e_v, e_R, e_J)\) to order L in \(C^0\), and let \(I \subseteq {\mathbb {R}}\) be a nonempty closed interval such that

$$\begin{aligned} {\mathrm {supp}}\,R_J \cup {\mathrm {supp}}\,c_A&\subseteq I \times {\mathbb {T}}^2 \end{aligned}$$
(3.9)

Define the time-scale \({\hat{\tau }} = \Xi ^{-1} e_v^{-1/2}\), and let

$$\begin{aligned} e(t) : {\mathbb {R}}\rightarrow {\mathbb {R}}_{\ge 0} \end{aligned}$$

be any non-negative function for which the lower bound

$$\begin{aligned} e(t) \ge K e_R \quad \quad \text{ for } \text{ all } t \in I \pm {\hat{\tau }} \end{aligned}$$
(3.10)

is satisfied in a \({\hat{\tau }}\)-neighborhood of the interval I, and whose square root satisfies the estimates

$$\begin{aligned} || \frac{d^r}{dt^r} e^{1/2} ||_{C^0}&\le M \left( \Xi e_v^{1/2}\right) ^r e_R^{1/2} ,\qquad 0 \le r \le 2 \end{aligned}$$
(3.11)

Now let N be any positive number obeying the bound

$$\begin{aligned} N&\ge \left( \frac{e_v}{e_R} \right) ^{3/2} \end{aligned}$$
(3.12)

and define the dimensionless parameter \(\mathbf{b} = \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2}\).

Then there exists a solution \((\theta _1,u^l_1, c_B, R_1^l)\) of the form \(\theta _1 = \theta + \Theta \), \(u_1 = u + U\) to the Compound Scalar-Stress Equation (3.2) with vector \(B^l\) whose frequency and energy levels are below

$$\begin{aligned} \begin{aligned} (\Xi ', e_{v}', e_{R}', e_J')&= \left( C N \Xi , e_R, K_1 e_J, \frac{e_v^{1/4} e_R^{3/4}}{N^{1/2}} \right) \\&= \left( C N \Xi , e_R, K_1 e_J, \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2} e_R \right) \\&= \left( C N \Xi , e_R, K_1 e_J, \mathbf{b}^{-1} \frac{e_v^{1/2} e_R^{1/2}}{N}\right) \end{aligned} \end{aligned}$$
(3.13)

to order L in \(C^0\), and whose stress fields \(R_1\) and \(c_B\) are supported in

$$\begin{aligned} {\mathrm {supp}}\,c_B \cup {\mathrm {supp}}\,R_1 \subseteq {\mathrm {supp}}\,e \times {\mathbb {T}}^2 \end{aligned}$$
(3.14)

The correction \(\Theta = \theta _1 - \theta \) is of the form \(\Theta = \nabla \cdot W\). This correction and the correction to the velocity field \(U^l = T^l[\Theta ]\) can be guaranteed to obey the bounds

$$\begin{aligned} ||\Theta ||_{C^0} + \Vert U^l \Vert _{C^0} +&\le C e_R^{1/2} \end{aligned}$$
(3.15)
$$\begin{aligned} \Vert \nabla \Theta \Vert _{C^0} + \Vert \nabla U \Vert _{C^0}&\le C N \Xi e_R^{1/2} \end{aligned}$$
(3.16)
$$\begin{aligned} \Vert (\partial _t + u^j \partial _j) \Theta \Vert _{C^0} + \Vert (\partial _t + u^j \partial _j) U \Vert _{C^0}&\le C \mathbf{b}^{-1} \Xi e_v^{1/2} e_R^{1/2} \end{aligned}$$
(3.17)
$$\begin{aligned} ||W||_{C^0}&\le C \Xi ^{-1} N^{-1} e_R^{1/2} \end{aligned}$$
(3.18)
$$\begin{aligned} \Vert \nabla W \Vert _{C^0}&\le C e_R^{1/2} \end{aligned}$$
(3.19)
$$\begin{aligned} ||(\partial _t + u^j \partial _j) W ||_{C^0}&\le C \mathbf{b}^{-1} N^{-1} e_v^{1/2} e_R^{1/2} \end{aligned}$$
(3.20)

The energy increment from the correction is prescribed up to errors bounded by

$$\begin{aligned} \left| \int _{{\mathbb {T}}^2} \frac{|\Theta |^2}{2}(t,x) dx - \int _{{\mathbb {T}}^2} e(t) dx \right|&\le \frac{1}{2} \int _{{\mathbb {T}}^2} e(t) dx + \frac{e_R}{N} \end{aligned}$$
(3.21)

and the incremental energy variation satisfies an estimate

$$\begin{aligned} \left| \frac{d}{dt} \int _{{\mathbb {T}}^2} |\Theta |^2(t,x) dx \right| \le C \mathbf{b}^{-1} \Xi e_v^{1/2} e_R \end{aligned}$$
(3.22)

uniformly in time. Finally, the space-time support of the correction \(\Theta \) is contained in \({\mathrm {supp}}\,e \times {\mathbb {T}}^2\).

Remarks About the Main Lemma

The overall structure of Lemma 3.1 is based on the Main Lemma of [31, Lemma 10.1]. The most important difference in our Lemma lies in the difference in the definition of the compound frequency energy levels. The bounds implicit in (3.13), which state the rate at which we are able to reduce the stress error, are the most essential point the main lemma and dictate the regularity of the solutions we obtain. Another noticeable difference between Lemma 3.1 compared to the Lemmas [31, Lemma 10.1] and [34, Lemma 4.1] is that the estimate (3.21) gives us worse control over the increment of energy. In those Lemmas, the term \(\frac{1}{2} \int _{{\mathbb {T}}^2} e(t) dx\) is not present, and the error in prescribing the energy increment is of size \(O(N^{-1})\).

This weaker estimate on the energy increment is still sufficient for the applications considered in those papers. In [31] and [34], the same estimate is applied to prove the nontriviality of solutions, by proving that the energy strictly increases during the iteration at each fixed time slice on which the corrections are nontrivial. The same statement can be obtained here, although in our case the nontriviality of solutions follows already from the weak-* approximation statement in Theorem 1.1. In [34], it was shown that a localized version of the estimate (3.21) can be combined with the bounds (3.18)-(3.19) to prove that that the construction necessarily results in solutions which fail to have any kind of improved \(C_x^{1/5 + \epsilon }(B)\) local regularity (or even local \(W^{1/5+\epsilon , 1}\) regularity) on every open ball B and every time slice contained in the support of the iteration (see [34, Theorem 1.2] for a precise statement). This lack of higher regularity is an automatic consequence of the construction, as the same proof shows the failure of local regularity above \(C_x^{1/5 - \epsilon }\) regularity for the earlier constructions of \(C_{t,x}^{1/5 - \epsilon }\) solutions in [23, 31]. The same result applies in our setting by the same proof, using the estimates (3.18)-(3.19) and the localized version of (3.21). Namely, our solutions in dimension 2 fail to belong to \(C_x^{1/9+\epsilon }(B)\) on every open ball B and every time slice contained in the support of the iteration, and in dimension d fail to have any local regularity \(C_x^{1/(1 + 4 d)+\epsilon }(B)\) in a similar way.

Modifications for the Higher Dimensional Case

In this subsection we make some remarks about how to modify our proof to apply in higher dimensions.

In order to prove Theorem 1.5 regarding the case of higher dimensions, the relevant Main Lemma has a slightly different formulation, as one must modify the definitions of the compound scalar stress equation and the compound frequency energy levels. In the case of dimension d, we assume given a linearly independent set of vectors \(A_{(1)}, \ldots , A_{(d)}\) in the image of the even part of the multiplier. A typical solution to the Compound Scalar Stress equation will then be a solution to the equation

$$\begin{aligned} \begin{aligned} \partial _t \theta + \partial _l(\theta u^l)&= \partial _l( c_{A,(1)} A_{(1)}^l + \ldots + c_{A,(d-1)} A_{(d-1)}^l + R_J^l) \\ u^l&= T^l(\theta ) \end{aligned} \end{aligned}$$
(3.23)

A single step of the iteration will remove the \(A_{(1)}\) component of the error, giving a solution \(\theta _1\) and a new error of the form

$$\begin{aligned} \partial _t \theta _1 + \partial _l(\theta _1 u_1^l)&= \partial _l( c_{A,(2)} A_{(2)}^l + \ldots + c_{A,(d-1)} A_{(d-1)}^l + c_{A,(d)} A_{(d)}^l + R_{J,1}^l) \end{aligned}$$
(3.24)
$$\begin{aligned} u_1^l&= T^l(\theta _1) \end{aligned}$$
(3.25)

At the step above (or even earlier when writing (3.23)) we can absorb the \(A_{(2)}, \ldots , A_{(d-1)}\) components of \(R_{J,1}\) into the other terms. (To say it in a slightly different way, one can assume from the start in writing (3.23) that \(R_J\) is a multiple of \(A_{(d)}\) by absorbing the other components of \(R_J\) into the other terms.)

The Definition 3.2 of compound frequency energy levels now should include \(d + 1\) different energy levels \(e_v \ge e_{R,[1]} \ge \ldots \ge e_{R,[d-1]} \ge e_J\). The Main Lemma then takes as an input a compound scalar stress field with given frequency energy levels and outputs another scalar stress field with compound frequency energy levels

$$\begin{aligned}&\left( \Xi , e_v, e_{R,[1]}, \ldots , e_{R,[d-1]}, e_J\right) \nonumber \\&\quad \mapsto \left( C N \Xi , e_{R,[1]}, K_1 e_{R,[2]}, \ldots , K_1 e_{R,[d-1]}, K_1 e_J, e_J'\right) \end{aligned}$$
(3.26)
$$\begin{aligned} e_J'&= \left( \frac{e_v^{1/2}}{e_{R,[1]}^{1/2}N} \right) ^{1/2} e_{R,[1]} \end{aligned}$$
(3.27)

as in (3.13). All the bounds of the Main Lemma then hold with \(e_R\) replaced by \(e_{R,[1]}\), since we are eliminating the first and largest component of the error, and leaving the other terms for the next stages.

The proof of the Main Lemma is then performed similarly as below, but naturally involves more terms and notation. The Main Lemma is applied to prove Theorem 1.5 in a similar way as is done in Section 9 below, where one maintains a constant ratio of the consecutive energy levels with size bounded by \(\frac{e_v}{e_{R,[1]}} , \frac{e_{R,[i]}}{e_{R,[i+1]}} \le \frac{K_1}{Z}\). The difference in the iteration then is the choice of \(N_{(k)} \sim Z^{(4 d + 1)/2}\) instead of (9.14) at later stages k. Comparing the growth of frequencies \(\Xi _{(k)} \sim Z^{(4 d + 1)k/2}\) to the decay in energy levels \(e_{R,[1],(k)}^{1/2} \sim Z^{-k/2}\) as in (9.26), we obtain Hölder regularity up to \(\frac{1}{(4d + 1)}\) as stated in Theorem 1.5.

In the next Sections 4-8, we give the proof of the Main Lemma. In the following Sections 9-11, we then explain how the Main Lemma can be used to deduce Theorems 1.1-1.2.

The Microlocal Lemma

The following Lemma will be used heavily in the construction in order to control the output of a convolution operator applied to a highly oscillatory input. The Lemma allows us to show that, to leading order, a convolution operator simply behaves like a multiplication operator when it is applied to a high frequency input with a nonlinear phase function.

In all of our applications, the kernel K(h) below will be a Schwartz function essentially supported on length scales of order \(|h| \sim \lambda ^{-1}\) for large \(\lambda \). We normalize the Fourier transform of a function \(K : {\mathbb {R}}^2 \rightarrow {\mathbb {C}}\) to be

$$\begin{aligned} \hat{K}(\xi ) = \int _{{\mathbb {R}}^2} e^{- i \xi \cdot h} K(h) dh \end{aligned}$$

Lemma 4.1

(Microlocal Lemma) Suppose that

$$\begin{aligned} T[\Theta ](x) = \int _{{\mathbb {R}}^2} \Theta (x - h) K(h) dh \end{aligned}$$

is a convolution operator acting on functions \(\Theta : {\mathbb {T}}^2 \rightarrow {\mathbb {C}}\), with a kernel \(K : {\mathbb {R}}^2 \rightarrow {\mathbb {C}}\) in the Schwartz class. Let \(\xi : {\mathbb {T}}^2 \rightarrow {\mathbb {R}}\) and \(\theta : {\mathbb {T}}^2 \rightarrow {\mathbb {C}}\) be smooth, periodic functions and \(\lambda \in {\mathbb {Z}}\) be an integer. Then for any input of the form

$$\begin{aligned} \Theta = e^{i \lambda \xi (x)} \theta (x) \end{aligned}$$

we have the formula

$$\begin{aligned} T[\Theta ](x)&= e^{i \lambda \xi (x)} \left( \theta (x) {\hat{K}}(\lambda \nabla \xi (x)) + \delta [T\Theta ](x) \right) \end{aligned}$$
(4.1)

where the error in the amplitude term has the explicit form

$$\begin{aligned} \begin{aligned} \delta [T\Theta ](x)&= \int _0^1 dr \frac{d}{dr} \int _{{\mathbb {R}}^2} e^{- i \lambda \nabla \xi (x) \cdot h} e^{i Z(r,x, h)} \theta (x - rh) K(h) dh \\ Z(r,x,h)&= r \lambda \int _0^1 h^a h^b \partial _a \partial _b \xi (x - s h) (1-s) ds \end{aligned} \end{aligned}$$
(4.2)

Proof

Observe that

$$\begin{aligned} e^{- i \lambda \xi (x)} T[\Theta ](x)&= \int _{{\mathbb {R}}^2} e^{i \lambda (\xi (x-h) - \xi (x))} \theta (x-h) K(h) dh \end{aligned}$$
(4.3)

By Taylor expanding, we express

$$\begin{aligned} \xi (x-h) - \xi (x)&= - \nabla \xi (x) \cdot h + \int _0^1 h^a h^b \partial _a \partial _b \xi (x - s h) (1-s) ds \end{aligned}$$
(4.4)

In our applications, the kernel K is localized to small values of \(|h| \sim \lambda ^{-1}\) for large \(\lambda \), so we view the second term in (4.4) as a small error. Similarly, we think of \(\theta (x - h)\) as a perturbation of \(\theta (x)\), which motivates us to express the right hand side of (4.3) as

$$\begin{aligned} e^{- i \lambda \xi (x)} T[\Theta ](x)&= \int _{{\mathbb {R}}^2} e^{- i \lambda \nabla \xi (x) \cdot h} \theta (x) K(h) dh + \delta [T \Theta ](x), \end{aligned}$$
(4.5)

where \(\delta [T \Theta ](x)\) is expressed in (4.2). The proof concludes by recognizing that \(\theta (x)\) can be factored out of the integral in (4.5), which gives formula (4.1). \(\square \)

Remark 3

We remark that the same method applied here to prove Lemma 4.1 can also be iterated to obtain a higher order expansion of \(T[\Theta ](x)\) involving only the functions \(\theta (x)\), \(\nabla \xi (x)\) and their derivatives evaluated at the point x

$$\begin{aligned} \delta [T\theta ](x) = -i ~\partial _a \theta (x) \partial ^a \hat{K}(\lambda \nabla \xi (x)) - \frac{1}{2} i \lambda ~ \theta (x) \partial _a \partial _b \xi (x) \partial ^a \partial ^b \hat{K} (\lambda \nabla \xi (x)) + \ldots \end{aligned}$$
(4.6)

To obtain this further expansion, one modifies the function Z defined in (4.2) to have an additional factor of r in the argument of the phase function

$$\begin{aligned} Z(r,x,h) = r \lambda \int _0^1 h^a h^b \partial _a \partial _b \xi (x - r s h) (1-s) ds \end{aligned}$$

The expansion (4.6) is then obtained by Taylor expansion in the variable r via integration by parts. We do not take this approach here because it does not improve our estimates, and results in some more complicated formulas.

The Construction

We now give a detailed description of the construction. We start by obtaining a complete list of the error terms.

Suppose that we are in the setting of Lemma 3.1. Thus, we have a solution \((\theta , u, c_A, R_J)\) to the compound scalar-stress equation with vector \(A^l = m^l(\xi ^{(1)}) + m^l(-\xi ^{(1)})\) as in (2.14)

$$\begin{aligned} \left\{ \begin{aligned} \partial _t \theta + \partial _l\left( \theta u^l\right)&= \partial _l\left( c_A A^l + R_J^l\right) \\ u^l&= T^l[\theta ] \end{aligned} \right. \end{aligned}$$
(5.1)

whose frequency-energy levels are below \((\Xi , e_v, e_R, e_J)\). After adding a correction \(\Theta \) to the scalar field, the corrected scalar \(\theta _1 = \theta + \Theta \) and drift velocity \(u_1^l = u^l + U^l\), \(U^l = T^l[\Theta ]\) satisfy the system

$$\begin{aligned} \partial _t \theta _1 + \partial _l(\theta _1 u_1^l)&= \partial _t \Theta + \partial _l( u^l \Theta ) + \partial _l(\theta U^l) + \partial _l( \Theta U^l + c_A A^l + R_J^l) \end{aligned}$$
(5.2)

As a preliminary step, it is necessary to define suitable regularizations \((\theta _\epsilon , u_\epsilon , \tilde{c}_A, R_\epsilon )\) of \((\theta , u, c_A, R_J)\). The purpose of these regularizations is to ensure that only the “low frequency parts” of the given \((\theta , u, c_A, R_J)\) will influence the building blocks of the construction. These mollifications give rise to an error term

$$\begin{aligned} R_M^l&= (u^l - u_\epsilon ^l) \Theta + \left( \theta - \theta _\epsilon \right) U^l + \left( c_A - \tilde{c}_A\right) A^l + (R_J^l - R_\epsilon ^l) \end{aligned}$$
(5.3)

Our goal is to design a correction \(\Theta \) for the scalar field \(\theta \) so that the corrected scalar \(\theta _1 = \theta + \Theta \) and drift velocity \(u_1^l = u^l + U^l\) satisfy the compound scalar-stress equation with vector \(B^l = m^l(\xi ^{(2)}) + m^l(-\xi ^{(2)})\) as in (2.14)

$$\begin{aligned} \left\{ \begin{aligned} \partial _t \theta _1 + \partial _l(\theta _1 u_1^l)&= \partial _l( c_B B^l + R_1^l) \\ u_1^l&= T^l[\theta _1] \end{aligned} \right. \end{aligned}$$
(5.4)

whose compound frequency energy levels are bounded as in Lemma 3.1.

The Shape of the Corrections

Our correction is a sum of individual waves

$$\begin{aligned} \Theta&= \sum _I \Theta _I \end{aligned}$$
(5.5)
$$\begin{aligned} \Theta _I&= e^{i \lambda \xi _I}\left( \theta _I + \delta \theta _I\right) \end{aligned}$$
(5.6)

where we are free to specify the amplitudes \(\theta _I\) and the phase function \(\xi _I\). The parameter \(\lambda \) is a large frequency parameter of the form

$$\begin{aligned} \lambda&= B_\lambda N \Xi \end{aligned}$$
(5.7)

where \(B_\lambda \) is a very large constant associated to \(\lambda \) which is chosen at the end of the argument. (For technical reasons, we will require that \(\lambda \in {\mathbb {Z}}_+\) is a positive integer, so \(B_\lambda \) will really have some dependence on \(N \Xi \), but will nonetheless be bounded, and should be thought of as a constant.) The term \(\delta \theta _I\) in (5.6) is a small correction term which is present to ensure that the wave \(\Theta _I\) has compact support in frequency space. We will specify \(\delta \theta _I\) later, but it is important to remark that

$$\begin{aligned} \Vert \delta \theta _I \Vert _{C^0} \rightarrow 0, \text{ as } \lambda \rightarrow \infty \end{aligned}$$

Each wave \(\Theta _I\) has a conjugate wave \(\Theta _{\bar{I}} = \overline{\Theta }_I\) with an opposite phase function \(\xi _{\bar{I}} = - \xi _I\) and amplitude \(\theta _{\bar{I}} = \bar{\theta }_I\) so that the overall correction is real-valued. We will choose the amplitudes \(\theta _I = \bar{\theta }_I\) to be real-valued as well.

The index I for the wave \(\Theta _I\) consists of two parts \(I = (k,f) \in {\mathbb {Z}}\times \{ \pm \}\). The discrete index \(k \in {\mathbb {Z}}\) specifies the support of the wave \(\Theta _I = \Theta _{(k,f)}\) in time. Specifically, the support of \(\Theta _{(k,f)}\) will be contained in the time interval \([(k - \frac{2}{3}) \tau , (k + \frac{2}{3})\tau ]\) where \(\tau \) is a time scale parameter that will be chosen during the iteration. The index \(f \in \{ \pm \}\) is a sign which specifies the direction of oscillation of the wave \(\Theta _{(k,f)}\).

The phase functions \(\xi _I\) are solutions to the transport equation

$$\begin{aligned} \begin{aligned} (\partial _t + u_\epsilon ^l \partial _l) \xi _I&= 0 \\ \xi _I\left( t(I), x\right)&= \hat{\xi }_I(x) \end{aligned} \end{aligned}$$
(5.8)

The amplitudes \(\theta _I\) will be supported on a small time interval during which the phase functions remain close to their initial data. The initial data \(\hat{\xi }_I\) for the phase function \(\xi _I = \xi _{(k,f)}\) is chosen at the time \(t(I) = k \tau \) depending on the index \(I = (k,f)\)

$$\begin{aligned} \hat{\xi }_I(x) = \hat{\xi }_{(k, \pm )}(x)&= \pm 10^{[k]} \xi ^{(1)} \cdot x \end{aligned}$$
(5.9)

where \([k] \in \{ 0, 1 \}\) is equal to 0 when k is even and is equal to 1 when k is odd. In particular, we have

$$\begin{aligned} \nabla \hat{\xi }_I = \pm 10^{[k]} \xi ^{(1)}, \qquad [k] \in \{ 0, 1 \} \end{aligned}$$

Our individual waves are localized in frequency and take the form

$$\begin{aligned} \Theta _I&= P_{\approx \lambda }^I [ e^{i \lambda \xi _I} \theta _I] \end{aligned}$$
(5.10)

The operators \(P_{\approx \lambda }^I\) in (5.10) restrict to frequencies of order \(\lambda \) in a neighborhood of \(\lambda \nabla \hat{\xi }_I\). To be explicit, let \({\hat{\eta }}_{\approx 1}(\xi )\) be a bump function supported on frequencies

$$\begin{aligned} {\hat{\eta }}_{\approx 1}(\xi ) \in C_c^\infty \left( B_{|\xi ^{(1)}|/2}(\xi ^{(1)}) \right) \end{aligned}$$

which has the property that

$$\begin{aligned} {\hat{\eta }}_{\approx 1}(\xi ) = 1, \qquad ~\text{ if } |\xi - \xi ^{(1)}| \le \frac{1}{4} |\xi ^{(1)}| \end{aligned}$$

We then define a frequency cutoff supported on high frequencies of order \(\lambda \) by rescaling and reflection

$$\begin{aligned} {\hat{\eta }}^{I}_{\approx \lambda }(\xi ) = {\hat{\eta }}_{\approx 1}(\pm 10^{-[k]} \lambda ^{-1} \xi ). \end{aligned}$$

Then \(P_{\approx \lambda }^{I}\) is given explicitly by a Fourier multiplier

$$\begin{aligned} \widehat{P_{\approx \lambda }^{I} F}(\xi ) = {\hat{\eta }}_{\approx \lambda }^I(\xi ) {\hat{F}}(\xi ). \end{aligned}$$

Including this “projection operator” \(P_{\approx \lambda }^{I}\) guarantees that all the corrections (5.10) have frequency support in the ball \(| \xi - ( \lambda \nabla \hat{\xi }_I ) | \le \lambda \frac{|\nabla \hat{\xi }_I|}{2} \), and in particular have integral 0. Having compact support in frequency space will allow us to easily control the resulting increment to the velocity field, which is obtained by applying another Fourier multiplier.

By the Microlocal Lemma 4.1, it is possible to write the wave (5.10) in the form (5.6) with an explicit remainder \(\delta \theta _I\), since we have

$$\begin{aligned} \Theta _I = P_{\approx \lambda }^I [ e^{i \lambda \xi _I} \theta _I ]&= e^{i \lambda \xi _I}( \theta _I {\hat{\eta }}_{\approx \lambda }^I\left( \lambda \nabla \xi _I\right) + \delta \theta _I) \nonumber \\&= e^{i \lambda \xi _I}\left( \theta _I + \delta \theta _I\right) \end{aligned}$$
(5.11)

provided that the phase gradient is sufficiently close to its initial value

$$\begin{aligned} | \nabla \xi _I - \nabla \hat{\xi }_I |&\unlhd \frac{|\nabla \hat{\xi }_I|}{4} \end{aligned}$$
(5.12)

We will verify that inequality (5.12) is satisfied when the parameter lifespan parameter \(\tau \) is chosen.

Applying the Microlocal Lemma 4.1 again, we can also calculate the resulting correction to the drift velocity.

$$\begin{aligned} U_I^l&:= T^l\left[ \Theta _I \right] \end{aligned}$$
(5.13)
$$\begin{aligned} T^l P_{\approx \lambda }^I [ e^{i \lambda \xi _I} \theta _I ]&= e^{i \lambda \xi _I}(\theta _I {\hat{K}}^l\left( \lambda \nabla \xi _I\right) {\hat{\eta }}_{\approx \lambda }^I\left( \lambda \nabla \xi _I\right) + \delta u_I^l ) \end{aligned}$$
(5.14)
$$\begin{aligned} U_I^l&= e^{i \lambda \xi _I}( \theta _I m^l\left( \nabla \xi _I\right) + \delta u_I^l) \end{aligned}$$
(5.15)

Therefore, once we have verified (5.12), we have

$$\begin{aligned} U_I^l&= e^{i \lambda \xi _I}( u_I^l + \delta u_I^l) \end{aligned}$$
(5.16)
$$\begin{aligned} u_I^l&= \theta _I m^l\left( \nabla \xi _I\right) \end{aligned}$$
(5.17)

with an explicit error term \(\delta u_I^l\) given by Lemma 4.1.

Choosing the Amplitudes

According to Section 5.1, we can now decompose the remaining error terms in Equation (5.2) as follows

$$\begin{aligned} \partial _t \theta _1 + \partial _l\left( \theta _1 u_1^l\right)&= \partial _t \Theta + \partial _l( u_\epsilon ^l \Theta ) + \partial _l(\theta _\epsilon U^l) \end{aligned}$$
(5.18)
$$\begin{aligned}&\quad + \partial _l[ \sum _I \Theta _I U_{\bar{I}}^l + \tilde{c}_A A^l + R_\epsilon ^l ] \end{aligned}$$
(5.19)
$$\begin{aligned}&\quad + \sum _{J \ne \bar{I}} \partial _l(\Theta _I U_J^l) \end{aligned}$$
(5.20)
$$\begin{aligned}&\quad + \partial _l R_M^l \end{aligned}$$
(5.21)

The term \(R_M^l\) comes from the regularizations in Equation (5.3).

The first objective of the correction is to eliminate the term (5.19), which is the only low frequency term that arises. However, since we consider oscillations in essentially only one direction \(\nabla \xi _I \approx \pm 10^{[k]} \xi ^{(1)}\), we will only able to eliminate the \(A^l\) component of (5.19).

We begin by expanding the low frequency part of the interactions in line (5.19) as

$$\begin{aligned} \sum _I (\Theta _I U_{\bar{I}}^l)&= \frac{1}{2} \sum _I (\Theta _I U_{\bar{I}}^l + \Theta _{\bar{I}} U_I^l) \nonumber \\&= \sum _{I \in {\mathcal {I}}_+} \theta _I \bar{\theta }_I (m^l\big (- \nabla \xi _I) + m^l\big (\nabla \xi _I\big )\big ) + \text{ Lower } \text{ Order } \text{ Terms } \end{aligned}$$
(5.22)
$$\begin{aligned}&= \sum _{I \in {\mathcal {I}}_+} |\theta _I|^2 (m^l\big (- \nabla \hat{\xi }_I\big ) + m^l\big (\nabla \hat{\xi }_I\big )) + \text{ Lower } \text{ Order } \text{ Terms } \end{aligned}$$
(5.23)
$$\begin{aligned}&= \sum _{I \in {\mathcal {I}}_+} |\theta _I|^2 A^l + \text{ Lower } \text{ Order } \text{ Terms } \end{aligned}$$
(5.24)

We will give a complete list of the lower order terms below after we have chosen the amplitudes \(\theta _I\).

We wish to choose the amplitudes \(\theta _I\) so that the main term in (5.24) cancels with the \(A^l\) component of the other terms in line (5.19). We achieve this cancellation in two steps. First, we decompose \(R_\epsilon \) into components

$$\begin{aligned} R_\epsilon ^l&= c_J A^l + c_B B^l \end{aligned}$$
(5.25)

We also subtract a constant vector field \(\partial _l( e(t) A^l) = 0\) from line (5.19), which leads us to impose to an equation

$$\begin{aligned} \sum _{I \in {\mathcal {I}}_+} |\theta _I|^2 A^l&= e(t) A^l +\tilde{c}_A A^l + c_J A^l \end{aligned}$$
(5.26)
$$\begin{aligned}&= e(t)(1 + \varepsilon ) A^l \end{aligned}$$
(5.27)
$$\begin{aligned} \varepsilon&= \frac{\tilde{c}_A + c_J}{e(t)} \end{aligned}$$
(5.28)

for the amplitudes \(\theta _I\). In this way, the amplitudes \(\theta _I\) are chosen to eliminate the \(A^l\) component of the low frequency part of the stress \(R_\epsilon ^l\).

It will be important for our construction that the term \(\varepsilon \) is smaller than the constant 1 in the \((1 + \varepsilon )\) term in (5.27). From the lower bound \(e(t) \ge K e_R\) assumed in (3.10), we can obtain an upper bound

$$\begin{aligned} \Vert \varepsilon \Vert _{C^0}&\le \frac{Z}{K} \end{aligned}$$
(5.29)

on the size of the term (5.28), where Z is a constant depending only on the vectors \(A^l\) and \(B^l\). Now, provided \(K \ge K_0 = 2 Z\), we have

$$\begin{aligned} \Vert \varepsilon \Vert _{C^0}&\le \frac{1}{2} \end{aligned}$$
(5.30)

A subtle point here is that the bound (5.29) does not follow immediately from (3.10). Namely, we must also check that the same lower bound remains true on the set

$$\begin{aligned} e(t) \ge K e_R \text{ for } \text{ all } (t,x) \in {\mathrm {supp}}\,\left( \tilde{c}_A + c_J \right) , \end{aligned}$$
(5.31)

which is slightly larger than the supports of the given \(R_J\) and \(c_A\) due to a regularization in time in the definitions of \(\tilde{c}_A\), \(c_J\). Thus, the estimates (5.29)-(5.30) are guaranteed only after (5.31) has been verified, which is accomplished in Line (6.17) below when we choose the mollifying parameters. We now assume that (5.29)-(5.30) hold in order to finish defining the construction.

From Equation (5.27), we are led to choose amplitudes of the form

$$\begin{aligned} \theta _I&= e^{1/2}(t) \eta _k(t) \gamma , \qquad I = (k,f) \end{aligned}$$
(5.32)
$$\begin{aligned} \gamma&= (1 + \varepsilon )^{1/2} \end{aligned}$$
(5.33)

The functions

$$\begin{aligned} \eta _k(t) = \eta \left( \frac{t - k \tau }{\tau } \right) \end{aligned}$$

are elements of a rescaled partition of unity in time

$$\begin{aligned} \sum _{u \in {\mathbb {Z}}} \eta ^2(t - u) = 1 \end{aligned}$$

which we use to patch together local solutions of Equation (5.27). Our choice of \(\eta _k\) ensures that each amplitude \(\theta _{(k,f)}\) has support in a time interval \([k \tau - \frac{2 \tau }{3},k \tau + \frac{2 \tau }{3}]\) of duration \(\frac{4 \tau }{3}\). The coefficient \(\gamma \) ensures that (5.27) is satisfied, and \(\gamma \) is assured to be well-defined by the bound (5.30).

To express the remaining error terms in a compact way, let us introduce the notation

$$\begin{aligned} \tilde{\theta }_I = \theta _I + \delta \theta _I, \quad \tilde{u}_I^l = u_I^l + \delta u_I^l \end{aligned}$$

Thus, \(\Theta _I = e^{i \lambda \xi _I} \tilde{\theta }_I\) and \(U_I^l = e^{i \lambda \xi _I} \tilde{u}_I^l\).

Having chosen \(\theta _I\), we can now expand the error term in (5.19) as follows

$$\begin{aligned} \sum _I \Theta _I U_{\bar{I}}^l&+ \tilde{c}_A A^l + R_\epsilon ^l = c_B B^l + R_S^l \end{aligned}$$
(5.34)
$$\begin{aligned} R_S^l&= \sum _I (\Theta _I U_{\bar{I}}^l) - \sum _{I \in {\mathcal {I}}_+} |\theta _I|^2 A^l \end{aligned}$$
(5.35)

We now expand

$$\begin{aligned} (\Theta _I U_{\bar{I}}^l + \Theta _{\bar{I}} U_I^l)&= \tilde{\theta }_I \tilde{u}_{\bar{I}}^l + \tilde{\theta }_{\bar{I}} \tilde{u}_I^l \\&= |\theta _I|^2 (m^l\big (-\nabla \hat{\xi }_I\big ) + m^l(\nabla \hat{\xi }_I)) + R_{SI,1}^l + R_{SI, 2}^l \\ R_{SI,1}^l&= |\theta _I|^2 [ \big (m^l(-\nabla \xi _I) - m^l\big (-\nabla \hat{\xi }_I\big )\big ) + ( m^l(\nabla \xi _I) - m^l\big (\nabla \hat{\xi }_I\big ))] \\ R_{SI,2}^l&= \delta \theta _I \tilde{u}_{\bar{I}}^l + \tilde{\theta }_I \delta u_{\bar{I}}^l - \delta \theta _I \delta u_{\bar{I}}^l + \delta \theta _{\bar{I}} \tilde{u}_{I}^l + \tilde{\theta }_{\bar{I}} \delta u_{I}^l - \delta \theta _{\bar{I}} \delta u_{I}^l \end{aligned}$$

which gives

$$\begin{aligned} R_S^l&= \sum _I (R_{SI,1}^l + R_{SI,2}^l) \end{aligned}$$
(5.36)

Note that, at any given time t, at most four indices I contribute to the sum in (5.36).

The Remaining Error Terms

In Sections 5.1-5.2 we have defined the construction up to the specification of a few parameters. Our result is that the corrected field \(\theta _1 = \theta + \Theta \) and drift velocity \(u_1^l = u^l + U^l\) satisfy Equation (5.4) with \(c_B\) defined in line (5.25), and

$$\begin{aligned} R_1&= R_T + R_L + R_H + R_M + R_S \end{aligned}$$
(5.37)

The terms \(R_M\) \(R_S\) are defined in (5.3) and (5.36). We now rewrite the remaining terms in Equations (5.18)-(5.20) using the fact that the velocity fields appearing in these equations are divergence free.

The transport term \(R_T\) is obtained by solving

$$\begin{aligned} \partial _l R_T^l&= \left( \partial _t + u_\epsilon ^l \partial _l\right) \Theta \end{aligned}$$
(5.38)
$$\begin{aligned}&= \sum _I e^{i \lambda \xi _I} \left( \partial _t + u_\epsilon ^l \partial _l\right) {\tilde{\theta }}_I \end{aligned}$$
(5.39)

Here the term where the derivative hits the phase functions vanishes according to equation (5.8). Formula (5.39) suggests that the transport term has frequency \(\lambda \), so we expect to gain a factor \(\lambda ^{-1}\) in solving equation (5.38). In fact, we will choose our mollification \(u_\epsilon \) to be a frequency-localized version of u so that together with (5.10), the term (5.38) is literally supported on frequencies of size \(\frac{\lambda }{3} \le |\xi | \le 20 \lambda \). Hence, there is a frequency localizing operator \(P_{\approx \lambda }\) satisfying

$$\begin{aligned} \left( \partial _t + u_\epsilon ^l \partial _l\right) \Theta = P_{\approx \lambda }\left[ \left( \partial _t + u_\epsilon \cdot \nabla \right) \Theta \right] \end{aligned}$$

This frequency localization property allows us to simply define

$$\begin{aligned} R_T^l&= \partial ^l \Delta ^{-1} P_{\approx \lambda } \left[ \left( \partial _t + u_\epsilon \cdot \nabla \right) \Theta \right] \end{aligned}$$
(5.40)

In particular, we obtain the bound

$$\begin{aligned} \Vert R_T \Vert _{C^0}&\le \lambda ^{-1} \Vert \left( \partial _t + u_\epsilon \cdot \nabla \right) \Theta \Vert _{C^0} \end{aligned}$$
(5.41)

The terms remaining from (5.18) and (5.20) are the High-Low term

$$\begin{aligned} \partial _l R_L^l&= U^l \partial _l \theta _\epsilon \end{aligned}$$
(5.42)
$$\begin{aligned}&= \sum _I e^{i \lambda \xi _I} {\tilde{u}}_I^j \partial _j \theta _\epsilon \end{aligned}$$
(5.43)

and the high frequency interference terms

$$\begin{aligned} \partial _l R_H^l&= \sum _{J \ne \bar{I}} U_J^l \partial _l \Theta _I \end{aligned}$$
(5.44)

The frequency cutoffs in our definitions of \(\theta _\epsilon , U_I\) and \(\Theta _I\) ensure both of these terms have Fourier support in frequencies \(\frac{\lambda }{3} \le |\xi | \le 40 \lambda \). Here it is important that we have localized the frequency support each \(\Theta _I\) and \(U_I^l\) to a limited range of angles. As a consequence,

$$\begin{aligned} U^l \partial _l \theta _\epsilon= & {} P_{\approx \lambda } [ U^l \partial _l \theta _\epsilon ]\\ U_J^l \partial _l \Theta _I= & {} P_{\approx \lambda } [U_J^l \partial _l \Theta _I] \end{aligned}$$

for some frequency projection operator \(P_{\approx \lambda }\), and we can define

$$\begin{aligned} R_L^l&= \partial ^l \Delta ^{-1} P_{\approx \lambda } [ U^j \partial _j \theta _\epsilon ] \end{aligned}$$
(5.45)
$$\begin{aligned} R_H^l&= \sum _{J \ne \bar{I}} \partial ^l \Delta ^{-1} P_{\approx \lambda } [ U_J^l \partial _l \Theta _I ] \end{aligned}$$
(5.46)

Now that we have written down the error terms (5.37), we must observe that each of these terms can be made small. For the transport term \(R_T^l\), the estimate (5.41) ensures that \(R_T\) is small once \(\lambda \) is chosen sufficiently large, and the same type of estimate can be used to control the High-Low term \(R_L\). The high-frequency interference terms require a more careful treatment.

Let us focus on an individual term in the sum.

$$\begin{aligned} U_J^l \partial _l \Theta _I&= e^{i \lambda \left( \xi _I + \xi _J\right) } ( ( i \lambda ) \tilde{u}_J^l \partial _l \xi _I \tilde{\theta }_I + \tilde{u}_J^l \partial _l {\tilde{\theta }}_I) \end{aligned}$$
(5.47)

We expand this term as

$$\begin{aligned} U_J^l \partial _l \Theta _I&= \left( i \lambda \right) e^{i \lambda \left( \xi _I + \xi _J\right) } (\theta _J m^l\left( \nabla \xi _J\right) + \delta u_J^l) \partial _l \xi _I {\tilde{\theta }}_I \end{aligned}$$
(5.48)

If we regard the phase gradients \(\nabla \xi _I \approx \nabla \hat{\xi }_I\) as perturbations of their initial values, the main term in (5.48) vanishes

$$\begin{aligned} m^l\left( \nabla \hat{\xi }_J\right) \partial _l \hat{\xi }_I = m^l(\pm \nabla \hat{\xi }_I) \partial _l \hat{\xi }_I = 0 \end{aligned}$$

from the degree zero homogeneity of \(m(\xi )\) and the identity \(m(\xi ) \cdot \xi = 0\).

The terms which remain are all lower order

$$\begin{aligned} \frac{1}{\left( i \lambda \right) } U_J^l \partial _l \Theta _I&= e^{i \lambda \left( \xi _I + \xi _J\right) } \theta _J {\tilde{\theta }}_I ( m^l\left( \nabla \xi _J\right) - m^l\big (\nabla \hat{\xi }_J\big )) \partial _l \xi _I \end{aligned}$$
(5.49)
$$\begin{aligned}&\quad + e^{i \lambda \left( \xi _I + \xi _J\right) } \theta _J {\tilde{\theta }}_I m^l(\nabla \hat{\xi }_J)( \partial _l \xi _I - \partial _l \hat{\xi }_I) \end{aligned}$$
(5.50)
$$\begin{aligned}&\quad + e^{i \lambda \left( \xi _I + \xi _J\right) } \delta u_J^l \partial _l \xi _I {\tilde{\theta }}_I \end{aligned}$$
(5.51)

The terms (5.49), (5.50) are made small by choosing the lifespan parameter \(\tau \) to be small, while the term (5.51) is made small once \(\lambda \) is chosen sufficiently large (see Section 6.1). The high-frequency term \(R_H\) itself is then controlled by the estimate

$$\begin{aligned} \Vert R_H \Vert _{C^0} \le \frac{C}{\lambda } \Vert \sum _{J \ne \bar{I}} U_J^l \partial _l \Theta _I \Vert _{C^0} \end{aligned}$$

from the formula (5.46). This calculation concludes our list of the error terms (5.37). What remains is to specify the parameters in the construction, prove estimates for the elements of the construction and finally to check that the estimates stated in Lemma 3.1 are satisfied.

Specifying Parameters and the Mollification Term

To initialize the argument, we must specify how we regularize the given solution \((\theta , u, c_A, R^l)\) to the compound scalar-stress equation. In this section, we specify how these regularizations are defined. Because the flow map of the regularized velocity is used to define the regularizations of \(c_A\) and \(R^l\), it is necessary to start with the defininition of the regularized velocity. After the regularizations of \(c_A\) and \(R^l\) are defined, we are able to verify the lower bound (5.31) which had been assumed previously to guarantee a well-defined construction.

To obtain the regularized scalar field \(\theta _\epsilon \) and drift velocity \(u_\epsilon \), we take low frequency projections in the spatial variables with length scale parameters \(\epsilon _\theta \) and \(\epsilon _u\)

$$\begin{aligned} \theta _\epsilon&= P_{\le q}^2 \theta , \qquad \text{ where }\qquad 2^{-q} = \epsilon _\theta \end{aligned}$$
(6.1)
$$\begin{aligned} u_\epsilon&= P_{\le q}^2 u, \qquad \text{ where } \qquad 2^{-q} = \epsilon _u \end{aligned}$$
(6.2)

The reason for the double mollification in equation (6.2) will become apparent during the commutator estimates of Section 7. The operator is given by rescaling a Fourier multiplier

$$\begin{aligned} \widehat{P_{\le \chi } F}(\xi ) = \hat{\eta }\left( \frac{\xi }{2^\chi }\right) \hat{F}(\xi ) \end{aligned}$$

where \(\hat{\eta }(\xi )\) is a smooth function with compact support in \(|\xi | \le 2\) that is equal to 1 on \(|\xi | \le 1\).

By well-known estimates for convolutions with mollifiers satisfying vanishing moment conditions (see [31] Section 14), we have

$$\begin{aligned} \Vert \theta - \theta _\epsilon \Vert _{C^0}&\le C_L \epsilon _\theta ^a \Vert \nabla ^a \theta \Vert _{C^0} \end{aligned}$$
(6.3)
$$\begin{aligned} \Vert u - u_\epsilon \Vert _{C^0}&\le C_L \epsilon _u^a \Vert \nabla ^a u \Vert _{C^0} \end{aligned}$$
(6.4)

We want to choose the length scales \(\epsilon _\theta \) and \(\epsilon _u\) as large as possible while ensuring that the mollification term \(R_M^l\) in (5.3) is acceptably small. The main terms in (5.3) where these mollification errors appear are

$$\begin{aligned} R_{M,\theta }^l&= \sum _I \left( \theta - \theta _\epsilon \right) e^{i \lambda \xi _I} u_I^l \end{aligned}$$
(6.5)
$$\begin{aligned} R_{M,u}^l&= \sum _I e^{i \lambda \xi _I} \theta _I (u^l - u_\epsilon ^l) \end{aligned}$$
(6.6)

Logically, the terms (6.5) and (6.6) are not well-defined until we have specified how to define \(u_\epsilon \). However, from the expressions (5.32) and (5.17) and the bound (3.11) we have an a priori estimate

$$\begin{aligned} \Vert R_{M,\theta } \Vert _{C^0} + \Vert R_{M,u} \Vert _{C^0}&\le A e_R^{1/2} \left( \Vert \theta - \theta _\epsilon \Vert _{C^0} + \Vert u - u_\epsilon \Vert _{C^0}\right) \end{aligned}$$
(6.7)

for A a constant depending only on the parameter M in Lemma 3.1.

Using (6.3)-(6.4) for \(a = L\) and the bound (3.3), we can choose parameters of the form

$$\begin{aligned} \epsilon _\theta&= \epsilon _u = \frac{1}{B} N^{-1/L} \Xi ^{-1} \end{aligned}$$
(6.8)

Here B is some large constant depending on A in (6.7) chosen to assure that

$$\begin{aligned} \Vert R_{M,\theta } \Vert _{C^0} + \Vert R_{M,u} \Vert _{C^0}&\le \frac{e_v^{1/2}e_R^{1/2}}{1000 N} \end{aligned}$$
(6.9)

The estimate (6.9) is stronger than what we require for Lemma 3.1. Rather, estimate (6.9) is the type of bound one requires to obtain solutions with regularity \(1/3 -\) (see [31] Section 13).

Observe that the parameter choice (6.8) is exactly the choice of parameter taken in [31] Section 15 up to a constant. We will therefore in many cases be able to refer to the estimates of [31] without repeating the proofs.

Having defined \(\theta _\epsilon \) and \(u_\epsilon \), we can now define our regularizations \(\tilde{c}_A\) and \(R_\epsilon ^l\) of \(c_A\) and \(R_J^l\).

Following [31], we define these regularizations using the coarse scale flow \(\Phi _s\) associated to \(\partial _t + u_\epsilon \cdot \nabla \), whose definition we now recall.

Definition 6.1

We define the coarse scale flow \(\Phi _s(t,x) : {\mathbb {R}}\times {\mathbb {R}}\times {\mathbb {T}}^2 \rightarrow {\mathbb {R}}\times {\mathbb {T}}^2\) to be the unique solution to the ODE

$$\begin{aligned} \frac{d}{ds} \Phi _s^0(t,x)&= 1 \\ \frac{d}{ds} \Phi _s^j(t,x)&= u_\epsilon ^j(\Phi _s(t,x)), \qquad j = 1, 2 \\ \Phi _0(t,x)&= (t,x) \end{aligned}$$

We can now define our regularizations for \(c_A\) and \(R_J\). First, we mollify both \(c_A\) and \(R_J^l\) in space to define

$$\begin{aligned} c_{A, \epsilon _x}&= \eta _{\epsilon _x} *c_A \end{aligned}$$
(6.10)
$$\begin{aligned} R_{\epsilon _x}&= \eta _{\epsilon _x} *R_J \end{aligned}$$
(6.11)

We then use the coarse scale flow \(\Phi _s\) and a smooth function \(\eta _{\epsilon _t}(s)\) supported in \(|s| \le \epsilon _t\) with integral \(\int \eta _{\epsilon _t}(s) ds = 1\) to average in time and form

$$\begin{aligned} \tilde{c}_A(t,x)&= \int c_{A, \epsilon _x}(\Phi _s(t,x) ) \eta _{\epsilon _t}(s) ds \end{aligned}$$
(6.12)
$$\begin{aligned} R_\epsilon ^l(t,x)&= \int R_{\epsilon _x}(\Phi _s(t,x)) \eta _{\epsilon _t}(s) ds \end{aligned}$$
(6.13)

In this way, the values of \(R_\epsilon (t,x)\) and \(\tilde{c}_A(t,x)\) are obtained by averaging \(c_A\) and \(R_J^l\) over an \(\epsilon _x\)-neighborhood of a time \(2\epsilon _t\) flow line through (tx).

To estimate \(\tilde{c}_A\) and \(R_\epsilon ^l\), we recall that both \(c_A\) and \(R_J\) satisfy the estimates

$$\begin{aligned} || \nabla ^k c_A ||_{C^0} + \left( \frac{e_R}{e_J} \right) \Vert \nabla ^k R_J \Vert&\le 2 \Xi ^k e_R&\,&k = 0, \ldots , L \end{aligned}$$
(6.14)
$$\begin{aligned} || \nabla ^k (\partial _t + u \cdot \nabla ) c_A ||_{C^0} + \left( \frac{e_R}{e_J} \right)&\,&\,&\nonumber \\ || \nabla ^k (\partial _t + u \cdot \nabla ) R_J ||_{C^0}&\le 2 \Xi ^{k+1} e_v^{1/2} e_R&\,&k = 0, \ldots , L - 1 \end{aligned}$$
(6.15)

coming from the compound frequency-energy levels of Definition 3.2.

Since the bounds (6.14)-(6.15) coincide with the bounds for the tensor \(R^{jl}\) in [31], we can draw from the results of that paper to control \(R_\epsilon ^l\) and \(\tilde{c}_A\).

Following Section 18.3 of [31], we choose length and time scales of the form

$$\begin{aligned} \epsilon _x = \frac{1}{B} N^{-1/L} \Xi ^{-1}, \quad \epsilon _t = \frac{1}{B} (N \Xi )^{-1} e_R^{-1/2} \end{aligned}$$
(6.16)

We choose \(B \ge 1\) large enough to bound the terms

$$\begin{aligned} \Vert (\tilde{c}_A - c_A) A^l \Vert _{C^0} + \Vert R_J^l - R_\epsilon ^l \Vert _{C^0}&\le \frac{e_v^{1/2}e_R^{1/2}}{100 N} \end{aligned}$$
(6.17)

which appear in the list of error terms from mollification of Equation (5.3).

Note that the choice of parameters (6.16) is the same as the choice made in [31, Section 18.3], and therefore leads to the same bounds

$$\begin{aligned} \begin{aligned} \Vert&\nabla ^k \left( \frac{\bar{D}}{\partial t}\right) ^r \tilde{c}_A \Vert _{C^0}\\&\quad \le C_k \Xi ^k e_R (\Xi e_v^{1/2})^{(r\ge 1)}(N \Xi e_R^{1/2})^{(r\ge 2)} N^{(k+1-L)_+/L} \\&\Vert \nabla ^k \left( \frac{\bar{D}}{\partial t}\right) ^r R_J \Vert _{C^0} + \Vert \nabla ^k \left( \frac{\bar{D}}{\partial t}\right) ^r c_J \Vert _{C^0}\\&\quad \le C_k \Xi ^k e_J \left( \Xi e_v^{1/2} \right) ^{(r\ge 1)} \left( N \Xi e_R^{1/2}\right) ^{(r\ge 2)} N^{(k+1-L)_+/L} \end{aligned} \end{aligned}$$
(6.18)

where we use the notation \((r\ge m) = \chi _{[m,\infty )}(r)\) and we restrict to \(0 \le r \le 2\). The fact we are using here is that \(c_A\) obeys the same estimates as the stress \(R^{jl}\) in [31], and the terms \(R_J\) and \(c_J\) satisfy even better bounds. The details of the proof are carried out in [31, Section 18].

A crucial point here is that (6.18) contains estimates on second order advective derivatives, even though our assumed bounds (3.6), (3.8) on \(c_A\) and \(R_J \) contain only information regarding first order advective derivatives. The ability to obtain this estimate comes from the fact that the advective derivative \(\frac{\bar{D}}{\partial t}\) commutes with its own flow \(\Phi _s\), and thus commutes with the averaging along the flow. This observation allows us to integrate by parts in

$$\begin{aligned} \frac{\bar{D}}{\partial t}R_\epsilon ^l(t,x)&= \int \frac{\bar{D}}{\partial t}R_{\epsilon _x}(\Phi _s(t,x)) \eta _{\epsilon _t}(s) ds \end{aligned}$$
(6.19)
$$\begin{aligned}&= - \int R_{\epsilon _x}(\Phi _s(t,x)) \eta _{\epsilon _t}'(s) ds \end{aligned}$$
(6.20)

This computation explains why the cost of the second advective derivative in (6.18) is exactly a factor of \(\epsilon _t^{-1}\) for the choice of parameter (6.16). We refer to [31, Section 18.6.1] and to [33, Section 12.1] for two different proofs of identity (6.19).

Having defined \(\tilde{c}_A\) and \(R_\epsilon \), we are now able to verify the lower bound (5.31) on the energy profile, which had been assumed previously in many of the formulas in our construction. From the assumption (3.9) that \({\mathrm {supp}}\,c_A \cup {\mathrm {supp}}\,R_J \subseteq I \times {\mathbb {T}}^2\), we have by construction that

$$\begin{aligned} {\mathrm {supp}}\,\tilde{c}_A \cup {\mathrm {supp}}\,R_\epsilon \cup {\mathrm {supp}}\,c_J \subseteq I \pm \epsilon _t \times {\mathbb {T}}^2 \end{aligned}$$

Since we assumed the lower bound (3.10) for e(t) on the interval \(I \pm \Xi ^{-1} e_v^{-1/2}\), it suffices to check that \(\epsilon _t < \Xi ^{-1} e_v^{-1/2}\). This inequality follows from the definition (6.16) of \(\epsilon _t\) and the inequality \(N \ge \left( \frac{e_v}{e_R} \right) ^{1/2}\), which follows from condition (3.12).

At this point, the only term that remains to be estimated in the mollification term

$$\begin{aligned} R_M^l&= R_{M,\theta }^l + R_{M,u}^l + \left( c_A - \tilde{c}_A\right) A^l + (R_J^l - R_\epsilon ^l) + R_{M'}^l \end{aligned}$$
(6.21)

is given by

$$\begin{aligned} R_{M'}^l&= \sum _I e^{i \lambda \xi _I} (u^l - u_\epsilon ^l) \delta \theta _I + \sum _I e^{i \lambda \xi _I} \left( \theta - \theta _\epsilon \right) \delta u_I^l \end{aligned}$$
(6.22)

This term will be estimated when we choose the parameter \(\lambda \) at the end of the argument.

The Choice of Lifespan Parameter and the Limiting Error Terms

The present Section is devoted to choosing the lifespan parameter \(\tau \). Here we motivate the choice of \(\tau \) by comparing the estimates that will be satisfied by the main error terms and optimizing. However, we warn the reader that the estimates stated in this Section have not yet been established, but will follow from the bounds of Section 7 below.

The lifespan parameter \(\tau \) determines the length of time during which an amplitude

$$\begin{aligned} \theta _I&= e^{1/2}(t) \eta \left( \frac{t - t(I)}{\tau } \right) \gamma \end{aligned}$$
(6.23)

is allowed to remain nonzero. The parameter \(\tau \) is chosen to be small so that the gradients of the phase functions, which satisfy the transport equation

$$\begin{aligned} (\partial _t + u_\epsilon ^j \partial _j) \partial ^l \xi _I&= - \partial ^l u_\epsilon ^j \partial _j \xi _I, \end{aligned}$$
(6.24)

remain close to their initial values \(\nabla \xi _I \approx \nabla \hat{\xi }_I\). More precisely, equation (6.24) with initial data \(\xi _I(t(I), x) = \hat{\xi }_I(x)\) leads to a bound of the form

$$\begin{aligned} |\nabla \xi _I\left( \Phi _s(t(I),x)\right) - \nabla {\hat{\xi }}_I(x)|&\le A e^{A \Xi e_v^{1/2} \tau } (\Xi e_v^{1/2}) \tau , \qquad |s|\le \tau \end{aligned}$$
(6.25)

where \(\Xi e_v^{1/2}\) is an upper bound on \(\Vert \nabla u_\epsilon \Vert _{C^0} \le \Xi e_v^{1/2}\), cf. Lemma 7.3 below.

In our case, we require that \(\tau \le \Xi ^{-1} e_v^{-1/2}\), so that the estimate (6.25) becomes

$$\begin{aligned} \Vert \nabla \xi _I - \nabla \hat{\xi }_I \Vert _{C^0}&\le A (\Xi e_v^{1/2}) \tau \end{aligned}$$
(6.26)

Here, there are two main error terms which require the choice of a sharp time cutoff in order to control. The first such term, which is familiar from the case of the Euler equations, is the set of high-frequency interference terms in (5.46)

$$\begin{aligned} R_H^l&= \sum _{J \ne \bar{I}} \partial ^l \Delta ^{-1} P_{\approx \lambda } [U_J^j \partial _j \Theta _I] \end{aligned}$$
(6.27)

Recall from (5.49)-(5.51) that each term in the series (6.27) can be expressed to leading order as

$$\begin{aligned} \frac{1}{\left( i \lambda \right) } U_J^l \partial _l \Theta _I&= e^{i \lambda \left( \xi _I + \xi _J\right) } \theta _J {\tilde{\theta }}_I ( m^l\left( \nabla \xi _J\right) - m^l\big (\nabla \hat{\xi }_J\big ) ) \partial _l \xi _I \end{aligned}$$
(6.28)
$$\begin{aligned}&\quad + e^{i \lambda \left( \xi _I + \xi _J\right) } \theta _J {\tilde{\theta }}_I m^l(\nabla \hat{\xi }_J)( \partial _l \xi _I - \partial _l \hat{\xi }_I) \end{aligned}$$
(6.29)
$$\begin{aligned}&\quad + \text{ lower } \text{ order } \text{ terms } \end{aligned}$$
(6.30)

Formula (6.27) leads to the bound

$$\begin{aligned} \Vert R_H \Vert _{C^0}&\le \frac{A}{\lambda } \Vert \sum _{J \ne \bar{I}} U_J^j \partial _j \Theta _I \Vert _{C^0} \nonumber \\&\le A \max _I \Vert \theta _I \Vert _{C^0}^2 (\Vert m^l\left( \nabla \xi _I\right) - m^l(\nabla \hat{\xi }_I) \Vert _{C^0} + \Vert \nabla \xi _I - \nabla \hat{\xi }_I \Vert _{C^0}) \nonumber \\&\quad + \text{ Lower } \text{ order } \text{ terms } \nonumber \\&\le A e_R \max _I \Vert \nabla \xi _I - \nabla \hat{\xi }_I \Vert _{C^0} + \text{ Lower } \text{ order } \text{ terms } \end{aligned}$$
(6.31)
$$\begin{aligned}&\le A e_R (\Xi e_v^{1/2} \tau ) + \text{ Lower } \text{ order } \text{ terms } \end{aligned}$$
(6.32)

where the constant A changes from line to line. The error term (6.32) is made small by choosing the lifespan parameter \(\tau \) to be small compared to the natural time scale \(\Xi ^{-1} e_v^{-1/2}\) of the coarse scale velocity \(u_\epsilon \). The other terms in (6.32) are lower order in the sense that they can be made small by a suitable choice of \(\lambda \).

The price we pay by introducing sharp cutoffs is a worse bound on the transport term.

$$\begin{aligned} R_T^l&= \partial ^l \Delta ^{-1} P_{\approx \lambda } [ (\partial _t + u_\epsilon \cdot \nabla ) \Theta ] \end{aligned}$$
(6.33)
$$\begin{aligned} (\partial _t + u_\epsilon \cdot \nabla ) \Theta&= \sum _I e^{i \lambda \xi _I} (\partial _t + u_\epsilon \cdot \nabla ) \theta _I + \text{ Lower } \text{ order } \text{ terms } \end{aligned}$$
(6.34)

The time cutoffs appear in the formula (6.23) for the amplitude, and give rise to a term of size

$$\begin{aligned} \Vert (\partial _t + u_\epsilon \cdot \nabla ) \Theta \Vert _{C^0}&= A \tau ^{-1} e_R^{1/2} + \text{ Lower } \text{ order } \text{ terms } \end{aligned}$$
(6.35)

which leads in turn from the definition (5.7) of \(\lambda \). to a bound on the transport term

$$\begin{aligned} \Vert R_T \Vert _{C^0}&\le A \lambda ^{-1} \Vert (\partial _t + u_\epsilon \cdot \nabla ) \Theta \Vert _{C^0} \end{aligned}$$
(6.36)
$$\begin{aligned}&\le A B_\lambda ^{-1} (N \Xi )^{-1} \tau ^{-1}e_R^{1/2} + \text{ Lower } \text{ order } \text{ terms } \end{aligned}$$
(6.37)

We therefore choose

$$\begin{aligned} \tau = B_\lambda ^{-1/2} \left( \frac{e_v^{1/2}}{e_R^{1/2} N} \right) ^{1/2} \Xi ^{-1} e_v^{-1/2} \end{aligned}$$
(6.38)

in order to optimize between the estimates for the leading term in (6.32) and (6.37). This choice leads to the \(C^0\) estimate

$$\begin{aligned} \Vert R_1 \Vert _{C^0}&\unlhd \left( \frac{e_v^{1/2}}{e_R^{1/2} N} \right) ^{1/2} \frac{e_v^{1/2} e_R^{1/2}}{N} \end{aligned}$$
(6.39)

stated in Lemma 3.1, and ultimately to the regularity \(1/9-\).

Unlike the case of the Euler equations, there is also a second error term which requires sharp time cutoffs to make small in our present scheme, namely the Stress term \(R_S\) appearing in (5.36). It turns out that this term also satisfies the same estimate (6.32), and consequently will be among the largest error terms, having size (6.39) after the above choice of \(\tau \). The reason that we see this extra term compared to the case of Euler is that the method we have used here to solve the quadratic equation (5.26) requires the phase gradients \(\nabla \xi _I\) to remain very close to their initial values \(\nabla \xi _I \approx \nabla \hat{\xi }_I\) to within an error much smaller than O(1). In the case of Euler, the equation analogous to (2.15) can be solved using nonlinear phase functions in a way which allows for the phase gradients to depart from their initial values by an error of size \(\Vert \nabla \xi _I - \nabla \hat{\xi }_I \Vert _{C^0} = O(1)\) (see [31, Section 7.3]). Ideally, one would hope to solve equation (2.15) in a similar manner to avoid generating error terms such as these which require sharp time cutoffs to treat as above.

We now turn our attention to obtaining estimates for the terms in the construction. In particular, we need to establish the estimates (6.32) and (6.37) precisely, and also to estimate the other error terms. The proof is concluded by choosing the constant \(B_\lambda \) in (5.7) to be sufficiently large so that the inequality (6.39) holds as stated, without any implicit constant factor.

Basic Estimates for the Construction

Lemma 7.1

(Coarse Scale Flow Estimates) Let \(L\ge 2\) be an integeras in Lemma 3.1. The mollified velocity field \(u_\epsilon \) defined in (6.2) obeys the estimates

$$\begin{aligned} \Vert \nabla ^k u_{\epsilon } \Vert _{C^0}&\le C_k \Xi ^k e_v^{1/2} N^{(k-L)_+/L}, \quad k \ge 1, \end{aligned}$$
(7.1)
$$\begin{aligned} \Vert \nabla ^k (\partial _t + u_\epsilon \cdot \nabla ) u_{\epsilon } \Vert _{C^0}&\le C_k \Xi ^{k+1} e_v N^{(k+1-L)_+/L}, \quad k \ge 0 \end{aligned}$$
(7.2)

for some universal positive constants \(C_k\).

Proof

For \(k \le L\), we see that (7.1) holds in view of the iterative assumption (3.3). For \(k>L\), there is an additional cost of \(\epsilon _u^{(k-L)_+} = (B^{-1} N^{1/L} \Xi )^{(k-L)_+}\), where we have used the choice of \(\epsilon _u\) in (6.8).

In order to prove (7.2), we recall that \(u_\epsilon ^j = P_{\le q}^2 u^j\), where \(2^{-q} = \epsilon _u=B^{-1} N^{1/L} \Xi \). We then have

$$\begin{aligned} P_{\le q}^2 \left( \partial _t u + u \cdot \nabla u\right) = \left( \partial _t u_{\epsilon } + u_{\epsilon } \cdot \nabla u_{\epsilon }\right) - Q_{\epsilon }(u,u), \end{aligned}$$
(7.3)

where

$$\begin{aligned} Q_{\epsilon }^j(u,u)&= u_\epsilon ^i \partial _i u_\epsilon ^j - P_{\le q}^2(u^i \partial _i u^j) \end{aligned}$$
(7.4)
$$\begin{aligned}&= [P_{\le q}^2 u^i \partial _i, P_{\le q}] (P_{\le q} u^j) + P_{\le q} ( [u^i \partial _i, P_{\le q}] u^j ) \nonumber \\&\quad - P_{\le q} ( (u^i - P_{\le q}^2u^i) \partial _i (P_{\le q} u^j) ). \end{aligned}$$
(7.5)

The estimate

$$\begin{aligned} \Vert Q_{\epsilon }(u,u) \Vert _{C^0} \le C \epsilon _u \Xi ^2 e_v \le C N^{-1/L} \Xi e_v \end{aligned}$$
(7.6)

follows from (7.5) precisely as in [31, Section 16], by appealing to (3.3). The decomposition (7.5) of the quadratic commutator term is convenient since it allows one to estimate without additional complications the higher order derivatives \(\nabla ^k Q_{\epsilon }(u,u)\). Derivatives up to order \(L-1\) each introduce a factor of \(\Xi \), while past that order the derivatives fall on the mollifier \(P_{\le q}\) and the cost per derivative is a constant multiple of \(\Xi N^{1/L}\). Combining with

$$\begin{aligned} \Vert \nabla ^k P_{\le q}^2 \left( \partial _t u^j + u^i \partial _i u^j\right) \Vert _{C^0} \le C_k \Xi ^{k+1} e_v N^{(k+1-L)_+/L} \end{aligned}$$
(7.7)

which follows from the definition of q and (3.4), the proof of (7.2) is completed. \(\square \)

Lemma 7.2

(Commutator Estimates) Let \(D \ge 1\) and let Q be a convolution operator

$$\begin{aligned} Q f(x) = \int _{{\mathbb {R}}^2} f(x-h) q(h) dh \end{aligned}$$

whose kernel q satisfies the estimates

$$\begin{aligned} \Vert ~ |h|^{a} ~| \nabla ^{b} q|(h)~ \Vert _{L^1({\mathbb {R}}^2)} \le \lambda ^{b-a} \end{aligned}$$
(7.8)

for some \(\lambda \ge N \Xi \), and all \(0 \le a \le b \le D\). Then the commutator \(\left[ \frac{\bar{D}}{\partial t}, Q \right] = \left[ \partial _t + u_\epsilon \cdot \nabla ,Q\right] \) satisfies the estimates

$$\begin{aligned} \left\| \nabla ^k \left[ \frac{\bar{D}}{\partial t}, Q\right] \right\|&\le C_k \Xi e_v^{1/2} \lambda ^k, \qquad 0 \le k \le D - 1 \end{aligned}$$

as a bounded operator on \(C^0({\mathbb {R}}\times {\mathbb {T}}^2)\).

Proof

For \(f \in C^0\) we have that

$$\begin{aligned} \left| \left[ \frac{\bar{D}}{\partial t}, Q \right] f(x) \right|&= \left| \int _{{\mathbb {R}}^2} \left( u_\epsilon ^j(x) - u_\epsilon ^j(x-h) \right) \partial _j f(x-h) q(h) dh \right| \nonumber \\&\le \left| \int _{{\mathbb {R}}^2} \int _0^1 \partial _a u_\epsilon ^j(x - s h) ds f(x-h) h^a \partial _j q(h) dh \right| \nonumber \\&\le \Vert f\Vert _{C^0} \Vert \nabla u_\epsilon \Vert _{C^0} \Vert |h| |\nabla q|\Vert _{L^1} \le C \Vert f\Vert _{C^0} \Xi e_v^{1/2} \end{aligned}$$
(7.9)

by appealing to (7.1) with \(k=1\), and (7.8) with \(k=1\). This gives the proof of the Lemma when \(k=0\). For \(1 \le k \le D-1\), we appeal to the Leibniz rule, the assumption (7.8) on q, the bound (7.1) on \(u_\epsilon \), and the condition that \(\lambda \ge N \Xi \). For instance,

$$\begin{aligned} \left| \left[ \frac{\bar{D}}{\partial t}, Q \right] f(x) \right|&\le C \Vert f\Vert _{C^0} \left( \Vert \nabla ^2 u_\epsilon \Vert _{C^0} \Vert |h| |\nabla q|\Vert _{L^1} + \Vert \nabla u_\epsilon \Vert _{C^0} \Vert |h| |\nabla ^2 q| \Vert _{L^1} \right) \nonumber \\&\le C \Vert f\Vert _{C^0} \left( \Xi ^2 e_v^{1/2} + \Xi e_v^{1/2} \lambda \right) \le C \Vert f\Vert _{C^0} \Xi e_v^{1/2} \lambda \end{aligned}$$
(7.10)

is the desired bound when \(k =1\). The remaining cases \(2 \le k \le D-1\) are treated similarly. \(\square \)

In fact, the above lemma will only be applied to operators Q for which \(\lambda \) is given as in (5.7).

Lemma 7.3

(Transport Estimates) Let \(L\ge 2\), and denote by \(\frac{\bar{D}}{\partial t}\) the convective derivative associated to the flow \(u_\epsilon \). The phase gradients \(\nabla \xi _I\) obey the bound

$$\begin{aligned} \Vert \nabla ^k \left( \frac{\bar{D}}{\partial t}\right) ^{r} \nabla \xi _I\Vert _{C^0} \le C_k \Xi ^k (\Xi e_v^{1/2})^r N^{(k+(r-1)_+ + 1 - L)_+/L} \end{aligned}$$
(7.11)

for all \(k \ge 1\) and \(r \in \{0,1,2\}\). Moreover, the same bound holds if \(\nabla ^k (\frac{\bar{D}}{\partial t})^r\) is replaced by \(D^{(k,r)}\), where the latter is defined by

$$\begin{aligned} D^{(k,r)} = \nabla ^{k_1} \left( \frac{\bar{D}}{\partial t}\right) ^{r_1} \nabla ^{k_2} \left( \frac{\bar{D}}{\partial t}\right) ^{r_2} \nabla ^{k_3}, \end{aligned}$$
(7.12)

with \(k_1+k_2+k_3=k\), \(k_i \ge 0\), \(r_1+r_2=r\), and \(r_i \ge 0\).

We also have the estimate

$$\begin{aligned} | \nabla \xi _I(&\Phi _s(t,x)) - \nabla \hat{\xi }_I(x) | \le C b, \qquad |s| \le \tau \nonumber \\ b&= B_\lambda ^{-1/2} \left( \frac{e_v^{1/2}}{e_R^{1/2} N} \right) ^{1/2} = \tau \Xi e_v^{1/2} \end{aligned}$$
(7.13)

where \(\Phi _s\) is the coarse scale flow defined in 6.1, and \(\tau \) is specified as in line (6.38).

Proof

In order to establish (7.11) for \(r=0\), one appeals to (6.24), and obtains

$$\begin{aligned} (\partial _t + u_\epsilon ^j \partial _j) \nabla ^k \partial ^l \xi _I&= - \nabla ^k(\partial ^l u_\epsilon ^j \partial _j \xi _I) + [u_\epsilon ^j \partial _j, \nabla ^k] \partial ^l \xi _I. \end{aligned}$$
(7.14)

The bound for \(r=0\) then follows from the Grönwall inequality in the above identity, estimate (7.1), and the choice for \(\tau \) in (6.38). Similarly, from

$$\begin{aligned} \nabla ^k (\partial _t + u_\epsilon ^j \partial _j) \partial ^l \xi _I&= - \nabla ^k(\partial ^l u_\epsilon ^j \partial _j \xi _I) \end{aligned}$$
(7.15)

and (3.3) we obtain the estimate (7.11) with \(r=1\).

Lastly, in order to obtain the desired estimate when \(r=2\) we note that

$$\begin{aligned} \left( \frac{\bar{D}}{\partial t}\right) ^2 \partial ^l \xi _I&= (\partial _t + u_\epsilon ^i \partial _i)^2 \partial ^l \xi _I = - (\partial _t + u_\epsilon ^i \partial _i) (\partial ^l u_\epsilon ^j \; \partial _j \xi _I) \end{aligned}$$
(7.16)
$$\begin{aligned}&= - \partial _j \xi _I (\partial _t + u_\epsilon ^i \partial _i) (\partial ^l u_\epsilon ^j) - \partial ^l u_\epsilon ^j (\partial _t + u_\epsilon ^i \partial _i) (\partial _j \xi _I) \end{aligned}$$
(7.17)
$$\begin{aligned}&= - \partial _j \xi _I \partial ^l (\partial _t + u_\epsilon ^i \partial _i) u_\epsilon ^j +2 \partial ^l u_\epsilon ^j \; \partial _j u_\epsilon ^i \partial _i \xi _I. \end{aligned}$$
(7.18)

In particular, it is important that the second convective derivative of \(\nabla \xi _I\) only depends on a single convective derivative of \(u_\epsilon \). By appealing to Lemma 7.1, from (7.18) we obtain that

$$\begin{aligned} \Vert \left( \frac{\bar{D}}{\partial t}\right) ^2 \nabla \xi _I \Vert _{C^0} \le C \Xi ^2 e_v. \end{aligned}$$
(7.19)

The bound (7.11) with \(r=2\) and \(k \ge 1\), similarly follows from (7.18), the Leibniz rule, Lemma 7.1, and estimate (7.11) with \(r=0\).

The estimate (7.13) follows from the bound (7.11) with \(k = 0\) and \(r = 1\), from the calculation

$$\begin{aligned} | \nabla \xi _I( \Phi _s(t,x)) - \nabla \hat{\xi }_I(x) |&\le \int _0^s \Big | \frac{\bar{D}}{\partial t}\nabla \xi _I( \Phi _\sigma (t,x) ) \Big | d\sigma \\&\le C |s| \Xi e_v^{1/2} \le C b, \qquad \text{ if } |s| \le \tau \end{aligned}$$

\(\square \)

Lemma 7.4

(Principal Amplitude estimates) Let \(L\ge 2\) and \(\tau \) be defined in (6.38). Then the principal parts of the scalar amplitude \(\theta _I\), and the velocity amplitude \(u_I\), obey the bounds

$$\begin{aligned} \Vert D^{(k,r)} \theta _I \Vert _{C^0} + \Vert D^{(k,r)} u_I \Vert _{C^0}&\le C_k \Xi ^k e_R^{1/2} \tau ^{-r} N^{(k+1-L)_+/L} \end{aligned}$$
(7.20)

for all \(k\ge 0\) and \(r\in \{0,1,2\}\), for some suitable universal constants \(C_k >0\).

Proof

First, we note that in view of (5.17) we have \( u_I^l = \theta _I m^l(\nabla \xi _I). \) Since the multiplier m is smooth outside the origin and in view of Lemma 7.3 we have bounds for the derivatives of \(\nabla \xi _I\), the bound on \(u_I\) follows from that on \(\theta _I\), up to possibly increasing the constant \(C_K\) by a constant factor.

From (5.28) and (5.32) we recall that

$$\begin{aligned} \theta _I = \eta \left( \frac{t-k\tau }{\tau } \right) e(t)^{1/2} \gamma = \eta \left( \frac{t-k\tau }{\tau } \right) e(t)^{1/2} \left( 1 + \varepsilon \right) ^{1/2} \end{aligned}$$
(7.21)

where \(\varepsilon = (\tilde{c}_A + c_J)/e(t)\). Using (6.18), the lower bound \(e(t) \ge K_0 e_R\) and (5.30), we obtain the following estimates for \(\varepsilon \) and \(\gamma = (1 + \varepsilon )^{1/2}\)

$$\begin{aligned}&\Vert \nabla ^k \left( \frac{\bar{D}}{\partial t}\right) ^r \varepsilon \Vert _{C^0} + \Vert \nabla ^k \left( \frac{\bar{D}}{\partial t}\right) ^r \gamma \Vert _{C^0}\nonumber \\&\quad \le C_k \Xi ^k e_R \left( \Xi e_v^{1/2} \right) ^{(r\ge 1)} \left( N \Xi e_R^{1/2}\right) ^{(r\ge 2)} N^{(k+1-L)_+/L}. \end{aligned}$$
(7.22)

The bounds for spatial derivatives of \(\theta _I\) now follow from (7.22) since the other terms \(\eta \left( \frac{t-k\tau }{\tau } \right) \) and \(e^{1/2}(t)\) do not depend on x. Lemma 7.4 requires us also to show that that each advective derivative up to order 2 costs at most \(C \tau ^{-1}\) per derivative. For the time cutoff and the function \(e^{1/2}(t)\) in (7.21), the cost of \(\tau ^{-1}\) follows by definition for the time cutoff and by (3.11) for \(e^{1/2}(t)\) using the inequality \(\Xi e_v^{1/2} \le \tau ^{-1}\) from the choice of \(\tau \) in Section 6.1. For the other terms, the estimate (7.22) tells us that the first advective derivative costs \(\Xi e_v^{1/2} \le \tau ^{-1}\), and taking two advective derivatives costs

$$\begin{aligned} \left| \frac{\bar{D}^2}{\partial t^2}\right| \le C (\Xi e_v^{1/2})(N \Xi e_R^{1/2}) = C N \Xi ^2 e_v^{1/2} e_R^{1/2} = C \tau ^{-2} \end{aligned}$$

from the choice of \(\tau \) in (6.38). The bounds for the spatial derivatives then follow from the pattern in (7.22). \(\square \)

Lemma 7.5

(Amplitude correction estimates) Under the hypotheses of Lemma 7.4, the corrections \(\delta \theta _I\) and \(\delta u_I^l\) to the scalar amplitude and the velocity amplitude obey the bounds

$$\begin{aligned} \Vert D^{(k,r)} \delta \theta _I \Vert _{C^0} + \Vert D^{(k,r)} \delta u_I \Vert _{C^0}&\le C_k B_\lambda ^{-1} N^{-1} \Xi ^k e_R^{1/2} \tau ^{-r} N^{(k+2-L)_+/L} \end{aligned}$$
(7.23)

for all \(k\ge 0\) and \(r\in \{0,1,2\}\), for some suitable universal constants \(C_k >0\).

Proof

These estimates are obtained by explicitly differentiating the formulas for \(\delta \theta _I\) and \(\delta u_I^l\) coming from the Microlocal Lemma, Lemma 4.1. Here we carry out the calculation for the case of \(\delta \theta _I\), since the term \(\delta u_I^l\) can be treated in the same way. Recall that

$$\begin{aligned} \Theta _I = P_{\approx \lambda }^I( e^{i \lambda \xi _I} \theta _I) = e^{i \lambda \xi _I}\left( \theta _I + \delta \theta _I \right) \end{aligned}$$

Applying Lemma 4.1 with \(K(h) = \eta _{\approx \lambda }^I(h)\), we have the following formula for \(\delta \theta _I\)

$$\begin{aligned} \delta \theta _I&= \delta \theta _{I,1} - \delta \theta _{I,2} \nonumber \\ \delta \theta _{I, 1}&= \int _0^1 dr \int e^{-i \lambda \nabla \xi _I(x) \cdot h} e^{i Z(r,x,h)} (i \lambda )\nonumber \\&\quad \times \left[ \int _0^1 h^a h^b \partial _a\partial _b \xi _I(x-sh) (1-s) ds\right] \theta _I(x - rh) \eta _{\approx \lambda }^I(h) dh \end{aligned}$$
(7.24)
$$\begin{aligned} \delta \theta _{I,2}&= \int _0^1 dr \int e^{-i \lambda \nabla \xi _I(x) \cdot h} e^{i Z(r,x,h)} \partial _a \theta _I(x - rh) h^a \eta _{\approx \lambda }^I(h) dh \end{aligned}$$
(7.25)

with \(Z(r,x,h) = r\lambda \int _0^1 h^a h^b \partial _a\partial _b \xi (x-sh) (1-s) ds\) and where \(\eta _{\approx \lambda }^I\) is defined after line (5.10). In particular, recall that the kernel \(\eta _{\approx \lambda }^I(h) = 10^{2[k]}\lambda ^{2} \eta _{\approx 1}(\pm 10^{[k]} \lambda h)\) is constructed by rescaling a Schwartz kernel by a factor \(\lambda \), and therefore satisfies the estimates

$$\begin{aligned} \Vert |h|^m \eta _{\approx \lambda }^I \Vert _{L^1_h}&\le C_m \lambda ^{-m} \end{aligned}$$
(7.26)

Combining the estimate (7.26) with the bounds of Lemma 7.4 and Lemma 7.3 gives the \(C^0\) estimate for \(\nabla ^k \delta \theta _I\).

Proving estimates for advective derivatives of \(\delta \theta _I\) is tedious, but straightforward. To ease notation let us write \(Z(r,x,h) = r \lambda h^a h^b Z_{ab}\) where

$$\begin{aligned} Z_{ab} = Z_{ab}(t,x,h) = \int _0^1 \partial _a\partial _b \xi _I(x-sh) (1-s) ds \end{aligned}$$

We will sketch one example and estimate the advective derivative of the term in (7.24).

$$\begin{aligned}&(\partial _t + u_\epsilon \cdot \nabla ) \delta \theta _{I,1}(t,x) = - i T_{(1)} + i T_{(2)} + T_{(3)} + T_{(4)} \end{aligned}$$
(7.27)
$$\begin{aligned}&T_{(1)} = \int _0^1 dr \int e^{-i \lambda \nabla \xi _I(x) \cdot h} e^{i Z} (i \lambda ) h^a h^b Z_{ab} \theta _I(x - rh) \frac{\bar{D}}{\partial t}\partial _c \xi _I(x) \lambda h^c \eta _{\approx \lambda }^I(h) dh \end{aligned}$$
(7.28)
$$\begin{aligned} T_{(2)}&= \int _0^1 dr \int e^{-i \lambda \nabla \xi _I(x) \cdot h} e^{i Z} (i \lambda ) h^a h^b Z_{ab} \theta _I(x - rh) r \left( \partial _t + u_\epsilon ^i(x) \frac{\partial }{\partial x^i}\right) \nonumber \\&\quad \times Z_{ab} \lambda h^a h^b \eta _{\approx \lambda }^I(h) dh \end{aligned}$$
(7.29)
$$\begin{aligned}&T_{(3)} = \int _0^1 dr \int e^{-i \lambda \nabla \xi _I(x) \cdot h} e^{i Z} (i \lambda ) h^a h^b \left( \partial _t + u_\epsilon ^i(x) \frac{\partial }{\partial x^i}\right) Z_{ab} \theta _I(x - rh) \eta _{\approx \lambda }^I(h) dh \end{aligned}$$
(7.30)
$$\begin{aligned}&T_{(4)} = \int _0^1 dr \int e^{-i \lambda \nabla \xi _I(x) \cdot h} e^{i Z} (i \lambda ) h^a h^b Z_{ab} \left( \partial _t + u_\epsilon ^i(x) \frac{\partial }{\partial x^i}\right) \theta _I(x - rh) \eta _{\approx \lambda }^I(h) dh \end{aligned}$$
(7.31)

The pattern we observe in (7.28)-(7.31) is that the cost of the first advective derivative is given by \(\Xi e_v^{1/2}\) for every term. This cost is most clear for the term (7.28). The advective derivative brings down one term of size

$$\begin{aligned} \Vert \frac{\bar{D}}{\partial t}\nabla \xi _I \Vert _{C^0} \le C \Xi e_v^{1/2} \end{aligned}$$

and also introduces the factor \(\lambda h^c\). The \(\lambda \) and the h cancel out in terms of the estimate, since we gain a \(\lambda ^{-1}\) when we apply the bound

$$\begin{aligned} \Vert h^a h^b h^c \eta _{\approx \lambda }^I(h) \Vert _{L^1_h} \le C \lambda ^{-3} \end{aligned}$$

for the kernel, which comes from scaling.

The terms (7.29)-(7.31) require one more trick, which is to approximate the value of \(u_\epsilon ^i(x)\) with the nearby point in the integral. For example, for the term \( \left( \partial _t + u_\epsilon ^i(x) \frac{\partial }{\partial x^i}\right) \theta _I(x - rh)\) in (7.31) we write

$$\begin{aligned} \left( \partial _t + u_\epsilon ^i(x) \frac{\partial }{\partial x^i}\right) \theta _I(x - rh)&= \frac{\bar{D}}{\partial t}\theta _I(x - rh) + ( u_\epsilon ^i(x) - u_\epsilon ^i(x - r h) ) \partial _i \theta _I(x - r h) \end{aligned}$$
(7.32)

The cost of \(\Xi e_v^{1/2}\) for the advective derivative on \(\theta _I\) follows from Lemma 7.4. For the latter term, we write

$$\begin{aligned} ( u_\epsilon ^i(x) - u_\epsilon ^i(x - r h) ) \partial _i \theta _I(x - r h)&= - r \int _0^1 \partial _c u_\epsilon ^i( x - \sigma r h) d\sigma ~ \partial _i \theta _I(x - r h) h^c \end{aligned}$$
(7.33)

The term where \(\partial _c u_\epsilon ^i\) appears accounts for the cost of \(\Vert \nabla u_\epsilon \Vert _{C^0} \le \Xi e_v^{1/2}\). The derivative hitting \(\theta _I\) costs a factor of \(\Xi \), but this factor is regained by the factor \(h^c\) that has appeared, which gains a \(\lambda ^{-1}\) when combined with the kernel as usual. Repeating this observation many times for each one of (7.29)-(7.31), one obtains the first advective derivative bound in (7.23). We omit the details.

One also has to take a second advective derivative in order to prove (7.23), giving rise to another long series of terms which obey the correct bounds. We omit the proof of this estimate also, but we remark that one can avoid using these bounds during the course of the proof. The only applications of these bounds are in Section 8.3 for a lower order part of the advective derivative of the transport term, and in this case one can substitute second order commutator estimates as in Lemma 7.2, which are somewhat less tedious to write down. \(\square \)

Corollary 7.1

The bounds (7.20) of Lemma 7.4 hold also for \(\tilde{\theta }_I = \theta _I + \delta \theta _I\) and for \(\tilde{u}_I^l = u_I^l + \delta u_I^l\).

Estimates for the Corrections to the Scalar Field and Drift Velocity

In this Subection, we obtain estimates for the corrections \(\Theta \) and \(U^l = T^l[\Theta ]\) to the scalar field and drift velocity. These bounds confirm that the estimates (3.15)-(3.20) of Lemma 3.1 are satisfied. As with the previous Lemmas 7.1-7.5 and our choices of parameters, the results we obtain in this section are familiar from [31, Section 22.1]. In our setting, these estimates turn out to be a bit easier to check thanks to our use of frequency localizing projections.

Proposition 7.1

Under the hypotheses of Lemma 7.4, the corrections \(\Theta _I\) and \(U_I^l\) to the scalar field and the drift velocity satisfy

$$\begin{aligned} \Vert D^{(k,r)} \Theta _I \Vert _{C^0} + \Vert D^{(k,r)} U_I \Vert _{C^0}&\le C_k (B_\lambda N \Xi )^k \tau ^{-r} e_R^{1/2} \end{aligned}$$
(7.34)

for \(0 \le r \le 2\).

Proof

We outline the proof of (7.34) for \(\Theta _I\), as the velocity field \(U_I\) can be treated in the same way. Here we recall again that

$$\begin{aligned} \Theta _I = P_{\approx \lambda }^I [ e^{i \lambda \xi _I} \theta _I ] = e^{i \lambda \xi _I} \tilde{\theta }_I \end{aligned}$$

For \(r = 0\), the estimates for \(\nabla ^k \Theta _I\) follow from the bound \(\Vert \theta _I \Vert _{C^0} \le C e_R^{1/2}\), and the definition of \(\lambda \). To estimate the advective derivatives, we write

$$\begin{aligned} \frac{\bar{D}}{\partial t}\Theta _I&= e^{i \lambda \xi _I} \frac{\bar{D}}{\partial t}\tilde{\theta }_I \end{aligned}$$
(7.35)
$$\begin{aligned} \frac{\bar{D}^2}{\partial t^2}\Theta _I&= e^{i \lambda \xi _I} \frac{\bar{D}^2}{\partial t^2}\tilde{\theta }_I \end{aligned}$$
(7.36)

The bounds (7.34) now follow from the bounds of Lemma 7.3 and Corollary 7.1. The main terms in the estimates for spatial derivatives arise in every case when the derivatives fall on \(e^{i \lambda \xi _I}\). Alternatively, one can obtain the same bounds using commutator estimates such as those of Lemma 7.2 extended to second order commutators. Note that this latter approach avoids using the second advective derivative estimates proven in Lemma 7.5. \(\square \)

Lemma 3.1 also requires bounds on a vector field \(W^l\) satisfying \({\text {div}}\,W = \Theta \). To define \(W^l\), first recall that the corrections

$$\begin{aligned} \Theta _I = P_{\approx \lambda }^I ( e^{i \lambda \xi _I} \theta _I ) \end{aligned}$$

are frequency localized, which allows us to invert the divergence using the standard Helmholtz solution

$$\begin{aligned} W_I^l&= \partial ^l \Delta ^{-1} P_{\approx \lambda }^I(e^{i \lambda \xi _I} \theta _I) \end{aligned}$$
(7.37)

With this definition, we have \(\Theta = {\text {div}}\,W\) for \(W^l = \sum _I W_I^l\). The bounds (3.18)-(3.20) of Lemma 3.1 now follow from Lemma 7.4 and a suitable rescaling of Lemma 7.2 by writing

$$\begin{aligned} \left( \partial _t + u_\epsilon \cdot \nabla \right) W_I&= \left[ \frac{\bar{D}}{\partial t}, \partial ^l \Delta ^{-1} P_{\approx \lambda }^I\right] (e^{i \lambda \xi _I} \theta _I) + \partial ^l \Delta ^{-1} P_{\approx \lambda }^I (e^{i \lambda \xi _I} \frac{\bar{D}}{\partial t}\theta _I) \end{aligned}$$
(7.38)

and differentiating in space.

Prescribing the Energy Increment

We conclude this Section by verifying the estimates (3.21) and (3.22) for prescribing the energy increment. To obtain the estimate (3.21), let \(t \in {\mathbb {R}}\) and write

$$\begin{aligned} \int _{{\mathbb {T}}^2} |\Theta |^2(t,x) dx&= \sum _{I, J} \int \Theta _I \cdot \Theta _J(t,x) dx \end{aligned}$$
(7.39)

For indices \(J \ne \bar{I}\) which are not conjugate to each other, the product \(\Theta _I \cdot \Theta _J\) is localized at frequency \(\approx \lambda \), and in particular has integral 0. The only remaining terms are

$$\begin{aligned} \int _{{\mathbb {T}}^2} |\Theta |^2(t,x) dx&= \sum _{I} \int |\Theta _I|^2(t,x) dx \\ |\Theta _I|^2&= | \theta _I + \delta \theta _I |^2 = |\theta _I|^2 + 2 \delta \theta _I \theta _I + \delta \theta _I^2 \end{aligned}$$

The terms involving \(\delta \theta _I\) can all be estimated using Lemma 7.5 and Lemma 7.4.

$$\begin{aligned} \sum _I \Big | \int _{{\mathbb {T}}^2} 2 \theta _I ~ \delta \theta _I + (\delta \theta _I)^2 dx \Big |&\le C \frac{e_R}{B_\lambda N} \end{aligned}$$

The main terms are then given by

$$\begin{aligned} \sum _I \int _{{\mathbb {T}}^2} |\theta _I|^2(t,x) dx&= \sum _I \int \eta _{k}^2(t) e(t) \gamma ^2 dx \\&= 2 \int e(t) \gamma ^2 dx \\&= 2 \int _{{\mathbb {T}}^2} e(t) (1 + \varepsilon ) dx \end{aligned}$$

The bound (3.21) now follows from (5.30) provided \(B_\lambda \) is sufficiently large.

In order to obtain the estimate (3.22), we differentiate (7.39) with respect to t, and use the fact that the coarse scale velocity field \(u_\epsilon \) is divergence free

$$\begin{aligned} \frac{d}{dt} \int _{{\mathbb {T}}^2} |\Theta |^2(t,x) dx&= \sum _{I, J} \int _{{\mathbb {T}}^2} (\partial _t + u_\epsilon \cdot \nabla ) \Theta _I \cdot \Theta _J(t,x) dx \end{aligned}$$

At this point, we again observe that the terms \((\partial _t + u_\epsilon \cdot \nabla ) \Theta _I \cdot \Theta _J\) are localized in frequency space at frequencies of order \(\lambda \) for all nonconjugate indices \(J \ne \bar{I}\). These terms therefore integrate to 0 and we are left with

$$\begin{aligned} \frac{d}{dt} \int _{{\mathbb {T}}^2} |\Theta |^2(t,x) dx&= \sum _{I} \int _{{\mathbb {T}}^2} (\partial _t + u_\epsilon \cdot \nabla ) |\Theta _I|^2(t,x) dx \\&= \sum _I \int _{{\mathbb {T}}^2} (\partial _t + u_\epsilon \cdot \nabla ) | \tilde{\theta }_I |^2 dx \end{aligned}$$

The bound (3.22) now follows from Corollary 7.1.

Checking Frequency Energy Levels for the Scalar Field and Drift Velocity

The statement (3.13) in Lemma 3.1 requires us to prove that the new scalar field and drift velocity \(\theta _1 = \theta + \Theta \), \(u_1^l = u^l + U^l\) satisfy the bounds (3.3)-(3.4) for the new compound frequency energy levels \((\Xi ', e_v', e_R', e_J') = (C N \Xi , e_R, K_1 e_J, e_J')\) with

$$\begin{aligned} e_J' = \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2} e_R \end{aligned}$$

The bounds in (3.3) already follow from the arguments in [31, Section 22], as the scalar field \(\theta \) and drift velocity \(u^l\) both share the same estimates as the coarse scale velocity \(v^l\) in that paper, and the corrections \(\Theta \) and \(U^l\) both share the same estimates at the corrections \(V^l\) in that paper. The only new point here is how we establish the bound

$$\begin{aligned} \Vert (\partial _t + u_1 \cdot \nabla ) u_1 \Vert _{C^0}&\unlhd \left( \Xi ' e_v'\right) = C N \Xi e_R \end{aligned}$$
(7.40)

This estimate, which is quadratic in the velocity, is analogous to the bound for the pressure gradient in the case of Euler.

The idea is to use the assumed bound (3.4) and write

$$\begin{aligned} (\partial _t u_1 + u_1 \cdot \nabla ) u_1&= (\partial _t u + u \cdot \nabla u) + U \cdot \nabla u + (\partial _t + u \cdot \nabla ) U + U \cdot \nabla U \end{aligned}$$
(7.41)

In the case of Euler, the first term \((\partial _t u + u \cdot \nabla u)\) can be bounded using the Euler–Reynolds equations. In our case, though, the bound (7.40) on the advective derivative cannot be obtained from commuting the operator \(T^l\) with the compound scalar stress equation due to the lack of \(C^0\) boundedness of \(T^l\), and arguments involving frequency truncations still give logarithmic losses.

The idea is that we have already assumed the bound \(\Vert (\partial _t u + u \cdot \nabla u) \Vert _{C^0} \le \Xi e_v\), so that (7.40) follows from the condition \(N \ge \left( \frac{e_v}{e_R}\right) \). Also, further advective derivatives can be estimated at a cost smaller than \(N \Xi \) per derivative up to order \(L - 1\), giving (3.4) for this term. The proof of (3.4) for the other two terms is the same as in [31, Section 22]. The main idea is to write \((\partial _t + u \cdot \nabla ) = (\partial _t + u_\epsilon \cdot \nabla ) + (u - u_\epsilon ) \cdot \nabla \), and then to apply the relevant bounds established earlier on in Sections 6-7.

Estimates for the New Stress

In this Section, we conclude the proof of Lemma 3.1 by establishing estimates for the error terms contributing to the new stress field which were derived in Section 5.3. Recall from that section that the new scalar field \(\theta _1 = \theta + \Theta \) and the new drift velocity \(u_1^l = u^l + U^l\) satisfy the compound scalar stress equation

$$\begin{aligned} \left\{ \begin{aligned} \partial _t \theta _1 + \partial _l(\theta _1 u_1^l)&= \partial _l( c_B B^l + R_1^l) \\ u_1^l&= T^l\left[ \theta _1\right] \end{aligned} \right. \end{aligned}$$
(8.1)

The function \(c_B\) is defined in (5.25), and the new stress field has the form

$$\begin{aligned} R_1&= R_T + R_L + R_H + R_M + R_S \end{aligned}$$
(8.2)

as in (5.37). For these error terms, the Main Lemma requires us to show that the bounds of Definition 3.2 are satisfied for the compound frequency energy levels \((\Xi ', e_v', e_R', e_J' )\) specified in (3.13). Our starting point will be to prove the \(C^0\) estimates

$$\begin{aligned} \Vert c_B \Vert _{C^0}&\unlhd K_1 e_J \end{aligned}$$
(8.3)
$$\begin{aligned} \Vert R_1 \Vert _{C^0}&\unlhd e_J' \end{aligned}$$
(8.4)
$$\begin{aligned} e_J'&= \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2} e_R \end{aligned}$$
(8.5)

We will obtain these bounds in Section 8.1, at which point we will finally specify the large constant \(B_\lambda \) appearing in line (5.7) where \(\lambda \) is defined.

Once the \(C^0\) estimates are established and \(B_\lambda \) is chosen, the bounds on spatial derivatives

$$\begin{aligned} \Vert \nabla ^k c_B \Vert _{C^0}&\unlhd C (N \Xi )^k K_1 e_J \qquad k = 0, \ldots , L \end{aligned}$$
(8.6)
$$\begin{aligned} \Vert \nabla ^k R_1 \Vert _{C^0}&\unlhd C (N \Xi )^k e_J', \qquad k = 0, \ldots , L \end{aligned}$$
(8.7)

will be clear, and we will also need to verify the estimates for the advective derivatives

$$\begin{aligned} \Vert \nabla ^k \left( \partial _t + u_1 \cdot \nabla \right) c_B \Vert _{C^0}&\unlhd C \left( N \Xi \right) ^k \left( N \Xi e_R^{1/2}\right) K_1 e_J \end{aligned}$$
(8.8)
$$\begin{aligned} \Vert \nabla ^k \left( \partial _t + u_1 \cdot \nabla \right) R_1 \Vert _{C^0}&\unlhd C \left( N \Xi \right) ^k \left( N \Xi e_R^{1/2}\right) e_J' \nonumber \\ k&= 0, \ldots , L-1 \end{aligned}$$
(8.9)

These bounds will be checked in Sections 8.2 and 8.3, which will conclude the proof of Lemma 3.1.

The \(C^0\) Bounds

In this Section, we establish the \(C^0\) bounds (8.3)-(8.4). The bound (8.4) will be obtained only after the constant \(B_\lambda \) of line (5.7) is chosen sufficiently large.

First, observe that the estimate (8.3) for \(c_B\) follows immediately from line (5.25) where \(c_B\) is defined, and the bound \(\Vert R_\epsilon \Vert _{C^0} \le \Vert R_J \Vert _{C^0}\). Note that the constant \(K_1\) depends only on the operator \(T^l\).

It now remains to estimate the stress terms appearing in (8.2). We estimate each of these in turn.

The Mollification Term \(R_M^l\). We recall from (5.3) and (6.21) that

$$\begin{aligned} R_M^l&= (u^l - u_\epsilon ^l) \Theta + \left( \theta - \theta _\epsilon \right) U^l + \left( c_A - \tilde{c}_A\right) A^l + (R_J^l - R_\epsilon ^l) \end{aligned}$$
(8.10)
$$\begin{aligned}&= R_{M,\theta }^l + R_{M,u}^l + \left( c_A - \tilde{c}_A\right) A^l + (R_J^l - R_\epsilon ^l) + R_{M'}^l. \end{aligned}$$
(8.11)

Note that by the choice of B, from (6.9) and (6.17) we have that

$$\begin{aligned}&\Vert R_{M,\theta } \Vert _{C^0} + \Vert R_{M,u} \Vert _{C^0} + \Vert \left( c_A - \tilde{c}_A\right) A^l \Vert _{C^0} + \Vert (R_J^l - R_\epsilon ^l) \Vert _{C^0} \le \frac{e_v^{1/2} e_R^{1/2}}{50 N} \nonumber \\&\quad = \frac{1}{50} ( \frac{e_v^{1/2}}{e_R^{1/2} N}) e_R \end{aligned}$$
(8.12)

The factor \(\left( \frac{e_v^{1/2}}{e_R^{1/2} N} \right) \) is less than 1, so this estimate is more than enough to achieve the bound (8.4).

To estimate the remaining term

$$\begin{aligned} R_{M'}^l&= \sum _I e^{i \lambda \xi _I} (u^l - u_\epsilon ^l) \delta \theta _I + \sum _I e^{i \lambda \xi _I} \left( \theta - \theta _\epsilon \right) \delta u_I^l \end{aligned}$$
(8.13)

recall the estimates

$$\begin{aligned} \Vert (u^l - u_\epsilon ^l) \Vert _{C^0} + \Vert \left( \theta - \theta _\epsilon \right) \Vert _{C^0}&\le \frac{e_v^{1/2}}{N} \le e_v^{1/2} \end{aligned}$$
(8.14)
$$\begin{aligned} \Vert \delta \theta _I \Vert _{C^0} + \Vert \delta u_I^l \Vert _{C^0}&\le C \frac{e_R^{1/2}}{B_\lambda N} \end{aligned}$$
(8.15)

from Section 6 and Lemma 7.5 (in fact the estimates for the terms \((\theta - \theta _\epsilon )\) and \((u - u_\epsilon )\) are even better). Note also that, at any given time t, at most four indices I contribute to the sum in (8.13).

For sufficiently large values of \(B_\lambda \), we therefore obtain

$$\begin{aligned} \Vert R_{M'}^l \Vert _{C^0} \le \frac{1}{50} \frac{e_v^{1/2} e_R^{1/2}}{N} \end{aligned}$$

which is sufficient for (8.4).

The Stress Term \(R_S\). To estimate \(R_S\), let us recall from (5.36) that we can express

$$\begin{aligned} R_S^l&= \sum _I (R_{SI,1}^l + R_{SI,2}^l ) \end{aligned}$$
(8.16)
$$\begin{aligned} R_{SI,1}^l&= |\theta _I|^2 [ (m^l(-\nabla \xi _I) - m^l(-\nabla \hat{\xi }_I)) + ( m^l\left( \nabla \xi _I\right) - m^l\left( \nabla \hat{\xi }_I\right) )] \end{aligned}$$
(8.17)
$$\begin{aligned} R_{SI,2}^l&= \delta \theta _I \tilde{u}_{\bar{I}}^l + \tilde{\theta }_I \delta u_{\bar{I}}^l - \delta \theta _I \delta u_{\bar{I}}^l + \delta \theta _{\bar{I}} \tilde{u}_{I}^l + \tilde{\theta }_{\bar{I}} \delta u_{I}^l - \delta \theta _{\bar{I}} \delta u_{I}^l \end{aligned}$$
(8.18)

The estimates of Lemma 7.4, Lemma 7.5 and Corollary 7.1 give

$$\begin{aligned} \Vert R_{SI, 2} \Vert _{C^0}&\le \frac{C}{B_\lambda N} e_R \end{aligned}$$

At any given time t, as most 4 terms of the form \(R_{SI, 2}\) are nonzero, which allows us to obtain the estimate

$$\begin{aligned} \Vert \sum _I | R_{SI,2} | \Vert _{C^0}&\le \frac{e_v^{1/2} e_R^{1/2}}{500 N} \end{aligned}$$

which is sufficient for (8.4), by taking the value of \(B_\lambda \) sufficiently large.

We estimate the terms in (8.17) using (7.13) and Lemma 7.4, in order to obtain the bound

$$\begin{aligned} \Vert R_{SI,1}^l \Vert _{C^0}&\le \frac{C}{B_\lambda ^{1/2}} \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2} e_R \end{aligned}$$

By choosing the constant \(B_\lambda \) sufficiently large, we obtain the bound

$$\begin{aligned} \Vert \sum _I | R_{SI,1}^l | \Vert _{C^0}&\le \frac{1}{1000} e_J' \end{aligned}$$
(8.19)

where \(e_J'\), as defined in (8.4), is our goal for the size of the new stress term \(R_1^l\).

For the next stress terms, \(R_L\) and \(R_T\), we use that they are frequency localized between two constant multiples of \(\lambda \), and thus we can appeal to the estimate

$$\begin{aligned} \Vert \nabla \Delta ^{-1} P_{\approx \lambda } \Vert _{C^0 \rightarrow C^0} \le C \lambda ^{-1} = \frac{C}{N B_\lambda }. \end{aligned}$$
(8.20)

The High-Low Term \(R_L\). We recall from (5.45) that

$$\begin{aligned} R_L^l = \partial ^l \Delta ^{-1} P_{\approx \lambda } [ U^j \partial _j \theta _\epsilon ] \end{aligned}$$

and thus

$$\begin{aligned} \Vert R_L \Vert _{C^0} \le \Vert \nabla \Delta ^{-1} P_{\approx \lambda } \Vert _{C^0 \rightarrow C^0} \Vert U^j \Vert _{C^0} \Vert \partial _j \theta _\epsilon \Vert _{C^0} \le \frac{C}{N B_\lambda } e_R^{1/2} e_v^{1/2} \end{aligned}$$

holds, upon appealing to (8.20). Choosing \(B_\lambda \) sufficiently large, we see that

$$\begin{aligned} \Vert R_L \Vert _{C^0} \le \frac{1}{1000} \frac{e_v^{1/2} e_R^{1/2}}{N} \end{aligned}$$

which is sufficient for (8.4) to be satisfied.

The Transport Term \(R_T\). We use (5.40) to recall that

$$\begin{aligned} R_T = \partial ^l \Delta ^{-1} P_{\approx \lambda }\left[ \frac{\bar{D}}{\partial t}\Theta \right] . \end{aligned}$$

Thus, from (8.20) and (7.34) we obtain

$$\begin{aligned} \Vert R_T \Vert _{C^0}&\le \frac{C}{\lambda } \tau ^{-1} e_R^{1/2} \\&=\frac{C}{B_\lambda N \Xi } B_\lambda ^{1/2} \left( \frac{e_v^{1/2}}{e_R^{1/2} N} \right) ^{-1/2} \Xi e_v^{1/2} e_R^{1/2}\\&= \frac{C}{B_\lambda ^{1/2}} \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2} e_R \end{aligned}$$

in view of the choice of \(\tau \) in (6.38). Choosing \(B_\lambda \) sufficiently large immediately shows that

$$\begin{aligned} \Vert R_T \Vert _{C^0} \le \frac{1}{1000} e_J' \end{aligned}$$

holds.

The High-Frequency Interference Term \(R_H\). To conclude the \(C^0\) stress estimates we recall from (5.46) and (5.49)-(5.51) that

$$\begin{aligned} R_H&= \sum _{J \ne \bar{I}} i\lambda \partial ^l \Delta ^{-1} P_{\approx \lambda } \\&\quad \Big [e^{i \lambda \left( \xi _I + \xi _J\right) } ( \theta _J {\tilde{\theta }}_I ( m^l(\nabla \xi _J) - m^l(\nabla \hat{\xi }_J)) \partial _l \xi _I\\&\quad + \theta _I {\tilde{\theta }}_J m^l\left( \nabla \hat{\xi }_J\right) (\partial _l \xi _I - \partial _l \hat{\xi }_I)+ \delta u_J^l \partial _l \xi _I {\tilde{\theta }}_I ) \Big ]. \end{aligned}$$

From (8.20) we thus obtain

$$\begin{aligned} \Vert R_H \Vert _{C^0}&\le C A e_R \left( \Xi e_v^{1/2} \tau \right) + \frac{C}{B_\lambda N } e_R \\&\le \frac{C A }{B_\lambda ^{1/2}} \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2} e_R + \frac{C}{B_\lambda N } e_R. \end{aligned}$$

For all sufficiently large choices of \(B_\lambda \), we finally have the estimate

$$\begin{aligned} \Vert R_H \Vert _{C^0} \le \frac{1}{1000} e_J' \end{aligned}$$

This error term is the last one, so the estimate (8.4) will finally be satisfied for any sufficiently large choice of \(B_\lambda \ge \overline{B_\lambda }\). The only restriction now is that \(\lambda = B_\lambda N \Xi \) in (5.7) must be a positive integer. Since we assume \(\Xi \ge 2\) in Definition 3.2 and \(N \ge 1\), an appropriate choice of \(B_\lambda \) exists in the interval \(B_\lambda \in [\overline{B_\lambda }, 2 \overline{B_\lambda }]\). Our construction is now fully specified once such a value is chosen.

Spatial Derivative Bounds

First we claim that

$$\begin{aligned} \Vert \nabla ^k c_B \Vert _{C^0} \le C_k (N \Xi )^k K_1 e_J \end{aligned}$$

For this purpose, recall the definition (5.25) and the bound (6.18). This above estimate holds since we have already verified the \(C^0\) estimate (8.3), and each spatial derivatives costs no more than a factor

$$\begin{aligned} |\nabla | \le C N \Xi . \end{aligned}$$

The stress terms \(R_T, R_L\), and \(R_H\) each are contain a frequency localizing operator \(P_{\approx \lambda }\), so that again, each spatial derivative costs at most \(CN \Xi \), since the constant \(B_\lambda \) has now been fixed, in the previous subsection.

The term \(R_M^l\) is treated in the same fashion as the mollified stress term in [31, Section 25.1]. The main ideas is that comparing the bound

$$\begin{aligned} \Vert u - u_\epsilon \Vert _{C^0} \le \frac{e_v^{1/2}}{N} \end{aligned}$$

which had been used to establish (8.12) in Section 6, to the estimate

$$\begin{aligned} \Vert \nabla u \Vert _{C^0} + \Vert \nabla u_\epsilon \Vert _{C^0} \le C \Xi e_v^{1/2} \end{aligned}$$

we notice the cost is at most \(C N \Xi \) upon taking a spatial derivative.

The \(R_S^l\) stress is treated by writing \(R_S^l = \sum _I (R_{SI,1}^l + R_{SI,2}^l)\). The estimate for \(R_{SI,2}\) follows from the bounds established in Lemmas 7.4–(7.5). To treat \(R_{SI,1}\) we need to observe that the spatial derivative costs at most \(N \Xi \) when it is applied to the difference \(\nabla \xi _I - \nabla \hat{\xi }_I\). Comparing the bounds of Lemma 7.3 and (7.13)

$$\begin{aligned} \Vert \nabla \xi _I - \nabla \hat{\xi }_I \Vert _{C^0}&\le \left( \frac{e_v^{1/2} }{e_R^{1/2} N} \right) ^{1/2} \\ \Vert \nabla ^2 \xi _I \Vert _{C^0}&\le C \Xi \end{aligned}$$

gives a cost of \(|\nabla | \le C N^{1/2} \Xi \), which is smaller than the threshold \(N \Xi \). All further derivatives of this term cost at most \(C N^{1/2} \Xi \) according to Lemma 7.3.

Advective Derivative Bounds

We now proceed to establish the advective derivative bounds (8.8)-(8.9) for the new frequency energy levels, which is more subtle than the spatial derivative estimates due to the improved regularity of the advective derivative. As observed in [31, Proposition 24.1 and Proposition 24.2], note that it suffices to check the bounds for the coarse scale advective derivative \(\frac{\bar{D}}{\partial t}= \partial _t + u_\epsilon \cdot \nabla \) after we write

$$\begin{aligned} \partial _t + u_1 \cdot \nabla = (\partial _t + u_\epsilon \cdot \nabla ) + (u - u_\epsilon ) \cdot \nabla + U \cdot \nabla . \end{aligned}$$

Having established spatial derivative estimates on all our error terms, the bounds for the two error spatial derivative terms follow from the results of Section 8.2, the already established estimates on spatial derivatives for \(\Vert \nabla ^k ( u - u_\epsilon ) \Vert _{C^0}\) which follow from (6.4), and the bounds on \(\Vert \nabla ^k U \Vert _{C^0}\), which follow from Propositon 7.1.

Since each term has been estimated already in \(C^0\) by the energy level \(e_J'\), our goal at this point is to check that the advective derivative never costs any more than

$$\begin{aligned} \Big | \frac{\bar{D}}{\partial t}\Big |&\unlhd C N \Xi e_R^{1/2} \end{aligned}$$
(8.21)

compared the estimates that were used to obtain the \(C^0\) bound.

For most terms, the advective derivative costs \(\tau ^{-1}\), so it is useful to observe that our goal is also implied by a bound of the type

$$\begin{aligned} \Big | \frac{\bar{D}}{\partial t}\Big | \le C \tau ^{-1} \end{aligned}$$
(8.22)

from the fact that \(\tau ^{-1} \le N \Xi e_R^{1/2}\). For terms involving the difference between the phase gradients and their initial values, the following Lemma stating the cost of differentiating \(\nabla \xi _I - \nabla \hat{\xi }_I\) is helpful

Lemma 8.1

For \(k \ge 0\) and \(0 \le r \le 2\), we have the following bounds

$$\begin{aligned} \Vert \nabla ^k \left( \frac{\bar{D}}{\partial t}\right) ^r ( \nabla \xi _I - \nabla \hat{\xi }_I) \Vert _{C^0}&\le C_k (N\Xi )^k \tau ^{-r} b \end{aligned}$$
(8.23)
$$\begin{aligned} b&= B_\lambda ^{-1/2} \left( \frac{e_v^{1/2}}{e_R^{1/2}N} \right) ^{1/2} \end{aligned}$$
(8.24)

Lemma 8.1 follows from Lemma 7.3 after checking the relationships of the parameters using the condition \(N \ge \frac{e_v}{e_R}\).

Corollary 8.1

The bounds in Lemma 8.1 hold also for

$$\begin{aligned} m^l\left( \nabla \xi _I\right) - m^l\left( \nabla \hat{\xi }_I\right) = \left( \nabla \xi _I - \nabla \hat{\xi }_I\right) \int _0^1 \partial _a m^l\Big ( (1-\sigma )\nabla \hat{\xi }_I + \sigma \nabla \xi _I \Big ) d\sigma \end{aligned}$$

With these bounds in hand, we can now quickly verify (8.21).

The Term \(c_B\). The term \(c_B\) inherits the estimates for \(R_\epsilon \) from its definition in (5.25). These bounds are no worse than the bounds stated for \(R_J\) in (6.18) as long as one takes no more than 2 advective derivatives and no more than L total spatial or advective derivatives (see [31, Section 18]). As a consequence, we obtain (8.8).

The Mollification Term \(R_M^l\). The mollification term (8.11) is handled in the same way as in [31, Sections 25.1, 25.2]. Among these estimates, the most subtle are the terms

$$\begin{aligned} (u - u_\epsilon ) \Theta + (\theta - \theta _\epsilon ) U \end{aligned}$$

For the purposes of proving our result and the main theorem in [31], these terms can be estimated separately as

$$\begin{aligned} \left\Vert \left( \frac{\bar{D}}{\partial t}u - \frac{\bar{D}}{\partial t}u_\epsilon \right) \Theta \right\Vert_{C^0} \le C \left( \left\Vert \frac{\bar{D}}{\partial t}u\right\Vert_{C^0} + \left\Vert \frac{\bar{D}}{\partial t}u_\epsilon \right\Vert_{C^0}\right) \Vert \Theta \Vert _{C^0} \end{aligned}$$

at the cost of requiring the condition \(N \ge \left( \frac{e_v}{e_R} \right) ^{3/2}\). However, as discussed in [31, Section 25.1], it appears that a scheme aimed at proving 1 / 3 regularity might require this term to be estimated more delicately. A more delicate commutator estimate would allow us to require instead that \(N \ge \left( \frac{e_v}{e_R} \right) \).

The Stress Term \(R_S\) For the term \(R_S\), the cost (8.21) is obtained for every term appearing in (8.16) using the estimates of Lemmas 7.4-7.5 and Corollary 7.1 for the amplitudes, and using Lemma 8.1 and Corollary 8.1 for the terms involving differences of phase gradients.

The Terms \(R_T\), \(R_L\) and \(R_H\) The commutator estimates of Lemma 7.2 and the use of frequency localized waves make it especially simple to estimate the terms obtained by solving the divergence equation. We list these terms here.

$$\begin{aligned} \frac{\bar{D}}{\partial t}R_T^l&= \left[ \frac{\bar{D}}{\partial t},\partial ^l \Delta ^{-1} P_{\approx \lambda }\right] \left[ \frac{\bar{D}}{\partial t}\Theta \right] + \partial ^l \Delta ^{-1} P_{\approx \lambda } \left[ \frac{\bar{D}^2}{\partial t^2}\Theta \right] \end{aligned}$$
(8.25)
$$\begin{aligned} \frac{\bar{D}}{\partial t}R_L^l&= \left[ \frac{\bar{D}}{\partial t},\partial ^l \Delta ^{-1} P_{\approx \lambda }\right] \left[ U^j \partial _j \theta _\epsilon \right] + \partial ^l \Delta ^{-1} P_{\approx \lambda } \frac{\bar{D}}{\partial t}\left[ U^j \partial _j \theta _\epsilon \right] \end{aligned}$$
(8.26)
$$\begin{aligned} \frac{\bar{D}}{\partial t}R_H^l&= \sum _{J \ne \bar{I}} \left[ \frac{\bar{D}}{\partial t},\partial ^l \Delta ^{-1} P_{\approx \lambda }\right] r_{H, IJ} + \partial ^l \Delta ^{-1} P_{\approx \lambda } \frac{\bar{D}}{\partial t}r_{H,IJ} \end{aligned}$$
(8.27)
$$\begin{aligned} r_{H,IJ}&= (i\lambda ) e^{i \lambda (\xi _I + \xi _J)} \left( \theta _J {\tilde{\theta }}_I ( m^l(\nabla \xi _J) - m^l(\nabla \hat{\xi }_J)) \partial _l \xi _I \right) \end{aligned}$$
(8.28)
$$\begin{aligned}&\quad + \left( i \lambda \right) e^{i \lambda \left( \xi _I + \xi _J\right) } \left( \theta _J {\tilde{\theta }}_I m^l(\nabla \hat{\xi }_J)( \partial _l \xi _I - \partial _l \hat{\xi }_I)+ \delta u_J^l \partial _l \xi _I {\tilde{\theta }}_I \right) . \end{aligned}$$
(8.29)

Combining Lemma 7.2 with Corollary 8.1 and all the bounds of Section 7, we obtain a cost of (8.22) (and therefore (8.21)) for the advective derivative. Further spatial derivatives cost at most \(C N \Xi \) as all the terms are in fact localized to frequencies of order \(\lambda \).

This estimate concludes the proof of the Main Lemma.

Proof of the Main Theorem

In this Section, we explain how Theorem 1.1 can be deduced from the Main Lemma, Lemma 3.1. More specifically, the Theorem we establish directly is the following:

Theorem 9.1

As in the hypotheses of Theorem 1.1, let \(\alpha < 1/9\), let \(\epsilon > 0\) be given, and let \(f : {\mathbb {R}}\times {\mathbb {T}}^2 \rightarrow {\mathbb {R}}\) be any smooth function of compact support

$$\begin{aligned} {\mathrm {supp}}\,f \subseteq I \times {\mathbb {T}}^2 \end{aligned}$$

for which the integral

$$\begin{aligned} \int _{{\mathbb {T}}^2} f(t,x) dx&= 0, \qquad t \in {\mathbb {R}} \end{aligned}$$
(9.1)

remains constant in time. Then there exists a function \(\theta : {\mathbb {R}}\times {\mathbb {T}}^2 \rightarrow {\mathbb {R}}\) with the following properties:

  1. 1.

    \(\theta \) satisfies the Active Scalar Equation (1.1) in the sense of distributions.

  2. 2.

    The scalar field \(\theta \) and the drift velocity \(u^l = T^l[\theta ]\) both belong to the Hölder class \(\theta , u^l \in C_{t,x}^\alpha \)

  3. 3.

    \(\theta \) is supported in the time interval

    $$\begin{aligned} {\mathrm {supp}}\,\theta \subseteq I_\epsilon \times {\mathbb {T}}^2, \end{aligned}$$
    (9.2)

    where \(I_\epsilon = [a_0 - \epsilon , b_0 + \epsilon ]\) is an \(\epsilon \)-neighborhood of the interval \(I = [a_0, b_0]\)

  4. 4.

    \(\theta \) satisfies a uniform estimate

    $$\begin{aligned} \Vert \theta \Vert _{C^0}&\le C \end{aligned}$$
    (9.3)

    with C depending only on f.

  5. 5.

    For any smooth function \(\phi : {\mathbb {R}}\times {\mathbb {T}}^2 \rightarrow {\mathbb {C}}\), we have

    $$\begin{aligned} \left| \int _{{\mathbb {R}}\times {\mathbb {T}}^2} (\theta - f) \phi ~dt dx \right|&\le C \epsilon \Vert \nabla \phi \Vert _{L^1_{t,x}(I_\epsilon \times {\mathbb {T}}^2)} \end{aligned}$$
    (9.4)

    for some constant C depending on f.

Theorem 1.1 follows from Theorem 9.1 by a straightforward argument that is implicit in Section 10 below.

Our starting point is the observation that the function f can be viewed as a solution to the scalar-stress equation

$$\begin{aligned} \partial _t f + \partial _l\left( f u^l\right)&= \partial _l R^l \nonumber \\ u^l&= T^l[f] \end{aligned}$$
(9.5)
$$\begin{aligned} R^j&= \partial ^j \Delta ^{-1} \left[ \partial _t f + \partial _l\left( f u^l\right) \right] \end{aligned}$$
(9.6)

thanks to the condition

$$\begin{aligned} \int _{{\mathbb {T}}^2} \partial _t f dx = \frac{d}{dt} \int _{{\mathbb {T}}^2} f(t,x) dx = 0. \end{aligned}$$

Furthermore, the functions \((f, u^l, R^l)\) in (9.5) are all smooth functions on \({\mathbb {R}}\times {\mathbb {T}}^2\) with support contained in a finite time interval \(I \times {\mathbb {T}}^2\). In particular, the scalar function \(\theta _{(0)} = f\) can be viewed as part of a smooth, compactly supported solution \((f, u^l, c_A, R^l)\) to the compound scalar stress equation (3.2) with \(c_A = 0\).

Our proof of Theorem 9.1 will be completed once we prove the following Claim.

Claim 9.1

Under the assumptions of Theorem 9.1, there exists a sequence sequence of scalar-stress fields \((\theta _{(k)}, u_{(k)}^l , c_{A, (k)}, R_{J,(k)}^l )\) satisfying the following properties.

  1. 1.

    For even indices \(k = 0, 2, 4, \ldots \), \((\theta _{(k)}, u_{(k)}^l , c_{A, (k)}, R_{J,(k)}^l )\) solves the Compound Scalar-Stress Equation (3.2) with vector \(A^l\), whereas for odd indices \(k = 1, 3, 5, \ldots \), \((\theta _{(k)}, u_{(k)}^l , c_{A, (k)}, R_{J,(k)}^l )\) solves the Compound Scalar-Stress Equation (3.2) with vector \(B^l\). Here \(A^l\) and \(B^l\) are defined as in (3.1).

  2. 2.

    We have \(\Vert c_{A,(k)} \Vert _{C^0} + \Vert R_{J,(k)} \Vert _{C^0} \rightarrow 0\) as \(k \rightarrow \infty \)

  3. 3.

    The sequences \(\theta _{(k)}, u_{(k)}^l\) are Cauchy in \(C_{t,x}^0\) with uniform bounds on \(\Vert \theta _{(k)} \Vert _{C^0}, \Vert u_{(k)} \Vert _{C^0}\) depending only on f.

  4. 4.

    The sequences \(\theta _{(k)}, u_{(k)}^l\) are also Cauchy in \(C_{t,x}^\alpha \).

  5. 5.

    We have \({\mathrm {supp}}\,\theta _{(k)} \subseteq I_\epsilon \times {\mathbb {T}}^2\) for all k.

  6. 6.

    The estimate (9.4) holds for \(\theta _{(k)}\) uniformly in k.

The scalar stress fields described in Claim 9.1 will be constructed by iteration of Lemma 3.1.

The Base Case

To initialize the construction, we set \(\theta _{(0)} = f\), \(u_{(0)}^l = T^l[f]\), \(c_{A,(0)} = 0\), and \(R_{J,(0)}\) as in (9.6). We define \(I_{(0)}\) to be the smallest closed inerval such that \({\mathrm {supp}}\,f \subseteq I_{(0)} \times {\mathbb {T}}^2\). We set

$$\begin{aligned} e_{J,(0)} = \Vert R_{J,(0)} \Vert _{C^0}. \end{aligned}$$

In order to be consistent with the iteration rules (9.8)-(9.11) below and to maintain the inequality \(e_v \ge e_R \ge e_J\) during the iteration, we take

$$\begin{aligned} e_{v,(0)} = e_{R,(0)} = K_1 e_{J,(0)} \end{aligned}$$

where \(K_1\) is the constant in Lemma 3.1. Now let \(\overline{\Xi }\) be a sufficiently large constant such that the bounds of Definition 3.2 hold with \(L = 2\) for the frequency energy levels \((\Xi , e_v, e_R, e_J) = (\overline{\Xi }, K_1e_{J,(0)}, K_1e_{J,(0)}, e_{J,(0)})\).

We will choose our initial frequency level \(\Xi _{(0)}\) to be even larger than the parameter \(\overline{\Xi }\). More specifically, \(\Xi _{(0)}\) will take the form

$$\begin{aligned} \Xi _{(0)}&= Y \overline{\Xi } \end{aligned}$$
(9.7)

Here \(Y \ge 1\) is a large parameter whose purpose will ultimately be to make sure that the time interval containing the support of the solution will be small without disturbing the required \(C^0\) estimate. In terms of the construction, choosing the parameter Y to be large will imply that we perform the iteration with a large frequency parameter \(\lambda \) and a small lifespan parameter \(\tau \) when we iterate the Main Lemma.

Choice of Parameters for \(k \ge 1\)

We will proceed with the proof by iteration of the Main Lemma, which requires us to specify a sequence of frequency energy levels \((\Xi _{(k)}, e_{v,(k)}, e_{R,(k)}, e_{J,(k)})\), a sequence of functions \(e_{(k)}(t)\) prescribing the energy increment, a sequence of intervals \(I_{(k)}\) containing the support of the compound scalar-stress fields, and a sequence of frequency growth factors \(N_{(k)} \ge 2\). The present section is devoted to choosing these parameters, and studying how these parameters grow or decay during the iteration.

We will choose our frequency energy levels so that the following iteration rules hold for all \(k \ge 0\):

$$\begin{aligned} e_{v,(k+1)}&= \; e_{R,(k)} \end{aligned}$$
(9.8)
$$\begin{aligned} e_{R,(k+1)}&= K_1 e_{J,(k)} \end{aligned}$$
(9.9)
$$\begin{aligned} e_{J,(k+1)}&= \frac{ e_{J,(k)}}{Z} \end{aligned}$$
(9.10)
$$\begin{aligned} \Xi _{(k+1)}&= C_0 N_{(k)} \Xi _{(k)} \end{aligned}$$
(9.11)

The parameter Z will be chosen in the proof to be a large constant satisfying \(Z\ge K_1 \ge 1\). From (9.8)-(9.10) and the choices of Section 9.1, the energy levels decay exponentially according to the following pattern:

$$\begin{aligned} \begin{aligned} (e_v, e_R, e_J)_{(0)}&= ( K_1 e_{J,(0)}, K_1 e_{J,(0)}, e_{J,(0)} ) \\ (e_v, e_R, e_J)_{(1)}&= \left( K_1 e_{J,(0)}, K_1 e_{J,(0)}, \frac{1}{Z} e_{J,(0)} \right) \\ (e_v, e_R, e_J)_{(k)}&= \left( \frac{K_1}{Z^{k-2}} e_{J,(0)}, \frac{K_1}{Z^{k-1}} e_{J,(0)}, \frac{1}{Z^k} e_{J,(0)} \right) , \qquad k \ge 2 \end{aligned} \end{aligned}$$
(9.12)

The constant \(C_0\) in (9.11) is the constant C appearing in line (3.13) of the Main Lemma. Thus, \(C_0\) will depend on how we construct our energy increment functions \(e_{(k)}(t)\), which will be specified momentarily. According to the bound of line (3.13), we have that

$$\begin{aligned} e_{J,(k+1)}&= \left( \frac{e_{v,(k)}^{1/2}}{e_{R,(k)}^{1/2} N_{(k)}} \right) ^{1/2} e_{R,(k)} \end{aligned}$$
(9.13)

The iteration rules (9.8)-(9.10) are therefore achieved by taking

$$\begin{aligned} N_{(k)}&= \left( \frac{e_{v,(k)}}{e_{R,(k)}} \right) ^{1/2} \left( \frac{e_{R,(k)}}{e_{J,(k)}} \right) ^2 Z^2 \end{aligned}$$
(9.14)

More specifically, recalling (9.12),

$$\begin{aligned} \begin{aligned} N_{(0)} = K_1^2 Z^2, \quad N_{(1)} = K_1^2 Z^4, \quad N_{(k)} = K_1^2 Z^{9/2}, \qquad k \ge 2. \end{aligned} \end{aligned}$$
(9.15)

As we always have \(\left( \frac{e_{v,(k)}}{e_{R,(k)}} \right) ^{3/2} \le Z^{3/2} \le N\), the assumption of line (3.12) is always satisfied, so this choice of \(N_{(k)}\) is admissible. With this choice, iteration of (9.11) results in exponential growth of the frequency levels

$$\begin{aligned} Z^{2k} \Xi _{(0)} \le \Xi _{(k)}&\le C_0^{k} K_1^{2k} Z^{(9/2) k} \Xi _{(0)} \end{aligned}$$
(9.16)

for all \(k \ge 0\).

We will now specify how our sequence of energy functions \(e_{(k)}(t)\) and time intervals \(I_{(k)}\) will be chosen, beginning with stage \(k = 0\). Define

$$\begin{aligned} \hat{\tau }_{(k)} = \Xi _{(k)}^{-1} e_{v,(k)}^{-1/2} \end{aligned}$$
(9.17)

to be the natural time scale associated to these frequency energy levels. Let \(I_{(0)}\) be the time interval containing the support of the initial scalar-stress field. Let \(\eta _\epsilon (t)\) be a standard, non-negative mollifying kernel in one variable, with support in \(|t| \le \epsilon \). The initial energy function \(e_{(0)}(t)\) is required to satisfy the lower bound \(e_{(0)}(t) \ge K_0 e_{R,(0)}\) on the time interval \(I_{(0)} \pm \hat{\tau }_{(0)}\) according to (3.10), and must have a square root \(e_{(0)}^{1/2}(t)\) which satisfies bounds of the form (3.11). We construct \(e_{(0)}(t)\) by mollifying the characteristic function of \(I_{(0)}\) according to the formula

$$\begin{aligned} e_{(0)}^{1/2}(t)&= (2 K_0)^{1/2} e_{R,(0)}^{1/2} ~ \eta _{\hat{\tau }} *\chi _{I_{(0)} \pm 3 \hat{\tau }}(t) \end{aligned}$$
(9.18)

With this choice, the lower bound (3.10) and the bounds (3.11) are satisfied with \(K = K_0\) and with M being some absolute constant which arises from differentiating the mollifier. Having constructed \(e_{(0)}(t)\), we can apply Lemma 3.1 to obtain a solution \((\theta _{(1)}, u_{(1)}^l, c_{A,(1)}, R_{J,(1)}^l)\) to the Compound Scalar-Stress equation with vector \(B^l\) with support in the interval \(I_{(1)} \times {\mathbb {T}}^2\), \(I_{(1)} = I_{(0)} \pm 4 \hat{\tau }_{(0)}\).

We now iterate this procedure to form a sequence of scalar stress fields \((\theta _{(k)}, u_{(k)}^l, c_{A,(k)}, R_{J,(k)}^l)\) whose compound frequency energy levels obey the rules -(9.8)-(9.11) by choosing \(N_{(k)} = Z^{9/2}\) according to (9.15). We define

$$\begin{aligned} e_{(k)}^{1/2}(t) = (2 K_0)^{1/2} e_{J,(k)}^{1/2} ~ \eta _{\hat{\tau }} *\chi _{I_{(k)} \pm 3 \hat{\tau }}(t) \end{aligned}$$

so that the bounds on \(e_{(k)}^{1/2}\) are consistent with the bounds on (9.18), and Lemma 3.1 applies with the same constant M. According to Lemma 3.1, the time intervals containing the support of the scalar stress fields support grow according to the rule

$$\begin{aligned} I_{(k+1)}&= I_{(k)} \pm \left( 4 \hat{\tau }_k\right) \end{aligned}$$
(9.19)

In (9.19) and below, we use the notation \(I \pm \delta \) to denote the \(\delta \)-neighborhood of an interval I. In other words, \(I \pm \delta = [a-\delta , b+\delta ]\) if \(I = [a,b]\). During this iteration, the vector in the scalar-stress equation alternates between \(A^l\) and \(B^l\) as in Property 1 of Claim 9.1.

We have now defined our iteration up to the choice of the parameters Y and Z. We will choose these parameters in the following Subsection to ensure that the properties listed in Claim 9.1 are all satisfied.

Verifying Claim 9.1

We now verify that the properties in Claim 9.1 hold for sufficiently large values of Y and Z.

Property 1 This property follows immediately from the construction.

Property 2 To check that the error terms converge uniformly to 0, we observe that

$$\begin{aligned} \Vert R_{J,(k)} \Vert _{C^0}&\le e_{J,(k)} = Z^{-k} e_{J,(0)} \end{aligned}$$

from (9.12), and the same type of estimate holds for \(\Vert c_{A,(k+1)} \Vert _{C^0}\). Thus, both terms composing the stress error converge uniformly to 0.

Property 3 Here we verify that the sequence \(\theta _{(n)}, u_{(n)}^l\) is Cauchy in \(C^0\). Recall that, for \(n \ge 1\) we have

$$\begin{aligned} \theta _{(n)} = \theta _{(0)} + \sum _{k = 0}^{n-1} \Theta _{(k)}, \quad u_{(n)}^l = u_{(0)}^l + \sum _{k=0}^{n-1} U_{(k)} \end{aligned}$$
(9.20)

where the properties of \(\Theta _{(k)}\) and \(U_{(k)}^l\) are as described in Lemma 3.1. The functions \(\theta _{(0)}\) and \(u_{(0)}^l\) are smooth with compact support, and are therefore uniformly bounded. Our estimates for the corrections have the form

$$\begin{aligned} \Vert \Theta _{(k)} \Vert _{C^0} + \Vert U_{(k)} \Vert _{C^0}&\le C e_{R,(k)}^{1/2} \end{aligned}$$
(9.21)

From (9.12), the bounds \(e_{R,(k)}\) decays exponentially in k for any choice of \(Z \ge 2\), and both series in (9.20) therefore converge uniformly. In particular, as \(\theta _{(0)} = f\), we have

$$\begin{aligned} \Vert \theta _{(k)} \Vert _{C^0} + \Vert u_{(k)} \Vert _{C^0}&\le \Vert f \Vert _{C^0} + C e_{R,(0)}^{1/2}, \qquad k \ge 0 \end{aligned}$$
(9.22)

where C is proportional to the constant in Lemma 3.1. In particular, the bound (9.22) depends only on f, and does not depend on our choices of parameters Y and Z.

Property 4 We now verify that the series (9.20) also converges in \(C_{t,x}^\alpha \) once Z is chosen large enough. We claim that the bounds from Lemma 3.1 give

$$\begin{aligned} \Vert \nabla \Theta _{(k)} \Vert _{C^0} + \Vert \nabla U_{(k)} \Vert _{C^0} + \Vert \partial _t \Theta _{(k)} \Vert _{C^0} + \Vert \partial _t U_{(k)} \Vert _{C^0}&\le C N_{(k)} \Xi _{(k)} e_{R,(k)}^{1/2} \end{aligned}$$
(9.23)

The bounds on \(\Vert \nabla \Theta _{(k)} \Vert _{C^0} + \Vert \nabla U_{(k)} \Vert _{C^0} \) follow directly from Lemma 3.1. We obtain the same bound for the time derivatives by writing

$$\begin{aligned} \partial _t \Theta _{(k)}&= - u_{(k)} \cdot \nabla \Theta _{(k)} + (\partial _t + u_{(k)} \cdot \nabla )\Theta _{(k)} \end{aligned}$$
(9.24)

and similarly for \(U_{(k)}\). As we have shown in (9.22) that the sequence \(\Vert u_{(k)} \Vert _{C^0}\) is uniformly bounded by some constant, we have that the terms \(- u_{(k)} \cdot \nabla \Theta _{(k)}\) and \(- u_{(k)} \cdot \nabla U_{(k)}\) both obey the estimate (9.23). Lemma 3.1 also supplies the following bound on the advective derivative:

$$\begin{aligned} \Vert \left( \partial _t + u_{\left( k\right) } \cdot \nabla \right) \Theta _{\left( k\right) } \Vert _{C^0} + \Vert \left( \partial _t + u_{\left( k\right) } \cdot \nabla \right) U_{\left( k\right) } \Vert _{C^0}&\le C \mathbf{b}_{\left( k\right) }^{-1/2} \Xi _{\left( k\right) } e_{v,\left( k\right) }^{1/2} e_{R,\left( k\right) }^{1/2} \\ \mathbf{b}_{\mathbf{\left( k\right) }}&= N_{\left( k\right) }^{-1} (e_{v,\left( k\right) }^{1/2}/e_{R,\left( k\right) }^{1/2}) \end{aligned}$$

Note that the parameter \(N_{(k)} = K_1^2 Z^{9/2}\) and the ratio \(\frac{e_{v,(k)}^{1/2}}{e_{R,(k)}^{1/2}} = Z^{1/2}\) are both independent of k once \(k \ge 2\), while \(e_{v,(k)} = K_1^2 Z^{-(k-2)} e_{J,(0)}\) decays to 0 exponentially. Thus, the estimate for the advective derivative is even better than the bound (9.23). From (9.24) we now conclude that (9.23) holds for the time derivative as well.

Interpolation of (9.23) and (9.21) gives

$$\begin{aligned} \Vert \Theta _{(k)} \Vert _{C_{t,x}^\alpha } + \Vert U_{(k)} \Vert _{C_{t,x}^\alpha }&\le C [N_{(k)} \Xi _{(k)}]^\alpha e_{R,(k)}^{1/2} \end{aligned}$$
(9.25)

Applying (9.16) and (9.15), we have

$$\begin{aligned} \Vert \Theta _{(k)} \Vert _{C_{t,x}^\alpha } + \Vert U_{(k)} \Vert _{C_{t,x}^\alpha }&\le C_{Z, K_1, C_0} \Big (C_0^\alpha K_1^{2 \alpha } Z^{\left( \frac{9}{2} \alpha - \frac{1}{2}\right) } \Big )^k ~ \Big ( \Xi _{(0)}^\alpha e_{R,(0)}^{1/2} \Big ) \end{aligned}$$
(9.26)

As we have assumed \(\alpha < 1/9\), we can take Z large enough depending on \(\alpha \), \(K_1\) and \(C_0\) so that

$$\begin{aligned} Z^{\left( \frac{9}{2} \alpha - \frac{1}{2}\right) }&< (K_1^{2\alpha } C_0^\alpha )^{-1} \end{aligned}$$
(9.27)

Under this assumption, the right hand side of (9.26) tends to 0 exponentially fast as \(k \rightarrow \infty \), and it follows that the series (9.20) converges in \(C_{t,x}^\alpha \).

Property 5 To bound the support of \(\theta _{(k)}\), recall that \({\mathrm {supp}}\,\theta _{(k)} \subseteq I_{(k)}\), where \(I_{(0)}\) is the smallest time interval containing the support of f, and the intervals \(I_{(k)}\) grow according to the rule (9.19). As a consequence, we have (in terms of the notation introduced in (9.19))

$$\begin{aligned} I_{(k)} \subseteq I_{(0)} \pm T , \qquad k = 0, 1, 2, \ldots \\ T = 4 \sum _{k = 0}^\infty \hat{\tau }_{(k)} = 4 \sum _{k = 0}^\infty \Xi _{(k)}^{-1} e_{v,(k)}^{-1/2} \end{aligned}$$

We recall now that \(e_{v,(0)} = e_{v,(1)} = e_{v,(2)} = K_1 e_{J,(0)}\) while \(e_{v,(k)}\) decays exponentially as (9.12) for \(k \ge 2\). We also recall the lower bound in (9.16) to obtain

$$\begin{aligned} T&\le 4 \Xi _{(0)}^{-1} e_{J,(0)}^{-1/2} \left( 2 + \frac{1}{1 - (C_0 Z^{5/2} )^{-1}} \right) \end{aligned}$$
(9.28)

Recalling the definition (9.7) of \(\Xi _{(0)}\), and noting that \(C_0 \ge 2\) and \(Z \ge 1\), we have

$$\begin{aligned} T&\le 8 Y^{-1} \overline{\Xi }^{-1} e_{J,(0)}^{-1/2} \end{aligned}$$
(9.29)

Property 5 is satisfied for the \(\epsilon > 0\) given in (5) once Y is chosen sufficiently large to ensure \(T < \epsilon \).

Property 6 For a smooth test function \(\phi \) with compact support, we have

$$\begin{aligned} \int _{{\mathbb {R}}\times {\mathbb {T}}^2} ( \theta - f ) \phi (t,x) dt dx&= \int _{{\mathbb {R}}\times {\mathbb {T}}^2} ( \theta - \theta _{(0)} ) \phi (t,x) dt dx \end{aligned}$$
(9.30)
$$\begin{aligned}&= \sum _{k = 0}^\infty \int \Theta _{(k)} \phi (t,x) dt dx \end{aligned}$$
(9.31)

According to Lemma 3.1, we can write the corrections in divergence form \(\Theta _{(k)} = \partial _l W_{(k)}^l \) for some vector fields \(W_{(k)}^l\) obeying the estimates (3.18)-(3.20). Integrating by parts, we have

$$\begin{aligned} \sum _{k = 0}^\infty \int \Theta _{(k)} \phi (t,x) dt dx&= - \sum _{k = 0}^\infty \int W_{(k)}^l \partial _l \phi (t,x) dt dx \end{aligned}$$
(9.32)

Recalling (9.12), (9.16) and the definition (9.7) of \(\Xi _{(0)}\), we obtain

$$\begin{aligned} \Big |\int _{{\mathbb {R}}\times {\mathbb {T}}^2} \left( \theta - f \right) \phi \left( t,x\right) dt dx \Big |&\le \sum _{k = 0}^\infty \Vert W_{\left( k\right) } \Vert _{C^0} \Vert \nabla \phi \Vert _{L^1_{t,x}\left( I_\epsilon \times {\mathbb {T}}^2\right) } \end{aligned}$$
(9.33)
$$\begin{aligned}&\le C \left( \sum _{k = 0}^\infty \frac{1}{N_{\left( k\right) }\Xi _{\left( k\right) }} e_{R,\left( k\right) }^{1/2} \right) \Vert \nabla \phi \Vert _{L^1_{t,x}\left( I_\epsilon \times {\mathbb {T}}^2\right) } \end{aligned}$$
(9.34)
$$\begin{aligned}&= \left( \sum _{k = 0}^\infty \frac{C_0}{\Xi _{\left( k+1\right) }} e_{R,\left( k\right) }^{1/2} \right) \Vert \nabla \phi \Vert _{L^1_{t,x}\left( I_\epsilon \times {\mathbb {T}}^2\right) } \end{aligned}$$
(9.35)
$$\begin{aligned}&\le C \frac{C_0}{\Xi _{\left( 1\right) }} e_{R,\left( 0\right) }^{1/2} \Vert \nabla \phi \Vert _{L^1_{t,x}\left( I_\epsilon \times {\mathbb {T}}^2\right) } \end{aligned}$$
(9.36)
$$\begin{aligned}&\le C \frac{1}{Z^2} \Xi _{\left( 0\right) }^{-1} e_{J,\left( 0\right) }^{1/2} \Vert \nabla \phi \Vert _{L^1_{t,x}\left( I_\epsilon \times {\mathbb {T}}^2\right) } \end{aligned}$$
(9.37)
$$\begin{aligned}&\le C \frac{1}{Z^2 Y} \overline{\Xi }^{-1} e_{J,\left( 0\right) }^{1/2} \Vert \nabla \phi \Vert _{L^1_{t,x}\left( I_\epsilon \times {\mathbb {T}}^2\right) } \end{aligned}$$
(9.38)

where \(C_0\) above denotes the constant in the Main Lemma. Taking Y (or alternatively Z) to be large enough depending on C, \(\overline{\Xi }\) and \(e_{J,(0)}\), we obtain (9.4). This choice concludes the proof of Theorem 9.1.

Proof of Corollary 1.1

In this Section, we explain how Theorem 1.1 (or alternatively Theorem 9.1) can be applied to prove Corollary 1.1, which characterizes the closure of compactly supported solutions to the active scalar equations in the space \(L^\infty \) endowed with the weak-* topology.

Proof of Corollary 1.1

As in the statement of Theorem 1.1, consider an Active Scalar Equation (1.1) with a smooth multiplier that is not odd. Let \(I \subseteq {\mathbb {R}}\) be a nonempty, finite, open interval. Let \(\alpha < 1/9\) and let \(S \subseteq L^\infty \) denote the set of all weak solutions \((\theta , u^l)\) to the Active Scalar equation (1.1) which have compact support in \(I \times {\mathbb {T}}^2\), and which belong to the Hölder class \((\theta , u^l) \in C_{t,x}^\alpha \). Let \(\overline{S}\) denote the closure of S in \(L^\infty \) with respect to the weak-* topology. Corollary 1.1 asserts that \(\overline{S}\) is equal to the space of \(f \in L^\infty (I \times {\mathbb {T}}^2)\) which satisfy the conservation law \(\int f(t,x) dx = 0\) as a distribution in the variable t. In other words, we assume that for every smooth test function \(\eta (t) : I \rightarrow {\mathbb {R}}\) with compact support, we have

$$\begin{aligned} \int _{I \times {\mathbb {T}}^2} \eta (t) f(t,x) dt dx&= 0 \end{aligned}$$
(10.1)

First, observe that any \(f \in L^\infty \) which belongs to \(\overline{S}\) must satisfy (10.1) for all such \(\eta (t)\), since the integration against \(\eta (t)\) is continuous with respect to the weak-* topology, and because equality (10.1) is satisfied by all of the elements \((\theta , u^l) \in S\). This conservation law is proven for each \((\theta , u^l) \in S\) by writing the test function in (10.1) as \(\eta = \tilde{\eta } + ( \eta - \tilde{\eta })\), where \(\tilde{\eta }\) is a smooth function whose support is disjoint support from that of \((\theta , u^l)\) that satisfies

$$\begin{aligned} \int _I \tilde{\eta }(t) dt = \int _I \eta (t) dt \end{aligned}$$

This condition allows us to write \(\eta - \tilde{\eta } = \frac{d}{dt} h(t)\), where h(t) is smooth and compactly supported in I. The definition of weak solution for (1.1) then implies

$$\begin{aligned} \int _{I \times {\mathbb {T}}^2} \eta (t) \theta (t,x) dt dx&= \int _{I \times {\mathbb {T}}^2} \tilde{\eta }(t) \theta (t,x) dt dx + \int _{I \times {\mathbb {T}}^2} (\eta - \tilde{\eta })(t) \theta (t,x) dt dx \\&= \int \frac{\partial }{\partial t}h(t) \theta (t,x) dt dx \\&= - \int u^l \frac{\partial }{\partial x^l} h(t) ~ \theta (t,x) dt dx = 0 \end{aligned}$$

We now show conversely that every \(f \in L^\infty \) satisfying (10.1) belongs to \(\overline{S}\). Let us assume by contradiction that \(f \notin \overline{S}\). By definition of the weak-* topology, there exists a finite collection \(\{ \eta _1, \ldots , \eta _m \} \subseteq L^1(I \times {\mathbb {T}}^2)\) and \(\delta > 0\) such that for all \(\theta \in S\) the lower bound

$$\begin{aligned} \Big | \int (f(t,x) - \theta (t,x)) \eta _j(t,x) dt dx \Big |&\ge \delta \end{aligned}$$
(10.2)

holds for at least one \(\eta _j \in \{ \eta _1, \ldots , \eta _m \}\).

Let \(\tilde{f} \in L^\infty (I \times {\mathbb {T}}^2)\) be a smooth function of compact support with \(\Vert \tilde{f} \Vert _{L^\infty } \le \Vert f\Vert _{L^\infty }\) such that property (10.1) holds for \(\tilde{f}\) and for all such \(\eta _j\), and we have the bound

$$\begin{aligned} \Big | \int (f(t,x) - \tilde{f}(t,x)) \eta _j(t,x) dt dx \Big |&\le \delta /4 \end{aligned}$$
(10.3)

Such a function \(\tilde{f}\) can be constructed by first applying a smooth cutoff in time to restrict to a compact subset of \(I \times {\mathbb {T}}^2\), and then convolving with a mollifier in time and space, noting that both operations preserve the property (10.1) without enlarging the \(L^\infty \) norm. Inequality (10.3) is established by duality, as the adjoint cutoff and mollifier operations converge strongly in \(L^1\) when applied to each \(\eta _j\). We choose the mollification in such a way that the support of \(\tilde{f}\) remains inside a time interval \(\tilde{I}\) strictly smaller than I with \(\tilde{I} \pm \tau \subseteq I\) for some \(\tau > 0\).

Now applyFootnote 4 Theorem 9.1 for the function \(\tilde{f}\) with \(\epsilon = 1/n\) to obtain a sequence \((\theta _n, u_n^l) \in S\) such that the bound \(\Vert \theta _n \Vert _{L^\infty } \le A\) holds uniformly, and (9.4) holds for \(\tilde{f}\) with \(\epsilon = 1/n\). We assume here that \(n \ge \tau ^{-1}\) is large enough to ensure \((\theta _n, u_n^l)\) have compact support in \(I \times {\mathbb {T}}^2\) thanks to the compact support of \(\tilde{f}\) and (9.2). Now let \(\tilde{\eta }_j\) be smooth functions of compact support in \(I \times {\mathbb {T}}^2\) with \(\Vert \eta _j- \tilde{\eta }_j \Vert _{L^1(I \times {\mathbb {T}}^2)} \le \frac{\delta }{5(\Vert f\Vert _{L^\infty } + A)}\). We obtain an upper bound on the left hand side of (10.2)

$$\begin{aligned} \Big | \int (f(t,x) - \theta _n(t,x)) \eta _j(t,x) dt dx \Big |&\le \frac{\delta }{4} + \Big | \int ( \tilde{f} - \theta _n) \eta _j dt dx \Big | \\&\le \frac{\delta }{4} + \frac{\delta }{5} + \Big | \int ( \tilde{f} - \theta _n) \tilde{\eta }_j dt dx \Big | \\&\le \frac{\delta }{4} + \frac{\delta }{5} + \frac{1}{n} \Vert \nabla \tilde{\eta }_j \Vert _{L^1_{t,x}} \end{aligned}$$

Taking n large enough contradicts inequality (10.2) and concludes the proof. \(\square \)

Proof of Theorem 1.2

In this section, we outline how Lemma 3.1 can also be applied to yield Theorem 1.2. The proof follows an idea of [31, Section 12]. The same argument below also shows that one can glue any two solutions which have the same integral.Footnote 5

Let \(\theta \) be a smooth solution of (1.1) on \((-T, T) \times {\mathbb {T}}^2\), with multiplier \(m^l\) which is not odd. Let \(\bar{\theta } = \frac{1}{|{\mathbb {T}}^2|} \int _{{\mathbb {T}}^2} \theta (0,x) dx\) be the average value of \(\theta \), which is conserved by \(\theta \) along the flow. Let \(\psi (t)\) be a smooth cutoff function, equal to 1 on \(|t| \le \frac{5T}{8}\) and equal to 0 for \(|t| \ge \frac{6T}{8} = \frac{3T}{4}\).

Now consider the scalar field \(\theta _{(0)}(t,x) = \psi (t) \theta (t,x) + (1-\psi (t) ) \bar{\theta }\). Then \(\theta _{(0)}\) is an integral-conserving scalar field (i.e. \(\int _{{\mathbb {T}}^2} \partial _t \theta _{(0)} dx = \frac{d}{dt} \int _{{\mathbb {T}}^2} \theta _{(0)} dx = 0\)), and therefore solves the scalar stress equation

$$\begin{aligned} \partial _t \theta _{(0)} + \partial _l( \theta _{(0)} T^l[\theta _{(0)}])&= \partial _l R^l \end{aligned}$$
(11.1)
$$\begin{aligned} R^j&= \partial ^j \Delta ^{-1} [ \partial _t \theta _{(0)} + \partial _l( \theta _{(0)} T^l[\theta _{(0)}])] \end{aligned}$$
(11.2)

Note also that, because both \(\theta \) and \(\bar{\theta }\) are solutions to (1.1), the support of \(R^l\) is contained in the support of \(\psi '(t)\), namely

$$\begin{aligned} {\mathrm {supp}}\,R^l(t,x) \subseteq \{ \frac{5T}{8} \le |t| \le \frac{6T}{8}\} \times {\mathbb {T}}^2 \end{aligned}$$

Repeating the argument of Sections 9.1-9.3, we can now iterate Lemma 3.1 to obtain a sequence of solutions \(\theta _{(k)}\) to the compound scalar stress equation, such that

$$\begin{aligned} {\mathrm {supp}}\,\left( \theta _{(k)} - \theta _{(0)}\right) \subseteq \{ \frac{T}{2} \le |t| \le \frac{4T}{5}\} \times {\mathbb {T}}^2 \end{aligned}$$

for all indices \(k \ge 0\), and such that \(\theta _{(k)} \rightarrow \tilde{\theta }\) converge in \(C_{t,x}^\alpha \) to a solution of (1.1). At this point, the main difference in the argument is that we choose energy functions \(e_{(k)}(t)\) which are supported within pairs of intervals containing a small neighborhood of \(\{ \frac{5T}{8} \le |t| \le \frac{6T}{8} \}\). (In fact, the argument is simpler at this point because we do not need to achieve a weak approximation, and hence there is no need to introduce the parameter Y.) As we can take this intervals of support to form an arbitrarily small neighborhood of \(\{ \frac{5T}{8} \le |t| \le \frac{6T}{8} \}\), we can keep the support of the iteration contained within \(\{ \frac{T}{2} \le |t| \le \frac{4T}{5} \}\), and thereby obtain Theorem 1.2.

Proof of Weak Rigidity for Odd Active Scalars

In this section we give the proof of Theorem 1.3. Let \(\theta \in \{\theta ^n \}_{n \ge 0}\) be a weak solution to (1.1), with associated velocity field \(u^l = T^l[\theta ]\). Also let \(\phi \in {\mathcal {D}}(I \times {\mathbb {T}}^d)\) be a fixed test function. The proof of the theorem is based on the following computation. For each fixed time t, let

$$\begin{aligned} N_t[\theta ,\phi ] = \int _{{\mathbb {T}}^d} \theta (t,x) \; u^l(t,x) \;\partial _l \phi (t,x) \;dx \end{aligned}$$
(12.1)

denote the nonlinear term integrated over the time t slice. Since \(T^l\) is given by a Fourier multiplier, it commutes with differentiation, and upon integrating by parts several times we obtain

$$\begin{aligned} N_t[\theta ,\phi ]&= \int _{{\mathbb {T}}^d} \partial ^k \Delta ^{-1} \partial _k \theta \; \partial ^j T^l[ \Delta ^{-1} \partial _j\theta ] \; \partial _l \phi \; dx \nonumber \\&= - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^k \partial ^j T^l[ \Delta ^{-1} \partial _j \theta ] \; \partial _l \phi \; dx\nonumber \\&\quad - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^j T^l[ \Delta ^{-1}\partial _j \theta ] \; \partial ^k\partial _l \phi \; dx\nonumber \\&= \int _{{\mathbb {T}}^2} \partial _k \Delta ^{-1} \partial ^j\theta \; \partial ^k T^l[ \Delta ^{-1} \partial _j \theta ] \; \partial _l \phi \; dx \nonumber \\&\quad + \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^k T^l[ \Delta ^{-1} \partial _j \theta ] \; \partial ^j \partial _l \phi \; dx\nonumber \\&\quad - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^j T^l[ \Delta ^{-1}\partial _j \theta ] \; \partial ^k\partial _l \phi \; dx\nonumber \\&= - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial ^j\theta \; \partial _j T^l[ \theta ] \; \partial _l \phi \; dx\nonumber \\&\quad - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial ^j\theta \; \partial ^k T^l[ \Delta ^{-1} \partial _j \theta ] \; \partial _k \partial _l \phi \; dx\nonumber \\&\quad + \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^k T^l[ \Delta ^{-1} \partial _j \theta ] \; \partial ^j \partial _l \phi \; dx\nonumber \\&\quad - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^j T^l[ \Delta ^{-1}\partial _j \theta ] \; \partial ^k\partial _l \phi \; dx. \end{aligned}$$
(12.2)

At this stage we use that the Fourier multiplier \(m^l(\xi )\) is odd in \(\xi \), which implies that \(\partial _j T^l\), given by the Fourier multiplier \(i \xi _j m^l(\xi )\) which is even in \(\xi \), is self-adjoint in \(L^2({\mathbb {T}}^d)\). We may thus write

$$\begin{aligned}&\int _{{\mathbb {T}}^d} \Delta ^{-1} \partial ^j\theta \; \partial _j T^l[ \theta ] \; \partial _l \phi \; dx \nonumber \\&\quad = \int _{{\mathbb {T}}^d} \theta \; \partial _j T^l[ \Delta ^{-1} \partial ^j\theta \; \partial _l \phi ] \; dx \nonumber \\&\quad = \int _{{\mathbb {T}}^d} \theta \; \partial _j T^l[ \Delta ^{-1} \partial ^j\theta ] \; \partial _l \phi \; dx \nonumber \\&\qquad + \int _{{\mathbb {T}}^d} \theta ( \partial _j T^l[ \Delta ^{-1} \partial ^j\theta \; \partial _l \phi ] - \partial _j T^l[ \Delta ^{-1} \partial ^j\theta ] \; \partial _l \phi ) dx \nonumber \\&\quad = N_t[\theta ,\phi ] + \int _{{\mathbb {T}}^d} \theta \; [\partial _j T^l, \partial _l \phi ] \Delta ^{-1} \partial ^j \theta \; dx. \end{aligned}$$
(12.3)

Combining (12.2) and (12.3) we arrive at

$$\begin{aligned} 2 N_t[\theta ,\phi ]&= - \int _{{\mathbb {T}}^d} \theta \; [\partial _j T^l, \partial _l \phi ] \Delta ^{-1} \partial ^j \theta \; dx \nonumber \\&\quad - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial ^j\theta \; \partial ^k T^l[ \Delta ^{-1} \partial _j \theta ] \; \partial _k \partial _l \phi \; dx\nonumber \\&\quad + \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^k T^l[ \Delta ^{-1} \partial _j \theta ] \; \partial ^j \partial _l \phi \; dx \nonumber \\&\quad - \int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k \theta \; \partial ^j T^l[ \Delta ^{-1}\partial _j \theta ] \; \partial ^k\partial _l \phi \; dx . \end{aligned}$$
(12.4)

From the Hölder inequality, and the bounds

$$\begin{aligned} \Vert \nabla T \Delta ^{-1} \nabla \eta \Vert _{L^2}&\le C \Vert \eta \Vert _{L^2},&\text{ for } \eta \in L^2({\mathbb {T}}^d) \end{aligned}$$
(12.5)
$$\begin{aligned} \Vert \big [ \nabla T^l, \partial _l \phi \big ] \eta \Vert _{L^2}&\le C \Vert \eta \Vert _{L^2} \Vert \phi \Vert _{H^{d/2 + 2 +\epsilon }},&\text{ for } \eta \in \dot{H}^1({\mathbb {T}}^d) \end{aligned}$$
(12.6)

we thus obtain from (12.4) that

$$\begin{aligned} |N_t[\theta ,\phi ]| \le C \Vert \theta (t, \cdot )\Vert _{L^2} \Vert \Delta ^{-1} \nabla \theta (t,\cdot )\Vert _{L^2} \Vert \phi (t,\cdot ) \Vert _{H^{d/2+2+\epsilon }} \end{aligned}$$
(12.7)

for any \(\epsilon >0\). The above estimate is a manifestation of the compactness inherent in \(N_t\) in the spatial variables.

Since we have only assumed \(\theta \in L^p(I; L^2({\mathbb {T}}^d))\), compactness in the time variable must come from the active scalar equation. Below we give two essentially equivalent approaches to obtaining this compactness. The first proof is based on a variant of the Arzelà-Ascoli principle due to Aubin-Lions. The second proof is a more direct argument in the spirit of [32], using Littlewood-Paley theory to extract regularity in time.

Time Compactness via Aubin–Lions Compactness Lemma. At this stage we notice that for any weak solution \(\theta \in L^p(I; L^2({\mathbb {T}}^d))\) of (1.1), and any index j, we have we have that

$$\begin{aligned} \partial _t (\Delta ^{-1} \partial _j \theta ) = \Delta ^{-1} \partial _j \partial _l (\theta \; T^l[\theta ]) \end{aligned}$$
(12.8)

holds in the sense of distributions, and thus for any \(s > d/2\), we have

$$\begin{aligned} \Vert \partial _t (\Delta ^{-1} \partial _j \theta )\Vert _{L^{p/2}(I;H^{-s}({\mathbb {T}}^d))}&= \Vert \Delta ^{-1} \partial _j \partial _l (\theta \; T^l[\theta ]) \Vert _{ L^{p/2}(I;H^{-s}({\mathbb {T}}^d))} \nonumber \\&\le C \Vert \Delta ^{-1} \partial _j \partial _l (\theta \; T^l[\theta ]) \Vert _{L^{p/2}(I;L^1({\mathbb {T}}^d))} \nonumber \\&\le C \Vert \theta \Vert _{L^p(I;L^2({\mathbb {T}}^d))}^2 \end{aligned}$$
(12.9)

in view of the compact embedding of \(W^{s,1}({\mathbb {T}}^d) \subset L^2({\mathbb {T}}^d)\), for functions of zero mean on \({\mathbb {T}}^d\).

Now assume that

$$\begin{aligned} \theta ^n \rightharpoonup f \in L^p\big (I; L^2({\mathbb {T}}^d)\big ) \end{aligned}$$
(12.10)

for some \(p>2\). The convergence of the mean

$$\begin{aligned} \int _{{\mathbb {T}}^d} \theta ^n \; dx \rightarrow \int _{{\mathbb {T}}^d} f \; dx \qquad \text{ in } {\mathcal {D}}'(I) \end{aligned}$$

is immediate. In view of the Sobolev embedding and (12.9), by (12.10) we have that

$$\begin{aligned}&\Delta ^{-1} \nabla \theta ^n \text{ is } \text{ uniformly } \text{ bounded } \text{ in } L^p\big (I; H^1({\mathbb {T}}^d)\big )\end{aligned}$$
(12.11)
$$\begin{aligned}&\partial _t \big (\Delta ^{-1} \nabla \theta ^n\big ) \text{ is } \text{ uniformly } \text{ bounded } \text{ in } L^{p/2}\big (I;H^{-s}({\mathbb {T}}^d)\big ), \end{aligned}$$
(12.12)

where \(s> d/2\). Therefore, applying the Aubin-Lions compactness lemma (see e.g. [15, Lemma 8.4]), we obtain that there is a subsequence \(\{\theta ^{n_j}\}\) such that

$$\begin{aligned} \Delta ^{-1} \nabla \theta ^{n_j} \rightarrow \Delta ^{-1} \nabla f \in L^p\big (I; L^2({\mathbb {T}}^d)\big ), \end{aligned}$$
(12.13)

i.e. the convergence is strong. To conclude, we integrate (12.4) in time, use (12.7) and (12.13), and obtain that

$$\begin{aligned} \int _I \int _{{\mathbb {T}}^d} \theta ^{n_j} \; T^l\left[ \theta ^{n_j}\right] \; \partial _l \phi dx dt \rightarrow \int _I \int _{{\mathbb {T}}^d} f\; T^l[f]\; \partial _l \phi dx dt \end{aligned}$$

for any test function \(\phi \), since the product of a strong and a weak limit is a weak limit. The convergence holds in fact along any subsequence \(n_j \rightarrow \infty \), and therefore holds also along the original sequence.

Time Compactness via Littlewood–Paley Theory. We now give a more direct proof which illustrates the usefulness of Littlewood-Paley theory in extracting time regularity.

Let us use the notation \(P_{\le I} \theta \), \(P_I \theta \) and \(P_{[a,b]} \theta \) denote the standard, Littlewood-Paley projection operators. Thus,

$$\begin{aligned} \widehat{P_{\le I} \theta }(\xi ) = \eta ( 2^{-I} \xi ) \hat{\theta }(\xi ), \quad I = 0, 1, 2, \ldots \end{aligned}$$

is a truncation of \(\hat{\theta }\) to frequencies of order \({\mathrm {supp}}\,\widehat{P_{\le I} \theta } \subseteq \{ |\xi | \le 2^{I+1} \}\), \(\eta \) is a smooth cutoff supported in \(|\xi | \le 2\) with \(\eta (\xi ) = 1\) for \(|\xi | \le 5/4\). We let \(P_I = P_{\le I} - P_{\le I-1} \) denote the Littlewood-Paley piece which occupies frequencies \({\mathrm {supp}}\,\widehat{P_I \theta } \subseteq \{ 2^{I-1} \le |\xi | \le 2^{I+1} \}\). We use the notation \(P_{[a,b]} = \sum _{a \le I \le b} P_I\).

Now let \(\phi \in C_0^\infty (I \times {\mathbb {T}}^d)\) be a smooth test function, and let \(\theta ^n\) be a sequence of solutions to (1.1) converging weakly to \(\theta _n \rightharpoonup f\) in \(L^p(I; L^2({\mathbb {T}}^d))\) for some \(p > 2\) as in (12.10). Let \(N = \int _{\mathbb {R}}N_t[\theta , \phi ] dt = \int _{\mathbb {R}}\int _{{\mathbb {T}}^d} \theta u^l \partial _l \phi ~dx dt\) denote the full nonlinear term.

We claim that \(N[\theta ^n, \phi ] \rightarrow N[f, \phi ]\). To simplify the calculation, a simple approximation argument allows us to assume that that \(\hat{\phi }\) has compact support in \({\mathrm {supp}}\,\hat{\phi } \subseteq \{ |\xi | \le 2^{r-1} \}\) for some \(r \ge 0\). In this case, for all \(\theta \in \{ \theta ^n \}_{n \ge 0}\), we decompose the nonlinear term (12.1) into dyadic frequency increments

$$\begin{aligned} N[\theta , \phi ]&= N[P_{\le r} \theta , \phi ] + \sum _{I = r+1}^\infty \delta N_I[ \theta , \phi ] \end{aligned}$$
(12.14)
$$\begin{aligned} \delta N_I[ \theta , \phi ]&= N[P_{\le I+1} \theta , \phi ] - N[P_{\le I} \theta , \phi ] \end{aligned}$$
(12.15)
$$\begin{aligned}&= \int _{\mathbb {R}}\int _{{\mathbb {T}}^d} P_{I+1} \theta P_{\le I+1} u^l \partial _l \phi dx dt\nonumber \\&\quad + \int _{\mathbb {R}}\int _{{\mathbb {T}}^d} P_{\le I} \theta P_{I+1} u^l \partial _l \phi dx dt \end{aligned}$$
(12.16)
$$\begin{aligned}&=\int _{\mathbb {R}}\int _{{\mathbb {T}}^d} P_{I+1} \theta P_{[ I - r, I + r]} u^l \partial _l \phi dx dt \nonumber \\&\quad + \int _{\mathbb {R}}\int _{{\mathbb {T}}^d} P_{[I - r, I + r]} \theta P_{I+1} u^l \partial _l \phi dx dt. \end{aligned}$$
(12.17)

In the last line we took advantage of the compact support of \(\hat{\phi }\) for convenience. Using the commutator formulation (12.4), each \(\delta N_I\) decomposes into several terms of the type

$$\begin{aligned} \delta N_I\left[ \theta ,\phi \right]= & {} \int _{\mathbb {R}}\int _{{\mathbb {T}}^d} \Delta ^{-1} \partial _k P_{I+1} \theta ~\partial ^k T^l\left[ \Delta ^{-1} \partial _j P_{\left[ I-r, I+r\right] } \theta \right] \partial ^j \partial _l \phi ~dx dt \\&+ \text{ other } \text{ similar } \text{ terms } \end{aligned}$$

From (12.6) and \(\Vert \Delta ^{-1} \nabla P_I \theta \Vert _{L_x^2} \le C 2^{-I} \Vert \theta \Vert _{L_x^2}\), each \(\delta N_I\) is bounded by

$$\begin{aligned} | \delta N_I[\theta , \phi ] |&\le C_\phi 2^{-I} \Vert \theta \Vert _{L_{t,x}^2}^2 \end{aligned}$$
(12.18)

for some constant \(C_\phi \) depending on \(\phi \).

We now show that the weak convergence (12.10) can in fact be upgraded to uniform convergence for each dyadic piece \(P_I \theta ^n \rightarrow P_I f\), which implies the convergence of each term \(\delta N_I[\theta ^n , \phi ] \rightarrow \delta N_I[f, \phi ]\). The uniform convergence is obtained by compactness. We start with the bounds

$$\begin{aligned} \Vert P_I \theta ^n \Vert _{L_t^p L_x^\infty } + \Vert \nabla P_I \theta ^n \Vert _{L_t^p L_x^\infty } \le C_I \Vert \theta ^n \Vert _{L_t^p L_x^2} \end{aligned}$$

Applying \(P_I\) to (1.1), the equation \(\partial _t P_I \theta = - \partial _l P_I[ \theta u^l]\) gives regularity in time

$$\begin{aligned} \Vert \partial _t P_I \theta ^n \Vert _{L_t^{p/2} L_x^\infty } \le C_I \Vert \theta ^n \Vert _{L_t^p L_x^2}^2 \end{aligned}$$

As we have assumed \(p > 2\) and uniform in n bounds on \(\Vert \theta ^n \Vert _{L_t^p L_x^2}^2\) from (12.10), it follows by Sobolev embedding that the sequence \(P_I \theta ^n\) for each I is Hölder continuous in time and space, uniformly in n. By Arzelà-Ascoli, there exists a uniformly convergent subsequence \(P_I \theta ^{n_j}\) for each I. From the weak convergence (12.10), we have uniform convergence of \(P_I \theta ^n \rightarrow P_I f\) on any subsequence, which implies that the original sequence \(P_I \theta ^n \rightarrow P_I f\) converges uniformly.

It now follows that \(\delta N_I[\theta ^n , \phi ] \rightarrow \delta N_I[f, \phi ]\) for each index I and that \(N[P_{\le r} \theta ^n, \phi ] \rightarrow N[P_{\le r} f, \phi ]\). We also have the estimate (12.18), so the convergence of \(N[\theta ^n, \phi ] \rightarrow N[f, \phi ]\) follows from the dominated convergence theorem applied to (12.14).

We remark that the same two arguments can be upgraded to prove compactness of solutions when we only assume weak convergence in \(L_t^{p}H_x^s\) for some \(p > 2\) and \(s > -1/2\). The main difference involves using the commutator formulation (12.4) to obtain an estimate for the time derivative from the lower regularity in space.

Conservation of the Hamiltonian for Odd Active Scalars

In this Section we give the proof of Theorem 1.4. Recall that the symbol of the Fourier multiplier L defined in (1.3), which defines the Hamiltonian is given by

$$\begin{aligned} \hat{L}(\xi ) = |\xi |^{-2} \left( i \xi _2 m_1(\xi ) - i \xi _1 m_2(\xi ) \right) \end{aligned}$$
(13.1)

with the convention that \(\hat{L}(0)=0\). Since we are in two spatial dimensions and \(\xi \cdot m(\xi ) = 0\) for all nonzero vectors \(\xi \), automatically we must have that

$$\begin{aligned} m(\xi ) = i \xi ^\perp |\xi |^{-1} \ell (\xi ) \end{aligned}$$
(13.2)

for some even, zero-order homogenous, smooth on the unit sphere, real-valued scalar function \(\ell (\xi )\). The fact that \(\ell (\xi ) \in {\mathbb {R}}\) follows from the fact that \(\overline{\ell (\xi )} = \ell (-\xi ) = \ell (\xi )\). In the case of the SQG equation, \(\ell (\xi ) =1\).

In summary, we have that

$$\begin{aligned} \hat{L}(\xi ) = |\xi |^{-1} \ell (\xi ) \end{aligned}$$
(13.3)

which reiterates that L is a self-adjoint operator, which is smoothing of degree \(-1\) when \(\ell (\xi )\) is nonvanishing on the unit sphere. The Hamiltonian then is

$$\begin{aligned} H(t) = \int _{{\mathbb {T}}^2} \theta (t,x) L\theta (t,x) dx \end{aligned}$$
(13.4)

or equivalently, in view of Plancherel’s theorem,

$$\begin{aligned} H(t) = \sum _{k \in {\mathbb {Z}}^2_*} |\hat{\theta }(t,k)|^2 |k|^{-1} \ell (k). \end{aligned}$$
(13.5)

Let \(\phi _\epsilon = \epsilon ^{-2} \phi (x/\epsilon )\) be a standard mollifier on \({\mathbb {T}}^2\), and denote

$$\begin{aligned} \cdot _\epsilon = \cdot *\phi _\epsilon , \qquad \cdot _{\epsilon ,\epsilon } = \cdot *\phi _\epsilon *\phi _\epsilon . \end{aligned}$$

The conservation of the Hamiltonian H for solutions of (1.1) is implied by establishing that \(\frac{d}{dt} H(t) = 0\) as a distribution in t. Namely, we show that

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \eta '(t) \theta _\epsilon (t,x) L \theta _\epsilon (t,x) dx dt = 0 \end{aligned}$$
(13.6)

holds for every smooth function \(\eta (t)\) which is supported in I. Note that since mollification \(*\phi _\epsilon \) is given by a Fourier multiplier, it commutes with L.

Considering the test function \(\eta (t) L \theta _{\epsilon ,\epsilon }\) in the weak formulation of (1.1), we arrive at

$$\begin{aligned} \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \theta \partial _t (\eta L\theta _{\epsilon ,\epsilon }) dx dt + \int _{{\mathbb {R}}}\int _{{\mathbb {T}}^2} \theta u^l \partial _l (\eta L\theta _{\epsilon ,\epsilon }) dx dt = 0, \end{aligned}$$
(13.7)

for every \(\epsilon >0\). Strictly speaking, the above test function is not smooth in time, but this restriction can be ignored after a time mollification argument, as in the proof of [33, Theorem 2.2]. The first term in (13.7) may now be rewritten as

$$\begin{aligned} \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \theta \partial _t (\eta L\theta _{\epsilon ,\epsilon }) dx dt&= \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \theta _{\epsilon } \partial _t \left( \eta L\theta _{\epsilon }\right) dx dt \nonumber \\&= \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \theta _{\epsilon } \eta ' L\theta _{\epsilon } dx dt + \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \theta _{\epsilon } \eta L\partial _t \theta _{\epsilon } dx dt \nonumber \\&= \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \theta _{\epsilon } \eta ' L\theta _{\epsilon } dx dt + \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} L \theta _{\epsilon } \eta \partial _t\theta _{\epsilon } dx dt \nonumber \\&= \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2} \theta _{\epsilon } \eta ' L\theta _{\epsilon } dx dt - \int _{{\mathbb {R}}} \int _{{\mathbb {T}}^2}\partial _t \left( L \theta _{\epsilon ,\epsilon } \eta \right) \theta dx dt. \end{aligned}$$
(13.8)

Combining the above with (13.7) we see that establishing (13.6) is equivalent to establishing

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \int _{{\mathbb {R}}} \eta \int _{{\mathbb {T}}^2} \left( \theta u^l\right) _{\epsilon } \partial _l L\theta _{\epsilon } dx dt = 0. \end{aligned}$$
(13.9)

for \(u^l = T^l[\theta ]\).

Up to this point, we have presented the proof of conservation of H(t) analogously to the proof of energy conservation for Euler in the Onsager critical Besov space \(L_t^3 B_{1/3, c({\mathbb {N}})}^3\) of [10], but the remaining analysis turns out to be less subtle. In particular, there is no need for a quadratic commutator estimate as in [16] (and the mollification above could also be simpler).

To proceed, we view the cubic term on the left hand side of (13.9) as the diagonal part of a family of trilinear operators

$$\begin{aligned} Q_\epsilon [\theta _{(1)}, \theta _{(2)}, \theta _{(3)}]&=\int _{{\mathbb {R}}} \eta \int _{{\mathbb {T}}^2} (\theta _{(1)} T^l[\theta _{(2)}])_{\epsilon } (\partial _l L\theta _{(3)})_{\epsilon } dx dt \end{aligned}$$
(13.10)

In this notation, equation (13.9) asks to show \(\lim _{\epsilon \rightarrow 0} Q_\epsilon [\theta , \theta ,\theta ] = 0\) for all \(\theta \in L_{t,x}^3\). Observe first that the operators \(Q_\epsilon \) satisfy the bound

$$\begin{aligned} |Q_\epsilon [ \theta _{(1)}, \theta _{(2)}, \theta _{(3)} ]|&\le \Vert \theta _{(1)} \Vert _{L_{t,x}^3} \cdot \Vert T[\theta _{(2)}] \Vert _{L_{t,x}^3} \cdot \Vert \nabla L [\theta _{(3)}] \Vert _{L_{t,x}^3} \nonumber \\&\le C \Vert \theta _{(1)} \Vert _{L_{t,x}^3} \cdot \Vert \theta _{(2)} \Vert _{L_{t,x}^3} \cdot \Vert \theta _{(3)} \Vert _{L_{t,x}^3} \end{aligned}$$
(13.11)

This bound follows from the fact that both operators T and

$$\begin{aligned} \nabla L = \nabla (-\Delta )^{-1/2} \big (R_2 T^1 - R_1 T^2\big ) \end{aligned}$$
(13.12)

are bounded as operators from \(L^3_{x}\) to itself, thanks to the smoothness and degree 0 homogeneity of m.

Because the operators \(Q_\epsilon \) are trilinear, and the bound (13.11) they satisfy is uniform in \(\epsilon \), it suffices to prove (13.9) under the additional assumption that \(\theta \) is smooth with compact support by the density of such functions in \(L_{t,x}^3\). Assuming now that \(\theta \) is smooth, we may pass \(\epsilon \) to 0 in (13.9), and it remains to show that

$$\begin{aligned} \int _{{\mathbb {R}}} \eta \int _{{\mathbb {T}}^2} \theta T^l[\theta ] \partial _l L\theta dx dt = 0. \end{aligned}$$

At this stage we recall that

$$\begin{aligned} u^l = T^l [\theta ] = \left( \partial ^l\right) ^\perp L\theta , \end{aligned}$$
(13.13)

which may be seen on the Fourier side from (13.2) and (13.3). As a result, we have

$$\begin{aligned} \int _{{\mathbb {T}}^2} \theta u^l \; \partial _l L\theta dx = \int _{{\mathbb {T}}^2} \theta \left( \partial ^l\right) ^\perp L \theta \; \partial _l L\theta dx = 0 \end{aligned}$$
(13.14)

which concludes the proof.

Constraints on Weak Limits of Degenerate Active Scalars in Higher Dimensions

In this Section, we give a proof of Theorem 1.6, which shows that the nondegeneracy condition in Theorem 1.5 is necessary for the weak limit statement of Theorem 1.1.

In this section, we assume that there is a nonzero frequency \(\xi _{(0)} \in \widehat{{\mathbb {T}}^n} \setminus \{ 0\} = {\mathbb {Z}}_*^n\) in the dual lattice such that the image of the even part of the multiplier is contained in

$$\begin{aligned} \left\{ m(\xi ) + m(-\xi ) ~|~ \xi \in \widehat{{\mathbb {R}}^n} \right\} \subseteq \langle \xi _{(0)} \rangle ^{\perp } \end{aligned}$$
(14.1)

In this case, we have the following restriction on weak limits of solutions to the active scalar equation, which bears resemblance to a new conservation law.

Lemma 14.1

Consider the active scalar equation (1.1) on \(I \times {\mathbb {T}}^n\) and suppose that the image of the even part of the multiplier is contained in the hyperplane (14.1). Let \(T_{0}^l\) denote the Fourier multiplier with symbol

$$\begin{aligned} \widehat{T_0^l[\theta ]}(\xi ) = \frac{1}{2}\left( m^l(\xi ) - m^l\left( -\xi \right) \right) \hat{\theta }(\xi ) \end{aligned}$$

Suppose that \(\phi \in C_0^\infty (I \times {\mathbb {T}}^n)\) has the property that its spatial gradient takes values in the direction \(\xi _{(0)}\)

$$\begin{aligned} \nabla \phi (t,x)&\in \langle \xi _{(0)} \rangle \end{aligned}$$
(14.2)

Suppose that \(f \in L^\infty (I \times {\mathbb {T}}^n)\) can be realized as a weak-* limit \(\theta _{(k)} \rightharpoonup f\) in \(L^\infty \) of some sequence of solutions \(\theta _{(k)}\) to (1.1). Then

$$\begin{aligned} \int _{I \times {\mathbb {T}}^n} f \partial _t \phi + f T_0^l[f] \partial _l \phi dx dt&= 0 \end{aligned}$$
(14.3)

Proof

Consider the sequence of solutions \(\theta _{(k)}\) to (1.1) converging to f in the \(L^\infty \) weak-* topology. Decompose the operator \(T^l\) as \(T^l = T_0^l + T_e^l\), where the term \(T_e^l\) of the operator is the Fourier multiplier with symbol

$$\begin{aligned} \widehat{T_e^l[\theta ]}(\xi )&= \frac{1}{2}\big ( m^l(\xi ) + m^l(-\xi ) \big ) \hat{\theta }(\xi ) \end{aligned}$$
(14.4)

By equation (1.1), we have for all indices k that

$$\begin{aligned} \int _{I \times {\mathbb {T}}^n} ( \theta _{(k)} \partial _t \phi + \theta _{(k)} T_0^l[\theta _{(k)}] \partial _l \phi ) dx dt = - \int _{I \times {\mathbb {T}}^n} \theta _{(k)} T_e^l[\theta _{(k)}] \partial _l \phi dx dt = 0 \end{aligned}$$

by the condition (14.1). By the proof of Theorem 1.3, the nonlinear term is continuous with respect to weak-* limits in \(L^\infty \) when restricted to active scalar fields, giving (14.3). To make this conclusion, it is important to note that, in the proof of compactness for the nonlinear term, it was not important that the operator in the nonlinear term was identical to the operator appearing in the active scalar equation. The proof used only the oddness of the multiplier in the nonlinear term, and certain time regularity estimates from the active scalar equation coming from the boundedness properties of the operator in the equation. \(\square \)

Assuming that the hyperplane containing the image of the even part of m is in the dual lattice \(\xi _{(0)} \in \widehat{{\mathbb {T}}^n}\), it is now not so hard to design a test function \(\phi \) obeying (14.2) and an integral-conserving function f which fails to satisfy (14.3). As a first attempt, we can let \(\zeta (t)\) be a smooth cutoff in time, and take

$$\begin{aligned} \phi (t,x)&= \zeta (t) \cos \left( \xi _{(0)} \cdot x\right) \end{aligned}$$
(14.5)
$$\begin{aligned} f(t,x)&= \zeta '(t) \cos \left( \xi _{(0)} \cdot x\right) \end{aligned}$$
(14.6)

Then (14.2) is satisfied, and we also have

$$\begin{aligned} \int f(t,x) \partial _t \phi (t,x) dx dt&= \int \left( \zeta '(t)\right) ^2 \cos ^2\left( \xi _{(0)} \cdot x\right) dx dt > 0 \end{aligned}$$
(14.7)

is strictly positive.

The positivity of (14.7) does not necessarily imply the failure of (14.3). However, if the equality (14.3) holds for this function f, then (14.3) cannot hold for the function 2f, because the linear term (which is positive by (14.7)) scales linearly, whereas the quadratic term scales quadratically. Thus, at least one of f or 2f fails to satisfy (14.3), and we have Theorem 1.6.

Concluding Discussion

Active scalar equations arise naturally in fluid dynamics in several asymptotic regimes, and as model equations for the full fluid systems. The problem of constructing active scalar fields for which the energy \(\Vert \theta \Vert _{L^2_x}\) fails to be conserved is a natural generalization of Onsager’s conjecture for the Euler equations. This problem, however, encounters several additional difficulties when compared to Euler. Most importantly, a suitable analogue of Beltrami flows, which provide an essential ingredient for obtaining regularity up to 1 / 5 in the case of Euler, are unavailable.

For active scalars with multipliers that are not odd, we obtain nonuniqueness of weak solutions and even h-principles among integral-conserving functions for weak solutions with Hölder regularity up to 1 / 9 (Theorem 1.1, Theorem 1.2, and Corollary 1.1). Our proof is based on the observation that the interference terms which arise due to self-interactions between individual waves must vanish to leading order. This observation allows for an approach in the spirit of the isometric embedding equations, where we eliminate one component of the error in each stage of the iteration using one-dimensional oscillations. Our observation is general, and applies in arbitrary dimensions even to the case of the Euler equations, giving a new approach to solutions in that case as well. However, our inability to remove more than one component of the error leads to further losses in regularity.

These results however should not be expected for multipliers which are odd. For odd symbols, the Hamiltonian is conserved at the level of \(\theta \in L^3_{t,x}\) (Theorem 1.4), and the nonlinearity exhibits a weak rigidity which makes it impossible to obtain an h-principle type result (Theorem 1.3). In higher dimensions, the presence of conservation laws and other rigidity properties of weak solutions can even be sensitive to more subtle algebraic properties of the multiplier, and our method applies in a generality which is essentially optimal (Theorems 1.51.6).

Several related questions remain open. Part of our proof does not apply to the nonperiodic setting and some new idea could be required to produce nonperiodic solutions (currently even \(L^\infty _{t,x}\) solutions have not been constructed in this case). Other significant questions include

  1. 1.

    In the case of SQG, exhibit a weak solution \(\theta \in L^p_t L^2_x\), that does not conserve energy.

  2. 2.

    In the case of SQG, exhibit a weak solution \(\theta \in C^0_{x,t}\) that does not conserve energy, but does conserve the Hamiltonian.

  3. 3.

    In the case of IPM, or more generally for not odd symbols, exhibit weak solutions \(\theta \in C^{\alpha }_{t,x}\), with \(\alpha \in (1/9,1/3)\) that do not conserve energy.

We believe that answering these questions may shed some light into the field of two dimensional turbulence.

Finally, further sharpening approaches which do not rely on the use of Beltrami flows may be found useful in resolving Onsager’s conjecture. The current approaches introduce anomalous time scales in the construction which are incompatible with the time regularity bounds held by more regular solutions. Although our construction shares in this deficiency, the cancellation of self-interference terms that lies at the heart of our proof is a general observation that arises from the structure of the equations and remains available even at longer time scales. It is important to investigate whether further, more dynamical methods of construction can be developed.