1 Introduction

For a fixed, finite time horizon \(T > 0\) and a velocity field \(b: [0,T] \times \mathbb {R}^d \rightarrow \mathbb {R}^d\), we study the linear transport equation

$$\begin{aligned} \partial _t u + b(t,x) \cdot \nabla u = 0 \quad \text {in } [0,T] \times \mathbb {R}^d, \end{aligned}$$
(1.1)

along with the dual, continuity equation

$$\begin{aligned} \partial _t f + {\text {div}}( b(t,x) f) = 0 \quad \text {in } [0,T] \times \mathbb {R}^d, \end{aligned}$$
(1.2)

and the associated ordinary differential equation (ODE) flow

$$\begin{aligned} \partial _t \phi _{t,s}(x) = b(t,\phi _{t,s}(x)), \quad (s,t,x) \in [0,T] \times [0,T] \times \mathbb {R}^d, \quad \phi _{s,s} = {\text {Id}}. \end{aligned}$$
(1.3)

The goal of the paper is to analyze the three problems, and the relations between them, for vector fields b satisfying the one-sided Lipschitz condition

$$\begin{aligned} \left\{ \begin{aligned}&(b(t,x) - b(t,y)) \cdot (x-y)\\&\qquad \qquad \ge -C(t)|x-y|^2 \quad \text {for a.e. } (t,x,y) \in [0,T] \times \mathbb {R}^d \times \mathbb {R}^d\\&\text {for some nonnegative }C \in L^1([0,T]). \end{aligned} \right. \end{aligned}$$
(1.4)

When b is Lipschitz continuous in the space variable, the ODE flow (1.3) admits a unique global solution, and, through the method of characteristics, (1.1) and (1.2) are uniquely solved for any given smooth initial or terminal data. Moreover, the flow is a diffeomorphism, and therefore the solution operators for either the initial value problem (IVP) or terminal value problem (TVP) for (1.1) and (1.2) are continuous on \(L^p_\textrm{loc}\) for any \(p \in [1,\infty ]\).

Under the assumption (1.4), the time direction plays a nontrivial role, and there is a fundamental difference between the solvability of the flow (1.3) forward versus backward in time. Indeed, b need not even be continuous, and (1.4) is equivalent to

$$\begin{aligned} \frac{\nabla b(t,\cdot ) + \nabla b(t,\cdot )^T}{2} \ge -C(t){\text {Id}}\quad \text {in the sense of distributions.} \end{aligned}$$

In particular, the distribution \({\text{ div }}b\) is a signed measure that is bounded from below, but not in general absolutely continuous with respect to Lebesgue measure. Thus, when \(t < s\), the flow (1.3) is expected to concentrate at sets of Lebesgue measure zero, while the formation of vacuum is witnessed for \(t > s\).

A general study of transport equations and ODEs with irregular velocity fields, motivated by nonlinear problems in fluid dynamics, was initiated by DiPerna and the first author [40], who introduced the notion of renormalized solutions to prove the well-posedness for (1.1) and (1.2) and the almost-everywhere solvability of the flow (1.3) for b with Sobolev regularity. The DiPerna–Lions theory was extended to equations where only \(\textrm{Sym}(\nabla b) \in L^1\) [28], to Vlasov equations with \(BV_\textrm{loc}\) velocity fields [20], and to two-dimensional problems with a Hamiltonian structure [2,3,4, 21, 45]. Using deep results from geometric measure theory, the renormalization property was extended to the very general case where \(b \in BV_\textrm{loc}\) and \({\text {div}}b \in L^1\) by Ambrosio [5], who also provided a new, measure-theoretic viewpoint on the relationship between uniqueness of nonnegative solutions of (1.2) and the unique solvability of the flow (1.3) through the idea of superposition. Further developments include equations with velocity fields having a particular structure allowing for less regularity [31, 50] and velocity fields belonging to SBD (i.e. \(\textrm{Sym}(\nabla b)\) is a signed measure with no singular Cantor-like part) [12]. Fine regularity properties of DiPerna-Lions flows were established in [13, 36], and the study of so-called “nearly incompressible flows” [14] led to the resolution by Bianchini and Bonicatto [19] of Bressan’s compactness conjecture [26, 27]; see also [47] for related results. For many more details and references, we refer the reader to the surveys [6,7,8, 10].

In the majority of these works, the divergence \({\text {div}}b\) is assumed to be bounded, or at least absolutely continuous with respect to Lebesgue measure. This is not the case in general for velocity fields satisfying (1.4), and so the equations (1.1) and (1.2) do not even have a sense as distributions, because the products \(({\text {div}}b) u\) and bf are ill-defined for general \(u \in L^1_\textrm{loc}\) or measures f. The DiPerna-Lions theory does not, therefore, cover this situation. Moreover, the choice of an appropriate function space of solutions is very sensitive to whether the equations are posed as initial or terminal value problems.

The problems (1.1)–(1.3) for velocity fields with a one-sided Lipschitz condition have been approached with a variety of methods [22, 25, 29, 33, 59,60,61], a primary motivation being the study of pressureless gases and scalar conservation laws, which, when posed as nonlinear transport equations, involve velocity fields whose divergence is not absolutely continuous [23, 24, 43, 44]. Our main purpose is to complement these works, and in particular the theory of Bouchut, James, and Mancini [25], by providing complete characterizations of the stable solutions to all three problems in both the compressive and expansive regimes. We also provide some results on the corresponding parabolic equations with a degenerate, second-order term, as well as the SDE analogue of (1.3) for both the velocity field b and \(-b\).

1.1 Main Results

We relegate a full description of the results, discussions, and examples to the body of the paper. Here, we briefly outline the different sections and the types of results proved within them, and we compare them to the existing literature.

1.1.1 The Compressive Regime

In Sect. 2, we record properties of the backward Filippov flow for (1.3), as well as for its Jacobian \(J_{t,s}(x) {:}{=} \det (\nabla \phi _{t,s}(x))\), which is well-defined in \(L^\infty \) for a.e. \(t \le s\) and \(x \in \mathbb {R}^d\). We employ measure-theoretic arguments to make sense of the right-inverse of the flow in an almost-everywhere sense, as a preliminary step to understanding the forward, regular Lagrangian flow, and prove several properties, the most important of which is its almost-everywhere continuity.

In Sect. 3, we turn to the study of the nonconservative equationFootnote 1

$$\begin{aligned} \partial _t u - b(t,x) \cdot \nabla u = 0 \quad \text {in } (0,T) \times \mathbb {R}^d, \quad u(T,\cdot ) = u_T, \end{aligned}$$
(1.5)

for which the uniqueness of continuous distributional solutions fails in general. We introduce a new PDE characterization of the “good” (stable) solution of (1.5) as the unique viscosity solution, in the sense of Crandall, Ishii, and the first author [35]. This is done by proving a comparison principle for sub and supersolutions. The viscosity solution characterization coincides with the selection of “good” solutions by other authors in particular settings [22, 25, 33, 59,60,61], allows for robust stability statements, and, moreover, generalizes to the setting of degenerate parabolic problems (see the discussion below).

The “usual” viscosity solution theory must be modified due to the lack of global continuity of b. In view of the evolution nature of the equations, the \(L^1\)-dependence in time does not present a problem, and the equations can be treated with the methods of [46, 55, 57, 58]. To deal with the discontinuity of b in space, sub and supersolutions must be defined with appropriate semicontinuous envelopes of b in the space variable. The direction of the one-sided Lipschitz assumption (1.4) accounts for the beneficial inequalities in the proof of the comparison principle.

The nonuniqueness of distributional solutions is explored through examples of the form \(b(x) = {\text{ sgn } }x |x|^\alpha \). We also introduce further conditions on the velocity field b and terminal data \(u_T\) that ensure uniqueness of arbitrary continuous distributional solutions. In particular, the interplay between the regularity of b and \(u_T\) plays an important role: if \(b \in C^\alpha \) and \(u_T \in C^\beta \), then distributional solutions are unique if \(\alpha + \beta > 1\), while uniqueness may fail in general if \(\alpha + \beta \le 1\), as can be seen from our counterexamples.

The latter half of Sect. 3 deals with the study of the dual problem to (1.5), namely,

$$\begin{aligned} \partial _t f - {\text {div}}( b(t,x) f) = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad f(0,\cdot ) = f_0. \end{aligned}$$
(1.6)

Even if \(f_0 \in L^1_\textrm{loc}\), the concentrative nature of the flow causes the measure \(f(t,\cdot )\) to develop a singular part, and therefore we are led to seek measure-valued solutions. This prevents the duality solution of (1.6) from being understood in the distributional sense, due to the lack of continuity of b. Nevertheless, we prove that, if b is continuous, or if it happens that \(f(t,\cdot )\) is absolutely continuous with respect to Lebesgue measure on the time interval [0, T], then the notions of duality and distributional solutions are equivalent.

An important feature of the continuity equation (1.6) is the failure of renormalization; that is, if f is a duality solution, the measure |f| may fail to be a distributional solution, and may even violate conservation of mass. We once again study examples of the form \(b(x) = {\text{ sgn } }x |x|^\alpha \), \(0 \le \alpha < 1\). Note that, for this example, when \(\alpha > 1\), b has the Sobolev regularity \(b \in W^{1,p}\) for \(p < 1/\alpha \), and so our counterexample is constructed to ensure that the duality solution f satisfies \(f \in L^q\) only for q outside the range for which the DiPerna-Lions commutator lemma holds. This contrast with the DiPerna-Lions theory is a direct consequence of the compressive nature of the backward flow, which can lead to cancellation of the positive and negative parts of f. A related phenomenon is the nonuniqueness of distributional solutions of the continuity equation (1.2) with the reverse sign (see below).

1.1.2 The Expansive Regime

In Sect. 4, we reverse the sign on the velocity field, and study the corresponding problems

$$\begin{aligned} \partial _t u + b(t,x) \cdot \nabla u = 0 \quad \text {in } (0,T) \times \mathbb {R}^d, \quad u(T,\cdot ) = u_T \end{aligned}$$
(1.7)

and

$$\begin{aligned} \partial _t f + {\text {div}}(b(t,x) f) = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad f(0,\cdot ) = f_0. \end{aligned}$$
(1.8)

In view of the lower bound on the divergence of b, we are motivated to seek an \(L^p\)-based theory for both equations, based on a priori estimates, or equivalently, on the fact that the characteristic flow (the forward ODE (1.3)) does not concentrate on sets of measure zero.

The initial value problem for the continuity equation (1.8) was studied in [22, 25], where a large part of the analysis is based on the fact that locally integrable distributional solutions are not unique in general.Footnote 2 The same setting is studied in [29], where the existence and uniqueness of the forward Filippov flow for (1.3) is established for a.e. \(x \in \mathbb {R}^d\).

In the first part of Sect. 4, we identify a unique “good” distributional solution, and prove that the resulting solution operator is continuous on \(L^p_\textrm{loc}\) for all \(p \in [1,\infty ]\), and stable with respect to regularizations. This coincides with the notion of reversible solution in [22, 25].

We then obtain strong stability results for the Bouchut–James–Mancini duality solutions of the nonconservative problem (1.7) in all \(L^p\)-spaces, which allow us to prove the renormalization property. Moreover, we introduce a PDE characterization of this duality solution in terms of regularization by \(\mathop {\mathrm {ess\,inf}}\limits \)- and \(\mathop {\mathrm {ess\,sup}}\limits \)-convolution. An important ingredient in establishing this characterization is the propagation of almost-everywhere continuity, which, in turn, follows from the renormalization property and the almost-everywhere continuity of the forward flow proved in Sect. 2.

As a consequence of this new characterization, we give a PDE-based proof of the fact that nonnegative distributional \(L^p\)-solutions of (1.8) are unique, which was established in [29] using the superposition principle. This result, along with the renormalization property for (1.7), allows us to establish the existence, uniqueness, and stability of the forward regular Lagrangian flow for the ODE (1.3) identified in [29]. As a byproduct, this also provides a full characterization of the Bouchut-James-Mancini notion of “good” (reversible) solution as the pushforward of \(f_0\) by the forward flow. Moreover, a distributional solution f is a reversible solution if and only if |f| is also a distributional solution (cf. [25, Proposition 3.12], which operates under the criterion that f be a so-called “Jacobian” solution).

1.1.3 SDEs and Second Order Equations

This paper also contains various results regarding second order versions of (1.1) and (1.2), as well as stochastic differential equation (SDE) flows. SDEs and degenerate second-order Fokker-Planck equations have been studied from many perspectives, using both the DiPerna-Lions theory and adaptations of the superposition principle, by many authors, including Le Bris and Lions [51], Figalli [41], Trevisan [64], and Champagnat and Jabin [32]; see also the book [52]. Just as in the first-order setting, the fact that the measure \({\text {div}}b\) may contain a singular part prevents the application of these theories to the present situation.

In the compressive regime, we extend the viscosity solution theory of Sect. 3 to the second order equation

$$\begin{aligned} -\partial _t u + b(t,x) \cdot \nabla u - {\text {tr}}[ a(t,x) \nabla ^2 u] = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad u(T,\cdot ) = u_T, \end{aligned}$$
(1.9)

where b satisfies the one-sided Lipschitz condition (1.4) and a is a regular, but possibly degenerate, symmetric matrix. This equation, as well as the dual problem

$$\begin{aligned} \partial _t f - {\text {div}}(b(t,x) f) - \nabla ^2 \cdot (a(t,x) f) = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad f(0,\cdot ) = f_0, \end{aligned}$$
(1.10)

can be related to the SDE

$$\begin{aligned} d_t \Phi _{t,s}(x) = -b(t,\Phi _{t,s}(x))dt + \sigma (t, \Phi _{t,s}(x))dW_t, \quad t > s, \quad \Phi _{s,s}(x) = x, \end{aligned}$$
(1.11)

which is the SDE analogue of the backward flow for (1.3). Here W is a given Brownian motion and \(a =\frac{1}{2} \sigma \sigma ^T\). We establish the existence and uniqueness, for every \(x \in \mathbb {R}^d\), of a strong solution in the Filippov sense, and we show that, with probability one, \(\Phi _{t,s}\) is Hölder continuous for any exponent less than 1.

The situation is more complicated in the expansive regime, namely, for the equations

$$\begin{aligned} -\partial _t u - b(t,x) \cdot \nabla u - {\text {tr}}[ a(t,x) \nabla ^2 u] = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad u(T,\cdot ) = u_T \end{aligned}$$
(1.12)

and

$$\begin{aligned} \partial _t f + {\text {div}}(b(t,x) f) - \nabla ^2 \cdot (a(t,x) f) = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad f(0,\cdot ) = f_0. \end{aligned}$$
(1.13)

In the first-order setting, the characterization of the “good” distributional solution of the continuity equation (1.8) relies on the Lipschitz continuity of the backward ODE flow. Adapting similar methods for the second order equation (1.13) involves establishing Lipschitz continuity of a stochastic flow like (1.11) with certain time-reversed coefficients (see (4.30) below). While it is well-known that flows of the form (1.11) are Hölder continuous for any exponent less than 1, even in more general contexts (see [62]), it is an open question as to whether it is Lipschitz with probability one. We relegate a general study of (1.12) and (1.13), and of the stochastic regular Lagrangian flow for

$$\begin{aligned} d_t \Phi _{t,s}(x) = b(t, \Phi _{t,s}(x))dt + \sigma (t, \Phi _{t,s}(x))dW_t, \quad t >s, \quad \Phi _{s,s}(x) = x, \end{aligned}$$
(1.14)

to future work. The exception is when \(\sigma \) is constant in the \(\mathbb {R}^d\)-variable.Footnote 3 In this case, we prove that a suitable stochastic flow of the form (1.11) can be inverted, leading, as in the deterministic case, to the existence and uniqueness of a strong solution to (1.14) for a.e. \(x \in \mathbb {R}^d\), and a corresponding solution theory for the PDEs (1.12) and (1.13).

1.2 Applications and Further Study

While interesting in their own right, linear transport equations and ODEs with nonregular velocity fields arise naturally in several equations in fluid dynamics, in which the velocity fields depend nonlinearly on various other physical quantities that are coupled with the transported quantity. Since these equations must be posed a priori in a weak sense, this leads to velocity fields with limited regularity. The DiPerna-Lions and Ambrosio theories have been successfully applied to a number of such situations; see [9, 11, 15, 16, 37, 48, 53, 67]. As mentioned above, the one-dimensional Bouchut-James theory of reversible solutions for transport equations with semi-Lipschitz velocity fields has been successfully applied in applications to conservation laws and pressureless gasses; see [23, 24, 43, 44].

Nonlinear transport equations also arise in certain models for large population dynamics, specifically mean field games (MFG). In [49], the first author and Lasry introduced a forward-backward system of PDEs modeling a large population of agents in a state of Nash equilibrium. The evolution of the density f of players is described by a continuity equation (1.8) (or Fokker-Planck equation (1.13)), where the velocity field b is given by

$$\begin{aligned} b(t,x) = -\nabla _p H(t, x,\nabla u(t,x)). \end{aligned}$$
(1.15)

Here, H is a convex Hamiltonian, and u is the solution of the terminal value problem

$$\begin{aligned} & -\partial _t u - {\text {tr}}[ a(t,x) \nabla ^2 u] + H(t,x, \nabla u(t,x)) = F[f(t,\cdot )] \quad \text {in }(0,T) \times \mathbb {R}^d,\quad \nonumber \\ & u(T,\cdot ) = G[f(t,\cdot )], \end{aligned}$$
(1.16)

which is a Hamilton–Jacobi–Bellman equation encoding the optimization problem for a typical agent, and whose influence by the population of agents is described by the coupling functions F and G. The velocity field (1.15) is the consensus optimal feedback policy of the population of agents at a Nash equilibrium.

When a is degenerate, or even zero, the function u has limited regularity, and is no better than semiconcave in the spatial variable in general. Therefore, even if H is smooth, the velocity field (1.15) may satisfy at most

$$\begin{aligned} b \in BV_\text {loc}\quad \text{ and } \quad ({\text{ div } }b)_- \in L^\infty . \end{aligned}$$
(1.17)

This falls just outside the DiPerna-Lions-Ambrosio regime, since the measure \(({\text{ div } }b)_+\) may still fail to be absolutely continuous in general. In fact, the well-posedness of a suitable notion of solution for the transport and ODE problems under the general assumptions (1.17) remains an open problem.

Many simple but useful MFG models involve a linear-quadratic Hamiltonian of the form

$$\begin{aligned} H(t,x,p) = A(t,x) |p|^2 + B(t,x) \cdot p + C(t,x) \end{aligned}$$

for smooth, real-valued ABC with \(A > 0\). In this case, it is easy to see that (1.15) satisfies the half-Lipschitz condition (1.4). This situation was studied by Cardaliaguet and Souganidis [29] for first-order, stochastic mean field games systems with common noise. In particular, it is proved there that the uniqueness of probability density solutions of (1.7) gives rise, through the superposition principle, to the uniqueness of optimal trajectories for the probabilistic formulation of the MFG problem, and, moreover, the solution of the stochastic forward-backward system can be used to construct approximate Nash equilibria for the N-player game. Our analysis for the Fokker-Planck equation (1.13) may therefore be expected to yield similar results for stochastic MFG systems with common noise and degenerate, spatially-homogenous, idiosyncratic noise, a special case of the equations considered by Cardaliaguet, Souganidis, and the second author in [30].

The second application of nonlinear transport equations in mean field games is involved with the master equation for a MFG with a finite state space. These equations generally take the form

$$\begin{aligned} \partial _t u + b(t,x,u) \cdot \nabla u = c(t,x,u) \quad \text {in } (0,T) \times \mathbb {R}^d, \end{aligned}$$
(1.18)

where u, b, and c all take values in \(\mathbb {R}^d\); coordinate-by-coordinate, (1.18) is written as

$$\begin{aligned} \partial _t u^i + b^j(t,x,u) \partial _{x_j} u^i = c^i(t,x,u), \quad i = 1,2,\ldots , d. \end{aligned}$$

Therefore, (1.18) is a nonconservative hyperbolic system, whose general well-posedness is a difficult question in general; note that, when \(d = 1\), (1.18) becomes a scalar conservation law.

We do not discuss (1.18) here, but, in the paper [56], we study a particular regime of equations taking the form (1.18), using a new theory for linear transport equations with velocity fields b that are increasing coordinate by coordinate, that is, \(\partial _{x_j} b^i \ge 0\) for \(i \ne j\).

The extension to infinite dimensions, of both the linear problems (1.1)–(1.2), as well as the nonlinear equation (1.18), remains an interesting question, with numerous applications, including the study of mean field game master equations on the Hilbert space of square-integrable random variables. We aim to study these situations in future work.

1.3 Notation

Given a function space \(X(\mathbb {R}^d)\), or \(X(\Omega )\) for an appropriate subdomain of \(\mathbb {R}^d\), \(X_\textrm{loc}\) denotes the space of functions (or distributions) f such that \(\phi f \in X\) for all \(\phi \in C^\infty _c(\mathbb {R}^d)\). If X is a normed space, the same is not necessarily true for \(X_\textrm{loc}\), but it inherits the topology of local X-convergence. For example, \(\lim _{n \rightarrow \infty }f_n = f\) in \(L^p_\textrm{loc}(\mathbb {R}^d)\) means that \(\lim _{n \rightarrow \infty } \left\| f_n - f \right\| _{L^p(B_R)} = 0\) for all \(R > 0\). We denote by \(L^p_+([0,T])\) the subset of \(L^p([0,T])\) consisting of nonnegative functions.

Unless otherwise specified, Banach or Fréchet spaces of functions are endowed with the strong topology. For a function space X, the subscripts \(X_\textrm{w}\) and indicate the weak (resp. weak-\(\star \)) topology.

For \(1 \le p < \infty \), \(\mathcal P_p\) is the space of Borel probability measures \(\mu \), with \(\int |x|^p \mu (dx) < \infty \), which becomes a complete metric space for the p-Wasserstein distance \(\mathcal W_p\) defined for \(\mu , \nu \in \mathcal P_p\) by

$$\begin{aligned} \mathcal W_p(\mu ,\nu ) = \inf _{\gamma \in \Gamma (\mu ,\nu )} \iint _{\mathbb {R}^d \times \mathbb {R}^d} |x-y|^p d\gamma (x,y), \end{aligned}$$

where \(\Gamma \) is the set of couplings of \(\mu \) and \(\nu \), that is, measures \(\gamma \) on the product space \(\mathbb {R}^d \times \mathbb {R}^d\) such that \(\gamma (A \times \mathbb {R}^d) = \mu (A)\) and \(\gamma (\mathbb {R}^d \times A) = \nu (A)\) for all Borel measurable \(A \subset \mathbb {R}^d\).

The transpose of a matrix \(\sigma \) is denoted by \(\sigma ^T\), and, if \(\sigma \) is a square matrix, its symmetric part is denoted by \(\textrm{Sym}(\sigma ) {:}{=} \frac{1}{2}(\sigma + \sigma ^T)\). The symbol \({\text {Id}}\) stands for either the identity map or the identity matrix, the precise meaning being clear from context.

2 The ODE Flow

This section is focused on the solvability and properties of the flow associated to a velocity field b satisfying that

$$\begin{aligned} \left\{ \begin{aligned}&\text {for some } C_0,C_1 \in L^1_+([0,T]) \text { and for all }t \in [0,T] \text { and } x,y \in \mathbb {R}^d,\\&|b(t,x)| \le C_0(t)(1 + |x|) \quad \text {and} \\&(b(t,x) - b(t,y)) \cdot (x-y) \ge -C_1(t)|x-y|^2. \end{aligned} \right. \end{aligned}$$
(2.1)

Footnote 4 Because \(b(t,\cdot )\) is not necessarily continuous, the ODE must be interpreted in the Filippov sense [42], that is, abusing notation, we denote by b(tx) the convex hull of all limit points of b(ty) as \(y \rightarrow x\). For \(s \in [0,T]\), we seek absolutely continuous solutions \(t \mapsto \phi _{t,s}(x)\) of the problem

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _t \phi _{t,s}(x) \in b(t,\phi _{t,s}(x)), & t \in [0,T],\\ \phi _{s,s}(x) = x. & \end{array}\right. \end{aligned}$$
(2.2)

Remark 2.1

If \(\dot{X}(t) \in b(t,X(t))\),

$$\begin{aligned} {\tilde{X}}(t)&{:}{=} \exp \left( \int _0^t C_1(s)ds\right) X(t) \quad \text {and}\\ {\tilde{b}}(t,x)&{:}{=} \exp \left( \int _0^t C_1(s)ds\right) b\left( t, \exp \left( - \int _0^t C_1(s)ds \right) x \right) , \end{aligned}$$

so that \(\dot{{\tilde{X}}} \in {\tilde{b}}(t, {\tilde{X}}(t))\), then \({\tilde{b}}\) satisfies (2.1) with \(C_1 \equiv 0\) and a possibly different \(C_0\). In other words, with a change of variables, one may always assume b is monotone without loss of generality.

We will use the following characterization and properties of half-Lipschitz maps; see [25, Lemma 2.2 and Remark 2.4].

Lemma 2.1

A vector field \(B: \mathbb {R}^d \rightarrow \mathbb {R}^d\) satisfies

$$\begin{aligned} (B(x) - B(y)) \cdot (x-y) \ge -C|x-y|^2 \quad \text {for some } C \ge 0 \text { and all } x,y \in \mathbb {R}^d \end{aligned}$$

if and only if \(\textrm{Sym}(\nabla B)\ge -C {\text {Id}}\) in the sense of distributions. We then also have \(B \in BV_\textrm{loc}(\mathbb {R}^d)\), and

$$\begin{aligned} \textrm{Sym}(\nabla B) - ({\text {tr}}\nabla B) {\text {Id}}\le (d-1)C {\text {Id}}. \end{aligned}$$

The fact that B belongs not only to the space \(BD_\textrm{loc}(\mathbb {R}^d)\) of bounded deformations (the space of vector fields \(B: \mathbb {R}^d \rightarrow \mathbb {R}^d\) such that the symmetric part of the distribution \(\nabla B\) is a locally bounded Radon measure [63]), but more particularly to \(BV_\textrm{loc}(\mathbb {R}^d)\), is a consequence of the analysis in [1], where we refer the reader for many more fine geometric properties of (semi-)monotone functions.

We fix a family of regularizations such that

$$\begin{aligned} \left\{ \begin{aligned}&(b^\varepsilon )_{\varepsilon> 0} \subset L^1([0,T], C^{0,1}(\mathbb {R}^d)), \quad \lim _{\varepsilon \rightarrow 0} b^\varepsilon = b \text { a.e. in } [0,T] \times \mathbb {R}^d, \text { and}\\&b^\varepsilon \text { satisfies }(2.1) \text { uniformly in }\varepsilon > 0. \end{aligned} \right. \end{aligned}$$
(2.3)

For example, we may take \(b^\varepsilon (t,\cdot ) = b(t,\cdot ) * \rho _\varepsilon \) for \(\rho _\varepsilon = \varepsilon ^{-d} \rho (\cdot /\varepsilon )\), with \(\rho \in C^\infty _+(\mathbb {R}^d)\), \({\text {supp }}\rho \in B_1\), and \(\int \rho = 1\).

2.1 The Backward Flow

We begin the analysis with the backward flow, that is, (2.2) for \(t < s\). This is the time-direction for which the one-sided Lipschitz condition (2.1) yields a unique, Lipschitz flow. We record its properties here and refer to [33, 42, 59, 60] for the proofs; see also the work of Dafermos [38] for the connection to generalized characteristics of conservation laws.

Lemma 2.2

For every \((s,x) \in [0,T] \times \mathbb {R}^d\), there exists a unique solution \(\phi _{t,s}(x)\) of (2.2) defined for \((t,x) \in [0,s] \times \mathbb {R}^d\), satisfying the Lipschitz bound

$$\begin{aligned} |\phi _{t,s}(x) - \phi _{t,s}(y)|\le & \exp \left( \int _t^s C_1(r)dr \right) |x-y| \!\quad \text {for all } 0 \nonumber \\\le & \!t \le s \le T \text { and } x,y \in \mathbb {R}^d. \end{aligned}$$
(2.4)

Moreover, there exists a constant \(C > 0\) depending only on T and \(C_0\) from (2.1) such that

$$\begin{aligned} |\phi _{t,s}(x)| \le C(|x| + 1) \quad \text {for all } 0 \le t \le s \le T \text { and } x \in \mathbb {R}^d, \end{aligned}$$
(2.5)

and

$$\begin{aligned} \left\{ \begin{aligned}&|\phi _{t_1,s}(x) - \phi _{t_2,s,x}| \le C (1 + |x|)|t_1 - t_2| \quad \text {and} \\&|\phi _{t,s_1}(x) - \phi _{t,s_2}(x)| \le C (1 + |x|)|s_1 - s_2|\\&\text {for all } t_1,t_2 \in [0, s], \; s_2,s_2 \in [t, T], \text { and } x \in \mathbb {R}^d. \end{aligned} \right. \end{aligned}$$
(2.6)

For all \(0 \le r \le s \le t \le T\), \(\phi _{r,s} \circ \phi _{s,t} = \phi _{r,t}\). If \((b^\varepsilon )_{\varepsilon > 0}\) are regularizations satisfying (2.3), then the corresponding backward flows \(\phi ^\varepsilon \) converge locally uniformly as \(\varepsilon \rightarrow 0\) to \(\phi \).

Remark 2.2

The a priori local boundedness and time-regularity estimates (2.5) and (2.6), depending only on \(C_0\) and not \(C_1\), do not require the half-Lipschitz assumption on \(b(t,\cdot )\), and are therefore satisfied for any limiting solutions of the ODE when b satisfies the first condition in (2.1). On the other hand, the half-Lipschitz assumption is crucial for the Lipschitz continuity of the flow (2.4), as well as the uniqueness of the solution.

Remark 2.3

Consider the backward flow in \(\mathbb {R}\) corresponding to \(b(t,x) = b(x) = {\text{ sgn } }x\), which is given, for \(x \in \mathbb {R}\) and \(s < t\), by

$$\begin{aligned} \phi _{s,t}(x) = \left\{ \begin{array}{ll} x+(t-s) & \text {if } x<-t -s,\\ 0 & \text {if } |x| \le t-s, \text { and}\\ x-(t-s) & \text {if } x > t-s. \end{array}\right. \end{aligned}$$
(2.7)

This demonstrates that, in general, the trajectories of the backward flow may concentrate on sets of measures 0, in particular, where b has jump discontinuities.

We will often consider the examples \(b(x) = {\text{ sgn } }x\) in subsequent parts of the paper in order to illustrate certain general phenomena and to present counterexamples. Note that, by Remark 2.1, one can consider similar examples for arbitrary \(C_1 \in L_+^1([0,1])\).

2.2 The Jacobian for the Backward Flow

In view of the Lipschitz regularity (2.4), \(\nabla _x \phi _{t,s} \in L^\infty \) for \(t \le s\), and so we can define the Jacobian

$$\begin{aligned} J_{t,s}(x) {:}{=} \det ( \nabla _x \phi _{t,s}(x)) \quad \text {for } 0 \le t \le s \le T \quad \text { and a.e. } x \in \mathbb {R}^d. \end{aligned}$$
(2.8)

Lemma 2.3

Let J be defined as in (2.8). Then \(J \ge 0\),

$$\begin{aligned} & \left\{ \begin{aligned}&J_{\cdot , s} \in L^\infty ([0,s] \times \mathbb {R}^d) \cap C([0,s], L^1_\textrm{loc}(\mathbb {R}^d)) \quad \forall s \in [0,T] \quad \text {and} \\&J_{t,\cdot } \in L^\infty ([t,T] \times \mathbb {R}^d) \cap C([t,T], L^1_\textrm{loc}(\mathbb {R}^d)) \quad \forall t \in [0,T], \end{aligned} \right. \end{aligned}$$
(2.9)
$$\begin{aligned} & \left\| J_{t,s} \right\| _{L^\infty } \le \exp \left( d\int _t^s C_1(r)dr \right) \quad \text {for all } 0 \le t \le s \le T, \end{aligned}$$
(2.10)

and, for all \(R > 0\), there exists a modulus of continuity \(\omega _R\), which depends on b only through the constants \(C_0\) and \(C_1\) in (2.1), such that

$$\begin{aligned} \left\{ \begin{aligned}&\left\| J_{t_1,s} - J_{t_2,s} \right\| _{L^1(B_R)} \le \omega _R(|t_1 - t_2|) \quad \text {for all } t_1,t_2 \in [0,s] \quad \text {and} \\&\left\| J_{t,s_1} - J_{t,s_2} \right\| _{L^1(B_R)} \le \omega _R(|s_1 - s_2|) \quad \text {for all } s_1,s_2 \in [t,T]. \end{aligned} \right. \end{aligned}$$
(2.11)

If \((b^\varepsilon )_{\varepsilon > 0}\) are as in (2.3), \((\phi ^\varepsilon )_{\varepsilon > 0}\) are the corresponding solutions of (2.2), and, for \(\varepsilon > 0\), \(J^\varepsilon = \det (\nabla _x \phi ^\varepsilon )\), then

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0} J^\varepsilon _{\cdot ,s} = J_{\cdot ,s} \quad \text {weak-}\star \text { in } L^\infty ([0,s] \times \mathbb {R}^d) \quad \text {and} \nonumber \\&\lim _{\varepsilon \rightarrow 0} J^\varepsilon _{t,\cdot } = J_{t,\cdot } \quad \text {weak-}\star \text { in } L^\infty ([t,T] \times \mathbb {R}^d). \end{aligned}$$
(2.12)

Proof

It suffices to prove all statements about \(J_{\cdot ,s}\) on [0, s]. The arguments are exactly the same for the other halves using the fact that \(s \mapsto \phi _{t,s}\) is the forward flow corresponding to the velocity \(-b\).

The convergence (2.12) goes through by compensated compactness arguments for determinants; see the Appendix of [25]. The nonnegativity of J now follows, because \(J^\varepsilon \ge 0\) for all \(\varepsilon \).

For fixed \(\varepsilon > 0\) and \((s,x) \in [0,T] \times \mathbb {R}^d\), we have

$$\begin{aligned} \partial _t J^\varepsilon _{t,s}(x) = {\text {div}}_x b^\varepsilon (t, \phi ^\varepsilon _{t,s}(x)) J^\varepsilon _{t,s}(x) \quad \text {for } t \in [0,s]. \end{aligned}$$

Then (2.3) implies \(\partial _t J^\varepsilon _{t,s}(x) \ge -dC_1(t) J^\varepsilon _{t,s}(x)\), and so

$$\begin{aligned} \frac{\partial }{\partial t} \left( J^\varepsilon _{t,s}(x) e^{- d\int _t^s C_1(r)dr} \right) \ge 0. \end{aligned}$$

In particular, for \(t_1 < t_2 \le s\) and \(R > 0\),

$$\begin{aligned} \int _{B_R} |J^\varepsilon _{t_2,s} - J^\varepsilon _{t_1,s}|\le & e^{\int _{t_1}^{t_2} C_1(r)dr} \int _{B_R} J^\varepsilon _{t_2,s} - \int _{B_R} J^\varepsilon _{t_1,s}\\ & +\left( e^{\int _{t_1}^{t_2} C_1(r)dr} - 1 \right) \int _{B_R} J^\varepsilon _{t_2,s}. \end{aligned}$$

Identifying the modulus of continuity \(\omega _R\) in the statement of the Lemma then reduces to proving the uniform-in-\(\varepsilon \) continuity of

$$\begin{aligned} [0,s] \ni t \mapsto \int _{B_R} J^\varepsilon _{t,s}(x)dx; \end{aligned}$$

note that \(\int _{B_R} J^\varepsilon _{s,s}(x)dx = |B_R|\), so this will also imply that \(\int _{B_R} J^\varepsilon _{t,s}(x)dx\) is bounded uniformly in \(\varepsilon \).

In view of the uniform-in-\(\varepsilon \) \(L^\infty \)-boundedness of \(J^\varepsilon \), it suffices to prove the uniform-in-\(\varepsilon \) continuity in t of \(\int f(x) J^\varepsilon _{t,s}(x)dx\) for any \(f \in C_c(\mathbb {R}^d)\). The change of variables formula gives

$$\begin{aligned} \int f(x) J^\varepsilon _{t,s}(x)dx = \int f(\phi ^\varepsilon _{s,t}(x))dx. \end{aligned}$$

Note that \(\partial _t \phi ^\varepsilon _{s,t}(x) = -b^\varepsilon (t,\phi ^\varepsilon _{s,t}(x))\), and the Lipschitz constant in t of \(\phi ^\varepsilon _{s,t}(x)\) depends only on an upper bound for |x| and the constant \(C_0\) in (2.1), and, therefore, is independent of \(\varepsilon \). \(\square \)

When \(d = 1\), the \(L^\infty \)-weak-\(\star \) convergence of \(J^\varepsilon = \partial _x \phi ^\varepsilon \) to J can be strengthened via an Aubin-Lions type compactness result.

Proposition 2.1

Assume \(d = 1\), and let \(J^\varepsilon \) and J be as in Lemma 2.3. Then

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} J^\varepsilon _{\cdot ,s} = J_{\cdot ,s} \quad \text {strongly in }L^1_\textrm{loc}([0,s] \times \mathbb {R}) \end{aligned}$$

and

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} J^\varepsilon _{t,\cdot } = J_{t,\cdot } \quad \text {strongly in }L^1_\textrm{loc}([t,T] \times \mathbb {R}). \end{aligned}$$

Proof

Fix \(t \in [0,T]\) and \(R > 0\). Lemma 2.2 implies that there exists M independent of \(\varepsilon \) such that \(|\phi ^\varepsilon _{t,s}(x)| \le M\) for all \(s \in [t,T]\) and \(x \in [-R,R]\). Upon redefining b outside of \([0,T] \times [-2R,2R]\), we find that \(\phi _{t,s}(x)\), and therefore \(J_{t,s}(x)\), is unchanged, and therefore, in order to prove the \(L^1\)-convergence in \([t,T] \times [-R,R]\), we may assume without loss of generality that b is bounded uniformly. Applying the transformation \({\tilde{\phi }}_{t,s}(x) = \phi _{t,s}(x) - \int _t^s C(r)dr\) for an appropriate \(C \in L^1_+([0,T])\) depending on \(C_0\) from (2.1), we may also assume \(b \ge 1\).

For \((s,x) \in [t,T] \times \mathbb {R}\), set \(f^\varepsilon (s,x) = J^\varepsilon _{t,s}(x)\). Then \(f^\varepsilon \) solves the continuity equation

$$\begin{aligned} \partial _s f^\varepsilon + \partial _x \left( b^\varepsilon (s,x) f^\varepsilon \right) = 0 \quad \text {in } [t,T] \times \mathbb {R}\quad \text {and} \quad f^\varepsilon (t,\cdot ) = 1. \end{aligned}$$

For a standard mollifier \(\rho \in C^\infty _c([-1,1])\), let \(\rho _n = n \rho (\cdot /n)\) and \(f^{\varepsilon ,n} = \rho _n *_t f^\varepsilon \) be the mollification of \(f^\varepsilon \) only in the time variable. We then have

$$\begin{aligned} \partial _s f^{\varepsilon ,n} + \partial _x \left[ \rho _n *_t (b^\varepsilon f^\varepsilon ) \right] = 0 \quad \text {in } \left[ t + \frac{1}{n}, T \right] \times \mathbb {R}\end{aligned}$$

and, for any \(R > 0\),

$$\begin{aligned}&\sup _{s \in [t+1/n,T]} \left\| \partial _x\left[ \rho _n *_t (b^\varepsilon f^\varepsilon ) \right] (s,\cdot ) \right\| _{L^1([-R,R])} \\&\quad \le \sup _{s \in [t+1/n,T]} \left\| \partial _s f^{\varepsilon ,n}(s,\cdot ) \right\| _{L^1([-R,R])} \le n \left\| \rho ' \right\| _{L^1(\mathbb {R})} \omega _R\left( \frac{1}{n} \right) , \end{aligned}$$

where \(\omega _R\) is as in (2.11). It follows that, for fixed \(n \in \mathbb {N}\), \((\rho _n *_t (b^\varepsilon f^\varepsilon ))_{\varepsilon > 0}\) is precompact in \(L^1([t,T] \times [-R,R])\), and so, because

$$\begin{aligned} \lim _{n \rightarrow \infty } \rho _n *_t (b^\varepsilon f^\varepsilon ) = bf \end{aligned}$$

in \(L^1([t,T] \times [-R,R])\), uniformly in \(\varepsilon \), we conclude that \((b^\varepsilon f^\varepsilon )_{\varepsilon > 0}\) is precompact in \(L^1([t,T] \times [-R,R])\). This implies that, as \(\varepsilon \rightarrow 0\), \(b^\varepsilon f^\varepsilon \) converges strongly in \(L^1([t,T] \times [-R,R])\) to bf.

Fix any subsequence \((\varepsilon _n)_{n \ge 0}\) approaching zero as \(n \rightarrow \infty \). Then there exists a further subsequence such that \(f^{\varepsilon _{n_k}} b^{\varepsilon _{n_k}} \xrightarrow {k \rightarrow \infty } fb\) almost everywhere, and therefore \(f^{\varepsilon _{n_k}} \xrightarrow {k \rightarrow \infty } f\) a.e. in \([t,T] \times [-R,R]\) because \(b \ge 1\) and

$$\begin{aligned} f^\varepsilon (s,x) - f(s,x)&= \frac{b(s,x)f^\varepsilon (s,x) - b(s,x) f(s,x)}{b(s,x)} \\&= \frac{b^\varepsilon (s,x)f^\varepsilon (s,x) - b(s,x) f(s,x)}{b(s,x)}\\&\quad + \frac{\left( b(s,x) - b^\varepsilon (s,x)\right) f^\varepsilon (s,x)}{b(s,x)}. \end{aligned}$$

The convergence of \(f^{\varepsilon _{n_k}}\) to f in \(L^1([t,T] \times [-R,R])\), and therefore the convergence of the full family \((f^\varepsilon )_{\varepsilon > 0}\) to f, is a consequence of the Lebesgue dominated convergence theorem. \(\square \)

Remark 2.4

The one-dimensional structure is important in the proof of Proposition 2.1, in particular, in deducing from the equicontinuity of \(J^\varepsilon \) in time that \((b^\varepsilon J^\varepsilon )_{\varepsilon > 0}\) belongs to a precompact subset of \(L^1\). It is not immediately clear whether this argument can be extended to multiple dimensions.

2.3 The Forward Flow as the Right-Inverse of the Backward Flow

We next investigate the solvability of (2.2) forward in time. This is done by analyzing the Jacobian J from the previous subsection in order to invert the backward flow. Similar methods are used in [29], and, by including the Jacobian in the analysis, we obtain additionally the almost-everywhere continuity of the inverse.

We will revisit this topic in Sect. 4 when we analyze the forward flow, which will arise from the theory of renormalized solutions of the appropriate transport equation.

Proposition 2.2

For \(t \le s\), there exists a set \(A_{ts} \subset \mathbb {R}^d\) of full measure such that, for all \(y \in A_{ts}\), \(\phi _{t,s}^{-1}(\{y\})\) is a singleton, which we denote by \(\left\{ \phi _{s,t}(y) \right\} \). Moreover, there exists a version of the map \(\phi _{s,t}:\mathbb {R}^d \rightarrow \mathbb {R}^d\) such that \(\phi _{s,t}\) is continuous a.e.

As an intermediate step, we first prove the following:

Lemma 2.4

Assume \(0 \le t \le s \le T\) and \(K \subset \mathbb {R}^d\) is nonempty, compact, and connected. Then \(\phi _{t,s}^{-1}(K)\) is nonempty, compact, and connected.

Proof

For \(r > 0\), define \(K_r {:}{=} \bigcup _{y \in K} B_r(y)\). Fix a sequence \((b^n)_{n \in \mathbb {N}}\) satisfying (2.3)Footnote 5, and let \(\phi ^n_{t,s}\) denote the corresponding backward flow from the previous subsections.

We first show that

$$\begin{aligned} \phi _{t,s}^{-1}(K) = \bigcap _{r > 0} \bigcup _{n \in \mathbb {N}} \bigcap _{k \ge n} (\phi _{t,s}^k)^{-1}(K_r). \end{aligned}$$
(2.13)

Suppose \(x \in \phi _{t,s}^{-1}(K)\). Then \(y = \phi _{t,s}(x) \in K\). Setting \(y_n {:}{=} \phi ^n_{t,s}(x)\), we have \(\lim _{n \rightarrow \infty } y_n = y\) by Lemma 2.2, which means that, for all \(r > 0\), there exists \(n \in \mathbb {N}\) such that, for all \(k \ge n\), \(\phi ^k_{t,s}(x) \in B_r(y) \subset K_r\). This proves the \(\subset \) direction of (2.13).

Now suppose x belongs to the right-hand side of (2.13). Then, for all \(r > 0\), there exists \(n \in \mathbb {N}\) such that \(x \in (\phi _{t,s}^k)^{-1}(K_r)\) for all \(k \ge n\). Set \(y_k {:}{=} \phi _{t,s}^k(x)\), so that we have \(y_k \subset K_r\) for all \(k \ge n\). We have \(y {:}{=} \lim _{k \rightarrow \infty } y_k = \lim _{k \rightarrow \infty } \phi _{t,s}^k(x) = \phi _{t,s}(x)\) by Lemma 2.2. On the other hand, we also have \(y \in \overline{K_r}\), and so

$$\begin{aligned} \phi _{t,s}(x) \subset \bigcap _{r > 0} \overline{K_r} = K. \end{aligned}$$

Thus, the \(\supset \) direction of (2.13) is established.

The continuity of \(\phi _{t,s}\) and the compactness of K imply that \(\phi _{t,s}^{-1}(K)\) is closed. We note also that \((\phi ^k_{t,s})^{-1} = \phi ^k_{s,t}\) satisfies (2.5) uniformly in k, because the bound only depends on the constant \(C_0\) in the linear growth bound of (2.1), which is also satisfied by \(-b^k\). This along with (2.13) implies that \(\phi _{t,s}^{-1}(K)\) is bounded, and thus compact.

We now show that \(\phi _{t,s}\) is surjective. Using again the bound (2.5) satisfied uniformly in k for \(\phi ^k_{s,t}\), we set \(x_n {:}{=} (\phi ^n_{t,s})^{-1}(y) = \phi ^n_{s,t}(y)\) and note that \((x_n)_{n \in \mathbb {N}}\) is bounded. Passing to a subsequence, we have \(\lim _{k \rightarrow \infty } x_{n_k} = x\) for some \(x \in \mathbb {R}^d\), and then \(y = \phi ^{n_k}_{t,s}(x_k)\), so that \(y = \lim _{k \rightarrow \infty } \phi ^{n_k}_{t,s}(x_k) = \phi _{t,s}(x)\).

Finally, we show \(\phi _{t,s}^{-1}(K)\) is connected. For each \(k \in \mathbb {N}\), \((\phi ^k_{t,s})^{-1}(K_r)\) is connected, and therefore so is the intersection \(\bigcap _{k \ge n} (\phi ^k_{t,s})^{-1}(K_r)\) for each n. These sets are nested in n, so taking the union in \(n \in \mathbb {N}\) yields a connected set. Taking the intersection over \(r > 0\) gives the connectedness of \(\phi ^{-1}_{t,s}(K\)). \(\square \)

Remark 2.5

The fact that the approximate backward flows converge uniformly to \(\phi _{t,s}\) is used in the second-to-last paragraph of the proof, in order to show that \(\phi _{t,s}\) is surjective.

Proof of Proposition 2.2

We identify the set by

$$\begin{aligned} A_{t,s}&= \Big \{ y \in \mathbb {R}^d: \text {there exists } x \in \phi _{t,s}^{-1}(\{y\}) \text { such that} \\&\qquad \phi _{t,s}\text { is differentiable at }x\text { and } J_{t,s}(x) \ne 0 \Big \}. \end{aligned}$$

We first check that \(A_{t,s}\) has full measure. Its complement consists of

$$\begin{aligned} \mathbb {R}^d \backslash A_{ts} =&\left\{ y \in \mathbb {R}^d: J_{t,s} = 0 \text { at the points of differentiability of }\phi _{t,s}\text { on } \phi _{t,s}^{-1}(\{y\}) \right\} \\&\cup \left\{ y \in \mathbb {R}^d: \phi _{t,s} \text { is not differentiable anywhere in } \phi _{t,s}^{-1}(\{y\}) \right\} . \end{aligned}$$

The fact that \(\phi _{t,s}\) is differentiable a.e. and the change of variables formula then give

$$\begin{aligned} |\mathbb {R}^d \backslash A_{t,s} | = \int _{\mathbb {R}^d} \textbf{1}\{ \phi _{t,s}(x) \in \mathbb {R}^d \backslash A_{t,s} \} J_{t,s}(x)dx = 0. \end{aligned}$$

It remains to show that \(\phi _{t,s}^{-1}(\{y\})\) is a singleton for all \(y \in A_{t,s}\). By Lemma 2.4, \(\phi _{t,s}^{-1}(\{y\})\) is nonempty, compact, and connected. Suppose \(x, {\tilde{x}} \in \phi _{t,s}^{-1}(\{y\})\) are such that \(J_{t,s}(x) \ne 0\). A Taylor expansion gives

$$\begin{aligned} y= & \phi _{t,s}({\tilde{x}}) = \phi _{t,s}(x) + \nabla _x \phi _{t,s}(x) \cdot (x - {\tilde{x}}) + o(|x - {\tilde{x}}|) \\= & y + \nabla _x \phi _{t,s}(x)\cdot (x - {\tilde{x}}) + o(|x - {\tilde{x}}|). \end{aligned}$$

The invertibility of \(\nabla _x \phi _{t,s}(x)\) then implies that, if \(|{\tilde{x}} - x|\) is sufficiently small, then \({\tilde{x}} = x\), or, in other words, x is an isolated point. But then the connected set \(\phi _{t,s}(\{y\})\) must be equal to \(\{x\}\), and we call \(x = \phi _{s,t}(y)\).

For \(y \in A_{t,s}\), we then have \(( \phi _{t,s} \circ \phi _{s,t})(y) = y\). Since \(\phi _{t,s}^{-1}(\{y\})\) is nonempty for any \(y \in \mathbb {R}^d\), we may define a version of \(\phi _{s,t}\) on all of \(\mathbb {R}^d\) by imposing that \(\phi _{s,t}(y) \in (\phi _{t,s}^{-1})(\{y\})\) for any \(y \in A_{t,s}\). For this version, we have \(\phi _{t,s} \circ \phi _{s,t} = {\text {Id}}\) everywhere on \(\mathbb {R}^d\). Suppose now that \(y \in A_{t,s}\) and \(\lim _{n \rightarrow \infty } y_n = y\) for some sequence \((y_n)_{n \in \mathbb {N}} \subset \mathbb {R}^d\). Then

$$\begin{aligned} \lim _{n \rightarrow \infty } (\phi _{t,s} \circ \phi _{s,t})(y_n) = (\phi _{t,s} \circ \phi _{s,t})(y). \end{aligned}$$

We have

$$\begin{aligned} (\phi _{s,t}(y_n))_{n \in \mathbb {N}} \subset \bigcup _{n \in \mathbb {N}} (\phi _{t,s})^{-1}(\{y_n\}), \end{aligned}$$

which implies by Lemma 2.4 that \((\phi _{s,t}(y_n))_{n \in \mathbb {N}}\) is bounded. Letting z be a limit point of this set, we have, by continuity of the backward flow, that \(y = \lim _{n \rightarrow \infty } y_n = \phi _{t,s}(z)\), and therefore \(z = \phi _{s,t}(y)\). \(\square \)

Remark 2.6

We shall see in Sect. 4 that the forward flow is always BV in space. Therefore, the “forward Jacobian” \(J_{t,s}\) for \(t > s\) can only be understood as a measure. Indeed, returning to the example \(b(t,x) = {\text{ sgn } }x\) on \(\mathbb {R}\), the right inverse \(\phi _{t,s}\) of \(\phi _{s,t}\) given by (2.7) is \(\phi _{t,s}(x) = x + ({\text{ sgn }}x)(t-s)\) for \(s \le t\), which is discontinuous only at 0. The backward Jacobian is given by \(J_{s,t}(x) = \textbf{1}\left\{ |x| \ge t-s \right\} \), and the forward one is \(J_{t,s} = 1 + 2(t-s) \delta _0\).

Remark 2.7

The formula \(\phi _{s,t} \circ \phi _{t,s} = {\text {Id}}\) makes sense a.e. if \(s < t\), because \(\phi _{s,t}\) is Lipschitz and \(\phi _{t,s}\) is measurable. On the other hand, \(\phi _{t,s}\) is not also a left-inverse, since the formula \(\phi _{t,s} \circ \phi _{s,t}\) does not make sense. In the above example, \(\phi _{s,t}(x)\) is equal to 0, for \(|x| \le t-s\), and 0 is a point of discontinuity for \(\phi _{t,s}\). In general, the concentration of \(\phi _{s,t}\) on sets of measure 0 forbids applying \(\phi _{t,s}\) as a left-inverse.

2.4 Compressive Stochastic Flows

We now fix a matrix-valued map

$$\begin{aligned} \Sigma \in L^2([0,T], C^{0,1}(\mathbb {R}^d; \mathbb {R}^{d \times m})), \end{aligned}$$
(2.14)

and assume that

$$\begin{aligned} W: \Omega \times [0,T] \rightarrow \mathbb {R}^m \quad&\text {is a standard Brownian motion} \nonumber \\&\text {on a given probability space } (\Omega , \mathcal F, \mathbb P, \mathbb E). \end{aligned}$$
(2.15)

In order to extend the results in the preceding subsections, and, in particular, to bypass the difficulties of the backward time direction, we consider forward SDEs with drift satisfying the opposite of (2.1), that is,

$$\begin{aligned} B: [0,T] \times \mathbb {R}^d \rightarrow \mathbb {R}^d, \quad -B \text { satisfies } (2.1), \end{aligned}$$
(2.16)

and consider the flow

$$\begin{aligned} \left\{ \begin{array}{ll} d_s \Phi _{s,t}(x) = B(s,\Phi _{s,t}(x))ds + \Sigma (s,\Phi _{s,t}(x)) dW_s, & s \in [t,T],\\ \Phi _{t,t}(x) = x. \end{array}\right. \end{aligned}$$
(2.17)

Once again, (2.17) must be understood in the Filippov sense, which means, for \(s \in [t,T]\),

$$\begin{aligned} \Phi _{s,t}(x) = x + \int _t^s \alpha _r dr + \int _t^s \Sigma (r,\Phi _{r,t}(x))dW_r, \quad \alpha _s \in B(s, \Phi _{s,t}(x)), \end{aligned}$$
(2.18)

and we remark that our assumptions will allow us to always consider probabilistically strong solutions; that is, we solve (2.18) path by path for almost every continuous W with respect to the Wiener measure. Depending on the context in later sections (in particular, the time direction of solvability for the transport and continuity equations), we consider different examples for B and \(\Sigma \) for which these assumptions are satisfied.

Lemma 2.5

For every \((t,x) \in [0,T] \times \mathbb {R}^d\) and \(\mathbb {P}\)-almost surely, there exists a unique strong solution \(\Phi _{s,t}(x)\) of (2.17) defined on \([t,T] \times \mathbb {R}^d\). Moreover, for all \(p \in [2,\infty )\), there exists a constant \(C = C_{p} > 0\) depending only on the assumptions (2.1) and (2.14) such that

$$\begin{aligned} \mathbb E |\Phi _{s,t}(x) - \Phi _{s,t}(y)|^p\le & C |x-y|^p \quad \text {for all } 0 \le t \le s \le T \text { and } x,y \in \mathbb {R}^d,\nonumber \\ \end{aligned}$$
(2.19)
$$\begin{aligned} \mathbb {E}|\Phi _{s,t}(x)|^p\le & C(|x|^p + 1) \quad \text {for all } -0 \le t \le s \le T \text { and } x \in \mathbb {R}^d,\nonumber \\ \end{aligned}$$
(2.20)

and

$$\begin{aligned} \left\{ \begin{aligned}&\mathbb {E}|\Phi _{s_1,t}(x) - \Phi _{s_2,t}(x)|^p \le C (1 + |x|)|s_1 - s_2|^{p/2}\\&\text {for all }t \in [0,T], s_1,s_2 \in [t,T],\text { and } x \in \mathbb {R}^d. \end{aligned} \right. \end{aligned}$$
(2.21)

With probability one, for all \(0 \le r \le s \le t \le T\), \(\Phi _{t,s} \circ \Phi _{s,r}= \Phi _{t,r}\). If \((b^\varepsilon )_{\varepsilon > 0}\) are regularizations satisfying (2.3), then, with probability one, the corresponding stochastic flows \(\Phi ^\varepsilon \) converge locally uniformly as \(\varepsilon \rightarrow 0\) to \(\Phi \).

Proof

For \(\varepsilon > 0\), let \(B^\varepsilon \) be the convolution of B in space by a standard mollifier (so that \(b^\varepsilon {:}{=} -B^\varepsilon \) satisfies (2.3)), and let \(\Phi ^\varepsilon _{t,s}\) denote the corresponding stochastic flow. Itô’s formula, the one-sided Lipschitz assumption, and the Lipschitz continuity of \(\Sigma \) yield, for any \(p \ge 2\) and some \(C \in L^1_+([0,T])\),

$$\begin{aligned} \frac{\partial }{\partial t} \mathbb {E}|\Phi ^\varepsilon _{t,s}(x) - \Phi ^\varepsilon _{t,s}(y)|^p \le C(t) \mathbb {E}|\Phi ^\varepsilon _{t,s}(x) - \Phi ^\varepsilon _{t,s}(y)|^p, \end{aligned}$$

which, along with Grönwall’s inequality, leads to the first statement. The other two estimates are proved similarly, with constants independent of \(\varepsilon > 0\).

In view of (2.19) and (2.21), the Kolmogorov continuity criterion then yields, for any \(R > 0\), \(p \ge 2\) and \(\delta \in (0,1)\), a constant \(C = C_{R,p,\delta } > 0\) such that, for all \(s \in [0,T]\), \(\lambda \ge 1\) and \(\varepsilon > 0\),

$$\begin{aligned} \mathbb {P}\left( \sup _{x,y \in B_R} \sup _{r,s \in [s,T]} \frac{ |\Phi ^\varepsilon _{t,s}(x) - \Phi ^\varepsilon _{r,s}(y) |}{|x-y|^{1-\delta } + |t-s|^{\frac{1}{2}(1-\delta )} } > \lambda \right) \le \frac{C}{\lambda ^p}. \end{aligned}$$

It follow that the probability measures on \(C([s,T] \times \mathbb {R}^d; \mathbb {R}^d)\) induced by the random variables \((\Phi ^\varepsilon _{\cdot ,s})_{\varepsilon > 0}\) are tight with respect to the topology of locally uniform convergence, and therefore converge weakly along a subsequence as \(\varepsilon \rightarrow 0\) to a probability measure that gives rise to a weak (in the probabilistic sense) solution of (2.17), for which the estimates in the statement of the lemma continue to hold.

A similar computation to the one above reveals that, for a fixed probability space and almost every Brownian path W, the solution of (2.17) is unique. The pathwise uniqueness then implies, by a standard argument due to Yamada and Watanabe [66], that there is a unique strong solution for every \(x \in \mathbb {R}^d\). \(\square \)

Remark 2.8

It is an open question whether \(\Phi _{t,s}\) is Lipschitz continuous, even if B is Lipschitz. When B is Lipschitz and \(\Sigma \in C^{1,\alpha }\) for some \(\alpha \in (0,1]\), it turns out the flow \(\Phi _{t,s}\) is \(C^{1,\alpha '}\) for any \(\alpha ' \in (0,\alpha )\), but it is not clear how to extend this to the case where \(-B\) satisfies the one-sided Lipschitz bound from below.

As a consequence, an understanding of the Jacobian \(\det (\nabla _x\Phi _{t,s}(x))\), or of the stability with respect to regularizations of B, is considerably more complicated in the stochastic case. The results of Sect. 4, where we discuss the expansive regime, are therefore constrained to the first-order case, and we relegate the second-order analysis to future work. One exception is when \(\Sigma \) is independent of the spatial variable, in which case a change of variables relates the SDE to an ODE of the form (2.2) with a random b.

2.5 Small Noise Approximations

We return to the backward flow \(\phi _{t,s}\), \(0 \le t \le s \le T\), from Lemma 2.2. Recall that the backward flow also corresponds to the forward flow for \(-b\); that is,

$$\begin{aligned} \frac{\partial }{\partial s} \phi _{t,s}(x) = - b(s, \phi _{t,s}(x)), \quad s \ge t, \quad \phi _{t,t}(x) = x. \end{aligned}$$
(2.22)

For \(\varepsilon > 0\), let \(\phi ^\varepsilon _{t,s}(x)\) denote the following stochastic flow

$$\begin{aligned} d_s \phi ^\varepsilon _{t,s}(x) = - b(s,\phi ^\varepsilon _{t,s}(x))ds + \varepsilon dW_s \quad s \ge t, \quad \phi ^\varepsilon _{t,t}(x) = x, \end{aligned}$$
(2.23)

where W is now a d-dimensional Brownian motion. We note that (2.17) falls under the assumptions of Lemma 2.5, but in fact (2.23) admits a unique strong solution as soon as b is merely locally bounded [39, 65]. In general, the limiting solutions as \(\varepsilon \rightarrow 0\) are not unique; however, we immediately have the following as a consequence of Lemma 2.5.

Proposition 2.3

For every \(\varepsilon > 0\), there exists a unique strong solution of (2.23). Moreover, as \(\varepsilon \rightarrow 0\), \(\phi ^\varepsilon \) converges locally uniformly to \(\phi \).

If \(J^\varepsilon = \det (\nabla _x \phi ^\varepsilon )\), then, as \(\varepsilon \rightarrow 0\), \(J^\varepsilon \) converges weak-\(\star \) in \(L^\infty ([s,T] \times \mathbb {R}^d)\) and weakly in \(C([s,T], L^1_{\textrm{loc}}(\mathbb {R}^d))\) to J.

2.6 Some Bibliographical Remarks

We conclude this section by placing the above results in the context of the existing literature. As has been mentioned, the well-posedness and properties of the backwards flow in Sect. 2.1 are well studied, and date back to at least the work of Filippov [42]. For the particular properties of the Jacobian stated in subsection 2.2, especially the weak-\(\star \) convergence in \(L^\infty \), we expand on arguments from the Appendix of [25]. Our result on strong convergence when \(d = 1\) relies on arguments as for Aubin-Lions compactness Lemmas [18, 54].

Meanwhile, the forward flow in Sect. 4.3 is comparatively less studied. Our approach to uniquely identifying the forward flow as the right-inverse of the backward flow is similar to that in the appendix of [29], with the argument expanded so as to prove the almost-everywhere continuity, which was not previously known. As mentioned, we further expand the properties of the forward flow in Sect. 4 below.

As in the ODE case, the theory for compressive SDE flows such as those considered in Sect. 2.4 is well-understood. For instance, in [62], this situation is studied in a still more general setting (the constant \(C_1\) in (2.1) is allowed to depend additionally on x and y, with some integrability assumptions), and, similarly as in Sect. 3.2 below, this allows for a theory of measure-valued solutions of the Fokker–Planck equation (3.4). By contrast, degenerate stochastic regular Lagrangian flows and degenerate Fokker-Planck equations with \(({\text{ div } }b)_- \in L^\infty \) and \({\text{ div }}b \notin L^1\) are far less studied, and, as far as we know, our results in Sect. 4.5 below (which rely on the properties of compressive flows established in this section) are the first in this direction.

3 The Compressive Regime

In this section, we consider the transport and continuity equations in the so-called compressive regime. That is, for velocity field b satisfying (2.1), we study the TVP for the nonconservative equation

$$\begin{aligned} -\frac{\partial u}{\partial t} + b(t,x) \cdot \nabla u = 0 \quad \text {in } (0,T) \times \mathbb {R}^d \quad \text {and} \quad u(T,\cdot ) = u_T \quad \text {in } \mathbb {R}^d, \end{aligned}$$
(3.1)

and the IVP for the conservative equation

$$\begin{aligned} \frac{\partial f}{\partial t} - {\text {div}}(b(t,x) f) = 0 \quad \text {in } (0,T) \times \mathbb {R}^d \quad \text {and} \quad f(0,\cdot ) = f_0. \end{aligned}$$
(3.2)

We recall that \({\text {div}}b\) is bounded from below, and therefore, the direction of time for (3.1) and (3.2) does not allow for a solution theory in Lebesgue spaces, due to the concentrative nature of the backward flow analyzed in the previous section. The TVP (3.1) will be solved in the space of continuous functions, while (3.2) is solved in the dual space of Radon measures.

We also obtain analogous results for the second-order equations

$$\begin{aligned} -\frac{\partial u}{\partial t} - {\text {tr}}[a(t,x)\nabla ^2 u] + b(t,x) \cdot \nabla u = 0 \quad \text {in } (0,T) \times \mathbb {R}^d \quad \text {and} \quad u(T,\cdot ) = u_T \end{aligned}$$
(3.3)

and

$$\begin{aligned} \frac{\partial f}{\partial t} - {\text {div}}\big [ {\text {div}}(a(t,x)f) - b(t,x) f \big ] = 0 \quad \text {in } (0,T) \times \mathbb {R}^d \quad \text {and} \quad f(0,\cdot ) = f_0, \end{aligned}$$
(3.4)

where \(a = \frac{1}{2} \sigma \sigma ^T\) for \(\sigma : [0,T] \times \mathbb {R}^d \rightarrow \mathbb {R}^{d \times m}\), satisfying that

$$\begin{aligned} \sup _{x \in \mathbb {R}^d} \frac{ |\sigma (\cdot ,x)|}{1 + |x|} + \sup _{y,z \in \mathbb {R}^d} \frac{|\sigma (\cdot ,y) - \sigma (\cdot ,z)|}{|y-z|} \in L^2([0,T]). \end{aligned}$$
(3.5)

3.1 The Nonconservative Equation

3.1.1 Representation Formula

When interpreting (3.1) in the distributional sense, we are constrained to seek solutions that are continuous. Indeed, the distribution

$$\begin{aligned} b \cdot \nabla u = {\text{ div }}( bu) -( {\text{ div } }b) u \end{aligned}$$

pairs the solution u with \({\text{ div } }b\), which is a measure in general. The other motivating factor is the formal representation formula for the solution of the TVP (3.1), which is given in terms of the backward flow:

$$\begin{aligned} u(t,x) = u_T( \phi _{t,T}(x)) \quad \text {for } (t,x) \in [0,T] \times \mathbb {R}^d. \end{aligned}$$
(3.6)

This formula and the Lipschitz continuity of \(\phi _{t,T}\) given in Lemma 2.2 suggest that the solution operator for (3.1) should preserve continuity. In fact, the formula (3.6) defines a distributional solution, which is uniquely obtained from limits of natural regularizations of the equation.

Theorem 3.1

If \(u_T \in C(\mathbb {R}^d)\), then the function u in (3.6) is a distributional solution of (3.1). Moreover, if \((b^\varepsilon )_{\varepsilon > 0}\) satisfy (2.3) and \(u^\varepsilon \) is the corresponding solution of (3.1) with velocity field \(b^\varepsilon \), then, as \(\varepsilon \rightarrow 0\), \(u^\varepsilon \) converges locally uniformly to u.

Proof

The unique solution \(u^\varepsilon \) for the regularized velocity field is given by \(u^\varepsilon (t,\cdot ) = u_T \circ \phi ^\varepsilon _{t,T}\), where \(\phi ^\varepsilon \) is the flow corresponding to \(b^\varepsilon \). By Lemma 2.2, as \(\varepsilon \rightarrow 0\), \(\phi ^\varepsilon \) converges locally uniformly to \(\phi \), and so the local-uniform convergence to u follows from the continuity of \(u_0\).

Multiplying the equation for \(u^\varepsilon \) by some \(\psi \in C^1_c((0,T) \times \mathbb {R}^d)\) and integrating by parts gives

$$\begin{aligned} \int _0^T \int _{\mathbb {R}^d} u^\varepsilon (t,x)\left( \partial _t \psi (t,x) - b^\varepsilon (t,x) \cdot \nabla \psi (t,x) + ({\text{ div } }b^\varepsilon (t,x)) \psi (t,x) \right) dxdt = 0. \end{aligned}$$

As \(\varepsilon \rightarrow 0\), \(b^\varepsilon \rightarrow b\) almost everywhere and \({\text {div}}b^\varepsilon \rightharpoonup {\text {div}}b\) weakly in the sense of measures, and so the fact that u is a distributional solution follows. \(\square \)

Turning next to the second-order equation (3.3), we identify a solution candidate with the appropriate stochastic flow. We do so by changing the time direction in b and \(\sigma \) and considering the SDE

$$\begin{aligned} \left\{ \begin{array}{ll} d_s \Phi _{s,t}(x) = -b(s, \Phi _{s,t}(x))ds + \sigma (s, \Phi _{s,t}(x))dW_s, & s \in [t,T], \\ \Phi _{t,t} = {\text {Id}}, \end{array}\right. \end{aligned}$$
(3.7)

where W is as in (2.15). Note that (3.7) is of the type in (2.17) and thus falls within the assumptions of Lemma 2.5. In particular, if \(u_T\) is continuous, then, in view of (2.19)-(2.21), the formula

$$\begin{aligned} u(t,x) = \mathbb {E}[u_T(\Phi _{T,t}(x))] \end{aligned}$$
(3.8)

defines a continuous function. Moreover, if \(u_T\) is Lipschitz, then \(u(t,\cdot )\) is Lipschitz for all \(t > 0\), and 1/2-Hölder continuous in time. Note that, in this case, the distribution \({\text{ tr }}[a\nabla ^2 u] = {\text{ div }}(a \nabla u) - ({\text{ div } }a) \cdot \nabla u\) makes sense, because \(\nabla u\) and \({\text{ div } }a\) both belong to \(L^\infty \).

The following is proved exactly as for Theorem 3.1, with the use of the estimates in Lemma 2.5.

Theorem 3.2

Let \(u_T \in C(\mathbb {R}^d)\) be uniformly continuous and define u by (3.8). If \((b^\varepsilon )_{\varepsilon > 0}\) satisfy (2.3) and \(u^\varepsilon \) is the corresponding solution of (3.3) with velocity \(b^\varepsilon \), then, as \(\varepsilon \rightarrow 0\), \(u^\varepsilon \) converges locally uniformly to u. Moreover, if \(u_T \in C^{0,1}\), then

$$\begin{aligned} \sup _{(t,x,y) \in [0,T] \times \mathbb {R}^d \times \mathbb {R}^d} \frac{|u(t,x) - u(t,y)|}{|x-y|} + \sup _{(r,s,z) \in [0,T] \times \mathbb {R}^d} \frac{|u(r,z) - u(s,z)|}{|r-s|^{1/2}(1 + |z|) } < \infty , \end{aligned}$$

and u is a distributional solution of (3.3).

As a special case, we consider, for \(\varepsilon > 0\), the “viscous” version of (3.1), that is

$$\begin{aligned} -\partial _t u^\varepsilon - \frac{\varepsilon ^2}{2} \Delta u^\varepsilon + b(t,x) \cdot \nabla u^\varepsilon = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad u^\varepsilon (T,\cdot ) = u_T. \end{aligned}$$
(3.9)

This uniformly parabolic equation has a unique classical solution for any uniformly continuous \(u_T:\mathbb {R}^d \rightarrow \mathbb {R}\), which, moreover, is given by \(u^\varepsilon (t,x) = \mathbb {E}[ u_T(\phi ^\varepsilon _{t,T}(x))]\), where now \(\phi ^\varepsilon \) denotes the solution of the SDE (2.23) from the previous section. Arguing just as in Theorem 3.1 and invoking Proposition 2.3 immediately gives the following:

Theorem 3.3

As \(\varepsilon \rightarrow 0\), the solution \(u^\varepsilon \) converges locally uniformly to the function u given by (3.6).

3.1.2 Viscosity Solutions

Although (3.6) and (3.8) are the distributional solutions that arise uniquely through regularization (either of b or through vanishing viscosity limits), it turns out that distributional solutions are not unique in general (see Sect. 3.1.3 below). It is then a natural question as to whether the “good” solutions can be characterized other than as limits of regularizations, or by the explicit formulae. For example, this is done for the one-dimensional problem in [59] by introducing a sort of entropy condition.

We give a different characterization here using the theory of viscosity solutions [35], which covers both the first- and second-order problems. We present the results here only in the second-order case, which includes the first-order equations when \(a = 0\).

We define, for \((t,x,p) \in [0,T] \times \mathbb {R}^d \times \mathbb {R}^d\),

$$\begin{aligned} \underline{b}(t,x,p) = \liminf _{z \rightarrow x} b(t,z) \cdot p \quad \text {and} \quad \overline{b}(t,x,p) = \limsup _{z \rightarrow y} b(t,z) \cdot p. \end{aligned}$$

For fixed \((t,x) \in [0,T] \times \mathbb {R}^d\), \(\underline{b}(t,x,\cdot )\) and \(\overline{b}(t,x,\cdot )\) are Lipschitz continuous on \(\mathbb {R}^d\), and, for fixed \((t,p) \in [0,T]\), \(\underline{b}(t,\cdot ,p)\) and \(\overline{b}(t,\cdot ,p)\) are respectively lower and upper semicontinuous.

The following definition of viscosity (sup, super) solutions closely resembles the one in [55]:

Definition 3.1

An upper-semicontinuous (resp. lower-semicontinuous) function u is called a subsolution (resp. supersolution) of (3.3) if, for all \(\psi : [0,T] \times \mathbb {R}^d\) that are \(C^1\) in t and \(C^2\) in x, it holds that

$$\begin{aligned}&-\frac{d}{dt} \max _{x \in \mathbb {R}^d} \left\{ u(t,x) - \psi (t,x) \right\} \\&\quad \le \inf \left\{ {\text {tr}}[ a(t,y) \nabla ^2 \psi (t,y)] - \underline{b}(t,y, \nabla \psi (t,y)) : y \in \mathop {\mathrm {arg\,max}}\limits \{ u(t,\cdot ) - \psi (t,\cdot ) \} \right\} \end{aligned}$$

(resp.

$$\begin{aligned}&-\frac{d}{dt} \min _{x \in \mathbb {R}^d} \left\{ u(t,x) - \psi (t,x) \right\} \\&\quad \ge \sup \left\{ {\text {tr}}[ a(t,y) \nabla ^2 \psi (t,y)] - \overline{b}(t,y, \nabla \psi (t,y)) : y \in \mathop {\mathrm {arg\,min}}\limits \{ u(t,\cdot ) - \psi (t,\cdot ) \} \right\} \Big ). \end{aligned}$$

If \(u \in C([0,T] \times \mathbb {R}^d)\) is both a sub and supersolution, we say u is a solution.

The comparison principle is proved by doubling the space variable. In particular, we have the following lemma, which follows exactly by methods as in [34, 57, 58]. For \((t,x,y) \in [0,T] \times \mathbb {R}^d \times \mathbb {R}^d\), we define the nonnegative matrix

$$\begin{aligned} A(t,x,y) {:}{=} \begin{pmatrix} \sigma (t,x) \\ \sigma (t,y) \end{pmatrix} \begin{pmatrix} \sigma (t,x)^T&\sigma (t,y)^T \end{pmatrix}. \end{aligned}$$

Lemma 3.1

Assume u and v are respectively a sub and supersolution of (3.3). Then \(w(t,x,y) = u(t,x) - v(t,y)\) is a subsolution of

$$\begin{aligned} -\partial _t w - {\text {tr}}[ a(t,x,y) \nabla ^2_{(x,y)}w] + \underline{b}(t,x, \nabla _x w) - \overline{b}(t,x, -\nabla _y w) \le 0. \end{aligned}$$

We may now state and prove the comparison principle.

Theorem 3.4

If u and v are respectively a sub and supersolution of (3.3) such that

$$\begin{aligned} \sup _{(t,x) \in [0,T] \times \mathbb {R}^d} \frac{ u(t,x)}{1 + |x|} + \sup _{(s,y) \in [0,T] \times \mathbb {R}^d} \frac{ - v(s,y)}{1 + |y|} < \infty , \end{aligned}$$

then \(t \mapsto \sup _{x \in \mathbb {R}^d} \left\{ u(t,x) - v(t,x) \right\} \) is nondecreasing.

Proof

Define \(w(t,x,y) {:}{=} u(t,x) - v(t,y)\), fix \(\delta ,\varepsilon > 0\), and define \(\Phi _{\delta ,\varepsilon }(x,y) = \frac{1}{2 \delta } |x-y|^2 + \frac{1}{2\varepsilon }(|x|^2 + |y|^2)\). In view of the growth of u and v in x, for all \(t \in [0,T]\), the map \(w(t,\cdot ,\cdot ) - \Phi _{\delta ,\varepsilon }(x,y)\) attains a maximum on \(\mathbb {R}^d \times \mathbb {R}^d\). Moreover, standard arguments from the theory of viscosity solutions (see for instance [35, Lemma 3.1]) imply that there exist \(\rho _\delta > 0\) and \(\lambda _\varepsilon \) such that \(\lim _{\delta \rightarrow 0} \rho _\delta ^2/\delta = \lim _{\varepsilon \rightarrow 0} \varepsilon \lambda _\varepsilon ^2 = 0\), and

$$\begin{aligned} |x - y| \le \rho _\delta \quad \text {and} \quad |x| + |y| \le \lambda _\varepsilon \quad \text {for all } (x,y) \in \mathop {\mathrm {arg\,max}}\limits \left\{ w(t,\cdot ,\cdot ) - \Phi _{\delta ,\varepsilon } \right\} , \quad t \in [0,T]. \end{aligned}$$

Therefore, if \(t \in [0,T]\) and \((x,y) \in \mathop {\mathrm {arg\,max}}\limits \left\{ w(t,\cdot ,\cdot ) - \Phi _{\delta ,\varepsilon } \right\} \), we have, for some \(C \in L^1_+([0,T])\),

$$\begin{aligned}&{\text {tr}}[ a(t,x,y) \nabla ^2_{(x,y)} \Phi _{\delta ,\varepsilon }(x,y) ]\\&\quad = {\text {tr}}\left[ \left( \frac{1}{\delta } \begin{pmatrix} {\text {Id}}& -{\text {Id}}\\ - {\text {Id}}& {\text {Id}}\end{pmatrix} + \varepsilon \begin{pmatrix} {\text {Id}}& 0 \\ 0 & {\text {Id}}\end{pmatrix} \right) \begin{pmatrix} \sigma (t,x) \\ \sigma (t,y) \end{pmatrix} \begin{pmatrix} \sigma (t,x)^T&\sigma (t,y)^T \end{pmatrix} \right] \\&\quad \le C(t)\left( \frac{\rho _\delta ^2}{\delta } + \varepsilon \lambda _\varepsilon ^2 \right) \end{aligned}$$

and

$$\begin{aligned}&- \underline{b}\left( t, x, \nabla _x \Phi _{\delta ,\varepsilon }(x,y) \right) + \overline{b}\left( t, y, - \nabla _y \Phi _{\delta ,\varepsilon }(x,y) \right) \\&\quad = \limsup _{(z,w) \rightarrow ( x, y)} \left\{ - b( t, z) \cdot \left( \frac{ x - y}{\delta } + \beta x \right) + b( t, w) \cdot \left( \frac{ x - y}{\delta } - \beta y \right) \right\} \\&\quad = \limsup _{(z,w) \rightarrow ( x, y)} \left\{ - ( b( t, z) - b( t, w) ) \cdot \frac{z - w}{\delta } - b( t, z) \cdot \beta z + b( t, w) \cdot \beta w \right\} \\&\quad \le C(t)\left( \frac{\rho _\delta ^2}{\delta } + \varepsilon + \varepsilon \lambda _\varepsilon \right) . \end{aligned}$$

It now follows from Definition 3.1 and Lemma 3.1 that, for some \(C_{\delta ,\varepsilon } \in L^1_+([0,T])\) satisfying \(\lim _{(\delta ,\varepsilon ) \rightarrow (0,0)} C_{\delta ,\varepsilon } = 0\) in \(L^1([0,T])\),

$$\begin{aligned} t \mapsto \sup _{(x,y) \in \mathbb {R}^d \times \mathbb {R}^d} \left\{ w(t,x,y) - \Phi _{\delta ,\varepsilon }(x,y) \right\} - \int _t^T C_{\delta ,\varepsilon }(s)ds \end{aligned}$$

is nondecreasing. The result follows upon sending \(\delta \) and \(\varepsilon \) to 0. \(\square \)

As a consequence of the comparison theorem, the “good” distributional solution of (3.3) can be uniquely characterized.

Theorem 3.5

Assume \(u_T: \mathbb {R}^d \rightarrow \mathbb {R}\) is uniformly continuous and \(u_T \cdot (1 + |\cdot |^{-1}) \in L^\infty \). Then (3.8) is the unique viscosity solution of (3.3).

Proof

The fact that (3.8) defines a viscosity solution is due to Theorem 3.2 and the stability properties of viscosity solutions.Footnote 6 In view of Lemma 2.5 and the growth of \(u_T\), we may appeal to Theorem 3.4 to conclude that (3.8) is the only viscosity solution of the terminal value problem (3.3). \(\square \)

3.1.3 (Non)equivalence of Distributional and Viscosity Solutions

For \(x \in \mathbb {R}\), set \(b(t,x) = {\text{ sgn } }x\) and \(u_T(x) = |x|\). Using the formula (2.7) for the backward flow, the solution (3.6) becomes

$$\begin{aligned} u(t,x) = (|x| - (T-t))_+. \end{aligned}$$
(3.10)

However, the Lipschitz function

$$\begin{aligned} v(t,x) = |x| - (T-t) \end{aligned}$$
(3.11)

is another distributional solution (and in fact satisfies the equation a.e.). It can also be checked directly that (3.11) does not give a viscosity solution of (3.1). Indeed, note that \(v(t,x) - t\) attains a global minimum at any \((t,0) \in [0,T] \times \mathbb {R}\). Applying the supersolution definition with \(\phi (t,x) = t\) yields the contradictory \(-1 \ge 0\).

The uniqueness of distributional solutions fails even if b is continuous. Indeed, if \(0< \alpha < 1\) and \(b(t,x) = {\text{ sgn } }x |x|^{\alpha }\) and \(u_T(x) = |x|^{1-\alpha }\), then, arguing similarly as in the above example,

$$\begin{aligned} u(t,x) = \left( |x|^{1-\alpha } - (1-\alpha )(T-t) \right) _+ \end{aligned}$$
(3.12)

and

$$\begin{aligned} v(t,x) = |x|^{1-\alpha } - (1-\alpha )(T-t) \end{aligned}$$
(3.13)

are two distributional solutions, and (3.12) is the one corresponding to (3.6). Once again, (3.13) can directly be seen to fail the viscosity supersolution property.

In the first example above, \(u_T\) is Lipschitz while b is discontinuous, and, while b is continuous in the second example, we take \(u_T\) to be non-Lipschitz. This should be compared with the following sufficient criterion for equivalence.

Theorem 3.6

If \(b \in C([0,T] \times \mathbb {R}^d)\) satisfies (2.1) and \(u_T \in C^{0,1}(\mathbb {R}^d)\), then there exists a unique distributional solution \(u \in C([0,T] , C^{0,1}(\mathbb {R}^d))\), and it is given by (3.6).

Proof

Let \(\rho \in C_c^\infty \) be a standard mollifier and, for \(\varepsilon > 0\), set \(\rho _\varepsilon (x) = \varepsilon ^{-d} \rho (\varepsilon ^{-1} x)\). Let \(u \in C([0,T],C^{0,1}(\mathbb {R}^d))\) be a distributional solution of (3.1) and define \(u_\varepsilon = u * \rho _\varepsilon \). Then

$$\begin{aligned} -\partial _t u_\varepsilon + b \cdot \nabla u_\varepsilon = r_\varepsilon \quad \text {in } (0,T) \times \mathbb {R}^d, \end{aligned}$$
(3.14)

where

$$\begin{aligned} r_\varepsilon (t,x) = \int _{\mathbb {R}^d} \left( b(t,y) - b(t,x) \right) \cdot \nabla u(t,y) \rho _\varepsilon (x-y)dy. \end{aligned}$$

Note that \(r_\varepsilon \in C([0,T] \times \mathbb {R}^d)\), and \(u_\varepsilon \) solves (3.14) in the sense of viscosity solutions. Moreover, the continuity of b and boundedness of \(\nabla u\) imply that \(r_\varepsilon \xrightarrow {\varepsilon \rightarrow 0} 0\) locally uniformly. Standard stability results from the theory of viscosity solutions then imply that the limit u of \(u_\varepsilon \) is the unique viscosity solution of (3.1). \(\square \)

The above result can be extended by studying the interplay between regularity of b and u.

Theorem 3.7

Suppose that \(\alpha ,\beta \in (0,1]\) satisfy \(\alpha + \beta > 1\), b satisfies (2.1) and \(\sup _{t \in [0,T]} [b(t,\cdot )]_{C^\alpha } < \infty \), and u is a distributional solution of (3.1) such that \(\sup _{t \in [0,T]} [u(t,\cdot )]_{C^\beta } < \infty \). Then u is the unique viscosity solution of (3.1).

Remark 3.1

The condition on \(\alpha + \beta \), and, in particular, the strict inequality, is sharp, as the example above with \(b(x) = {\text{ sgn } }x |x|^{\alpha }\) and \(u_T(x) = |x|^{1-\alpha }\) shows.

Proof of Theorem 3.7

Arguing similarly as for Theorem 3.6, it suffices to prove that

$$\begin{aligned} r_\varepsilon = (b \cdot \nabla u) * \rho _\varepsilon - b \cdot \nabla (u * \rho _\varepsilon ) \xrightarrow {\varepsilon \rightarrow 0} 0 \quad \text {locally uniformly}, \end{aligned}$$

where \(\rho _\varepsilon \) is a standard mollifier. We note that \(r_\varepsilon = M_\varepsilon [ b(t,\cdot ), u(t,\cdot )]\), where the bilinear operator \(M_\varepsilon \) is defined, for sufficiently regular \((B,U): \mathbb {R}^d \rightarrow \mathbb {R}^d \times \mathbb {R}\), by

$$\begin{aligned} M_\varepsilon [ B,U] = \int _{\mathbb {R}^d} \left( B(y) - B(x) \right) \cdot \nabla U(y) \rho _\varepsilon (x-y)dy. \end{aligned}$$

Standard interpolation arguments give, for some \(C >0\) depending on \(\alpha \) and \(\beta \), for all \((B,U) \in C^\alpha \times C^\beta \),

$$\begin{aligned} \left| M_\varepsilon [B,U] \right| \le C\varepsilon ^{\alpha + \beta - 1} [B]_{C^\alpha } [U]_{C^\beta }. \end{aligned}$$

Therefore \(|r_\varepsilon (t,x)| \le C [b(t,\cdot )]_{C^\alpha } [u(t,\cdot )]_{C^\beta } \varepsilon ^{\alpha + \beta - 1}\), and we conclude upon sending \(\varepsilon \rightarrow 0\). \(\square \)

3.2 The Conservative Equation

3.2.1 Duality Solutions

For either of the two conservative equations (3.2) and (3.4), the tendency of the backward flow to concentrate on sets of Lebesgue measure zero implies that, even if \(f_0\) is absolutely continuous with respect to the Lebesgue measure, \(f(t,\cdot )\) may develop a singular part for \(t > 0\).

This presents an obstacle in defining solutions in the sense of distributions, since the product of the discontinuous vector field b with a singular measure f may not be well-defined. This same issue arose in the works [22, 25], and also in studying the the nonconservative equation (4.2) in Lebesgue spaces (see Sect. 4 below). The approach in these works was to define solutions through duality with the dual equation, for which particular distributional solutions could be defined in a stable, unique way. In this compressive regime, we do the same for (3.2), and directly define solutions in duality with the nonconservative equation.

Definition 3.2

A map \(f \in C([0,T], \mathcal M_{\textrm{loc},\textrm{w}})\) is called a solution of (3.2) if, for all \(t \in [0,T]\) and \(g \in C_c(\mathbb {R}^d)\),

$$\begin{aligned} \int g(x) f(t, dx) = \int g(\phi _{0,t}(x)) f_0(dx). \end{aligned}$$

Remark 3.2

For \(g \in C_c(\mathbb {R}^d)\) and \(t \in [0,T]\), \((s,x) \mapsto g(\phi _{s,t}(x))\) is the solution of the transport equation (3.1) in \([0,t] \times \mathbb {R}^d\) with terminal value g at time t, and, hence, f is called the duality solution of (3.2). Equivalently, \(f(t,\cdot )\) is the pushforward by \(\phi _{0,t}\) of the measure \(f_0\). When \(f_0\) is a probability measure, this means that \(f(t,\cdot )\) is the law at time t of the stochastic process \(\phi _{0,t}(X_0)\), where \(X_0\) is a random variable with law \(f_0\).

Remark 3.3

The notion of duality solution can be equivalently formulated in relation to nonconservative equations with a right-hand sideFootnote 7, that is, for \(g \in L^1([0,T], C(\mathbb {R}^d))\),

$$\begin{aligned} -\partial _t u + b(t,x) \cdot \nabla u = g(t,x) \quad \text {in } (0,T) \times \mathbb {R}^d. \end{aligned}$$
(3.15)

With this perspective, although the object \({\text {div}}(bf)\) does not make sense as a classical distribution, the equation can still be applied to particular singular test functions, namely, solutions of equations like (3.15). Then the pairing

$$\begin{aligned}&\int _{\mathbb {R}^d} u(T,x)f(T,dx) - \int _{\mathbb {R}^d} u(0,x)f_0(dx) \nonumber \\&\quad + \int _0^T \int _{\mathbb {R}^d} \underbrace{ \left[ - \partial _t u(t,x) + b(t,x) \cdot \nabla u(t,x) \right] }_{= g(t,x)} f(t,dx) = 0 \end{aligned}$$
(3.16)

has a sense, because the singular terms collapse into a continuous function, which may be paired with \(f(t,\cdot )\).

Remark 3.4

When \(d = 1\), the theory for (3.2) can be connected to that for the nonconservative equation (4.2) in Sect. 4 below, in that (4.2) is (up to a time change of b) a primitive of (3.2). Using this relationship, Bouchut and James [22, Theorem 4.3.4] are able to give meaning to the distributional product bf (or \(b\partial _x u\) in the next section). More precisely, it is shown that the duality solution in either setting has the following reformulation: there exists \({\hat{b}}: [0,T] \times \mathbb {R}\rightarrow \mathbb {R}\) such that \({\hat{b}} = b\) a.e., and f (resp. u) is a distributional solution of (3.2) (resp. (4.2)) with b replaced by \({\hat{b}}\). Extending this concept to multiple dimensions, in which case the direct relationship between (3.2) and (4.2) is not present, seems to be rather difficult.

Theorem 3.8

There exists a unique duality solution f of (3.2). If, for \(\varepsilon > 0\), \(f^\varepsilon \) is the solution corresponding to \(b^\varepsilon \) as in (2.3), then, as \(\varepsilon \rightarrow 0\), \(f^\varepsilon \) converges weakly in the sense of measures to f. If \(1 \le p < \infty \), \(f_0, g_0 \in \mathcal P_p\), and f and g are the corresponding duality solutions, then, for some \(C > 0\) depending on p and the constants in (2.1), \(\mathcal W_p(f_t, g_t) \le C \mathcal W_p(f_0,g_0)\).

Proof

The existence and uniqueness of duality solutions is a direct consequence of the definition. Moreover, the duality solution identity implies that, for any \(R > 0\) and for some \(C > 0\) depending on the constants in (2.1), \( \left\| f(t,\cdot ) \right\| _{TV(B_R)} \le \left\| f_0 \right\| _{TV(B_{R+C})}\). For \(0 \le s < t \le T\) and \(g \in C_c(\mathbb {R}^d)\), we apply the duality formula with the test function \(g \circ \phi _{s,t}\) and obtain the identity

$$\begin{aligned} \int _{\mathbb {R}^d}g(x) f(t,dx) = \int _{\mathbb {R}^d} g(\phi _{s,t}(x)) f(s,dx). \end{aligned}$$

Then, by Lemma 2.2, for some modulus of continuity \(\omega \) depending on the modulus of continuity for g,

$$\begin{aligned} \left| \int _{\mathbb {R}^d} g(x)\left[ f(t,dx) - f(s,dx) \right] \right| \le \omega (|t-s|) \left\| f_0 \right\| _{B_{{\text {supp }}g + C}}, \end{aligned}$$

and we conclude that \(f \in C([0,T], \mathcal M_{\textrm{loc}, \textrm{w}})\).

For \(R > 0\), define \(f_{0,R} {:}{=} f_0 \textbf{1}_{B_R}\), and denote by \(f_R\) and \(f^\varepsilon _R\) the duality solutions of (3.2) with respectively b and \(b^\varepsilon \) and initial condition \(f_{0,R}\). It then suffices to prove that, for fixed \(R > 0\) as \(\varepsilon \rightarrow 0\), \(f^\varepsilon _R \rightharpoonup f_R\) in the sense of measures. Then, in view of Lemma 2.2, for any \(t \in [0,T]\) and \(g \in C_c(\mathbb {R}^d)\) for sufficiently large support,

$$\begin{aligned} \int _{\mathbb {R}^d} g(x) f_R(t,dx)= & \int _{B_R} g(\phi _{0,t}(x)) f_0(dx) = \int _{\mathbb {R}^d} g(\phi _{0,t}(x)) f_0(dx)\\= & \int _{\mathbb {R}^d} g(x) f(t,dx), \end{aligned}$$

and similarly for \(f^\varepsilon \).

Let then \(g \in C_c(\mathbb {R}^d)\) and \(t \in (0,T]\) be fixed, and assume without loss of generality that \(f_0\) has compact support in \(B_R\) for some \(R > 0\). Then, for \(\varepsilon > 0\),

$$\begin{aligned} \int _{\mathbb {R}^d} g(x) f^\varepsilon (t,dx) = \int g(\phi ^\varepsilon _{0,t}(x))f_0(dx). \end{aligned}$$

so that \( \left\| f^\varepsilon \right\| _{TV} \le \left\| f_0 \right\| _{TV}\). Moreover, if \({\text {supp }}g \subset \mathbb {R}^d \backslash B_{R+C}\) for some \(C > 0\) sufficiently large and independent of \(\varepsilon > 0\), again by Lemma 2.2,

$$\begin{aligned} \int _{\mathbb {R}^d} g(x) f^\varepsilon (t,dx) = 0. \end{aligned}$$

We may then take a weakly convergent subsequence of \(f^\varepsilon \), with limit point \(F \in L^\infty ([0,T], \mathcal M)\), and, sending \(\varepsilon \rightarrow 0\), we obtain that F satisfies the duality solution identity, and therefore \(F = f\).

Choose \(h_1,h_2 \in C_c(\mathbb {R}^d)\) such that, for all \(x,y \in \mathbb {R}^d\), \(h_1(x) + h_2(y) \le |x-y|^p\). Then, if \(\gamma \) is any coupling between \(f_0\) and \(g_0\), we compute, using the duality identity and Lemma 2.2,

$$\begin{aligned} \int h_1(x) f(t,dx) + \int h_2(y) g(t,dy)&= \iint \left( h_1(\phi _{0,t}(x)) + h_2(\phi _{0,t}(y)) \right) \gamma (dx,dy)\\&\le C \iint |x-y|^p \gamma (dx,dy). \end{aligned}$$

Taking the infimum over such \(\gamma \) and supremum over such \(h_1,h_2\), and using the dual formulation of the p-Wasserstein distance, we arrive at the estimate for the Wasserstein distances. \(\square \)

Remark 3.5

The final estimate can also be proved using the characterization of f and g as laws of certain stochastic processes (see Remark 3.2) and the characterization of the Wasserstein metric in terms of random variables.

We may repeat the above analysis for the second-order conservative equation (3.4), the only difference being the lack of a finite speed of propagation. Therefore, all measures are taken to have finite mass over \(\mathbb {R}^d\). Below, \(\Phi _{t,0}\) is the stochastic flow satisfying (3.7).

Definition 3.3

A map \(f \in C([0,T], \mathcal M_{\textrm{w}})\) is called a solution of (3.4) if, for all \(t \in [0,T]\) and \(g \in C_b(\mathbb {R}^d)\),

$$\begin{aligned} \int g(x) f(t, dx) = \int \mathbb {E}[g(\Phi _{t,0}(x))] f_0(dx). \end{aligned}$$

Remark 3.6

Once again, such solutions are called duality solutions because \(\mathbb {E}[ g \circ \Phi _{t,0}]\) is the solution of (3.3) with terminal value g at time t. If \(f_0\) is a probability measure, then \(f(t,\cdot )\) is the law of the stochastic process \(\Phi _{t,0}(X_0)\), where \(X_0\) is a random variable with law \(f_0\), independent of the Wiener process W.

The following may be proved exactly as for Theorem 3.8, now invoking the properties of the stochastic flow described by Lemma 2.5.

Theorem 3.9

There exists a unique duality solution f of (3.4). If, for \(\varepsilon > 0\), \(f^\varepsilon \) is the solution corresponding to \(b^\varepsilon \) as in (2.3), then, as \(\varepsilon \rightarrow 0\), \(f^\varepsilon \) converges weakly in the sense of measures to f. If \(1 \le p \le \infty \), \(f_0, g_0 \in \mathcal P_p\), and f and g are the corresponding duality solutions, then, for some \(C > 0\) depending on p and the constants in (2.1), \(\mathcal W_p(f_t, g_t) \le C \mathcal W_p(f_0,g_0)\).

3.2.2 On the Failure of Renormalization

In view of the formula (3.6), it is immediate that (viscosity) solutions of (3.1) satisfy the renormalization property, that is, if u is a viscosity solution and \(\beta : \mathbb {R}\rightarrow \mathbb {R}\) is smooth, then \(\beta \circ u\) is also a solution. This is related to the existence and uniqueness of the Lipschitz backward flow; indeed, note that, coordinate by coordinate, \(\phi _{t,T}(x)\) is the unique viscosity solution of (3.1) with terminal value x at time T.

We contrast this with the renormalization property for the forward, conservative problem (3.2). If b is smooth, then classical computations show that f is a solution if and only if |f|, \(f_+\), and \(f_-\) are all solutions. Because \(f(t,\cdot )\) is the pushforward by \(f_0\) of the flow \(\phi _{0,t}\), this can be viewed as a generalized form of injectivity for the flow. For b satisfying (2.1), the backward flow is not guaranteed to be injective, and may in fact concentrate at null sets. We therefore cannot expect renormalization to hold in general.

As a concrete example, take again \(b(x) = {\text{ sgn } }x\) on \(\mathbb {R}\), and \(f_0 = \frac{1}{2} \delta _{1} - \frac{1}{2} \delta _{-1}\). Then, for \(t > 0\), \(f(t,\cdot ) = \frac{1}{2} \delta _{(1-t)_+} - \frac{1}{2} \delta _{-(1-t)_+}\), which means that \(f(t,\cdot ) \equiv 0\) for \(t \ge 1\). However, the solution F of (3.2) with \(F_0 = |f_0| = \frac{1}{2}\delta _1 + \frac{1}{2} \delta _{-1}\) is equal to \(F(t,\cdot ) = \frac{1}{2} \delta _{(1-t)_+} + \frac{1}{2} \delta _{-(1-t)_+}\), so that \(F(t,\cdot ) = \delta _0\) for \(t \ge 1\). Thus \(F_t \ne |f_t|\) for \(t \ge 1\); indeed, \(|f_t|\) does not even conserve mass.

The failure of renormalization holds even if we impose \(f_0 \in L^1 \cap L^\infty \). For such \(f_0\) and for \(b(x) = {\text{ sgn } }x\), we have

$$\begin{aligned} f(t,dx) = \left[ f_0(x+t) \textbf{1}\left\{ x > 0\right\} + f_0(x-t) \textbf{1}\left\{ x < 0 \right\} \right] dx + \left( \int _{[-t,t]} f_0 \right) d \delta _0(x). \end{aligned}$$

Therefore, renormalization fails whenever \(f_0\) is nonzero and odd.

We present one more counterexample to renormalization in which \(b \in C\) and \(f \in L^1\) (as the previous example shows, even if \(f_0 \in L^1\), \(f(t,\cdot )\) may not be absolutely continuous with respect to Lebesgue measure due to the concentration of the flow). Take \(b(t,x) = 2{\text{ sgn } }x |x|^{1/2}\). The backward flow is given by \(\phi _{0,t}(x) = {\text{ sgn } }x (|x|^{1/2} - t)_+^2\) for \((t,x) \in [0,T] \times \mathbb {R}\). For \(f_0 \in L^1\), the duality solution is given by

$$\begin{aligned} f(t,dx) = \left( \int _{[-t^2,t^2]}f_0 \right) \delta _0(dx) + f_0\left( {\text{ sgn } }x(|x|^{1/2} + t)^2 \right) \frac{|x|^{1/2} + t}{|x|^{1/2}} dx. \end{aligned}$$

We then take the odd density \(f_0(x) = {\text{ sgn } }x |x|^{1/2} {\textbf {1}}_{[-1,1]}(x)\), and the duality solution takes values in \(L^1\):

$$\begin{aligned} f(t,x) = {\text{ sgn } }x \frac{ (|x|^{1/2} + t)^2}{|x|^{1/2}} {\textbf {1}}_{[-(1-t)_+^2, (1-t)_+^2]}(x). \end{aligned}$$
(3.17)

On the other hand, |f| is not the duality solution, or even a distributional solution, since mass is not conserved. The unique duality solution with initial density \(|f_0(x)| = |x|^{1/2} \textbf{1}_{[-1,1]}(x)\) in this case is given by

$$\begin{aligned} F(t,dx) = \frac{4t^3}{3} \delta _0(dx) + \frac{ (|x|^{1/2} + t)^2}{|x|^{1/2}} \textbf{1}_{[-(1-t)_+^2, (1-t)_+^2]}(x)dx. \end{aligned}$$

Remark 3.7

One consequence of the commutator lemma of DiPerna and Lions [40, Lemma II.1] is that, if \(f \in L^p\) and \(b \in W^{1,q}\) with \(\frac{1}{p} + \frac{1}{q} \le 1\), then the renormalization property is satisfied. The previous example therefore indicates that these conditions cannot be weakened in general. Indeed, even though \(f_0 \in L^1 \cap L^\infty \), the solution \(f(t,\cdot )\) given by (3.17) belongs to \(L^p\) only for \(p \in [1,2)\) when \(t > 0\), and the same is true for \(\partial _x b\).

3.2.3 Equivalence of Duality and Distributional Solutions

We finish this section by studying the setting where bf can be understood as a distribution, and, therefore, distributional solutions of (3.2) can be considered.

Theorem 3.10

Assume either that b is continuous, or that \(f(t,\cdot ) \in L^1_\textrm{loc}\) for all \(t \in [0,T]\). Then f is a distributional solution of (3.2) if and only if f is the unique duality solution.

Proof

Suppose f is the unique duality solution. Let \((b^\varepsilon )_{\varepsilon > 0}\) be as in (2.3) and let \(f^\varepsilon \) be the corresponding solution of (3.2). For \(\phi \in C^1_c((0,T) \times \mathbb {R}^d)\), integrating by parts yields

$$\begin{aligned} \iint _{(0,T) \times \mathbb {R}^d} f^\varepsilon (t,x)\left( -\partial _t \phi (t,x) + b^\varepsilon (t,x) \cdot \nabla \phi (t,x) \right) dt dx = 0. \end{aligned}$$

In the case that \(b \in C\), we may choose regularizations \(b^\varepsilon \) that converge locally uniformly to b. By Theorem 3.8, as \(\varepsilon \rightarrow 0\), \(f^\varepsilon \) converges weakly in the sense of measures to f, and so we may take \(\varepsilon \rightarrow 0\) above to obtain

$$\begin{aligned} \iint _{(0,T) \times \mathbb {R}^d} f(t,dx)\left( -\partial _t \phi (t,x) + b(t,x) \cdot \nabla \phi (t,x) \right) dt = 0. \end{aligned}$$

Otherwise, if \(f \in L^1_\textrm{loc}\), it follows that \(f^\varepsilon \) converges weakly in \(L^1_\textrm{loc}\), and therefore the same is true for \(b^\varepsilon f^\varepsilon \) by the dominated convergence theorem. We may then take \(\varepsilon \rightarrow 0\) in this case as well.

Assume now that f is an arbitrary distributional solution. We aim to show the duality equality in Definition 3.2, and, by a density argument, it suffices to do so for \(g \in C_c(\mathbb {R}^d) \cap C^{0,1}(\mathbb {R}^d)\). Let \(\rho _\varepsilon \) be a standard mollifier as before and set \(f_\varepsilon = f * \rho _\varepsilon \). Then \(f_\varepsilon \) satisfies

$$\begin{aligned} \partial _t f_\varepsilon - {\text{ div }}(b f_\varepsilon ) = {\text{ div } }r_\varepsilon , \end{aligned}$$

where \(r_\varepsilon = (bf) * \rho _\varepsilon - b f_\varepsilon \). For \(t \in (0,T]\), let u be the unique Lipschitz viscosity solution of the terminal value problem

$$\begin{aligned} -\partial _s u + b \cdot \nabla u = 0 \quad \text {in }(0,t) \times \mathbb {R}^d, \quad u(t,\cdot ) = g. \end{aligned}$$

By the theory in Sect. 3.1, \(u(s,x) = g(\phi _{s,t}(x))\) and is Lipschitz continuous with compact support. We then compute

$$\begin{aligned} \partial _s \int f_\varepsilon (s,x) u(s,x)dx = - \int r_\varepsilon (s,x) \cdot \nabla u(s,x)dx, \end{aligned}$$

so that

$$\begin{aligned} & \int f_\varepsilon (t,x) g(x)dx - \int (f_0 * \rho _\varepsilon )(x) g(\phi _{0,t}(x))dx\\ & \qquad = - \int _0^t \int _{\mathbb {R}^d} r_\varepsilon (s,x) \cdot \nabla u(s,x)dxds. \end{aligned}$$

We may then conclude by proving that \(r_\varepsilon \xrightarrow {\varepsilon \rightarrow 0} 0\) in \(L^1_\textrm{loc}\).

If \(f \in L^1_\textrm{loc}\), this is immediate because, as \(\varepsilon \rightarrow 0\), both \((bf) * \rho _\varepsilon \) and \(b f_\varepsilon \) converge in \(L^1_\textrm{loc}\) to bf. If \(b \in C\), then, as \(\varepsilon \rightarrow 0\), both \((bf) * \rho _\varepsilon \) and \(b f_\varepsilon \) converge locally in total variation to bf. It follows that \(r_\varepsilon \) converges locally in total variation to 0, but, because \(r_\varepsilon \in L^1\) for all \(\varepsilon > 0\), the convergence in \(L^1_\textrm{loc}\) is established. \(\square \)

Remark 3.8

Even in the context of Theorem 3.10, the renormalization property can fail. Indeed, this is the case for the final example in the previous section, where both \(b \in C\) and \(f \in L^1\).

4 The Expansive Regime

We continue our analysis of transport and continuity equations with vector fields b satisfying (2.1), and in this section we study the expansive regime. Reversing the sign appearing in front of the velocity field b, the initial value problem for the continuity equation becomes

$$\begin{aligned} \partial _t f + {\text {div}}(b(t,x)f) = 0 \quad \text {in } (0,T) \times \mathbb {R}^d \quad \text {and} \quad f(0,\cdot ) = f_0, \end{aligned}$$
(4.1)

and the corresponding dual terminal value problem for the non-conservative transport equation is

$$\begin{aligned} \partial _t u + b(t,x) \cdot \nabla u = 0 \quad \text {in } (0,T) \times \mathbb {R}^d \quad \text {and} \quad u(T,\cdot ) = u_T. \end{aligned}$$
(4.2)

Equivalently, we are studying the time-reversed versions of (3.1) and (3.2) (in this case, b is replaced with \(b(T-t,\cdot )\)). As such, the relevant direction of the flow (2.2) changes in this context: whereas in the previous section, the compressive, backward flow gave rise to the dual solution spaces C and \(\mathcal M\), here, the expansive, forward flow allows to develop a theory for both (4.1) and (4.2) in Lebesgue spaces. This can also be seen from formal a priori \(L^p\) estimates for (4.1) and (4.2), which follow immediately from the lower bound on \({\text {div}}b\).

The regime for these equations matches those studied by Bouchut et al. [25], in which emphasis is placed on the fact that distributional solutions of (4.1) are not unique in general. Our approach to these equations is similar, in that we use a particular solution of (4.1) to study, by duality, the transport equation (4.2) and the forward ODE flow to (2.2). We extend the results of [25] by identifying a “good” solution (reversible solution, in the terminology of [25]) of (4.2) for any \(f_0 \in L^p_\textrm{loc}\), where the continuous solution operator on \(L^p\) is stable under regularizations in the weak topology of \(C([0,T],L^p_\textrm{loc}(\mathbb {R}^d))\).

The terminal value problem (4.2) is then understood both in the dual sense and through the lens of renormalization theory. It is this theory that allows, as in [40], to make sense of the forward ODE flow (2.17) as the right-inverse of the backward flow, completing the program initiated in Sect. 2. As a consequence, we then also obtain the uniqueness of nonnegative distributional solutions of (4.1), and, by extension, a characterization of the reversible solution of [25].

We finish the section by making some remarks about the second-order analogues of (4.1) and (4.2). Unlike in the previous section, we do not have a full solution theory for general second-order equations, unless the ellipticity matrix is uniformly positive (the case which has already been covered by Figalli in [41]) or is degenerate but independent of the space variable.

4.1 The Conservative Equation

The starting point for the study of the conservative equation (4.1) is that solutions in the sense of distributions are not unique (see also [22], [25, Section 6]). We revisit the example, when \(d = 1\), \(b(t,x) = {\text{ sgn } }x\). Then \(f(t,x) {:}{=} {\text{ sgn } }x {\textbf {1}}_{|x| \le t}\) is a nontrivial distributional solution of (4.1) belonging to \(L^1 \cap L^\infty \) with \(f(0,\cdot ) = 0\). The uniqueness can be seen as a consequence of the compressive nature of the backward flow (2.2), which allows for positive and negative mass to be “cancelled” at time 0, only to appear immediately for \(t > 0\). The same phenomenon is what leads to the failure of renormalization for the compressive regime for the continuity equation in Sect. 3.2. In either case, we remark that this particular b belongs to \(BV(\mathbb {R})\), while \(\partial _x b\) is not absolutely continuous with respect to Lebesgue measure, and so the condition in the work of Ambrosio [5] that \({\text{ div } }b \in L^1_\text {loc}\) cannot indeed be weakened in general, if one is to hope for renormalization or uniqueness for the continuity equation.

One strategy is to define solutions of (4.1) by duality with the transport equation (3.1) from the compressive setting. With the theory of Sect. 3, for \(g \in C^{0,1}_c(\mathbb {R}^d)\), we may define a Lipschitz viscosity solution of the initial value problem

$$\begin{aligned} \partial _t v + b(t,x) \cdot \nabla v = 0 \quad \text {in } (0,T) \times \mathbb {R}^d, \quad v(0,\cdot ) = g \end{aligned}$$

(because \({\tilde{v}}(t,x) {:}{=} v(T-t,x)\) solves the corresponding terminal value problem (3.1) with velocity \({\tilde{b}}(t,x) = b(T-t,x)\)), and then, formally, for \(t > 0\), \(\int f(t,x) v(t,x)dx = \int f_0(x) g(x)dx\).

The main problem with this approach is that duality does not define unique solutions, again due to the concentration effect of the backward flow. Taking once more \(b(t,x) = {\text{ sgn } }x\), we have, by (3.6),

$$\begin{aligned} v(t,x) = \left\{ \begin{array}{ll} g(x - ({\text{ sgn } }x )t), & |x| \ge t, \\ g(0), & |x| \le t. \end{array}\right. \end{aligned}$$

Therefore, the duality equality fails to give sufficient information to identify f in the cone \(\{|x| \le t\}\), in which v is always constant, regardless of the initial data g. Indeed, the two distributional solutions \(f \equiv 0\) and \(f(t,x) = {\text{ sgn } }x {\textbf {1}}\{|x| \le t\}\) differ in exactly this cone, in which the Jacobian of the backward flow vanishes. It is exactly this observation that lead to the notion of “exceptional” solutions of (3.1) and the exceptional set in [25].

Inspired by the work of [25], we instead identify a “good” solution operator acting on all \(f_0 \in L^p_\textrm{loc}\), \(1 \le p \le \infty \), by extending the solution formula in the smooth case, which depends on the backward flow studied in Sect. 2, as well as the corresponding Jacobian. In particular, the “good solution” is distinguished by vanishing whenever the Jacobian does. Though our notion of solution turns out to be equivalent to the reversible solutions, our approach differs slightly from that of [25], who work with a general class of “transport flows” that generalize the backward ODE flow. One advantage of our analysis is that we can directly appeal to the various topological properties of the backward flow proved in Sect. 2. Let us also draw analogy with the approach of Crippa and De Lellis [36], in which the regular Lagrangian flow and its properties are studied directly, leading to information about the associated PDEs, although in our setting, we still require PDE techniques to glean further properties of the ODE flow.

4.1.1 Representation Formula

If b is Lipschitz, then the solution of (4.1) is given by

$$\begin{aligned} f(t,x) = f_0(\phi _{0,t}(x)) J_{0,t}(x), \end{aligned}$$
(4.3)

where \(\phi _{0,t}(x)\) is the reverse flow defined in Sect. 2 and \(J_{0,t}(x) = \det (\nabla _x \phi _{0,t}(x))\) is the corresponding Jacobian. One way to derive this formula is through the Feynman–Kac formula for the reversed time equation

$$\begin{aligned} & -\partial _t {\tilde{f}} + b(T-t,x) \cdot \nabla {\tilde{f}} + {\text {div}}_x b(T-t,x) {\tilde{f}} = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad \\ & {\tilde{f}}(T,\cdot ) = f_0, \end{aligned}$$

which gives

$$\begin{aligned} f(t,x) = {\tilde{f}}(T-t,x) = f_0(\phi _{0,t}(x)) \exp \left( -\int _0^t {\text{ div } }b(s, \phi _{0,s}(x)ds \right) , \end{aligned}$$
(4.4)

and then \(J_{0,t}(x) = \exp \left( -\int _0^s {\text{ div } }b(s, \phi _{0,s}(x)ds \right) \).

In the general case where b satisfies (2.1), the formula (4.3) makes sense for arbitrary \(f_0 \in L^p_\textrm{loc}\), \(1 \le p \le \infty \). We may then use the various results in Sect. 2 to analyze the stability properties of the solution operator defined by the formula (3.6). We remark in particular that the stability results of Lemma 2.3 depend on the determinant structure of the Jacobian, which is somewhat disguised by the exponential expression in (4.4).

Theorem 4.1

Let \(1 \le p \le \infty \), assume that \(f_0 \in L^p_\textrm{loc}(\mathbb {R}^d)\), and define f by (4.3). Then f is a distributional solution of (4.1). If \(1 \le p < \infty \), \(f \in C([0,T],L^p(\mathbb {R}^d))\), and if \(p = \infty \), . There exists a constant \(C > 0\) depending only on the assumptions in (2.1) such that, for all \(R > 0\),

$$\begin{aligned} \left\| f(t,\cdot ) \right\| _{L^p(B_R)} \le C \left\| f_0 \right\| _{L^p(B_{R+C})}. \end{aligned}$$
(4.5)

If \((b_\varepsilon )_{\varepsilon > 0}\) are as in (2.3) and \((f_\varepsilon )_{\varepsilon > 0}\) are the corresponding solutions of (4.1), then, as \(\varepsilon \rightarrow 0\), \(f_\varepsilon \) converges to f weakly in \(C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) if \(1 \le p < \infty \), and weak-\(\star \) in \(L^\infty \) if \(p = \infty \).

Proof

When \(p = \infty \), the bound (4.5) follows from the \(L^\infty \) bounds for the flow and Jacobian in Lemmas 2.2 and 2.3. We prove the bound when \(p < \infty \) for the solutions \(f_\varepsilon \) of the equation with \(b_\varepsilon \) as in (2.3), for a constant independent of \(\varepsilon \), and then the estimate for f follows after proving the weak convergence result.

For a constant \(C > 0\) independent of \(\varepsilon \), by Lemmas 2.2 and 2.3, we have \(|J^\varepsilon _{0,t}|\le C\) and \(|\phi ^\varepsilon _{0,t}(x)| \le R + C\) for \(|x| \le R\). Then

$$\begin{aligned} \int _{B_R} |f^\varepsilon (t,x)|^pdx&= \int _{\mathbb {R}^d} |f_0(\phi ^\varepsilon _{0,t}(x))|^p J^\varepsilon _{0,t}(x)^pdx \\&\le \left\| J_{0,t} \right\| _\infty ^{p-1} \int _{B_R} |f_0(\phi ^\varepsilon _{0,t}(x))|^p J^\varepsilon _{0,t}(x)dx\\&\le C \int _{B_{R+C}} |f_0(x)|^pdx. \end{aligned}$$

It suffices to prove the weak convergence of \(f^\varepsilon \) when \(p < \infty \) for \(f_0 \in C_c\). In the general case, if \({\tilde{f}}_0\) is continuous with compact support and we let \({\tilde{f}}^\varepsilon \) be the solution with \(b^\varepsilon \) and \({\tilde{f}}_0\), we have

$$\begin{aligned} \left\| f^\varepsilon - {\tilde{f}}^\varepsilon \right\| _{C([0,T],L^p(B_R))} \le C \left\| f_0 - {\tilde{f}}_0 \right\| _{L^p(B_{R+C})}, \end{aligned}$$

and we may then choose \({\tilde{f}}_0\) arbitrarily close to \(f_0\) in \(L^p_\textrm{loc}\).

By Lemma 2.2, as \(\varepsilon \rightarrow 0\), \(\phi ^\varepsilon \rightarrow \phi \) uniformly in \([0,T] \times \mathbb {R}^d\), and therefore \(f_0 \circ \phi ^\varepsilon _{0,t})\) converges uniformly to \(f_0 \circ \phi _{0,t}\). In view of Lemma 2.3, \(f^\varepsilon \) converges weakly in the sense of distributions (and therefore, in the sense of locally bounded Borel measures) to f. Since \(f^\varepsilon \) is bounded in \(L^\infty ([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\), the convergence is actually weak in \(L^\infty ([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\).

If \(p = \infty \), then, in particular, \(f^\varepsilon \in C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) for \(p < \infty \), uniformly in \(\varepsilon \), and we have the convergence as \(\varepsilon \rightarrow 0\) in the sense of distributions to f. In this case, \(f \in L^\infty _\textrm{loc}([0,T] \times \mathbb {R}^d)\), and so the convergence is weak-\(\star \) in \(L^\infty _\textrm{loc}\).

Given \(g \in C^1_c((0,T) \times \mathbb {R}^d)\), integrating by parts gives

$$\begin{aligned} \iint _{[0,T] \times \mathbb {R}^d} f^\varepsilon (t,x)\left[ \partial _t \phi (t,x) + b^\varepsilon (t,x) \cdot \nabla \phi (t,x) \right] dxdt = 0. \end{aligned}$$

As \(\varepsilon \rightarrow 0\), the bracketed expression converges a.e. to \(\partial _t \phi (t,x) + b(t,x) \cdot \nabla \phi (t,x)\), and so converges weakly in \(L^q\) for all \(1 \le q < \infty \) by the dominated convergence theorem. We may therefore send \(\varepsilon \rightarrow 0\), using the weak convergence of \(f^\varepsilon \), to deduce that f is a distributional solution. This implies in particular that \(f \in C([0,T], L^p_\textrm{w}(\mathbb {R}^d))\), or if \(p = \infty \).

To show that \(f \in C([0,T], L^p(\mathbb {R}^d))\) when \(p < \infty \), we may again consider \(f_0 \in C_c(\mathbb {R}^d))\) without loss of generality. Then \(f_0 \circ \phi _{0,\cdot } \in C([0,T] \times \mathbb {R}^d)\), while \(J_{0,\cdot } \in C([0,T], L^1_\textrm{loc}(\mathbb {R}^d))\) by Lemma 2.3, and the result follows. \(\square \)

Remark 4.1

In view of the stability results of Theorem 4.1 above, this “good” solution coincides with the notion of reversible solutions in [22, 25]. We refer to it in the sequel as the BJM solution.

The following is immediately obtained from the formula (4.3):

Corollary 4.1

If f is a BJM solution of (4.1), then so is |f|.

Corollary 4.1 is in direct contrast to the continuity equation in the compressive setting of the previous section, where renormalization fails. Its proof depends on the formula for the BJM solution; indeed, despite the weak stability result in Theorem 4.1, this renormalization property cannot be proved by regularization, since we only have the weak convergence as \(\varepsilon \rightarrow 0\) of \(f_\varepsilon \) to f. At present, we do not know whether the convergence is strong in \(L^p\). This turns out to be equivalent to the strong convergence in \(L^1_\textrm{loc}\) of the Jacobians, and therefore, in view of Proposition 2.1, we have the following when \(d = 1\).

Theorem 4.2

Assume \(d = 1\), \(f_0 \in L^p_\textrm{loc}(\mathbb {R})\) for \(p < \infty \), \((b^\varepsilon )_{\varepsilon > 0}\) is as in (2.3), and \(f^\varepsilon \) is the corresponding solution of (4.1). Then, as \(\varepsilon \rightarrow 0\), \(f^\varepsilon \) converges strongly in \(C([0,T] , L^p_\textrm{loc}(\mathbb {R}))\) to f.

Proof

Just as in the proof of Theorem 4.1, we may assume without loss of generality that \(f_0 \in C_c(\mathbb {R})\). In that case, \(f^\varepsilon \) is bounded in \(L^1\) and \(L^\infty \), and so the strong \(L^p\) convergence reduces to the strong convergence of \(J^\varepsilon _{0,\cdot }\) to \(J_{0,\cdot }\) in \(L^1_\textrm{loc}([0,T] \times \mathbb {R})\) from Proposition 2.1. \(\square \)

4.1.2 Vanishing Viscosity Approximation

The BJM solution above also arises from vanishing viscosity limits, that is, the limit as \(\varepsilon \rightarrow 0\) of solutions of

$$\begin{aligned} \partial _t f^\varepsilon - \frac{\varepsilon ^2}{2} \Delta f^\varepsilon + {\text {div}}(b(t,x) f^\varepsilon ) = 0 \quad \text {in } [0,T] \times \mathbb {R}^d \quad \text {and} \quad f^\varepsilon (0,\cdot ) = f_0, \end{aligned}$$
(4.6)

which has as its unique solution

$$\begin{aligned} f^\varepsilon (t,x) {:}{=} \mathbb {E}[ f_0(\phi ^\varepsilon _{0,t}(x))J^\varepsilon _{0,t}(x) ], \end{aligned}$$
(4.7)

where now \(\phi ^\varepsilon \) and \(J^\varepsilon \) denote respectively the stochastic flow and Jacobian from (2.23), corresponding to Proposition 2.3.

The proof of the following result follows from Proposition 2.3, and is proved almost exactly as for Theorem 4.1.

Theorem 4.3

The function \(f^\varepsilon \) defined by (4.6) belongs to \(C([0,T],L^p_\textrm{loc}(\mathbb {R}^d))\) if \(1 \le p < \infty \) and if \(p = \infty \), and, as \(\varepsilon \rightarrow 0\), \(f^\varepsilon \) converges weakly in those spaces to f.

4.2 The Nonconservative Equation

The next step is the study of the terminal value problem (4.2). Unlike the transport equation (3.1) with velocity \(-b\), which was solved in the space of continuous functions, we cannot define \(L^p\) solutions in the distributional sense, as the product \(b \cdot \nabla u = {\text{ div }}(bu) - ({\text{ div } }b) u\) does not make sense when \({\text{ div } }b\) is merely a measure. Instead, we initially characterize solutions by duality with (4.1), which can be seen as a way of restricting the class of test functions to deal with the singularities in b (see Remark 3.3).

4.2.1 \(L^p\) and BV Estimates

We will first prove a priori \(L^p\) and BV estimates for the solution of (4.2), assuming all the data and solutions are smooth. The BV estimates in particular are crucial to establishing the strong convergence in \(L^p\) of regularized solutions to a unique limit, which will be the duality solution, adjoint to the equation (4.1). The BV estimate appears already in [25, Lemma 4.4]. We present an alternate proof here, which is similar to the one for second-order equations we prove later.

Lemma 4.1

Assume b is smooth and satisfies (2.1), and let u be a smooth solution of (4.2). Then, for all \(1 \le p \le \infty \), there exist \(C = C_{p,R} \in L^1_+([0,T])\) and \(C_R > 0\) depending only on the bounds in (2.1) such that, for all \(0 \le t \le T\),

$$\begin{aligned} \left\| u(t,\cdot ) \right\| _{L^p(B_R)} \le \exp \left( \int _0^t C(s)ds \right) \left\| u_T \right\| _{L^p(B_{R+C})} \end{aligned}$$

and

$$\begin{aligned} \left\| u(t,\cdot ) \right\| _{BV(B_R)} \le \exp \left( \int _0^t C(s)ds \right) \left\| u_T \right\| _{BV(B_{R+C})}. \end{aligned}$$

Proof

We assume that \(u_T\) has compact support, and, therefore, in view of the finite speed of propagation property, so does u. The general result for \(L^p_\textrm{loc}\) and \(BV_\textrm{loc}\) is proved similarly.

The \(L^\infty \) bound is a consequence of the maximum principle. For \(p < \infty \), we compute

$$\begin{aligned} \frac{\partial }{\partial t} \int _{\mathbb {R}^d} |u(t,x)|^pdx = \int _{\mathbb {R}^d} {\text{ div } }b(t,x) |u(t,x)|^p dx \ge -C_0(t)d \int _{\mathbb {R}^d} |u(t,x)|^p dx, \end{aligned}$$

and the \(L^p\) bound follows from Grönwall’s inequality.

Now, for \(t \le T\) and \(x,z \in \mathbb {R}^d\), set \(w(t,x,z) = \nabla u(t,x) \cdot z\). Then w satisfies

$$\begin{aligned} -\partial _t w + b \cdot \nabla _x w + (z \cdot \nabla )b \cdot \nabla _z w = 0. \end{aligned}$$

Since b and w are smooth, the renormalization property holds for this transport equation, and so a simple regularization argument shows, in the sense of distributions,

$$\begin{aligned} \partial _t |w| + b \cdot \nabla _x |w| + (z \cdot \nabla )b \cdot \nabla _z |w| = 0. \end{aligned}$$

Define \(\phi (z) = e^{-|z|^2}\). Then

$$\begin{aligned} & \iint _{\mathbb {R}^d \times \mathbb {R}^d} \phi (z) b(t,x) \cdot \nabla _x |w(t,x,z)|dxdz \\ & \qquad = - \iint _{\mathbb {R}^d \times \mathbb {R}^d} \phi (z) {\text{ div } }b(t,x) |w(t,x,z)|dxdz \end{aligned}$$

and

$$\begin{aligned}&\iint _{\mathbb {R}^d \times \mathbb {R}^d} \phi (z)(z \cdot \nabla )b(t,x) \cdot \nabla _z|w(t,x,z)|dxdz\\ &\quad = - \iint _{\mathbb {R}^d \times \mathbb {R}^d} \left[ \nabla \phi (z) \cdot (z \cdot \nabla b(t,x)) + \phi (z) {\text{ div } }b(t,x) \right] |w(t,x,z)|dxdz. \end{aligned}$$

Therefore, by Lemma 2.1,

$$\begin{aligned}&\partial _t \iint _{\mathbb {R}^d \times \mathbb {R}^d} |w(t,x,z)|\phi (z)dxdz\\ &\quad = \iint _{\mathbb {R}^d \times \mathbb {R}^d} \left[ \nabla \phi (z) \cdot (z \cdot \nabla b(t,x)) + 2 \phi (z) {\text{ div } }b(t,x) \right] |w(t,x,z)|dxdz \\ &\quad =\iint _{\mathbb {R}^d \times \mathbb {R}^d} 2e^{-|z|^2} \left[ {\text{ div } }b(t,x) - \nabla b(t,x)z \cdot z\right] dxdz\\ &\quad \ge -2(d-1)C_0(t) \iint _{\mathbb {R}^d \times \mathbb {R}^d} e^{-|z|^2} |w(t,x,z)|dxdz. \end{aligned}$$

The result follows from Grönwall’s lemma and the fact that

$$\begin{aligned} \iint _{\mathbb {R}^d \times \mathbb {R}^d} e^{-|z|^2} |w(t,x,z)|dxdz = c_0 \int _{\mathbb {R}^d} |\nabla u(t,x)|dx, \end{aligned}$$

where the constant \(c_0 = \int _{\mathbb {R}^d} e^{-|z|^2} |\nu \cdot z|dz\) is independent of the choice of \(|\nu | = 1\) by rotational invariance. \(\square \)

4.2.2 Duality Solutions

Proceeding by duality with the conservative forward equation, and using the BV-estimates above, then gives the following.

Theorem 4.4

Assume \(1 \le p \le \infty \) and \(u_T \in L^p_\textrm{loc}\). Then there exists a unique function \(u \in C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) (or in if \(p = \infty \)) such that, if \((b^\varepsilon )_{\varepsilon > 0}\) is as in (2.3) and \(u^\varepsilon \) denotes the corresponding solution of (4.2), then, as \(\varepsilon \rightarrow 0\), \(u^\varepsilon \) converges strongly in \(C([0,T], L^p(\mathbb {R}^d))\) for \(p < \infty \) and weak-\(\star \) in \(L^\infty \) to u. Moreover, the solution map \(u_T \mapsto u\) is linear, order-preserving, and continuous in \(L^p_\textrm{loc}(\mathbb {R}^d))\). If \(s \in [0,T)\), \(f_s \in L^{p'}(\mathbb {R}^d)\) and \(f \in C([s,T], L^{p'}(\mathbb {R}^d))\) (or if \(p = 1\)) is the BJM solution of (4.1) with initial data \(f(s,\cdot ) = f_s\), then

$$\begin{aligned} \int _{\mathbb {R}^d} u(s,x)f_s(x)dx = \int _{\mathbb {R}^d} u_T(x) f(T,x)dx. \end{aligned}$$

Remark 4.2

The function u corresponds with the notion of duality solution presented in [25] whenever \(u_T\) (and therefore \(u(t,\cdot )\) for \(t < T\)) belongs to \(BV_\textrm{loc}\).

Proof

By Lemma 4.1, \((u^\varepsilon )_{\varepsilon > 0}\) is bounded uniformly in \(C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\), and so, along a subsequence, converges weakly as \(\varepsilon \rightarrow 0\) to some u satisfying the same bounds.

In order to see that the convergence is strong, note that it suffices, by the \(L^p\)-boundedness of solution operator implied by Lemma 4.1, to assume that \(u_T\in C_c(\mathbb {R}^d)\). We then have \(u^\varepsilon \) bounded in \(L^\infty ([0,T], BV(\mathbb {R}^d))\) independently of \(\varepsilon \). The identity \(\partial _t u^\varepsilon = - b^\varepsilon \cdot \nabla u^\varepsilon \) then implies that, for any \(t_1 < t_2 \le T\) and \(R > 0\),

$$\begin{aligned} \left\| u^\varepsilon (t_1,\cdot ) - u^\varepsilon (t_2,\cdot ) \right\| _{L^1(B_R)} \le \left\| b \right\| _{L^\infty (B_R)} \sup _{t \in [0,T]} \left\| \nabla u^\varepsilon \right\| _{L^1(B_R)} |t_1 - t_2|. \end{aligned}$$

This, along with the uniform BV estimates, implies that \((u^\varepsilon )_{\varepsilon > 0}\) is precompact in \(C([0,T], L^1_\textrm{loc}(\mathbb {R}^d))\), and, because of the uniform \(L^\infty \)-bound, precompact in \(C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) for any \(p \in [1,\infty )\). It therefore follows that any weakly convergent subsequence actually converges strongly.

If \(f^\varepsilon \) is the solution of (4.1) with \(f^\varepsilon (s,\cdot ) = f_s\), then classical computations involving integration by parts give

$$\begin{aligned} \int _{\mathbb {R}^d} u^\varepsilon (s,x)f_s(x)dx = \int _{\mathbb {R}^d} u_T(x) f^\varepsilon (T,x)dx. \end{aligned}$$

Sending \(\varepsilon \rightarrow 0\) along a subsequence and using the weak convergence of \(f^\varepsilon \) and strong convergence of \(u^\varepsilon \) shows that any limit point u must satisfy the duality identity with f, and is therefore unique. We conclude that the full sequence converges strongly. As before, when \(p = \infty \), we obtain the same result since then also \(u \in C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) for any \(p < \infty \). \(\square \)

Remark 4.3

If \(u_T \in BV_\textrm{loc}\), then the duality solution u of (4.2) satisfies \(\nabla u \in L^\infty ([0,T], \mathcal M_\textrm{loc}(\mathbb {R}^d))\). Note, however, that this is still not enough to make sense of u as a distributional solution, unless b is continuous; see also Remark 3.4.

4.2.3 Renormalization

In Sect. 3, the renormalization property for solutions of the transport equation (3.1) followed from the formula (3.6). We prove a similar renormalization property for the transport equation (4.2) in the expansive regime. Here, it depends on the strong convergence in \(L^p\) of regularizations.

Theorem 4.5

Let \(1 \le p \le \infty \) and \(u_T \in L^p_\textrm{loc}(\mathbb {R}^d)\), and let \(u \in C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) be the duality solution of (4.2). Assume \(\beta : \mathbb {R}\rightarrow \mathbb {R}\) is smooth and satisfies \(|\beta (r)| \le C (1 + |r|^\alpha )\) for some \(C,\alpha > 0\). Then \(\beta \circ u = C([0,T],L^{p/\alpha }_\textrm{loc}(\mathbb {R}^d))\) is the duality solution of (4.2) with terminal value \(\beta (u(T,\cdot )) = \beta \circ u_T\).

Proof

The proof is an easy consequence of regularization of b as in (2.3), and the passage to the limit follows from the strong convergence of \(u^\varepsilon \) to u. \(\square \)

4.3 The Forward ODE Flow

We finally return to the study of the flow (2.2), in particular for the forward direction. A candidate for the object \(\phi _{t,s}(x)\), \(t > s\), a.e. x was already identified in Proposition 2.2 as the right inverse of the backward flow—note that the full measure set of \(x \in \mathbb {R}^d\) depends on s and t. We now connect this right-inverse with the transport equation (4.2), and exploit the renormalization property to identify \(\phi _{t,s}(x)\) as a regular Lagrangian flow, that is, for a.e. \(x \in \mathbb {R}^d\), an absolutely continuous solution of the integral equation for (2.2) with control on the compressibility.

4.3.1 Properties of the Right Inverse

We first record more properties of the right-inverse of the backward flow identified in Proposition 2.2. From now on, for \(0 \le s \le t \le T\), we always denote by \(\phi _{t,s}\) the version of the right-inverse of \(\phi _{s,t}\) which is continuous almost everywhere (such a version is guaranteed to exist by Proposition 2.2).

Theorem 4.6

For any \(t \in (0,T]\), \([0,t] \times \mathbb {R}^d \ni (s,x) \mapsto \phi _{t,s}(x)\) is (coordinate-by-coordinate) the duality solution of (4.2) with terminal value x at time t. For all \(1 \le p < \infty \),

$$\begin{aligned} \phi _{\cdot , s} \in C([s,T], L^p_\textrm{loc}(\mathbb {R}^d)) \quad \text {and} \quad \phi _{t,\cdot } \in C([0,t] , L^p_\textrm{loc}(\mathbb {R}^d)), \end{aligned}$$

and there exists a constant \(C > 0\) such that, for all \(0 \le s \le t \le T\) and \(x \in \mathbb {R}^d\),

$$\begin{aligned} |\phi _{t,s}(x)| \le C(1 + |x|) \quad \text {and} \quad \left\| \phi _{t,s} \right\| _{BV_\textrm{loc}} \le C. \end{aligned}$$

Finally, if \((b^\varepsilon )_{\varepsilon > 0}\) is as in (2.3) and \(\phi ^\varepsilon _{t,s}\) is the corresponding forward flow, then, for all \(1 \le p < \infty \),

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \phi ^\varepsilon _{\cdot ,s} = \phi _{\cdot ,s} \quad \text {strongly in } C([s,T], L^p_\textrm{loc}(\mathbb {R}^d)) \end{aligned}$$

and

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \phi ^\varepsilon _{t,\cdot } = \phi _{t,\cdot } \quad \text {strongly in } C([0,t], L^p_\textrm{loc}(\mathbb {R}^d)), \end{aligned}$$

and the convergence also holds in the weak-\(\star \) sense in \(L^\infty _\textrm{loc}\).

Proof

For \(\varepsilon > 0\) and \((b^\varepsilon )_{\varepsilon > 0}\) as in (2.3), it is standard that, for \(t \in (0,T]\), the vector-valued solution of

$$\begin{aligned} \frac{\partial u^\varepsilon }{\partial s} + b^\varepsilon \cdot \nabla u^\varepsilon = 0 \quad \text {in } (0,t) \times \mathbb {R}^d, \quad u^\varepsilon (t,x) = x \end{aligned}$$

is given by \(u^\varepsilon (s,x) = \phi ^\varepsilon _{t,s}(x)\) for \(s \in [0,t]\), where \(\phi ^\varepsilon \) is the flow corresponding to \(b^\varepsilon \). By Theorem 4.4, we have the given convergence statements, as \(\varepsilon \rightarrow 0\), of \(\phi ^\varepsilon \) to the vector valued duality solution u of (4.2) in \([0,t] \times \mathbb {R}^d\) with terminal value \(u(t,\cdot ) = x\).

The flow property for smooth \(b^\varepsilon \) yields, for \(0 \le s \le t \le T\) and \(x \in \mathbb {R}^d\), \(\phi ^\varepsilon _{s,t}(\phi ^\varepsilon _{t,s}(x))\). By Lemma 2.2 and the above strong \(L^p\)-convergence statement, we may take \(\varepsilon \rightarrow 0\) to obtain \(\phi _{s,t}(u(s,x)) = x\), and then, by Proposition 2.2, we must have \(u(s,x) = \phi _{t,s}(x)\). The other statements now follow immediately in view of Theorem 4.4. Note that we are using that, for \(s \in [0,T)\), the map \([s,T] \times \mathbb {R}^d \ni (t,x) \mapsto \phi _{t,s}(x)\) is the duality solution of the initial value problem

$$\begin{aligned} \frac{\partial {\tilde{u}}}{\partial t} - b(t,x) \cdot \nabla {\tilde{u}} = 0 \quad \text {in } [s,T] \times \mathbb {R}^d, \quad {\tilde{u}}(s,x) = x, \end{aligned}$$

whose theory can be treated exactly as for (4.2). \(\square \)

4.3.2 The Regular Lagrange Property

We now observe that there is a representation formula for the duality solution of the transport equation (4.2).

Theorem 4.7

Let \(1 \le p \le \infty \). Then there exists a constant \(C > 0\) depending only on p and the constant in (2.1) such that, for all \(F \in L^p_\textrm{loc}\cap C\), \(R > 0\), and \(0 \le s \le t \le T\),

$$\begin{aligned} \left\| F \circ \phi _{t,s} \right\| _{L^p(B_R)} \le C \left\| F \right\| _{L^p(B_{R+C})}. \end{aligned}$$
(4.8)

In particular, for any \(A \subset \mathbb {R}^d\) with finite Lebesgue measure,

$$\begin{aligned} \left| \left\{ x : \phi _{t,s}(x) \in A \right\} \right| \le C |A|. \end{aligned}$$
(4.9)

If \(u_T \in L^p_\textrm{loc}(\mathbb {R}^d)\), then the duality solution of (4.2) is given by

$$\begin{aligned} u(t,x) = u_T(\phi _{T,t}(x)). \end{aligned}$$
(4.10)

If \(u_T\) has a version which is continuous almost everywhere, then, for \(t < T\), \(u(t,\cdot )\) also has a version that is continuous almost everywhere.

Remark 4.4

When \(u_T\) is not continuous, then (4.10) must be interpreted as the continuous extension of the operator \(u_T \mapsto u_T \circ \phi _{T,t}\) to \(u_T \in L^p_\textrm{loc}\), which is well-defined in view of the estimate (4.8).

Remark 4.5

The estimate (4.9) is called the regular Lagrange property. It reinforces the fact that \(\phi _{t,s}\) does not concentrate in sets of measure zero.

Remark 4.6

The propagation of almost-everywhere continuity is a consequence of the same property for the forward flow (Proposition 2.2). Note that it is not true in general that a function \(u \in BV_\textrm{loc}(\mathbb {R}^d)\) is continuous almost everywhere, unless \(d = 1\).

Proof of Theorem 4.7

For continuous \(u_T\), the representation formula is an immediate consequence of the renormalization property Theorem 4.5 and Theorem 4.6. The estimate (4.8) then follows from Theorem 4.4, and (4.9) is obtained by taking \(p = 1\) and \(F = \textbf{1}_A\).

For the claim about almost everywhere continuity, define

$$\begin{aligned} A {:}{=} \left\{ y \in \mathbb {R}^d: u_T \text { is not continuous at }y\right\} . \end{aligned}$$

Then \(|A| = 0\), and then (4.9) gives, for \(0 \le t < T\),

$$\begin{aligned} \left| \left\{ x \in \mathbb {R}^d: u_T \text { is not continuous at } \phi _{T,t}(x) \right\} \right| = 0. \end{aligned}$$

It follows that \(u_T\) is continuous at \(\phi _{T,t}(x)\) for a.e. x. By Proposition 2.2, \(\phi _{T,t}\) is continuous almost everywhere, and the result follows. \(\square \)

Recalling the duality relationship between (4.1) and (4.2) from Theorem 4.4, we then have

Corollary 4.2

For any \(1 \le p \le \infty \) and \(f_0 \in L^p_\textrm{loc}(\mathbb {R}^d)\), the BJM reversible solution f of (4.1) is given at time \(t > 0\) by \(\phi _{t,0}^\# f_0\).

Remark 4.7

The regular Lagrange property says that the measure \(\phi _{t,0}^\# f_0\) is well-defined and absolutely continuous with respect to Lebesgue measure, with a density in \(L^p_\textrm{loc}\). If \(f_0\) is the density for a probability measure, that is, \(f_0 \in L^1_+(\mathbb {R}^d)\) and \(\int f_0 = 1\), then \(f(t,\cdot )\) is the law at time t of the stochastic process \(\phi _{t,0}(X)\), where X is a random variable with density \(f_0\).

A consequence of renormalization and the regular Lagrange property is the fact that the forward flow \(\phi _{t,s}\) solves the ODE (2.2) for a.e. initial \(x \in \mathbb {R}^d\). A first step is the following lemma.

Lemma 4.2

For all \(p \in [1,\infty )\) and \(s \in [0,T)\), \(\{(t,x) \mapsto b(t,\phi _{t,s}(x))\} \in L^1([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\). If \((b^\varepsilon )_{\varepsilon > 0}\) is as in (2.3) and \((\phi ^\varepsilon )_{\varepsilon > 0}\) is the corresponding flow, then, for all \(R > 0\),

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _s^T \left\| b^\varepsilon (t, \phi ^\varepsilon _{t,s}) - b(t,\phi _{t,s}) \right\| _{L^p(B_R)} dt= 0. \end{aligned}$$

Proof

The first claim follows from (4.8): there exists \(C > 0\) independent of s and R such that, for all \(t \in [0,T]\), \( \left\| b(t,\phi _{t,s}) \right\| _{L^p(B_R)} \le C \left\| b(t,\cdot ) \right\| _{L^p(B_{R+C})}\).

For \(\delta > 0\) and \(0 \le s \le t \le T\), we write

$$\begin{aligned}&\left\| b^\varepsilon (t, \phi ^\varepsilon _{t,s}) - b(t,\phi _{t,s}) \right\| _{L^p(B_R)} \le \left\| b^\varepsilon (t, \phi ^\varepsilon _{t,s}) - b^\delta (t,\phi ^\varepsilon _{t,s} \right\| _{L^p(B_R)}\\&\quad + \left\| b^\delta (t, \phi ^\varepsilon _{t,s}) - b^\delta (t,\phi _{t,s} \right\| _{L^p(B_R)} + \left\| b^\delta (t, \phi _{t,s}) - b(t,\phi _{t,s} \right\| _{L^p(B_R)}. \end{aligned}$$

By (4.8), for some \(C > 0\) independent of \(\delta \), \(\varepsilon \), s, and t,

$$\begin{aligned} \left\| b^\varepsilon (t, \phi ^\varepsilon _{t,s}) - b^\delta (t,\phi ^\varepsilon _{t,s}) \right\| _{L^p(B_R)} \le C \left\| b^\varepsilon (t,\cdot ) - b^\delta (t,\cdot ) \right\| _{L^p(B_{R+C})} \end{aligned}$$

and

$$\begin{aligned} \left\| b^\delta (t, \phi _{t,s}) - b(t,\phi _{t,s}) \right\| _{L^p(B_R)} \le C \left\| b^\delta (t,\cdot ) - b(t,\cdot ) \right\| _{L^p(B_{R+C})}. \end{aligned}$$

The smoothness of \(b^\delta \) implies that, for all \(t \in [s,T]\), as \(\varepsilon \rightarrow 0\), \(b^\delta (t,\phi ^\varepsilon _{t,s})\) converges a.e. to \(b^\delta (t,\phi _{t,s})\). Sending \(\varepsilon \rightarrow 0\) and using dominated convergence, we thus have

$$\begin{aligned} & \limsup _{\varepsilon \rightarrow 0} \int _s^T \left\| b^\varepsilon (t, \phi ^\varepsilon _{t,s}) - b(t,\phi _{t,s}) \right\| _{L^p(B_R)}dt \\ & \qquad \quad \le C \int _s^T \left\| b^\delta (t,\cdot ) - b(t,\cdot ) \right\| _{L^p(B_{R+C})}dt. \end{aligned}$$

The proof of the claim is finished upon sending \(\delta \rightarrow 0\) and again using dominated convergence. \(\square \)

Theorem 4.8

Fix \(1 \le p < \infty \) and \(s \in [0,T)\). Then

$$\begin{aligned} \left\{ (t,x) \mapsto \phi _{t,s}(x) \right\} \in L^p_\textrm{loc}(\mathbb {R}^d, W^{1,1}([0,T]) ), \end{aligned}$$

and, for a.e. \(x \in \mathbb {R}^d\), \([s,T] \ni t \mapsto \phi _{t,s}(x)\) is an absolutely continuous solution of

$$\begin{aligned} \phi _{t,s}(x) = x + \int _s^t b(r, \phi _{r,s}(x))dr. \end{aligned}$$

If \((b^\varepsilon )_{\varepsilon > 0}\) satisfy (2.3) and \(\phi ^\varepsilon \) is the corresponding flow, then, for all \(R > 0\),

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \left\| \phi ^\varepsilon _{\cdot ,s} - \phi _{\cdot ,s} \right\| _{L^p(B_R, W^{1,1}([s,T])} = 0. \end{aligned}$$

For all \(0 \le r \le s \le t \le T\), \(\phi _{t,r} = \phi _{t,s} \circ \phi _{s,r}\) a.e.

Remark 4.8

The fact that \(\partial _t \phi _{t,\cdot } \in L^1\) is due to the fact that we are assuming the weakest possible integrability of b in the time variable. If \(b \in L^q\) for some \(q >1\), then the forward flow belongs to \(W^{1,p}\) for any \(p \le q\).

Remark 4.9

The composition \(\phi _{t,s} \circ \phi _{s,r}\) is made sense of due to (4.8) and the fact that the forward flow takes values in \(L^p_\textrm{loc}(\mathbb {R}^d)\).

Proof of Theorem 4.8

For \(\varepsilon > 0\), we have \(\partial _t \phi ^\varepsilon _{t,s}(x) = b^\varepsilon (t, \phi ^\varepsilon _{t,s}(x))\). By Lemma 4.2, sending \(\varepsilon \rightarrow 0\), we see that the distribution \(\partial _t \phi _{t,s}(x)\) satisfies, in the distributional sense, \(\partial _t \phi _{t,s}(x) = b(t,\phi _{t,s}(x))\), and therefore, for all \(R > 0\),

$$\begin{aligned} \left\| \int _s^T |\phi _{t,s}|dt \right\| _{L^p(B_R)} \le \int _s^T \left\| \phi _{t,s} \right\| _{L^p(B_R)}dt < \infty . \end{aligned}$$

The convergence claim and the solvability of the ODE follow immediately in view of the fact that \(\phi ^\varepsilon _{s,s}(x) = \phi _{s,s}(x) = x\) for all \(\varepsilon > 0\) and \(x \in \mathbb {R}^d\).

To prove the last claim, we note that the equality \(\phi _{r,t} \circ \phi _{t,r} = {\text {Id}}\) holds as functions in \(L^p_\textrm{loc}\), and, in view of the flow property of the backward flow,

$$\begin{aligned} \phi _{r,t} \circ ( \phi _{t,s} \circ \phi _{s,r}) = \phi _{r,s} \circ \phi _{s,t} \circ \phi _{t,s} \circ \phi _{s,r} = \phi _{r,s} \circ \phi _{s,r} = {\text {Id}}. \end{aligned}$$

It follows from Proposition 2.2 that \(\phi _{t,r} = \phi _{t,s} \circ \phi _{s,r}\) a.e., as desired. \(\square \)

We recall that Proposition 2.2 implies that any right-inverse of the backward flow is determined uniquely almost everywhere. We remark here that this property actually follows from the duality between the transport and continuity equations.

Theorem 4.9

Assume \(\psi \in C([0,t], L^p_\textrm{loc}(\mathbb {R}^d))\) satisfies \(\phi _{s,t}(\psi _s(x)) = x\) for all \(s \in [0,t]\), for a.e. \(x \in \mathbb {R}^d\). Then \(\psi = \phi _{t,\cdot }\).

Proof

It suffices to show that \(u(t,x) = \psi _s(x)\) is the unique (vector-valued) duality solution of (4.2) with terminal data equal to x at time t.

Fix \(g \in C_c(\mathbb {R}^d)\). For a.e. \(x \in \mathbb {R}^d\), if \(y = \psi _t(x)\), we have \(\phi _{s,t}(y) = x\) by assumption. Therefore, the change of variables formula yields

$$\begin{aligned} \int _{\mathbb {R}^d} g(x) \psi _t(x)dx = \int _{\mathbb {R}^d} g(\phi _{s,t}(y)) y J_{s,t}(y)dy = \int _{\mathbb {R}^d} f(t,y) y dy, \end{aligned}$$

where f is the BJM solution of the forward continuity equation with initial condition g at time s. \(\square \)

Remark 4.10

A corresponding result characterizing \(\phi _{\cdot , s}\) on [sT] follows in exactly the same way, by considering the duality between the IVP and TVP for, respectively, an appropriate transport and continuity equation.

Remark 4.11

The uniqueness result above demonstrates that the right-inverse property is a crucial property of the forward flow. In other words, it implies that \(\phi _{t,s}\) solves the ODE, that it solves the transport PDE in the duality sense, and that it has the regularity properties laid out in Theorems 4.7 and 4.8.

4.4 Characterizations

We now present alternative ways to characterize the solutions of the forward continuity and backward transport equations identified above. Although the PDE (4.2) does not make sense as a distribution, we nevertheless can characterize solutions in a PDE sense through the use of \(\sup \)- and \(\inf \)-convolutions. The propagation of almost-everywhere continuity proved in Theorem 4.7 is a crucial ingredient.

By using this characterization in duality with the conservative equation, we then show that nonnegative distributional solutions of (4.1) are unique, and therefore equal to the solution identified by the formula (4.3). As a consequence, we finally conclude with the uniqueness of regular Lagrangian flows, forward in time, of the ODE (2.2).

4.4.1 The Nonconservative Equation: \(\sup \) and \(\inf \) Convolutions

We now identify those regularizations that will lead to a PDE characterization for solutions of the equation (4.2). Given \(\delta > 0\) and \(u \in L^\infty ( \mathbb {R}^d)\), we define the \(\sup \)- and \(\inf \)-convolutions

$$\begin{aligned} u^\delta (x) {:}{=} \mathop {\mathrm {ess\,sup}}\limits _{y \in \mathbb {R}^d} \left\{ u(y) - \frac{1}{2\delta }|x-y|^2 \right\} \end{aligned}$$

and

$$\begin{aligned} u_\delta (x) {:}{=} \mathop {\mathrm {ess\,inf}}\limits _{y \in \mathbb {R}^d} \left\{ u(y) + \frac{1}{2\delta }|x-y|^2 \right\} . \end{aligned}$$

These regularizations are common in the theory of viscosity solutions, or generally for equations satisfying a maximum principle in spaces of continuous functions. The supremum and infimum must be essential, because u is only defined almost everywhere.

Lemma 4.3

Assume that \(u \in L^\infty (\mathbb {R}^d)\) is continuous almost everywhere. Then, for all \(\delta > 0\), \(u_\delta , u^\delta \) are globally Lipschitz with constant

$$\begin{aligned} (\mathop {\mathrm {ess\,sup}}\limits u - \mathop {\mathrm {ess\,inf}}\limits u)^{1/2} \delta ^{-1/2}, \end{aligned}$$

and

$$\begin{aligned} u_\delta \le u \le u^\delta \quad \text {a.e.} \end{aligned}$$

As \(\delta \rightarrow 0\), \(u^\delta \) decreases to u and \(u_\delta \) increases to u a.e. Finally, the \(\mathop {\mathrm {ess\,sup}}\limits \) and \(\mathop {\mathrm {ess\,inf}}\limits \) in the definitions of \(u^\delta \) and \(u_\delta \) can be restricted to respectively \(y \in B_{R^\delta (x)}(x)\) and \(B_{R_\delta (x)}(x)\), where

$$\begin{aligned} R^\delta (x) = 2(u^{2\delta }(x) - u^\delta (x))^{1/2} \delta ^{1/2} \end{aligned}$$

and

$$\begin{aligned} R_\delta (x) = 2(u_\delta (x) - u_{2\delta }(x))^{1/2} \delta ^{1/2}. \end{aligned}$$

Proof

Fix \(x \in \mathbb {R}^d\) and \(r > 0\). We thus have

$$\begin{aligned} u^\delta (x) \ge \mathop {\mathrm {ess\,sup}}\limits _{y \in B_r(x)} u(y) - \frac{r^2}{2\delta }. \end{aligned}$$

Sending \(r \rightarrow 0\), we see that \(u^\delta (x) \ge u(x)\) whenever u is continuous at x, and therefore \(u^\delta \ge u\) a.e. Similarly, \(u_\delta \le u\) a.e.

We now observe that, if \(R > (\mathop {\mathrm {ess\,sup}}\limits u - \mathop {\mathrm {ess\,inf}}\limits u)^{1/2}\), then, for a.e. \(y \notin B_{R\delta ^{1/2}}\),

$$\begin{aligned} u(y) - \frac{|x-y|^2}{2\delta } \le \mathop {\mathrm {ess\,sup}}\limits u - R^2 < \mathop {\mathrm {ess\,inf}}\limits u \le u^\delta (x). \end{aligned}$$

By also using a similar argument for \(u_\delta \), we see that

$$\begin{aligned} u^\delta (x) {:}{=} \mathop {\mathrm {ess\,sup}}\limits _{|y-x| \le R\delta ^{1/2}} \left\{ u(y) - \frac{1}{2\delta }|x-y|^2 \right\} \end{aligned}$$

and

$$\begin{aligned} u_\delta (x) {:}{=} \mathop {\mathrm {ess\,inf}}\limits _{|y-x| \le R\delta ^{1/2}} \left\{ u(y) + \frac{1}{2\delta }|x-y|^2 \right\} . \end{aligned}$$

It is then straightforward to see that \(u^\delta \) and \(u_\delta \) are respectively decreasing and increasing pointwise as \(\delta \) decreases to 0, and converge whenever u is continuous at x (and thus a.e.) to u(x).

For fixed \(x \in \mathbb {R}^d\), \(\delta > 0\), and \(\eta > 0\), define

$$\begin{aligned} A_{\delta ,\eta }(x) {:}{=} \left\{ y \in B_{R \delta ^{1/2}}(x) : u(y) - \frac{|x-y|^2}{2\delta } > u^\delta (x) - \eta \right\} . \end{aligned}$$

Then, by definition, \(A_{\delta ,\eta }(x)\) is nonempty, and in fact has nonzero Lebesgue measure. Therefore, for any \(x' \in \mathbb {R}^d\) and \(y \in A_{\delta ,\eta }(x)\), we have

$$\begin{aligned} u^\delta (x) - u^\delta (x') \le \frac{|x'-y|^2}{2\delta } - \frac{|x-y|^2}{2\delta } + \eta \le \frac{R}{\delta ^{1/2}}|x'-x| + \frac{|x'-x|^2}{\delta } + \eta . \end{aligned}$$

Switching the roles of x and \(x'\) and using the fact that \(\eta \) was arbitrary, we see that, for all \(x \in \mathbb {R}^d\),

$$\begin{aligned} \limsup _{x' \rightarrow x} \frac{|u^\delta (x') - u^\delta (x)|}{|x'-x|} \le \frac{R}{\delta ^{1/2}}. \end{aligned}$$

We may then let R decrease down to \((\mathop {\mathrm {ess\,sup}}\limits u - \mathop {\mathrm {ess\,inf}}\limits u)^{1/2}\), and the same proof for \(u_\delta \) holds.

For any \(\eta > 0\) and a.e. \(y \in A^\eta _\delta \),

$$\begin{aligned} u^{2\delta }(x) \ge u(y) - \frac{|x-y|^2}{4\delta } > u^\delta (x) + \frac{|x-y|^2}{2\delta } - \eta , \end{aligned}$$

and so

$$\begin{aligned} |y-x| \le 2( u^{2\delta }(x) - u^\delta (x) + \eta )^{1/2} \delta ^{1/2}. \end{aligned}$$

Therefore, for a.e. y such that \(|y-x| > R^\delta (x)\), we must have \(u(y) - \frac{|x-y|^2}{2\delta } < u^\delta (x)\), and the statement about restricting the \(\mathop {\mathrm {ess\,sup}}\limits \) follows. The corresponding result for \(u_\delta \) is proved in the same way. \(\square \)

A formal calculation using the one-sided Lipschitz condition on b suggests that, if u solves (4.2), then the \(\sup \)- and \(\inf \)-convolutions of u in the spatial variable are approximate sub- and supersolutions of (4.2). The following result not only establishes this property rigorously, but also proves that it in fact characterizes the unique duality solution of (4.2). The result is proved by using the duality property in relation to a nonnegative distributional solution of (4.1), and we use exactly the same methods to prove the uniqueness of nonnegative distributional solutions in Theorem 4.11 below.

Theorem 4.10

Assume \(u \in C([0,T], L^1_\textrm{loc}(\mathbb {R}^d)) \cap L^\infty ([0,T] \times \mathbb {R}^d)\) is continuous almost everywhere and \(u(T,\cdot ) = u_T \in L^\infty (\mathbb {R}^d)\). Then u is the duality solution of (4.2) if and only if there exist \(r^\delta , r_\delta \in L^1_\textrm{loc}([0,T] \times \mathbb {R}^d))\) such that \(\lim _{\delta \rightarrow 0} r^\delta = \lim _{\delta \rightarrow 0} r_\delta = 0\) in \(L^1_\textrm{loc}\), and the \(\sup \)- and \(\inf \)-convolutions

$$\begin{aligned} u^\delta (t,x) {:}{=} \mathop {\mathrm {ess\,sup}}\limits _{y \in \mathbb {R}^d} \left\{ u(t,y) - \frac{1}{2\delta }|x-y|^2 \right\} \end{aligned}$$

and

$$\begin{aligned} u_\delta (t,x) {:}{=} \mathop {\mathrm {ess\,inf}}\limits _{y \in \mathbb {R}^d} \left\{ u(t,y) + \frac{1}{2\delta }|x-y|^2 \right\} \end{aligned}$$

satisfy in the sense of distributions on \([0,T] \times \mathbb {R}^d\) the inequalities

$$\begin{aligned} \frac{\partial u^\delta }{\partial t} + b(t,x) \cdot \nabla u^\delta \le r^\delta (t,x) \quad \text {and} \quad \frac{\partial u_\delta }{\partial t} + b(t,x) \cdot \nabla u_\delta \ge -r_\delta (t,x). \end{aligned}$$

Proof

Assume first that the \(\sup \)- and \(\inf \)-convolutions have the stated properties. For standard mollifiers \((\rho _\eta )_{\eta > 0}\) on \(\mathbb {R}\), define \(u^\delta _\eta (t,x) = (u^\delta (\cdot , x) *_t \rho _\eta )(t)\) and \(u_{\delta ,\eta }(t,x) = (u_\delta (\cdot , x) *_t \rho _\eta )(t)\). Then, by Lemma 4.3, \(u^\delta _\eta \) and \(u_{\delta ,\eta }\) are Lipschitz continuous on \([0,T] \times \mathbb {R}^d\), and satisfy a.e. in \([0,T] \times \mathbb {R}^d\)

$$\begin{aligned} \frac{\partial u^\delta _\eta }{\partial t} + b(t,x) \cdot \nabla u^\delta _\eta \le r^\delta _\eta (t,x) \quad \text {and} \quad \frac{\partial u_{\delta ,\eta }}{\partial t} + b(t,x) \cdot \nabla u_{\delta ,\eta } \ge -r_{\delta ,\eta }(t,x), \end{aligned}$$

where

$$\begin{aligned} r^\delta _\eta (t,x) = (r^\delta (\cdot ,x) *_t \rho _\eta )(t) + \int _\mathbb {R}(b(t,x) - b(s,x)) \cdot \nabla u^\delta (s,x) \rho _\eta (s-t)ds \end{aligned}$$

and

$$\begin{aligned} r_{\delta ,\eta }(t,x) = (r_\delta (\cdot ,x) *_t \rho _\eta )(t) + \int _\mathbb {R}(b(t,x) - b(s,x)) \cdot \nabla u_\delta (s,x) \rho _\eta (s-t)ds. \end{aligned}$$

The (local) boundedness of b, \(\nabla u_\delta \), and \(\nabla u^\delta \) then allows us to invoke the dominated convergence theorem to say that, for fixed \(\delta \), \(\lim _{\eta \rightarrow 0} r^\delta _\eta = r^\delta \) and \(\lim _{\eta \rightarrow 0} r_{\delta ,\eta } = r_\delta \) in \(L^1_\textrm{loc}\).

Now let \(f_0 \in C_c(\mathbb {R}^d)\) be nonnegative and let f be the BJM solution of (4.1). In view of the nonnegativity of J, f given by (4.3) is nonnegative on \([0,T] \times \mathbb {R}^d\), and the bounds for the backward flow in Lemma 2.2 imply that f has compact support in \([0,T] \times \mathbb {R}^d\). By Theorem 4.1, f is a distributional solution, and therefore

$$\begin{aligned}&\int _{\mathbb {R}^d} f(T,x) u^\delta _\eta (T,x)dx - \int _{\mathbb {R}^d} f_0(x) u^\delta _\eta (0,x)dx\\&\quad = \int _0^T \int _{\mathbb {R}^d}f(t,x) \left[ \partial _t u^{\delta ,\eta }(t,x) + b(t,x) \cdot \nabla u^{\delta ,\eta }(t,x) \right] dxdt \\&\quad \le \int _0^T \int _{\mathbb {R}^d} f(t,x) r^\delta _\eta (t,x)dxdt. \end{aligned}$$

Sending first \(\eta \rightarrow 0\) and then \(\delta \rightarrow 0\), using Lemma 4.3 and the dominated convergence theorem, we conclude that

$$\begin{aligned} \int _{\mathbb {R}^d} f(T,x) u_T(x)dx \le \int _{\mathbb {R}^d} f_0(x) u(0,x)dx. \end{aligned}$$

Arguing similarly with \(u_{\delta ,\eta }\) as a test function, we achieve the opposite inequality. By linearity, the duality identity holds for any \(f_0 \in L^\infty \) with bounded support, and we conclude that u is the unique duality solution.

Assume now conversely that u is the duality solution. Let \((b^\varepsilon )_{\varepsilon > 0}\) be as in (2.3), let \(u^\varepsilon \) be the corresponding solution, and define

$$\begin{aligned} u^{\varepsilon ,\delta }(t,x) {:}{=} \sup _{y \in \mathbb {R}^d} \left\{ u^\varepsilon (t,y) - \frac{1}{2\delta }|x-y|^2 \right\} \end{aligned}$$

and

$$\begin{aligned} u^\varepsilon _\delta (t,x) {:}{=} \inf _{y \in \mathbb {R}^d} \left\{ u^\varepsilon (t,y) + \frac{1}{2\delta }|x-y|^2 \right\} . \end{aligned}$$

By Lemma 4.3, for fixed \(\delta > 0\), \(u^{\varepsilon ,\delta }\) and \(u^\varepsilon _\delta \) are Lipschitz continuous in the space variable, uniformly over \([0,T] \times \mathbb {R}^d\) and \(\varepsilon > 0\). Moreover, the \(\sup \) and \(\inf \) are actually a \(\max \) and \(\min \), and may be restricted to

$$\begin{aligned} |y-x| \le (\max u_0 - \min u_0)^{1/2}\delta ^{1/2} \end{aligned}$$

(note that we have used the maximum principle for the transport equation to control the maximum and minimum of \(u^\varepsilon \) and \(u_\varepsilon \)). We may alternatively restrict the y for which the maximum in the definition of \(u^{\varepsilon ,\delta }(t,x)\) is attained to satisfy

$$\begin{aligned} |y-x| \le 2(u^{\varepsilon ,2\delta }(t,x) - u^{\varepsilon ,\delta }(t,x))^{1/2} \delta ^{1/2}, \end{aligned}$$
(4.11)

and the minimum in the definition of \(u^\varepsilon _\delta \) is attained by y satisfying

$$\begin{aligned} |y-x| \le 2 (u^\varepsilon _\delta (t,x) - u^\varepsilon _{2\delta }(t,x))^{1/2} \delta ^{1/2}. \end{aligned}$$
(4.12)

Standard properties of envelopes then give the identities, for any \((t,x) \in [0,T] \times \mathbb {R}^d\),

$$\begin{aligned} \frac{\partial u^{\varepsilon ,\delta }}{\partial t}(t,x) = \frac{\partial u^\varepsilon }{\partial t}(t,y) \quad \text {and} \quad \nabla u^{\varepsilon ,\delta }(t,x) = \nabla u^\varepsilon (t,y) = \frac{y-x}{\delta } \end{aligned}$$

for some y satisfying (4.11). Therefore

$$\begin{aligned} \partial _t u^{\varepsilon ,\delta }(t,x) = - b^\varepsilon (t,y) \cdot \nabla u^{\varepsilon ,\delta }(t,x), \end{aligned}$$

from which we deduce that \(u^{\varepsilon ,\delta }\) is uniformly Lipschitz continuous in the time variable over \([0,T] \times B_R\) for any \(R > 0\), independently of \(\varepsilon \). Further developing the equality gives

$$\begin{aligned} \frac{\partial u^{\varepsilon ,\delta }}{\partial t}(t,x) + b^\varepsilon (t,x) \cdot \nabla u^{\varepsilon ,\delta }(t,x)&= \frac{\partial u^{\varepsilon }}{\partial t}(t,y) + b^\varepsilon (t,x) \cdot \nabla u^{\varepsilon }(t,y)\nonumber \\&= -(b^\varepsilon (t,x) - b^\varepsilon (t,y)) \cdot \frac{x-y}{\delta } \nonumber \\&\le C_0(t) \frac{|x-y|^2}{\delta } \nonumber \\&\le 4C_0(t)(u^{\varepsilon ,2\delta }(t,x) - u^{\varepsilon ,\delta }(t,x)). \end{aligned}$$
(4.13)

We similarly have that \(u^\varepsilon _\delta \) is Lipschitz continuous in the time variable, locally in space, uniformly over \(\varepsilon > 0\), and

$$\begin{aligned} \frac{\partial u^{\varepsilon }_\delta }{\partial t}(t,x) + b^\varepsilon (t,x) \cdot \nabla u^{\varepsilon }_\delta (t,x) \ge -4C_0(t) ( u^\varepsilon _\delta (t,x) - u^\varepsilon _{2\delta }(t,x)). \end{aligned}$$
(4.14)

We now claim that, as \(\varepsilon \rightarrow 0\), \(u^{\varepsilon ,\delta }\) and \(u^\varepsilon _\delta \) converge pointwise to respectively \(u^\delta \) and \(u_\delta \), and then, by the uniform-in-\(\varepsilon \) Lipschitz regularity, the convergence is locally uniform. To see this, fix \(x \in \mathbb {R}^d\) and \(\eta > 0\), and let \(A \subset \mathbb {R}^d\) be a set of positive measure such that

$$\begin{aligned} u^\delta (t,x) \le u(t,y) - \frac{|x-y|^2}{2\delta } + \eta . \end{aligned}$$

We then have, for all \(y \in A\),

$$\begin{aligned} u^{\delta ,\varepsilon }(t,x) \ge u^\varepsilon (t,y) - \frac{|x-y|^2}{2\delta }. \end{aligned}$$

For at least one such y, we then have \(u^\varepsilon (t,y) \xrightarrow {\varepsilon \rightarrow 0} u(t,y)\), and we thus have

$$\begin{aligned} \limsup _{\varepsilon \rightarrow 0} \left( u^\delta (t,x) - u^{\delta ,\varepsilon }(t,x) \right) \le \eta . \end{aligned}$$

It follows that \(\limsup _{\varepsilon \rightarrow 0} \left( u^\delta (t,x) - u^{\delta ,\varepsilon }(t,x) \right) \le 0\) since \(\eta \) was arbitrary.

Now, there exists a full measure set \(B \subset \mathbb {R}^d\) such that, for all \(y \in B\),

$$\begin{aligned} u^\delta (t,x) \ge u(t,y) - \frac{|x-y|^2}{2\delta } \quad \text {and} \quad \lim _{\varepsilon \rightarrow 0} u^\varepsilon (t,y) = u(t,y). \end{aligned}$$

In view of the continuity of \(u^\varepsilon (t,\cdot )\), there exists a bounded (independently \(\varepsilon \)) sequence \((y_n)_{n \in \mathbb {N}} \subset B\) such that

$$\begin{aligned} \rho _n {:}{=} u^{\delta ,\varepsilon }(t,x) - \left\{ u^\varepsilon (t,y_n) - \frac{|x-y_n|^2}{2\delta } \right\} \end{aligned}$$

satisfies \(\lim _{n\rightarrow \infty } \rho _n = 0\). Therefore, for all n,

$$\begin{aligned} u^{\delta ,\varepsilon }(t,x) - u^\delta (t,x) \le u^\varepsilon (t,y_n) - u(t,y_n) + \rho _n. \end{aligned}$$

Sending \(\varepsilon \rightarrow 0\) gives \(\limsup _{\varepsilon \rightarrow 0} (u^{\delta ,\varepsilon }(t,x) - u^\delta (t,x)) \le \rho _n\), and the proof of pointwise convergence is finished upon sending \(n \rightarrow \infty \). The exact same argument can be used for the pointwise convergence of \(u^\varepsilon _\delta \) to \(u_\delta \).

It then follows that, for fixed \(\delta \), as \(\varepsilon \rightarrow 0\), \(\nabla u^{\varepsilon ,\delta }\) and \(\nabla u^{\varepsilon }_\delta \) converge weak-\(\star \) in \(L^\infty \) to \(\nabla u^\delta \) and \(\nabla u_\delta \) respectively, while \(b^\varepsilon \) converges in \(L^1_\textrm{loc}\) to b. We may then take \(\varepsilon \rightarrow 0\) in (4.13) and (4.14) to obtain the distributional inequalities

$$\begin{aligned} \frac{\partial u^{\delta }}{\partial t}(t,x) + b(t,x) \cdot \nabla u^{\delta }(t,x) \le 4C_0(t)(u^{2\delta }(t,x) - u^{\delta }(t,x)) {=}{:} r^\delta (t,x) \end{aligned}$$

and

$$\begin{aligned} \frac{\partial u_\delta }{\partial t}(t,x) + b(t,x) \cdot \nabla u_\delta (t,x) \ge -4C_0(t) ( u_\delta (t,x) - u_{2\delta }(t,x)) {=}{:} -r_\delta (t,x). \end{aligned}$$

By Lemma 4.3 and the almost-everywhere continuity of u, the right-hand sides of both inequalities converge a.e. to 0 as \(\delta \rightarrow 0\), and, by the uniform boundedness in \(\delta \) of \(u^\delta \) and \(u_\delta \) and the dominated convergence theorem, \(r^\delta \) and \(r_\delta \) both converge in \(L^1_\textrm{loc}\) to 0 as \(\delta \rightarrow 0\). \(\square \)

4.4.2 The Conservative Equation: Uniqueness of Nonnegative Solutions

We observe that, in the first implication in the proof of Theorem 4.10, it was proved that u was a duality solution by proving the duality identity relative to a “good” nonnegative solution, i.e. the reversible BJM solution we have been working with above. However, it was only explicitly used that f was a distributional solution. Therefore, after having proved the equivalence in Theorem 4.10, we arrive at the following:

Theorem 4.11

Suppose that \(f \in C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) is a distributional solution of (4.1) and \(f \ge 0\). Then \(f(t,x) = f(0,\phi _{0,t}(x))J_{0,t}(x)\).

Proof

Fix \(t > 0\) and \(v \in C_c(\mathbb {R}^d)\), and let \(u \in C([0,t], L^1_\textrm{loc}(\mathbb {R}^d)) \cap L^\infty ([0,t] \times \mathbb {R}^d)\) be the duality solution of (4.2) with terminal data v at time t. Then, by Theorem 4.7, u is continuous almost everywhere in \([0,t] \times \mathbb {R}^d\). Arguing exactly as in the first part of Theorem 4.10, using the nonnegativity of f, we arrive at the equality

$$\begin{aligned} \int _{\mathbb {R}^d} f(t,x) v(x)dx = \int _{\mathbb {R}^d} f(0,x)u(0,x)dx. \end{aligned}$$

Since v was arbitrary, it follows from the definition of duality solutions that f(tx) must be given by (4.3). \(\square \)

We then have the following corollary about characterizing the BJM solution even when f is signed:

Corollary 4.3

A function \(f \in C([0,T], L^p_\textrm{loc}(\mathbb {R}^d))\) is the unique reversible solution of (4.1) in the sense of [25] if and only if f and |f| are both solutions in the sense of distributions.

Proof

That this property is satisfied by the good solution was already pointed out (Corollary 4.1). Suppose now that f and |f| are both distributional solutions. It follows that \(f_+ = \frac{1}{2} (f + |f|)\) and \(f_- = \frac{1}{2} (|f| - f)\) are distributional solutions, and, since \(f_+ \ge 0\) and \(f_- \ge 0\), they are both BJM reversible solutions. Therefore \(f = f_+ - f_-\) is a reversible solution by linearity. \(\square \)

4.4.3 Uniqueness of Regular Lagrangian Flows

We can finally establish the uniqueness for the forward flows of the ODE (2.2)

Theorem 4.12

For every \(s \in [0,T]\) and almost every \(x \in \mathbb {R}^d\), \(\phi _{t,s}(x)\) is the unique absolutely continuous solution of (2.2).

Proof

This is a consequence of Theorem 4.11 and the superposition principle of Ambrosio [6, Theorem 3.1]. \(\square \)

4.5 Some Remarks for Second Order Equations

We next investigate the second-order analogues of (4.1) and (4.2). As mentioned earlier, we are not able to treat the most general case in which \(\sigma \) is a regular function of x. This is due to the fact that Lemma 2.5 only gives regularity of the backward stochastic flow in \(C^{0,1-\varepsilon }\) for \(0< \varepsilon < 1\). As a consequence, defining the Jacobian and using it to analyze the right-inverse of the flow is not possible at present with our methods. Our results in this case are limited to stochastic flows for which the coefficient \(\sigma \) in front of the Wiener process is constant in the space variable. The generalization to regular but nonconstant \(\sigma \) will be the subject of future work.

4.5.1 The Expansive Stochastic Flow with Constant Noise Coefficient

The stochastic analogue of the forward flow (2.2) is

$$\begin{aligned} d_t \Phi _{t,s}(x) = b(t, \Phi _{t,s}(x))dt + \sigma (t, \Phi _{t,s}(x))dW_t, \quad t \in [s,T], \quad \Phi _{s,s}(x) = x, \end{aligned}$$
(4.15)

where \(\sigma : [0,T] \times \mathbb {R}^d \rightarrow \mathbb {R}^{d \times m}\) is some matrix-valued map. As we shall see, this general setting is out of the reach at the moment, and we thus assume

$$\begin{aligned} \sigma \in L^2([0,T], \mathbb {R}^{d \times m}) \end{aligned}$$
(4.16)

is constant in the space variable. We then consider the forward stochastic flow

$$\begin{aligned} d\Phi _{t,s}(x) = b(t,\Phi _{t,s}(x))dt + \sigma _t dW_t, \quad t \in [s,T], \quad \Phi _{s,s}(x) = x. \end{aligned}$$
(4.17)

Formally defining

$$\begin{aligned} {\tilde{\Phi }}_{t,s}(x) {:}{=} \Phi _{t,s}(x) - \underbrace{\int _s^t \sigma _r dW_r}_{{:}{=} M_t - M_s} \end{aligned}$$

leads to the random ODE

$$\begin{aligned} \partial _t {\tilde{\Phi }}_{t,s}(x) = b\left( t, {\tilde{\Phi }}_{t,s}(x) + M_t - M_s \right) , \quad t \in [s,T], \quad {\tilde{\Phi }}_{s,s}(x) = x. \end{aligned}$$
(4.18)

We now invoke the theory of the previous subsections to obtain the following:

Theorem 4.13

For every \(s \in [0,T)\), with probability one, there exists a unique \(\Phi _{\cdot ,s} \in C([s,T], L^p_\textrm{loc}(\mathbb {R}^d)) \cap L^p_\textrm{loc}(\mathbb {R}^d, C([s,T]))\) such that, for a.e. \(x \in \mathbb {R}^d\),

$$\begin{aligned} \Phi _{t,s}(x) = x + \int _s^t b(r, \Phi _{r,s}(x))dr + \int _s^t \sigma _r dW_r. \end{aligned}$$

If \((b^\varepsilon )_{\varepsilon > 0}\) are as in (2.3) and \(\Phi ^\varepsilon \) is the unique stochastic flow solving (4.17) with drift \(b^\varepsilon \), then, with probability one, as \(\varepsilon \rightarrow 0\), \(\Phi ^\varepsilon \) converges in \(C([s,T], L^p_\textrm{loc}(\mathbb {R}^d))\) and in \(L^p_\textrm{loc}(\mathbb {R}^d, C([s,T]))\) to \(\Phi \).

Proof

This follows upon applying the results of Theorems 4.8 and 4.12 to the random ODE (4.18). \(\square \)

4.5.2 A Priori Estimates for the Second-Order Nonconservative Equation

We next relate the forward stochastic flow from the previous subsection to the terminal value problem for a certain second-order, nonconservative equation. This will be done with the use of a priori \(L^p\) and BV estimates, which lead to useful compactness results, just as for the first order case.

We begin with the more general problem

$$\begin{aligned} -\partial _t u - {\text {tr}}[ a(t,x) \nabla ^2 u] + b(t,x) \cdot \nabla u = 0 \quad \text {in } (0,T) \times \mathbb {R}^d, \quad u(T,\cdot ) = u_T, \end{aligned}$$
(4.19)

where

$$\begin{aligned} a(t,x) = \frac{1}{2} \sigma (t,x) \sigma (t,x)^T, \quad \sigma \in L^2([0,T], C^{1,1}(\mathbb {R}^d, \mathbb {R}^{d \times m}) ); \end{aligned}$$
(4.20)

notice that, although we allow \(\sigma \) to be nonconstant here, we require more regularity for \(\sigma \) than in Sect. 3.

Lemma 4.4

There exists \(C \in L^1_+([0,T])\) depending only on the \(C^{1,1}\) norm of \(\sigma \) such that, if u is a smooth solution of

$$\begin{aligned} -\partial _t u - {\text {tr}}[ a(t,x) \nabla ^2 u] = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad u(T,\cdot ) = u_T, \end{aligned}$$

then

$$\begin{aligned} \left\| u(t,\cdot ) \right\| _{BV(\mathbb {R}^d)} \le \exp \left( \int _t^T C(s)ds \right) \left\| u_T \right\| _{BV(\mathbb {R}^d)}. \end{aligned}$$

Proof

For \((t,x,z) \in [0,T] \times \mathbb {R}^d \times \mathbb {R}^d\), set \(w(t,x,z) = \nabla u(t,x) \cdot z\). Then w solves the parabolic PDE

$$\begin{aligned} \frac{\partial w}{\partial t} - {\text {tr}}[A(t,x,z)\nabla ^2_{(x,z)}w] = 0 \quad \text {in }(0,T) \times \mathbb {R}^{2d}, \end{aligned}$$

where

$$\begin{aligned} A(t,x,z) = \frac{1}{2} \begin{pmatrix} \sigma (t,x) \\ z \cdot \nabla \sigma (t,x) \end{pmatrix} \begin{pmatrix} \sigma (t,x)^T&z \cdot \nabla \sigma (t,x)^T \end{pmatrix}. \end{aligned}$$

After a routine regularization argument, using the convexity of \(w \mapsto |w|\), we find that

$$\begin{aligned} \frac{\partial |w|}{\partial t} - {\text {tr}}[A(t,x,z) \nabla ^2_{(x,z)}|w| ] \le 0 \quad \text {in }(0,T) \times \mathbb {R}^d \times \mathbb {R}^d. \end{aligned}$$
(4.21)

For some \(m > d +1\), let \(\phi \in C^\infty _+([0,\infty ))\) be such that, for some universal \(C > 0\),

$$\begin{aligned} \phi (r) = \frac{1}{r^m} \quad \text {for } r \ge 1 \quad \text {and} \quad r|\phi '(r)| + r^2 |\phi ''(r)| \le C \phi (r) \quad \text {for all } r \ge 0. \end{aligned}$$
(4.22)

We multiply (4.21) by \(\phi (|z|)\) and integrate in \((x,z) \in \mathbb {R}^d \times \mathbb {R}^d\). Then (4.20) and (4.22) imply that for some \(C \in L^1_+([0,T])\),

$$\begin{aligned} -\frac{d}{dt} \iint _{\mathbb {R}^d \times \mathbb {R}^d} |w(t,x,z)| \phi (|z|)dxdz \le C(t) \iint _{\mathbb {R}^d \times \mathbb {R}^d} |w(t,x,z)| \phi (|z|)dxdz. \end{aligned}$$

The proof is then finished by Grönwall’s lemma and the fact that

$$\begin{aligned} \iint _{\mathbb {R}^d \times \mathbb {R}^d} |w(t,x,z)| \phi (z)dxdz = c_0 \int _{\mathbb {R}^d} |\nabla u(t,x)| dx, \end{aligned}$$

where \(c_0 {:}{=} \int _{\mathbb {R}^d} |\nu \cdot z| \phi (|z|) dz\) is finite and independent of \(|\nu | = 1\). \(\square \)

We have already proved an exponential propagation of the BV bounds when \(a = 0\) in Lemma 4.1. It is a classical fact for evolution PDEs that, upon using a splitting scheme, that these estimates can be combined, and we immediately have the following:

Lemma 4.5

There exists a constant \(C \in L^1_+([0,T])\) depending only on the constants in (2.1) and (4.20) such that, if u is a smooth solution of (4.19), then

$$\begin{aligned} \left\| u(t,\cdot ) \right\| _{L^p} \le \exp \left( \int _0^t C(s)ds \right) \left\| u_T \right\| _{L^p} \end{aligned}$$

and

$$\begin{aligned} \left\| u(t,\cdot ) \right\| _{BV} \le \exp \left( \int _0^t C(s)ds \right) \left\| u_T \right\| _{BV}. \end{aligned}$$

Just as in the first-order case, it is not possible to define \(L^p\)-distributional solutions of (4.19), and the utility of Lemma 4.5 is that it allows to obtain strongly convergent subsequences in \(C([0,T], L^p(\mathbb {R}^d))\) after regularizing the velocity field b.

The main question is whether such limiting solutions are unique. This uniqueness was achieved in the first-order case through duality with the conservative equation, and the solution was further characterized with a formula involving the forward flow. In the second-order case, we are constrained to work with constant noise coefficients:

$$\begin{aligned} -\partial _t u - {\text {tr}}[ a(t)\nabla ^2 u] + b(t,x) \cdot \nabla u = 0 \quad \text {in }(0,T) \times \mathbb {R}^d, \quad u(T,\cdot ) = u_T, \end{aligned}$$
(4.23)

where \(a = \frac{1}{2} \sigma \sigma ^T\) as before.

Theorem 4.14

For \(1< p < \infty \) and \(t \in [0,T]\), the map

$$\begin{aligned} C_c(\mathbb {R}^d) \ni u_T \mapsto \mathbb {E}[ u_T \circ \Phi _{T,t}] \end{aligned}$$

extends to a continuous, linear, order-preserving map on \(L^p(\mathbb {R}^d)\), and the function

$$\begin{aligned} u(t,x) {:}{=} \mathbb {E}[ u_T(\Phi _{T,t}(x))] \quad (t,x) \in [0,T] \times \mathbb {R}^d \end{aligned}$$
(4.24)

belongs to \(C([0,T], L^p(\mathbb {R}^d))\), and, if \(u_T \in BV(\mathbb {R}^d)\), then \(u \in L^\infty ([0,T], BV(\mathbb {R}^d))\).

If \((b^\varepsilon )_{\varepsilon > 0}\) is as in (2.3) and \(u^\varepsilon \) is the corresponding solution of (4.23), then, as \(\varepsilon \rightarrow 0\), \(u^\varepsilon \) converges strongly to u in \(C([0,T], L^p(\mathbb {R}^d))\).

Proof

Assume that \(u_T \in C^2(\mathbb {R}^d) \cap C_c(\mathbb {R}^d)\). For \(b^\varepsilon \) and \(u^\varepsilon \) as in the statement of the theorem, we have the standard representation formula \(u^\varepsilon (t,x) = \mathbb {E}[ u_T(\Phi ^\varepsilon _{T,t}(x))]\), where \(\Phi ^\varepsilon \) corresponds to the flow (4.17) with drift \(b^\varepsilon \). By Theorem 4.13, for any \(t \in [0,T]\), with probability one, \(u_T \circ \Phi ^\varepsilon _{T,t} \rightarrow u_T \circ \Phi _{T,t}\) a.e. in \(\mathbb {R}^d\). On the other hand, by Lemma 4.5, \((u^\varepsilon )_{\varepsilon > 0}\) is precompact in \(C([0,T], L^p(\mathbb {R}^d))\), and therefore the full sequence converges to u given by (4.24). The \(L^p\)-bounds and the extension to \(u_T \in L^p(\mathbb {R}^d)\) now follow from the \(L^p\) a priori estimates in Lemma 4.5. \(\square \)

4.5.3 Representation Formula for the Fokker–Planck Equation

We turn next to the Fokker-Planck equation

$$\begin{aligned} \partial _t f - \nabla ^2 \cdot (a(t,x) f) + {\text {div}}(b(t,x) f) = 0 \quad \text {in } (0,T) \times \mathbb {R}^d, \quad f(0,\cdot ) = f_0, \end{aligned}$$
(4.25)

where once again \(a = \frac{1}{2} \sigma \sigma ^T\) with \(\sigma \) as in (4.20).

The existence of solutions in \(C([0,T], L^p(\mathbb {R}^d))\) is straightforward; we include the proof for convenience.

Theorem 4.15

For any \(f_0 \in L^p(\mathbb {R}^d)\), \(1 \le p \le \infty \), there exists a distributional solution \(f \in C([0,T], L^p_\textrm{w}(\mathbb {R}^d))\) if \(1 \le p < \infty \), or \(f \in L^\infty \) if \(p = \infty \). Moreover, there exists \(C \in L^1_+([0,T])\) depending only on p, \(C_0(t)\) from (2.1) and the \(L^2([0,T], C^{1,1}(\mathbb {R}^d))\) norm of aFootnote 8 such that

$$\begin{aligned} \left\| f(t,\cdot ) \right\| _{L^p} \le \exp \left( \int _0^t C(s)ds \right) \left\| f \right\| _{L^p}. \end{aligned}$$

Proof

We do this with the use of a priori estimates, assuming all the data is smooth. The computations may be made rigorous by regularizing b, adding a small ellipticity to a, and extracting weakly convergent subsequences.

We then compute

$$\begin{aligned} & \partial _t |f|^p -\nabla ^2 \cdot ( a(t,x) |f|^p) + {\text {div}}(b(t,x) |f|^p)\\ & \qquad \le (p-1) \left( \nabla ^2 \cdot a(t,x) - {\text {div}}b(t,x) \right) |f|^p, \end{aligned}$$

and so \(\partial _t \int |f(t,\cdot )|^p \le C(t) \int |f(t,\cdot )^p\) for some C as in the statement of the Theorem. The result now follows from Grönwall’s lemma. \(\square \)

We now explore the possibility of obtaining a formula for the solution, similar to (4.3) for the first order equation (4.1). To do so, it is convenient to reverse time and consider, for fixed \(t \in (0,T]\), the equation satisfied by \(g^{(t)}(s,x) {:}{=} f(t-s,x)\):

$$\begin{aligned} & -\partial _s g^{(t)} -\nabla ^2 \cdot (a(t-s,x)g^{(t)}) + {\text {div}}(b(t-s,x)g^{(t)}) = 0 \quad \text {in } (0,t) \times \mathbb {R}^d, \quad \\ & g^{(t)}(t,\cdot ) = f_0. \end{aligned}$$

For \((s,x,\xi ) \in [0,t] \times \mathbb {R}^d \times \mathbb {R}\), define \(G^{(t)}(s,x,\xi ) = g^{(t)}(s,x) \xi \). Then

$$\begin{aligned} \left\{ \begin{aligned}&-\partial _s G^{(t)} - {\text {tr}}[ A^{(t)}(s,x,\xi ) \nabla ^2_{x,\xi }G^{(t)}] \\&\qquad - B^{(t)}(s,x) \cdot \nabla G^{(t)} - C^{(t)}(s,x) \xi \partial _\xi G^{(t)}= 0 \quad \text {in } (0,t) \times \mathbb {R}^{d+1}, \\&G^{(t)}(t,x,\xi ) = f_0(x)\xi , \end{aligned} \right. \end{aligned}$$
(4.26)

where

$$\begin{aligned} \left\{ \begin{aligned} A^{(t)}(s,x,\xi )&= \frac{1}{2}\Sigma ^{(t)}(s,x,\xi ) \Sigma ^{(t)}(s,x,\xi )^T, \quad \Sigma ^{(t)}(s,x,\xi ) = \begin{pmatrix} \sigma \\ \xi {\text{ div } }\sigma \end{pmatrix},\\ B^{(t)}(s,x)&= -b + (\sigma \cdot \nabla )\sigma ^T, \quad \text{ and }\\ C^{(t)}(s,x)&= -{\text{ div }}\left( b- {\text{ div } }a \right) \\ &= -{\text{ div } }b + {\text{ tr }}[ (\sigma \cdot \nabla )(\nabla \cdot \sigma ) ] + \frac{1}{2} |{\text{ div } }\sigma |^2 + \frac{1}{2} {\text{ tr }}[ \nabla \sigma \nabla \sigma ^T ]; \end{aligned} \right. \end{aligned}$$
(4.27)

for brevity, we have suppressed the arguments for a, \(\sigma \), and b, which are all \((t-s,x)\).

For an m-dimensional Wiener process W on [0, t] and a fixed \(s \in [0,t]\), we are led to consider the SDE, for \(r \in [s,t]\),

$$\begin{aligned} \left\{ \begin{aligned}&d_r \begin{pmatrix} \Phi ^{(t)}_{r,s}(x,\xi ) \\ \Xi ^{(t)}_{r,s}(x,\xi ) \end{pmatrix} = \begin{pmatrix} B^{(t)}(r, \Phi ^{(t)}_{r,s}(x,\xi )) \\ C^{(t)}(r, \Phi ^{(t)}_{r,s}(x,\xi )) \Xi ^{(t)}_{r,s}(x,\xi ) \end{pmatrix} dr\\&\qquad + \Sigma ^{(t)}(r, \Phi ^{(t)}_{r,s}(x,\xi ), \Xi ^{(t)}_{r,s}(x,\xi )) d W_r, \\&\begin{pmatrix} \Phi ^{(t)}_{s,s}(x,\xi ) \\ \Xi ^{(t)}_{s,s}(x,\xi ) \end{pmatrix} = \begin{pmatrix} x \\ \xi \end{pmatrix}. \end{aligned} \right. \end{aligned}$$
(4.28)

Itô’s formula, (4.26), and (4.28) then yield that, for any \((s,x,\xi ) \in [0,t) \times \mathbb {R}^d \times \mathbb {R}\),

$$\begin{aligned} r \mapsto G^{(t)}(r, \Phi ^{(t)}_{r,s}(x,\xi ), \Xi ^{(t)}_{r,s}(x,\xi )) \end{aligned}$$

is a martingale on [st] with respect to the filtration \((\mathcal F_r)_{r\in [0,t]}\) generated by the Wiener process W, and so, for all \(r \in [s,t]\),

$$\begin{aligned} \mathbb {E}\left[ G^{(t)}(r, \Phi ^{(t)}_{r,s}(x,\xi ), \Xi ^{(t)}_{r,s}(x,\xi )) \mid \mathcal F_s \right] = G^{(t)}(s, x,\xi ). \end{aligned}$$
(4.29)

Observe that \(\Phi ^{(t)}_{r,s}\) is independent of \(\xi \), while \(\Xi ^{(t)}_{r,s}\) can be written as \(\Xi ^{(t)}_{r,s}(x,\xi ) = J^{(t)}_{r,s}(x) \xi \) for some scalar quantity \(J^{(t)}_{r,s}(x)\), and so (4.28) reduces to the two SDEs

$$\begin{aligned} \left\{ \begin{array}{ll} d_r \Phi ^{(t)}_{r,s}(x) = -\left[ b(t-r, \Phi ^{(t)}_{r,s}(x)) - (\sigma \cdot \nabla )\sigma ^T(t-r, \Phi ^{(t)}_{r,s}(x)) \right] dt\\ \quad \qquad \qquad \qquad + \sigma (t-r, \Phi ^{(t)}_{r,s}(x)) dW_r, \quad r \in [s,t], \\ \Phi ^{(t)}_{s,s}(x) = x \end{array}\right. \end{aligned}$$
(4.30)

and

$$\begin{aligned} \left\{ \begin{aligned}&d_r J^{(t)}_{r,s}(x) = \Bigg [ - {\text{ div } }b + {\text{ tr }}[(\sigma \cdot \nabla )(\nabla \cdot \sigma )] + \frac{1}{2} |{\text{ div } }\sigma |^2 \\ &\qquad \qquad \qquad \quad + \frac{1}{2} {\text{ tr }}[ \nabla \sigma \nabla \sigma ^T] \Bigg ](t-r, \Phi ^{(t)}_{r,s}(x)) J^{(t)}_{r,s}(x) dr \\ &\qquad \qquad \qquad \quad + {\text{ div }}\sigma (t-r,\Phi ^{(t)}_{r,s}(x)) J^{(t)}_{r,s}(x) d W_r, \quad r \in [s,t], \\ &J^{(t)}_{s,s}(x) = 1. \end{aligned} \right. \end{aligned}$$
(4.31)

Standard but tedious computations involving Itô’s formula reveal that \(J^{(t)}_{r,s}(x) = \det \nabla _x \Phi ^{(t)}_{r,s}(x)\).

Taking \(r = t\) and \(\xi = 1\) in (4.29), we thus arrive at

$$\begin{aligned} \mathbb {E}\left[ f_0(\Phi ^{(t)}_{t,s}(x)) J^{(t)}_{t,s}(x) \mid \mathcal F_s \right] = g(s, x), \end{aligned}$$

and so, because \(g(0,x) = f(t,x)\), we obtain the representation for solutions of (4.25):

$$\begin{aligned} f(t,x) = \mathbb {E}\left[ f_0(\Phi ^{(t)}_{t,0}(x)) J^{(t)}_{t,0}(x) \right] . \end{aligned}$$
(4.32)

Let us note that \(\Phi ^{(t)}_{t,0}\) has the same law as \((\Phi _{t,0})^{-1}\), where \(\Phi _{t,s}\) is the stochastic flow from (4.15). We can see this by duality with the nonconservative equation. Indeed, if u is the solution of (4.19) with \(u(t,\cdot ) = g\) for some given g, then

$$\begin{aligned} \int f_0(x) u(0,x)dx = \int f(t,x) g(x)dx. \end{aligned}$$

On the other hand, by (4.24) and (4.32),

$$\begin{aligned} \int f_0(x) u(0,x)dx = \mathbb {E}\int f_0(x) g( \Phi _{t,0}(x))dx \end{aligned}$$

and

$$\begin{aligned} \int f(t,x) g(x)dx = \mathbb {E}\int f_0(\Phi ^{(t)}_{t,0}(x)) g(x) J^{(t)}_{t,0}(x) dx, \end{aligned}$$

so, using the change of variables formula and the fact that \(f_0\) is arbitrary, we have \(\mathbb {E}[ g( \Phi _{t,0}(x)) ] = \mathbb {E}[ g( [ \Phi ^{(t)}_{t,0} ]^{-1}(x))]\) for all \(g: \mathbb {R}^d \rightarrow \mathbb {R}\) and \(x \in \mathbb {R}^d\).

We now note that the SDE (4.30) falls under the assumptions of Lemma 2.5, and therefore, for every \(0 \le s < t \le T\), there exists a unique solution \(\Phi ^{(t)}_{\cdot ,s}\) with the properties laid out by that result. However, the main difficulty is that we do not know whether \(\Phi ^{(t)}_{t,0}\) is Lipschitz continuous on \(\mathbb {R}^d\) (see Remark 2.8). This prevents us from bounding \(J^{(t)}_{t,0}\) uniformly in \(L^\infty \) and passing to weak distributional limits. This is a major obstacle in using the formula (4.32) to identify the unique limiting distributional solution of (4.25), as we did for the first order equation (4.1).

The exception is when \(\sigma \) is independent of x. In that case, (4.30) and (4.31) become

$$\begin{aligned} d_r \Phi ^{(t)}_{r,s}(x) = - b(t-r, \Phi ^{(t)}_{r,s}(x)) dr + \sigma (t-r) dW_r, \quad r \in [s,t], \quad \Phi ^{(t)}_{s,s}(x) = x \end{aligned}$$
(4.33)

and

$$\begin{aligned} \partial _r J^{(t)}_{r,s}(x) = -{\text {div}}b(t-r, \Phi ^{(t)}_{r,s}(x))J^{(t)}_{r,s}(x), \quad r \in [s,t], \quad J^{(t)}_{s,s}(x) = 1. \end{aligned}$$
(4.34)

The SDE (4.34) is in fact an ODE with random coefficients. In particular, \(J^{(t)}_{\cdot ,s}\) has a deterministic bound.

We then characterize uniquely the limiting distributional solution of

$$\begin{aligned} \partial _t f - \nabla ^2 \cdot (a(t)f) + {\text {div}}( b(t,x) f) = 0 \quad \text {in } (0,T) \times \mathbb {R}^d, \quad f(0,\cdot ) = f_0. \end{aligned}$$
(4.35)

Theorem 4.16

For \(1 \le p < \infty \), the formula (4.32), where \(\Phi ^{(t)}_{\cdot ,s}\) and \(J^{(t)}_{\cdot ,s}\) are specified by respectively (4.33) and (4.34), extends continuously to any \(f_0 \in L^p(\mathbb {R}^d)\). If \(f_0 \in L^p(\mathbb {R}^d)\) and \((b^\varepsilon )_{\varepsilon > 0}\) are as in (2.3) and \(f^\varepsilon \) is the corresponding solution of (4.35), then, as \(\varepsilon \rightarrow 0\), \(f^\varepsilon \) converges weakly in \(C([0,T], L^p_\textrm{w}(\mathbb {R}^d))\) to f. If \(f_0 \ge 0\), then there exists a unique nonnegative distributional solution of (4.35), which is given by (4.32).

Proof

Let \((b^\varepsilon )_{\varepsilon > 0}\) and \(f^\varepsilon \) be as in the statement of the theorem, and assume \(f_0 \in C^2_c(\mathbb {R}^d)\). Let \(u^\varepsilon \) be the solution of (4.23) with velocity \(b^\varepsilon \) and with terminal data \(u(t,\cdot ) = g \in C^2_c(\mathbb {R}^d)\) for some fixed \(t \in [0,T]\). Then integration by parts yields

$$\begin{aligned} \int f^\varepsilon (t,x)g(x)dx = \int f_0(x) u^\varepsilon (0,x)dx. \end{aligned}$$

By Theorem 4.14, as \(\varepsilon \rightarrow 0\), \(u^\varepsilon \) converges strongly in \(L^{p'}(\mathbb {R}^d)\) to the function u defined uniquely by \(u(s,x) = g(\Phi _{t,s}(x))\). Therefore, any \(C([0,T], L^p_\textrm{w}(\mathbb {R}^d))\)-weak limit f of \(f^\varepsilon \) as \(\varepsilon \rightarrow 0\) must satisfy

$$\begin{aligned} \int f(t,x) g(x) dx = \int f_0(x) u(0,x)dx, \end{aligned}$$

and it follows that there is a unique such limiting function f.

On the other hand, for \(\varepsilon > 0\),

$$\begin{aligned} f^\varepsilon (t,x) = \mathbb {E}\left[ f_0\left( \Phi ^{(t),\varepsilon }_{t,0}(x) \right) J^{(t),\varepsilon }_{t,0}(x) \right] , \end{aligned}$$

where \(\Phi ^{(t),\varepsilon }_{\cdot ,s}\) and \(J^{(t),\varepsilon }_{\cdot ,s}\) are as in respectively (4.33) and (4.34) with b replaced everywhere by \(b^\varepsilon \). For fixed \(t \in [0,T]\), uniformly in \(\varepsilon \), \(\Phi ^{(t),\varepsilon }_{t,0}\) is Lipschitz continuous on \(\mathbb {R}^d\), and so \(J^{(t),\varepsilon }_{t,0} = \det \nabla _x \Phi ^{(t),\varepsilon }_{t,0}\) is bounded in \(L^\infty \). By exactly the same arguments as in Lemma 2.3 and Theorem 4.1, we see that, as \(\varepsilon \rightarrow 0\), \(\mathbb {E}f_0\left( \Phi ^{(t),\varepsilon }_{t,0} \right) J^{(t),\varepsilon }_{t,0}\) converges weakly in \(L^p\) to \( \mathbb {E}f_0\left( \Phi ^{(t)}_{t,0} \right) J^{(t)}_{t,0} \). It follows that f must be given by (4.32). The fact that the formula extends to arbitrary \(f_0 \in L^p(\mathbb {R}^d)\) now follows from the a priori \(L^p\) bounds in Theorem 4.15.

The uniqueness of nonnegative distributional solutions is then a consequence of the uniqueness of the forward flow established in Theorem 4.13, as well as the generalization of superposition to second-order Fokker–Planck equations (see Figalli [41, Lemma 2.3]). \(\square \)