Abstract
We consider a one-dimensional kinetic model of granular media in the case where the interaction potential is quadratic. Taking advantage of a simple first integral, we can use a reformulation (equivalent to the initial kinetic model for classical solutions) which allows measure solutions. This reformulation has a Wasserstein gradient flow structure (on a possibly infinite product of spaces of measures) for a convex energy which enables us to prove global in time well-posedness.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Kinetic models for granular media were initiated in the work of Benedetto et al. [4, 5] who considered the following PDE
where \(f_0\) is an integrable nonnegative function on the phase space and W is a certain convex and radially symmetric potential capturing the (inelastic) collision rule between particles, and the convolution is in velocity only \((\nabla W\star _v f_t)(x,v)=\int _{\mathbb {R}^d} \nabla W(v-u)f_t(x,u) \text{ d } u\) (so that there is no regularizing effect in the spatial variable). At least formally, (1.1) captures the limit as the number N of particles tends to \(+\infty \) of the second-order ODE system:
which describes the motion of N particles of mass \(\frac{1}{N}\) moving freely until collisions occur, and at collision times, there is some velocity exchange with a loss of kinetic energy depending on the form of the potential W.
Surprisingly there are very few results on well-posedness for such equations. This is in contrast with the spatially homogeneous case (i.e. f depending on t and v only) associated with (1.1) that has been very much studied (see [4, 6, 11–13, 17] and the references therein) and for which existence, uniqueness and long-time behavior are well understood. In fact, the spatially homogeneous version of (1.1) can be seen as the Wasserstein gradient flow of the interaction energy associated to W, and then well-posedness results can be viewed as a consequence of the powerful theory of Wasserstein gradient flows (see [3]). For the full kinetic equation (1.1), local existence and uniqueness of a classical solution was proved in one dimension in [4] for the potential \(W(v)=|v|^3/3\,\) (as observed in [2], the arguments of [4] extend to dimension d and \(W(v)=|v|^p/p\) provided \(p>3-d\)) when the initial datum \(f_0\) is a non-negative \(C^1\cap W^{1,\infty }(\mathbb {R}\times \mathbb {R})\) integrable function with compact support. Under an additional smallness assumption, the authors of [4] also proved a global existence result. In [1], the first author has extended the local existence result of [4] to more general interaction potentials W and to any dimension, \(d\ge 1\). The proof of [1] is based on a splitting of the kinetic equation (1.1) into a free transport equation in x, and a collision equation in v that is interpreted as the gradient flow of a convex interaction energy with respect to the quadratic Wasserstein distance. In [2], various a priori estimates are obtained, in particular a global entropy bound (which thus rules out concentration in finite time) in dimension 1 when \(W''\) is subquadratic near zero.
Understanding under which conditions one can hope for global existence or on the contrary expect explosion in finite time is mainly an open question. Let us remark that the weak formulation of (1.1) means that for any \(T>0\) and any \(\phi \in C_c^{\infty }([0,T]\times \mathbb {R}^d\times \mathbb {R}^d)\) one has
and for the right hand side to make sense, it is necessary to have a control on nonlinear quantities like
which actually makes it difficult to define measure solutions (this also explains why in [4] or [1], the authors look for \(L^1\cap L^\infty \) solutions). Observing that (1.1) can be written in conservative form as
we see that, at least for smooth solutions, (1.1) can be integrated using the method of characteristics:
where \(S_t\) is the flow of the vector-field F(f) i.e.
and \(f_t={S_t}_\# f_0\) means that
In the present work, we investigate the one-dimensional case with the quadratic kernel \(W(v)=\frac{1}{2} \vert v\vert ^2\) which is neither covered by the analysis of [4] nor by the entropy estimate of [2] (actually the entropy cannot be globally bounded in this case, see [2]). In this case the convolution takes the form
where
so that the kinetic equation (1.1) rewrites
and we supplement (1.4) with the initial condition
where \(f_0\) is a compactly supported probability density:
and
for some positive constants \(R_x\) and \(R_v\). We shall see later on, how to treat more general measures as initial conditions. Our first contribution is the observation that, thanks to a special first integral of motion for the characteristics system associated with (1.4), one may define weak solutions not at the level of measures on the phase space but on a (possibly infinite) product of measures on the physical space. Our second contribution is to show that this reformulation has a gradient flow structure for an energy functional with good properties which will enable us to prove global well-posedness. To the best of our knowledge, even if the situation we are dealing with is very particular, this is the first global result of this type for kinetic models of granular media. As pointed out to us by Yann Brenier, our analysis has some similarities with (but is different from) some models of sticky particles for pressureless flows (see [8, 9]) and Brenier’s formulation of the Darcy–Boussinesq system [7].
The article is organized as follows. In Sect. 2, we show how a certain first integral of motion can be used to give a reformulation of (1.4) which allows for measure solutions. Section 3 investigates the gradient flow structure of this reformulation. Section 4 proves global existence thanks to the celebrated Jordan–Kinderlehrer–Otto (henceforth JKO) implicit Euler scheme of [16] for a certain energy functional. In Sect. 5, we prove uniqueness and stability and give some concluding remarks.
2 A first integral and measure solutions
2.1 A first integral for classical solutions
Let us consider a \(C^1\) compactly supported initial condition \(f_0\) and a classical solution f, that is a \(C^1\) function which solves (1.4) in a pointwise sense on \(\mathbb {R}_+\times \mathbb {R}^d\times \mathbb {R}^d\). It is then easy to show (see [2]) that f remains compactly supported locally in time; more precisely (1.7) and (1.4) imply that
The characteristics for (1.4) is the flow map for the second-order ODE
in the sense that
where \((X_0(x,v), V_0(x,v))=(x,v)\) and
with \(\rho \) and m being respectively the spatial marginal and momentum associated to f defined by (1.3). Integrating (1.4) with respect to v, first gives:
so that there is a stream potential G such that
and since \(\rho \) is a probability measure, it is natural to choose the integration constant in such a way that G is the cumulative distribution function of \(\rho \):
Replacing (2.6) in (2.2) then gives
so that \(\dot{X}+G_t(X)\) is constant along the characteristics. Since \(G_0\) can be deduced from the initial condition \(f_0\) by
we have the following explicit first integral of motion for (2.3):
2.2 Reformulation and equivalence for classical solutions
In view of the first integral (2.7), it is natural to perform a change of variables on the initial conditions:
so that for every \(\phi \in C(\mathbb {R}\times \mathbb {R})\) one has
and then to rewrite the characteristics as a family of first-order ODEs parametrized by the label a:
The flow (2.3) may then be rewritten as:
Hence setting
the relation \(f_t =(X_t, V_t)_\# f_0\) can be re-expressed as:
for every \(t\ge 0\) and every test-function \(\phi \in C(\mathbb {R}^2)\). This implies in particular that
and then also
On the other hand, using (2.8), we deduce that for each \(a\in \mathbb {R}\), \(\nu ^a\) satisfies the continuity equation:
Note that \(\nu _t^a\) is a nonnegative measure but not necessarily a probability measure, its total mass being that of \(\nu _0^a\) i.e. \(h(a):=\int _\mathbb {R}f_0(x, a-G_0(x)) \text{ d } x\).
The previous considerations show that any classical solution of (1.4) is related to a solution of the system of continuity equations (2.11) and (2.12) with initial condition \(f_0\) via the relation (2.10). The converse is also true: if \(\nu ^a\) is a family of classical solutions of (2.12) with \(G^a\) and G given by (2.11), then the time-dependent family of probability measures \(f_t\) on \(\mathbb {R}^2\) defined by (2.10) actually solves (1.4). Indeed, by construction the spatial marginal \(\rho \) of f is \(\partial _x G\); as for the momentum, we have
Then, thanks to (2.12) and Fubini’s theorem, we have
Then let us take a test-function \(\phi \in C_c^1(\mathbb {R}^2)\), differentiating (2.10) with respect to time, using \(\partial _x G=\rho \), \(\partial _t G=-m\), (2.10) and an integration by parts and (2.12), we have
This proves that, for classical solutions, the kinetic equation (1.4) is actually equivalent to the system of PDEs (2.12)–(2.11) indexed by the label a.
2.3 Measure solutions
We now take the system (2.11) and (2.12) as a starting point to define measure solutions. We have to suitably relax the system so as to take into account:
-
The fact that shocks may occur i.e. atoms of \(\rho \) may appear in finite time, then the cumulative distribution G may become discontinuous (in which case it will be convenient to view G, which is monotone, as a set-valued map),
-
The fact that when shocks occur, the velocity may depend on the label a,
-
More general initial conditions.
Let us treat first the case of more general initial conditions. What really matters is to be able perform the change of variables \((x,v)\mapsto (x,a):=(x, v+G_0(x))\), which can be done as soon as \(\rho _0\) is atomless i.e. does not charge points. We shall therefore assume that \(f_0\) is a probability measure on \(\mathbb {R}^2\) with compact support and having an atomless spatial marginal:
Defining the spatial marginal \(\rho _0\) of \(f_0\) by
as well as its cumulative distribution function
\(G_0\) is continuous and \(\rho _0\) is supported on \([-R_x, R_x]\). Since \(G_0\) takes values in [0, 1], then \(a(x,v):=v+G_0(x)\in [-R_v, R_v+1]\) for \((x,v)\in \mathrm {Supp}(f_0)\). We then define the probability measure \(\eta _0\) as the push-forward of \(f_0\) through \((x,v)\mapsto (x, a(x,v))\) i.e.
We then fix a \(\sigma \)-finite measure \(\mu \) such that the second marginal of \(\eta _0\) is absolutely continuous with respect to \(\mu \); for instance it could be the second marginal of \(\eta _0\), but we allow \(\mu \) to be a more general measure (not necessarily a probability measure; for instance it was the Lebesgue measure in the previous Sect. 2.2, and in the discrete example of Sect. 2.4 below, \(\mu \) will be a discrete measure). Then we can disintegrate \(\eta _0\) as \(\eta _0=\nu _0^a \otimes \mu \) which means that for every \(\phi \in C(\mathbb {R}^2)\) we have
Note that \(\nu _0^a\) is supported on \([-R_x, R_x]\) and it is not necessarily a probability measure. We denote by h(a) its total mass i.e. the Radon–Nikodym density of the second marginal of \(\eta _0\) with respect to \(\mu \):
so that \(h\in L^1(\mu )\), \(\int _{\mathbb {R}} h(a) \text{ d } \mu (a)=1\) and \(h=0\) outside of the interval \([-R_v, R_v+1]\).
The rest of the paper will be devoted to study the structure and well-posedness of the following system which relaxes to a measure-valued setting the system (2.11) and (2.12):
subject to the constraint that
where
Note that when \(\mu \) is the Lebesgue measure and there are no shocks i.e. when \(G_t\) is continuous, we recover the system (2.11) and (2.12) of Sect. 2.2. Denoting by \(\mathcal{P}_2(\mathbb {R})\) the set of Borel probability measures on \(\mathbb {R}\) with finite second moment, solutions of (2.16)–(2.18) are then formally defined by:
Definition 2.1
Fix a time \(T>0\); a measure solution of the system (2.16)–(2.18) on \([0,T]\times \mathbb {R}\) is a family of measures \((t,a) \in [0,T]\times [-R_v, R_v+1] \mapsto \nu _t^a \in h(a) \mathcal{P}_2(\mathbb {R})\) which
-
1.
Is measurable in the sense that for every Borel bounded function \(\phi \) on \([0,T]\times \mathbb {R}\times \mathbb {R}\), the map \((t,a)\mapsto \int _\mathbb {R}\phi (t,a,x)\text{ d } \nu _t^a(x)\) is \( \text{ d }t \otimes \mu \) measurable,
-
2.
Satisfies the continuity equation (2.16) in the sense of distributions for \(h\mu \)-a.e. a, with a \(\nu _t^a\otimes \mu \otimes \text{ d }t\)-measurable velocity field \(v_t^a\) which satisfies (2.17), \(\nu _t^a\otimes \mu \otimes \text{ d }t\) a.e, and with \(G_t\) and \(G_t^{-}\) defined by (2.18).
Note that since \(v_t^a\) constrained by (2.18) is bounded, \(t\mapsto \nu _t^a\) is actually continuous for the weak convergence of measures for \(h\mu \) a.e. a. Note also that the fact that \(t\mapsto \nu _t^a\) satisfies the continuity equation (2.16) in the sense of distributions is equivalent to the condition that for every \(\psi \in C( [-R_v, R_v+1])\) and \(\phi \in C_c^1([0,T]\times \mathbb {R})\) one has:
2.4 A discrete example and a system of Burgers equations
The aim of this paragraph, somehow independent from the rest of the paper, is to show, on a discrete example, that one cannot take for granted that the stream \(G_t\) remains continuous, which justifies the necessity to relax the condition \(v_t^a (x)=a-G_t(x)\) by (2.17). Consider indeed the special case
where \(\rho _0\) is a smooth compactly supported probability density and \(a_1<\cdots <a_N\) are the finitely many values that the label a may take. In this case, we take \(\mu \) as the counting measure and then
Even though \(G_0\) is smooth, we have to expect that shocks may appear in finite time. Let us relabel the measures \(\nu ^i:=\nu ^{a_i}\) and the corresponding cumulative distributions \(G^i:=G^{a_i}\), \(G:=\sum _{j=1}^N G^j\). If G was continuous then all the nondecreasing functions \(G_i\) would also be continuous (no shocks), and, then, the system (2.16)–(2.18) would become
Integrating with respect to the spatial variable between \(-\infty \) and x would then give a system of Burgers-like equations:
We can at least formally rewrite each of these equations in the more familiar form
where each function \(\psi ^i\) is implicitly defined in terms of the pseudo inverse \(H^i_t\) of \(G^i_t\):
Note that \(\psi _t^i\) is decreasing for every t and actually \((\psi _t^i)'\le -1\). In the absence of shocks, \(H_t^i\) simply solves \(\partial _t H^i=\psi ^i_t\). Let us then take \(x_1<x_2\) belonging to a certain interval on which \(\rho _0\ge \nu \) with \(\nu >0\) and define \(y_1:=\frac{1}{N} G_0(x_1)\), \(y_2:=\frac{1}{N} G_0(x_2)\), we then have \(y_2-y_1=\frac{1}{N} \int _{x_1}^{x_2} \rho _0\ge \frac{\nu }{N}(x_2-x_1)\). Integrating \(\partial _t H^i=\psi ^i_t\) and using the fact that \((\psi ^i)'\le -1\), we get
This means that \(H^i_t\) becomes noninjective before a time
In other words, discontinuities of \(G^i\) i.e. shocks appear in finite time of order O(N) for any finite N.
3 A gradient flow structure
In this section, assuming (2.13) we will see how to obtain solutions to the system (2.16)–(2.18) by a gradient flow approach. Existence of such gradient flows using the JKO implicit scheme for Wasserstein gradient flows will be detailed in Sect. 4. We denote by \(\mathcal{M}(\mathbb {R}^d)\) the set of Borel measures on \(\mathbb {R}^d\) and \(\mathcal{P}(\mathbb {R}^d)\) the set of Borel probability measures on \(\mathbb {R}^d\). Given two nonnegative Borel measures on \(\mathbb {R}^d\) with common finite total mass h (not necessarily 1) and finite p-moments, \(\nu \) and \(\theta \), recall that for \(p\in [1,+\infty )\), the p-Wasserstein distance between \(\nu \) and \(\theta \) is by definition:
where \( \Pi (\nu , \theta )\) is the set of transport plans between \(\nu \) and \(\theta \) i.e. the set of Borel measures on \(\mathbb {R}^d\times \mathbb {R}^d\) having \(\nu \) and \(\theta \) as marginals (we refer to the textbooks of Villani [18, 19] for a detailed exposition of optimal transport theory). Wasserstein distances are usually defined between probability measures such as \(h^{-1} \nu \) and \(h^{-1} \theta \) , but of course they extend to measures with the same total mass and \(W_p^p(\nu , \theta )=h W_p^p(h^{-1} \nu , h^{-1}\theta )\). We shall mainly use the 2-Wasserstein distance but the 1-Wasserstein distance will be useful as well in the sequel. We also recall that the 1-Wasserstein distance can also be defined through the Kantorovich duality formula (see for instance [18, 19]):
We will see in Sect. 4 that one may obtain solutions to the system (2.16)–(2.18) by a minimizing scheme for an energy defined on an infinite product of spaces of measures parametrized by the label a. Wasserstein gradient flows on finite products have recently been investigated in [10, 15]. To our knowledge the case of an infinite product is new in the literature.
3.1 Functional setting
As in section 2.3, starting from \(f_0\) satisfying (2.13), let us define \(A:=[-R_v, R_v+1]\), fix a \(\sigma \)-finite measure \(\mu \) on \(\mathbb {R}\) and a measurable family of finite Borel measures \(a\in A\mapsto \nu _0^a\) such that, for every \(\phi \in C(\mathbb {R}^2)\):
As already pointed out, neither \(\mu \) nor \(\nu _0^a\) need to be probability measures, we thus define
so that \(h\in L^1(\mu )\), \(\int _{\mathbb {R}} h(a) \text{ d } \mu (a)=\int _A h(a) \text{ d } \mu (a)=1\). Let us then denote by X the set consisting of all \({\varvec{\nu }}:=(\nu ^a)_{a\in A}\), \(\mu \)-measurable families of measures such that
Given \(R>0\) [the precise choice of R will be made later on, see (4.2) below], let us denote by \(X_R\) the subset of X defined by
For \({\varvec{\nu }}\in X_R\), let us define the probability [because \(\int _\mathbb {R}h(a) \text{ d } \mu (a)=1\)] measure
and the energy
Note that J is unbounded from below on the whole of X but it is bounded on each \(X_R\). Note also that the interaction term can be rewritten as:
We equip \(X_R\) with the distance d given by:
It will also be convenient to work with the weak topology on \(X_R\) that is the one defined by the family of semi-norms
where \({\varvec{\nu }}\otimes \mu \) is the probability measure defined by
and
so that convergence for the weak topology is nothing but weak-\(*\) convergence of \({\varvec{\nu }}\otimes \mu \). Since for all \({\varvec{\nu }}\in X_R\), \({\varvec{\nu }}\otimes \mu \) is a probability measure on the compact set \(A\times [-R, R]\), \(X_R\) is compact for the weak topology. Note also that since the weak-\(*\) topology is metrizable by the Wasserstein distance (see [18, 19]) on the set of probability measures on a compact set of \(\mathbb {R}^2\), the weak topology is metrizable by the distance \(d_w\):
so that \((X_R, d_w)\) is a compact metric space. We summarize the basic properties of J, d and \(d_w\) in the following.
Lemma 3.1
Let \(X_R\), J, d and \(d_w\) be defined as above then we have:
-
1.
J is Lipschitz continuous for \(d_w\),
-
2.
\(d_w\le d\),
-
3.
d is lower semicontinous for \(d_w\): if \(({\varvec{\nu }}_n)_n\) is a sequence in \(X_R\), \(({\varvec{\nu }}, {\varvec{\theta }})\in X_R\times X_R\) and \(\lim _n d_w({\varvec{\nu }}_n, {\varvec{\nu }})=0\) then \(\liminf _n d^2({\varvec{\nu }}_n, {\varvec{\theta }})\ge d^2({\varvec{\nu }}, {\varvec{\theta }})\).
Proof
Let us recall that if \(\theta \) and \(\nu \) are (compactly supported say) probability measures on \(\mathbb {R}^d\) then by Cauchy–Schwarz inequality,
and, it follows from (3.1) that, if f is M-Lipschitz then
Moreover,
-
1.
Let us rewrite J as
$$\begin{aligned} J({\varvec{\nu }})=\frac{1}{4} J_0({\varvec{\nu }})+J_1({\varvec{\nu }}), \end{aligned}$$with
$$\begin{aligned} J_0({\varvec{\nu }}):=\int _{K^2} \vert x-y\vert \text{ d } ({\varvec{\nu }}\otimes \mu )(a,x) \text{ d } ({\varvec{\nu }}\otimes \mu )(b,y), \end{aligned}$$(3.10)and
$$\begin{aligned} J_1({\varvec{\nu }}):=\int _K \Big (\frac{1}{2}-a\Big )x \text{ d } ({\varvec{\nu }}\otimes \mu )(a,x). \end{aligned}$$(3.11)The fact that \(J_1\) is Lipschitz for \(d_w\) directly follows from (3.7), (3.8) and the fact that the integrand in \(J_1\) is uniformly Lipschitz in x. As for \(J_0\), using also (3.9) and the fact that the distance is 1-Lipschitz, we have
$$\begin{aligned} \begin{array}{ll} J_0({\varvec{\nu }})-J_0({\varvec{\theta }})\le W_1(({\varvec{\nu }}\otimes \mu ) \otimes ({\varvec{\nu }}\otimes \mu ), ({\varvec{\theta }}\otimes \mu ) \otimes ({\varvec{\theta }}\otimes \mu ))\\ \quad \le 2 W_2({\varvec{\nu }}\otimes \mu , {\varvec{\theta }}\otimes \mu )=2d_w({\varvec{\nu }}, {\varvec{\theta }}). \end{array} \end{aligned}$$ -
2.
Let \({\varvec{\nu }}=(\nu ^a)_{a\in A}\) and \({\varvec{\theta }}=(\theta ^a)_{a\in A}\) be two elements of \(X_R\) and let \(\gamma ^a\) be an optimal plan between \(\nu ^a\) and \(\theta ^a\) (which can be chosen in a \(\mu \)-measurable way, thanks to standard measurable selection arguments, see [14]). Let us then define the probability measure \(\alpha \) on \(K^2\) by
$$\begin{aligned}&\int _{K\times K} \phi \big ((a,x), (b, y)\big ) \text{ d } \alpha (a,x, b,y)\\&\quad :=\int _A \Big (\int _{[-R,R]^2} \phi \big ((a,x), (a,y)\big ) \text{ d } \gamma ^a(x,y) \Big ) \text{ d } \mu (a) \end{aligned}$$for all \(\phi \in C(K\times K)\). Observing that \(\alpha \in \Pi ({\varvec{\nu }}\otimes \mu , {\varvec{\theta }}\otimes \nu )\), we get
$$\begin{aligned} d^2_w({\varvec{\nu }}, {\varvec{\theta }})&\le \int _{K\times K} \vert x-y\vert ^2 \text{ d } \alpha (a,x, b,y) =\int _A \Big ( \int _{[-R,R]^2} \vert x-y\vert ^2 \text{ d } \gamma ^a (x,y) \Big )\text{ d } \mu (a)\\&= \int _A W_2^2 (\nu ^a, \theta ^a) \text{ d } \mu (a)=d^2({\varvec{\nu }}, {\varvec{\theta }}). \end{aligned}$$ -
3.
Let \(\gamma _n^a\) be an optimal plan (\(\mu \)-measurable with respect to a) between \(\nu _n^a\) and \(\theta ^a\). Again passing to a subsequence if necessary we may assume that \(\gamma _n^a\otimes \mu \) weakly \(*\) converges to some measure of the form \(\gamma ^a \otimes \mu \). Using test-functions of the form \(\psi (a)(\alpha (x)+\beta (y))\) we deduce easily that for \(\mu \)-almost every a, \(\gamma ^a\in \Pi (\nu ^a, \theta ^a)\) and then
$$\begin{aligned}&\liminf _n d^2({\varvec{\nu }}_n, {\varvec{\theta }}) =\liminf \int _{A} \int _{[-R,R]^2} \vert x-y \vert ^2 \text{ d } \gamma _n^a (x,y) \text{ d }\mu ( a)\\&\quad = \int _A\int _{[-R,R]^2} \vert x-y \vert ^2 \text{ d } \gamma ^a (x,y) \text{ d }\mu ( a) \ge d^2({\varvec{\nu }}, {\varvec{\theta }}). \end{aligned}$$
3.2 Subdifferential of the energy and gradient flows as measure solutions
Let us start with some convexity properties of J. Let \({\varvec{\nu }}=(\nu ^a)_{a\in A}\) and \({\varvec{\theta }}\) belong to \(X_R\) and let \({\varvec{\gamma }}:=(\gamma ^a)_{a\in A}\) be a measurable family of transport plans between \(\nu ^a\) and \(\theta ^a\) [which we shall simply denote by \({\varvec{\gamma }}\in \Pi ({\varvec{\nu }}, {\varvec{\theta }})\)]. For \(\varepsilon \in [0,1]\), then define
where \(\pi _1\) and \(\pi _2\) are the canonical projections \(\pi _1(x,y)=x\), \(\pi _2(x,y)=y\). Then \(\varepsilon \in [0,1]\mapsto {\varvec{\nu }}_\varepsilon \) is a curve which interpolates between \({\varvec{\nu }}\) and \({\varvec{\theta }}\). Similarly if we take transport plans \(\gamma ^a\) induced by maps of the form \({{\mathrm{id}}}+\xi ^a\) with \({\varvec{\xi }}=(\xi ^a)_{a\in A} \in L^{\infty } ({\varvec{\nu }}\otimes \mu )\) i.e. \(\theta ^a=({{\mathrm{id}}}+\xi ^a)_\#\nu ^a\) then \(\nu _\varepsilon ^a=({{\mathrm{id}}}+\varepsilon \xi ^a)_\#\nu ^a\) and in this case, we shall simply denote \({\varvec{\xi }}:=(\xi ^a)_{a\in A}\) and \({\varvec{\nu }}_\varepsilon \) as
Lemma 3.2
Let \({\varvec{\nu }}\) and \({\varvec{\theta }}\) be in \(X_R\), \({\varvec{\gamma }}\in \Pi ({\varvec{\nu }}, {\varvec{\theta }})\) and \({\varvec{\nu }}_\varepsilon \) be given by (3.12). Then
In particular, the same inequality holds if \({\varvec{\nu }}_\varepsilon =({\varvec{{{\mathrm{id}}}}}+\varepsilon {\varvec{\xi }})_\#{\varvec{\nu }}\) with \({\varvec{\xi }}\in L^{\infty } ({\varvec{\nu }}\otimes \mu )\).
Proof
This immediately follows from the construction of \({\varvec{\nu }}_\varepsilon \), the convexity of the absolute value in \(J_0\) defined by (3.10) and the linearity in x of the integrand in \(J_1\) defined by (3.11). \(\square \)
Remark 3.3
The convexity Lemma 3.2 holds along the interpolation \({\varvec{\nu }}_\varepsilon \) given by any transportation plan \(\gamma ^a\) between \(\nu ^a\) and \(\mu ^a\), it is in particular true when in addition \(\gamma ^a\) is a required to be an optimal plan, in such a case, it is easy to see that the interpolation \(\varepsilon \in [0,1]\mapsto {\varvec{\nu }}_\varepsilon \) is a geodesic between \({\varvec{\nu }}\) and \({\varvec{\theta }}\), in other words, J is convex along geodesics (but does not satisfy any strong convexity property along geodesics).
Definition 3.4
Let \({\varvec{\nu }}\in X_R\), the subdifferential of J at \({\varvec{\nu }}\), denoted \(\partial J({\varvec{\nu }})\), consists of all \({\varvec{w}}:=(w^a)_{a\in A} \in L^1({\varvec{\nu }}\otimes \mu )\) such that for every \(R'>0\), every \({\varvec{\theta }}\in X_{R'}\) and every \({\varvec{\gamma }}=(\gamma ^a)_{a\in A} \in \Pi ({\varvec{\nu }}, {\varvec{\theta }})\), one has
Remark 3.5
An equivalent way to define \(\partial J({\varvec{\nu }})\) (which will turn out to be more convenient in the sequel to prove stability properties, see Lemma 4.4) is in terms of transition kernels rather than of transport plans. More precisely, given \({\varvec{\nu }}\in X_R\), we define the set \(T({\varvec{\nu }})\) of \({\varvec{\nu }}\otimes \mu \) measurable maps \({\varvec{\eta }}\): \((a,y)\in K\mapsto \eta ^{a,y}\in \mathcal{P}(\mathbb {R})\) such that there exists an \(R'>0\) such that \(\eta ^{a,y}\) is supported by \([-R', R']\) for \({\varvec{\nu }}\otimes \mu \) almost every \((a,y)\in K\). We then define \({\varvec{\nu }}_{{\varvec{\eta }}}=(\nu ^a_{{\varvec{\eta }}})_{a\in A}\) by
By construction, \({\varvec{\gamma }}=(\gamma ^a)_{a\in A}\) with \(\gamma ^a =\nu ^a \otimes \eta ^{a,y}\) defined by
belongs to \(\Pi ({\varvec{\nu }}, {\varvec{\nu }}_{{\varvec{\eta }}})\) and thanks to the disintegration Theorem, it is then easy to check that \({\varvec{w}}\in \partial J({\varvec{\nu }})\) if and only if, for every \({\varvec{\eta }}\in T({\varvec{\nu }})\), one has
Remark 3.6
If we restrict ourselves to transport maps [i.e. take \(\eta ^{a,y}=\delta _{\xi ^a(y)}\) in (3.13)], we obtain a condition which is weaker than definition 3.4 but somehow easier to handle. If \({\varvec{w}}:=(w^a)_{a\in A} \in L^1({\varvec{\nu }}\otimes \mu )\in \partial J({\varvec{\nu }})\) then for every \({\varvec{\xi }}=(\xi ^a)_{a\in A} \in L^{\infty } ({\varvec{\nu }}\otimes \mu )\), one has
Remark 3.7
The subdifferential \(\partial J\) obviously has the following monotonicity property (which will be crucial for uniqueness, see Sect. 5): if \({\varvec{\nu }}_1\) and \({\varvec{\nu }}_2\) belong to \(X_R\) and \({\varvec{w}}_1\in \partial J({\varvec{\nu }}_1)\) and \({\varvec{w}}_2\in \partial J({\varvec{\nu }}_2)\), then for every \({\varvec{\gamma }}\in \Pi ({\varvec{\nu }}_1, {\varvec{\nu }}_2)\), one has
The connection between the subdifferential [in fact the weak condition (3.14)] of the energy J given by (3.3) and the condition (2.17) is clarified by the following:
Proposition 3.8
Let \({\varvec{\nu }}\in X_R\), if \({\varvec{w}}\in \partial J({\varvec{\nu }})\) then, defining the x-marginal of \({\varvec{\nu }}\otimes \mu \) by
and its cumulative distribution function by
we have
In particular \({\varvec{w}}\in L^{\infty } ({\varvec{\nu }}\otimes \mu )\) with
Proof
Let \({\varvec{\xi }}\in L^{\infty } ({\varvec{\nu }}\otimes \mu )\) and define \({\varvec{\nu }}_\varepsilon :=({\varvec{{{\mathrm{id}}}}}+\varepsilon {\varvec{\xi }})_\# {\varvec{\nu }}\) for \(\varepsilon \in [0,1]\). Since \({\varvec{w}}\in \partial J({\varvec{\nu }})\) we have in particular
Defining \(J_0\) and \(J_1\) as in (3.10) and (3.11) and \(K:=A\times [-R, R]\) , first we have
We then write
with
Observing that \(\eta _\varepsilon \) is bounded by \(2 \Vert {\varvec{\xi }}\Vert _{L^{\infty } ({\varvec{\nu }}\otimes \mu )}\) and that
by Lebesgue’s dominated convergence theorem, we get
with \(I_0\) given by (3.19), and
and
To compute \(I_1\) we observe that thanks to Fubini’s theorem
Treating similarly the integral on \(\{x<y\}\) we thus get
As for \(I_2\), we have
then we use Fubini’s theorem to get
Note that in the previous integral, the integration with respect to x is actually a discrete sum, because the set of atoms where \(G>G^{-}\) is at most countable since G is nondecreasing; let us denote this set by
where I is at most countable. Similarly for the second term in the right hand side of (3.27) observing that \(\vert \xi ^b(x)\vert \int _A \nu ^b(\{x\}) \text{ d } \mu (b) \le \Vert {\varvec{\xi }}\Vert _{L^{\infty } ({\varvec{\nu }}\otimes \mu )} (G(x)-G^{-}(x))\), we only have to integrate in x over S which gives
so that
Putting together (3.18), (3.19), (3.23), (3.26) and (3.28) we arrive at the inequality
which holds for any \({\varvec{\xi }}\in L^{\infty } ({\varvec{\nu }}\otimes \mu )\) and (3.16) obviously follows. \(\square \)
Definition 3.9
A gradient flow of J on the time interval [0, T] starting from \({\varvec{\nu }}_0\) is a Lipschitz continuous (for d) curve \(t\in [0,T]\mapsto {\varvec{\nu }}(t)=(\nu (t)^a)_{a\in A} \in X_R\) together with a measurable map \(t\in [0,T]\mapsto {\varvec{v}}(t) \in L^1({\varvec{\nu }}\otimes \mu )\) such that \({\varvec{v}}(t) \in -\partial J({\varvec{\nu }}(t))\) for almost every \(t\in [0,T]\), and for \(\mu \)-almost every \(a\in A\), \(t\mapsto \nu (t)^a\) is a solution in the sense of distributions of the continuity equation (2.16).
It then follows from Proposition 3.8 that gradient flows starting from \({\varvec{\nu }}_0\) are measure solutions of the system (2.16)–(2.18). Note also that thanks to the bound (3.17), gradient flows are not only absolutely continuous but automatically Lipschitz for d and even more is true: for \(\mu \)-almost every a, the curve \(t\mapsto \nu _t^a\) is Lipschitz for \(W_2\), more precisely
4 Existence by the JKO scheme
We will prove existence of a gradient flow curve on the time interval [0, T] starting from \({\varvec{\nu }}_0=(\nu _0^a)_{a\in A}\) by considering the JKO scheme. Given a time step \(\tau >0\), starting from \({\varvec{\nu }}_0\), we construct inductively a sequence \({\varvec{\nu }}_k\) by
for \(k=0, \cdots , N\) with \(N:=[\frac{T}{\tau }]\).
4.1 Estimates
The first step in proving that this scheme is well-defined consists in showing that one can a priori bound the support. This is based on the following basic result (which we state in any dimension d eventhough, in the sequel, we will only apply it when \(d=1\)):
Lemma 4.1
Let \(R_0\), \(R>0\) and \(\tau \) be positive constants, \(\nu _0\) be a probability measure on \(\mathbb {R}^d\) with support in \(B_{R_0}\) and \(\nu \in \mathcal{P}_2(\mathbb {R}^d)\). Let P be the projection onto \(B_{R_0+\tau R}\) and define \({\hat{\nu }}:=P_\# \nu \). Then, for every \(a\in B_R\), one has
Proof
Fix an optimal transport plan between \(\nu _0\) and \(\nu \) i.e. a \(\gamma \in \Pi (\nu _0, \nu )\) such that \(W_2^2(\nu , \nu _0)=\int _{\mathbb {R}^d\times \mathbb {R}^d} \vert x-y \vert ^2 \text{ d } \gamma (x,y)\). Since the map \((x,y)\mapsto (x, P(y))\) pushes forward \(\gamma \) to a plan having \(\nu _0\) and \({\hat{\nu }}\) as marginals, we have
and then
But since \(\gamma \)-a.e. \(x+\tau a \in B_{R_0+\tau R}\), we get that the integrand in the right-hand side is nonpositive by the well-known characterization of the projection onto \(B_{R_0+\tau R}\). \(\square \)
Now consider the first step of the JKO scheme. Since \(\nu _0^a\) is supported by \([-R_x, R_x]\), for every \(a\in A\) and \(a\in A\Rightarrow \vert a \vert \le R_v +1\), the previous lemma implies that if one replaces \({\varvec{\nu }}=(\nu ^a)_{a\in A}\in X\) by \(\hat{{\varvec{\nu }}}=(\hat{\nu }^a)_{a\in A}\) defined for every a by \(\hat{\nu }^a=P_\#\nu ^a\) where P is the projection on \([-R_x-\tau (R_v+3/2), R_x+\tau (R_v+3/2)]\), one has
As for the interaction term, it is also improved by replacing \({\varvec{\nu }}\) by \(\hat{{\varvec{\nu }}}\); this is obvious from the expression (3.4) and the fact that P is 1-Lipschitz. In the first step of the JKO scheme, we may therefore impose the constraint that \({\varvec{\nu }}\in X_{R_x+\tau (R_v+3/2)}\). After k steps, we may similarly impose that the minimization is performed on \(X_{R_x+k \tau (R_v+3/2)}\), so simply setting
we may replace (4.1) with a bound on the support:
By a direct application of Lemma 3.1 and the compactness of \((X_R, d_w)\), we then see that the minimizing scheme (4.3) is well-defined and actually defines a sequence \({\varvec{\nu }}_k\), \(k=0,\ldots , N+1\). We also extend this sequence by piecewise constant in time interpolation:
In the following basic estimates, C will denote a constant (possibly depending on T) which may vary from one line to the other. By construction, for all \(k=0,\ldots , N\), we have
Summing and using the fact that every \({\varvec{\nu }}_k\) belongs to \(X_R\) and that J is bounded from below on \(X_R\) we get:
From (4.6), Cauchy–Schwarz inequality and Lemma 3.1 we classically get a uniform Hölder estimate:
Since \((X_R, d_w)\) is a compact metric space, it follows from some refined variant of Ascoli-Arzelà theorem (see [3]) that there exists a limit curve
and a vanishing sequence of time-steps \(\tau _n \rightarrow 0\) as \(n\rightarrow +\infty \) such that
4.2 Discrete Euler–Lagrange equation
Let \({\varvec{\gamma }}_{k+1}=(\gamma _{k+1}^a)_{a\in A} \in \Pi ({\varvec{\nu }}_{k}, {\varvec{\nu }}_{k+1})\) be such that \(\gamma ^a_{k+1}\) is an optimal plan for \(\mu \)-almost every a and let \(v_{k+1}^a\) be defined by
for all \(\xi \in C([-R, R])\), or equivalently, disintegrating \(\gamma _{k+1}^a\) with respect to its second marginal \(\nu _{k+1}^a\) as \(\text{ d } \gamma _{k+1}^a(x,y)= \text{ d } \gamma _{k+1}^{a,y}(x) \otimes \text{ d } \nu _{k+1}^a(y)\):
The Euler–Lagrange equation for (4.1) can then be written as
Lemma 4.2
Let \({\varvec{\nu }}_{k+1}\) be a solution of (4.1), \({\varvec{\gamma }}_{k+1} \in \Pi ({\varvec{\nu }}_{k}, {\varvec{\nu }}_{k+1})\) and \({\varvec{v}}_{k+1}\) be constructed as above, then:
Proof
Let \(R'>0\), \({\varvec{\theta }}\in X_{R'}\) and \({\varvec{\gamma }}\in \Pi ({\varvec{\nu }}^{k+1}, {\varvec{\theta }})\), and define for \(\varepsilon \in [0,1]\)
Then by optimality of \({\varvec{\nu }}_{k+1}\) and using Lemma 3.2, we have
We have already disintegrated the optimal plan \(\gamma _{k+1}^a \) between \(\nu _k^a\) and \(\nu _{k+1}^a\) as
Let us also disintegrate the (arbitrary) plan \(\gamma ^a \) between \(\nu _{k+1}^a\) and \(\theta ^a\) as:
Define then the 3-plan \(\beta ^a\) by \(\beta ^a =( \gamma _{k+1}^{a,y} \otimes \gamma ^{a,y} )\otimes \nu _{k+1}^a\) i.e.
for every \(\phi \in C(\mathbb {R}^3)\). Setting
we have by construction, \({\pi _{12}}_\# \beta ^a =\gamma _{k+1}^a\), \({\pi _{23}}_\#\beta ^a=\gamma ^a\). By the very definition of \(\nu _\varepsilon ^a\), we also have \((\pi _1, (1-\varepsilon )\pi _2+\varepsilon \pi _3))_\#\beta ^a \in \Pi (\nu _k^a, \nu _\varepsilon ^a)\) so that
and
Using Lebesgue’s dominated convergence Theorem and recalling the definition of \(\beta ^a\) and \(v_{k+1}^a\) we then get
This yields
i.e. \({\varvec{v}}_{k+1}\in -\partial J ({\varvec{\nu }}_{k+1})\). \(\square \)
Let us also extend \(v_{k+1}\) by piecewise constant interpolation
so that, thanks to the previous Lemma, we have
Thanks to Proposition 3.8, note that \(\sup _{t\in [0,T]} \Vert {\varvec{v}}_{\tau }(t)\Vert _{L^{\infty }({\varvec{\nu }}_\tau (t)\otimes \mu )} \le C\); we can then define the time-dependent-family of signed measures
Denoting by \(\lambda \) the one dimensional Lebesgue measure on [0, T], we may assume, taking a subsequence if necessary, that the bounded family of measures on \({\varvec{q}}_{\tau _n} \otimes \mu \otimes \lambda \) converges weakly \(*\) to some bounded signed measure on \([-R,R]\times A\times [0,T]\) which is necessarily of the form \({\varvec{q}}\otimes \mu \otimes \lambda \) because marginals (with respect to the a and t variables) are stable under weak limits. Since \(\vert {\varvec{q}}_{\tau _n}\vert \otimes \mu \otimes \lambda \le C {\varvec{\nu }}_{\tau _n} \otimes \mu \otimes \lambda \) and \({\varvec{\nu }}_{\tau _n}\otimes \mu \) converges weakly \(*\) to \({\varvec{\nu }}\otimes \mu \), we have \(\vert {\varvec{q}}\vert \otimes \mu \otimes \lambda \le C {\varvec{\nu }}\otimes \mu \otimes \lambda \). Hence, for \(\mu \otimes \lambda \) a.e. (a, t), the limit satisfies \(\vert q(t)^a \vert \le C \nu (t)^a\) and therefore can be written in the form \(\text{ d } q(t)^a=v(t)^a \text{ d } \nu ^a(t)\) (\({\varvec{q}}= {\varvec{v}}{\varvec{\nu }}\) for short) with \(\Vert {\varvec{v}}(t)\Vert _{L^{\infty }({\varvec{\nu }}(t)\otimes \mu )} \le C\) for \(\lambda \)-a.e. \(t\in [0,T]\). We thus have
In other words, for every \(\phi \in C([0,T]\times A\times [-R, R])\) we have
4.3 Existence by passing to the limit
Our task now consists in showing that the limit curve \(t\mapsto {\varvec{\nu }}(t)\) is a gradient flow solution associated to the velocity \(t\mapsto {\varvec{v}}(t)\) constructed above. Let us first check that it satisfies the system of continuity equations (2.16). To do so, take test functions \(\psi \in C(A)\) and \(\phi \in C^2([0,T]\times [-R,R])\) and let us consider
Then, we rewrite
Using the optimal plans \(\gamma ^a_{k+1}\) as in Lemma 4.2, we then rewrite
A Taylor expansion gives
Integrating and using the optimality of \(\gamma ^a_{k+1}\) gives
and then, recalling (4.6) we have
Recalling the definition of the discrete velocity \(v_{k+1}\) from Lemma 4.2, we can rewrite
hence by definition of \({\varvec{\nu }}_\tau \) and \({\varvec{v}}_\tau \)
Now thanks to (4.8), we have
and
where we use in the above limits that \(\nu _{N}^a=\nu ^a_{\tau _n}(N\tau _n)\) and \(\nu ^a_1=\nu ^a_{\tau _n}(\tau _n)\). Putting the previous computations together, summing and using (4.15), (4.14), (4.16), we thus obtain
where \(\varepsilon _{\tau _n}\) goes to 0 as \(n\rightarrow +\infty \). Taking \(\tau =\tau _n\), using (4.8) and (4.13) and letting \(n\rightarrow +\infty \) in the previous identity we get
In other words, we have proved the following:
Lemma 4.3
For \(\mu \)-almost every a, the limit curve \(t\mapsto \nu (t)^a\) solves the continuity equation (2.16) associated to the limit velocity \(t\mapsto v(t)^a\).
It remains to check that
Lemma 4.4
For a.e. \(t\in [0,T]\), we have \({\varvec{v}}(t)\in -\partial J({\varvec{\nu }}(t))\).
Proof
By construction of the curves \({\varvec{v}}_\tau \) and \({\varvec{\nu }}_\tau \) and thanks to Lemma 4.2, we have seen in (4.12) that
which means that for every \(\tau >0\), every \(t\in [0,T]\) and every \({\varvec{\eta }}\in T({\varvec{\nu }}_\tau (t))\) (as defined in Remark 3.5), we have
We wish to prove that there exists \(S\subset [0,T]\), \(\lambda \)-negligible, such that for every \(t\in [0,T]\setminus S\) and every \(\eta \in T({\varvec{\nu }}(t))\), one has
To pass to the limit \(\tau =\tau _n\), \(n\rightarrow \infty \) in (4.17) to obtain (4.18), we shall proceed in several steps. Let us remark that it is enough to prove (4.17) when \(\eta ^{a,y}\) is supported by a fixed compact interval \([-R', R']\) (and then to take an exhaustive sequence of such compact intervals). Let us also recall that, thanks to Lemma 3.1 and (4.8), \(J({\varvec{\nu }}_{\tau _n}(t))\) converges to \(J({\varvec{\nu }}(t))\) as \(n\rightarrow \infty \) uniformly on [0, T].
Step 1: Let us first consider the case where \({\varvec{\eta }}\) is continuous in the sense that \((a,y)\in K \mapsto \int _{[-R', R']} \varphi (z) \text{ d } \eta ^{a,y}(z)\) is continuous for every \(\varphi \in C(\mathbb {R})\). Let \(\phi \in C(A\times \mathbb {R})\). Since \(\varphi _{\varvec{\eta }}\) defined by \(\varphi _{\varvec{\eta }}(a,y):=\int \phi (a,z) \text{ d } \eta ^{a,y}(z)\) belongs to C(K), using the fact that
and (4.8), we deduce that \(\lim _n d_w({\varvec{\nu }}_{\tau _n}(t)_{\varvec{\eta }}, {\varvec{\nu }}(t)_{\varvec{\eta }})=0\) for every \(t\in [0,T]\). Hence, thanks to Lemma 3.1, we have
Let \(\varphi \in C([0,T])\), \(\varphi \ge 0\). Using (4.17) gives
where
belongs to C(K). We then deduce from (4.13), (4.19) and Lebesgue’s dominated convergence that
This implies that there exists a negligible subset \(S_{\varvec{\eta }}\) of [0, T] outside which (4.18) holds.
Step 2: For every \(N\in {\mathbb N}^*\), let \(\Delta _N:=\{(\alpha _0, \cdots , \alpha _{2N-1}) \in \mathbb {R}_+^{2N} \; : \; \sum _{k=0}^{2N-1} \alpha _i=1\}\), \(F_N\) be a countable and dense family in \(C(K, \Delta _N)\), and consider
where for \(k=0, \ldots , 2N-1\), \(z_k^N\) denotes the midpoint of the interval \([-R'+kR'/N, -R'+(k+1)R'/N]\). Since D is countable and its elements belong to \(C(K, (\mathcal{P}([-R', R']), W_2))\), it follows from Step 1, that (4.18) holds for every \({\varvec{\eta }}\in D\) and every \(t\in [0,T]\setminus S\) where S is the \(\lambda \)-negligible set
Step 3: Let \(t\in [0,T]\setminus S\), and \({\varvec{\eta }}\in T({\varvec{\nu }})\) having its support in \([-R', R']\). Note that now we are working with a fixed t so that we just have to suitably approximate \({\varvec{\eta }}\) by a sequence in D. For \(N\in {\mathbb N}^*\), first define for every \((a,y)\in K\) the discrete measure
where \(I_k^N\) is the interval \([-R'+kR'/N, -R'+(k+1)R'/N)\) if \(k=0, \ldots , 2N-2\) and \(I_{2N-1}^N:=[R'(1-1/N), R']\). We then have
The function \((f_k^N)_{k=0, \ldots , 2N-1}\) is not continuous but belongs to \(L^1({\varvec{\nu }}(t)\otimes \mu , \Delta _N)\). Since \(C(K, \Delta _N)\) is dense in \(L^1({\varvec{\nu }}(t)\otimes \mu , \Delta _N)\), there exist \((g_0^N, \ldots , g_{2N-1}^N)\in C(K, \Delta _N)\) such that
Since we have chosen \(F_N\) dense in \(C(K, \Delta _N)\), there exist \(\alpha =(\alpha _0^N, \ldots , \alpha _{2N-1}^N)\in F_N\) such that
We then define \(\eta _N\in D\) by
Thanks to Kantorovich duality formula (3.1), it is easy to see that for every \(\alpha \) and \(\beta \) in \(\Delta _N\), \(W_1(\sum _k \alpha _k \delta _{z_k^N}, \sum _k \beta _k \delta _{z_k^N}) \le R' \sum _k \vert \alpha _k-\beta _k \vert \). In particular, thanks to (4.23), we have
Similarly, (4.24) implies that
We know, from Step 2 that for every \(N\in {\mathbb N}^*\):
Thanks to (4.22), (4.25) and (4.26) and the triangle inequality, we have
Recalling that \({\varvec{v}}(t)\in L^{\infty }({\varvec{\nu }}(t)\otimes \mu )\) and using (3.8), we have
so that the right-hand side of (4.27) converges to
as \(N\rightarrow \infty \). As for the convergence of the right-hand side of (4.27), we have to show that \(\lim _N W_1({\varvec{\nu }}_{{\varvec{\eta }}_N}\otimes \mu , {\varvec{\nu }}_{{\varvec{\eta }}}\otimes \mu )=0\). For this, we shall use the Kantorovich-duality formula (3.1) and observe that if \(\phi \in C(K)\) is 1-Lipschitz then
which tends to 0 as \(N\rightarrow \infty \) thanks to (4.28). Using Lemma 3.1 we then have \(\lim _{N\rightarrow \infty } J({{\varvec{\nu }}(t)}_{{\varvec{\eta }}_N})= J({{\varvec{\nu }}(t)}_{{\varvec{\eta }}})\). Passing to the limit \(N\rightarrow \infty \) in (4.27) gives the desired inequality (4.18). This shows that \({\varvec{v}}(t)\in -\partial J({\varvec{\nu }}(t))\) for every \( t\in [0,T]{\setminus } S\). \(\square \)
We deduce from Lemmas 4.3 and 4.4 the following existence result:
Theorem 4.5
If (2.13) holds, then for any \(T>0\), there exists a gradient flow of J starting from \({\varvec{\nu }}_0\) on the time interval [0, T]. In particular, there exists measure solutions to the system (2.16)–(2.18).
5 Uniqueness and concluding remarks
5.1 Uniqueness and stability
Thanks to (3.15), we easily deduce uniqueness and stability:
Theorem 5.1
Let \({\varvec{\nu }}_0\) and \({\varvec{\theta }}_0\) be in \(X_R\). If \(t\mapsto {\varvec{\nu }}(t)\) and \(t\mapsto {\varvec{\theta }}(t)\) are gradient flows of J starting respectively from \({\varvec{\nu }}_0\) and \({\varvec{\theta }}_0\), then
In particular there is a unique gradient flow of J starting from \({\varvec{\nu }}_0\).
Proof
By definition there exists velocity fields \({\varvec{v}}\) and \({\varvec{w}}\) such that for a.e. t, \({\varvec{v}}(t)=(v(t)^a)_{a\in A}\in -\partial J({\varvec{\nu }}(t))\) and \({\varvec{w}}(t)=(w(t)^a)_{a\in A}\in -\partial J({\varvec{\theta }}(t))\) and for \(\mu \)-almost every a, one has
Since \(v^a\) and \(w^a\) are bounded in \(L^{\infty }(\nu ^a)\) and \(L^{\infty }(\theta ^a)\) respectively, it follows from well-known arguments (see [3], in particular Theorem 8.4.7 and Lemma 4.3.4) that \(t\mapsto W_2^2(\nu _t^a, \theta _t^a)\) is a Lipschitz function and that for any family of optimal plans \(\gamma _s^a\) between \(\nu _s^a\) and \(\theta _s^a\) for \(t_1\le t_2\) one has:
Integrating the previous inequality gives
But since \({\varvec{v}}(s)\in -\partial J({\varvec{\nu }}(s))\) and \({\varvec{w}}(s)\in -\partial J({\varvec{\theta }}(s))\) for a.e. s, the monotonicity relation (3.15) gives
We then obtain the desired contraction estimate. \(\square \)
5.2 Concluding remarks
5.3 Back to classical solutions, more general initial conditions
Starting from a one-dimensional kinetic model of granular media, we have defined generalized (measure) solutions thanks to a special first-integral and have proven that measure solutions exist globally in time thanks to a gradient flow approach. For classical solutions, as explained in section 2.2 there is an equivalence between the initial kinetic formulation and the system of PDEs (2.12) and (2.11) which enabled us to define weak solutions through (2.16)–(2.18). We also gave an example in section 2.4 which shows that one cannot expect that the spatial cumulative function \(G_t\) remains continuous globally in time even if \(G_0\) is very smooth, but in this example the initial condition is very singular in the velocity variable. If one starts with a more regular initial condition \(f_0\) in the phase space, it is not clear to us whether measure solutions of (2.16)–(2.18) are such that \(G_t\) remains absolutely continuous globally in time [a necessary condition to give a meaning to (1.4)]. In other words, we have defined a notion of generalized solutions to (1.4) and proved a global existence result for the latter but have a priori no guarantee that these generalized solutions have enough regularity to be solutions of (1.4).
We would also like to mention here that in our main results of existence and uniqueness of a gradient flow for J, the assumption that \(\rho _0\) is atomless plays no significant role. Actually, our results hold for any compactly supported initial condition \({\varvec{\nu }}_0\) (we did not investigate the extension to the case where this assumption is relaxed to a second moment bound, but this is probably doable). The assumption that \(\rho _0\) is atomless was used only to select unambiguously the Cauchy datum \(\nu _0^a\). We suspect that in the case where \(\rho _0\) is a discrete measure, there might be an interesting connection between gradient flows solutions (which typically select elements of the subgradient with minimal norm) and some solutions of the initial ODE system (1.2) but a more precise investigation is left for the future.
5.4 Higher dimensions, more general functionals
The motivation for the present work comes from kinetic models of granular media. Since the first integral trick of Sect. 2 is very specific to the quadratic interaction kernel case in dimension one, all our subsequent analysis has been performed in dimension one only. However, it is obvious (but we are not aware of any practical examples in kinetic theory) that our arguments can be used also to study systems of continuity equations in \(\mathbb {R}^d\) for infinitely many species (labeled by a parameter a) such as
which (taking for instance W symmetric \(W(a,b,x,y)=W(b,a,y,x)\)), can be seen as the gradient flow of
References
Agueh, M.: Local existence of weak solutions to kinetic models of granular media. Arch. Ration. Mech. Anal. (2016) (in press)
Agueh, M., Carlier, G., Illner, R.: Remarks on kinetic models of granular media: asymptotics and entropy bounds. Kinet. Relat. Models 8(2), 201–214 (2015)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics. Birkhäuser, Basel (2005)
Benedetto, D., Caglioti, E., Pulvirenti, M.: A kinetic equation for granular media. RAIRO Model. Math. Anal. Numer. 31(5), 615–641 (1997)
Benedetto, D., Caglioti, E., Pulvirenti, M.: Erratum: A kinetic equation for granular media. M2AN Math. Model. Numer. Anal. 33, 439–441 (1999)
Bertozzi, A.L., Laurent, T., Rosado, J.: \(L^p\) theory for multidimensional aggregation model. Commun. Pure Appl. Math. 64, 45–83 (2011)
Brenier, Y.: On the Darcy and hydrostatic limits of the convective Navier–Stokes equations. Chin. Ann. Math. 30, 1–14 (2009)
Brenier, Y., Gangbo, W., Savaré, G., Westdickenberg, M.: Sticky particle dynamics with interactions. J. Math. Pures Appl. 99(9), no. 5, 577–617 (2013)
Brenier, Y., Grenier, E.: Sticky particles and scalar conservation laws. SIAM J. Numer. Anal. 35(6), 2317–2328 (1998)
Carlier, G., Laborde, M.: On systems of continuity equations with nonlinear diffusion and nonlocal drifts (2015) (preprint)
Carrillo, J.A., McCann, R.J., Villani, C.: Kinetic equilibration rates for granular media and related equations: entropy dissipation and mass transportation estimates. Rev. Mat. Iberoam. 19, 1–48 (2003)
Carrillo, J.A., McCann, R.J., Villani, C.: Contractions in the 2-Wasserstein length space and thermalization of granular media. Arch. Ration. Mech. Anal. 179, 217–263 (2006)
Carrillo, J.A., DiFrancesco, M., Figalli, A., Laurent, L., Slepcev, D.: Global-in-time weak measure solutions and finite-time aggregation for nonlocal interaction equations. Duke Math. J. 156(2), 229–271 (2011)
Castaing, C., Valadier, M.: Convex Analysis and Measurable Multifunctions. Lecture Notes in Mathematics, vol. 580. Springer, Berlin (1977)
Di Francesco, M., Fagioli, S.: Measure solutions for nonlocal interaction PDEs with two species. Nonlinearity 26, 2777–2808 (2013)
Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker–Planck equation. SIAM J. Math. Anal. 29, 1–17 (1998)
Laurent, T.: Local and global existence for an aggregation equation. Commun. Part. Diff. Eq. 32, 1941–1964 (2007)
Villani, C.: Topics in Optimal Transportation, Graduate Studies in Mathematics, vol. 58. American Mathematical Society, Providence (2003)
Villani, C.: Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften. Springer, Heidelberg (2009)
Acknowledgments
The authors are grateful to Yann Brenier, Reinhard Illner and Maxime Laborde for fruitful discussions about this work. M.A. acknowledges the support of NSERC through a Discovery Grant. G.C. gratefully acknowledges the hospitality of the Mathematics and Statistics Department at UVIC (Victoria, Canada), and the support from the CNRS, from the ANR, through the project ISOTACE (ANR-12- MONU-0013) and from INRIA through the action exploratoire MOKAPLAN.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by L. Ambrosio.
Rights and permissions
About this article
Cite this article
Agueh, M., Carlier, G. Generalized solutions of a kinetic granular media equation by a gradient flow approach. Calc. Var. 55, 37 (2016). https://doi.org/10.1007/s00526-016-0978-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00526-016-0978-7