Abstract
A new notion of displacement convexity on a matrix level is developed for density flows arising from mean-field games, compressible Euler equations, entropic interpolation, and semi-classical limits of non-linear Schrödinger equations. Matrix displacement convexity is stronger than the classical notions of displacement convexity, and its verification (formal and rigorous) relies on matrix differential inequalities along the density flows. The matrical nature of these differential inequalities upgrades dimensional functional inequalities to their intrinsic dimensional counterparts, thus improving on many classical results. Applications include turnpike properties, evolution variational inequalities, and entropy growth bounds, which capture the behavior of the density flows along different directions in space.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The optimal decisions of agents in large populations, the lazy gas experiment of Schrödinger, and the flow of slender jets can all be modeled by systems of coupled partial differential equations of the form
The first equation in (1.1) is the continuity equation of \(\rho _t\ge 0\), interpreted as a density over a domain \(\Omega \subset {\mathbb {R}}^{n}\), driven by a gradient vector field \(\nabla \theta _t\). The second equation in (1.1) describes the evolution of the vector field itself via an equation for \(\theta _t\), which in turn can depend on the density \(\rho _t\). The boundary conditions of (1.1) will vary based on the model and will usually be a specification of \((\rho _0,\rho _{\tau })\), or \((\rho _0,\nabla \theta _0)\), or \((\rho _0,\theta _{\tau })\), and so on. The scope of the flows (1.1) will be recalled in Section 1.2; they include planning problems (optimal transport, entropic interpolation, regularization of planning problems), mean-field games, barotropic fluids, and semi-classical limits of non-linear Schrödinger equations. The majority of this work will focus on flows \((\rho _t,\theta _t)\) satisfying (1.1) under the assumptionsFootnote 1 that \(\sigma \) is real, \(U_t\) is convex, \(W\) is concave, and \(f\) is non-decreasing.
1.1 Matrix Displacement Convexity and Intrinsic Dimensional Functional Inequalities
The discovery by McCann [34] of the notion of displacement convexity has had a significant impact on probability, analysis, and geometry. Move specifically, it was shown in [34] that certain functionals are convex along the optimal transport flow. A central example of such a functional is the differential entropy
which was shown by McCann [34] to be convex (i.e., \(t\mapsto E(t)\) is convex) when \((\rho _t)\) is the optimal transport flow. It was later realized that displacement convexity also holds along other density flows. For example, Léonard [32] showed that \(E(t)\) is convex when \((\rho _t)\) is the entropic interpolation flow, and Gomes and Seneci [22] showed that \(E(t)\) is convex when \((\rho _t)\) is a first-order mean-field game flow. In a certain sense, these results generalize McCann’s result (as well as the classical convexity of entropy along heat flows) as will become clear in Section 1.2.
One important application of displacement convexity is its usage in the definition of Ricci curvature for metric measure spaces as developed by Lott-Villani [33] and Sturm [41, 42]. Roughly, a metric-measure space is defined to have a non-negative Ricci curvature if the entropy is convex along optimal transport flows over this space. This notion of Ricci curvature coincides with the classical notion when the space is a Riemannian manifold.
There is a stronger curvature condition (going beyond non-negative Ricci curvature) which incorporates the effect of the dimension. Restricting to the flat case, this is the \(\text {CD}(0,n)\) curvature-dimension condition of Bakry-Émery [1]. Analogous to the relation between non-negative Ricci curvature and displacement convexity of the entropy, Erbar-Kuwada-Sturm [15] showed that the \(\text {CD}(0,n)\) curvature-dimension condition is equivalent (under sufficient regularity) to the concavity of the map
along the optimal transport flow (which implies the convexity of entropy along the flow). Due to the role of dimension in this notion of convexity it will be dubbed here dimensional displacement convexity. The natural question of whether \(e^{-\frac{E(t)}{n}}\) is concave along the entropic interpolation flow was settled by Ripani [40], which thus recovered both the result of Erbar-Kuwada-Sturm on flat space, as well as the result of Costa [13] who showed that \(e^{-\frac{E(t)}{n}}\) is concave along the heat flow.
1.1.1 Matrix Displacement Convexity
The main purpose of this work is to develop and prove a new notion of matrix displacement convexity which is stronger than dimensional displacement convexity (and thus stronger than classical displacement convexity) along density flows of the form (1.1). To keep the discussion concrete, at this point matrix displacement convexity will be defined just for the entropy (but the extension is clear). Recall that the entropy production \(S(t)\) associated to a density flow \((\rho _t)\) is defined as
In the setting of this work (and many others), there is a natural entropy production matrix \({\mathcal {S}}(t)\) which can be defined so that
Indeed, a simple calculation (cf. Lemma 3.1) shows that when \((\rho _t,\theta _t)\) satisfies the continuity equation, the entropy production matrix is given by
where \(\otimes _S\) is the symmetric tensor product. The entropy matrix is defined as
so that
Remark 1.1
When the flow \((\rho _t,\theta _t)\) is the optimal transport flow (cf. Example 1.4), one can check that
where \(\Phi _t\) is the optimal transport map (which satisfies \(\nabla \Phi _t \succeq 0\)) between \(\rho _0\) to \(\rho _t\).
Definition 1.2
The entropy matrix \({\mathcal {E}}(t)\) is matrix displacement convex along a flow \((\rho _t,\theta _t)\) if, for any \(w\in S^{n-1}\), the function
is concave.
Note that if \({\mathcal {E}}(t)\) is matrix displacement convex then \(E(t)\) is dimensional displacement convex, and hence displacement convex (cf. Section 1.1.2.)
There are two main inter-related motivations behind Definition 1.2. The first motivation comes from the notion of intrinsic dimensional functional inequalities. Consider a flow \((\rho _t)\) which is (approximately) trivial along certain directions in space, that is, its evolution (approximately) takes place on a subspace of low dimension \(\ll n\). In such settings, the explicit dependence on the ambient dimension \(n\) in the notion of dimensional displacement convexity of the entropy, formulated as the concavity of \(e^{-\frac{E(t)}{n}}\), renders this notion oblivious to the intrinsic dimension of the flow \((\rho _t)\). Consequently, functional inequalities which are derived from dimensional displacement convexity are dimensional functional inequalities, in the sense that the ambient dimension \(n\) appears explicitly in the inequalities. This dimensional feature is undesirable in high-dimensional settings. On the other hand, in many practical settings, there is a lower-dimensional manifold inside the high-dimensional ambient space to which the objects of interest (approximately) belong (e.g., the manifold hypothesis). In order to capture this phenomenon one needs intrinsic dimensional functional inequalities where the ambient dimension is absent and which scale like the dimension of the object at hand. Indeed, it will be shown in this work that matrix displacement convexity allows to derive such intrinsic dimensional functional inequalities, which improve on their classical dimensional counterparts by capturing more refined structures of the flow—see Section 1.1.3. This is because controlling the matrix entropy, rather than just its trace, facilitates the analysis of the flow \((\rho _t)\) along different directions in space.
The second motivation behind Definition 1.2 comes back to the discussion of curvature notions. For flat spaces, the \(\text {CD}(0,n)\) curvature-dimension condition does not capture the full curvature structure of the space. Indeed, the \(\text {CD}(0,n)\) condition implies a zero lower bound on the Ricci tensor, but in flat space one knows that the full Riemann tensor vanishes. More generally, there are important classes of manifolds where information beyond lower bounds on the Ricci tensor is given. One such prominent class in differential geometry is the class of Einstein manifolds with lower bounds on the sectional curvature (which includes the sphere and hyperbolic space). What is the correct notion of displacement convexity that captures this type of curvature information? This question was taken up in [28, 30], but Definition 1.2 seems to provide an alternative route as will be further explained in Section 1.1.2.
To conclude this section the first result of this paper is stated informally.
Theorem
(Theorem 4.9) Suppose \((\rho _t,\theta _t)\) is a nice flow satisfying (1.1) and assume that \(\sigma \) is real, \(U_t\) is convex, \(W\) is concave, and \(f\) is non-decreasing. Then, \({\mathcal {E}}(t)\) is matrix displacement convex.
1.1.2 Matrix Differential Inequalities
One classical way to deduce convexity is via differential inequalities. The most basic example is expressing the displacement convexity of \(E(t)\) along some flow via the differential inequality
The dimensional displacement convexity of \(E(t)\) is equivalent to the differential inequality for the entropy production,
In particular, comparing (1.2) and (1.3) shows that dimensional displacement convexity is stronger than displacement convexity. It will be shown in this work that the matrix displacement displacement convexity of \({\mathcal {E}}(t)\) is equivalent to the matrix differential inequality for the entropy production matrix,
The inequality (1.4) is stronger than (1.3) by the Cauchy-Schwarz inequality. More importantly, the ambient dimension \(n\) is absent from (1.4), and having an inequality for the full matrix (rather than just the trace as in (1.3)) allows to control each direction of space separately.
The proof of the matrix displacement convexity of \({\mathcal {E}}(t)\) will follow by establishing (1.4). In fact, more powerful differential matrix inequalities will be established, which in turn imply new intrinsic dimensional functional inequalities. To state these differential inequalities define the (positive semidefinite) Fisher information matrix associated to a flow \((\rho _t)\) as
and let
Theorem
(Theorem 4.8, Theorem 4.9) Suppose \((\rho _t,\theta _t)\) is a nice flow satisfying (1.1) and assume that \(\sigma \) is real, \(U_t\) is convex, \(W\) is concave, and \(f\) is non-decreasing. Then,
Consequently,
The proofs of the matrix differential inequalities will rely on integration by parts, which provides a more analytic approach to the study of (1.1) rather than relying on probabilistic representations. The proofs of the matrix differential inequalities also shed light on the question posed at the end of Section 1.1.1. Crucial to the proofs of integration by parts is the exchange of derivatives, which is permitted in the flat case treated in this work. However in the manifold setting, such an exchange of derivatives causes curvature terms to appear. Since the requisite differential inequalities are for matrices, it is sectional curvature terms which appear, rather than just Ricci terms. Hence, the verification of matrix differential inequalities, and hence matrix displacement convexity and intrinsic dimensional functional inequalities, is intimately tied to curvature information that goes beyond the classical curvature-dimension conditions. A concrete manifestation of this phenomenon can be found in [16] where Eskenazis and the author proved matrix differential inequalities (using different techniques) for heat flows over spaces of constant curvature. These matrix differential inequalities led to Hamilton-type inequalities (which operate on the matrix level and require assumptions on the full Riemann tensor) which imporove on Li-Yau inequalities (which operate on the trace level and only require information on the Ricci tensor). The reader is referred to [16] for further discussion.
1.1.3 Intrinsic Dimensional Functional Inequalities
Once the matrix differential inequalities (1.5) and (1.6) are in place one can use known techniques to deduce functional inequalities. As explained above, the matrical nature of the differential inequalities leads to the replacement of the (ambient) dimensional functional inequalities by more refined inequalities which capture the intrinsic dimension of the flow \((\rho _t)\), and thus improve on many classical results. These functional inequalities do not apply in the generality of the flows discussed above but apply in many cases of interest. Since the focus of this work is on matrix displacement convexity and the associated matrix differential inequalities, only some intrinsic dimensional functional inequalities will proven to show the power of the method. The reader is referred to the appropriate references for background on the significance of these functional inequalities.
Section 5 contains the intrinsic dimensional functional inequalities which are summarized as follows:
-
Theorem 5.2. Intrinsic dimensional lower and upper bounds on the growth of the entropy \(E(t)\) along flows satisfying (1.1).
-
Theorem 5.4. Intrinsic dimensional turnpike properties via dissipation of Fisher information along viscous flows satisfying (1.1).
-
Theorem 5.6. Intrinsic dimensional lower and upper bounds on certain costs associated to the flow (1.1) when \(U_t\) is independent of t, \(W=0\), and \(\sigma \ne 0\). These cost inequalities can also be seen as a generalization of the intrinsic dimensional local logarithmic Sobolev inequalities (Remark 5.7).
-
Theorem 5.11. Intrinsic dimensional long time asymptotics for cost and energy along entropic interpolation flows.
-
Theorem 5.12. Intrinsic dimensional evolution variational inequalities along entropic interpolation flows.
-
Theorem 5.13. Intrinsic dimensional contraction of entropic cost along entropic interpolation flows.
Remark 1.3
This works focuses on the development of matrix displacement convexity, and consequently intrinsic dimensional inequalities, for certain functionals (e.g., entropy) along flows of the form (1.1) in flat spaces. The natural next step is to find other functionals which are matrix displacement convex (analogous to [34]), and to investigate the extension of the results of this paper to curved spaces.
1.2 Examples of Density Flows
To conclude the introduction this section demonstrates the scope of density flows of the form (1.1) via a number of important examples. The first step is to note that the equations in (1.1) have a variational characterization as the Euler-Lagrange equations of the functional
Here, the Lagrangian \(L\) is given by
where \(U_t:\Omega \rightarrow {\mathbb {R}}\) is a potential term. The term \(W*\rho _t\) stands for the convolution of the density \(\rho _t\) with a symmetric interaction potential \(W:\Omega \rightarrow {\mathbb {R}}\), and \(F:{\mathbb {R}}_{\ge 0}\rightarrow {\mathbb {R}}\) is such that \(f(r)=F(r)+rF'(r)\) where \(f:{\mathbb {R}}_{\ge 0}\rightarrow {\mathbb {R}}\).
Example 1.4
(Planning problems) Consider the boundary conditions in (1.1) specifying \(\rho _0\) and \(\rho _{\tau }\). The planning problem seeks to find the optimal density flow going from \(\rho _0\) to \(\rho _{\tau }\), subject to the minimization of the cost given by (1.7). The optimal flow \((\rho _t,\theta _t)\) is given by the equations (1.1).
Optimal transport [43, §5.4]. Taking \(U_t=f=W=0\) and \(\sigma =0\) leads to \((\rho _t,\theta _t)\) being the optimal flow minimizing (1.7),
among all flows satisfying the continuity equation \(\partial _t\rho _t+\nabla {\cdot }(\rho _tv_t)=0\) with boundary conditions \(\rho _0\) and \(\rho _{\tau }\). The function \(\theta _t\) evolves according to
In this setting the flow \((\rho _t)\) is a geodesic in the Wasserstein space between \(\rho _0\) and \(\rho _{\tau }\), which is the fluid mechanics formulation by Benamou-Brenier [2] of the optimal transport problem between \(\rho _0\) and \(\rho _{\tau }\).
Heat flow. Taking \(U_t=f=W=0\) and \(\sigma \rightarrow \infty \) leads to the flow \((\rho _t)\) corresponding to the heat equation
(with boundary term \(\rho _{\tau }\) adjusted appropriately).
Entropic interpolation [7, §4.5]. Taking \(U_t=f=W=0\) leads to \((\rho _t,\theta _t)\) being the optimal flow minimizing (1.7),
among all flows satisfying the continuity equation \(\partial _t\rho _t+\nabla {\cdot }(\rho _tv_t)=0\) with boundary conditions \(\rho _0\) and \(\rho _{\tau }\). The function \(\theta _t\) evolves according to
The flow \((\rho _t)\) is the entropic interpolation between \(\rho _0\) and \(\rho _{\tau }\), which is the same flow of the Schrödinger bridge problem and is the dynamic formulation of entropic optimal transport. The above fluid dynamics formulation (or stochastic control formulation) of entropic interpolation is due to Chen-Georgiou-Pavon [6] and Gentil-Léonard-Ripani [19]. The entropic interpolation flow encapsulates both the optimal transport flow and the heat flow as the limits \(\sigma \rightarrow 0\) and \(\sigma \rightarrow \infty \), respectively.
Regularization of planning problems [23]. Taking \(U_t=W=0\) and \(\sigma =0\) leads to \((\rho _t,\theta _t)\) being the optimal flow minimizing (1.7),
among all flows satisfying the continuity equation \(\partial _t\rho _t+\nabla {\cdot }(\rho _tv_t)=0\) with boundary conditions \(\rho _0\) and \(\rho _{\tau }\). The function \(f\) is seen as a regularization term of optimal transport and it is often assumed to be non-decreasing (an assumption under which the results of this work apply). The choice
leads to the entropic regularization of optimal transport as investigated by Porretta [39].
Example 1.5
(Mean-field games) The theory of mean-field games was developed by Huang-Malhame-Caines [25] and Lasry-Lions [31] to describe Nash equilibrium type concepts for games with large populations of agents. To describe this set up define the Hamiltonian \(H\) to be the Legendre transform of the Lagrangian \(L\) (in the \(w\) variable), so that
and let
for a density \(\rho \) over \(\Omega \). Let \((\rho _t,\theta _t)\) be a flow satisfying (1.1) and let \(u_t:=\theta -\frac{\sigma }{2}\log \rho _t\). Then, the system (1.1) is the combination of a Fokker-Planck equation and a Hamilton-Jacobi equation,
which describes the following stochastic optimal control problem. Consider an infinite population of agents where each agent evolves its state \(x_t\) according to the stochastic differential equation
where \(v_t\) is the control chosen by the agent and \((B_t)\) is a standard Brownian motion in \({\mathbb {R}}^{n}\). The agent’s goal is to minimize
where \(\rho _t(x)\) is the density describing the fraction of agents at state x at time t, and \(u_{\tau }\) stands for the cost at the final state \(x_{\tau }\). The first term in (1.19) stands for the energy spent by the control \(v_t\), and the second term in (1.19) accounts for the effect of the rest of the population of agents. For example, the common assumption in mean-field games (and in this work) that \(f\) is non-decreasing models the agent’s aversion to overcrowding. At equilibrium, each agent chooses its control optimally and the resulting density is \(\rho _t\) which satisfies the first equation in (1.17). Letting \(u_t(x)\) stand for the expected cost that will be incurred by an agent playing optimally, starting at time t at state x, one can show that \(u_t\) solves the second equation in (1.17). From a different perspective, \((\rho _t,\nabla u_t)\) can be derived as the optimal solution to the problem of minimizing
among all flows \((\rho _t,v_t)\) satisfying the continuity equation \(\partial _t\rho _t+\nabla {\cdot }(\rho _tv_t)=0\) with boundary conditions \(\rho _0\) and \(u_{\tau }\), where
so that \(\textrm{V}'\) is the functional derivative of \(\textrm{V}\) (recall \(f(r)=F(r)+rF'(r)\)). The functional (1.20) is exactly (1.7) and indeed (1.17) is exactly (1.1). Equations (1.17) constitute a second-order mean-field game system when \(\sigma >0\) and a first-order mean-field game system when \(\sigma =0\). Note that in contrast to the planning problem of Example 1.4, where the boundary conditions were \((\rho _0,\rho _{\tau })\), in the mean-field game setting the boundary conditions are \((\rho _0,u_{\tau })\).
Example 1.6
(Barotropic fluids) Let \(e:{\mathbb {R}}_{\ge 0}\rightarrow {\mathbb {R}}\) be the internal energy of a fluid and let \(p: {\mathbb {R}}\rightarrow {\mathbb {R}}\) be the pressure function given by \(p(r):=e'(r)r^2\). Taking \(U_t=W=0, ~\sigma =0\), and setting \(e=-F\), turns (1.1) (after spatial differentiation of the second equation) into the system of equations
The system (1.22) describes the compressible Euler equationsFootnote 2 where the pressure depends only on the fluid density \(\rho _t\), which renders the fluid barotropic [29, §4.3]. Normally, the pressure should be a non-decreasing function of the density, which translates to \(f'\le 0\) as \(-f'(r)=\frac{p'(r)}{r}\). Most of the results of this work only apply to the setting \(f'\ge 0\) (which is the relevant setting for the mean-field games of Example 1.5, and in principle can also be used in the planning problem of Example 1.4). However, there are in fact systems of fluid equations where \(f'\ge 0\), namely the zero-viscosity limit of the slender jet equation where \(U_t(x)=gx\) (with \(g>0\) standing for gravity) and \(f(r)=-\gamma r^{-\frac{1}{2}}\) (with \(\gamma >0\) standing for the surface tension coefficient) [11, 12]. Note that in contrast to Example 1.4 and Example 1.5, the boundary conditions here are usually \((\rho _0,\nabla \theta _0)\).
Example 1.7
(Semi-classical limits of non-linear Schrödinger equations) Consider the equation
where \(\Psi \) is a complex-valued wave function, i is the imaginary unit, m is the mass, and \(\hbar \) is the reduced Planck constant. When \(f=W=0\), Equation (1.23) is the standard linear Schrödinger equation with potential \(U_t\) (often independent of t). The interaction potential \(W(x)\) is often a power law, an inverse power law, or a logarithm in the norm |x|, and the non-linearity \(f(r)\) is often a polynomial or a logarithm in r. The connection between (1.23) and (1.1) is via the Madelung transform [44]: Using the representation
and assuming \(|\Psi _t(x)|^2>0\) for every \(x\in \Omega \) and \(t\in [0,\tau ]\), the flow \((\rho _t,\theta _t)\) satisfy
where the units are chosen so that \(m=1\). Equations (1.24) are exactly the same as equations (1.1) with the choice \(\sigma =i\hbar \). The term \(\frac{\Delta \rho _t^{1/2}}{\rho _t^{1/2}}\) is known as the (non-local) quantum pressure or Bohm potential. Most of the results in this paper only apply to the case where \(\sigma \) is real so they cannot apply as is to (1.24). However, when taking \(\hbar \rightarrow 0\), i.e., taking the semi-classical limit of (1.23), equations (1.24) formally reduce to equations (1.1) with \(\sigma =0\). In particular, the results of this work apply (at least formally) whenever \(U_t\) is convex, \(W\) is concave, and \(f'\ge 0\). These assumptions cover a number of semi-classical limits of interest:
Semi-classical limit of the linear Schrödinger equation with convex potential. Take \(f=W=0\) and \(U_t\) to be convex.
Semi-classical limit of \(\varvec{focusing}\) non-linear Schrödinger equations. Take \(U_t=W=0\) and \(f'\ge 0\). Prominent examples are
the semi-classical limit of the focusing \(\varvec{cubic}\) non-linear Schrödinger equation, and
the semi-classical limit of the focusing \(\varvec{logarithmic}\) non-linear Schrödinger equation (cf. entropic regularization of optimal transport in Example 1.4).
For further information on semi-classical limits of non-linear focusing Schrödinger equations see [4, 8, 24, 27, 37] and [3, 5, 14, 18].
The results of this work apply to all of the above examples and, in addition, to generalization afforded by considering general flows of the form (1.1).
Remark 1.8
(Quantum drift-diffusion) The quantum drift-diffusion model [21] (which for \(n=1\) corresponds to the Derrida-Lebowitz-Speer-Spohn equation) is defined by
It was shown by Gianazza-Savaré-Toscani [21] that the flow \((\rho _t,\theta _t)\) satisfying (1.25) is the gradient flow in Wasserstein space of the Fisher information functionalFootnote 3, and an important part of the solution theory of (1.25) is the monotonicity of the entropy \(E(t)\) and Fisher information \({\text {Tr}}[{\mathcal {I}}(t)]\). In Remark 4.10 it is observed that, in addition to the known monotonicity of \({\text {Tr}}[{\mathcal {I}}(t)]\), there is also monotonicity (in the positive semidefinite sense) for the Fisher information matrix \({\mathcal {I}}(t)\). This observation is in line with the theme of this work but the flow (1.25) does not fall under the framework of (1.1), and the monotonicity of the Fisher information matrix can anyway be easily deduced from [21], so this observation will not be elaborated beyond Remark 4.10.
1.3 Organization of Paper
Section 2 establishes the assumptions, notation, and definitions used in this work. Section 3 contains the derivation of the formulas for the time derivatives of various quantities along the density flows. Section 4 contains the main results of this work where the differential matrix inequalities and matrix displacement convexity are derived. Finally, Section 5 contains the intrinsic dimensional functional inequalities.
2 Preliminaries
This section collects the assumptions, notation, and definitions used in this work.
2.1 Assumptions
The existence and regularity theory of density flows of the form (1.1) is highly dependent on the precise form of the partial differential equations and the boundary conditions. Such questions are orthogonal to the topic of this work so will not be addressed here. Instead, sufficient regularity will be assumed to justify the computations. In certain settings, the results of this work are completely rigorous, provided sufficient regularity on the boundary conditions is assumed, while in other settings the computations are formal. In order to avoid distracting from the main point of this work this distinction will not be emphasized.
Definition 2.1
(Nice flows)
-
(1)
The domain \(\Omega \) is assumed to be a convex subset of \({\mathbb {R}}^{n}\) with smooth boundary (if the domain is bounded).
-
(2)
The functions \(\rho _t(x)\) and \(\theta _t(x)\) are classical solutions of (1.1), differentiable in t, twice-differentiable in x, and are finite.
-
(3)
Integration by parts without boundary terms is justified. This entails either fast-enough decay of the flow and its derivative at infinity (when \(\Omega ={\mathbb {R}}^{n}\)), or appropriate boundary conditions when \(\Omega \) is bounded. A good example to keep in mind is when \(\Omega \) is a flat torus in \({\mathbb {R}}^{n}\).
-
(4)
The density flow \(\rho _t\) has a smooth density with respect to the Lebesgue measure on \({\mathbb {R}}^{n}\), is assumed to be strictly positive, and integrates to 1, \(\int _{\Omega }\,\textrm{d}\rho _t(x)=1\) for all \(t\in [0,\tau ]\).
-
(5)
The exchange of derivatives and integration is permitted.
2.2 Notation
An absolutely continuous probability measure \(\nu \) will often be associated with its density with respect to the Lebesgue measure so that \(\,\textrm{d}\nu =\nu \,\textrm{d}x\). To alleviate the notation the domain of spatial integrals will be omitted, \(\int :=\int _{\Omega }\), and the Lebesgue measure will be omitted as well, \(\int := \int \,\textrm{d}x\). Often, the x argument of various functions will be omitted, e.g., \(\int \nu (x)=\int \nu \), while the time dependence will be kept, e.g., \(\int v_t(x)\,\textrm{d}x=\int v_t\). The metric on \({\mathbb {R}}^{n}\) is taken to be the standard Euclidean metric, denoted by \(\langle \cdot ,\cdot \rangle \), with the associated norm \(|\cdot |\). The coordinates of a vector \(w\in {\mathbb {R}}^{n}\) are denoted by upper scripts, \(w= (w^1,\ldots ,w^{n})\). The unit sphere in \({\mathbb {R}}^{n}\) is denoted by \(S^{n-1}\). The symmetric tensor product \(\otimes _S\) is given by \(w\otimes _S w':=\frac{1}{2}[w\otimes w' + w'\otimes w]\) for \(w,w'\in {\mathbb {R}}^{n}\) where \(w\otimes w'\) is the standard tensor product.
Matrix quantities will be denoted by calligraphic fonts, e.g., \({\mathcal {M}}\), and their traces (scalar) will be denoted by regular fonts, e.g., \(M={\text {Tr}}[{\mathcal {M}}]\). The (i, j)th entry of \({\mathcal {M}}\) is denoted \({\mathcal {M}}_{ij}\). The transpose of a matrix \({\mathcal {M}}\) is given by \({\mathcal {M}}^{\textsf{T}}\). The identity matrix on \({\mathbb {R}}^{n}\) is denoted by \({\text {Id}}\). The symbols \(\succeq \) and \(\preceq \) will stand for the semi-definite order and will be applied only to symmetric matrices.
Time derivatives will be denoted as \(\partial _t\) and spatial derivatives will be denoted as \(\partial _i:=\partial _{x_i}\) and \(\partial _{ij}^2:=\partial _{x_ix_j}^2\), etc. The spatial gradient and Hessian are denoted \(\nabla ,\nabla ^2\), respectively, and \(\nabla {\cdot }\) stands for the divergence of vector fields. Given a vector field \(v\) over \({\mathbb {R}}^{n}\) denote by \(\nabla v\) the matrix defined by \((\nabla v)_{ij}=\partial _iv^j\) with \(v=(v^1,\ldots ,v^{n})\), and write \(\partial _kv:=(\partial _kv^1,\ldots ,\partial _kv^{n})\) for \(k\in \{1,\ldots , n\}\). The first and second derivative of a function \(\eta \) over an interval are denoted by \(\eta ',\eta ''\).
The summation \(\sum _{k=1}^{n}\) will often be written as \(\sum _k\).
2.3 Definitions
In the following definitions it is implicitly assumed that the expressions are well-defined. Throughout \(\nu \) is a density (non-negative function with finite integral) over \(\Omega \).
The differential entropy of \(\nu \) is defined as
The Fisher information matrix of \(\nu \) is the symmetric matrix defined as
with the equality holding by integration by parts. The Fisher information of \(\nu \) is
Let \((\rho _t)_{t\in [0,\tau ]}\) be a density flow which evolves according to a continuity equation:
For \(t\in [0,\tau ]\) denote the entropy, Fisher information matrix, and Fisher information, respectively, of \(\rho _t\) as
The entropy production matrix is defined as
and the entropy production is its trace
The matrix entropy is defined by
so that
Another matrix which will play an important role comes from the driving vector field,
with its trace
Note that \({\mathcal {V}}(t)\) is symmetric. Finally, the following combinations of matrices will be crucial: Given \(\sigma \ge 0\) let
The interpretation of \({\mathcal {T}}_{\pm }(t)\) will become clearer in the subsequent sections.
3 Density Flows
This section derives the evolution equations of key quantities (entropy, entropy production matrix, etc.) along a density flow \((\rho _t)_{t\in [0,\tau ]}\) satisfying the continuity equation
Lemma 3.1 and Lemma 3.2 describe the first derivative along the flow of the entropy \(E(t)\) and Fisher information matrix \({\mathcal {I}}(t)\), respectively, for general flows satisfying (3.1). While these results hold for general flows satisfying a continuity equation, the focus of this paper is on density flows of the form (1.1). Using the identity
the flow (1.1) can be written as
with
For the class of flows \((\rho _t,\theta _t)\) satisfying (3.3)-(3.4), Lemma 3.3 provides a formula for the first derivative along the flow of the entropy production matrix \({\mathcal {S}}(t)\) and, consequently, deduces in Corollary 3.4 a formula for the second derivative of the entropy along the flow. In addition, Lemma 3.5 describes the evolution of \({\mathcal {V}}(t)\) along the flow, and the evolution of its trace \(V(t)\) is given in Corollary 3.6.
The first result describes the time evolution of the entropy of \(\rho _t\).
Lemma 3.1
(1st derivative of entropy \(E(t)\)) Suppose \((\rho _t,v_t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.1). Then,
Proof
From the continuity equation (3.1) and integration by parts,
\(\square \)
Next, the time evolution of the Fisher information matrix of \(\rho _t\) is derived.
Lemma 3.2
(1st derivative of Fisher information matrix \({\mathcal {I}}(t)\)) Suppose \((\rho _t,v_t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.1). Then,
Proof
Recall that
so that, by exchanging derivatives,
For a vector field \(w{=}(w^1,\ldots ,w^{n})\) and \(k{=}1,\ldots ,n\) let \(\partial _kw:=(\partial _kw^1,\ldots ,\partial _kw^{n})\). Using the continuity equation (3.1) and exchanging derivatives gives
Hence, by integration by parts and exchanging derivatives,
\(\square \)
The remainder of the section focuses on flows \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) satisfying (3.3)-(3.4). Note that under the evolution (3.3)-(3.4), the entropy production matrix \({\mathcal {S}}(t)\) can also be expressed, by integration by parts, as
Lemma 3.3
(1st derivative of entropy production matrix \({\mathcal {S}}(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3)-(3.4). Then,
Proof
By definition
where
To compute \(A_{ij}\) note that by (3.3),
Hence, by integration by parts,
To compute \(B_{ij}\) note that by (3.4),
Hence, by integration by parts,
It follows that
Analogous argument applies to \(A_{ji}+B_{ji}\). Finally, by integration by parts,
so
which completes the proof. \(\square \)
Combining Lemma 3.1 and Lemma 3.3 yields:
Corollary 3.4
(2nd derivative of entropy \(E(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3)-(3.4). Then,
Lemma 3.5
(1st derivative of \({\mathcal {V}}(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) satisfy (3.3)-(3.4). Then,
Proof
Recall that
so that
where
To compute \(A_{ij}\) note that by (3.4),
Hence, by integration by parts,
Analogously,
To compute \(B_{ij}\) note that by (3.3) and integration by parts,
It follows that
The proof is complete by Lemma 3.2. \(\square \)
Combining Lemma 3.5 and (3.3) yields:
Corollary 3.6
(1st derivative of \(V(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3)-(3.4). Then,
4 Matrix Differential Inequalities and Matrix Displacement Convexity
In this section the main matrix differential inequalities of this work are derived. The main result is Theorem 4.1 which provides matrix differential inequalities for \([0,\tau ]\ni t\mapsto {\mathcal {T}}_{\pm }(t)\), for any flow satisfying (3.3)-(3.4), provided that \(\sigma \in {\mathbb {R}}_{\ge 0}\). From Theorem 4.1 it is possible to deduce a matrix differential inequality for \({\mathcal {S}}(t)\), which is the content of Theorem 4.2. In Section 4.1, a few technical results are collected which show how to obtain bounds on matrices and deduce matrix displacement convexity from matrix differential inequalities. Finally, Section 4.2 apply Theorem 4.1 and Theorem 4.2, together with the results of Section 4.1, to flows of the form (3.3–3.4) under convexity constraints.
The following theorem is the main result of this section and is based on the formulas of Section 3.
Theorem 4.1
(Matrix differential inequalities for \({\mathcal {T}}_{\pm }(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3)-(3.4) with \(\sigma \in {\mathbb {R}}_{\ge 0}\). Then,
Proof
Fix \(\sigma \in {\mathbb {R}}_{\ge 0}\). By Lemma 3.2 and Lemma 3.3,
where the inequality holds by Jensen’s inequality. \(\square \)
By combining the differential inequalities of Theorem 4.1 the following result is deduced.
Theorem 4.2
(Matrix differential inequalities for entropy production matrix \({\mathcal {S}}(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3)-(3.4) with \(\sigma \in {\mathbb {R}}_{\ge 0}\). Then,
Proof
Since
it follows from Theorem 4.1 that
The result follows as
\(\square \)
Both Theorem 4.1 and Theorem 4.2 provide differential inequalities of the form
In Section 4.2 it will be shown that in many flows of interest the remainder term is nonnegative (in a semidefinite sense), which means that (4.1) implies differential inequalities of the form
The following section shows how to take differential inequalities of the form (4.2) and deduce bounds on M(t) as well as obtain matrix displacement convexity.
4.1 Matrix Differential Inequalities and Displacement Convexity
Suppose for the rest of this section that \([0,\tau ]\ni t\mapsto M(t)\) is a differentiable function taking values in the set of \(n\times n\) symmetric matrices. The first result shows how the differential inequality (4.2) implies bounds on M(t) in terms of M(0).
Remark 4.3
Note that in the following results, the existence time \(\tau \) of the flow \((M(t))_{t\in [0,\tau ]}\) will depend on the value of M(0).
Lemma 4.4
If
then, for any \(w\in S^{n-1}\),
Proof
Fix \(w\in {\mathbb {R}}^{n}\) and let \(\eta (t):=\langle w, M(t)w\rangle \). Then, by the Cauchy-Schwarz inequality,
The solution of the ordinary differential equation
is \(\xi (t):=\frac{\eta (0)}{1-t\eta (0)}\). Standard comparison [35] shows that \(\eta (t)\ge \xi (t)\) for all \(t\in [0,\tau ]\), which establishes (4.3). \(\square \)
Corollary 4.5
Suppose that
and let \(\{\lambda _i\}_{i=1}^{n}\) be the eigenvalues of M(0). Then,
Proof
Let \(\{w_i\}_{i=1}^{n}\) be the normalized eigenvectors of M(0) corresponding to \(\{\lambda _i\}_{i=1}^{n}\). By Lemma 4.4,
\(\square \)
Next it is shown how the differential inequality (4.2) implies matrix displacement convexity.
Lemma 4.6
If
then, \(\int _0^tM(s)\,\textrm{d}s\) is matrix displacement convex, that is, for any \(w\in S^{n-1}\), the function \(\textrm{c}_{w}:[0,\tau ]\rightarrow {\mathbb {R}}\) given by
is concave. Consequently,
Proof
To show the concavity of \(\textrm{c}_{w}\) it suffices to show that \(\partial _{tt}^2\textrm{c}_{w}(t)\le 0\) for every \(t\in [0,\tau ]\). The first derivative is
and the second derivative is nonnegative as
where the first inequality holds by the assumption \(\partial _t M(t)\succeq M^2(t)\), and the second inequality holds by the Cauchy-Schwarz inequality.
To establish (4.4) follow the argument of [9, §3.3.1] and note that the concavity of \(\textrm{c}_{w}\) implies
Since \(\partial _t\textrm{c}_{w}(t)=-\textrm{c}_{w}(t)\langle w,M(t)w\rangle \), and \(\textrm{c}_{w}(t)\ge 0\),
is equivalent to
The bound \(\langle w,M(t)w\rangle \le \frac{1}{\tau -t}\) follows analogously by using \(\frac{\textrm{c}_{w}(\tau )-\textrm{c}_{w}(t)}{\tau -t}\le \partial _t\textrm{c}_{w}(t)\).
\(\square \)
Corollary 4.7
Suppose
and let \(\{\lambda _i\}_{i=1}^{n}\) be the eigenvalues of M(0). Then,
Proof
Taking \(t=0\) in (4.4) gives \(\lambda _i\le \frac{1}{\tau }\). Hence, \(\lambda _i\le \frac{1}{t}\) for any \(t\in [0,\tau ]\) which implies \(0\le 1-t\lambda _i\) for any \(t\in [0,\tau ]\). In fact, these inequalities are strict since otherwise the left-hand side in Corollary 4.5 is infinite (but by assumption it is finite, cf. Remark 4.3). The result follows by integrating the bound in Corollary 4.5 from \(t=0\) to \(t=\tau \). \(\square \)
4.2 Matrix Differential Inequalities and Displacement Convexity Along Density Flows
This section shows that there are a number of important density flows where the matrix differential inequalities of Theorems 4.1 and 4.2 are of the form
Hence, Lemma 4.4, Corollary 4.5, Lemma 4.6, and Corollary 4.7 are applicable. The reader should keep in mind Remark 4.3.
Theorem 4.8
(Differential inequalities and matrix displacement convexity for \({\mathcal {T}}_{\pm }(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3–3.4) with \(\sigma \in {\mathbb {R}}_{\ge 0}\), \(f'(r)\ge 0\) for every \(r\in {\mathbb {R}}_{\ge 0}\), and \(\int \{\nabla ^2U_t-\nabla ^2W*\rho _t\}\,\textrm{d}\rho _t \succeq 0\) for every \(t\in [0,\tau ]\). Then,
Consequently, for any \(w\in S^{n-1}\),
and
Furthermore, the matrix \(\int _0^{t}{\mathcal {T}}_{\pm }(s)\,\textrm{d}s\) is matrix displacement convex, that is, for \(w\in S^{n-1}\), the function \(\textrm{c}_{w}:[0,\tau ]\rightarrow {\mathbb {R}}\) given by
Consequently, for every \(t\in [0,\tau ]\) and \(w\in S^{n-1}\),
and
The implications of Theorem 4.8 to intrinsic dimensional functional inequalities will be derived in Section 5.
The next result is analogous to Theorem 4.8 but applies to \({\mathcal {S}}(t)\) rather than \({\mathcal {T}}_{\pm }(t)\). Its implications to intrinsic dimensional functional inequalities will also be derived in Section 5.
Theorem 4.9
(Differential inequalities and matrix displacement convexity for \({\mathcal {S}}(t)\)) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3)-(3.4) with \(\sigma \in {\mathbb {R}}_{\ge 0}\), \(f'(r)\ge 0\) for every \(r\in {\mathbb {R}}_{\ge 0}\), and \(\int \{\nabla ^2U_t-\nabla ^2W*\rho _t\}\,\textrm{d}\rho _t \succeq 0\) for every \(t\in [0,\tau ]\). Then,
Consequently, for any \(w\in S^{n-1}\),
and
Furthermore, the matrix \({\mathcal {E}}(t)\) is matrix displacement convex, that is, for any \(w\in S^{n-1}\), the function \(\textrm{c}_{w}:[0,\tau ]\rightarrow {\mathbb {R}}\) given by
Consequently, for every \(t\in [0,\tau ]\) and \(w\in S^{n-1}\),
and
Remark 4.10
(Quantum drift-diffusion) As mentioned in Remark 1.8, the quantum drift-diffusion model is given by the \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) satisfying
In order to compute the derivatives of \(E(t)\) it turns out to be convenient to use the identity
Then, the continuity equation implies (analogous to the proof of Lemma 3.1) that entropy decreases along the flow \((\rho _t)\) since
For the computation of \(\partial _t{\mathcal {I}}(t)\) apply Lemma 3.2 to write
and use
to get
Integration by parts shows that
and
It follows that
and hence
which establishes the monotonicity of the Fisher information matrix along the quantum drift-diffusion flow.
5 Intrinsic Dimensional Functional Inequalities
In this section Theorems 4.8 and 4.9 will be used to derive intrinsic dimensional functional inequalities. When the boundary conditions of (1.1) correspond to the planning problem, i.e., \((\rho _0,\rho _{\tau })=(\mu _a,\mu _z)\) for densities \(\mu _a,\mu _z\) over \(\Omega \), the time symmetry of the problem can be used:
Remark 5.1
(Time symmetry) The variational problem of (1.7) with the boundary conditions \((\mu _a,\mu _z)\) is time-symmetric. Consequently, if \((\rho _t,\theta _t)\) is the optimal flow with boundary conditions \((\mu _a,\mu _z)\), then the optimal flow with boundary conditions \((\mu _z,\mu _a)\) is \(({\tilde{\rho }}_t,{\tilde{\theta }}_t)\) where \({\tilde{\rho }}_t:=\rho _{\tau -t}\) and \({\tilde{\theta }}_t:=-\theta _{\tau -t}\). Hence, the matrices \({\tilde{{\mathcal {T}}}}_{\pm },{\tilde{{\mathcal {S}}}},{\tilde{{\mathcal {I}}}}\) associated with \(({\tilde{\rho }}_t,{\tilde{\theta }}_t)\) satisfy
which implies
The first intrinsic dimensional functional inequality describes the growth of the entropy along the flow.
Theorem 5.2
(Entropy growth) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3–3.4) with \(\sigma \in {\mathbb {R}}_{\ge 0}\), \(f'(r)\ge 0\) for every \(r\in {\mathbb {R}}_{\ge 0}\), and \(\int \{\nabla ^2U_t-\nabla ^2W*\rho _t\}\,\textrm{d}\rho _t \succeq 0\) for every \(t\in [0,\tau ]\). Then,
where \(\{\lambda _i(t)\}_{i=1}^{n}\) are the eigenvalues of \({\mathcal {S}}(t)\). Furthermore, under the planning problem boundary conditions \((\mu _a,\mu _z)\),
Proof
Inequality (5.1) and the left-hand side of inequality (5.2) is simply (4.17). To get the right-hand side of inequality (5.2), Remark 5.1 is used as follows. By (4.14),
where \(\{\lambda _i(0)\}_{i=1}^{n}\) are the eigenvalues of \({\tilde{{\mathcal {S}}}}(0)=-{\mathcal {S}}(\tau )\). By (4.16), \({\tilde{\lambda }}_i(0)\le \frac{1}{\tau }\le \frac{1}{t}\), which implies \(1+t\lambda _i(\tau )=1-t{\tilde{\lambda }}_i(0)\ge 0\). Hence, the integral over \(t\in [0,\tau ]\) on the left-hand side of (5.3) is equal to \(-\sum _{i=1}^{n}\log \left( 1+\tau \lambda _i(\tau )\right) \). The proof is complete by noting that \(\int _0^{\tau }{\tilde{S}}(t)\,\textrm{d}t=-\int _0^{\tau }S(t)\,\textrm{d}t\). \(\square \)
5.1 Viscous Flows
In this section the flow is assumed to be viscous, that is, \(\sigma \ne 0\). The first result pertains to the turnpike property of a viscous flow \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) satisfying (3.3, 3.4). The reader is referred to [9, 17, 20] for a discussion of the turnpike property, but in this context it suffices to state the formulation of the turnpike property by Clerc-Conforti-Gentil [9, Theorem 4.9]. They showed that when the flow \((\rho _t,\theta _t)\) is the entropic interpolation flow,
The next result improves on (5.4) by replacing the scalar inequality for the Fisher information by a matrix inequality for the Fisher information matrix, thus disposing of the the ambient dimension \(n\). In addition, the result applies to settings beyond entropic interpolation.
Theorem 5.3
(Turnpike properties via dissipation of Fisher information) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (3.3–3.4) with \(\sigma \in {\mathbb {R}}_{\ge 0}\), \(f'(r)\ge 0\) for every \(r\in {\mathbb {R}}_{\ge 0}\), and \(\int \{\nabla ^2U_t-\nabla ^2W*\rho _t\}\,\textrm{d}\rho _t \succeq 0\) for every \(t\in [0,\tau ]\). Then,
Proof
The proof is analogous to proof of [9, Theorem 4.9]. By (4.10),
so
\(\square \)
The remainder of the results of this section are restricted to flows of the form
so that the potential assumed to be independent of time, i.e.,
and the interaction term \(W\) is assumed to vanish. Under these assumptions an energy can be defined which is constant along the flow. Begin by defining
where the Hamiltonian \( H\) is given by
and where \(F\) satisfies
Lemma 5.4
(Preservation of energy) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (5.5). Then, the energy \(O(t)\) is constant along \(t\in [0,\tau ]\).
Proof
First note that
By Corollary 3.6,
while, on the other hand,
It follows that \(\partial _tO(t)=0\). \(\square \)
In light of Lemma 5.4 define
Next define the cost
where the Lagrangian \(L\) is given by
The relation between the cost \(C_{\tau }\), the entropy \(E\), the energy \(O_{\tau }\), and the the matrix \({\mathcal {T}}_{\pm }\) is captured by the following lemma:
Lemma 5.5
Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (5.5). Then,
Proof
By definition
so
\(\square \)
With Lemma 5.5 in hand the following intrinsic dimensional functional inequality for the combination of cost, entropy, and energy can be proved.
Theorem 5.6
(Cost inequalities) Suppose \((\rho _t,\theta _t)_{t\in [0,\tau ]}\) is a nice flow satisfying (5.5) with \(\sigma >0\), \(f'(r)\ge 0\) for every \(r\in {\mathbb {R}}_{\ge 0}\), and \(\int \nabla ^2U\,\textrm{d}\rho _t \succeq 0\) for every \(t\in [0,\tau ]\). Then,
where \(\{\lambda _i(t)\}_{i=1}^{n}\) are the eigenvalues of \({\mathcal {T}}_{\pm }(t)\). Furthermore, under the planning problem boundary conditions \((\mu _a,\mu _z)\),
where \(\{\lambda _i(t)\}_{i=1}^{n}\) are the eigenvalues of \({\mathcal {T}}_{\pm }(t)\).
Proof
Inequality (5.9) and the left-hand side of inequality (5.10) follows from (4.11) and Lemma 5.5. For the right-hand side of inequality (5.10), use (4.8) and Remark 5.1 to get
where \(\{\lambda _i(0)\}_{i=1}^{n}\) are the eigenvalues of \({\mathcal {T}}_{\pm }(0)\) and \(\{{\tilde{\lambda }}_i(0)\}_{i=1}^{n}\) are the eigenvalues of \({\tilde{{\mathcal {T}}}}_{\mp }(0)=-{\mathcal {T}}_{\pm }(\tau )\). Hence,
Using
and Lemma 5.5, gives
The next step is to note that by (4.10), \(-\frac{1}{t}\le -\frac{1}{\tau }\le \lambda _i(\tau )\), which implies that \(0\le 1+ t\lambda _i(\tau )\). Hence, the integral over \(t\in [0,\tau ]\) of the right-hand side of (5.11) is equal to \(\frac{\sigma }{2}\sum _{i=1}^{n}\log (1+\tau \lambda _i(\tau ))\). Similarly, (4.10) gives \(\lambda _i(0)\le \frac{1}{\tau }\le \frac{1}{t}\), which implies \(1-t\lambda _i(0)\ge 0\). Hence, the integral over \(t\in [0,\tau ]\) of the left-hand side of (5.11) is equal to \(-\frac{\sigma }{2}\sum _{i=1}^{n}\log (1-\tau \lambda _i(0))\). \(\square \)
Remark 5.7
(Intrinsic dimensional local logarithmic Sobolev inequalities) The inequalities of Theorem 5.6 can be viewed as a generalization of the intrinsic dimensional local logarithmic Sobolev inequalities for the Euclidean heat semigroup [16, Equations (29) and (30)]. In particular, consider the entropic interpolation setting with \(U_t=W=f=0\) and \(\sigma =1\). Fix \(x\in {\mathbb {R}}^{n}\) and take \(\mu _a:=\delta _x\) and \(\,\textrm{d}\mu _z(y):=h(y)p_{\tau }(x,y)\) where \(h\ge 0\) and \(p_{\tau }\) is the heat kernel associated with the Euclidean heat semigroup \(\textrm{P}_{\tau }\). Then, using the explicit expression for \((\rho _t)\) in [10, Remark 4.1], one can formally derive the inequalities of [16, Equations (29) and (30)] for the function h evaluated at x:
In [10], this entropic interpolation flow was used to prove the dimensional local log-Sobolev inequalities which are weaker than the intrinsic dimensional local log-Sobolev inequalities (5.12)-(5.13).
5.2 Entropic Interpolation Flows
In this section the flow will be assumed to be the entropic interpolation flow,
over the domain \(\Omega ={\mathbb {R}}^{n}\). In contrast to Section 5.1 where the energy preserved along the flow is a scalar quantity, in the entropic interpolation setting a matrix quantity is preserved as well. Define the matrix energy as
Lemma 5.8
(Preservation of matrix energy) Suppose that \((\rho _t,\theta _t)\) is a nice flow satisfying (5.14). Then, the energy matrix \({\mathcal {O}}(t)\) is constant along \(t\in [0,\tau ]\).
Proof
The proof is immediate by Lemma 3.5 which states
\(\square \)
In light of Lemma 5.8 set
In the setting of entropic interpolation a matrix cost can be defined as
with its trace
The relation between the matrix cost and the matrix energy is captured by the following lemma, but first a remark is in order.
Remark 5.9
(Time scaling) Define
where the minimum is over flows satisfying the continuity equations with boundary conditions:
Then, the Euler-Lagrange equations of (5.19) are
and it is easy to see that \(({\tilde{\rho }}_t,{\tilde{\theta }}_t)=(\rho _{\tau t},\tau \theta _{\tau t})\) where \((\rho _t,\theta _t)\) satisfy (5.14).
Lemma 5.10
Suppose that \((\rho _t,\theta _t)\) is a nice flow satisfying (5.14). Assume that \(\tau \mapsto \partial _{\tau }{\mathcal {C}}_{\tau }(\mu _a,\mu _z)\) is differentiable. Then,
Proof
By the envelope theorem [36] and Remark 5.9,
where \(({\tilde{\rho }}_t)_{t\in [0,1]}\) is the optimal flow in \({\mathcal {A}}_{\tau }(\mu _a,\mu _z)\) given by \({\tilde{\rho }}_t=\rho _{\tau t}\). Hence, by the change of variables \(t\mapsto \frac{t}{\tau }\),
On the other hand, changing variables \(t\mapsto \tau t\) shows that
so it follows that
which implies the result. \(\square \)
To set up the first main result of this section recall the definition of the matrix entropy (2.8) and define
so that
The following result is the intrinsic dimensional improvement of [9, Theorem 4.6] by Clerc-Conforti-Gentil, and is proved similarly.
Theorem 5.11
(Large time asymptotics for cost and energy) Suppose that \((\rho _t,\theta _t)\) is a nice flow satisfying (5.14). Then,
and, consequently,
Moreover,
Proof
To prove (5.23) note that
which implies
Adding \(\frac{\sigma ^2}{4}{\mathcal {I}}(t)\) to both sides of (5.27) gives
which is equivalent to
Taking \(t=0\) gives
where the last inequality holds by (4.10). This establishes (5.23).
To prove (5.24) integrate (5.23) over t from 0 to \(\tau \) and use Lemma 5.10 to get
Finally, to prove (5.25) note that by (5.26), Lemma 3.5, and Theorem 4.1,
which shows that \([0,\tau ]\ni s\mapsto \int \left[ \nabla \theta _s+\frac{\sigma }{2}\nabla \log \rho _s\right] ^{\otimes 2}\,\textrm{d}\rho _s\) is non-decreasing. Hence,
Since
the bound (5.25) follows by applying (5.24). \(\square \)
The next result is the intrinsic dimensional improvement of the evolution variational inequality for the entropic cost of Ripani [40, Corollary 11], and is proved similarly.
Theorem 5.12
(Evolution variational inequality) Fix \(\tau =1, ~\sigma =\sqrt{2}\), and suppose that \((\rho _t,\theta _t)\) is a nice flow satisfying (5.14). Let \((\textrm{P}_t)\) be the heat semigroup in \({\mathbb {R}}^{n}\) and suppose that \(t\mapsto C_1(\mu _a,\textrm{P}_t\mu _z)\) is differentiable. Then, for any normalized basis \(\{w_i\}_{i=1}^{n}\) of \({\mathbb {R}}^{n}\) and fixed \(t\in [0,1]\),
Proof
Let \(\{w_i\}_{i=1}^{n}\) be any normalized basis of \({\mathbb {R}}^{n}\) and fix \(t\in [0,1]\). Consider the entropic interpolation flow betwen \(\mu _a\) and \(\textrm{P}_t\mu _z\) so, by (4.15), \(\textrm{c}_{w_i}(t):=e^{-\langle w_i,{\mathcal {E}}(t)w_i\rangle }\) is concave, and hence,
Inequality (5.28) is equivalent to
which upon rearrangement gives
Using the definition of \(\textrm{c}_{w_i}(1)\), and summing over i in (5.29), gives
By [40, Theorem 9],
so (5.30) implies
By the semigroup property, inequality (5.31) can be applied at any t to yield the result. \(\square \)
Finally, the last result is the intrinsic dimensional improvement of the entropic cost contraction of Ripani [40, Corollary 13], and is proved similarly.
Theorem 5.13
(Contraction of entropy cost) Fix \(\tau =1, ~\sigma =\sqrt{2}\), and suppose that \((\rho _t,\theta _t)\) is a nice flow satisfying (5.14). Let \((\textrm{P}_t)\) be the heat semigroup in \({\mathbb {R}}^{n}\) and suppose that \(t\mapsto C_1(\mu _a,\textrm{P}_t\mu _z)\) is differentiable. Then,
Proof
Fix \(s\in [0,1]\) and let \(\{w_i\}_{i=1}^{n}\) be any normalized basis of \({\mathbb {R}}^{n}\). Applying Theorem 5.12 with \(\mu _a\mapsto \textrm{P}_s\mu _a\) gives
On the other hand, by time symmetry (Remark 5.1),
and switching the roles of \(\mu _a\) and \(\mu _z\) thus gives
Taking \(\mu _z\mapsto \textrm{P}_s \mu _z\) yields
and adding (5.32) and (5.33) shows that
Taking \(s=t\) yields
where time symmetry (Remark 5.1) was used to write \({\tilde{S}}(t)=-S(1-t)\) and hence \({\mathcal {E}}_{1}(\textrm{P}_t \mu _z,\textrm{P}_t \mu _a)=-{\mathcal {E}}_{1}(\textrm{P}_t \mu _a,\textrm{P}_t \mu _z)\). Integrating over t from 0 to 1, and using \(\cosh r=\frac{e^r+ e^{-r}}{2}\), gives
Finally, since \(\sinh ^2 r=\frac{\cosh (2r)-1}{2}\),
\(\square \)
Notes
Some of these assumptions can in fact be relaxed, cf. Section 4.2.
The incompressible Euler equations have the additional constraint \(\Delta \theta _t=0\).
References
Bakry, D., Gentil, I., Ledoux, M.: Analysis and geometry of Markov diffusion operators, Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 348. Springer, Cham (2014)
Benamou, J.-D., Brenier, Y.: A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numer. Math. 84, 375–393, 2000
Bialynicki-Birula, I., Mycielski, J.: Nonlinear wave mechanics. Ann. Phys. 100, 62–93, 1976
Carles, R.: On the semi-classical limit for the nonlinear Schrödinger equation, Stationary and time dependent Gross-Pitaevskii equations, Contemp. Math., vol. 473, Amer. Math. Soc., Providence, RI, pp. 105–127, (2008)
Carles, R.: Logarithmic Schrödinger equation and isothermal fluids. EMS Surv. Math. Sci. 9, 99–134, 2022
Chen, Y., Georgiou, T.T., Pavon, M.: On the relation between optimal transport and Schrödinger bridges: a stochastic control viewpoint. J. Optim. Theory Appl. 169, 671–691, 2016
Chen, Y., Georgiou, T.T., Pavon, M.: Stochastic control liaisons: Richard Sinkhorn meets Gaspard Monge on a Schrödinger bridge. SIAM Rev. 63, 249–313, 2021
Clarke, S.R., Miller, P.D.: On the semi-classical limit for the focusing nonlinear Schrödinger equation: sensitivity to analytic properties of the initial data. R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 458, 135–156, 2002
Clerc, G., Conforti, G., Gentil, I.: Long-time Behaviour of Entropic Interpolations. Potential Anal. 59, 65–95, 2023
Clerc, G., Conforti, G., Gentil, I.: On the variational interpretation of local logarithmic Sobolev inequalities, Annales de la Faculté des Sciences de Tolouse (To appear).
Constantin, P., Drivas, T.D., Nguyen, H.Q., Pasqualotto, F.: Compressible fluids and active potentials. Ann. Inst. H. Poincaré C Anal. Non Linéaire 37, 145–180, 2020
Constantin, P., Drivas, T.D., Shvydkoy, R.: Entropy hierarchies for equations of compressible fluids and self-organized dynamics. SIAM J. Math. Anal. 52, 3073–3092, 2020
Max, H.M.: Costa, A new entropy power inequality. IEEE Trans. Inform. Theory 31, 751–760, 1985
d’Avenia, P., Montefusco, E., Squassina, M.: On the logarithmic Schrödinger equation. Commun. Contemp. Math.16, 1350032, 15, 2014
Erbar, M., Kuwada, K., Sturm, K.-T.: On the equivalence of the entropic curvature-dimension condition and Bochner’s inequality on metric measure spaces. Invent. Math. 201, 993–1071, 2015
Eskenazis, A., Shenfeld, Y.: Intrinsic dimensional functional inequalities on model spaces, arXiv preprint arXiv:2303.00784 (2023).
Faulwasser, T., Grüne, L.: Turnpike properties in optimal control: An overview of discrete-time and continuous-time results. Handb. Numer. Anal. 23, 367–400, 2022
Ferriere, G.: The focusing logarithmic Schrödinger equation: analysis of breathers and nonlinear superposition. Discrete Contin. Dyn. Syst. 40, 6247–6274, 2020
Gentil, I., Léonard, C., Ripani, L.: About the analogy between optimal transport and minimal entropy. Ann. Fac. Sci. Toulouse Math. 26, 569–601, 2017
Geshkovski, B., Zuazua, E.: Turnpike in optimal control of PDEs, ResNets, and beyond. Acta Numer 31, 135–263, 2022
Gianazza, U., Savaré, G., Toscani, G.: The Wasserstein gradient flow of the Fisher information and the quantum drift-diffusion equation. Arch. Ration. Mech. Anal. 194, 133–220, 2009
Gomes, D., Seneci, T.: Displacement convexity for first-order mean-field games, arXiv preprint arXiv:1807.07090 (2018).
Jameson Graber, P., Mészáros, A.R., Silva, Francisco J., Tonon, D.: The planning problem in mean field games as regularized mass transport. Calc. Var. Partial Differential Equations 58, Paper No. 115, 28, (2019)
Grenier, E.: Semiclassical limit of the nonlinear Schrödinger equation in small time. Proc. Amer. Math. Soc. 126, 523–530, 1998
Huang, M., Malhamé, R.P., Caines, P.E.: Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6, 221–251, 2006
Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker-Planck equation. SIAM J. Math. Anal. 29, 1–17, 1998
Kamvissis, S., McLaughlin, K.D. T.-R., Miller, P.D.: Semiclassical soliton ensembles for the focusing nonlinear Schrödinger equation, Annals of Mathematics Studies, vol. 154, Princeton University Press, Princeton, NJ, (2003).
Ketterer, C., Mondino, A.: Sectional and intermediate Ricci curvature lower bounds via optimal transport. Adv. Math. 329, 781–818, 2018
Khesin, B., Misiołek, G., Modin, K.: Geometric hydrodynamics and infinite-dimensional Newton’s equations. Bull. Amer. Math. Soc. (N.S.) 58, 377–442, 2021
Kim, Y.-H., Pass, B.: Nonpositive curvature, the variance functional, and the Wasserstein barycenter. Proc. Amer. Math. Soc. 148, 1745–1756, 2020
Lasry, J.-M., Lions, P.-L.: Mean field games. Jpn. J. Math. 2, 229–260, 2007
Léonard, C.: On the convexity of the entropy along entropic interpolations, Measure theory in non-smooth spaces. Partial Differ. Equ. Meas. Theory, De Gruyter Open, Warsaw, pp. 194–242 (2017)
Lott, J., Villani, C.: Ricci curvature for metric-measure spaces via optimal transport. Ann. Math. 169, 903–991, 2009
Robert, J.: McCann, A convexity principle for interacting gases. Adv. Math. 128, 153–179, 1997
Michel Petrovitch, M.: Sur une manière d’étendre le théorème de la moyenne aux équations différentielles du premier ordre. Math. Ann. 54, 417–436, 1901
Milgrom, P., Segal, I.: Envelope theorems for arbitrary choice sets. Econometrica 70, 583–601, 2002
Peter, D.: Miller and Spyridon Kamvissis, On the semiclassical limit of the focusing nonlinear Schrödinger equation. Phys. Lett. A 247, 75–86, 1998
Otto, F.: Dynamics of labyrinthine pattern formation in magnetic fluids: A mean-field theory. Arch. Ration. Mech. Anal. 141, 63–103, 1998
Porretta, A.: Regularizing effects of the entropy functional in optimal transport and planning problems. J. Funct. Anal. 284, 109759, 2023
Ripani, L.: Convexity and regularity properties for entropic interpolations. J. Funct. Anal. 277, 368–391, 2019
Sturm, K.-T.: On the geometry of metric measure spaces. I. Acta Math. 196, 65–131, 2006
Sturm, K.-T.: On the geometry of metric measure spaces. II. Acta Math. 196, 133–177, 2006
Villani, C.: Topics in Optimal Transportation, Graduate Studies in Mathematics, vol. 58. American Mathematical Society, Providence, RI (2003)
von Renesse, M.-K.: An optimal transport view of Schrödinger’s equation. Canad. Math. Bull. 55, 858–869, 2012
Acknowledgements
Many thanks to Giovanni Conforti for helpful remarks on the manuscript. Thanks also to Alexandros Eskenazis, Matthew Rosenzweig, and Gigliola Staffilani for their comments and suggestions on this work. I am very grateful to the two anonymous referees for their careful remarks that improved this manuscript. This material is based upon work supported by the National Science Foundation under Award Numbers 2002022 and DMS-2331920. No data is associated with this manuscript. The author have no relevant financial or non-financial interests to disclose.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by A. Figalli.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shenfeld, Y. Matrix Displacement Convexity Along Density Flows. Arch Rational Mech Anal 248, 74 (2024). https://doi.org/10.1007/s00205-024-02021-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00205-024-02021-8