Abstract
We study a class of Hamilton–Jacobi partial differential equations in the space of probability measures. In the first part of this paper, we prove comparison principles (implying uniqueness) for this class. In the second part, we establish the existence of a solution and give a representation using a family of partial differential equations with control. A large part of our analysis exploits special structures of the Hamiltonian, which might look mysterious at first sight. However, we show that this Hamiltonian structure arises naturally as limit of Hamiltonians of microscopical models. Indeed, in the third part of this paper, we informally derive the Hamiltonian studied before, in a context of fluctuation theory on the hydrodynamic scale. The analysis is carried out for a specific model of stochastic interacting particles in gas kinetics, namely a version of the Carleman model. We use a two-scale averaging method on Hamiltonians defined in the space of probability measures to derive the limiting Hamiltonian.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 An overview
We develop a variational approach to derive macroscopic hydrodynamic equations from particle models. Within this broad context, this article studies a new class of Hamilton–Jacobi equations in the space of probability measures. Specifically, we study the convergence of Hamiltonians and establish well-posedness for a limiting Hamilton–Jacobi equation using a model problem in statistical physics. However, our main interest is to develop a new scale-bridging methodology, rather than to advance the understanding of the specific model.
The theory of hydrodynamic limits can be divided into deterministic and stochastic theories. The goal of the deterministic approach is to derive continuum-level conservation laws as scaling limit of particle motions on the microscopic level, which satisfy Hamiltonian ordinary differential equations. At a key step, this requires the use of ergodic theory for Hamiltonian dynamical systems. Such theory, at a level to be applied successfully, is not readily available for a wide range of problems. Consequently, the program of rigorously deriving hydrodynamic limits from deterministic models remains a huge challenge [40, Chapter 1]. (See, however, the work of Dolgopyat and Liverani [11] on weakly interacting geodesic flows on manifolds of negative curvature, which made a progress in this direction.) This topic has a long history; indeed, the passage from atomistic to continuum models is mentioned by Hilbert in his explanation of his sixth problem (see [32] for a recent review). Stochastic hydrodynamics, on the other hand, relies on probabilistic interacting particle models, and the program has been more successful. Conceptually, one often thinks of these stochastic models as regularizations of underlying deterministic models. Usually, the interpretation is that, at appropriate intermediate scale, a class of particles develop contact with a fast oscillating environment, which is modeled by stochastic noises. With randomness injected in the particle motions, we can invoke probabilistic ergodic theorems. Probabilistic ergodic theory is much tamer than its counterpart for deterministic Hamiltonian dynamics. Hence the program can be carried out with rigor for a wide variety of problems. In many cases in stochastic hydrodynamics, convergence towards a macroscopic equation can be seen as a sophisticated version of the law of large numbers. Large deviation theory describes fluctuations around this macroscopic equation as limit, and offers finer information in form of a variational structure through a so-called rate function. In this context, the rate function automatically describes the hydrodynamic (macroscopic) limit equation as minimizer. A Hamiltonian formulation of large deviation theory for Markov processes has been developed by Feng and Kurtz [24]. We will follow the method in Sect. 4 of this paper. The literature on the large deviation approach to stochastic hydrodynamics is so extensive that we do not try to review it (e.g., Spohn [40], Kipnis and Landim [34]). Instead, we only mention the seminal work of Guo, Papanicolaou and Varadhan [33], which introduced a major novel technique, known as block-averaging/replacement-lemma, to handle multi-scale convergence. With refinements and variants, this block-averaging method remains the standard of the subject to the present day.
Despite the power and beauty of block-averaging and the replacement lemma, this method relies critically on probabilistic ergodic theories, and is thus largely restricted to stochastic models. In Sect. 4, we introduce a functional-analytic approach for scale-bridging which is different to the one of Guo, Papanicolaou and Varadhan [33]. We will still consider a stochastic model in this paper, but our method is developed with a view to be applicable in the deterministic program as well. Our approach takes inspiration from the Aubry–Mather theory for deterministic Hamiltonian dynamical systems, and, more directly, from another deeply linked topic known as the weak KAM (Kolmogorov–Arnold–Moser) theory (e.g., Fathi [23]). We will derive and study an infinite particle version of a specific weak KAM problem. By infinite particle version, we mean a Hamiltonian describing the motion of infinitely many particles. Hence it is defined in space of probability measures. To avoid raising false hopes, we stress again that the model we use still incorporates randomness. However, in principle, the Hamilton–Jacobi part of our program is not fundamentally tied to probabilistic ergodic theory. We choose a stochastic model problem to test ideas, as the infinite particle version of weak KAM theory in this case becomes simple and directly solvable. At the present time, a general infinite particle version of weak KAM theory does not exist. This is in sharp contrast with the very well-developed theories for finite particle versions of deterministic Hamiltonian dynamics defined in finite-dimensional compact domains. Therefore, it is useful for us to focus on a more modest goal in this article—we are content with developing a method of studying problems where other (probabilistic) approaches may apply in principle, though we are not aware of such results for Carleman-type models. To put things in perspective, we hope the study of this paper will reveal the relevance and importance of studying Hamilton–Jacobi partial differential equations in the space of probability measures, and open up many possibilities on this type of problems in the future.
In light of the preceding discussion, we study the convergence of Hamiltonians arising from the large deviation setting in Sect. 4 using a Hamilton–Jacobi theory developed by Feng and Kurtz [24]. This approach is different from the usual approach to large deviations, which can be viewed as a Lagrangian technique.
To understand the convergence of Hamilton–Jacobi equations, following a topological compactness-uniqueness strategy, we need to resolve two issues: one is the multi-scale convergence of particle-level Hamiltonians to the continuum-level Hamiltonian; the other is the uniqueness for a class of abstract Hamilton–Jacobi equations which includes the limiting continuum Hamiltonian. We settle the first issue in Sect. 4 in a semi-rigorous manner, and the second issue on Sects. 2 and 3 rigorously.
We first sketch how the second problem—existence and uniqueness for a class of Hamilton–Jacobi equations—is addressed in this paper; this is a question of independent interest, and corresponds to a hard issue for the traditional Lagrangian approach to hydrodynamic limits and large deviations, namely to match large deviation upper and lower bounds. Essentially, we may not know how regular the paths needs to be in order to approximate the Lagrangian action accurately for all paths. In the Hamiltonian setting advocated here, this problem can be solved rigorously for the model problem considered in this article. Indeed, we develop a method to establish a strong form of uniqueness (the comparison principle) for the macroscopic Hamilton–Jacobi equation. The analysis uses techniques from viscosity solution in space of probability measures developed by Feng and Kurtz [24] and Feng and Katsoulakis [25]. This is a relatively new topic and is a step forward compared to earlier studies initiated by Crandall and Lions [4,5,6,7,8,9,10] on Hamilton–Jacobi equations in infinite dimensions, focusing on Hilbert spaces. Our Hamiltonian has a structure which is closer in spirit to the one studied by Crandall and Lions [8] (but with a nonlinear operator and other subtle differences), to Example 9.35 in Chapter 9 and Section 13.3.3 in Chapter 13 of [24] and to Example 3 of [25], or Feng and Swiech [26]. It is different in structure than those studied by Gangbo, Nguyen and Tudorascu [28] and Gangbo and Tudorascu [30] in Wasserstein space; or to those studied with metric analysis techniques by Giga, Hamamuki and Nakayasu [31], Ambrosio and Feng [1], Gangbo and Swiech [29]. The difference is that we have to deal with an unbounded drift term given by a nonlinear operator which we explain next. In (1.1) below, we first informally define the Hamiltonian as \(H=H(\rho ,\varphi )\) for a probability measure \(\rho \) and a smooth test function \(\varphi \). This definition contains a nonlinear term \(\partial _{xx}^2 \log \rho \). A priori, \(\rho \) is just a probability measure, thus even the definition of this expression is a problem. If the probability measure \(\rho \) is zero on a set of positive Lebesgue measure, \(\log \rho \) cannot even be defined in a distributional sense. In addition, our notion of a solution is more than the B-continuity studied in [8]. Namely, for the large deviation theory, we need the solution to be continuous in the metric topology on the space (and it in fact is). This can be established through an a posteriori estimate technique which we introduce in Lemma 2.8. Regarding the possible singularity of \(\partial _{xx}^2 \log \rho \), in rigorous treatment, we will use non-smooth test functions \(\varphi \) in the Hamiltonian to compensate for the possible loss of distributional derivatives of the term \(\log \rho \). This is essentially a renormalization idea, in the sense that we rewrite the equation in appropriately chosen new coordinates to “tame” the singularities. Section 2 explores a hidden controlled gradient flow structure in the Hamiltonian. Using a theorem by Feng and Katsoulakis [25] and the regularization technique in Lemma 2.8, we establish a comparison principle for the Hamilton–Jacobi equation with the Hamiltonian given by (1.1). For the existence of a solution, we argue in the Lagrangian picture. Here, the problem translates into a nonlinear parabolic problem, which again is quite singular. We establish in Sect. 3 an existence theory using the theory of optimal control and Nisio semigroups, and by deriving some non-trivial estimates.
We now describe the approach for dealing with the first problem, the multi-scale convergence. This is discussed at the end of the paper (Sect. 4) and involves some semi-rigorous arguments. As this part involves a broad spectrum of techniques coming from different areas of mathematics, a rigorous justification and the description of details are long; we postpone them to future studies. The stochastic model we use here is known as the stochastic Carleman model studied by Caprino, De Masi, Presutti and Pulvirenti [3]. This is a fictitious system of interacting stochastic particles describing a two-velocity gas, and leads to the Carleman equation as the kinetic description. At a different (coarser) level, a hydrodynamic limit theorem has been derived by Kurtz [35] and then by McKean [38]. It yields the nonlinear diffusion equation studied in Sect. 3. Ideas of Lions and Toscani [37] to study this equation in terms of the density and the flux turn out to be useful. Here, following the Hamilton–Jacobi method of [24], we study large deviations of the stochastic Carleman equations. We give three heuristic derivations to identify the limit Hamiltonian, which is the one given in (1.1) and studied in the earlier parts of the paper. We now sketch these three limit identifications. The first is based on a formal weak KAM theory in an infinite-dimensional setting. The second approach is based on a finite-dimensional weak KAM theory, and the key is reduction due to propagation of chaos. The third derivation uses semiclassical approximations. We remark that our overall aim is to provide new functional-analytic methods for deriving the limiting continuum level Hamiltonian in this hydrodynamic large deviation setting. We turn the issue of multi-scale into one of studying a small-cell Hamiltonian averaging problem in the space of probability measures. In the present case, this can be solved at least formally by the weak KAM theory for Hamiltonian dynamical systems, which is a deterministic method.
Our program combines tools from a variety of sources, notably viscosity solutions in the space of probability measures, optimal transport, parabolic estimates, optimal control, Markov processes, and Hamiltonian dynamics. We use weak KAM type arguments to replace stronger versions of ergodic theory in the derivation of limiting Hamiltonian.
1.2 The setting
Let \({\mathcal {O}}\) be the one dimensional circle, i.e. the unit interval [0, 1] with periodic boundary by identifying 0 and 1 as one point. We denote by \({\mathcal {P}}({\mathcal {O}})\) the space of probability measures on \({\mathcal {O}}\) and formally define a Hamiltonian function on \({\mathcal {P}}({\mathcal {O}}) \times C^\infty ({\mathcal {O}})\):
We use the word formal because even for probability measures admitting a Lebesgue density \(\rho (dx)=\rho (x) dx \in \mathcal P({\mathcal {O}})\), as long as \(\rho (x)=0\) on a positive Lebesgue measure set of \({\mathcal {O}}\), we have \(-\log \rho (x) = +\infty \) on this set. In such cases, \(\partial _{xx}^2 \log \rho \) cannot be defined as a distribution. Therefore, we will explore special choices of the test functions \(\varphi \) which are \(\rho \) dependent and possibly non-smooth to compensate for loss of the distribution derivative on the \(\log \rho \) term.
We will introduce a number of notations and definitions in Sect. 1.3. In particular, we denote \({{\mathsf {X}}}:=\mathcal P({\mathcal {O}})\) and define a homogeneous negative order Sobolev space \(H_{-1}({\mathcal {O}})\) according to (1.14). In Sect. 1.3, we will show that \({{\mathsf {X}}}\) can be identified as a closed subset of this \(H_{-1}({\mathcal {O}})\). Hence it is a metric space as well. With the formal Hamiltonian function (1.1), we can now proceed to the second step to introduce a formally defined operator
where the test functions f are only chosen to be very smooth,
In the first part of this paper (Sect. 2), we prove a comparison principle (Theorem 2.1) for a Hamilton–Jacobi equation in the space of probability measures. This equation is formally written as
In this equation, the function h and the constant \(\alpha >0\) are given, and f is a solution. However, making sense of (1.3) rigorous is very subtle. Motivated by a priori estimates, we make sense of the operator H by introducing two more operators \(H_0\) and \(H_1\), and interpret equation (1.3) as two families of inequalities (1.28) and (1.29), which define sub- and super- viscosity solutions (Definition 1.1). The comparison principle in Theorem 2.1 compares the sub- and super- solutions of these two (in-)equations. This result implies in particular that there is at most one function f which is both a sub- as well as a super- solution.
In the second part of the paper (Sect. 3), we construct solutions by studying the Lagrangian dynamics associated with the Hamiltonian H in (1.1). A Legendre dual transform of the formal Hamiltonian gives a Lagrangian function
(the norm is defined in (1.13) below). We define an action on \({\mathcal {P}}({\mathcal {O}})\)-valued curves by
One can consider variational problems with this action defined in the space of curves \(\rho (\cdot )\), or equivalently, consider a nonlinear partial differential equation with control,
with \(\rho (r,\cdot )\) being the state variable, \(\eta (t,\cdot )\) (or equivalently \(\partial _t \rho \)) being a control, and \(A_T\) being a running cost. We take the control interpretation next and defined a class of admissible control as those satisfying
We also define a value function for the above optimal control problem,
Then, assuming \(h \in C_b({{\mathsf {X}}})\), we show that
is both a sub-solution to (1.28) as well as a super-solution to (1.29) (see Lemma 3.15). This gives us an existence result for the Hamilton–Jacobi PDE (1.3) in the setting we introduced. Hence, by the comparison principle earlier proved, it is the only solution.
The formal basis for the existence results above comes from an observation that
We emphasize again that \(\log \rho \) may not be a distribution, hence the above variational representation is not rigorous. However, it suggests at least formally that H is a Nisio semigroup generator associated with the family of nonlinear diffusion equations with control (1.5). We also comment that the value function \(R_\alpha h:{{\mathsf {X}}}\mapsto {\bar{{{\mathbb {R}}}}}: = {{\mathbb {R}}}\cup \{ \pm \infty \}\) introduced before is well defined for all \(h:{{\mathsf {X}}}\mapsto {\bar{{{\mathbb {R}}}}}\) satisfying
This includes in particular the class of measurable \(h:{{\mathsf {X}}}\mapsto {\bar{{{\mathbb {R}}}}}\) which are bounded from above \(\sup _{{\mathsf {X}}}h<+\infty \). Additionally, the precise meaning of control equation (1.5) is given in Definition 3.1. We establish existence and some regularities of solutions in Lemmas 3.2 and 3.5. Finally, in Sect. 3 we use that the partial differential equation (1.5) can also be written as a system in a density-flux \((\rho ,j)\)-variables
This “change-of-coordinate” turns out to be very useful when we justify the derivation of the Hamiltonian H from microscopic models in the last part of the paper.
The third part of this paper, Sect. 4, is, unlike the other parts of the paper, non-rigorous. The purpose of this section is to place results of first two parts of this paper in context of a bigger program, by explaining significance of studying the equation (1.3). Specifically, in Sect. 4, we will informally derive the Hamiltonian H given in (1.1) in a context of Hamiltonian convergence using generalized multi-scale averaging techniques (for operators on functions in the space of probability measures). Our starting point is a stochastic model of microscopically defined particle system in gas kinetics. By a two-scale hydrodynamic rescaling and by taking the number of particles to infinity, the (random) empirical measure of particle number density \(\rho _\epsilon \) follows an asymptotic expression
where the \(A_T\) is precisely the action given by (1.4) and the \(P_0\) is some ambient background reference measure. We justify the above probabilistic limit theorem (known as large deviations) through a Hamilton–Jacobi approach. For a full exposition on this approach in a rigorous and general context, see Feng and Kurtz [24]. In this general theory, H is derived from a sequence of Hamiltonians which describe Markov processes given by stochastic interacting particle systems. The rigorous application of the general theory developed in [24] requires to establish a comparison principle for the limiting Hamilton–Jacobi equation given by the H. In addition, if we have an optimal control representation of H (such as the identity (1.9)), then we can explicitly identify the right hand side of (1.12) using the action. This is the reason we studied these problems in Sects. 2 (comparison principle) and 3 (optimal control problem) in this paper.
To summarize, we derive the Hamiltonian convergence in Sect. 4 in a non-rigorous manner; we rigorously prove the comparison principle in Sect. 2; and rigorously construct the solution and related the solution with an optimal control problem in Sect. 3.
1.3 Notations and definitions
Let \({\mathcal {P}}(A)\) denote the collection of all probability measures on a set A. On a metric space (E, r), we use B(E) to denote the collection of bounded function on E. Further, \(C_b(E)\) denotes bounded continuous functions, UC(E) denotes uniformly continuous functions, \(UC_b(E):= UC(E) \cap B(E)\). Finally, LSC(E) (respectively USC(E)) denotes lower-semicontinuous (respectively upper-semicontinuous) functions, which are possibly unbounded. For a function \(h \in UC(E)\), we denote \(\omega _h\) the (minimal) modulus of continuity of h with respect to the metric r on E:
We write \(C^\infty ({\mathcal {O}})\) for the collection of infinitely differentiable functions on \({\mathcal {O}}\). For a Schwartz distribution \(m \in {\mathcal {D}}^\prime ({\mathcal {O}})\), we define
We denote the homogeneous Sobolev space of negative order
The associated norm has the property that
Hence \(H_{-1}({\mathcal {O}})\) is a subset of distributions that annihilates constants, \(\langle m, 1\rangle =0\). In fact, the following representation holds: for every \(m \in H_{-1}(\mathcal O)\), we have
Regarding the one dimensional torus \({\mathcal {O}}:= {{\mathbb {R}}}/ {{\mathbb {Z}}}\) as a quotient metric space, we consider a metric r defined by
Let \(\rho , \gamma \in {\mathcal {P}}({\mathcal {O}})\). We write
For \(p \in (1, \infty )\), let \(W_p\) be the Wasserstein order p-metric on \({\mathcal {P}}({\mathcal {O}})\):
See Chapter 7 in Ambrosio, Gigli and Saváre [2] or Chapter 7 of Villani [44] for properties of this metric. Next, we claim that
To see this, we note that on one hand, by the Kantorovich–Rubinstein theorem (e.g., Theorem 1.14 of [44]), for every \(\rho , \gamma \in {\mathcal {P}}({\mathcal {O}})\), we have
On the other hand, by an adaptation of Lemma 4.1 of Mischler and Mouhot [39] to the torus case (see Lemma A.1 in Appendix A), there exists a universal constant \(C>0\) such that
Therefore the topology induced by the metric \(\Vert \cdot \Vert _{-1}\) is identical to the usual topology of weak convergence of probability measure on \({\mathcal {P}}({\mathcal {O}})\). Since any sequence of elements in \({{\mathsf {X}}}:={\mathcal {P}}({\mathcal {O}})\) is tight, we conclude that \(({{\mathsf {X}}}, {{\mathsf {d}}})\) is a compact metric space with the \({{\mathsf {d}}}(\rho ,\gamma ) :=\Vert \rho -\gamma \Vert _{-1}\). In particular, this argument establishes that (1.17) holds.
We define a free energy functional \(S:{\mathcal {P}}({\mathcal {O}}) \mapsto [0,+\infty ]\),
We use convention that \(0\log 0 :=0\). Since this is the relative entropy between \(\rho \) and the uniform probability measure 1 on \({\mathcal {O}}\), we have \(S(\rho ) \ge S(1)=0\). By the variational representation
we have \(S \in LSC\big ({\mathcal {P}}({\mathcal {O}})\big )\).
We make two formal observations. For the Hamiltonian H in (1.1), we have
We introduce an analog of Fisher information in this context, extending the usual definition in optimal mass transport theory, by defining
We claim that \(I \in LSC({\mathcal {P}}({\mathcal {O}}); {\bar{{{\mathbb {R}}}}})\). This claim can be verified by the following observations. Let \(\rho _n\) be a sequence such that \(\sup _n I(\rho _n) <\infty \). First, by one-dimensional Sobolev inequalities, \(\sup _n \Vert \log \rho _n \Vert _{L^\infty } <\infty \) and \(\rho _n \in C({\mathcal {O}})\). In fact, \(\{\rho _n\}_n\) has a uniform modulus of continuity, and is hence relatively compact in \(C({\mathcal {O}})\). This implies relative compactness of the \(\{ \log \rho _n \}_n\) in \(C({\mathcal {O}})\). Secondly, by variational formula, for all \(\rho \) such that \(\log \rho \) is bounded:
We note that \(\varphi \mapsto H(\rho , \varphi )\) is convex in the usual sense.
Next, extending the existing theory of viscosity solution for Hamilton–Jacobi equations in abstract metric spaces, we define a notion of solutions that will be used in this paper.
Definition 1.1
Let (E, r) be an arbitrary compact metric space. A function \(\overline{f} :E \mapsto {{\mathbb {R}}}\) is a sub-solution to (1.3) if for every \(f_0 \in D(H)\), there exists \(x_0 \in E\) such that
we have
Similarly, a function \({\underline{f}} :E \mapsto {{\mathbb {R}}}\) is a super-solution to (1.3) if for every \(f_1 \in D(H)\), there exists \(y_0 \in E\) such that
we have
Note this definition differs from the usual one; indeed, if in Definition 1.1 “there exist” is replaced by “for every”, then \({\bar{f}}\) and \({\underline{f}}\) are strong sub- and super- solution. A function f is a (strong) solution if it is both a (strong) sub-solution and a (strong) super-solution.
1.4 Towards a well-posedness theory: two more Hamiltonians
We now establish some useful properties of the Hamiltonian H in (1.1). In particular, we now give a heuristic argument why a comparison principle can be expected formally for (1.3). Since we will use non-smooth test functions which do not fall inside the domain of very smooth functions D(H) given by (1.2), it is not at all trivial to make this result rigorous; to this behalf we introduce two more Hamiltonians related to the one in (1.1), as motivated through the formal calculations we now give. Throughout the paper, we denote \({{\mathsf {d}}}(\rho ,\gamma ) := \Vert \rho - \gamma \Vert _{-1}\).
Let \(k \in {{\mathbb {R}}}_+\), at least formally for \(\rho (dx) = \rho (x) dx\) and \(\gamma (dx)= \gamma (x) dx\), we have
The last inequality follows since by Jensen’s inequality
Similarly, we have
In particular,
Experts in viscosity solution theory may immediately recognize that, at a formal level, this inequality implies the comparison principle of (1.3) (see for instance, Theorems 3 and 5 of Feng and Katsoulakis [25]). To make this rigorous, we need to face the possibility of cancellations of the kind \(\infty - \infty \) when dealing with \(S(\rho ) - S(\gamma )\), or more generally, \(\infty - \infty -\infty +\infty \) when dealing with the left hand of (1.23).
To establish this result rigorously, we introduce two more Hamiltonian operators and formulate Theorem 1.2, which establishes not only the comparison principle, but also the existence of super- and sub-solutions for the Hamiltonians we now introduce. Let us mention that, although every operator in this paper is single-valued, for notational convenience, we may still identify an operator with its graph.
We now define two operators \(H_0 \subset C({{\mathsf {X}}}) \times M({{\mathsf {X}}}; {\bar{{{\mathbb {R}}}}})\) and \(H_1 \subset C({{\mathsf {X}}}) \times M({{\mathsf {X}}}; {\bar{{{\mathbb {R}}}}})\). Let
and
This definition is motivated by the formal calculation (1.21). Analogously, motivated by (1.22), let
and
Instead of working with (1.3) with H given in (1.1), we consider the equation for \(H_0\) and seek sub-solutions, and analogously super-solutions for \(H_1\). We establish existence and show that these solutions coincide for a common right-hand side h. Namely, let \(h_0, h_1 \in UC_b({{\mathsf {X}}})\) and \(\alpha >0\). We consider a viscosity sub-solution \(\overline{f}\) for
and a viscosity super-solution \({\underline{f}}\) for
We will prove the following well-posedness result in Sects. 2 and 3.
Theorem 1.2
Let \(h \in C_b({{\mathsf {X}}})\) and \(\alpha >0\). We consider viscosity sub-solution to (1.28) and super-solution to (1.29) in the case where \(h_0= h_1=h\). There exists a unique \(f \in C_b({{\mathsf {X}}})\) such that it is both a sub-solution to (1.28) as well as a super-solution to (1.29). Moreover, such solution is given by
where the \(R_\alpha \) is given by (1.7).
2 The Comparison Principle
In this section, we establish the following comparison principle.
Theorem 2.1
Let \(h_0, h_1 \in UC_b({{\mathsf {X}}})\) and \(\alpha >0\). Suppose that \(\overline{f} \in USC({{\mathsf {X}}}) \cap B({{\mathsf {X}}})\) respectively \({\underline{f}} \in LSC({{\mathsf {X}}}) \cap B({{\mathsf {X}}})\) is a sub-solution to (1.28) respectively a super-solution to (1.29). Then
We divide the proof into two parts.
2.1 A set of extended Hamiltonians and a comparison principle
We define a new set of operators \({{{\bar{H}}}}_0\) and \({{{\bar{H}}}}_1\) which extend \(H_0\) and \(H_1\) by allowing a wider class of test functions. The test functions for these operators are generally discontinuous and can take the values \(\pm \infty \). The operators \({{{\bar{H}}}}_0\) and \({{{\bar{H}}}}_1\) satisfy a structural assumption (Condition 1) used in Feng and Katsoulakis [25], hence the comparison principle for the associated Hamilton–Jacobi equations follows from [25, Theorem 3] (the same technique for establishing comparison principle has also been presented in Chapter 9.4 of Feng and Kurtz [24] (Condition 9.26 and Theorem 9.28)). In the next Sect. 2.2, we will link viscosity solutions for \({\bar{H}}_0\) and \({\bar{H}}_1\) with those for \(H_0\) and \(H_1\).
Let \(\epsilon \in (0,1)\), \(\delta \in [0,1]\) and \(\gamma \text { be such that } S(\gamma )<\infty \). We define
and
This definition of \({{{\bar{H}}}}_0\) is motivated by convexity considerations: formally,
Similarly, let \(\epsilon \in (0,1)\), \(\delta \in [0,1]\) and \(\rho \text { be such that } S(\rho )<\infty \). We define
and
The definition of \({{{\bar{H}}}}_1\) is motivated in a similar way to that of \({{{\bar{H}}}}_0\), but is a bit more involved. We first observe that
By convexity considerations applied formally to H, we have
That is, \({\bar{H}}_1\) is defined such that
We now give an auxiliary statement to establish the existence of strong viscosity solutions.
Lemma 2.2
Take \(f_0,f_1\) in (2.1) and (2.3), then \( {{{\bar{H}}}}_0 f_0 \in USC({{\mathsf {X}}}, {\bar{{{\mathbb {R}}}}})\) and \({{{\bar{H}}}}_1 f_1 \in LSC({{\mathsf {X}}}, {\bar{{{\mathbb {R}}}}})\). Moreover, for \(\rho , \gamma \in {{\mathsf {X}}}\) such that \(S(\rho )+S(\gamma )<\infty \), we have
Proof
The semi-continuity properties follow from the lower semi-continuity of \(\rho \mapsto S(\rho )\) and \(\rho \mapsto I(\rho )\) in \(({{\mathsf {X}}}, {{\mathsf {d}}})\). The estimate (2.5) follows from direct verification. \(\quad \square \)
Below, we establish the first comparison result in this paper for strong viscosity solutions (see the note following 1.1).
Lemma 2.3
Suppose that \(\overline{f} \in USC({{\mathsf {X}}}, {{\mathbb {R}}}) \cap B({{\mathsf {X}}})\) and \({\underline{f}} \in LSC({{\mathsf {X}}}, {{\mathbb {R}}}) \cap B({{\mathsf {X}}})\) are respectively viscosity strong sub-solution and strong super-solution to
Then
Moreover, if \(\overline{f}, {\underline{f}} \in C_b({{\mathsf {X}}})\), then
Proof
The estimates in Lemma 2.2 imply that Theorem 3 in Feng and Katsoulakis [25] applies, hence the conclusions follow. \(\quad \square \)
2.2 Viscosity extensions from \(H_0\) and \(H_1\) to \({\bar{H}}_0\) and \({\bar{H}}_1\) and the comparison theorem
Throughout this section, we assume that the functions \(\overline{f} \in USC({{\mathsf {X}}}) \cap B({{\mathsf {X}}})\) respectively \({\underline{f}} \in LSC({{\mathsf {X}}}) \cap B({{\mathsf {X}}})\) are a viscosity sub-solution to (1.28) respectively a super-solution to (1.29). The following regularizations \(\overline{f}_t\) and \({\underline{f}}_t\) and Lemma 2.4 are analogues of Lemma 13.34 of Feng and Kurtz [24].
For each \(t \in (0,1)\), we define
It follows that
Lemma 2.4
\(\overline{f}_t, {\underline{f}}_t \in \mathrm{Lip}({{\mathsf {X}}})\).
Next we establish a few a priori estimates. To get some intuition of what we will derive, we now explain the heuristic ideas. Let \(f_0\) be as in (2.1) taking the form
where the \({\hat{\gamma }} \in {{\mathsf {X}}}\) is such that \(S({\hat{\gamma }})<\infty \). Then formally
If \(\log \rho \in L^\infty ({\mathcal {O}})\), then the expression above is an element in \(L^2({\mathcal {O}})\). Let \(\rho _0, \gamma _0 \in {{\mathsf {X}}}\) be such that \(S(\rho _0) <\infty \). We assume that
Then, by taking directional derivatives along paths \(t \mapsto \rho :=\rho (t)\) with \(\rho (0) =\rho _0\), the above will imply the following comparison of Hamiltonians
We now rigorously justify these formal comparisons. We divide the justification into three steps. First, we make sense of the following statement in a rigorous way:
Lemma 2.5
Let \(f_0\) be given by (2.10) with \(S({\hat{\gamma }})<\infty \). Let \(\rho _0,\gamma _0 \in {{\mathsf {X}}}\) be such that \(S(\rho _0)<\infty \) and that (2.12) holds. Then
Note that if we assume \(S(\gamma _0)<\infty \), then this estimate immediately implies an a posteriori estimate
Proof
We claim that there exists a curve \(\rho \in C([0,\infty ); {{\mathsf {X}}})\) such that the following partial differential equation
is satisfied in the sense of Definition 3.1 below. Moreover,
and
A rigorous justification of the claim above can be found in Lemma 3.2. Results of this type are well-known for \(\partial _t \rho = \partial _{xx} \Phi (\rho )\) with a regular \(\Phi \) (e.g., Theorem 5.5 of Vazquez [43]). However, in our case, this existing theory does not directly apply, as our \(\Phi (r)= \frac{1}{2} \log r\) is singular at \(r=0\). Although main ideas for establishing these estimates remain the same, additional subtleties need to be taken care of. Of course, the proofs in Sect. 3.1 are independent of the results in this section on comparison principle, hence we can invoke these results here without creating circular arguments.
In summary, by (2.14) and (2.15), the curve satisfies
We also have (note that the following inequality holds trivially if \(S(\gamma _0)=+\infty \))
We plug the two lines above into (2.12) and note that S, I are both lower semicontinuous. The inequality (2.13) follows. \(\quad \square \)
Lemma 2.6
Let the \(f_0, \rho _0,\gamma _0, {\hat{\gamma }}\) be as in the previous Lemma 2.5 with the additional assumption that \(S(\gamma _0)<\infty )\) (hence \(I(\rho _0)<\infty \) by the previous lemma). Then the term \(\frac{\delta f_0}{\delta \rho _0}\) in (2.11) is well defined and
which implies
Proof
Let \(\gamma \in C^\infty ({\mathcal {O}}) \cap {{\mathsf {X}}}\) with \(\inf _{\mathcal O} \gamma >0\). We define
From \(I(\rho _0)<\infty \), we have \(\rho _0, j \in C({\mathcal {O}})\) and \(\inf _{{\mathcal {O}}} \rho _0 >0\). Therefore, \(\rho (s) \in C({\mathcal {O}}) \cap {{\mathsf {X}}}\) and \(\inf _{{\mathcal {O}}} \rho (s)>0\) for all \(s \in [0,1]\). With these regularities, \((-\partial _{xx}^2)^{-1}(\rho (s) - \gamma ) \in C({\mathcal {O}})\). Hence, if we define
then this expression is well defined and
Therefore
and
In view of (2.12) and the regularities \(\rho (r), j \in C({\mathcal {O}})\), we have
As j is arbitrary, the claim follows. \(\quad \square \)
Lemma 2.7
Let \(f_0, \rho _0,\gamma _0, {\hat{\gamma }}\) be as in Lemma 2.5, with \(S(\gamma _0)<\infty \). We assume that (2.12) holds. Then
Proof
We have shown in Lemma 2.5 that \(I(\rho _0)<\infty \). Note that by definition
and
By Lemma 2.6 and then (2.11) and the convexity of quadratic functions, we have
Combined with the estimate (2.13) in Lemma 2.5, the conclusion follows. \(\quad \square \)
We now state the first existence result for viscosity solutions, in a suitably regularized setting. The proof of Theorem 2.1 will follow easily from this statement.
Lemma 2.8
Let us consider \(h_0 \in UC_b({{\mathsf {X}}})\) with a nondecreasing modulus of continuity denoted as \(\omega _0:=\omega _{h_0}\). Let
Then the \(\overline{f}_t \in C_b({{\mathsf {X}}})\) is a strong viscosity sub-solution to the Hamilton–Jacobi equation (2.6) with \(h_0\) being replaced by \(h_{0,t}\).
Similarly, suppose that \(h_1 \in UC_b({{\mathsf {X}}})\) with a nondecreasing modulus of continuity \(\omega _1:=\omega _{h_1}\). Let
Then \({\underline{f}}_t \in C_b({{\mathsf {X}}})\) is a strong viscosity super-solution to the Hamilton–Jacobi equation (2.7) with \(h_1\) being replaced by \(h_{1,t}\).
Proof
We only prove the sub-solution case, the super-solution case is similar. Let \(f_0\) be as in (2.10). We assume that \(\rho _0 \in {{\mathsf {X}}}\) is such that
Then \(S(\rho _0)<\infty \). The existence of such \(\rho _0\) is guaranteed by the lower-semicontinuity of \(f_0\), \(f_0 \in LSC({{\mathsf {X}}};{\bar{{{\mathbb {R}}}}})\) and the compactness of \({{\mathsf {X}}}\). We have
Since the \(\overline{f}\) is a viscosity sub-solution to (1.28), by compactness of \({{\mathsf {X}}}\), there exists \(\gamma _0 \in {{\mathsf {X}}}\) such that
with
From the upper boundedness of \(h_0 - \overline{f}\), we arrive at the estimate that \(S(\gamma _0)<\infty \). Thus (2.16) reduces to
The above implies (2.12), hence we can apply Lemma 2.7 to (2.18), which results in
From (2.19), we obtain a rough estimate
Denoting a nondecreasing modulus of h by \(\omega _h\), then
We note that, from (2.17),
The claim is established. \(\quad \square \)
Finally, we are in a position to prove Theorem 2.1.
Proof
By Lemmas 2.4 and 2.8, we know that the functions \(\overline{f}_t\) and \({\underline{f}}_t\) satisfy the conditions of Lemma 2.3, for each \(t>0\). Hence by the comparison principle in Lemma 2.3, we have
Since
the conclusion of Theorem 2.1 follows by taking \(t \rightarrow 0^+\). \(\quad \square \)
3 Existence of Solutions for the Hamilton–Jacobi Equation Through Optimal Control of Nonlinear Diffusion Equations and Related Nisio Semigroups
We recall that \(({{\mathsf {X}}}, {{\mathsf {d}}})\) is a compact metric space, hence \(C({{\mathsf {X}}}) =C_b({{\mathsf {X}}})= UC({{\mathsf {X}}})\). Theorem 2.1 establishes that for each \(h \in C_b({{\mathsf {X}}})\) and \(\alpha >0\), there exists at most one function f such that it is both a sub-solution to (1.28) as well as a super-solution to (1.29). In this section, we show that there exists such a solution. Moreover, this solution is unique and can always be represented as the value function \(f= R_\alpha h\) of the family of nonlinear diffusion equations with control introduced in the introduction (see (1.7) for the definition of the operator \(R_\alpha \)):
with
3.1 A set of nonlinear diffusion equations with control
Throughout this section, we always assume that \(\eta \) satisfies (3.2). We use the convention \(0 \log 0 :=0\).
Definition 3.1
We say that \((\rho , \eta )\) is a weak solution to (3.1) in the time interval [0, T] if the following holds:
-
(1)
\(\rho (\cdot ) \in C([0,T];{{\mathsf {X}}})\).
-
(2)
\(\rho (t, dx) = \rho (t,x) dx\) holds for \(t>0\), for some measurable function \((t,x) \mapsto \rho (t,x)\).
-
(3)
The following estimates hold:
$$\begin{aligned} \int _0^T \int _{{\mathcal {O}}} \rho (t,x) \log \rho (t,x) dx dt <\infty , \end{aligned}$$(3.3)and
$$\begin{aligned} \int _s^T \int _{{\mathcal {O}}} | \log \rho (t,x) | dx dt <\infty , \quad \forall s>0, \end{aligned}$$(3.4)and
$$\begin{aligned} \int _s^T \int _{{\mathcal {O}}} |\partial _x \log \rho (t,x)|^2 dx dt <\infty , \quad \forall s>0. \end{aligned}$$(3.5) -
(4)
For every \(\varphi \in C^\infty ({\mathcal {O}})\) and \(0<s<t \le T\), we have
$$\begin{aligned} \langle \varphi , \rho (t) \rangle - \langle \varphi , \rho (s)\rangle = \int _s^t \big ( \langle \frac{1}{2} \partial ^2_{xx} \varphi , \log \rho (r) \rangle - \langle \partial _x \varphi , \eta (r) \rangle \big )dr. \end{aligned}$$(3.6)
In the above, note that \([0,\infty ) \ni r \mapsto r \log r\) is a function bounded from below (with convention \(0\log 0 =0\)), hence the \(\int _{{\mathcal {O}}} \rho (x) \log \rho (x) dx \in {{\mathbb {R}}}\cup \{+\infty \}\) is well defined.
We now describe the technical difficulties we need to overcome in this section. Let \(\Phi (r):= \frac{1}{2} \log r\), for \(r>0\). Then (3.1) can be written as
where
Equations similar to this type have been studied by Vázquez [43] with the control variable \(\partial _x \eta \) denoted using f. However, there it is assumed that \(\Phi \) is at least continuous. In contrast, our \(\Phi \) has \(\Phi (0) =-\infty \) and is thus singular. In addition, we also need to ensure that solution is non-negative. In [43], an approach based on the maximum principle is developed to establish positivity of a solution. This works well in the absence of control, \(f=0\), or when \(f\ge 0\). However, the positivity of a solution in our case, for this special f, seems to be of a different origin: the singularity of \(\Phi (0)=-\infty \) plays a key role. Therefore, we present a detailed justification using energy estimates. A further, but very minor, issue in that [43] is focused on Dirichlet or Neumann boundary conditions, whereas we have a periodic boundary. However, the boundary conditions only appear after integration by parts and the argument simplifies for the case of periodic boundary conditions. Hence, we do not provide details for this last issue and only address the first two issues below by studying a sequence of approximate equations.
The main purpose of this subsection is to establish the following existence result. We recall that the definition of the entropy function S is given in (1.19).
Lemma 3.2
For every \(\eta \) satisfying (3.2) and every \(\rho (0) =\rho _0 \in {{\mathsf {X}}}\subset H_{-1}({\mathcal {O}})\), there exists a \(\rho (\cdot ) \in C([0,T]; {{\mathsf {X}}})\) such that \((\rho ,\eta )\) solve (3.1)–(3.2) in the weak sense of Definition 3.1. This solution is unique. Indeed, such a pair \((\rho ,\eta )\) also satisfies the following properties.
-
(1)
For every \(\gamma _0 \in {{\mathsf {X}}}\) such that \(S(\gamma _0)<\infty \), and for every \(0 \le s<t <T\), the following variational inequalities hold:
$$\begin{aligned}&\frac{1}{2} \Vert \rho (t) -\gamma _0 \Vert _{-1}^2 + \int _s^t \Big ( \frac{1}{2} \big ( S(\rho (r)) - S(\gamma _0) \big ) \nonumber \\&\qquad + \int _{{\mathcal {O}}} \eta (r,x)\big ( \partial _x (-\partial _{xx}^2)^{-1} (\rho (r)-\gamma _0)(x) \big )dx \Big ) dr \nonumber \\&\quad \le \frac{1}{2} \Vert \rho (s) -\gamma _0 \Vert _{-1}^2. \end{aligned}$$(3.7) -
(2)
It holds that \(S(\rho (t)) <\infty \) for every \(t >0\) and \(\int _0^T S(\rho (r)) dr <\infty \) (this implies in particular that \(\rho (t,dx) = \rho (t,x) dx\) for \(t>0\)).
-
(3)
For every \(0<s<T<\infty \), it holds that
$$\begin{aligned} \int _s^T \int _{{\mathcal {O}}} \big ( - \log \rho \big )^+ dx dr <\infty . \end{aligned}$$ -
(4)
For every \(0 \le s \le t\), allowing the possibility of \(S(\rho (0))=+\infty \), the following holds
$$\begin{aligned} S(\rho (t)) +\int _s^t \int _{{\mathcal {O}}} \big ( \frac{1}{2} |\partial _x \log \rho (r,x)|^2 + \eta (r,x) \partial _x \log \rho (r,x)\big ) dx dr \le S(\rho (s)). \end{aligned}$$(3.8)
We divide the proof into several parts.
3.1.1 Approximate equations
Let \(\eta \in L^2((0,T) \times {{\mathcal {O}}})\) and \(\rho _0 \in {{\mathsf {X}}}\). We extend the definition to \(L^2({{\mathbb {R}}}\times {{\mathcal {O}}})\) by \(\eta (t,x):=0\) whenever \(t \le 0\) or \(t\ge T\). Let \(J \in C^\infty ({\mathcal {O}})\) be a standard spatial mollifier and \(G \in C^\infty _c({\mathcal {O}})\) a standard time-variable mollifier. We define mollification of (possibly signed) measures and functions on \({\mathcal {O}}\) in the usual sense. Hence \(\rho _{\epsilon ,0}: = J_\epsilon * \rho _0 \in C^\infty ({\mathcal {O}})\). We write
We approximate the singular function \(\Phi \) by a smooth function \(\Phi _\epsilon \) as follows:
Note that \(\Phi ^\prime (r) =\Phi _\epsilon ^\prime (r)\) for \(r > \epsilon \). We choose the constant \(C_\epsilon := -\frac{3}{2}\log \epsilon \) so that \(\Phi _\epsilon (\epsilon ) = - \log \epsilon >0 = \Phi _\epsilon (0)\). This feature allows us to pick a smooth function \(\theta _\epsilon \) with \(\theta ^\prime _\epsilon >0\) such that
so that
We denote the primitive \(\Theta _\epsilon (t) =\int _0^t \theta _\epsilon (r) dr\), and note that (\(\theta _\epsilon ^\prime >0\) ensures that \(\theta _\epsilon \) is an increasing function)
This construction ensures that \(\Phi _\epsilon ^\prime \in C({{\mathbb {R}}})\). Now, we consider
By Theorem 5.7 in [43], there exists a unique weak solution \(\rho _\epsilon (\cdot )\) in the sense of Definition 5.4 of [43]. Hence for every \(\varphi \in C^\infty ({\mathcal {O}})\), it holds that
In fact, in the regularized situation considered at the moment, standard quasilinear theory applies (e.g., the method of proving Theorem 6.1 in Chapter V of Ladyženskaja, Solonnikov and Ural’ceva [36]), hence \(\rho _\epsilon \in C^{1,2}((0,T) \times \bar{{\mathcal {O}}})\) is a classical solution. Note that the first part of condition (6.9) in Chapter V of [36] requires \(\Phi ^\prime \) be uniformly bounded away from zero. The above constructed \(\Phi _\epsilon \) does not satisfy this requirement. However, this is not a problem in current context because that \(\rho _\epsilon \) is bounded. We explain this in detail: Let \(M>0\) be a large parameter, we modify the definition of \(\Phi _\epsilon (r)\) into \(\Phi _\epsilon ^M(r)\) for those \(r > M\) and keep \(\Phi _\epsilon ^M(r)= \Phi _\epsilon (r)\) for \(r \le M\). We do such modification so that the \(\Phi _\epsilon ^M\) satisfies conditions of [36]. Then there exists a unique classical \(C^{1,2}\)-solution \(\rho _\epsilon ^M\) for
By the maximum principle,
Consequently, when \(M > M_0\), \(\rho _\epsilon :=\rho _\epsilon ^M\) solves (3.9) in classical sense.
We note that, for each \(t>0\) and \(\epsilon >0\), we cannot rule out the possibility that \(\rho _\epsilon (x,t)<0\). But we will show that this possibility disappears in the limit \(\epsilon \rightarrow 0^+\), by asymptotic estimates we now establish.
There are three important regularity properties of the \(\rho _\epsilon \) we will exploit. First, let
Then we have the energy inequality (5.20) in Theorem 5.7 of [43]:
Note that \(\rho _\epsilon (0) \in L^\infty ({\mathcal {O}})\), hence \(\int | \Psi _\epsilon (\rho _\epsilon (0,x)) | dx <\infty \). Also, by Jensen’s inequality,
Second, we have inequalities of dissipation type: for all \(\gamma _0 \in H_{-1}({\mathcal {O}})\) such that \(\int _{{\mathcal {O}}} \Psi _\epsilon (\gamma _0)dx<\infty \) and \(0\le s<t\)
This estimate can be verified through integration by parts. Note that \(\Psi _\epsilon ^{\prime \prime } = \Phi _\epsilon ^\prime >0\). The convexity of the \(\Psi _\epsilon \) implies that \((v-u) \Phi _\epsilon (u) \le \Psi _\epsilon (v) - \Psi _\epsilon (u)\). Therefore the last inequality also leads to
By direct computation,
From \((t \theta _\epsilon )^\prime = t \theta _\epsilon ^\prime + \theta _\epsilon \ge \theta _\epsilon \) for \(t\ge 0\), we obtain the estimate \(\Theta _\epsilon (t) \le t \theta _\epsilon (t)\). This implies in particular that
Integrating the solution of (3.9), we also arrive at the conservation property \(\langle 1, \rho _\epsilon (t)\rangle = \langle 1, \rho _{\epsilon ,0}\rangle =1\). We decompose \(\rho _\epsilon \) into positive and negative parts,
Then, when the \(\gamma _0 \in {{\mathsf {X}}}\) is a probability measure satisfying \(S(\gamma _0)<\infty \), we have
We note that
Therefore, the above estimates combined with (3.14) give a useful control on the amount of negative mass of \(\rho _\epsilon \):
Third, we show the following property.
Lemma 3.3
The \(\{ \rho _\epsilon \}_\epsilon \) is relatively compact in \(C\big ([0,\infty ); H_{-1}({\mathcal {O}})\big )\). Hence, selecting subsequence if necessary, there exists a limiting curve \(\rho (\cdot ) \in C\big ([0,\infty ); H_{-1}({\mathcal {O}})\big )\) such that
Proof
We verify relative compactness of the \(\{ \rho _\epsilon (\cdot ) \}_{\epsilon >0}\) through Arzelá-Ascoli lemma. The proof would be easier if \(\rho _\epsilon (t) \in \mathsf {{\mathsf {X}}}\), since the \({{\mathsf {X}}}\) is a compact space. However, for each fixed \(\epsilon >0\), our construction allow the possibility of negative mass in \(\rho _\epsilon (t)\) even though \(\rho _\epsilon (0) \in \mathsf {{\mathsf {X}}}\). The negative masses only vanish in the \(\epsilon \rightarrow 0\) limit.
First, we verify the existence of a compact subset \(K_1 \subset \subset H_{-1}({\mathcal {O}})\) such that
We start with a compact set \(K_0 := \{ \rho _0, J_\epsilon * \rho _0 : \epsilon >0\} \subset \subset H_{-1}({\mathcal {O}})\). Then for every \(\delta >0\), there exists a finite positive integer \(N:=N(\delta ) \in {{\mathbb {N}}}\) and \(\rho _{1,0}, \ldots , \rho _{N,0} \in C^\infty ({\mathcal {O}}) \cap {{\mathsf {X}}}\) such that \(K_0 \subset \cup _{k=1}^N B(\rho _{k, 0}; \delta )\). Let \(\rho _{\epsilon , k}(t)\) be the solution to
By a contraction estimate in Chapter 6.7.2 of [43] (see also part (iii) of Theorem 6.17 there),
By (3.12), noting \(\rho _{k,0} \in {\mathcal {P}}({\mathcal {O}})\), for every \(t \in [0,T]\), we have
In view of the explicit form of \(\Psi _\epsilon \) in (3.15), the set
is relatively compact in \(H_{-1}({\mathcal {O}})\) for every finite \(l \in {{\mathbb {R}}}_+\). Denote \(K_{1,\delta }:=K_{1,\delta }(L_\delta )\), then
Let \(K_{1,\delta }^\delta \) denote \(\delta \)-thickening set of the \(K_{1,\delta }\). Then by (3.19),
Taking \(K_1:= \overline{\cap _{\delta >0} K_{1,\delta }^\delta }\) (which is complete and totally bounded), we arrive at (3.18).
Second, through variational inequality (3.14), we obtain a local uniform modulus of continuity estimate \(\sup _{\epsilon >0}\sup _{t, s \in [0,T], |t-s|\le 1}\Vert \rho _\epsilon (t) - \rho _\epsilon (s) \Vert _{-1} \le \omega (|t-s|)\) for some modulus \(\omega \). It is sufficient to verify that, for every \(\delta \in (0,1)\), there exists a finite positive number \(C_\delta >0\) and \(\alpha \in (0,1)\) such that
Then we conclude by taking the \(\omega (r):= \inf _{\delta \in (0,1)} \{ \delta + C_\delta r^\alpha \}\). Let \(\delta \in (0,1)\) be given. For every \(s,t \in [0,T]\) with \(|s-t|\le 1\) and \(\epsilon >0\), by (3.20), there exists \(\gamma := \gamma (\delta ,\epsilon ,s) \in K_{1,\delta }(L_\delta )\) such that \(\Vert \rho _\epsilon (s) - \gamma \Vert _{-1} <\delta \). By (3.14),
Consequently
Note that the \(C_{2,\delta }\) only depends on \(\delta \) and not on \(\epsilon \), nor on s, t, we conclude. \(\quad \square \)
3.1.2 A priori regularities for the PDE with control (3.1)
Lemma 3.4
Let \(\rho _0 \in {{\mathsf {X}}}\) and \(\rho (\cdot ) \in C([0,\infty ); H_{-1}({\mathcal {O}}))\) be the limit as obtained from (3.17). It then satisfies the following properties.
-
(1)
It holds that \(\rho (r) \in {{\mathsf {X}}}\) for every \(t \ge 0\). Indeed, \(\rho (r,dx)=\rho (r, x) dx\) for \(r>0\) a.e., and
$$\begin{aligned} \rho (r,x) \ge 0, \text { a.e. } (r,x) \in (0,\infty ) \times {{\mathcal {O}}}. \end{aligned}$$(3.21) -
(2)
The variational inequality (3.7) holds.
-
(3)
\(\int _0^T \int _{{\mathcal {O}}} \rho (t,x) \log \rho (t,x) dx dt <\infty \).
-
(4)
\(\rho (\cdot ) \in C([0,\infty ); {{\mathsf {X}}})\).
Proof
Taking the limit \(\epsilon \rightarrow 0\) in (3.17), by the approximate variational inequality estimates (3.14), the negative mass estimate (3.16), and lower semicontinuity arguments, we conclude that \(\rho (r,dx) = \rho (r,x) dx\) for \(r >0\) a.e., that (3.21) holds (hence \(\rho (r) \in {{\mathsf {X}}}\) for all \(r\ge 0\)), and that (3.7) holds.
The estimate \(\int _0^T S(\rho (t)) dt < \infty \) follows from (3.7). \(\quad \square \)
We remark that the variational inequalities (3.7) alone (for a family of \(\gamma _0 \in {{\mathsf {X}}}\) with \(S(\gamma _0)<\infty \)) can be used as a definition of a solution for (3.1). This definition would suffice to establish a uniqueness result, as we now show.
Lemma 3.5
Let \((\rho _i, \eta _i)\), \(i=1,2\) solve (3.1)–(3.2) in the sense that both pairs satisfy the variational inequalities (3.7). In addition, we assume that \(\rho _i(\cdot ) \in C([0,T];{{\mathsf {X}}})\) for every \(T>0\) and \(i=1,2\). Then
Hence, given a fixed initial condition and the same control \(\eta =\eta _1=\eta _2\), it follows that \(\rho _1=\rho _2\).
Proof
Let \(0<\alpha<\beta <T\) and \(0<s<t<T\). From (3.7),
Similarly,
Adding these two inequalities, we obtain
We define
Then (3.23) becomes
If we write \(G(h) := F(t+h, s+h; t+h, s+h) \in C^1({{\mathbb {R}}}_+)\), then the last inequality becomes
That is,
We multiply by \((t-s)^{-2}\) on both sides and then take the limit \(t \rightarrow s^+\) to find
Here we used the fact that \(\rho _i(\cdot ) \in C({{\mathbb {R}}}_+;H_{-1}(\mathcal O))\) for each \(i=1,2\). By further mollification-approximation estimates, we find the above inequality is equivalent to (3.22).
\(\square \)
Definition 3.1 gives a notion of weak solution for the partial differential equation (3.1). It requires an a priori estimate that \(\log \rho (t,x)\) is locally integrable, so that this quantity can be viewed as a distribution (see (3.6)). Next, we establish this local integrability estimate for the limit \(\rho \) obtained from (3.17). We note that from the estimates in Lemma 3.4, we already know that \(\int _0^T \int _{{\mathcal {O}}} \rho (r,x) \log \rho (r,x) dx dr <\infty \), which implies in particular that
Therefore, we need to focus on the case where \(\rho (r,x) <1\).
Lemma 3.6
Let \(\rho (\cdot ) \in C([0,\infty ); H_{-1}({\mathcal {O}}))\) be the limit as obtained from (3.17). For every \(0\le s\le T<\infty \), allowing the possibility of \(S(\rho (0))=+\infty \), we have that
Furthermore, in view of (3.24) and Lemma 3.4,
Proof
Noting \(\int _{{\mathcal {O}}} \rho _\epsilon (t, x) dx =1\) for all \(t > 0\), we have \(\max _x \rho _\epsilon (t,x) >1/2\). Since \((t,x) \mapsto \rho _\epsilon (t,x)\) is continuous, we can select a family of points \(\{ x_\epsilon (t) \in {\mathcal {O}} : t>0\}\) such that \(\rho _\epsilon (t, x_\epsilon (t)) \ge 1/2\). We also observe that
Next, we estimate the left hand side of the above in three situations, namely
We note that \(\Phi _\epsilon (\rho _\epsilon (t,x_\epsilon (t))) \ge -\frac{1}{2} \log 2 + C_\epsilon \). Therefore
In addition,
which implies
Therefore, when \(\epsilon >0\) is small enough, (3.26) gives (using the convention that \(\sup \emptyset = -\infty \)),
Combined with (3.12), this yields
Using \( \int _{{\mathcal {O}}} \big ( - \log \big )(\rho _\epsilon \vee \epsilon ) dx \le \big ( - \log \big )(\inf _{{\mathcal {O}}} \rho _\epsilon \vee \epsilon )\), we conclude
Now we pass \(\epsilon \rightarrow 0\) in the above inequality to conclude (3.25). The details are given in the following steps. First, we note that
Hence by the convergence in (3.17) and by the estimate (3.27),
The observation
leads to the variational formula
Consequently
Second, by Jensen’s inequality, \(\Vert \eta _\epsilon (r) \Vert _{L^2} \le \Vert \eta (r)\Vert _{L^2}\). Finally, to get (3.25) from (3.28) by taking \(\epsilon \rightarrow 0^+\), noting the identification of \(\Psi _\epsilon \) in (3.15), all we need is to justify the inequality
If \(s=0\), then this follows directly by convexity/Jensen inequality arguments,
The case of \(0<s<T<\infty \) is more subtle. We divide the justifications into three steps. In step one, we construct the solutions \(\{\rho _\epsilon (r) : 0\le r < s\}\) as before with \(\rho _\epsilon (0) : =J_\epsilon * \rho _0\) and take its limit \(\{\rho (r): 0\le r < s\}\). Then we construct \(\{ {\hat{\rho }}_\epsilon (r) : s \le T\}\) as solution to (3.9) with initial data \({\hat{\rho }}_\epsilon (s) := J_\epsilon * \rho (s)\) and concatenate \(\rho _\epsilon \) with \({\hat{\rho }}\) to arrive at a new \({{\mathsf {X}}}\)-valued curve \(\{ {\tilde{\rho }}_\epsilon (r) : 0 \le r \le T\}\). This new curve is defined on \(r \in [0,T]\), but may have a discontinuity at time \(s>0\). We proceed to the second step next. All arguments and estimates before (3.31) in the proof of this lemma still hold if we replace \(\rho _\epsilon \) by \({\tilde{\rho }}_\epsilon \). Hence, for the concatenated curve, (3.31) still holds. In the last step, we note that \(\{ {\tilde{\rho }}_\epsilon :\epsilon >0\}\) and \(\{ \rho _\epsilon : \epsilon >0\}\) have the same limit \(\rho \). This holds by the stability-uniqueness result in Lemma 3.5. Therefore, (3.31) is verified for the curve \({\tilde{\rho }}_\epsilon \).
We conclude that (3.25) holds for the limit \(\rho \). \(\quad \square \)
Lemma 3.7
The energy estimate (3.8) holds.
Proof
Our strategy is to derive (3.8) by passing to the limit \(\epsilon \rightarrow 0^+\) in (3.12).
Let \(\rho \) be the limit of the the sequence of functions \(\rho _\epsilon \) as in (3.17). From (3.12), the following holds for every \(0<s <T\):
Suppose that we can find a set \({\mathcal {N}} \subset [s,T]\) of Lebesgue measure zero and functions \([s,T] \ni r \mapsto k_\epsilon (r) \in {{\mathbb {R}}}\) such that for every \(r \in [s,T]\setminus {{\mathcal {N}}}\), there exists a subsequence (still labeled by \(\epsilon :=\epsilon (r)\)) with
Then
hence we conclude.
We establish (3.33) next. Let
By (3.32), we can find a set \({\mathcal {N}} \subset [s,T]\) of Lebesgue measure zero such that
Therefore for \(r \in [s,T]\setminus {{\mathcal {N}}}\), there exists a subsequence \(\epsilon :=\epsilon (r)\), and there exist constants \(k_\epsilon :=k_\epsilon (r)\) such that \(\{ \Phi _\epsilon (\rho _\epsilon (r,\cdot )) + k_\epsilon \}_{\epsilon >0}\) is relatively compact in \(C({\mathcal {O}})\). Let
where the convergence (along the selected subsequence) is uniform in \({\mathcal {O}}\). We claim that the set \(\{x : \rho _\epsilon (r,x) <0\} = \emptyset \), when \(\epsilon \) is small enough. Suppose this is not true. Then by continuity of \(x \mapsto \rho _\epsilon (r,x)\), we can find \({\tilde{x}}_\epsilon (r)\) such that \(\rho _\epsilon (r, {\tilde{x}}_\epsilon (r)) =0\). We also recall that in the proof of Lemma 3.6, we can find \(x_\epsilon (r)\) such that \(\rho _\epsilon (r, x_\epsilon (r)) \ge 1/2\). Hence
But on the other hand, the estimate (3.34) implies that
This contradiction allows us to conclude that
Therefore \(u(r,x) = \lim _{\epsilon \rightarrow 0^+}\big ( \frac{1}{2} \log (\rho _\epsilon (r,x)) + (k_\epsilon + C_\epsilon )\big )\), where the convergence is uniform in x. That is, along this subsequence of \(\epsilon =\epsilon (r)\),
In view of the weak convergence in (3.17), \(u(r,x) = \frac{1}{2} \log \rho (r,x) + C_0\) for some constant \(C_0:=C_0(r)\). Hence we verified (3.33). \(\quad \square \)
3.2 A posteriori estimates for the PDE with control
We see in Lemma 3.5 that the notion of solutions in terms of the variational inequalities (3.7) implies uniqueness and stability. Next, we prove that the weak solution in the sense of Definition 3.1 implies that these variational inequalities hold, and hence uniqueness and stability follow.
Lemma 3.8
Every weak solution \((\rho ,\eta )\) to (3.1)–(3.2) also satisfies (3.7).
Proof
From (3.6), simple approximations shows that the following holds for all \(0<s<t\):
for all smooth \(\varphi :=\varphi (r,x) \) which includes in particular the choice
here, \(G_\epsilon :=J_\epsilon * J_\epsilon \) where \(J_\epsilon \) is a smooth mollifier and the \(*\) denotes convolution in the spatial variable only.
Therefore
We note that, by Jensen’s inequality,
With the a priori regularity estimates (3.3)–(3.5), passing to the limit \(\epsilon \rightarrow 0\), we obtain (3.7) for \(s>0\). Taking \(s \rightarrow 0+\), the case \(s=0\) follows. \(\quad \square \)
Lemma 3.9
Suppose that \((\rho , \eta )\) is the weak solution to (3.1)–(3.2) in the sense of Definition 3.1. Let
then
Proof
Following the ideas exposed in Theorem 24.16 of Villani [45] in a similar setting, we combine (3.7) and (3.8) to obtain the desired estimate. \(\quad \square \)
We define a set of regular points in the state space \({{\mathsf {X}}}\),
where the last equality follows from one-dimensional Sobolev inequalities. The estimates in Lemma 3.2 implies that under finite control cost (3.2), the weak solution of (3.1) has the property that it spends zero Lebesgue time outside the set \(\text {Reg}\). That is,
3.3 A Nisio semigroup
We recall that
For every \((\rho , \eta )\) satisfying (3.1)–(3.2) in the sense of Definition 3.1, by the regularity results established in Lemma 3.2, \(\log \rho (t,x) \in \mathcal D^\prime ((0,T) \times {\mathcal {O}})\) exists as a distribution, and
Let \(f:{{\mathsf {X}}}\mapsto {{\mathbb {R}}}\cup \{+\infty \}\) with \(\sup _{{\mathsf {X}}}f <\infty \), we define
It follows that \(\sup _{{\mathsf {X}}}V(t) f <\infty \). Moreover, \(V(t) C = C\) for any \(C \in {{\mathbb {R}}}\).
We define an action functional on curves \(\rho (\cdot ) \in C([0,\infty ); {{\mathsf {X}}})\), thus giving a precise meaning to (1.4):
and
We recall that \(R_\alpha \) is defined in (1.7) before giving regularity results for V(t) and \(R_\alpha \).
Lemma 3.10
For \(h \in C_b({{\mathsf {X}}})\), \(V(t) h\in C_b({{\mathsf {X}}})\) for all \(t \ge 0\) and \(R_\alpha h \in C_b({{\mathsf {X}}})\) for every \(\alpha >0\).
Proof
These claims are consequences of the stability Lemma 3.5. The proofs resemble those in standard-finite dimensional control problem. Hence we only prove the claim that \(V(t)h \in C_b({{\mathsf {X}}})\).
Let \(\rho _0, \gamma _0 \in {{\mathsf {X}}}\). For any \(\epsilon >0\), there exists \((\rho (\cdot ), \eta (\cdot )):=(\rho _\epsilon (\cdot ), \eta _\epsilon (\cdot ))\) and \((\gamma (\cdot ), \eta (\cdot )):= (\gamma _\epsilon (\cdot ), \eta _\epsilon (\cdot ))\) satisfying (3.1)–(3.2) with \(\rho (0) =\rho _0\) and \(\gamma (0)=\gamma _0\), and the contraction estimate (3.22) holds. Consequently,
Since \(\epsilon >0\) is arbitrary, it follows that \(V(t) h \in C_b({{\mathsf {X}}})\). \(\quad \square \)
Lemma 3.11
(Nisio semigroup). The family of operators \(\{ V(t): t \ge 0\}\) has the following properties:
-
(1)
It forms a nonlinear semigroup on \(C_b({{\mathsf {X}}})\),
$$\begin{aligned} V(t)V(s) f = V(t+s) f, \quad \forall t, s \ge 0, f \in C_b({{\mathsf {X}}}). \end{aligned}$$ -
(2)
The semigroup is a contraction on \(C_b({{\mathsf {X}}})\): for every \(f,g \in C_b({{\mathsf {X}}})\), we have
$$\begin{aligned} \Vert V(t) f - V(t) g \Vert _{L^\infty ({{\mathsf {X}}})} \le \Vert f - g \Vert _{L^\infty ({{\mathsf {X}}})}. \end{aligned}$$In fact,
$$\begin{aligned} \sup _{{\mathsf {X}}}\big ( V(t) f - V(t) g \big ) \le \sup _{{\mathsf {X}}}\big (f - g \big ). \end{aligned}$$ -
(3)
The resolvent is a contraction on \(C_b({{\mathsf {X}}})\): for every \(h_1, h_2 \in C_b({{\mathsf {X}}})\), we have
$$\begin{aligned} \Vert R_\alpha h_1 - R_\alpha h_2 \Vert _{L^\infty ({{\mathsf {X}}})} \le \Vert h_1 - h_2\Vert _{L^\infty ({{\mathsf {X}}})}. \end{aligned}$$Moreover, if \(h_1\) is bounded from above and \(h_2\) satisfies (1.10) and is bounded from below, then
$$\begin{aligned} \sup _{{\mathsf {X}}}( R_\alpha h_1 - R_\alpha h_2 ) \le \sup _{{\mathsf {X}}}(h_1 - h_2). \end{aligned}$$(3.41)If \(h_1\) is bounded and \(h_2\) is bounded from above, then
$$\begin{aligned} \inf _{{\mathsf {X}}}(R_\alpha h_1 - R_\alpha h_2) \ge \inf _{{\mathsf {X}}}(h_1 - h_2). \end{aligned}$$
Proof
The semigroup property follows from standard reasoning in dynamical programming. The contraction properties follow from an \(\epsilon \)-optimal control argument applied to the definition of V(t) in (3.38), similar to the proof of the last lemma. \(\quad \square \)
Lemma 3.12
Let \(\rho _0 \in {{\mathsf {X}}}\). We have
If in addition \(f \in C_b({{\mathsf {X}}})\), then there exists \(\rho ^t(\cdot ) \in C([0,t]; {{\mathsf {X}}})\) or equivalently, there exists \((\rho ^t(\cdot ), \eta (\cdot ))\) satisfying (3.1)–(3.2) in the sense of Definition 3.1, with \(\rho ^t(0) =\rho _0\) and satisfying the variational inequality (3.7), such that
Proof
By the definition of V(t)f in (3.38), we can find a sequence of \((\rho _n(\cdot ), \eta _n(\cdot ))\) satisfying (3.1) in the weak sense of Definition 3.1 with \(\rho _n(0) =\rho _0\) and such that
We note that \(V(t) 0 = 0\). The contraction property in Lemma 3.11 gives \(\Vert V(t) f \Vert _{L^\infty ({{\mathsf {X}}})} \le \Vert f \Vert _{L^\infty ({{\mathsf {X}}})}\), which in turn implies that
Therefore there exists an \(\eta \) such that \(\eta _n \rightharpoonup \eta \) weakly in \(L^2((0,T) \times {\mathcal {O}})\). This implies in particular
Therefore, it suffices to show that \(\{ \rho _n(\cdot ) : n =1,2,\ldots \}\) is relatively compact in \(C([0,\infty ); {{\mathsf {X}}})\), and that any limit point \(\rho _\infty (\cdot )\) of a convergent subsequence satisfies the regularity estimates (3.3)–(3.5).
Since \(({{\mathsf {X}}}, {{\mathsf {d}}})\) is a compact metric space, the relative compactness of the curves \(\{ \rho _n(\cdot ) : n=1,2,\ldots \}\) follows from a uniform modulus of continuity which we show to hold. First, for each n, (3.7) implies that
Second, by (3.9), this implies that
where the \(M_1>0\) is a finite constant. Using this, we obtain from (3.7) the estimate
Hence there exists a modulus \(\omega _1 : [0, \infty ) \mapsto [0,\infty )\) with \(\omega _1 (0+)=0\) such that
Third, we derive a short-time estimate of a uniform modulus of continuity. From the weak solution property coupled with the a priori estimates in the definition of a solution in Definition 3.1, we can derive a variant of (3.7) by approximation arguments:
In the last two lines above, the first inequality follows from the fact that \(\log \) is a monotonically increasing function. The previous estimate implies that
Consequently, for any \(\epsilon >0\), by a density argument we can find a \(\gamma _{0,\epsilon } \in {{\mathsf {X}}}\) such that \(S(\gamma _{0,\epsilon })<\infty \) and \(\Vert \gamma _{0,\epsilon } - \rho _0 \Vert _{-1}<\epsilon \). Taking \(\gamma _0:= \gamma _{0,\epsilon }\) and \(s=0\), we have
The constant C is the one from (3.43). Fourth, by the existence of \(\omega _i, i=1,2\), we conclude the existence of a uniform modulus \(\omega \) such that
The entropy estimate (3.3) for the \(\rho (\cdot )\) follows from the fact that each \(\rho _n(\cdot )\) satisfies (3.7), and that \(\rho \mapsto S(\rho )\) is lower semicontinuous in \(({{\mathsf {X}}}, {{\mathsf {d}}})\). Similarly, (3.4) holds for \(\rho (\cdot )\) since (3.25) holds for each \(\rho _n\) and \(\rho \mapsto \int _{{\mathcal {O}}} (-\log ) \rho ) dx\) is lower semicontinuous in \(({{\mathsf {X}}}, {{\mathsf {d}}})\) (see the proof of (3.29)). Finally, (3.5) holds for \(\rho (\cdot )\) because (3.8) holds for each \(\rho _n(\cdot )\).
Lemma 3.13
For every \(\alpha , \beta >0\), we have
Proof
The idea of proof in Lemma 8.20 of [24] applies in this context. Because of the special structure of the problem here, the use of a relaxed control there is not necessary.
\(\square \)
Let \(f_0\) and \(H_0f_0\) be as in (1.24)–(1.25). Note that the estimate in (3.7) holds, that \(f_0 - \alpha H_0 f_0\) is lower semicontinuous and bounded from below, and moreover that it satisfies for \(\alpha >0\)
for every \((\rho (\cdot ), \eta (\cdot ))\) solving the controlled PDE (3.1) with (3.2). Therefore
is a well defined function, it is bounded from below.
Similarly, let \(f_1\) and \(H_1 f_1\) be defined as in (1.26)–(1.27). Since \({{\mathsf {X}}}\) is compact, \(f_1 - \alpha H_1 f_1\) is bounded from above. It is also upper semicontinuous and bounded from above. Hence
is a well defined function and it is bounded from above.
With these estimates, we prove a variant of Lemma 8.19 in Feng and Kurtz [24].
Lemma 3.14
For every \(\alpha >0\),
Proof
We note that, for every \(\eta \in L^2({\mathcal {O}})\),
By the a priori estimate (3.7), for every \((\rho (\cdot ), \eta (\cdot ))\) solving the PDE (3.1) in the sense of Definition 3.1 with (3.2) and the initial condition \(\rho (0) =\rho _0\), we have
Moreover,
In view of (1.7), we conclude (3.45).
Next, we show (3.46). For any \(\gamma \in {{\mathsf {X}}}\) and \(\rho \in {{\mathsf {X}}}\) in the definition of \(f_1:=f_1(\gamma )\) as in (1.26), we define \(\eta := - k \partial _x (-\partial _{xx}^2)^{-1} (\rho - \gamma )\). Then
We consider the unique solution \(\gamma := \gamma (t)\) to
where the \(\rho \) is such that \(S(\rho )<\infty \). Then, in view of the estimate (3.7) (with the roles of \(\rho \) and \(\gamma \) swapped)
Consequently
Sending \(t \rightarrow \infty \), we conclude the proof of (3.46) and have thus established the lemma. \(\quad \square \)
Lemma 3.15
Let \(\alpha >0\) and \(h \in C_b({{\mathsf {X}}})\). We denote \(f:=R_\alpha h \in C_b({{\mathsf {X}}})\) (Lemma 3.10). Then f is a viscosity sub-solution to (1.28) with the \(h_0\) replaced by h, it is also a viscosity super-solution to (1.29) with the \(h_1\) replaced by h.
Proof
The proof follows from the proof of part (a) of Theorem 8.27 in Feng and Kurtz [24], using Lemmas 3.13. We only give details for the sub-solution case. The conditions on \(H_0 f_0\) is different than those imposed on \(\mathbf{H}_{\dagger }\) in [24]. However, in view of the improved contraction estimate (3.41) and because that \(f_0 - \beta H_0 f_0\) satisfies (1.10) (see the a priori estimate (3.7)), the proof can be repeated almost verbatim.
Let \(f_0, H_0 f_0\) be defined as in (1.24), (1.25). Then \(f_0\) is bounded from below and for every \(\beta >0\)
In this estimate, the first inequality follows from Lemmas 3.13 and 3.14; the second inequality follows from (3.41). By the arbitrariness of the \(\beta >0\), the sub-solution property follows from Lemma 7.8 of [24]. Note that \(f \in C_b({{\mathsf {X}}})\) and \(f_0 \in LSC({{\mathsf {X}}}; {{\mathbb {R}}}\cup \{+\infty \})\) and \(H_0f_0 \in USC({{\mathsf {X}}}; {{\mathbb {R}}}\cup \{-\infty \})\). \(\quad \square \)
In view of the comparison principle we established in Theorem 2.1, the above result allows us to conclude that Theorem 1.2 holds.
Finally, we link the operator \(R_\alpha \) with the semigroup V by a product formula.
Lemma 3.16
Let \(h \in C_b({{\mathsf {X}}})\), then
Proof
The proof of Lemma 8.18 in [24] applies here. The use of a relaxed control argument is not required in the current context. \(\quad \square \)
4 An Informal Derivation of the Hamiltonian H from Stochastic Particles
In this section, we outline a non-rigorous derivation of the Hamiltonian (1.1). A rigorous version of the theory requires significant additional work and will be presented elsewhere. Here, we restrict ourselves to establishing a detailed but heuristic picture.
To explain our program in a nutshell, we start with a system of interacting stochastic particles which has been used as a simplified toy model for gas dynamics. This model leads to the Carleman equation as kinetic limit. Kurtz obtained the hydrodynamic limit of this model, a nonlinear diffusion equation, [35] in 1973, followed by work by McKean [38]. We will also use ideas of Lions and Toscani [37] in their work on this model. While these references are concerned with the hydrodynamic limit, we are interested more broadly in fluctuations around this limit (which includes the hydrodynamic limit as minimiser of the rate functional).
The system is a high-dimensional Markov process. In the hydrodynamic limit, the macroscopic particle density is described by a probability measures on \({\mathcal {O}}\) satisfying a nonlinear diffusion equation. We aim to characterize both the limit as well as the fluctuations around it through an effective action minimization theory formulated as a path-integral. The probabilistic large deviation theory gives us a mathematical framework for explaining this rigorously.
Following a method developed by Feng and Kurtz [24], we establish the large deviations by studying the convergence of a sequence of Hamiltonians derived from the underlying Markov processes. A critical step in the program is to prove comparison principles for the limiting Hamiltonian. This is the motivation for the results presented in earlier sections of this paper. Another critical step is the derivation of the limit Hamiltonian, which we present now informally. The main technique involved is a singular perturbation method generalized to a setting of nonlinear PDEs in the space of probability measures.
4.1 Carleman equations, mean-field version
We now describe the particle model studied by Kurtz, McKean and Lions and Toscani [35, 37, 38]. On the unit circle \({\mathcal {O}}\), we are given a fictitious gas consisting of particles with two velocities. The first particle type moves into the positive x-direction and the second particle type in the negative direction, both with the same (modulus of) speed \(c>0\). Let \(w_1(t,x)\) be density of the first particles type at time t and at location x, and \(w_2(t,x)\) be density of the second type. When particles collide, reactions occur if the types are the same; otherwise, particles move freely as if nothing happened. The reaction happens at a rate \(k>0\) and the reaction mechanism is simple—they both switch to becoming the opposite type. At a mean-field level, we can express the above description in terms of a system of PDEs known as the Carleman equation
Following Lions and Toscani [37], we introduce the total mass density variable \(\rho \) and the flux variable j:
Then
We consider a hydrodynamic rescaling of the system by setting
Then
The flux variable j quickly equilibrates as \(\epsilon \rightarrow 0\) to an invariant set indexed by the slow variable \(\rho \):
This very explicit density-flux relation enables us to close the description using the \(\rho \)-variable only, giving a nonlinear diffusion equation
The first rigorous derivation of (4.3) as limit from (4.1) was given by Kurtz [35] in 1973 under suitable assumptions on the initial data. McKean [38] improved the result by giving a different and more elementary proof. The change of coordinate to the pair \((\rho , j)\) in Lions and Toscani [37] appeared later but makes a two-scale nature of the problem much more transparent.
4.2 A microscopically defined stochastic Carleman particle model
The Carleman equation (4.1) is a mean field model without any fluctuation. We go beyond mean field model by adding more details. One way of doing so would be to introduce explicitly a Lagrangian action, so that the Carleman dynamic (4.1) appears as a critical point or minimizer in the space of curves. We will, however, pursue a different implicit approach by introducing the action probabilistically using underlying stochastic particle dynamics. There are more than one possible choice for such a model. However, they all should have the following properties: one, such a model should give the Carleman equation in the large particle number limit; two, the action should appear implicitly in the sense of the likelihood of seeing a curve in the space of curves. That is, the higher the action, the less likely to see the curve. This action can be defined through a limit theorem as several parameters get rescaled (particle number, transport speed and reaction speed), as in (1.12). The precise language to be used here are large deviations. Caprino, De Masi, Presutti and Pulvirenti [3] has considered such a stochastic particle model and studied its law of large number limit. We now study the large deviation using a slight variation of their model.
We denote the phase space variable of an N-particle system
and define an operator \(\Phi _{ij}\) in the phase space
For \(f:=f(\mathbf{x,v})\) and \(i \ne j\), with a slight abuse of notation, we denote
To model nearest neighbor interaction, we introduce a standard non-negative symmetric mollifier \({\hat{J}} \in C^\infty ((-1,1);{{\mathbb {R}}}_+)\) with \(\int _{x \in (-1,1)} {\hat{J}}(x) dx =1\), \({\hat{J}}(0)>0\). We denote
and
Let \(\theta := \theta _N \rightarrow 0\) slowly with \(N \theta _N \rightarrow \infty \), and let \(\tau := \tau _N \rightarrow 0\). We now describe the modification of the model studied by Caprino, De Masi, Presutti and Pulvirenti [3]. We consider a Markov process in state space \(\big ( {{\mathcal {O}}} \times \{-1, +1\}\big )^N\) given by generator
From a formal point of view, the parameter \(\tau \) is unnecessary. However, it is essential for us to obtain useful a priori estimates which allow an analysis of the limit passage. It was introduced in [3] to avoid a paradoxical feature observed by Uchiyama [42] in the case of Broadwell equations. This feature shows that particles at the same location cannot be separated by the dynamics. Hence the kinetic limit \(N \rightarrow \infty \) of the stochastic model without the term with \(\tau \) does not converge to the Carleman equation as expected by formal computations. We refer the reader to page 628 and Section 4 of [3] for more information on this point.
Let \((\mathbf{X, V}):= \big ( (X_1, V_1), \ldots , (X_N, V_N)\big )\) be the Markov process defined by the generator \(B_N\). Moreover, we denote the one-particle-marginal density
Exploring propagation of chaos, through the BBGKY hierarchy, the authors of [3] proved that, as \(N \rightarrow \infty \), \(\mu _N\) has a (kinetic) limit \(\mu := \mu (dx,v; t):= \mu (x,v;t) dx\) satisfying
This is the Carleman system (4.1) if we take \(w_1(t,x)=\mu (x,+1;t)\) and \(w_2(t,x):=\mu (x,-1;t)\).
In order to understand the large deviation behavior, following [24], we compute the following nonlinear operator
We define the empirical probability measure
and choose a class of test functions which are symmetric under particle permutations,
The test function f can be abstractly thought of as a function in the space of probability measures with a typical element denoted as \(\mu \), hence the notation \(f(\mu )\). In the following, we use the traditional notation of a functional derivative,
For any test function \(\varphi =\varphi (x,v)\), we define a collision operator which maps a function \(\varphi \) of two variables (x, v) into a function \(C \varphi \) of four variables \((x,v, x_*, v_*)\), as follows:
For a measure \(\nu \) on \({\mathcal {O}}\) and \(\theta >0\), we define its mollification
Then, direct computation leads to the estimate
In the last line above, we invoked the condition \(N \theta _N \rightarrow +\infty \) to ensure that the diagonal terms \(\sum _{i=j=1}^\infty \) have a negligible effect on the overall convergence. Assuming \(\mu _N \rightarrow \mu \) in narrow topology where the \(\mu (dx;v) = \mu (x,v) dx\), we then have
4.3 Large deviation from the hydrodynamic limit
We now consider the hydrodynamic scaling by taking \(c:=\epsilon ^{-1}\) and \(k:= \epsilon ^{-2}\), together with \(N:=N(\epsilon ) \rightarrow \infty \).
To emphasize the two-scale nature of the problem, we switch to the density-flux coordinates:
The calculations in the coming paragraphs will heavily rely upon the simple relations:
and
Let
Then
We consider
and then
More generally, we can consider
The identity (4.9) relating derivatives of \(\mu \) to those of \(\rho \) and j still holds:
From this point on, we write \(H_\epsilon := H_{N(\epsilon )}\) to emphasize the dependence of \(\epsilon \). Then
Following the abstract theorems in Feng and Kurtz [24], if we can derive a limit of the operator \(H_\epsilon \) (which we claim to be the H in (1.1)) and if we can prove the associated comparison principle, then
where the action functional \(A_T\) is defined as in (1.4); here \(B(\rho (\cdot );\delta )\) is a ball of size \(\delta \) around \(\rho (\cdot )\) in \(C([0,T]; {{\mathsf {X}}})\) and \({{\mathsf {X}}}={\mathcal {P}}({\mathcal {O}})\) is the metric space specified in the introduction. This will then rigorously justify the formally statement (1.12). We reiterate that the aim of this paper is to establish the one challenging part, the comparison principle, rigorously (Sect. 2), which we now give heuristic arguments for the other part, the convergence.
4.4 Convergence of Hamiltonian operators, in a singularly perturbed sense
We show that the Hamiltonian H in (1.1) is a formal limit for the sequence of operators given by \(H_\epsilon \). The identification of H is related to an infinite-dimensional version of ground state energy problem in (4.15). We now describe three different possible approaches.
We consider a class of perturbed test functions
It follows then
where
with
and
We would like to have the limit in (4.12) independent of the j-variable and thus want to make the j-variable to disappear asymptotically. We can choose the test function \(f_1\) suitably to achieve this.
We introduce perturbed Hamiltonians in the j-variable
Then we seek solution to a stationary Hamilton–Jacobi equation in the j-variable
where H is a constant in j but may depend on \(\rho \) through \({{{\mathsf {H}}}}(\cdot , \cdot ;\rho )\) and on \(\varphi \) through \({{\mathsf {V}}}(\cdot ;\varphi )\). We denote this dependence as
Suppose that we can solve (4.15), then
and
Hence we can conclude our program. Next, we identify the Hamiltonian H as the one defined in (1.1) and show that the associated Hamilton–Jacobi equation (1.3) (in the interpretation of Sects. 2 and 3) is solvable.
We now present three different approaches to identify H. We comment that, although we work with a specific model (the Carleman particles) in this paper, our goal has been more ambitious. We would like to explore the scope of applicability of the Hamiltonian operator convergence method in the context of hydrodynamic limits. As this ambition is very general, we aim to present as many ways of verifying the required conditions as possible.
4.5 First approach to identify H—formal weak KAM method in infinite dimensions
In finite dimensions, equations of the type (4.15) have been studied in the weak KAM (Kolmogorov–Arnold–Moser) theory for Hamiltonian dynamical systems. See Fathi [18,19,20], E [12], Evans [13,14,15], Evans and Gomes [16, 17], Fathi and Siconolfi [21, 22] and others; there is an unpublished book of Fathi [23]. The existing literature focuses on finite-dimensional systems, mostly with compactness assumptions on the physical space. Our setting is necessarily very different, as we have an infinite-dimensional non-locally compact state space. In the following, we (formally) apply conclusions of the existing weak KAM theory to arrive at the representation
The representation (4.16) can be made more explicit due to a hidden controlled gradient flow structure in \({{\mathsf {H}}}\) (see (4.17)). To present the ideas as clearly as possible, we introduce yet another set of coordinates by considering
Then
This motivates us to introduce new test functions \(\phi := \rho \xi \). Under the new coordinates, we have
We define a free energy function
so
and
In particular, for any \(\theta \in [0,2]\),
This inequality will play important role in the rigorous justification of the derivation of H in (1.1).
Putting everything together, (4.16) gives
This is the Hamiltonian we gave in (1.1).
4.6 A decomposition of the \({{\mathsf {H}}}\) into a family of microscopic ones \(\big \{ {{\mathfrak {h}}}(\cdot ; \alpha ,\beta ) : \alpha , \beta \in {{\mathbb {R}}}\big \}\)
The second and third approaches to identify H involve a subtle argument we are going to explain first.
For the kind of problem we consider, we intuitively expect that propagation of chaos to hold. We expect this even at the large deviation/ hydrodynamic limit scale. Therefore, the infinite-dimensional Hamiltonian \({{\mathsf {H}}}\) is expected to be representable as summation of a family of one-particle level Hamiltonians indexed by some hydrodynamic parameters in statistical local equilibrium. This intuition leads to the following arguments.
We define a family of Hamiltonians indexed by \((\alpha ,\beta )\) at the one-particle level,
We observe that
and that
At least formally, if we take
and denote \(\partial _1 \psi (\upsilon ;y):= \partial _\upsilon \psi (\upsilon ;y)\), then
and
Therefore, in order to solve (4.15), it suffices to solve a family (indexed by \(\alpha \) and \(\beta \)) of finite-dimensional “small cell” problems
Here, E is a constant of the variable \(\upsilon \). Moreover, if we can solve this finite-dimensional PDE problem, then the H term for the infinite-dimensional problem (4.15) has a solution
These consideration leads to two more ways of identifying the effective Hamiltonian \(H=H(\rho ,\varphi )\). (We remark that we could present at least one further approach, which exploits the special one-dimensional nature of (4.20) by invoking the Maupertuis’ principle. We choose not to present this approach, since we are interested in general methodologies that work even when the velocity field u(x) take values in several dimensions and \(\upsilon \) in (4.20) lives in several dimensions.)
4.7 Second approach to identify H—finite-dimensional weak KAM and the method of equilibrium points
We introduce a microscopic (one-particle level) free energy function
The connection with the free energy introduced earlier is that
It is not surprising that the microscopic Hamiltonians \({\mathfrak h}\) also have controlled gradient flow structures:
if we introduce a family of isotropic Hamiltonians
Solving (4.20) is equivalent to solving
with \(\Psi = (\psi -{{\mathfrak {f}}})\). We note \({\mathfrak {h}}_\mathrm{iso}\) is isotropic in the sense that the dependence on generalized momentum variable p is only through its length |p|, i.e. \({{\mathfrak {h}}}_{\mathrm{iso}}(\upsilon , p) = {{\mathfrak {h}}}_{\mathrm{iso}} \big (\upsilon , |p|\big )\). It also holds that \({{\mathbb {R}}}_+ \ni r \mapsto \eta _{\mathrm{iso}}(\upsilon , r)\) is convex, monotonically nondecreasing and super-linear. In particular,
For this kind of Hamiltonian, it is known that (e.g. Fathi [23])
Consequently,
4.8 Third approach to identify H—semiclassical approximations
Finally, we abandon methods based on weak KAM. Instead, we introduce a method for identifying \(E[P;\alpha ,\beta ]\) directly using probability theory and ideas from semi-classical limits.
Our point of departure is to approximate equation (4.20) by introducing an extra viscosity parameter \(\kappa >0\). For readers familiar with the Hamiltonian convergence approach to large deviation as described in Feng and Kurtz [24], \({\mathfrak {h}}\) is (see (4.23) below) the limiting Hamiltonian for small noise large deviations (\(\kappa \rightarrow 0^+\)) for the stochastic differential equations
The solution \(\upsilon (t)\) is an \({{\mathbb {R}}}\)-valued Markov process with infinitesimal generator
Following [24], we define a sequence of nonlinear second order differential operators
then
We also consider a second-order stationary Hamilton–Jacobi equation with constant \(E_\kappa \):
This can be viewed as a regularized approximation to the first-order equation (4.20).
A simple transformation turns the nonlinear PDE (4.24) into a linear eigenfunction, eigenvalue equation
where
This is the equation defining ground state \(\Psi _\kappa \) with ground state energy \(E_\kappa \) of the rescaled Schrödinger operator \(\kappa L_\kappa \). There is a theory giving uniqueness for the constant E in (4.20). By well-known stability results for viscosity solution of Hamilton–Jacobi equations (4.24), we can prove that
The ground state energy \(E_\kappa \) is given by the Rayleigh-Ritz formula, which has been extensively studied in probability theory in the context of large deviations for occupation measures by Donsker and Varadhan. We denote by
the invariant probability measure for the Markov process \(\upsilon (t)\), and introduce a family of related probability measures indexed by \(\Phi \in C_b({{\mathbb {R}}})\):
We identify the (pre-)Dirichlet form associated with \(\kappa L_\kappa \) by
Then, by the arguments on pages 112 and 113 of Stroock [41] (alternatively, one can also follow Example B.14 in Feng and Kurtz [24]), we have
A change of variable \({\hat{\Phi }} \mapsto \kappa \Phi \) gives
where
We can further lift the \(m_{\kappa , \Phi }\) probability measure to
giving
We see that as \(\kappa \rightarrow 0\), by the Laplace principle, the limit points of \(\{ {\mathfrak {m}}_{\kappa , \Phi }: \kappa >0 \}\) form a family of probability measures as follows:
with \(\upsilon _k\) solves the algebraic equation
That is,
Then it follows that
Hence again we are lead to the
and again recover the Hamiltonian (1.1).
References
Ambrosio, L., Feng, J.: On a class of first order Hamilton–Jacobi equations in metric spaces. J. Differ. Equ. 256(7), 2194–2245 (2014). https://doi.org/10.1016/j.jde.2013.12.018
Ambrosio, L., Gigli, N., Savaré, G.: Gradient flows in metric spaces and in the space of probability measures. In: Lectures in Mathematics ETH Zürich, 2, x+334. Birkhäuser, Basel (2008)
Caprino, S., De Masi, A., Presutti, E., Pulvirenti, M.: A stochastic particle system modeling the Carleman equation. J. Stat. Phys. 55(3–4), 625–638 (1989)
Crandall, M.G., Lions, P.-L.: Hamilton–Jacobi equations in infinite dimensions. I. Uniqueness of viscosity solutions. J. Funct. Anal. 62(3), 379–396 (1985). https://doi.org/10.1016/0022-1236(85)90011-4
Crandall, M.G., Lions, P.-L.: Hamilton–Jacobi equations in infinite dimensions. II. Existence of viscosity solutions. J. Funct. Anal. 65(3), 368–405 (1986). https://doi.org/10.1016/0022-1236(86)90026-1
Crandall, M.G., Lions, P.-L.: Hamilton–Jacobi equations in infinite dimensions. III. J. Funct. Anal. 68(2), 214–247 (1986). https://doi.org/10.1016/0022-1236(86)90005-4
Crandall, M.G., Lions, P.-L.: Viscosity solutions of Hamilton–Jacobi equations in infinite dimensions. IV. Hamiltonians with unbounded linear terms. J. Funct. Anal. 90(2), 237–283 (1990). https://doi.org/10.1016/0022-1236(90)90084-X
Crandall, M.G., Lions, P.-L.: Viscosity solutions of Hamilton–Jacobi equations in infinite dimensions. V. Unbounded linear terms and \(B\)-continuous solutions. J. Funct. Anal. 97(2), 417–465 (1991). https://doi.org/10.1016/0022-1236(91)90010-3
Crandall, M.G., Lions, P.-L.: Hamilton–Jacobi equations in infinite dimensions. VI. Nonlinear \(A\) and Tataru’s method refined, Evolution equations, control theory, and biomathematics, (Han sur Lesse 1991). In: Lecture Notes in Pure and Applied Mathematics, 155, vol. 1994, pp. 51–89. Dekker, New York
Crandall, M.G., Lions, P.-L.: Viscosity solutions of Hamilton–Jacobi equations in infinite dimensions. VII. The HJB equation is not always satisfied. J. Funct. Anal. 125(1), 111–148 (1994). https://doi.org/10.1006/jfan.1994.1119
Dolgopyat, D., Liverani, C.: Energy transfer in a fast–slow Hamiltonian system. Commun. Math. Phys. 308(1), 201–225 (2011). https://doi.org/10.1007/s00220-011-1317-7
E, W.: Aubry–Mather theory and periodic solutions of the forced Burgers equation. Commun. Pure Appl. Math. 52(7), 811–828 (1999)
Evans, L.C.: Effective Hamiltonians and quantum states, Séminaire: Équations aux Dérivées Partielles, 2000–2001, Sémin. Équ. Dériv. Partielles, École Polytech., Palaiseau, Exp. No. XXII, 13 (2001)
Evans, L.C.: Towards a quantum analog of weak KAM theory. Commun. Math. Phys. 244(2), 311–334 (2004)
Evans, L.C.: A survey of partial differential equations methods in weak KAM theory. Commun. Pure Appl. Math. 57(4), 445–480 (2004)
Evans, L.C., Gomes, D.: Effective Hamiltonians and averaging for Hamiltonian dynamics. I. Arch. Ration. Mech. Anal. 157(1), 1–33 (2001)
Evans, L.C., Gomes, D.: Effective Hamiltonians and averaging for Hamiltonian dynamics. II. Arch. Ration. Mech. Anal. 161(4), 271–305 (2002)
Fathi, A.: Théorème KAM faible et théorie de Mather sur les systèmes lagrangiens, language=French, with English and French summaries. C. R. Acad. Sci. Paris Sér. I Math. 324(9), 1043–1046 (1997)
Fathi, A.: Solutions KAM faibles conjuguées et barrières de Peierls, French, with English and French summaries. C. R. Acad. Sci. Paris Sér. I Math. 325(6), 649–652 (1997)
Fathi, A.: Orbites hétéroclines et ensemble de Peierls, French, with English and French summaries. C. R. Acad. Sci. Paris Sér. I Math. 326(10), 1213–1216 (1998)
Fathi, A., Siconolfi, A.: Existence of \(C^1\) critical subsolutions of the Hamilton-equation. Invent. Math. 155(2), 363–388 (2004)
Fathi, A., Siconolfi, A.: PDE aspects of Aubry–Mather theory for quasiconvex Hamiltonians. Calc. Var. Partial Differ. Equ. 22(2), 185–228 (2005)
Fathi, A.: Weak KAM theorem in Lagrangian dynamics. In: Cambridge Studies in Advanced Mathematics, vol. 88. Cambridge University Press, Not Published
Feng, J., Kurtz, T.G.: Large Deviations for Stochastic Processes. Mathematical Surveys and Monographs, vol. 131. American Mathematical Society, Providence (2006)
Feng, J., Katsoulakis, M.: A comparison principle for Hamilton–Jacobi equations related to controlled gradient flows in infinite dimensions. Arch. Ration. Mech. Anal. 192(2), 275–310 (2009)
Feng, J., Święch, A.: Optimal control for a mixed flow of Hamiltonian and gradient type in space of probability measures, With an appendix by Atanas Stefanov. Trans. Am. Math. Soc. 365(8), 3987–4039 (2013). https://doi.org/10.1090/S0002-9947-2013-05634-6
Galaz-García, F., Kell, M., Mondino, A., Sosa, G.: On quotients of spaces with Ricci curvature bounded below. J. Funct. Anal. 275(6), 1368–1446 (2018). https://doi.org/10.1016/j.jfa.2018.06.002
Gangbo, W., Nguyen, T., Tudorascu, A.: Hamilton–Jacobi equations in the Wasserstein space. Methods Appl. Anal. 15(2), 155–183 (2008). https://doi.org/10.4310/MAA.2008.v15.n2.a4
Gangbo, W., Święch, A.: Metric viscosity solutions of Hamilton–Jacobi equations depending on local slopes. Calc. Var. Partial Differ. Equ. 54(1), 1183–1218 (2015). https://doi.org/10.1007/s00526-015-0822-5
Gangbo, W., Tudorascu, A.: Lagrangian dynamics on an infinite-dimensional torus; a weak KAM theorem. Adv. Math. 224(1), 260–292 (2010). https://doi.org/10.1016/j.aim.2009.11.005
Giga, Y., Hamamuki, G., Nakayasu, A.: Eikonal equations in metric spaces. Trans. Am. Math. Soc. 367(1), 49–66 (2015). https://doi.org/10.1090/S0002-9947-2014-05893-5
Gorban, A.N.: Hilbert’s sixth problem: the endless road to rigour. Philos. Trans. R. Soc. A 376(2118), 20170238 (2018). https://doi.org/10.1098/rsta.2017.0238
Guo, M.Z., Papanicolaou, G.C., Varadhan, S.R.S.: Nonlinear diffusion limit for a system with nearest neighbor interactions. Commun. Math. Phys. 118(1), 31–59 (1988)
Kipnis, C., Landim, C.: Scaling Limits of Interacting Particle Systems, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 320. Springer, Berlin (1999). https://doi.org/10.1007/978-3-662-03752-2
Kurtz, T.G.: Convergence of sequences of semigroups of nonlinear operators with an application to gas kinetics. Trans. Am. Math. Soc. 186(1973), 259–272 (1974)
Ladyženskaja, O.A., Solonnikov, V.A., Ural’ceva, N.N.: Linear and quasilinear equations of parabolic type, Russian, Translated from the Russian by S. Smith. Translations of Mathematical Monographs, vol. 23, American Mathematical Society, Providence, xi+648 (1968)
Lions, P.L., Toscani, G.: Diffusive limit for finite velocity Boltzmann kinetic models. Rev. Mat. Iberoam. 13(3), 473–513 (1997)
McKean, H.P.: The central limit theorem for Carleman’s equation. Isr. J. Math. 21(1), 54–92 (1975). https://doi.org/10.1007/BF02757134
Mischler, S., Mouhot, C.: Kac’s program in kinetic theory. Invent. Math. 193(1), 1–147 (2013). https://doi.org/10.1007/s00222-012-0422-3
Spohn, H.: Large Scale Dynamics of Interacting Particles. Springer, Berlin (1991)
Stroock, D.W.: An Introduction to the Theory of Large Deviations, Universitext. Springer, New York (1984). https://doi.org/10.1007/978-1-4613-8514-1
Uchiyama, K.: On the Boltzmann–Grad limit for the Broadwell model of the Boltzmann equation. J. Stat. Phys. 52(1–2), 331–355 (1988). https://doi.org/10.1007/BF01016418
Vázquez, J.L.: The Porous Medium Equation. Oxford Mathematical Monographs, Mathematical Theory. The Clarendon Press, Oxford (2007)
Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics, vol. 58. American Mathematical Society, Providence (2003). https://doi.org/10.1007/b12016
Villani, C.: Optimal Transport. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 338. Springer, Berlin (2009). https://doi.org/10.1007/978-3-540-71050-9
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. Hairer.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Jin Feng’s work was supported in part by US NSF Grant No. DMS-1440140 while he was in residence at MSRI - Berkeley, California, during fall 2018; and in part by a Simons Visiting Professorship to Mathematisches Forschungsinstitut Oberwolfach, Germany and to Technical University of Eindhoven, the Netherlands in November 2017, and in part by LABEX MILYON (ANR-10-LABX-0070) of Université de Lyon, within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) operated by French National Research Agency (ANR). He thanks Albert Fathi for helpful discussions on weak KAM theory on numerous occasions. Toshio Mikami’s work was supported by JSPS KAKENHI Grant Numbers JP26400136 and JP16H03948 from Japan. Johannes Zimmer’s work was supported through a Royal Society Wolfson Research Merit Award.
Appendix A. Some Properties of \({\mathcal {P}}({\mathcal {O}})\)
Appendix A. Some Properties of \({\mathcal {P}}({\mathcal {O}})\)
1.1 A.1 Quotients and Coverings
1.1.1 A.1.1 Projections and lifts
As defined in Sect. 1.3, we view \({\mathcal {O}}:= {{\mathbb {R}}}/{{\mathbb {Z}}}\) as a quotient space with corresponding quotient metric r. We define a projection \({{\mathsf {p}}}:{{\mathbb {R}}}\mapsto {\mathcal {O}}\) by
Let \({\mathcal {P}}_2({{\mathbb {R}}})\) be the Wasserstein order-2 metric space [2]. For every \({\hat{\mu }} \in {\mathcal {P}}_2({{\mathbb {R}}})\), we define push forward of \({\hat{\mu }}\) by \(\mu := {{\mathsf {p}}}_\# {\hat{\mu }} \in {\mathcal {P}}({\mathcal {O}})\). That is, we project \({\hat{\mu }}\) to \(\mu \) in the following way
Here, we use the set \(A+k:= \{ x +k : x \in A\}\), and we write \({{\mathcal {B}}}({\mathcal {O}})\) to denote the collection of Borel sets in \({\mathcal {O}}\).
There are many ways to lift a probability measure \(\mu \in \mathcal P({\mathcal {O}})\) to a probability measure \({\hat{\mu }} \in \mathcal P_2({{\mathbb {R}}})\) such that \({{\mathsf {p}}}_\# {\hat{\mu }} = \mu \). Following Galaz-García, Kell, Mondino and Sosa [27], we now describe one class of such lifts. Given a family of weights
we introduce a probability measure on \({{\mathbb {R}}}\) by
Second, using the family of measures \(\{\nu _x \}_{x \in \mathcal O}\), we define a lift operator \(\Lambda : = \Lambda ^\alpha : \mathcal P({\mathcal {O}}) \mapsto {\mathcal {P}}_2({{\mathbb {R}}})\) as follows
We note that \({{\mathsf {p}}}_\# {\hat{\mu }} = \mu \).
Let \(C_{\mathrm{per}}({{\mathbb {R}}})\) be the collection of continuous functions which are 1-periodic on \({{\mathbb {R}}}\). Similarly, we define \(C_{\mathrm{per}}^p({{\mathbb {R}}})\) for \(p =1,2,\ldots , \infty \). For each \({\hat{\varphi }} \in C_{\mathrm{per}}^p({{\mathbb {R}}})\), we have translation invariance
Hence such a function \({\hat{\varphi }}\) projects to an element \(\varphi \in C^p({\mathcal {O}})\) as follows
On the other hand, each \(\varphi \in C^p({\mathcal {O}})\) has a lift to \({\hat{\varphi }} \in C^p_{\mathrm{per}}({{\mathbb {R}}})\) defined by
Such a lift is translation invariant in \({{\mathbb {Z}}}\) and its projection (as defined in (A.1)) gives \(\varphi \).
Given \(\rho , \gamma \in {\mathcal {P}}({\mathcal {O}})\) and \(\varphi \in C^p({\mathcal {O}})\), for a fixed \(\alpha \), let \({\hat{\rho }}, {\hat{\gamma }} \in {\mathcal {P}}_2({{\mathbb {R}}})\) and \({\hat{\varphi }} \in C^p_{\mathrm{per}}({{\mathbb {R}}})\) be lifts as just defined. Then
In particular, this implies that
1.1.2 A.1.2 A random variable description
The constructions of projection and lifts of the previous subsection A.1.1 can be described using the language of random variables. In certain situations, this can be more intuitive.
Let \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) be a probability space and let \((X,K):\Omega \mapsto {\mathcal {O}} \times {{\mathbb {Z}}}\) be a pair of random variables. We define the \({{\mathbb {R}}}\)-valued random variable
Then
If \({\hat{X}}\) has the probability law \({\hat{\rho }}\), then
On the other hand, if X has the probability law \(\rho \), depending on the conditional probability law of the K,
or equivalently for the conditional law of \({\hat{X}}\)
the lift defined in (A.4) becomes
1.2 A.2 Equivalence of metric topologies
We recall inequality (1.18)
Next, we establish a converse of sorts. The proof below is an adaptation of Lemma 4.1 in Mischler–Mouhot [39].
Lemma A.1
For every \(\rho , \gamma \in {\mathcal {P}}({\mathcal {O}})\), we have
Proof
We can construct a probability space \((\Omega , {\mathcal {F}}, {\mathbb P})\) with two pairs of \({\mathcal {O}} \times {{\mathbb {Z}}}\)-valued random variables \((X,K_1), (Y,K_2)\) such that
We introduce
and denote \({\hat{\rho }}(d{\hat{x}}) := {{\mathbb {P}}}({\hat{X}} \in d {\hat{x}})\) and \({\hat{\gamma }}(d{\hat{y}}):={{\mathbb {P}}} ({\hat{Y}} \in d {\hat{y}})\). We use the Fourier transform
and
Then
where \(K:=K_1-K_2\).
On the other hand, it holds that
Therefore
Next, we note that
Therefore, with the quotient metric r defined in (1.15), we have
(see (1.16) for the definition of \(\Pi (\rho ,\gamma )\)). This leads to
\(\square \)
Rights and permissions
About this article
Cite this article
Feng, J., Mikami, T. & Zimmer, J. A Hamilton–Jacobi PDE Associated with Hydrodynamic Fluctuations from a Nonlinear Diffusion Equation. Commun. Math. Phys. 385, 1–54 (2021). https://doi.org/10.1007/s00220-021-04110-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00220-021-04110-1