1 Introduction

In this note we report on some recent progress on the origins of the fluctuation theory from the fundamental laws of motion. For states far from equilibrium, the macroscopic fluctuation theory has been investigated intensively, but microscopic derivations are mainly focused on stochastic lattice gases (see e.g. [3, 12, 32]). We study here classical deterministic particles in a rarefied gas. In its rigorous version, the issue is then connected with the problem of the mathematical validity of the Boltzmann equation, in the limit introduced by Grad [16]. This limit procedure states that in a Hamiltonian system of N particles, strongly interacting at distance \({\varepsilon }\), the particle density approximates the solution to the Boltzmann equation when \(N \rightarrow \infty ,\)\({\varepsilon }\rightarrow 0 ,\) in such a way that the collision frequency (proportional to \(N{\varepsilon }^{d-1}\) in dimension \(d=2,3\)) remains bounded; the volume density scales like \({\varepsilon }\), and both collisions and transport have a finite effect in the limit.

The Boltzmann gas is a simple case featuring a nonlinear dynamics, and a rich structure for the fluctuations. At the macroscopic scale, typical particles behave as i.i.d. variables. Small fluctuations admit already an interesting theory. In particular, they exhibit spatial correlations, and noise originating from (deterministic) collisions [13, 31]. Moreover, rare fluctuations satisfy a large deviation principle. We refer to the companion work by Bouchet [7], where the large deviation theory for the Boltzmann equation has been discussed first.

On the mathematical side, the only result we are aware of in this direction, is the convergence of the second moment of the small fluctuations proved by Spohn in [30]. Similar results are available for linear regimes close to equilibrium, both for short [2] and for large times [4]. As suggested in [31], the fluctuation theory should not be merely a phenomenological theory, but a rigorous consequence of the laws of mechanics. Our aim is to support this assertion, providing a robust mathematical framework.

We shall state several theorems (Theorems 1, 3, 4) describing the behaviour of the empirical density

$$\begin{aligned} \pi ^{{\varepsilon }}_t:=\frac{1}{\mu _{\varepsilon }}\sum _{i=1}^N \delta _{\mathbf{z}^{\varepsilon }_i(t)} \,,\ \ \ \ \ \mu _{\varepsilon }={\varepsilon }^{-(d-1)} \end{aligned}$$
(1.1)

for a Newtonian evolution of N particles with positions and velocities

$$\begin{aligned} \mathbf{z}^{\varepsilon }_i(t) = \left( \mathbf{x}^{\varepsilon }_i(t), \mathbf{v}^{\varepsilon }_i(t)\right) \;,\ \ \ \ \ i=1,\dots ,N\;. \end{aligned}$$

We assume that the particles are approximately Poisson-distributed at time \(t=0\), with (random) total number of particles \(\mathcal {N}\) and regular phase-space density \(f^0=f^0(x,v)\). Probability and expectation with respect to this initial measure are denoted by \({\mathbb {P}}_{\varepsilon }\) and \({\mathbb {E}}_{\varepsilon }\). Then, in the Boltzmann–Grad limit \({\varepsilon }\rightarrow 0\), \({\mathbb {E}}_{\varepsilon }\left( \mathcal {N}\right) /\mu _{\varepsilon }\rightarrow 1\), the following properties hold.

  1. 1.

    Law of large numbers:

    $$\begin{aligned} \pi ^{{\varepsilon }}_t \rightarrow f_t \,,\qquad t \in [0,T^\star ] \end{aligned}$$
    (1.2)

    weakly (in probability) for some \(T^\star >0\), where \(f_t\) is the solution of

    $$\begin{aligned} {\partial }_t f +v \cdot \nabla _x f = C(f,f) \end{aligned}$$
    (1.3)

    with initial datum \(f^0\) and C is Boltzmann’s collision operator [21]; in particular, the chaos property of the initial measure propagates in time (rescaled correlation functions converge to a tensor product).

  2. 2.

    Central limit theorem: the fluctuation field

    $$\begin{aligned} \zeta ^\varepsilon _t := \sqrt{\mu _{\varepsilon }} \left( \pi ^{\varepsilon }_t - {\mathbb {E}}_{\varepsilon }\left( \pi ^{\varepsilon }_t \right) \right) \end{aligned}$$
    (1.4)

    describing the small deviations of the empirical density from its average, converges in law on \([0,T^\star ]\) to the Gaussian process \(\zeta _t\) governed by the fluctuating Boltzmann equation

    $$\begin{aligned} d \zeta _t = \mathcal {L}_t \,\zeta _t\, dt + d\eta _t \,, \end{aligned}$$
    (1.5)

    where \(\mathcal {L}_t\) is Boltzmann’s operator linearized around \(f_t\), and \(d\eta _t\) is Gaussian noise (with covariance defined in (5.2)), as predicted in [30].

  3. 3.

    Large deviations are exponentially small in \(\mu _{\varepsilon }\) and characterized, at least in a regime of strong regularity, by the same large deviation functional as heuristically derived in [7] (and previously obtained, rigorously, in [28] from a one-dimensional stochastic process). That is, the probability of observing a path \( \varphi _t= \varphi (t,x,v)\) satisfies

    $$\begin{aligned} {\mathbb {P}}_{{\varepsilon }}\left( \pi ^{\varepsilon }_t \approx \varphi _t , \ t \in [0, T^\star ] \right) \asymp \exp \left( -\mu _{\varepsilon }\, \mathcal {F}(T^\star ,\varphi )\right) , \end{aligned}$$
    (1.6)

    where \(\mathcal {F}\) is defined as the Legendre transform of a functional \(\mathcal {J}= \mathcal {J}( T^\star ,\varphi )\), solution of a Hamilton–Jacobi equation.

The Boltzmann equation is naturally suited to a probabilistic interpretation, and its mathematical validity can be based on the construction of a stochastic particle system mimicking the microscopic collisions. The basic example is the Kac model [17], from which the spatially homogeneous Boltzmann equation can indeed be recovered. Fluctuations in this type of process can also be analysed (see e.g. [18, 19, 23]), including large deviations ([22]) and spatially inhomogeneous variants ([27, 28]). Our results show that the analogy between the deterministic hard-sphere dynamics and the stochastic model goes far beyond the typical behavior as it remains valid for extremely rare events. The statistical behavior of the hard-sphere gas described above is in fact the same as the one derived in [27, 28]. For physical observables as the empirical measure, the deterministic dynamics and the stochastic approximation (as often used in simulations) cannot be distinguished, even at the level of fluctuations.

Our main restriction is the smallness of the time \(T^\star \). This time (depending only on \(f^0\)) is actually a fraction of the time of validity of the Boltzmann equation obtained by Lanford in [21]. We will further restrict to a gas of hard spheres, though we believe that the results could be proved for smooth and compactly supported interactions, adopting known techniques [14, 20, 25].

The Hamilton–Jacobi equation determining \(\mathcal {J}\) (Theorem 2) is our ultimate point of arrival in the derivation from a microscopic mechanical model. A stationary solution of this equation is given by the (dual) Boltzmann’s H functional, which describes large deviations of the equilibrium state. Moreover, \(\mathcal {F}\) has an invariance encoding the microscopic reversibility, a symmetry inherited from the equality between the probability of a path and the probability of the time-reversed path. This is an indication on the amount of recovered information which was “lost” in Lanford’s Theorem, proving the transition from a reversible to a dissipative model.

Our method is far from standard approaches to a large deviation problem. For stochastic dynamics, large deviations can be evaluated by modifying the underlying stochastic process in a time dependent way in order to produce an atypical trajectory. The optimal cost for inducing such a bias on the stochastic dynamics is precisely the large deviation rate. For hard-sphere systems, there is no underlying stochastic dynamics as all the randomness lies in the initial data. It seems exceedingly hard to figure out a way to bias the initial probability measure in order to produce a given admissible path \(\varphi _t\). Indeed, the deterministic dynamics is responsible for an intricate relation between the path and the initial distribution of spheres.

We therefore turn back to the more modest problem of analysing the error in (1.2).Footnote 1 It is already evident in Lanford’s proof, that the dynamical information lives on precise little regions of the \(j-\)particle phase space, converging to measure-zero sets as \({\varepsilon }\rightarrow 0\), for any finite j. In little regions of the same size, correlations are generated by the collision events, which break the propagation of chaos (see e.g. [5]). These correlation sets do not encode the most probable future dynamics and they can be neglected when proving (1.2). However we can extract much more information, by looking for mathematically tractable quantities which are concentrated exactly on these sets, and retaining the information which is lost in (1.2).

A natural candidate is provided by cumulants, which can be obtained by the series expansion of the generating function

$$\begin{aligned} \Lambda ^{\varepsilon }_{t} (h) := \frac{1}{\mu _{\varepsilon }}\log {\mathbb {E}}_{\varepsilon }\left( \exp \left( \sum _{i =1}^{\mathcal {N}} h \left( \mathbf{z}^{\varepsilon }_i(t)\right) \right) \right) \end{aligned}$$
(1.7)

where h is any test function. The order n in this expansion is given by a function \(f_n^{{\varepsilon }}= f_n^{{\varepsilon }}(t)\) on the \(n-\)particle phase space (formula (4.6)) describing a cluster of particles mutually correlated by a chain of interactions. It has been noted in [13] that the hierarchy of cumulants determines all the properties of the fluctuations in a gas, and that their exact computation furnishes a theory of fluctuations at the same time. In order to prove rigorous results and reach large deviations, we will construct the limit of the exponential moment (1.7), and link it to the function \(\mathcal {J}\).

The expansion of (1.7) leads to a combinatorial problem, which can be dealt with by the cluster expansion method [29]. Indeed this method fits very well with the dynamics at low density, when combined with geometrical estimates on hard-sphere trajectories.

We organize the paper as follows. Section 2 is a brief introduction to our strategy. Section 3 presents the model and the fundamental result leading to (1.2), and explains the basic dynamical formula expressing the main quantities of interest in terms of the initial data. In Sect. 4 we state our main results on the dynamical correlations and their limiting structure, and derive the Hamilton–Jacobi equation. Finally, the last two sections are devoted to the fluctuating Boltzmann equation and the large deviations respectively. In this paper we shall only sketch the proof of our results, the complete version of which will be provided in a longer publication [6].

2 Strategy

Lanford’s method [21] is based on the BBGKY hierarchy governing the evolution of the family of (properly rescaled) correlation functions \(\left( F^{{\varepsilon }}_n \right) _{n \ge 1}\). This hierarchy is completely equivalent to the Liouville equation describing interacting transport of N hard spheres. In the Boltzmann–Grad limit, the probability densities concentrate on an infinite-dimensional space, and the BBGKY hierarchy is convenient to capture the relevant information. One thus introduces the (rescaled) correlation functions \( F_n^{{\varepsilon }} (t,Z_n )\) such that

$$\begin{aligned} {\mathbb {E}}_{\varepsilon }\Big ( \sum _{\begin{array}{c} i_1, \dots , i_n \\ i_j \ne i_k, j \ne k \end{array}} h_n \big ( \mathbf{z}_{i_1}^{{\varepsilon }}(t), \dots , \mathbf{z}_{i_n}^{{\varepsilon }} (t)\big ) \Big ) = \mu _{\varepsilon }^{n} \int _{\mathbb {D}^n} dZ_n \, F_n^{{\varepsilon }} (t,Z_n )\, h_n( Z_n) \,, \end{aligned}$$

for any test function \(h_n\), where \(\mathbb {D}\) is the 1-particle phase space. The family \(\left( F^{{\varepsilon }}_n \right) _{n \ge 1}\) is suited to the description of typical events: in the limit \({\varepsilon }\rightarrow 0\), \(F^{{\varepsilon }}_n\rightarrow f^{\otimes n}\) so that everything is coded in f (solution of (1.3)), no matter how large n.

We need to go beyond the BBGKY hierarchy and turn to a more powerful representation of the dynamics. We shall replace the family \(\left( F^{{\varepsilon }}_n \right) _{n \ge 1}\) with an equivalent family of (rescaled) truncated correlation functions \(\left( f^{{\varepsilon }}_n \right) _{n \ge 1}\), called cumulants. Their role is to grasp information on the dynamics on finer and finer scales. Loosely speaking, \(f^{{\varepsilon }}_n \) will collect events where n particles are “completely connected” by a chain of interactions. We shall say that the n particles form a connected cluster. Since a collision between two given particles is typically of order \(\mu _{\varepsilon }^{-1}\) (the size of the “collision tube” spanned by one particle in time 1), a complete connection would account for events of probability of order \(\mu _{\varepsilon }^{-(n-1)}\). We therefore end up with a hierarchy of rare events, which we would like to control at arbitrary order. At variance with \(\left( F^{{\varepsilon }}_n \right) _{n \ge 1}\), even after the limit \(\mu _{\varepsilon }\rightarrow \infty \) is taken, the cumulant \(f^{{\varepsilon }}_n\) cannot be trivially obtained from the cumulant \(f^{{\varepsilon }}_{n-1}\). Each step entails extra information, and events of increasing complexity, and decreasing probability.

Unfortunately, the equations for \(\left( f^{{\varepsilon }}_n \right) _{n \ge 1}\) are difficult to handle. But the moment-to-cumulant relation \( \left( F^{{\varepsilon }}_n\right) _{n \ge 1} \rightarrow \left( f^{{\varepsilon }}_n \right) _{n \ge 1}\) is a bijection and, in order to construct \( f^{{\varepsilon }}_n(t)\), we can still resort to the same solution representation of [21] for the correlation functions \(\left( F^{{\varepsilon }}_n (t)\right) _{n \ge 1}\). This formula is an expansion over collision trees, meaning that it has a geometrical representation as a sum over binary tree graphs, with vertices accounting for collisions (see Sect. 3). Two particles are correlated if their generated trees are connected by a “recollision”, which is an event of weight \(\mu _{\varepsilon }^{-1}\) (see Sect. 3.3.3 for a precise notion of recollision).

In Proposition 2 we will state the main technical advance of this paper: the cumulant (rescaled by the factor \(\mu _{\varepsilon }^{n-1}\)) grows as \( n^{n-2}\) in \(L^1\)-norm. This estimate is intuitively simple. We have at disposal a geometric notion of correlation as a link between two collision trees. Based on this notion, we can draw a random graph on n vertices telling us which particles are correlated and which particles are not (each collision tree being one vertex of the graph). Since the cumulant \(f^{\varepsilon }_n\) corresponds to n completely correlated particles, there will be at least \(n-1\) edges, each one of small ‘volume’ \(\mu _{\varepsilon }^{-1}\). Of course there could be more than \(n-1\) connections (the random graph has cycles), but these are hopefully unlikely as they produce extra smallness in \({\varepsilon }\). If we ignore all of them, we are left with minimally connected graphs, whose total number is \(n^{n-2}\) by Cayley’s formula.

The limiting equations for the family \(\left( f^{{\varepsilon }}_n \right) _{n \ge 1}\) form a Boltzmann cumulant hierarchy, displaying a remarkable structure [10, 13]. The first equation (\(n=1\)) is just the Boltzmann equation. The second equation (\(n=2\)) is driven by a linearized Boltzmann operator \(\mathcal {L}_t\), plus a singular “recollision operator”, acting on \(f_1\) only, generating the “connection” (correlation) between two particles and suited to be interpreted as noise source [30]. The higher order equations (\(n>2\)) have an increasingly complex structure, combining the action of the two operators (of standard linearized type, and of connecting type) on n different particles, in all possible ways. But the good n-dependence of the uniform bounds allows to sum up the cumulants into an analytic series. This finally translates the cumulant hierarchy into the Hamilton–Jacobi equation, which stands as a compact, nonlinear representation of the correlation dynamics.

3 Collision Trees

In this section we introduce the geometrical representation of the hard-sphere dynamics with random initial data, which will be our basic tool.

3.1 Hard-Sphere Model

Fig. 1
figure 1

Transport and collisions in a hard-sphere gas

The microscopic model consists of N identical hard spheres of unit mass and of diameter \({\varepsilon }\). Their motion is governed by a system of ordinary differential equations, which are set in  \(\mathbb {D}^N:=( \mathbb {T}^d\times \mathbb {R}^d)^{N }\) where \(\mathbb T^d\) is the unit d-dimensional periodic box:

$$\begin{aligned} {d\mathbf{x}^{{\varepsilon }}_i\over dt} = \mathbf{v}^{{\varepsilon }}_i\,,\quad {d\mathbf{v}^{{\varepsilon }}_i\over dt} =0 \quad \hbox { as long as } \ |\mathbf{x}^{{\varepsilon }}_i(t)-\mathbf{x}^{{\varepsilon }}_j(t)|>{\varepsilon }\quad \hbox {for } \ 1 \le i \ne j \le N \, , \end{aligned}$$
(3.1)

with specular reflection at collisions:

$$\begin{aligned} \begin{aligned} \left. \begin{aligned} \left( \mathbf{v}^{{\varepsilon }}_i\right) '&:= \mathbf{v}^{{\varepsilon }}_i - \frac{1}{{\varepsilon }^2} (\mathbf{v}^{{\varepsilon }}_i-\mathbf{v}^{{\varepsilon }}_j)\cdot (\mathbf{x}^{{\varepsilon }}_i-\mathbf{x}^{{\varepsilon }}_j) \, (\mathbf{x}^{{\varepsilon }}_i-\mathbf{x}^{{\varepsilon }}_j) \\ \left( \mathbf{v}^{{\varepsilon }}_j\right) '&:= \mathbf{v}^{{\varepsilon }}_j + \frac{1}{{\varepsilon }^2} (\mathbf{v}^{{\varepsilon }}_i-\mathbf{v}^{{\varepsilon }}_j)\cdot (\mathbf{x}^{{\varepsilon }}_i-\mathbf{x}^{{\varepsilon }}_j) \, (\mathbf{x}^{{\varepsilon }}_i-\mathbf{x}^{{\varepsilon }}_j) \end{aligned}\right\} \quad \hbox { if } |\mathbf{x}^{{\varepsilon }}_i(t)-\mathbf{x}^{{\varepsilon }}_j(t)|={\varepsilon }\,. \end{aligned} \end{aligned}$$
(3.2)

The sign of the scalar product \((\mathbf{v}^{{\varepsilon }}_i-\mathbf{v}^{{\varepsilon }}_j)\cdot (\mathbf{x}^{{\varepsilon }}_i-\mathbf{x}^{{\varepsilon }}_j)\) identifies post-collisional (+) and pre-collisional (−) configurations. This flow does not cover all possible situations, as multiple collisions are excluded. But one can show (see [1]) that, for almost every initial configuration \((\mathbf{x}^{{\varepsilon }0}_i, \mathbf{v}^{{\varepsilon }0}_i)_{1\le i \le N}\), there are neither multiple collisions, nor accumulations of collision times, so that the dynamics is globally well defined (Fig. 1).

Below, we shall denote collections of positions and velocities respectively by \(X_N := (x_1,\dots ,x_N) \in \mathbb {T}^{dN}\) and \(V_N := (v_1,\dots ,v_N) \in \mathbb {R}^{d N}\), and we set \(Z_N:= (X_N,V_N) \in ( \mathbb {T}^d\times \mathbb {R}^d)^{N }\), \(Z_N =(z_1,\dots ,z_N)\).

Let \(f^0\) be a probability density on \(\mathbb {D}\) with Gaussian decay in velocity

$$\begin{aligned} |f^0(x,v)| + | \nabla _x f^0(x,v)|\ \le C_0 \; \exp \left( - \frac{\beta _0}{2} |v|^2 \right) , \end{aligned}$$
(3.3)

where \(C_0, \beta _0 > 0\). Because of the condition of hard-sphere exclusion, the positions of the particles cannot be independent of each other. To better focus on the dynamical issue, we shall choose, as initial measure, the N-particle distribution with minimal correlations. In particular, to avoid spurious correlations due to a given total number of particles, we shall consider a grand canonical state. The initial probability density of finding N particles in \(Z_N\) is given by

$$\begin{aligned} \frac{1}{N!}W^{{\varepsilon }0}_{N}(Z_N) := \frac{1}{\mathcal {Z}^ {\varepsilon }} \,\frac{\mu _{\varepsilon }^N}{N!} \, \mathbf{1}_{ \mathcal {D}^{{\varepsilon }}_{N}} \, \prod _{i=1}^N f^0 (z_i) \end{aligned}$$
(3.4)

where the domain encodes the exclusion:

$$\begin{aligned} \mathcal {D}^{{\varepsilon }}_{N} := \big \{ Z_N \in \mathbb {D}^N \, \big | \, \quad \forall i \ne j, \quad \, |x_i - x_j| >{\varepsilon }\big \}\,, \end{aligned}$$

and the normalization constant \(\mathcal {Z}^{\varepsilon }\) is given by

$$\begin{aligned} \mathcal {Z}^{\varepsilon }:= 1 + \sum _{N\ge 1}\frac{\mu _{\varepsilon }^N}{N!} \int _{\mathbb {D}^N} dZ_N\, \mathbf{1}_{ \mathcal {D}^{{\varepsilon }}_{N}} \prod _{i=1}^N f^0 (z_i) \, . \end{aligned}$$

With this definition, if

$$\begin{aligned} \mu _{\varepsilon }{\varepsilon }^{d-1} = 1\;, \end{aligned}$$

then the average number of particles satisfies

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \mathbb {E}_{\varepsilon }\left( \mathcal {N}\right) {\varepsilon }^{ d-1 } = 1 \end{aligned}$$

(Boltzmann–Grad scaling).

The rescaled n-particle correlation function is defined by

$$\begin{aligned} F_n^{{\varepsilon }0} (Z_n) := \mu _{\varepsilon }^{-n} \, \sum _{p=0}^{\infty } \,\frac{1}{p!}\, \int dz_{n+1}\dots dz_{n+p} \, W_{n+p}^{{\varepsilon }0} (Z_{n+p})\;. \end{aligned}$$
(3.5)

For any symmetric test function \(h_n : \mathbb {D}^n\rightarrow {\mathbb {R}}\), one can check that

$$\begin{aligned} {\mathbb {E}}_{\varepsilon }\Big ( \sum _{\begin{array}{c} i_1, \dots , i_n \\ i_j \ne i_k, j \ne k \end{array}} h_n \big ( \mathbf{z}_{i_1}^{{\varepsilon }0}, \dots , \mathbf{z}_{i_n}^{{\varepsilon }0} \big ) \Big ) = \mu _{\varepsilon }^{n} \int _{\mathbb {D}^n} dZ_n \, F_n^{{\varepsilon }0} (Z_n )\, h_n( Z_n) \,. \end{aligned}$$
(3.6)

Moreover one can prove that, in the Boltzmann–Grad limit,

$$\begin{aligned} \forall n \ge 1 \, , \quad F_n^{{\varepsilon }0} (Z_n) \longrightarrow \prod _{i=1}^n f^0 (z_i) \hbox { as } {\varepsilon }\rightarrow 0 \end{aligned}$$

on the set \(\{ x_i \ne x_j\;,\ \forall i \ne j\}\). That is, at leading order, the initial distribution is chaotic.

Starting from the dynamical equations (3.1) we get that, for each fixed N, the probability density at time \(t>0\) is determined by the Liouville equation

$$\begin{aligned} {\partial }_t W^{{\varepsilon }}_N +V_N \cdot \nabla _{X_N} W^{{\varepsilon }}_N =0 \,\,\,\,\,\,\,\,\, \hbox {on } \,\,\,\mathcal {D}^{{\varepsilon }}_{N}\, , \end{aligned}$$
(3.7)

with specular reflection (3.2) on the boundary \(|x_i - x_j|= {\varepsilon }\).

By integration of the Liouville equation for fixed \({\varepsilon }\), we get that the one-particle correlation function \(F^{\varepsilon }_1\) satisfies an equation

$$\begin{aligned} \partial _t F^{\varepsilon }_1 + v \cdot \nabla _{x} F^{\varepsilon }_1 = C_{1,2}^{{\varepsilon }} F^{\varepsilon }_{2} \end{aligned}$$
(3.8)

where the collision operator comes from the boundary terms in Green’s formula (using the reflection condition to rewrite the gain part in terms of pre-collisional velocities):

$$\begin{aligned} (C_{1,2} ^{\varepsilon }F^{\varepsilon }_2 )(x,v)&:= \int F^{\varepsilon }_{2} (x,v', x+{\varepsilon }\omega ,w') \big ( (w- v)\cdot \omega \big )_+ \, d \omega dw\\&\quad - \int F^{\varepsilon }_{2} (x,v, x+{\varepsilon }\omega ,w) \big ( ( w - v )\cdot \omega \big )_- \, d \omega dw\, , \end{aligned}$$

with

$$\begin{aligned} v' = v - (v-w) \cdot \omega \, \omega , \quad w' = w +(v-w) \cdot \omega \, \omega \,. \end{aligned}$$

As in (3.6), \(F_1^{{\varepsilon }}(t)\) describes the average behavior of (identical) particles at time t:

$$\begin{aligned} {\mathbb {E}}_{\varepsilon }\left( \frac{1}{\mu _{\varepsilon }} \sum _{i=1}^{\mathcal {N}} h \left( \mathbf{z}^{\varepsilon }_i(t)\right) \right) =\int F^{{\varepsilon }}_1(t,z)\, h(z)\, dz\,, \end{aligned}$$

for any test function \(h : \mathbb {D}\rightarrow {\mathbb {R}}\). Similarly for any test function \(h_2 : \mathbb {D}^2\rightarrow {\mathbb {R}}\), the two-particle correlation function satisfies

$$\begin{aligned} {\mathbb {E}}_{\varepsilon }\left( \frac{1}{\mu _{\varepsilon }^2} \sum _{i\ne j } h_2\left( \mathbf{z}_i(t),\mathbf{z}_j(t)\right) \right) =\int F^{{\varepsilon }}_2(t,Z_2)\, h_2(Z_2)\, dZ_2\, . \end{aligned}$$

3.2 Law of Large Numbers

The issue with equation (3.8) is that it is not closed: it involves \(F^{\varepsilon }_2\). At the level of (3.8), Boltzmann’s main assumption would correspond to the replacement

$$\begin{aligned} \begin{aligned} F^{\varepsilon }_2 (t,x_1,v_1, x_2,v_2) \sim F^{\varepsilon }_1 (t,x_1,v_1)F^{\varepsilon }_1 (t,x_2,v_2)\, , \quad \text{ as } \, \, {\varepsilon }\rightarrow 0\,, \\ \text{ when } \quad |x_1-x_2|={\varepsilon }\;, \quad (x_1-x_2)\cdot (v_1-v_2)<0\, . \end{aligned} \end{aligned}$$

In other words, particles are assumed to be statistically independent, at least in pre-collisional configurations. This very strong chaos property (which we assumed at time 0) is supposed to be valid for all times.

In the Boltzmann–Grad limit \({\varepsilon }\rightarrow 0\), we then expect \(F^{\varepsilon }_1\) to be well approximated by the solution to the Boltzmann equation

$$\begin{aligned} {\partial }_t f +v \cdot \nabla _x f = C(f,f) \end{aligned}$$
(3.9)

with

$$\begin{aligned}&C(f,f) (t,x,v)\\&\quad := \int _{\mathbb {R}^d}\int _{\mathbb {S}^{d-1}} \Big ( f(t,x,w') f(t,x,v') - f(t,x,w) f(t,x,v)\Big ) ((v-w)\cdot \omega )_+ \, d\omega \,dw \,. \end{aligned}$$

The claim by Boltzmann that the particle density is well approximated by Eq. (3.9) has been made rigorous by Lanford for short times.

Theorem 1

(Lanford, [21]) Consider a gas of hard spheres initially distributed according to (3.4). Then, in the Boltzmann–Grad limit \(\mu _{\varepsilon }\rightarrow \infty \) with \(\mu _{\varepsilon }{\varepsilon }^{d-1}=1\), the 1-particle distribution \(F_1^{\varepsilon }\) converges, uniformly on compact sets, towards the solution f of the Boltzmann equation (3.9) on a short time interval \([0, T^\star ]\) (where \(T^\star \) depends on the initial distribution \(f^0\) through \(C_0,\beta _0\) in (3.3)).

Furthermore for each n, the n-particle correlation function \(F^{\varepsilon }_n(t)\) converges almost everywhere to \(f^{\otimes n}(t)\) on the same time interval.

The propagation of chaos obtained in Lanford’s theorem implies in particular that the empirical measure \(\pi _t^{\varepsilon }\), defined by (1.1), concentrates on the solution to the Boltzmann equation. Indeed computing the variance we get, for any test function h, that

$$\begin{aligned}&{\mathbb {E}}_{\varepsilon }\Big ( \big (\pi ^{\varepsilon }_t(h)- \int F^{\varepsilon }_1(t,z)\, h(z)\, dz \big )^2 \Big )\nonumber \\&\quad = {\mathbb {E}}_{\varepsilon }\Big ( \frac{1}{\mu _{\varepsilon }^2} \sum _{i=1}^{\mathcal {N}} h^2 \big ( \mathbf{z}^{{\varepsilon }}_i(t)\big ) + \frac{1}{\mu _{\varepsilon }^2} \sum _{i \not = j} h \big ( \mathbf{z}^{{\varepsilon }}_i(t)\big ) h \big ( \mathbf{z}^{{\varepsilon }}_j(t)\big ) \Big ) - \Big (\int F^{\varepsilon }_1(t,z)\, h (z)\, dz \Big )^2 \nonumber \\&\quad = \frac{1}{\mu _{\varepsilon }} \int F^{\varepsilon }_1 \,h^2\, d z_1 + \int F^{\varepsilon }_2\, h^{\otimes 2}\, d Z_2 - \Big (\int F^{\varepsilon }_1\, h\, dz \Big )^2 \end{aligned}$$
(3.10)

which converges to 0 as \({\varepsilon }\rightarrow 0\) since \(F^{\varepsilon }_2\) converges to \(f^{\otimes 2}\) and \(F^{\varepsilon }_1\) to f. This computation can be interpreted as a law of large numbers.

Theorem 1 entails a drastic loss of information, which (as becomes clear from the proof) is retained in particular in “recollision sets” of measure zero. Some of the microscopic time-reversible structure can be recovered by looking at correlations on finer scales. This is the role played by the rescaled dynamical cumulants, defined by

$$\begin{aligned} f^{\varepsilon }_n(t, Z_n):= \mu _{\varepsilon }^{n-1} \sum _{s= 1}^n \sum _{\sigma \in \mathcal {P}^s_n } (-1) ^{s-1} (s-1) ! \, \prod _{i=1} ^s F^{\varepsilon }_{|\sigma _i|}(t,Z_{\sigma _i}) \,. \end{aligned}$$
(3.11)

Here we denoted by \(\mathcal {P}^s_n\) the set of partitions of \(\{1,\dots , n \}\) in s parts, \(\mathcal {P}^s_n\ni \sigma = \left( \sigma _1,\dots ,\sigma _s\right) \), by \(|\sigma _i|\) the cardinality of the set \(\sigma _i\) and by \(Z_{\sigma _i} = (z_j)_{j \in \sigma _i}\). This formula is cooked up to extract the effect of recollisions and therefore to obtain the detailed correlation structure at arbitrarily small scales. Note that, for fixed \({\varepsilon }>0\), \(\left( F^{{\varepsilon }}_n \right) _{n \ge 1}\) and \(\left( f^{{\varepsilon }}_n \right) _{n \ge 1}\) provide the same amount of information, as shown by the inversion formula :

$$\begin{aligned} F^{{\varepsilon }}_n(t,Z_n) = \sum _{s = 1}^{n} \sum _{\sigma \in \mathcal {P}^s_n} \mu _{\varepsilon }^{ - (n-s)} \prod _{i=1}^s f^{{\varepsilon }}_{|\sigma _i|}(t,Z_{\sigma _i})\,. \end{aligned}$$
(3.12)

Before passing to the investigation of (3.11), we need to recall the main features of the proof of Theorem 1 (see [9, 14, 32] for more details).

3.3 Hierarchy and Pseudotrajectories

The starting point is the equation (3.8) for the 1-particle correlation function \(F^{\varepsilon }_1\). In order to get a closed system, we write similar equations for all correlation functions \(F_n^{\varepsilon }\)

$$\begin{aligned} \partial _t F^{\varepsilon }_n + V_n \cdot \nabla _{X_n} F^{\varepsilon }_n = C_{n,n+1}^{\varepsilon }F^{\varepsilon }_{n+1} \quad \text{ on } \quad \mathcal {D}^{\varepsilon }_{n }\;, \end{aligned}$$
(3.13)

with specular boundary reflection as in (3.7) [8]. As \(C^{\varepsilon }_{1, 2} \) above, \(C^{\varepsilon }_{n, n+1}\) describes collisions between one “fresh” particle (labelled \(n+1\)) and one given particle \(i\in \{1,\dots , n\}\).

We denote by \(S^{\varepsilon }_n\) the group associated with free transport in \(\mathcal {D}^{\varepsilon }_n\) (with specular reflection at collisions). Iterating Duhamel’s formula, we can express the solution as a sum of operators acting on the initial data :

$$\begin{aligned} F^{\varepsilon }_n (t) =\sum _{m\ge 0} Q^{\varepsilon }_{n,n+m}(t) F_{n+m}^{{\varepsilon }0} \, , \end{aligned}$$
(3.14)

where we have defined for \(t>0\)

$$\begin{aligned} Q^{\varepsilon }_{n,n+m}(t) F_{n+m}^{{\varepsilon }0 } := \int _0^t \int _0^{t_{1}}\dots \int _0^{t_{m-1}} S^{\varepsilon }_n(t-t_{ 1}) C^{\varepsilon }_{n,n+1} S^{\varepsilon }_{n+1}(t_{1}-t_{2}) C^{\varepsilon }_{n+1,n+2} \\ \dots S^{\varepsilon }_{n+m}(t_{m}) F_{n+m}^{{\varepsilon }0} \, dt_{m} \dots d t_{1} \end{aligned}$$

and \(Q^{\varepsilon }_{n,n}(t)F^{{\varepsilon }0}_{n} := S^{\varepsilon }_n(t)F_{n}^0\), \(Q^{\varepsilon }_{n,n+m}(0)F^{{\varepsilon }0} _{n+m} := \delta _{m,0}F^{{\varepsilon }0}_{n+m}\).

In the following, we shall label \(1^*, \dots ,n^*\) the n particles with configuration \(Z_n\) at time t, and \(1, \dots , m\) the m “fresh” particles which are added by the collision operators. The configuration of the particle labeled \(i^*\) will be denoted indifferently \(z_i^*=(x_i^*,v_i^*)\) or \(z_{i*}=(x_{i*},v_{i*})\).

3.3.1 The Tree Structure

Each term of the series expansion (3.14) (after inserting the explicit definition of the collision operators) can be represented by a collision tree \(a = (a_i) _{i = 1, \dots , m}\), which records the combinatorics of collisions : the colliding particles at time \(t_i\) are i and \(a_i \in \{1^*, \dots ,n^*\} \cup \{1, \dots , i-1\}\). We define the set \(\mathcal {A}_{n, m}\) of all possible such trees. Note that \(|\mathcal {A}_{n, m}| = n(n+1) \dots (n+m-1)\). Note also that, graphically, \(a \in \mathcal {A}_{n, m}\) is represented by n binary tree graphs (below, we will call collision tree both \(a \in \mathcal {A}_{n, m}\) and each of its n components).

For all collision trees \(a \in \mathcal {A}_{n,m}\) and all parameters \((t_i, \omega _i, v_i)_{i=1,\cdots ,m}\) with \(t_1>t_2> \dots > t_m\), one constructs pseudo-trajectories on [0, t]

$$\begin{aligned} \varPsi ^{{\varepsilon }}_{n,m} = \varPsi ^{{\varepsilon }}_{n,m} \Big (Z_n^*, (a_i, t_i, \omega _i, v_i)_{i=1,\dots ,m}\Big ) \end{aligned}$$

iteratively on \(i=1,2,\dots ,m\) (denoting by \(Z_{n,m}(\tau ) =\big (Z_{n}^*(\tau ) ,Z_m(\tau )\big )\) the coordinates of particles at time \(\tau \le t_m\)):

  • starting from \((Z_n^*)\) at time \(t =: t_0\),

  • transporting all existing particles backward on \([t_{i}, t_{i-1}]\) (on \(\mathcal {D}^{\varepsilon }_{n +i-1}\) with specular reflection at collisions),

  • adding a new particle i at time \(t_i\), with position \(x _{a_i} (t_i) +{\varepsilon }\omega _i\) and velocity \(v_i\),

  • and applying the scattering rule (3.2) if \(\big ( x _{a_i} (t_i), v_{a_i}(t_i^+), x_{a_i} (t_i)+{\varepsilon }\omega _i, v_i\big ) \) is a post-collisional configuration.

We discard non admissible parameters for which this procedure is ill-defined; in particular we exclude values of \(\omega _i\) corresponding to an overlap of particles (two spheres at distance strictly smaller than \({\varepsilon }\)). In the following we denote by \(\mathcal {G}^{\varepsilon }_m(a, Z_n^* )\) the set of admissible parameters.

Figure 2 is an example of such flow (for \(n=1, m=4\)).

Fig. 2
figure 2

The tree structure of collisions

With these notations, one gets the following geometric representation of the correlation function \(F^{\varepsilon }_n\) :

$$\begin{aligned} \begin{aligned} F^{\varepsilon }_n (t, Z_n^*) = \sum _{m \ge 0} \sum _{a \in \mathcal {A}_{n, m} }\int _{\mathcal {G}_{ m}^{{\varepsilon }}(a, Z_n^* )} dT_m d\varOmega _{ m} dV_{ m}\left( \prod _{i= 1}^{ m} \big ( v_i -v_{ a_i} (t_i^+)\big ) \cdot \omega _i \right) F_{n+m}^{{\varepsilon }0} \big (\varPsi ^{{\varepsilon }0}_{n,m}\big )\;, \end{aligned} \end{aligned}$$

where \((T_m, \varOmega _{ m}, V_{ m}) := (t_i, \omega _i, v_i)_{ 1\le i\le m}\), and \(\varPsi ^{{\varepsilon }0}_{n,m}\) is the \((n+m)\)-particle configuration of the pseudo-trajectory at time zero. Or, in short,

$$\begin{aligned} F_n^{\varepsilon }(t,Z_n^*)&= \int \mu (d \varPsi _{n}^{\varepsilon }) \, \mathcal {C}\big ( \varPsi _{n}^{\varepsilon }\big ) \mathbf{1}_{\mathcal {G}^{\varepsilon }} ( \varPsi _{n}^{\varepsilon }\big ) F^{{\varepsilon }0} \big ( \varPsi _{n}^{{\varepsilon }0}\big ) \nonumber \\ \hbox { with } \quad \mu (d \varPsi _{n}^{\varepsilon })&:=\sum _m\sum _{a \in \mathcal {A}_{n,m} } dT_{m} d\varOmega _{m} dV_{m},\quad \mathcal {C}( \varPsi _{n}^{\varepsilon }) := \prod _{i=1}^{m} \big ( v_i -v_{a_i} (t_i^+)\big ) \cdot \omega _i \end{aligned}$$
(3.15)

\(\mathbf{1}_{\mathcal {G}^{{\varepsilon }}} ( \varPsi ^{\varepsilon }_n \big ):=\mathbf{1}_{\mathcal {G}_{ m}^{{\varepsilon }}(a, Z^*_n)}\), and \(F^{{\varepsilon }0} \big (\varPsi _n^{{\varepsilon }0}\big )\) the initial correlation function evaluated on the configuration at time 0 of the pseudo-trajectory (including \(n+m\) particles). From now on, we will indicate by \(\varPsi ^{\varepsilon }_n\) a generic pseudo-trajectory with n particles at time \(t=t_0\).

3.3.2 A Short Time Estimate

Each elementary integral corresponding to a collision tree with m branching points involves a simplex in time (\(t_1> t_2> \dots > t_m\)). Thus, if we replace, for simplicity, the cross-section factors \(\mathcal {C}( \varPsi _{1}^{\varepsilon })\) by a bounded function (cutting off high energies), we immediately get that the integrals for \(n=1\) are bounded, for each fixed tree \(a \in \mathcal {A}_{1,m}\), by

$$\begin{aligned} \left| \int dT_m d\varOmega _{ m} dV_{m} \, \mathcal {C}( \varPsi _{1}^{\varepsilon }) \, \mathbf{1}_{\mathcal {G}^{\varepsilon }} ( \varPsi _{1}^{\varepsilon }\big ) \, F^{{\varepsilon }0} \big ( \varPsi _{1}^{{\varepsilon }0} \big )\right| \le {(C'_0 t)^m \over m!}\,, \end{aligned}$$

where \(C'_0>0\) depends only on \(C_0,{\beta }_0\) of (3.3). Since \(|\mathcal {A}_{1,m}| = m ! \), the series expansion is therefore absolutely convergent for short times, uniformly in \({\varepsilon }\). A similar estimate holds for \(n > 1\). Moreover in presence of the true factors \(\mathcal {C}( \varPsi _{n}^{\varepsilon })\), the result remains valid (with a slightly different value of the convergence radius), though the proof requires some extra care [20].

Hence it is enough to study the convergence of each elementary term in the Boltzmann–Grad limit \({\varepsilon }\rightarrow 0\).

3.3.3 Removing Recollisions

When the size \({\varepsilon }\) of the particles goes to 0, we expect the pseudo-trajectory \(\varPsi _{1}^{\varepsilon }\) to converge to a limiting \(\varPsi _{1}\), defined iteratively on \(i=1,2,\cdots ,m\):

  • starting from \(z_1^*\) at time \(t = t_0\),

  • transporting all existing particles backward on \([t_{i}, t_{i-1}]\) (by free transport),

  • adding a new particle i at time \(t_i\), exactly at position \(x_{a_i} (t_i) \) and with velocity \(v_i\),

  • and applying the scattering rule (3.2) if \(\big ( v_i -v_{a_i} (t_i^+)\big ) \cdot \omega _i > 0\) (post-collisional configuration).

The main obstacle to the convergence \(\varPsi _{1}^{\varepsilon }\rightarrow \varPsi _{1}^{\varepsilon }\) are the so-called recollisions. In the language of pseudo-trajectories, a recollision is a collision between pre-existing particles (see Fig. 3), namely a collision which does not correspond to the addition of a fresh particle. It is easy to realize that, in the absence of recollisions, \(\varPsi _{1}^{\varepsilon }\) and \( \varPsi _{1}\) differ only by small shifts in the positions.

Fig. 3
figure 3

An example of recollision

A careful geometric analysis of recollisions shows that they can happen only for a small set of parameters, which is negligible in the limit \(\varepsilon \rightarrow 0\). Roughly, if particles p and q are at positions \(x_p, x_q\) with \(x_p \ne x_q\) at time \(\tau > 0\), then a recollision between these particles implies that there is a time \(t_\mathrm{rec} < \tau \) such that \(x_p - x_q - (v_p -v_q) (\tau - t_\mathrm{rec}) = O({\varepsilon })\). As a consequence, \(v_p - v_q\) is constrained to be in a small cone of opening \({\varepsilon }\), and the integration parameters in (3.15) lie in a small set. Thanks to the uniform bounds, one concludes that pseudo-trajectories involving recollisions give an overall vanishing contribution to \(F^{\varepsilon }_1\).

A similar analysis can be performed to study higher order correlation functions. However, in this case the convergence is slightly more subtle. Notice that, for the n-particle correlation function, the convergence will fail on some sets of parameters of volume \(\mu _{\varepsilon }^{-1}\), which correspond to particles of different trees colliding in the backward dynamics (see e.g. Fig. 4). These “external recollisions” are apparently innocent, as they correspond again to small volume sets which do not contribute to the limit. On the other hand, it is the little failure of convergence of the \(F_n^{\varepsilon }\) which prevents Lanford’s theorem from being “reversible”, i.e. from being applicable to the state at time \(t>0\) with reversed velocities [2, 5]. This suggests that the relevant information to go backwards is hidden in singular directions where different trees merge. In Sect. 4, we will show that the dynamical cumulants \(f_n^{\varepsilon }\) defined by (3.11) “live” in these singular directions, thus allowing to investigate the n-particle correlations more precisely.

3.3.4 Averaging Over Trajectories

We conclude this section with a generalization of the previous discussion, which will be important in the following.

So far we discussed correlations in phase space, at a given time t. But clearly, spatio-temporal correlations are also of interest. We therefore need to study trajectories of particles, and not only their distribution at a given time. Pseudo-trajectories provide a geometric representation of the iterated Duhamel series, but they are not physical trajectories of the particle system. Nevertheless, the probability of trajectories of n particles can be represented as above, by conditioning the Duhamel series.

Proposition 1

[6] Let \(H_n\) be a bounded measurable function on the Skorokhod space of trajectories over \(\mathbb {D}^n\) in [0, t]. Define

$$\begin{aligned} F^{\varepsilon }_{n, [0,t]} ( H_n) := \int dZ_n^* \int \mu (d \varPsi _{n}^{\varepsilon }) \, \mathcal {C}\big ( \varPsi _{n}^{\varepsilon }\big ) \mathbf{1}_{\mathcal {G}^{\varepsilon }} ( \varPsi _{n}^{\varepsilon }\big ) H_n \big (Z_n^*([0,t]) \big ) F^{{\varepsilon }0} \big ( \varPsi _{n}^{{\varepsilon }0}\big )\,, \end{aligned}$$
(3.16)

where \(Z_n^*( [0,t] )\) are the trajectories of the n\(*\)-tagged particles in the pseudo-trajectory \(\varPsi _{n}^{\varepsilon }\). Then,

$$\begin{aligned} {\mathbb {E}}_{\varepsilon }\Big ( \sum _{\begin{array}{c} i_1, \dots , i_n \\ i_j \ne i_k, j \ne k \end{array}} H_n \big ( \mathbf {z}^{\varepsilon }_{i_1}([0,t] ), \dots , \mathbf {z}^{\varepsilon }_{i_n}([0,t] ) \big ) \Big ) = \mu _{\varepsilon }^{n} F^{{\varepsilon }}_{n, [0,t]} (H_n) \,, \end{aligned}$$

where \(\mathbf {z}^{\varepsilon }_{i_1}([0,t] ), \dots , \mathbf {z}^{\varepsilon }_{i_n}([0,t] )\) is the sample path of n hard spheres labeled \(i_1, \dots , i_n\), among the \( \mathcal {N} \) hard spheres randomly distributed at time zero.

This generalizes (3.6) and the representation (3.15), in the sense that, for \(H_n (Z_n^*([0,t] )) = h_n(Z_n^*(t))\), we obtain

$$\begin{aligned} F^{\varepsilon }_{n, [0,t]} ( H_n) = \int F^{\varepsilon }_n (t,Z_n^*) h_n(Z_n^*) dZ_n^*\,. \end{aligned}$$

4 Dynamical Correlations

4.1 Recollisions and Overlaps

We start from the representation (3.16) of \( F_{n, [0,t]} ^{\varepsilon }(H_n )\) in terms of collision trees and pseudo-trajectories. We assume that

$$\begin{aligned} H_n = H^{\otimes n} \end{aligned}$$

with H a measurable function on the Skorokhod space of trajectories D([0, t]) in \(\mathbb {D}\), and we abbreviate

$$\begin{aligned} \mathcal {H}\big ( \varPsi _{n}^{\varepsilon }\big ) := H^{\otimes n} \big ( Z_n^*([0,t])\big ) \,. \end{aligned}$$

Recall that there are two types of interactions between particles:

  • a collision correspond to the addition of a new particle;

  • recollisions occurring when two pre-existing particles collide.

The elementary integrals in the series expansion of \( F_{n, [0,t]} ^{\varepsilon }(H_n )\) can be decomposed depending on whether collision trees are correlated or not by recollisions (see Fig. 4). We then have a partition of \(\{1,\dots , n\}\) into a certain number (say \(\ell \)) of forests \((\lambda _i)_{i= 1, \dots ,\ell }\), and we shall denote by  the characteristic function of the forest \(\lambda _i\). Namely, if and only if any two elements of \(\lambda _i\) are connected (through their collision trees) by a chain of recollisions. We say that is supported on clusters of size \(|\lambda _i|\), formed by \(|\lambda _i|\) “recolliding” collision trees. We will further indicate the decomposition in forests by \(\lambda = (\lambda _i)_{i= 1, \dots ,\ell }\).

Fig. 4
figure 4

Recollisions connect trees into forests

Formula (3.16) can then be rewritten as a partially factorized expression:

$$\begin{aligned} F_{n, [0,t] }^{\varepsilon }(H^{\otimes n} )= \int dZ_n^* \sum _{\ell =1}^n \sum _{\lambda \in \mathcal {P}_n^\ell } \int \mathcal {K}_\lambda \left( \varPsi _{\lambda }^{\varepsilon }\right) \, \varPhi _\ell \, F^{{\varepsilon }0} \big ( \varPsi ^{{\varepsilon }0}_n\big )\, \end{aligned}$$
(4.1)

where

and \(\varPhi _\ell = \varPhi _\ell \big ( \lambda _1, \dots , \lambda _\ell \big )\) is the indicator function that particles belonging to different forests keep mutual distance larger than \({\varepsilon }\). Here and below, we indicate by \(\varPsi ^{\varepsilon }_{\alpha }\) the pseudo-trajectory constructed starting from \(Z_{\alpha }^*\), for any \({\alpha }\) subset of \(\{ 1, \dots , n\}\).

Although there cannot be any recollision between particles of different forests \(\lambda _i\), such particles are not yet independent, as the parameters of the pseudo-trajectories are constrained precisely by the fact that no recollision should occur. The characteristic function \(\varPhi _\ell = \varPhi _\ell \big ( \lambda _1, \dots , \lambda _\ell \big )\) expresses this no-recollision condition. Next, we write its cumulant expansion (the analogue of (3.12)):

$$\begin{aligned}&\varPhi _\ell = \sum _{r = 1}^{\ell } \sum _{\rho \in \mathcal {P}^r_\ell }\varphi _\rho \,.\\&\varphi _\rho = \prod _{i=1}^r \varphi _{\rho _i}.\nonumber \end{aligned}$$
(4.2)

This formula reorganizes the \(\ell \) forests into a group of rjungles\(\rho = \left( \rho _i\right) _{i = 1,\dots , r}\).

By construction, particles belonging to different forests will never collide among themselves. However they are allowed to “overlap”. We say that two different forests \(\lambda _i\) and \(\lambda _j\)overlap if two particles, belonging to the pseudo-trajectories \(\varPsi ^{\varepsilon }_{\lambda _i}\) and \(\varPsi ^{\varepsilon }_{\lambda _j}\) respectively, touch each other (without colliding) and cross each other freely. Standard combinatorial arguments show then that the cumulant \(\varphi _s\) of order s is supported on clusters of size s, formed by s overlapping forests (namely any two forests are connected by a chain of overlaps).

The last source of correlation in (3.16) comes from the initial data. For each given \(\rho \), we introduce a cumulant expansion of the initial data associated with \(\rho \):

$$\begin{aligned} F^{{\varepsilon }0 }\big ( \varPsi ^{{\varepsilon }0}_n\big ) = \sum _{s =1}^r \sum _{\sigma \in \mathcal {P}_r^s} f^{{\varepsilon }0}_{\sigma }\,, \quad f^{{\varepsilon }0}_{\sigma } =\prod _{i=1} ^s f^{{\varepsilon }0}_{\sigma _i} \, , \quad f^{{\varepsilon }0} _{\sigma _i} =f^{{\varepsilon }0} \left( \varPsi ^{{\varepsilon }0}_{\sigma _i}\right) \;. \end{aligned}$$
(4.3)

Here and below, by abuse of notation, the partitions \(\sigma , \rho \) are also interpreted as a partition of \(\{1,\dots , n\} \), coarser than the partition \(\lambda \); the relative coarseness will be denoted by \(\lambda \hookrightarrow \rho \hookrightarrow \sigma \, .\) Therefore \(f^{{\varepsilon }0} \big (\varPsi _{\sigma _i}^{{\varepsilon }0}\big )\) is the time-zero block cumulant evaluated on the configuration (at time 0) of the pseudo-trajectory starting from \(Z_{\sigma _i}^*\).

We end up with a cluster structure on collision trees, of the form depicted in Fig. 5.

Fig. 5
figure 5

Clustering structure due to recollisions, overlaps and initial correlations

Replacing (4.2) and (4.3) into (4.1), we arrive to the following decomposition of correlation functions :

$$\begin{aligned} F_{n, [0,t] }^{\varepsilon }(H^{\otimes n} ) = \int dZ_n^*\, \sum _{\begin{array}{c} \lambda , \rho , \sigma \\ \lambda \hookrightarrow \rho \hookrightarrow \sigma \end{array}} \int \mathcal {K}_\lambda \, \varphi _{ \rho } \, f^{{\varepsilon }0}_{\sigma }\;, \end{aligned}$$
(4.4)

where \(\lambda \) is the partition of \(\{1, \dots , n\}\) into \(\ell \) forests of recolliding trees, \(\rho \) is the partition of \(\{1, \dots , \ell \}\) into r jungles of overlapping forests, and \(\sigma \) is the partition of \(\{1, \dots , r\}\) into initially correlated clusters.

4.2 Cumulants and Clusters

Comparing formula (4.4) with (3.12), we finally identify the rescaled dynamical cumulants (averaged over trajectories):

$$\begin{aligned} {f^{\varepsilon }_{n,[0,t]} (H ^{\otimes n } )}= \mu _{\varepsilon }^{n-1} \int dZ_n^*\sum _{\ell =1}^n \sum _{\lambda \in \mathcal {P}_n^\ell } \sum _{r =1}^\ell \sum _{\rho \in \mathcal {P}_\ell ^r} \int \mathcal {K}_\lambda \, \varphi _{ \rho } \, f^{{\varepsilon }0}_{\{1,\dots ,r\}} (\varPsi _{\rho _1}^{{\varepsilon }0}, \dots , \varPsi _{\rho _r}^{{\varepsilon }0}) \, . \end{aligned}$$
(4.5)

This result shows that the cumulant of order n is geometrically represented by connected clusters of size n : \(f_{n, [0,t] }^{\varepsilon }\) corresponds to pseudo-trajectories where the n collision trees are connected by recollisions, overlaps, or initial correlations. This graphical representation of cumulants leads to the following result.

Proposition 2

(Convergence of dynamical cumulants, [6]) Consider a gas of hard spheres initially distributed according to (3.4). Let H be a bounded continuous functional on \(D([0,T^\star ])\). Define the rescaled cumulant \( {f^{\varepsilon }_{n,[0,t]} (H ^{\otimes n } )}\) by (4.5). Then,

  • there exists a positive constant C such that the following uniform a priori bound holds

    $$\begin{aligned} | {f^{\varepsilon }_{n,[0,t]} (H ^{\otimes n } )}| \le C^n \Vert H\Vert _\infty ^n (t+{\varepsilon }) ^{n-1} n! \end{aligned}$$

    uniformly in \({\varepsilon }\) and n, for any \(t \le T^\star \);

  • when \({\varepsilon }\rightarrow 0\), in the same time interval, \( {f^{\varepsilon }_{n,[0,t]} (H ^{\otimes n } )}\) converges to a limiting \({f_{n,[0,t]} (H^{\otimes n} )}\), which is represented by a sum over minimally connected graphs, and by pseudo-trajectories with exactly \(n-1\) pointwise recollisions or overlaps.

The key point to obtain the right scaling of cumulants is to identify “independent” clustering constraints : for fixed \(\lambda , \rho \), collision parameters a and \((T_m, \varOmega _m, V_m)\) and initial velocities \(V^*_n\) :

  • we extract a sequence of \(|\lambda _i | - 1\) clustering recollisions in each forest \(\lambda _i\), a sequence of \(|\rho _j| - 1\) clustering overlaps in each jungle \(\rho _j\), and a sequence of \(r- 1\) clustering initial correlations, and prove that the factor n! accounts for the combinatorics of these clustering constraints;

  • we then show that the clustering constraints can be expressed as \(n-1\) conditions on the positions at time t of the particles of the pseudo-trajectory \((x_j^*)_{j=1,\dots , n}\), which are satisfied on a set of volume \(O(\mu _{\varepsilon }^{-(n-1)})\).

These estimates being essentially uniform with respect to the collision parameters \((a, T_k, \varOmega _k, V_k)\), we can sum/integrate to get the \(L^1\)-bound.

There is a subtle point here: a brute expansion of the overlap constraint \(\varphi _s\) defined by (4.2), leads to \(2^{s^2}\) terms, and cancellations need to be exploited to show that the effective number is bounded by s!. How to do this is known by cluster expansion techniques (see e.g. [15, 24, 29]). In fact, \(\varphi _s\) can be regarded as an Ursell function ([29]) by writing formally “\(\varPhi _\ell (\lambda _1,\cdots ,\lambda _\ell ) = \exp \left( - U_\ell (\lambda _1,\cdots ,\lambda _\ell )\right) \)” and interpreting U as a hard core interaction on dynamical collision trees.

The proof of the second statement of Proposition 2 is very similar to Lanford’s proof. We first discard the contribution of initial correlations (which are of order \(O({\varepsilon }^d)\) instead of \(O({\varepsilon }^{d-1})\)). We then prove (as discussed in Sect. 3.3.3) that any recollision which is not of clustering type will create some extra smallness, giving a vanishing contribution to the limit.

4.3 Cumulant Generating Function

The cumulants allow to characterize exponential moments of the empirical measure, as shown by the following identity :

$$\begin{aligned} \Lambda ^{\varepsilon }_{[0,t]} (H) :=\frac{1}{\mu _{\varepsilon }} \log {\mathbb {E}}_{\varepsilon }\left( \exp \Big ( \sum _{i =1}^\mathcal {N}H \big ( \mathbf{z}^{\varepsilon }_i ([0,t] \big ) \Big ) \right) =\sum _{n = 1}^\infty {1\over n!} f^{\varepsilon }_{n, [0,t]} \big (( e^H - 1)^{\otimes n} \big )\,, \end{aligned}$$
(4.6)

valid for functionals \(H : D([0,t]) \rightarrow {\mathbb {R}}\) such that the series is absolutely convergent. In order to describe the asymptotic behavior of these exponential moments, we need to obtain dynamical equations for the limiting cumulant generating function

$$\begin{aligned} \sum _{n = 1}^\infty {1\over n!} f_{n, [0,t]}\big (( e^H - 1)^{\otimes n} \big ) \,, \end{aligned}$$
(4.7)

which is well defined (as a corollary of Proposition 2) for \(t\in [0,T^\star ]\) provided that H is a continuous functional satisfying a suitable bound:

$$\begin{aligned} \Big | \Big (e^{H \big (z([0,t]) \big )}-1\Big )^{\otimes n} \Big | \le \exp \Big ( \alpha _0 n +\frac{\beta _0}{4}\sup _{s\in [0,t] } |V_n(s)|^2\Big )\,, \end{aligned}$$

for some \(\alpha _0\) (related to the constant \(C_0\) in (3.3) and to \(T^\star \)).

We shall not write here the hierarchical equations for the family of cumulants at equal times \(\left( f_{n}(t)\right) _{n \ge 1}\), obtained by choosing \(H (z([0,t] )) = h(z(t))\) in (4.6). This hierarchy (mentioned in Sect. 2 as “Boltzmann cumulant hierarchy”) is derived and analysed in [13]. Our purpose is to focus directly on the full series (4.7), which we study for a class of regular test functionals.

For \(t\in [0, T^\star ]\), denote by \( \mathcal {J}(t, \varphi , \gamma )\) the limiting cumulant generating function (4.7) associated with

$$\begin{aligned} e^{ H \big (z([0,t]) \big )} = \gamma \big (z(t)\big ) \exp \Big ( - \int _0^t \varphi \big ( s,z(s) \big ) ds \Big )\, , \end{aligned}$$

where \((\varphi ,\gamma )\) belong to

$$\begin{aligned} \begin{aligned} \mathcal {B} :=&\Big \{(\varphi ,\gamma ) \in C^1([0,t] \times \mathbb {D};\mathbb {C}) \times C^1(\mathbb {D};\mathbb {C} ) \, \ \big |\ \, \\&| \gamma (z)| \le e^{\frac{1}{2}( \alpha _0 +\frac{\beta _0}{4} |v|^2)} ,\quad \sup _{s\in [0,T^\star ]}| \varphi (s,z) | \le {1\over 2T^\star } \left( \alpha _0 +\frac{\beta _0}{4} |v|^2\right) \Big \}\, . \end{aligned} \end{aligned}$$

We shall be interested in functions of the form

$$\begin{aligned} \varphi = D_sh \equiv ({\partial }_s+ v\cdot \nabla _x) h\quad \hbox { and } \quad \gamma = \exp (h(t)) \,, \end{aligned}$$

therefore we simplify notation by setting

$$\begin{aligned} \mathcal {J}(t,h) := \mathcal {J}(t, Dh, \gamma )_{| \gamma = \exp (h(t))} \;. \end{aligned}$$

We set \({\mathbb {B}}:= \left\{ \, h \,\ \big |\ \, (D_t h, \exp ( h(T^\star )) ) \in \mathcal {B}\,\right\} \,.\) With these notations, the following result holds.

Theorem 2

(Hamilton–Jacobi equations, [6]) The functional \(\mathcal {J}\) is analytic with respect to \({\gamma }\) on \({\mathbb {B}}\), and it satisfies on \([0,T^\star ]\) the Hamilton–Jacobi equation

$$\begin{aligned} {\partial }_t \mathcal {J}(t,h)&= \frac{1}{2} \int {{\partial }\mathcal {J}(t,h) \over {\partial }\gamma } (z_1) {{\partial }\mathcal {J}(t,h) \over {\partial }\gamma } (z_2) \nonumber \\&\quad \times \Big (e^{h(t,z'_1) +h(t,z'_2)} - e^{h(t,z_1) +h(t,z_2)}\Big ) d\mu (z_1, z_2, \omega )\, , \end{aligned}$$
(4.8)

where

$$\begin{aligned} d\mu (z_1, z_2, \omega ):= \delta ({x_1 - x_2})\, ((v_1-v_2)\cdot \omega )_+ d\omega dv_1 dv_2 dx_1 dx_2\,. \end{aligned}$$
(4.9)

The local existence and uniqueness of a solution for this Hamilton–Jacobi equation relies on a Cauchy–Kowalewski argument in a functional space, encoding the loss continuity estimates due to the divergence of the collision cross section (4.9) at large velocities.

5 The Fluctuating Boltzmann Equation

Describing the fluctuations around the Boltzmann equation is a first way to capture part of the information which has been lost in the limit (1.2). As in the standard central limit theorem, we expect these fluctuations to be of order \(1/\sqrt{\mu _{\varepsilon }}\). We therefore define the fluctuation field \(\zeta ^\varepsilon \) by (see (1.4))

$$\begin{aligned} \zeta ^\varepsilon _t \big ( h \big ) := { \sqrt{\mu _{\varepsilon }}} \left( \pi ^{\varepsilon }_t(h) - \int \, F^{\varepsilon }_1(t,z) \, h \big ( z \big )\, dz \right) \, , \end{aligned}$$

for any test function \(h: \mathbb {D}\rightarrow \mathbb {R}\).

It is easy to check that, in our assumptions, the empirical measure starts close to the density profile \(f^0\) and that \(\zeta _0^{\varepsilon }\) converges to a Gaussian white noise \(\zeta _0\) with covariance

$$\begin{aligned} {\mathbb {E}}\left( \zeta _0(h_1)\, \zeta _0(h_2) \right) = \int h_1(z) \,h_2(z)\, f^0(z) \,dz\, . \end{aligned}$$

Moreover, it follows from Proposition 2 that \(\zeta ^{\varepsilon }\) converges to a solution of the fluctuating Boltzmann equation

$$\begin{aligned} d \zeta _t = \mathcal {L}_t \,\zeta _t\, dt + d\eta _t\,, \end{aligned}$$
(5.1)

where \(\mathcal {L}_t \) is the linearized Boltzmann operator around the solution f of the Boltzmann equation (3.9)

$$\begin{aligned} \mathcal {L}_t \,h(x,v) :=&- v \cdot \nabla _x h(x,v)+\int _{\mathbb {R}^d}\int _{\mathbb {S}^{d-1}} \,d \omega \, d w \left( (v - w) \cdot \omega \right) _+ \big ( f (t,x,w') h(x,v')\\&+ f (t,x,v') h(x,w') - f (t,x,v) h(x,w) - f (t,x,w) h(x,v) \big ), \end{aligned}$$

and \(d \eta _t(x,v)\) is a Gaussian noise with zero mean and covariance

$$\begin{aligned}&{\mathbb {E}}\left( \int dt_1\, dz_1 \, h_1 (z_1) \,\eta _{t_1} (z_1) \, \int dt_2 \, dz_2 \, h_2 (z_2) \,\eta _{t_2}(z_2) \right) \nonumber \\&\quad = \frac{1}{2} \int dt \int f (t, z_1)\, f (t, z_2) \varDelta h_1 \, \varDelta h_2 \; d\mu (z_1, z_2, \omega ) \end{aligned}$$
(5.2)

with notation (4.9) and

$$\begin{aligned} \varDelta h (z_1, z_2, \omega ) := h(x_1,v'_1) + h(x_2,v'_2) - h(x_1,v_1) - h(x_2,v_2) \,. \end{aligned}$$
(5.3)

Our main result is then the following.

Theorem 3

(Fluctuating Boltzmann equation, [6]) Consider a system of hard spheres initially distributed according to (3.4). Then, in the Boltzmann–Grad limit \(\mu _{\varepsilon }\rightarrow \infty \), the fluctuation field \(\left( \zeta ^{\varepsilon }_t\right) _{t \ge 0}\) converges in law on \([0,T^\star ]\) to the solution \(\left( \zeta _t\right) _{t \ge 0}\) of the fluctuating Boltzmann equation (5.1).

The convergence towards the limiting process (5.1) was conjectured by Spohn in [31] and the non-equilibrium covariance of the process at two different times was obtained in [30]. The noise emerges after averaging the deterministic microscopic dynamics. It is white in time and space, but correlated in velocities so that momentum and energy are conserved.

We further recall a few properties (referring to [13, 30, 31] for details).

  • In the equilibrium case (\(f^0(x,v)= M_{\beta }(v)\) where \(M_{\beta }\) is a Maxwellian with inverse temperature \({\beta }\)) the noise term compensates the dissipation induced by the (stationary) linearized collision operator \(\mathcal {L}\), and the covariance of the noise can be predicted heuristically by using the invariant measure.

  • Out of equilibrium, on the one hand, the noise covariance (5.2) can be simply understood as a generalization of the covariance at equilibrium, based on the assumption (which can be proved for short times [32]) that the system is locally Poisson distributed; i.e. on a small cube around x at time t we see a uniform ideal gas with density \(\int f(t, x,v) dv\) and velocity distribution \(f (t,x,v) / \int f(t, x,v_*) dv_*\). The noise being delta-correlated in space and time, its structure is obtained from the equilibrium case after the replacement \(M_{\beta }(v) \rightarrow f(t,x,v)\).

  • On the other hand, the covariance of the fluctuation field out of equilibrium has a subtle microscopic structure originating from recollisions in the Newtonian dynamics. To see this, it is enough to compute the covariance of the fluctuation field at time t by using (3.10) :

    $$\begin{aligned} {\mathbb {E}}_{\varepsilon }\Big ( \zeta ^\varepsilon _t \big ( h \big )^2 \Big ) =&\int F^{\varepsilon }_1(t,z_1)\, h^2(z_1)\, d z_1 \\&\quad + \int \mu _{\varepsilon }\Big ( F^{\varepsilon }_2(t,Z_2) - F^{\varepsilon }_1(t,z_1) F^{\varepsilon }_1(t,z_2) \Big )\, h(z_1) h(z_2)\, d Z_2 \\ =&\int f^{\varepsilon }_1(t,z_1)\, h^2(z_1)\, d z_1 + \int f^{\varepsilon }_2(t,Z_2) \, h(z_1) h(z_2)\, d Z_2 \end{aligned}$$

    where, in the second equality, we used the first two cumulants as defined by (3.12). The last term is zero at equilibrium, while out of equilibrium describes correlations visible at macroscopic distance in space. But, as made apparent from the geometrical representation (4.5), \(f^{\varepsilon }_2 \) records the effect of one recollision/overlap; meaning precisely that the pseudo-trajectories contributing to \(f^{\varepsilon }_2 \) have the form in Fig. 6. Contrary to the typical behavior of the hard sphere gas for which recollisions can be neglected, the covariance of the limiting Gaussian process encodes exactly the effect of a single recollision.

Fig. 6
figure 6

Trajectories contributing to the equal time covariance at two different points in space (case of \(m=3\) collisions with fresh particles): clustering recollisions or clustering overlaps

The uniform bounds on the cumulants discussed in the previous section are considerably better than what is required to obtain Theorem 3. The proof amounts indeed to looking at a characteristic function living on larger scales. A more technical part concerns the tightness of the process. This can be achieved adapting a Garsia–Rodemich–Rumsey’s inequality on the modulus of continuity, to the case of a discontinuous process. We omit the details, and focus on the characteristic function only.

Consider the function H defined by

$$\begin{aligned} H (z([0,t])) = \sum _{p=1} ^P h_p \big (z (\theta _p)\big ) \end{aligned}$$
(5.4)

for a finite sequence of times \((\theta _p )_{1\le p \le P} \) and weights \((h_p )_{1\le p \le P} \). The characteristic function can be rewritten in terms of the empirical measure

$$\begin{aligned} \log {\mathbb {E}}_{\varepsilon }\left( \exp \Big ( \sum _{p= 1} ^P \zeta ^{\varepsilon }_{\theta _p} (h_p) \Big ) \right)&= \mu _{\varepsilon }\sum _{ n=1}^\infty \frac{1}{n !} f_{n, [0,t] }^{\varepsilon }\left( \big ( e^{ {H\over \sqrt{\mu _{\varepsilon }}}} - 1\big )^{\otimes n} \right) \\&\quad - \sqrt{\mu _{\varepsilon }} \sum _{p= 1}^P \int F^{\varepsilon }_1(\theta _p,z) h_p(z)\,dz \, . \end{aligned}$$

At leading order, only the terms \(n=1\) and \(n=2\) will be relevant in the limit since

$$\begin{aligned} \Big | f_{n, [0,t] }^{\varepsilon }\left( \big ( e^{ H\over \sqrt{\mu _{\varepsilon }}} - 1\big )^{\otimes n} \right) \Big | \le C^n \left\| e^{ H\over \sqrt{\mu _{\varepsilon }}} - 1\right\| _\infty ^n n! \, . \end{aligned}$$

Expanding the exponential with respect to \(\mu _{\varepsilon }\), we also notice that the term of order \(\sqrt{\mu _{\varepsilon }}\) cancels and we find

$$\begin{aligned} \log {\mathbb {E}}_{\varepsilon }\left( \exp \Big ( \sum _{p= 1} ^P \zeta ^{\varepsilon }_{\theta _p} (h_p) \Big ) \right) = \frac{1}{2} f_{1, [0,t]} ^{\varepsilon }\left( H^2 \right) + \frac{1}{2} f_{2, [0,t]}^{\varepsilon }\left( H^{\otimes 2} \right) + O \left( \frac{1}{\sqrt{ \mu _{\varepsilon }} } \right) \,. \end{aligned}$$

Then the characteristic function \({\mathbb {E}}_{\varepsilon }\left( \exp \big ( \sum _{p= 1} ^P \zeta ^{\varepsilon }_{\theta _p} (h_p) \big ) \right) \) converges to the characteristic function of a Gaussian process.

From the equations on \(f_1\) and \( f_2\), we deduce that the limiting covariance \(\mathcal {C}= \mathcal {C}(s,t,\varphi ,\psi )\) satisfies the following dynamical equations, for test functions \(\varphi ,\psi \) on \( \mathbb {D}\) :

$$\begin{aligned} {\left\{ \begin{array}{ll} {\partial }_t \mathcal {C}(s,t,\varphi ,\psi ) = \mathcal {C}(s,t,\varphi , \mathcal {L}_t^*\psi )\,, \\ {\partial }_t \mathcal {C}(t,t,\varphi ,\psi ) = \mathcal {C}(t,t,\varphi , \mathcal {L}_t^*\psi ) + \mathcal {C}(t,t, \mathcal {L}_t^* \varphi , \psi ) + \mathbf{Cov}_t ( \psi , \varphi )\,, \\ \mathcal {C}(0,0,\varphi ,\psi ) =\displaystyle \int \varphi (z) \psi (z) f^0(z) dz\,, \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} \mathbf{Cov}_t ( \varphi , \psi ) := \frac{1}{2} \int d\mu (z_1, z_2, \omega ) \, f (t, z_1)\, f (t, z_2) \varDelta \psi \varDelta \varphi \end{aligned}$$

with notation (4.9) and (5.3), and \( \mathcal {L}_t^* := v\cdot \nabla _x + \mathbf{L}_t^*\) with

$$\begin{aligned} \mathbf{L}_t^* \,\varphi (v) := \int _{\mathbb {R}^d}\int _{\mathbb {S}^{d-1}} \,d \omega \, d w \left( (v - w) \cdot \nu \right) _+ f (t,w) \varDelta \varphi \, . \end{aligned}$$

The covariance of the fluctuating Boltzmann equation (5.1) satisfies the same equations, and we conclude by a uniqueness argument that both processes coincide.

6 Large Deviations

While typical fluctuations are of order \(O(\mu _{\varepsilon }^{-1/2})\), larger fluctuations may sometimes happen, leading to an evolution which is different from the typical one given by the Boltzmann equation. A classical problem is to evaluate the probability of such atypical events, namely that the empirical measure \(\pi ^{\varepsilon }_t\), defined in (1.1), remains close to a probability density \(\varphi _t\) during the time interval \([0,T^\star ]\).

In the Gärtner-Ellis theory of large deviations [11], the large deviation functional is given as the Legendre transform of the limiting cumulant generating function. The outcome of the cumulant analysis was the existence of the limiting exponential moment \(\mathcal {J}(t,h)\) and its characterization via the Hamilton–Jacobi equation in Theorem 2. For any \(t \le T^\star \), we then define the large deviation functional on the time interval [0, t] as

$$\begin{aligned} \mathcal {F}(t, \varphi )&:= \sup _{ h \in {\mathbb {B}}} \Big \{ - \int _0^t \int _\mathbb {D} \varphi (s, z) D_s h (s, z) dz ds \nonumber \\&\qquad \qquad + \int _\mathbb {D} \varphi (t,z) h(t,z) dz - \mathcal {J}(t,h) \Big \} \, . \end{aligned}$$
(6.1)

Since the supremum is restricted, for technical reasons, to the test functions in \({\mathbb {B}}\), we do not expect \(\mathcal {F}\) to be the correct large deviation functional. However the following theorem shows that the functional \(\mathcal {F}\) fully describes the large deviation behavior for densities \(\varphi \) such that the supremum in (6.1) is reached for some \(h \in {\mathbb {B}}\). This restricted set of densities \(\varphi \) will be called \(\mathcal {R}\).

A different, explicit formula for the large deviation functional was obtained by Rezakhanlou [28] in the case of a one-dimensional stochastic dynamics mimicking the hard-sphere dynamics, and then conjectured for the three-dimensional, deterministic hard-sphere dynamics by Bouchet [7] :

$$\begin{aligned} \widehat{\mathcal {F}}(t,\varphi ) :=&\widehat{\mathcal {F}} (0,\varphi _0) \nonumber \\&+ \sup _p \left\{ \int _0^{t} ds \left[ \int _{{\mathbb {T}}^d} dx \, \int _{{\mathbb {R}}^d} dv \, p(s,x,v) \, D_s\varphi (s,x,v) - \mathcal {H}\big ( \varphi (s) ,p(s) \big ) \right] \right\} , \end{aligned}$$
(6.2)

where the supremum is taken over bounded measurable functions p growing at most quadratically in v, the Hamiltonian is given by

$$\begin{aligned} \mathcal {H}(\varphi ,p) := \frac{1}{2} \int d\mu (z_1, z_2, \omega ) \varphi (z_1) \varphi (z_2) \big ( \exp \big ( \varDelta p \big ) -1 \big ) \end{aligned}$$

and \(\widehat{\mathcal {F}}(0,\cdot )\) stands for the large deviation functional on the initial data

$$\begin{aligned} \widehat{\mathcal {F}} (0,\varphi _0) = \int dz \left( \varphi _0 \log \left( \frac{\varphi _0}{f^0} \right) - \varphi _0 + f^0 \right) . \end{aligned}$$
(6.3)

Let \(\hat{\mathcal {R}}\) denote the set of densities \(\varphi \) such that the supremum in (6.2) is reached for some \(p \in {\mathbb {B}}\).

Let \(\mathcal {M}({\mathbb {D}})\) be the set of probability measures on \({\mathbb {D}}\).

Our main result is then the following.

Theorem 4

(Large deviations, [6]) Consider a system of hard spheres initially distributed according to (3.4). In the Boltzmann–Grad limit \(\mu _{\varepsilon }\rightarrow \infty \), the empirical measure \(\pi ^{\varepsilon }\) satisfies the following large deviation estimates for any \(t \in [0,T^\star ]\).

  • For any compact set \(\mathbf{F}\) of the Skorokhod space \(D([0,T^\star ], \mathcal {M})\),

    $$\begin{aligned} \limsup _{\mu _{\varepsilon }\rightarrow \infty } \frac{1}{\mu _{\varepsilon }} \log {\mathbb {P}}_{\varepsilon }\left( \pi ^\varepsilon \in \mathbf{F} \right) \le - \inf _{\varphi \in \mathbf{F}} \mathcal {F}(T^\star ,\varphi )\, . \end{aligned}$$
    (6.4)
  • For any open set \(\mathbf{O}\) of the Skorokhod space \(D([0,T^\star ], \mathcal {M})\),

    $$\begin{aligned} \liminf _{\mu _{\varepsilon }\rightarrow \infty } \frac{1}{\mu _{\varepsilon }} \log {\mathbb {P}}_{\varepsilon }\left( \pi ^\varepsilon \in \mathbf{O} \right) \ge - \inf _{\varphi \in \mathbf{O} \cap \mathcal {R}} \mathcal {F}(T^\star ,\varphi )\, . \end{aligned}$$
    (6.5)

Moreover, for any \(\varphi \in \mathcal {R}\cap \hat{\mathcal {R}} \) and t sufficiently small, one has that \(\mathcal {F}(t,\varphi ) = \widehat{\mathcal {F}}(t,\varphi )\).

Given our precise control of the exponential moments, the large deviation proof is standard. Note that, in absence of global convexity, we cannot succeed in proving a full large deviation principle. However, restricting to a class of regular profiles, the variational problem defining the dual of \(\widehat{\mathcal {F}}\) can be uniquely solved and identified with the solution of the Hamilton–Jacobi equation (4.8). The result then follows from a uniqueness property of (4.8).