1 Introduction and Main Result

1.1 Thermostated Kac Particle System

We are interested in Kac’s 1D particle system, subjected to interactions against particles taken from an ideal external thermostat, as studied for instance in [4, 22]. It can be described as follows: consider N particles characterized by their one-dimensional velocities, subjected to two types of random interactions:

  • Kac collisions: at rate \(\lambda N\), randomly select two particles in the system and update their velocities \(v,v_*\in \mathbb {R}\) according to the rule

    $$\begin{aligned} \left( \begin{array}{c} v \\ v_* \end{array} \right) \mapsto \left( \begin{array}{c} v' \\ v_*' \end{array} \right) {:}{=} \left( \begin{array}{c} v\cos \theta - v_*\sin \theta \\ v\sin \theta + v_*\cos \theta \end{array} \right) , \end{aligned}$$
    (1)

    where \(\theta \in [0,2\pi )\) is selected uniformly at random. This rule preserves the energy: \(v^2 + v_*^2 = (v')^2 + (v_*')^2\).

  • Thermostat interactions: at rate \(\mu N\), randomly select a particle in the system and update its velocity \(v\in \mathbb {R}\) according to the rule

    $$\begin{aligned} v \mapsto v\cos \theta - w\sin \theta , \end{aligned}$$
    (2)

    where \(\theta \) is again selected uniformly at random on \([0,2\pi )\), and w is sampled with the Gaussian density \(\gamma (w) = (2\pi T)^{-1/2} e^{-w^2/2T}\). This can be seen as an interaction against a particle taken from an ideal thermostat, that is, from an infinite reservoir at thermal equilibrium with temperature \(T>0\).

Here \(\lambda>0, \mu >0\) are given fixed constants representing the rate of Kac and thermostat collisions, respectively. The initial velocities of the N particles are chosen according to some prescribed symmetric distribution \(f_0^N\) on \(\mathbb {R}^N\), and all previous random choices are made independently. These rules unambiguously specify the law of the particle system as an \(\mathbb {R}^N\)-valued pure-jump continuous-time Markov process, whose state at time \(t\ge 0\) is denoted \(\mathbf {V}_t = (V_t^1,\ldots ,V_t^N)\), and we also write \(f_t^N = \text {Law}(\mathbf {V}_t)\) for its symmetric distribution. For simplicity, in our notation we omit the dependence on N in the particle system \(\mathbf {V}_t\).

Kac’s original model [17], corresponding to the case \(\mu =0\), represents the evolution of a large number of indistinguishable particles that exchange energies via random collisions in a one dimensional caricature of a gas, as a simplification of the more realistic spatially homogeneous Boltzmann equation. The form of the collision rule (1) implies that the average energy \(\frac{1}{N}\sum _i (V_t^i)^2\) is preserved a.s., and one typically assumes that the initial average energy is a.s. equal to 1. Thus, \(f_t^N\) is supported on the sphere \(S^N = \{\mathbf {v}\in \mathbb {R}^N : \sum _i (v^i)^2 = N\}\) for all \(t\ge 0\), and the dynamics has \(\sigma ^N\), the uniform measure on \(S^N\), as the unique stationary distribution. Kac worked with initial conditions \(f_0^N\) having a density in \(L^2(S^N,\sigma ^N)\), for which we now know that \(f_t^N\) equilibrates exponentially fast in the \(L^2\) norm, with rates uniform in N, see [5, 16]. However, the \(L^2\) norm is a crude upper bound for the \(L^1\) norm; moreover, the \(L^2\) norm of typical initial distributions \(f_0^N\) with a near-product structure (specifically, chaotic sequences, see below) grows exponentially with N, which means that one has to wait a time proportional to N in order for the \(L^2\) bound to start providing evidence of convergence.

Thus, one looks for alternative ways to quantify equilibration, such as convergence in relative entropy. The relative entropy of near-product measures grows linearly (and not exponentially) with N, which is a crucial advantage over the \(L^2\) norm. The usual approach is to control the entropy production, in order to obtain an exponential rate of equilibration in relative entropy. Unfortunately, there exist sequences of initial distributions for which the entropy production degenerates as \(N\rightarrow \infty \), as shown in [12]. It is worth noting, however, that the sequence constructed in [12] is physically unlikely, in the sense that \(f_0^N\) gives half the total energy of the system to a small fraction of the particles. This raises the question of whether there is a smaller, but still rich, class of initial conditions for which one can have good control on the entropy production. We refer the reader to [7] for more details about Kac model and equilibration in relative entropy.

Picking up the challenge of choosing good (physical) initial conditions, and in order to avoid the badly behaved initial distributions for which entropy production degenerates, Bonetto et al. [4] introduced the model (1)-(2), called the thermostated Kac particle system, to describe a system in which all but a few particles are at equilibrium. This thermostated particle system no longer preserves the energy, so \(f_t^N\) is supported on the whole space \(\mathbb {R}^N\), and the equilibrium distribution is the N-dimensional Gaussian with density \(\gamma ^{\otimes N}(\mathbf {v}) = \prod _i \gamma (v^i)\). In this case, the system approaches equilibrium in relative entropy exponentially fast, with rates uniform in the number of particles, see [4, Theorem 3]. Later, in [3], the use of the ideal thermostat (2) was justified by approximating it with a finite but large reservoir of particles at equilibrium in a quantitative way.

1.2 Propagation of Chaos

Besides the long-time behaviour of the particle system, one can also study convergence of \(f_t^N\) as \(N\rightarrow \infty \). Notice however that this is not an easy task, because even if we consider particles whose velocities are independent at \(t=0\), the collisions amongst them will destroy this independence for later times. Nevertheless, for the thermostated Kac system, one expects the correlations between particles to become weaker as N grows. The following concept formalizes this idea of asymptotic independence:

Definition 1

(chaos) For each \(N\in \mathbb {N}\), let \(f^N\) be a symmetric probability measure on \(\mathbb {R}^N\). The collection \((f^N)_{N\in \mathbb {N}}\) is said to be chaotic with respect to some given probability measure f on \(\mathbb {R}\), if for all \(k\in \mathbb {N}\), the marginal distribution of \(f^N\) on the first k variables converges in distribution, as \(N\rightarrow \infty \), to the tensor product measure \(f^{\otimes k}\). That is: for every \(k\in \mathbb {N}\) and every bounded and continuous function \(\phi :\mathbb {R}^k \rightarrow \mathbb {R}\), it holds

$$\begin{aligned} \lim _{N\rightarrow \infty } \int _{\mathbb {R}^N} \phi (v^1,\ldots ,v^k) f^N(d\mathbf {v}) = \int _{\mathbb {R}^k} \phi (v^1,\ldots ,v^k) f(d v^1) \cdots f(dv^k). \end{aligned}$$

For Kac’s model, that is, when \(\mu =0\), we know that if the sequence \(( f_0^N )_{N\in \mathbb {N}}\) is chaotic to some probability measure \(f_0\) on \(\mathbb {R}\), then for all \(t\ge 0\) the sequence \(( f_t^N )_{N\in \mathbb {N}}\) will also be chaotic to some \(f_t\); this property is known as propagation of chaos. The limit \(f_t\) is the solution to the so-called Boltzmann–Kac equation, which reads

$$\begin{aligned} \frac{d f_t}{dt}(v) = 2\lambda \int _\mathbb {R}\int _0^{2\pi } [f_t(v')f_t(v_*') - f_t(v)f_t(v_*)] \frac{d\theta }{2\pi } dv_*, \end{aligned}$$
(3)

in the case where \(f_0\), and thus every \(f_t\), has a density. This was first shown by Kac [17] in the special case where \(f_t^N\) has a density in \(L^2(S^N,\sigma ^N)\). The solution to (3) also preserves the initial energy, i.e., \(\int v^2 f_t(dv) = \int v^2 f_0(dv) = 1\) for all \(t\ge 0\). It is straightforward to verify that the Gaussian density with energy 1 is a stationary distribution of the equation, and it is known that the solution converges to it, see for instance [15, 17].

When we introduce the thermostat to Kac’s original model, propagation of chaos still holds, as shown in [4], Theorem 5], and the limit density satisfies

$$\begin{aligned} \begin{aligned} \frac{d f_t}{dt}(v)&= 2\lambda \int _\mathbb {R}\int _0^{2\pi } [f_t(v')f_t(v_*') - f_t(v)f_t(v_*)] \frac{d\theta }{2\pi } dv_* \\&\quad +\mu \int _\mathbb {R}\int _0^{2\pi } [f_t(v') \gamma (v_*') - f_t(v) \gamma (v_*)] \frac{d\theta }{2\pi } dv_*, \end{aligned} \end{aligned}$$
(4)

which we refer to as the thermostated Boltzmann–Kac equation, or simply the kinetic equation. As with the particle system, the solution to (4) does not preserve the initial energy, and its equilibrium distribution is \(\gamma \), the Gaussian density with energy T. When the initial condition \(f_0\) has a density with finite relative entropy, it follows from [4], Propostition 15] that there is exponential convergence to equilibrium in relative entropy. In Definition 3 we will provide a notion of weak solution for (4), which will allow us to work with probability measures instead of densities. Using this notion, we give an existence and uniqueness result in Theorem 5.

We mention that, starting from Kac’s work [17], propagation of chaos has been studied for other related kinetic models, most notably the spatially homogeneous Boltzmann equation, see for instance [11, 14, 18, 19] and the references therein. Propagation of chaos has also been studied for similar models involving thermostats; for example, the authors in [2, 6] consider the Gaussian isokinetic thermostat, which is used to keep the total energy of the system fixed.

1.3 Main Result

Chaoticity, and thus propagation of chaos, can be made quantitative. For Kac’s model this was done in [9] using Wasserstein distances, defined below, and providing explicit convergence rates in N which are uniform in time. Similar quantitative results for the spatially homogeneous Boltzmann equation can be found for instance in [11, 18].

The goal of the present article is to strengthen the propagation of chaos result for the thermostated Kac model in [4], by making it quantitative in N with rates that are uniform in time. To quantify chaos we will use the following metric: given fg probability measures on \(\mathbb {R}^k\), their 2-Wasserstein distance is given by

$$\begin{aligned} W_2(f,g) = \left( \inf _{{\mathbf {X}},{\mathbf {Y}}} \mathbb {E}\left[ \frac{1}{k} \sum _{i=1}^k (X^i - Y^i)^2 \right] \right) ^{1/2}, \end{aligned}$$

where the infimum is taken over all pairs of random vectors \({\mathbf {X}} = (X^1,\ldots ,Y^k)\) and \({\mathbf {Y}} = (Y^1,\ldots ,Y^k)\) such that \(\text {Law}({\mathbf {X}}) = f\) and \(\text {Law}({\mathbf {Y}}) = g\). This defines a distance in the space of probability measures with finite second moment. The infimum is always achieved by some \(({\mathbf {X}},{\mathbf {Y}})\), and such a pair is called an optimal coupling; see [23] for details.

We will use the following characterization of chaoticity, see for instance [19]: a sequence \((f^N)_{N\in \mathbb {N}}\) is f-chaotic if and only if for a sequence of random vectors \({\mathbf {X}}\) on \(\mathbb {R}^N\) with \(\text {Law}({\mathbf {X}}) = f^N\), it holds that the sequence of random empirical measures

$$\begin{aligned} \bar{{\mathbf {X}}} {:}{=} \frac{1}{N} \sum _{i=1}^N \delta _{X^i} \end{aligned}$$

almost surely converges to the constant probability measure f. We can now state our main result.

Theorem 2

(uniform propagation of chaos) Assume that \(\int |v|^r f_0(dv) < \infty \) for some \(r>4\). Let \((\mathbf {V}_t)_{t\ge 0}\) be the thermostated Kac N-particle system described by (1)-(2), and let \((f_t)_{t\ge 0}\) be the unique weak solution of (4). Then there exists a constant C depending only on \(\lambda \), \(\mu \), T, r, and \(\int |v|^r f_0(dv)\), such that for all \(t\ge 0\) we have:

$$\begin{aligned} \mathbb {E}[W_2^2({\bar{\mathbf {V}}}_t, f_t)] \le 4 e^{-\frac{\mu }{2} t} W_2^2(f_0^N, f_0^{\otimes N}) + \frac{C}{N^{1/3}}. \end{aligned}$$
(5)

We remark that in this result \((f_0^N)_{N\in \mathbb {N}}\) can be any family of symmetric initial distributions; thus, Theorem 2 provides a uniform-in-time propagation of chaos rate of order \(N^{-1/3}\) whenever \(W_2^2(f_0^N,f_0^{\otimes N})\) converges to 0 at the same rate or faster. For instance, one can simply take \(f_0^N = f_0^{\otimes N}\), so the first term in (5) vanishes; another common choice for \(f_0^N\) is the one described in [7], where the authors construct a chaotic sequence by conditioning \(f_0^{\otimes N}\) to the Kac sphere \(S^N\); quantitative rates of chaoticity in \(W_2\) for this kind of construction can be found in [8].

The rate \(N^{-1/3}\) is not so far from the optimal rate \(N^{-1/2}\), valid for the convergence of the empirical measure of an N-tuple of i.i.d. variables towards their common law, with the same metric as in (5); see [13, Theorem 1]. We remark that if one only assumes \(\int |v|^r f_0(dv) < \infty \) for some \(2<r<4\), we can still deduce (5), but with a slower chaos rate of order \(N^{-\eta (r)}\) for some \(0<\eta (r)<1/3\). We also note that the value \(\mu /2\), corresponding to the rate of decay of the initial condition term in (5) (see also the contraction estimates given in Lemmas 7 and 9 below), coincides with the spectral gap of the generator of the particle system, and with the bound on the entropy production obtained in [4]; see [22] for the optimality of this bound.

The proof of Theorem 2 is based on a coupling argument developed in [10] and later used in [9] to prove uniform propagation of chaos for Kac’s original model. This argument makes use of a probabilistic object called the Boltzmann process, which is a stochastic process \((Z_t)_{t\ge 0}\) satisfying \(\text {Law}(Z_t) = f_t\) for all \(t\ge 0\). More specifically, we will construct our particle system \(\mathbf {V}_t = (V_t^1,\ldots ,V_t^N)\) using a Poisson point measure, and couple it with a collection \(\mathbf {Z}_t = (Z_t^1,\ldots ,Z_t^N)\) of Boltzmann processes, in a way that the two remain close on expectation. Some adaptations are required in order to use this technique. For instance, we will need to introduce an additional Poisson point measure to represent thermostat interactions in the particle system. Also, whereas Kac’s original model is known to have useful properties like well-posedness, propagation of moments, and convergence to equilibrium (which the argument of [9, 10] requires), for the thermostated Kac model of the present paper we will need to prove these properties, or adapt previously known results. For instance, see Theorem 5 for well-posedness, Lemma 8 for propagation of moments, and Lemmas 7 and 9 for convergence to equilibrium in the \(W_2\)-metric.

The structure of the article is as follows. In Sect. 2 we provide a notion of weak solution for the thermostated Boltzmann–Kac equation (4), valid for collections \((f_t)_{t\ge 0}\) of probability measures, and we then prove a well-posedness result for this notion. In Sect. 3 we specify the coupling construction mentioned above and we prove Theorem 2. Along the way, we will use this construction to prove some interesting results, such as the equilibration in \(W_2\) for the particle system in Lemma 7, and an analogous result for the kinetic equation in Lemma 9. Some final comments are given in Sect. 4.

2 Well-Posedness for the Kinetic Equation

In this section, we define a notion of weak solution to (4), and prove its well-posedness. We will not require each \(f_t\) to have a density; instead, it will be an element of the space \(\mathcal {M}\) of bounded non-negative Borel measures on \(\mathbb {R}\) metrized by total variation \(\Vert \cdot \Vert \). We will see that, if \(f_0\) is a probability measure, then \(f_t\) will also be a probability measure for all \(t>0\). Similarly, if \(f_0\) has a density, so will \(f_t\).

For convenience, let us introduce the mapping \(B:\mathcal {M}\times \mathcal {M}\rightarrow \mathcal {M}\), given by

$$\begin{aligned} \int _\mathbb {R}\phi (x) B[\nu _1, \nu _2](dx) = \int _\mathbb {R}\int _\mathbb {R}\nu _1(dx) \nu _1(dy) \int _0^{2\pi } \phi (x \cos \theta + y \sin \theta ) \frac{d\theta }{2\pi } \end{aligned}$$

for all bounded and continuous function \(\phi \). Notice that when \(\nu _1\) and \(\nu _2\) have densities \(g_1\) and \(g_2\) with respect to the Lebesgue measure, then \(B[\nu _1, \nu _2]\), also denoted by \(B[g_1,g_2]\), satisfies

$$\begin{aligned} B[g_1,g_2](v) = \int _0^{2\pi }\int _\mathbb {R}g_1(v') g_2(v_*') dv_* \frac{d\theta }{2\pi }. \end{aligned}$$

We note that (4) is equivalent to

$$\begin{aligned} \frac{d f_t}{dt} = 2\lambda (B[f_t,f_t] - f_t) + \mu (B[f_t,\gamma ] - f_t). \end{aligned}$$

This motivates the following notion of weak solution:

Definition 3

A function \(f \in C([0,\infty ), \mathcal {M})\) is a weak solution to (4) with initial condition \(f_0\) if, for all \(t\ge 0\), we have

$$\begin{aligned} f_t = f_0 + \int _0^t \{ 2\lambda (B[f_s, f_s] -f_s) + \mu (B[f_s, \gamma ] - f_s) \} ds. \end{aligned}$$
(6)

We summarize some of the useful properties of the mapping B in the following lemma, which we state without proof.

Lemma 4

  1. (i)

    Monotonicity: If \(\nu _1, \nu _2, \pi _1\), and \(\pi _2\) in \(\mathcal {M}\) are such that

    $$\begin{aligned} \nu _1(A) \ge \pi _1(A) \text { and } \nu _2(A) \ge \pi _2(A) \quad \forall A \text {measurable}, \end{aligned}$$

    then

    $$\begin{aligned} B[\nu _1, \nu _2](A) \ge B[\pi _1, \pi _2](A), \quad \forall A \text {measurable}. \end{aligned}$$
  2. (ii)

    Norm: for all \(\nu _1, \nu _1 \in \mathcal {M}\), it holds

    $$\begin{aligned} \Vert B[\nu _1, \nu _2] \Vert = \Vert \nu _1 \Vert \Vert \nu _2 \Vert . \end{aligned}$$

    If \(\nu _1\) and \(\nu _2\) are bounded, signed, Borel measures, then

    $$\begin{aligned} \Vert B[\nu _1, \nu _2] \Vert \le \Vert \nu _1 \Vert \Vert \nu _2 \Vert . \end{aligned}$$
  3. (iii)

    Second moments and arbitrary moments: If \(\nu _1\) and \(\nu _2\) in \(\mathcal {M}\) have finite second moments \(e_1\) and \(e_2\) respectively, then

    $$\begin{aligned} \int _\mathbb {R}x^2 B[\nu _1, \nu _2] (dx) = \frac{ e_2 + e_2 }{2}. \end{aligned}$$

    If \(\nu _1\) and \(\nu _2\) have finite \(r^{\text {th}}\) moments \(n_r\) and \(m_r\) for some \(r > 0\), then

    $$\begin{aligned} \int _\mathbb {R}\vert x\vert ^r B[\nu _1, \nu _2](dx) \le 2^{\max \{\frac{r}{2},1\}} \frac{n_r + m_r}{2} \int _0^{2\pi } | \cos \theta |^r \frac{d\theta }{2\pi }. \end{aligned}$$

We are now ready to state and prove our well-posednes result:

Theorem 5

(well-posedness) For every probability measure \(f_0 \in \mathcal {M}\), there is a unique solution f to (7). \(f_t\) is a probability measure for every t. If \(f_0\) has a density or a finite \(r^{\text {th}}\) moment for some \(r\ge 2\), then so does \(f_t\) for all t.

Proof

We will use the following equivalent form of (6):

$$\begin{aligned} f_t = e^{-(2\lambda +\mu ) t} f_0 + \int _0^t e^{-(2\lambda +\mu )(t-s)}\left( 2\lambda B[f_s, f_s] + \mu B[f_s, \gamma ] \right) ds. \end{aligned}$$
(7)

We use the iterative construction in [20]. Let \(f_0\) be a Borel probability measure on \(\mathbb {R}\). Define the sub-probability measures \(( u^n_t)_{n=0}^\infty \) inductively by

$$\begin{aligned} u^0_t= & {} e^{-(2\lambda +\mu )t} f_0, \nonumber \\ u^{n+1}_t= & {} e^{-(2\lambda +\mu ) t} f_0 + \int _0^t e^{-(2\lambda +\mu )(t-s)} \left( \mu B[ u^n_s, \gamma ] + 2\lambda B[u^n_s, u^n_s] \right) ds \end{aligned}$$
(8)

Using Lemma 4 we see that \(u^n_t\) is continuous in t for each n, that \(u^n_t(\mathbb {R}) \le 1\), and that \((u^n_t)_n\) is increasing in n. Hence, for each t, \((u^n_t)_n\) converges to some element \(u_t\) in \(\mathcal {M}\) and \(u_t(\mathbb {R}) \le 1\). Note that \(u_t-u_t^n\) is a non-negative measure for each t, thus we have convergence in total variation, since

$$\begin{aligned} \lim _{n\rightarrow \infty } \Vert u^n_t - u_t \Vert = \lim _{n\rightarrow \infty } u_t(\mathbb {R}) - u^n_t(\mathbb {R}) = 0. \end{aligned}$$

This, together with Lemma 4, implies that

$$\begin{aligned} \lim _{n\rightarrow \infty } \Vert B[u^n_t, \gamma ] - B[u_t, \gamma ] \Vert \le \lim _{n\rightarrow \infty } \Vert u^n_t - u_t \Vert = 0, \end{aligned}$$

and

$$\begin{aligned} \lim _{n\rightarrow \infty } \Vert B[u^n_t, u^n_t] - B[u_t, u_t] \Vert \le \lim _{n\rightarrow \infty } \Vert u^n_t - u_t \Vert (u^n_t(\mathbb {R}) + u_t(\mathbb {R})) = 0. \end{aligned}$$

Thus we can take the infinite n limit in (8) and establish that \(u_t\) solves (7). Being an increasing limit of continuous functions, \(u: [0,\infty ) \rightarrow \mathcal {M}\) is lower semi-continuous, and thus measurable. Since \(u_t(\mathbb {R}) \le 1\), \(\forall t\), u belongs to \(L^\infty ( [0,\infty ), \mathcal {M})\). To show that \(u_t\) is continuous (in t) we note that it equals

$$\begin{aligned} e^{-(2\lambda +\mu ) t} f_0 + e^{-(2\lambda +\mu ) t} \int _0^t e^{(2\lambda +\mu )s} \left( \mu B[ u_s, \gamma ] + 2\lambda B[u_s, u_s] \right) ds \end{aligned}$$

and the integrand above is in the Bochner space \(L^1([0,\tau ], \mathcal {M})\) for all \(\tau \). This makes \(u_t\) continuous. A consequence of this continuity is that: \(h(t)= u_t(\mathbb {R})\) is differentiable and satisfies the differential equation

$$\begin{aligned} h'(t) = -(2\lambda +\mu ) h(t) + \mu h(t) + 2\lambda h(t)^2 \end{aligned}$$

Since \(h(0)=1\), \(h(t) \equiv 1\). Hence, \(u_t\) is a probability measure for all t. To show the uniqueness of \(u_t\), let \(g_t \in C([0,\infty ), \mathcal {M})\) satisfy (7). On one hand, \(g_t\ge u^0_t\) by definition. And thus, by induction, \(g_t \ge u^n_t\) a.e. t for all n. By the monotone convergence theorem, we have

$$\begin{aligned} g_t(A) \ge u_t(A) \end{aligned}$$

for every measurable set A. On the other hand, using Lemma 4, for each t we obtain

$$\begin{aligned} \int _0^t e^{-(2\lambda +\mu )(t-s)} \Vert (\mu B[g(s),\gamma ] + 2\lambda B[g_s, g_s]) \Vert ds \le \mu \sqrt{t} \left( \int _0^t \Vert g_s \Vert ^2 ds\right) ^{\frac{1}{2}} + 2\lambda \int _0^t \Vert g_s\Vert ^2 ds \end{aligned}$$

which shows that \(g_t\) is continuous in t, and just like \(u_t\), must be a probability measure for all t. Thus, \(\Vert g_t - u_t \Vert = g_t(\mathbb {R})- u_t(\mathbb {R}) = 1 - 1 =0\).

To prove the last statement of the theorem, we note that if \(f_0 \in L^1(\mathbb {R})\), then \(u_t^n \in L^1(\mathbb {R})\) for all \(\mathbb {R}\) and we use the completeness of \(L^1\) under the total variation norm. If \(f_0\) has a finite \(r^{\text {th}}\) moment for some \(r>0\), then by Lemma 4 and induction, we see that, for each t, \((\int _\mathbb {R}u_t^n(dv) \vert v\vert ^r)_n\) is finite, monotone increasing, and bounded above by the solution R(t) to the following integral equation:

$$\begin{aligned} R(t) = e^{-(2\lambda + \mu ) t} \int _\mathbb {R}\vert v\vert ^r f_0(dv) + C_r\int _0^t e^{-(2\lambda +\mu )(t-s)} (2\lambda + \frac{\mu }{2})R(s) ds + C_r \int _\mathbb {R}\vert w\vert ^r \gamma (dw). \end{aligned}$$

Here \(C_r = 2^{\max \{\frac{r}{2},1\}}\int _0^{2\pi } \vert \cos \theta \vert ^r \frac{d\theta }{2\pi }\) is as in Lemma 4. R(t) is finite due to Gronwall’s inequality. The monotone convergence theorem implies that R(t) controls the \(r^{\text {th}}\) moment of \(f_t\). \(\square \)

It is straightforward to verify that, if \(\int _\mathbb {R}v^2 f_0(dv)<\infty \), then we have

$$\begin{aligned} \int _\mathbb {R}v^2 f_t(dv) = \left( \int _\mathbb {R}v^2 f_0(dv) \right) e^{-\frac{1}{2} \mu t } + T (1-e^{-\frac{1}{2} \mu t }). \end{aligned}$$
(9)

Remark 6

The uniqueness of the solution to (6) holds in the larger space \(L^2_\text {loc}([0, \infty ), \mathcal {M})\), provided we identify functions \(f_t\) that agree t-a.s.

3 Coupling Construction

3.1 Particle System

We provide an explicit construction of the particle system using an SDE, following [10]. To this end, for fixed \(N\in \mathbb {N}\), let \(\mathcal {R}(dt,d\theta ,d\xi ,d\zeta )\) be a Poisson point measure on \([0,\infty )\times [0,2\pi )\times [0,N)^2\) with intensity

$$\begin{aligned} N\lambda dt \frac{d\theta }{2\pi } \frac{d\xi d\zeta \mathbf {1}_{\{\mathbf {i}(\xi ) \ne \mathbf {i}(\zeta )\}}}{N(N-1)} = \frac{\lambda dt d\theta d\xi d\zeta \mathbf {1}_{\{\mathbf {i}(\xi ) \ne \mathbf {i}(\zeta )\}}}{2\pi (N-1)}, \end{aligned}$$

where \(\mathbf {i}\) is the function that associates to a variable \(\xi \in [0,N)\) the discrete index \(\mathbf {i}(\xi ) = \lfloor \xi \rfloor +1 \in \{1,\ldots ,N\}\). In words: at rate \(N\lambda \), the measure \(\mathcal {R}\) selects collision times \(t\ge 0\), and for each such time, it independently samples a parameter \(\theta \) uniformly at random on \([0,2\pi )\), and a pair \((\xi ,\zeta ) \in [0,N)^2\) such that \(\mathbf {i}(\xi ) \ne \mathbf {i}(\zeta )\), also uniformly. The pair \((\mathbf {i}(\xi ),\mathbf {i}(\zeta ))\) provides the indices of the particles involved in Kac-type collisions. The fact that we use continuous variables \(\xi ,\zeta \in [0,N)\), instead of discrete indices in \(\{1,\ldots ,N\}\), will be crucial to define our coupling with a collection of Boltzmann processes.

Let \(\mathcal {Q}_1(dt, d\theta , dw),\ldots ,\mathcal {Q}_N(dt, d\theta , dw)\) be a collection of independent Poisson point measures on \([0,\infty )\times [0,2\pi ) \times \mathbb {R}\), also independent of \(\mathcal {R}\), each having intensity \(\mu dt \frac{d\theta }{2\pi } \gamma (dw)\). Finally, let \(\mathbf {V}_0 = (V_0^1,\ldots ,V_0^N)\) be an exchangeable collection of random variables with \(\text {Law}(\mathbf {V}_0) = f_0^N\), independent of everything else.

The particle system \(\mathbf {V}_t = (V_t^1,\ldots ,V_t^N)\) is defined as the unique jump-by-jump solution of the SDE

$$\begin{aligned} \begin{aligned} d\mathbf {V}_t&= \int _0^{2\pi } \int _{[0,N)^2} \sum _{i,j=1, i\ne j}^N [ {\mathbf {a}}_{ij}(\mathbf {V}_{t^{-}}, \theta ) - \mathbf {V}_{t^{-}} ] \mathbf {1}_{\{ \mathbf {i}(\xi )=i,\mathbf {i}(\zeta )=j\}} \mathcal {R}(dt,d\theta ,d\xi ,d\zeta ) \\&\quad + \sum _{i=1}^N \int _0^{2\pi } \int _\mathbb {R}[{\mathbf {b}}_i(\mathbf {V}_{t^{-}},\theta , w) - \mathbf {V}_{t^{-}}] \mathcal {Q}_i(dt, d\theta , dw) \end{aligned} \end{aligned}$$
(10)

that starts at \(\mathbf {V}_0\). Here, for \(\mathbf {v}\in \mathbb {R}^N\), the vectors \({\mathbf {a}}_{ij}(\mathbf {v},\theta )\in \mathbb {R}^N\) and \({\mathbf {b}}_i(\mathbf {v},\theta ,w)\in \mathbb {R}^N\) are defined as

$$\begin{aligned} {{\mathbf {a}}}_{ij}(\mathbf {v},\theta )^k&= {\left\{ \begin{array}{ll} v^i \cos \theta - v^j \sin \theta &{} \text {if} k=i, \\ v^i \sin \theta + v^j \cos \theta &{} \text {if} k=j, \\ v^k &{} \text {otherwise}, \end{array}\right. } \\ {{\mathbf {b}}}_i(\mathbf {v},\theta ,w)^k&= {\left\{ \begin{array}{ll} v^i\cos \theta - w \sin \theta &{} \text {if} k=i, \\ v^k &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

For any \(i=1,\ldots ,N\), from (10) it follows that particle \(V_t^i\) satisfies the SDE

$$\begin{aligned} \begin{aligned} dV_t^i&= \int _0^{2\pi } \int _0^N [V_{t^{-}}^i \cos \theta - V_{t^{-}}^{\mathbf {i}(\xi )}\sin \theta - V_{t^{-}}^i] \mathcal {P}_i(dt,d\theta ,d\xi ) \\&\quad + \int _0^{2\pi } \int _\mathbb {R}[V_{t^{-}}^i \cos \theta - w\sin \theta - V_{t^{-}}^i] \mathcal {Q}_i(dt, d\theta ,dw), \end{aligned} \end{aligned}$$
(11)

where \(\mathcal {P}_i\) is defined as

$$\begin{aligned} \mathcal {P}_i(dt,d\theta ,d\xi ) = \mathcal {R}(dt,d\theta ,[i-1,i), d\xi ) + \mathcal {R}(dt,-d\theta , d\xi , [i-1,i)), \end{aligned}$$

and where we use \(-d\theta \) to transform \(\sin \theta \) into \(-\sin \theta \). Clearly, \(\mathcal {P}_i\) is a Poisson point measure on \([0,\infty )\times [0,2\pi )\times [0,N)\) with intensity

$$\begin{aligned} 2 \lambda dt \frac{d\theta }{2\pi } \frac{d\xi \mathbf {1}_{\{\mathbf {i}(\xi )\ne i\}}}{N-1}. \end{aligned}$$

As mentioned earlier, \(f_t^N = \text {Law}(\mathbf {V}_t)\) converges exponentially fast to the Gaussian density \(\gamma ^{\otimes N}\) in relative entropy, as shown in [4, Theorem 3]. Similarly, the following result provides equilibration in \(W_2\), which does not require \(f_t^N\) to have a density:

Lemma 7

(contraction and equilibration for the particle system) Let \(f_t^N\) and \(\tilde{f}_t^N\) be the laws of the thermostated Kac N-particle systems starting from (possibly different) symmetric initial distributions \(f_0^N\) and \(\tilde{f}_0^N\), respectively. Then

$$\begin{aligned} W_2^2(f_t^N , \tilde{f}_t^N) \le e^{-\frac{\mu }{2} t} W_2^2(f_0^N , \tilde{f}_0^N). \end{aligned}$$

Consequently, taking \(\tilde{f}_0^N\) as the stationary distribution \(\gamma ^{\otimes N}\), gives

$$\begin{aligned} W_2^2(f_t^N , \gamma ^{\otimes N}) \le e^{-\frac{\mu }{2} t} W_2^2(f_0^N , \gamma ^{\otimes N}). \end{aligned}$$

Proof

Let \((\mathbf {V}_t)_{t\ge 0}\) and \(({\tilde{\mathbf {V}}}_t)_{t\ge 0}\) be the solutions to the SDE (10) with respect to the same Poisson point measures \(\mathcal {R}\), \(\mathcal {Q}_1,\ldots ,\mathcal {Q}_N\), but starting from initial conditions \((\mathbf {V}_0,{\tilde{\mathbf {V}}}_0)\) which we take as an optimal coupling between \(f_0^N\) and \(\tilde{f}_0^N\). Call \(h(t) = \mathbb {E}[(V_t^1 - \tilde{V}_t^1)^2]\), then \(W_2^2(f_t^N,\tilde{f}_t^N) \le \mathbb {E}[\frac{1}{N} \sum _i (V_t^i - \tilde{V}_t^i)^2 ] = h(t)\) by exchangeability, with equality at \(t=0\). Thus, it suffices to study h(t). Since both \(V_t^1\) and \(\tilde{V}_t^1\) satisfy (11) with \(i=1\), when computing the increments of \((V_t^1-\tilde{V}_t^1)^2\) the terms \(w \sin \theta \) cancel, thus obtaining

$$\begin{aligned} \nonumber h'(t)&= 2\lambda \mathbb {E}\int _0^{2\pi } \int _1^N \left[ (V_t^1 \cos \theta - V_t^{\mathbf {i}(\xi )}\sin \theta - \tilde{V}_t^1 \cos \theta + \tilde{V}_t^{\mathbf {i}(\xi )} \sin \theta )^2 \right. \nonumber \\&\qquad \left. - (V_t^1 - \tilde{V}_t^1)^2 \right] \frac{d\theta d\xi }{2\pi (N-1)} \nonumber \\&\qquad + \mu \mathbb {E}\int _0^{2\pi } \int _\mathbb {R}\left[ (V_t^1 \cos \theta - \tilde{V}_t^1 \cos \theta )^2 - (V_t^1 - \tilde{V}_t^1)^2 \right] \frac{d\theta \gamma (dw)}{2\pi } \nonumber \\&= 2\lambda \mathbb {E}\int _1^N \left[ -\frac{1}{2}(V_t^1 - \tilde{V}_t^1)^2 + \frac{1}{2}(V_t^{\mathbf {i}(\xi )} - \tilde{V}_t^{\mathbf {i}(\xi )})^2 \right] \frac{d\xi }{ N-1} - \frac{\mu }{2} h(t), \end{aligned}$$
(12)

where we used that \(\int _0^{2\pi } \cos ^2\theta \frac{d\theta }{2\pi } = \frac{1}{2} = \int _0^{2\pi } \sin ^2\theta \frac{d\theta }{2\pi }\) and \(\int _0^{2\pi } \cos \theta \sin \theta \frac{d\theta }{2\pi } = 0\). Notice that

$$\begin{aligned} \mathbb {E}\int _1^N (V_t^{\mathbf {i}(\xi )} - \tilde{V}_t^{\mathbf {i}(\xi )})^2 \frac{d\xi }{N-1} = \mathbb {E}\frac{1}{N-1} \sum _{i=2}^N (V_t^i - \tilde{V}_t^i)^2 = h(t), \end{aligned}$$

thus the first term in (12) vanishes, which then gives \(h'(t) = -\frac{\mu }{2} h(t)\). The desired bound follows. \(\square \)

3.2 Coupling with Boltzmann Processes

For a given probability measure \(f_0\), let \((f_t)_{t\ge 0}\) be the unique weak solution of (4) given by Theorem 5. We will now construct a stochastic process \((Z_t)_{t\ge 0}\), called the Boltzmann process, such that \(\text {Law}(Z_t) = f_t\) for all \(t\ge 0\). This process is the probabilistic counterpart of (4), and it represents the trajectory of a single particle immersed in the infinite population. It was first introduced by Tanaka [21] in the context of the Boltzmann equation for Maxwell molecules.

Consider a Poisson point measure \(\mathcal {P}(dt,d\theta ,dz)\) on \([0,\infty ) \times [0,2\pi ) \times \mathbb {R}\) with intensity \(2\lambda dt \frac{d\theta }{2\pi } f_t(dz)\), and an independent Poisson point measure \(\mathcal {Q}(dt,d\theta ,dw)\) on \([0,\infty ) \times [0,2\pi ) \times \mathbb {R}\) with intensity \(\mu dt \frac{d\theta }{2\pi } \gamma (dw)\). Consider also a random variable \(Z_0\) with law \(f_0\), independent of \(\mathcal {P}\) and \(\mathcal {Q}\). The process \(Z_t\) is defined as the unique solution, starting from \(Z_0\), to the stochastic differential equation

$$\begin{aligned} \begin{aligned} dZ_t&= \int _0^{2\pi } \int _\mathbb {R}[Z_{t^{-}} \cos \theta - z\sin \theta - Z_{t^{-}}] \mathcal {P}(dt,d\theta ,dz) \\&\quad {} + \int _0^{2\pi } \int _\mathbb {R}[Z_{t^{-}}\cos \theta - w\sin \theta - Z_{t^{-}}] \mathcal {Q}(dt,d\theta ,dw). \end{aligned} \end{aligned}$$
(13)

Strong existence and uniqueness of solutions for this SDE is straightforward, since the rates of \(\mathcal {P}\) and \(\mathcal {Q}\) are finite on bounded time intervals. To show that \(\text {Law}(Z_t) = f_t\), the argument is classical: one first shows that \(\ell _t {:}{=} \text {Law}(Z_t)\) solves

$$\begin{aligned} \ell _t = f_0 + \int _0^t \{ 2 \lambda (B[\ell _s, f_s] - \ell _s) + \mu ( B[\ell _s, \gamma ] - \ell _s)\} ds, \end{aligned}$$

which is a linearized version of (6). This equation has a unique solution in the space \(C([0, \infty ), \mathcal {M})\) because the mapping \(\nu \mapsto B[\nu ,f_t]\) is non-expanding in total variation for all t. Since \(f_t\) is a solution of this linearized version, we must have that \(\ell _t = f_t\).

Since \(\text {Law}(Z_t) = f_t\), we can thus use the Boltzmann process as a tool to prove properties of the solution of the thermostated Boltzmann–Kac equation (4). For instance, we have the following lemma, which will be needed later to prove our uniform-in-time propagation of chaos result.

Lemma 8

(propagation of moments) Let \((f_t)_{t\ge 0}\) be the weak solution to (4). Let \(r\ge 2\), and assume that \(\int _\mathbb {R}\vert v \vert ^r f_0(dv)< \infty \). Then \(\sup _{t \ge 0} \int _\mathbb {R}\vert v\vert ^r f_t(dv)<\infty \).

Proof

The case \(r=2\) follows from (9), so we assume \(r>2\). Let \((Z_t)_{t\ge 0}\) be the Boltzmann process, i.e., the solution to (13). Let \(h(t) = \mathbb {E}\vert Z_t \vert ^r = \int _\mathbb {R}\vert v\vert ^r f_t(dv)\). We know from Theorem 5 that \(h(t)<\infty \) for all t. Then h(t) satisfies

$$\begin{aligned} h'(t)&= 2\lambda \mathbb {E}\int _{0}^{2\pi } \frac{d\theta }{2\pi } \int _{\mathbb {R}} f_t(dz) \left( \vert Z_{t^-} \cos \theta - z \sin \theta \vert ^r - \vert Z_{t}\vert ^r \right) \\&\quad + \mu \mathbb {E}\int _{0}^{2\pi } \frac{d\theta }{2\pi } \int _\mathbb {R}\gamma (dw)\left( \vert Z_{t^-} \cos \theta - w \sin \theta \vert ^r - \vert Z_{t}\vert ^r \right) . \end{aligned}$$

Note that \(\mathbb {E}|Z_t|^{r-1} \le h(t)^{1-1/r}\) and \(\mathbb {E}|Z_t| \le \max \{T, \int _\mathbb {R}v^2 f_t(dv)\}^{1/2}\), thanks to (9) and Jensen’s inequality. Using the inequality \((a+b)^r \le a^r + b^r + 2^{r-1} (a b^{r-1} +a^{r-1} b)\) valid for \(a,b \ge 0\), we thus obtain

$$\begin{aligned} h'(t) \le -C_1 h(t) + C_2 + C_3 h(t)^{1-1/r}, \end{aligned}$$
(14)

where

$$\begin{aligned} C_1 = 2\lambda \left( 1 - 2\int _0^{2\pi } \vert \cos \theta \vert ^r \frac{d\theta }{2\pi }\right) + \mu \left( 1 - \int _0^{2\pi } \vert \cos \theta \vert ^r \frac{d\theta }{2\pi } \right) >0 , \end{aligned}$$

and \(C_2, C_3>0\) are constants depending on \(\lambda \), \(\mu \), r, T, \(\int _\mathbb {R}v^2 f_t(dv)\), and some moments of \(\gamma \) of order at most r. The statement follows from (14). \(\square \)

The Boltzmann process (13) is particularly useful in coupling arguments, as the next result shows. It provides contraction for the thermostated Boltzmann–Kac equation in \(W_2\)-distance:

Lemma 9

(contraction and equilibration for the thermostated Boltzmann–Kac equation) Let \(f_t\), \(\tilde{f}_t\) be the weak solutions to (4) starting from some possibly different probability measures \(f_0\), \(\tilde{f}_0\). Then

$$\begin{aligned} W_2^2(f_t,\tilde{f}_t) \le e^{-\frac{\mu }{2}t} W_2^2(f_0,\tilde{f}_0). \end{aligned}$$

Consequently, taking \(f_0 = \gamma \), gives

$$\begin{aligned} W_2^2(f_t,\gamma ) \le e^{-\frac{\mu }{2}t} W_2^2(f_0, \gamma ). \end{aligned}$$

Proof

For all \(t\ge 0\), let \(\Pi _t\) be an optimal coupling between \(f_t\) and \(\tilde{f}_t\), that is, \(\Pi _t\) is a probability measure on \(\mathbb {R}\times \mathbb {R}\) such that \(\int (z-\tilde{z})^2 \Pi _t(dz,d\tilde{z}) = W_2^2(f_t,\tilde{f}_t)\). Let \({\mathcal {S}}(dt,d\theta ,dz,d\tilde{z})\) be a Poisson point measure on \([0,\infty )\times [0,2\pi )\times \mathbb {R}\times \mathbb {R}\) with intensity \(2\lambda dt \frac{d\theta }{2\pi }\Pi _t(dz,d\tilde{z})\), and define \(\mathcal {P}(dt,d\theta ,dz) = {\mathcal {S}}(dt,d\theta ,dz,\mathbb {R})\) and \({\tilde{\mathcal {P}}}(dt,d\theta ,d\tilde{z}) = {\mathcal {S}}(dt,d\theta ,\mathbb {R},d\tilde{z})\). In words, \(\mathcal {P}\) and \({\tilde{\mathcal {P}}}\) are Poisson point measures, with intensities \(2\lambda dt \frac{d\theta }{2\pi } f_t(dz)\) and \(2\lambda dt \frac{d\theta }{2\pi } \tilde{f}_t(d\tilde{z})\) respectively, which have the same atoms in the t and \(\theta \) variables, and with optimally-coupled realizations of \(f_t\) and \(\tilde{f}_t\) on the z and \(\tilde{z}\) variables. Also, let \(\mathcal {Q}(dt,dw)\) be a Poisson point measure with intensity \(\mu dt \gamma (dw)\) that is independent of \({\mathcal {S}}\), and set \({\tilde{\mathcal {Q}}} = \mathcal {Q}\). Let also \((Z_0, \tilde{Z}_0)\) be a realization of \(\Pi _0\), independent of everything else; in particular we have \(\mathbb {E}[(Z_0-\tilde{Z}_0)^2] = W_2^2(f_0,\tilde{f}_0)\).

Let \(Z_t\) and \(\tilde{Z}_t\) be the solutions to the SDE (13) with respect to \((\mathcal {P},\mathcal {Q})\) and \(({\tilde{\mathcal {P}}},{\tilde{\mathcal {Q}}})\), respectively, thus \(\text {Law}(Z_t) = f_t\) and \(\text {Law}(\tilde{Z}_t) = \tilde{f}_t\). Consequently, we have \(W_2^2(f_t,\tilde{f}_t) \le \mathbb {E}[(Z_t - \tilde{Z}_t)^2] =: h(t)\). Using Itô calculus, we have:

$$\begin{aligned} h'(t)&= 2\lambda \mathbb {E}\int _0^{2\pi } \int _{\mathbb {R}\times \mathbb {R}} [(Z_t\cos \theta - z\sin \theta - \tilde{Z}_t\cos \theta + \tilde{z}\sin \theta )^2 - (Z_t-\tilde{Z}_t)^2] \Pi _t(dz,d\tilde{z}) \frac{d\theta }{2\pi } \\&\quad {} + \mu \mathbb {E}\int _0^{2\pi } \int _\mathbb {R}[(Z_t \cos \theta - w\sin \theta - \tilde{Z}_t \cos \theta + w \sin \theta )^2 - (Z_t-\tilde{Z}_t)^2] \gamma (dw) \frac{d\theta }{2\pi } \\&= 2\lambda \mathbb {E}\int _0^{2\pi } \int _{\mathbb {R}\times \mathbb {R}} [(\cos ^2\theta - 1)(Z_t-\tilde{Z}_t)^2 + (z-\tilde{z})^2\sin ^2\theta ] \frac{d\theta }{2\pi } \Pi _t(dz,d\tilde{z}) - \frac{\mu }{2} h(t), \end{aligned}$$

where in the last step the cross term vanished because \(\int _0^{2\pi } \cos \theta \sin \theta d\theta = 0\). Since \(\int (z-\tilde{z})^2 \Pi _t(dz,d\tilde{z}) = W_2^2(f_t,\tilde{f}_t) \le h(t)\), the integral in the last line is bounded above by 0. We thus obtain \(h'(t) \le - \frac{\mu }{2} h(t)\), which yields the result. \(\square \)

We now specify the coupling construction that will allow us to prove our main result. We closely follow [10], see also [9]. The key idea is to define a system \(\mathbf {Z}_t = (Z_t^1,\ldots ,Z_t^N)\) of Boltzmann processes such that, for each \(i=1,\ldots ,N\), the process \(Z_t^i\) mimics as closely as possible the dynamics of particle \(V_t^i\). Comparing (11) and (13), we see that a way of achieving this is to define \(Z_t^i\) as the solution of (11), but replacing \(V_{t^{-}}^{\mathbf {i}(\xi )}\), which is a \(\xi \)-realization of the (random) empirical measure \(\frac{1}{N-1} \sum _{j\ne i} \delta _{V_{t^{-}}^j}\), with a \(\xi \)-realization of \(f_t\). Moreover, we will do this in an optimal way.

Specifically: we define \(Z_t^i\) as the unique jump-by-jump solution to

$$\begin{aligned} \begin{aligned} dZ_t^i&= \int _0^{2\pi } \int _0^N [Z_{t^{-}}^i \cos \theta - F_t^i(\mathbf {Z}_{t^{-}},\xi )\sin \theta - Z_{t^{-}}^i] \mathcal {P}_i(dt,d\theta ,d\xi ) \\&\quad {} + \int _0^{2\pi } \int _\mathbb {R}[Z_{t^{-}}^i \cos \theta - w\sin \theta - Z_{t^{-}}^i] \mathcal {Q}_i(dt, d\theta ,dw), \end{aligned} \end{aligned}$$
(15)

where we have used the same Poisson point measures \(\mathcal {P}_i\) and \(\mathcal {Q}_i\) as in (11). Here, \(F^i\) is a measurable function \([0,\infty ) \times \mathbb {R}^N \times [0,N) \ni (t,\mathbf {z},\xi ) \mapsto F_t^i(\mathbf {z},\xi ) \in \mathbb {R}\) with the following property: for any \(t\ge 0\), \(\mathbf {z}\in \mathbb {R}^N\), and any random variable U uniformly distributed on the set \([0,N) \backslash [i-1,i)\), the pair \((z^{\mathbf {i}(U)}, F_t^i(\mathbf {z}, U))\) is an optimal coupling between the empirical measure \({\bar{\mathbf {z}}}^i {:}{=}\frac{1}{N-1} \sum _{j\ne i} \delta _{z^j}\) and \(f_t\). In other words,

$$\begin{aligned} \int _0^N \left( z^{\mathbf {i}(\xi )} - F_t^i(\mathbf {z},\xi ) \right) ^2 \frac{d\xi \mathbf {1}_{\{\mathbf {i}(\xi )\ne i\}}}{N-1} = W_2^2({\bar{\mathbf {z}}}^i, f_t). \end{aligned}$$
(16)

(The values of \(F_t^i(\mathbf {z},\xi )\) for \(\xi \in [i-1,1)\) are irrelevant). We refer the reader to [10, Lemma 3] for a proof of existence of such a function. The same result also ensures that \(F_t^i\) satisfies the following: for any exchangeable random vector \({\mathbf {X}}\) in \(\mathbb {R}^N\), and any measurable function \(\phi \), one has for \(j\ne i\)

$$\begin{aligned} \mathbb {E}\int _{j-1}^j \phi (F_t^i({\mathbf {X}},\xi )) d\xi = \int _\mathbb {R}\phi (v) f_t(dv). \end{aligned}$$
(17)

We take an initial condition \(\mathbf {Z}_0 = (Z_0^1,\ldots ,Z_0^N)\) with distribution \(f_0^{\otimes N}\) and optimally coupled to \(\mathbf {V}_0\), thus

$$\begin{aligned} \mathbb {E}[ (V_0^1 - Z_0^1)^2 ] = \mathbb {E}\left[ \frac{1}{N} \sum _{i=1}^N (V_0^i - Z_0^i)^2 \right] = W_2^2( f_0^N, f_0^{\otimes N}), \end{aligned}$$
(18)

by exchangeability. We have thus defined a collection \(\mathbf {Z}_t = (Z_t^1,\ldots ,Z_t^N)\), where each \(Z_t^i\) is a Boltzmann process by construction; in particular, we have \(\text {Law}(Z_t^i) = f_t\). However, notice that \(Z_t^i\) and \(Z_t^j\) have a simultaneous jump whenever \(V_t^i\) and \(V_t^j\) undergo a Kac collision, which implies that \(Z_t^i\) and \(Z_t^j\) are not independent. In order for this construction to be useful, one needs to prove that these Boltzmann processes become asymptotically independent as \(N\rightarrow \infty \), as is done in [9, 10]. This is the content of the following lemma, which moreover provides explicit rates in N, uniformly on time:

Lemma 10

(decoupling of Boltzmann processes) There exists a constant \(C<\infty \) depending only on \(\lambda \), \(\mu \), T, and \(\int v^2 f_0(dv)\), such that for all fixed \(k\in \mathbb {N}\) we have for all \(t\ge 0\):

$$\begin{aligned} W_2^2\left( \text {Law}(Z_t^1,\ldots ,Z_t^k), f_t^{\otimes k} \right) ^2 \le \frac{Ck}{N}. \end{aligned}$$

Proof

The argument is the same as in [10, Lemma 6] and [9, Lemma 3], so we only provide the main steps of the proof here. The idea is to again use a coupling argument: for fixed \(k \le N\), we will define k independent Boltzmann processes \(\tilde{Z}_t^1,\ldots ,\tilde{Z}_t^k\) that remain close to \(Z_t^1,\ldots ,Z_t^k\) on expectation. To achieve this, each \(\tilde{Z}_t^i\) will use the same randomness that defines \(Z_t^i\) (i.e., the SDE (15)), except when \(Z_t^i\) has a simultaneous jump with \(Z_t^j\) for some \(j\in \{1,\ldots ,k\}\), in which case either \(\tilde{Z}_t^i\) or \(\tilde{Z}_t^j\) will not jump. To compensate for the missing jumps, we will use an additional independent source of randomness to define new jumps. Since on expectation this occurs only a proportion k/N of the jumps of the collection \(Z_t^1,\ldots ,Z_t^k\), this construction will give the desired estimate.

To this end, let \({\tilde{\mathcal {R}}}\) be an independent copy of the Poisson point measure \(\mathcal {R}\) introduced at the beginning of Sect. 3.1, and for \(i=1,\ldots ,k\), define

$$\begin{aligned} {\tilde{\mathcal {P}}}_i(dt,d\theta ,d\xi )&= \mathcal {R}(dt,d\theta ,[i-1,i), d\xi ) \\&\qquad {} + \mathcal {R}(dt,-d\theta , d\xi , [i-1,i)) \mathbf {1}_{[k,N)}(\xi ) \\&\qquad {} + {\tilde{\mathcal {R}}}(dt, -d\theta , d\xi , [i-1,i)) \mathbf {1}_{[0,k)}(\xi ), \end{aligned}$$

which is a Poisson point measure with intensity \(2\lambda dt d\theta d\xi \mathbf {1}_{\{\mathbf {i}(\xi )\ne i\}}/ [2\pi (N-1)]\), just as \(\mathcal {P}_i\). Note that the Poisson measures \({\tilde{\mathcal {P}}}_1,\ldots ,{\tilde{\mathcal {P}}}_k\) are independent by construction. Mimicking (15), we define \(\tilde{Z}_t^i\) as the solution, starting from \(\tilde{Z}_0^i = Z_0^i\), to the SDE

$$\begin{aligned} \begin{aligned} d\tilde{Z}_t^i&= \int _0^{2\pi } \int _0^N [\tilde{Z}_{t^{-}}^i \cos \theta - F_t^i({\mathbf {Z}}_{t^{-}},\xi )\sin \theta - \tilde{Z}_{t^{-}}^i] {\tilde{\mathcal {P}}}_i(dt,d\theta ,d\xi ) \\&\quad {} + \int _0^{2\pi } \int _\mathbb {R}[\tilde{Z}_{t^{-}}^i \cos \theta - w\sin \theta - \tilde{Z}_{t^{-}}^i] \mathcal {Q}_i(dt, d\theta ,dw). \end{aligned} \end{aligned}$$
(19)

It is clear that \(\tilde{Z}_t^1,\ldots ,\tilde{Z}_t^k\) is an exchangeable collection of Boltzmann processes. Moreover, using the independence of \({\tilde{\mathcal {P}}}_1,\ldots ,{\tilde{\mathcal {P}}}_k\) and the fact that \(F_t^i({\mathbf {z}},\xi )\) has distribution \(f_t\) for any \(\mathbf {z}\in \mathbb {R}^N\) and any \(\xi \) uniformly distributed on \([0,N)\backslash [i-1,i)\), one can prove that the processes \(\tilde{Z}_t^1,\ldots ,\tilde{Z}_t^k\) are independent. For a full proof of this fact in a very similar setting, we refer the reader to [10, Lemma 6].

Call \(h(t) {:}{=} \mathbb {E}[(Z_t^1 - \tilde{Z}_t^1)^2]\). By exchangeability, we have

$$\begin{aligned} W_2^2\left( \text {Law}(Z_t^1,\ldots ,Z_t^k) , f_t^{\otimes k} \right) \le \mathbb {E}\left[ \frac{1}{k} \sum _{i=1}^k (Z_t^i - \tilde{Z}_t^i)^2\right] = h(t), \end{aligned}$$

thus it suffices to obtain the desired estimate for h(t). From (15) and (19), using Itô calculus, we obtain:

$$\begin{aligned} \begin{aligned} h'(t)&= \mathbb {E}\int _0^{2\pi } \int _0^N \Delta _1 \left[ \mathcal {R}(dt,d\theta ,[0,1), d\xi ) + \mathcal {R}(dt,-d\theta ,d\xi ,[0,1))\mathbf {1}_{[k,N)}(\xi )\right] \\&\qquad + \mathbb {E}\int _0^{2\pi } \int _0^N \Delta _2 \mathcal {R}(dt,-d\theta , d\xi ,[0,1))\mathbf {1}_{[0,k)}(\xi ) \\&\qquad + \mathbb {E}\int _0^{2\pi } \int _0^N \Delta _3 {\tilde{\mathcal {R}}}(dt, -d\theta ,d\xi ,[0,1))\mathbf {1}_{[0,k)}(\xi ) \\&\qquad + \mathbb {E}\int _0^{2\pi } \int _\mathbb {R}\Delta _4 \mathcal {Q}_1(dt,d\theta ,dw), \end{aligned} \end{aligned}$$
(20)

where \(\Delta _1\) corresponds to the increment of \((Z_t^1 - \tilde{Z}_t^1)^2\) when \(Z_t^1\) and \(\tilde{Z}_t^1\) have a simultaneous Kac-type jump, \(\Delta _2\) is the increment when only \(Z_t^1\) jumps, \(\Delta _3\) is the increment when only \(\tilde{Z}_t^1\) jumps, and \(\Delta _4\) is the increment when there is a thermostat interaction.

Thanks to the indicator \(\mathbf {1}_{[0,k)}(\xi )\), the fact that \(\Delta _2\) and \(\Delta _3\) involve only second-order products of \(f_t\)-distributed variables, using (17), and recalling that (9) implies that \(\int v^2 f_t(dv) \le \max \left\{ \int v^2 f_0(dv) , T \right\} \), we deduce that the second and third terms in (20) are bounded above by \(\frac{Ck}{N}\). On the other hand, since the term \(F_t^i({\mathbf {Z}}_{t^{-}},\xi )\) appears in both (15) and (19), it will cancel out in \(\Delta _1\); more specifically, we have

$$\begin{aligned} \Delta _1&= \left[ Z_{t^{-}}^1 \cos \theta - F_t^1({\mathbf {Z}}_{t^{-}},\xi ) \sin \theta -\tilde{Z}_{t^{-}}^1 \cos \theta + F_t^1({\mathbf {Z}}_{t^{-}},\xi ) \sin \theta \right] ^2 - (Z_{t^{-}}^1 - \tilde{Z}_{t^{-}}^1)^2 \\&= - (1-\cos ^2\theta ) (Z_{t^{-}}^1 - \tilde{Z}_{t^{-}}^1)^2 \le 0. \end{aligned}$$

Similarly, it can be easily seen that \(\Delta _4 = -(1-\cos ^2\theta ) (Z_{t^{-}}^1-\tilde{Z}_{t^{-}}^1)^2\), then the last term in (20) is equal to \(-\frac{\mu }{2} h(t)\). Thus, simply discarding the term \(\Delta _1 \mathbf {1}_{[k,N)}(\xi ) \le 0\) in the first line of (20), we deduce that

$$\begin{aligned} h'(t)&\le -\mathbb {E}\int _0^{2\pi } \int _1^N (1-\cos ^2\theta ) (Z_t^1 - \tilde{Z}_t^1)^2 \frac{2 \lambda d\theta d\xi }{2\pi (N-1)} + \frac{Ck}{N} - \frac{\mu }{2} h(t) \\&= -(\lambda + \mu /2) h(t) + \frac{Ck}{N}. \end{aligned}$$

Thus \(h'(t) + (\lambda + \frac{\mu }{2}) h(t) \le \frac{Ck}{N}\). Since \(h(0) = 0\), the desired bound follows from the last inequality by multiplying by \(e^{(\lambda + \frac{\mu }{2})t}\) and integrating. \(\square \)

We now want to obtain an estimate for the decoupling property of the system of Boltzmann processes in terms of \(\mathbb {E}[W_2^2({\bar{\mathbf {Z}}}_t, f_t)]\); this is the content of Lemma 11 below. To this end, we will need to recall two results.

For a probability measure \(\nu \) on \(\mathbb {R}\) and for any \(k\in \mathbb {N}\), we will let \(\varepsilon _k(\nu )\) be given by

$$\begin{aligned} \varepsilon _k(\nu ) = \mathbb {E}[ W_2^2(\bar{{\mathbf {X}}}, \nu )] \end{aligned}$$

where \({\mathbf {X}} = (X_1, \dots , X_k)\) is a collection of i.i.d. variables with law \(\nu \). The first result, see [13, Theorem 1], provides rates of convergence for \(\varepsilon _k(\nu )\): if \(\nu \) has a finite \(r^{\text {th}}\) moment for some \(r>4\), then there is a constant \(C_r\) that depends only on r such that

$$\begin{aligned} \varepsilon _k(\nu ) \le \frac{C_r \int \vert x \vert ^r \nu (dx)}{k^{1/2}}. \end{aligned}$$
(21)

The second result, which is a special case of [10, Lemma 7], states that if \({\mathbf {X}}\) is any exchangeable random vector on \(\mathbb {R}^N\) and \(\nu \) is any probability measure on \(\mathbb {R}\), then there is a constant C depending only on the second moments of \(X^1\) and \(\nu \) such that for any \(k \le N\) we have:

$$\begin{aligned} \frac{1}{2} \mathbb {E}[ W_2^2(\bar{{\mathbf {X}}}, \nu )] \le W_2^2( \text {Law}(X^1, \dots , X^k), \nu ^{\otimes k}) + \varepsilon _k(\nu ) + C \frac{k}{N}. \end{aligned}$$
(22)

We are now ready to state and prove:

Lemma 11

Assume that \(\int _\mathbb {R}f_0(dv) \vert v \vert ^r < \infty \) for some \(r>4\). Then there is a constant C depending only on \(\lambda \), \(\mu \), T, r, and \(\int _\mathbb {R}f_0(dv) \vert v\vert ^r\), such that for all \(t\ge 0\) we have

$$\begin{aligned} \mathbb {E}[ W_2^2( \bar{{\mathbf {Z}}}_t, f_t) ] \le \frac{C}{N^{1/3}}. \end{aligned}$$

Moreover, this bound also holds if we replace \(\bar{{\mathbf {Z}}}_t\) by \(\bar{{\mathbf {Z}}}_t^i = \frac{1}{N-1} \sum _{j\ne i} \delta _{Z_t^j}\).

Proof

For \(k\le N\), (22) applied to \(\nu =f_t\) and \({\mathbf {X}} = {\mathbf {Z}}_t\) gives:

$$\begin{aligned} \frac{1}{2} \mathbb {E}[W_2^2(\bar{{\mathbf {Z}}}_t, f_t)]&\le W_2^2(\text {Law}(Z_t^1,\ldots ,Z_t^k), f_t^{\otimes k}) + \varepsilon _k(f_t) + C\frac{k}{N} \\&\le C\frac{k}{N} + \varepsilon _k(f_t) + C\frac{k}{N}, \end{aligned}$$

where in the last step we used Lemma 10. The finite initial \(r^{\text {th}}\) moment hypothesis, together with Lemma 8, implies that

$$\begin{aligned} \sup _{t\ge 0}\int _\mathbb {R}\vert v\vert ^r f_t(dv) < \infty . \end{aligned}$$

Thus, from (21), we obtain \(\varepsilon _k(f_t) \le C/k^{1/2}\) for all \(t\ge 0\) (since \(r>4\)). Taking \(k \sim N^{2/3}\) gives the result. The estimate for \(\bar{{\mathbf {Z}}}_t^i\) is deduced similarly, taking \({\mathbf {X}} = (Z_t^j)_{j\ne i}\) in (22). \(\square \)

We now prove Theorem 2.

Proof

Call \(h(t) = \mathbb {E}[(V_t^1-Z_t^1)^2]\). Using Lemma 11 and exchangeability, we obtain

$$\begin{aligned} \mathbb {E}[W_2^2(\bar{\mathbf {V}}_t, f_t)]&\le 2 \mathbb {E}[W_2^2(\bar{\mathbf {V}}_t, \bar{{\mathbf {Z}}}_t)] + 2 \mathbb {E}[W_2^2(\bar{{\mathbf {Z}}}_t, f_t)] \\&\le 2 \mathbb {E}\left[ \frac{1}{N}\sum _{i=1}^N (V_t^i - Z_t^i)^2 \right] + \frac{C}{N^{1/3}} \\&= 2 h(t) + \frac{C}{N^{1/3}}. \end{aligned}$$

Thus, it suffices to prove that \(h(t) \le 2 e^{-\frac{\mu }{2} t} h(0) + C N^{-1/3}\), because \(h(0) = W_2^2(f_0^N, f_0^{\otimes N})\) thanks to (18).

We thus study the evolution of h(t). We have

$$\begin{aligned} h'(t) = S_t^K + S_t^T. \end{aligned}$$

Here \(S_t^K\) corresponds to the Kac interactions coming from the \(\mathcal {P}_i\) terms in (11) and (15), and \(S_t^T\) corresponds to the thermostat interactions coming from the \(\mathcal {Q}_i\) terms. For brevity, let us call \(V_t^\mathbf {i}= V_t^{\mathbf {i}(\xi )}\), \(Z_t^\mathbf {i}= Z_t^{\mathbf {i}(\xi )}\), and \(F_t^1 = F_t^1({\mathbf {Z}}_{t^{-}},\xi )\). We now study each of \(S_t^K\) and \(S_t^T\). For the Kac term \(S_t^k\), we recall that the intensity of \(\mathcal {P}_1(dt,d\theta ,d\xi )\) is \(\frac{2 \lambda dt d\theta d\xi \mathbf {1}_{\{\mathbf {i}(\xi )\ne 1\}}}{2\pi (N-1)}\). Thus from (11) and (15), using Itô calculus, for \(S_t^K\) we obtain:

$$\begin{aligned} \nonumber S_t^K&= \mathbb {E}\int _0^{2\pi } \int _1^N \left[ \left( V_t^1 \cos \theta - V_t^{\mathbf {i}} \sin \theta - Z_t^1 \cos \theta + F_t^1 \sin \theta \right) ^2 - (V_t^1 - Z_t^1)^2 \right] \frac{2\lambda d\theta d\xi }{2\pi (N-1)} \nonumber \\&= \mathbb {E}\int _0^{2\pi } \int _1^N \left[ (V_t^1 - Z_t^1)^2 (\cos ^2\theta -1) + (V_t^\mathbf {i}- F_t^1)^2 \sin ^2\theta \right] \frac{2\lambda d\theta d\xi }{2\pi (N-1)} \nonumber \\&= 2\lambda \left[ -\frac{1}{2}h(t) + \frac{1}{2} \mathbb {E}\int _1^N (V_t^\mathbf {i}- F_t^1)^2 \frac{d\xi }{N-1} \right] , \end{aligned}$$
(23)

where in the second equality the cross-term vanished since \(\int _0^{2\pi } \cos \theta \sin \theta d\theta = 0\). We now control the positive term in (23) by subtracting and then adding \(Z_t^\mathbf {i}\) inside the square. Set a(t) to be \(\mathbb {E}\int _1^N (Z_t^\mathbf {i}- F_t^1)^2 \frac{d\xi }{N-1}\), thus \(a(t) = \mathbb {E}[W_2^2(\bar{{\mathbf {Z}}}_t^1, f_t)]\) thanks to (16). Also note that \(\mathbb {E}\int _1^N (V_t^\mathbf {i}- Z_t^\mathbf {i})^2 \frac{d\xi }{N-1} = \frac{1}{N-1}\sum _{i=2}^N \mathbb {E}(V_t^j- Z_t^j)^2\) which equals h(t) by exchangeability. Therefore, we have

$$\begin{aligned} \mathbb {E}\int _1^N (V_t^\mathbf {i}- F_t^1)^2 \frac{d\xi }{N-1}&= h(t) + a(t) + 2\mathbb {E}\int _1^N (V_t^\mathbf {i}- Z_t^\mathbf {i})(Z_t^\mathbf {i}- F_t^1) \frac{d\xi }{N-1} \\&\le h(t) + a(t) + 2 h(t)^{1/2} a(t)^{1/2}, \end{aligned}$$

where we have used the Cauchy–Schwarz inequality. Plugging this into (23) gives

$$\begin{aligned} S_t^K \le \lambda a(t) + 2\lambda h(t)^{1/2} a(t)^{1/2}. \end{aligned}$$

Next, for the thermostat term \(S_t^T\), we recall that the intensity of \(\mathcal {Q}_1(dt,d\theta , dw)\) is \(\mu dt \frac{d\theta }{2\pi }\gamma (dw)\). Thus, again from (11) and (15), we have for \(S_t^T\):

$$\begin{aligned} S_t^T&= \mu \mathbb {E}\int _\mathbb {R}\int _0^{2\pi } \left[ (V_t^1 \cos \theta - w \sin \theta - Z_t^1 \cos \theta + w\sin \theta )^2 - (V_t^1-Z_t^1)^2 \right] \frac{d\theta }{2\pi } \gamma (dw) \\&= -\frac{\mu }{2} h(t). \end{aligned}$$

Joining the bounds for \(S_t^K\) and \(S_t^T\), we see that

$$\begin{aligned} h'(t) \le -\frac{\mu }{2}h(t) + \lambda a(t) + 2\lambda h(t)^{1/2} a(t)^{1/2}. \end{aligned}$$
(24)

Lemma 11 showed that \(a(t) \le C/N^{1/3}\). Thus, the Theorem follows from (24) by a Gronwall-type inequality (see for example [1, Lemma 4.1.8]). \(\square \)

4 Conclusion

In this work we showed that the thermostated Kac N-particle system propagates chaos uniformly in time, at a polynomial rate of order \(N^{-1/3}\) in terms of the 2-Wasserstein metric squared, improving the propagation of chaos result in [4]. This illustrates that the coupling method in [10] can be adapted to include thermostats. We also used coupling arguments to deduce equilibration estimates for both the particle system and the kinetic equation.

We plan on developing this coupling method further to a Kac-type model where, in addition to the particle collisions (1) and the thermostat interactions (2), the system has an energy restoring mechanism that pushes the total energy of the system to its initial value after each interaction with the thermostat. This is the subject of future research.