1 Introduction and Main Results

In this paper we study Kac’s particle system, introduced in [1] and later studied for instance in [2,3,4,5]. It can be described as follows: consider N objects or “particles” characterized by their one-dimensional velocities, subjected to the following binary random “collisions”: when particles with velocities v and \(v_*\) collide, they acquire new velocities \(v'\) and \(v_*'\) given by the rule

$$\begin{aligned} (v,v_*) \mapsto (v',v_*') = (v \cos \theta - v_*\sin \theta , v_* \cos \theta + v \sin \theta ), \end{aligned}$$
(1)

where \(\theta \in [0,2\pi )\) is chosen uniformly at random. This can be seen as a rotation in \(\theta \) of the pair \((v,v_*)\in \mathbb {R}^2\) and, as such, it preserves the energy, i.e., \(v^2 + v_*^2 = v'^2 + v_*'^2\). The system evolves continuously with time \(t\ge 0\); the times between collisions follow an exponential law with parameter N / 2 and the two particles that collide are chosen randomly among all possible pairs, so each particle collides once per unit of time on average. The system starts at \(t=0\) with some fixed symmetric distribution, and all the previous random choices are made independently. This description unambigously determines (the law of) the particle system, which we denote \(\mathbf {V}_t = (V_{1,t},\ldots ,V_{N,t})\).

In the pioneering work [1], Kac proved that for all \(t\ge 0\), as \(N \rightarrow \infty \), the empirical measure of the system \(\frac{1}{N} \sum _i \delta _{V_{i,t}}\) converges weakly to \(f_t\) (provided that the convergence holds for \(t=0\)), where \((f_t)_{t\ge 0}\) is the collection of probability measures on \(\mathbb {R}\) solving the so-called Boltzmann–Kac equation:

$$\begin{aligned} \partial _t f_t(v) = \int _0^{2\pi } \int _\mathbb {R}[f_t(v')f_t(v_*') - f_t(v)f_t(v_*)] dv_* \frac{d\theta }{2\pi }. \end{aligned}$$
(2)

This convergence is now termed propagation of chaos, and it has been extensively studied during the last decades for this and other, more general kinetic models (especially the Boltzmann equation), see for instance [6, 7] and the references therein.

Another interesting feature of this model is its behaviour as \(t\rightarrow \infty \). For instance, assuming normalized initial energy, i.e., \(\sum _i V_{i,0}^2 = N\) a.s., it is known that the law of the system converges exponentially in \(L^2\) to its equilibrium, namely, the uniform distribution on the Kac sphere \(\{\mathbf {x}\in \mathbb {R}^N: \sum _i x_i^2 = N\}\), see [3] and the references therein. As an alternative approach, one can couple two copies of the particle system using the same collision times and the same angle \(\theta \) (i.e., “parallel coupling”), but with different initial conditions, to show that the 2-Wasserstein distance between their laws is non-increasing in time. However, a simple and better coupling was recently introduced in [8]: note first that the post-collisional velocities in (1) can be written as \((v',v_*') = \sqrt{v^2 + v_*^2} (\cos (\alpha +\theta ),\sin (\alpha +\theta ))\), where \(\alpha \in (-\pi ,\pi ]\) is the angle defined by \((v,v_*) = \sqrt{v^2 + v_*^2} (\cos \alpha ,\sin \alpha )\), with the convention that all sums of angles are modulo \(2\pi \); next, note that, since \(\theta \) is uniformly chosen in \([0,2\pi )\), so is \(\alpha +\theta \), and then the interaction rule

$$\begin{aligned} (v,v_*) \mapsto (v',v_*') = \sqrt{v^2 + v_*^2} (\cos (\theta ),\sin (\theta )) \end{aligned}$$
(3)

generates a system that has the same law than the one described by (1). Using this new parametrization of the collision, one can define a coupling that leads to contraction results in some Wasserstein metrics, see [8] for details.

Our goal in this paper is to use the parametrization (3) in a propagation of chaos context, in order to obtain explicit (in N) and uniform-in-time rates of convergence, as \(N\rightarrow \infty \), for the law of the particles towards the solution of (2). We will quantify this convergence using the p-Wasserstein distance: given two probability measures \(\mu \) and \(\nu \) on \(\mathbb {R}^k\), it is defined as

$$\begin{aligned} \mathcal {W}_p(\mu ,\nu ) = \left( \inf \mathbb {E}\frac{1}{k}\sum _{i=1}^k |X_i-Y_i|^p \right) ^{1/p}, \end{aligned}$$

where the infimum is taken over all random vectors \(\mathbf {X} = (X_1,\ldots ,X_k)\) and \(\mathbf {Y} = (Y_1,\ldots ,Y_k)\) such that \(\mathcal {L}(\mathbf {X}) = \mu \) and \(\mathcal {L}(\mathbf {Y}) = \nu \) (we do not specify the dependence on k in our notation). We use the normalized distance \(|\mathbf {x}-\mathbf {y}|_k^p = \frac{1}{k}\sum _i |x_i-y_i|^p\) on \(\mathbb {R}^k\), which is natural when one cares about the dependence on the dimension. A pair \((\mathbf {X},\mathbf {Y})\) attaining the infimum is called an optimal coupling and it can be shown that it always exists. See for instance [9] for background on optimal coupling and Wasserstein distances.

Let us fix some notation. We denote \(E_N = \frac{1}{N} \sum _i V_{i,0}^2\) the (random) mean initial energy, which is preserved, i.e., \(\frac{1}{N} \sum _i V_{i,t}^2 = E_N\) for all \(t\ge 0\), a.s. We also denote \(\mathcal {E} = \int _\mathbb {R}v^2 f_0(dv)\), which itself is preserved by the flow \((f_t)_{t\ge 0}\). For a vector \(\mathbf {x} = (x_1,\ldots ,x_N) \in \mathbb {R}^N\) we denote by \(\mathbf {x}^{(2)} = (x_1^2,\ldots ,x_N^2)\) the vector of squares of \(\mathbf {x}\), and we define the (empirical) probability measures \(\bar{\mathbf {x}} = \frac{1}{N}\sum _{j} \delta _{x_j}\) and \(\bar{\mathbf {x}}_i = \frac{1}{N-1}\sum _{j\ne i} \delta _{x_j}\). Also, for a probability measure \(\mu \) on \(\mathbb {R}\), we denote by \(\mu ^{(2)}\) the measure on \(\mathbb {R}_+\) defined by \(\int \phi (v) \mu ^{(2)}(dv) = \int \phi (v^2) \mu (dv)\).

Theorem 1

Assume that \(\int _\mathbb {R}|v|^p f_0(dv) < \infty \) for some \(p>4\), \(p\ne 8\). Let \(\gamma = \min (\frac{1}{3},\frac{p-4}{2p-4})\) and \(\lambda _N = \frac{1}{4}\frac{N+2}{N-1}\). Then, there exists a constant C depending only on p and \(\int _\mathbb {R}|v|^p f_0(dv)\), such that for all \(t\ge 0\),

$$\begin{aligned} \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t^{(2)}, f_t^{(2)})&\le \frac{C}{N^\gamma } + C \mathbb {E}(E_N-\mathcal {E})^2 \\&\quad \, +\, Ce^{-\lambda _N t} \mathcal {W}_2^2\big (\mathcal {L}\big (\mathbf {V}_0^{(2)}\big ),\big (f_0^{(2)}\big )^{\otimes N}\big ). \end{aligned}$$

This yields a uniform-in-time propagation of chaos in \(\mathcal {W}_2^2\) for the energy of the particles. For instance, assuming that \(\int |v|^p f_0(dv) <\infty \) for some \(p>8\), the result gives a rate of order \(N^{-1/3}\), provided that \(\mathbb {E}(E_N-\mathcal {E})^2\) and \(\mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0^{(2)}),(f_0^{(2)})^{\otimes N})\) converge to 0 at the same rate or faster. Notice also that \(\lambda _N\) coincides with the spectral gap in \(L^2\) of the associated generator of the particle system, which was computed in [3] (although with a factor 2 due to a different rate of the collision times). The restriction \(p\ne 8\) comes from the fact that the proof of Theorem 1 makes use of a general chaocity result for i.i.d. sequences found in [10, Theorem 1]; including the case \(p=8\) would produce additional logarithmic terms in the rate, see (15) below.

As in [8, Corollary 3], this \(\mathcal {W}_2^2\) propagation of chaos result for the energy implies the following \(\mathcal {W}_4^4\) result for the non-squared system:

Corollary 1

Let \(\mathbf {U}_0 = (U_{1,0},\ldots ,U_{N,0})\) be any vector of i.i.d. and \(f_0\)-distributed random variables, and let \(\tilde{\gamma } = \frac{p-4}{2p}\mathbf {1}_{p<8} + \frac{p-4}{3p-8}\mathbf {1}_{p> 8}\). Under the same assumptions as in Theorem 1, we have for all \(t\ge 0\),

$$\begin{aligned} \mathbb {E}\mathcal {W}_4^4(\bar{\mathbf {V}}_t, f_t)\le & {} \frac{C}{N^{\tilde{\gamma }}} + C \mathbb {E}(E_N-\mathcal {E})^2 + C e^{-\lambda _N t} \mathbb {E}\left[ \frac{1}{N} \sum _{i=1}^N \big (V_{i,0}^2-U_{i,0}^2\big )^2\right] \\&+\, C e^{-t} \mathbb {E}\left[ \frac{1}{N} \sum _{i=1}^N (V_{i,0}-U_{i,0})^4\right] . \end{aligned}$$

Notice that \(\tilde{\gamma } < \gamma \) for all \(p>4\), thus the rate obtained is slower than the one of Theorem 1 (although we can easily deduce a rate \(N^{-\gamma }\) in \(\mathcal {W}_4^4\) for the law of one particle). For instance, if \(f_0\) has finite moment of order \(p>8\), Corollary 1 gives a chaos rate of \(N^{-1/4}\) in \(\mathcal {W}_4^4\); but if \(f_0\) has finite moments of all orders, it yields a rate of almost \(N^{-1/3}\).

Note that when p is close to 4, the chaos rates provided by these results are very slow. The following theorem provides a good rate assuming only that \(f_0\) has finite moment of order \(4+\epsilon \):

Theorem 2

Assume that \(\int _\mathbb {R}|v|^p f_0(dv) < \infty \) for some \(p>4\), and that \(\sup _N \mathbb {E}V_{1,0}^4 < \infty \). Then, there exists a constant C depending only on p, on \(\int _\mathbb {R}|v|^p f_0(dv)\) and on \(\sup _N \mathbb {E}V_{1,0}^4\), such that for all \(t\ge 0\),

$$\begin{aligned} \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t,f_t) \le \frac{C \log ^2 N}{N^{1/3}} + C \mathcal {W}_2^2\big (\mathcal {L}(\mathbf {V}_0),f_0^{\otimes N}\big ). \end{aligned}$$

To the best of our knowledge, these are the first uniform propagation of chaos results for Kac’s 1D particle system; they will be proven in Sect. 3. Similar results for the law of k particles can also be stated. The rates are explicit and of order \(N^{-1/3}\) (almost, in Corollary 1; Theorem 2), assuming enough moments of \(f_0\). This is quite reasonable, given that in general the optimal rate of chaocity for an i.i.d. sequence is \(N^{-1/2}\), see [10, Theorem 1]. Notice that in these results, the initial condition \(\mathbf {V}_0\) is not restricted to have fixed (non-random) mean energy, and can thus be chosen at convenience. For instance, it can have distribution \(f_0^{\otimes N}\), thus the term \(\mathbb {E}(E_N-\mathcal {E})^2\) in Theorem 1 and Corollary 1 is easily seen to be of order 1 / N, while the terms \(\mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0^{(2)}),(f_0^{(2)})^{\otimes N})\), \(\sum _i (V_{i,0}^2-U_{i,0}^2)^2\), \(\sum _i (V_{i,0}-U_{i,0})^4\) and \(\mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0),f_0^{\otimes N})\) all vanish. Or one can assume normalized energy (i.e., \(E_N = \mathcal {E}\) a.s.), provided that one can control the remaining terms.

We remark that, although one could use the general functional techniques of [6] in the present context, the rates obtained with these techniques are likely to be much slower than the ones presented here.

The proof of our results mainly relies on the parametrization (3) introduced in [8], and on a coupling argument developed in [11] to relate the behaviour of the particle system and the limit jump process (the nonlinear process). We remark however that, while the proof of Theorem 1 makes use of the techniques of [8] and [11], the proof of Theorem 2 directly combines the results found in these references.

2 Construction

We now give a specific construction of the particle system and couple it with a suitable system of nonlinear processes, following [11]. Consider a Poisson point measure \(\mathcal {N}(dt,d\theta ,d\xi ,d\zeta )\) on \(\mathbb {R}_+\times [0,2\pi )\times [0,N) \times [0,N)\) with intensity \(\frac{dt d\theta d\xi d\zeta \mathbf {1}_\mathcal {G}(\xi ,\zeta )}{4 \pi (N-1)}\), where \(\mathcal {G}:= \{(\xi ,\zeta ) \in [0,N)^2: {\mathbf {i}}(\xi ) \ne \mathbf {i}(\zeta )\}\) and \(\mathbf {i}(\xi ):= \lfloor \xi \rfloor + 1\). In words, the measure \(\mathcal {N}\) picks collision times \(t\in \mathbb {R}_+\) at rate N / 2, and for each such t, it also independently samples an angle \(\theta \) uniformly at random from \([0,2\pi )\) and a pair \((\xi ,\zeta )\) uniformly from the set \(\mathcal {G}\) (note that the area of \(\mathcal {G}\) is \(N(N-1)\)). The pair \((\mathbf {i}(\xi ),\mathbf {i}(\zeta ))\) gives the indices of the particles that jump at each collision. Using the parametrization (3), we define the particle system \(\mathbf {V}_t = (V_{1,t},\ldots ,V_{N,t})\) as the solution to

$$\begin{aligned} dV_{i,t} = \int _0^{2\pi } \int _{A_i} \left[ \sqrt{V_{i,t^-}^2 + V_{\mathbf {i}(\xi ),t^-}^2 } \cos \theta - V_{i,t^-} \right] \mathcal {N}_i(dt,d\theta ,d\xi ), \end{aligned}$$
(4)

for all \(i\in \{1,\ldots ,N\}\), where \(A_i := [0,N) \setminus [i-1,i)\), and \(\mathcal {N}_i\) is the point measure defined as

$$\begin{aligned} \mathcal {N}_i(dt,d\theta ,d\xi ) = \mathcal {N}(dt,d\theta ,[i-1,i), d\xi ) + \mathcal {N}(dt,d\theta - \pi /2,d\xi ,[i-1,i)), \end{aligned}$$
(5)

where the \(-\pi /2\) is to transform sinus to cosinus. Clearly, \(\mathcal {N}_i\) is a Poisson point measure on \(\mathbb {R}_+ \times [0,2\pi )\times A_i\) with intensity \(\frac{dt d\theta d\xi }{2\pi (N-1)}\). The initial condition \(\mathbf {V}_0 = (V_{1,0},\ldots ,V_{N,0})\) is some random vector with exchangeable components, independent of \(\mathcal {N}\).

The nonlinear process (introduced by Tanaka [12] in the context of the Boltzmann equation for Maxwell molecules) is a stochastic jump-process having marginal laws \((f_t)_{t\ge 0}\), and it is the probabilistic counterpart of (2). It represents the trajectory of a fixed particle inmersed in the infinite population, and it is obtained, for instance, as the solution to (4) when one replaces \(V_{\mathbf {i}(\xi ),t^-}\) (which is a \(\xi \)-realization of the (random) measure \(\bar{\mathbf {V}}_{i,t^-} = \frac{1}{N-1} \sum _{j\ne i} \delta _{V_{j,t^-}}\)) with a realization of \(f_t\).

The key idea, introduced in [11], is to define, for each \(i\in \{1,\ldots ,N\}\), a nonlinear process \(U_{i,t}\) that mimics as closely as possible the dynamics of \(V_{i,t}\), which is achieved using a suitable realization of \(f_t\) at each collision. More specifically: the collection \(\mathbf {U}_t = (U_{1,t},\ldots ,U_{N,t})\) is defined as the solution to

$$\begin{aligned} dU_{i,t} = \int _0^{2\pi } \int _{A_i} \left[ \sqrt{U_{i,t^-}^2 + F_{i,t}^2(\mathbf {U}_{t^-},\xi ) } \cos \theta - U_{i,t^-} \right] \mathcal {N}_i(dt,d\theta ,d\xi ), \end{aligned}$$
(6)

for all \(i\in \{1,\ldots ,N\}\). Here, \(F_{i}\) is a measurable mapping \(\mathbb {R}_+ \times \mathbb {R}^N \times A_i \ni (t,\mathbf {x},\xi ) \mapsto F_{i,t}(\mathbf {x},\xi )\) such that for all \((t,\mathbf {x})\) and any random variable \(\xi \) which is uniformly distributed on \(A_i\), the pair \((x_{\mathbf {i}(\xi )},F_{i,t}(\mathbf {x},\xi ))\) is an optimal coupling between \(\bar{\mathbf {x}}_i = \frac{1}{N-1} \sum _{j\ne i} \delta _{x_j}\) and \(f_t\) with respect to the cost function \(c(x,y) = (x^2-y^2)^2\). Thus,

$$\begin{aligned} \int _{A_i} (x_{\mathbf {i}(\xi )}^2 - F_{i,t}^2(\mathbf {x},\xi ))^2 \frac{d\xi }{N-1} = \mathcal {W}_2^2(\bar{\mathbf {x}}_i^{(2)},f_t^{(2)}). \end{aligned}$$
(7)

We refer to [11, Lemma 3] for a proof of existence of such a mapping (here we use a different cost, but our proof works for any cost that is continuous and bounded from below, in order to use a measurable selection result of optimal transference plans, such as [9, Corollary 5.22]). That lemma also shows that for any \(i\ne j\in \{1,\ldots ,N\}\), any random vector \(\mathbf {X} \in \mathbb {R}^N\) with exchangeable components and any bounded and Borel measurable \(\phi :\mathbb {R}\rightarrow \mathbb {R}\), we have

$$\begin{aligned} \mathbb {E}\int _{i-1}^i \phi (F_{i,t}(\mathbf {X},\xi )) d\xi = \int _\mathbb {R}\phi (v)f_t(dv). \end{aligned}$$
(8)

The initial conditions \(U_{1,0},\ldots ,U_{N,0}\) are taken independently and with law \(f_0\). For instance, they can be chosen such that the pair \((\mathbf {V}_0,\mathbf {U}_0)\) is an optimal coupling between \(\mathcal {L}(\mathbf {V}_0)\) and \(f_0^{\otimes N}\) with respect to the cost function \((x^2-y^2)^2\), so that \(\mathbb {E}\frac{1}{N} \sum _i (V_{i,0}^2-U_{i,0}^2)^2 = \mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0^{(2)}), (f_0^{(2)})^{\otimes N})\) (this is done in the proof of Theorem 1, but, in general, \(\mathbf {U}_0\) can be any random vector with law \(f_0^{\otimes N}\)).

Strong existence and uniqueness of solutions \(\mathbf {V}_t = (V_{1,t},\ldots ,V_{N,t})\) and \(\mathbf {U}_t = (U_{1,t},\ldots ,U_{N,t})\) for (4) and (6) are straightforward: since the total rate of \(\mathcal {N}\) is finite over finite time intervals, those equations are nothing but recursions for the values of the processes at the (timely ordered) jump times. Also, the collection of pairs \((V_1,U_1),\ldots ,(V_N,U_N)\) is clearly exchangeable.

Every \(U_{i,t}\) is a nonlinear process, thus \(\mathcal {L}(U_{i,t}) = f_t\) for all t. Note however that \(U_{i,t}\) and \(U_{j,t}\) have simultaneous jumps, and consequently they are not independent. As in [11], in order to obtain the desired results, we will need to show that they become asymptotically independent as \(N\rightarrow \infty \), which is achieved using a second coupling, see Lemma 3 below.

3 Proofs

We will need the following propagation of moments result.

Lemma 1

Assume that \(\int _\mathbb {R}|v|^{p} f_0(dv) < \infty \) for some \(p\ge 2\). Then there exists \(C>0\) depending only on p and \(\int _\mathbb {R}|v|^p f_0(dv)\) such that \(\int _\mathbb {R}|v|^p f_t(dv) < C\) for all \(t\ge 0\).

Proof

See the proof of [11, Lemma 5]. \(\square \)

Lemma 2

Assume that \(\int _\mathbb {R}v^4 f_0(dv) < \infty \). Then, there exists a constant C depending only on \(\int _\mathbb {R}v^4 f_0(dv)\), such that for any \(i\ne j\),

$$\begin{aligned} | cov (U_{i,t}^2,U_{j,t}^2)| \le (1-e^{-t})\frac{C}{N}. \end{aligned}$$

Proof

We will estimate \(h_t := \mathbb {E}(U_{i,t}^2 U_{j,t}^2)\). From (6) we have

$$\begin{aligned} dh_t= & {} \mathbb {E}\int _0^{2\pi } \int _{[0,N)^2} \left[ \mathbf {1}_{\{\mathbf {i}(\xi ) = i, \mathbf {i}(\zeta ) = j\}} \Delta _1 + \mathbf {1}_{\{\mathbf {i}(\xi ) = j, \mathbf {i}(\zeta ) = i\}} \Delta _2 \right. \nonumber \\&+ \,\mathbf {1}_{\{\mathbf {i}(\xi ) = i, \mathbf {i}(\zeta ) \ne j\}} \Delta _3 + \mathbf {1}_{\{\mathbf {i}(\xi ) \ne j, \mathbf {i}(\zeta ) = i\}} \Delta _4 \nonumber \\&+\left. \mathbf {1}_{\{\mathbf {i}(\xi ) \ne i, \mathbf {i}(\zeta ) = j\}} \Delta _5 + \mathbf {1}_{\{\mathbf {i}(\xi ) = j, \mathbf {i}(\zeta ) \ne i\}} \Delta _6 \right] \mathcal {N}(dt,d\theta ,d\xi ,d\zeta ), \end{aligned}$$
(9)

where \(\Delta _1\) and \(\Delta _2\) are the increments of \(U_{i,t}^2 U_{j,t}^2\) when \(U_{i,t}\) and \(U_{j,t}\) have a simultanous jump, and \(\Delta _3,\ldots ,\Delta _6\) are the increments when only one of them jumps. For instance,

$$\begin{aligned} \Delta _1&= (U_{i,t^-}^2 + F_{i,t}^2(\mathbf {U}_{t^-},\zeta ))\cos ^2\theta (U_{j,t^-}^2 + F_{j,t}^2(\mathbf {U}_{t^-},\xi ))\sin ^2\theta - U_{i,t^-}^2 U_{j,t^-}^2, \\ \Delta _3&= (U_{i,t^-}^2 + F_{i,t}^2(\mathbf {U}_{t^-},\zeta ))\cos ^2\theta U_{j,t^-}^2 - U_{i,t^-}^2 U_{j,t^-}^2. \end{aligned}$$

we have for the latter:

$$\begin{aligned}&\mathbb {E}\int _0^{2\pi } \int _{[0,N)^2} \mathbf {1}_{\{\mathbf {i}(\xi ) = i, \mathbf {i}(\zeta ) \ne j\}} \Delta _3 \mathcal {N}(dt,d\theta ,d\xi ,d\zeta ) \nonumber \\&\quad = \mathbb {E}\int _0^{2\pi } \int _{A_i} \left[ -(1-\cos ^2\theta ) U_{i,t}^2 U_{j,t}^2 + \cos ^2\theta F_{i,t}^2(\mathbf {U}_t,\zeta ) U_{j,t}^2\right] \frac{dt d\theta d\zeta }{4\pi (N-1)} \nonumber \\&\quad = \left[ -\frac{1}{4} h_t + \frac{1}{4}\mathcal {E}^2\right] dt, \end{aligned}$$
(10)

where we have used that \(U_{j,t} \sim f_t\) under \(\mathbb {P}\) and \(F_{i,t}(\mathbf {U}_t,\zeta ) \sim f_t\) under \(\frac{d\zeta \mathbf {1}_{A_i}(\zeta )}{N-1}\). The same identity holds for \(\Delta _4, \Delta _5\) and \(\Delta _6\). On the other hand for \(\Delta _1\) we can simply use the Cauchy–Schwarz inequality and the fact that \(\mathbb {E}\int _{j-1}^j F_{i,t}^4(\mathbf {U}_t, \zeta ) d\zeta = \int v^4 f_t(dv) \le C\) [thanks to (8) and Lemma 1], thus obtaining

$$\begin{aligned} -\frac{C}{N} dt \le \mathbb {E}\int _0^{2\pi } \int _{[0,N)^2} \mathbf {1}_{\{\mathbf {i}(\xi ) = i, \mathbf {i}(\zeta ) = j\}} \Delta _1 \mathcal {N}(dt,d\theta ,d\xi ,d\zeta ) \le \frac{C}{N}dt. \end{aligned}$$

The same estimate holds true for \(\Delta _2\). Using this and (10) in (9), we deduce that \(-h_t + \mathcal {E}^2 - \frac{C}{N} \le \partial _t h_t \le - h_t + \mathcal {E}^2 + \frac{C}{N}\), and multiplying by \(e^t\) and integrating yields \( (e^t-1)(\mathcal {E}^2-\frac{C}{N}) \le e^t h_t - h_0 \le (e^t-1)(\mathcal {E}^2+\frac{C}{N}) \). But \(U_{i,0}\) and \(U_{j,0}\) are independent, thus \(h_0 = \mathcal {E}^2\), and then \(\mathcal {E}^2 - (1-e^{-t})\frac{C}{N} \le h_t \le \mathcal {E}^2 + (1-e^{-t})\frac{C}{N}\). Since \( cov (U_{i,t}^2,U_{j,t}^2) = h_t - \mathcal {E}^2\), the conclusion follows. \(\square \)

For a given exchangeable random vector \(\mathbf {X}\) on \(\mathbb {R}^N\), denote \(\mathcal {L}^n(\mathbf {X})\) the joint law of its n first components. The following lemma provides a decoupling property for the system of nonlinear processes \(\mathbf {U}_t\).

Lemma 3

Assume \(\int _\mathbb {R}v^4 f_0(dv) < \infty \). Then there exists a constant \(C>0\), depending only on \(\int _\mathbb {R}v^4 f_0(dv)\), such that for all \(n\le N\) and \(t\ge 0\),

$$\begin{aligned} \mathcal {W}_2^2\big (\mathcal {L}^n(\mathbf {U}_t^{(2)}), (f_t^{(2)})^{\otimes n} \big ) \le C \frac{n}{N}. \end{aligned}$$

Also, if \(\int _\mathbb {R}|v|^p f_0(dv) < \infty \) for some \(p>4\), then there exists a constant \(C>0\), depending only on p and \(\int _\mathbb {R}|v|^p f_0(dv)\), such that for all \(n\le N\) and \(t\ge 0\),

$$\begin{aligned} \mathcal {W}_4^4\big (\mathcal {L}^n(\mathbf {U}_t), f_t^{\otimes n} \big ) \le C \left( \frac{n}{N}\right) ^{\frac{p-4}{p}}. \end{aligned}$$

Proof

The argument uses a coupling construction, as in the proof of [11, Lemma 6]. We repeat the important steps here. First, for all \(n\in \{2,\ldots ,N\}\), the idea is to construct n independent nonlinear processes \(\tilde{U}_{1,t},\ldots ,\tilde{U}_{n,t}\) such that \(\tilde{U}_{i,t}\) remains close to \(U_{i,t}\) on average. To achieve this, let \(\mathcal {M}\) be an independent copy of the Poisson point measure \(\mathcal {N}\), and define for all \(i\in \{1,\ldots ,n\}\)

$$\begin{aligned} \mathcal {M}_i(dt,d\theta ,d\xi )= & {} \mathcal {N}(dt,d\theta ,[i-1,i), d\xi ) \nonumber \\&+ \mathcal {N}(dt,d\theta - \pi /2,d\xi ,[i-1,i)) \mathbf {1}_{[n,N)}(\xi ) \nonumber \\&+ \mathcal {M}(dt,d\theta - \pi /2,d\xi ,[i-1,i)) \mathbf {1}_{[0,n)}(\xi ), \end{aligned}$$
(11)

which is a Poisson point measure on \(\mathbb {R}_+ \times [0,2\pi )\times A_i\) with intensity \(\frac{dt d\theta d\xi }{2\pi (N-1)}\), just as \(\mathcal {N}_i\). We then define \(\tilde{U}_{i,t}\) starting with \(\tilde{U}_{i,0} = U_{i,0}\) and solving an equation similar to (6), but using \(\mathcal {M}_i\) in place of \(\mathcal {N}_i\):

$$\begin{aligned} d\tilde{U}_{i,t} = \int _0^{2\pi } \int _{A_i} \left[ \sqrt{\tilde{U}_{i,t^-}^2 + F_{i,t}^2(\mathbf {U}_{t^-},\xi ) } \cos \theta - \tilde{U}_{i,t^-} \right] \mathcal {M}_i(dt,d\theta ,d\xi ). \end{aligned}$$
(12)

In words, the processes \(\tilde{U}_{1,t},\ldots ,\tilde{U}_{n,t}\) use the same atoms of \(\mathcal {N}\) that \(U_{1,t},\ldots ,U_{n,t}\) use, except for those that produce a joint jump of \(U_{i,t}\) and \(U_{j,t}\) for some \(i,j\in \{1,\ldots ,n\}\), in which case either \(\tilde{U}_{i,t}\) or \(\tilde{U}_{j,i}\) does not jump at that instant. To compensate for the missing jumps, additional independent atoms, drawn from \(\mathcal {M}\), are added to \(\mathcal {M}_i\).

It is clear that \(\mathcal {M}_1,\ldots ,\mathcal {M}_n\) are independent Poisson point measures. Using this and the fact that \(F_{i,t}(\mathbf {x},\xi )\) has distribution \(f_t\) when \(\xi \) is uniformly distributed on \(A_i\), one can show that \(\tilde{U}_{1,t},\ldots ,\tilde{U}_{n,t}\) are independent nonlinear processes; see the details in the proof of [11, Lemma 6].

Thus, \(\mathcal {W}_2^2(\mathcal {L}^n(\mathbf {U}_t^{(2)}), (f_t^{(2)})^{\otimes n} ) \le \mathbb {E}\frac{1}{n} \sum _{i=1}^n (U_{i,t}^2 - \tilde{U}_{i,t}^2)^2\), and then, to deduce the first bound, it suffices to estimate \(h_t := \mathbb {E}(U_{i,t}^2 - \tilde{U}_{i,t}^2)^2\) for any fixed \(i\in \{1,\ldots ,n\}\). From (6) and (12) we have

$$\begin{aligned} dh_t= & {} \mathbb {E}\int _0^{2\pi } \int _{A_i} \Delta _1 [ \mathcal {N}(dt,d\theta ,[i-1,i),d\xi ) \nonumber \\&+\, \mathcal {N}(dt,d\theta - \pi /2, d\xi , [i-1,i)) \mathbf {1}_{[n,N)}(\xi )] \nonumber \\&+ \, \mathbb {E}\int _0^{2\pi } \int _{A_i} \Delta _2 \mathcal {N}(dt,d\theta - \pi /2, d\xi , [i-1,i) ) \mathbf {1}_{[0,n)}(\xi ) \nonumber \\&+ \, \mathbb {E}\int _0^{2\pi } \int _{A_i} \Delta _3 \mathcal {M}(dt,d\theta - \pi /2, d\xi , [i-1,i) ) \mathbf {1}_{[0,n)}(\xi ), \end{aligned}$$
(13)

where \(\Delta _1\) is the increment of \((U_{i,t}^2 - \tilde{U}_{i,t}^2)^2\) when \(U_{i,t}\) and \(\tilde{U}_{i,t}\) have a simultaneous jump, \(\Delta _2\) is the increment when only \(U_{i,t}\) jumps, and \(\Delta _3\) is the increment when only \(\tilde{U}_{i,t}\) jumps. Thanks to the indicator \(\mathbf {1}_{[0,n)}(\xi )\) and Lemma 1, the second and third terms in (13) are easily seen to be of order \(C\frac{n}{N}\). For the first term, we have

$$\begin{aligned} \Delta _1&= \left( (U_{i,t^-}^2 + F_{i,t}^2(\mathbf {U}_{t^-},\xi ))\cos ^2\theta - (\tilde{U}_{i,t^-}^2 + F_{i,t}^2(\mathbf {U}_{t^-},\xi ))\cos ^2\theta \right) ^2 \\&\quad - (U_{i,t^-}^2 - \tilde{U}_{i,t^-}^2)^2 \\&= -(1-\cos ^4\theta ) (U_{i,t^-}^2 - \tilde{U}_{i,t^-}^2)^2. \end{aligned}$$

Since \(\int _0^{2\pi } (1-\cos ^4\theta ) \frac{d\theta }{2\pi } = \frac{5}{8}\), from (13) we obtain \(\partial _t h_t \le -\frac{5}{8} h_t + C\frac{n}{N}\) [we have simply discarded the negative term with the indicator \(\mathbf {1}_{[n,N)}(\xi )\) in (13)], and since \(h_0 = 0\), the estimate for \(\mathcal {W}_2^2\) follows from Gronwall’s lemma:

$$\begin{aligned} h_t \le C(1-e^{-5t/8}) \frac{n}{N} \le C\frac{n}{N}. \end{aligned}$$
(14)

The estimate for \(\mathcal {W}_4^4\) can be reduced to the previous one using an argument similar to the proof of [8, Corollary 3]: for \(i\in \{1,\ldots ,n\}\), call \(S_{i,t}\) the event in which \(U_{i,t}\) and \(\tilde{U}_{i,t}\) have the same sign. On \(S_{i,t}\) we have

$$\begin{aligned} (U_{i,t}-\tilde{U}_{i,t})^4 \le (U_{i,t}-\tilde{U}_{i,t})^2 (U_{i,t}+\tilde{U}_{i,t})^2 = (U_{i,t}^2-\tilde{U}_{i,t}^2)^2, \end{aligned}$$

and then, using Hölder’s inequality with \(a=\frac{p}{p-4}\) and \(b=p/4\), we obtain

$$\begin{aligned} \mathbb {E}(U_{i,t}-\tilde{U}_{i,t})^4&\le \mathbb {E}\mathbf {1}_{S_{i,t}} (U_{i,t}^2-\tilde{U}_{i,t}^2)^2 + \mathbb {E}\mathbf {1}_{S_{i,t}^c} (U_{i,t}-\tilde{U}_{i,t})^4 \\&\le \mathbb {E}(U_{i,t}^2-\tilde{U}_{i,t}^2)^2 + \mathbb {P}(S_{i,t}^c)^{1/a} [\mathbb {E}(U_{i,t}-\tilde{U}_{i,t})^{4b}]^{1/b}. \end{aligned}$$

The first term in the r.h.s. of this inequality is bounded by Cn / N thanks to (14), while the expectation in the second term is bounded uniformly on t thanks to Lemma 1. Also, we have \(\mathbb {P}(S_{i,t}^c) \le n/(2N)\): from (6) and (12) we see that when the processes \(U_{i,t}\) and \(\tilde{U}_{i,t}\) have a joint jump, they acquire the same sign [the one of \(\cos \theta \)], and form (5) and (11), it is easy to see that this occurs a proportion \(1-n/(2N)\) of the jumps on average. With all these, we get

$$\begin{aligned} \mathcal {W}_4^4(\mathcal {L}^n(\mathbf {U}_t), f_t^{\otimes n}) \le \mathbb {E}\frac{1}{n} \sum _{i=1}^n (U_{i,t}-\tilde{U}_{i,t})^4 \le C \left( \frac{n}{N}\right) ^{1/a}, \end{aligned}$$

which proves the estimate for \(\mathcal {W}_4^4\). \(\square \)

To prove the following lemma, we will need some preliminaries. For a probability measure \(\mu \) on \(\mathbb {R}\), for any \(q \ge 1\) and any \(n\in \mathbb {N}\), define \(\varepsilon _{q,n}(\mu ):= \mathbb {E}\mathcal {W}_q^q(\bar{\mathbf {Z}},\mu )\), where \(\mathbf {Z} = (Z_1,\ldots ,Z_n)\) is an i.i.d. and \(\mu \)-distributed tuple. The best avaliable estimates for \(\varepsilon _{q,n}(\mu )\) can be found in [10, Theorem 1]: if \(\mu \) has finite r-moment for some \(r>q\), \(r\ne 2q\), then there exists a constant C depending only on q and r such that for \(\eta = \min (1/2, 1-q/r)\), it holds

$$\begin{aligned} \varepsilon _{q,n}(\mu ) \le C \frac{\left( \int |x|^r \mu (dx) \right) ^{q/r}}{n^\eta }. \end{aligned}$$
(15)

We will also need the following bound, which is a consequence of [11, Lemma 7]: given an exchangeable random vector \(\mathbf {X}\in \mathbb {R}^N\) and a probability measure \(\mu \) on \(\mathbb {R}\), there exists a constant C, depending only on the q-moments of \(\mu \) and \(X_1\), such that for all \(n\le N\),

$$\begin{aligned} \frac{1}{2^{q-1}} \mathbb {E}\mathcal {W}_q^q(\bar{\mathbf {X}}, \mu ) \le \mathcal {W}_q^q(\mathcal {L}^n(\mathbf {X}), \mu ^{\otimes n}) + \varepsilon _{q,n}(\mu ) + C \frac{n}{N}. \end{aligned}$$
(16)

As a consequence of these estimates and Lemma 3, we have:

Lemma 4

Assume that \(\int _\mathbb {R}|v|^{p} f_0(dv) < \infty \) for some \(p>4\), \(p\ne 8\). Then there exists a constant C depending only on p and \(\int _\mathbb {R}|v|^p f_0(dv)\) such that for \(\gamma = \min (\frac{1}{3},\frac{p-4}{2p-4})\) and for all \(t\ge 0\),

$$\begin{aligned} \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {U}}_t^{(2)}, f_t^{(2)}) \le \frac{C}{N^\gamma }, \end{aligned}$$

and for \(\tilde{\gamma } = \frac{p-4}{2p}\mathbf {1}_{p<8} + \frac{p-4}{3p-8}\mathbf {1}_{p> 8}\),

$$\begin{aligned} \mathbb {E}\mathcal {W}_4^4(\bar{\mathbf {U}}_t, f_t) \le \frac{C}{N^{\tilde{\gamma }}}. \end{aligned}$$

Moreover, the same bounds hold with \(\bar{\mathbf {U}}_{i,t}^{(2)}\) in place of \(\bar{\mathbf {U}}_t^{(2)}\) and with \(\bar{\mathbf {U}}_{i,t}\) in place of \(\bar{\mathbf {U}}_t\), respectively.

Proof

Using the first part of Lemma 3 and (15) and (16) with \(\mu = f_t^{(2)}\), \(q=2\) and \(r = p/2\), we obtain \(\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {U}}_t^{(2)},f_t^{(2)}) \le C[n^{-\eta } + n/N]\) for \(\eta = \min (1/2, 1-4/p)\) (C depends on the p / 2 moments of \(f_t^{(2)}\), which are controlled uniformly on t thanks to Lemma 1). Taking \(n = \lfloor N^{1/(1+\eta )} \rfloor \) gives the estimate for \(\mathcal {W}_2^2\). The estimate for \(\mathcal {W}_4^4\) follows similarly: using the second part of Lemma 3 and (15) and (16) with \(\mu = f_t\), \(q=4\) and \(r = p\), we obtain \(\mathbb {E}\mathcal {W}_4^4(\bar{\mathbf {U}}_t,f_t) \le C[n^{-\eta } + (n/N)^{1/a}]\), for \(a = \frac{p}{p-4}\) and the same \(\eta = \min (1/2, 1-4/p)\). Taking \( n= \lfloor N^{1/(1+a \eta )} \rfloor \) gives the desired bound.

The estimates for \(\bar{\mathbf {U}}_{i,t}^{(2)}\) and \(\bar{\mathbf {U}}_{i,t}\) are obtained similarly. \(\square \)

We can now prove Theorem 1:

Proof of Theorem 1

For some \(i\in \{1,\ldots ,N\}\) fixed, we will estimate the quantity \(h_t := \mathbb {E}(V_{i,t}^2-U_{i,t}^2)^2\). Let us first shorten notation: call \(V = V_{i,t^-}\), \(V_* = V_{\mathbf {i}(\xi ),t^-}\), \(U = U_{i,t^-}\), \(F = F_{i,t}(\mathbf {U}_{t^-},\xi )\), and \(U_* = U_{\mathbf {i}(\xi ),t^-}\). From (4) and (6), we have

$$\begin{aligned} d h_t&= \mathbb {E}\int _0^{2\pi } \int _{A_i} \left[ \big ( V^2 + V_*^2 - U^2 - F^2\big )^2 \cos ^4\theta - \big (V^2 - U^2\big )^2 \right] \mathcal {N}_i(dt,d\theta ,d\xi ) \nonumber \\&= \mathbb {E}\int _0^{2\pi } \int _{A_i} \left[ \big (\cos ^4\theta -1\big )\big (V^2-U^2\big )^2 \right. \nonumber \\&\quad +\cos ^4\theta \big (V_*^2 - U_*^2\big )^2 +\cos ^4\theta \big (U_*^2 - F^2\big )^2 \nonumber \\&\quad + 2 \cos ^4\theta \big (V^2-U^2+V_*^2-U_*^2\big )\big (U_*^2-F^2\big ) \nonumber \\&\quad \left. + 2 \cos ^4\theta \big (V^2-U^2\big )\big (V_*^2-U_*^2\big ) \right] \frac{dt d\theta d\xi }{2\pi (N-1)}. \end{aligned}$$
(17)

Clearly \(\mathbb {E}\int _{A_i} (V_*^2 - U_*^2)^2 \frac{d\xi }{N-1} = h_t\), by exchangeability. Thus, the first and second terms in the integral of (17) yield \(- h_t dt \int _0^{2\pi } (1-2\cos ^4\theta ) \frac{d\theta }{2\pi } = -\frac{1}{4} h_t dt\). From (7), we have \(\mathbb {E}\int _{A_i}(U_*^2-F^2)^2 \frac{d\xi }{N-1} = \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {U}}_{i,t}^{(2)},f_t^{(2)})\le CN^{-\gamma }\), thanks to Lemma 4. Using the Cauchy-Schwarz inequality, the third and fourth terms in the integral of (17) are thus bounded above by \([C N^{-\gamma } + Ch_t^{1/2}N^{-\gamma /2}]dt\). For the remaining term, since \(\frac{1}{N}\sum _j V_{j,t}^2 = E_N\) for all \(t\ge 0\) a.s., we have

$$\begin{aligned}&\mathbb {E}(V_{i,t}^2 - U_{i,t}^2) \int _{A_i} (V_{\mathbf {i}(\xi ),t}^2 - U_{\mathbf {i}(\xi ),t}^2) d\xi \\&\quad = \mathbb {E}(V_{i,t}^2-U_{i,t}^2) \left( -V_{i,t}^2 + U_{i,t}^2 + NE_N - \sum _{j=1}^N U_{j,t}^2 \right) \\&\quad \le -h_t + h_t^{1/2} \left[ \mathbb {E}\left( \sum _{j=1}^N ( U_{j,t}^2 - \mathcal {E})\right) ^2 \right] ^{1/2} + N h_t^{1/2} \left[ \mathbb {E}(E_N-\mathcal {E})^2\right] ^{1/2} \\&\quad = -h_t + h_t^{1/2} \left[ N\text {var}(U_{i,t}^2) + N(N-1) cov (U_{i,t}^2,U_{j,t}^2)\right] ^{1/2} + N h_t^{1/2} B_N^{1/2}, \end{aligned}$$

where in the last line \(j\ne i\) is any fixed index, and \(B_N := \mathbb {E}(E_N-\mathcal {E})^2\). Thanks to Lemmas 1 and 2, the latter is bounded by \(-h_t + Ch_t^{1/2}N^{1/2} + N h_t^{1/2} B_N^{1/2}\); thus, the fifth term of (17) is controlled by \(-\frac{3}{4(N-1)}h_tdt + Ch_t^{1/2}[N^{-1/2} + B_N^{1/2}] dt\). Gathering all these estimates, we get from (17)

$$\begin{aligned} \partial _t h_t&\le -\left( \frac{1}{4} + \frac{3}{4(N-1)} \right) h_t + Ch_t^{1/2}\big [N^{-\gamma /2} + N^{-1/2}+ B_N^{1/2}\big ] + CN^{-\gamma } \\&\le -\lambda _N h_t + Ch_t^{1/2}[N^{-\gamma /2} + B_N^{1/2}] + CN^{-\gamma }. \end{aligned}$$

Using a version of Gronwall’s lemma (see for instance [13, Lemma 4.1.8]), we obtain

$$\begin{aligned} h_t \le Ce^{-\lambda _N t} h_0 + C N^{-\gamma } + CB_N. \end{aligned}$$
(18)

Finally, note that \(\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t^{(2)},f_t^{(2)}) \le 2 \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t^{(2)},\bar{\mathbf {U}}_t^{(2)}) +2 \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {U}}_t^{(2)},f_t^{(2)})\), and, since \(\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t^{(2)},\bar{\mathbf {U}}_t^{(2)}) \le \mathbb {E}\frac{1}{N} \sum _{j} (V_{j,t}^2 - U_{j,t}^2)^2 = h_t\) by exchangeability, the conclusion follows from (18), the first part of Lemma 4, and choosing \((\mathbf {V}_0,\mathbf {U}_0)\) as an optimal coupling with respect to the cost \((x^2-y^2)^2\), so \(h_0 = \mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0^{(2)}), (f_0^{(2)})^{\otimes N})\). \(\square \)

Proof of Corollary 1

The argument is the same as in the proof of [8, Corollary 3], and we repeat it here for convenience of the reader. From (4) and (6), it is clear that \(V_{i,t}\) and \(U_{i,t}\) have the same sign (the one of \(\cos \theta \)) after the first jump. And if they have the same sign, then

$$\begin{aligned} (V_{i,t} - U_{i,t})^4 \le (V_{i,t} - U_{i,t})^2 (V_{i,t} + U_{i,t})^2 = (V_{i,t}^2 - U_{i,t}^2)^2. \end{aligned}$$

Call \(\tau _i\) the time of the first jump of \(V_{i,t}\). Then

$$\begin{aligned} \mathbb {E}(V_{i,t} - U_{i,t})^4&\le \mathbb {E}\mathbf {1}_{\{\tau _i \le t\}} \big (V_{i,t}^2 - U_{i,t}^2\big )^2 +\mathbb {E}\mathbf {1}_{\{\tau _i> t\}} (V_{i,t} - U_{i,t})^4 \\&\le \mathbb {E}\big (V_{i,t}^2 - U_{i,t}^2\big )^2 +\mathbb {E}\mathbf {1}_{\{\tau _i > t\}} (V_{i,0} - U_{i,0})^4. \end{aligned}$$

For the second term we use the fact that \(\tau _i\) is independent of \((V_{i,0},U_{i,0})\) and has exponential distribution with parameter 1, which gives \(e^{-t} \mathbb {E}(V_{i,0} - U_{i,0})^4\). For the first term we simply use (18). This yields

$$\begin{aligned} \mathbb {E}\frac{1}{N} \sum _i (V_{i,t}-U_{i,t})^4&\le CN^{-\gamma } + C e^{-\lambda _N t} \mathbb {E}\frac{1}{N} \sum _i (V_{i,0}^2-U_{i,0}^2)^2 \\&\quad + C \mathbb {E}(E_N - \mathcal {E})^2 + C e^{- t} \mathbb {E}\frac{1}{N} \sum _i (V_{i,0}-U_{i,0})^4. \end{aligned}$$

Finally, we have \(\mathbb {E}\mathcal {W}_4^4(\bar{\mathbf {V}}_t,f_t) \le C\mathbb {E}\mathcal {W}_4^4(\bar{\mathbf {V}}_t,\bar{\mathbf {U}}_t) + C\mathbb {E}\mathcal {W}_4^4(\bar{\mathbf {U}}_t,f_t)\), and the result follows since the first term is bounded above by \(C\mathbb {E}\frac{1}{N} \sum _i (V_{i,t}-U_{i,t})^4\) and using the second part of Lemma 4 on the second term (recall that \(\tilde{\gamma } < \gamma \)). \(\square \)

To prove Theorem 2, we will need the results of [8]. They provide exponential contraction rates in \(\mathcal {W}_4^4\) for both the particle system and the nonlinear process, which in turn imply contraction in \(\mathcal {W}_2^2\). More specifically: assuming \(\sup _N \mathbb {E}V_{1,0}^4 < \infty \) and \(\int _\mathbb {R}v^4 f_0(dv)<\infty \), one has for some \(\alpha >0\)

$$\begin{aligned} \mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_t), \mathcal {U}_N) \le Ce^{-\alpha t} \quad \text { and } \quad \mathcal {W}_2^2(f_t, f_\infty ) \le Ce^{-\alpha t}, \end{aligned}$$
(19)

where \(\mathcal {U}_N\) and \(f_\infty \) are the stationary distributions for the particle system and nonlinear process, respectively. Namely, \(\mathcal {U}_N\) is the uniform distribution on the sphere \(\{\mathbf {x}\in \mathbb {R}^N: \frac{1}{N}\sum _i x_i^2 = r^2\}\) with \(r^2\) chosen randomly with the same law as \(E_N = \frac{1}{N}\sum _i V_{i,0}^2\), and \(f_\infty \) is the Gaussian distribution with mean 0 and variance \(\mathcal {E} = \int v^2 f_0(dv)\) (note that, although the results of [8] are stated in the case \(E_N = 1\) a.s., it is easy to generalize them to the case of particle systems starting a.s. with the same random energy).

Also, it is easy to verify that

$$\begin{aligned} \mathcal {W}_2^2\big ( \mathcal {U}_N, f_\infty ^{\otimes N}\big ) \le CN^{-1/2} + C\mathcal {W}_2^2\big (\mathcal {L}(\mathbf {V}_0), f_0^{\otimes N}\big ). \end{aligned}$$
(20)

Indeed, given a random vector \(\mathbf {Z} = (Z_1,\ldots ,Z_N)\) with law \(f_\infty ^{\otimes N}\) independent of \(\mathbf {V}_0\), call \(Q^2 = \frac{1}{N} \sum _{i=1}^N Z_i^2\) and define \(Y_i = E_N^{1/2}Z_i / Q\), so that \(\mathbf {Y} = (Y_1,\ldots ,Y_N)\) has distribution \(\mathcal {U}_N\) thanks to the fact that \(f_\infty ^{\otimes N}\) is rotation invariant. A straightforward computation shows that \(\frac{1}{N} \sum _i (Z_i-Y_i)^2 = (Q-E_N^{1/2})^2 \le 2(Q-\mathcal {E}^{1/2})^2 + 2(E_N^{1/2}-\mathcal {E}^{1/2})^2\), which is bounded above by \(2\mathcal {W}_2^2(\bar{\mathbf {Z}},f_\infty ) + 2\mathcal {W}_2^2(\bar{\mathbf {V}}_0,f_0)\), since \(\int v^2 f_\infty (dv) = \int v^2 f_0(dv) = \mathcal {E}\) (in general, for measures \(\mu \) and \(\nu \) on \(\mathbb {R}\) with \(Q_\mu ^2 = \int x^2 \mu (dx)\), one has for any \(X \sim \mu \) and \(\tilde{X}\sim \nu \): \(\mathbb {E}(X-\tilde{X})^2 \ge Q_\mu ^2 + Q_\nu ^2 - 2 Q_\mu Q_\nu = (Q_\mu -Q_\nu )^2\)). This coupling gives \(\mathcal {W}_2^2( \mathcal {U}_N, f_\infty ^{\otimes N}) \le \mathbb {E}\frac{1}{N} \sum _i (Z_i-Y_i)^2 \le 2 \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {Z}},f_\infty ) + 4\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_0,\bar{\mathbf {U}}_0) +4\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {U}}_0,f_0)\), where the first and third terms are controlled by \(CN^{-1/2}\) thanks to (15), and the second term is controlled by \(4\mathbb {E}\frac{1}{N} \sum _i (V_{i,0} - U_{i,0})^2 = 4\mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0),f_0^{\otimes N})\), this time choosing the initial conditions \((\mathbf {V}_0,\mathbf {U}_0)\) as an optimal coupling with respect to the usual quadratic cost \((x-y)^2\).

We are now ready to prove Theorem 2:

Proof of Theorem 2

The argument combines the contraction results of [8] and the propagation of chaos results of [11]. Clearly,

$$\begin{aligned} \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t , f_t) \le C \mathbb {E}[ \mathcal {W}_2^2(\bar{\mathbf {V}}_t, \bar{\mathbf {V}}_\infty ) + \mathcal {W}_2^2(\bar{\mathbf {V}}_\infty ,\bar{\mathbf {Z}}_\infty ) + \mathcal {W}_2^2(\bar{\mathbf {Z}}_\infty ,f_\infty ) + \mathcal {W}_2^2(f_\infty ,f_t)].\nonumber \\ \end{aligned}$$
(21)

Here \(\mathbf {V}_\infty \) is a random vector on \(\mathbb {R}^N\) with law \(\mathcal {U}_N\), which is also optimally coupled to \(\mathbf {V}_t\) with respect to the quadratic cost, so \(\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t, \bar{\mathbf {V}}_\infty ) \le \mathbb {E}\frac{1}{N}\sum _i (V_{i,t} - V_{i,\infty })^2 = \mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_t), \mathcal {L}(\mathbf {V}_\infty ))\). Thus, the first and fourth term are bounded by \(Ce^{-\alpha t}\), thanks to (19). Also, we have chosen \(\mathbf {Z}_\infty \) with law \(f_\infty ^{\otimes N}\) and being optimally coupled to \(\mathbf {V}_\infty \), so for the second term of (21) we have \(\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_\infty ,\bar{\mathbf {Z}}_\infty ) \le \mathcal {W}_2^2(\mathcal {U}_N,f_\infty ^{\otimes N})\), which is controlled using (20). The third term is controlled by \(CN^{-1/2}\), thanks to (15). With all these estimates, we obtain from (21):

$$\begin{aligned} \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t,f_t) \le Ce^{-\alpha t} + C\mathcal {W}_2^2\big (\mathcal {L}(\mathbf {V}_0),f_0^{\otimes N}\big ) + CN^{-1/3} \end{aligned}$$
(22)

for some \(\alpha >0\). On the other hand, from [11, Theorem 1] we have

$$\begin{aligned} \mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t,f_t) \le C\mathcal {W}_2^2\big (\mathcal {L}(\mathbf {V}_0),f_0^{\otimes N}\big ) + C(1+t)^2N^{-1/3}. \end{aligned}$$
(23)

(In [11] the initial distribution of the particle system was chosen as \(f_0^{\otimes N}\), but the extension to any exchangeable initial condition is straightforward). Finally, the result is obtained from (22) and (23) adjusting t and N conveniently: take \(t_* = \frac{\log N}{3\alpha }\), so (22) yields \(\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t,f_t) \le C\mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0),f_0^{\otimes N}) + CN^{-1/3}\) for \(t\ge t_*\), whereas (23) gives \(\mathbb {E}\mathcal {W}_2^2(\bar{\mathbf {V}}_t,f_t) \le C\mathcal {W}_2^2(\mathcal {L}(\mathbf {V}_0),f_0^{\otimes N}) + C N^{-1/3}\log ^2 N\) for \(t\le t_*\). The result follows. \(\square \)