1 Introduction

In 1955, Mark Kac [14] introduced a simple model to study the evolution of a dilute gas of N particles with unit mass undergoing pairwise collisions. Instead of following the deterministic evolution of the particles until a collision takes place, he considered particles that collide at random times with every particle undergoing, on average, a given number of collisions per unit time. Moreover, when a collision takes place, the energy of the two particles is randomly redistributed between them. In such a situation, one can neglect the position of the particles and focus on their velocities. To obtain a model as simple as possible, he considered particles that move in one spatial dimension. This leads to an evolution governed by a master equation for the probability distribution \(f(\underline{v}_N)\), where \(\underline{v}_N\in {\mathbb {R}}^N\) are the velocities of the particles. Since collisions preserve the kinetic energy of the system, to obtain ergodicity one has to restrict the evolution to \(\underline{v}_N\in {\mathbb {S}}^{N-1}(\sqrt{2eN})\), that is on the surface of constant kinetic energy with e the kinetic energy per particle. To further simplify the model, he neglected the dependence of a particle collision rate on its speed, a situation sometime referred as Maxwellian particles. In this setting, the dynamical properties of the evolution do not depend on e and it is thus natural to set \(e=1/2\), see [14,15,16] for more details.

The study of the Kac master equation has been very useful to clarify and investigate notions and conjectures arising from the kinetic theory of diluted gases. We refer the reader to Kac’s original works [14] and [15] for extensive discussion.

Kac’s master equation also provides a natural setting to study approach to equilibrium. In the case of the standard Kac model [14], equilibrium is represented by the uniform distribution on the surface of given kinetic energy. Uniform convergence in the sense of the \(L^2\) gap was conjectured by Kac and it was established in [13] while the gap was explicitly computed in [5].

A more natural way to define approach to equilibrium is via the relative entropy. This provides a better setting since the relative entropy, in general, grows only linearly with the number of particles. There is no result of exponential decay of relative entropy with a rate that is uniform in N for the original Kac model. Moreover, estimates of the entropy production rate seem to point to a slow decay, at least for short times, see [7, 19].

In [4], the authors studied the evolution of a dilute gas of N particles brought to equilibrium via a Maxwellian thermostat, i.e. an infinite heat reservoir at fixed temperature \(T=\beta ^{-1}\). The velocities of the particles in the system evolve according to the standard Kac collision process described above. On top of this, particles in the system collide with particles in the thermostat at randomly distributed times. In this way, the system and the reservoir exchange energy, but there is no exchange of particles. In particular, the kinetic energy of the system is no more preserved. They proved that the system admits as a unique steady state the Canonical Ensemble, i.e. in the steady state the probability distribution \(f(\underline{v}_N)\) is the Maxwellian distribution at temperature T. Moreover, the steady state is approached exponentially fast and uniformly in N, both in the sense of the spectral gap, in a suitable \(L^2\) space, and in the sense of the relative entropy. In both cases, the rate of approach is determined by the interaction with the thermostat while the rate of collision between particles in the system appears only in the second spectral gap. They also adapted McKean’s proof [16] of propagation of chaos and obtained a Boltzmann-Kac type effective equation for the evolution of the one particle marginal in the limit \(N\rightarrow \infty \).

In the present work, we study a different way to bring the system to equilibrium. As in [4], we study a system of N particles evolving through pair collisions and interacting with an infinite reservoir at given temperature T; however, the system and the reservoir are allowed to exchange particles. The evolution of the the velocities of the particles in the system is again described by a standard Kac collision process. On top of these, at random times a particle in the system can leave it while, still at random times, a particle can enter the system from the reservoir with its velocity distributed according to the Maxwellian at temperature T. Since the reservoir is infinite, no particle can enter or leave the system more than once. Clearly, in this new setting, energy and number of particles are not preserved. We show that this new evolution admits as its unique steady state the Grand Canonical Ensemble. This means that, in the steady state, the probability that the system contains N particles is given by a Poisson distribution while the probability distribution on the velocities, given the number of particles, is the Maxwellian at temperature T.

We also study the approach to equilibrium in a suitable \(L^2\) space and in relative entropy. In both cases, we show that the rate of approach is uniform in the average number of particles. As in [4], the approach to equilibrium, both in \(L^2\) and in relative entropy, is driven by the thermostat alone while the second spectral gap depends on the rate of binary particle collisions. Finally, we look at the emergence of an effective evolution for the particle density in the limit of a large system, that is when the average number of particles goes to infinity. This requires some adaptation of the concept of propagation of chaos since the number of particles in the system is not constant. Adapting the proof in [16], we show that the relative particle density, defined in (19) and (22) below, satisfies a Boltzmann-Kac type of equation.

The rest of the paper is organized as follows. In Sect. 2, we present the model and state our main results. Section 3 contains the proofs of our main results, while in Sect. 4 we report some open problems and present possible areas of future work. Finally the appendix contains the proofs of some technical Lemmas used in Sect. 3.

2 Model and Results

Since we want to describe a dilute gas with uniform density exchanging particles with an infinite reservoir, it is natural to assume that, in a given time, each particle in the system has the same probability of leaving it independently from the total number N of particles in the system. This implies that the flow of particles from the system to the reservoir is proportional to N. On the other hand, the probability of a particle to enter the system from the reservoir depends only on the characteristics of the reservoir, and not on N, so that the flow of particles in the system is independent from N. Finally, since the gas is dilute, given two particles in the system, their probability of colliding in a given time does not depend on the total number of particles in the system. Thus we expect the number of binary collisions in the system, in a given time, to be proportional to \(\left( {\begin{array}{c}N\\ 2\end{array}}\right) \). These are the main heuristic considerations that lead to the formulation of our model to be introduced formally below.

We consider a system of particles in one space dimension interacting with an infinite reservoir with which it exchanges particles. Since the number of particles in the system is not constant, the phase space is given by \({\mathcal {R}}=\bigcup _{N=0}^\infty {\mathbb {R}}^N\), where \({\mathbb {R}}^0=\{\emptyset \}\) represents the state where no particle is in the system.

The evolution of the system is governed by three separate random processes. First, at exponentially distributed times a particle is added to the system with a velocity randomly chosen from a Maxwellian distribution at temperature T. To simplify notation we chose \(T^{-1}=2\pi \). Second, also at exponentially distributed times, a particle is chosen at random to exit the system and disappear forever with no chance of reentry. Finally, a pair of particles in the system is selected at random to undergo a standard Kac collision.

More precisely, let \(L^1_s({{\mathcal {R}}})=\bigoplus _{N=0}^\infty L^1_s({\mathbb {R}}^N)\) be the Banach space of all states \(\mathbf {f}=(f_N)_{N=0}^\infty \), with \(f_N(\underline{v}_N)\) symmetric under permutation of the \(v_i\), defined by the norm \(\Vert \mathbf {f}\Vert _1:=\sum _N\Vert f_N\Vert _{1,N}\), where \(\Vert f_N\Vert _{1,N}=\int d\underline{v}_N|f_N(\underline{v}_N)|\). We say that \(\mathbf {f}\) is positive if \(f_N(\underline{v}_N)\ge 0\) for every N and almost every \(\underline{v}_N\). If \(\mathbf {f}\) is positive and \(\Vert \mathbf {f}\Vert _1=1\) then \(\mathbf {f}\) is a probability distribution on \({\mathcal {R}}\). In this case, for \(N>0\), \(f_N(\underline{v}_N)\) represents the probability of finding N particles in the system with velocities \({\underline{v}}_N=(v_1,\dots ,v_N)\) while \(f_0\in {\mathbb {R}}\) is the probability that the system contains no particles.

The master equation for the evolution is given by

$$\begin{aligned} \frac{d}{dt}\mathbf {f}={\mathcal {L}} [\mathbf {f}]:=\mu ({\mathcal {I}}[\mathbf {f}]-\mathbf {f}) +\rho ({\mathcal {O}}[\mathbf {f}]- {\mathcal {N}}[\mathbf {f}])+\tilde{\lambda } {\mathcal {K}}[\mathbf {f}] \end{aligned}$$
(1)

where \({\mathcal {I}}\) is the in operator that represents the effect of introducing a particle into the system and, after symmetrization, is given by

$$\begin{aligned} ({\mathcal {I}}\mathbf {f})_{N}(\underline{v})=\frac{1}{N}\sum _{i=1}^N e^{-\pi v_i^2}f_{N-1}(v_1,\dots ,v_{i-1},v_{i+1},\ldots , v_N) \end{aligned}$$
(2)

while \({\mathcal {O}}\) is the out operator that represents the effect of a random particle leaving the system

$$\begin{aligned} ({\mathcal {O}}\mathbf {f})_{N}(\underline{v})=\sum _{i=1}^{N+1}\int dw f_{N+1}(v_1,\ldots , v_{i-1},w,v_i,\ldots ,v_N) \end{aligned}$$
(3)

and

$$\begin{aligned} ({\mathcal {N}}\mathbf {f})_{N}(\underline{v})=Nf_N(v_1,\ldots , v_N)\, . \end{aligned}$$

Observe that, due to the symmetry of \(f_{N+1}\), we can write

$$\begin{aligned} ({\mathcal {O}}\mathbf {f})_{N}(\underline{v}_N)=(N+1)\int dv_{N+1} f_{N+1}(\underline{v}_{N+1})\, . \end{aligned}$$

We also define the thermostat operator \({\mathcal {T}}\) as

$$\begin{aligned} {\mathcal {T}}:=\mu ({\mathcal {I}}-\mathrm{Id}) +\rho ({\mathcal {O}}- {\mathcal {N}})\,. \end{aligned}$$
(4)

These definitions imply that, in every time interval dt, there is a probability \(\mu dt\) of a particle being added to the system. This probability is independent of the number of particles already in the system. In the same time interval, every particle in the system has a probability \(\rho dt\) of leaving the system, which is, again, independent of the number of particles in the system. Thus, as discussed at the beginning of this section, the outflow of particles is proportional to N while the inflow does not depend on N.

Finally \({\mathcal {K}}\) represents the effect of the collisions among particles. It acts independently on each of the N particles subspaces, that is it is \(({\mathcal {K}}\mathbf {f})_N=K_N f_N\) with

$$\begin{aligned} K_N f_N:=\sum _{1\le i<j\le N}(R_{i,j}-\mathrm{Id})f_N:=Q_Nf_N-\left( {\begin{array}{c}N\\ 2\end{array}}\right) f_N \end{aligned}$$
(5)

where \(R_{i,j}\) represents the effect of a collision between particles i and j:

$$\begin{aligned} (R_{i,j}f_N)(\underline{v}_N)=\frac{1}{2\pi }\int f_N(\dots ,v_i\cos \theta -v_j\sin \theta ,\dots ,v_i\sin \theta +v_j\cos \theta , \dots )d\theta \, , \end{aligned}$$
(6)

that is, \(R_{i,j}f_N\) is the average of \(f_N\) over all rotations in the plane \((v_i,v_j)\). In this way, the probability that two given particles suffer a collision in an interval dt is proportional to \(\tilde{\lambda }\) and does not depend on the number of particles in the system.

Since \({\mathcal {L}}\) is a sum of unbounded operators that do not commute, we first need to show that (1) defines an evolution on \(L^1_s({\mathcal {R}})\) and that such an evolution preserves probability distributions. Observe that, notwithstanding \({\mathcal {L}}\) is unbounded, the operator \({\mathcal {L}}_N\mathbf {f}\), defined by \({\mathcal {L}}_N\mathbf {f}:=({\mathcal {L}}\mathbf {f})_N\), is bounded as an operator from \(L_s^1({\mathcal {R}})\) to \(L^1_s({\mathbb {R}}^N)\) with \(\Vert {\mathcal {L}}_N\Vert _{1,N}\le 2\mu +(2N+1)\rho +\tilde{\lambda } N^2\). Thus we will take \(D^1=\{\mathbf {f}\;|\,\sum _N N^2\Vert f_N\Vert _{1,N}<\infty \}\) as the domain of \({\mathcal {L}}\). It is easy to see that \(D^1\) is dense in \(L_s^1({\mathcal {R}})\).

In Sect. 3.1 we will build a semigroup of continuous operators \(e^{t{\mathcal {L}}}\) that solves (1) for initial data \(\mathbf {f}\in D^1\) and show that \(e^{t{\mathcal {L}}}\) preserves probability distributions.

Lemma 1

There exists a semigroup of continuous operators \(e^{t{\mathcal {L}}}\) such that if \(\mathbf {f}\in D^1\) then \(\mathbf {f}(t)=e^{t{\mathcal {L}}}\mathbf {f}\) solves (1). For every \(\mathbf {f}\in L^1_s({\mathcal {R}})\) we have

$$\begin{aligned} \Vert e^{t{\mathcal {L}}}\mathbf {f}\Vert _1\le \Vert \mathbf {f}\Vert _1\, . \end{aligned}$$

Moreover, if  \(\mathbf {f}\) is positive then so is \(e^{t{\mathcal {L}}}\mathbf {f}\) and \(\Vert e^{t{\mathcal {L}}}\mathbf {f}\Vert _1=\Vert \mathbf {f}\Vert _1\). Thus (1) generates an evolution that preserves probability distributions.

Proof

See Sect. 3.1. \(\square \)

It is not hard to see that the evolution generated by (1) admits the steady state \(\varvec{\varGamma }\) given by

$$\begin{aligned} (\varvec{\varGamma })_N(\underline{v}_N)=\left( \frac{\mu }{\rho }\right) ^N \frac{e^{-\frac{\mu }{\rho }}}{N!}e^{-\pi |\underline{v}_N|^2}:=a_N\gamma _N(\underline{v}_N) \end{aligned}$$
(7)

where \(\gamma _N(\underline{v}_N)=\prod _{i=1}^N\gamma (v_i)\), with \(\gamma (v)=e^{-\pi v^2}\), is the Maxwellian distribution with \(\beta =2\pi \) in dimension N while \(a_N=\left( \frac{\mu }{\rho }\right) ^N\frac{e^{-\frac{\mu }{\rho }}}{N!}\) is a Poisson distribution on \({\mathbb {N}}\). We observe that \(\varvec{\varGamma }\) is a Grand Canonical Ensemble with temperature \(T=\beta ^{-1}=1/2\pi \), chemical potential \(\chi =(2\pi )^{-1} \log (\rho /\mu )\), and average number of particles \(\langle {\mathcal {N}}\varvec{\varGamma }\rangle =\mu /\rho \) where

$$\begin{aligned} \langle {\mathcal {N}}\mathbf {f}\rangle :=\sum _{N=0}^\infty N \int f_N(\underline{v}_N)d\underline{v}_N\, . \end{aligned}$$

In Sect. 3.1 we show that \(\varvec{\varGamma }\) is the unique steady state of the evolution generated by (1). Finally, from a physical point of view, it is natural to consider only initial states with finite average number of particles and average kinetic energy, that is probability distributions \(\mathbf {f}\) such that

$$\begin{aligned} \langle {\mathcal {N}}\mathbf {f}\rangle<\infty ,\quad \hbox {and}\qquad \langle {\mathcal {E}}\mathbf {f}\rangle := \sum _{N=0}^\infty \frac{1}{2}\int \bigl (\sum _i v_i^2\bigr ) f_N(\underline{v}_N)d\underline{v}_N<\infty \, . \end{aligned}$$
(8)

Since the Kac collision operator \({\mathcal {K}}\) preserves energy and number of particles, we can derive autonomous equations for the evolutions of \(N(t)=\langle {\mathcal {N}}\mathbf {f}(t)\rangle \) and \(E(t)=\langle {\mathcal {E}} \mathbf {f}(t)\rangle \). Indeed, if \(\mathbf {f}\) is a probability distribution, we obtain

$$\begin{aligned} \frac{d}{dt}N(t)&=\mu -\rho N(t)\nonumber \\ \frac{d}{dt}E(t)&=\frac{\mu }{2\pi }-\rho E(t) \end{aligned}$$
(9)

so that, if (8) holds at time \(t=0\) it holds for every time \(t>0\). See Sect. 3.1 for a derivation of these equations. Letting \(e(t)=E(t)/N(t)\), we get

$$\begin{aligned} \frac{d}{dt}e(t)=\frac{\mu }{N(t)}\left( \frac{1}{2\pi }- e(t)\right) \,. \end{aligned}$$
(10)

Equation (10) looks like Newton law of cooling for a system like ours. Notwithstanding this, e(t) is not the natural definition of temperature since it is not the average kinetic energy per particle. A more interesting quantity is \(\tilde{e}(t)=\langle v_1^2\mathbf {f}\rangle \), but we were not able to obtain a closed form expression for its evolution.

As discussed in the introduction, we are interested in properties that are uniform in the average number of particles in the steady state \(\langle {\mathcal {N}}\varvec{\varGamma }\rangle =\mu /\rho \) and eventually we want to consider the situation where the average number of particles goes to infinity, that is \(\mu /\rho \rightarrow \infty \). A classical way to take such a limit is to require that the collision rate between particles decreases as the average number of particles increases in such a way that the average number of collisions a given particle suffers in a given time is independent from \(\mu /\rho \), at least when \(\mu /\rho \) is large. This is achieved by setting

$$\begin{aligned} \tilde{\lambda }=\lambda \frac{\rho }{\mu }\, . \end{aligned}$$

Observe that in this way, the scaling in N of \(K_N\) in (5) differs from the scaling in the standard Kac model. Notwithstanding this, they can both be thought as implementations of the Grad-Boltzmann limit in the two different situations, see [11].

One way to study the approach of an initial state \(\mathbf {f}\) toward \(\varvec{\varGamma }\) is by computing the spectral gap of \({\mathcal {L}}\). Since \({\mathcal {L}}\) is not self adjoint on \(L^2_s({\mathcal {R}})\) we perform a ground state transformation setting

$$\begin{aligned} f_N:=a_N\gamma _Nh_N\,. \end{aligned}$$
(11)

We will express (11) as \(\mathbf {f}=\varvec{\varGamma } \mathbf {h}\). Inserting the above definition in (1) we get

$$\begin{aligned} \frac{d}{dt}\mathbf {h}=\widetilde{{\mathcal {L}}} \mathbf {h} :=\rho ({\mathcal {P}}^+\mathbf {h}-{\mathcal {N}}\mathbf {h})+\mu ({\mathcal {P}}^-\mathbf {h}-\mathbf {h})+\tilde{\lambda }{\mathcal {K}}\mathbf {h}\end{aligned}$$

where we have set

$$\begin{aligned}&({\mathcal {P}}^+\mathbf {h})_{N}=\sum _{i=1}^Nh_{N-1}(v_1,\dots ,v_{i-1},v_{i+1},\dots ,v_N)\\&({\mathcal {P}}^-\mathbf {h})_{N}=\frac{1}{N+1}\sum _{i=1}^{N+1}\int dwe^{-\pi w^2}h_{N+1}(v_1,\dots ,v_{i-1},w,v_i,\dots ,v_N) \end{aligned}$$

In this representation, the steady state is given by the vector \(\mathbf {e}^0\) such that \((\mathbf {e}^0)_N\equiv 1\) for every N. Thus \(\widetilde{{\mathcal {L}}}\) is an unbounded operator on the Hilbert space

$$\begin{aligned} L^2_s({\mathcal {R}},\varvec{\varGamma })=\bigoplus _{N=0}^\infty L^2_s({\mathbb {R}}^N,a_N\gamma _N(\underline{v}_N)) \end{aligned}$$

of all states \(\mathbf {h}=(h_0,h_1,h_2,\ldots )\) with \(h_N(\underline{v}_N)\) symmetric under permutations of the \(v_i\) and defined by the scalar product

$$\begin{aligned} (\mathbf {h}_1,\mathbf {h}_2):=\sum _{N=0}^\infty a_N (h_{1,N},h_{2,N})_N :=\sum _{N=0}^\infty a_N\int h_{1,N}(\underline{v}_N)h_{2,N}(\underline{v}_N) \gamma _N(\underline{v}_N)d\underline{v}_N\, . \end{aligned}$$

As for \({\mathcal {L}}\), defining \(\widetilde{{\mathcal {L}}}_M\mathbf {h}=({\mathcal {L}}\mathbf {h})_M\) we get a bounded operator from \(L^2_s({\mathcal {R}},\varvec{\varGamma })\) to \(L^2_s({\mathbb {R}}^N,\gamma _N(\underline{v}_N))\) so that, calling \(\Vert h_N\Vert _{2,N}=(h_{N},h_{N})_N\), we can take

$$\begin{aligned} D^2=\bigl \{\mathbf {h}\,\big |\,\sum _{N=0}^\infty a_N\Vert (\widetilde{{\mathcal {L}}}\mathbf {h})_N\Vert _{2,N}<\infty \bigr \} \end{aligned}$$

as the domain of \(\widetilde{{\mathcal {L}}}\). The following Theorem shows that \(\widetilde{{\mathcal {L}}}\) defines an evolution on \(L^2_s({\mathcal {R}},\varvec{\varGamma })\).

Theorem 2

The generator \(\widetilde{{\mathcal {L}}}\) is self adjoint and non-positive definite on \(L^2_s({\mathcal {R}},\varvec{\varGamma })\). Furthermore, if we define

$$\begin{aligned} \varDelta =\sup \{(\mathbf {h},\widetilde{{\mathcal {L}}}\mathbf {h})\,|\, \mathbf {h}\in D^2, \Vert \mathbf {h}\Vert _2=1, \mathbf {h}\perp \mathbf {E}_0\} \end{aligned}$$

where \(\Vert \mathbf {h}\Vert _2=(\mathbf {h},\mathbf {h})\) and \(\mathbf {E}_0=\mathrm {span}\{\mathbf {e}^0\}\), we get

$$\begin{aligned} \varDelta =-\rho \, . \end{aligned}$$

Moreover \(\varDelta \) is an eigenvalue and the associated eigenspace is \(\mathbf {E}_1=\mathrm {span}\{\mathbf {e}_1,\mathbf {e}_{(0,0,1)}\}\) with \(\mathbf {e}_1=\sqrt{\frac{\rho }{\mu }}{\mathcal {P}}^+\mathbf {e}^0-\sqrt{\frac{\mu }{\rho }}\mathbf {e}^0\) while

$$\begin{aligned} (\mathbf {e}_{(0,0,1)})_N(\underline{v}_N)= \sqrt{\frac{\rho }{2\mu }} \sum _{i=1}^N(2\pi v_i^2-1) \,. \end{aligned}$$

Proof

See Sect. 3.2. \(\square \)

Due to the invariance of even, second degree polynomials under the Kac collision operator \({\mathcal {K}}\), Theorem 2 shows that the spectral gap of the generator \(\widetilde{{\mathcal {L}}}\) is completely determined by the presence of the reservoir. This is not surprising since all states \(\mathbf {h}\) such that \(h_N\) is rotationally invariant for every N are in the null space of \({\mathcal {K}}\).

As in [4], to see the effect of the Kac collision operator \({\mathcal {K}}\), we have to look at the second gap, defined as

$$\begin{aligned} \varDelta _2=\sup \{(\mathbf {h},\widetilde{{\mathcal {L}}}\mathbf {h})\,|\, \mathbf {h}\in D^2, \Vert \mathbf {h}\Vert _2=1, \mathbf {h}\perp \mathbf {E}_0\oplus \mathbf {E}_1\}\, . \end{aligned}$$
(12)

Theorem 3

If

$$\begin{aligned} \rho>\frac{\lambda }{4}+2\lambda \sqrt{\frac{\rho }{\mu }} \quad \mathrm {and}\quad \frac{\mu }{\rho }>256 \end{aligned}$$
(13)

we have

$$\begin{aligned} -\rho -\frac{\lambda }{4}\le \varDelta _2<-\rho -\frac{\lambda }{4} +2\lambda \sqrt{\frac{\rho }{\mu }}\, . \end{aligned}$$

Moreover \(\varDelta _2\) is an eigenvalue and the associated eigenspace is contained in the space of all states \(\mathbf {h}\) such that \(h_N\) is an even, fourth degree polynomial.

Proof

See Sect. 3.3. \(\square \)

Since \(\mu /\rho \) is the average number of particles in the steady state, the conditions in (13) are not too restrictive.

It is possible to see that, as in the case of the standard Kac evolution, the \(L^2\) norm discussed above does not scale well with the average number of particles in the system and thus it is not a good measure of distance from the steady state if \(\mu /\rho \) is large. A better measure is the entropy of a probability distribution \(\mathbf {f}\) relative to the steady state \(\varvec{\varGamma }\) defined as

$$\begin{aligned} {\mathcal {S}}(\mathbf {f}\,|\,\varvec{\varGamma })=\sum _Na_N\int dv_N h_N (\underline{v}_N)\log h_N(\underline{v}_N)\gamma _N(\underline{v}_N) \end{aligned}$$

where, as before, \(\mathbf {f}=\varvec{\varGamma }\mathbf {h}\) and \(a_N\) and \(\gamma _N\) are defined in (11).

As usual, it is easy to show using convexity that \({\mathcal {S}}(\mathbf {f}\,|\,\varvec{\varGamma })\ge 0\), \({\mathcal {S}}(\mathbf {f}\,|\,\varvec{\varGamma })=0\) if and only if \(\mathbf {f}=\varvec{\varGamma }\). Moreover, from Lemma 1 and convexity, it follows that \({\mathcal {S}}(\mathbf {f}(t)\,|\,\varvec{\varGamma })\le {\mathcal {S}}(\mathbf {f}\,|\,\varvec{\varGamma })\) where \(\mathbf {f}(t)=e^{t{\mathcal {L}}}\mathbf {f}\). In Sect. 3.4, we show that, thanks to the presence of the reservoir, the entropy production rate is strictly negative. More precisely, assuming that \(\mathbf {f}=\varvec{\varGamma } \mathbf {h}\in D^1\) and \(\varvec{\varGamma } \mathbf {h}\log \mathbf {h}\in D^1\) we essentially obtain that

$$\begin{aligned} \frac{d}{dt}{\mathcal {S}}(\mathbf {f}(t)\,|\,\varvec{\varGamma }) \le -\rho {\mathcal {S}}(\mathbf {f}(t)\,|\,\varvec{\varGamma })\, . \end{aligned}$$
(14)

See Lemmas 19 and 20 in Sect. 3.4 below for a precise statement. Form (14) we obtain the following Theorem.

Theorem 4

If \(\mathbf {f}=\mathbf {h}\varvec{\varGamma }\in D^1\) is a probability distribution such that \(\varvec{\varGamma }\mathbf {h}\log \mathbf {h}\in D^1\) then

$$\begin{aligned} {\mathcal {S}}(\mathbf {f}(t)\,|\,\varvec{\varGamma })\le e^{-\rho t} {\mathcal {S}}(\mathbf {f}(0)\,|\,\varvec{\varGamma })\,. \end{aligned}$$
(15)

Proof

See Sect. 3.4. \(\square \)

As in the case of Theorem 2, convergence to equilibrium in entropy is completely dominated by the presence of the thermostat, that is, Theorem 4 remains valid in the case \(\tilde{\lambda }=0\) where there is no collision among the particles.

We can now discuss the validity of a Boltzmann-Kac type equation when the average number of particles in the system goes to infinity. To follow the standard analysis in [16], we have first to define what a chaotic sequence is in the present situation. It is natural to call \(\mathbf {f}=(f_0,f_1,f_2,\ldots )\) a product state if it has the form

$$\begin{aligned} f_N(\underline{v}_N)=e^{-\eta }\frac{\eta ^N}{N!}\prod _{i=1}^Ng(v_i) \end{aligned}$$
(16)

where g(v) is a probability density on \({\mathbb {R}}\) and \(\eta >0\) is the average number of particles. We observe that for the state \(\mathbf {f}\) in (16), we have

$$\begin{aligned} \left( e^{t{\mathcal {T}}}\mathbf {f}\right) _N=e^{-\eta (t)}\frac{\eta (t)^N}{N!} \prod _{i=1}^Ng(v_i,t) \end{aligned}$$
(17)

where \({\mathcal {T}}\) is defined in (4) and, calling \(l(v,t)=\frac{\rho }{\mu }\eta (t)g(v,t)\), we get

$$\begin{aligned} \eta (t)&=e^{-\rho t}\eta +(1-e^{-\rho t})\frac{\mu }{\rho }\nonumber \\ l(v,t)&=e^{-\rho t}l(v)+(1-e^{-\rho t})\gamma (v) \end{aligned}$$
(18)

This implies that the thermostat preserves the product structure exactly. See Sect. 3.5 for a derivation of (17) and (18).

Thus we call a sequence of states \(\mathbf {f}_n=(f_{n,0},f_{n,1},f_{n,2},\ldots )\) chaotic if it approaches the structure (16) while the average number of particles \(\langle {\mathcal {N}}\mathbf {f}_n\rangle \) goes to infinity. More precisely, let \(\mu _n\) be a sequence such that \(\lim _{n\rightarrow \infty }\mu _n=\infty \) and define

$$\begin{aligned} F_n^{(k)}(\underline{v}_k)=\left( \frac{\rho }{\mu _n}\right) ^{k}\sum _{N\ge k} \frac{N!}{(N-k) !}\int f_{n,N}(\underline{v}_k,\underline{v}_{N-k})d\underline{v}_{N-k} \end{aligned}$$
(19)

where the factor \(\frac{N!}{(N-k)!}\) accounts for the possible ways to choose the k particles with velocities \(\underline{v}_k\). We also define

$$\begin{aligned} \Vert \mathbf {f}\Vert _1^{(k)}=\sum _{N\ge k}\frac{N!}{(N-k)!}\Vert f_N\Vert _{1,N} \end{aligned}$$
(20)

so that \(\Vert F_n^{(k)}\Vert _{1,k}\le \left( \frac{\rho }{\mu _n}\right) ^k\Vert \mathbf {f}_n\Vert _1^{(k)}\).

Observe that, if \(\mathbf {f}_n\) is a product state of the form (16) with average number of particles \(\eta _n\), that is if

$$\begin{aligned} f_{n,N}(\underline{v}_N)=e^{-\eta _n}\frac{\eta _n^N}{N!}\prod _{i=1}^Ng(v_i) \end{aligned}$$

we get

$$\begin{aligned} F_n^{(k)}(\underline{v}_k)=\left( \frac{\eta _n\rho }{\mu _n}\right) ^{k} \prod _{i=1}^kg(v_i)\, . \end{aligned}$$

Thus the factor \(\left( \frac{\rho }{\mu _n}\right) ^{k}\) in (19) assures that, at least in this case, if \(\lim _{n\rightarrow \infty }\eta _n/\mu _n\) exists then also \(\lim _{n\rightarrow \infty } F_n^{(k)}\) exists.

To generalize these observations, we say that \(F_n^{(k)}\) converges weakly to \(F^{(k)}\) if, for any continuous and bounded test function \(\phi _k:{\mathbb {R}}^k\rightarrow {\mathbb {R}}\), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{{\mathbb {R}}^k}F_n^{(k)}(\underline{v}_k)\phi _k(\underline{v}_k)d\underline{v}_k= \int _{{\mathbb {R}}^k}F^{(k)}(\underline{v}_k)\phi _k(\underline{v}_k)d\underline{v}_k \end{aligned}$$

and we write \(\text {w-lim}_{n\rightarrow \infty }F_n^{(k)}=F^{(k)}\). Given a sequence \(\mathbf {f}_n\) of probability distributions such that

$$\begin{aligned} \Vert \mathbf {f}_n\Vert _1^{(r)}\le M^r\left( \frac{\mu _n}{\rho }\right) ^r \end{aligned}$$
(21)

for some \(M>0\) and every n and r, we say that \(\mathbf {f}_n\) is chaotic (w.r.t. \(\mu _n\)) if, for some F

$$\begin{aligned} \mathop {\text {w-lim}}\limits _{n\rightarrow \infty }F^{(1)}_n=F \end{aligned}$$
(22)

while for every \(k>1\) we have

$$\begin{aligned} \mathop {\text {w-lim}}\limits _{n\rightarrow \infty }F_n^{(k)}=F^{\otimes k} \end{aligned}$$
(23)

where \(F^{\otimes k}(\underline{v}_k)=\prod _{i=1}^k F(v_i)\). Observe that

$$\begin{aligned} \int F(v)dv=\lim _{n\rightarrow \infty }\frac{\langle {\mathcal {N}}\mathbf {f}_n\rangle \rho }{\mu _n} \end{aligned}$$
(24)

so that we can see F(v) as the relative particle density.

In [14, 16] a sequence of probability distributions \(f_n:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is said to be chaotic if, calling

$$\begin{aligned} \widetilde{F}_n^{(k)}(\underline{v}_k)=\int f_{n}(\underline{v}_k,\underline{v}_{n-k})d\underline{v}_{n-k}\, , \end{aligned}$$

we have

$$\begin{aligned} \mathop {\text {w-lim}}\limits _{n\rightarrow \infty }\widetilde{F}^{(1)}_n =\widetilde{F}\qquad \hbox {and}\qquad \mathop {\text {w-lim}}\limits _{n\rightarrow \infty }\widetilde{F}_n^{(k)} =\widetilde{F}^{\otimes k}\, . \end{aligned}$$

If we consider the sequence of states \(\mathbf {f}_n\) defined as

$$\begin{aligned} (\mathbf {f}_n)_N={\left\{ \begin{array}{ll} f_n &{} n=N\\ 0 &{} n\not = N \end{array}\right. } \end{aligned}$$

with the natural choice \(\mu _n=n\rho \), since the number of particles in \(\mathbf {f}_n\) is exactly n, from (19) we get \(F=\widetilde{F}\) and thus \(F^{(k)}=\widetilde{F}^{(k)}\). In this sense, (19) and (23) can be considered as a generalization of the classical definition in [14].

Let now

$$\begin{aligned} \mathbf {f}_n(t)=e^{{\mathcal {L}}_n t}\mathbf {f}_n(0) \end{aligned}$$

where \({\mathcal {L}}_n\) is given by (1) with \(\mu =\mu _n\) and

$$\begin{aligned} \tilde{\lambda }=\tilde{\lambda }_n=\lambda \frac{\rho }{\mu _n}\, . \end{aligned}$$
(25)

In Sect. 3.6, we prove that \(e^{{\mathcal {L}}_n t}\) propagates chaos in the sense that, if \(\mathbf {f}_n(0)\) forms a chaotic sequence, then \(\mathbf {f}_n(t)\) also forms a chaotic sequence for every t. This gives the following theorem.

Theorem 5

If \(\mathbf {f}_n(0)\) forms a chaotic sequence w.r.t. \(\mu _n\), with \(\lim _{n\rightarrow \infty }\mu _n=\infty \), then also \(\mathbf {f}_n(t)\) forms a chaotic sequence for every \(t\ge 0\). Moreover the relative particle density

$$\begin{aligned} F(v,t)=\mathop {\text {w-lim}}\limits _{n\rightarrow \infty } \frac{\rho }{\mu _n}\sum _{N=1}^\infty N \int f_{n,N}(v,\underline{v}_{N-1},t)d\underline{v}_{N-1} \end{aligned}$$

satisfies the Boltzmann-Kac type equation

$$\begin{aligned} \frac{d}{dt}F(v,t)&= -\rho (F(v,t)-\gamma (v))\nonumber \\&\quad +\lambda \int _{\mathbb {R}}dw\int \frac{d\theta }{2\pi }[F(v\cos \theta +w\sin \theta ,t)\nonumber \\ {}&\quad \times F(-v\sin \theta +w\cos \theta ,t)-F(w,t)F(v,t)]\, . \end{aligned}$$
(26)

Proof

See Sect. 3.6. \(\square \)

3 Proofs

3.1 Proof of Lemma 1

The results in this section are based on two observations. The first is that the collision operator \({\mathcal {K}}\) acts independently on each \(L^1_s({\mathbb {R}}^N)\) and thus preserves positivity and probability. The second is that, due to the different scaling in N of the in and out operators, see (2) and (3), for large N the outflow of particles dominates the inflow. Thus even if the initial probability of having a number of particles much larger than the steady state average \(\mu /\rho \) is high, this probability will rapidly decrease toward its steady state value, see (34) and (46) below. In particular this prevents probability from “leaking out at infinity”.

We will now construct a solution of (1) in three steps, starting from \({\mathcal {K}}\) alone, using a partial power series expansion, see (27) below, and then adding the out operator \({\mathcal {O}}\) and finally the in operator \({\mathcal {I}}\), using a Duhamel style expansion, see (33) and (40) below. These expansions are strongly inspired by the stochastic nature of the the evolution studied, see Remark 8 below for more details.

It is natural to define \(\left( e^{t\tilde{\lambda } {\mathcal {K}}}\mathbf {f}\right) _N=e^{t\tilde{\lambda } K_N}f_N\) where we can write

$$\begin{aligned} e^{t\tilde{\lambda } K_N}f_N=e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) } \sum _{n=0}^\infty \frac{\tilde{\lambda }^nt^n Q_N^n}{n!}f_N\, . \end{aligned}$$
(27)

Observing that

$$\begin{aligned} \Vert e^{t\tilde{\lambda } K_N}f_N-f_N\Vert _{1,N}&\le \left( 1-e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) }\right) \Vert f_N\Vert _{1,N}+\Bigl \Vert e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) } \sum _{n=1}^\infty \frac{\tilde{\lambda }^nt^n Q_N^n}{n!} f_N\Bigr \Vert _{1,N}\nonumber \\&\le 2\left( 1-e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) }\right) \Vert f_N\Vert _{1,N} \end{aligned}$$
(28)

and using that from Dominated Convergence we get

$$\begin{aligned} \lim _{t\rightarrow 0^+}\sum _{N=0}^\infty \left( 1-e^{-\tilde{\lambda } t \left( {\begin{array}{c}N\\ 2\end{array}}\right) }\right) \Vert f_N\Vert _{1,N}=0 \end{aligned}$$

we obtain that \(\lim _{t\rightarrow 0^+}e^{t\tilde{\lambda } {\mathcal {K}}}\mathbf {f}=\mathbf {f}\). Similarly, we get

$$\begin{aligned}&\frac{1}{t}\Vert e^{t\tilde{\lambda } K_N}f_N-f_N -\tilde{\lambda } t K_Nf_N\Vert _{1,N}\\&\quad \le \frac{1}{t}\left( e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) } -1+\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) \right) \Vert f_N\Vert _{1,N} +\left( 1- e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) }\right) \Vert \tilde{\lambda } Q_Nf_N\Vert _{1,N}\\&\qquad +\frac{1}{t}\Bigl \Vert e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) } \sum _{n=2}^\infty \frac{\tilde{\lambda }^nt^n Q_N^n}{n!}f_N\Bigr \Vert _{1,N}\\&\quad \le \frac{2}{t}\left( e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) }-1 +\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) \right) \Vert f_N\Vert _{1,N}+\tilde{\lambda } \left( {\begin{array}{c}N\\ 2\end{array}}\right) \left( 1-e^{-\tilde{\lambda } t\left( {\begin{array}{c}N\\ 2\end{array}}\right) }\right) \Vert f_N\Vert _{1,N} \end{aligned}$$

so that, if \(\mathbf {f}\in D_1\) then \(\lim _{t\rightarrow 0^+}\left( e^{t\tilde{\lambda } {\mathcal {K}}}\mathbf {f}-\mathbf {f}\right) /t =\tilde{\lambda }{\mathcal {K}}\mathbf {f}\). Since \(\Vert e^{t\tilde{\lambda } K_N}f_N\Vert _{1,N}\le \Vert f_N\Vert _{1,N}\) we get \(\Vert e^{t\tilde{\lambda }{\mathcal {K}}}\mathbf {f}\Vert _1\le \Vert \mathbf {f}\Vert _1\). Moreover if \(f_N\) is positive then also \(e^{t\tilde{\lambda } K_N}f_N\) is positive and \(\Vert e^{t\tilde{\lambda } K_N}f_N\Vert _{1,N}= \Vert f_N\Vert _{1,N}\). Thus if \(\mathbf {f}\) is positive then \(e^{t\tilde{\lambda }{\mathcal {K}}}\mathbf {f}\) is positive and \(\Vert e^{t\tilde{\lambda }{\mathcal {K}}}\mathbf {f}\Vert _1= \Vert \mathbf {f}\Vert _1\).

Let now \(\mathbf {f}(t)\) be a solution of

$$\begin{aligned} \frac{d}{dt} \mathbf {f}(t)=\tilde{\lambda } {\mathcal {K}}\mathbf {f}(t)+\rho ({\mathcal {O}}-{\mathcal {N}})\mathbf {f}(t) \end{aligned}$$
(29)

with \(\mathbf {f}(0)=\mathbf {f}\in D^1\). If such a solution exists, it satisfies the Duhamel formula

$$\begin{aligned} f_N(t)=e^{(\tilde{\lambda } K_N-\rho N)t}f_N+\rho \int _0^t e^{(\tilde{\lambda } K_N-\rho N)(t-s)} \left( {\mathcal {O}}\mathbf {f}(s)\right) _N\,ds \end{aligned}$$
(30)

where the construction of \(e^{(\tilde{\lambda } {\mathcal {K}}-\rho {\mathcal {N}})t}\) is analogous to that of \(e^{\tilde{\lambda } {\mathcal {K}}t}\). From (30) we get

$$\begin{aligned} \left\| f_N(t)\right\| _{1,N}\le e^{-\rho Nt}\Vert f_N\Vert _{1,N} +\int _0^t e^{-\rho N(t-s)} \rho (N+1)\left\| f_{N+1}(s)\right\| _{1,N+1}ds \end{aligned}$$
(31)

where we have used that

$$\begin{aligned} \Vert ({\mathcal {O}}\mathbf {f})_N\Vert _{1,N}=(N+1)\int \left| \int f_{N+1}(\underline{v}_{N+1})dv_{N+1}\right| d\underline{v}_N\le (N+1)\Vert f_{N+1}\Vert _{1,N+1}\,. \end{aligned}$$
(32)

Observe that, in (32), equality holds if and only if \(f_{N+1}\) is everywhere positive or everywhere negative. To construct a solution of (29) we iterate (30) to define

$$\begin{aligned} {\mathcal {Q}}(t)\mathbf {f}&=e^{(\tilde{\lambda } {\mathcal {K}}-\rho {\mathcal {N}})t}\mathbf {f}+\sum _{n=1}^\infty \int \limits _{0<t_1<\ldots<t_n<t} e^{(\tilde{\lambda } {\mathcal {K}}-\rho {\mathcal {N}})(t-t_n)} \rho {\mathcal {O}}e^{(\tilde{\lambda } {\mathcal {K}}-\rho {\mathcal {N}})(t_n-t_{n-1})} \nonumber \\&\quad \quad \cdots \rho {\mathcal {O}}e^{(\tilde{\lambda } {\mathcal {K}}-\rho {\mathcal {N}})t_1} \mathbf {f}\,dt_1\cdots dt_n \end{aligned}$$
(33)

and then show that \({\mathcal {Q}}(t)\) is a semigroup of bounded operators and that \(\mathbf {f}(t)={\mathcal {Q}}(t)\mathbf {f}\) solves (29) if \(\mathbf {f}\in D^1\). Using (31) iteratively we get

$$\begin{aligned} \Bigl \Vert \left( {\mathcal {Q}}(t)\mathbf {f}\right) _N\Bigr \Vert _{1,N}&\le \sum _{n\ge 0}e^{-\rho N t}\frac{(N+n)!}{N!} \int _{0<t_1<\cdots<t_n<t}\nonumber \\&\quad \prod _{i=1}^{n}e^{\rho (N+n-i)t_{i}}\rho e^{-\rho (N+n-i+1)t_{i}}dt_1\cdots dt_n\Vert f_{N+n}\Vert _{1,N+n}\nonumber \\&=e^{-\rho N t}\sum _{n\ge 0}\left( {\begin{array}{c}N+n\\ N\end{array}}\right) \left( 1-e^{-\rho t}\right) ^{n}\Vert f_{N+n}\Vert _{1,N+n} \end{aligned}$$
(34)

where, in the last identity, we have used that

$$\begin{aligned} \rho ^n\int _{0\le t_1\le \cdots t_n\le t} \prod _{i=1}^n e^{-\rho t_{i}}dt_1\cdots dt_n =\frac{1}{n!}(1-e^{-\rho t})^n\, . \end{aligned}$$
(35)

After summing over N we get

$$\begin{aligned} \Vert {\mathcal {Q}}(t)\mathbf {f}\Vert _1\le \sum _{N\ge 0} \sum _{n\ge 0} \left( {\begin{array}{c}N+n\\ N\end{array}}\right) e^{-\rho N t}\left( 1-e^{-\rho t}\right) ^{n}\Vert f_{N+n}\Vert _{1,N+n} =\Vert \mathbf {f}\Vert _1\, . \end{aligned}$$
(36)

so that \(\Vert {\mathcal {Q}}(t)\Vert _1\le 1\). Observe also that, if \(\mathbf {f}\) is positive then \({\mathcal {Q}}(t)\mathbf {f}\) is positive and \(\Vert {\mathcal {Q}}(t)\mathbf {f}\Vert _1=\Vert \mathbf {f}\Vert _{1}\), see comment below (32). Conversely, if for some N, \(f_N\) takes both positive and negative values then \(\Vert {\mathcal {Q}}(t)\mathbf {f}\Vert _1<\Vert \mathbf {f}\Vert _{1}\).

From (33), we see that \({\mathcal {Q}}(t_1){\mathcal {Q}}(t_2)= {\mathcal {Q}}(t_1+t_2)\) while, using (34) and (36), and the fact that

$$\begin{aligned} N(N-1)\left( {\begin{array}{c}M\\ N\end{array}}\right) =M(M-1)\left( {\begin{array}{c}M-2\\ N-2\end{array}}\right) \end{aligned}$$

we get

$$\begin{aligned} \sum _{N=1}^\infty N^2 \left\| \left( {\mathcal {Q}}(t)\mathbf {f}\right) _N \right\| _{1,N}\le e^{-\rho t}\sum _{N=1}^\infty N^2\Vert f_N\Vert _{1,N} \end{aligned}$$

so that \({\mathcal {Q}}(t)\mathbf {f}\in D^1\) if \(\mathbf {f}\in D^1\). Moreover observe that

$$\begin{aligned} \Vert {\mathcal {Q}}(t)\mathbf {f}-\mathbf {f}\Vert _1&\le \sum _{N\ge 0} \sum _{n\ge 1} \left( {\begin{array}{c}N+n\\ N\end{array}}\right) e^{-\rho N t}\left( 1-e^{-\rho t}\right) ^{n}\Vert f_{N+n}\Vert _{1,N+n}\nonumber \\&\quad +\Bigl \Vert e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})t}\mathbf {f}-\mathbf {f}\Bigr \Vert _1\nonumber \\&=\sum _{N\ge 0}\left( 1-e^{-\rho N t}\right) \Vert f_N\Vert _{1,N}+ \Bigl \Vert e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})t}\mathbf {f}-\mathbf {f}\Bigr \Vert _1 \end{aligned}$$
(37)

so that \(\lim _{t\rightarrow 0^+}{\mathcal {Q}}(t)\mathbf {f}=\mathbf {f}\). Similarly we have

$$\begin{aligned}&\frac{1}{t}\Vert {\mathcal {Q}}(t)\mathbf {f}-\mathbf {f}-t(\tilde{\lambda }{\mathcal {K}}-\rho ({\mathcal {O}}-{\mathcal {N}}))\mathbf {f}\Vert _1\nonumber \\&\quad \le \frac{1}{t}\sum _{N\ge 0} \sum _{n\ge 2} \left( {\begin{array}{c}N+n\\ N\end{array}}\right) e^{-\rho N t}\left( 1-e^{-\rho t}\right) ^{n}\Vert f_{N+n}\Vert _{1,N+n}\nonumber \\&\qquad +\frac{1}{t}\left\| e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})t} -\mathbf {f}-t(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})\mathbf {f}\right\| _1\nonumber \\&\qquad +\frac{\rho }{t}\left\| \int _0^t e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})(t-s)}{\mathcal {O}}e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})s} \mathbf {f}-t{\mathcal {O}}\mathbf {f}\right\| _1\, . \end{aligned}$$
(38)

If \(\mathbf {f}\in D^1\), proceeding as in (37) we see that the second and third lines of the right hand side of (38) vanish as \(t\rightarrow 0^+\) while writing

$$\begin{aligned} \int _0^t e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})(t-s)}{\mathcal {O}}e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})s}\mathbf {f}-t{\mathcal {O}}\mathbf {f}&= \int _0^t e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})(t-s)}{\mathcal {O}}\left( e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}})s}\mathbf {f}-\mathbf {f}\right) \nonumber \\&\quad +\int _0^t \left( e^{(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}}) (t-s)}{\mathcal {O}}-{\mathcal {O}}\right) \mathbf {f}\end{aligned}$$
(39)

and using (28) we see that also the last line of (38) vanish as \(t\rightarrow 0^+\). This implies that, for \(\mathbf {f}\in D^1\), we have \(\lim _{t\rightarrow 0^+} \left( {\mathcal {Q}}(t)\mathbf {f}-\mathbf {f}\right) /t =\tilde{\lambda }{\mathcal {K}}\mathbf {f}+\rho ({\mathcal {O}}-{\mathcal {N}})\mathbf {f}\) and we can write \({\mathcal {Q}}(t)=e^{t(\tilde{\lambda } {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}}))}\).

We can now use a Duhamel style expansion once more to obtain

$$\begin{aligned} e^{t{{\mathcal {L}}}}\mathbf {f}&=e^{(\tilde{\lambda } {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}})-\mu \mathrm {Id})t}\mathbf {f}\nonumber \\&\quad +\sum _{n=1}^\infty \mu ^n \int \limits _{0<t_1<\ldots<t_n<t} e^{(\tilde{\lambda } {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}})-\mu \mathrm {Id})(t-t_n)} {\mathcal {I}}e^{(\tilde{\lambda } {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}})-\mu \mathrm {Id})(t_n-t_{n-1})} \nonumber \\&\qquad \cdots {\mathcal {I}}e^{(\tilde{\lambda } {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}})-\mu \mathrm {Id})t_1}\mathbf {f}\,dt_1\cdots dt_n \end{aligned}$$
(40)

that, thanks to the fact that \({\mathcal {I}}\) is bounded, converges for every \(\mathbf {f}\in L^1({\mathcal {R}})\) to a solution of \(\frac{d}{dt}\mathbf {f}(t)={\mathcal {L}}\mathbf {f}(t)\). Lemma 1 follows easily observing that \(\Vert {\mathcal {I}}\mathbf {f}\Vert _1= \Vert \mathbf {f}\Vert _1\). \(\square \)

Remark 6

The proof of Lemma 1 above also shows that given \(\mathbf {f}\in L^1({\mathcal {R}})\), if for some N, \(f_N\) takes both positive and negative values, then \(\Vert e^{t{\mathcal {L}}}\mathbf {f}\Vert _1<\Vert \mathbf {f}\Vert _{1}\).

Remark 7

From (30) it is not hard to see that, if \(\mathbf {f}_i(t)\in D^1\), \(i=1,2\), are two solutions of (1) with \(\mathbf {f}_1(0)=\mathbf {f}_2(0)\) then \(\mathbf {f}_1(t)=\mathbf {f}_2(t)\).

Remark 8

Observe that (1) is the master equation of a jump process where jumps occur when two particles collide, a particle enters the system or a particle leaves it. Moreover, these jumps arrive according to a Poisson process. The expansions (27), (33) and (40) combined can be seen as a representation of the evolution of \(\mathbf {f}\) as an integral over all possible realizations of the jump process, sometime called jump or collision histories. A similar representation was used in [2] to study the interaction of a Kac system with a large reservoir. Clearly, such a representation is much more complex in the present situation then for the model studied in [2]. Here the arrival rate for the jumps depends on the state of the system via the number of particles N and goes to infinity as N increases.

Given a state \(\mathbf {f}=(f_0,f_1,\ldots )\) we set \(\bar{f}_N=\int f_N(\underline{v}_N)d\underline{v}_N\). It is easy to see that

$$\begin{aligned} \int ({\mathcal {O}}\mathbf {f})_N(\underline{v}_N)d\underline{v}_N=(N+1)\bar{f}_{N+1}\, ,\qquad \int ({\mathcal {I}}\mathbf {f})_N(\underline{v}_N)d\underline{v}_N=\bar{f}_{N-1} \end{aligned}$$
(41)

while

$$\begin{aligned} \int ({\mathcal {K}}\mathbf {f})_N(\underline{v}_N)d\underline{v}_N=0\, , \end{aligned}$$

so that we get

$$\begin{aligned} \overline{({{\mathcal {L}}}\mathbf {f})_0}&=-\mu \bar{f}_0+\rho \bar{f}_1\nonumber \\ \overline{({{\mathcal {L}}}\mathbf {f})_N}&=-(N\rho +\mu )\bar{f}_N +\mu \bar{f}_{N-1}+\rho (N+1) \bar{f}_{N+1}\quad N>0\, . \end{aligned}$$
(42)

If \(\varvec{\varGamma }\) is a steady state, writing

$$\begin{aligned} \overline{\varGamma }_N=c_N \left( \frac{\mu }{\rho }\right) ^N\frac{1}{N!} \end{aligned}$$

we see from (42) that \(c_N=c_0\) for every N. Since \(\sum _N\overline{\varGamma }_N=1\) we get \(\overline{\varGamma }_N=a_N\), see (7). This implies that if \(\varvec{\varGamma }\) and \(\varvec{\varGamma }'\) are two steady states then

$$\begin{aligned} \int (\varGamma _N(\underline{v}_N)-\varGamma '_N(\underline{v}_N))d\underline{v}_N=0 \end{aligned}$$

for every N. From Remark 6 it follows that, if \(\varvec{\varGamma }\not =\varvec{\varGamma }'\) then \(\Vert e^{t{\mathcal {L}}}(\varvec{\varGamma }-\varvec{\varGamma }')\Vert _1 <\Vert \varvec{\varGamma }-\varvec{\varGamma }'\Vert _{1}\). Uniqueness of the steady state follows immediately.

We now prove a more general version of (9). For \(r\ge 0\) we define

$$\begin{aligned} N_{r}(\mathbf {f})=\sum _{N=r}^\infty \frac{N!}{(N-r)!}\bar{f}_N\, . \end{aligned}$$
(43)

and, using (42), we get

$$\begin{aligned} \frac{d}{dt} N_{r}(\mathbf {f})&=\sum _{N=r}^\infty \frac{N!}{(N-r)!} \left( -(N\rho +\mu )\bar{f}_N+ \mu \bar{f}_{N-1}+\rho (N+1) \bar{f}_{N+1}\right) \nonumber \\&=-\rho r N_r(\mathbf {f})+\mu rN_{r-1}(\mathbf {f}) \end{aligned}$$
(44)

that, for \(r=1\), would imply the first of (9) since for a probability distribution we have \(N_0(\mathbf {f})=1\). This argument is suggestive but only formal since we need to show that we can exchange the sum with the derivative in the above derivation. Notwithstanding this, it shows that for \(r=0\), if \(\mathbf {f}\in D^1\) then

$$\begin{aligned} \sum _{N=0}^\infty \overline{(\mathcal L\mathbf {f})_N}=0\, . \end{aligned}$$
(45)

To prove (9) we proceed more directly using the expansions derived previously. Indeed from (34) and (36) we get

$$\begin{aligned} N_r \left( e^{t(\tilde{\lambda }{\mathcal {K}}-\rho {\mathcal {N}}+\rho {\mathcal {O}})}\mathbf {f}\right) =&\sum _{N\ge r} \sum _{n\ge 0}\frac{N!}{(N-r)!}\left( {\begin{array}{c}N+n\\ N\end{array}}\right) e^{-\rho N t} \left( 1-e^{-\rho t}\right) ^{n}\bar{f}_{N+n}\nonumber \\&=e^{-\rho r t}N_r(\mathbf {f})\,\, . \end{aligned}$$
(46)

Furthermore, using that \(N_r({\mathcal {I}}\mathbf {f})= N_r(\mathbf {f})+r N_{r-1}(\mathbf {f})\), we get

$$\begin{aligned} N_r\left( \mathbf {f}(t)\right)&=N_r\left( e^{t(\tilde{\lambda } {\mathcal {K}}-\rho {\mathcal {N}}+\rho {\mathcal {O}}-\mu \mathrm {Id})}\mathbf {f}(0)+\mu \int _0^te^{(t-s)(\tilde{\lambda } {\mathcal {K}}-\rho {\mathcal {N}}+\rho {\mathcal {O}}-\mu \mathrm {Id})}{\mathcal {I}}\mathbf {f}(s)ds\right) \nonumber \\&= e^{-(\rho r+\mu ) t}N_r(\mathbf {f}(0))+\mu \int _0^t e^{-(\rho r+\mu ) (t-s)}N_r(\mathbf {f}(s))ds\nonumber \\&\quad +r\mu \int _0^t e^{-(\rho r+\mu ) (t-s)}N_{r-1}(\mathbf {f}(s))ds \end{aligned}$$
(47)

that gives

$$\begin{aligned} N_r\left( \mathbf {f}(t)\right) =e^{-\rho r t}N_r(\mathbf {f}(0))+r\mu \int _0^t e^{-\rho r (t-s)}N_{r-1}(\mathbf {f}(s)) ds\, . \end{aligned}$$
(48)

For \(r=1\), if \(\mathbf {f}(0)\) is a probability distribution, we get

$$\begin{aligned} N(t)=e^{-\rho t}N(0)+(1-e^{-\rho t})\frac{\mu }{\rho } \end{aligned}$$

that proves the first of (9). We will need the following corollary in Sect. 3.6 below.

Corollary 9

Given a probability distribution \(\mathbf {f}\), assume that there exists M such that \(|N_r(\mathbf {f}(0))|\le M^r\) then we have

$$\begin{aligned} |N_r(\mathbf {f}(t))|\le \max \left\{ M,\frac{\mu }{\rho }\right\} ^r \end{aligned}$$
(49)

for every \(t\ge 0\).

Proof

Clearly (49) holds for \(r=0\) since \(N_0(\mathbf {f}(t))=1\) for every \(t\ge 0\). Calling \(M_1=\max \left\{ M,\frac{\mu }{\rho }\right\} \), assume that \(|N_{r-1}(\mathbf {f}(t))|\le M_1^{r-1}\). Form (48) we get

$$\begin{aligned} |N_r(\mathbf {f}(t))|&\le e^{-\rho r t}M^r+r\mu \int _0^t e^{-\rho r (t-s)}M_1^{r-1} ds\\&=e^{-\rho r t}M^r+\frac{\mu }{\rho }(1-e^{-\rho r t})M_1^{r-1}\le \max \left\{ M^r,\frac{\mu }{\rho }M_1^{r-1}\right\} \, . \end{aligned}$$

The corollary follows by induction on r. \(\square \)

Let now

$$\begin{aligned} \tilde{f}_N=\sum _{i=1}^N\int v_i^2 f_N(\underline{v}_N)d\underline{v}_N \end{aligned}$$

so that \(E(t)=\sum _{N=1}^\infty \tilde{f}_N\) and observe that

$$\begin{aligned}&\sum _{i=1}^N\int v_i^2({\mathcal {O}}\mathbf {f})_N(\underline{v}_N)d\underline{v}_N=N\tilde{f}_{N+1}\, .\\&\sum _{i=1}^N\int v_i^2({\mathcal {I}}\mathbf {f})_N(\underline{v}_N)d\underline{v}_N=\tilde{f}_{N-1} +\frac{1}{2\pi }\bar{f}_{N-1} \end{aligned}$$

while

$$\begin{aligned} \sum _{i=1}^N\int v_i^2({\mathcal {K}}\mathbf {f})_N(\underline{v}_N)d\underline{v}_N=0\, . \end{aligned}$$

Again proceeding formally we get

$$\begin{aligned} \frac{d}{dt}\sum _{N=1}^\infty \tilde{f}_N&=\sum _{N=1}^\infty \left( -(N\rho +\mu )\tilde{f}_N+ \mu \tilde{f}_{N-1} +\frac{\mu }{2\pi }\bar{f}_{N-1} + \rho N \tilde{f}_{N+1}\right) \\&=\frac{\mu }{2\pi }\sum _{N=0}^\infty \bar{f}_N-\rho \sum _{N=1}^\infty \tilde{f}_N\,. \end{aligned}$$

It is not hard to adapt this argument, together with (46) and (47), to prove the second of (9).

3.2 Proof of Theorem 2

To prove Theorems 2 and 3, we will construct a basis of eigenvectors for the generator

$$\begin{aligned} {\mathcal {G}}=\rho ({\mathcal {P}}^+ - {\mathcal {N}}) + \mu ({\mathcal {P}}^--\mathrm {Id}) \end{aligned}$$

of the evolution due to the thermostat on \(L^2_s({\mathcal {R}},\varvec{\varGamma })\). We start by defining

$$\begin{aligned} ({\mathcal {P}}^+(g)\mathbf {h})_{N}(\underline{v}_N)&=\sum _{i=1}^N h_{N-1}(v_1,\ldots ,v_{i-1},v_{i+1},\ldots , v_N)g(v_i)\nonumber \\ ({\mathcal {P}}^-(g)\mathbf {h})_{N}(\underline{v}_N)&=\frac{1}{N+1}\sum _{i=1}^{N+1} \int dw e^{-\pi w^2}g(w)h_{N+1}(\underline{v}_{N,i}(w)) \end{aligned}$$
(50)

with \(\underline{v}_{N,i}(w)=(v_1,\ldots ,v_{i-1},w,v_i,\ldots , v_N)\) and \(g\in L^2({\mathbb {R}},\gamma )\). Moreover, we use the convention that the sum over an empty set is 0 so that \(({\mathcal {P}}^+(g)\mathbf {h})_{0}=0\) for every \(\mathbf {h}\). With this notation, \({\mathcal {P}}^+\) and \({\mathcal {P}}^-\) from the introduction are \({\mathcal {P}}^+(1)\) and \({\mathcal {P}}^-(1)\), respectively.

Lemma 10

We have

$$\begin{aligned} \rho {\mathcal {P}}^+(g)^*=\mu {\mathcal {P}}^-(g) \end{aligned}$$
(51)

so that \({\mathcal {G}}\) is self-adjoint.

Proof

Proceeding as in the definition of \(D^2\), we take as domain of \({\mathcal {P}}^\pm (g)\) the subspaces

$$\begin{aligned} D^\pm =\bigl \{\mathbf {h}\,\Big |\, \sum _{N=0}^\infty a_N\Vert ({\mathcal {P}}^\pm (g)\mathbf {h})_N\Vert ^2_{2,N}<\infty \bigr \}\,. \end{aligned}$$

It is easy to see that \(D^\pm \) are dense in \(L^2({\mathcal {R}},\varvec{\varGamma })\).

Calling \(\underline{v}_N^i=(v_1,\dots ,v_{i-1},v_{i+1},\dots ,v_N)\) we get

$$\begin{aligned}&(h_N,({\mathcal {P}}^+(g)\mathbf {j})_N)_N = \sum _{i=1}^N\int d\underline{v}_N\gamma _N(\underline{v}_N) h_N(\underline{v}_N)j_{N-1}(\underline{v}_N^i)g(v_i)\nonumber \\&\quad =\sum _{i=1}^N\int d\underline{v}_N^i\gamma _{N-1} (\underline{v}_N^i)\left( \int dv_ie^{-\pi v_i^2}g(v_i) h_N(\underline{v}_N)\right) j_{N-1}(\underline{v}_N^i)\nonumber \\&\quad =N(({\mathcal {P}}^-(g)\mathbf {h})_{N-1},j_{N-1})_{N-1}\, . \end{aligned}$$
(52)

Assume now that \(\mathbf {h}\) is in the domain of \({\mathcal {P}}^+(g)^*\). This means that for every \(\mathbf {j}\) in \(D^+\) we have

$$\begin{aligned} ({\mathcal {P}}^+(g)^*\mathbf {h},\mathbf {j})=(\mathbf {h},{\mathcal {P}}^+(g)\mathbf {j})\, . \end{aligned}$$

Given M, choose \(\mathbf {j}\) such that \(j_N\equiv 0\) if \(N\not =M\). For such a \(\mathbf {j}\) we have \(\mathbf {j}\in D^+\) and

$$\begin{aligned} \rho a_M(({\mathcal {P}}^+(g)^*\mathbf {h})_M,j_M)_M&=\rho ({\mathcal {P}}^+(g)^*\mathbf {h},\mathbf {j})=\rho (\mathbf {h},{\mathcal {P}}^+(g)\mathbf {j})\\&=\rho a_{M+1}(h_{M+1},({\mathcal {P}}^+(g)\mathbf {j})_{M+ 1 })_{M+1}= \mu a_M(({\mathcal {P}}^-(g)\mathbf {h})_M,j_M)_M \end{aligned}$$

where the last equality follows from (52) and the fact that

$$\begin{aligned} \rho N a_N=\mu a_{N-1}\, . \end{aligned}$$
(53)

This implies that \(\rho ({\mathcal {P}}^+(g)^*\mathbf {h})_M=\mu ({\mathcal {P}}^-(g)\mathbf {h})_M\) for every M thus proving (51). This also implies that \({\mathcal {G}}\) is self adjoint. \(\square \)

To obtain convergence toward \(\mathbf {e}^0\), we first need to show that \({\mathcal {G}}\) is non positive. This is the content of the following Lemma.

Lemma 11

\({{\mathcal {G}}}\) is non positive and \({{\mathcal {G}}}\mathbf {h}=0\) if and only if \(\mathbf {h}=c\mathbf {e}^0\), where \(\mathbf {e}^0\) is given by \(e_N^0(\underline{v}_N)=1\) for every N and \(\underline{v}_N\).

Proof

From (51), we get \(\rho (\mathbf {h},{\mathcal {P}}^+\mathbf {h})= \mu ({\mathcal {P}}^-\mathbf {h},\mathbf {h})\) so that

$$\begin{aligned} (\mathbf {h},{\mathcal {G}} \mathbf {h})=2\rho (\mathbf {h},{\mathcal {P}}^+\mathbf {h}) -(\mathbf {h},(\rho {\mathcal {N}} +\mu )\mathbf {h}) \end{aligned}$$
(54)

Moreover we have

$$\begin{aligned} \rho (\mathbf {h},{\mathcal {P}}^+\mathbf {h})&=\rho \sum _{N=1}^\infty a_N\int d\underline{v}_N\gamma _N (\underline{v}_N)h_N(\underline{v}_N)\left( \sum _{i=1}^ Nh_{N-1}(\underline{v}_N^i)\right) \nonumber \\&=\sum _{N=1}^\infty \left[ \sum _{i=1}^N\int d\underline{v}_N\gamma _N(\underline{v}_N)\left( \sqrt{\rho a_N}h_N(\underline{v}_N)\right) \left( \sqrt{\frac{\mu }{N} a_{N-1}}h_{N-1}(\underline{v}_N^i)\right) \right] \nonumber \\&\le \sum _{N=1}^\infty \sum _{i=1}^N\left[ \frac{1}{2}\rho a_N\int d\underline{v}_N\gamma _N(\underline{v}_N)h_N(\underline{v}_N)^2 +\frac{1}{2} \frac{\mu }{N}a_{N-1}\int d\underline{v}_N\gamma _N(\underline{v}_N) h_{N-1}(\underline{v}_N^i) ^2\right] \nonumber \\&=\sum _{N=0}^\infty \left[ \frac{1}{2}N\rho a_N \int d\underline{v}_N\gamma _N(\underline{v}_N)h_N (\underline{v}_N)^2+\frac{1}{2}\mu a_N\int d \underline{v}_N\gamma _N(\underline{v}_N)h_N (\underline{v}_N)^2\right] \nonumber \\&=\frac{1}{2}(\mathbf {h},(\rho {\mathcal {N}} +\mu )\mathbf {h}) \end{aligned}$$
(55)

where we have used (53) to obtain the second line and that \(ab\le (a^2+b^2)/2\) in going from the second to the third line of (55). Non positivity follows immediately from (54) and (55). Furthermore, we see that the inequality at the end of the second line of (55) becomes an equality if and only if:

$$\begin{aligned} \sqrt{\rho a_N}h_N(\underline{v}_N) =\sqrt{\frac{\mu }{N}a_{N-1}}h_{N-1}(\underline{v}_N^i) \end{aligned}$$

or \(h_N(\underline{v}_N)=h_{N-1}(\underline{v}_N^i)\) for every i and N which implies that \(h_N\equiv h_0\). \(\square \)

Our construction of the eigenvalues and eigenvectors of \({\mathcal {G}}\) is inspired by the construction of the Fock space for a bosonic quantum field theory, see for example Chap. 6 of [17]. The main observation is that the operators \({\mathcal {P}}^\pm (g)\) defined in (50) have the form of the creation and annihilation operators. Since the “ground state” of \({\mathcal {G}}\) is \(\mathbf {e}^0\), as opposed to the state with no particles \(\mathbf {n}\), see (64) below, we will introduce the operators \({\mathcal {R}}^\pm (g)\), see (57) below, that can be thought as quasi particle operators, that is operators that create and destroy excitations above the ground state, see for example [1]. The proofs of the Lemmas in the remaining part of this section should be familiar to readers with a background in QFT.

We start with the commutation relations of the operators \({\mathcal {P}}^\pm (g)\) and \({\mathcal {N}}\). Setting \(\{{\mathcal {A}},{\mathcal {B}}\}={\mathcal {A}} {\mathcal {B}}-{\mathcal {B}}{\mathcal {A}}\), we obtain the following Lemma.

Lemma 12

We have

$$\begin{aligned} \{{{\mathcal {P}}}^+(g_1),{{\mathcal {P}}}^-(g_2)\}&=-(g_1,g_2)\mathrm {Id}\\ \{{\mathcal {P}}^+(g_1),{\mathcal {P}}^+(g_2)\}&=\{{\mathcal {P}}^-(g_1),{\mathcal {P}}^-(g_2)\}=0\\ \{{\mathcal {N}} ,{\mathcal {P}}^\pm (g)\}&=\pm {\mathcal {P}}^\pm (g) \end{aligned}$$

where

$$\begin{aligned} (g_1,g_2)=\int _{{\mathbb {R}}}g_1(w)g_2(w)e^{-\pi w^2}dw\, . \end{aligned}$$

Proof

We first observe that, due to the symmetry of \(h_N\), we have

$$\begin{aligned} ({\mathcal {P}}^-(g)\mathbf {h})_N(\underline{v}_{N})=\int \gamma (v_{N+1})g(v_{N+1})h_{N+1} (\underline{v}_{N+1})dv_{N+1}:= (P^-_N(g)h_{N+1})(\underline{v}_{N}) \end{aligned}$$

while

$$\begin{aligned} ({\mathcal {P}}^+(g)\mathbf {h})_N(\underline{v}_{N})= \sum _{i=1}^{N}(P^+_{N,i}(g) h_{N-1})(\underline{v}_{N}) \end{aligned}$$

where

$$\begin{aligned} (P^+_{N,i} (g)h_{N-1})(\underline{v}_{N})=h_{N-1}(v_1,\ldots ,v_{i-1}, v_{i+1},\ldots ,v_{N})g(v_i)\, . \end{aligned}$$

Thus we get

$$\begin{aligned} ({\mathcal {P}}^-(g_1){\mathcal {P}}^-(g_2)\mathbf {h})_N(\underline{v}_{N})&=(P^-_N(g_1) P^-_{N+1}(g_2) h_{N+2})(\underline{v}_{N})\\&=\int \gamma (v_{N+1})\gamma (v_{N+2}) g_1(v_{N+1})g_2(v_{N+2})\\ {}&\quad \times h_{N+2}(\underline{v}_{N+2})dv_{N+1}dv_{N+2} \end{aligned}$$

Using again that \(h_N\) is symmetric we get \(\{{\mathcal {P}}^-(g_1),{\mathcal {P}}^-(g_2)\}=0\). Moreover, we have

$$\begin{aligned}&P^+_{N,i}(g_1) P^+_{N-1,j}(g_2) h_{N-2}(\underline{v}_{N})\\&\quad ={\left\{ \begin{array}{ll} h_{N-2}(v_1,\ldots ,v_{j-1},v_{j+1},\ldots , v_{i-1},v_{i+1},\ldots ,v_{N})g_1(v_i) g_2(v_j) &{} i>j\\ h_{N-2}(v_1,\ldots ,v_{i-1},v_{i+1},\ldots ,v_{j},v_{j+2}, \ldots ,v_{N})g_1(v_i) g_2(v_{j+1}) &{} i\le j \end{array}\right. } \end{aligned}$$

so that

$$\begin{aligned} {\left\{ \begin{array}{ll} P^+_{N,i}(g_1) P^+_{N-1,j}(g_2) h_{N-2}=P^+_{N,j}(g_2) P^+_{N-1,i-1}(g_1) h_{N-2} &{} i>j\\ P^+_{N,i}(g_1) P^+_{N-1,j}(g_2) h_{N-2}=P^+_{N,j+1}(g_2) P^+_{N-1,i}(g_1) h_{N-2} &{} i\le j\, . \end{array}\right. } \end{aligned}$$

Summing over i and j it follows that \(\{{\mathcal {P}}^+(g_1),{\mathcal {P}}^+(g_2)\}=0\).

Similarly we have

$$\begin{aligned} (P^-_N(g_1)P^+_{{N+1},N+1}(g_2)h_N)(\underline{v}_N)=h_N(\underline{v}_N) \int g_1(v_{N+1})g_2(v_{N+1})\gamma (v_{N+1})dv_{N+1} \end{aligned}$$

while for \(i\le N\) we get

$$\begin{aligned}&(P^-_N(g_1)P^+_{{N+1},i}(g_2)h_N)(\underline{v}_N)\\&\quad = g_2(v_i)\int h_N(v_1,\ldots ,v_{i-1},v_{i+1}, \ldots ,v_{N+1})g_1(v_{N+1})\gamma (v_{N+1})dv_{N+1}\\&\quad = (P^+_{N,i}(g_2)P^-_{N-1}(g_1)h_N)(\underline{v}_N)\, . \end{aligned}$$

Summing over i we get \(\{{{\mathcal {P}}}^+(g_1),{{\mathcal {P}}}^-(g_2)\} =-(g_1,g_2)\mathrm {Id}\).

Finally we observe that

$$\begin{aligned} ({\mathcal {P}}^-(g){\mathcal {N}}\mathbf {h})_N=P^-_{N}(g)({\mathcal {N}}\mathbf {h})_{N+1} =(N+1)({\mathcal {P}}^-(g)\mathbf {h})_N=(({\mathcal {N}}+\mathrm {Id}){\mathcal {P}}^+(g)\mathbf {h})_N \end{aligned}$$

so that \(\{{\mathcal {N}} ,{\mathcal {P}}^-(g)\}=-{\mathcal {P}}^-(g)\). The commutation relation for \({\mathcal {P}}^+\) follows taking the adjoint. \(\square \)

Observe that \({\mathcal {P}}^-(g)\mathbf {e}^0=(g,1)\mathbf {e}^0\) while from Lemma 12 it follows that

$$\begin{aligned} \{{\mathcal {G}},{\mathcal {P}}^+(g)\}&=\{{\mathcal {P}}^+(1),{\mathcal {P}}^+(g)\} - \rho \{{\mathcal {N}},{\mathcal {P}}^+(g)\}+\mu \{{\mathcal {P}}^-(1),{\mathcal {P}}^+(g)\}\nonumber \\&=-\rho {\mathcal {P}}^+(q)+\mu (1,g)\mathrm {Id} \end{aligned}$$
(56)

that makes it natural to define the new creation and annihilation operators

$$\begin{aligned}&{{\mathcal {R}}}^+(g)=\sqrt{\frac{\rho }{\mu }}{{\mathcal {P}}}^+(g) -\sqrt{\frac{\mu }{\rho }}(g,1)\,\mathrm {Id}\nonumber \\&{{\mathcal {R}}}^-(g)=\sqrt{\frac{\mu }{\rho }}{{\mathcal {P}}}^-(g) -\sqrt{\frac{\mu }{\rho }}(g,1)\,\mathrm {Id}\, . \end{aligned}$$
(57)

The following Corollary collects the relevant properties of \({{\mathcal {R}}}^\pm (g)\).

Corollary 13

We have \({\mathcal {R}}^+(g)^*={\mathcal {R}}^-(g)\), \({\mathcal {R}} ^-(g)\mathbf {e}^0=0\), and

$$\begin{aligned} \{{{\mathcal {R}}}^+(g_1),{{\mathcal {R}}}^-(g_2)\}&=-(g_1,g_2)\mathrm {Id}\\ \{{\mathcal {R}}^+(g_1),{\mathcal {R}}^+(g_2)\}&=\{{\mathcal {R}}^-(g_1),{\mathcal {R}} ^-(g_2)\}=0\\ \{{\mathcal {N}},{\mathcal {R}}^\pm (g)\}&=\pm \left( {\mathcal {R}}^\pm (g) +\sqrt{\frac{\mu }{\rho }}(g,1)\mathrm {Id}\right) \end{aligned}$$

Moreover we also have

$$\begin{aligned} \{{\mathcal {G}},{\mathcal {R}}^+(g)\}=-\rho {\mathcal {R}}^+(g) \,,\qquad \{{\mathcal {G}},{\mathcal {R}}^-(g)\}=\rho {\mathcal {R}}^-(g)\,. \end{aligned}$$
(58)

Proof

It is easy to verify that \({\mathcal {R}}^-(g)\mathbf {e}^0=0\). Moreover we only need to prove (58) since the other relations are immediate consequences of Lemma 12. From (56) we get

$$\begin{aligned} \{{\mathcal {G}},{\mathcal {R}}^+(g)\}=\sqrt{\frac{\rho }{\mu }}\{{\mathcal {G}},{\mathcal {P}}^+(g)\} =-\rho \sqrt{\frac{\rho }{\mu }}{\mathcal {P}}^+(g)+\sqrt{\mu \rho }(g,1)\mathrm {Id} =-\rho {\mathcal {R}}^+(g) \end{aligned}$$

The second equation of (58) follows by taking the adjoint of the first. \(\square \)

Since \(K_N\) preserves the space of polynomials of a given degree, see [4], we choose as an orthonormal basis for \(L^2({\mathbb {R}},\gamma (v))\) the polynomials

$$\begin{aligned} L_n(v)=\frac{1}{\sqrt{n!}}H_n(\sqrt{2\pi }v) \end{aligned}$$
(59)

where

$$\begin{aligned} H_n(v)=(-1)^n e^{\frac{v^2}{2}}\frac{d^n}{dv^n}e^{\frac{-v^2}{2}} \end{aligned}$$

are the standard Hermite polynomials. For every sequence \(\underline{\alpha }=(\alpha _0,\alpha _1,\alpha _2,\ldots )\) such that \(\alpha _i\in {\mathbb {N}}\) and \(\lambda (\underline{\alpha }) :=\sum _{i=0}^\infty \alpha _i<\infty \), we define

$$\begin{aligned} \mathbf {e}_{\underline{\alpha }}=\prod _{i=0}^\infty \frac{({\mathcal {R}}^+_i)^{\alpha _i}}{\sqrt{\alpha _i!}}\mathbf {e}^0 \end{aligned}$$
(60)

where \({\mathcal {R}}^\pm _n={\mathcal {R}}^\pm (L_n)\).

Lemma 14

The vectors \(\mathbf {e}_{\underline{\alpha }}\) form an orthonormal basis in \(L^2_s({\mathcal {R}},\varvec{\varGamma })\). Moreover, we have

$$\begin{aligned} {\mathcal {G}}\mathbf {e}_{\underline{\alpha }}=-\rho \lambda (\underline{\alpha })\mathbf {e}_{\underline{\alpha }}\, . \end{aligned}$$
(61)

Finally we have \(\Vert {\mathcal {K}}\mathbf {e}_{\underline{\alpha }}\Vert _2<\infty \), so that \(\mathbf {e}_{\underline{\alpha }}\in D^2\), for every \(\underline{\alpha }\).

Proof

If \(n_1\not =n_2\) and \(\alpha _1\alpha _2\not =0\), using Corollary 13 we get

$$\begin{aligned} (({\mathcal {R}}^+_{n_1})^{\alpha _1}\mathbf {e}^0,({\mathcal {R}}^+_{n_2})^{\alpha _2}\mathbf {e}^0) =(\mathbf {e}^0,({\mathcal {R}}^+_{n_2})^{\alpha _2}({\mathcal {R}}^-_{n_1})^{\alpha _1}\mathbf {e}^0)=0 \end{aligned}$$

while

$$\begin{aligned} (({\mathcal {R}}^+_n)^{\alpha _1}\mathbf {e}^0,({\mathcal {R}}^+_n)^{\alpha _2}\mathbf {e}^0)&=(({\mathcal {R}}^+_n)^{\alpha _1- 1}\mathbf {e}^0,{\mathcal {R}}^-_n({\mathcal {R}}^+_n)^{\alpha _2}\mathbf {e}^0)\\&=(({\mathcal {R}}^+_n)^{\alpha _1-1}\mathbf {e}^0,{\mathcal {R}}^+_n{\mathcal {R}}^-_n({\mathcal {R}}^+_n)^{\alpha _2-1}\mathbf {e}^0)\\&\quad +(({\mathcal {R}}^+_n)^{\alpha _1-1}\mathbf {e}^0,({\mathcal {R}}^+_n)^{\alpha _2-1}\mathbf {e}^0)\\&\quad \vdots \\&=(({\mathcal {R}}^+_n)^{\alpha _1-1}\mathbf {e}^0,({\mathcal {R}}^+_n)^{\alpha _2}{\mathcal {R}}^-_n\mathbf {e}^0)\\&\quad +\alpha _2(({\mathcal {R}}^+_n)^{\alpha _1-1}\mathbf {e}^0,({\mathcal {R}}^+_n)^{\alpha _2-1}\mathbf {e}^0)\\&=\alpha _2(({\mathcal {R}}^+_n)^{\alpha _1- 1}\mathbf {e}^0,({\mathcal {R}}^+_n)^{\alpha _2-1}\mathbf {e}^0)\,. \end{aligned}$$

Assuming \(\alpha _1\ge \alpha _2\) we get

$$\begin{aligned} (({\mathcal {R}}^+_n)^{\alpha _1}\mathbf {e}^0,({\mathcal {R}}^+_n)^{\alpha _2}\mathbf {e}^0) =\alpha _2!(({\mathcal {R}}^+_n)^{\alpha _1-\alpha _2}\mathbf {e}^0,\mathbf {e}^0) \end{aligned}$$
(62)

so that

$$\begin{aligned} (({\mathcal {R}}^+_n)^{\alpha _1}\mathbf {e}^0,({\mathcal {R}}^+_n)^{\alpha _2}\mathbf {e}^0) =\alpha _1!\delta _{\alpha _1,\alpha _2} \end{aligned}$$

from which orthonormality follows easily. Observe now that

$$\begin{aligned} (({\mathcal {P}}^+(1))^n\mathbf {e}^0)_N={\left\{ \begin{array}{ll} 0 &{} N<n\\ \frac{N!}{(N-n)!} &{} N\ge n \end{array}\right. } \end{aligned}$$
(63)

so that we can write

$$\begin{aligned} \mathbf {n}=\sum _{n=0}^\infty \frac{(-1)^n}{n!}({\mathcal {P}}^+(1))^n\mathbf {e}^0 \end{aligned}$$
(64)

where \(\mathbf {n}=(1,0,0,\ldots )\). Since \({\mathcal {P}}^+(1) =\sqrt{\frac{\rho }{\mu }}{\mathcal {R}}_0^++\sqrt{\frac{\mu }{\rho }}\mathrm{Id}\) we see that \(\mathbf {n}\) is in the closure of the span of the \(\mathbf {e}_{\underline{\alpha }}\). Calling \({\mathcal {P}}^+_i={\mathcal {P}}^+(L_i)\), we observe that \(({\mathcal {P}}^+_i\mathbf {n})_N=0\) for \(N\not =1\) while \(({\mathcal {P}}^+_i\mathbf {n})_1=L_i\). Since the \(L_i\) form a basis for \(L^2({\mathbb {R}},\gamma _1)\) we see that the closure of the span of \(\{\mathbf {n}; {\mathcal {P}}^+_i\mathbf {n}, i\ge 0\}\) contains a basis for \(L^2_s({\mathbb {R}}^0,a_0)\oplus L^2_s({\mathbb {R}},a_1\gamma _1)\). Observe now that \({\mathcal {P}}^+_i=\sqrt{\frac{\mu }{\rho }}{\mathcal {R}}^+_i+\delta _{i,0} \frac{\mu }{\rho }\mathrm {Id}\) and that \({\mathcal {R}}^+_i\mathbf {e}_{\underline{\alpha }} =\sqrt{\alpha _i+1}\mathbf {e}_{\underline{\alpha }'}\), where \(\alpha '_j=\alpha _j\) for \(j\not =i\) while \(\alpha '_i=\alpha _i+1\). Combining this with (64) we get that the closure of the span of the \(\mathbf {e}_{\underline{\alpha }}\) contains \({\mathcal {P}}^+_i\mathbf {n}\) and thus it contains a basis for \(L^2_s({\mathbb {R}}^0)\oplus L^2_s({\mathbb {R}},a_1\gamma _1)\). Iterating this construction we obtain completeness. Equation (61) follows easily from (58).

Finally, since \((h_N,R_{i,j}h_N)_N\le \Vert h_n\Vert _{2,N}\), from (5) we get

$$\begin{aligned} \Vert {\mathcal {K}}\mathbf {h}\Vert _2^2\le \sum _{N=0}^\infty a_N N^4\Vert \mathbf {h}\Vert _{2,N}^2 =\Vert {\mathcal {N}}^2\mathbf {h}\Vert _2^2\,. \end{aligned}$$

Using the commutation relations in Corollary 13 as in the derivation of (62) we get

$$\begin{aligned} {\mathcal {N}}\left( {\mathcal {R}}^+_n\right) ^\alpha =\left( {\mathcal {R}}^+_n\right) ^\alpha {\mathcal {N}}+\alpha \left( {\mathcal {R}}^+_n \right) ^\alpha +\delta _{n,0}\alpha \sqrt{\frac{\mu }{\rho }}\left( {\mathcal {R}}^+_n\right) ^{\alpha -1} \end{aligned}$$

that, together with \({\mathcal {N}}\mathbf {e}^0=\sqrt{\frac{\mu }{\rho }} {\mathcal {R}}^+_0\mathbf {e}^0+\frac{\mu }{\rho }\mathbf {e}^0\), gives

$$\begin{aligned} {\mathcal {N}}\mathbf {e}_{\underline{\alpha }}=\left( \lambda (\underline{\alpha }) +\frac{\mu }{\rho }\right) \mathbf {e}_{\underline{\alpha }} +\sqrt{\frac{\mu }{\rho }}(\sqrt{\alpha _0}\mathbf {e}_{\underline{\alpha }^-} +\sqrt{\alpha _0+1}\mathbf {e}_{\underline{\alpha }^+}) \end{aligned}$$

where \(\alpha ^\pm _i=\alpha _i\), for \(i>0\), while \(\alpha _0^\pm =\alpha _0\pm 1\). Thus we have \(\Vert {\mathcal {N}}^2\mathbf {e}_{\underline{\alpha }}\Vert _2<\infty \) and the proof is complete. \(\square \)

In Sect. 3.3 we will need a more explicit representation of the \(\mathbf {e}_{\underline{\alpha }}\). To this end observe that, if \(n\not =0\), \(({\mathcal {R}}^+_n\mathbf {e}^0)_N(\underline{v}_N)=\sqrt{\frac{\rho }{\mu }} \sum _{i=1}^N L_n(v_i)\) while for \(n_1,n_2\not =0\) and \(N\ge 2\) we can write

$$\begin{aligned} ({\mathcal {R}}^+_{n_1}{\mathcal {R}}^+_{n_2}\mathbf {e}^0)_N(\underline{v}_N)=\frac{\rho }{\mu }\sum _{i\not =j} L_{n_1}(v_i)L_{n_2}(v_j)=\frac{1}{(N-2)!}\frac{\rho }{\mu } \sum _{\pi \in \mathrm{Sym}(N)}L_{n_1}(v_{\pi (1)})L_{n_2}(v_{\pi (2)}) \end{aligned}$$

where \(\mathrm{Sym}(N)\) is the group of permutations on \(\{1,\ldots ,N\}\). More generally, given \(n_i\not =0\), \(i=1,\ldots ,M\), we get, for \(N\ge M\),

$$\begin{aligned} \left( \prod _{i=1}^M{\mathcal {R}}^+_{n_i}\mathbf {e}^0\right) _N(\underline{v}_N)=\frac{1}{(N-M)!} \left( \frac{\rho }{\mu }\right) ^{\frac{M}{2}} \sum _{\pi \in \mathrm{Sym}(N)}\prod _{i=1}^M L_{n_i}(v_{\pi (i)})\, . \end{aligned}$$
(65)

while \(\left( \prod _{i=1}^M{\mathcal {R}}^+_{n_i}\mathbf {e}^0\right) _N\equiv 0\) for \(N<M\). Given \(\underline{\alpha }\) with \(\lambda (\underline{\alpha })<\infty \), define

$$\begin{aligned} L_{\underline{\alpha }}=\bigotimes _{i=1}^\infty L_{i}^{\otimes \alpha _i} \end{aligned}$$

where \(L_{i}^{\otimes 0}=1\) and observe that \(L_{\underline{\alpha }}\) is a polynomial in \(\lambda _0(\underline{\alpha }):=\sum _{i=1}^\infty \alpha _i\) variables with degree \(d(\underline{\alpha }) :=\sum _{i=1}^\infty i\alpha _i\). Also for \(\pi \in \mathrm{Sym}(N)\), define \(\pi (\underline{v}_N)=(v_{\pi (1)},v_{\pi (2)},\ldots v_{\pi (N)})\). Using these definitions, together with (60) and the fact that \({\mathcal {R}}_0^+=\sqrt{\frac{\rho }{\mu }}{\mathcal {P}}^+(1)+\sqrt{\frac{\mu }{\rho }} \mathrm {Id}\) we can write, for \(N\ge \lambda _0(\underline{\alpha })\),

$$\begin{aligned} (\mathbf {e}_{\underline{\alpha }})_N(\underline{v}_N)=c_{\underline{\alpha },N} \sum _{\pi \in \mathrm{Sym}(N)} L_{\underline{\alpha }}(\pi (\underline{v}_N))\, , \end{aligned}$$
(66)

for suitable coefficients \(c_{\underline{\alpha },N}\), while \((\mathbf {e}_{\underline{\alpha }})_N(\underline{v}_N)=0\) for \(N<\lambda _0(\underline{\alpha })\).

We now come back to the full operator \(\widetilde{{\mathcal {L}}}\).

Corollary 15

The operator \(\widetilde{{\mathcal {L}}}\) is self-adjoint, non positive and \(\widetilde{{\mathcal {L}}}\mathbf {h}=0\) if and only if \(\mathbf {h}=c\mathbf {e}^0\).

Proof

We can proceed exactly as in proof of Lemma 10. Assume that \(\mathbf {h}\) is in the domain of \(\widetilde{{\mathcal {L}}}^*\). This means that for every \(\mathbf {j}\) in \(D^2\) we have

$$\begin{aligned} (\widetilde{{\mathcal {L}}}^*\mathbf {h},\mathbf {j}) =(\mathbf {h},\widetilde{{\mathcal {L}}}\mathbf {j})\, . \end{aligned}$$

Given M, choose \(\mathbf {j}\) such that \(j_N\equiv 0\) if \(N\not =M\). Clearly \(\mathbf {j}\in D^2\) because \((\widetilde{{\mathcal {L}}}\mathbf {j})_N \not =0\) only for \(N=M-1\), M, and \(M+1\). Moreover \(({\mathcal {K}}\mathbf {h},\mathbf {j})=a_M(K_Mh_M,j_M)_M\) is well defined for every \(\mathbf {h}\in L_s^2({\mathcal {R}},\varvec{\varGamma })\). Finally we known that \(K_M\) is non negative and self-adjoint for every M. Thus we get

$$\begin{aligned} a_M((\widetilde{{\mathcal {L}}}^*\mathbf {h})_M,j_M)_M&= ((\widetilde{{\mathcal {L}}}^*\mathbf {h},\mathbf {j}) = (\mathbf {h},\widetilde{{\mathcal {L}}}\mathbf {j}) = (\mathbf {h},{\mathcal {G}}\mathbf {j})+\tilde{\lambda } a_M(h_M,K_Mj_M)_M\\&=a_M(({\mathcal {G}}\mathbf {h})_M,j_M)+\tilde{\lambda } a_M(K_Mh_M,j_M)_M =a_M((\widetilde{{\mathcal {L}}}\mathbf {h})_M,j_M)_M\, . \end{aligned}$$

This implies that \((\widetilde{{\mathcal {L}}}^*\mathbf {h})_M =(\widetilde{{\mathcal {L}}}\mathbf {h})_M\) for every M. This proves that \(\widetilde{{\mathcal {L}}}\) is self-adjoint. Observe also that \({\mathcal {G}}\mathbf {h}=0\) if and only if \(\mathbf {h}=c\mathbf {e}^0\), see Lemma 11, while \({\mathcal {K}}\) is positive and \({\mathcal {K}}\mathbf {e}^0=0\). This completes the proof. \(\square \)

Let \(\mathbf {W}_1=\mathrm {span}\{\mathbf {e}_{\underline{\alpha }}\,|\, \lambda (\underline{\alpha })=1\}=\mathrm {span}\{{\mathcal {R}}^+_n\mathbf {e}^0\,|\,n\ge 0\}\). Observe that \({\mathcal {G}}\mathbf {h}=-\rho \mathbf {h}\) if \(\mathbf {h}\in \mathbf {W}_1\) while \((\mathbf {h},{\mathcal {G}}\mathbf {h})<-\rho (\mathbf {h},\mathbf {h})\) if \(\mathbf {h}\in D^2\), \(\mathbf {h}\perp \mathbf {e}^0\) but \(\mathbf {h}\not \in \mathbf {W}_1\). Thus we get

$$\begin{aligned} \varDelta \le -\rho +\sup \{(\mathbf {h},{\mathcal {K}} \mathbf {h})\,|\, \mathbf {h}\in D^2, \Vert \mathbf {h}\Vert _2=1, \mathbf {h}\perp \mathbf{E}_0\}\le -\rho \end{aligned}$$

From [4] we know that \((f_N,K_Nf_N)\le 0\) for every \(f_N\) while \((f_N,K_Nf_N)= 0\) if and only if \(f_N\) is rotationally invariant. Since \(({\mathcal {R}}^+_n \mathbf {e}^0)_N=\sqrt{\rho /\mu }\sum _{i=1}^N L_{n}(v_i)\), for \(n>0\), while \(({\mathcal {R}}^+_0 \mathbf {e}^0)_N=\sqrt{\rho /\mu }N-\sqrt{\mu /\rho }\) we have that \({\mathcal {R}}^+_n\mathbf {e}^0\) is rotationally invariant if and only if \(n=0\) or \(n=2\). This implies that \((\mathbf {h},\widetilde{{\mathcal {L}}}\mathbf {h})=-\rho \Vert \mathbf {h}\Vert _2\) if and only if \(\mathbf {h}\in \mathrm {span}\{{\mathcal {R}}_0^+\mathbf {e}^0,{\mathcal {R}}_2^+\mathbf {e}^0\}\). Since \({\mathcal {R}}_0^+\mathbf {e}^0=\mathbf {e}_{(1,0,\ldots )}\) and \({\mathcal {R}}_2^+\mathbf {e}^0=\mathbf {e}_{(0,0,1,0,\ldots )}\), this completes the proof of Theorem 2. \(\square \)

3.3 Proof of Theorem 3

To prove Theorem 3, we need more information on the action of \({\mathcal {K}}\) on the basis vectors \(\mathbf {e}_{\underline{\alpha }}\).

As a basic step, we compute the action of \(R_{1,2}\), see (6), on the product of two Hermite polynomials in \(v_1\) and \(v_2\). A simple calculation, see e.g. [4], shows that \((R_{1,2}F)(v_1,v_2)=0\) for every F odd in \(v_1\) or \(v_2\). Thus, calling \(H_{(m_1,m_2)}(v_1,v_2)=H_{m_1}(v_1)H_{m_2}(v_2)\), it follows that \(R_{1,2}H_{(m_1,m_2)}\not =0\) if and only if \(m_1\) and \(m_2\) are both even while \(R_{1,2}H_{(2n_1,2n_2)}\) is a rotationally invariant polynomial of degree \(2(n_1+n_2)\) in \(v_1\) and \(v_2\). Moreover, if \(m_1+m_2<2n_1+2n_2\), we get

$$\begin{aligned}&\int H_{(m_1,m_2)}(v_1,v_2)\bigl (R_{1,2}H_{(2n_1,2n_2)}\bigr )(v_1,v_2) \gamma (v_1)\gamma (v_2)dv_1dv_2 \\&\quad =\int \bigl (R_{1,2}H_{(m_1,m_2)}\bigr )(v_1,v_2)H_{(2n_1,2n_2)} (v_1,v_2)\gamma (v_1)\gamma (v_2)dv_1dv_2 =0 \end{aligned}$$

where we have used that \(H_{(2n_1,2n_2)}\) is orthogonal to any polynomial of degree less that \(2(n_1+n_2)\). Thus we have \(R_{1,2}H_{(2n_1,2n_2)}\in \mathrm {span}\{H_{(p_1,p_2)}\,|\, p_1+p_2=2n_1+2n_2\}\) and, since \(H_n\) is a monic polynomial of degree n, we can write

$$\begin{aligned} R_{1,2}H_{(2n_1,2n_2)}=\sum _{k=0}^{n_1+n_2} a_{k,n_1,n_2} H_{(2k,2(n_1+n_2-k))}=\sum _{k=0}^{n_1+n_2} a_{k,n_1,n_2}v_1^{2k}v_2^{2(n_1+n_2-k)}+Q \end{aligned}$$

for suitable coefficients \(a_{k,n_1,n_2}\) and polynomial \(Q(v_1,v_2)\) of degree strictly less then \(2(n_1+n_2)\). This, together with rotational invariance, implies that

$$\begin{aligned} R_{1,2}H_{(2n_1,2n_2)}=\tilde{\tau }_{n_1,n_2}\sum _{k=0}^{n_1+n_2} \left( {\begin{array}{c}n_1+n_2\\ k\end{array}}\right) H_{(2k,2(n_1+n_2-k))} \end{aligned}$$
(67)

for suitable coefficients \(\tilde{\tau }_{n,m}\). Using (67), together with (66), it is possible to give an explicit representation of \({\mathcal {K}}\) on the basis of the \(\mathbf {e}_{\underline{\alpha }}\). For the purpose of this paper, we will only need some particular cases discussed in detail below.

Let now \(\mathbf {V}_m=\mathrm {span}\{\mathbf {e}_{\underline{\alpha }}| \sum _{i=1}^\infty i\alpha _i=m\}=\mathrm {span}\{\prod _i ({\mathcal {R}}_i^+)^{\alpha _i}\mathbf {e}^0|\sum _{i=1}^\infty i\alpha _i=m\}\), that is \(\mathbf {V}_m\) is the subspace of all states \(\mathbf {h}\) such that \(h_N\) is a polynomial of degree m orthogonal to all polynomials of degree less than m. From the above considerations and (66) it follows that \({\mathcal {K}}\mathbf {V}_m\subset \mathbf {V}_m\) so that defining

$$\begin{aligned} \delta _{m}= \inf _{\begin{array}{c} \mathbf {h}\in \mathbf {V}_{m}\cap D^2\\ \Vert \mathbf {h}\Vert _2=1,\, \mathbf {h}\perp \mathbf {E}_1\oplus \mathbf {E}_0 \end{array}} (\mathbf {h},-\widetilde{{\mathcal {L}}}\mathbf {h})\, . \end{aligned}$$
(68)

and observing that \(L^2_s({\mathcal {R}},\varvec{\varGamma }) =\bigoplus _{m=0}^{\infty } \mathbf {V}_m\), we get \(\varDelta _2=-\inf _m\delta _m\).

Since \(\mathbf {E}_1=\mathrm {span}\{{\mathcal {R}}_0^+\mathbf {e}^0,{\mathcal {R}}_2^+\mathbf {e}^0\}\), we get

$$\begin{aligned} \mathbf {V}_0\cap (\mathbf {E}_1\oplus \mathbf {E}_0)^\perp&=\mathrm{span}\{({\mathcal {R}}^+_0)^n \mathbf {e}^0, n\ge 2\}\\ \mathbf {V}_2\cap (\mathbf {E}_1\oplus \mathbf {E}_0)^\perp&=\mathrm{span}\{({\mathcal {R}}^+_0)^n {\mathcal {R}}_2^+ \mathbf {e}^0, n\ge 1; ({\mathcal {R}}^+_0)^m ({\mathcal {R}}^+_1)^2\mathbf {e}^0, m\ge 0\}\, . \end{aligned}$$

Observing that \({\mathcal {K}}({\mathcal {R}}^+_0)^n\mathbf {e}^0={\mathcal {K}}({\mathcal {R}}^+_0)^n {\mathcal {R}}_2^+ \mathbf {e}^0=0\), due to rotational invariance, while \({\mathcal {K}}({\mathcal {R}}^+_0)^m ({\mathcal {R}}^+_1)^2\mathbf {e}^0=0\), due to parity, we obtain \(\delta _0=\delta _2=2\rho \). Moreover we have that, for \(m\not =0,2\), \(\mathbf {V}_m\perp \mathbf {E}_1 \oplus \mathbf {E}_0\). Thus we need a lower bound on \(\delta _m\) for m odd and for m even and greater than 2.

Observe that \(({\mathcal {R}}^+_{m}\mathbf {e}^0,{\mathcal {G}}{\mathcal {R}}^+_{m}\mathbf {e}^0)=-\rho \) while \((\mathbf {h},{\mathcal {G}}\mathbf {h})\le -2\rho (\mathbf {h},\mathbf {h})\) if \(\mathbf {h}\in \mathbf {V}_m\) and \(\mathbf {h}\perp {\mathcal {R}}^+_{m}\mathbf {e}^0\). Thus, if \(\lambda \) is not too big, it is natural to search for the infimum of \((\mathbf {h},-\widetilde{{\mathcal {L}}}\mathbf {h})\) on \(\mathbf {V}_m\) looking at states \(\mathbf {h}\) close to \({\mathcal {R}}^+_{m}\mathbf {e}^0\). To do this, we need the representation of \({\mathcal {K}}{\mathcal {R}}^+_m\mathbf {e}^0\) on the basis formed by the \(\mathbf {e}_{\underline{\alpha }}\). If \(m=2n\), using (67) for \(n_2=0\) we get

$$\begin{aligned} R_{1,2}H_{(2n,0)}=\tau _{n}\sum _{k=0}^{n} \left( {\begin{array}{c}n\\ k\end{array}}\right) H_{(2k,2(n-k))} \end{aligned}$$
(69)

where \(\tau _n=\tilde{\tau }_{n,0}\). To compute \(\tau _n\) we compare the coefficients of \(v_1^{2n}\) on the left and right hand side of (69). On the left hand side the only contribution comes from \(R_{1,2}v_1^{2n}\) since \(R_{1,2}\) preserve the degree. On the right hand side only the term with \(k=n\) contains the monomial \(v_1^{2n}\). Since the \(H_n\) are monic and

$$\begin{aligned} R_{1,2}v_1^{2n}=\int _0^{2\pi }(v_1\cos \theta - v_2\sin \theta )^{2n} \frac{d\theta }{2\pi }= (v_1^2 + v_2^2 )^{n}\int _0^{2\pi }\cos ^{2n} \theta \frac{d\theta }{2\pi }\, , \end{aligned}$$

and we obtain

$$\begin{aligned} \tau _n=\int _0^{2\pi } \cos ^{2n}\theta \frac{d\theta }{2\pi } =\frac{1}{4^n}\left( {\begin{array}{c}2n\\ n\end{array}}\right) \, . \end{aligned}$$

Combining with (59) we get

$$\begin{aligned} R_{i,j}L_{2n}(v_i)=\tau _n\sum _{k=0}^n\left( {\begin{array}{c}n\\ k\end{array}}\right) \frac{\sqrt{(2k)![2(n-k)]!}}{\sqrt{(2n)!}}L_{2k}(v_i)L_{2(n-k)}(v_j)\, . \end{aligned}$$

Since for \(n>0\) we have \(({\mathcal {R}}^+_{2n}\mathbf {e}^0)_N =\sqrt{\rho /\mu }\sum _{i=1}^N L_{2n}(v_i)\), a direct computation shows that

$$\begin{aligned} ({\mathcal {K}}{\mathcal {R}}^+_{2n}\mathbf {e}^0)_N&=\sqrt{\frac{\rho }{\mu }}(N-1)(2\tau _n-1) \sum _{i=1}^NL_{2n}(v_i)\\&\quad +\sqrt{\frac{\rho }{\mu }}\sum _{k=1}^{n-1}\sum _{i\ne j} \sigma _{n,k}L_{2k}(v_i)L_{2(n-k)}(v_j) \end{aligned}$$

where

$$\begin{aligned} \sigma _{n,k}=\tau _n\frac{\left( {\begin{array}{c}n\\ k\end{array}}\right) }{\sqrt{\left( {\begin{array}{c}2n\\ 2k\end{array}}\right) }} =\sqrt{\tau _n\tau _k\tau _{n-k}}\,. \end{aligned}$$
(70)

This gives us

$$\begin{aligned} {\mathcal {K}}{\mathcal {R}}^+_{2n}\mathbf {e}^0&=(2\tau _n-1){\mathcal {R}}^+_{2n}{\mathcal {N}}\mathbf {e}^0 +\sqrt{\frac{\mu }{\rho }}\sum _{k=1}^{n-1} \sigma _{n,k}{\mathcal {R}}^+_{2k}{\mathcal {R}}^+_{2(n-k)}\mathbf {e}^0\nonumber \\&=\frac{\mu }{\rho }(2\tau _n-1){\mathcal {R}}^+_{2n}\mathbf {e}^0 +\sqrt{\frac{\mu }{\rho }}(2\tau _n-1) {\mathcal {R}}^+_0{\mathcal {R}}^+_{2n}\mathbf {e}^0\nonumber \\&\quad +\sqrt{\frac{\mu }{\rho }}\sum _{k=1}^{n-1} \sigma _{n,k}{\mathcal {R}}^+_{2k}{\mathcal {R}}^+_{2(n-k)}\mathbf {e}^0 \end{aligned}$$
(71)

where we have used that \({\mathcal {N}}\mathbf {e}^0=\sqrt{\frac{\mu }{\rho }} {\mathcal {R}}^+_0\mathbf {e}^0+\frac{\mu }{\rho }\mathbf {e}^0\).

If \(m=2n+1\), \(R_{1,2}H_{2n+1}(v_1)=0\) gives

$$\begin{aligned} {\mathcal {K}}{\mathcal {R}}^+_{2n+1}\mathbf {e}^0=-\frac{\mu }{\rho }{\mathcal {R}}^+_{2n+1}\mathbf {e}^0 -\sqrt{\frac{\mu }{\rho }}{\mathcal {R}}^+_0 {\mathcal {R}}^+_{2n+1}\mathbf {e}^0\, . \end{aligned}$$
(72)

From (71) and (72) we get

$$\begin{aligned} \tilde{\lambda }({\mathcal {R}}^+_{2n}\mathbf {e}^0,{\mathcal {K}}{\mathcal {R}}^+_{2n}\mathbf {e}^0) =-\lambda (1-2\tau _n)\, , \quad \tilde{\lambda } ({\mathcal {R}}^+_{2n+1}\mathbf {e}^0,{\mathcal {K}}{\mathcal {R}}^+_{2n+1}\mathbf {e}^0)=-\lambda \end{aligned}$$

so that \(\delta _{2n}\le \rho +\lambda (1-2\tau _n)\) and \(\delta _{2n+1}\le \rho +\lambda \).

The following Lemma shows that, if the average number of particles in the steady state is large enough and \(\lambda \) is not too large, one can find a lower bound for \(\delta _m\) close to the upper bound derived above.

Lemma 16

For \(m=2n+1\) we have

$$\begin{aligned} \delta _{2n+1}\ge \min \left\{ \rho +\lambda -\lambda \sqrt{\frac{\rho }{\mu }}\,,\; 2\rho -\lambda \sqrt{\frac{\rho }{\mu }}\right\} \end{aligned}$$
(73)

while for \(m=2n\), \(n>1\), we have

$$\begin{aligned} \delta _{2n}\ge \min \left\{ \rho +(1-2\tau _n)\lambda -2\lambda \sqrt{\frac{\rho }{\mu }}\,,\;2\rho -2\lambda \sqrt{\frac{\rho }{\mu }}\right\} \,. \end{aligned}$$
(74)

Proof

See Appendix A.1. \(\square \)

Since \(\tau _2=3/8\) and \(({\mathcal {R}}_4\mathbf {e}^0, -\widetilde{{\mathcal {L}}}{\mathcal {R}}_4\mathbf {e}^0)=\rho +\lambda /4\), we get

$$\begin{aligned} \rho +\frac{\lambda }{4}-2\lambda \sqrt{\frac{\rho }{\mu }}<\delta _4\le \rho +\lambda /4. \end{aligned}$$

Moreover, thanks to (13),

$$\begin{aligned} 2\rho -\lambda \sqrt{\frac{\rho }{\mu }}>\rho +\frac{\lambda }{4}\,, \qquad \rho +\lambda -\lambda \sqrt{\frac{\rho }{\mu }}>\rho +\frac{\lambda }{4} \end{aligned}$$

so that \(\delta _{2n+1}>\delta _4\) for every n. Finally we observe that \(\tau _{n+1}<\tau _n\) and \(\tau _3=5/16\). Using (13) again it follows that, for \(n\ge 3\),

$$\begin{aligned} \delta _{2n}\ge \min \left\{ 2\rho -2\lambda \sqrt{\frac{\rho }{\mu }},(1-2\tau _3)\lambda +\rho -2\lambda \sqrt{\frac{\rho }{\mu }}\right\} >\rho +\frac{\lambda }{4}\ge \delta _4 \end{aligned}$$

so that \(\varDelta _2=-\delta _4\).

To show that \(\varDelta _2\) is an eigenvalue, we need to construct an eigenstate, that is we need to find \(\hat{\mathbf {h}}\in \mathbf {V}_4\) such that \(\widetilde{{\mathcal {L}}}\hat{\mathbf {h}}=-\delta _4 \hat{\mathbf {h}}\). To this end, it is enough to show that there exists \(\hat{\mathbf {h}}\in \mathbf {V}_4\) such that \((\hat{\mathbf {h}},\widetilde{{\mathcal {L}}} \hat{\mathbf {h}})=-\delta _4 (\hat{\mathbf {h}},\hat{\mathbf {h}})\). Observe that if \(\mathbf {h}\in \mathbf {V}_4\) then \({\mathcal {K}}\mathbf {h}\) is even. We thus restrict our search to \(\hat{\mathbf {h}}\in \mathbf {V}^e_4=\mathrm {span}\{({\mathcal {R}}^+_0)^k{\mathcal {R}}^+_4\mathbf {e}^0,\, ({\mathcal {R}}^+_0)^k({\mathcal {R}}^+_2)^2\mathbf {e}^0;\,k\ge 0 \}\).

Consider a sequence \(\mathbf {h}_n\in \mathbf {V}^e_4\) such that \(\Vert \mathbf {h}_n\Vert _2=1\) and \(\lim _{n\rightarrow \infty }(\mathbf {h}_n, -\widetilde{{\mathcal {L}}}\mathbf {h}_n)=\delta _4\). Calling \(\mathbf {V}^e_{4,k}=\mathrm {span}\{({\mathcal {R}}^+_0)^{k-1}{\mathcal {R}}^+_4\mathbf {e}^0, ({\mathcal {R}}^+_0)^{k-2}({\mathcal {R}}^+_2)^2\mathbf {e}^0 \}\) for \(k>2\), while \(\mathbf {V}^e_{4,1}=\mathrm {span}\{{\mathcal {R}}^+_4\mathbf {e}^0\}\), we can write \(\mathbf {h}_n=\sum _{k=0}^\infty \mathbf {h}_{n,k}\) with \(\mathbf {h}_{n,k}\in \mathbf {V}^e_{4,k}\) and we can find a subsequence \(\mathbf {h}^0_n\) of \(\mathbf {h}_n\) such that \(\lim _{n\rightarrow \infty }\mathbf {h}_{n,0}=\hat{\mathbf {h}}_0\). Similarly we can find a new subsequence \(\mathbf {h}^1_n\) of \(\mathbf {h}^0_n\) such that \(\lim _{n\rightarrow \infty }\mathbf {h}_{n,1}=\hat{\mathbf {h}}_1\). Proceeding like this we find a sequence \(\mathbf {h}_n^\infty \) such that \(\lim _{n\rightarrow \infty }\mathbf {h}^\infty _{n,k}=\hat{\mathbf {h}}_k\), for every k. Analogously, since \(h_{n,N}\) is an even polynomial of degree 4 in \(\underline{v}_N\) we can assume, possibly at the cost of further extracting a subsequence, that \(\lim _{n\rightarrow \infty } h^\infty _{n,N}=\hat{h}_N\) for every N. From Fatou’s Lemma we get that \(\lim _{n\rightarrow \infty }\mathbf {h}^\infty _n=\hat{\mathbf {h}}\) with \(\Vert \hat{\mathbf {h}}\Vert _2\le 1\) while

$$\begin{aligned} (\hat{\mathbf {h}},-{\mathcal {G}}\hat{\mathbf {h}})=\rho \sum _{k=1}^\infty k\Vert \hat{\mathbf {h}}_k\Vert ^2 \le \liminf _{n\rightarrow \infty }\rho \sum _{k=1}^\infty k\Vert \mathbf {h}_{n,k}\Vert ^2 =\liminf _{n\rightarrow \infty }(\mathbf {h}^\infty _n,-{\mathcal {G}} \mathbf {h}^\infty _n) \end{aligned}$$

and analogously, since \(K_N\) is non positive,

$$\begin{aligned} (\hat{\mathbf {h}},-{\mathcal {K}}\hat{\mathbf {h}})&=\sum _{N=0}^\infty (\hat{h}_N,-K_N \hat{h}_N)_N \le \liminf _{n\rightarrow \infty }\sum _{N=0}^\infty (h^\infty _{n,N}, -K_N h^\infty _{n,N})_N \\&\le \liminf _{n\rightarrow \infty }(\mathbf {h}^\infty _n,-{\mathcal {K}}\mathbf {h}^\infty _n) \end{aligned}$$

so that

$$\begin{aligned} (\hat{\mathbf {h}},-\widetilde{{\mathcal {L}}}\hat{\mathbf {h}}) \le \liminf _{n\rightarrow \infty }(\mathbf {h}^\infty _n,-\widetilde{{\mathcal {L}}} \mathbf {h}^\infty _n)=\delta _4 \end{aligned}$$

while \((\hat{\mathbf {h}},-\widetilde{{\mathcal {L}}}\hat{\mathbf {h}})\ge \delta _4 \Vert \hat{\mathbf {h}}\Vert _2\) since \(\hat{\mathbf {h}}\in \mathbf {V}_4^e\). Thus we need to show that \(\Vert \hat{\mathbf {h}}\Vert _2=1\).

To this end observe that for every \(M>0\) we have

$$\begin{aligned} \rho M\sum _{k=M+1}^\infty \Vert \mathbf {h}_{n,k}\Vert _2^2 \le \rho \sum _{k=1}^\infty k\Vert \mathbf {h}_{n,k}\Vert _2^2 = (\mathbf {h}_n,-{\mathcal {G}}\mathbf {h}_n)\le (\mathbf {h}_n,-\widetilde{{\mathcal {L}}}\mathbf {h}_n)\le 2\delta _4 \end{aligned}$$

eventually in n. Thus, for every \(\epsilon \) there exists M such that \(\sum _{k=0}^M\Vert \mathbf {h}_{n,k}\Vert _2^2\ge 1-\epsilon \) eventually in n. Taking the limit this implies that for every \(\epsilon \) there exists M such that \(\sum _{k=0}^M\Vert \hat{\mathbf {h}}_{k}\Vert _2^2 \ge 1-\epsilon \) and thus we get \(\Vert \hat{\mathbf {h}}\Vert =1\). This concludes the proof of Theorem 3. \(\square \)

3.4 Proof of Theorem 4

To simplify notation, given \(\mathbf {f}=\mathbf {h}\varvec{\varGamma }\), we set \(S(\mathbf {h})={{\mathcal {S}}}(\mathbf {f}\,|\,\varvec{\varGamma }) \) and we define

$$\begin{aligned} \varPsi (\mathbf {h})&=\sum _{N=0}^\infty a_{N}\int d\underline{v}_{N+1}(h_{N+1}-h_{N}) (\log h_{N+1}-\log h_{N})\gamma _{N+1}(\underline{v}_{N+1})\\ E(\mathbf {h})&=\sum _{N=0}^\infty a_{N}\int d\underline{v}_{N}h_N(\underline{v}_{N}) \gamma _N(\underline{v}_{N}). \end{aligned}$$

Finally we observe that if \(\mathbf {f}\in L^1_s({\mathcal {R}})\) then \(\mathbf {h}\in L^1_s({\mathcal {R}},\varvec{\varGamma })\) and \(e^{{\mathcal {L}}t}\mathbf {f}=(e^{\widetilde{{\mathcal {L}}}t}\mathbf {h})\varvec{\varGamma }\) with \(\widetilde{{\mathcal {L}}}={\mathcal {G}}+\tilde{\lambda }{\mathcal {K}}\) defined in Sect. 3.2 but now considered as an operator on \(L^1_s({\mathcal {R}},\varvec{\varGamma })\).

To obtain an explicit expression for \(\frac{d}{dt}S(\mathbf {h}(t))\), where \(\mathbf {h}(t)=e^{\widetilde{{\mathcal {L}}}t}\mathbf {h}\) we need to interchange the order of the derivative in t with the sum over N and the integral over \(\underline{v}_N\). To do this we will use the following two Lemmas that will allow us to use Fatou’s Lemma to excahnge derivative and integrals.

Lemma 17

Given \(\mathbf {f}\in L^1({\mathcal {R}})\) we have

$$\begin{aligned}&\lim _{t\rightarrow 0^+}\left( \left( e^{{\mathcal {L}} t}\mathbf {f}\right) _N (\underline{v}_N)-f_N(\underline{v}_N)\right) =0\\&\lim _{t\rightarrow 0^+} \frac{1}{t}\left( \left( e^{{\mathcal {L}} t} \mathbf {f}\right) _N(\underline{v}_N)-f_N(\underline{v}_N)\right) =\left( {\mathcal {L}} \mathbf {f}\right) _N(\underline{v}_N) \end{aligned}$$

for every N and almost every \(\underline{v}_N\).

Proof

See Appendix A.2. \(\square \)

Lemma 18

If \(\mathbf {h}\varvec{\varGamma }\in L^1_s({\mathcal {R}})\) then

$$\begin{aligned} h_N(t)\log (h_N(t))\le \left( e^{\widetilde{{\mathcal {L}}}t} (\mathbf {h}\log \mathbf {h})\right) _N\, . \end{aligned}$$

Proof

See Appendix A.3. \(\square \)

After setting

$$\begin{aligned} \frac{d_+}{dt}S(\mathbf {h}(t)):=\limsup _{h\rightarrow 0^+} \frac{1}{h}(S(\mathbf {h}(t+h))-S(\mathbf {h}(t)))\,, \end{aligned}$$

we are ready to estimate of the variation in time of \(S(\mathbf {h})\).

Lemma 19

Let \(\mathbf {h}\) be such that \(\mathbf {h}\varvec{\varGamma }\in D^1\) and \(\mathbf {h}\log \mathbf {h}\varvec{\varGamma }\in D^1\) then we have

$$\begin{aligned} \frac{d_+}{dt}S(\mathbf {h}(t))\le -\mu \varPsi (\mathbf {h}(t))\, . \end{aligned}$$

Proof

From Lemma 18 we get

$$\begin{aligned}&\frac{1}{t}\bigl (h_N(\underline{v}_N,t)\log (h_N(\underline{v}_N,t)) -h_N(\underline{v}_N)\log (h_N(\underline{v}_N))\bigr )\\&\quad -\frac{1}{t}\left( \left( e^{\widetilde{{\mathcal {L}}}t} (\mathbf {h}\log \mathbf {h})\right) _N(\underline{v}_N)-h_N(\underline{v}_N)\log (h_N(\underline{v}_N))\right) \le 0\, . \end{aligned}$$

Since \(\mathbf {h}\log \mathbf {h}\varvec{\varGamma }\in L^1({\mathcal {R}})\), conservation of probability gives

$$\begin{aligned} \sum _{N=0}^\infty a_N\int _{{\mathbb {R}}^N} \left( \left( e^{\widetilde{{\mathcal {L}}}t} (\mathbf {h}\log \mathbf {h})\right) _N(\underline{v}_N)\gamma _N(\underline{v}_N) -h_N(\underline{v}_N)\log (h_N(\underline{v}_N))\gamma _N(\underline{v}_N)\right) d\underline{v}_N=0 \end{aligned}$$

so that by Fatou’s Lemma

$$\begin{aligned}&\limsup _{t\rightarrow 0^+} \frac{1}{t}(S(\mathbf {h}(t))-S(\mathbf {h}))\\&\quad \le \sum _{N=0}^\infty a_N\int _{{\mathbb {R}}^N}\limsup _{t\rightarrow 0^+} \frac{1}{t}\bigl (h_N(\underline{v}_N,t)\log (h_N(\underline{v}_N,t))-h_N(\underline{v}_N) \log (h_N(\underline{v}_N))\bigr )\\&\qquad -\sum _{N=0}^\infty a_N\int _{{\mathbb {R}}^N}\limsup _{t\rightarrow 0^+} \frac{1}{t}\left( \left( e^{\widetilde{{\mathcal {L}}}t} (\mathbf {h}\log \mathbf {h})\right) _N(\underline{v}_N)-h_N(\underline{v}_N)\log (h_N(\underline{v}_N))\right) \end{aligned}$$

and, using Lemma 17, we get

$$\begin{aligned} \limsup _{t\rightarrow 0^+} \frac{1}{t}(S(\mathbf {h}(t))-S(\mathbf {h}))&\le \sum _{N=0}^\infty a_N \int _{{\mathbb {R}}^N} (\widetilde{{\mathcal {L}}}\mathbf {h})_N (\underline{v}_N)(\log (h_N(\underline{v}_N))+1)\gamma _N(\underline{v}_N)d\underline{v}_N\\&\quad -\sum _{N=0}^\infty a_N \int _{{\mathbb {R}}^N} \left( \widetilde{{\mathcal {L}}}(\mathbf {h}\log \mathbf {h})\right) _N (\underline{v}_N)\gamma _N(\underline{v}_N)d\underline{v}_N \, . \end{aligned}$$

Since \(\varvec{\varGamma } \mathbf {h}\in D^1\) and \(\varvec{\varGamma } \mathbf {h}\log \mathbf {h}\in D^1\), (45) gives

$$\begin{aligned} \frac{d_+}{dt}S(\mathbf {h}(t))\bigr |_{t=0}&\le \sum _{N=0}^\infty a_N \int d\underline{v}_N\gamma _N (\widetilde{{\mathcal {L}}}\mathbf {h})_N\log (h_N) \\&=\sum _{N=0}^\infty a_N\int d\underline{v}_N\gamma _N \left( \rho ({\mathcal {P}}^+\mathbf {h})_{N}+\mu ({\mathcal {P}}^-\mathbf {h})_{N} \right. \\ {}&\quad \left. -(\mu +\rho N)h_N+\tilde{\lambda } K_Nh_N\right) \log h_N \\&\le \sum _{N=0}^\infty a_N\int d\underline{v}_N\gamma _N \left( \rho ({\mathcal {P}}^+\mathbf {h})_{N}+\mu ({\mathcal {P}}^-\mathbf {h})_{N} -(\mu +\rho N)h_N\right) \log h_N \end{aligned}$$

where we have used that \(\int d\underline{v}_N\gamma _N (K_Nh_N)\log h_N\le 0\). Observe finally that

$$\begin{aligned} \int d\underline{v}_N\gamma _N\rho ({\mathcal {P}}^+\mathbf {h})_N\log h_N&=\int d\underline{v}_N \gamma _NN\rho h_{N-1}\log h_N\\ \int d\underline{v}_N\gamma _N\mu ({\mathcal {P}}^-\mathbf {h})_N\log h_N&=\int d\underline{v}_{N+1} \gamma _{N+1}\mu h_{N+1}\log h_N \end{aligned}$$

from which we get

$$\begin{aligned} \frac{d_+}{dt}S(\mathbf {h}(t))\bigr |_{t=0}&\le \sum _{N=1}^\infty a_N \int d\underline{v}_N\gamma _NN\rho (\tilde{h}_{N-1}(t)-\tilde{h}_N(t))\log \tilde{h}_N(t)\\&\quad +\mu \sum _{N=0}^\infty a_N\int d\underline{v}_{N+1}\gamma _{N+1} (\tilde{h}_{N+1}(t)-\tilde{h}_N(t))\log \tilde{h}_N(t) \end{aligned}$$

The thesis follows by reindexing the first sum and using (53). \(\square \)

Thus to show that \(S(\mathbf {h}(t))\) decays exponentially we need a lower bound for \(\varPsi (\mathbf {h})\) in terms of \(S(\mathbf {h})\). This is the content of the following Lemma that is the main result of this section.

Lemma 20

If \(\varvec{\varGamma }\mathbf {h}\in L^1_s({\mathcal {R}})\) with \(S(\mathbf {h})<\infty \), then

$$\begin{aligned} S(\mathbf {h})\le E(\mathbf {h})\log E(\mathbf {h}) +\frac{\mu }{\rho }\varPsi (\mathbf {h})\,. \end{aligned}$$
(75)

Remark 21

The idea behind the proof of (75) is to think of the entry and exit processes defined by the thermostat as a continuous family of independent entry processes, one for each possible velocity v, with entry rates \(\mu \gamma (v)dv\), while each particle in the system leaves with rate \(\rho \) independent of its velocity. Clearly such a description makes little mathematical sense and, as a first step, one may think of approximating the original process by restricting the velocity of each particle to assume only a finite number of values \(\bar{v}_k\), \(k=1,\ldots ,K\), characterized by suitable entry rates \(\omega _k\). After this, using convexity, we reduce the proof of (75) to the case with \(K=1\), essentially equivalent to the case in which all particles in the thermostat have the same velocity. In this situation, we further approximate the infinite reservoir by a large finite reservoir containing M particles that enter and leave the system, independently from each other, at a suitable rate. Convexity will allow us to reduce this situation to that of a single particle jumping from the system to the reservoir and back. The final step is thus Lemma 25 below that deals with this situation. This argument is inspired by the proof of the Logarithmic Sobolev Inequality in [12].

Remark 22

In the proof of Lemma 19 we required that \(\mathbf {h}\varvec{\varGamma }\in D^1\) and \(\mathbf {h}\log \mathbf {h}\varvec{\varGamma }\in D^1\) only to differentiate \(e^{t\widetilde{{\mathcal {L}}}}\mathbf {h}\) and show that \(\sum _{N=0}^\infty a_N\int (\widetilde{{\mathcal {L}}}\mathbf {h})_N \gamma _Nd\underline{v}_N=0\) and similarly for \(\mathbf {h}\log \mathbf {h}\). We believe it is possible to implement the strategy outlined in Remark 21, and developed in the proof below, directly to \(S(\mathbf {h})\) thanks to the representation of the evolution described in Remark 8. This would eliminate the need for conditions on \(\mathbf {h}\) but it would make the proof below unnecessarily involved.

Proof of Lemma 20

A way to make the first step of the discussion in Remark 21 rigorous is to coarse grain, that is to approximate each \(h_N\) by a simple function obtained by averaging it over the element of a partition of \({\mathbb {R}}^N\) made by rectangles obtained as the Cartesian product of a finite number of measurable set of \({\mathbb {R}}\).

More precisely, we call \({\mathscr {B}}=\{B_k\}_{k=1}^K\) a (measurable) partition of \({\mathbb {R}}^N\) if \(B_k\subset {\mathbb {R}}^N\) are measurable and \(\bigcup _k B_k={\mathbb {R}}^N\) while \(B_k\cap B_{k'}=\emptyset \) if \(k\not =k'\). Given a measurable partition \({\mathcal {B}}\) let \(I_k(\underline{v}_N,\underline{w}_N)\) be the indicator function of \(B_k\times B_k\subset {\mathbb {R}}^{2N}\) and define the coarse graining kernel:

$$\begin{aligned} C_{\mathscr {B}}(\underline{v}_N,\underline{w}_N)=\sum _{k=1}^K \frac{1}{\omega _{k}}I_k(\underline{v}_N,\underline{w}_N) \quad \mathrm {with} \quad \omega _{k} =\int _{B_k}\gamma _N(\underline{v}_N)d\underline{v}_N\,. \end{aligned}$$

Clearly, for every \(\underline{w}_N\) we have

$$\begin{aligned} \int _{{\mathbb {R}}^N}C_{\mathcal {B}}(\underline{v}_N,\underline{w}_N) \gamma (\underline{v}_N)d\underline{v}_N=1 \end{aligned}$$

while \(C_{\mathcal {B}}(\underline{v}_n,\underline{w}_N)=C_{\mathcal {B}} (\underline{w}_N,\underline{v}_N)\). Given a function \(h_N\) is \(L^1({\mathbb {R}}^N)\) we can define its coarse grained version as

$$\begin{aligned} h_{N,{\mathcal {B}}}(\underline{v}_N)=\int _{{\mathbb {R}}^N}C_{\mathcal {B}} (\underline{v}_N,\underline{w}_N)h_N(\underline{w}_N)\gamma (\underline{w}_N)d\underline{w}_N\, . \end{aligned}$$

Observe that, if \(\underline{v}_N\in B_k\) then

$$\begin{aligned} h_{N,{\mathcal {B}}}(\underline{v}_N)=\frac{1}{\omega _{k}}\int _{B_k} \gamma (\underline{w}_N)h_N(\underline{w}_N)d\underline{w}_N\, . \end{aligned}$$

This means that \(h_{N,{\mathcal {B}}}(\underline{v}_N)\) is a simple function that assumes only K possible values. Finally we have \(\int _{{\mathbb {R}}^N}h_{N,{\mathcal {B}}}(\underline{v}_N)\gamma (\underline{v}_N)d\underline{v}_N =\int _{{\mathbb {R}}^N}h_N(\underline{v}_N) \gamma (\underline{v}_N) d\underline{v}_N\).

Given measurable partitions \({\mathcal {B}}=\{B_k\}_{k=1}^K\) and \({\mathcal {B}}'=\{B'_j\}_{j=1}^J\)of \({\mathbb {R}}^N\) and \({\mathbb {R}}^M\) respectively, we can define the product partition \({\mathcal {B}}\times {\mathcal {B}}'=\{B_k\times B'_j\,|k=1,\ldots ,K\,\,\, j=1,\ldots ,J\}\) of \({\mathbb {R}}^{N+M}\). Observe that the coarse graining kernel of \({\mathcal {B}}\times {\mathcal {B}}'\) satisfies

$$\begin{aligned} C_{{\mathcal {B}}\times {\mathcal {B}}'}(\underline{v}_N,\underline{v}'_M, \underline{w}_N,\underline{w}'_M)= C_{{\mathcal {B}}} (\underline{v}_N,\underline{w}_N)C_{{\mathcal {B}}'}(\underline{v}'_M,\underline{w}'_M)\, . \end{aligned}$$

Finally, given a partition \({\mathcal {B}}=\{B_k\}_{k=1}^K\) of \({\mathbb {R}}\), and \(\underline{k}=(k_1,\ldots ,k_N)\in \{1,\ldots ,K\}^N\) we consider the set \(B_{\underline{k}}=\times _i B_{k_i}\subset {\mathbb {R}}^N\). Clearly the \(B_{\underline{k}}\) form a measurable partition of \({\mathbb {R}}^N\) that we will denote as \({\mathcal {B}}^N\). As before, we can define the coarse graining kernel for \({\mathcal {B}}^N\) as

$$\begin{aligned} C_{{\mathcal {B}}^N}(\underline{v}_N,\underline{w}_N) =\sum _{\underline{k}\in \{1,\ldots , K\}^N} \frac{1}{\omega _{\underline{k}}} I_{\underline{k}}(\underline{v}_N,\underline{w}_N) \end{aligned}$$

where \(\omega _{\underline{k}}=\prod _{i=1}^{N}\omega _{k_i}\) and \(I_{\underline{k}}(\underline{v}_N,\underline{w}_N)\) is the characteristic function of \(B_{\underline{k}}\times B_{\underline{k}}\in {\mathbb {R}}^{2N}\). Moreover the coarse grained version of \(h_N\in L^1({\mathbb {R}}^N,\gamma _N)\) is

$$\begin{aligned} h_{N,{\mathcal {B}}^N}(\underline{v}_N)=\int _{{\mathbb {R}}^N}\gamma (\underline{w}_N) C_{{\mathcal {B}}^N}(\underline{v}_N,\underline{w}_N)h_N(\underline{w}_N) d\underline{w}_N\, . \end{aligned}$$

Again, if \(\underline{v}_N\in B_{\underline{k}}\) we have

$$\begin{aligned} h_{N,{\mathcal {B}}^N}(\underline{v}_N)=\frac{1}{\omega _{\underline{k}}} \int _{B_{\underline{k}}}h_N(\underline{v}_N)\gamma _{N}(\underline{v}_{N}) d \underline{v}_{N}:=\bar{h}_{N,{\mathcal {B}}^N}(\underline{k}) \end{aligned}$$

and \(h_{N,{\mathcal {B}}^N}(\underline{v}_N)\) assumes only the \(K^N\) possible values \(\bar{h}_{N,{\mathcal {B}}^N}(\underline{k})\). Observe finally that, since

$$\begin{aligned} C_{{\mathcal {B}}^N}(\underline{v}_N,\underline{w}_N)=\prod _{i=1}^N C_{{\mathcal {B}}}(v_i,w_i)\,, \end{aligned}$$

we can write

$$\begin{aligned} h_{N-1,{\mathcal {B}}^{N-1}}(\underline{v}_{N-1})=\int _{{\mathbb {R}}^N} \gamma (\underline{w}_N)C_{{\mathcal {B}}^N}(\underline{v}_N,\underline{w}_N) h_{N-1}(\underline{w}_{N-1})d\underline{w}_N\, . \end{aligned}$$
(76)

Given a state \(\mathbf {h}\) and a partition \({\mathcal {B}}\) of \({\mathbb {R}}^N\), we define the coarse grained version \(\mathbf {h}_{{\mathcal {B}}}\) of \(\mathbf {h}\) over \({\mathcal {B}}\) by setting \(h_{{\mathcal {B}},N}=h_{N,{\mathcal {B}}^N}\). Since \(x\log (x)\) is convex in x and \((x-y)(\log (x)-\log (y))\) is jointly convex in x and y, for every partition \({\mathcal {B}}\) of \({\mathbb {R}}\), we get

$$\begin{aligned} S(\mathbf {h}_{{\mathcal {B}}})\le S(\mathbf {h}),\qquad \varPsi (\mathbf {h}_{{\mathcal {B}}}) \le \varPsi (\mathbf {h}),\qquad E(\mathbf {h}_{{\mathcal {B}}})=E(\mathbf {h}) \end{aligned}$$
(77)

where in the inequality for \(\varPsi \) we used (76). On the other hand, we have the following Lemma.

Lemma 23

Given \(\mathbf {h}\), for every \(\epsilon \) we can find a finite measurable partition \({\mathcal {B}}\) of \({\mathbb {R}}\) such that

$$\begin{aligned} S(\mathbf {h})-S(\mathbf {h}_{{\mathcal {B}}})\le \epsilon \end{aligned}$$

Proof

See Appendix A.4. \(\square \)

We thus claim that to prove Lemma 20 we just need to show that, for every finite partition \({\mathcal {B}}\) of \({\mathbb {R}}\) and every state \(\mathbf {h}\) we have

$$\begin{aligned} S(\mathbf {h}_{{\mathcal {B}}})\le E(\mathbf {h}_{{\mathcal {B}}})\log E (\mathbf {h}_{{\mathcal {B}}})+\frac{\mu }{\rho }\varPsi (\mathbf {h}_{{\mathcal {B}}})\, . \end{aligned}$$
(78)

To see this observe that Lemma 23, together with (77) and (78), implies that for every \(\epsilon \) we can find a partition \({\mathcal {B}}\) such that

$$\begin{aligned} S(\mathbf {h})\le S(\mathbf {h}_{{\mathcal {B}}})+\epsilon&\le E(\mathbf {h}_{{\mathcal {B}}})\log E(\mathbf {h}_{{\mathcal {B}}})+\frac{\mu }{\rho }\varPsi (\mathbf {h}_{{\mathcal {B}}})+\epsilon \\&\le E(\mathbf {h})\log E(\mathbf {h})+\frac{\mu }{\rho }\varPsi (\mathbf {h})+\epsilon \, . \end{aligned}$$

Thus we consider a given finite partition \({\mathcal {B}}=\{B_k\}_{k=1}^K\) and a given state \(\mathbf {h}\). Since \(h_{{\mathcal {B}}, N}\) takes only finitely many values, it should be possible to transform the integrals defining \(E(\mathbf {h}_{{\mathcal {B}}})\), \(S(\mathbf {h}_{{\mathcal {B}}})\) and \(\varPsi (\mathbf {h}_{{\mathcal {B}}})\) into summations. To do this, given \(\underline{k}\in \{1,\ldots , K\}^N\), we define the occupation numbers \(\underline{n}(\underline{k})=(n_1(\underline{k}),\dots ,n_K(\underline{k}))\in {\mathbb {N}}^K\) as

$$\begin{aligned} n_q(\underline{k})=\sum _i \delta _{q,k_i}\,. \end{aligned}$$

That is \(n_q(\underline{k})\)is the number of i such that \(k_i=q\). In other words, if \(\underline{v}_N\in B_{\underline{k}}\) then there are \(n_q(\underline{k})\) particles with velocity in \(B_q\).

The fact that \(h_N\) is invariant under permutation of its arguments implies that \(\bar{h}_{N,{\mathcal {B}}^N}(\underline{k})\) depends only on \(\underline{n}(\underline{k})\) or, more precisely, if \(\underline{n}(\underline{k})=\underline{n}(\underline{k}')\) then \(\bar{h}_{N,{\mathcal {B}}^N}(\underline{k})=\bar{h}_{N,{\mathcal {B}}^N}(\underline{k}')\). This allow us to define the function \(F:{\mathbb {N}}^K\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} F(\underline{n})=\bar{h}_N(\underline{k})\quad \mathrm {if}\quad \underline{n}=\underline{n}(\underline{k}), \;\;\mathrm { and }\;\; N=\sum _{k=1}^K n_k:=|\underline{n}|\, . \end{aligned}$$

Using this definition and the fact that \(\sum _{k=1}^K\omega _k=1\), we can now write

$$\begin{aligned} E(\mathbf {h}_{{\mathcal {B}}})&=\sum _N a_N \sum _{\underline{k}\in \{1,\ldots ,K\}^N}\bar{h}_{N,{\mathcal {B}}^N}(\underline{k})\omega _{\underline{k}}\nonumber \\&=\sum _N \frac{e^{-\frac{\mu }{\rho }}}{N!}\left( \frac{\mu }{\rho }\right) ^N \sum _{|\underline{n}|=N}\left( {\begin{array}{c}N\\ n_1,\ldots ,n_K\end{array}}\right) F(\underline{n})\prod _{k=1}^K \omega _k^{n_k}\nonumber \\&=\sum _{\underline{n}\in {\mathbb {N}}^K}F(\underline{n})\prod _{k=1}^K \pi _{\alpha _k}(n_k) :=\widetilde{E}_{\underline{\alpha }_K}(F) \end{aligned}$$
(79)

where \(\underline{\alpha }_K=(\alpha _1,\ldots ,\alpha _K)\) with \(\alpha _k=\mu \omega _k/\rho \) and

$$\begin{aligned} \pi _{\alpha }(n)=e^{-\alpha }\frac{\alpha ^n}{n!}\,, \end{aligned}$$

that is \(\pi _{\alpha _k}\) is the Poisson distribution with expected value \(\alpha _k\). Similarly we have

$$\begin{aligned} S(\mathbf {h}_{{\mathcal {B}}})&=\sum _N a_N\sum _{\underline{k}\in \{1,\ldots ,K\}^N} \bar{h}_{N,{\mathcal {B}}^N}(\underline{k})\log (\bar{h}_{N,{\mathcal {B}}^N}(\underline{k}))\omega _{\underline{k}}\nonumber \\&=\sum _{\underline{n}\in {\mathbb {N}}^K} F(\underline{n})\log (F(\underline{n}))\prod _{k=1}^K \pi _{\alpha _k}(n_k):=\widetilde{S}_{\underline{\alpha }_K}(F) \end{aligned}$$
(80)

Finally setting \(\underline{n}^{q}=(n_1,\ldots ,n_q+1,\ldots ,n_K)\) we get

$$\begin{aligned} \varPsi (\mathbf {h}_{{\mathcal {B}}})&=\sum _N a_{N}\sum _{\underline{k}\in \{1,\ldots ,K \}^{N}} \sum _{q=1}^K (\bar{h}_{N+1, {\mathcal {B}}^{N+1}}(\underline{k},q) -\bar{h}_{N,{\mathcal {B}}^N}(\underline{k}))\cdot \nonumber \\&\quad \times (\log \bar{h}_{N+1,{\mathcal {B}}^{N+1}}(\underline{k},q) -\log \bar{h}_{N,{\mathcal {B}}^N}(\underline{k}))\omega _{\underline{k}}\omega _q\nonumber \\&=\frac{\rho }{\mu }\sum _{q=1}^K\alpha _q\sum _{\underline{n}\in {\mathbb {N}}^K} \left( F(\underline{n}^q)-F(\underline{n})\right) \left( \log F(\underline{n}^q)-\log F(\underline{n})\right) \prod _{k=1}^K \pi _{\alpha _k}(n_k)\nonumber \\&:=\frac{\rho }{\mu }\widetilde{\varPsi }_{\underline{\alpha }_K}(F)\, . \end{aligned}$$
(81)

so that, to prove (78), we need to show that, for every \(F:{\mathbb {N}}^K\rightarrow {\mathbb {R}}_+\) and for every K and \(\underline{\alpha }_K\in {\mathbb {R}}_+^K\), if \(\widetilde{S}_K(F)<\infty \) then

$$\begin{aligned} \widetilde{S}_{\underline{\alpha }_K}(F) \le \widetilde{\varPsi }_{\underline{\alpha }_K}(F) +\widetilde{E}_{\underline{\alpha }_K}(F)\log \widetilde{E}_{\underline{\alpha }_K}(F)\, . \end{aligned}$$
(82)

We will prove (82) by induction over K. Assume that (82) is valid for every index less than K for some \(K>1\) and write

$$\begin{aligned} \widetilde{S}_{\underline{\alpha }_{K-1}}(F(\cdot ,n_K)) =\sum _{\underline{n}'\in {\mathbb {N}}^{K-1}}F(\underline{n}',n_K)\log F(\underline{n}',n_K) \prod _{k=1}^{K-1} \pi _{\alpha _k}(n_k) \end{aligned}$$

and similar expression for \(E_{\underline{\alpha }_{K-1}}(F(\cdot ,n_K))\) and \(\varPsi _{\underline{\alpha }_{K-1}}(F(\cdot ,n_K))\).

Using the inductive hypothesis we obtain

$$\begin{aligned} \widetilde{S}_{\underline{\alpha }_{K}}(F)&=\sum _{n_K=0}^\infty \widetilde{S}_{\underline{\alpha }_{K-1}} (F(\cdot ,n_K))\pi _{\alpha _K}(n_K) \le \sum _{n_K=0}^\infty \widetilde{\varPsi }_{\underline{\alpha }_{K-1}} (F(\cdot ,n_K))\pi _{\alpha _K}(n_K)\\&\quad + \sum _{n_K=0}^\infty \widetilde{E}_{\underline{\alpha }_{K-1}} (F(\cdot ,n_K))\log \widetilde{E}_{\underline{\alpha }_{K-1}} (F(\cdot ,n_K))\pi _{\alpha _K}(n_K)\, . \end{aligned}$$

Calling \(F_1(n_K)=\widetilde{E}_{\underline{\alpha }_{K-1}} (F(\cdot ,n_K))\) and using the inductive hypothesis again we get

$$\begin{aligned} \sum _{n_K=0}^\infty \widetilde{E}_{\underline{\alpha }_{K-1}} (F(\cdot ,n_K))\log \widetilde{E}_{\underline{\alpha }_{K-1}} (F(\cdot ,n_K))\pi _{\alpha _K}(n_K)&=\widetilde{S}_{\alpha _K}(F_1)\\&\le \widetilde{\varPsi }_{\alpha _K}(F_1) +\widetilde{E}_{\alpha _K}(F_1)\log \widetilde{E}_{\alpha _K}(F_1) \end{aligned}$$

so that

$$\begin{aligned} \widetilde{S}_{\underline{\alpha }_{K}}(F) \le \sum _{n_K=0}^\infty \widetilde{\varPsi }_{{\underline{\alpha }_{K-1}}}(F(\cdot ,n_K)) \pi _{\alpha _K}(n_K)+ \widetilde{\varPsi }_{\alpha _K}(F_1) +\widetilde{E}_{\alpha _K}(F_1)\log \widetilde{E}_{\alpha _K}(F_1)\, . \end{aligned}$$
(83)

Observing that \(\widetilde{E}_{\alpha _K}(F_1) =\widetilde{E}_{\underline{\alpha }_{K}}(F)\) and that, by convexity,

$$\begin{aligned} \widetilde{\varPsi }_{\alpha _K}(F_1)&=\alpha _K\sum _{n=0}^\infty (F_1(n+1)-F_1(n))(\log F_1(n+1)-\log F_1(n))\pi _{\alpha _K}(n)\\&\le \alpha _K\sum _{\underline{n}\in {\mathbb {N}}^K} \left( F(\underline{n}^K)-F(\underline{n})\right) \left( \log F(\underline{n}^K)-\log F(\underline{n})\right) \prod _{k=1}^K \pi _{\alpha _k}(n_k) \end{aligned}$$

we get (82) for K. Thus, by induction, to prove (82) for every K we just need to prove it for \(K=1\). This is the content of the following Lemma.

Lemma 24

Let \(\pi _\alpha \) be the Poisson distribution on \({\mathbb {N}}\) with expected value \(\alpha >0\) and \(f:{\mathbb {N}}\rightarrow {\mathbb {R}}^+\) be such that

$$\begin{aligned} \sum _{n=0}^\infty f(n)\log f(n)\pi _\alpha (n)<\infty \,, \end{aligned}$$

then we have

$$\begin{aligned} \sum _{n=0}^\infty f(n)\log f(n) \pi _\alpha (n)&\le \left( \sum _{n=0}^\infty f(n) \pi _\alpha (n)\right) \log \left( \sum _{n=0}^\infty f(n) \pi _\alpha (n)\right) \nonumber \\&\quad +\alpha \sum _{n=0}^\infty \left( f(n+1)-f(n)\right) \left( \log f(n+1)-\log f(n)\right) \pi _\alpha (n)\, . \end{aligned}$$
(84)

Proof

Observe first that since \(\alpha \pi _\alpha (n)=(n+1)\pi _\alpha (n+1)\) we get

$$\begin{aligned}&\alpha \sum _{n=0}^\infty \left( f(n+1)-f(n)\right) \left( \log f(n+1) -\log f(n)\right) \pi _\alpha (n)\\&\quad =\sum _{n=1}^\infty n\left( f(n)-f(n-1)\right) \left( \log f(n) -\log f(n-1)\right) \pi _\alpha (n)\, . \end{aligned}$$

Let now \(\pi _{\alpha ,N}(n)\) be the binomial distribution with parameters N and \(\alpha /N\), that is

$$\begin{aligned} \pi _{\alpha ,N}(n)=\left( {\begin{array}{c}N\\ n\end{array}}\right) \left( \frac{\alpha }{N}\right) ^n \left( 1-\frac{\alpha }{N}\right) ^{N-n}\, . \end{aligned}$$

We will prove by induction that for every N and every \(\alpha \le N\) we have

$$\begin{aligned} \sum _{n=0}^N f(n)\log f(n) \pi _{\alpha ,N}(n)&\le \left( \sum _{n=0}^N f(n) \pi _{\alpha ,N}(n)\right) \log \left( \sum _{n=0}^N f(n) \pi _{\alpha ,N}(n)\right) \nonumber \\&\quad +\sum _{n=1}^N n\left( f(n)-f(n-1)\right) \left( \log f(n)-\log f(n-1)\right) \pi _{\alpha ,N}(n)\, \end{aligned}$$
(85)

so that, taking the limit for \(N\rightarrow \infty \), we will obtain (84). The base case \(N=1\) is covered by the following Lemma. \(\square \)

Lemma 25

Let \(\mu _x\ge 0\), \(x\in \{0,1\}\), be such that \(\mu _0+\mu _1=1\) then for every function \(f:\{0,1\}\rightarrow {\mathbb {R}}^+\) we have

$$\begin{aligned} \sum _{x=0,1} f(x)\log f(x) \mu _x&\le \left( \sum _{x=0,1} f(x) \mu _x\right) \log \left( \sum _ {x=0,1}f(x)\mu _x\right) '\nonumber \\&\quad +\mu _0\mu _1\left( f(1)-f(0)\right) \left( \log f(1)-\log f(0)\right) \, . \end{aligned}$$
(86)

Proof

Calling \(h(0)=f(0)/(\mu _0f(0)+\mu _1 f(1))\) and \(h(1)=f(1)/(\mu _0f(0)+\mu _1 f(1))\), (86) becomes

$$\begin{aligned} \sum _{x=0,1} h(x)\log h(x) \mu _x\le \mu _0\mu _1\left( h(1)-h(0)\right) \left( \log h(1)-\log h(0)\right) \, . \end{aligned}$$

Since \(\mu _0 h(0)+\mu _1 h(1)=1\) we can write \(h(0)=1+\delta \mu _1\) and \(h(1)=1-\delta \mu _0\) and we get

$$\begin{aligned}&\sum _{x=0,1} h(x)\log h(x) \mu _x\\&\quad =\mu _0\mu _1\delta (\log (1+\delta \mu _1) -\log (1-\delta \mu _0))+\mu _0\log (1+\delta \mu _1) +\mu _1\log (1-\delta \mu _0)\\&\quad \le \mu _0\mu _1\delta (\log (1+\delta \mu _1) -\log (1-\delta \mu _0))\\&\quad =\mu _0\mu _1\left( h(1)-h(0)\right) \left( \log h(1)-\log h(0)\right) \end{aligned}$$

where we have used concavity of the logarithm. \(\square \)

Assume now that (85) holds for every index less than N. Given \(\alpha \le N\) call \(\beta =(N-1)\alpha /N\) so that \(\beta \le N-1\). Define also \(\mu _0=1-\alpha /N\), \(\mu _1=\alpha /N\), and observe that, for every \(J:{\mathbb {N}}\rightarrow {\mathbb {R}}\),

$$\begin{aligned} \sum _{n=0}^N J(n)\pi _{\alpha ,N}(n)=\sum _{x=0,1}\sum _{n=0}^{N-1} J(n+x)\pi _{\beta ,N-1}(n)\mu _x\, . \end{aligned}$$
(87)

Calling

$$\begin{aligned} \bar{f}(x)=\sum _{n=0}^{N-1} f(n+x)\pi _{\beta ,N-1}(n) \end{aligned}$$

and using (87) and the inductive hypothesis for index \(N-1\), we get

$$\begin{aligned} \sum _{n=0}^N f(n)\log f(n) \pi _{\alpha ,N}(n)&=\sum _{x=0,1}\sum _{n=0}^{N-1} f(n+x)\log f(n+x) \pi _{\beta ,N-1}(n)\mu _x\\&\le \sum _{x=0,1}\bar{f}(x)\log \bar{f}(x)\mu _x \\&\quad +\sum _{x=0,1}\sum _{n=1}^{N-1} n\left( f(n+x)-f(n-1+x)\right) \\&\quad \cdot \left( \log f(n+x)-\log f(n-1+x)\right) \pi _{\beta ,N-1}(n)\mu _x \end{aligned}$$

while using Lemma 25 for the first term in the second line delivers

$$\begin{aligned} \sum _{n=0}^N f(n)\log f(n) \pi _{\alpha ,N}(n)&\le \left( \sum _{n=0}^N f(n)\pi _{\alpha ,N}(n)\right) \log \left( \sum _{n=0}^N f(n)\pi _{\alpha ,N}(n)\right) \nonumber \\&\quad +\mu _0\mu _1(\bar{f}(1)-\bar{f}(0)) (\log \bar{f}(1)-\log \bar{f}(0))\nonumber \\&\quad +\sum _{x=0,1}\sum _{n=1}^{N-1} n \left( f(n+x)-f(n-1+x)\right) \nonumber \\&\quad \cdot \left( \log f(n+x)-\log f(n-1+x)\right) \pi _{\beta ,N-1}(n)\mu _x \end{aligned}$$
(88)

Finally using the joint convexity in (xy) of the function \((x-y)(\log x - \log y)\) and the fact that \(\mu _0<1\) we can write

$$\begin{aligned}&\mu _0\mu _1(\bar{f}(1)-\bar{f}(0))(\log \bar{f}(1)-\log \bar{f}(0))\\&\quad \le \mu _1\sum _{n=0}^{N-1} (f(n+1)-f(n))(\log f(n+1)-\log f(n))\pi _{\beta ,N-1}(n) \end{aligned}$$

that inserted in (88) gives

$$\begin{aligned} \sum _{n=0}^N f(n)\log f(n) \pi _{\alpha ,N}(n)&\le \left( \sum _{n=0}^N f(n)\pi _{\alpha ,N}(n)\right) \log \left( \sum _{n=0}^N f(n)\pi _{\alpha ,N}(n)\right) \\&\quad +\sum _{x=0,1}\sum _{n=1}^{N-1} (n+x)\left( f(n+x)-f(n-1+x)\right) \\&\quad \cdot \left( \log f(n+x)-\log f(n-1+x)\right) \pi _{\beta ,N-1}(n)\mu _x\, . \end{aligned}$$

Changing summation variables from (xn) to \((x,n+x)\) and using (87) we obtain (85) for index N. Thus (85) is valid for every \(N\ge 1\) and every \(\alpha \le N\).

To complete the proof of Lemma 24 we need to show that we can take the limit for \(N\rightarrow \infty \) in (85). To this end observe that given \(\alpha \), for N large enough we have \(0<\left( 1-\frac{\alpha }{N}\right) ^{N}\le 2e^{-\alpha }\). Thus for large N and \(\alpha <n\le N\) we get

$$\begin{aligned} \pi _{\alpha ,N}(n)&\le 2e^{-\alpha }\frac{(\alpha )^n}{n!} \left( 1-\frac{\alpha }{N}\right) ^{-n} \prod _{i=1}^n\left( 1-\frac{i}{N} \right) \nonumber \\&\le 2e^{-\alpha }\frac{(\alpha )^n}{n!} \left( 1-\frac{\alpha }{N}\right) ^{-\lfloor \alpha \rfloor } \prod _{i=1}^{\lfloor \alpha \rfloor } \left( 1-\frac{i}{N} \right) \le 4\pi _\alpha (n)\, . \end{aligned}$$
(89)

Using Dominated Convergence, (89) implies that, if f(n) is bounded below and \(\sum _{n=0}^\infty f(n)\pi _{\alpha }(n)\le \infty \) then

$$\begin{aligned} \lim _{N\rightarrow \infty } \sum _{n=0}^N f(n)\pi _{\alpha ,N}(n) =\sum _{n=0}^\infty f(n)\pi _{\alpha }(n)\, . \end{aligned}$$

We can now let \(N\rightarrow \infty \) in (85) to obtain (84). This concludes the proof of Lemma 24. \(\square \)

To sum up, the validity of (84) together with the inductive argument in (83) shows that (82) is valid for every K and \(\alpha _k\), \(k=1,\ldots , K\). This in turn, together with (79), (80) and (81), establishes the validity of (78) for every state \(\mathbf {h}\) and every partition \({\mathcal {B}}\) of \({\mathbb {R}}\). This, together with Lemma 24 completes the proof of Lemma 20. \(\square \)

Observe now that if \(\mathbf {f}\) is a probability distribution \(E(\mathbf {h})=1\) so that Lemma 20, together with Lemma 19, gives

$$\begin{aligned} \frac{d_+}{dt} S(\mathbf {h}(t))\le -\rho S(\mathbf {h}(t)) \end{aligned}$$
(90)

To complete the proof of Theorem 4 we have to show that (90) implies (15). To this end, take \(\rho '<\rho \), assume that there exists t such that \(S(\mathbf {h}(t))> e^{-\rho ' t} S(\mathbf {h}(0))\) and let

$$\begin{aligned} T=\inf \{t\ge 0\,|\, S(\mathbf {h}(t))> e^{-\rho ' t} S(\mathbf {h}(0))\}\, . \end{aligned}$$

By continuity we get \(S(\mathbf {h}(T))= e^{-\rho ' T} S(\mathbf {h}(0))\). From (90), for every \(\epsilon \) we can find \(\delta \) such that

$$\begin{aligned} S(\mathbf {h}(T+h))\le (1-\rho h)e^{-T\rho '}S(\mathbf {h}(0))+h\epsilon \end{aligned}$$

for every \(h\le \delta \). Choosing \(\epsilon =(\rho -\rho ')e^{-T\rho '}S(\mathbf {h}(0))\) we get

$$\begin{aligned} S(\mathbf {h}(T+h))\le e^{-(T+h)\rho '}S(\mathbf {h}(0)) \end{aligned}$$

which implies that \(S(\mathbf {h}(t))\le e^{-t\rho '}S(\mathbf {h}(0))\) for every \(t\ge 0\) and every \(\rho '<\rho \). \(\square \)

3.5 Derivation of (17)

To prove (17), we observe that \(\eta (t)\) and g(vt) in (18) satisfy the equations

$$\begin{aligned} \dot{\eta }(t)&=\mu -\rho \eta (t)\\ \dot{g}(v,t)&=\frac{\mu }{\eta (t)}(\gamma (v)-g(v,t))\, . \end{aligned}$$

Setting \(\mathbf {f}(t)=(f_0(t),f_1(t),f_2(t),\ldots )\) with

$$\begin{aligned} f_N(\underline{v}_N,t)=e^{-\eta (t)}\frac{\eta (t)^N}{N!}\prod _{i=1}^Ng(v_i,t) \end{aligned}$$

we get

$$\begin{aligned} \frac{d}{dt}f_N(\underline{v}_N,t)&=(\mu -\rho \eta (t)) e^{-\eta (t)}\frac{\eta (t)^{N-1}}{(N-1)!} \left( 1-\frac{\eta (t)}{N}\right) \prod _{i=1}^Ng(v_i,t)\\&\quad +\mu e^{-\eta (t)}\frac{\eta (t)^{N-1}}{N!} \sum _i\left( (\gamma (v_i)-g(v_i,t))\prod _{j\not =i}g(v_j,t)\right) \\&=\rho e^{-\eta (t)}\frac{\eta (t)^{N+1}}{N!}\prod _{i=1}^Ng(v_i,t) -\rho e^{-\eta (t)}\frac{\eta (t)^N}{(N-1)!}\prod _{i=1}^Ng(v_i,t)\\&\quad +\mu e^{-\eta (t)}\frac{\eta (t)^{N-1}}{N!}\sum _i \gamma (v_i)\prod _{j\not =i}g(v_j,t)- \mu e^{-\eta (t)}\frac{\eta (t)^N}{N!}\prod _{i=1}^Ng(v_i,t)\\&=\rho (({\mathcal {O}}\mathbf {f}(t))_{N}(\underline{v})-Nf_N(\underline{v}_N,t))+\mu (({\mathcal {I}}\mathbf {f}(t))_{N}(\underline{v})-f_N(\underline{v}_N,t))\,. \end{aligned}$$

Thus \(\mathbf {f}(t)\) solves (1) with \(\tilde{\lambda }=0\). Clearly \(\mathbf {f}(t)\in D^1\) for every \(t\ge 0\) so that, by Remark 7, \(\mathbf {f}(t)=e^{t{\mathcal {T}}}\mathbf {f}(0)\).

3.6 Proof of Theorem 5

Given a continuous and bounded test function \(\phi _k:{\mathbb {R}}^k\rightarrow {\mathbb {R}}\), symmetric with respect to the permutation of its variables, we define

$$\begin{aligned} (\mathbf {f}_n,\phi _k)_{k,n}=\left( \frac{\rho }{\mu _n}\right) ^{k} \sum _{N\ge k}\frac{N!}{(N - k )!}\int _{{\mathbb {R}}^N} f_{n,N}(\underline{v}_N)\phi _k(\underline{v}_k)d\underline{v}_N\, . \end{aligned}$$

What we need to show is that, if \(\mathbf {f}_n\) forms a chaotic sequence and \(\phi :{\mathbb {R}}\rightarrow {\mathbb {R}}\) is a test function then

$$\begin{aligned} \lim _{n\rightarrow \infty }(e^{{\mathcal {L}}_n t}\mathbf {f}_n,\phi ^{\otimes k})_{k,n} =\left( \lim _{n\rightarrow \infty }(e^{ {\mathcal {L}}_n t}\mathbf {f}_n,\phi )_{1,n}\right) ^k \end{aligned}$$

which implies propagation of chaos.

The argument to prove propagation of chaos introduced in [16] is based on the power series expansion of \(e^{\lambda K_N t}\), which converges since \(K_N\) is a bounded operator. After this, one can exploit a cancellation between \(Q_N\) and \(\left( {\begin{array}{c}N\\ 2\end{array}}\right) \mathrm {Id}\), see (5), when they act on a function \(\phi _k\) depending only on \(k<N\) variables, see Sect. 3 of [16]. In the present case the analogue of such an argument formally works but it cannot be applied directly since, being \({\mathcal {K}}\) unbounded, the power series expansion of \(e^{\tilde{\lambda }_n {\mathcal {K}}t}\) does not converge. To avoid this problem, one may try to use the convergent expansion (27) introduced in Sect. 3.1. But the different treatment of \(Q_N\) and \(\left( {\begin{array}{c}N\\ 2\end{array}}\right) \mathrm {Id}\) in (27) would make it very hard to see the needed cancellation.

Thus we will introduce a partial expansion of \(e^{\tilde{\lambda }_n {\mathcal {K}}t}\) and combine it with (33) and (40). The idea is to expand this exponential in the least possible way to exploit the central cancellations of McKean’s argument. We first decompose \(K_N\) as

$$\begin{aligned} K_N=K_k+\widetilde{K}_{N-k}+(N-k)G_k \end{aligned}$$

with

$$\begin{aligned} \widetilde{K}_{N-k}&=\sum _{k+1\le i<j\le N}(R_{i,j}-\mathrm{Id})\\ G_k&=\frac{1}{N-k}\sum _{i=1}^k\sum _{j=k+1}^N (R_{i,j}-\mathrm{Id}) \end{aligned}$$

and obtain

$$\begin{aligned} e^{\tilde{\lambda }_n K_N t}\phi _k=e^{\tilde{\lambda }_n K_k t}\phi _k+(N-k)\tilde{\lambda }_n\int _0^t e^{\tilde{\lambda }_n K_N(t-s)}G_ke^{\tilde{\lambda }_n K_k s}\phi _k ds \end{aligned}$$
(91)

where we used that \(K_N\) is a bounded operator on \(C^0({\mathbb {R}}^N)\) and that \(\widetilde{K}_{N-k}\phi _k=0\). Since we are interested in integrating (91) against a symmetric function \(f_N\) we can write

$$\begin{aligned} G_k[\phi _k](\underline{v}_{k+1})= \sum _{i=1}^k\int \frac{d\theta }{2\pi } [ \phi ( v_1 ,\ldots ,v_{i-1},v_i\cos \theta +v_{k+1}\sin \theta ,v_{i+1},\ldots ,v_k)-\phi (\underline{v}_k)] \end{aligned}$$

To iterate we need to apply (91) to the factor \(e^{\tilde{\lambda }_n K_N(t-s)}\) inside the integral in (91) itself. Since \(G_ke^{\tilde{\lambda }_n K_k s}\phi _k\) is a function of \(k+1\) variables we now have to write

$$\begin{aligned} K_N=K_{k+1}+\widetilde{K}_{N-k-1}+(N-k-1)G_{k+1}\, . \end{aligned}$$

Iterating this procedure we get

$$\begin{aligned} e^{\tilde{\lambda }_n K_N t}\phi _k&=e^{\tilde{\lambda }_n K_{k} t}\phi _k\\&\quad +\sum _{p=1}^{N-k}\frac{\tilde{\lambda }_n^p (N-k)!}{(N-k-p)!}\int _{0<t_1< \cdots< t_p < t } e^{\tilde{\lambda }_n K_{k+p} (t-t_{p})}G_{k+p-1} e^{\tilde{\lambda }_n K_{k+p-1} (t_p-t_{p-1})}\\&\qquad \cdots e^{(t_2-t_1)\tilde{\lambda }_n K_{k+1}}G_ke^{\tilde{\lambda }_n K_k t_1}\phi _k\,dt_p\cdots dt_1 \end{aligned}$$

so that

$$\begin{aligned} (e^{\tilde{\lambda }_n{\mathcal {K}}t}\mathbf {f}_n,\phi _k)_{k,n}&=(\mathbf {f}_n,e^{\tilde{\lambda }_n K_{k}t}\phi _k)_{k,n}\nonumber \\&\quad +\sum _{p=1}^\infty \lambda ^p\int _{0<t_1<\cdots< t_p < t } \Bigl (\mathbf {f}_n,e^{ \tilde{\lambda }_n K_{k+p} (t-t_{p})}G_{k+p-1}e^{\tilde{\lambda }_n K_{k+p-1} (t_p-t_{p-1})}\nonumber \\&\qquad \cdots e^{(t_2-t_1)\tilde{\lambda }_n K_{k+1}}G_ke^{\tilde{\lambda }_n K_k t}\phi _k\bigr )_{k+p,n} \,dt_p\cdots dt_1 \end{aligned}$$
(92)

where the factor \(\lambda ^p\) in the second line of (92), comes from (25) and (19).

Observe now that the \(R_{i,j}\) are averaging operators so that \(\Vert R_{i,j}\Vert _\infty \le 1\) which gives

$$\begin{aligned} \bigl \Vert e^{t\tilde{\lambda }_n K_{N}}\bigr \Vert _\infty =e^{-t\tilde{\lambda } \left( {\begin{array}{c}N\\ 2\end{array}}\right) }\bigl \Vert e^{t\tilde{\lambda } \sum _{1\le i<j\le N}R_{i,j}}\bigr \Vert _\infty \le 1\, . \end{aligned}$$

For the same reason we have

$$\begin{aligned} \Vert G_k\Vert _\infty \le \frac{1}{N-k}\sum _{i=1}^k\sum _{j=k+1}^N (\Vert R_{i,j}\Vert _\infty +1)\le 2k\, . \end{aligned}$$

Using (21) we get

$$\begin{aligned} \left| (e^{\tilde{\lambda }_n{\mathcal {K}}t}\mathbf {f}_n,\phi _k)_{k,n}\right|&\le \left( \frac{\rho }{\mu _n}\right) ^k\Vert \mathbf {f}_n\Vert _1^{(k)} \Vert \phi _k\Vert _{\infty }\\&\quad +\sum _{p=1}^\infty \frac{\lambda ^pt^p}{p!} \prod _{i=k}^{k+p-1}\Vert G_i\Vert _\infty \left( \frac{\rho }{\mu _n}\right) ^{k+p} \Vert \mathbf {f}_n\Vert _1^{(k+p)}\Vert \phi _k\Vert _{\infty }\\&\le \Vert \phi _k\Vert _{\infty }K^k\sum _{p=0}^\infty 2^p \lambda ^pt^pK^p\left( {\begin{array}{c}k+p-1\\ p\end{array}}\right) \,. \end{aligned}$$

Observe that the series in the last line converges for \(\lambda K t<1/2\). On the other hand, since \(\lim _{n\rightarrow \infty }\tilde{\lambda }_n=0\), for every t we have

$$\begin{aligned} \lim _{n\rightarrow \infty }(\mathbf {f}_n,e^{\tilde{\lambda }_n K_{k} t}\phi _k)_{k,n}=\lim _{n\rightarrow \infty }(\mathbf {f}_n,\phi _k)_{k,n} \end{aligned}$$

and similarly, calling \(G_k^{*p}=\prod _{i=0}^p G_{k+i}\),

$$\begin{aligned}&\lim _{n\rightarrow \infty } \int _{0<t_1<\cdots< t_p < t } \Bigl (\mathbf {f}_n,e^{ \tilde{\lambda }_n K_{k+p}(t-t_{p})} G_{k+p-1}e^{\tilde{\lambda }_n K_{k+p-1} (t_p-t_{p-1})}\\&\quad \cdots e^{(t_2-t_1)\tilde{\lambda }_n K_{k+1}}G_ke^{\tilde{\lambda }_n K_k t}\phi _k\bigr )_{k+p,n} \,dt_p\cdots dt_1 = \lim _{n\rightarrow \infty }\left( \mathbf {f}_n, G_{k}^{*p}\phi _k\right) _{k+p,n} \end{aligned}$$

so that we finally get

$$\begin{aligned} \lim _{n\rightarrow \infty }(e^{\tilde{\lambda }_n{\mathcal {K}}t} \mathbf {f}_n,\phi _k)_{k,n}=\lim _{n\rightarrow \infty }\sum _{p=0}^\infty \frac{\lambda ^pt^p}{p!}\left( \mathbf {f}_n, G_{k}^{*p}\phi _k\right) _{k+p,n}\, . \end{aligned}$$
(93)

Observe now that \(G_k\) acts as a derivation in the sense of [16], that is, for every \(\phi _{k_1}\) and \(\psi _{k_2}\) with \(k_1+k_2=k\), we have

$$\begin{aligned} G_{k}(\phi _{k_1}\otimes \psi _{k_2})=(G_{k_1}\phi _{k_1}) \otimes \psi _{k_2}+\phi _{k_1}\otimes (G_{k_2}\psi _{k_2})\, . \end{aligned}$$

This implies that

$$\begin{aligned} \frac{1}{p!}G_k^{*p}(\phi _{k_1}\otimes \psi _{k_2}) = \sum _{p_1+p_2=p}\frac{1}{p_1!}\frac{1}{p_2!} (G^{*p_1}_{k_1}\phi _{k_1})\otimes (G^{*p_1}_{k_2}\psi _{k_2})\, . \end{aligned}$$
(94)

Observing that if \(\mathbf {f}_n\) forms a chaotic sequence then

$$\begin{aligned} \lim _{n\rightarrow \infty } (\mathbf {f}_n,\phi _{k_1}\otimes \psi _{k_2})_{k,n} =\lim _{n\rightarrow \infty }(\mathbf {f}_n,\phi _{k_1})_{k_1,n}\lim _{n\rightarrow \infty } (\mathbf {f}_n,\psi _{k_2})_{k_2,n} \end{aligned}$$
(95)

we get

$$\begin{aligned}&\lim _{n\rightarrow \infty }\sum _{p=0}^\infty \frac{\lambda ^pt^p}{p!} (\mathbf {f}_n,G_{k}^{*p}\phi _{k_1}\otimes \psi _{k_2})_{k+p,n}\\&\quad =\lim _{n\rightarrow \infty }\sum _{p_1=0}^\infty \frac{\lambda ^{p_1}t^{p_1}}{p_1!}\left( \mathbf {f}_n, G_{k_1}^{*p_1}\phi _{k_1}\right) _{k_1+p_1,n} \lim _{n\rightarrow \infty }\sum _{p_2=0}^\infty \frac{\lambda ^{p_2}t^{p_2}}{p_2!}\left( \mathbf {f}_n, G_{k_2}^{*p_2}\phi _{k_2}\right) _{k_2+p_2,n}\nonumber \end{aligned}$$
(96)

which implies that \(e^{\tilde{\lambda }_n {\mathcal {K}}t}\) propagates chaos, at least for \(t\le t_0=\frac{1}{2\lambda K}\). Finally we need to verify that (21) still holds. Since \(\mathbf {f}_n\) are positive \(\Vert \mathbf {f}_n\Vert _1^{(r)}=N_r(\mathbf {f})\), see (43). Thus Corollary 9 implies that for every \(t\ge 0\) we have \(\Vert \mathbf {f}_n(t)\Vert ^{(r)}\le K_1^r\left( \frac{\mu _n}{\rho }\right) ^r\) with \(K_1=\max \{K,1\}\). Thus \(\mathbf {f}_n(t_0)=e^{\tilde{\lambda }_n {\mathcal {K}}t_0}\mathbf {f}_n\) forms a chaotic sequence that satisfies (21) with \(K_1\) in place of K. Using \(\mathbf {f}_n(t_0)\) as initial condition we get that propagation of chaos holds up to time \(t_1=\frac{1}{2\lambda K}+\frac{1}{2\lambda K_1}\). Iterating this argument we see that \(e^{\tilde{\lambda }_n {\mathcal {K}}t}\) propagates chaos for every \(t\ge 0\).

To add the out operator \({\mathcal {O}}\), we observe that from (93) we get

$$\begin{aligned}&\lim _{n\rightarrow \infty }(e^{(\tilde{\lambda }_n{\mathcal {K}}-\rho {\mathcal {N}}) t} \mathbf {f}_n,\phi _k)_{k,n}\nonumber \\&\quad =\lim _{n\rightarrow \infty }\sum _{p=0}^\infty \frac{\lambda ^pt^p}{p!}\left( \frac{\rho }{\mu }\right) ^{k+p} \sum _{N\ge k+p} \frac{N!}{(N-k-p)!}e^{-\rho Nt}\nonumber \\&\qquad \times \quad \int f_{n,N}(\underline{v}_{N}) (G_{k}^{*p}\phi _k)(\underline{v}_{k+p}) d\underline{v}_{N}\, . \end{aligned}$$
(97)

Inserting (97) into (33), after some long algebra that we report in Appendix A.5, we obtain

$$\begin{aligned} \lim _{n\rightarrow \infty }\Bigl ( e^{(\tilde{\lambda }_n {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}}))t}&\mathbf {f}_n,\phi _k\Bigr )_{k,n} =\lim _{n\rightarrow \infty }\sum _{p=0}^\infty \frac{t^p\lambda ^p}{p!} \left( \mathbf {f}_n,e^{-\rho (k+p) t}G_k^{*p}\phi _k \right) _{k+p,n}\, . \end{aligned}$$
(98)

It is not hard to see that (98) implies that \(e^{(\tilde{\lambda }_n {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}}))t}\) propagates chaos.

Finally we consider the in operator \({\mathcal {I}}\). Observe that

$$\begin{aligned} \mu _n({\mathcal {I}}\mathbf {f}_n,\phi _k)_{k,n}&=\mu _n\left( \frac{\rho }{\mu _n}\right) ^{k} \sum _{N\ge k} \frac{(N-1)!}{(N-k)!}\int \sum _{i=1}^N f_{n,N-1}(\underline{v}_{N-1}^i)\gamma (v_i)\phi _k(\underline{v}_k)d\underline{v}_N\\&=\mu _n\left( \frac{\rho }{\mu _n}\right) ^{k}\sum _{N\ge k} \frac{(N-1)!}{(N-k)!}\int \bigl ((N-k) f_{n,N-1}(\underline{v}_{N-1})\gamma (v_N)\phi _k(\underline{v}_k)d\underline{v}_N\\&\quad +kf_{n,N-1}(\underline{v}_{N-1})\phi _k(\underline{v}_{k-1},v_N) \gamma (v_N)d\underline{v}_N\bigr )\\&=\mu _n\left( \frac{\rho }{\mu _n}\right) ^{k}\sum _{N> k} \frac{(N-1)!}{(N-1-k)!}\int f_{n,N-1}(\underline{v}_{N-1})\phi _k(\underline{v}_k)\gamma (v_N)d\underline{v}_{N-1}\\&\quad + k\rho \left( \frac{\rho }{\mu _n}\right) ^{k-1}\sum _{N\ge k-1} \frac{N!}{(N-(k-1))!}\\ {}&\quad \times \int f_{n,N}(\underline{v}_{N}) \phi _k(\underline{v}_{k-1},w)\gamma (v_N)d\underline{v}_Ndw \end{aligned}$$

so that

$$\begin{aligned} \mu _n ({\mathcal {I}}\mathbf {f}_n-\mathbf {f}_n,\phi _k)_{k,n}=(\mathbf {f}_n,I_k\phi _k)_{k-1,n} \end{aligned}$$
(99)

where

$$\begin{aligned} I_k[\phi _k](\underline{v}_{k-1}):=\rho k \int _{{\mathbb {R}}} \phi _k(\underline{v}_{k-1},w)e^{-\pi w^2}dw\, . \end{aligned}$$

which clearly act as a derivative in the sense of [16]. We can now use an expansion similar to (34)

$$\begin{aligned} e^{t{{\mathcal {L}}}_n}\mathbf {f}_n&=e^{(\tilde{\lambda }_n {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}}))t}\mathbf {f}_n\nonumber \\&\quad +\sum _{q=1}^\infty \mu _n^q\int \limits _{0<t_1<\ldots<t_q<t} e^{(\tilde{\lambda }_n {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}}))(t-t_q)} ({\mathcal {I}}-\mathrm {Id}) e^{(\tilde{\lambda }_n {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}}))(t_q-t_{q-1})} \nonumber \\&\qquad \cdots ({\mathcal {I}}-\mathrm {Id}) e^{(\tilde{\lambda }_n {\mathcal {K}}+\rho ({\mathcal {O}}-{\mathcal {N}}))t_1}\mathbf {f}_n\,dt_1\cdots dt_n \end{aligned}$$
(100)

that combined (99) with (98) gives

$$\begin{aligned} \lim _{n\rightarrow \infty }(e^{{\mathcal {L}}_n t}\mathbf {f}_n,\phi _k)_{k,n}&=\lim _{n\rightarrow \infty }\sum _{q\ge 0}\sum _{p_0, p_1 , \ldots ,p_{q} \ge 0} \rho ^q\lambda ^{|p|}e^{-\rho k t}\, \nonumber \\&\quad \cdot \int _{0\le t_q\le \cdots \le t_1\le t} \prod _{i=0}^q e^{-\rho (t_{i}-t_{i+1})(|p|_i-i)} \frac{(t_{i}-t_{i+1})^{p_i}}{p_i!}\, dt_1 \cdots dt_q\cdot \nonumber \\&\qquad \left( \mathbf {f}_n,G_{k+|p|_q-q}^{*p_{q}}I_{k+|p|_q-q+1}\cdots G_{k+p_0-1}^{*p_1}I_{k+p_0}G_k^{*p_0}\phi _k\right) _{k+|p|-q,n} \end{aligned}$$
(101)

where \(|p|_{i}=\sum _{j=0}^{i-1} p_j\) and \(t_0=t\), \(t_{q+1}=0\) and the order of the \(t_i\) in the integral is inverted due to the inversion of the order of the operators when taking the adjoint. From (101) it follows, after more long algebra reported in Appendix A.5, that, if \(k_1+k_2=k\), then

$$\begin{aligned} \lim _{n\rightarrow \infty }(e^{{\mathcal {L}}_n t}\mathbf {f}_n,\phi _{k_1} \otimes \psi _{k_2})_{k,n}= \lim _{n\rightarrow \infty }(e^{{\mathcal {L}}_n t} \mathbf {f}_n,\phi _{k_1})_{k_1,n}\lim _{n\rightarrow \infty }(e^{{\mathcal {L}}_n t} \mathbf {f}_n,\psi _{k_2})_{k_2,n} \end{aligned}$$
(102)

that is, \(e^{{\mathcal {L}}_n t}\) propagates chaos. The validity of the Boltzmann-Kac type equation (26) follows exactly as in [16]. \(\square \)

4 Conclusions

The central aim of this work is the extension of the analysis in [4], in which a thermostat idealizes the interaction with a large reservoir of particles kept at constant temperature and chemical potential. While in [4] the reservoir and the system could not exchange particles, here the main interaction is the continuous exchange of particles between the two.

However, it is in this same work which we hoped to extend that we also find points of possible extension to our current work. In the case of the standard Kac model, approach to equilibrium in the sense of the GTW metric \(d_2\) was shown in [18] while for a Kac system interacting with one or more Maxwellian thermostats it was shown in [8]. In the present situation though, it is not clear how to define an analogue of the GTW metric since the components \(f_N\) of a state \(\mathbf {f}\) are not, in general, probability distributions on \({\mathbb {R}}^N\).

Furthermore, in [3] the authors show that, in a strong and uniform sense, the evolution of the Kac system with a Maxwellian thermostat can be thought of as an idealization of the interaction with a large heat reservoir, itself described as a Kac system. We think it is possible to replicate such an analysis in the present context and hope to come back to this issue in a forthcoming paper.

We based our proof of propagation of chaos on the work in [16]; therefore, as in [16], it is not quantitative nor uniform in time. Recently, a quantitative and uniform in time result was obtained for the Kac system with a Maxwellian thermostat [6]. It is unclear to us whether the methods in their work extend to the present model.

Finally, the assumption that the rates \(\rho \) and \(\mu \) are independent of the number of particles is clearly unrealistic, allowing the possibility of an unbounded number of particles in the system. However, in the steady state (and in a chaotic state) the probability of having a number of particles in the system much larger then the average is extremely small, and so we do not consider this a serious problem. In any case, it would be interesting to investigate what happens if one assumes a maximum number of particles allowed inside the system.