Abstract
We study a 1-dimensional chain of N weakly anharmonic classical oscillators coupled at its ends to heat baths at different temperatures. Each oscillator is subject to pinning potential and it also interacts with its nearest neighbors. In our set up both potentials are homogeneous and bounded (with N dependent bounds) perturbations of the harmonic ones. We show how a generalised version of Bakry–Emery theory can be adapted to this case of a hypoelliptic generator which is inspired by Baudoin (J Funct Anal 273(7):2275-2291, 2017). By that we prove exponential convergence to non-equilibrium steady state in Wasserstein–Kantorovich distance and in relative entropy with quantitative rates. We estimate the constants in the rate by solving a Lyapunov-type matrix equation and we obtain that the exponential rate, for the homogeneous chain, has order bigger than \(N^{-3}\). For the purely harmonic chain the order of the rate is in \( [N^{-3},N^{-1}]\). This shows that, in this set up, the spectral gap decays at most polynomially with N.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 Description of the Model
We consider a model for heat conduction consisting of a one-dimensional chain of N coupled oscillators. The evolution is a Hamiltonian dynamics with Hamiltonian
where (p, q) belong in the phase space \( {\mathbb {R}}^{2N}\) and \(q_0,q_{N+1}\) describe the boundaries which here are considered to be fixed: \(q_0=q_{N+1}=0\). We denote by \(q=(q_1,\ldots ,q_N) \in {\mathbb {R}}^N\) the displacements of the atoms from their equilibrium positions and by \(p=(p_1,\ldots ,p_N) \in {\mathbb {R}}^N\) the momenta. Each particle has its own pinning potential \(U_{\text {pin}}\) and it also interacts with its nearest neighbors through an interaction potential \(U_{\text {int}}\). Notice that here all the masses are equal and we take them \(m_i=1\). So we consider a homogeneous chain, where both the masses and the potentials that act on each oscillator, are the same. The classical Hamiltonian dynamics is perturbed by noise and friction in the following way: the two ends of the chain are in contact with heat Langevin baths at two different temperatures \(T_L\), \(T_R >0 \). So our dynamics is described by the following system of SDEs:
where \(\gamma _i\) are the friction constants, \(T_i\) are the two temperatures and \(W_1,W_N\) are two independent normalised Wiener processes.
The dynamics (1.1) is equivalently described by the following Liouville equation on the law of the process
where \({\mathcal {L}}\) is the second order differential operator
which is the generator of the semigroup \(P_t\) acting on the space \(C_b^2({\mathbb {R}}^{2N})\) of bounded real-valued, \(C^2\) functions on the phase space. We denote by \({\mathcal {L}}^*\) the generator of the dual semigroup that acts on probability measures.
1.1.1 State of the Art
The model described by the SDEs (1.1), was first used to describe heat diffusion and derive rigorously Fourier’s law (for an overview see [8, 14, 27] and [17]). Since then, it has been the subject of many studies, both from a numerical and from a theoretical perspective. First, the purely harmonic case with several idealised reservoirs at different temperatures has been solved explicitly in [35]. In this paper the authors found exactly how the non-equilibrium stationary state looks like: it is Gaussian in the positions and momenta of the system. For the anharmonic chain there are no explicit results in general. However it has been studied numerically for many different potentials and many kinds of heat baths, including the Langevin heat baths that we consider here. See for instance [2, 20, 28] and references therein.
There are two facts in this model that make its rigorous study very challenging: first of all, we do not know explicitly the form of the invariant measure of (1.1) and also our generator is highly degenerate, having the dissipation and noise acting only on two variables of momenta at the end of the chain. It is not difficult to see, though, that in the equilibrium case, i.e. when the two temperatures are equal \(T_L=T_R=T= \beta ^{-1}\), the stationary measure is the Gibbs–Boltzmann measure \(\text{ d }\mu (p,q)=\exp (-\beta H(p,q)) \text{ d }p\text{ d }q\): after explicit calculations we have \({\mathcal {L}}^* e^{-\beta H(p,q)}=0\).
Since we are interested in the theoretical aspects of the model, we refer to [15, 16], which is the first rigorous study of the anharmonic case. The existence of a steady state has only been obtained in some cases where the potentials act like polynomials near infinity. In particular under the following assumptions on the potentials:
for constants \(a_k>0\), where for the interaction: \(k \ge 2\) and for the pinning \(k \ge 1\) (the exponent k for the pinning was improved in [9]) and assuming that the interaction potential is at least as strong as the pinning, the existence and uniqueness of an invariant measure was first proved in [15] using functional analytic methods. In particular it was proved that the resolvent of the generator of (1.1) is compact in a suitable weighted \(L^2\) space. Later it was proved in [34] that the rate of convergence to the steady state is exponential using probabilistic tools. Note that in the above-mentioned papers, the coupling of the chain with the heat baths is slightly different and a bit more complicated than considering Langevin thermostats, with physical interpretation: the model of the reservoirs is the classical field theory given by linear wave equations with initial conditions distributed with respect to appropriate Gibbs measures at different temperatures, see also [33, Sect. 2]. Later, an adaptation of a very similar probabilistic proof was provided in [9] for the Langevin thermostats. The difference with the Langevin heat baths is that the dissipation and the noise act on the momenta only indirectly through some auxiliary variables. Finally let us mention that the relaxation rates have been studied for short chains of rotors with Langevin thermostats in [11, 13].
Regarding the existence, uniqueness of a non-equilibrium stationary state and exponential convergence towards it in more complicated networks of oscillators (multi-dimensional cases) see [12]. The proofs there are inspired by the above-mentioned works in the 1-dimensional chains.
There are also cases where there is no convergence to equilibrium, when for instance \(l > k\), i.e. when the pinning is stronger than the coupling potential, see for example [21, 22]. In [22] the resolvent of the generator fails to be compact or/and there is lack of spectral gap, under some scenarios included in \(l > k\). In particular, when the interaction is harmonic, 0 belongs in the essential spectrum of the generator as soon as the pinning potential is of the form \(| q |^{k}\) for \(k>3\). The conjecture is that this is true as soon as \(k>\frac{2n}{2n-1}\) if n is the center of the chain.
1.2 Notation
\(\{e_i\}_{i=1}^n\) denote the elements of the canonical basis in \({\mathbb {R}}^n\) and \(| \cdot |\) to denote the Euclidean norm on \({\mathbb {R}}^n\), from the usual inner product \(\langle \cdot , \cdot \rangle \). For a square matrix \(A = (a_{ij})_{1 \le i,j \le n} \in {\mathbb {R}}^{n \times n}\), we write \(\Vert A\Vert _2\) for the operator (spectral) norm, induced by the Euclidean norm for vectors :
We also write \(A^{1/2}\) for the square root of a (positive definite) matrix A, i.e. the matrix such that \(A^{1/2} A^{1/2}=A\), for \(A^{1/2}\) a positive definite matrix as well. Moreover, by \(C_b^{\infty }({\mathbb {R}}^n)\) we denote the space of the smooth and bounded functions, by \(\nabla _z\) we denote the gradient on z-variables in a metric space X with respect to the Euclidean metric. We write \({\mathcal {P}}_2({\mathbb {R}}^n)\) for the space of the probability measures on \({\mathbb {R}}^n\) that have second moment finite, i.e.
[N] denotes the set \(\{1,2,\ldots , N \}\) and we use the notation \(g(x) \lesssim {\mathcal {O}}\big (f(x) \big )\) to indicate that there is a dimensionless constant \(C>0\) so that \(|g(x)| \le C |f(x)|\).
1.3 Set Up and Main Results
Let us state two assumptions: one on the boundary conditions of the chain and one on the potentials.
-
(H1) Regarding the boundary conditions, we consider the oscillators chain with rigidly fixed edges: the left boundary of the chain is an oscillator labelled 0 and the right is an oscillator labelled \(N+1\) under the hypothesis that \(q_0=q_{N+1}=0\). The first and the last particle are pinned with additional harmonic forces, corresponding to their attachment to a wall. Note that these boundary conditions and heat baths modelled by two Ornstein–Uhlenbeck processes at both ends as explained above, is the same model as in [35] and is known as the Casher–Lebowitz model, since it is also one of the models considered in [10].Footnote 1
-
(H2) The chain is weakly anharmonic: both pinning and interaction potentials differ from the quadratic ones by perturbing potentials \( U_{\text {pin}}^N, U_{\text {int}}^N \in {\mathcal {C}}^2({\mathbb {R}})\) with bounded Hessians in the following sense:
$$\begin{aligned} \sup _{\begin{array}{c} q_i \in {\mathbb {R}},\\ i=1,\ldots ,N \end{array} } \Vert \text {Hess}\ U_{\text {pin}}^N(q_i) \Vert _{2} \le C_{pin}^N \quad \text {and}\quad \sup _{\begin{array}{c} r_i \in {\mathbb {R}},\\ i=1,\ldots ,N \end{array}} \Vert \text {Hess}\ U_{\text {int}}^N (r_i)\Vert _{2} \le C_{int}^N \end{aligned}$$(1.4)where \(r_i:= q_{i+1}-q_i,\ i=1,\ldots ,N\). The positive constants \(C_{pin}^N\), \(C_{int}^N\) scale with the dimension like
$$\begin{aligned} C_{pin}^N + C_{int}^N \le C_0 N^{-9/2} \end{aligned}$$(1.5)and \(C_0\) is a dimensionless constant.
Under Assumptions (H1) and (H2) for \(a \ge 0, c>0\), the Hamiltonian takes the form
and denoting by \({\mathcal {L}}\) the infinitesimal generator, we look at the Liouville equation \( \partial _tf = {\mathcal {L}}^* f\), where the generator of the dynamics now is
where we take all the friction constants equal \(\gamma _1=\gamma _N=\gamma \), for the two temperatures \(T_L,T_R\) we assume that they satisfy \(T_L=T+\Delta T\), \(T_R=T- \Delta T\), for some temperature difference \(\Delta T >0\). Also, B is the symmetric tridiagonal (Jacobi) matrix
It is convenient to see the above form of the generator in the following block-matrix form:
where \(z=(p,q)^T \in {\mathbb {R}}^{2N}\), \(\Phi (q)\) corresponds to the perturbing potentials so that
the matrix \({\mathcal {F}}\) is the friction matrix
the matrix \(\Theta \) is the temperature matrix
and M in blocks is the following
where I is the identity matrix, so that it corresponds to the transport part of the operator, while B and \(\Gamma \) correspond to the harmonic part of the potentials and the drift from both ends, respectively.
Motivation. This study is motivated by a discussion opened in C. Villani’s memoir on hypocoercivity, see Sect. 9.2 in [40], concerning open questions on the heat conduction model as defined above, and how to approach them by hypocoercive techniques. This chain of coupled oscillators corresponds to a hypocoercive situation, where the diffusion only at the ends of the chain leads to a convergence to the stationary distribution exponentially fast, under the following assumptions on the potentials: strict convexity on the interaction potential (being stronger than the pinning one) and bounded Hessians for both potentials. In particular, he points out that it might be possible to recover the previous results of exponential convergence in the weighted \(H^1(\mu )\)-norm for this different class of potentials (than the potentials assumed in [16] for instance) by applying a generalised version of Theorem 24 in [40]. For that, one needs to know some properties of the, non-explicit, non-equilibrium steady state \(\mu \): for instance, if it satisfies a Poincaré inequality or if the Hessian of the logarithm of its density is bounded.
Finally we note that entropic hypocoercivity has been applied in [29] in order to develop estimates and to get quantitative convergence results to the limit equation, for anharmonic chains but with thermostats in contact with all the particles along the chain.
Main results. Here, considering a perturbation of the harmonic chain (homogeneous case), instead we follow an approach that combines hypocoercivity techniques and the Bakry–Émery theory of \(\Gamma \) calculus and curvature conditions as in [4]. We prove the validity of the Bakry–Émery criterion in a modified setting. This is explained in more details and is implemented in Sect. 3. The whole idea was inspired by Baudoin in [6]: using this combination, Baudoin proved exponential convergence to equilibrium for the Kinetic Fokker–Planck equation in \(H^1\)-norm and in Kantorovich–Wasserstein distance.
Thus we show, for the dynamics (1.1) as well, exponential convergence to the stationary state in Kantorovich–Wasserstein distance and in relative entropy and we get quantitative rates of convergence in these distances, i.e. we obtain information on the N-dependence of the rate. In particular our estimates show that the convergence rate in the harmonic chain approach 0 as N tends to infinity at a polynomial rate with order between \(C_1 /N^{3}\) and \(C_2/N\) and that the scaling of the rate is bigger than \(C_3 N^{-3}\) in the weakly anharmonic chain.
In order to quantify the above rates, we estimate \(\Vert b_N\Vert _2\), where \(b_N\) is a block matrix defined in Sect. 3 as a solution of a matrix equation, (1.10). Since \(\Vert b_N\Vert _2\) appears in the rates in the Theorems 1.4, 1.6 and the Proposition 1.2, we start by stating this result:
Proposition 1.1
Let \(\Pi _{N} = {\text {diag}}(2T_L, 1, \ldots ,1,2T_R, 1, 1, \ldots , 1,1) \in {\mathbb {R}}^{2N \times 2N}\) and \(M \in {\mathbb {R}}^{2N \times 2N}\) given by (1.9), with pinning and interaction coefficients \(a \ge 0, c>0\). For all \( N \in {\mathbb {N}}\), there exists a unique symmetric positive definite block matrix \(b_{N} \in {\mathbb {R}}^{2N \times 2N}\) such that
Moreover there exists \(C_{a,c} >0\), that depends only on the coefficients a, c, such that for all \(N \in {\mathbb {N}}\), \( \Vert b_{N} \Vert _2 \le C_{a,c} N^3 \) and \( \Vert b_N^{-1}\Vert \le C_{a,c}\).
Second, we state the following Proposition, that is restricted to the harmonic chain, and provides us with a lower bound on the spectral gap (given the estimates on \(\Vert b_N\Vert _2\) by Proposition 1.1):
Proposition 1.2
(Lower bound on the spectral gap of the harmonic chain) For the spectral gap \(\rho \) of the chain described by the generator (1.8) without the perturbing potentials (the harmonic chain), which is given by the relation
we have the following property: there exists \(\kappa >0\) such that for all \(N \in {\mathbb {N}}\),
This lower bound is in fact the optimal rate in the case of the harmonic homogeneous chain. In the work [7, Proposition 9.1] an upper bound is provided as well and thus the scaling of \(\rho \) is exactly \(N^{-3}\). This is done by exploiting the form of the matrix M, (1.9), and more specifically using information on the spectrum of the discrete Laplacian. In [7] we study also the case of disordered chains by considering different pinning coefficients for each oscillator. Compared to the homogeneous case, as in this paper, where the decay is polynomial, in a disordered chain the spectral gap decays at an exponential rate in terms of N. Regarding the adaptation of the generalised Bakry–Emery theory presented in this paper to a non-homogeneous scenario, we can prove existence of a spectral gap for the weakly anharmonic chain as soon as the matrix M has a spectral gap (and this is the case as soon as all the interaction coefficients \(c_i \ne 0\)). The difficulty in a non-homogeneous scenario will be the second part (as described in the Sect. 2): to solve the high-dimensional matrix equation (1.10) in order to estimate the spectral norm.
Remark 1.3
We expect the bound on the \(\Vert b_N\Vert _2\), from Proposition 1.1, to be optimal, since from the proof of Proposition 1.2 combined with [7, Proposition 9.1]: there exist \(c_1>0\), such that
In the following, we consider \(b_N\) as given by Proposition 1.1. Before we state the first main Theorem, we recall the definition of the Kantorovich–Rubinstein–Wasserstein \(L^2\)-distance \(W_2(\mu , \nu )\) between two probability measures \(\mu , \nu \):
where the infimum is taken over the set of all the couplings, i.e. the joint measures \(\pi \) on \( {\mathbb {R}}^N \times {\mathbb {R}}^N\) with left and right marginals \(\mu \) and \(\nu \) respectively.
It is easy to see that \(W_2\) is indeed a metric. We restrict ourselves on the subspace \({\mathcal {P}}_2({\mathbb {R}}^{2N})\), where \(\mu \) and \(\nu \) have second moments finite, so that their distance \(W_2(\mu ,\nu )\) will be finite. For more information on this distance we refer the reader for instance to [41] and references therein.
Theorem 1.4
We consider a chain of coupled oscillators whose dynamics are described by the system (1.1) under Assumptions (H1) and (H2). For a fixed number of particles N, there is a unique stationary state \(f_{\infty }\), in particular, for initial data \(f_0^1,f_0^2\) of the evolution equation, we have the following contraction property:
for \(C_{a,c}, \lambda _0\) dimensionless constants.
Moreover, in the set up of Theorem 1.4, we get some qualitative information about the non-equilibrium steady distribution, like the validity of a Poincaré inequality and even better, a Log–Sobolev inequality:
Proposition 1.5
(Log–Sobolev inequality) Let \({\mathcal {T}}\) be the quadratic form
Under Assumption (H2), the unique invariant measure \(\mu =f_{\infty } \) from the Theorem 1.4 satisfies a Log–Sobolev inequality \((LSI(C_N) )\) :
where
where \(\gamma , T_L, C_{a,c}, \lambda _0:= \lambda _0(C_0)\) are all dimensionless constants with the prefactor in (1.5), \(C_0\), to satisfy \(C_0 < {\text {min}}(1,2T_R)C_{a,c}^{-2}. \)
Consequently we have convergence to the non-equilibrium steady state in Entropy. Let us first define the following information-theoretical functionals. For two probability measures \(\mu \) and \(\nu \) on \({\mathbb {R}}^{2N}\) with \(\nu \ll \mu \), we define the Boltzmann H functional
and the relative Fisher information
We have entropic convergence in the following sense, as in [40, Sect. 6]:
Theorem 1.6
We consider a chain of coupled oscillators whose dynamics are described by the system (1.1) under Assumptions (H1) and (H2). For a fixed number of particles N, assuming that (i) \(\mu \) is the invariant measure for \(P_t\) and (ii) that it satisfies a Log–Sobolev inequality with constant \(C_N>0\), for all \(f>0\) with
we have a convergence to the non-equilibrium steady state in the following sense:
for dimensionless constants \(\lambda _{a,c}, \lambda _0\).
From Theorem 1.4 we get an exponential rate of order bigger than \(N^{-3}\) for the weakly anharmonic chain. In the purely harmonic case, we have that the convergence rate is between \(C_1 N^{-3}\) and \(C_2 N^{-1}\) for some constants \(C_1, C_2\) that are independent of N.
Remark 1.7
Note that a generalised version of \(\Gamma \) calculus has been applied for a toy model of the dynamics (1.1) by Monmarché, [31]: working with the unpinned, non-kinetic version, with convex interaction and given that the center of the mass is fixed, he proves the same kind of convergences and ends up with explicit and optimal N-dependent rates, of order \({\mathcal {O}}(N^{-2})\), for the overdamped dynamics.
1.4 Plan of the Paper
Sections 2, 3, 4 and 5 concern the proofs of the convergence to the steady state by hypocoercive arguments (applying the generalised Bakry–Emery criterion) while Sect. 6 is devoted to estimating the spectral norm of \(b_N\), which is crucial in the final estimate for the scaling of the spectral gap. In particular, Sect. 2 contains an introduction to Bakry–Emery theory and an explanation of the method that is used. In Sect. 3 we obtain the estimates that lead to the proof of Proposition 1.5. In Sects. 4 and 5 we give the proof of Theorems 1.6 and 1.4 respectively. Finally in Sect. 6 we prove Propositions 1.1 and 1.2.
2 Carré du Champ Operators and Curvature Condition
2.1 Introduction to Carré du Champ Operators
Consider a Markov semigroup \(P_t\) with at least one invariant measure \(\mu \) and infinitesimal generator \(L: D(L) \subset L^2(\mu ) \rightarrow L^2(\mu )\). Here we restrict ourselves to the case of the diffusion operators and we associate with the operator L, a bilinear quadratic differential form \(\Gamma \), the so-called Carré du Champ operator, which is defined as follows: for every pair of functions (f, g) in \( C^{\infty }\times C^{\infty }\)
In other words \(\Gamma \) measures the default of the distributivity of L. Then we define its iteration \(\Gamma _2\), where instead of the multiplication we use the action of \(\Gamma \):
From the theory of \(\Gamma \)-calculus we have that a curvature condition of the form
for all f in a suitable algebra \({\mathcal {A}}\) dense in the \(L^2(\mu )\)-domain of L and \(\lambda >0\) is equivalent to the following gradient estimate
where \(P_t\) is the semigroup generated by \({\mathcal {L}}\). The uniqueness of the invariant measure then follows from the contraction property in \(W_2\) distance (which is equivalent to the gradient estimate above thanks to Kuwada’s duality, see [26] or Theorem 4.1 later on). This also implies a Log–Sobolev inequality (and thus a Poincaré inequality), see [4] or [3, Sect. 3].
Attempt to apply the classical \(\Gamma \)theory to the generator \({\mathcal {L}}\)given by (1.8): For the generator of the dynamics (1.1), given by (1.8), we can not bound \(\Gamma _2 \) by \( \Gamma \) from below. Explicit calculations give
while
Since we can not control the terms \( \partial _{p_i}f \partial _{q_i}f, \) we can not bound \(\Gamma _2\) from below by \(\Gamma \). In cases like this, we say that the particle system has \( -\infty \) Bakry–Emery curvature.
2.2 Description of the Method
In order to overcome this problem, we are doing the following:
(1) First we modify the classical \(\Gamma \) theory: we define a new quadratic form, different, but equivalent, to the \(| \nabla _z f|^2\) that will play the role of the \(\Gamma \) functional. This will spread the noise from \(p_1\) and \(p_N \) to all the other degrees of freedom as well. The general idea comes from Baudoin [6]. We make a suitable choice of a positive definite matrix, \(b_N \in {\mathbb {R}}^{2N \times 2N}\), to define a new quadratic form that will replace the \(\Gamma \) functional, so that we obtain a ’twisted’ curvature condition: an estimate of the form (2.3). This implies also a modified gradient estimate, and thus a Poincaré and Log–Sobolev inequality. We choose this matrix to be the unique solution of a Lyapunov equation with positive definite r.h.s.:
In general in order to deal with a hypocoercive situation in \(H^1\)- setting, one can perturb the norm to an equivalent norm, so that exponential convergence results can be deduced with this new norm. The idea is originally due to Talay in [38] and it was later generalised by Villani in [40]. Then one can have convergence in the usual norm thanks to their equivalence. Here, instead of the norm, we modify the gradient and thus the \(\Gamma \)Carré du Champ, and work with a generalised \(\Gamma \)- theory.
The idea of working with the matrix that solves the above-mentioned Lyapunov equation came from the fact that (i) we need to control from below the quantity \(b_NM + M^Tb_N\) and (ii) in the linear chain, the covariance matrix \(b_0 \in {\mathbb {R}}^{2N \times 2N}\) solves
and determines the stationary solution of the corresponding Liouville equation. Therefore, tackling the hypoellipticity problem, i.e. spreading the dissipation to all the degrees of freedom, corresponds to working with a Lyapunov equation with positive definite r.h.s. A way to think of it is as a sequence of Lyapunov equations:
so that in each step we add a positive entry in the diagonal of the r.h.s. from both sides. This corresponds to spreading the noise and dissipation to the next oscillator from both ends until the center of the chain, like the commutators would do in a classical hypoelliptic setting, see also Fig. 1. So in the last step we have \(\Pi _N>0\) which corresponds to having spread the noise everywhere in the space. This allows us to prove the validity of the generalised Bakry–Emery criterion (3.4), which is the key estimate in order to have exponential convergence to the non-equilibrium steady state.
(2) In order to make our estimates quantitative, we estimate the spectral norm of the matrix \(b_N\) and its inverse. Regarding the bound on the norm of \(b_N\), we estimate its entries using that it solves the Lyapunov equation, while for the norm of \(b_N^{-1}\), we compare it to the norm of \(b_0^{-1}\) which is uniformly bounded in N. This corresponds to the proof of Proposition 1.1 which is the subject of Sect. 6.
For those familiar with Hörmander’s method we describe briefly here the similarity with the spreading of dissipation-mechanism: in Hörmander’s theory the smoothing mechanism is the one transferred through the interacting particles inductively by the use of commutators: the generator has the form
where
Then \([\partial _{p_1},X_0]=-\partial _{p_1}+ \partial _{q_1} \). Now commuting \(\partial _{q_1}\) with the first order terms of the generator: \([\partial _{q_1},X_0]= \partial _{q_1q_1}H \partial _{p_1}-\partial _{q_1q_2}H \partial _{p_2}\). Given that \(\partial _{q_1q_2}H\) is non-vanishing we have ’spread the smoothing mechanism’ to \(p_2\). Continuing like that, commuting the ’new’ variable with the first order terms of \({\mathcal {L}}\), inductively we cover all the particles of the chain.
3 Functional Inequalities in the Modified Setting
In order to apply a ’twisted’ Bakry–Emery machinery, introduced by Baudoin in Sect. 2.6 of [6], we work with the positive definite matrix \(b_N\) chosen to be the solution of the Lyapunov equation (1.10). The following Proposition gives us existence of such a solution.
Proposition 3.1
There exists a positive solution to (1.10) if and only if the r.h.s. of it, is positive definite and all the eigenvalues of M have positive real parts.
Proof
It is a matrix reformulation of a well known and classical result of Lyapunov that can be found for instance in [18, p. 224] or [30, Sect. 20]. \(\square \)
The eigenvalues of M have strictly positive real part ([25, Lemma 5.1]) and the right hand side of (1.10) is positive definite. Therefore there exists a positive solution of (1.10). Also, we can easily see that the solution is given by the formula
We define the following quadratic quantity for \(f,g \in C^{\infty }({\mathbb {R}}^{2N})\),
so that
Then we consider the functional
Here \({\mathcal {T}}(f,f)\) is always positive since \(b_N \ge 0\) (and in fact positive definite since \(b_N>0\): this is proven in the last part of the proof of Proposition 1.1). In contrast with the original operator \(\Gamma \), our modified quadratic form \({\mathcal {T}}\) is related to \({\mathcal {L}}\) only indirectly through the different steps of commutators.
We have an equivalence of the following form between \( {\mathcal {T}}\) and \(| \nabla _z|^2\):
Combining this with the conclusion of Proposition 1.1, we write
Proposition 3.2
With the above notation, under Assumption (H2), for all \(N \in {\mathbb {N}}\) there exists constant
such that for \(f \in C^{\infty }({\mathbb {R}}^{2N})\),
Proof
We use the form of the generator \({\mathcal {L}}\) as in (1.8):
where \(\Phi \) is the function that corresponds to the perturbing potentials. We write
About the \((-z^TM \nabla _z)\) -part of \({\mathcal {L}}\), the last equation of the above formula gives
Similarly, concerning the \((-\nabla _q \Phi (q) \cdot \nabla _p)\) -part of \({\mathcal {L}}\) we get
and finally regarding the second order terms of the generator we end up with
We eventually write
where for the second inequality we used that the terms \({\mathcal {T}}( \partial _{p_i} f,\partial _{p_i} f)\) for \(i=1,N\), are positive. We write the second and third term of the last equation as
and then from the boundedness assumption on the operator norms of the Hessians for both perturbing potentials and the Lyapunov equation (1.10), we get the following
We conclude by gathering the terms. \(\square \)
The assumption (H2) combined with the conclusion of the Proposition 1.1 ensures us that \(\lambda _N\) is positive, by choosing suitable pre-factors, as we do in the proofs of the main Theorems 1.4 and 1.6. We state now the following lemma that gives the ’twisted’ gradient bound.
Lemma 3.3
(Gradient bound) Under Assumption (H2), for all \(N \in {\mathbb {N}}\), \(t \ge 0\), \((p,q) \in {\mathbb {R}}^{2N}\) and \(f \in C_c^{\infty }({\mathbb {R}}^{2N})\), we have the following twisted gradient estimate
for \(\lambda _N\) given by Proposition 3.2.
Proof
We shall first present a formal derivation of the estimate (3.5). If \({\mathcal {T}}(P_t f, P_tf)\) is compactly supported we consider the functional, for fixed \(t>0, (p,q) \in {\mathbb {R}}^{2N}\),
for \(f \in C_c^{\infty }({\mathbb {R}}^{2N})\). Since from the semigroup property we have
by differentiating and using the above inequality we get
and since \(\Psi (0)={\mathcal {T}}(P_tf,P_tf),\ \Psi (t)=P_t({\mathcal {T}}(f,f))\), by Grönwall’s lemma we get the desired inequality for every smooth and bounded function f.
In general we need \({\mathcal {T}}(P_t f,P_tf)\) to belong in \(L^{\infty }({\mathbb {R}}^{2N})\) because then we know that \(P_s\big ({\mathcal {T}}(P_{t-s}f,P_{t-s}f)\big )\) is well defined. So we do the following:
First we take \(W(p,q)= 1+ |p|^2 + |q|^2\) as a Lyapunov structure that satisfies the following conditions: \(W >1\), \( {\mathcal {L}} W \le C W\), the sets \( \{ W \le m \}\) are compact for each m, and \({\mathcal {T}}(W) \le C W^2\). This W satisfy the conditions thanks to the bounded-Hessians assumption, i.e. \( | \nabla (U_{int}^N+ U_{pin}^N)| \) will be Lipschitz. In particular, for the inequality \( {\mathcal {L}} W \le C W \) using Cauchy–Schwarz and Young’s inequalities, we write
while the inequality \( {\mathcal {T}}(W) \le C_2 W^2\) obviously holds. So we end up with the same constant by choosing \(C:= \max \{C_1,C_2\}.\)
Now using the function W combined with a localization argument as in the work by F.Y. Wang [42, Lemma 2.1] or [5, Theorem 2.2] we prove the boundedness of \( {\mathcal {T}} (P_tf,P_tf).\) For this we approximate the generator \({\mathcal {L}}_n\) with truncated operators so that the approximating diffusion processes remain in compact sets. Consider \(h \in C_c^{\infty }([0,\infty ))\) decreasing such that \(h\vert _{[0,1]}=1\) and \( h\vert _{[2,\infty )}=0\) and define
Then \({\mathcal {L}}_n\) has compact support in \(K_n:= \{W \le 2n\}\), in the sense that it is 0 outside of it, due to the definition of \(h_n\). Let \(P_t^n\) be the semigroup generated by \({\mathcal {L}}_n\), which is given as the unique bounded solution of
Then we also have that for every bounded \(f \in L^{\infty }({\mathbb {R}}^{2N})\), pointwise
We do the ’interpolation semigroup argument’ as before for \({\mathcal {L}}_n\) and for \(f \in C_c^{\infty }({\mathbb {R}}^{2N})\) supported in \(\{W\le n\}\). Define
for fixed \(t>0\), \(n \ge 1\) applied to a fixed point (p, q) in the support inside the set \(\{W \le n \}\).
It is true, due to the properties of W, that \({\mathcal {T}}(P_t^n f,P_t^n f) \le C_{f,t} \) with \(C_{f,t}\) independent of n and so we have a bound on \({\mathcal {T}}(P_t^nf,P_t^nf )\) uniformly on the set \(\{W\le n\}\). Indeed
with \(C_1\) constant independent of n. About the last term:
with C independent of n. Now calculate
with \(C_2>0\) some constant again independent of n (from the assumptions on the Lyapunov functional W). Therefore
Combining this last estimate with the above bounds we end up with the differential inequality
and \(C_3=C_3(f,t) \) is again independent of n. We multiply both sides with \(e^{(2| \lambda _N | + 2)s}\) so that the above inequality implies
or equivalently, after integrating both sides in time from 0 to t, that
which gives the boundedness of \( {\mathcal {T}}(P_t^nf,P_t^nf) = \Psi _n(0) \) uniformly in n, on the set \(\{ W \le n\}\).
Now if \(d'\) is the intrinsic distance induced by \({\mathcal {T}}\)
from the above bound we have that
for n large enough with \(x,y \in \{W \le n\}\) and \(f \in C_c^{\infty }({\mathbb {R}}^{2N})\) with support in \(\{W \le n\}\). This comes from the formula
Now C does not depend on n (from before), so passing to the limit we have
and so \({\mathcal {T}}(P_tf, P_tf) \) is also bounded. Now we can repeat the standard Bakry–Emery calculations as in the beginning of the proof. \(\square \)
Remark 3.4
Note that using the equivalence of \( {\mathcal {T}}\) and \(| \nabla _z|^2\):
we get the following \(L^2\)- gradient estimate
Once we have a curvature condition of the form (3.4) we are also able to show that the stationary measure satisfies a Poincaré inequality.
Proposition 3.5
Let \({\mathcal {L}} \) be the generator of the dynamics described by the SDEs (1.1) and \({\mathcal {T}}\) the perturbed quadratic form defined in (3.1). Under Assumption (H2), for all \(N \in {\mathbb {N}}\), if \(f \in C^{\infty }({\mathbb {R}}^{2N})\), invariant measure \(\mu \) satisfies a Poincaré inequality
where \(C_N = \frac{ \gamma T_L \Vert b_N^{-1} \Vert _{2}}{\lambda _N}\), with \(\lambda _N\) defined in Proposition 3.2.
Proof
For \(f \in C^{\infty }({\mathbb {R}}^{2N})\), we consider the functional
We denote by \(\Gamma \) the Carré du Champ operator defined in (2.1). By differentiating we have
Now by integrating from 0 to t
where in the first inequality we used that
for the second we used the gradient bound from Lemma 3.3 and just right after that, the semigroup property. The last line can be rewritten like
Now letting t to go to \(\infty \), thanks to the ergodicity, we have the desired inequality. \(\square \)
In fact it is possible to show a stronger pointwise gradient bound, that we exploit for the proof of a Log–Sobolev inequality for the invariant measure of the dynamics.
Proposition 3.6
(Strong gradient bound) For \(f \in C_c^{\infty }({\mathbb {R}}^{2N})\), \(\forall \ t\ge 0\) and \((p,q) \in {\mathbb {R}}^{2N}\)
Remark 3.7
This is a better estimate than (3.5) in Lemma 3.3 because of Cauchy–Schwarz inequality.
Proof
The rigorous justification, i.e. boundedness of \( \sqrt{{\mathcal {T}}(P_{t-s}f,P_{t-s}f)}\)), of the following formal calculations is exactly like in the proof of Lemma 3.3.
Here for \(f \in C_c^{\infty }({\mathbb {R}}^{2N}) \), and for fixed \(t \ge 0, (p,q) \in {\mathbb {R}}^{2N}\), instead we define
We denote by \(g=P_{t-s}f \), we differentiate and perform the standard calculations we have
where in the first equality we used that
In the first inequality we used the formula
from the proof of Proposition 3.2, that
where \(\Gamma \) is the Carré du Champ operator defined in (2.1), and that \({\mathcal {T}}\) and \(\partial _{p_1}\) obviously commute. Now from Grönwall’s lemma we get
\(\square \)
This pointwise, strong gradient bound implies a Log–Sobolev inequality.
Proof of Proposition 1.5
For \(f \in C_c^{\infty }({\mathbb {R}}^{2N}) \), we introduce the functional
for fixed \(s \in [0,t]\) evaluated at a fixed point in the phase space. We denote by \(\Gamma \) the Carré du Champ operator defined in (2.1) and following again Bakry’s recipes, we get
where for the second inequality we used the bound from Proposition 3.6, while for the last inequality we applied Jensen’s and the fact that the function \(y^2/x\) is convex for x, y positive. Now integrating from 0 to t, we get
Letting \(t \rightarrow \infty \) and thanks to the ergodicity of the semigroup, we get the LSI with constant \( \frac{\gamma T_L \Vert b_N^{-1} \Vert _{2} \Vert b_N \Vert _{2} }{2\lambda _N } \) corresponding to the constant with the non-perturbed Fischer information. Therefore, applying the estimates from Proposition 1.1 we have
where \(C_0\) is the constant in (1.5) which we choose small enough, i.e. to satisfy
so that \(\lambda _0 >0\). \(\square \)
4 Convergence to Equilibrium in Kantorovich–Wasserstein Distance
We use that the gradient estimate (3.6) is equivalent to an estimate in Wasserstein distance (Kuwada’s duality [26]). More specifically, we have the following Theorem, here stated only in the Euclidean space with the Lebesgue measure \( ({\mathbb {R}}^{2N}, |\cdot |, \lambda )\) and only for the Wasserstein-2 distance:
Theorem 4.1
(Theorem 2.2 of [26]) Let a Markov semigroup P on \({\mathbb {R}}^{2N}\), that has a continuous density with respect to the Lebesgue measure. For \(c>0\), the following are equivalent:
-
(i)
For all probability measures \(\mu , \nu \) we have,
$$\begin{aligned} W_2 (P_t^* \mu , P_t^* \nu ) \le c W_2 (\mu ,\nu ). \end{aligned}$$ -
(ii)
For all bounded and Lipschitz functions f and \( z \in {\mathbb {R}}^{2N}\),
$$\begin{aligned} |\nabla P_t f | (z) \le c P_t \big ( | \nabla f|^2\big )(z)^{1/2} \end{aligned}$$where this estimate is associated with the Lipschitz norm defined just above.
Now we are ready to prove Theorem 1.4.
Proof of Theorem 1.4
The convergence follows if we apply Kuwada’s duality from Theorem 4.1 since we have the estimate (3.6) with \(c= \Vert b_N^{-1} \Vert _{2}^{1/2} \Vert b_N \Vert _{2}^{1/2}.\) Therefore the contraction reads
Since \(\lambda _N\), as defined in (3.3), is:
by exploiting the estimates on \(\Vert b_N\Vert _2\) and \(\Vert b_N^{-1}\Vert _2\) from the Proposition 1.1 we quantify the rate:
Choosing \(C_0 < {\text {min}}(1,2T_R) C_{a,c}^{-2}\) gives us \(\lambda _N >0\) for all N.
This gives us the statement of the Theorem:
Finally, for the uniqueness of the stationary solution \(f_{\infty }\), we see that all the solutions \(f_t\) will converge towards it if we make the choice \(f_0^2= f_{\infty }\). \(\square \)
5 Entropic Convergence to Equilibrium
If \(\mu \) is the invariant measure of the system, we prove here convergence to the stationary state in Entropy as stated in Theorem 1.6: first with respect to the functional
and then using the equivalence of \({\mathcal {T}}(f,f)\) with \(|\nabla f |^2\).
Proof of Theorem 1.6
We consider the functional
and by differentiating and repeating similarly the steps from the Propositions 3.6 and 1.5 we end up with
where we have used that for the second inequality
and in the last inequality we used the bound (3.4). We introduce a constant \(\eta \) on which we will optimise later, we integrate against the invariant measure \(\mu \) and we apply the Log-Sobolev inequality from Proposition 1.5:
since \(\int _{{\mathbb {R}}^{2N}} P_s \Big ( P_{t-s}f \log P_{t-s}f \Big ) d\mu =\int _{{\mathbb {R}}^{2N}} P_s \Big ( P_{t-s}f \log P_{t-s}f - P_{t-s}f +1 \Big ) d\mu \) which is nonnegative. For \(\eta := \frac{C_N}{1+C_N}\) we have
Finally, from Grönwall’s inequality we have
or equivalently the desired convergence, thanks to the invariance of the measure. Since \(\lim _{N \rightarrow \infty } \lambda _N \frac{C_N}{1+C_N} = \lim _{N \rightarrow \infty } \lambda _N \), we have that the exponential rate is indeed of order \(\lambda _N\) (as in the convergence in Theorem 1.4):
Since \( {\mathcal {T}}\) and \( | \nabla _z|^2\) are equivalent, see (3.2), we get the above convergence in the non-perturbed setting with equivalence-constant \( {\text {max}}\left( 1,\Vert b_N^{-1}\Vert _2 \right) \Vert b_N \Vert _{2}. \)
In particular, both the Boltzmann entropy \(H_{\mu }(P_tf \mu )\), given by (1.13), and the Fisher information \(I_{\mu }(P_tf \mu )\), given by (1.14), decay:
Thus, combining with the conclusion of Proposition 1.1, the denominator is of order 1 with the dimension, and, as in the proof of Theorem 1.4, \(\lambda _N \ge \lambda _0 N^{-3}\) and we conclude. \(\square \)
Remark 5.1
-
(i)
The rate of the convergence to the stationary state, \(\lambda _N\), does not depend on the difference of the temperatures \(\Delta T\): under the assumption (H2) we get existence of spectral gap for all \(\Delta T\), since the twisted curvature condition from Proposition 3.2 sees only the first order terms of the generator. The scaling of \(\lambda _N\) relies on the result of the Proposition 1.1 and we can see through its proof that it is not affected by \(\Delta T\). Therefore, the same scaling holds in the equilibrium case \(\Delta T=0\) as well.
-
(ii)
Regarding the boundary conditions: Assumption (H1) is not necessary in order to obtain existence of a spectral gap with a lower bound \(N^{-3}\). In fact, we have spectral gap as soon as there is a solution to the matrix equation (1.10), which is the case when \(a\ge 0, c>0\) (see Proposition 3.1). Therefore, the proof of Proposition 1.1 still holds, with minor differences, when we consider the following b.c. as well (free in a sense): \(q_0=q_1\), \(q_N=q_{N+1}\). Note that for the harmonic chain, this is suggested by numerical simulations similar to Fig. 2, too. We work under assumption (H1) here in order to keep the presentation of Sect. 6 as simple as possible. This is since, (H1) corresponds to the Discrete Laplacian B with Dirichlet b.c. (i.e. is constant along the diagonal) giving us more symmetries, whereas the above-mentioned free b.c. correspond to the Discrete Laplacian B with Neumann b.c.
-
(iii)
A comment on the choice of \(\Pi _N\): We have the curvature condition from Proposition 3.2 by considering any positive definite r.h.s. of (1.10). We choose specifically \(\Pi _N\), since then we can compare \(b_N\) to \(b_0\) that solves (2.4) (\(b_0\) is the covariance matrix for the harmonic chain) and then we bound \(\Vert b_N^{-1}\Vert _2\). See the end of proof of the Proposition 1.1.
-
(iv)
A convergence to equilibrium in total variation norm for a similar small perturbation of the harmonic oscillator chain, has been shown recently in [32]. There, a version of Harris’ ergodic Theorem was applied making it possible to treat more general cases of the oscillator chain with different kind of noises, as well. However, this is a non-quantitative version of Harris’ Theorem, which provides no information on the dependency of the convergence rate in N.
6 Estimates on the Spectral Norm of \(b_N\)
First, let us state the following Proposition on the optimal exponential rate of convergence for the purely harmonic chain.
Proposition 6.1
(Proposition 7.1 and 7.2 (3) in [7]) We write \(\lambda _N^H\) for the spectral gap of the dynamics which evolution is described by the generator (1.8), without the perturbing potentials, i.e. dynamics of the linear chain, and \(\rho :=\inf \{\text {Re}(\mu ) : \mu \in \sigma (M) \}\). We have
Moreover the spectral gap approaches 0 as N goes to infinity as follows:
for some constant C independent of N.
Proof
We exploit the results by Arnold and Erb in [1] or by Monmarché in [31, Proposition 13]: working with an operator of the form
under the conditions that (i) no non-trivial subspace of \(\text {Ker} ({\mathcal {F}} \Theta )\) is invariant under M and (ii) the matrix M is positively stable, i.e. all the eigenvalues have real part greater than 0, then the associated semigroup has a unique invariant measure and if \(\rho >0\), then for the exponential rate \(\lambda _N^H\) of the above Ornstein–Uhlenbeck process we have
for every \(\epsilon \in (0, \rho )\). Fix such an \(\epsilon >0\) and conclude the first statement of the Proposition. In particular, when m is the maximal dimension of the Jordan block of M corresponding to the eigenvalue \(\lambda \) such that \(\text {Re}(\lambda ) = \rho \), the quantity \((1+t^{2(m-1)})e^{-2\rho t}\) is the optimal one regarding the long time behaviour, [31]. This implies that the spectral gap of the generator is \(\rho -\epsilon \), whereas the constant in front of the exponential is
The harmonic chain satisfies the conditions (i) and (ii): the first condition is equivalent to the hypoellipticity of the operator L, [23, Sect. 1], and our generator (1.8) is indeed hypoelliptic: it is proven, [16, Sect. 3, p. 667] and [9, Sect. 3], for more general classes of potentials than the quadratic ones, that the generator satisfies the rank condition of Hörmander’s hypoellipticity Theorem, [24, Theorem 22.2.1]. Also the matrix M is stable for every N, i.e. \(\text {Re}(\lambda ) > 0\) for all the eigenvalues \(\lambda \), see [25, Lemma 5.1].
For the second conclusion of the Proposition, we recall that the matrix M is given by (1.9) and we write,
In the r.h.s. we have a sum of 2N (counting multiplicity) positive terms, since \(\inf \{\text {Re}(\lambda ) \} \) is strictly positive, [25, Lemma 5.1(2)]. Now note that the \(\text {Tr}({\mathcal {F}})\) does not depend on the number of oscillators, so the r.h.s. of the above displayed equation should be uniformly bounded in N. Since
we have that \(2N\inf \{\text {Re}(\lambda ): \lambda \in \sigma (M)\}\) is bounded asymptotically with N, which implies the second part of the statement. \(\square \)
Remark 6.2
B can be seen as the Schrödinger operator : \(B=-c\ \Delta ^N + \sum _{i=1}^N a \delta _i\) where \(c>0\), \( \Delta ^N \) is the Dirichlet Laplacian on \( l^2( \{1,\ldots ,N \})\) and \(\delta _i\) the projection on the i-th coordinate. We give the following definition for the (discrete) Laplacian on \( l^2( \{1,\ldots ,N \})\) with Dirichlet boundary conditions:
where \(L^{i,i+1}\) are uniquely determined by the quadratic form
We will use this information in the last part of the proof of Proposition 1.1, to bound the spectral norm of the inverse, \(\Vert b_N^{-1} \Vert _2\).
The rest of this section is devoted to the study of the solution of the matrix equation (1.10). Note that [35, 36] are two other cases where a Lyapunov equation is explicitly solved in order to study the thermal transport in atom harmonic chains. The right hand side of the equation in the two above-mentioned cases is much simpler though, therefore it is easier to provide an analytical formula which represents the unique solution as in [36].
Here we split the \(2N \times 2N\) dimensional problem into 4 equal-sized blocks of dimension \(N\times N\). Then we exploit all the information we get about each block from the following Lemma 6.3. In order to ease the readability of the proof we split it into several lemmas until the end of the section.
6.1 Matrix Equations on Lyapunov Equation
Lemma 6.3
For \(0 \le m \le N\), we have the following equations for the blocks \(x_m,y_m\) and \(z_m\) of the matrix \(b_m\):
Here \({\widetilde{J}}_m = {\text {diag}}(1,1,\ldots ,1,0,\ldots ,0, 1,1,\ldots ,1)\) where the 0’s start at \((m+1,m+1)\)-entry and stop at \((N-(m+1),N-(m+1))\)-entry, and
\(J_m^{(\Delta T)} = {\text {diag}}(2T_L,1,\ldots ,1,0,\ldots ,0,1,\ldots ,1,2T_R) \) where the 0’s start at \((m+2,m+2)\)-entry and stop at \((N-(m+2),N-(m+2))\)-entry.
Proof
We consider m s.t. \(0 \le m \le N\), where \(b_m\) solves
and where
From (6.7) and considering that \(x_m\) and \(y_m\) are symmetric matrices, we get
From that we get (6.2) and (6.3) directly, and also that:
and by applying (6.2) to (6.8) we get (6.4).
Also, using that \(x_m\) and \(y_m\) are required to be symmetric matrices, from the transposed version of (6.3), we get the equation
which, combined with (6.3), gives (6.5) for \(m\ge 1\) and (6.6) for \(m=0\). \(\square \)
From now on, we perform all the calculations when the dimension of the block matrices, N, is odd. The same calculations with minor differences hold when N is even as well.
6.2 Calculations for \(m=0,1,2\)
Before we start analysing the form of the block \(z_N\), we first present how each unit in the right hand side of the Lyapunov equation (6.7) for \(0 \le m \le N\) (that corresponds to the spread of noise on the system), affects the \(z_m\) block of the solution \(b_m\).
This subsection is only to make it easier for the reader to follow on how perturbing the r.h.s. of the Lyapunov equation affects the solution in each sequential step. Then in the next subsection we analyse the \(z_N\) block (\(m=N\)) which is what we are interested in. Thus, the reader who is interested only in the proofs, and not in the motivation behind them, might skip this subsection.
For \(m=0\): The unique solution \(b_0\) of
has been computed in [35], where they found exactly the elements of \(z_0 := (z_{ij}^{(0)})_{1\le i,j \le N}\) when \(a=0,c=1\), to be
for \(\alpha \) constant such that \( {\text {cosh}} (\alpha ) = 1+\frac{1}{2\gamma }\). (It was done in the same manner with [43, Sect. 11] but there the case was \(\Delta T =0\)). Here we describe briefly the steps: first we notice that \(z_0\) is antisymmetric since in (6.2) \(J_0^{(0)}=0\), and second, by (6.4) we get that it has a Toeplitz-form
Indeed note that the r.h.s of (6.4) forms a bordered matrix
i.e. only the bordered elements are non zero and so the l.h.s of (6.4) should also have this bordered form. Due to the tridiagonal form of B we get a Toeplitz matrix: in particular using that \(B= -c \Delta ^N + a I\), the l.h.s of (6.4) is
and equating the non-boundary entries, due to the symmetry of \(\Delta ^N\) and the antisymmetry of \(z_0\), we have that the elements of \(z_0\) will be constant along the diagonals: indeed, for \(1<i<N\), for the diagonal’s entries of the Eq. (6.11) we have
For the superdiagonal’s entries of the Eq. (6.11)
We repeat these calculations through all the non-boundary entries of the matrix, and using the information we get from each one calculation, we end up with the Toeplitz form of \(z_0\) in (6.10).
We can now see that a solution to (6.6) is a symmetric Hankel matrix which is antisymmetric about the cross diagonal and such that \((y_{1,j}^{(0)})_{j=1}^{N-1}= z_{1,j+1}^{(0)}.\) Then we apply (6.3) to get a formula for the entries of \(x_0\) and from the bordered entries of \(x_0\) from (6.4), we end up with the linear equation
Here \(\underline{z_0},\ e_1 \in {\mathbb {C}}^{N-1}\) are the vectors \(\underline{z_0}=(z_{1,1}^{(0)}, \ldots , z_{1,N-1}^{(0)})^T\), \(e_1=(1,0,\ldots ,0)^T\) and \(K_0\) is a \((N-1) \times (N-1)\) symmetric Jacobi matrix whose entries depend on the (dimensionless) friction constant \(\gamma \) and interaction constant c:
We solve the above equation using for example Cramer’s rule and we find an explicit formula for the \(z_{1,j}^{(0)}\)’s: the recurrence formula of the determinant of \(K_0\) is the same formula of the Chebyshev polynomials of the second kind, so using properties of these polynomials and imposing appropriate initial conditions we end up with the form (6.9).
For \(m\ge 1\) we use again the Eq. (6.4). In the first step we get that:
For \(m=1\), i.e. for the form of the \(z_1\)-block in \(b_1\), the elements \(z_{1,1}^{(1)}, z_{N,N}^{(1)}\) in the main diagonal are \(-1/2\). The difference with the \(m=0\) step is that \(z_1\) is not antisymmetric anymore, since 1/2 is added in the first entry of the diagonal (due to the form of \({\widetilde{J}}_1\)). So from (6.2) we write
But we still have the bordered form in the r.h.s. of (6.4), so we still have a Toeplitz-form for \(z_1\).
In the next Lemma we give the form of the \(z_2\) block of \(b_2\).
Lemma 6.4
(For \(m=2\), form of \(z_2\)) For the \(z_2\)-block of \(b_2\) : There exists an antisymmetric matrix \(z_2^{anti}\): \( z_2 = z_2^{anti} - {\widetilde{J}}_2 \) and
The last property is that the Toeplitz form is not perturbed in more than 2 diagonals away from the centre.
So we denote by \(\mu _{a,c}:= \frac{1+a+2c}{4c}\) and we write:
Proof of Lemma 6.4
\(z_2 \) is not antisymmetric but from (6.2) we immediately have that \( z_2=z_2^{anti}-{\widetilde{J}}_2\), where \(z_2^{anti}\) is antisymmetric. So we work with \(z_2^{anti}\) and due to the antisymmetry we look only at the upper diagonal part of the matrix.
Here, besides that \(z_2\) is not antisymmetric, the r.h.s of (6.4) is not a bordered matrix anymore and also the matrix \(B {\widetilde{J}}_2\) affects non boundary entries as well, in particular it adds the \((3 \times 2)\) top-left and bottom-right submatrices of B to the \((3 \times 2) \) respective submatrices of \(z_2\):
Equating the entries that correspond to the zero-submatrix as drawn above we will have the same calculations as in the step \(m=0\).
From (6.2) we have \(z_{1,1}^{(2)}=z_{2,2}^{(2)}=z_{N,N}^{(2)}=z_{N-1,N-1}^{(2)}=-1/2\) and \(z_{i,i}^{(2)}=0\) for \(N-1>i>2\).
Looking at the (2, 2)-entry and the (2, 3)-entry of the equation (6.12) we have respectively
and since \(z_{i,j}^{(2)}=-z_{j,i}^{(2)}\) for \(j \ne i\) from (6.2), and also \(z_{2,2}^{(2)}=-1/2, z_{3,3}^{(2)}=0\), we get
Now looking at the entries (i, i) for \(3 \le i\le N-2\) of equation (6.12), we write (as in the 0-step):
which gives
In particular
where the second equations in both lines are proved by looking at the reversed direction (bottom-right to top-left side of the matrix). Also for \(k \ge 2\) and \( 1 \le i \le N-k\), look at \((i,i+k)\) entry of the Eq. (6.12) and get
This corresponds to the Toeplitz property that holds for all the diagonals apart from the 5 central ones. Remember that for \(m=0\) we end up with a Toeplitz matrix. \(\square \)
In the m-th step of the sequence of these matrix equations, for the \(z_m\)- block of \(b_m\), the central \((4m-3)\) diagonals have a perturbed Toeplitz form: the elements across these diagonals on each line are changed by constants that depend on the coefficients a, c.
The resulting matrix \(z_m\) is described in the following way, where \(\mu _{a,c}:= \frac{1+a+2c}{4c}\):
The explanation is the same as in the step \(m=2\) but this holds for an arbitrary \(m\le N\).
6.3 Preliminaries: Compute the Blocks \(z_N\), \(y_N, x_N\) of \(b_N\)
Lemma 6.5
(Form of \(z_N\) block) The matrix \(z_N:= (z_{i,j}^{(N)})_{1 \le i,j \le N}\) is a real \(N \times N\) matrix of the form
where \(z_N^{anti}= [z_{i,j}^{(N),anti}]\) is an antisymmetric matrix. We denote by \(\mu _{a,c} := \frac{1+a+2c}{2c}\). \(z_N\) has the following perturbed Toeplitz form: for \(2 \le i\le N-k\) and \(1 \le k \le N-2\),
and for the second and second-to-last line respectively:
Regarding the ’cross-diagonal’ we have, for \(1 \le k \le N-2\),
In particular,
This corresponds to the relation of the first row with the last row of the matrix.
From the above Lemma we conclude that \(z_N\) can be written in the general form
where we write \({\overline{J}}\) for the square matrix with 1’s in the superdiagonal and \({\underline{J}}\) for the matrix with 1’s in the subdiagonal.
Also \({\overline{\iota }}_k \) for the matrix with 1 in the \((k,k+1)\)- entry and \({\underline{\iota }}_{-k}\) for the matrix with \(-1\) in the \((k+1,k)\)-entry. So for example
For a visualisation:
Proof of Lemma 6.5
From (6.2) we have
where \(z_N^{anti}\) is antisymmetric matrix. So in order to find the form of \(z_N\) we only need to study \(z_N^{anti}\) and due to its antisymmetry, we only need to study its upper triagonal part.
We look at the non-bordered entries of the upper triagonal part of (6.4). That is the equation
Looking at the diagonal’s entries (i, i) for \(1<i<N\) of the above Eq. (6.18), we write
and using the antisymmetry of the elements of \(z_N^{anti}\), it gives
Therefore, inductively we get
At the same time, looking from bottom-right to top-left, we can write
Then, looking at the super-diagonal’s entries, i.e. the \((i,i+1)\)-entry, for \(1< i < N-1\), of Eq. (6.18), we write
and that gives
and at the same time (reversed direction, i.e. from bottom right to top left)
Similarly, looking at the entries \((i,i+2)\) for \(1< i < N-2\):
Apply (6.19) twice: \( z_{i+1,i+2}^{(N),anti} = z_{1,2}^{(N),anti} - i\mu _{a,c}\) and \( -z_{i,i+1}^{(N),anti} = -z_{1,2}^{(N),anti}+ (i-1)\mu _{a,c}\) and get
So inductively,
Also, from the reversed direction we get inductively
For the general case, as stated in the Lemma, we prove it by induction in k. For \(k=1,2,3\) is true from the above calculations. We do it for k odd. Let it hold for \(k-2\), we look at the \((i,i+k-1)\)-entry of Eq. (6.18) : for \(1<i<N-(k-1)\),
Then from the induction hypothesis we end up with the (6.13). The case k even follows similarly.
Now generalise the previous induction formulas for k odd for example and write:
and from the reversed direction
From these two equations we have the specific case (6.16). k even is proven similarly. For (6.15) we write for k odd:
where in the last line we applied (6.16). The case k even is proven in the same way. \(\square \)
The above discussion shows that in order to understand the entries of \(z_N\), we need only to understand the vector \(\underline{z_N} = (z_{1,2}^{(N)}, z_{1,3}^{(N)},\ldots , z_{1,N}^{(N)})\).
We state now a Lemma that shows the relation between the elements of \(\underline{z_N}\) and the entries of the first row and the last column of \(x_N=[x_{i,j}^{(N)}]\), concluding a relation between \(x_{1,j}^{(N)}\) and \(x_{i,N}^{(N)}\) about the ’cross diagonal’.
Lemma 6.6
For \(3 \le k \le N\),
and \(z_{1,2}^{(N),anti} = \frac{\gamma }{c}x_{1,1}^{(N)} - \frac{T_L+a+2c}{2c}\) and so for \(3 \le k \le N\)
Also \(x_{1,N}^{(N)} = \frac{c}{2\gamma }\mu _{a,c}\), where \(\mu _{a,c} := \frac{1+a+2c}{2c}\).
Proof
We look at the bordered entries of Eq. (6.4). Let us first look at (N, j)-entry for j even:
Using Lemma 6.5 we write
and after the obvious cancellations we have for j even
Similarly for j odd we have
Moreover, with exactly the same calculations, but looking at the (1, j)-entry of Eq. (6.4) we get, for \(2 \le j \le N-1\),
Now for \(k:=N-j+2\) then \(3 \le k \le N\). Since N is odd, whenever j is odd, k is even and the opposite. Solving the Eqs. (6.24) and (6.23) for \(z_{1,k}^{(N),anti}\), we get the second equations in (6.21), whereas solving (6.25) for \(\lambda := j+1\), for \(z_{1,\lambda }^{(N),anti}\), we get the first equations in (6.21) as well. We conclude with (6.22) just by combining the above relations in both cases.
Finally to get this specific value for \(x_{1,N}^{(N)}\) we look at the (1, N)-entry of Eq. (6.4) and perform the same calculations as above. \(\square \)
Considering the above Lemma we can write the matrix \(z_N\) also as follows:
where \(\kappa _L := \frac{T_L+a+2c}{2c}\) and \(\kappa _R := \frac{T_R+a+2c}{2c}\).
In the following we state a Lemma about the symmetries that hold in \(y_N\)-block of \(b_N\), concluding that all the entries of \(y_N\) can be written in terms of the vectors \(\underline{y_N}:= (y_{1,N}^{(N)},y_{1,N-1}^{(N)}, \ldots , y_{1,1}^{(N)})\) and \(\underline{z_N}\).
Lemma 6.7
For \(2 \le i \le N-(k+1)\) and \(1 \le k \le N-3\),
Proof
Due to symmetry of \(y_N\) is enough to look at the upper-triagonal part. We look at the entries \((i,i+k)\) of Eq. (6.5). For \(k=1\) we have
which is the Eq. (6.26). For \(1<k< N-1\) we prove it by induction in k, like in the proof of Lemma 6.5. Let us now look at the (1, N)- entry of (6.5):
which gives \(y_{2,N}^{(N)} = y_{1,N-1}^{(N)} + \frac{2\gamma }{c}z_{1,N}^{(N)}.\) For (6.27) we look at (1, k)- entry:
which is
and this is the desired equation. For (6.28), we look at \((k-1,N)\)- entry of (6.5) for \(k\ge 3\). Performing the same calculations as above we get
Then using the relations (6.26) and (6.27) for each of the terms above, we get the stated relation. \(\square \)
With the result of the following Lemma we relate the entries of \(\underline{y_N}\) with the entries of \(\underline{z_N}\).
Lemma 6.8
Let B be the matrix (1.7). We have
where \(\underline{{\tilde{z}}_N}\) is the vector
where \(\mu _{a,c} := \frac{1+a+2c}{2c}\). In particular:
Proof
We combine the information for \(x_{1i}\)’s we get from two equations: first from (6.3), we remind that Eq. (6.3) is
and second from the bordered entries of (6.4), which is
We look at the element \(x_{1,N}^{(N)}\) and we write:
and
which give
Moreover
and from the proof of Lemma 6.6, see relation (6.25), we have
Both of them give
In general using again Lemma 6.7 and relation (6.25), we have
For \(x_{1,1}^{(N)}\) we use that
from Lemma 6.6, and from (6.3),
Putting the above relations in a more compact form we have
We end up with (6.30) considering that \(\Vert B^{-1}\Vert _2\) is uniformly (in N) bounded, since B has bounded spectral gap. \(\square \)
The following Lemma shows, through its proof, that there is one unique solution to the Lyapunov matrix equation (since one can explicitly find the entries of \(\underline{z_N}\), that determine all the rest) and eventually gives the scaling in N of the entries of \(\underline{z_N}\). For \(1 \le k \le N-2\), using all the information we have from the block equations in Lemma 6.3, we write all the \(z_{1,N-k}^{(N),anti}\) in terms of \(z_{1,N}^{(N),anti}\), which we then calculate explicitly. This is presented in the following Lemma.
Lemma 6.9
For \(1 \le k \le N-2\), the order of the entries of \(\underline{z_N}\) is given by
and \(z_{1,N}^{(N),anti} = {\mathcal {O}}\left( R^{1-N} \left( \frac{ \kappa _R-\kappa _L}{2\gamma } \right) \right) \), where \(R:= \frac{c}{\gamma ^2} + \frac{a+2c}{c}\) and \(\mu _{a,c}:= \frac{1+a+2c}{2c}\). Therefore
where \(\Delta T\) is the temperature difference at the ends of the chain.
Proof
We look at the equations around \(x_{k,N}^{(N)}\) for \(2 \le k \le N\). First we look at \(x_{2,N}^{(N)}\) and from (6.23) we have
while from the (2, N)-entry of (6.3) we have
Combine them and get
Then we look at \(x_{3,N}^{(N)}\): from (6.24) we have
while from the (3, N)-entry of (6.3) we have similarly
Combine them and get
Then considering (6.32) as well, we have
In the same manner, but looking around \(x_{4,N}^{(N)}\) and \(x_{5N}^{(N)}\), we get
respectively. Inductively, we have a way to write all the elements of \(\underline{z_N}\) in terms of \(z_{1,N}^{(N),anti}\), and looking at the leading order in terms of N we have the general formula (6.31) for \(1 \le k \le N-2\). In particular, for \(k=N-3\) (is even by assumption on N) and \(k=N-2\) (odd) :
respectively. Moreover, by looking at \(x_{N,N}^{(N)}\) combining (6.3) and (6.4) we have
Plugging in the above equation the relations from (6.35), we write
We conclude the last statement by combining the above estimate on \(z_{1,N}^{(N),anti}\) with (6.31). \(\square \)
Now we estimate the entries \(\underline{y_N}\): from (6.30) and Lemma 6.9,
This gives that
and then also, since \( y_{k,N}^{(N)}= \frac{\gamma }{c}(z_{k-1,N}^{(N)} + z_{1,N-(k-2)}^{(N)})+ y_{1,N-(k-1)}^{(N)} \),
Lemma 6.10
(Estimate on the spectral norm of \(y_N\)) For the spectral norm of \(y_N\) we have that
Proof
Let \(v=(v_1,v_2,\ldots ,v_N) \in {\mathbb {C}}^{N}\). We write \(L_i\) for the i-th row of the matrix \(y_N\) and then calculate
We estimate the terms due to the first half of the matrix, i.e. the terms until \( L_{\lfloor \frac{N}{2}\rfloor +1} \cdot v \): from Lemma 6.7 we write all the \(y_{i,j}^{(N)}\)’s in terms of the entries of \(\underline{y_N}\) and \(\underline{z_N}\) that, due to the observations above, scale at most like N. In particular for the second line
and more generally
Then, from (6.36):
So the highest order is due to \(\left| L_{\lfloor \frac{N}{2}\rfloor +1} \cdot v \right| ^2\) for which we estimate
The terms \((2i-1)\) in the sum above, denote the number of the entries of \(\underline{y_N}, \underline{z_N}\) that each \(y_{i,j}^{(N)}\) is given by.
Regarding the terms due to the second half of the matrix, we use again Lemma 6.7, Eq. (6.26). This way we write the elements \(y_{i,j}^{(N)}\)’s in terms of \(y_{N,j}^{(N)}\)’s and then from relation (6.28), we have all the \(y_{i,j}^{(N)}\)’s in terms of the entries of \(\underline{y_N}\) and \(\underline{z_N}\), that scale at most like N. So in the end we have
Then
Before we finish the proof, we give more details on the estimates (6.38) above:
For the first inequality we apply iteratively Lemma 6.7. Regarding the row \(L_2\):
So \(y_{2,2}^{(N)}\) is given by the sum of 3 terms whose absolute value is of order not more than \({\mathcal {O}}(N)\). The same holds (from Lemma 6.7) for each \(y_{2,j}^{(N)}\) for \(j \le N-2\), i.e. until we reach the ’cross-diagonal’. After the ’cross-diagonal’: \(y_{2,N}^{(N)}= y_{1,N-1}^{(N)} + \frac{2\gamma }{c}z_{1,N}^{(N)}\), and \(|y_{1,N-1}^{(N)}|, |z_{1,N}^{(N)}|\) have order less than N.
Regarding the row \(L_3\):
is given by the sum of 3 terms whose absolute value has order less than N, while for \(y_{3,3}^{(N)}\), by applying Lemma 6.7 twice, i.e. until we end up only with elements of \(\underline{y_N}\) and \(\underline{z_N}\), we get
So \(y_{3,3}^{(N)}\) is given by the sum of 5 terms whose absolute value has order less than N. For \(y_{3,j}^{(N)}\), \(j \le N-2\) (until the ’cross-diagonal’), apply Lemma 6.7 twice: the value of \(y_{3,j}^{(N)}\) is given by the sum of 5 such terms, while for \(N-1 \le j \le N\),
and so they are given by 3 terms with absolute value of order at most N.
In general, the same holds for the row \(L_i\), \(i \le \lfloor \frac{N}{2}\rfloor +1\) from applications of Lemma 6.7 inductively. For all \(y_{i,j}^{(N)}\) we apply Lemma 6.7 until we have written each \(y_{i,j}^{(N)}\) only in terms of entries of \(\underline{y_N}\) and \(\underline{z_N}\).
For \(j \le i \), i.e. until the main diagonal, \(y_{i,j}^{(N)} \) is given by the sum of \(\nu \) terms, whose order is less than N, and
For that we apply Lemma 6.7 and write
This formula gives that \(y_{i,j}^{(N)}\) is the sum of \((2j-1)\) terms whose absolute value has order less than \({\mathcal {O}}(N)\).
The same holds for \(j > N-(i-1)\), i.e. after the ’cross-diagonal’, considering also (6.37). As for the rest terms in \(L_i\), for \(i \le j\le N-(i-1)\): \(y_{i,j}^{(N)}\) is given by the sum of \((2i-1)\) terms whose order is less than \({\mathcal {O}}(N)\). \(\square \)
Now, from (6.3) we can see that the entries of \(x_N\) can be written in terms of entries of \(z_N\) as well:
where \(\beta _{ij}\) are the elements of the matrix B, (1.7), and the entries of \(y_N\) are split into two sums regarding their position about the cross diagonal.
We write
Proof of Proposition 1.1
We are ready now to bound from above \(\Vert b_N\Vert _2\). We write for some positive constant \(C_{a,c}^1\)
where for the first inequality: since \(b_N\) is positive definite, decomposing \(b_N\) in its square root matrices:
And since \(X^{*}X\) and \(XX^{*}\) are unitarily congruent and the same holds for \(Y^{*}Y\) and \(YY^{*}\) (from polar decomposition for example), there are unitary matrices U, \(V \in {\mathbb {C}}^{N \times N}\) so that:
Then it is clear that for the spectral norm (which is unitarily invariant):
Regarding the last part of the statement that \(\Vert b_N^{-1}\Vert _2\) is bounded from above: Let us first state some facts about the spectrum of the matrix \(b_0\) that solves
It is known that \(b_0\) is the covariance matrix that determines the stationary solution of the Liouville equation in the harmonic chain (and it has been found explicitly in [35], see a description of their approach in the beginning of the proof of Lemma 6.5). From [25, Lemma 5.1], we know that \(b_0\) is bounded below and above:
Thus \(\Vert b_0\Vert _2 \) and \(\Vert b_0^{-1} \Vert _2\) are uniformly bounded in terms of N: from Remark 6.2 we write \(B=-c\ \Delta ^N + \sum _{i=1}^N \alpha \delta _i\). Even though here we will only use that \(\Vert b_0^{-1} \Vert _2\) is finite, in fact when \(a>0\), B possesses a spectral gap uniformly in N. Moreover, \(b_N \ge b_0\): since \( \Pi _N > {\tilde{\Theta }}\), for every \(t>0\),
and since \(-M\) is stable (all the characteristic roots have negative real part) we have
So \(b_N^{-1} \le b_0^{-1}\) and so \( \Vert b_{ N}^{-1} \Vert _2 \le \Vert b_0^{-1} \Vert _2 \) which is less than a finite constant (because of the spectrum of the discrete Laplacian). Therefore there exists positive and finite constant \(C_{a,c}^2\) so that \(\Vert b_N^{-1}\Vert _2 \le C_{a,c}^2\). Conclude the Proposition by taking \(C_{a,c}:= \text {min}(C_{a,c}^1, C_{a,c}^2)\). \(\square \)
To sum up: for the homogeneous weakly anharmonic chain, the method described in Sect. 3with the modified Bakry–Emery criterion, gives a lower bound on the spectral gap that is of order \(N^{-3}\)(see the exponential rate in the main Theorems). For the purely harmonic chain, since we know that it always decays with N from Proposition 6.1, this lower bound shows that the spectral gap in this case can not decay at an exponential rate in N, it is at most polynomial.
In the next Proposition, exploiting the estimates on \(\Vert b_N\Vert _2\) from the above matrix analysis, we get alternatively the lower bound on the spectral gap of the harmonic chain.
Proof of Proposition 1.2
We remind that \(\Vert b_{ N} \Vert _2 \le C_{a,c} N^3 \) by Proposition 1.1 and that the spectral gap divided by \(\inf \{\text {Re}(\mu ) : \mu \in \sigma (M) \}\) is bounded below and above in terms of N, by Proposition 6.1. From [19, 39, Inequality (13)], we have an estimate for the decay of \(e^{-Mt}\):
So, for u be the (normalised) eigenvector corresponding to an eigenvalue of M, \(\mu >0\), we write
and therefore we write \( -2 \text {Re}(\mu ) \le - \frac{1}{\Vert b_{N}\Vert } \) which means
Taking the infimum over the real parts of the eigenvalues of M, we conclude that
\(\square \)
Eventually, from the whole procedure in this note we have that the scaling of the spectral gap of the homogeneous harmonic chain is in between \(N^{-3}\) and \(N^{-1}\). In [7, Proposition 9.1] it is proven that this lower bound is the sharp one, i.e. an upper bound of order \(N^{-3}\) is provided.
From a simple numerical simulation in Matlab on the spectral gap of the matrix M, the true value is indeed \(N^{-3}\). In particular calculating the real part of the smallest eigenvalue of the matrix M and multiplying the result by \(N^3\) we get the following behaviour in Fig. 2, which shows that then the spectral gap converges for large N:
Notes
The other one considered for studying the N-dependence of the energy flux was first introduced by Rubin-Greer [37], where the heat baths are semi-infinite chains distributed according to Gibbs equilibrium measures of temperatures \(T_L, T_R\) (free boundaries). In both [10] and [37] the purpose was to study the heat flux behaviour in disordered harmonic chains
References
Arnold, A., Erb, J.: Sharp entropy decay for hypocoercive and non-symmetric fokker-planck equations with linear drift. arXiv:1409.5425
Aoki, K., Lukkarinen, J., Spohn, H.: Energy transport in weakly anharmonic chains. J. Stat. Phys. 124(5), 1105–1129 (2006)
Bakry, D.: Functional inequalities for Markov semigroups. In: Probability Measures on Groups: Recent Directions and Trends, pp. 91–147. Tata Inst. Fund. Res., Mumbai (2006)
Bakry, D., Émery, M.: Diffusions hypercontractives. In: Séminaire de probabilités, XIX, 1983/84, volume 1123 of Lecture Notes in Math., pp. 177–206. Springer, Berlin (1985)
Baudoin, F.: Wasserstein contraction properties for hypoelliptic diffusions. arXiv:1602.04177
Baudoin, F.: Bakry-Émery meet Villani. J. Funct. Anal. 273(7), 2275–2291 (2017)
Becker, S., Menegaki, A.: Spectral gap in O(n)-model and chain of oscillators using Schrödinger operators. arXiv:1909.12241
Bonetto, F., Lebowitz, J.L., Rey-Bellet, L.: Fourier’s law: a challenge to theorists. In: Mathematical Physics 2000, pp. 128–150. Imp. Coll. Press, London (2000)
Carmona, P.: Existence and uniqueness of an invariant measure for a chain of oscillators in contact with two heat baths. Stoch. Process. Appl. 117(8), 1076–1092 (2007)
Casher, A., Lebowitz, J.L.: Heat flow in regular and disordered harmonic chains. J. Math. Phys. 12, 1701 (1971)
Cuneo, N., Eckmann, J.-P., Poquet, C.: Non-equilibrium steady state and subgeometric ergodicity for a chain of three coupled rotors. Nonlinearity 28(7), 2397–2421 (2015)
Cuneo, N., Eckmann, J.-P., Hairer, M., Rey-Bellet, L.: Non-equilibrium steady states for networks of oscillators. Electron. J. Probab. 23, 28 (2018)
Cuneo, N., Poquet, C.: On the relaxation rate of short chains of rotors interacting with Langevin thermostats. Electron. Commun. Probab. 22, 35 (2017)
Dhar, A.: Heat transport in low-dimensional systems. Adv. Phys. 57, 08 (2008)
Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Entropy production in nonlinear, thermally driven Hamiltonian systems. J. Stat. Phys. 95(1–2), 305–331 (1999)
Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Non-equilibrium statistical mechanics of anharmonic chains coupled to two heat baths at different temperatures. Commun. Math. Phys. 201(3), 657–697 (1999)
Flandrin, P., Bernardin, C. (eds) Fourier and the Science of Today/Fourier et la Science d’aujourd’hui, vol. 20, Issue 5. Comptes Rendus Physique (2019)
Gantmacher, F.R.: Applications of the theory of matrices, translated by J.L. Brenner, with the assistance of D.W. Bushaw and S. Evanusa. Interscience Publishers, Inc., New York; Interscience Publishers Ltd., London (1959)
Godunov, S.V., Kiriljuk, O.P., Kostin, I.V.: Spectral Portraits of Matrices (Russian). AN SSSR Siber. Otd, Novosibirsk (1990)
Giardiná, C., Livi, R., Politi, A., Vassalli, M.: Finite thermal conductivity in 1D lattices. Phys. Rev. Lett. 84, 2144–2147 (2000)
Hairer, M.: How hot can a heat bath get? Commun. Math. Phys. 292(1), 131–177 (2009)
Hairer, M., Mattingly, J.C.: Slow energy dissipation in anharmonic oscillator chains. Commun. Pure Appl. Math. 62(8), 999–1032 (2009)
Hörmander, L.: Hypoelliptic second order differential equations. Acta Math. 119, 147–171 (1967)
Hörmander, L.: The analysis of linear partial differential operators. III. Classics in Mathematics. Springer, Berlin (2007). Pseudo-differential operators, Reprint of the 1994 edition
Jakšić, V., Pillet, C.-A., Shirikyan, A.: Entropic fluctuations in thermally driven harmonic networks. J. Stat. Phys. 166(3–4), 926–1015 (2017)
Kuwada, K.: Duality on gradient estimates and Wasserstein controls. J. Funct. Anal. 258(11), 3758–3774 (2010)
Lepri, S. (eds). Thermal transport in low dimensions, vol. 921 of Lecture Notes in Physics. Springer, Cham (2016). From statistical physics to nanoscale heat transfer
Lepri, S., Livi, R., Politi, A.: Thermal conduction in classical low-dimensional lattices. Phys. Rep. 377(1), 1–80 (2003)
Letizia, V., Olla, S.: Nonequilibrium isothermal transformations in a temperature gradient from a microscopic dynamics. Ann. Probab. 45(6A), 3987–4018 (2017)
Liapounoff, A.: Problème Général de la Stabilité du Mouvement. Annals of Mathematics Studies, no. 17. Princeton University Press, Princeton, N.J.; Oxford University Press, London (1947)
Monmarché, P.: Generalized \(\Gamma \) calculus and application to interacting particles on a graph. Potential Anal. 50(3), 439–466 (2019)
Raquépas, R.: A note on Harris’ ergodic theorem, controllability and perturbations of harmonic networks. Ann. Henri Poincaré 20(2), 605–629 (2019)
Rey-Bellet, L.: Open Classical Systems. In: Open quantum systems. II, volume 1881 of Lecture Notes in Math., pp. 41–78. Springer, Berlin (2006)
Rey-Bellet, L., Thomas, L.E.: Exponential convergence to non-equilibrium stationary states in classical statistical mechanics. Commun. Math. Phys. 225(2), 305–329 (2002)
Rieder, Z., Lebowitz, J.L., Lieb, E.: Properties of a harmonic crystal in a stationary nonequilibrium state. J. Math. Phys. 8(5), 1073–1078 (1967)
Roussel, J., Stoltz, G.: A perturbative approach to control variates in molecular dynamics. Multiscale Model. Simul. 17(1), 552–591 (2019)
Rubin, R.J., Greer, W.: Abnormal lattice thermal conductivity of a one-dimensional, harmonic, isotopically disordered crystal. J. Math. Phys. 12, 1686–1701 (1971)
Talay, D.: Stochastic Hamiltonian systems: exponential convergence to the invariant measure, and discretization by the implicit Euler scheme, vol. 8, pp. 163–198 (2002). Inhomogeneous random systems (Cergy-Pontoise, 2001)
Veselić, K.: Bounds for exponentially stable semigroups. Linear Algebra Appl., 358, 309–333 (2003). Special issue on accurate solution of eigenvalue problems (Hagen, 2000)
Villani, C.: Hypocoercivity. Mem. Am. Math. Soc. 202(950), iv+141 (2009)
Villani, C.: Optimal transport, volume 338 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin (2009). Old and new
Wang, F.-Y.: Generalized curvature condition for subelliptic diffusion processes. arXiv:1202.0778 (2012)
Wang, M.C., Uhlenbeck, G.E.: On the theory of the Brownian motion II. Rev. Mod. Phys. 17, 323–342 (1945)
Acknowledgements
I thank my advisor, C. Mouhot, for useful conversations, suggestions and encouragement. Also, I would like to thank S. Becker, for pointing out a mistake in an earlier version of this draft and suggesting the reference [39] to me and J. Evans for letting me know about this paper [1]. Finally, the valuable suggestions and detailed comments from three anonymous referees that greatly improved the presentation are gratefully acknowledged. This work was supported by the EPSRC Grant EP/L016516/1 for the University of Cambridge CDT, the CCA.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Stefano Olla.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Menegaki, A. Quantitative Rates of Convergence to Non-equilibrium Steady State for a Weakly Anharmonic Chain of Oscillators. J Stat Phys 181, 53–94 (2020). https://doi.org/10.1007/s10955-020-02565-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-020-02565-5