Abstract
Coulomb and log-gases are exchangeable singular Boltzmann–Gibbs measures appearing in mathematical physics at many places, in particular in random matrix theory. We explore experimentally an efficient numerical method for simulating such gases. It is an instance of the Hybrid or Hamiltonian Monte Carlo algorithm, in other words a Metropolis–Hastings algorithm with proposals produced by a kinetic or underdamped Langevin dynamics. This algorithm has excellent numerical behavior despite the singular interaction, in particular when the number of particles gets large. It is more efficient than the well known overdamped version previously used for such problems, and allows new numerical explorations. It suggests for instance to conjecture a universality of the Gumbel fluctuation at the edge of beta Ginibre ensembles for all beta.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
We explore the numerical simulation of Coulomb gases and log-gases by mean of Hybrid or Hamiltonian Monte Carlo algorithms (HMC) [19, 36]. Such algorithms consist basically in using discretized kinetic (underdamped) Langevin dynamics to produce proposals for Metropolis–Hastings algorithms. This can be viewed as a way to add momentum to a Monte Carlo interacting particle system. The basic outcome of this exploratory work is that HMC algorithms have remarkably good numerical behavior for such gases despite the singularity of the interactions. Such algorithms scale well with the dimension of the system, see [4, 8]. They are therefore more efficient than the tamed overdamped version already explored in the literature for instance in [55]. In this paper, we benchmark the capability of the algorithm to reproduce known results efficiently, and we make it ready to explore new conjectures.
Another advantage of this approach is that it could be adapted to take into account a sub-manifold constraint [51]. For instance, this could be used for simulating random matrices with prescribed trace or determinant, which is difficult to achieve by direct sampling of matrices.
For the sake of completeness, we should mention that there are remarkable alternative simulation algorithms which are not based on a diffusion process, such as the ones based on piecewise deterministic Markov processes (PDMP), see for instance [41] and [72].
1 Boltzmann–Gibbs Measures
We are interested in interacting particle systems subject to an external field and experiencing singular pair interactions. In order to encompass Coulomb gases as well as log-gases from random theory, we introduce a vector subspace S of dimension d of \({\mathbb {R}}^n\), with \(n\ge 2\) and \(n\ge d\ge 1\). The particles belong to S, and \({\mathbb {R}}^n\) is understood as a physical ambient space. We equip S with the trace of the Lebesgue measure of \({\mathbb {R}}^n\), denoted by \(\mathrm {d}x\). The external field and the pair interaction are respectively denoted by \(V:S\mapsto {\mathbb {R}}\) and \(W:S\mapsto (-\infty ,+\infty ]\), and belong to \(\mathcal {C}^2\) functions, with \(W(x)<\infty \) for all \(x\ne 0\). For any \(N\ge 2\), we consider the probability measure \(P_N\) on \(S^N=S\times \cdots \times S\) defined by
where \(\beta _N>0\) is a parameter,
is the normalizing factor, and
is usually called energy or Hamiltonian of the system. We assume that \(\beta _N\), V, and W are chosen in such a way that \(Z_N<\infty \) for any N. The law \(P_N\) is invariant by permutation of the coordinates \(x_1,\ldots ,x_N\) (exchangeable), and \(H_N\) depends only on the empirical measure
Therefore \(P_N\) is also the law of a random empirical measure encoding a cloud of indistinguishable particles \(x_1,\ldots ,x_N\). We emphasize that the particles live on the space \(S^N=S\times \cdots \times S\) of dimension dN. The parameter n serves as the physical dimension of the ambient space, for the Coulomb gas setting described next.
For any \(m\ge 1\) and \(x\in \mathbb {R}^m\), we denote by \({{\left| x \right| }}=\sqrt{x_1^2+\cdots +x_m^2}\) the Euclidean norm of x. This matches the absolute value when \(m=1\) and the modulus when \(m=2\), \(\mathbb {R}^2\equiv \mathbb {C}\).
1.1 Coulomb Gases
The notion of Coulomb gas is based on elementary electrostatics. Here the vector subspace S is interpreted as a conductor. It corresponds to taking \(W=g\) where g is the Coulomb kernel or Green function in the physical space \(\mathbb {R}^n\). More precisely, recall that the Green function g in \({\mathbb {R}}^n\), \(n\ge 2\), is defined for all \(x\in {\mathbb {R}}^n\), \(x\ne 0\), by
This function is the fundamental solution of the Poisson equation, namely, denoting by \(\Delta \) the Laplace operator in \(\mathbb {R}^n\) and by \(\delta _0\) the Dirac mass at 0, we have, in the sense of distributions,
The physical interpretation in terms of electrostatics is as follows: \(H_N(x_1,\ldots ,x_N)\) is the electrostatic energy of a configuration of N electrons in \(\mathbb {R}^n\) lying on S at positions \(x_1,\ldots ,x_N\), in an external field given by the potential V. The Green function or Coulomb kernel g expresses the Coulomb repulsion which is a two body singular interaction. The probability measure \(P_N\) can be seen as a Boltzmann–Gibbs measure, \(\beta _N\) playing the role of an inverse temperature. The probability measure \(P_N\) is known as a Coulomb gas or as a one-component plasma, see for instance [68] and references therein.
1.2 Log-Gases
A log-gas corresponds to choosing \(d=n\) and a logarithmic interaction W whatever the value of n is, namely
Coulomb gases and log-gases coincide when \(d=n=2\). In dimension \(d=n\ge 3\), log-gases are natural and classical objects of approximation theory and can be seen as limiting Riesz potentials, namely \(\lim _{\alpha \rightarrow 0}\frac{1}{\alpha }(|x|^{-\alpha }-1)\), see for instance [68,69,70].
1.3 Static Energy and Equilibrium Measures
Under natural assumptions over V and W, typically when \(\beta _N\gg N\) and V beats W at infinity, it is well known, see for instance [14, 67] and references therein, that \(P_N\) almost surely, the empirical measure
tends as \(N\rightarrow \infty \) to a non random probability measure, the equilibrium measure
the unique minimizer of the strictly convex lower semi-continuous “energy” \(\mathcal {E}\) defined by
When \(W=g\) is the Coulomb kernel, the quantity \(\mathcal {E}(\mu )\) is the electrostatic energy of the distribution of charges \(\mu \), formed by the sum of the electrostatic potential coming from the external electric field V with the Coulomb self repulsion by mean of the Coulomb kernel g. Note that \(\mathcal {E}(\mu )=\infty \) if \(\mu \) has a Dirac mass due to the singularity of g. An Euler–Lagrange variational analysis reveals that when \(S=\mathbb {R}^d\) and V is smooth, convex, and grows faster than g at infinity then the equilibrium probability measure \(\mu _*\) is compactly supported and has density proportional to \(\Delta V\), see [14] and references therein. Table 1 gives examples of equilibrium measures in this Coulomb setting. We refer to [33, 44, 65, 67, 68] for old and new potential theory from this analytic point of view. Moreover, quite a few equilibrium measures are known for log-gases beyond Coulomb gases, see for instance [16].
Actually it can be shown that essentially if \(\beta _N\gg N\) and V beats g at infinity then under \({(P_N)}_N\) the sequence of random empirical measures \({(\mu _N)}_N\) satisfies a large deviation principle with speed \(\beta _N\) and good rate function \(\mathcal {E}\), see [3, 14, 30]. Concentration of measure inequalities are also available, see [12] and references therein.
1.4 Two Remarkable Gases from Random Matrix Theory
Let us give a couple of famous gases from random matrix theory that will serve as benchmark for our algorithm. They correspond to \(n=2\) because the Lebesgue measure on a matrix translates via the Jacobian of the change of variable to a Vandermonde determinant on the eigenvalues, giving rise to the two-dimensional Coulomb kernel inside the exponential via the identity
Hence the name “log-gases”. A good reference on this subject is [28] and we refer to [21, 24, 28, 29, 39] for more examples of Coulomb gases related to random matrix models. Coulomb gases remain interesting in any dimension n beyond random matrices, see [67, 68].
Beta-Hermite model This model corresponds to
This means that the particles evolve on the line \(\mathbb {R}\) with Coulomb interactions given by the Coulomb kernel in \(\mathbb {R}^2\). For \(\beta =2\), it becomes the famous Gaussian Unitary Ensemble (GUE), which is the distribution of the eigenvalues of random \(N\times N\) Hermitian matrices distributed according to the Gaussian probability measure with density proportional to \(H\mapsto \mathrm {e}^{-N\mathrm {Tr}(H^2)}\). Beyond the case \(\beta =2\), the cases \(\beta =1\) and \(\beta =4\) correspond respectively to Gaussian random matrices with real and quaternionic entries. Following [21], for all \(\beta \in (0,\infty )\), the measure \(P_N\) is also the distribution of the eigenvalues of special random \(N\times N\) Hermitian tridiagonal matrices with independent but non identically distributed entries. Back to the case \(\beta =2\), the law \(P_N\) writes
In this case, the Coulomb gas \(P_N\) has a determinantal structure, making it integrable or exactly solvable for any \(N \ge 2\), see [28, 57]. This provides in particular a formula for the density of the mean empirical spectral distribution \(\mathbb {E}\mu _N\) under \(P_N\), namely
where \({(H_\ell )}_{\ell \ge 0}\) are the Hermite polynomials which are the orthonormal polynomials for the standard Gaussian distribution \(\mathcal {N}(0,1)\). The equilibrium measure \(\mu _*\) in this case is the Wigner semicircle distribution with the following density with respect to the Lebesgue measure:
A plot of \(\mu _*\) and \(\mathbb {E}\mu _N\) is provided in Fig. 1, together with our simulations. We refer to [46] for a direct proof of convergence of (1.3)–(1.4) as \(N\rightarrow \infty \). Beyond the case \(\beta =2\), the equilibrium measure \(\mu _*\) is still a Wigner semicircle distribution, scaled by \(\beta \), supported by the interval \([-\beta ,\beta ]\), but up to our knowledge we do not have a formula for the mean empirical spectral distribution \(\mathbb {E}\mu _N\), except when \(\beta \) is an even integer, see [21].
Beta-Ginibre model This model corresponds to
In this case, the particles move in \(\mathbb {R}^2\) with a Coulomb repulsion of dimension 2—it is therefore a Coulomb gas. As for the GUE, the law \(P_N\) can be written as
When \(\beta =m\) for an even integer \(m\in \{2,4,\ldots \}\), the law of this gas matches the Laughlin wavefunction modeling the fractional quantum Hall effect (FQHE), see for instance [26].
For \(\beta =2\), this gas, known as the complex Ginibre Ensemble, matches the distribution of the eigenvalues of random \(N\times N\) complex matrices distributed according to the Gaussian probability measure with density proportional to \(M\mapsto \mathrm {e}^{-N\mathrm {Tr}(MM^*)}\) where \(M^*=\overline{M}^\top \). In this case \(P_N\) has a determinantal structure, see [28, 57]. This provides a formula for the density of the mean empirical spectral distribution \(\mathbb {E}\mu _N\) under \(P_N\), namely
which is the analogue of (1.3) for the Gaussian Unitary Ensemble. Moreover, if \(Y_1,\ldots ,Y_N\) are independent and identically distributed Poisson random variables of mean \(|x|^2\) for some \(x\in \mathbb {R}^2\), then (1.6) writes
As \(N\rightarrow \infty \), by the law of large numbers, it converges to \(1/\pi \) if \(|x|<1\) and to 0 if \(|x|>1\), while by the central limit theorem it converges to \(1/(2\pi )\) if \(|x|=1\). It follows that \(\mathbb {E}\mu _N\) converges weakly as \(N\rightarrow \infty \) to the uniform distribution on the disk, with density
which is the equilibrium measure \(\mu _*\). When N is finite, the numerical evaluation of (1.6) is better done by mean of the Gamma law. Namely, by induction and integration by parts, (1.6) writes
where \(\Gamma \) is the normalized incomplete Gamma function and where we used the identity
Note that \(t\mapsto 1-\Gamma (N,t)\) is the cumulative distribution function of the Gamma distribution with shape parameter N and scale parameter 1. Figure 4 illustrates the difference between the limiting distribution (1.7) and the mean empirical spectral distribution (1.6) for a finite N. Beyond the case \(\beta =2\), we no longer have a formula for the density of \(\mathbb {E}\mu _N\), but a simple scaling argument reveals that the equilibrium measure \(\mu _*\) is in this case the uniform distribution on the centered disk of radius \(\sqrt{\beta /2}\).
2 Simulating Log-Gases and Coulomb Gases
Regarding simulation of log-gases or Coulomb gases such as (1.1), it is natural to use the random matrix models when they are available. There exist also methods specific to determinantal processes which cover the log-gases of random matrix theory with \(\beta =2\), see [2, 18, 32, 37, 45, 59, 66]. Beyond these specially structured cases, a great variety of methods are available for simulating Boltzmann–Gibbs measures, such as overdamped Langevin diffusion algorithm, Metropolis–Hastings algorithm, Metropolis adjusted Langevin algorithm (MALA), and kinetic versions called Hybrid or Hamiltonian Monte Carlo (HMC) which are based on a kinetic (or underdamped) Langevin diffusion, see for instance [10, 52]. Other possibilities exist, such as Nosé-Hoover dynamics [40] or piecewise deterministic Markov processes [9].
Two difficulties arise when sampling measures as (1.1). First, the Hamiltonian \(H_N\) involves all couples, so the computation of forces and energy scales quadratically with the number of particles. A natural way to circumvent this numerical problem is to use clusterization procedures such as the “fast multipole methods”, see for instance [35]. A second difficult feature of such a Hamiltonian is the singularity of the interacting function W, which typically results in numerical instability. A standard stabilization procedure is to «tame» the dynamics [11, 38], which is the strategy adopted in [55]. However, this smoothing of the force induces a supplementary bias in the invariant measure, as shown in [11] for regular Hamiltonians. This requires using small time steps, hence long computations. In the present note, we explore for the first time the usage of HMC for general Coulomb gases in the context of random matrices, in the spirit of [71], the difficulty being the singularity of the interaction. This method has the advantage of sampling the exact invariant measure (1.1), while allowing to choose large time steps, which reduces the overall computational cost [27].
In Sect. 2.1, we review standard methods for sampling measures of the form \(\mathrm {e}^{-\beta _N H_N}\), before presenting in detail the HMC algorithm in Sect. 2.2.
2.1 Standard Sampling Methods
To simplify and from now on, we suppose the support set S in (1.1) to be \(\mathbb {R}^d\). We introduce the methods based on the overdamped Langevin dynamics. To sample approximately (1.1), the idea is to exploit the fact that \(P_N\) in (1.1) is the reversible invariant probability measure of the Markov diffusion process \({(X_t)}_{t\ge 0}\) solution to the stochastic differential equation:
or in other words
where \({(B_t)}_{t\ge 0}\) is a standard Brownian motion on \(S^N\) and \(\alpha _N>0\) is an arbitrary time scaling parameter (for instance \(\alpha _N=1\) or \(\alpha _N=\beta _N\)). The infinitesimal generator associated with (2.1) is
The difficulty in solving (2.1) lies in the fact that the energy \(H_N\) involves a singular interaction W, which may lead the process to explode. Actually, under certain conditions on \(\beta _N\) and V, the Eq. (2.1) is well posed, the process \({(X_t)}_{t\ge 0}\) is well defined, and
for all non-degenerate initial condition \(X_0\). See for instance [1, 13, 25] for the case of Beta-Hermite case known as the Dyson Ornstein–Uhlenbeck process, and [6] for the Beta-Ginibre case. We do not discuss these delicate aspects in this note. A convergence in Cesáro mean is provided by the ergodic theorem for additive functionals,
almost surely or, for any test function \(f\in L^1(P_N)\),
almost surely. It is also possible to accelerate the convergence by adding a divergence free term in the dynamics (2.1), see for instance [22, 49] and references therein. This modification keeps the same invariant distribution but produces a non-reversible dynamics.
This method of simulation is referred to as an “unadjusted Langevin algorithm”, a terminology which will be clarified later on. In practice, one cannot simulate the continuous stochastic process \({(X_t)}_{t\ge 0}\) solution to (2.1), and resorts to a numerical integration with a finite time step \(\Delta t\). A typical choice is the Euler–Maruyama scheme [42, 58], which reads
where \((G_k)\) is a family of independent and identically distributed standard Gaussian variables, and \(x_k\) is an approximation of \(X_{k\Delta t}\). Note that \(\alpha _N\) and \(\Delta t\) play the same role here. However, because of the singularity of \(H_N\), this sampling scheme leads to important biases in practice, and (2.2) may even lack an invariant measure [56, Sect. 6]. One way to stabilize the dynamics is to use a tamed version of (2.2), which typically takes the following form:
This strategy is used in [55] but, as noted by the authors, the number of time steps needed to run a trajectory of fixed time T scales as \(\Delta t\sim N^{-2}\), which makes the study of large systems difficult.
Another strategy is to add a selection step at each iteration. This is the idea of the Metropolis Adjusted (overdamped) Langevin Algorithm (MALA) [63], which prevents irrelevant moves with a Metropolis step. One can also view the MALA algorithm as a Metropolis algorithm in which the proposal is produces by using a one step discretization of the Langevin dynamics (2.1). Let us make this precise; more details can be found e.g. in [61, 63].
Algorithm 2.1
(Metropolis Adjusted (overdamped) Langevin Algorithm—MALA) Let K be the Gaussian transition kernel associated to the Markov chain of the Euler discretization (2.2) of the dynamics (2.1). For each step k,
-
draw a proposal \({\tilde{x}}_{k+1}\) according to the kernel \(K(x_k,\cdot )\),
-
compute the probability
$$\begin{aligned} p_k=1\wedge \frac{K({\tilde{x}}_{k+1},x_k)\mathrm {e}^{-\beta _N H_N({\tilde{x}}_{k+1})}}{K(x_{k},{\tilde{x}}_{k+1})\mathrm {e}^{-\beta _N H_N(x_{k})} }, \end{aligned}$$(2.4) -
set
$$\begin{aligned} x_{k+1} ={\left\{ \begin{array}{ll} {\tilde{x}}_{k+1} &{} \text {with probability }p_k;\\ x_k &{} \text {with probability }1-p_k. \end{array}\right. } \end{aligned}$$
Note that the “reversed” kernel \(K(\cdot ,x)\) is Gaussian only if \(H_N\) is a quadratic form. Note also that if the proposal kernel K is symmetric in the sense that \(K(x,y)=K(y,x)\) for all x, y then it disappears in (2.4), and it turns out that this is the case for the Hybrid Monte Carlo algorithm described next (up to momentum reversal)!
A natural issue with these algorithms is the choice of \(\Delta t\): if it is too large, an important fraction of the proposed moves will be rejected, hence poor convergence properties; conversely, if \(\Delta t\) is too small, many steps will be accepted but the physical ellapsed time will be small, hence a large variance for a fixed number of iterations. This algorithm actually has a nice scaling of the optimal time step \(\Delta t\) with the dimension of the system. Indeed, it can be shown that it scales as \(\Delta t\sim N^{-\frac{1}{3}}\), at least for product measures (see [62] and references therein). Although this algorithm is already efficient, we propose to use a kinetic version with further advantages.
2.2 Hybrid Monte Carlo Algorithm
Hybrid Monte Carlo is built on Algorithm 2.1, but using a kinetic version of (2.1). For this, a momentum variable is introduced so as to improve the exploration of the space. Namely, set \(E=\mathbb {R}^{dN}\), and let \(U_N:E\rightarrow \mathbb {R}\) be smooth and such that \(\mathrm {e}^{-\beta _N U_N}\) is Lebesgue integrable. Let \({(X_t,Y_t)}_{t\ge 0}\) be the diffusion process on \(E\times E\) solution to the stochastic differential equation
where \({(B_t)}_{t\ge 0}\) is a standard Brownian motion on E, and \(\gamma _N>0\) is an arbitrary parameter which plays the role of a friction, and which may depend a priori on N and \((X_t)_{t\ge 0}\), even if we do not use this possibility here. In addition, \(H_N\) and \(\beta _N\) are as in (1.1), while \(U_N\) plays the role of a generalized kinetic energy [71]. This dynamics admits the following generator:
where \(L_1\) is known as the Hamiltonian part while \(L_2\) is called the fluctuation-dissipation part. The dynamics leaves invariant the product Boltzmann–Gibbs measure
see for instance [71]. In other words
As for the overdamped dynamics, the ergodic theorem for additive functionals gives
Remark 2.2
(Terms: Hamiltonian, Langevin, overdamped, underdamped, kinetic) The dynamics (2.5) is called “Hamiltonian” when we turn off the noise by taking \(\gamma _N=0\). On the other hand, when \(\gamma _N\rightarrow \infty \) and \(\alpha _N\rightarrow 0\) with \(\alpha _N\gamma _N=1\), we recover (2.1) from (2.5) with \(Y_t\) and \(U_N\) instead of \(X_t\) and \(H_N\). Both (2.1) and (2.5) are known as Langevin dynamics. To be more precise, (2.1) is generally called overdamped while (2.5) is referred to as kinetic or underdamped.
When \(U_N(y)=\frac{1}{2}|y|^2\) then \(Y_t=\mathrm {d}X_t/\mathrm {d}t\), and in this case \(X_t\) and \(Y_t\) can be interpreted respectively as the position and the velocity of a system of N points in S at time t. In this case we say that \(U_N\) is the kinetic energy. For simplicity, we specialize in what follows to this “physical” or “kinetic” case and refer to [71] for more possibilities.
As before, to simulate \({(X_t,Y_t)}_{t\ge 0}\), one can discretize (2.5) and sample from a trajectory. This will provide a proposal for the HMC scheme as the Euler discretization (2.2) did for Algorithm 2.1. A good way of doing this is a splitting procedure. First, one integrates the Hamiltonian part i.e. the operator \(L_1\) in (2.6), which amounts to a standard Hamiltonian dynamics, before integrating the fluctuation-dissipation part i.e. the operator \(L_2\) in (2.6). For discretizing the Hamiltonian dynamics over a time step, a standard approach is the Verlet integrator [31, 50], which we describe now. For a time step \(\Delta t>0\), this scheme reads, starting from a state \((x_k,y_k)\) at time k:
This corresponds to updating the velocity over half a time step, then the positions over a time step, and again the velocity over half a time-step. Given that this scheme only corresponds to the Hamiltonian part, it remains to integrate the fluctuation-dissipation part, corresponding to \(L_2\) in (2.6). For quadratic energies, it is a simple Ornstein–Uhlenbeck process whose variance can be computed explicitly. Therefore, we add to the previous scheme the following velocity update which comes from the Mehler formulaFootnote 1:
where \(G_k\) is a standard Gaussian random variable. Like the numerical scheme (2.2), because of the singularity of the interactions, this integrator may not have an invariant measure [56], or its invariant measure may be a poor approximation of \(R_N\) depending on the time step [48]. Note that, here again, \(\alpha _N\) and \(\Delta t\) play the same role.
Hybrid or Hamiltonian Monte Carlo (HMC) methods, built on the later integration, appeared in theoretical physics in lattice quantum chromodynamics with [19], see also [64], and are still actively studied in applied mathematics, see for instance [4, 8, 17, 23, 34, 50, 71] and references therein. The HMC algorithm can be thought of in a sense as a special Metropolis Adjusted (underdamped) Langevin Algorithm. Indeed, inspired by the MALA Algorithm 2.1, a way to avoid the stability problem of the discretization of the kinetic Langevin dynamics mentioned above is to add an acceptance-rejection step. A surprising advantage of this approach is that the Verlet integration scheme is time reversible up to momenta reversal [50, Sect. 2.1.3 and Eq. (2.11)], hence when computing the acceptance probability as in (2.4), the transition kernel does not appear. Note that the Verlet algorithm has been widely used for years by statistical physicists, and goes back to the historical works of Verlet [73] and Levesque and Verlet [53, 54]. Let us now describe the algorithm.
Algorithm 2.3
[HMC] Start from a configuration \((x_0,y_0)\) and perform the following steps for each time \(k\ge 0\):
-
(1)
update the velocities with
$$\begin{aligned} {\tilde{y}}_{k}=\eta y_{k}+\sqrt{\frac{1-\eta ^2}{\beta _N}} G_k, \quad \eta = \mathrm {e}^{-\gamma _N\alpha _N\Delta t}; \end{aligned}$$ -
(2)
run one step of the Verlet scheme:
$$\begin{aligned} \left\{ \begin{aligned} {\tilde{y}}_{k+\frac{1}{2}}&={\tilde{y}}_k -\nabla H_N(x_k)\alpha _N\frac{\Delta t}{2};\\ {\tilde{x}}_{k+1}&=x_k +{\tilde{y}}_{k+\frac{1}{2}}\alpha _N\Delta t;\\ {\tilde{y}}_{k+1}&={\tilde{y}}_{k+\frac{1}{2}}-\nabla H_N(x_{k+1})\alpha _N\frac{\Delta t}{2};\\ \end{aligned} \right. \end{aligned}$$(2.8) -
(3)
compute the probability ratio
$$\begin{aligned} p_k = 1\wedge \exp \left[ -\beta _N\Big (H_N({\tilde{x}}_{k+1}) + \frac{{\tilde{y}}_{k+1}^2}{2} - H_N(x_{k}) - \frac{{\tilde{y}}_{k}^2}{2} \Big )\right] ; \end{aligned}$$ -
(4)
set
$$\begin{aligned} (x_{k+1},y_{k+1}) = {\left\{ \begin{array}{ll} ({\tilde{x}}_{k+1},{\tilde{y}}_{k+1}) &{} \text {with probability }p_k;\\ (x_{k},-{\tilde{y}}_{k}) &{} \text {with probability }1-p_k. \end{array}\right. } \end{aligned}$$
As noted in the various references above, the Metropolis step acts as a corrector on the energy conservation of the Hamiltonian step. In this, it helps avoiding irrelevant moves, while enhancing the exploration capacities of the dynamics through the speed variable. A more precise argument in favor of this algorithm is the scaling of the time step \(\Delta t\) with respect to the system size N. Indeed, as shown in [4] for product measures, the optimal scaling is as \(\Delta t\sim N^{-\frac{1}{4}}\), which makes the algorithm appealing for large systems. Since the Hamiltonian computational cost scales as \(N^2\), we see that the cost of the algorithm for a fixed time T and \(N={\lceil }[T/\Delta t]{\rceil }\) is in \(\mathcal {O}(N^\frac{9}{4})\), which has to be compared to the \(\mathcal {O}(N^4)\) cost reached in [55]. Finally, the parameter \(\gamma _N\) can also be tuned in order to optimize the speed of convergence – we leave this point here and stick to \(\gamma _N=1\).
The control of the error or rate of convergence for the HMC algorithm is the subject of active research, see for instance [47] and [7, 23] for some results under structural assumptions.
From a practical point of view, the algorithm can be tested in the following way. First, when only the Hamiltonian part of the dynamics is integrated with the Verlet scheme (2.8), it can be checked that the energy variation over one time step scales as \(\Delta t^3\) as \(\Delta t\rightarrow 0\). Then, if the selection step is added, the rejection rate should also scale as \(\Delta t^3\). When the momentum resampling is added, this rejection rate scaling should not change. For completeness, we illustrate some of these facts in Sect. 3.
3 Numerical Experiments on Remarkable Models
In this section, we start testing Algorithm 2.3 for the two cases described in Sect. 1.4. Since the equilibrium measures are known for any \(N\ge 2\), we will be able to compare accurately our results with the expected one. We will also consider models for which the empirical spectral distribution and the equilibrium distribution are not known. We remind that when \(S=\mathbb {R}^d\) with \(d\ge 1\) we have the following formulas that hold in any dimension:
3.1 Case Study: 1D
We test the numerical method by looking at the mean empirical distribution in the case of the Gaussian Unitary Ensemble (1.2) with \(\beta =2\), \(N=8\), for which the exact expression of \(\mathbb {E}\mu _N\) under \(P_N\) is provided by (1.3). The results in Fig. 1 show a very good agreement between the exact result and the algorithm. For completeness, we study the rejection rate of the algorithm as \(\Delta t\) goes to zero, as mentioned at the end of Sect. 2.2. More precisely, we compute over a trajectory the rate of rejected moves in the Step 4 of Algorithm 2.3. The logarithmic plot in Fig. 2 shows a linear fit with a slope of about 3.1, which confirms the expected scaling in \(\Delta t^3\).
We also study the quartic confinement potential \(V(x)=x^4/4\), as in [55]. In this case, the empirical spectral distribution is not known, but the equilibrium distribution has density with respect to the Lebesgue measure given by
The results of the numerical simulations, see Fig. 3, show a good agreement with the equilibrium measure when N is large. Note that a tridiagonal random matrix model is known but it does not have independent entries, see [43, Prop. 2.1].
3.2 Case Study: 2D
We next consider in Fig. 4 the mean empirical distribution in the case of the Complex Ginibre Ensemble (1.5) with \(\beta =2\), \(N=8\). In this case, we also know a theoretical formula for \(\mathbb {E}\mu _N\) under \(P_N\), given by (1.6). For completeness, we investigate the scaling of the relative energy difference in the Step 3 of Algorithm 2.3 (by turning off the selection procedure of Step 4). The logarithmic plot in Fig. 5 shows a slope of about 2.9, which confirms the expected scaling in \(\Delta t^3\) that corresponds to the error of energy conservation, over one time step, of the Verlet integrator (2.8).
We explore next in Fig. 6 the Gumbel fluctuation at the edge, which is proved for \(\beta =2\) and conjectured for \(\beta \ne 2\), see [15, 20, 60] (note that in this case we have a formula for \(\mu _*\) but not for \(\mathbb {E}\mu _N\) under \(P_N\)). One could also explore the crystallization phenomenon, see [5] and references therein.
3.3 Case Study: 3D
In Fig. 7, we finally turn to the Coulomb gas which corresponds to \(S=\mathbb {R}^3\), \(d=n=3\), \(V={{\left| \cdot \right| }}^2/\beta \), \(W=1/{{\left| \cdot \right| }}\) and to the log-gas for which \(W=-\log {{\left| \cdot \right| }}\). In the first case the equilibrium measure \(\mu _*\) is uniform on the centered ball of \(\mathbb {R}^d\) of radius \((\beta (d-2)/2)^{1/d}\), see for instance [14, Cor. 1.3], while in the second case the equilibrium measure is not know yet, see however [16]. In both cases we do not have a formula for \(\mathbb {E}\mu _N\) under \(P_N\). One could study the fluctuation at the edge, which is conjectured to be Gumbel, just like for the complex Ginibre ensemble in 2D.
Notes
The Mehler formula states that the Ornstein–Uhlenbeck process \({(Z_t)}_{t\ge 0}\) in \(\mathbb {R}^n\) solution of the stochastic differential equation \(\mathrm {d}Z_t=\sqrt{2\sigma ^2}\mathrm {d}B_t-\rho Z_t\mathrm {d}t\) satisfies \(\mathrm {Law}(Z_{t+s}\mid Z_s=z)=\mathcal {N}(z\mathrm {e}^{-\rho t},\frac{1-\mathrm {e}^{-2\rho t}}{\rho }\sigma ^2I_n)\).
References
Anderson, G.W., Guionnet, A., Zeitouni, O.: An Introduction to Random Matrices. Cambridge Studies in Advanced Mathematics, vol. 118. Cambridge University Press, Cambridge (2010)
Bardenet, R., Hardy, A.: Monte Carlo with determinantal point processes. arXiv:1605.00361v1 (2016)
Berman, R.J.: On large deviations for Gibbs measures, mean energy and Gamma-convergence. arXiv:1610.08219v1 (2016)
Beskos, A., Pillai, N., Roberts, G., Sanz-Serna, J.-M., Stuart, A.: Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli 19(5A), 1501–1534 (2013)
Blanc, X., Lewin, M.: The crystallization conjecture: a review. EMS Surv. Math. Sci. 2(2), 225–306 (2015)
Bolley, F., Chafï, D., Fontbona, J.: Dynamics of a planar Coulomb gas. Ann. Appl. Probab. arXiv:1706.08776v3 (2017)
Bou-Rabee, N., Eberle, A., Zimmer, R.: Coupling and convergence for Hamiltonian Monte Carlo. arXiv:1805.00452v1 (2018)
Bou-Rabee, N., Sanz-Serna, J.M.: Geometric integrators and the Hamiltonian Monte Carlo method. arXiv:1711.05337v1 (2017)
Bouchard-Côté, A., Vollmer, S.J., Doucet, A.: The bouncy particle sampler: a nonreversible rejection-free Markov chain Monte Carlo method. J. Am. Stat. Assoc. 113(522), 855–867 (2018)
Brooks, S., Gelman, A., Jones, G.L., Meng, X.L. (eds.): Handbook of Markov Chain Monte Carlo. Chapman Hall/CRC Handbooks of Modern Statistical Methods. CRC Press, Boca Raton (2011)
Brosse, N., Durmus, A., Moulines, É., Sabanis , S.: The tamed unadjusted Langevin algorithm arXiv:1710.05559v2 (2017)
Chafaï, D., Hardy, A., Maïda, M.: (2018) Concentration for Coulomb gases and Coulomb transport inequalities. J. Funct. Anal. arXiv:1610.00980v3
Chafaï, D., Lehec, J.: On Poincaré and logarithmic Sobolev inequalities for a class of singular Gibbs measures. arXiv:1805.00708v2 (2018)
Chafaï, D., Gozlan, N., Zitt, P.-A.: First-order global asymptotics for confined particles with singular pair repulsion. Ann. Appl. Probab. 24(6), 2371–2413 (2014)
Chafaï, D., Péché, S.: A note on the second order universality at the edge of Coulomb gases on the plane. J. Stat. Phys. 156(2), 368–383 (2014)
Chafaï, D., Saff, E.: Aspects of an Euclidean log-gas. Work in progress (2018)
Dalalyan, A., Riou-Durand, L.: On sampling from a log-concave density using kinetic Langevin diffusions. arXiv:1807.09382v1 (2018)
Decreusefond, L., Flint, I., Vergne, A.: Vergne: a note on the simulation of the Ginibre point process. J. Appl. Probab. 52(4), 1003–1012 (2015)
Duane, S., Kennedy, A., Pendleton, B.J., Roweth, D.: Hybrid Monte Carlo. Phys. Lett. B 195(2), 216–222 (1987)
Dubach, G.: Powers of Ginibre Eigenvalues. arXiv:1711.03151v2 (2017)
Dumitriu, I., Edelman, A.: Matrix models for beta ensembles. J. Math. Phys. 43(11), 5830–5847 (2002)
Duncan, A.B., Lelièvre, T., Pavliotis, G.A.: Variance reduction using nonreversible Langevin samplers. J. Stat. Phys. 163, 457–491 (2016)
Durmus, A., Moulines, E., Saksman, E.: On the convergence of Hamiltonian Monte Carlo. arXiv:1705.00166v1 (2017)
Edelman, A., Rao, N.R.: Random matrix theory. Acta Numer. 14, 233–297 (2005)
Erdős, L., Yau, H.-T.: A dynamical approach to random matrix theory. Courant Lecture Notes in Mathematics, vol. 28. American Mathematical Society, Providence, RI (2017)
Ezawa, Z.E.: Quantum Hall Effects. Field Theoretical Approach and Related Topics, 2nd edn. World Scientific Publishing Co. Pt. Ltd., Hackensack, NJ (2008)
Fathi, M., Homman, A.-A., Stoltz, G.: Error analysis of the transport properties of Metropolized schemes. ESAIM Proc. Surv. 48, 341–363 (2015)
Forrester, P .J.: Log-Gases and Random Matrices. London Mathematical Society Monographs Series, vol. 34. Princeton University Press, Princeton, NJ (2010)
Forrester, P.J.: Analogies between random matrix ensembles and the one-component plasma in two-dimensions. Nuclear Phys. B 904, 253–281 (2016)
García-Zelada, D.: A large deviation principle for empirical measures on Polish spaces: application to singular Gibbs measures on manifolds. arXiv:1703.02680v2 (2017)
Hairer, E., Lubich, C., Wanner, G.: Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations, Springer Series in Computational Mathematics, vol. 31. Springer Science Business Media, Berlin (2006)
Hardy, A.: Polynomial ensembles and recurrence coefficients. arXiv:1709.01287v1 (2017)
Helms, L.L.: Potential Theory, 2nd edn. Springer, London (2014)
Hoffman, M.D., Gelman, A.: The no-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15, 1593–1623 (2014)
Höft, T.A., Alpert, B.K.: Fast updating multipole Coulombic potential calculation. SIAM J. Sci. Comput. 39(3), A1038–A1061 (2017)
Horowitz, A.M.: A generalized guided Monte Carlo algorithm. Phys. Lett. B 268, 247–252 (1991)
Hough, J.B., Krishnapur, M., Peres, Y., Virág, B.: Determinantal processes and independence. Probab. Surv. 3, 206–229 (2006)
Hutzenthaler, M., Jentzen, A., Kloeden, P.E.: Strong convergence of an explicit numerical method for SDEs with nonglobally Lipschitz continuous coefficients. Ann. Appl. Probab. 22(4), 1611–1641 (2012)
Jiang, T., Qi, Y.: Spectral radii of large non-Hermitian random matrices. J. Theoret. Probab. 30(1), 326–364 (2017)
Jones, A., Leimkuhler, B.: Adaptive stochastic methods for sampling driven molecular systems. J. Chem. Phys. 135(8), 084125 (2011)
Kapfer, S.C., Krauth, W.: Cell-veto Monte Carlo algorithm for long-range systems. Phys. Rev. E 94, 031302 (2016)
Kloeden, P.E., Platen, E., Schurz, H.: Numerical Solution of SDE Through Computer Experiments. Universitext. Springer, Berlin (1994)
Krishnapur, M., Rider, B., Virág, B.: Universality of the stochastic Airy operator. Commun. Pure Appl. Math. 69(1), 145–199 (2016)
Landkof, N.S.: Foundations of Modern Potential Theory. Springer, New York (1972) (Translated from the Russian by A, p. 180. P. Doohovskoy, Die Grundlehren der mathematischen Wissenschaften, Band)
Lavancier, F., Møller, J., Rubak, E.: Determinantal point process models and statistical inference. J. R. Stat. Soc. Ser. B. Stat. Methodol 77(4), 853–877 (2015)
Ledoux, M.: Differential operators and spectral distributions of invariant ensembles from the classical orthogonal polynomials. The continuous case. Electron. J. Probab 9(7), 177–208 (2004)
Lee, Y.T., Vempala, S.S.: Convergence rate of Riemannian Hamiltonian Monte Carlo and faster polytope volume computation. arXiv:1710.06261v1 (2017)
Leimkuhler, B., Matthews, C., Stoltz, G.: The computation of averages from equilibrium and nonequilibrium Langevin molecular dynamics. IMA J. Numer. Anal. 36(1), 13–79 (2015)
Lelièvre, T., Nier, F., Pavliotis, G.A.: Optimal non-reversible linear drift for the convergence to equilibrium of a diffusion. J. Stat. Phys. 152, 237–274 (2013)
Lelièvre, T., Rousset, M., Stoltz, G.: Free Energy Computations. A Mathematical Perspective. Imperial College Press, London (2010)
Lelièvre, T., Rousset, M., Stoltz, G.: Langevin dynamics with constraints and computation of free energy differences. Math. Comput. 81(280), 2071–2125 (2012)
Lelièvre, T., Stoltz, G.: Partial differential equations and stochastic methods in molecular dynamics. Acta Numer. 25, 681–880 (2016)
Levesque, D., Verlet, L.: On the theory of classical fluids II. Physica 28(11), 1124–1142 (1962)
Levesque, D., Verlet, L.: Computer experiments on classical fluids. III. Time-dependent self-correlation functions. Phys. Rev. A 2, 2514 (1970)
Li, X.H., Menon, G.: Numerical solution of Dyson Brownian motion and a sampling scheme for invariant matrix ensembles. J. Stat. Phys. 153(5), 801–812 (2013)
Mattingly, J., Stuart, A., Higham, D.: Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise. Stoch. Proc. Appl. 101(2), 185–232 (2002)
Mehta, M.L.: Random Matrices, 3rd edn. Pure and Applied Mathematics (Amsterdam), vol. 142. Elsevier/Academic Press, Amsterdam (2004)
Milstein, G .N., Tretyakov, M .V.: Stochastic Numerics for Mathematical Physics. Springer Science Business Media, Berlin (2013)
Olver, S., Nadakuditi, R .R., Trogdon, T.: Sampling unitary ensembles. Random Matrices Theory Appl. 4(1), 1550002–22 (2015)
Rider, B.: A limit theorem at the edge of a non-Hermitian random matrix ensemble. Random matrix theory. J. Phys. A 36(12), 3401–3409 (2003)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer Texts in Statistics, 2nd edn. Springer, New York (2004)
Roberts, G.O., Rosenthal, J.S.: Optimal scaling for various Metropolis-Hastings algorithms. Stat. Sci. 16(4), 351–367 (2001)
Roberts, G.O., Tweedie, R.L., et al.: Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2(4), 341–363 (1996)
Rossky, P .H., Doll, J .D., Friedman, H .L.: Brownian dynamics as smart Monte Carlo simulation. J. Chem. Phys. 69, 4628 (1978)
Saff, E.B, Totik, V.: Logarithmic Potentials with External Fields. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 316, Springer, Berlin (1997) (Appendix B by Thomas Bloom)
Scardicchio, A., Zachary, C .E., Torquato, S.: Statistical properties of determinantal point processes in high-dimensional Euclidean spaces. Phys. Rev. E (3) 79(4), 041108 (2009). 19
Serfaty, S.: Coulomb Gases and Ginzburg-Landau Vortices. Zurich Lectures in Advanced Mathematics. Euro. Math. Soc. (EMS), Zürich (2015)
Serfaty, S.: Systems of points with Coulomb interactions. arXiv:1712.04095v1 (2017)
Smale, S.: Mathematical problems for the next century. Math. Intell. 20(2), 7–15 (1998)
Smale, S.: Mathematical problems for the next century. In: Arnold, V., Atiyah, M., Lax, P., Mazur, B. (eds.) Mathematics: Frontiers and Perspectives, pp. 271–294. Am. Math. Soc., Providence, RI (2000)
Stoltz, G., Trstanova, Z.: Stable and accurate schemes for Langevin dynamics with general kinetic energies. arXiv:1609.02891v1 (2016)
Vanetti, P., Bouchard-Côté, A., Deligiannidis, G., Doucet, A.: Piecewise-deterministic Markov Chain Monte Carlo. arXiv:1707.05296v1 (2018)
Verlet, L.: Computer experiments on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159(98), 9 (1967)
Acknowledgements
We warmly thank Gabriel Stoltz for his encouragements and for very useful discussions on the theoretical and numerical sides of this work. We are also grateful to Thomas Leblé and Laure Dumaz for their comments on the first version.
Author information
Authors and Affiliations
Corresponding author
Appendix A: Julia Code
Appendix A: Julia Code
Here is a program written in the Julia languageFootnote 2 illustrating our method. It allows to exploit the multiple cores of modern processors and works in parallel on clusters. Beware that this code is not fully optimized, for instance the energy and its gradient could be computed simultaneously for better performance.
Rights and permissions
About this article
Cite this article
Chafaï, D., Ferré, G. Simulating Coulomb and Log-Gases with Hybrid Monte Carlo Algorithms. J Stat Phys 174, 692–714 (2019). https://doi.org/10.1007/s10955-018-2195-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-018-2195-6
Keywords
- Numerical simulation
- Random number generator
- Singular Stochastic differential equation
- Coulomb gas
- Monte Carlo adjusted Langevin
- Hybrid Monte Carlo
- Markov chain Monte Carlo
- Langevin dynamics
- Kinetic equation