1 Introduction and Results

1.1 Introduction

In these notes, we consider a log-gas, also known as \(\beta \)-ensemble or one component plasma, in dimension 1 or 2 at inverse temperature \(\beta =2\). Let \({\mathfrak {X}}= {\mathbb {R}}\) equipped with the Lebesgue measure \(d\mu = dx\) or \({\mathbb {C}}\) equipped with the area measure \(d\mu = d\mathrm {A}= \frac{r dr d\theta }{\pi }\). Let also \(V\in C^2({\mathfrak {X}})\) be a real-valued function such that for a \(\nu >0\),

$$\begin{aligned} V(z) \ge (1+\nu ) \log |z| \text { as } |z| \rightarrow \infty . \end{aligned}$$
(1.1)

We consider the probability measure on \({\mathfrak {X}}^N\) with density \(G_N(x)=e^{-\beta {\mathscr {H}}^N_V(x)}/Z^N_V\) where the Hamiltonian is

$$\begin{aligned} {\mathscr {H}}^N_V(x)=\sum _{1\le i<j \le N} \log | x_i -x_j|^{-1} + N \sum _{j=1}^N V(x_j) . \end{aligned}$$
(1.2)

Regardless of the dimension \(\mathrm {d}\), the condition \(\beta =2\) implies that if a configuration \((\lambda _1,\dots , \lambda _N)\) is sampled from \(G_N\), then the point process \(\Xi := \sum _{k=1}^N \delta _{\lambda _k}\) is determinantal with a correlation kernel

$$\begin{aligned} K^N_V(z,w) = \sum _{k=0}^{N-1} \varphi _k(z)\overline{\varphi _k(w)} , \end{aligned}$$
(1.3)

with respect to \(\mu \). Moreover, for all \(k\ge 0\),

$$\begin{aligned} \varphi _k(x) = P_k(x) e^{-N V(x)} , \end{aligned}$$
(1.4)

where \(\{P_k \}_{k=0}^\infty \) is the sequence of orthonormal polynomialsFootnote 1 with respect to the weight \(e^{- 2N V(x)}\) on \(L^2(\mu )\). It turns out that for \(\beta =2\), the density \(G_N\) also corresponds to the joint law of the eigenvalues of the ensemble of Hermitian (or normal) matrices with weight \(e^{-2N {\text {Tr}}V(M)}\) on \({\mathfrak {X}}={\mathbb {R}}\) (or \({\mathfrak {X}}={\mathbb {C}}\)). In particular, when \(V(z)=|z|^2\), these correspond to the well-know Gaussian Unitary (GUE) and Ginibre ensembles respectively. It is well known that if the condition (1.1) holds, the thermodynamical limit of the log-gas is described by an equilibrium measure which has compact support. Moreover, if the potential \(V\in C^2({\mathfrak {X}})\), then the equilibrium measure is absolutely continuous and we let \(\varrho _V\) be its density. This implies that for any bounded test function \(f \in C({\mathfrak {X}})\), as \(N\rightarrow +\infty \),

$$\begin{aligned} \frac{1}{N}{\mathbb {E}}\big [\Xi (f)\big ] = \int f(x) u^N_V(x) d\mu (x) \rightarrow \int f(x) \varrho _V(x) d\mu (x) , \end{aligned}$$
(1.5)

where the expected density of states is given by \(u^N_V(x) = N^{-1}K^N_V(x,x)\). The asymptotics (1.5) follows either from potential theory for general \(\beta >0\) or from the asymptotics of the correlation kernel (1.3) when \(\beta =2\). We refer to [4, Section 2.6] for a proof of the large deviation principle in dimension 1 and to [18] for analogous results for Coulomb gases in higher dimension and further references.

In the following, we consider the problem of describing the fluctuations of the so-called thinned log-gases in dimension \(\mathrm {d}=1,2\). In general, a thinned or incomplete point process is defined by performing a Bernoulli percolation on the configuration of a the original process. That is the incomplete log-gas, denoted by \({\widehat{\Xi }}\), is obtained by deleting independently each particle with probability \(q_N \in (0,1)\) or by keeping it with probability \(p_N=1-q_N\). It turns out that the incomplete process \({\widehat{\Xi }}\) is also determinantal with correlation kernel \({\widehat{K}}^N_V(z,w) = p_NK^N_V(z,w)\); see the appendix A for a short proof. In the context of random matrix theory, this procedure was first considered by Bohigas and Pato [8, 9] who showed that it gives rise to a crossover to Poisson statistics and the problem of rigorously analyzing this transition in the context of Coulomb gases was popularized by Deift in [23, Problem 2]. Indeed, these types of transitions are supposed to arise in many different contexts in statistical physics, such as the localization/delocalization phenomena, the crossover from chaotic to regular dynamics in the theory of quantum billiards, or in the spectrum of certain band random matrices, see [27, 28, 50] and reference therein. Although such transitions are believed to be non-universal, the model of Bohigas and Pato is arguably one of the most tractable to study this phenomenon because it is determinantal. In a different context, the effect of thinning determinantal process on statistical inferences has been recently discussed in [40] and it should be emphasized that the general strategy explained in Sect. 2 applies to more general determinantal processes, see Theorem 2.2. For instance, our method applies to the Sine and the \(\infty \)-Ginibre processes which describes the local limits of the log-gases in dimensions 1 and 2 respectively. In fact, this paper is motivated by an analogous result obtained recently by Berggren and Duits for smooth linear statistics of the incomplete Sine and CUE processes [6]. Based on the fact that these processes come from integrable operators, they fully characterized the transition for a large class of mesoscopic linear statistics and suggested that it should be universal for thinned point processes coming from random matrix theory. There are also results for the gap probabilities of the critical thinned ensembles. In [14, 15], for the Sine process, Deift et al. computed very detailed asymptotics for the crossover from the Wigner surmise to the exponential distribution making rigorous a prediction of Dyson [26], and Charlier–Claeys obtained an analogous result for the CUE [19]. The contribution of this paper is to elaborate on universality for smooth linear statistics of \(\beta \)-ensemble in dimension 1 or 2 when \(\beta =2\). Although our proof relies on the determinantal structure of these models, instead of the connection with Riemann-Hilbert problems used in the previous works, we apply the cumulants method which appears to be very robust to study the asymptotic fluctuations of smooth linear statistics.

Let us point out that based on the theory of [30], an alternative correlation kernel for the incomplete process is

$$\begin{aligned} {\widehat{K}}^N_V(z,w) = \sum _{k=0}^\infty J_k^N \varphi _k(z) \varphi _k(w) , \end{aligned}$$
(1.6)

where \((J_k^N)_{k=1}^\infty \) is a sequence of i.i.d. Bernoulli random variables with expected values \({\mathbb {E}}[J_k^N]= p_N \mathbf {1}_{k<N} \). This shows that removing particles builds up randomness in the system and when the disorder becomes sufficiently strong, it will behave like a Poisson process rather than according to random matrix theory.

To keep the analysis as simple as possible, we will restrict ourself to real-analytic V, although the results should be valid for more general potential as well (especially in dimension 1 where the asymptotics of the correlation kernels have been studied in great generality). We keep track of the transition by looking at linear statistics \(\widehat{\Xi }(f) = \sum f(\lambda )\) for smooth test functions, where the sum is over the configuration of the incomplete log-gas. The random matrix regime is characterized by the property that the fluctuations of \(\widehat{\Xi }(f)\) are of order 1 and described by a universal Gaussian noise as the number of particles tends to infinity. On the other hand, in the Poisson regime, the variance of any non-trivial statistic diverges and, once properly renormalized, the point process converges in distribution to a white noise. In the remainder of this introduction, we first formulate our assumptions and main results for the fluctuations of the incomplete 2-dimensional log-gases. Then, we will present analogous results in dimension 1.

In what follows, we let \({\mathcal {C}}^k_c({\mathscr {S}})\) be the set of functions with k continuous derivatives and compact support in \({\mathscr {S}}\subset {\mathfrak {X}}\) and we use the notation:

$$\begin{aligned} \partial = (\partial _x - i \partial _y)/2 \ , \, \overline{\partial }= (\partial _x + i \partial _y)/2 \,\text {and } \Delta = \partial \overline{\partial }. \end{aligned}$$

If \(\varrho _V\) is the equilibrium density function, we also denote

$$\begin{aligned} {\mathscr {S}}_V:= \big \{ x\in {\mathfrak {X}}: \varrho _V(x) >0 \big \} . \end{aligned}$$
(1.7)

1.2 Main Results for 2-Dimensional Coulomb Gases

If the potential V is real-analytic and satisfies the condition (1.1), then the log-gas lives on the compact set \(\overline{{\mathscr {S}}_V} \subset {\mathbb {C}}\) which is called the droplet and the equilibrium density is given by \(\varrho _V = 2\Delta V \mathbf {1}_{{\mathscr {S}}_V}\). It is also well known that the bulk fluctuations of a two dimensional log-gas around its equilibrium configuration are described by a centered Gaussian process \(\mathrm {X}\) with correlation structure:

$$\begin{aligned} {\mathbb {E}}\big [\mathrm {X}(f) \mathrm {X}(g) \big ] = \frac{1}{4} \int _{\mathbb {C}}\nabla f(z) \cdot \nabla g(z)\ d\mathrm {A}(z) = \int _{\mathbb {C}}\partial f(z) \overline{\partial }g(z) d\mathrm {A}(z) , \end{aligned}$$
(1.8)

for any (real-valued) smooth functions f and g. Modulo constants, the RHS of formula (1.8) defines a Hilbert space, denoted \( H^{1}({\mathbb {C}})\), with norm:

$$\begin{aligned} \Vert f \Vert _{H^1({\mathbb {C}})}^2 = \int _{\mathbb {C}}\left| \partial f(z) \right| ^2 d\mathrm {A}(z) . \end{aligned}$$
(1.9)

Therefore the stochastic process \(\mathrm {X}\) is called a \(H^1\)-Gaussian noise. The central limit theorem (CLT) was first established for the Ginibre process by Rider and Viràg [45] for \({\mathcal {C}}^1\) test functions with at most exponential growth at infinity. For general real-analytic potentials, it was proved in [3] that for any smooth function f with compact support, one has as \(N\rightarrow \infty \),

$$\begin{aligned} \Xi (f) - {\mathbb {E}}\big [\Xi (f) \big ] \Rightarrow \mathrm {X}(f^\dagger ) \end{aligned}$$
(1.10)

where \(f^\dagger \) is the (unique) continuous and bounded function on \({\mathbb {C}}\) such that \(f^\dagger = f\) on the droplet \(\overline{{\mathscr {S}}_V}\) and \(\Delta f^\dagger =0\) on \({\mathbb {C}}\backslash \overline{{\mathscr {S}}_V}\). Actually, when \({\text {supp}}(f) \subset {\mathscr {S}}_V \), we have \(f^\dagger = f\) on \({\mathbb {C}}\) and the CLT was obtained previously in the paper [2] from which part of our method is inspired. We also refer to [10, 41] for more recent proofs which hold for general \(\beta >0\). By convention, in (1.8) and below, \(\Rightarrow \) means that the convergence holds in distribution and that all moments of the random variable converge.

In order to describe the crossover from the \(H^1\)-Gaussian noise to white noise, let \(\Lambda _\eta \) be a mean-zero Poisson process with intensity \(\eta \in L^\infty ({\mathfrak {X}})\). This process is characterized by the fact that for any function \(f\in {\mathcal {C}}_c({\mathfrak {X}})\), the Laplace transform of the random variable \(\Lambda _\eta (f)\) is well-defined and given by

$$\begin{aligned} \log {\mathbb {E}}\big [ \exp \Lambda _{\eta }(f) \big ] = \int _{{\mathfrak {X}}} \big ( e^{f(z)} - 1 - f(z) \big ) \eta (z) d\mu (z) . \end{aligned}$$
(1.11)

Theorem 1.1

Let \(\mathrm {X}\) be a \(H^1\)-Gaussian noise and \(\Lambda _{\tau \varrho _V}\) be an independent Poisson process with intensity \(\tau \varrho _V\) where \(\tau >0\) defined on the same probability space. Let \(f \in {\mathcal {C}}^3_c({\mathscr {S}}_V)\), \(p_N = 1-q_N\), and let \(T_N = N q_N\). As \(N\rightarrow \infty \) and \(q_N \rightarrow 0\), we have

$$\begin{aligned} \widehat{\Xi }(f) - {\mathbb {E}}\big [\widehat{\Xi }(f) \big ]&\Rightarrow \mathrm {X}(f)&\text {if }\ T_N \rightarrow 0 , \end{aligned}$$
(1.12)
$$\begin{aligned} \frac{\widehat{\Xi }(f) - {\mathbb {E}}\big [\widehat{\Xi }(f) \big ]}{\sqrt{T_N} }&\Rightarrow {\mathcal {N}}\left( 0, \int f(z)^2 \varrho _V(z) d\mathrm {A}(z) \right)&\text {if }\ T_N \rightarrow \infty , \end{aligned}$$
(1.13)
$$\begin{aligned} \widehat{\Xi }(f) - {\mathbb {E}}\big [\widehat{\Xi }(f) \big ]&\Rightarrow \mathrm {X}(f) + \Lambda _{\tau \varrho _V}(-f)&\text {if }\ T_N \rightarrow \tau . \end{aligned}$$
(1.14)

The proof of Theorem 1.1 is based on the cumulants’ method and it is explained in details in Sect. 2. In particular, we formulate a result—Theorem 2.2—valid for general determinantal point process and might be of independent interest. The details of the proof of Theorem 1.1 are given in Sect. 3.2. Our method relies on the approximations of the correlation kernel \(K^N_V\) from [2]—see Lemma 3.1 below—and it restricts us to work with test functions which are supported inside the bulk. However, the result should be true for general functions if we replace f by \(f^\dagger \) on the RHS of (1.12) and (1.14).

Theorem 1.1 can be interpreted as follows. In the regime \(Nq_N \rightarrow 0\), virtually no particles are deleted and linear statistics behave according to random matrix theory. On the other hand, in the regime \(Nq_N \rightarrow \infty \), the variance of a linear statistic diverge. So, if we renormalize the random variable \(\widehat{\Xi }(f)\), we obtain a classical CLT and (1.13) shows that the limit is described by a white noise supported on \({\mathscr {S}}_V\) whose intensity is the equilibrium measure \(\varrho _V\). In the critical regime, when the expected number of deleted particles equals \(\tau >0\), the limiting process is the superposition of a \(H^1\)-correlated Gaussian noise and an independent mean-zero Poisson process applied to \(-f\). Finally, by using formula (1.11), it is not difficult to check that as \(\tau \rightarrow \infty \), the random variable

$$\begin{aligned} \frac{\Lambda _{\tau \varrho _V}(-f)}{\sqrt{\tau }} \Rightarrow {\mathcal {N}}\left( 0, \int f(z)^2 \varrho _V(z) d\mathrm {A}(z) \right) , \end{aligned}$$

so that the critical regime clearly interpolates between (1.12) and (1.13).

In fact, the crossover is more interesting at mesoscopic scales. Namely, the density of a log-gas is of order N and one can also investigate fluctuations at small scales by zooming inside the bulk of the process. If \(L_N \nearrow \infty \), \(x_0 \in {\mathscr {S}}_V\), and \(f\in {\mathcal {C}}_c({\mathbb {C}})\), we consider the test function

$$\begin{aligned} f_N(z) = f\big (L_N(z-x_0) \big ) . \end{aligned}$$
(1.15)

The regime \(L_N=N^{1/\mathrm {d}}\) is called microscopic and it was shown in [2, Proposition 7.5.1] that when \(\mathrm {d}=2\),

$$\begin{aligned} \Xi (f_N) \Rightarrow \Xi ^\infty _{\varrho _V(x_0)}(f) , \end{aligned}$$

where the process \( \Xi ^\infty _{\rho }\) is called the \(\infty \)-Ginibre process with density \(\rho >0\). It is a determinantal process on \({\mathbb {C}}\) with correlation kernel

$$\begin{aligned} K^\infty _\rho (z,w) = \rho e^{\rho (2z{\overline{w}} - |z|^2 -|w|^2 )/2} . \end{aligned}$$
(1.16)

Based on the argument from [2], it is straightforward to verify that the incomplete process has a local limit as well:

$$\begin{aligned} \widehat{\Xi } (f_N) \Rightarrow \widehat{\Xi }^{\infty }_{\varrho _V(x_0) ; p}(f) \text { as }p_N\rightarrow p \text { and } N\rightarrow \infty . \end{aligned}$$
(1.17)

For any \(0<p\le 1\), \( \widehat{\Xi }^{\infty }_{\varrho ; p}\) is a (translation invariant) determinantal process on \({\mathbb {C}}\) with correlation kernel \(p K^\infty _\varrho (z,w) \). This process is constructed by running an independent Bernoulli percolation with parameter p on the point configuration of the \(\infty \)-Ginibre process with density \(\rho >0\). In particular, (1.17) shows that one needs to delete a non-vanishing fraction of the N particles of the gas in order to get a local limit which is different from random matrix theory. It was proved in [44] that, as the density \(\rho \rightarrow \infty \), the fluctuations of the \(\infty \)-Ginibre process are of order 1 and described by the \(H^1\)-Gaussian noise:

$$\begin{aligned} \Xi ^\infty _\rho (f) - \rho \int _{{\mathbb {C}}} f(z) d\mathrm {A}(z)\ \Rightarrow \mathrm {X}(f) \end{aligned}$$

for any \(f \in H^1\cap L^1({\mathbb {C}})\). Therefore it is expected that, in the mesoscopic regime, \(\text {i.e. }L_N = o(\sqrt{N})\), the asymptotic fluctuations of the linear statistic \(\Xi (f_N)\) are universal and described by \(\mathrm {X}(f)\). However, to our best knowledge, a proof was missing from the literature and, in Sect. 3, we show that this fact follows quite simply by combining the ideas from [44] and [2].

Theorem 1.2

Let \(x_0 \in {\mathscr {S}}_V\), \(f \in {\mathcal {C}}_c^3({\mathbb {C}})\), \(\alpha \in (0,1/2)\), and let \(f_N\) be given by formula (1.15) with \(L_N =N^\alpha \). Then, we have as \(N \rightarrow \infty \),

$$\begin{aligned} \Xi (f_N) - {\mathbb {E}}\big [ \Xi (f_N)\big ] \Rightarrow \mathrm {X}(f) . \end{aligned}$$

Using the same method, we can also describe the fluctuations of smooth mesoscopic linear statistics of a incomplete Coulomb gas.

Theorem 1.3

Let \(\mathrm {X}\) be a \(H^1\)-Gaussian noise and \(\Lambda _{\tau }\) be an independent Poisson process with constant intensity \(\tau >0\) on \({\mathbb {C}}\). Let \(x_0 \in {\mathscr {S}}_V\), \(f \in {\mathcal {C}}^3_c({\mathscr {S}}_V)\), \(\alpha \in (0,1/2)\), and let \(f_N\) be the mesoscopic test function given by formula (1.15) with \(L_N =N^\alpha \). We also let \(p_N = 1-q_N\) and \(T_N= N q_N L_N^{-2} \varrho _V(x_0)\). We have as \(N\rightarrow \infty \) and \(q_N\rightarrow 0\),

$$\begin{aligned} \widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]&\Rightarrow \mathrm {X}(f)&\text {if }\ T_N \rightarrow 0 ,\\ \frac{\widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]}{\sqrt{T_N} }&\Rightarrow {\mathcal {N}}\left( 0, \int f(z)^2 d\mathrm {A}(z) \right)&\text {if }\ T_N \rightarrow \infty ,\\ \widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]&\Rightarrow \mathrm {X}(f) + \Lambda _{\tau }(-f)&\text {if }\ T_N \rightarrow \tau . \end{aligned}$$

The proof of Theorem 1.3 follows the same strategy are that of Theorem 1.1 and the technical differences are explained in Sect. 3.2. This result shows that, at mesoscopic scales, the transition occurs when the mesoscopic density of deleted particles which is given by the parameter \(T_N>0\) converges to a positive constant \(\tau \). In contrast to previous results, this transition appears to be non-Gaussian and it is somewhat surprising that it can also be described in an elementary way. In dimension 1, one can obtain a crossover from GUE eigenvalues to a Poisson process by letting independent points evolved according to Dyson’s Brownian motion. This leads to a determinantal process sometimes called the deformed GUE whose kernel depends on the diffusion time, see [32]. This model also exhibit a transition which has been analyzed for mesoscopic linear statistics in [25] and it was proved that the critical fluctuations are Gaussian. One can also consider non-intersecting Brownian motions on a cylinder. It turns out that this point process describes the positions of free fermions confined in a harmonic trap at a temperature \(\tau >0\). It was established in [34], see also [21], that the corresponding grand canonical ensemble is determinantal with a correlation kernel of the form (1.6) with for \(k\ge 0\),

$$\begin{aligned} {\mathbb {E}}[J_k^N] = \frac{1}{1+ \exp \left( \frac{k-N}{\tau }\right) } . \end{aligned}$$

For sufficiently small temperature, this system behaves like its ground-state, the GUE, while it behaves like a Poisson process at larger temperature. It was proved in [35] that this leads to yet another crossover where non-Gaussian fluctuations are observed at the critical temperature. However, to the author’s knowledge, in contrast to the incomplete ensembles considered here, the critical processes discovered in [35] cannot be described in simple terms like Theorem 1.5 below.

1.3 Main Results for Eigenvalues of Unitary Invariant Hermitian Random Matrices

For 1-dimensional log-gases, for general \(\beta >0\) and for a large class of potentials, Johansson established in [31] the existence of the equilibrium measure and also managed to describe the fluctuations around the equilibrium configuration. To state the result in a universal way, note that one can make an affine rescaling of the potential and assume that \({\mathscr {I}}_V \subset [-1,1] \). If V is a polynomial and \({\mathscr {I}}_V =(-1,1)\), Johansson proved that linear statistics of the process \(\Xi \) satisfy a central limit theorem:

$$\begin{aligned} \Xi (f) - N \int _{\mathbb {R}}f(x) \varrho _V(x) dx \Rightarrow \mathrm {Y}(f) \,\text {as }N\rightarrow \infty , \end{aligned}$$
(1.18)

for any \(f \in {\mathcal {C}}^2({\mathbb {R}})\) such that \(f'(x)\) grows at most polynomially as \(|x|\rightarrow \infty \). The process \(\mathrm {Y}\) is a centered Gaussian noise defined on \([-1,1]\) with covariance structure:

$$\begin{aligned} {\mathbb {E}}\big [ \mathrm {Y}(f)\mathrm {Y}(g) \big ] = \frac{1}{4} \sum _{k=1}^\infty k \mathrm {c}_k(f) \mathrm {c}_k(g) . \end{aligned}$$
(1.19)

In (1.19), \(\mathrm {c}_k(f)\) denote the Fourier–Chebyshev coefficients of the function f:

$$\begin{aligned} \mathrm {c}_k(f)=\frac{2}{\pi } \int _{-1}^1 f(x) T_k(x) \frac{dx}{\sqrt{1-x^2}} , \end{aligned}$$
(1.20)

where \((T_k)_{k=0}^\infty \) are the Chebyshev polynomials of the first kind.Footnote 2 The CLT (1.18) holds for more general potentials and for other orthogonal polynomial ensembles as well, see [43, Section 11.3] or [12, 17, 39] and it is known that the one-cut condition, i.e. the assumption that the support of the equilibrium measure is connected, is necessary. Otherwise, the asymptotic fluctuations of a generic linear statistic \(\Xi (f)\) are still of order 1 but are not Gaussian, see [13, 42, 46]. In fact, the one-cut condition is closely related to the fact the recurrence coefficients [see formula (4.1)] which defines the orthogonal polynomials \((P_k)_{k\ge 0}\) appearing in the correlation kernel (1.3) satisfy for any \(j\in {\mathbb {Z}}\),

$$\begin{aligned} \lim _{N\rightarrow \infty } a^N_{N+j} = 1/2 \,\text {and}\, \lim _{N\rightarrow \infty }b^N_{N+j} = 0; \end{aligned}$$
(1.21)

see Remark 4.2 below. Like for 2-dimensional Coulomb gas, we obtain analogous transitions for the eigenvalues of random unitary invariant Hermitian matrices.

Theorem 1.4

Let \(p_N = 1-q_N\), \(T_N = N q_N\), and suppose that the recurrence coefficients of the orthogonal polynomials \(\{P_k \}_{k=0}^\infty \) satisfy the conditions (1.21). Then, for any polynomial Q, we obtain as \(N\rightarrow \infty \) and \(q_N \rightarrow 0\),

$$\begin{aligned} \widehat{\Xi }(Q) - {\mathbb {E}}\big [\widehat{\Xi }(Q) \big ]&\Rightarrow \mathrm {Y}(Q)&\text {if }\ T_N \rightarrow 0 , \\ \frac{\widehat{\Xi }(Q) - {\mathbb {E}}\big [\widehat{\Xi }(Q) \big ]}{\sqrt{T_N} }&\Rightarrow {\mathcal {N}}\left( 0, \int _{{\mathbb {R}}} Q(x)^2 \varrho _V(x) dx \right)&\text {if }\ T_N \rightarrow \infty ,\\ \widehat{\Xi }(Q) - {\mathbb {E}}\big [\widehat{\Xi }(Q) \big ]&\Rightarrow \mathrm {Y}(Q) + \Lambda _{\tau \varrho _V}(-Q)&\text {if }\ T_N \rightarrow \tau , \end{aligned}$$

where the Poisson process \( \Lambda _{\tau \varrho _V}\) is independent from the Gaussian process \(\mathrm {Y}\) and both are defined on \({\mathscr {I}}_V =(-1,1)\).

The proof of Theorem 1.4 is also based on the cumulants’ method and on Theorem 2.2. However, the technical details, which are explained in Sects. 4.1 and 4.2, rely on the formulation from [39] and are very different from that of the proof of Theorem 1.1.

We also obtain the counterpart of Theorem 1.4 for mesoscopic linear statistics. For any function \(f\in L^1({\mathbb {R}})\), we define its Fourier transform :

$$\begin{aligned} {\hat{f}}(u) = \int _{\mathbb {R}}f(x) e^{-2\pi i x u} dx . \end{aligned}$$

We let \(\mathrm {Z}\) be a mean-zero Gaussian process on \({\mathbb {R}}\) with correlation structure:

$$\begin{aligned} {\mathbb {E}}\big [ \mathrm {Z}(h)\mathrm {Z}(g) \big ] = \int _{0}^\infty u {\hat{h}}(u) \overline{{\hat{g}}(u)} du . \end{aligned}$$
(1.22)

Since \({\text {Var}}\big [ \mathrm {Z}(f)\big ] = \Vert f\Vert _{H^{1/2}({\mathbb {R}})}\), the process \(\mathrm {Z}\) is usually called the \(H^{1/2}\)—Gaussian noise. It describes the mesoscopic fluctuations of the eigenvalues of Hermitian random matrices, see [16, 29, 38], as well as the mesoscopic fluctuations of the log-gases for general \(\beta >0\), [5], and of certain random band matrices in the appropriate regime [27, 28].

Theorem 1.5

We let \(x_0\in {\mathscr {I}}_V\), \(f\in {\mathcal {C}}^2_c({\mathbb {R}})\), \(\alpha \in (0,1)\), and \(f_N(x) = f\big (N^\alpha (x-x_0) \big )\). We also let \(p_N = 1-q_N\) and \(T_N = q_NN^{1-\alpha } \varrho _V(x_0)\). We obtain as \(N\rightarrow \infty \) and \(q_N\rightarrow 0\),

$$\begin{aligned} \widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]&\Rightarrow \mathrm {Z}(f)&\text {if }\ T_N \rightarrow 0 , \end{aligned}$$
(1.23)
$$\begin{aligned} \frac{\widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]}{\sqrt{T_N} }&\Rightarrow {\mathcal {N}}\left( 0, \int _{{\mathbb {R}}} f(x)^2 dx \right)&\text {if }\ T_N \rightarrow \infty , \end{aligned}$$
(1.24)
$$\begin{aligned} \widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]&\Rightarrow \mathrm {Z}(f) + \Lambda _{\tau }(-f)&\text {if }\ T_N \rightarrow \tau , \end{aligned}$$
(1.25)

where the Poisson process \(\Lambda _{\tau }\) has constant intensity \(\tau >0\) on \({\mathbb {R}}\) and is independent from the \(H^{1/2}\)—Gaussian noise \(\mathrm {Z}\).

The proof of Theorem 1.5 is quite similar to that of Theorem 1.3. It follows the strategy explained in Sect. 2 and is based on the asymptotics for the correlation kernel \(K^N_V\) in terms of the sine-kernel, see [37]. The details relies on the method from [38] and are given in Sect. 4.3.

1.4 Overview of the Rest of the Paper

In Sect. 2, we present the strategy of the proofs of the results from Sects. 1.2 and 1.3. We begin by reviewing Soshnikov’s cumulants’ method. Then, we explain how to apply it to the incomplete ensemble \(\widehat{\Xi }\) and we obtain a general result—Theorem 2.2—which characterizes the transition from Gaussian to Poisson statistics for general determinantal point processes. The rest of the paper consists in verifying the assumptions of Theorem 2.2 for determinantal log-gases in dimensions 1 and 2. In Sect. 3, we prove Theorems 1.11.3 for the 2-dimensional log-gases. The proof relies on estimates for the correlation kernel (1.3) which come from the paper [2] and are collected in the Appendix B. In Sect. 4, we provide the details of the proofs of Theorems 1.4 and 1.5 by relying on the method from [39] and [38] respectively.

In the following, \(C>0\) denotes a numerical constant which changes from line to line. For any \(n\in {\mathbb {N}}\), we let for all \(\mathrm {x} \in {\mathfrak {X}}^n\),

$$\begin{aligned} d\mu ^n(\mathrm {x}) = d\mu (x_1) \cdots d\mu (x_n) . \end{aligned}$$

If \((\mathrm {u}_N)\) and \((\mathrm {v}_N)\) are two sequences, we use the notation:

$$\begin{aligned} \mathrm {u}_N \simeq \mathrm {v}_N \,\text {if} \, \lim _{N\rightarrow \infty }( \mathrm {u}_N - \mathrm {v}_N) =0 . \end{aligned}$$

2 Outline of the Proof

In this section, we consider a general state space \({\mathfrak {X}}\) which is a complete separable metric space equipped with a Radon measure \(\mu \) as in [30, 47] and let \(\Xi \) be a sequence of determinantal point processes on \({\mathfrak {X}}\) with correlation kernels \(K^N\) which are reproducing:

$$\begin{aligned} \int _{\mathfrak {X}}K^N(z, x) K^N(x, w) d\mu (x) = K^N(z,w) . \end{aligned}$$
(2.1)

One may think of the parameter \(N \in {\mathbb {N}}\) as the density of particles. Since it is generally the case in the context of random matrix theory, we shall also assume that the kernels \(K^N\) are continuous on \({\mathfrak {X}}\times {\mathfrak {X}}\), Hermitian symmetric, and that they define locally trace-class integral operators acting on \(L^2({\mathfrak {X}},\mu )\).

The cumulants’ method to analyze the asymptotic distribution of linear statistic of determinantal processes goes back to the work of Costin and Lebowitz [20] for count statistics of the Sine process. The general theory was developed by Soshnikov in [47,48,49] and subsequently applied to many different ensembles coming from random matrix theory, see for instance [2, 16, 17, 35, 38, 39, 44, 45].

In this section, we show how to implement the cumulants’ method to describe the asymptotics law of linear statistics of the incomplete ensemble \(\widehat{\Xi }\) with correlation kernel \(p_N K^N(z,w)\) when the density of particles \(0<p_N<1\) converges to 1 in the large N limit.

Let

$$\begin{aligned} \mho = \bigcup _{l=1}^\infty \big \{ \mathbf {k}=(k_1,\dots , k_l) \in {\mathbb {N}}^l \big \} \end{aligned}$$

and let \(\ell (\mathbf {k})= l\) denote the length of the tuple \(\mathbf {k}\). Let us denote the set of compositions of the integer \(n>0\) by

$$\begin{aligned} \big \{ \mathbf {k}\vdash n \big \} = \big \{ \mathbf {k}\in \mho : k_1+ \cdots + k_l = n \} . \end{aligned}$$

We also denote by \(n\in \mho \) the trivial composition. For any map \(\Upsilon : \mho \mapsto {\mathbb {R}}\), for any function \(f: {\mathfrak {X}}\rightarrow {\mathbb {R}}\), and for any \(n\in {\mathbb {N}}\), we define for all \(x\in {\mathfrak {X}}^n\),

$$\begin{aligned} \Upsilon ^n[f](x) = \sum _{\mathbf {k}\vdash n} \Upsilon (\mathbf {k})\, \prod _{1 \le j\le \ell (\mathbf {k})} \,f(x_j)^{k_j} . \end{aligned}$$
(2.2)

If \(\mathbf {k}\vdash n\), we let \(\displaystyle \mathrm {M}(\mathbf {k}) =\frac{n!}{k_1!\cdots k_l!}\) be the corresponding multinomial coefficient and for all integers \(n\ge 1\) and \( m \in \{0, \dots , n\}\), we define the coefficients

$$\begin{aligned} \gamma ^n_m = \sum _{ \mathbf {k}\vdash n}\frac{(-1)^{\ell (\mathbf {k})}}{\ell (\mathbf {k})} {\ell (\mathbf {k}) \atopwithdelims ()m} \mathrm {M}(\mathbf {k}) . \end{aligned}$$
(2.3)

We will also use the notation: \(\displaystyle \delta _{k}(n) = {\left\{ \begin{array}{ll} 1 &{}\text {if}\ n=k \\ 0 &{}\text {else} \end{array}\right. }\) for any \(k\in {\mathbb {Z}}\).

Lemma 2.1

For all \(n\in {\mathbb {N}}\), we have \(\gamma ^n_0 = \delta _{1}(n)\) and \( \gamma ^n_1 = (-1)^n\).

Proof

The coefficients (2.3) have the generating function:

$$\begin{aligned} \sum _{n=1}^\infty \sum _{m=0}^n \gamma ^n_m \frac{x^n q^m}{n!} = - \log \big (1+(1+ q)(e^x-1)\big ) . \end{aligned}$$

In particular, setting \(q=0\), we see that \( \gamma ^n_0 = \delta _{1}(n) \). Moreover, since

$$\begin{aligned} \sum _{n=1}^\infty \gamma ^n_1 \frac{x^n}{n!} =\left. \frac{d}{dq}\log \big (1+(1+ q)(e^x-1)\big ) \right| _{q=0} =1-e^{-x} , \end{aligned}$$

we also see that \( \gamma ^n_1 = (-1)^n\) . \(\square \)

Given a test function \(f: {\mathfrak {X}}\rightarrow {\mathbb {R}}\), say locally integrable with compact support, the cumulant generating function of the random variable \(\Xi (f)\) is

$$\begin{aligned} \log {\mathbb {E}}\big [ \exp ( \lambda \Xi (f)) \big ] = \sum _{n=1}^\infty \frac{\lambda ^n}{n!} {\text {C}}^n_{K^N}[f] . \end{aligned}$$

It was proved by Soshnikov that under our general assumptions, the cumulants \({\text {C}}^n_{K}[f] \) characterize the law of the linear statistics \(\Xi (f)\) and that for any \(n\in {\mathbb {N}}\),

$$\begin{aligned} {\text {C}}^n_{K^N}[f] = - \sum _{l=1}^n \frac{(-1)^l}{l} \sum _{\begin{array}{c} \mathbf {k}\vdash n \\ \ell (\mathbf {k}) =l \end{array}} \mathrm {M}(\mathbf {k}) \underset{x_{0} =x_l}{\int _{{\mathfrak {X}}^l}} f(x_1)^{k_1} \cdots f(x_l)^{k_l} \prod _{1\le j\le l} K^N(x_j, x_{j-1}) d\mu ^l(x).\nonumber \\ \end{aligned}$$
(2.4)

Under stronger assumptions, for instance if the kernel \(K^N\) has finite rank, this formula makes sense also for test functions which are not necessarily compactly supported. We use the convention that the variables \(x_0\) and \(x_l\) are identified in the previous integral. Since we assume that the correlation kernel \(K^N\) is reproducing, we can rewrite this formula:

$$\begin{aligned} {\text {C}}^n_{K^N}[f]&= - \sum _{\mathbf {k}\vdash n} \frac{(-1)^{\ell (\mathbf {k})}}{\ell (\mathbf {k})} \mathrm {M}(\mathbf {k}) \underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}} f(x_1)^{k_1} \cdots f(x_l)^{k_l} \prod _{1\le j\le n} K^N(x_j, x_{j-1}) d\mu ^n(x) \nonumber \\&= -\underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}}\Upsilon ^n_0[f](x) \prod _{1\le j\le n} K^N(x_j, x_{j-1}) d\mu ^n(x) , \end{aligned}$$
(2.5)

where for any \(\mathbf {k}\in \mho \),

$$\begin{aligned} \Upsilon _0(\mathbf {k}) = \frac{(-1)^{\ell (\mathbf {k})}}{\ell (\mathbf {k})} \mathrm {M}(\mathbf {k}) . \end{aligned}$$
(2.6)

A simple observation which turns out to be very important when it comes to asymptotics is that, by Lemma 2.1, for all \(n\ge 2\),

$$\begin{aligned} \sum _{\mathbf {k}\vdash n} \Upsilon _0(\mathbf {k}) =\gamma _0^n =0 . \end{aligned}$$
(2.7)

For any \(m\in {\mathbb {N}}\), we defineFootnote 3

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \Upsilon _m(\mathbf {k}) = \frac{(-1)^{\ell (\mathbf {k})}}{\ell (\mathbf {k})} {\ell (\mathbf {k}) \atopwithdelims ()m} \mathrm {M}(\mathbf {k}) &{} \text {if } \ell (\mathbf {k}) \ge 2 \\ \displaystyle \Upsilon _m(\mathbf {k}) =- \delta _{1}(m)-\gamma ^n_m &{} \text {if } \mathbf {k}=n \end{array}\right. } . \end{aligned}$$
(2.8)

These functions are constructed so that we have for all \(n,m\in {\mathbb {N}}\),

$$\begin{aligned} \sum _{\mathbf {k}\vdash n} \Upsilon _m(\mathbf {k}) = 0 . \end{aligned}$$
(2.9)

According to formula (2.4), the cumulants of the process \(\widehat{\Xi }\) with correlation kernel \({\widehat{K}}_N = p_N K^N\) are given by

$$\begin{aligned} {\text {C}}^n_{{\widehat{K}}^N}[f] = - \sum _{l=1}^n \frac{(-1)^l}{l} p_N^l \sum _{\begin{array}{c} \mathbf {k}\vdash n \\ \ell (\mathbf {k}) =l \end{array}} \mathrm {M}(\mathbf {k}) \underset{x_{0} =x_l}{\int _{{\mathfrak {X}}^l}} f(x_1)^{k_1} \cdots f(x_l)^{k_l} \prod _{1\le j\le l} K^N(x_j, x_{j-1}) d\mu ^l(x) . \end{aligned}$$

Since the kernel \(K^N\) is reproducing, if we set \(q_N=1-p_N\), using the binomial formula, we obtain

$$\begin{aligned} \begin{aligned} {\text {C}}^n_{{\widehat{K}}^N}[f] = {\text {C}}^n_{K^N}[f]&- \sum _{m=1}^n (-q_N)^m \gamma ^n_m \int _{\mathfrak {X}}f(x)^{n} K^N(x,x) d\mu (x) \\&- \sum _{m=1}^n (-q_N)^m \underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}} \Upsilon ^n_m[f](x) \prod _{1\le j\le n} K^N(x_j, x_{j-1}) d\mu ^n(x) . \end{aligned} \end{aligned}$$
(2.10)

We are now ready to state our general result from which Theorems 1.1, 1.3, 1.4 and 1.5 in the introduction follow.

Theorem 2.2

Let \(0<q_N<1\) be a sequence which converges to 0 as \(N\rightarrow +\infty \). Under our general assumptions above, let \(f_N\) be a sequence of functions for which the cumulants \({\text {C}}^n_{K^N}[f_N]\) are well-defined for all \(n, N \in {\mathbb {N}}\) and the following conditions hold:

  1. 1.

    There exists a (Radon) measure \(\eta \) on \({\mathfrak {X}}\), a function \(f\in L^p(\eta )\) for any \(p\ge 2\), and a sequence \(M_N \nearrow +\infty \) as \(N\rightarrow +\infty \) such that for all \(n\ge 1\),

    $$\begin{aligned} \frac{1}{M_N} \int _{\mathfrak {X}}f_N(x)^n K^N(x,x) d\mu (x) \simeq \int _{\mathfrak {X}}f(x)^n d\eta (x). \end{aligned}$$
    (2.11)
  2. 2.

    For all \(n\ge 2\) and all \(m\ge 1\), as \(N\rightarrow +\infty \)

    $$\begin{aligned} \left| \underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}} \Upsilon ^n_m[f_N](x) \prod _{1\le j\le n} K^N(x_j, x_{j-1}) d\mu ^n(x) \right| = o\left( q_N^{-1} \vee M_N \right) . \end{aligned}$$
    (2.12)
  3. 3.

    There exists \(\sigma >0\) such that for all \(n\in {\mathbb {N}}\),

    $$\begin{aligned} \lim _{N\rightarrow \infty } {\text {C}}^n_{K^N}[f_N] = {\left\{ \begin{array}{ll} \sigma ^2 &{}\text {if } n=2 \\ 0 &{}\text {if } n>2 \end{array}\right. }. \end{aligned}$$
    (2.13)

Then, depending on the parameter \(T_N = q_N M_N >0\), we distinguish three different asymptotic regimes for the linear statistic \(\widehat{\Xi }(f_N)\) of the thinned point process with density \(p_N=1-q_N\):

  1. (i)

    If \(T_N \rightarrow 0\) as \(N\rightarrow +\infty \),

    $$\begin{aligned} \widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ] \Rightarrow {\mathcal {N}}(0,\sigma ^2) . \end{aligned}$$
  2. (ii)

    If \(T_N \rightarrow \tau \) with \(\tau >0\) as \(N\rightarrow +\infty \),

    $$\begin{aligned} \widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ] \Rightarrow \mathrm {X} + \Lambda _{\tau \eta }(f), \end{aligned}$$

    where \(\mathrm {X} \sim {\mathcal {N}}(0,\sigma ^2)\) and \(\Lambda _{\tau \eta }\) is Poisson process on \({\mathfrak {X}}\) with intensity \(\tau \eta \) independent from \(\mathrm {X}\).

  3. (iii)

    If \(T_N \rightarrow +\infty \) as \(N\rightarrow +\infty \),

    $$\begin{aligned} \frac{\widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]}{\sqrt{T_N}} \Rightarrow {\mathcal {N}}\Big (0, \int f(x)^2 d\eta (x) \Big ) . \end{aligned}$$

Before giving our proof of Theorem 2.2, let us make two remarks about the Assumptions (2.11)–(2.13).

Remark 2.1

For any function \(g:{\mathfrak {X}}\rightarrow {\mathbb {R}}_+\), we have \(\displaystyle {\mathbb {E}}[\Xi (g)] = \int _{\mathfrak {X}}g(x) K^N(x,x) d\mu (x)\), so that one can interpret (2.11) as a condition about the mean of the point process \(\Xi \). In contrast, (2.12) can be seen as a condition about the fluctuations of the incomplete point process. We also implicitely assume that for \(n=2\), the RHS of (2.11) is positive, so that the measure \(\eta \) is non-trivial and puts mass on the support of f. Then the random variables \({\mathcal {N}}\big (0, \int f(x)^2 d\eta (x) \big )\) and \(\Lambda _{\tau \eta }(f)\) are non zero for any \(\tau >0\). In order to handle mesoscopic linear statistics, we have allowed our test functions to depend on the parameter N. However, for simplicity, one can think of the case where \(f_N = f\) is a smooth and compactly supported test function.

Remark 2.2

Instead of (2.13), we could assume that the cumulants of the linear statistics \(\Xi (f_N)\) converge to that of a random variable \(\mathrm {X}\) which is not necessarily Gaussian. Then, the conclusion (ii) of Theorem 2.2 remains true and we obtain a crossover from a non-Gaussian process to a Poisson process. For instance, this more general situation arises when considering linear statistics of 1-dimensional log-gases in the multi-cut regime.

Proof

Observe that it follows from formula (2.10) and the condition (2.12) that the cumulants of the linear statistic \(\widehat{\Xi }(f_N)\) satisfy for all \(n\ge 2\) as \(N\rightarrow +\infty \),

$$\begin{aligned} {\text {C}}^n_{{\widehat{K}}^N}[f_N] = {\text {C}}^n_{K^N}[f_N] - \sum _{m=1}^n (-q_N)^m \gamma ^n_m \int _{\mathfrak {X}}f_N(x)^{n} K^N(x,x) d\mu (x) + o(1 \vee T_N) , \end{aligned}$$

where \(T_N = q_N L_N\). Then, using the condition (2.11), we obtain for any \(n\ge 2\), as \(N\rightarrow +\infty \),

$$\begin{aligned} {\text {C}}^n_{{\widehat{K}}^N}[f_N] = {\text {C}}^n_{K^N}[f_N] + T_N \sum _{m=0}^{n-1} (-q_N)^m \gamma ^n_{m+1} \left( \int _{\mathfrak {X}}f_N(x)^{n} d\eta (x) + o(1) \right) + o(1 \vee T_N).\nonumber \\ \end{aligned}$$
(2.14)

Let us observe that in the previous sum, regardless of the regime we consider, since \(q_N\rightarrow 0\) as \(N\rightarrow +\infty \), only the term \(m=0\) is asymptotically relevant. For instance, if we assume that \(T_N = \tau +o(1)\) with \(\tau \ge 0\), this implies that

$$\begin{aligned} {\text {C}}^n_{{\widehat{K}}^N}[f_N] = {\text {C}}^n_{K^N}[f_N] + \tau \gamma ^n_{1} \int _{\mathfrak {X}}f_N(x)^{n} d\eta (x)+ o(1) . \end{aligned}$$

On the one hand by Lemma 2.1, since \(\gamma ^n_1 = (-1)^n\) and using the condition (2.13), we obtain for any \(n\ge 2\)

$$\begin{aligned} \lim _{N\rightarrow +\infty } {\text {C}}^n_{{\widehat{K}}^N}[f_N] = \sigma ^2 \mathbf {1}_{n = 2} + (-1)^n \tau \int _{\mathfrak {X}}f_N(x)^{n} d\eta (x) . \end{aligned}$$
(2.15)

In the regime (i)—which corresponds to \(\tau =0\)—this shows that the linear statistic \(\widehat{\Xi }(f_N)\), once centered, converges in distribution (as well as in the sense of moments) to a Gaussian random variable with variance \(\sigma ^2\). Let us observe that by formula (1.11), if \(\tau >0\), the second term on the RHS of (2.15) corresponds to the \(n{\mathrm{th}}\) cumulant of the random variable \(\Lambda _{\tau \eta }(f)\). This proves the claim in the regime (ii).

On the other hand, in the regime (iii) where \(T_N\rightarrow +\infty \), we see from (2.14) that the variance \({\text {C}}^n_{{\widehat{K}}^N}[f_N]\) diverges as \(N\rightarrow +\infty \). Thus, in order to have a non-trivial limit, we need to renormalize the linear statistic \(\widehat{\Xi }(f_N)\). Namely, we consider instead the test function \(g_N = f_N /\sqrt{T_N}\) and it follows from (2.14) that for all \(n\ge 2\), as \(N\rightarrow +\infty \)

$$\begin{aligned} {\text {C}}^n_{{\widehat{K}}^N}[g_N] = \mathbf {1}_{n=2}\left( \int _{\mathfrak {X}}f(x)^{2} d\eta (x) + o(q_N) \right) + o(1). \end{aligned}$$

These asymptotics show that in the the regime (iii), \(\frac{\widehat{\Xi }(f_N) - {\mathbb {E}}\big [\widehat{\Xi }(f_N) \big ]}{\sqrt{T_N}}\) converges in distribution (as well as in the sense of moments) to a centered Gaussian random variable with variance \(\int _{\mathfrak {X}}f(x)^{2} d\eta (x)\). \(\square \)

3 Transition for Coulomb Gases in Two Dimensions

3.1 Asymptotics of the Correlation Kernel

In this section, we begin by reviewing the basics of the theory of eigenvalues of random normal matrices developed by Ameur, Hedenmalm and Makarov. In particular, we are interested in the properties of the correlation kernel (1.3) in the bulk of the gas. It has been established in [2] that, if the potential V is real-analytic, the equilibrium measure is \(\varrho _V = 2\Delta V \mathbf {1}_{{\mathscr {S}}_V}\) and the droplet \(\overline{{\mathscr {S}}_V}\) is a compact set with a nice boundary. Moreover, in order to compute the asymptotics of the cumulants of a smooth linear statistic, instead of working with the correlation kernel \(K^N_V\), one can use the so-called approximate Bergman kernel:

$$\begin{aligned} B^N_V(z,w) = \big ( N b_0(z, {\overline{w}}) + b_1(z, {\overline{w}}) \big ) e^{N \{ 2 \Phi (z, {\overline{w}}) - V(z) - V(w) \} } . \end{aligned}$$
(3.1)

The functions \(b_0(z,w)\), \(b_1(z,w)\) and \(\Phi (z,w)\) are the (unique) bi-holomorphic functions defined in a neighborhood in \({\mathbb {C}}^2\) of the set \(\big \{ (z ,{\overline{z}}) : z\in {\mathscr {S}}_V \big \}\) such that \(b_0(z,{\overline{z}}) = 2 \Delta V(z)\), \(b_1(z,{\overline{z}}) = \frac{1}{2} \Delta \log ( \Delta V)(z)\), and \(\Phi (z,{\overline{z}}) = V(z)\).

Lemma 3.1

(Lemma 1.2 in [2], proved in [1, 7]) For any \(x_0 \in {\mathscr {S}}_V \), there exists \(\epsilon _0>0\) and \(C_0>0\) so that when the dimension N is sufficiently large, we have for all \(z, w \in {\mathbb {D}}(x_0 , \epsilon _0)\),

$$\begin{aligned} \left| K^N_V(z, w) - B^N_V(z,w) \right| \le C_0 N^{-1} . \end{aligned}$$
(3.2)

Moreover, at sufficiently small mesoscopic scale, up to a gauge transform, the asymptotics of the approximate Bergman kernel \(B^N_V\) is universal.

Lemma 3.2

Let \(\kappa >0\) and \(\epsilon _N= \kappa N^{-1/2}\log N \) for all \(N\in {\mathbb {N}}\). For any \(x_0 \in {\mathscr {S}}_V\), there exists \(\varepsilon _0>0\) and a function \({\mathfrak {h}}: {\mathbb {D}}(0,\varepsilon _0) \rightarrow {\mathbb {R}}\) such that if the parameter N is sufficiently large, the function

$$\begin{aligned} {\widetilde{B}}^N_{V, x_0}( u ,v) =\frac{B^N_V(x_0 + u , x_0 + v)e^{i N {\mathfrak {h}}(u)}}{e^{i N {\mathfrak {h}}(v)}} \end{aligned}$$
(3.3)

satisfies

$$\begin{aligned} {\widetilde{B}}^N_{V, x_0}( u , v) = K^\infty _{N \varrho _V(x_0)}(u, v)\left\{ 1 + \underset{N\rightarrow \infty }{O}\big ( (\log N)^{2}\epsilon _N \big ) \right\} \end{aligned}$$
(3.4)

uniformly for all \(u, v \in {\mathbb {D}}(0, \epsilon _N)\), where \(K^\infty _{N \varrho _V(x_0)}\) is the \(\infty \)-Ginibre kernel with density \(N \varrho _V(x_0)= 2N \Delta V(x_0)\).

A key ingredient in the paper [2], as well as [44], is to reduce the domain of integration in formula (2.5), using the exponential off-diagonal decay of the correlation kernels \(K_V^N\), to a set where we can use the asymptotics (3.4)—see the next Lemma. For completeness, the proofs of Lemmas 3.2 and 3.3 are given in the Appendix B.

Lemma 3.3

Let \(n\in {\mathbb {N}}\) and \(\epsilon _N = \kappa N^{-1/2} \log N\) for some constant \(\kappa >0\) which is sufficiently large compared to n. We let

$$\begin{aligned} {\mathscr {A}}(z_0; \epsilon ) = \big \{ \mathrm {z}\in {\mathbb {C}}^n : |z_j - z_{j-1}| \le \epsilon \text { for all } j=1, \dots n \big \} . \end{aligned}$$
(3.5)

Let \({\mathscr {S}}\) be a compact subset of \({\mathscr {S}}_V\), \(N_0\in {\mathbb {N}}\), and \(F_N : {\mathbb {C}}^{n+1} \rightarrow {\mathbb {R}}\) be a sequence of continuous functions such that

$$\begin{aligned} \sup \big \{ |F_N(z_0, \mathrm {z})| : \mathrm {z}\in {\mathbb {C}}^{n} , N \ge N_0 \big \} \le C \mathbf {1}_{z_0 \in {\mathscr {S}}} . \end{aligned}$$
(3.6)

We have

$$\begin{aligned}&\underset{z_0 = z_{n+1}}{\int _{{\mathbb {C}}^{n+1}}}\,F_N(z_0, \mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}(z_0) d\mathrm {A}^{n}(z) \nonumber \\&\quad = \int _{\mathscr {S}}d\mathrm {A}(z_{0}) \underset{z_{n+1} = z_{0}}{\int _{{\mathscr {A}}(z_{0}; \epsilon _N)}} F_N(z_0,\mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}^{n}(\mathrm {z})\ + \underset{N\rightarrow \infty }{O}(N^{-1}) . \end{aligned}$$
(3.7)

Remark 3.1

Recall that \(K^\infty _\rho \), (1.16), denotes the correlation kernel of the \(\infty \)-Ginibre process with density \(\rho >0\). Using the fact that for all \(z, w\in {\mathbb {C}}\),

$$\begin{aligned} \big | K^\infty _\rho (z,w)\big | = \rho e^{-\rho |z-w|^2/2} , \end{aligned}$$
(3.8)

it is easy to obtain the counterpart of Lemma 3.3 for the \(\infty \)-Ginibre kernel, see Lemma 4.3.

3.2 Proof of Theorem 1.1

In this section, we show how to apply Theorem 2.2 for Coulomb gases in the global regime by relying on the asymptotics from Sect. 3.1 for the correlation kernel \(K^N_V\). First, observe that with \(f_N = f\) and \(M_N = N\), the Assumptions (2.11) follow immediately from the law of large numbers (1.5). Then the measure \(d\eta = \varrho _V d\mu \) is absolutely with respect to \(\mu \) with compact support. Moreover, if \(f\in {\mathcal {C}}_c^3({\mathscr {S}}_V)\), then the conditions (2.13) are well-known from [2, Theorem 4.4] with \(\sigma ^2 = {\mathbb {E}}[\mathrm {X}(f)]\) according to formula (1.8). So, our main technical challenge is to obtain the estimates (2.12) for a large class of test functions. The first step of the proof is the following approximation.

Proposition 3.4

Let \({\mathcal {K}}\subset {\mathbb {C}}\) be a compact set, \(f\in {\mathcal {C}}^3_c({\mathcal {K}})\) and let \(\Upsilon : \mho \rightarrow {\mathbb {R}}\) be any map such that \(\sum _{\mathbf {k}\vdash n} \Upsilon (\mathbf {k}) = 0\) for all \(n\ge 2\). Let \(L_N \) be an increasing sequence such that \(L_N^{-1} (\log N)^4 = o(1)\) and \(L_N = o \big (\sqrt{N}/(\log N)^{3}\big )\) as \(N\rightarrow +\infty \). We also denote by \(H^n(\lambda ; \mathrm {w})\) the second order Taylor polynomial at 0 of the function \(\mathrm {w}\in {\mathbb {C}}^n \mapsto \Upsilon ^{n+1}[f](\lambda , \lambda + \mathrm {w})\). Fix \(x_0 \in {\mathscr {S}}_V\) and let \(f_N(z)=f(L_N(z- x_0))\) as in (1.15). Then, we have for any \(n\ge 1\), as \(N\rightarrow +\infty \),

$$\begin{aligned} \begin{aligned}&\underset{z_0 = z_{n+1}}{\int _{{\mathbb {C}}^{n+1}}}\, \Upsilon ^{n+1}[f_N](z_0, \mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}(z_0) d\mathrm {A}^{n}(\mathrm {z}) \\&\qquad \qquad \simeq \int _{{\mathcal {K}}} d\mathrm {A}(\lambda ) \,\underset{w_0=w_{n+1}=0}{\int _{{\mathbb {C}}^n}}\, H^n(\lambda ; \mathrm {w}) \prod _{j=0}^{n} K^\infty _{\eta _N(\lambda )}(w_{j}, w_{j+1}) d\mathrm {A}^n(\mathrm {w}) , \end{aligned} \end{aligned}$$
(3.9)

where the density is given by \(\eta _N(\lambda ) = N L_N^{-2} \varrho _V(x_0+ \lambda /L_N)\).

Remark 3.2

Observe that under the assumptions of Proposition 3.4, we have \(\eta _N(\lambda ) \rightarrow +\infty \) as \(N\rightarrow +\infty \) uniformly for all \(\lambda \in {\mathcal {K}}\). Moreover, it follows from the proof below that in the global regime where \(L_N =1\) and \(x_0 = 0\), provided that \({\mathcal {K}}\subset {\mathscr {S}}_V\), the estimates (3.9) remain valid with an extra error. Namely, we obtain for all \(n\ge 1\),

$$\begin{aligned} \begin{aligned}&\underset{z_0 = z_{n+1}}{\int _{{\mathbb {C}}^{n+1}}}\, \Upsilon ^{n+1}[f_N](z_0, \mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}(z_0) d\mathrm {A}^{n}(\mathrm {z}) \\&\quad = \int _{{\mathcal {K}}} d\mathrm {A}(\lambda ) \,\underset{w_0=w_{n+1}=0}{\int _{{\mathbb {C}}^n}}\, H^n(\lambda ; \mathrm {w}) \prod _{j=0}^{n} K^\infty _{ N \varrho _V(\lambda )}(w_{j}, w_{j+1}) \ d\mathrm {A}^n(\mathrm {w})\\&\quad \, +\underset{N\rightarrow \infty }{O}\left( (\log N)^4 \right) . \end{aligned} \end{aligned}$$
(3.10)

Proof

We let \(F_N = \Upsilon ^{n+1}[f_N]\) and

$$\begin{aligned} {\mathscr {J}}^{n}_N : = \underset{z_{n+1}=z_0}{\int _{{\mathbb {C}}^{n+1}}}\, F_N(z_0, \mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}(z_0) d\mathrm {A}^{n}(\mathrm {z}) . \end{aligned}$$

Since \(x_0\in {\mathscr {S}}_V\), there exists a compact set \({\mathscr {S}}\subset {\mathscr {S}}_V\) so that \({\text {supp}}(f_N) \subseteq {\mathscr {S}}\) when the parameter N is sufficiently large. Then, according to formula (2.2), the function \(F_N\) satisfies the Assumption (3.6). Thus, by Lemma 3.3, we obtain as \(N\rightarrow +\infty \)

$$\begin{aligned} {\mathscr {J}}^{n}_N \simeq \int _{{\mathscr {S}}} d\mathrm {A}(z_0) \underset{z_{n+1} = z_0}{\int _{{\mathscr {A}}(z_0; \epsilon _N)}}\, F_N(z_0,\mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}^{n}(\mathrm {z}) . \end{aligned}$$
(3.11)

By (3.5), the set

$$\begin{aligned} {\mathscr {A}}(z_0; \epsilon ) \subset \big \{ \mathrm {z}\in {\mathbb {C}}^n : z_1, \dots , z_n \in {\mathbb {D}}(z_0,n \epsilon _N) \big \} \end{aligned}$$

and we can apply Lemma 3.1 to replace the kernels \(K^N_V(z_j,z_{j+1})\) in formula (3.11). Namely, if \(z_{n+1} = z_0 \) and \(\mathrm {z}\in {\mathscr {A}}(z_0; \epsilon )\), then

$$\begin{aligned} \Bigg | \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) - \prod _{j=0}^{n} B^N_V(z_{j}, z_{j+1}) \Bigg | \le C \sum _{k=1}^{n+1} N^{-k} S_N^{n+1-k} , \end{aligned}$$

where \(S_N = \sup \big \{ |B^N_V(z,w)| : z, w \in {\mathbb {D}}(z_0, n \epsilon _N) , z_0 \in {\mathscr {S}}\big \} \). By Lemma 3.2, we have for all \(u,v \in {\mathbb {D}}(0, n\epsilon _N) \),

$$\begin{aligned} \big | B^N_V(z_0 + u , z_0 + v) \big | \le C \big | K^\infty _{N\varrho _V(z_0)}(u,v) \big | . \end{aligned}$$

and, by formula (3.8), this implies that \(S_N \le C N\). If we combine these estimates with formula (3.11), since the functions \(F_N\) are uniformly bounded, we obtain

$$\begin{aligned} {\mathscr {J}}^{n}_N&= \int _{{\mathscr {S}}} d\mathrm {A}(z_0) \underset{z_{n+1} = z_0}{\int _{{\mathscr {A}}(z_0; \epsilon _N)}} F_N(z_0,\mathrm {z}) \prod _{j=0}^n B^N_V(z_{j}, z_{j+1}) d\mathrm {A}^{n}(\mathrm {z})\\&+\underset{N\rightarrow \infty }{O}\left( N^{n-1} \int _{{\mathscr {S}}} d\mathrm {A}(z_0) \big |{\mathscr {A}}(z_0; \epsilon _N)\big | \right) , \end{aligned}$$

where \(|{\mathscr {A}}|\) denotes the Lebesgue measure of the set \({\mathscr {A}}\). By definition, \(\epsilon _N = \kappa N^{-1/2} \log N\) so that \( \big | {\mathscr {A}}(z_0;\epsilon _N) \big | \le C N^{-n} (\log N)^{2n}\) for all \(z_0\in {\mathbb {C}}\). Thus, the previous error term converges to 0 like \((\log N)^{2n}/ N\). Hence, if we make the change of variables \(\mathrm {z}= z_0 +\mathrm {u}\) and the appropriate gauge transform in the previous integral, according to formula (3.3), we obtain

$$\begin{aligned} {\mathscr {J}}^{n}_N \simeq \int _{{\mathscr {S}}} d\mathrm {A}(z_0) \underset{u_{n+1} = u_0=0}{\int _{{\mathscr {A}}(0; \epsilon _N)}}\, F_N(z_0 +\mathrm {u}) \prod _{j=0}^n {\widetilde{B}}^N_{V, z_0}(u_{j}, u_{j+1}) d\mathrm {A}^{n}(\mathrm {u}) . \end{aligned}$$
(3.12)

Note that in formula (3.12), the integral is over a small subset of the surface \(\{ \mathrm {u}\in {\mathbb {C}}^{n+2} : u_0 = u_{n+1} =0\} \) and we denote \(F_N(z_0 +\mathrm {u}) = F_N(z_0, z_0+u_1,\dots , z_0+u_n)\). Then, we can apply Lemma 3.2 to replace the kernel \( {\widetilde{B}}^N_{V, z_0}\) by \(K^\infty _{N \varrho _V(z_0)}\) in formula (3.12), we obtain

$$\begin{aligned} {\mathscr {J}}^{n}_N \simeq \int _{{\mathscr {S}}} d\mathrm {A}(z_0) \underset{u_{n+1} = u_0=0}{\int _{{\mathscr {A}}(0; \epsilon _N)}}\,F_N(z_0 +\mathrm {u})\chi _N(z_0,\mathrm {u}) \prod _{j=0}^n K^\infty _{N \varrho _V(z_0)}(u_{j}, u_{j+1}) d\mathrm {A}^{n}(\mathrm {u}) , \end{aligned}$$

where \(\displaystyle \chi _N(z_0,\mathrm {u}) = 1 + \underset{N\rightarrow \infty }{O}\big ( (\log N)^{2}\epsilon _N \big )\) uniformly for all \(z_0\in {\mathscr {S}}\) and all \(\mathrm {u}\in {\mathscr {A}}(0; \epsilon _N)\).

Let \(F=\Upsilon ^{n+1}[f]\), \(\delta _N = \epsilon _N L_N\) and \(\eta _N(\lambda ) =N L_N^{-2} \varrho _V(x_0 + \lambda /L_N) \). By definition, \(F_N(z_0 +\mathrm {u}) = F\big (L_N(z_0- x_0 +\mathrm {u})\big ) \) and we can make the change of variables \(\lambda = L_N(z_0-x_0)\) and \(\mathrm {w}= L_N \mathrm {u}\) to get rid of the scale \(L_N\) and \(x_0\) in the previous integral. Using the obvious scaling property of the \(\infty \)-Ginibre kernel, (1.16), we obtain

$$\begin{aligned} {\mathscr {J}}^{n}_N \simeq \int _{{\mathcal {K}}} d\mathrm {A}(\lambda ) \underset{w_{n+1} = w_0=0}{\int _{{\mathscr {A}}(0; \delta _N)}}\, F(\lambda +\mathrm {w}){\widetilde{\chi }}_N(\lambda ,\mathrm {w}) \prod _{j=0}^n K^\infty _{\eta _N(\lambda )}(w_{j}, w_{j+1}) d\mathrm {A}^{n}(\mathrm {w}),\quad \end{aligned}$$
(3.13)

where \(\displaystyle {\widetilde{\chi }}_N(\lambda ,\mathrm {w})= 1 + \underset{N\rightarrow \infty }{O}\big ( (\log N)^{2}\epsilon _N \big )\) uniformly for all \(\lambda \in {\mathcal {K}}\) and for all \(\mathrm {u}\in {\mathscr {A}}(0; \delta _N)\). Here we used that the test function f is supported in the set \({\mathcal {K}}\). The condition \(\sum _{\mathbf {k}\vdash n+1} \Upsilon (\mathbf {k}) = 0\) implies that \(F(\lambda +0)=0\) for all \(\lambda \in {\mathbb {C}}\) so that for all \(\mathrm {w}\in {\mathscr {A}}(0; \delta _N)\),

$$\begin{aligned} \big | F(\lambda +\mathrm {w}) \big | \le C \delta _N . \end{aligned}$$

Moreover, by formula (3.8), we have for any \(n \in {\mathbb {N}}\),

$$\begin{aligned} \prod _{j=0}^{n} \big |K^\infty _{\rho }(w_{j}, w_{j+1})\big | \le \rho ^{n+1} \prod _{j=1}^{n} e^{- \rho | v_j |^2 /2 } \end{aligned}$$
(3.14)

where \(v_j = w_j - w_{j-1}\) for all \(j=1,\dots , n\). Hence, we see that

$$\begin{aligned} \bigg | \underset{w_{n+1} = w_0=0}{\int _{{\mathscr {A}}(0; \delta _N)}}\, F(\lambda +\mathrm {w}) \prod _{j=0}^n K^\infty _{\eta _N(\lambda )}(w_{j}, w_{j+1}) d\mathrm {A}^{n}(\mathrm {w}) \bigg | \le C \delta _N \eta _N(\lambda ) \end{aligned}$$

and, since \(\eta _N(\lambda ) \le C N L_N^{-2}\) for all \(\lambda \in {\mathcal {K}}\), we deduce from formula (3.13) that

$$\begin{aligned} {\mathscr {J}}^{n}_N&= \int _{{\mathcal {K}}} d\mathrm {A}(\lambda ) \underset{w_{n+1} = w_0=0}{\int _{{\mathscr {A}}(0; \delta _N)}}\, F(\lambda +\mathrm {w}) \prod _{j=0}^n K^\infty _{\eta _N(\lambda )}(w_{j}, w_{j+1}) d\mathrm {A}^{n}(\mathrm {w}) \nonumber \\&\quad +\underset{N\rightarrow \infty }{O}\big ( N L_N^{-2} \delta _N \epsilon _N (\log N)^2 \big ) . \end{aligned}$$
(3.15)

Recall that \(\delta _N= L_N \epsilon _N\) and \(\epsilon _N=\kappa N^{-1/2}\log N\), so that the error term in (3.15) is of order \((\log N)^4 L_N^{-1}\). Moreover, if \(L_N = o\big ( \sqrt{N}/\log N \big )\), a Taylor approximation shows that for any \(\mathrm {w}\in {\mathscr {A}}(0; \delta _N) \),

$$\begin{aligned} F(\lambda ,\lambda +w_1,\dots , \lambda +w_n) =H^n( \lambda ; \mathrm {w}) + \underset{N\rightarrow \infty }{O}(\delta _N^3) . \end{aligned}$$

Using the estimate (3.14) once more, by formula (3.15), this implies that

$$\begin{aligned} {\mathscr {J}}^{n}_N&= \int _{{\mathcal {K}}} d\mathrm {A}(\lambda ) \underset{w_{n+1} = w_0=0}{\int _{{\mathscr {A}}(0; \delta _N)}}\, H^n( \lambda ; \mathrm {w}) \prod _{j=0}^n K^\infty _{\eta _N(\lambda )}(w_{j}, w_{j+1}) d\mathrm {A}^{n}(\mathrm {w}) \nonumber \\&\quad +\underset{N\rightarrow \infty }{O}\big ( N L_N^{-2} \delta _N^3 \vee (\log N)^4 L_N^{-1} \big ) . \end{aligned}$$
(3.16)

By Lemma 4.3, the leading term in formula (3.16) has the same limit (up to an arbitrary small error term) as

$$\begin{aligned} \int _{{\mathcal {K}}} d\mathrm {A}(\lambda ) \underset{w_{n+1} = w_0=0}{\int _{{\mathbb {C}}^n}}\, H^n( \lambda ; \mathrm {w}) \prod _{j=0}^n K^\infty _{\eta _N(\lambda )}(w_{j}, w_{j+1}) d\mathrm {A}^{n}(\mathrm {w}) \end{aligned}$$

and, since \(N L_N^{-2} \delta _N^3 \rightarrow 0\) when \(L_N = o \big ( \sqrt{N}/(\log N)^3 \big )\), this completes the proof. \(\square \)

Since the function \(\mathrm {w}\mapsto H^n(\lambda ; \mathrm {w})\) is a multivariate polynomial of degree 2, the leading term in the Asymptotics (3.9) can be computed explicitly using the reproducing property of the \(\infty \)-Ginibre kernel; see for instance [44]. For any \(\rho >0\), the function \((z,w) \mapsto e^{\rho z {\overline{w}}}\) is the reproducing kernel for the Bergman space with weight \(\rho e^{-\rho |z|^2/2}\) on \({\mathbb {C}}\). This implies that for any \(w_1, w_2 \in {\mathbb {C}}\) and for all integer \(k\ge 0\),

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \int _{\mathbb {C}}K^\infty _{\rho }(w_1,w_2) w_2^k K^\infty _{\rho }(w_2,w_3) d\mathrm {A}(w_2) = w_1^kK^\infty _{\rho }(w_1,w_3) \\ \displaystyle \int _{\mathbb {C}}K^\infty _{\rho }(w_1,w_2) \overline{w_2}^k K^\infty _{\rho }(w_2,w_3) d\mathrm {A}(w_2) = \overline{w_3}^kK^\infty _{\rho }(w_1,w_3) \end{array}\right. }. \end{aligned}$$

As a basic application of these identities, we obtain the following Lemma.

Lemma 3.5

Let \(n \ge 1\) and \(\rho >0\). For any polynomial \(H(\mathrm {w})\) of degree at most 2 in the variables \(w_1,\dots , w_n, \overline{w_1}, \dots , \overline{w_n}\) such that \(H(0)=0\), we have

$$\begin{aligned} \underset{w_{n+1} = w_0=0}{\int _{{\mathbb {C}}^n}}\, H^n( \mathrm {w}) \prod _{j=0}^nK^\infty _{\rho }(w_{j}, w_{j+1}) \ d\mathrm {A}^n(\mathrm {w}) = \sum _{1 \le r \le s\le n} \partial _s \overline{\partial }_r H |_{\mathrm {w}=0} . \end{aligned}$$
(3.17)

Under the assumptions of Proposition 3.4, since \(\eta _N(\lambda ) \rightarrow +\infty \) as \(N\rightarrow +\infty \) uniformly for all \(\lambda \in {\mathcal {K}}\), we deduce from Lemma 3.5 that for any test function \(f\in {\mathcal {C}}^3_c({\mathcal {K}})\), we have for all integers \(n\ge 1\) and \(m\ge 0\), as \(N\rightarrow +\infty \),

$$\begin{aligned}&\underset{z_0 = z_{n+1}}{\int _{{\mathbb {C}}^{n+1}}}\, \Upsilon ^{n+1}_m[f_N](z_0, \mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}(z_0) d\mathrm {A}^{n}(\mathrm {z}) \nonumber \\&\quad \simeq \, \sum _{2 \le r \le s \le n+1}\int _{{\mathcal {K}}} \partial _s \overline{\partial }_r \Upsilon ^{n+1}_m[f] (\lambda ,\dots , \lambda ) d\mathrm {A}(\lambda ) . \end{aligned}$$
(3.18)

Here we used that according to formulae (2.7) and (2.9), we have for any \(m \ge 0\) and \(n\ge 1\),

$$\begin{aligned} \Upsilon ^{n+1}_m[f](\lambda , \dots , \lambda )= f(\lambda )^{n+1} \sum _{\mathbf {k}\vdash n+1} \Upsilon _m(\mathbf {k}) = 0 . \end{aligned}$$

In the macroscopic regime (\(L_N=1\), \(x_0=0\) and \({\mathcal {K}}= {\text {supp}}(f) \subset {\mathscr {S}}_V\)), by Remark 3.2, this also shows that for any \(n,m \ge 1\),

$$\begin{aligned} \underset{z_0 = z_{n+1}}{\int _{{\mathbb {C}}^{n+1}}}\, \Upsilon ^{n+1}_m[f](\mathrm {z}) \prod _{j=0}^n K^N_V(z_{j}, z_{j+1}) d\mathrm {A}^{n+1}(\mathrm {z}) = \underset{N\rightarrow \infty }{O}\left( (\log N)^4 \right) . \end{aligned}$$

This shows that the estimate (2.12) with \(M_N = N\) holds for any sequence \(q_N \searrow 0\) as \(N\rightarrow +\infty \). By Theorem 2.2, this completes the proof of Theorem 1.1.

3.3 Mesoscopic Fluctuations for 2-Dimensional Coulomb Gases and the Proofs of Theorems 1.2 and 1.3

In the mesoscopic regime, we claim that the asymptotics (3.18) with \(m=0\) implies the Central Limit Theorem 1.2. Indeed, the fact that the cumulants of order \(n\ge 3\) vanish in the large N limit comes from the following combinatorial Lemma.

Lemma 3.6

([44], Lemma 9) For any \(n\ge 1\), let

$$\begin{aligned} {\mathscr {Y}}_n= -\sum _{k\vdash n} \Upsilon _0(\mathbf {k}) \left\{ \sum _{2 \le r < s\le \ell (\mathbf {k})} k_r k_s + \sum _{r=2}^{\ell (\mathbf {k})} k_r (k_r-n) \right\} . \end{aligned}$$

We have \(\displaystyle {\mathscr {Y}}_n = {\left\{ \begin{array}{ll} 1 &{}\text {if } n=2 \\ 0 &{}\text {else} \end{array}\right. }. \)

Proof of Theorem 1.2

Let \(\lambda \in {\mathbb {C}}\) and \(\varvec{\lambda }=(\lambda ,\dots , \lambda )\in {\mathbb {C}}^{n+1}\). According to formula (2.2), an elementary computation shows that for any \(2 \le r < s\le n+1\),

$$\begin{aligned} \partial _s \overline{\partial }_r \Upsilon _0^{n+1}[f](\varvec{\lambda }) = \partial f(\lambda ) \overline{\partial }f(\lambda ) f(\lambda )^{n-1} \sum _{\mathbf {k}\vdash n+1} \Upsilon _0(\mathbf {k}) k_r k_s \mathbf {1}_{s \le \ell (k)} \end{aligned}$$
(3.19)

and

$$\begin{aligned} \partial _r \overline{\partial }_r \Upsilon _0^{n+1}[f]((\varvec{\lambda })&= \partial f(\lambda ) \overline{\partial }f(\lambda ) f(\lambda )^{n-1} \sum _{\mathbf {k}\vdash n+1} \Upsilon _0(\mathbf {k}) k_r(k_r-1) \mathbf {1}_{r \le \ell (k)}\nonumber \\&\quad + \Delta f(\lambda ) f(\lambda )^{n}\sum _{\mathbf {k}\vdash n+1} \Upsilon _0(\mathbf {k}) k_r \mathbf {1}_{r \le \ell (k)} . \end{aligned}$$
(3.20)

Since, by integration by parts,

$$\begin{aligned} \int _{\mathbb {C}}\Delta f(\lambda ) f(\lambda )^{n} d\mathrm {A}(\lambda ) = - n \int _{\mathbb {C}}\partial f(\lambda ) \overline{\partial }f(\lambda ) f(\lambda )^{n-1} d\mathrm {A}(\lambda ) , \end{aligned}$$

we deduce from formulae (3.19) and (3.20) that

$$\begin{aligned} \sum _{2 \le r \le s \le n+1}\int _{{\mathbb {C}}} \partial _s \overline{\partial }_r \Upsilon ^{n+1}_m[f] (\varvec{\lambda }) d\mathrm {A}(\lambda ) = {\mathscr {Y}}_{n+1} \int _{{\mathbb {C}}} \partial f(\lambda ) \overline{\partial }f(\lambda ) f(\lambda )^{n-1} d\mathrm {A}(\lambda ) . \end{aligned}$$

When \( L_N =N^\alpha \) and \(0<\alpha <1/2\), formulae (2.5) and (3.18) with \(m=0\) imply that for any \(n\ge 1\),

$$\begin{aligned} \lim _{N\rightarrow \infty }{\text {C}}^{n+1}_{K^N_V}[f_N] = {\mathscr {Y}}_{n+1} \int _{{\mathbb {C}}} \partial f(\lambda ) \overline{\partial }f(\lambda ) f(\lambda )^{n-1} d\mathrm {A}(\lambda ) . \end{aligned}$$
(3.21)

By Lemma 3.6, this proves that for any test function \(f\in {\mathcal {C}}^3_0({\mathbb {C}})\) and any \(n\ge 2\),

$$\begin{aligned} \lim _{N\rightarrow \infty }{\text {C}}^{n}_{K^N_V}[f_N] = {\left\{ \begin{array}{ll} \Vert f \Vert _{H^1({\mathbb {C}})}^2 &{}\text {if }n=2 \\ 0 &{}\text {else} \end{array}\right. }. \end{aligned}$$
(3.22)

This shows that the centered mesoscopic linear statistics \(\Xi (f_N) -{\mathbb {E}}\big [\Xi (f_N) \big ] \) converges in distribution as \(N\rightarrow \infty \) to the mean-zero Gaussian random variable \(\mathrm {X}(f)\). \(\square \)

We are now ready to finish the proof of Theorem 1.3. By Lemma 3.1, for any bounded function f with compact support, we have for any \(n\ge 1\),

$$\begin{aligned} \int _{\mathbb {C}}f_N(z)^n K^N_V(z,z) d\mathrm {A}(z)&= N \int _{\mathbb {C}}f_N(z)^n 2 \Delta V(z) d\mathrm {A}(z) + \underset{N\rightarrow \infty }{O}(1) \\&= N L_N^{-2} \varrho _V(x_0) \int _{\mathbb {C}}f(z)^n d\mathrm {A}(z) + \underset{N\rightarrow \infty }{O}(N L_N^{-3} ) . \end{aligned}$$

Here we used that the potential V is smooth and \(\rho _V = 2 \Delta V >0\) on a small neighborhood of the point \(x_0 \in {\mathscr {S}}_V\). This implies the Assumption (2.11) with \(M_N = N L_N^{1-2\alpha } \varrho _V(x_0)\)—since the parameter \(\alpha <1/2\), \(M_N \nearrow +\infty \) as \(N\rightarrow +\infty \). As we already pointed out, the asymptotics (3.18) yield the Assumption (2.12) with an error which is O(1). Finally, the Assumption (2.13) was proved just above—see (3.22). So, by Theorem 2.2, this completes the proof of Theorem 1.3.

4 Transition for 1-Dimensional Log-Gases

4.1 Asymptotics of Orthogonal Polynomials

In this section, we begin by reviewing basic facts about the asymptotics of orthogonal polynomials which are required for the proofs of Theorems 1.4 and 1.5. A comprehensive reference for the results discussed in this section is the book of Deift [22]. We assume that the potential \(V \in C^2({\mathbb {R}})\) is a function which satisfies the condition (1.1) and we let \(\Xi \) and \(\widehat{\Xi }\) be the determinantal processes with correlation kernels \(K^N_V\) and \({\widehat{K}}^N_V = p_N K^N_V\) respectively.

The proof of Theorem 1.4 relies on a combinatorial method introduced in [17] which consists in using the three-terms recurrence relation of the orthogonal polynomials \(\{P_k \}_{k=0}^\infty \) with respect to the measure \(d\mu _N = e^{- 2N V(x)}dx\) to compute the cumulants of polynomial linear statistics. For any \(N\in {\mathbb {N}}\), there exists two sequences \(a^N_k >0\) and \(b^N_k \in {\mathbb {R}}\) such that the orthogonal polynomials \(P_k\) in (1.4) satisfy

$$\begin{aligned} xP_k(x) = a^N_k P_{k+1}(x) + b^N_k P_k(x) + a^N_{k-1} P_{k-1}(x) . \end{aligned}$$
(4.1)

In particular, the completion \({\mathscr {L}}_N\) of the space of polynomials with respect to \(L^2({\mathbb {R}},\mu _N)\) is isomorphic to \(L^2({\mathbb {N}}_0)\) and formula (4.1) implies that the multiplication by x on \({\mathscr {L}}_N\) is unitary equivalent to applying the Jacobi matrix

$$\begin{aligned} \mathbf {J} := \begin{bmatrix} b_0^N&a_0^N&0&0&0 \\ a_0^N&b_1^N&a_1^N&0&0&\mathbf {0} \\ 0&a_1^N&b_2^N&a_2^N&0 \\ 0&0&a_2^N&b_3^N&a_3^N \\&\mathbf {0}&\ddots&\ddots&\ddots \end{bmatrix} . \end{aligned}$$
(4.2)

We also let \(\varvec{\Pi }_N\) be the orthogonal projection on \({\text {span}}\{e_0, \dots , e_{N-1}\}\) acting on \(L^2({\mathbb {N}}_0)\). The connection with eigenvalues statistics comes from the fact that for any polynomial Q and for any composition \(\mathbf {k}\vdash n\), one has

$$\begin{aligned}&\underset{x_{0} =x_l}{\int _{{\mathfrak {X}}^l}} Q(x_1)^{k_1} \cdots Q(x_l)^{k_l} \prod _{0\le j\le l} K^N_V(x_j, x_{j-1}) dx_1 \cdots dx_l \\&\quad = {\text {Tr}}\big [Q(\mathbf {J})^{k_1}\varvec{\Pi }_N\cdots Q(\mathbf {J})^{k_l}\varvec{\Pi }_N \big ] \\&\quad = \sum _{m=0}^{N-1} \sum _{\pi \in \Gamma _{m}^n } \prod _{j=1}^l \mathbf {1}_{\pi (k_1+\cdots + k_j) < N} \prod _{i=0}^{n-1} Q(\mathbf {J})_{\pi (i)\pi (i+1)} , \end{aligned}$$

where \({\mathcal {G}}\) denotes the adjacency graph of the matrix \(Q(\mathbf {J})\) and

$$\begin{aligned} \Gamma _{m}^n = \big \{ \text {paths }\pi \text { on the graph }{\mathcal {G}}\text { of length }n \text { such that }\pi (0)=\pi (n)=m \big \} . \end{aligned}$$

Given a path \(\pi \) of length n and a composition \(\mathbf {k}\vdash n\), we let

$$\begin{aligned} \Phi _{\pi }^N(\mathbf {k}) := \mathbf {1}_{\displaystyle \max _{{1\le j <\ell (\mathbf {k})}}}\pi (k_1+\cdots + k_j) \ge N . \end{aligned}$$
(4.3)

Observe that

$$\begin{aligned} \prod _{1\le j<\ell (\mathbf {k})} \mathbf {1}_{\pi (k_1+\cdots + k_j) < N} = 1- \Phi _\pi ^N(\mathbf {k}) , \end{aligned}$$

so that by formula (2.5), the cumulants of a polynomial linear statistics are given by

$$\begin{aligned} {\text {C}}^n_{K^N_V}[Q] = - \sum _{m =0}^{N-1} \sum _{\pi \in \Gamma _{m}^n } \prod _{i=0}^{n-1} Q(\mathbf {J})_{\pi (i)\pi (i+1)} \sum _{\mathbf {k}\vdash n}\Upsilon _0(\mathbf {k})\big ( 1-\Phi _\pi ^N(\mathbf {k})\big ) . \end{aligned}$$
(4.4)

By definitions, there exists a constant \(M>0\) which only depends on the degree of Q and n so that \(\Phi _\pi ^N=0 \) for any path \(\pi \in \Gamma ^n_m\) as long as \(m < N - M\). Since \( \sum _{\mathbf {k}\vdash n}\Upsilon _0(\mathbf {k}) = 0\) for all \(n \ge 2\), formula (4.4) implies that

$$\begin{aligned} {\text {C}}^n_{K^N_V}[Q] = \sum _{m =N -M}^{N-1} \sum _{\pi \in \Gamma _{m}^n } \prod _{i=0}^{n-1} Q(\mathbf {J})_{\pi (i)\pi (i+1)} \sum _{\mathbf {k}\vdash n}\Upsilon _0(\mathbf {k})\Phi _\pi ^N(\mathbf {k}) . \end{aligned}$$

In particular, if the Jacobi matrix has a right-limit, i.e. there exists an (infinite) matrix \(\mathbf {L}\) such that for all \(i,j \in {\mathbb {Z}}\),

$$\begin{aligned} \lim _{N\rightarrow \infty } \mathbf {J}_{N+i, N+j} = \mathbf {L}_{i,j} \end{aligned}$$

then

$$\begin{aligned} \lim _{N\rightarrow \infty }{\text {C}}^n_{K^N_V}[Q] = \sum _{m =1}^{M} \sum _{\pi \in {\widetilde{\Gamma }}_{m}^n } \prod _{i=0}^{n-1} Q(\mathbf {L})_{\pi (i)\pi (i+1)} \sum _{\mathbf {k}\vdash n}\Upsilon _0(\mathbf {k})\Phi _\pi ^0(\mathbf {k}) , \end{aligned}$$
(4.5)

where \(\widetilde{{\mathcal {G}}}\) denotes the adjacency graph of the matrix \(Q(\mathbf {L})\) and

$$\begin{aligned} {\widetilde{\Gamma }}_{m}^n = \big \{ \text {paths }\pi \text { on the graph }\widetilde{{\mathcal {G}}}\text { of length }n \text { such that }\pi (0)=\pi (n)=-m \big \} . \end{aligned}$$

The condition (1.21) implies that the right-limit of the Jacobi matrix is a tridiagonal matrix \(\mathbf {L}\) such that \(\mathbf {L}_{jj}= 0\) and \(\mathbf {L}_{j, j\pm 1} =1/2\) for all \(j \in {\mathbb {Z}}\) and it was proved in [39], see also [17], that in this case:

$$\begin{aligned} \lim _{N\rightarrow \infty } {\text {C}}^n_{K^N_V}[Q] = {\left\{ \begin{array}{ll} \displaystyle \sum _{k=1}^{{\text {deg}} Q} k \bigg ( \int _{-1}^1 Q(x) T_k(x) \frac{dx}{\pi \sqrt{1-x^2}} \bigg )^2 &{}\text {if } n=2 \\ 0 &{}\text {else} \end{array}\right. } . \end{aligned}$$
(4.6)

The combinatorial method used in the previous section is well-suited to investigate the global fluctuations of 1-dimensional log-gas, but it is difficult to implement in the mesoscopic regime since we cannot use polynomials as test functions. So, to describe the transition for mesoscopic linear statistics and to prove Theorem 1.5, we rely on the asymptotics of the correlation kernel \(K^N_V\) from [24] and the method from [38] that we review below. Recall that \(\varrho _V\) is the equilibrium density of the gas and define the integrated density of states:

$$\begin{aligned} F_V(x) = \int _0^x \varrho _V(s) ds. \end{aligned}$$
(4.7)

Let us fix \(x_0\in {\mathscr {I}}_V\), \(0<\alpha <1\), and set

$$\begin{aligned} {\widetilde{K}}^N_{V,x_0}(x, y) = \frac{1}{N^\alpha } K_{V}^N\left( x_0+\frac{x}{N^{\alpha }} ,x_0 +\frac{y}{N^{\alpha }}\right) . \end{aligned}$$
(4.8)

Based on the results of [24], we have for any \(\alpha \in (0,1]\),

$$\begin{aligned} {\widetilde{K}}^N_{V,x_0}(x, y) = \frac{\sin \left[ \pi N \big (( F_V(x_0+ x N^{-\alpha })-F_V(x_0+ y N^{-\alpha })\big )\right] }{\pi (x-y)}+ \underset{N\rightarrow \infty }{O}\left( N^{-\alpha }\right) ,\quad \end{aligned}$$
(4.9)

uniformly for all xy in compact subsets of \({\mathbb {R}}\); c.f. [38, Proposition 3.5]. The main idea of the method of [38] is to compare the kernel (4.9) to the sine-kernel and use the results from Soshnikov [48] for the cumulants of linear statistics of the Sine process. We define the sine-kernel with density \(\rho >0\) on \({\mathbb {R}}\) by

$$\begin{aligned} K^{\sin }_\rho (x,y) = \frac{\sin [\pi \rho (x-y)]}{\pi (x-y)} . \end{aligned}$$
(4.10)

We see by taking \(\alpha =1\) in formula (4.9) that the Sine process with correlation kernel (4.10) describes the local limit in the bulk of the 1-dimensional log-gases. In the mesoscopic regime, it was proved in [38] that, up to a change of variable, it is possible to replace the kernel \({\widetilde{K}}^N_{V,x_0}\) by an appropriate sine-kernel using the asymptotics (4.9) in the cumulant formulae. Namely, for any \(n\ge 2\),

$$\begin{aligned} {\text {C}}^n_{K^N_V}[f_N] \simeq {\text {C}}^n_{K^{\sin }_{\eta _N(x_0) }}[ f\circ \zeta _N] \end{aligned}$$
(4.11)

where

$$\begin{aligned} \zeta _N(x) = N^\alpha \left\{ G_V\left( F_V(x_0) + \varrho _V(x_0) \frac{x}{N^\alpha } \right) -x_0\right\} . \end{aligned}$$
(4.12)

Here, the function \(G_V\) denotes the inverse of the integrated density of sates \(F_V\), (4.7). By the inverse function theorem, it exists in a neighborhood of any point \(F_V(x_0)\) when \(x_0\in {\mathscr {I}}_V\) and the map \(\zeta _N\) is well-defined on any compact subset of \({\mathbb {R}}\) as long as the parameter N is sufficiently large. Then, using Soshnikov’s main combinatorial Lemma, it was proved in [38] that

$$\begin{aligned} \lim _{N\rightarrow \infty } {\text {C}}^n_{K^N_V}[f_N] = {\left\{ \begin{array}{ll} \displaystyle \int _{0}^\infty u \big | {\hat{f}}(u) \big |^2 du &{}\text {if } n=2 \\ 0 &{}\text {if } n \ge 3 \end{array}\right. }. \end{aligned}$$
(4.13)

4.2 Proof of Theorem 1.4: The Global Regime

In this section, we modify the strategy described above in order to deduce Theorem 1.4 from our general Theorem 2.2. The first step is to verify the Assumption (2.11). By [43, Theorem 11.1.2], the expected density of states satisfies for all \(x\in {\mathbb {R}}\),

$$\begin{aligned} u^N_V(x) \le e^{- 2N \{ V(x) - \log (1+|x|) - C\} } , \end{aligned}$$

where C is a constant which depends only on the potential V. This implies that the law of large numbers (1.5) can be extended to all continuous functions with polynomial growth. Moreover, by (4.6), we obtain the asymptotics (2.13) for the cumulants \( {\text {C}}^n_{K_V^N}[Q]\) of the linear statistic \(\Xi (Q)\). Then, it only remains to verify that the estimates (2.12) hold for any polynomial test function. By (2.9), since \(\sum _{\mathbf {k}\vdash n} \Upsilon _m(\mathbf {k}) = 0\), the very same computation leading to (4.5) shows that for any integers \(n,m\ge 1\)

$$\begin{aligned}&\lim _{N\rightarrow \infty } \underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}} \Upsilon ^n_m[Q](x) \prod _{1\le j\le n} K^N_V(x_j, x_{j-1}) d^nx\\&\quad = \sum _{m =1}^{M} \sum _{\pi \in {\widetilde{\Gamma }}_{m}^n } \prod _{i=0}^{n-1} Q(\mathbf {L})_{\pi (i)\pi (i+1)} \sum _{\mathbf {k}\vdash n}\Upsilon _m(\mathbf {k})\Phi _\pi ^0(\mathbf {k}) . \end{aligned}$$

Since the (infinite) matrix \(\mathbf {L}\) is bounded with \(\Vert \mathbf {L}\Vert \le 1/2\), we obtain the estimates (2.12) with an error which is O(1). This completes the proof of Theorem 1.4.

Remark 4.1

(Generalizations of Theorem 1.4) Note that we have formulated Theorem 1.4 for a log-gas at inverse temperature \(\beta =2\), but the previous proof can be generalized to other one-dimensional biorthogonal ensemble with a correlation kernel of the form:

$$\begin{aligned} K^N(z,w) = \sum _{k=0}^{N-1} \varphi _k^N(z) \varpi _k^N(w) . \end{aligned}$$

The appropriate assumptions are that there exists an equilibrium density and a law of large numbers holds for all polynomials, the family \(\{\varphi _k^N \}_{k=0}^\infty \) satisfies a q-term recurrence relation for all \(N\in {\mathbb {N}}\) and the corresponding recurrence matrix \(\mathbf {J}\) has a right-limit \(\mathbf {L}\) as \(N\rightarrow \infty \). This applies to other orthogonal polynomial ensembles, such as the discrete point processes coming from domino tilings of hexagons, as well as some non-symmetric biorthogonal ensembles such as the Muttalib-Borodin ensembles, square singular values of product of complex Ginibre matrices or certain two-matrix models, see [17, 39] for more details. Moreover, we only require that the right-limit \(\mathbf {L}\) exists but it need not be a Toeplitz matrix. Then, we obtain a crossover from a non-Gaussian process (described by \(\mathbf {L}\) in the regime where \(T_N\rightarrow 0\)) to a Poisson process (when \(T_N\rightarrow \infty \)). For instance, such a transition arises when considering linear statistics of the log-gases in the multi-cut regime [39].

Remark 4.2

It was proved in [31, Section 5] that when V is a convex polynomial, then \({\mathscr {S}}_V=(-1,1)\) and the conditions (1.21) are satisfied. In fact, Johansson’s argument shows that these conditions are also necessary to have a CLT for polynomial test functions. Therefore, it is an interesting question to know whether (1.21) and the one-cut condition \({\mathscr {S}}_V=(-1,1)\) are equivalent.

4.3 Proof of Theorem 1.5: The mesoscopic regime

Let us fix \(x_0\in {\mathscr {I}}_V\), \(0<\alpha <1\) and let \(f_N= f\big (N^\alpha ( \cdot -x_0) \big )\) where \(f\in {\mathcal {C}}^2_c({\mathbb {R}})\). First, observe that by a change of variable, we have for all \(n\ge 1\),

$$\begin{aligned} \int _{\mathbb {R}}f_N(x)^n K_V^N(x,x) dx = \int _{\mathbb {R}}f(x)^n {\widetilde{K}}^N_{V,x_0}(x,x) dx , \end{aligned}$$

where \({\widetilde{K}}^N_{V,x_0}\) is given by (4.8). Using the asymptotics (4.9), we have that

$$\begin{aligned} {\widetilde{K}}^N_{V,x_0}(x, x) = N^{1-\alpha } \varrho _V(x_0) + \underset{N\rightarrow \infty }{O}\left( N^{-\alpha }\right) \end{aligned}$$

uniformly for all \(x\in {\text {supp}}(f)\). If \(M_N = N^{1-\alpha } \varrho _V(x_0) \), this implies that for all \(n\in {\mathbb {N}}\),

$$\begin{aligned} \frac{1}{M_N} \int _{\mathbb {R}}f_N(x)^n K_V^N(x,x) dx \simeq \int _{\mathbb {R}}f(x)^{n} dx . \end{aligned}$$

Thus, we obtain the condition (2.11) with \(d\eta =dx\). Moreover, the Assumption (2.13) is given by (4.13). Then it just remains to prove the estimates (2.12). Let us observe that by a change of variables, we also have for all \(n, m \ge 1\),

$$\begin{aligned}&\underset{u_{0} =u_n}{\int _{{\mathbb {R}}^n}} \Upsilon ^n_m[f_N](u) \prod _{1\le j\le n} K_V^N(u_j, u_{j-1}) d\mu ^n(u)\nonumber \\&\quad = \underset{u_{0} =u_n}{\int _{{\mathbb {R}}^n}} \Upsilon ^n_m[f](u) \prod _{1\le j\le n} {\widetilde{K}}^N_{V,x_0}(u_j, u_{j-1}) d\mu ^n(u) . \end{aligned}$$
(4.14)

Exactly like the proof of (4.11)—which corresponds to the case \(m=0\) by formula (2.5)—we deduce from the proof of [38, Proposition 2.2] that for any \(m \ge 1\),

$$\begin{aligned} \underset{x_{0} =x_n}{\int _{{\mathbb {R}}^n}} \Upsilon ^n_m[f](x) \prod _{j\le n} {\widetilde{K}}^N_{V,x_0}(x_j, x_{j-1}) d^nx \ \simeq \underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}} \, \Upsilon ^n_m[h_N](x) \prod _{1\le j\le n} K^{\sin }_{\eta _N}(x_j, x_{j-1}) d^nx\nonumber \\ \end{aligned}$$
(4.15)

where \(h_N = f\circ \zeta _N\) and \(\zeta _N\) is given by (4.12). To finish the proof, we need the following estimates.

Proposition 4.1

Suppose that \({\text {supp}}(f) \subset (-L, L)\). There exists \(N_0 >0\) such that for all \(N \ge N_0\), the functions \(h_N= f\circ \zeta _N\) are well-defined on \({\mathbb {R}}\), \( h_N\in C^2_c([-L,L])\) and for all \(u\in {\mathbb {R}}\),

$$\begin{aligned} \big | \widehat{h_N}(u) \big | \le \Vert f \Vert _{{\mathcal {C}}^2({\mathbb {R}})} \frac{C }{1+ |u|^2} . \end{aligned}$$
(4.16)

Proof

When the potential V is analytic, the bulk\({\mathscr {I}}_V\) consists of finitely many open intervals and the equilibrium density \(\varrho _V\) is smooth on \({\mathscr {I}}_V\). Since \(x_0\in {\mathscr {I}}_V\), by formula (4.12), the function \(\zeta _N\) is increasing and smooth on the interval \([-L,L]\) with

$$\begin{aligned} \zeta _N''(x) = \varrho _V(x_0)^2 G_V''\big (F_V(x_0) + \varrho _V(x_0) xN^{-\alpha }\big ) N^{-\alpha }. \end{aligned}$$

Moreover, since \(\zeta _N(0)=0\) and \(\zeta '_N(0) =G_V'\big (F_V(x_0)\big ) \varrho _V(x_0) =1\), this implies that

$$\begin{aligned} \zeta _N(x) = x + \underset{N\rightarrow \infty }{O}(N^{-\alpha }) \end{aligned}$$

uniformly for all \(x\in [-L,L]\). Since the open interval \((-L, L)\) contains the support of the test function f, this estimate shows that when the parameter N is large, we can define \(h_N(x) = f\big ( \zeta _N(x) \big )\) for all \(x\in [-L, L]\) and extend it by 0 on \({\mathbb {R}}\backslash [-L,L]\). Then \(h_N \in C^2_0({\mathbb {R}})\) and

$$\begin{aligned} h_N''(x) = \zeta _N''(x) f'(\zeta _N(x)) + \zeta _N'(x)^2f''(\zeta _N(x)) \end{aligned}$$
(4.17)

for all \(x\in [-L,L]\). Moreover, we can use the estimate

$$\begin{aligned} \big | \widehat{h_N}(u) \big | \le C \frac{\Vert h_N \Vert _\infty + \Vert h_N''\Vert _\infty }{1+ |u|^2} \end{aligned}$$
(4.18)

to get the upper-bound (4.16). Plainly \(\Vert h_N \Vert _\infty \le \Vert f\Vert _\infty \) and it is easy to deduce from formula (4.17) that

$$\begin{aligned} \Vert h_N'' \Vert _\infty \le \Vert f' \Vert _\infty + C \Vert f''\Vert _\infty . \end{aligned}$$

\(\square \)

To compute the limit of the RHS of (4.15), we also need the following asymptotics which come from the proof of Lemma 1 in Soshnikov’s paper [48] on linear statistics of the CUE and Sine process.

Lemma 4.2

Let \(n\ge 2\) and let \(\eta _N>0\) such that \(\eta _N\nearrow \infty \) as \(\rightarrow +\infty \). Suppose that \(h_N\) is a sequence of integrable functions such that

$$\begin{aligned} \lim _{N\rightarrow \infty }\, \underset{{\begin{array}{c} u_1+\cdots +u_{n}=0 \\ |u_1|+\cdots +|u_n| > \eta _N \end{array}}}{\int _{{\mathbb {R}}^{n-1}}} \, \big | \widehat{h_N}(u_1)\cdots \widehat{h_N}(u_n) \big | |u_1| d^{n-1}\mathrm {u} =0 . \end{aligned}$$
(4.19)

Then, for any map \(\Upsilon : \mho \rightarrow {\mathbb {R}}\) such that \(\sum _{\mathbf {k}\vdash n} \Upsilon (\mathbf {k}) =0\), we have

$$\begin{aligned}&\underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}} \Upsilon ^n[h_N](x) \prod _{1\le j\le n} K^{\sin }_{\eta _N}(x_j, x_{j-1}) d^nx\ \nonumber \\&\quad \simeq - \underset{u_1+\cdots +u_n=0}{\int _{{\mathbb {R}}^{n-1}}} \mathfrak {R}\bigg \{ \prod _{j=1}^n \widehat{h_N}(u_j) \bigg \}\sum _{\mathbf {k}\vdash n}\Upsilon (\mathbf {k})\Psi _u(\mathbf {k}) d^{n-1}u , \end{aligned}$$
(4.20)

where for any \(u \in {\mathbb {R}}^n\) and for any composition \(\mathbf {k}\vdash n\),

$$\begin{aligned} \Psi _u(\mathbf {k}) = 2\max _{1\le j <\ell (\mathbf {k})}\{0, u_1+\cdots + u_{k_1+\cdots +k_j} \} . \end{aligned}$$

Proof

Based on the formula

$$\begin{aligned} K^{\sin }_{\eta _N}(x,y) = \int _{\mathbb {R}}\mathbf {1}_{\{ |u|< \eta _N/2\} } e^{2\pi i u (x-y)} du , \end{aligned}$$

we obtain

$$\begin{aligned} {\mathscr {T}}_N&:=\underset{x_{0} =x_n}{\int _{{\mathfrak {X}}^n}} \Upsilon ^n[h_N](x) \prod _{1\le j\le n} K^{\sin }_{\eta _N}(x_j, x_{j-1}) d^nx \\&= \underset{u_1+\cdots +u_n=0}{\int _{{\mathbb {R}}^{n-1}}} \prod _{j=1}^n \widehat{h_N}(u_j) \sum _{\mathbf {k}\vdash n}\Upsilon (\mathbf {k}) \max \big \{0, \eta _N - \Psi _u(\mathbf {k})/2 - \Psi _{-u}(\mathbf {k})/2 \big \} d^{n-1}u . \end{aligned}$$

Then, the condition \(\sum _{\mathbf {k}\vdash n} \Upsilon (\mathbf {k}) =0\) implies that

$$\begin{aligned}&\bigg | {\mathscr {T}}_N + \underset{u_1+\cdots +u_n=0}{\int _{{\mathbb {R}}^{n-1}}} \prod _{j=1}^n \widehat{h_N}(u_j) \sum _{\mathbf {k}\vdash n}\Upsilon (\mathbf {k}) \frac{ \Psi _u(\mathbf {k}) + \Psi _{-u}(\mathbf {k})}{2} d^{n-1}u \bigg | \\&\quad \le \underset{u_1+\cdots +u_n=0 }{\int _{{\mathbb {R}}^{n-1}}} \bigg | \prod _{j=1}^n \widehat{h_N}(u_j) \bigg | \sum _{\mathbf {k}\vdash n} |\Upsilon (\mathbf {k}) | \Psi _u(\mathbf {k}) \mathbf {1}_{ \{\Psi _u(\mathbf {k}) + \Psi _{-u}(\mathbf {k}) \ge 2 \eta _N \} }d^{n-1}u \end{aligned}$$

Since \( \big | \Psi _u(\mathbf {k})/2 \big | \le |u_1|+\cdots +|u_n|\) for any \(\mathbf {k}\vdash n\), the condition (4.19) is sufficient to obtain the asymptotics (4.20). \(\square \)

We are now ready to complete the proof of Theorem 1.5. Using the estimate (4.16), we see that when the parameter N is sufficiently large, there exists a constant \(C>0\) so that

$$\begin{aligned} \begin{aligned}&\Bigg | \underset{u_1+\cdots +u_n=0}{\int _{{\mathbb {R}}^{n-1}}} \, \mathfrak {R}\bigg \{ \prod _{j=1}^n \widehat{h_N}(u_j) \bigg \}\sum _{\mathbf {k}\vdash n}\Upsilon _m(\mathbf {k})\Psi _u(\mathbf {k}) d^{n-1}u \Bigg | \\&\quad \le C \underset{u_1+\cdots +u_n=0}{\int _{{\mathbb {R}}^{n-1}}} \frac{|u_1|+\cdots +|u_n|}{(1+ |u_1|^2) \cdots (1+|u_n|^2)} d^{n-1}u . \end{aligned} \end{aligned}$$
(4.21)

A similar upper-bound shows that the sequence \(h_N = f\circ \zeta _N\) satisfies the condition (4.19) of Lemma 4.2. Hence, combining the asymptotics (4.15) and (4.20), we obtain

$$\begin{aligned}&\underset{x_{0} =x_n}{\int _{{\mathbb {R}}^n}} \Upsilon ^n_m[f](x) \prod _{j\le n} {\widetilde{K}}^N_{V,x_0}(x_j, x_{j-1}) d^nx \simeq \nonumber \\&\quad - \underset{u_1+\cdots +u_n=0}{\int _{{\mathbb {R}}^{n-1}}} \mathfrak {R}\bigg \{ \prod _{j=1}^n \widehat{h_N}(u_j) \bigg \}\sum _{\mathbf {k}\vdash n}\Upsilon _m(\mathbf {k})\Psi _u(\mathbf {k}) d^{n-1}u . \end{aligned}$$

By (4.21), we see that the previous integral is uniformly bounded by a constant which depends only on the test function f and \(n,m \in {\mathbb {N}}\). Hence, by (4.14), we obtain the estimates (2.12) with an error which is O(1) and we can apply Theorem 2.2. This completes the proof of Theorem 1.5.