1 Introduction

The Sherrington–Kirkpatrick (SK) and spherical Sherrington–Kirkpatrick (SSK) models devised in the 1970s are two classical examples of mean-field spin models in which the magnetic behavior of N particles, encoded in a spin vector \(\varvec{\sigma }\), is governed by their identically distributed random pairwise interactions. The SK model has Ising spins \(\varvec{\sigma }\in \{-1,1\}^N\), and SSK is the continuous analog with \(\varvec{\sigma }\in \{\mathbb {R}^N:\Vert \varvec{\sigma }\Vert ^2=N\}\). For a detailed exposition on these models, we refer readers to the book by Panchenko [44]. One limitation of these models is that of their mean-field structure, meaning that all pairs of particles interact according to the same rule. With the aim of reflecting inhomogeneities and community structures (e.g., in theoretical biology, social and neural networks), scholars have developed various extensions beyond mean-field models.

One extension is the multi-species model, in which the set of N spins is partitioned into a fixed number of disjoint subsets or “species” [15]. The random interactions between spins are not identically distributed as in the SK and SSK models, but rather have variances depending on the species structure. For a k-species model, the covariance structure can be encoded in a \(k\times k\) matrix \(\Delta ^2\), where \(\Delta ^2_{s,t}\) denotes the variance of the random interaction between a spin in species s and a spin in species t. In bipartite models, \(k=2\) and \(\Delta ^2_{s,s}=0\) and \(\Delta ^2_{s,t}>0\) for all \(s\ne t\), meaning that interactions are only between spins of different species. Bipartite models have important applications in biology and neural networks [1, 17, 19]. Another multi-species model (with applications in artificial intelligence) is the deep Boltzmann machine, where the species or “layers” are ordered, and interactions are only between spins in adjacent layers [4,5,6, 32, 49].

Another direction of generalizing the SK and SSK models is to allow interactions, not only between pairs, but among groups of spins. A p-spin model has interactions among groups of p spins. Likewise, a (pq)-spin bipartite model has interactions between a group of p spins from one species and a group of q spins from the other species. The case of spherical spins for this model was studied by Auffinger and Chen [9], where they obtained a minimization formula for the limiting free energy at sufficiently high temperature.

The current paper focuses on the bipartite (1, 1)-spin SSK model. The setup for this model is as follows. Given two positive integers nm, we define spin variables

$$\begin{aligned} \varvec{\sigma }=(\sigma _1,\sigma _2,...,\sigma _n)\in S_{n-1},\quad \varvec{\tau }=(\tau _1,\tau _2,...,\tau _m)\in S_{m-1}, \end{aligned}$$

where

$$\begin{aligned} S_{n-1}=\{\textbf{u}\in \mathbb {R}^n: \Vert \textbf{u}\Vert ^2=n\}. \end{aligned}$$

The Hamiltonian for the model is given by

$$\begin{aligned} H(\varvec{\sigma },\varvec{\tau })=\frac{1}{\sqrt{n+m}}\sum _{i=1}^n\sum _{j=1}^m J_{ij}\sigma _i\tau _j \end{aligned}$$

where \(J_{ij}\) are independent, standard Gaussian random variables. The Gibbs measure and the free energy for this model at inverse temperature \(\beta >0\) are

$$\begin{aligned} p(\varvec{\sigma },\varvec{\tau })=\frac{1}{Z_{n,m}}e^{\beta H(\varvec{\sigma },\varvec{\tau })},\quad F_{n,m}(\beta )=\frac{1}{n+m}\log Z_{n,m}, \end{aligned}$$
(1.1)

respectively, where \(Z_{m,n}\) is a normalization factor (i.e. partition function),

$$\begin{aligned} Z_{n,m}=\int _{S_{m-1}}\int _{S_{n-1}}e^{\beta H(\varvec{\sigma },\varvec{\tau })}\textrm{d}\omega _{n}(\varvec{\sigma })\textrm{d}\omega _{m}(\varvec{\tau }), \end{aligned}$$
(1.2)

and \(d\omega _{n}\) is the uniform probability measure on \(S_{n-1}\).

1.1 Background and Related Literature

The free energy of SK and SSK has been well studied, although more is known in the spherical setting. The limiting free energy was first conjectured by Parisi for SK [47] and by Crisanti and Sommers for SSK [26]. Both conjectures were rigorously proved by Talagrand [54, 55]. The fluctuations of the SK model are only known at high temperature [2, 14, 25, 31], but more is known for the spherical model, where additional analytic techniques are available. In 2016, Baik and Lee analyzed the fluctuations of the SSK free energy at non-critical temperature and found that the fluctuations at high temperature are asymptotically Gaussian while those at low temperature are asymptotically Tracy–Widom [12]. The fluctuations at the critical temperature were left open.

The fluctuations at the critical temperature of the SSK free energy were studied by Landon [39] and by Johnstone, Klochkov, Onatski, and Pavlyshyn [36], independently. Both papers showed that the critical scaling for the inverse temperature is \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\). Landon proved that, for fixed \(b\le 0\) and for \(b\rightarrow 0\), the fluctuations are Gaussian while, for \(b\rightarrow +\infty \) at any rate, the fluctuations are Tracy–Widom. For fixed \(b>0\), Landon showed tightness but did not obtain the limiting distribution. On the other hand, Johnstone et al. were able to compute fluctuations for all fixed b. Their result for \(b\le 0\) agrees with that of Landon and, for \(b>0\), they showed that the fluctuations are a sum of independent Gaussian and Tracy–Widom random variables.

Departure from the mean-field structure generally leads to a more challenging analysis. While the problem of the limiting free energy is solved for general one-species mixed p-spin SK and SSK models [22, 45, 47, 54, 55], limiting results remain incomplete for the multi-species and (pq)-spin bipartite models. For the multi-species SK model, the limiting free energy is only verified under the assumption of positive definite \(\Delta ^2\) (Barra et al. [15] proposed a Parisi-type formula and proved an upper bound, and Panchenko [46] proved a matching lower bound). For general \(\Delta ^2\), we only have a lower bound [46]. The bipartite model, one of the most natural multi-species examples, belongs to the indefinite \(\Delta ^2\) case and is still open in the case of Ising spins (a conjecture on the limiting free energy was made [16, 18]). When it comes to fluctuations, a central limit theorem (CLT) for the free energy of the two-species SK model for general \(\Delta ^2\) was obtained at high temperature by [41].

For the bipartite SSK model, more is known. Baik and Lee [13] obtained both the limit and the asymptotic fluctuations of the free energy, at all non-critical temperatures. More specifically, assuming \(n,m\rightarrow \infty \) with \(n/m=\lambda +O(n^{-1-\delta })\) for some \(\lambda ,\delta >0\), they provided explicit formulas for the first two terms in the asymptotic expansion of the free energy for \(\beta \ne \beta _c\), where the critical inverse temperature \(\beta _c\) is equal to \(\sqrt{1+\lambda }/\lambda ^{1/4}\). The formulas imply that fluctuation is Gaussian with order \(n^{-1}\) for \(\beta <\beta _c\) (high temperature) and is GOE Tracy–Widom of order \(n^{-2/3}\) for \(\beta >\beta _c\) (low temperature).

See [9, 20, 27, 52, 53] for high-temperature results for more general Ising or spherical spin models.

1.2 Main Theorem

The goal of this paper is to compute the fluctuations of the free energy in a transitional window around the critical temperature for the bipartite (1,1)-spin SSK model. In particular, this includes detailed knowledge of the free energy at the critical temperature, providing another result on critical temperature among spin glass models, in addition to the independent results of Landon [39] and of Johnstone et al. [36].

We state our main result in the following theorem.

Theorem 1.1

Let \(F_{n,m}(\beta )\) denote the free energy of a bipartite SSK spin glass, given by (1.1), where the species sizes nm satisfy \(n/m=\lambda +O(n^{-1})\), for some constant \(\lambda \in (0,1]\), as \(n,m\rightarrow \infty \). When the inverse temperature is at the critical scaling, namely \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\) for fixed b and \(\beta _c:=\sqrt{1+\lambda }/\lambda ^{1/4}\), the limiting distribution of the free energy is given by the formula below and this convergence holds in distribution.

$$\begin{aligned} \frac{n+m}{\sqrt{\frac{1}{6} \log n}}\left( F_{n,m}(\beta )-F(\beta )+\frac{1}{12}\frac{\log n}{n+m}\right) \rightarrow \mathcal {N}(0,1)+\frac{\sqrt{6}(1+\lambda )^{\frac{1}{2}}b_+}{\lambda ^{\frac{3}{4}}(1+\sqrt{\lambda })^{\frac{2}{3}}}{{\,\textrm{TW}\,}}_1\nonumber \\ \end{aligned}$$
(1.3)

where \({{\,\textrm{TW}\,}}_1\) denotes the real Tracy–Widom distribution that is independent from the standard normal \(\mathcal {N}(0,1)\) and \(b_+\) denotes the positive part of b. The limiting free energy is given by

$$\begin{aligned} F(\beta )={\left\{ \begin{array}{ll} \frac{\beta ^2}{2\beta _c^4}&{} \quad \text {for }\beta <\beta _c\\ f_\lambda +\frac{\lambda }{1+\lambda }A\left( (1+\sqrt{\lambda })^{2},\frac{\beta }{\sqrt{\lambda (1+\lambda )}}\right) -\frac{1}{2}\log \beta -\frac{\lambda }{2(1+\lambda )}C_\lambda &{} \quad \text {for }\beta \ge \beta _c \end{array}\right. }\nonumber \\ \end{aligned}$$
(1.4)

where

$$\begin{aligned} \begin{aligned} f_\lambda&=-\frac{1}{2}+\frac{\lambda -1}{2(\lambda +1)}\log 2+\frac{\lambda -1}{4(\lambda +1)}\log \lambda +\frac{1}{4}\log (1+\lambda ),\\ A(x,B)&=\sqrt{\alpha ^2+xB^2}-\alpha \log \left( \frac{\alpha +\sqrt{\alpha ^2+xB^2}}{2B}\right) ,\\ C_\lambda&=(1-\lambda ^{-1})\log (1+\lambda ^{1/2})+\log (\lambda ^{1/2})+\lambda ^{-1/2}. \end{aligned}\end{aligned}$$
(1.5)

1.3 Overview of the Proof Methods

One valuable tool in the analysis of the free energy for SSK and bipartite SSK models is a contour integral representation for the partition function (\(Z_{n,m}\) in our model). A priori, the partition function of SSK is a surface integral on a high-dimensional sphere (or two spheres in the bipartite case). However, this can be rewritten in terms of contour integrals in the complex plane, which are significantly easier to analyze. The contour integral representation for the SSK partition function was first observed by Kosterlitz, Thouless, and Jones [37]. The analogous representation for the spherical bipartite model, which we use in the current paper, was derived by Baik and Lee [13].

Armed with this contour integral representation, our analysis can be broken into two broad stages: (1) use steepest descent analysis to obtain an asymptotic expansion for the free energy and (2) analyze the limiting fluctuations using tools from random matrix theory. This general procedure has been followed in several recent papers on spherical spin glasses, including [36, 39] in their analysis of SSK at critical temperature. While much of our analysis is inspired by the methods in these two papers, the bipartite setting introduces certain technical challenges beyond those that arise for unipartite SSK.

One challenge in the bipartite setting is that the representation for \(Z_{n,m}\) is a double contour integral, rather than the single integral that arises for SSK. This makes the process of contour deformation and steepest descent analysis more delicate, particularly on the low-temperature side of the critical threshold, where the contour passes very close to the (random) singularities of the integrand. Another challenge in the bipartite setting is that the underlying random matrix is a Laguerre Orthogonal Ensemble (LOE) rather than the Gaussian Orthogonal Ensemble (GOE) that appears for SSK (more background on random matrices is in Sect. 2). While these ensembles have many similarities, certain analyses are more complicated for LOE.

From the steepest descent analysis, we obtain an asymptotic expansion for the free energy near the critical temperature, which depends on a sum of the form \(\sum _{i=1}^n\log (\gamma -\mu _i)\). This is a logarithmic linear statistic of the eigenvalues \(\{\mu _i\}_{i=1}^n\) of LOE. The CLT for this quantity is well known in random matrix theory in the case where \(\gamma -d_+>c\) for some constant c and \(d_+\) being the upper edge of the matrix spectrum (see, e.g., [10, 11, 42]). However, this standard CLT for linear eigenvalue statistics does not address the case where \(\gamma \) approaches \(d_+\) as \(n\rightarrow \infty \), which is precisely the scenario that arises when analyzing the free energy at critical temperature. Thus, we need an “edge CLT” to treat the case where \(\gamma \rightarrow d_+\). A similar challenge arises for the SSK model at critical temperature, where the log linear statistic depends on eigenvalues of GOE. The edge CLT for this statistic in the GOE case can be found in [35, 38], and these works provide a necessary ingredient for the analysis of SSK free energy at critical temperature.

When we began the current project, an analogous edge CLT for LOE did not exist in the literature. To fill this gap, we proved the following theorem in a separate paper [24].

Theorem 1.2

(Collins-Woodfin, Le [24]) Let \(M_{n,m}\) be an LOE matrix with \(n,m,\lambda ,C_\lambda ,d_+\) as above. Let \(\gamma =d_++\sigma _n n^{-2/3}\) with \(-\tau < \sigma _n\ll (\log n)^2\) for some \(\tau >0\). Then,

$$\begin{aligned}{} & {} \frac{\sum _{i=1}^n\log |\gamma -\mu _i|-C_\lambda n - \frac{1}{\lambda ^{1/2}(1+\lambda ^{1/2})}\sigma _n n^{1/3} +\frac{2}{3\lambda ^{3/4}(1+\lambda ^{1/2})^2}\sigma _n^{3/2} +\frac{1}{6}\log n}{\sqrt{\frac{2}{3} \log n}}\rightarrow \mathcal {N}(0,1). \nonumber \\ \end{aligned}$$
(1.6)

The above result is essential in proving Theorem 1.1 as it is the source of the Gaussian term in the limiting distribution.

The last step of our proof is to show the asymptotic independence of the Gaussian and Tracy–Widom terms in the limiting distribution. This involves a recurrence on the entries of the tridiagonal representation of LOE. In the course of this analysis, we prove a result that may be of independent interest, namely that the largest eigenvalue of an \(n\times n\) LOE matrix depends (asymptotically) on a minor of size \(n^{1/3}\log ^3n\). This result is well known numerically (e.g., [29]), but we have not found an explicit proof of it in the literature.

1.4 Organization

In Sect. 2, we provide a more detailed setup of the problem along with various probability, spin glass, and random matrix theory results that will be used throughout the paper. Sections 3 and 4 contain our analysis of the free energy for \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\) in the cases of \(b<0\) (high critical temperature) and \(b>0\) (low critical temperature), respectively. The case of \(b=0\) is also addressed in Sect. 4. Finally, in Sect. 5, we prove the asymptotic independence of the Gaussian and Tracy–Widom terms in the main theorem. Appendices A and B provide proofs of some technical lemmas from Sects. 2 and 5, respectively.

2 Setup and Preliminaries

2.1 Preliminaries for Bipartite SSK Model

2.1.1 Double contour integral representation of free energy

One of the key tools that enable us to precisely calculate the free energy and its fluctuations is a contour integral representation of the partition function. A priori, \(Z_{n,m}\) is given by the surface integral in (1.2). The contour integral representation of \(Z_{n,m}\) was derived by Baik and Lee [13]. For the bipartite model, we assume, without loss of generality, that \(n\le m\). We use \(S^{n-1}\) to denote the unit n-sphere (as opposed to \(S_{n-1}\), which denotes the n-sphere of radius \(\sqrt{n}\)). Then the partition function can be written as [13]

$$\begin{aligned} Z_{n,m}(\beta )=\frac{2^n}{|S^{m-1}||S^{n-1}|}\left( \frac{\pi ^2(n+m)}{m^2n\beta ^2}\right) ^{\frac{n+m-4}{4}}Q(n,\alpha _n,B_n) \end{aligned}$$
(2.1)

where

$$\begin{aligned} Q_n:=Q(n,\alpha _n,B_n)=-\int _{\gamma _1-\textrm{i}\infty }^{\gamma _1+\textrm{i}\infty }\int _{\gamma _2-\textrm{i}\infty }^{\gamma _2+\textrm{i}\infty }e^{nG(z_1,z_2)}\textrm{d}z_2\textrm{d}z_1 \end{aligned}$$
(2.2)

and \(G(z_1,z_2)\) is a random function depending on the eigenvalues \(\mu _1\ge \mu _2\ge \cdots \ge \mu _n\) of \(\frac{1}{m} JJ^T\). The parameters \(\gamma _1,\gamma _2\) can be any positive real numbers satisfying \(4\gamma _1\gamma _2>\mu _1\). The function G is defined as

$$\begin{aligned} G(z_1,z_2):=B_n(z_1+z_2)-\frac{1}{2n}\sum _{i=1}^n\log (4z_1z_2-\mu _i)-\alpha _n\log z_1 \end{aligned}$$
(2.3)

where

$$\begin{aligned} \alpha _n:=\frac{m-n}{2n}, \quad B_n:=\frac{m}{\sqrt{n(n+m)}}\beta \end{aligned}$$
(2.4)

Using this contour integral representation of \(Z_{n,m}(\beta )\), the free energy of the bipartite SSK is

$$\begin{aligned} F_{n,m}(\beta )= & {} \frac{1}{n+m}\log Q(n,\alpha _n, B_n) \nonumber \\{} & {} + \frac{1}{n+m}\log \left( \frac{2^n}{|S^{n-1}||S^{m-1}|}\left( \frac{\pi ^2(n+m)}{m^2n\beta ^2}\right) ^{\frac{n+m}{4}-1}\right) . \end{aligned}$$
(2.5)

By direct computation, the second term of the right-hand side is \(f_\lambda -\frac{1}{2}\log \beta +\frac{\lambda }{1+\lambda }\frac{\log n}{n}+O(n^{-1})\) as \(n\rightarrow \infty \), where \(f_\lambda \) is as defined in (1.5). We obtain

$$\begin{aligned} F_{n,m}(\beta ) = \frac{1}{n+m}\log Q(n,\alpha _n, B_n) + f_\lambda -\frac{1}{2}\log \beta +\frac{\lambda }{1+\lambda }\frac{\log n}{n}+O(n^{-1}) \nonumber \\ \end{aligned}$$
(2.6)

so the computation of the free energy boils down to computing the integral \(Q_n\). In order to compute this integral via steepest descent analysis, one needs to find a critical point of \(G(z_1,z_2)\). Baik and Lee show that there exists a critical point \((z_1,z_2)\) such that both coordinates are positive real and \(4z_1z_2>\mu _1\). We can choose the contours of the double integral to pass through this critical point, which has coordinates

$$\begin{aligned} (\gamma _1,\gamma _2)=\left( \frac{\alpha _n+\sqrt{\alpha _n^2+\gamma B_n^2}}{2B_n}, \frac{-\alpha _n+\sqrt{\alpha _n^2+\gamma B_n^2}}{2B_n} \right) \end{aligned}$$
(2.7)

where \(\gamma \) is the unique real number greater than \(\mu _1\) satisfying

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\frac{1}{\gamma -\mu _i}=\frac{B_n^2}{\alpha _n+\sqrt{\alpha _n^2+\gamma B^2}}. \end{aligned}$$
(2.8)

We see that \(\gamma \) is implicitly a function of the eigenvalues of \(\frac{1}{m} JJ^T\), which is a normalized Laguerre Orthogonal Ensemble (i.e., real Wishart matrix). Later in this section we recount some important properties of this matrix ensemble that will be used throughout the paper.

2.1.2 Critical inverse temperature \(\beta _c\) and critical window

As stated above, the critical inverse temperature of the bipartite SSK model is \(\beta _c=\sqrt{1+\lambda }/\lambda ^{1/4}\). At this value of \(\beta \), one sees a transition in the behavior of the critical point \(\gamma \). We give a brief, heuristic description of the transition here and provide more details in the next two sections.

Equation (2.8), which is random and n-dependent, can be approximated by its deterministic, n-independent analog

$$\begin{aligned} \int _\mathbb {R}\frac{1}{z-x}p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}(x)=\frac{B^2}{\alpha +\sqrt{\alpha ^2+z B^2}} \end{aligned}$$
(2.9)

where \(p_{{{\,\textrm{MP}\,}}}\) denotes the Marčenko–Pastur measure (see definition in Eq. (2.12)) and \(\alpha ,B\) are given by

$$\begin{aligned} \alpha := \frac{1-\lambda }{2\lambda },\quad B:=\frac{\beta }{\sqrt{\lambda (1+\lambda )}}. \end{aligned}$$
(2.10)

If Eq. (2.9) is to be of any use, then it should provide a solution \(z\in (d_+,\infty )\) that is close to the solution \(\gamma \) of (2.8) (with high probability and for all sufficiently large n). Labeling the left and right sides of (2.9) as \(L_\infty (z)\) and \(R_\infty (z)\), respectively, Baik and Lee [13] observe that \(\frac{L_\infty (z)}{R_\infty (z)}\) is a decreasing function of \(z\in (d_+,\infty )\) with

$$\begin{aligned} \lim _{z\rightarrow \infty }\frac{L_\infty (z)}{R_\infty (z)}=0,\quad \lim _{z\downarrow d_+}\frac{L_\infty (z)}{R_\infty (z)}=\frac{L_\infty (d_+)}{R_\infty (d_+)}. \end{aligned}$$
(2.11)

Hence, (2.9) has a solution \(z\in (d_+,\infty )\) if and only if \(L_\infty (d_+)>R_\infty (d_+)\). We call this solution \(\tilde{\gamma }\). By setting \(L_\infty (d_+)=R_\infty (d_+)\) and solving for \(\beta \), one obtains the critical inverse temperature. The implication of this is that, for \(\beta <\beta _c\) (high temperature), \(\gamma \) can be approximated by \(\tilde{\gamma }\), and this deterministic approximation turns out to be very accurate. However, for \(\beta >\beta _c\) (low temperature), (2.9) can’t be used to approximate \(\gamma \), since it has no solution in \((d_+,\infty )\). Intuitively, this is due to the fact that, at low temperature, \(\gamma \) is very close to the eigenvalue \(\mu _1\) and may be above or below \(d_+\), depending on the value of \(\mu _1\). A detailed analysis of \(\gamma \) in these two cases is provided in Sects. 3 and 4.

Finally, we comment on the scaling of the critical temperature window, \(\beta =\beta _c+O(n^{-1/3}\sqrt{\log n}\)). One can conjecture this critical scaling from the theorem of Baik and Lee by matching the order of the variance of the free energy at high and low temperature. For fixed \(\beta <\beta _c\), the free energy has variance of order \(n^{-2}\log (\beta _c-\beta )\) while, for fixed \(\beta >\beta _c\), the free energy has variance of order \(n^{-4/3}(\beta -\beta _c)^2\). By formally equating these, we find that their order matches when \(\beta -\beta _c=\Theta (n^{-1/3}\sqrt{\log n})\) and we conjecture that the variance of the free energy in this critical scaling should be of order \(n^{-2}\log n\). This conjecture turns out to be correct, as we will see in the subsequent sections.

2.2 Probability and Random Matrix Preliminaries

2.2.1 Notational conventions (probability and asymptotics)

Below are several asymptotic notations that we use along with the definitions that we follow. For any sequence \(\{a_n\}\) and positive sequence \(\{b_n\}\), we write

  • \(a_n=O(b_n)\) if there exists some constant C such that \(|a_n|\le Cb_n\) for all n,

  • \(a_n=\Omega (b_n)\) if there exists some constant C such that \(|a_n|\ge Cb_n\) for all n,

  • \(a_n=\Theta (b_n)\) if there exist constants \(C_1,C_2\) such that \(C_1b_n\le |a_n|\le C_2b_n\) for all n (or, equivalently, \(a_n=O(b_n)\) and \(a_n=\Omega (b_n)\)),

  • \(a_n\ll b_n\) if \(\lim _{n\rightarrow \infty } a_n/b_n=0\),

  • \(a_n\gg b_n\) if \(\lim _{n\rightarrow \infty } b_n/a_n=0\).

In addition, we sometimes need to make asymptotic statements about the probability of events in a sequence \(\{E_n\}\). We say that \(E_n\) occurs “asymptotically almost surely” if \(\mathbb {P}(E_n)\rightarrow 1\) as \(n\rightarrow \infty \). We say \(E_n\) occurs “with overwhelming probability” if, for all \(D>0\), there exists \(n_0\) such that \(\mathbb {P}(E_n)>1-n^{-D}\) for all \(n>n_0\).

2.2.2 Laguerre Orthogonal Ensemble and Marčenko–Pastur measure

As we saw in the previous subsection, the eigenvalues of the matrix \(\frac{1}{m} JJ^T\) will play an important role in our analysis. This is a normalized Laguerre Orthogonal Ensemble, and we provide an overview of some of its key properties here. Marčenko and Pastur [43] showed that the empirical spectral measure of LOE has the following convergence, as \(n,m\rightarrow \infty \) with \(n/m\rightarrow \lambda \le 1\),

$$\begin{aligned} \frac{1}{n} \sum _{i=1}^n\delta _{\mu _i}(x)\rightarrow p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x:= \frac{\sqrt{(d_+-x)(x-d_-)}}{2\pi \lambda x}\mathbbm {1}_{[d_-,d_+]}(x)\textrm{d}x. \end{aligned}$$
(2.12)

The convergence is weakly in distribution and \(d_\pm =(1\pm \lambda ^{1/2})^2\), and \(p_{{{\,\textrm{MP}\,}}}(x)\) is referred to as the Marčenko–Pastur measure. In working with \(p_{{{\,\textrm{MP}\,}}}\), we sometimes need to use its Stieltjes transform

$$\begin{aligned} s_{{{\,\textrm{MP}\,}}}(z):=\int _\mathbb {R}\frac{1}{z-x}p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x. \end{aligned}$$
(2.13)

We note that it is common to define the Stieltjes transform as the negative of what we use here. However, our definition is consistent with that of [13] and is more logical in this context, since it results in a positive value of \(s_{{{\,\textrm{MP}\,}}}\) for our setting.

2.2.3 Tracy–Widom distribution

The location of the largest eigenvalue is particularly important in our analysis. The following result is well known in random matrix theory. See, for example, [34, 50] and Corollary 1.2 of [48].

Lemma 2.1

Let \(\mu _1\) be the largest eigenvalue of \(\frac{1}{m} M_{n,m}\), where \(M_{n,m}\) is an \(n\times n\) matrix from the Laguerre Orthogonal Ensemble. Then the following convergence in distribution holds.

$$\begin{aligned} \frac{m\mu _1-(\sqrt{n}+\sqrt{m})^2}{(\sqrt{n}+\sqrt{m})\left( (1/\sqrt{n})+(1/\sqrt{m})\right) ^{1/3}}\rightarrow {{\,\textrm{TW}\,}}_1. \end{aligned}$$

Under the condition \(n/m\rightarrow \lambda \in (0,1]\), the following form of Lemma 2.1 is useful in our paper.

$$\begin{aligned} \frac{n^{\frac{2}{3}}(\mu _1-d_+)}{\lambda ^{\frac{1}{2}}(1+\lambda ^{\frac{1}{2}})^{\frac{4}{3}}}\rightarrow {{\,\textrm{TW}\,}}_1. \end{aligned}$$
(2.14)

2.2.4 Classical eigenvalue locations and rigidity

A key tool in our analysis is to approximate the eigenvalues by their “classical locations” (i.e., the quantiles of the Marčenko–Pastur measure). The classical locations \(\{g_i\}\) are defined by the relation

$$\begin{aligned} \frac{i}{n}=\int _{g_i}^{d_+}p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x. \end{aligned}$$
(2.15)

Using this definition, one can show that

$$\begin{aligned} g_i=d_+-\left( \frac{3\pi \lambda ^{3/4}d_+i}{2n}\right) ^{2/3}+O\left( \frac{i^{4/3}}{n^{4/3}}\right) ,\quad i\le n/2. \end{aligned}$$
(2.16)

Thus, we expect that, for \(i\ll n\), we will have \(\mu _i\approx d_+-\left( \frac{3\pi \lambda ^{3/4}d_+i}{2n}\right) ^{2/3}\). The concept of “eigenvalue rigidity” means that eigenvalues are close to their classical locations with high probability. More precisely, we define eigenvalue rigidity to be the event

$$\begin{aligned} \bigcap _{1\le i\le n}\left\{ |\mu _i-g_i|\le \frac{n^\delta }{n^{2/3}\min \{i^{1/3},(n+1-i)^{1/3}\}} \right\} , \end{aligned}$$

which holds with overwhelming probability. This is proved in [48](Theorem 3.3) in the case \(\lambda \in (0,1)\). For \(\lambda =1\), the result follows from Corollary 1.3 of [3] and the relation \(p_{{{\,\textrm{MP}\,}}}(x)=p_{\textrm{SC}}(\sqrt{x})\) between the Marčenko–Pastur and semicircle distributions.

In addition to eigenvalue rigidity, we sometimes need more precise control of the larger eigenvalues. For this purpose, we introduce the following lemma, which is proved in Appendix A. This lemma is inspired by a similar one proved in [40] for GOE matrices and used by Landon in his analysis of SSK at critical temperature [39].

Lemma 2.2

Let \(\{\mu _j\}_{j=1}^n\) be the eigenvalues of \(\frac{1}{m} M_{n,m}\). For each j, define

$$\begin{aligned} A_j=\left( \frac{3\pi \lambda ^{3/4}d_+}{2}j\right) ^{2/3}-n^{2/3}(d_+-\mu _j). \end{aligned}$$
(2.17)

Given \(\varepsilon >0\), there exists K such that for sufficiently large n,

$$\begin{aligned} \mathbb {P}\left( \bigcap _{K\le j \le n^{2/5}}\left\{ \left| A_j\right| \le \lambda j^{2/3}\right\} \right) \ge 1-\varepsilon . \end{aligned}$$
(2.18)

Furthermore, there exists \(C,c>0\) such that

$$\begin{aligned} \mathbb {E}\left[ \mathbbm {1}_{\{n^{2/3}(\mu _j-d_+)\le -C\}}\left| A_j\right| \right] \le \frac{c\log j}{j^{1/3}}, \quad \text {for } K\le j\le n^{2/5}. \end{aligned}$$
(2.19)

2.2.5 Tridiagonal representation of LOE

In Sect. 5, when proving the asymptotic independence of the Gaussian and Tracy–Widom variables, we will need the tridiagonal representation of LOE. Dumitriu and Edelman [28] show that the eigenvalue distribution of the unnormalized LOE matrix \(M_{n,m}\) is the same as that of the \(n\times n\) matrix \(T_n=BB^T\) where B is a bi-diagonal matrix of dimension \(n\times n\). In particular,

$$\begin{aligned}{} & {} B= \begin{bmatrix} a_1 &{}\quad &{}\quad &{}\quad &{}\quad \\ b_1 &{}\quad a_2 &{}\quad &{}\quad &{}\quad \\ &{}\quad b_2 &{}\quad a_3 &{}\quad &{}\quad \\ &{}\quad &{}\quad \ddots &{}\quad \ddots &{}\quad \\ &{}\quad &{}\quad &{}\quad b_{n-1} &{}\quad a_n \end{bmatrix}\quad \text {so}\nonumber \\{} & {} BB^T= \begin{bmatrix} a_1^2 &{}\quad a_1b_1 &{}\quad &{}\quad &{}\quad \\ a_1b_1 &{}\quad a_2^2+b_1^2 &{}\quad a_2b_2 &{}\quad &{}\quad \\ &{}\quad a_2b_2 &{}\quad a_3^2+b_2^2 &{}\quad &{}\quad \\ &{}\quad &{}\quad &{}\quad \ddots &{}\quad a_{n-1}b_{n-1}\\ &{}\quad &{}\quad &{}\quad a_{n-1}b_{n-1} &{}\quad a_n^2+b_{n-1}^2 \end{bmatrix} \end{aligned}$$
(2.20)

where \(\{a_i\},\{b_i\}\) are all independent random variables with distributions satisfying

$$\begin{aligned} a_i^2\sim \chi ^2 (m-n+i),\qquad b_i^2\sim \chi ^2( i). \end{aligned}$$
(2.21)

2.3 Defining the Event on Which Our Results Hold

Our arguments throughout this paper rely upon certain conditions on the eigenvalues, which hold with probability close to 1. To streamline the later proofs, we collect in this section various events involving the eigenvalues \(\{\mu _i\}\) and provide probability bounds for each event. Finally, we define \(\mathcal {E}_\varepsilon \) to be the intersection of these events, which holds with probability \(1-\varepsilon \) for arbitrarily small choice of \(\varepsilon \).

Definition 2.3

Let \(\delta , s,t, r, R\) be positive numbers where \(s<t\), \(r<R\), and let K be a positive integer. We define the events \(\mathcal {F}^{(1)}_\delta ,\mathcal {F}^{(2)}_K,\mathcal {F}^{(3)}_{s,t},\mathcal {F}^{(4)}_{r,R}\) as follows.

$$\begin{aligned} \mathcal {F}^{(1)}_\delta&=\bigcap _{1\le i\le n}\left\{ |\mu _i-g_i|\le \frac{n^\delta }{n^{2/3}\min \{i^{1/3},(n+1-i)^{1/3}\}} \right\} , \end{aligned}$$
(2.22)
$$\begin{aligned} \mathcal {F}^{(2)}_K&=\bigcap _{K\le j\le n^{2/5}}\left\{ \left| n^{2/3}(\mu _j-d_+)+\left( \frac{3\pi \lambda ^{3/4}d_+}{2}j\right) ^{2/3}\right| \le \frac{j^{2/3}}{10} \right\} ,\end{aligned}$$
(2.23)
$$\begin{aligned} \mathcal {F}^{(3)}_{s,t}&=\left\{ n^{2/3}|d_+-\mu _1|\in [s,t] \right\} ,\qquad 0<s<t,\end{aligned}$$
(2.24)
$$\begin{aligned} \mathcal {F}^{(4)}_{r,R}&=\left\{ r<n^{2/3}(\mu _1-\mu _2)<R\right\} . \end{aligned}$$
(2.25)

Remark 2.4

The event \(\mathcal {F}^{(1)}_\delta \) is the eigenvalue rigidity condition with respect to the “classical location,” and \(\mathcal {F}^{(2)}_K\) is inspired by a similar event used in the context of Gaussian ensembles by Landon and Sosoe [40].

Lemma 2.5

(Event probability bounds) The following statements hold.

  • For any fixed \(\delta >0\), the event \(\mathcal {F}^{(1)}_\delta \) holds with overwhelming probability.

  • For any \(\varepsilon >0\), there exist positive constants KstrR depending on \(\varepsilon \) but not on n such that, for sufficiently large n,

    $$\begin{aligned} \mathbb {P}[\mathcal {F}^{(2)}_K]\ge 1-\tfrac{\varepsilon }{4},\qquad \mathbb {P}[\mathcal {F}^{(3)}_{s,t}]\ge 1-\tfrac{\varepsilon }{4},\qquad \mathbb {P}[\mathcal {F}^{(4)}_{r,R}]\ge 1-\tfrac{\varepsilon }{4}. \end{aligned}$$

Proof

The bounds on the first three events are clear. The eigenvalue rigidity condition \(\mathcal {F}^{(1)}_\delta \) holds with overwhelming probability (see explanation in Sect. 2.2). The bound on event \(\mathcal {F}^{(2)}_K\) follows directly from Lemma 2.2, where we can take larger value of K to replace \(\varepsilon \) in the bound by \(\varepsilon /4\). Result on \(\mathcal {F}^{(3)}_{s,t}\) is a consequence of the Tracy–Widom convergence in Lemma 2.1.

Finally, we consider \(\mathcal {F}^{(4)}_{r,R}\). The upper bound \(n^{2/3}(\mu _1-\mu _2)\le R\) holds with probability \(1-\varepsilon /8\) for some \(R>0\) via a union bound (where \(|\mu _1-d_+|\) is controlled using \(\mathcal {F}_{s,t}^{(3)}\) and \(|\mu _2-d_+|\) is bounded similarly using Tracy–Widom convergence of \(\mu _2\)). For the lower bound on \(n^{2/3}(\mu _1-\mu _2)\), note that the joint distribution of \(\mu _1\) and \(\mu _2\) (each rescaled as in (2.14)) converges to the distribution given by the Tracy–Widom law (see, for example, [48, 50]). This law describes the joint distribution of the largest two eigenvalues of an operator \(\textbf{H}_1\) whose spectrum is simple with probability one (see, for example, (4.5.9) and Theorem 4.5.42 of [7]), implying an \(r>0\) such that \(\mathbb {P}(n^{2/3}(\mu _1-\mu _2)>r)\ge 1-\varepsilon /8\) does exist for sufficiently large n.

Definition 2.6

Given \(\varepsilon >0\), we define \(\mathcal {E}_\varepsilon \) to be an event

$$\begin{aligned} \mathcal {E}_\varepsilon :=\mathcal {F}^{(1)}_\delta \cap \mathcal {F}^{(2)}_K\cap \mathcal {F}^{(3)}_{s,t}\cap \mathcal {F}^{(4)}_{r,R}\end{aligned}$$

where the parameters \(\delta ,K,s,t,r, R\) are chosen to satisfy the probability bounds in Lemma 2.5. Note that KstrR depend on \(\varepsilon \), but \(\delta \) does not. The choice of these constants is not unique. However, for any given \(\varepsilon >0\), we fix these values and define \(\mathcal {E}_\varepsilon \) accordingly.

The following corollary follows directly from the above definition and Lemma 2.5.

Corollary 2.7

For any \(\varepsilon >0\), \(\mathbb {P}[\mathcal {E}_\varepsilon ]\ge 1-\varepsilon \).

Computing the free energy in both the high- and low-temperature regimes involves analyzing linear statistics of eigenvalues of the form \(\sum _{i=1}^n\frac{1}{(z-\mu _i)^k}\), on the event defined above. The key lemma that we use for handling these sums is the following.

Lemma 2.8

Let \(z\in \mathbb {C}\) with \(\mathop {\textrm{Re}}\limits (z)\ge d_+\). Let \(\{\mu _i\}\) be the eigenvalues of \(\frac{1}{m} M_{m,n}\). Then, for any \(\varepsilon >0\) and any positive integer l,

$$\begin{aligned}{} & {} \mathbb {E}\left[ \mathbbm {1}_{\mathcal {E}_\varepsilon } \left| \frac{1}{n}\sum _{j=K}^n\frac{1}{(z-\mu _j)^l}-\int _{d_-}^{g_K} \frac{1}{(z-y)^l}p_{{{\,\textrm{MP}\,}}}(y)dy\right| \right] \nonumber \\{} & {} \quad = O\left( n^{\frac{2}{3} l-1}\cdot \min \left\{ \left| \frac{\log (n^{2/3}|z-d_+|)}{(n^{2/3}|z-d_+|)^l}\right| ,\;1\right\} \right) . \end{aligned}$$
(2.26)

Here, K is the constant depending on \(\varepsilon \) in \(\mathcal {F}^{(2)}_K\) and \(\mathcal {E}_\varepsilon \).

A proof of this lemma is included in Appendix A. The general approach is inspired by the method that Landon and Sosoe used in [40] to bound similar eigenvalue statistics in the case of Gaussian orthogonal ensembles. We prove a series of supporting lemmas, first for LUE, which allows us to make use of the determinantal properties. We then extend our final result to LOE by way of the interrelationship between eigenvalues of unitary and orthogonal ensembles provided in [30].

3 High Temperature

As mentioned in the previous section, the computation of the free energy reduces to the computation of the integral

$$\begin{aligned} Q_n=-\int _{\gamma _1-i\infty }^{\gamma _1+i\infty }\int _{\gamma _2-i\infty }^{\gamma _2+i\infty }e^{nG(z_1,z_2)}\textrm{d}z_2\textrm{d}z_1 \end{aligned}$$
(3.1)

where \(G(z_1,z_2)\) is defined in (2.3). The general idea is that we should be able to compute this integral via steepest descent analysis by deforming the contours such that they pass through the critical point \((\gamma _1,\gamma _2)\), which is a function of \(\gamma \) as defined in (2.7)–(2.8). Baik and Lee [13] show that at fixed high temperature (i.e., constant \(\beta <\beta _c\)), the random variable \(\gamma \) is well approximated by \(\tilde{\gamma }\), the solution to (2.9). Furthermore, \(|\gamma -\tilde{\gamma }|\) is small enough that the integral computations can be carried out with \(\tilde{\gamma }\) and the error remains sufficiently small.

In the high-temperature side of the critical window, we do not have fixed \(\beta <\beta _c\) as in [13], but rather \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\) for \(b<0\). The first task of this section is to show that, even in this scaling, \(\tilde{\gamma }\) remains a good approximation of \(\gamma \). Namely, we need to compute the asymptotics of \(\tilde{\gamma }\) and obtain an upper bound on \(|\gamma -\tilde{\gamma }|\).

3.1 Bounds on G, Its Derivatives, and Its Critical Point

We begin with an asymptotic expansion for \(\tilde{\gamma }\).

Lemma 3.1

For fixed \(b<0\), the solution \(\tilde{\gamma }\) to (2.9) satisfies

$$\begin{aligned} \tilde{\gamma }=d_++\frac{4\lambda b^2}{1+\lambda }n^{-2/3}\log n+O(n^{-1}(\log n)^{3/2}). \end{aligned}$$

Proof

From [13] (see (6.17)), we obtain the closed-form expression

$$\begin{aligned} \tilde{\gamma }=(1+\lambda )\beta ^{-2}+1+\lambda +\frac{\lambda }{1+\lambda }\beta ^2. \end{aligned}$$
(3.2)

Observe that the right-hand side, as a function of \(\beta \), is equal to \(d_+\) at \( \beta _c\). Thus, by expanding the function around \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\), we obtain

$$\begin{aligned} \tilde{\gamma }-d_+ =\frac{4\lambda }{1+\lambda }(\beta -\beta _c)^2-\frac{4\lambda ^{5/4}}{(1+\lambda )^{3/2}}(\beta -\beta _c)^3+O((\beta -\beta _c)^4), \end{aligned}$$
(3.3)

and the lemma follows. \(\square \)

In order to obtain a sufficiently tight bound for \(|\gamma -\tilde{\gamma }|\), we need bounds on various eigenvalue statistics and, in particular, we need to bound differences of the form

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\frac{1}{(z-\mu _i)^k}-\int \frac{p_{{{\,\textrm{MP}\,}}}(y)\textrm{d}y}{(z-y)^k},\quad k\ge 1 \end{aligned}$$
(3.4)

when z is close to \(\mu _1\). Given the precision needed for computations in the critical window, the bound obtained using eigenvalue rigidity is not tight enough. Instead, we make use of the following lemma.

Lemma 3.2

Let \(z\in \mathbb {C}\) with \(\mathop {\textrm{Re}}\limits (z)\ge d_+\) and \(|z-d_+|>cn^{-2/3}\log n\) for some \(c>0\). Let \(\{\mu _i\}\) be the eigenvalues of \(\frac{1}{m} M_{m,n}\). Then, for any \(\varepsilon >0\) and any positive integer l,

$$\begin{aligned} \mathbb {E}\left[ \mathbbm {1}_{\mathcal {E}_\varepsilon }\left| \frac{1}{n}\sum _{j=1}^n\frac{1}{(z-\mu _j)^l}-\int \frac{p_{{{\,\textrm{MP}\,}}}(y)\textrm{d}y}{(z-y)^l}\right| \right] = O\left( n^{\frac{2}{3} l-1}\frac{\log (n^{2/3}|z-d_+|)}{(n^{2/3}|z-d_+|)^l}\right) .\nonumber \\ \end{aligned}$$
(3.5)

Proof of Lemma 3.2

Given \(\varepsilon >0\), let K be the integer in the events \(\mathcal {F}^{(2)}_K\) and \(\mathcal {E}_\varepsilon \). Recall the classical locations \(g_i\), \(i=0,\dots , n\) of the Marčenko–Pastur measure. We start by writing \(\frac{1}{n}\sum _{i=1}^n\frac{1}{(z-\mu _i)^l}-\int \frac{1}{(z-y)^l}p_{{{\,\textrm{MP}\,}}}(y)dy\) as the sum

$$\begin{aligned} S_1+S_2= & {} \left( \frac{1}{n}\sum _{i=1}^K\frac{1}{(z-\mu _i)^l}-\int _{g_K}^{d_+} \frac{p_{{{\,\textrm{MP}\,}}}(y)\textrm{d}y}{(z-y)^l}\right) \nonumber \\{} & {} +\left( \frac{1}{n}\sum _{i=K+1}^n\frac{1}{(z-\mu _j)^l}-\int _{d_-}^{g_K} \frac{p_{{{\,\textrm{MP}\,}}}(y)\textrm{d}y}{(z-y)^l}\right) . \end{aligned}$$
(3.6)

For \(i\le K\), we observe that:

  • On the event \(\mathcal {E}_\varepsilon \), \(n^{2/3}(d_+-\mu _i)\) is uniformly bounded in i. Thus, \(|z-\mu _i|\ge |z-d_+|-|d_+-\mu _i|>\frac{1}{2}|z-d_+|\) by the assumption on z.

  • As \(\mathop {\textrm{Re}}\limits (z)>d_+\), we have \(|z-y|\ge |z-d_+|\) for all real \(y<d_+\).

Therefore,

$$\begin{aligned} \mathbbm {1}_{\mathcal {E}_\varepsilon }|S_1|\le \frac{1}{n}\sum _{i=1}^K\frac{1}{|z-\mu _i|^l}+\int _{g_K}^{d_+} \frac{1}{|z-y|^l}p_{{{\,\textrm{MP}\,}}}(y)\textrm{d}y\le \frac{3K}{n|z-d_+|^l}. \end{aligned}$$
(3.7)

We then bound \(\mathbbm {1}_{\mathcal {E}_\varepsilon }|S_2|\) using Lemma 2.8 to complete the proof of Lemma 3.2. \(\square \)

We obtain an upper bound for \(\gamma -\tilde{\gamma }\) in the following lemma. Together with Lemma 3.1, it verifies that the order of \(\gamma -\tilde{\gamma }\) is strictly less than that of \(\tilde{\gamma }-d_+\).

Lemma 3.3

If \(b<0\), then, on the event \(\mathcal {E}_\varepsilon \) for any given \(\varepsilon >0\),

$$\begin{aligned} |\gamma -\tilde{\gamma }|=O\left( \frac{(\log \log n)^2}{n^{2/3}\sqrt{\log n}}\right) . \end{aligned}$$

Proof

Recall that \(\gamma \) and \(\tilde{\gamma }\) are solutions to the equations \(L(x)=R(x)\) and \(L_\infty (x)=R_\infty (x)\), respectively, where

$$\begin{aligned} L(x)=\frac{1}{n}\sum _{i=1}^n\frac{1}{x-\mu _i(n)},\qquad R(x)=\frac{B_n^2}{\alpha _n+\sqrt{\alpha _n^2+xB_n^2}} \end{aligned}$$

and

$$\begin{aligned} L_\infty (x)=\int _\mathbb {R}\frac{p_{{{\,\textrm{MP}\,}}}(y)\textrm{d}y}{x-y},\qquad R_\infty (x)=\frac{B^2}{\alpha +\sqrt{\alpha ^2+xB^2}}. \end{aligned}$$

Define \(F(x)=R(x)/L(x)\) and let \(F_\infty (x)\) be given similarly. Setting \(\varepsilon _n=\frac{(\log \log n)^2}{n^{2/3}\sqrt{\log n}}\), we follow the method in [13] to prove \(|\gamma -\tilde{\gamma }|=O(\varepsilon _n)\) by showing \(F(\tilde{\gamma }-\varepsilon _n)<1<F(\tilde{\gamma }+\varepsilon _n).\) Since \(F_\infty (\tilde{\gamma })=1\) and \(F_\infty (\tilde{\gamma }-\varepsilon _n)<1<F_\infty (\tilde{\gamma }+\varepsilon _n)\), it suffices to show

$$\begin{aligned} |F(x)-F_\infty (x)|\ll |F_\infty '(\tilde{\gamma })|\varepsilon _n, \quad \text {for } x\in [\tilde{\gamma }-\varepsilon _n,\tilde{\gamma }+\varepsilon _n]. \end{aligned}$$
(3.8)

Thus, we need a lower bound for \(|F_\infty '(\tilde{\gamma })|\) and an upper bound for \(|F(x)-F_\infty (x)|\). For the lower bound, begin with

$$\begin{aligned} F_\infty '(\tilde{\gamma })=\frac{R_\infty '(\tilde{\gamma })L_\infty (\tilde{\gamma })-L_\infty '(\tilde{\gamma })R_\infty (\tilde{\gamma })}{(L_\infty (\tilde{\gamma }))^2}. \end{aligned}$$

Note that \(L_\infty (\tilde{\gamma })\) and \(R_\infty (\tilde{\gamma })\) are of order 1, and \(R_\infty '(\tilde{\gamma })=O(1)\) using the fact that \(\alpha ,B,\tilde{\gamma }\) are all of order 1.

We now show \(|L_\infty '(\tilde{\gamma })|\) is of order at least \(n^{1/3}(\log n)^{-1/2}\), which implies \(|F_\infty '(\tilde{\gamma })|\) is as well.

Since we are interested in \(L'_\infty (x)\) at \(\tilde{\gamma }\), where \(\tilde{\gamma }-d_+=\Theta (n^{-2/3}\log n)\) by Lemma 3.1, we consider \(L_\infty (d_++s)\) as a function of s and its derivative, and later set s to take value of order \(n^{-2/3}\log n\). We have

$$\begin{aligned} L_\infty (d_++s)= & {} C\int _{d_-}^{d_+}\frac{\sqrt{(d_+-y)(y-d_-)}}{(d_++s-y)y}\textrm{d}y\\= & {} \int _0^{d_+-d_-}\frac{C\sqrt{z(d_+-d_--z)}}{(z+s)(d_+-z)}\textrm{d}z, \end{aligned}$$

where \(C=\frac{2}{\pi }(\sqrt{d_+}-\sqrt{d_-})^{-2}\) and \(z=d_+-y\). We then write

$$\begin{aligned} \begin{aligned} L_\infty '(d_++s)&=\frac{\textrm{d}}{\textrm{d}s}\int _0^{\frac{d_+-d_-}{2}}\frac{C\sqrt{z(d_+-d_--z)}}{(z+s)(d_+-z)}\textrm{d}z \\ {}&\quad +\frac{\textrm{d}}{\textrm{d}s}\int _{\frac{d_+-d_-}{2}}^{d_+-d_-}\frac{C\sqrt{z(d_+-d_--z)}}{(z+s)(d_+-z)}\textrm{d}z. \end{aligned}\end{aligned}$$
(3.9)

First, we consider the derivative of a simplified version of the first integral:

$$\begin{aligned} \begin{aligned} \frac{\textrm{d}}{\textrm{d}s}\int _0^{\frac{d_+-d_-}{2}}\frac{\sqrt{z}}{z+s}\textrm{d}z&=-s^{-1/2}\arctan \sqrt{\tfrac{(d_+-d_-)/2}{s}}\\&\quad +\sqrt{s}\frac{\sqrt{(d_+-d_-)/2}}{s^{3/2}}\frac{1}{1+\frac{(d_+-d_-)/2}{s}}\\&=-s^{-1/2}+O(1). \end{aligned}\nonumber \\ \end{aligned}$$
(3.10)

Now that we have the derivative of this simplified integral, recall that the actual integrand is \(\frac{C\sqrt{z(d_+-d_--z)}}{(z+s)(d_+-z)}\) and make the following observations:

  • For \(z\in [0,\frac{d_+-d_-}{2}]\), there exist positive constants \(C_1,C_2\) such that \(C_1<\frac{C\sqrt{d_+-d_--z}}{d_+-z}<C_2\).

  • For any \(z>0\), the quantity \(\frac{\sqrt{z}}{z+s}\) is a decreasing function of s when \(s>0\).

From these two facts and the above computation, we conclude that, for small s,

$$\begin{aligned} -C_2s^{-1/2}\le \frac{\textrm{d}}{\textrm{d}s}\int _0^{\frac{d_+-d_-}{2}}\frac{C\sqrt{z(d_+-d_--z)}}{(z+s)(d_+-z)}\textrm{d}z \le -C_1s^{-1/2}. \end{aligned}$$
(3.11)

Finally, the second bullet point implies the second integral on the right side of (3.9) must be negative. Thus, \(L_\infty '(d_++s)<-C_1s^{-1/2}\), which implies \(|L_\infty '(\tilde{\gamma })|\) is of order at least \(n^{1/3}(\log n)^{-1/2}\). We obtain the lower bound

$$\begin{aligned} |F_\infty '(\tilde{\gamma })|=\Omega (n^{1/3}(\log n)^{-1/2}). \end{aligned}$$
(3.12)

We now show an upper bound of \(|F(x)-F_\infty (x)|\) for \(x\in [\tilde{\gamma }-\varepsilon _n,\tilde{\gamma }+\varepsilon _n]\). For such x,

$$\begin{aligned} F(x)-F_\infty (x)=\frac{(R(x)-R_\infty (x))L_\infty (x)+(L_\infty (x)-L(x))R_\infty (x)}{L(x)L_\infty (x)} \end{aligned}$$

satisfies that the denominator, \(L_\infty (x)\), and \(R_\infty (x)\) all have order 1. Thus, it remains to bound the terms \(R(x)-R_\infty (x)\) and \(L_\infty (x)-L(x)\). As \(\alpha _n-\alpha =O(n^{-1-\delta })\) and \(B_n-B=O(n^{-1-\delta })\), we have

$$\begin{aligned} R(x)-R_\infty (x)=\frac{\sqrt{\alpha _n^2+xB_n^2}-\alpha _n}{x}-\frac{\sqrt{\alpha ^2+xB^2}-\alpha }{x} =O(n^{-1-\delta }). \end{aligned}$$

Lastly, Lemma 3.2 yields that

$$\begin{aligned} L(x)-L_\infty (x)=\frac{1}{n} \sum _{i=1}^n\frac{1}{x-\mu _i}-\int \frac{p_{{{\,\textrm{MP}\,}}}(y)\textrm{d}y}{x-y}=O(n^{-1/3}(\log \log n)(\log n)^{-1}). \end{aligned}$$

Thus, we have shown that for \(x\in [\tilde{\gamma }-\frac{(\log \log n)^2}{n^{2/3}\log n},\tilde{\gamma }+\frac{(\log \log n)^2}{n^{2/3}\log n}]\),

$$\begin{aligned}{} & {} |F(x)-F_\infty (x)|=O\left( \frac{\log \log n}{n^{1/3}\log n}\right) ,\nonumber \\{} & {} |F'_\infty (\tilde{\gamma })|=\Omega \left( n^{1/3}(\log n)^{-1/2}\right) . \end{aligned}$$
(3.13)

This verifies the inequality (3.8), and the lemma follows. \(\square \)

We now introduce a deterministic approximation \(G_\infty \) of the function G, given by

$$\begin{aligned} G_\infty (z_1,z_2)=B(z_1+z_2)-\alpha \log z_1-\frac{1}{2} \int \log (4z_1z_2-x)p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x.\nonumber \\ \end{aligned}$$
(3.14)

We observe that \((\tilde{\gamma }_1,\tilde{\gamma }_2)\) is the unique critical point of \(G_\infty \) satisfying \(4\tilde{\gamma }_1\tilde{\gamma }_2 \in (d_+, \infty )\). This follows from the similar reasoning to what we used for \((\gamma _1,\gamma _2)\). We obtain the following asymptotic expressions for the functions G, \(G_\infty \) and their partial derivatives.

Lemma 3.4

Let \((z_1,z_2)\) satisfy \(\mathop {\textrm{Re}}\limits (4z_1z_2)\ge d_+\) and \(|4z_1z_2-d_+|\ge cn^{-2/3}\log n\) for some fixed \(c>0\). Then, on the event \(\mathcal {E}_\varepsilon \), the following hold and are uniform in any compact region satisfying the constraints on \((z_1,z_2)\):

  1. (i)

    For every multi-index \(k=(k_1,k_2)\) (with \(|k|:=k_1+k_2\ge 1\)),

    $$\begin{aligned} \partial ^kG(z_1,z_2)-\partial G_\infty ^k(z_1,z_2)=O\left( n^{\frac{2}{3} |k|-1}\frac{\log \log n}{(\log n)^{|k|}} \right) . \end{aligned}$$
    (3.15)
  2. (ii)

    For every multi-index k with \(|k|\ge 1\),

    $$\begin{aligned} \begin{aligned} \partial ^kG_\infty (z_1,z_2)&=O(n^{\frac{2}{3} |k|-1}(\log n)^{-|k|+\frac{3}{2}})\\ \partial ^kG(z_1,z_2)&=O(n^{\frac{2}{3} |k|-1}(\log n)^{-|k|+\frac{3}{2}}). \end{aligned}\end{aligned}$$
    (3.16)

Proof

We recall

$$\begin{aligned} \begin{aligned} G(z_1,z_2)&=B_n(z_1+z_2)-\alpha _n \log z_1-\frac{1}{2n}\sum _{j=1}^n\log (4z_1z_2-\mu _j),\\ G_\infty (z_1,z_2)&=B(z_1+z_2)-\alpha \log z_1-\frac{1}{2} \int \log (4z_1z_2-x)p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x.\\ \end{aligned} \end{aligned}$$

Observe that over any fixed compact region of \(\mathbb {C}^2\), for every \(|k|\ge 1\),

  • \(\partial ^kG_\infty (z_1,z_2)=O\left( \int (4z_1z_2-x)^{-k}p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x\right) \), and

  • the differences in the partials of G and \(G_\infty \) satisfy

    $$\begin{aligned} \partial ^kG(z_1,z_2)-\partial ^kG_\infty (z_1,z_2)=O\left( \frac{1}{n}\sum _{i=1}^n\frac{1}{(4z_1z_2-\mu _i)^{|k|}}-\int \frac{p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x}{(4z_1z_2-x)^{|k|}} \right) .\nonumber \\ \end{aligned}$$
    (3.17)

Applying Lemma 3.2 to (3.17) gives us part (i) of the lemma. For part (ii), we first obtain the bound for \(\partial ^kG_\infty \) by noting that

$$\begin{aligned} \begin{aligned}&\left| \int (4z_1z_2-x)^{-|k|}p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x\right| \\&\quad \le \int \frac{1}{\max \{|4z_1z_2-x|,\;d_+-x\}^{|k|}}p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x\\&\quad =O\left( \int _{n^{-2/3}\log n}^\infty \frac{\sqrt{y-n^{-2/3}\log n}}{y^{|k|}}\textrm{d}y\right) \\&\quad =O\left( \int _{n^{-2/3}\log n}^\infty y^{-|k|+\frac{1}{2}}\textrm{d}y\right) =O\left( (n^{-2/3}\log n)^{-|k|+3/2}\right) . \end{aligned} \end{aligned}$$
(3.18)

Then, the bound for \(\partial ^kG\) as in (ii) follows by part (i) of the lemma and the bound obtained for \(\partial ^kG_\infty \). \(\square \)

We prove some further properties of G and \(G_\infty \) in the following lemma.

Lemma 3.5

For the critical points \((\gamma _1,\gamma _2)\) and \((\tilde{\gamma }_1,\tilde{\gamma }_2)\) of G and \(G_\infty \), respectively, the following hold on event \(\mathcal {E}_\varepsilon \).

  1. (i)

    We have

    $$\begin{aligned}{} & {} |\gamma _1-\tilde{\gamma }_1|=O(n^{-2/3}(\log \log n)^2(\log n)^{-1/2}),\\{} & {} |\gamma _2-\tilde{\gamma }_2|=O(n^{-2/3}(\log \log n)^2(\log n)^{-1/2}). \end{aligned}$$
  2. (ii)

    There is a positive constant c, independent of n, such that

    $$\begin{aligned} 4\gamma _1\gamma _2-\mu _1>cn^{-2/3}\log n\quad 4\gamma _1\gamma _2-d_+>cn^{-2/3}\log n. \end{aligned}$$
  3. (iii)

    We have

    $$\begin{aligned} G(\gamma _1,\gamma _2)=G(\tilde{\gamma }_1,\tilde{\gamma }_2)+O(n^{-1}(\log n)^{-3/2}(\log \log n)^4) \end{aligned}$$

    and for and multi-index \(k=(k_1,k_2)\) satisfying \(|k|>0\),

    $$\begin{aligned} \partial ^kG(\gamma _1,\gamma _2)=\partial ^kG(\tilde{\gamma }_1,\tilde{\gamma }_2)+O\left( n^{\frac{2}{3} |k|-1}(\log n)^{-|k|}(\log \log n)^2 \right) . \end{aligned}$$

Proof

Part (i) follows from the equations for \(\gamma _1,\gamma _2,\tilde{\gamma }_1,\tilde{\gamma }_2\) along with the bound on \(|\gamma -\tilde{\gamma }|\).

Part (ii) follows from part (i) along with the computation of \(\tilde{\gamma }-d_+\) and the fact that \(|d_+-\mu _1|=O(n^{-2/3})\).

For Part (iii), using the bounds from Lemma 3.4(ii) and Lemma 3.5(i), we get the Taylor expansion

$$\begin{aligned}\begin{aligned} G(\tilde{\gamma }_1,\tilde{\gamma }_2)&=G(\gamma _1,\gamma _2)+\partial _1G(\gamma _1,\gamma _2)(\tilde{\gamma }_1-\gamma _1) +\partial _2G(\gamma _1,\gamma _2)(\tilde{\gamma }_2-\gamma _2)\\&\quad +O\left( \tfrac{n^{1/3}}{(\log n)^{1/2}}\cdot (\tfrac{(\log \log n)^2}{n^{2/3}(\log n)^{1/2}})^2\right) \\&=G(\tilde{\gamma }_1,\tilde{\gamma }_2)+O(n^{-1}(\log n)^{-3/2}(\log \log n)^4). \end{aligned}\end{aligned}$$

Similarly, for the partials, we get

$$\begin{aligned}\begin{aligned} \partial ^kG(\tilde{\gamma }_1,\tilde{\gamma }_2)&=\partial ^kG(\gamma _1,\gamma _2)+O\left( n^{\frac{2}{3} (|k|+1)-1}(\log n)^{-(|k|+1)+\frac{3}{2} }\cdot \tfrac{(\log \log n)^2}{n^{2/3}(\log n)^{1/2}}\right) \\&=\partial ^kG(\gamma _1,\gamma _2)+O\left( n^{\frac{2}{3} |k|-1}(\log n)^{-|k|}(\log \log n)^2 \right) . \end{aligned}\end{aligned}$$

\(\square \)

3.2 Steepest Descent Analysis

We now perform steepest analysis to compute the contour integral in the high-temperature case. The method relies on the observation that the dominant contribution to the integral comes from within a small radius around the critical point of G. In this case, the radius is \(r=n^{-2/3}(\log n)^{\frac{1}{4}+\varepsilon }\) for some \(\varepsilon >0\).

The intuition behind this choice of truncation radius is as follows: Consider a Taylor expansion of \(G_\infty \) where \(z_1=\tilde{\gamma }_1+it_1/r_n\) and \(z_2=\tilde{\gamma }_2+it_2/r_n\) with \(r_n\) to be determined. Let m denote a multi-index for the derivative, and let |m| denote the length the multi-index. We want to choose \(r_n\) such that

$$\begin{aligned} \partial ^{(m)}G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)\cdot r_n^{|m|}={\left\{ \begin{array}{ll}\Theta ( \frac{1}{n})&{}|m|=2\\ o(\frac{1}{n})&{}|m|\ge 3. \end{array}\right. } \end{aligned}$$
(3.19)

Using the previous lemmas, this is satisfied exactly when \(r=\Theta ( n^{-2/3}(\log n)^{1/4})\).

Lemma 3.6

Let \(\gamma _1=\gamma _1(n)\) and \(\gamma _2=\gamma _2(n)\) be such that \((\gamma _1,\gamma _2)\) is the critical point of \(G(z_1,z_2)\) satisfying \(\gamma =4\gamma _1\gamma _2>\mu _1(n)\). Then, for any \(0<\varepsilon <1/4\) and any \(\Omega \subset \{(y_1,y_2)\in \mathbb {R}^2:y_1^2+y_2^2\ge n^{-4/3}(\log n)^{1/2+2\varepsilon } \}\), on the event \(\mathcal {E}_\varepsilon \), there exists some \(C>0\) such that

$$\begin{aligned} \int _{\Omega }\exp \left[ n\mathop {\textrm{Re}}\limits (G(\gamma _1+\textrm{i}y_1,\gamma _2+\textrm{i}y_2)-G(\gamma _1,\gamma _2)) \right] \textrm{d}y_2\textrm{d}y_1 =O(e^{-C(\log n)^\varepsilon }). \end{aligned}$$

Proof

Since \(\gamma -\mu _n\) is bounded in n, Lemma 3.9 of [13] implies that with high probability, the portion of the above integral over \(\Omega \cap \{(y_1,y_2)\in \mathbb {R}^2:y_1^2+y_2^2\ge n^{-1+2\varepsilon }\}\) is \(O(e^{-n^\varepsilon })\). Thus, it remains to consider the subset of \(\Omega \) where \(y_1^2+y_2^2\) is between \(n^{-4/3}(\log n)^{1/2+2\varepsilon }\) and \(n^{-1+2\varepsilon }\). We denote this subset by \(\widetilde{\Omega }\).

The proof of Lemma 3.9 of [13] also shows that, for some constant \(c_0>0\) and for any integer \(K\ge 1\),

$$\begin{aligned}{} & {} \mathop {\textrm{Re}}\limits (G(\gamma _1+ \textrm{i}y_1,\gamma _2+\textrm{i}y_2)-G(\gamma _1,\gamma _2))\nonumber \\{} & {} \quad \le -\frac{1}{4n}\sum _{j=K}^{n}\log \left( 1+\frac{c_0}{(\gamma -\mu _j)^2}(y_1^2+y_2^2)\right) , \end{aligned}$$
(3.20)

for all \(y_1,y_2\in \mathbb {R}\). By Lemma 2.2, for every \(\varepsilon >0\), there exists \(c,K>0\) such that, with probability at least \(1-\varepsilon \),

$$\begin{aligned} \gamma -d_+\le cn^{-2/3}\log n \quad \text {and}\quad d_+-\mu _j\le {\left\{ \begin{array}{ll} ci^{2/3}n^{-2/3}, &{}\quad K\le j\le n^{2/5}, \\ c, &{}\quad j>n^{2/5}. \end{array}\right. } \end{aligned}$$

Thus, with probability at least \(1-\varepsilon \),

$$\begin{aligned} \gamma -\mu _j \le {\left\{ \begin{array}{ll} cn^{-2/3}\log n, &{}\quad K \le j \le (\log n)^{3/2},\\ cj^{2/3}n^{-2/3}, &{}\quad (\log n)^{3/2}\le j\le n^{2/5},\\ c, &{}\quad j>n^{2/5}. \end{array}\right. } \end{aligned}$$

Write \(r^2=y_1^2+y_2^2\) using polar coordinates, then for \(r\in [n^{-2/3}\log ^{1/4+\varepsilon }n, n^{-1/2+\varepsilon }]\) and the above choice of K, the right-hand side of (3.20) has upper bound

$$\begin{aligned} -\frac{1}{4n}\left[ (\log n)^{3/2}\log \left( 1+\frac{c'n^{4/3}}{\log ^2 n}r^2\right) +\sum _{j=(\log n)^{3/2} }^{n^{2/5}}\log \left( 1+\frac{c'r^2}{(j/n)^{4/3}}\right) +\frac{n}{2}\log (1+c'r^2) \right] . \end{aligned}$$
(3.21)

We then use \(r\ge n^{-2/3}(\log n)^{1/4+\varepsilon }\) for the first and last terms inside the brackets, and the fact \(\log (1+x)\ge x/2\) for small x to obtain a new bound

$$\begin{aligned} -\frac{c'}{4n}\left[ (\log n)^{2\varepsilon }+(\log n)^{-3/2+2\varepsilon } \sum _{j=\log ^{3/2}n}^{n^{2/5}}j^{-4/3}+\frac{n}{4} r^2\right]&\le -\frac{c'r^2}{16}-\frac{c'(\log n)^{2\varepsilon }}{8n}, \end{aligned}$$
(3.22)

noting that the sum over j is \(O((\log n)^{-1/2})\). Therefore, the integral over \(\widetilde{\Omega }\) is bounded by

$$\begin{aligned} e^{-\frac{c'}{8}(\log n) ^{2\varepsilon }} \int _{ n^{-2/3}\log ^{1/4+\varepsilon }n}^{n^{-1/2+\varepsilon }}e^{-\frac{c'}{16}r^2}r\textrm{d}r = O(e^{-C(\log n)^{2\varepsilon }}), \end{aligned}$$
(3.23)

for some \(C>0\). This completes our proof. \(\square \)

Lemma 3.7

If \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\) for fixed \(b<0\), then the integral \(Q_n\) in (3.1) satisfies

$$\begin{aligned} Q_n=e^{nG(\gamma _1,\gamma _2)}\frac{\pi }{n\sqrt{D(\gamma _1,\gamma _2)}}\left( 1+O((\log n)^{-\frac{3}{2}+6\varepsilon })\right) , \end{aligned}$$

where \(\varepsilon >0\) is arbitrarily small and \(D(\gamma _1,\gamma _2)\) is the discriminant

$$\begin{aligned} D(\gamma _1,\gamma _2):=\partial _1^2G(\gamma _1,\gamma _2)\cdot \partial _2^2G(\gamma _1,\gamma _2)-(\partial _1\partial _2G(\gamma _1,\gamma _2))^2. \end{aligned}$$
(3.24)

Proof

We make the change of variables

$$\begin{aligned} z_1=\gamma _1+\textrm{i}r_nt_1,\quad z_2=\gamma _2+\textrm{i}r_nt_2, \end{aligned}$$
(3.25)

where the scaling \(r_n:=n^{-2/3}(\log n)^{1/4}\) is chosen such that the quadratic term in the Taylor expansion of G near \((\gamma _1,\gamma _2)\) will be of order 1. With this change of variable, we have

$$\begin{aligned} Q_n{=}r_n^2e^{nG(\gamma _1,\gamma _2)}\int _{{-}\infty }^\infty \int _{-\infty }^\infty \exp \left( n\Big (G(\gamma _1{+}\textrm{i}r_nt_1,\;\gamma _2+\textrm{i}r_nt_2)-G(\gamma _1,\gamma _2)\Big )\right) \textrm{d}t_2\textrm{d}t_1. \nonumber \\ \end{aligned}$$
(3.26)

Fix \(0<\varepsilon <1/4\). We have shown in Lemma 3.6 that this integral outside a region of radius \((\log n)^\varepsilon \) around the critical point is \(O(e^{-c(\log n)^\varepsilon })\) for some constant \(c>0\). We now consider the region where \(|t_1|,|t_2|\le (\log n)^\varepsilon \). In this region,

$$\begin{aligned} \begin{aligned} G&(\gamma _1+\textrm{i}r_nt_1,\gamma _2+\textrm{i}r_nt_2)-G(\gamma _1,\gamma _2)\\&=-\tfrac{1}{2} r_n^2\left( \partial _1^2G(\gamma _1,\gamma _2)t_1^2 +2\partial _1\partial _2G(\gamma _1,\gamma _2)t_1t_2+\partial _1^2G(\gamma _1,\gamma _2)t_1^2\right) \\&\quad -\tfrac{\textrm{i}}{6}r_n^3\left( \partial _1^3G(\gamma _1,\gamma _2)t_1^3 +3\partial _1^2\partial _2G(\gamma _1,\gamma _2)t_1^2t_2 +3\partial _1\partial _2^2G(\gamma _1,\gamma _2)t_1t_2^2 \right. \\&\left. \quad +\partial _2^3G(\gamma _1,\gamma _2)t_2^3\right) +O(\text {Taylor remainder})\\&=:-r_n^2X_2(t_1,t_2)-\textrm{i}r_n^3X_3(t_1,t_2)+O(n^{-1}(\log n)^{-\frac{3}{2}+4\varepsilon }). \end{aligned}\nonumber \\ \end{aligned}$$
(3.27)

Thus, the integral on the central region becomes

$$\begin{aligned}{} & {} \int _{-(\log n)^\varepsilon }^{(\log n)^\varepsilon }\int _{-(\log n)^\varepsilon }^{(\log n)^\varepsilon }\exp \left( n\Big (G(\gamma _1+\textrm{i}r_nt_1,\;\gamma _2+\textrm{i}r_nt_2)-G(\gamma _1,\gamma _2)\Big )\right) \textrm{d}t_2\textrm{d}t_1\nonumber \\{} & {} \quad =\int \int e^{-nr_n^2X_2(t_1,t_2)}\textrm{d}t_2\textrm{d}t_1 -\textrm{i}\int \int nr_n^3X_3(t_1,t_2)e^{-nr_n^2X_2(t_1,t_2)}\textrm{d}t_2\textrm{d}t_1\nonumber \\{} & {} \quad +O\left( (\log n)^{-\frac{3}{2}+6\varepsilon }\right) , \end{aligned}$$
(3.28)

where the second integral vanishes due to the fact that

$$\begin{aligned} X_3(-t_1,-t_2)e^{-nr_n^2X_2(-t_1,-t_2)}=-X_3(t_1,t_2)e^{-nr_n^2X_2(t_1,t_2)}. \end{aligned}$$

It remains to compute \(\int _{-(\log n)^\varepsilon }^{(\log n)^\varepsilon }\int _{-(\log n)^\varepsilon }^{(\log n)^\varepsilon }e^{-nr_n^2X_2(t_1,t_2)}\textrm{d}t_2\textrm{d}t_1\), which we replace by the integral over \(\mathbb {R}^2\), incurring an error on the order of

$$\begin{aligned} \int _{(\log n)^\varepsilon }^\infty e^{-x^2}\textrm{d}x< e^{-(\log n)^{2\varepsilon }}\ll (\log n)^{-3/2}. \end{aligned}$$
(3.29)

Finally, applying Gaussian integration, we obtain the lemma. \(\square \)

We observe from the lemma above that the integral \(Q_n\) depends on \(G(\gamma _1,\gamma _2)\) and \(D(\gamma _1,\gamma _2)\), which we compute in the following lemma.

Lemma 3.8

If \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\) for some fixed \(b<0\), then

$$\begin{aligned} G(\gamma _1,\gamma _2)&=A(\tilde{\gamma },B)-\frac{1}{2n}\sum _{i = 1}^n\log (\tilde{\gamma }-\mu _i)+O(n^{-1})\\ D(\gamma _1,\gamma _2)&=\frac{\beta _c}{\lambda ^2b}n^{1/3}(\log n)^{-1/2}\left( 1+O\left( (\log \log n)^2(\log n)^{-3/2}\right) \right) \end{aligned}$$

where

$$\begin{aligned} A(x,B):=\sqrt{\alpha ^2+xB^2}-\alpha \log \left( \frac{\alpha +\sqrt{\alpha ^2+xB^2}}{2B}\right) . \end{aligned}$$
(3.30)

Proof

The computation of \(G(\gamma _1,\gamma _2)\) relies upon \(G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)\), which we write as

$$\begin{aligned} G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)=A(\tilde{\gamma },B)-\frac{1}{2}H_{{{\,\textrm{MP}\,}}}(\tilde{\gamma }), \quad \quad H_{{{\,\textrm{MP}\,}}}(z):=\int _{\mathbb {R}}\log (z-x)p_{{{\,\textrm{MP}\,}}}(x)\textrm{d}x.\nonumber \\ \end{aligned}$$
(3.31)

Then, by Lemma 3.5(iii),

$$\begin{aligned} \begin{aligned} G(\gamma _1,\gamma _2)&= G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2) + \left[ G(\tilde{\gamma }_1,\tilde{\gamma }_2) - G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)\right] + O\left( n^{-1}\frac{(\log \log n)^4}{(\log n)^{3/2}}\right) \\&=G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2) -\frac{1}{2n}\left[ \sum _{i=1}^n\log (\tilde{\gamma }-\mu _i)-nH_{{{\,\textrm{MP}\,}}}(\tilde{\gamma })\right] + O(n^{-1})\\&=A(\tilde{\gamma },B)-\frac{1}{2n}\sum _{i = 1}^n\log (\tilde{\gamma }-\mu _i)+O(n^{-1}). \end{aligned}\nonumber \\ \end{aligned}$$
(3.32)

The same lemma and Lemma 3.4(ii) together yield

$$\begin{aligned} D(\gamma _1,\gamma _2) =D_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)+O\left( n^{1/3}\frac{(\log \log n)^2}{(\log n)^2}\right) . \end{aligned}$$

Recall from (3.2) that \(\tilde{\gamma } =\frac{1+\beta ^2+\beta _c^{-4}\beta ^4}{(1+\lambda )^{-1}\beta ^2}\), and \(\beta _c=\lambda ^{-\frac{1}{4}}(1+\lambda )^{1/2}\). We arrive at

$$\begin{aligned} \begin{aligned} D_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)&:=\partial _1^2G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)\cdot \partial _2^2G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2)-(\partial _1\partial _2G_\infty (\tilde{\gamma }_1,\tilde{\gamma }_2))^2\\&=\frac{4\beta ^4}{\lambda ^2(\beta _c^4-\beta ^4)}. \end{aligned} \end{aligned}$$
(3.33)

Apply this to the expression \(D(\gamma _1,\gamma _2)\) and perform Taylor expansion around \(\beta _c\); we obtain the lemma. \(\square \)

3.3 High-Temperature Free Energy

Finally, using the contour integral computations from the previous section, we obtain the following lemma for the limiting fluctuations of the free energy on the high-temperature side of the critical temperature window.

Lemma 3.9

Suppose \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\) for some fixed \(b<0\). We define \(F(\beta )=\frac{\beta ^2}{2\beta _c^4}\). Then the free energy satisfies

$$\begin{aligned} \frac{m+n}{\sqrt{\frac{1}{6}\log n}}\left( F_{n,m}(\beta )-F(\beta )+\frac{1}{12}\frac{\log n}{n+m}\right) \rightarrow \mathcal {N}(0,1). \end{aligned}$$
(3.34)

Proof

We will show that

$$\begin{aligned} F_{n,m}(\beta )-\frac{\beta ^2}{2\beta _c^4}+\frac{1}{12}\frac{\log n}{n+m}-\frac{\sqrt{\frac{1}{6} \log n}}{m+n}T_{0n}=O\left( \frac{\log \log n}{n}\right) , \end{aligned}$$
(3.35)

where

$$\begin{aligned} -T_{0n}:=\frac{\sum _{i = 1}^n\log (\tilde{\gamma }-\mu _i)-C_\lambda n - \frac{1}{\sqrt{\lambda }(1+\sqrt{\lambda })} n(\tilde{\gamma }-d_+) +\frac{2}{3\lambda ^{3/4}(1+\sqrt{\lambda })^2}n(\tilde{\gamma }-d_+)^{3/2} +\frac{1}{6}\log n}{\sqrt{\frac{2}{3} \log n}}\nonumber \\ \end{aligned}$$
(3.36)

with \(C_\lambda :=(1-\lambda ^{-1})\log (1+\lambda ^{1/2})+\log (\lambda ^{1/2})+\lambda ^{-1/2}\) and, by [24], \(T_{0n}\) converges in distribution to a standard normal. We now compute the left-hand side of (3.35) in terms of the parameters \(\beta \) and \(\lambda \). From (2.6), we start by computing

$$\begin{aligned} \frac{1}{n+m}\log Q(n,\alpha _n, B_n){} & {} = \frac{n}{n+m} G(\gamma _1,\gamma _2)+\frac{1}{2(n+m)}\log \left( \frac{\pi ^2}{D(\gamma _1,\gamma _2)}\right) \nonumber \\{} & {} \quad -\frac{\log n}{n+m}+o(n^{-1}), \end{aligned}$$
(3.37)

using Lemma 3.7. By Lemma 3.8, the second term satisfies

$$\begin{aligned} \frac{1}{2(n+m)}\log \left( \frac{\pi ^2}{D(\gamma _1,\gamma _2)}\right) =-\frac{1}{6}\frac{\log n}{n+m}+O\left( \frac{\log \log n}{n}\right) . \end{aligned}$$
(3.38)

Thus, using the computation of \(G(\gamma _1,\gamma _2)\) from (3.32), (3.37) simplifies to

$$\begin{aligned} \frac{1}{n+m}\log Q(n,\alpha _n, B_n)= & {} \frac{n}{n+m}A(\tilde{\gamma },B)-\frac{1}{2(n+m)}\sum _{i = 1}^n\log (\tilde{\gamma }-\mu _i)\nonumber \\{} & {} -\frac{7}{6}\frac{\log n}{n+m}+O\left( \frac{\log \log n}{n}\right) . \end{aligned}$$
(3.39)

Recall that \(\alpha =\frac{1}{2}(\lambda ^{-1}-1)\) and \(B=\frac{1}{\sqrt{\lambda (1+\lambda )}}\beta \) for the bipartite SSK model, and \(\tilde{\gamma }\) is given in (3.2). This implies \(\sqrt{\alpha ^2+\tilde{\gamma }B^2}=\frac{\lambda +1}{2\lambda }+\frac{\beta ^2}{1+\lambda }\), and

$$\begin{aligned} \frac{n}{n+m}A(\tilde{\gamma },B)=\frac{1}{2}+\frac{\lambda \beta ^2}{(1+\lambda )^2}+\frac{1-\lambda }{2(1+\lambda )}\log \left( \frac{2\beta \sqrt{\lambda (1+\lambda )}}{1+\lambda +\beta ^2\lambda }\right) +O(n^{-1}).\nonumber \\ \end{aligned}$$
(3.40)

Combining (2.6), (3.39), and (3.40), we have

$$\begin{aligned} \begin{aligned} F_{n,m}(\beta ) =&-\frac{1}{2(n+m)}\sum _{i = 1}^n\log (\tilde{\gamma }-\mu _i) +\frac{\lambda \beta ^2}{(1+\lambda )^2}-\frac{1-\lambda }{2(1+\lambda )}\log (1+\lambda +\beta ^2\lambda )\\&{-}\frac{\lambda }{1{+}\lambda }\log \beta +\frac{1}{2(\lambda +1)}\log (1{+}\lambda ) -\frac{1}{6}\frac{\log n}{n{+}m}{+}O\left( \frac{\log \log n}{n}\right) . \end{aligned}\nonumber \\ \end{aligned}$$
(3.41)

In order to prove Eq. (3.35), we need express each \(\beta \)-dependent term as a Taylor expansion around \(\beta _c\). More specifically, we define

$$\begin{aligned} \Delta _\beta :=\beta _c-\beta =O(n^{-1/3}\sqrt{\log n}). \end{aligned}$$
(3.42)

Using this and the fact that \(\beta _c=\frac{\sqrt{1+\lambda }}{\lambda ^{1/4}}\), we get

$$\begin{aligned} \begin{aligned} \beta ^2=&\frac{1+\lambda }{\sqrt{\lambda }}-2\beta _c\Delta _\beta +\Delta _\beta ^2\\ \log \beta =&\frac{1}{2}\log (1+\lambda )-\frac{1}{4}\log \lambda -\frac{1}{\beta _c}\Delta _\beta -\frac{1}{2\beta _c^2}\Delta _\beta ^2\\ {}&-\frac{1}{3\beta _c^3}\Delta _\beta ^3+O(\Delta _\beta ^4)\\ \log (1+\lambda +\beta ^2\lambda )=&\log ((1+\lambda )(1+\sqrt{\lambda }))-\frac{2\beta _c\lambda }{(1+\lambda )(1+\sqrt{\lambda })}\Delta _\beta \\&+\frac{\lambda (1+\lambda +\beta _c^2\lambda -2\beta _c^2)}{(1+\lambda )^2(1+\sqrt{\lambda })^2}\Delta _\beta ^2\\&+\frac{2\beta _c\lambda ^2(1+\lambda -\frac{1}{3}\beta _c^2\lambda )}{(1+\lambda )^3(1+\sqrt{\lambda })^3}\Delta _\beta ^3+O(\Delta _\beta ^4) \end{aligned}\nonumber \\ \end{aligned}$$
(3.43)

Furthermore, using Eq. (3.3) we have

$$\begin{aligned} \tilde{\gamma }-d_+=\frac{4(1+\lambda )}{\beta _c^4}\Delta _\beta ^2+\frac{4(1+\lambda )}{\beta _c^5}\Delta _\beta ^3+O(\Delta _\beta ^4). \end{aligned}$$
(3.44)

Plugging these asymptotics into Eqs. (3.36) and (3.41), we verify (3.35), and the lemma follows. \(\square \)

4 Low Temperature

We now determine the asymptotics of the random double integral \(Q_n=-\int _{\gamma _1-\textrm{i}\infty }^{\gamma _1+\textrm{i}\infty }\int _{\gamma _2-\textrm{i}\infty }^{\gamma _2+\textrm{i}\infty }e^{nG(z_1,z_2)}\textrm{d}z_2\textrm{d}z_1\) when \(\beta =\beta _c+bn^{-\frac{1}{3}}\sqrt{\log n}\) for fixed \(b\ge 0\).

Recall that in the regime \(\beta <\beta _c\), both for fixed \( \beta \) as in [13] and for \(\beta \) in Sect. 3, the critical point \((\gamma _1,\gamma _2)\) of the function G is approximated by \((\tilde{\gamma }_1,\tilde{\gamma }_2)\), the critical point satisfying \(4\tilde{\gamma }_1\tilde{\gamma }_2>d_+\) of a deterministic approximation \(G_\infty \) of G. In the case \(\beta >\beta _c\), a critical point of \(G_\infty \) satisfying this inequality does not exist, and we cannot approximate the product \(\gamma =4\gamma _1\gamma _2\) by a deterministic number. In fact, the product \(\gamma \) gets close to the branch point \(\mu _1\) from above, which requires more delicate analysis.

We address this issue by focusing on G near the point \((\mu _1^{(1)},\mu _1^{(2)})\), given by

$$\begin{aligned} \mu _1^{(1)}=\frac{\alpha _n+\sqrt{\alpha _n^2+\mu _1B_n^2}}{2B_n}, \quad \mu _1^{(2)}=\frac{-\alpha _n+\sqrt{\alpha _n^2+\mu _1B_n^2}}{2B_n}, \end{aligned}$$
(4.1)

instead of \((\gamma _1,\gamma _2)\). We see that \(4\mu _1^{(1)}\mu _1^{(2)}=\mu _1\), and \(G(z_1,z_2)\) at \((\mu _1^{(1)},\mu _1^{(2)})\) is undefined due to the term \(\frac{1}{n}\log (4z_1z_2-\mu _1)\). However, the non-singular part given below will play an important role.

$$\begin{aligned} \widehat{G}:=B_n(\mu _1^{(1)}+\mu _1^{(2)})-\alpha _n\log \mu _1^{(1)}-\frac{1}{2n}\sum _{j=2}^n\log (\mu _1-\mu _j) \end{aligned}$$
(4.2)

In our computation of \(\widehat{G}\) as well as the contour integral, we need to work with sums of the form \(\frac{1}{n} \sum _{i=2}^n\frac{1}{(\mu _1-\mu _i)^l}\) for \(l\ge 1\). More specifically, we need the following lemma.

Lemma 4.1

For LOE eigenvalues, on the event \(\mathcal {E}_\varepsilon \), we have

$$\begin{aligned}{} & {} \frac{1}{n}\sum _{i=2}^n\frac{1}{\mu _1-\mu _i}-\frac{1}{\lambda ^{1/2}(1+\lambda ^{1/2})}=O(n^{-1/3}) \quad \text {and} \\{} & {} \frac{1}{n}\sum _{i=2}^n\frac{1}{(\mu _1-\mu _i)^l}=O(n^{\frac{2}{3} l-1}), \quad \text {for }l\ge 2. \end{aligned}$$

Proof

It suffices to prove the following statements:

  1. (i)

    For any \(l\ge 1\), on the event \(\mathcal {E}_\varepsilon \),

    $$\begin{aligned} \left| \frac{1}{n}\sum _{i=k}^n\frac{1}{(\mu _1-\mu _i)^l}-\int _{d_-}^{g_k}\frac{p_{{{\,\textrm{MP}\,}}}(x)}{(d_+-x)^l}\textrm{d}x \right| =O(n^{\frac{2}{3} l-1}). \end{aligned}$$
    (4.3)
  2. (ii)

    For any \(l\ge 1\) and any fixed k, on the event \(\mathcal {E}_\varepsilon \),

    $$\begin{aligned} \frac{1}{n}\sum _{i=2}^k\frac{1}{(\mu _1-\mu _i)^l}=O(n^{\frac{2}{3} l-1}). \end{aligned}$$
    (4.4)
  3. (iii)

    For the \(l=1\) case,

    $$\begin{aligned} \int _{g_k}^{d_+}\frac{p_{{{\,\textrm{MP}\,}}}(x)}{d_+-x}\textrm{d}x=O(n^{-\frac{1}{3}}). \end{aligned}$$
    (4.5)
  4. (iii)

    For the \(l\ge 2\) case,

    $$\begin{aligned} \int _{d_-}^{g_k}\frac{p_{{{\,\textrm{MP}\,}}}(x)}{(d_+-x)^l}\textrm{d}x=O(n^{\frac{2}{3} l-1}) \end{aligned}$$
    (4.6)

Verifying (ii) is straightforward after imposing the assumption \(\mu _1-\mu _i>cn^{-2/3}\) for some \(c>0\), which follows from event \(\mathcal {F}^{(4)}_{r,R}\). Statements (iii) and (iv) follow from the definitions of \(p_{{{\,\textrm{MP}\,}}}\) and \(g_k\).

We now turn to (i). It follows from Lemma 2.8 that, on the event \(\mathcal {F}^{(2)}_K\),

$$\begin{aligned} \frac{1}{n}\sum _{i=K}^n\frac{1}{(d_+-\mu _i)^l}-\int _{d_-}^{g_K}\frac{p_{{{\,\textrm{MP}\,}}}(x)}{(d_+-x)^l}\textrm{d}x=O(n^{\frac{2}{3} l-1}). \end{aligned}$$

Thus, it remains only to show that

$$\begin{aligned} \frac{1}{n}\sum _{i=K}^n\left( \frac{1}{(\mu _1-\mu _i)^l}-\frac{1}{(d_+-\mu _i)^l}\right) =O(n^{\frac{2}{3} l-1}). \end{aligned}$$
(4.7)

This bound holds on the event \(\mathcal {F}^{(2)}_K\cap \mathcal {F}^{(3)}_{s,t}\), which can be seen by observing that

$$\begin{aligned} \begin{aligned} \left| \frac{1}{(\mu _1-\mu _i)^l}-\frac{1}{(d_+-\mu _i)^l}\right|&=\left| \frac{(d_+-\mu _1)\sum _{j=0}^{l-1}(d_+-\mu _i)^j(\mu _1-\mu _i)^{l-j-1}}{(\mu _1-\mu _i)^l(d_+-\mu _i)^l}\right| \\&\le \frac{l|d_+-\mu _1|}{\min \{|d_+-\mu _i|,\;|\mu _1-\mu _i|\}^{l+1}}, \end{aligned}\end{aligned}$$

and thus

$$\begin{aligned}\begin{aligned} \left| \frac{1}{n}\sum _{i=K}^n\left( \frac{1}{(\mu _1-\mu _i)^l}-\frac{1}{(d_+-\mu _i)^l}\right) \right|&=O\left( \frac{1}{n}\sum _{i=K}^n\frac{l\cdot n^{-2/3}}{(d_+-\mu _i)^{l+1}}\right) \\&=O\left( n^{-5/3}\int _K^n l\left( \frac{x}{n}\right) ^{-\frac{2}{3}(l+1)}\textrm{d}x\right) \\ {}&=O(n^{\frac{2}{3} l-1}). \end{aligned}\end{aligned}$$

\(\square \)

4.1 Computation of \(\widehat{G}(\mu _1^{(1)},\mu _1^{(2)})\)

Lemma 4.2

$$\begin{aligned} \widehat{G}{} & {} =A(d_+,B)-\frac{\log n}{3n}-\frac{1}{2n}\sum _{i=1}^n\log |d_+-\mu _i| \\{} & {} \quad +\frac{bn^{-\frac{1}{3}}\sqrt{\log n}}{\lambda ^{\frac{1}{4}}(1+\lambda )^{\frac{1}{2}}d_+}(\mu _1-d_+)+O(n^{-1}), \end{aligned}$$

where

$$\begin{aligned} A(x,B):=\sqrt{\alpha ^2+xB^2}-\alpha \log \left( \frac{\alpha +\sqrt{\alpha ^2+xB^2}}{B}\right) . \end{aligned}$$
(4.8)

Remark 4.3

The expression of \(\widehat{G}\) given by Lemma 4.2 contains two distinct random variables, \(\sum _{i=1}^{n}\log |d_+-\mu _i|\) and \(\mu _1-d_+\). Under appropriate translation and scaling, they are the quantities that give rise to the Gaussian and Tracy–Widom terms, respectively, in the convergence of free energy as stated in Theorem 1.1. The translation and scaling needed for these two random variables are, respectively, \(T_{1n}\) and \(T_{2n}\), given by

$$\begin{aligned} T_{1n}=\frac{C_\lambda n-\frac{1}{6}\log n-\sum _{i=1}^{n}\log |d_+-\mu _i|}{\sqrt{\frac{2}{3}\log n}}, \quad T_{2n}=\frac{n^{2/3}(\mu _1-d_+)}{\sqrt{\lambda }(1+\sqrt{\lambda })^{4/3}}, \end{aligned}$$
(4.9)

where \(C_\lambda \) is as in (1.5). The expression of \(\widehat{G}\) then reads

$$\begin{aligned} \widehat{G}{} & {} =A(d_+,B)-\frac{1}{2}C_\lambda -\frac{\log n}{4n}+\left( \frac{1}{\sqrt{6}}T_{1n}+\frac{\lambda ^{\frac{1}{4}}b}{(1+\lambda ^{\frac{1}{2}})^{\frac{2}{3}}(1+\lambda )^{\frac{1}{2}}}T_{2n}\right) \frac{\sqrt{\log n}}{n}\nonumber \\{} & {} \quad +O(n^{-1}). \end{aligned}$$
(4.10)

Proof of Lemma 4.2

By definition,

$$\begin{aligned} \begin{aligned} \widehat{G}&=B_n(\mu _1^{(1)}+\mu _1^{(2)})-\alpha _n\log (\mu _1^{(1)})-\frac{1}{2n}\sum _{i=2}^n\log (\mu _1-\mu _i)\\&=\sqrt{\alpha _n^2+\mu _1B_n^2}-\alpha _n\log \left( \frac{\alpha _n+\sqrt{\alpha _n^2+\mu _1B_n^2}}{2B_n} \right) -\frac{1}{2n}\sum _{i=2}^n\log (\mu _1-\mu _i). \end{aligned} \nonumber \\ \end{aligned}$$
(4.11)

Replacing \(\alpha _n,B_n\) by \(\alpha ,B\), respectively (incurring an error of \(n^{-1-\delta }\)), and applying Taylor expansion with respect to \(\mu _1\) near \(d_+\), we obtain

$$\begin{aligned} \begin{aligned}&\sqrt{\alpha _n^2+\mu _1B_n^2}-\alpha _n\log \left( \frac{\alpha _n+\sqrt{\alpha _n^2+\mu _1B_n^2}}{2B_n} \right) \\ {}&\qquad =A(d_+,B) +\frac{B^2(\mu _1-d_+)}{2(\alpha +\sqrt{\alpha ^2+d_+B^2})} +O(n^{-1-\delta }). \end{aligned} \nonumber \\ \end{aligned}$$
(4.12)

Note we have dropped the quadratic term in the Taylor expansion, which is \(O(n^{-4/3})\). It remains to compute the summation in (4.11), which can be rewritten as

$$\begin{aligned} \sum _{i=2}^n\log (\mu _1-\mu _i)=\sum _{i=2}^n\log |d_+-\mu _i|-\frac{n(d_+-\mu _1)}{\lambda ^{\frac{1}{2}}(1+\lambda ^{\frac{1}{2}})}+E_1+E_2 \end{aligned}$$
(4.13)

where we define

$$\begin{aligned}{} & {} E_1=n(d_+-\mu _1)\left( \frac{1}{\lambda ^{\frac{1}{2}}(1+\lambda ^{\frac{1}{2}})}-\frac{1}{n}\sum _{i=2}^n\frac{1}{\mu _1-\mu _i} \right) ,\nonumber \\{} & {} E_2=\sum _{i=2}^n\left( \frac{d_+-\mu _1}{\mu _1-\mu _i}-\log \left| 1+\frac{d_+-\mu _1}{\mu _1-\mu _i}\right| \right) . \end{aligned}$$
(4.14)

We now show \(E_1+E_2=O(1)\), following an argument similar to that of Johnstone et al in [36]. The bound \(E_1=O(1)\) follows from Lemma 4.1. To bound \(E_2\), observe that, on the event we are considering, there exist kC such that \(\mu _1\le d_++Cn^{-2/3}\) and \(\mu _k\le d_+-Cn^{-2/3}\). For any fixed i, we also have \(d_+-\mu _1=\Theta (n^{-2/3})\) and \(\mu _1-\mu _i=\Theta (n^{-2/3})\). This implies

$$\begin{aligned} \sum _{i=2}^{k-1}\left( \frac{d_+-\mu _1}{\mu _1-\mu _i}-\log \left| 1+\frac{d_+-\mu _1}{\mu _1-\mu _i}\right| \right) =O(1). \end{aligned}$$

To bound the sum over the indices above k, we observe that, for \(i\ge k\), we have \(\frac{d_+-\mu _1}{\mu _1-\mu _i}\ge -\frac{1}{2}\) and, for any \(x\ge -\frac{1}{2}\), there is \(C_1\) such that \(|\log (1+x)-x|\le C_1x^2\). This gives us

$$\begin{aligned} \sum _{i=k}^n\left( \frac{d_+-\mu _1}{\mu _1-\mu _i}-\log \left| 1+\frac{d_+-\mu _1}{\mu _1-\mu _i}\right| \right) =O(1). \end{aligned}$$

Finally, combining the results above and observing that \(\frac{1}{2n}\log |d_+-\mu _1|=-\frac{\log n}{3n}+O(n^{-1})\), we get

$$\begin{aligned} \widehat{G}(\mu _1^{(1)},\mu _1^{(2)}){} & {} =A(d_+,B)-\frac{\log n}{3n}-\frac{1}{2n}\sum _{i=1}^n\log |d_+-\mu _i|+c_2(B)(\mu _1-d_+)\nonumber \\{} & {} +O(n^{-1}),\end{aligned}$$
(4.15)

where

$$\begin{aligned} c_2(B)=\frac{B^2}{2(\alpha +\sqrt{\alpha ^2+d_+B^2})}-\frac{1}{2\lambda ^{1/2}(1+\lambda ^{1/2})}. \end{aligned}$$
(4.16)

Recall that \(B_c\) is defined to be the quantity satisfying

$$\begin{aligned} \frac{\sqrt{\alpha ^2+d_+B_c^2}-\alpha }{d_+}=\int \frac{p_{{{\,\textrm{MP}\,}}}(x)}{d_+-x}\textrm{d}x=\frac{1}{\lambda ^{\frac{1}{2}}(1+\lambda ^{\frac{1}{2}})}. \end{aligned}$$
(4.17)

Using this definition along with a Taylor expansion of \(c_2\) near \(B=B_c=\lambda ^{-\frac{3}{4}}\), we get

$$\begin{aligned}\begin{aligned} c_2(B)&=\frac{B_c}{2\sqrt{\alpha ^2+d_+B_c^2}}(B-B_c)+O((B-B_c)^2)\\&=\frac{bn^{-\frac{1}{3}}\sqrt{\log n}}{\lambda ^{\frac{1}{4}}(1+\lambda )^{\frac{1}{2}}d_+}+O(n^{-2/3}\log n). \end{aligned} \end{aligned}$$

Apply this to (4.15), we obtain the lemma.\(\square \)

4.2 Contour Integral Analysis

We now derive the asymptotics of the rescaled double integral

$$\begin{aligned} S_n:=\exp (-n\widehat{G})Q_n=\int _{-\infty }^\infty \int _{-\infty }^\infty \exp [n(G(\gamma _1+\textrm{i}y_1,\gamma _2+\textrm{i}y_2)-\widehat{G})]\textrm{d}y_2\textrm{d}y_1.\nonumber \\ \end{aligned}$$
(4.18)

The analysis holds on the following probability event \(\mathcal {F}_\varepsilon \) for arbitrarily small \(\varepsilon >0\).

Lemma 4.4

For each \(\varepsilon >0\), there exist positive numbers rst, and C, depending on \(\varepsilon \), such that the event \(\mathcal {F}_\varepsilon \) given by

$$\begin{aligned} \mathcal {F}_\varepsilon{} & {} =\left\{ \left| \sum _{j=2}^n\frac{1}{n^{\frac{2}{3}}(\mu _1-\mu _j)}-s_{{{\,\textrm{MP}\,}}}(d_+)\right| \le C\right\} \\ {}{} & {} \cap \left\{ \sum _{j=2}^n\frac{1}{n^{\frac{4}{3}}(\mu _1-\mu _j)^2}\le C\right\} \mathcal {F}^{(3)}_{s,t}\cap \mathcal {F}^{(4)}_{r,R}\end{aligned}$$

satisfies \(\mathbb {P}(\mathcal {F}_\varepsilon )>1-\varepsilon \).

We note that the definition of \(\mathcal {F}_\varepsilon \) is not unique as it depends on the choice of strR, and C. For any given \(\varepsilon >0\), we fix the values strRC and define \(\mathcal {F}_\varepsilon \) accordingly.

Proof

First, for some \(C>0\), each of the two events that involve \(\frac{1}{n^{2/3}(\mu _1-\mu _j)}\), with this C as upper bound, holds with probability at least \(1-\varepsilon /4\) by Lemma 4.1. Meanwhile, by Lemma 2.5, we can find \(0<s<t\) and \(0<r<R\) such that each of the events \(\mathcal {F}^{(3)}_{s,t}\) and \(\mathcal {F}^{(4)}_{r,R}\) holds with probability at least \(1-\varepsilon /4\). \(\square \)

Since the integral representation of the partition function only requires \(\gamma _1,\gamma _2>0\) such that \(4\gamma _1\gamma _2>\mu _1\), we set \(\gamma _1=\mu _1^{(1)}\) and \(\gamma _2=\mu _1^{(2)}+n^{-1}\) in the low-temperature case. The shift \(n^{-1}\) in \(\gamma _2\) is due to the deformation \(\hat{\gamma }_2\), given in (4.20), that we later apply to the integral in the \(y_2\) variable. The order \(n^{-1}\) is needed to cancel out a term of order n of the function in the exponent (see, for example, (4.23)). Thus,

$$\begin{aligned} S_n=\int _{-\infty }^\infty \int _{-\infty }^\infty \exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\mu _1^{(2)}+n^{-1}+\textrm{i}y_2)-\widehat{G})]\textrm{d}y_2\textrm{d}y_1. \end{aligned}$$
(4.19)

In the remainder of the subsection, we prove the following lemma, for fixed \(\varepsilon >0\) sufficiently small (e.g., \(0<\varepsilon <\frac{1}{100}\)).

Lemma 4.5

On the event \(\mathcal {F}_\varepsilon \),

$$\begin{aligned} S_n= {\left\{ \begin{array}{ll} e^{O(1)}n^{-\frac{5}{6}}\left( b\sqrt{\log n}\right) ^{-\frac{1}{2}}, &{}\quad b>0,\\ e^{O(\log \log n)}n^{-\frac{5}{6}}, &{}\quad b=0. \end{array}\right. } \end{aligned}$$

By Lemma 3.9 of [13], the part of the double integral \(S_n\) with \(|y_1|>n^{-\frac{1}{2}+\varepsilon }\) is \(O(e^{-n^\varepsilon })\) with high probability. For \(|y_1|<n^{-\frac{1}{2}+\varepsilon }\), we modify the \(z_2\)-integral by replacing the vertical contour \(z_2=\gamma _2+\textrm{i}y_2\), \(y_2\in \mathbb {R}\) with the contour \(z_2=\hat{\gamma }_2+\textrm{i}y_2\), \(y_2\in \mathbb {R}\), where \(\hat{\gamma }_2\) is defined for each \(y_1\) by

$$\begin{aligned} \hat{\gamma }_2(y_1)=\frac{\mu _1^{(1)}(\mu _1^{(2)}+n^{-1})}{\mu _1^{(1)}+\textrm{i}y_1}. \end{aligned}$$
(4.20)

The new contour is a modification of the one introduced by Baik and Lee in [13]. Similar to the case in [13], we observe that the change in product \(z_1z_2\) for \((z_1,z_2)\) near \((\mu _1^{(1)},\mu _1^{(2)})\), but not the individual changes in \(z_1\), \(z_2\) with \(z_1z_2\) being fixed, greatly impacts the change in \(G(z_1,z_2)\), since the main contribution for the latter comes from the term \(\frac{1}{4z_1z_2-\mu _1}\). This suggests behavior of \(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2)\) should be similar to that of \(G(\mu _1^{(1)},\mu _1^{(2)}+n^{-1})\) for the current range of \(y_1\).

Note that this deformation for each \(z_1=\mu _1^{(1)}+\textrm{i}y_1\) is valid. Indeed, if \((z_1,z_2)\) is a point on the branch cut of the logarithmic function in G, then \(4z_1z_2-\mu _1\) is real and non-positive. That is, for some \(r\ge 0\),

$$\begin{aligned} \mathop {\textrm{Re}}\limits z_2=\mathop {\textrm{Re}}\limits \left( \frac{\mu _1-r}{4(\mu _1^{(1)}+\textrm{i}y_1)}\right) =\mathop {\textrm{Re}}\limits \left( \frac{\mu _1-r}{4\mu _1^{(1)}(\mu _1^{(2)}+n^{-1})}\hat{\gamma }_2\right) <\mathop {\textrm{Re}}\limits \hat{\gamma }_2. \end{aligned}$$

This implies that the deformed contour does not cross the branch cut. Thus, the part of \(S_n\) with \(|y_1|<n^{-1/2+\varepsilon }\) is equal to

$$\begin{aligned} \int _{-n^{-1/2+\varepsilon }}^{n^{-1/2+\varepsilon }}\int _{-\infty }^\infty \exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G})]\textrm{d}y_2\textrm{d}y_1. \end{aligned}$$

We now carry out the analysis of this double integral, first by truncating the \(y_2-\)integral. For given \(y_1,y_2\in \mathbb {R}\),

$$\begin{aligned} \begin{aligned}&G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G}\\&\quad = B_n\left( \textrm{i}(y_1+y_2)+\frac{\mu _1^{(2)}+n^{-1}}{1+\textrm{i}\frac{y_1}{\mu _1^{(1)}}}-\mu _1^{(2)}\right) -\alpha _n\log \left( 1+\frac{\textrm{i}y_1}{\mu _1^{(1)}}\right) \\&\qquad -\frac{1}{2n}\sum _{j=2}^n\log \left( 1+\frac{4\mu _1^{(1)}n^{-1}-4y_1y_2}{\mu _1-\mu _j}+\textrm{i}\frac{4\mu _1^{(1)}y_2}{\mu _1-\mu _j}\right) \\&\qquad -\frac{1}{2n}\log (4\mu _1^{(1)}n^{-1}-4y_1y_2+\textrm{i}4\mu _1^{(1)}y_2). \end{aligned} \nonumber \\ \end{aligned}$$
(4.21)

Our truncation procedure, which relies on bounding \(|G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G}|\), aligns rather closely with the arguments in [13], where the difference \(|G(\gamma _1+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-G(\gamma _1,\gamma _2)|\) is the focus there. After truncating in the \(y_1\) variable, the contribution from the part \(|y_2|>n^{-\frac{1}{2}+\varepsilon }\) is as follows.

Lemma 4.6

The following bound holds for the truncated integral.

$$\begin{aligned}{} & {} \int _{|y_1|\le n^{-\frac{1}{2}+\varepsilon }}\int _{|y_2|>n^{-\frac{1}{2}+\varepsilon }}\exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G})]\textrm{d}y_2\textrm{d}y_1 = O(n^{-1}).\nonumber \\ \end{aligned}$$
(4.22)

Proof

From (4.21),

$$\begin{aligned} \begin{aligned}&\mathop {\textrm{Re}}\limits \left[ n\left( G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G}\right) \right] \\&\quad =\frac{B_n(\mu _1^{(1)})^2-nB_n\mu _1^{(2)}y_1^2}{(\mu _1^{(1)})^2+y_1^2}-\frac{\alpha _n n}{2}\log \left( 1+\left( \frac{y_1}{\mu _1^{(1)}}\right) ^2\right) \\&\qquad -\frac{1}{4}\log \left( \left( \frac{4\mu _1^{(1)}}{n}-4y_1 y_2\right) ^2+(4\mu _1^{(1)}y_2)^2\right) \\&\qquad -\frac{1}{4}\sum _{j=2}^n\log \left( \left( 1+\frac{4\mu _1^{(1)}n^{-1}-4y_1 y_2}{\mu _1-\mu _j}\right) ^2+\frac{(4\mu _1^{(1)}y_2)^2}{(\mu _1-\mu _j)^2}\right) . \end{aligned} \nonumber \\ \end{aligned}$$
(4.23)

Applying Taylor expansion in terms of \(y_1\) around 0 to the first two terms on the right-hand side of (4.23), then for some \(c>0\), the first line has upper bound

$$\begin{aligned} c_0-cny_1^2-\frac{1}{2}\log \left( 4\mu _1^{(1)}|y_2|\right) , \quad \text {uniformly in } |y_1|\le n^{-\frac{1}{2}+\varepsilon }. \end{aligned}$$

For the sum of log, by consider the cases \(y_1y_2>0\) and \(y_1y_2<0\) as in [13], there exists \(c'>0\) such that for all \(j\in \{2,3,\dots , n\}\), for all \(|y_1|<n^{-\frac{1}{2}+\varepsilon }\) and \(|y_2|>n^{-\frac{1}{2}+\varepsilon }\),

$$\begin{aligned} \left( 1+\frac{4\mu _1^{(1)}n^{-1}-4y_1 y_2}{\mu _1-\mu _j}\right) ^2+\frac{(4\mu _1^{(1)}y_2)^2}{(\mu _1-\mu _j)^2}\ge 1+c'y_2^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathop {\textrm{Re}}\limits \left[ n\left( G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G}\right) \right]{} & {} \le c_0-cny_1^2-\frac{1}{2}\log \left( 4\mu _1^{(1)}|y_2|\right) \\ {}{} & {} \quad -\frac{n}{4}\log (1+c'y_2^2), \end{aligned}$$

and the left-hand side of (4.22) has upper bound

$$\begin{aligned} \int _{|y_1|\le n^{-\frac{1}{2}+\varepsilon }}\int _{|y_2|>n^{-\frac{1}{2}+\varepsilon }}e^{c_0-cny_1^2}e^{-\frac{n}{4}\log (1+c'y_2^2)}(4\mu _1^{(1)}|y_2|)^{-\frac{1}{2}}\textrm{d}y_2\textrm{d}y_1, \end{aligned}$$

which is a product of a \(y_1\)-integral and a \(y_2\)-integral. Each individual integral is \(O(n^{-\frac{1}{2}})\), so we obtain the lemma. \(\square \)

The computation of \(S_n\) is now reduced to that of the same integral, over the subset \(|y_1|\le n^{-\frac{1}{2}+\varepsilon }\) and \(|y_2|<n^{-\frac{1}{2}+\varepsilon }\). However, we need to truncate the \(y_2\)-integral further.

Lemma 4.7

For this further truncation, we have the following bound.

$$\begin{aligned}{} & {} \int _{|y_1|\le n^{-\frac{1}{2}+\varepsilon }}\int _{n^{-\frac{2}{3}+2\varepsilon }<|y_2|<n^{-\frac{1}{2}+\varepsilon }}\exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G})]\textrm{d}y_2\textrm{d}y_1 \\ {}{} & {} = O(e^{-n^{4\varepsilon }}). \end{aligned}$$

Proof

Computations similar to the proof of Lemma 4.6 give

$$\begin{aligned} \begin{aligned}&\mathop {\textrm{Re}}\limits \left[ n\left( G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G}\right) \right] \\&\quad \le c_0 -\frac{1}{4}\sum _{j=2}^n\log \left( \left( 1+\frac{4\mu _1^{(1)}n^{-1}-4y_1 y_2}{\mu _1-\mu _j}\right) ^2+\frac{(4\mu _1^{(1)}y_2)^2}{(\mu _1-\mu _j)^2}\right) . \end{aligned}\nonumber \\ \end{aligned}$$
(4.24)

Observe that \(n^{-\frac{2}{3}}\ll \mu _1-\mu _{n^{4\varepsilon }}\ll n^{-\frac{2}{3}+2\varepsilon }\). Thus, for \(2\le j\le n^{4\varepsilon }\), \(\left( \frac{4\mu _1^{(1)}y_2}{\mu _1-\mu _j}\right) ^2 \ge (4\mu _1^{(1)})^2\), and we obtain

$$\begin{aligned} -\frac{1}{4}\sum _{j=1}^{n^{4\varepsilon }}\log \left( \left( 1+\frac{4\mu _1^{(1)}n^{-1}-4y_1 y_2}{\mu _1-\mu _j}\right) ^2+\frac{(4\mu _1^{(1)}y_2)^2}{(\mu _1-\mu _j)^2}\right) \le -\frac{1}{2}\log (\mu _1^{(1)})n^{4\varepsilon }. \end{aligned}$$

For \(j>n^{4\varepsilon }\), \(\mu _1-\mu _j\ge \mu _1-\mu _{n^{4\varepsilon }}\gg n^{-\frac{2}{3}}\). Since \(|y_1|\), \(|y_2|\le n^{-\frac{1}{2}+\varepsilon }\), we have \(\mu _1-\mu _j \gg |4\mu _1^{(1)}n^{-1}-4y_1y_2|\). Using \(\log (1-x)\ge -2x\) for \(x\in (0,1)\), then for some constant \(C,C'>0\), the sum with indices \(j>n^{4\varepsilon }\) on the right-hand side of (4.24) has upper bound

$$\begin{aligned} \begin{aligned} -\frac{1}{2}\sum _{j=n^{4\varepsilon }+1}^{n}\log \left( 1+\frac{4\mu _1^{(1)}n^{-1}-4y_1 y_2}{\mu _1-\mu _j}\right)&\le 4|\mu _1^{(1)}n^{-1}-y_1y_2|\sum _{j=n^{4\varepsilon }+1}^{n}\frac{1}{\mu _1-\mu _j}\\&\le Cn|\mu _1^{(1)}n^{-1}-y_1y_2|\le C'n^{2\varepsilon }. \end{aligned} \end{aligned}$$

Here, the second inequality holds with probability at least \(1-\varepsilon \) by Lemma 4.1. Thus, we obtain the uniform bound

$$\begin{aligned} \mathop {\textrm{Re}}\limits \left[ n\left( G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G}\right) \right] \le c_0-Cn^{4\varepsilon } \end{aligned}$$

for some constant \(C>0\). This implies the lemma. \(\square \)

Therefore, we have shown that

$$\begin{aligned} S_n{} & {} = \int _{ -n^{-\frac{1}{2}+\varepsilon }}^{n^{-\frac{1}{2}+\varepsilon }}\int _{-n^{-\frac{2}{3}+2\varepsilon }}^{n^{-\frac{2}{3}+2\varepsilon }}\exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G})]\textrm{d}y_2\textrm{d}y_1 + O(n^{-1}). \nonumber \\ \end{aligned}$$
(4.25)

We proceed to compute the double integral in (4.25). For \(|y_1|\le n^{-\frac{1}{2}+\varepsilon }\) and \(|y_2|<n^{-\frac{2}{3}+2\varepsilon }\), by Taylor series and the definitions of \(\mu _1^{(1)}\) and \(\mu _1^{(2)}\) in (4.1), the second line of (4.21) for \(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G}\) is

$$\begin{aligned} B_n(n^{-1}+\textrm{i}y_2)-\textrm{i}\frac{B_n n^{-1}}{\mu _1^{(1)}}y_1-\frac{B_n(\frac{\mu _1^{(1)}+\mu _1^{(2)}}{2}+n^{-1})}{(\mu _1^{(1)})^2}y_1^2+O(y_1^3), \end{aligned}$$

while the last line, after factorizing the arguments of logarithm functions, becomes

$$\begin{aligned}{} & {} -\frac{1}{2n}\sum _{j=2}^n\log \left( 1+\frac{4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}{\mu _1-\mu _j}\right) -\frac{1}{2n}\log (4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2)\nonumber \\{} & {} \quad -\frac{1}{2n}\sum _{j=2}^n\log \left( 1-\frac{4y_1y_2}{\mu _1-\mu _j+4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}\right) \nonumber \\{} & {} \quad -\frac{1}{2n}\log \left( 1-\frac{4y_1y_2}{4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}\right) . \end{aligned}$$
(4.26)

Combining the above two displays, we obtain

$$\begin{aligned} \begin{aligned}&\exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G})]\\&\quad =\exp \left[ -\frac{\textrm{i}B_n }{\mu _1^{(1)}}y_1-\frac{B_n(\frac{\mu _1^{(1)}+\mu _1^{(2)}}{2}+n^{-1})}{(\mu _1^{(1)})^2}ny_1^2\right] \\&\qquad \cdot \exp \left[ B_nn(n^{-1}+\textrm{i}y_2)-\frac{1}{2}\log (4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2)\right. \\&\qquad \left. -\frac{1}{2}\sum _{j=2}^n\log (1+\frac{4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}{\mu _1-\mu _j})\right] \\&\qquad \cdot \exp \left[ -\frac{1}{2}\sum _{j=1}^n\log \left( 1-\frac{4y_1y_2}{\mu _1-\mu _j+4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}\right) +O(ny_1^3)\right] . \end{aligned} \nonumber \\ \end{aligned}$$
(4.27)

Let \(H(y_1,y_2)\) denote the product of the first two exponential factors on the right-hand side of (4.27) and \(L(y_1,y_2)\) be the last factor. That is,

$$\begin{aligned} \exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G})]=H(y_1,y_2)L(y_1,y_2). \end{aligned}$$

There is a constant \(c>0\) such that \(\frac{B_n(\frac{\mu _1^{(1)}+\mu _1^{(2)}}{2}+n^{-1})}{(\mu _1^{(1)})^2}>c\), so

$$\begin{aligned} \begin{aligned} |H(y_1,y_2)|&\le \exp \left[ B_n-\frac{B_n(\frac{\mu _1^{(1)}+\mu _1^{(2)}}{2}+n^{-1})}{(\mu _1^{(1)})^2}ny_1^2\right. \\&\quad \left. -\frac{1}{2}\mathop {\textrm{Re}}\limits \sum _{j=2}^n\log (1+\frac{4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}{\mu _1-\mu _j})\right] \\&\le \exp \left[ B_n-cny_1^2-\frac{1}{2}\log \left( \frac{4\mu _1^{(1)}|y_2|}{\mu _1-\mu _2}\right) \right] \\&\le C(\mu _1-\mu _2)^{\frac{1}{2}}|y_2|^{-\frac{1}{2}} e^{-cny_1^2}, \end{aligned} \nonumber \\ \end{aligned}$$
(4.28)

for some constant \(C>0\). On the other hand, by Lemma 4.1, there exists constant \(C>0\) such that

$$\begin{aligned} \sum _{j=2}^{n}\frac{1}{|\mu _1-\mu _j+4\mu _1^{(1)}n^{-1}+4i\mu _1^{(1)}y_2|^\ell }\le \sum _{j=2}^{n}\frac{1}{(\mu _1-\mu _j)^\ell }\nonumber \\ \le {\left\{ \begin{array}{ll} Cn^{1+\varepsilon }, &{}\quad \ell =1,\\ Cn^{\frac{2\ell }{3}+\varepsilon }, &{}\quad \ell =2,3,\dots \end{array}\right. } \end{aligned}$$
(4.29)

At the same time,

$$\begin{aligned} \left| \frac{4y_1y_2}{4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}\right| \le \frac{|y_1|}{\mu _1^{(1)}}=O(n^{-\frac{1}{2}+\varepsilon }). \end{aligned}$$

Thus, applying Taylor series, we have

$$\begin{aligned} \begin{aligned} L(y_1,y_2)=1&+\sum _{k=1}^{n}\frac{2y_1y_2}{\mu _1-\mu _j+4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2}+ O(n^{-\frac{1}{2}+3\varepsilon }). \end{aligned} \end{aligned}$$
(4.30)

Observe that

$$\begin{aligned} |\mu _1-\mu _j+4\mu _1^{(1)}n^{-1}+4\textrm{i}\mu _1^{(1)}y_2|\ge {\left\{ \begin{array}{ll} \mu _1-\mu _j, &{}\quad j=2,3,\dots , n,\\ 4\mu _1^{(1)}|y_2|, &{}\quad j=1. \end{array}\right. } \end{aligned}$$

Applying (4.29) with \(\ell =1\), we obtain

$$\begin{aligned} |L(y_1,y_2)-1|\le Cn|y_1y_2| +C'n^{-\frac{1}{2}+3\varepsilon }. \end{aligned}$$
(4.31)

We now write

$$\begin{aligned}{} & {} \int _{ -n^{-\frac{1}{2}+\varepsilon }}^{n^{-\frac{1}{2}+\varepsilon }}\int _{-n^{-\frac{2}{3}+2\varepsilon }}^{n^{-\frac{2}{3}+2\varepsilon }}\exp [n(G(\mu _1^{(1)}+\textrm{i}y_1,\hat{\gamma }_2+\textrm{i}y_2)-\widehat{G})]\textrm{d}y_2\textrm{d}y_1 =I_1+I_2,\nonumber \\ \end{aligned}$$
(4.32)

where \(I_2\) is given by

$$\begin{aligned} I_2=\int _{ -n^{-\frac{1}{2}+\varepsilon }}^{n^{-\frac{1}{2}+\varepsilon }}\int _{-n^{-\frac{2}{3}+2\varepsilon }}^{n^{-\frac{2}{3}+2\varepsilon }} H(y_1,y_2)(L(y_1,y_2)-1) \textrm{d}y_2\textrm{d}y_1. \end{aligned}$$
(4.33)

By (4.28) and (4.31), there is constant \(C_j>0, j=1,2,3\) such that

$$\begin{aligned} \begin{aligned} |I_2|&\le C_1\int _{ -n^{-\frac{1}{2}+\varepsilon }}^{n^{-\frac{1}{2}+\varepsilon }}\int _{-n^{-\frac{2}{3}+2\varepsilon }}^{n^{-\frac{2}{3}+2\varepsilon }} |H(y_1,y_2)|\left( n|y_1y_2|+n^{-\frac{1}{2}+3\varepsilon }\right) \textrm{d}y_2\textrm{d}y_1\\&\le C_2n(\mu _1{-}\mu _2)^{\frac{1}{2}} \int _{ {-}n^{{-}\frac{1}{2}+\varepsilon }}^{n^{-\frac{1}{2}+\varepsilon }}\int _{{-}n^{-\frac{2}{3}{+}2\varepsilon }}^{n^{-\frac{2}{3}+2\varepsilon }} e^{-cny_1^2}\left( |y_1||y_2|^{\frac{1}{2}}+n^{-\frac{3}{2}+3\varepsilon }|y_2|^{-\frac{1}{2}}\right) \textrm{d}y_2\textrm{d}y_1\\&\le C_3 n^{-1+4\varepsilon }(\mu _1-\mu _2)^{\frac{1}{2}}. \end{aligned}\nonumber \\ \end{aligned}$$
(4.34)

Together with (4.25) and (4.32), this implies that on the event \(\mu _1-\mu _2\le n^{-2/3+\varepsilon }\),

$$\begin{aligned} S_n=I_1+O(n^{-1}). \end{aligned}$$

Note that

$$\begin{aligned} I_1=\int _{ -n^{-\frac{1}{2}+\varepsilon }}^{n^{-\frac{1}{2}+\varepsilon }}\int _{-n^{-\frac{2}{3}+2\varepsilon }}^{n^{-\frac{2}{3}+2\varepsilon }} H(y_1,y_2)\textrm{d}y_2\textrm{d}y_1 \end{aligned}$$
(4.35)

is equal to the product of two single integrals \(I_{11}\) and \(I_{12}\) as follows. First,

$$\begin{aligned} \begin{aligned} I_{11}&=\int _{ -n^{-\frac{1}{2}+\varepsilon }}^{n^{-\frac{1}{2}+\varepsilon }}\exp \left[ -\frac{\textrm{i}B_n }{\mu _1^{(1)}}y_1-\frac{B_n(\frac{\mu _1^{(1)}+\mu _1^{(2)}}{2}+n^{-1})}{(\mu _1^{(1)})^2}ny_1^2\right] \textrm{d}y_1\\&= n^{-\frac{1}{2}}\int _{ -n^{\varepsilon }}^{n^{\varepsilon }}e^{-c_1x^2}\cos \left( \frac{c_2}{\sqrt{n}}x\right) \textrm{d}x, \\ (c_1,c_2)&:=\left( \frac{B_n(\frac{\mu _1^{(1)}+\mu _1^{(2)}}{2}+n^{-1})}{(\mu _1^{(1)})^2},\frac{B_n}{\mu _1^{(1)}}\right) . \end{aligned} \end{aligned}$$

Using Taylor’s series of cosine, we obtain that for some \(C>0\),

$$\begin{aligned} I_{11}=Cn^{-\frac{1}{2}}\left( 1+O(n^{-1+2\varepsilon })\right) . \end{aligned}$$
(4.36)

Second, we have

$$\begin{aligned} I_{12} =\int _{-n^{-\frac{2}{3}+2\varepsilon }}^{n^{-\frac{2}{3}+2\varepsilon }}\exp \left[ n(G(\mu _1^{(1)},\mu _1^{(2)}+n^{-1}+\textrm{i}y)-\widehat{G})\right] \textrm{d}y. \end{aligned}$$
(4.37)

We first check that \(I_{12}\) is close to the integral over the whole real line

$$\begin{aligned} K_n:=\int _{-\infty }^{\infty }\exp \left[ n(G(\mu _1^{(1)},\mu _1^{(2)}+n^{-1}+\textrm{i}y)-\widehat{G})\right] \textrm{d}y. \end{aligned}$$
(4.38)

By (4.27), for all \(y\in \mathbb {R}\),

$$\begin{aligned} \begin{aligned}&\mathop {\textrm{Re}}\limits \left[ n\left( G(\mu _1^{(1)},\mu _1^{(2)}+\frac{1}{n}+\textrm{i}y)-\widehat{G}\right) \right] \\ {}&\quad \le c_0- \frac{1}{4}\log \left( (4\mu _1^{(1)}n^{-1})^2+(4\mu _1^{(1)}y_2)^2\right) \\&\qquad -\frac{1}{4}\sum _{j=2}^n\log \left( \left( 1+\frac{4\mu _1^{(1)}/n }{\mu _1-\mu _j}\right) ^2+\left( \frac{4\mu _1^{(1)}y}{\mu _1-\mu _j}\right) ^2\right) . \end{aligned} \end{aligned}$$

In the case \(n^{-\frac{2}{3}+2\varepsilon }<|y|<n\), we use \(- \frac{1}{4}\log \left( (4\mu _1^{(1)}n^{-1})^2+(4\mu _1^{(1)}y_2)^2\right) \le \frac{1}{2}\log n\) and bound

$$\begin{aligned} \sum _{j=2}^n\log \left( \left( 1+\frac{4\mu _1^{(1)}/n}{\mu _1-\mu _j}\right) ^2+\left( \frac{4\mu _1^{(1)}y}{\mu _1-\mu _j}\right) ^2\right) \ge 2\sum _{j=2}^{n^{2\varepsilon }}\log \left( \frac{4\mu _1^{(1)}|y|}{\mu _1-\mu _j}\right) \ge Cn^{2\varepsilon } \end{aligned}$$

using the fact that \(\mu _1-\mu _{n^{2\varepsilon }}\ll n^{-\frac{2}{3}+2\varepsilon }\) with high probability. For \(|y|>n\), we drop the negative term \(- \frac{1}{4}\log \left( (4\mu _1^{(1)}n^{-1})^2+(4\mu _1^{(1)}y_2)^2\right) \), while, for some \(c>0\),

$$\begin{aligned} \sum _{j=2}^n\log \left( \left( 1+\frac{4\mu _1^{(1)}/n}{\mu _1-\mu _j}\right) ^2+\left( \frac{4\mu _1^{(1)}y}{\mu _1-\mu _j}\right) ^2\right) \ge n \log (1+cy^2)\ge n\log (c|y|). \end{aligned}$$

Therefore, for some \(C',C''>0\), it holds with high probability that

$$\begin{aligned} |K_n-I_{12}| \le C'\left( n^{1/2}e^{-Cn^{-2\varepsilon }}+\int _0^\infty (cy)^{-\frac{n}{4}}\right) \textrm{d}y \le C''e^{-c'n^{2\varepsilon }}. \end{aligned}$$
(4.39)

We determine in Sect. 4.2.1 that, on the event \(\mathcal {F}_\varepsilon \),

$$\begin{aligned} K_n={\left\{ \begin{array}{ll} e^{O(1)}n^{-\frac{1}{3}}\left( b\sqrt{\log n} \right) ^{-\frac{1}{2}}, &{}\quad b>0,\\ e^{O(\log \log n)}n^{-\frac{1}{3}},&{}\quad b=0. \end{array}\right. } \end{aligned}$$
(4.40)

Assuming (4.40) is true, then using (4.39) and the fact that \(S_n=I_{11}\cdot I_{12}+O(n^{-1})\), we obtain Lemma 4.5.

4.2.1 Proof of (4.40) When \(b>0\)

For brevity, we introduce the following two notations to be used throughout the subsection:

$$\begin{aligned} a_+ =\frac{\mu _1^{(2)}+\mu _2^{(2)}}{2}= \frac{\mu _1}{8\mu _1^{(1)}}+\frac{\mu _2}{8\mu _2^{(1)}}, \end{aligned}$$
(4.41)

where \(\mu _2^{(1)}:=\frac{\alpha _n+\sqrt{\alpha _n^2+\mu _2B_n^2}}{2B_n}\) and \(\mu _2^{(2)}:=\frac{-\alpha _n+\sqrt{\alpha _n^2+\mu _2B_n^2}}{2B_n}\).

We now show that the integral \(K_n\), on the event \(\mathcal {F}_\varepsilon \), satisfies (4.40), first under the assumption \(b>0\). By Cauchy theorem, for every \(r\in (0,n^{-1}]\),

$$\begin{aligned} \textrm{i}K_n = \int _\Gamma \exp \left[ n(G(\mu _1^{(1)},z)-\widehat{G})\right] \textrm{d}z, \end{aligned}$$

where \(\Gamma =\Gamma _1\cup \Gamma _{2}^{\pm }\cup \Gamma _{3}^{\pm }\) is the vertical keyhole-like contour as in Fig. 1. In particular, given a function \(\phi _r: \mathbb {R}_+\rightarrow [0,\pi ]\) of r such that \(\phi _r\rightarrow 0\) as \(r\downarrow 0\), we let \(\Gamma _1\) be the arc \(\{\mu _1^{(2)}+re^{i\theta }: \theta \in [-\pi +\phi _r, \pi -\phi _r]\}\), \(\Gamma _{2}^{\pm }=\{x\pm r\sin \phi _r: x\in [a_+,\mu _1^{(2)}-r\cos \phi _r]\}\), and \(\Gamma _{3}^{\pm }\) be the rays \(\{a_+\pm iy: y\in [r\sin \phi _r, \infty )\}\). Then, for fixed n,

$$\begin{aligned} \textrm{i}K_n = \lim \limits _{r\downarrow 0} \int _\Gamma \exp \left[ n(G(\mu _1^{(1)},z)-\widehat{G})\right] \textrm{d}z. \end{aligned}$$
(4.42)
Fig. 1
figure 1

Keyhole-like contour of integration \(\Gamma \)

For \(\Gamma _1\), using the fact that \(\log (x+\textrm{i}t) \rightarrow \log |x|+\textrm{i}\pi \) as \(t\downarrow 0\) for \(x<0\), and \(dz=\textrm{i}re^{\textrm{i}\theta }d\theta \) where \(\theta \) takes values in \([-\pi -\phi _r,\pi +\phi _r]\) as described above, one can verify using Fubini’s that for each fixed n, the integral over \(\Gamma _1\) converges to 0 as \(r\rightarrow 0\).

We show in Lemma 4.8 that, in the limit \(r\downarrow 0\), the contribution from \(\Gamma _2^+\cup \Gamma _2^-\) part of the contour satisfies the asymptotics (4.40) in both cases \(b>0\) and \(b=0\). In Lemma 4.9, we confirm that for any keyhole radius \(r\in (0,1/n]\), with probability arbitrarily close to 1, the contribution from \(\Gamma _3^+\cup \Gamma _3^-\) is little-o of that of \(\Gamma _2^+\cup \Gamma _2^-\) when \(b>0\). Together, the lemmas establish (4.40) when \(b>0\).

Lemma 4.8

On the event \(\mathcal {F}_\varepsilon \), it holds that

$$\begin{aligned} \lim \limits _{r\downarrow 0}\int _{\Gamma _2^+\cup \Gamma _2^-}\exp \left[ n(G(\mu _1^{(1)},z)-\widehat{G})\right] \textrm{d}z = {\left\{ \begin{array}{ll} \textrm{i}e^{O(1)}n^{{-}\frac{1}{3}}\left( b\sqrt{\log n}\right) ^{{-}\frac{1}{2}}, &{}\quad b>0,\\ \textrm{i}e^{O(1)}n^{{-}\frac{1}{3}},&{}\quad b=0. \end{array}\right. }\nonumber \\ \end{aligned}$$
(4.43)

Proof

Recall that, if \(z \in \Gamma _2^{\pm }\), then \(z=x\pm \textrm{i}r\sin \phi _r\) where \(x\in [a_+,\mu _1^{(2)}-r\cos \phi _r]\). Set \(s=\mu _1^{(2)}-x\), we have

$$\begin{aligned} \begin{aligned} n(G(\mu _1^{(1)},z)-\widehat{G})&=-nB_n(\mu _1^{(2)}-x)\pm \textrm{i}n B_n r\sin \phi _r\\&\quad -\frac{1}{2}\sum _{j=2}^n\log \left( 1-\frac{4\mu _1^{(1)}(\mu _1^{(2)}-x)\mp \textrm{i}4\mu _1^{(1)}r\sin \phi _r}{\mu _1-\mu _j}\right) \\&\quad -\frac{1}{2}\log \left( -4\mu _1^{(1)}(\mu _1^{(2)}-x)\pm \textrm{i}4\mu _1^{(1)}r\sin \phi _r\right) \\&{\mathop {\rightarrow }\limits ^{r\downarrow 0}} -B_nns-\frac{1}{2}\sum _{j=2}^n\log \left( 1-\frac{4\mu _1^{(1)}(\mu _1^{(2)}-x)}{\mu _1-\mu _j}\right) \\&\quad \qquad -\frac{1}{2}\log (4\mu _1^{(1)}(\mu _1^{(2)}-x))\mp \textrm{i}\frac{\pi }{2}. \end{aligned} \end{aligned}$$

Let A be the left-hand side of (4.43). We then obtain

$$\begin{aligned} A=\frac{2\textrm{i}}{\sqrt{4\mu _1^{(1)}}}\int _0^{\mu _1^{(2)}-a_+}\exp \left( -B_nns-\frac{1}{2}\sum _{j=2}^n\log \left( 1-\frac{4\mu _1^{(1)}s}{\mu _1-\mu _j}\right) \right) \frac{\textrm{d}s}{\sqrt{s}}.\nonumber \\ \end{aligned}$$
(4.44)

Observe that \(\frac{4\mu _1^{(1)}s}{\mu _1-\mu _j} \in [0,\frac{1}{2}]\) for all \(s\in [0,\mu _1^{(2)}-a_+]\) and all j. As \(0<-\log (1-x)-x\le x^2\) for \(x\in [0,\frac{1}{2}]\), there exists \(\zeta \in [0,1]\) such that

$$\begin{aligned} \begin{aligned}&-B_nns-\frac{1}{2}\sum _{j=2}^n\log \left( 1-\frac{4\mu _1^{(1)}s}{\mu _1-\mu _j}\right) \\&\quad =-B_nns+2\mu _1^{(1)}s\sum _{j=2}^n\frac{1}{\mu _1-\mu _j}+\frac{\zeta (4\mu _1^{(1)}s)^2}{2}\sum _{j=2}^n\frac{1}{(\mu _1-\mu _j)^2}. \end{aligned} \end{aligned}$$
(4.45)

Define \(y:=n^{2/3}s \in [0,n^{2/3}(\mu _1^{(2)}-a_+)]\), and let \(\omega _{1n}\), \(\omega _{2n}\) be random variables given by

$$\begin{aligned} \sum _{j=2}^n\frac{1}{n^{\frac{2}{3}}(\mu _1-\mu _j)} = s_{{{\,\textrm{MP}\,}}}(d_+)n^{\frac{1}{3}}+\omega _{1n}, \quad \sum _{j=2}^n\frac{1}{\left( n^{\frac{2}{3}}(\mu _1-\mu _j)\right) ^2} = \omega _{2n}.\nonumber \\ \end{aligned}$$
(4.46)

Then, (4.45) simplifies to

$$\begin{aligned}{} & {} -B_nns-\frac{1}{2}\sum _{j=2}^n\log \left( 1-\frac{4\mu _1^{(1)}s}{\mu _1-\mu _j}\right) \nonumber \\{} & {} \quad =n^{\frac{1}{3}}y\left( -B_n + 2\mu _1^{(1)}s_{{{\,\textrm{MP}\,}}}(d_+)\right) +\left[ y(2\mu _1^{(1)}\omega _{1n}+2(\mu _1^{(1)})^2\zeta \omega _{2n} y^2\right] ,\nonumber \\ \end{aligned}$$
(4.47)

where the term inside the square brackets is O(1), uniformly for \(y\in [0,n^{2/3} (\mu _1^{(2)}-a_+)]\). Observe also

$$\begin{aligned} -B_n + 2\mu _1^{(1)}s_{{{\,\textrm{MP}\,}}}(d_+)= -B_n+\frac{\alpha _n+\sqrt{\alpha _n^2+\mu _1B_n^2}}{B_n}s_{{{\,\textrm{MP}\,}}}(d_+), \end{aligned}$$

where \(B_n-B_c=\Theta (\beta -\beta _c)\) and \(B_c\) satisfies \(\sqrt{\alpha ^2+d_+B_c^2}=\alpha +d_+s_{{{\,\textrm{MP}\,}}}(d_+)\). Therefore, applying Taylor expansion to the above expression with respect to \(B_n\) near \(B_c\) and \(\mu _1\) near \(d_+\), using \(\mu _1-d_+=O(n^{-2/3})\) on the event \(\mathcal {F}_\varepsilon \), we obtain

$$\begin{aligned} \begin{aligned} -B_n + 2\mu _1^{(1)}s_{{{\,\textrm{MP}\,}}}(d_+)&=-\frac{2s_{{{\,\textrm{MP}\,}}}(d_+)\lambda ^{\frac{1}{2}}}{\sqrt{1+\lambda }}(\beta -\beta _c)+O\left( (\beta -\beta _c)^2\right) . \end{aligned} \end{aligned}$$
(4.48)

Thus, on the event \(\mathcal {F}_\varepsilon \),

$$\begin{aligned} -B_nns-\frac{1}{2}\sum _{j=2}^n\log \left( 1-\frac{4\mu _1^{(1)}s}{\mu _1-\mu _j}\right) =-\frac{2s_{{{\,\textrm{MP}\,}}}(d_+)\lambda ^{\frac{1}{2}}b\sqrt{\log n}}{\sqrt{1+\lambda }}y +O(1),\nonumber \\ \end{aligned}$$
(4.49)

and we arrive at

$$\begin{aligned} A{} & {} =\frac{\textrm{i}e^{O(1)}}{n^{\frac{1}{3}}}\int _0^{n^{\frac{2}{3}}(\mu _1^{(2)}-a_+)}\exp \left( -\frac{2s_{{{\,\textrm{MP}\,}}}(d_+)\lambda ^{\frac{1}{2}}b\sqrt{\log n}}{\sqrt{1+\lambda }}y\right) \frac{\textrm{d}y}{\sqrt{y}}\nonumber \\ {}{} & {} ={\left\{ \begin{array}{ll} \textrm{i}e^{O(1)}n^{-\frac{1}{3}}b^{-\frac{1}{2}}(\log n)^{-\frac{1}{4}}, &{}\quad b>0,\\ \textrm{i}e^{O(1)}n^{-\frac{1}{3}}, &{}\quad b=0. \end{array}\right. } \end{aligned}$$

This completes the proof of the lemma. \(\square \)

Lemma 4.9

Let \(\theta _n=\frac{n^{2/3}(\mu _1^{(2)}-\mu _2^{(2)})}{2}\). For \(b\ge 0\) and for every \(0<r<n^{-1}\), on the event \(\mathcal {F}_\varepsilon \),

$$\begin{aligned}{} & {} \left| \int _{\Gamma _3^+\cup \Gamma _3^-}\exp \left[ n(G(\mu _1^{(1)},z)-\widehat{G})\right] \textrm{d}z\right| \\ {}{} & {} \le n^{-\frac{1}{3}}\exp \left( -\frac{2s_{{{\,\textrm{MP}\,}}}(d_+)\sqrt{\lambda }\theta _n}{\sqrt{1+\lambda }}b\sqrt{\log n}+O(1)\right) . \end{aligned}$$

Proof

Since \(G(\mu _1^{(1)},\overline{z})=\overline{G(\mu _1^{(1)}, z)}\) for all \(z\in \mathbb {C}\), it suffices to bound the integral over \(\Gamma _3^+\). We define

$$\begin{aligned} G_+(\mu _1^{(1)},a_+)=\lim \limits _{t\downarrow 0}G(\mu _1^{(1)}, a_++\textrm{i}t), \quad \widetilde{G}(t)=G(\mu _1^{(1)}, a_++\textrm{i}t)-G_+(\mu _1^{(1)},a_+). \end{aligned}$$

Then, for \(z\in \Gamma _3^+\),

$$\begin{aligned} n(G(\mu _1^{(1)},z)-\widehat{G}) = n(G_+(\mu _1^{(1)},a_+)-\widehat{G}))+ n\widetilde{G}(t), \end{aligned}$$

and we have

$$\begin{aligned} \begin{aligned} \left| \int _{\Gamma _3^+}\exp \left[ n(G(\mu _1^{(1)},z)-\widehat{G})\right] \textrm{d}z\right|&\le \left| e^{n(G_+(\mu _1^{(1)},a_+)-\widehat{G})}\right| \int _0^\infty e^{n\mathop {\textrm{Re}}\limits \widetilde{G}(t)} \textrm{d}t. \end{aligned}\nonumber \\ \end{aligned}$$
(4.50)

For fixed \(k>2\),

$$\begin{aligned} n\mathop {\textrm{Re}}\limits \widetilde{G}(t)= & {} -\frac{1}{4}\sum _{j=1}^{n}\log \left( 1+\left( \frac{4\mu _1^{(1)}t}{4\mu _1^{(1)}a_+-\mu _j}\right) ^2\right) \\{} & {} \le -\frac{1}{4}\sum _{j=2}^{n}\log \left( 1+\left( \frac{4\mu _1^{(1)}t}{\mu _1-\mu _j}\right) ^2\right) \le -\frac{k}{4}\log \left( 1+\xi ^{-2}n^{\frac{4}{3}}t^2\right) , \end{aligned}$$

where \(\xi :=\frac{n^{2/3}}{4\mu _1^{(1)}}|\mu _1-\mu _{k+1}|\) is O(1) on the event \(\mathcal {F}_\varepsilon \). Thus,

$$\begin{aligned} \int _0^\infty e^{n\mathop {\textrm{Re}}\limits \widetilde{G}(t)}\textrm{d}t\le \int _0^\infty (1+\xi ^{-2}n^{\frac{4}{3}}t^2)^{-\frac{k}{4}}\textrm{d}t=\exp \left( -\frac{2}{3} \log n+O(1)\right) .\nonumber \\ \end{aligned}$$
(4.51)

At the same time, on the event \(\mathcal {F}_\varepsilon \), \( \theta _n=\Theta (n^{2/3}(\mu _1-\mu _2))=\Theta (1)\). Thus, similar to the proof of Lemma 4.8, we obtain that

$$\begin{aligned} n(G_+(\mu _1^{(1)},a_+)-\widehat{G}){} & {} =-B_nn^{\frac{1}{3}}\theta _n-\frac{1}{2}\sum _{j=2}^n\log \left( 1-\frac{4\mu _1^{(1)}\theta _n}{n^{\frac{2}{3}}(\mu _1-\mu _j)}\right) \nonumber \\{} & {} \quad -\frac{1}{2}\log (4\mu _1^{(1)}n^{-2/3}\theta _n)-\frac{\textrm{i}\pi }{2} \nonumber \\{} & {} =-n^{\frac{1}{3}}\theta _n\left( B_n-2\mu _1^{(1)}s_{{{\,\textrm{MP}\,}}}(d_+)-\omega _{1n}n^{-\frac{1}{3}}\right) \nonumber \\{} & {} \quad +(\zeta \theta _n)^2\omega _{2n}+\frac{\log n}{3}+O(1)\nonumber \\{} & {} =-\frac{2\sqrt{\lambda }s_{{{\,\textrm{MP}\,}}}(d_+)\theta _n}{\sqrt{1+\lambda }}b\sqrt{\log n}+\frac{\log n}{3}+O(1), \nonumber \\ \end{aligned}$$
(4.52)

on the event \(\mathcal {F}_\varepsilon \). Applying the above two displays to (4.50), we obtain the lemma. \(\square \)

4.2.2 Proof of (4.40) When \(b=0\)

Observe that when \(b=0\), Lemmas 4.8 and 4.9 using keyhole contour show that, with probability \(1-\varepsilon \) for arbitrary small \(\varepsilon >0\), the contribution from the vertical and horizontal parts of the contour is both \(n^{-\frac{1}{3}}e^{O(1)}\). This provides the upper bound for \(K_n\). As some cancelation between the two contributions can occur, further analysis is required for the lower bound. In this section, we use the steepest descent contour of \(G(\mu _1^{(1)}, z)\) crossing the real line above \(\mu _1^{(2)}\) to obtain the needed lower bound

$$\begin{aligned} K_n \ge n^{-\frac{1}{3}}e^{O(\log \log n)}. \end{aligned}$$

The argument is inspired by the one provided by Johnstone et al in [36].

Lemma 4.10

There exists a unique saddle point of \(G(\mu _1^{(1)},z)\) on \(z\in (\mu _1^{(2)}, \infty )\).

Proof

Observe that

$$\begin{aligned} \partial _2 G(\mu _1^{(1)},z)=B_n-\frac{1}{2n}\sum _{j=1}^{n}\frac{4\mu _1^{(1)}}{4\mu _1^{(1)}z-\mu _j} \end{aligned}$$

is an increasing function of z on the interval \((\mu _1^{(2)}, \infty )\) and that

$$\begin{aligned} \lim \limits _{z\downarrow \mu _1^{(2)}}\partial _2G(\mu _1^{(1)},z)=-\infty , \quad \lim \limits _{z\rightarrow \infty }\partial _2G(\mu _1^{(1)}, z)=B_n>0. \end{aligned}$$

Thus, there is a unique solution \(z_c\in (\mu _1^{(2)}, \infty )\) to the equation \(\partial _2G(\mu _1^{(1)},z)=0\). Moreover, \(\partial _2^2G(\mu _1^{(1)},z)>0\) for all \(z>\mu _1^{(2)}\). Thus, \(z_c\) is a saddle point of \(\mathop {\textrm{Re}}\limits [G(\mu _1^{(1)}, z)]\). \(\square \)

Let \(\Gamma _s\) be the steepest descent contour of \(G(\mu _1^{(1)}, z)\) crossing \(z_c\). For \(z=x+\textrm{i}y\in \Gamma _s\),

$$\begin{aligned} 0=\mathop {\textrm{Im}}\limits [G(\mu _1^{(1)},z)]=B_ny-\frac{1}{2n}\sum _{j=1}^n\arg (4\mu _1^{(1)}x-\mu _i+\textrm{i}4\mu _1^{(1)}y), \end{aligned}$$

which implies \(\Gamma _s\) is symmetric with respect to the x-axis. Moreover, for fixed \(y>0\), \(\arg (4\mu _1^{(1)}x-\mu _i+\textrm{i}4\mu _1^{(1)}y)\) is strictly decreasing in x. This suggests there is at most one solution x to \(\mathop {\textrm{Im}}\limits [G(\mu _1^{(1)}, x+\textrm{i}y)]=0\) for any \(y>0\). The same applies to \(y<0\) by symmetry. We then parameterize \(\Gamma _s=\{\Gamma _s(t):0<t<1\}\) such that \(\mathop {\textrm{Im}}\limits \Gamma _s(t)\) is increasing in t.

As \(B_n|y|\uparrow \frac{\pi }{2}\), \(x\rightarrow -\infty \) so \(\Gamma _s(0^+)=-\infty -\textrm{i}\frac{\pi }{2B_n}\) and \(\Gamma _s(1^-)=-\infty +\textrm{i}\frac{\pi }{2B_n}\). We obtain \(\mathop {\textrm{Re}}\limits \Gamma _s(t)\) is bounded above, and \(K_n\) as in (4.42) satisfies

$$\begin{aligned} \textrm{i}K_n=\int _{\Gamma _s}\exp \left[ n(G(\mu _1^{(1)}, z)-\widehat{G})\right] \textrm{d}z. \end{aligned}$$

We now consider points on the contour \(\Gamma _s\) with real part \(\mu _1^{(2)}\).

Lemma 4.11

The function

$$\begin{aligned} f(y):=\mathop {\textrm{Im}}\limits [G(\mu _1^{(1)}, \mu _1^{(2)}+\textrm{i}y)]=B_ny-\frac{\pi }{4n}-\frac{1}{2n}\sum _{j=2}^n\arctan \left( \frac{4\mu _1^{(1)}y}{\mu _1-\mu _j}\right) \end{aligned}$$

has a unique positive root \(y_0\). Furthermore, for any sequence \(a_n\rightarrow \infty \), \(a_n=O(n^\delta )\) for any \(\delta >0\),

$$\begin{aligned} n^{-2/3}a_n^{-1}\le y_0\le n^{-2/3}a_n, \quad \text {asymptotically almost surely.} \end{aligned}$$
(4.53)

Proof

Existence and uniqueness of \(y_0>0\) follow from the fact that f(y) is continuous, convex function on \([0,\infty )\) with \(f(0)=-\frac{\pi }{4n}\) and \(\lim \limits _{y\rightarrow \infty }f(y)=\infty \).

Let \(y_-\), \(y_+\) denote the bounds \(a_n^{-1}n^{-2/3}\) and \(a_nn^{-2/3}\), respectively. We now verify (4.53) by showing that a.a.s., \(f(y_-)<0<f(y_+)\). First, using \(\arctan (x)\ge x-x^2/4\) for \(x\ge 0\) and Lemma 4.1, then with probability \(1-\varepsilon \) for arbitrary \(\varepsilon >0\),

$$\begin{aligned} f(y_-)&\le y_-\left( B_n-\frac{2\mu _1^{(1)}}{n}\sum _{j=2}^n\frac{1}{\mu _1-\mu _j}\right) -\frac{\pi }{4n}+\frac{(4\mu _1^{(1)}y_-)^2}{8n}\sum _{j=2}^n\frac{1}{(\mu _1-\mu _j)^2}\\&=y_-\left( B_n-2\mu _1^{(1)}s_{{{\,\textrm{MP}\,}}}(d_+)+O(n^{-1/3})\right) -\frac{\pi }{4n}+y_-^2\cdot O(n^{1/3})\\&=-\frac{\pi }{4n}+o(n^{-1})<0. \end{aligned}$$

In the last equality, \(B_n-2\mu _1^{(1)}s_{{{\,\textrm{MP}\,}}}(d_+)=O(n^{-1-\tau })\) due to rigidity of \(\mu _1\) and the fact \(B_n=B_c+O(n^{-1-\tau })\) for any \(\tau >0\). The second part of the proof relies on the following statistics regarding the eigenvalues of a matrix from the Laguerre Orthogonal Ensemble. Let

$$\begin{aligned} j_0&=\#\bigg \{j:\mu _j>d_+-\frac{1}{3}a_n n^{-2/3}\bigg \}, \\ j^*&=\#\bigg \{j:\mu _j>\mu _1-\left( 1+\frac{\pi }{2}\right) ^{-1}a_nn^{-2/3}\bigg \}. \end{aligned}$$

By Chebyshev’s inequality and (A.2), for some \(c>0\), it holds a.a.s. that \(j_0\ge ca_n^{3/2}\). Combine with the observation that a.a.s., \(\mu _1-d_+=\Theta (n^{-2/3})\ll a_nn^{-2/3}\), we obtain

$$\begin{aligned} j^*\ge j_0\ge ca_n^{3/2} \quad \text {a.a.s.} \end{aligned}$$
(4.54)

Since \(\arctan (x)\le x-1\) for \(x>1+\frac{\pi }{2}\) and \(j^*=\max \{j:\frac{y_+}{\mu _1-\mu _j}>1+\frac{\pi }{2}\}\), we have

$$\begin{aligned} \arctan \left( \frac{4\mu _1^{(1)}y_+}{\mu _1-\mu _j}\right) \le \frac{4\mu _1^{(1)}y_+}{\mu _1-\mu _j}-\mathbbm {1}_{\{j\le j^*\}}. \end{aligned}$$

Lemma 4.1 and the above display imply that a.a.s.,

$$\begin{aligned} f(y_+)&=B_ny_+-\frac{1}{2n}\sum _{j=2}^n\arctan \left( \frac{4\mu _1^{(1)}y_+}{\mu _1-\mu _j}\right) -\frac{\pi }{4n}\\&\ge B_ny_+-\frac{1}{2n}\sum _{j=2}^n\frac{4\mu _1^{(1)}y_+}{\mu _1-\mu _j}+\frac{j^*}{2n}-\frac{\pi }{4n}\\&\ge y_+\cdot O(n^{-1/3})+\frac{ca_n^{3/2}}{2n}-\frac{\pi }{4n}, \end{aligned}$$

which is strictly positive as \(y_+=a_nn^{-2/3}\). We obtain the lemma. \(\square \)

Let \(z_0=\mu _1^{(2)}+\textrm{i}y_0\), and consider the subset

$$\begin{aligned} \Gamma _0=\{z\in \Gamma _s: |\mathop {\textrm{Im}}\limits z|\le y_0\}, \end{aligned}$$

which is a connected curve with endpoints \(z_0, \overline{z_0}\) by the parameterization. We have now obtained the needed tools to bound \(K_n\) as follows.

Observe that \(G(\mu _1^{(1)}, z)-\widehat{G}\) is real on \(\Gamma _s\) and is monotone decreasing as z moves away from the point \(z_c\) along \(\Gamma _s\). Also, \(\frac{\textrm{d}y}{\textrm{d}t}>0\) from the parameterization. Therefore,

$$\begin{aligned} \begin{aligned} K_n =&\frac{1}{\textrm{i}}\int _{\Gamma _s}\exp \left[ n(G(\mu _1^{(1)}, z)-\widehat{G})\right] \textrm{d}z\\&\ge \int _{-y_0}^{y_0}\exp \left[ n\mathop {\textrm{Re}}\limits (G(\mu _1^{(1)}, z(y))-\widehat{G})\right] \textrm{d}y\\&\ge 2y_0\exp \left[ n\mathop {\textrm{Re}}\limits (G(\mu _1^{(1)}, z_0)-\widehat{G})\right] . \end{aligned} \nonumber \\ \end{aligned}$$
(4.55)

Here,

$$\begin{aligned} \begin{aligned} \log y_0+n\mathop {\textrm{Re}}\limits (G(\mu _1^{(1)}, z_0)-\widehat{G})&= \log y_0-\frac{1}{2}\log (4\mu _1^{(1)}y_0)\\ {}&\quad -\frac{1}{4}\sum _{j=2}^n\log \left( 1+\frac{(4\mu _1^{(1)}y_0)^2}{(\mu _1-\mu _j)^2}\right) \\&\ge \frac{1}{2}\log y_0-\frac{(4\mu _1^{(1)}y_0)^2}{4}\sum _{j=2}^n\frac{1}{(\mu _1-\mu _j)^2}\\&\ge -\frac{1}{3}\log n +O(\log \log n). \end{aligned} \nonumber \\ \end{aligned}$$
(4.56)

The last inequality holds a.a.s., using Lemma 4.11 with \(a_n^2=\log \log n\) and the fact \(\sum _{j=2}^n\frac{1}{(\mu _1-\mu _j)^2}\) is \(O(n^{4/3})\) under the event \(\mathcal {F}_\varepsilon \). This completes the proof of the lower bound of \(K_n\).

4.3 Low-Temperature Free Energy

Finally, using the contour integral computations from the previous section, we obtain the following lemma for the limiting fluctuations of the free energy on the low-temperature side of the critical temperature window.

Lemma 4.12

If \(\beta =\beta _c+bn^{-1/3}\sqrt{\log n}\) for some fixed \(b\ge 0\), then the free energy satisfies

$$\begin{aligned} \frac{m{+}n}{\sqrt{\frac{1}{6}\log n}}\left( F_{n,m}(\beta ){-}F(\beta ){+}\frac{1}{12}\frac{\log n}{n{+}m}\right) \rightarrow \mathcal {N}(0,1){+}\frac{\sqrt{6}\lambda ^{\frac{1}{4}}b}{(1+\lambda )^{\frac{1}{2}}(1+\lambda ^{\frac{1}{2}})^{\frac{2}{3}}}{{\,\textrm{TW}\,}}_1, \end{aligned}$$

where

$$\begin{aligned} F(\beta )=f_\lambda +\frac{\lambda }{1+\lambda }A(d_+,B)-\frac{1}{2}\log \beta -\frac{\lambda }{2(1+\lambda )}C_\lambda . \end{aligned}$$
(4.57)

Proof

By (4.18),

$$\begin{aligned} \frac{1}{n+m}\log Q_n= \frac{n}{n+m}\widehat{G}+\frac{1}{n+m}\log S_n. \end{aligned}$$

Note that \(\frac{1}{n+m}\log S_n =-\frac{5}{6}\frac{\log n}{n+m}+O(n^{-1}\log \log n)\) by Lemma 4.5, while the quantity \(\widehat{G}\) is computed in Lemma 4.2. Combining them, we get

$$\begin{aligned} \frac{1}{n+m}\log Q_n{} & {} =\frac{\lambda }{1+\lambda }A(d_+,B) -\frac{7}{6}\frac{\log n}{n+m}\nonumber \\{} & {} \quad -\frac{1}{2(n+m)}\sum _{i = 1}^n\log |d_+-\mu _i|+\frac{\lambda ^{\frac{3}{4}}bn^{-\frac{1}{3}}\sqrt{\log n}}{(1+\lambda )^{\frac{3}{2}}d_+}(\mu _1-d_+)\nonumber \\{} & {} \quad +O(\tfrac{\log \log n}{n}). \end{aligned}$$
(4.58)

Apply this to (2.6), we obtain

$$\begin{aligned} \begin{aligned} F_{m,n}(\beta )&=f_\lambda +\frac{\lambda }{1+\lambda }A(d_+,B)-\frac{1}{2}\log \beta -\frac{1}{6}\frac{\log n}{n+m} \\ {}&\quad -\frac{1}{2(n+m)}\sum _{i = 1}^n\log |d_+-\mu _i|+\frac{\lambda ^{\frac{3}{4}}bn^{-\frac{1}{3}}\sqrt{\log n}}{(1+\lambda )^{\frac{3}{2}}d_+}(\mu _1-d_+)\\&\quad + O(\tfrac{\log \log n}{n}). \end{aligned} \end{aligned}$$

In terms of variables \(T_{1n}\) and \(T_{2n}\) as in (4.9), we get

$$\begin{aligned} \begin{aligned} F_{m,n}(\beta )&=f_\lambda +\frac{\lambda }{1+\lambda }A(d_+,B)-\frac{1}{2}\log \beta -\frac{\lambda }{2(1+\lambda )}C_\lambda -\frac{1}{12}\frac{\log n}{n+m}\\&\quad +\frac{\sqrt{\frac{1}{6}\log n}}{n+m}\left( T_{1n}+\frac{\sqrt{6}\lambda ^{\frac{1}{4}}b}{(1+\lambda )^{\frac{1}{2}}(1+\lambda ^{\frac{1}{2}})^{\frac{2}{3}}}T_{2n} \right) +O(\tfrac{\log \log n}{n}). \end{aligned}\nonumber \\ \end{aligned}$$
(4.59)

The theorem then follows since \(T_{1n}{\mathop {\rightarrow }\limits ^{d}}\mathcal {N}(0,1)\) by Theorem 1.2, and \(T_{2n}{\mathop {\rightarrow }\limits ^{d}}{{\,\textrm{TW}\,}}_1\) by Lemma 2.1. \(\square \)

The fact that the Gaussian and Tracy–Widom limits are independent is shown in the next section.

5 Independence of Gaussian and Tracy–Widom Variables (Low Temperature)

Recall the quantities

$$\begin{aligned} \begin{aligned}&T_{1n}:=\frac{C_\lambda n-\frac{1}{6} \log n-\sum _{i=1}^n\log |d_+-\mu _i|}{\sqrt{\frac{2}{3}\log n}},\qquad \quad T_{2n}:=\frac{n^{2/3}(\mu _1-d_+)}{\sqrt{\lambda }(1+\sqrt{\lambda })^{4/3}},\\&C_\lambda =(1-\lambda ^{-1})\log (1+\lambda ^{\frac{1}{2}})+\log (\lambda ^{\frac{1}{2}})+\lambda ^{-\frac{1}{2}} \end{aligned}\end{aligned}$$
(5.1)

The goal of this section is to show that, given an LOE matrix \(M_{n,m}\) (which we assume without loss of generality to be in tridiagonal form), with probability arbitrarily close to one,

  • \(T_{1n}=\frac{Z_n}{\sqrt{\frac{2}{3} \log n}}+o(1)\) for \(Z_n\) depending only on the upper left minor of size \(n-2n^{1/3}(\log n)^3\) of the matrix \(M_{n,m}\), and

  • \(T_{2n}=Y_n+o(1)\) for \(Y_n\) depending only on the lower right minor of size \(2n^{1/3}(\log n)^3\) of the matrix.

Our proofs draw on ideas from the paper [36], which proves a similar result in the case of Wigner ensembles. We also make use of results from [24], which studies the asymptotics of the quantity \(\sum _{i=1}^n\log |\gamma -\mu _i|\) for \(\gamma \ge d_+\) by analyzing a recurrence on the determinants of the minors of \(M_{n,m}\). In order to demonstrate the asymptotic independence of \(T_{1n}\) and \(T_{2n}\), we need not only the main theorem of [24], but also many of the intermediate lemmas which involve recurrences on the matrix entries. For this purpose, we briefly summarize the setup from that paper along with the key notations that are used.

Recall from (2.20) that the tridiagonal representation of \(M_{n,m}\) depends on \(\chi \)-squared random variables \(\{a_i^2\}\), \(\{b_i^2\}\). Paper [24] works with centered and rescaled versions of these, denoted by \(\alpha _i\) and \(\beta _i\), respectively, which are defined as

$$\begin{aligned} \alpha _i=\frac{a_i^2-(m-n+i)}{|\rho _i^+|}, \qquad \beta _i=\frac{b_{i-1}^2-(i-1)}{|\rho _i^+|}. \end{aligned}$$
(5.2)

Here, the scaling factor \(\rho _i^+\) is one of the characteristic roots of the recurrence on determinants of the minors of \(M_{n,m}\). This turns out to be a convenient rescaling since it prevents the iterates from blowing up. More precisely,

$$\begin{aligned} \rho _i^\pm{} & {} :=-\frac{1}{2}\left( \gamma m-(m-n+2i-1)\right. \nonumber \\ {}{} & {} \left. \pm \sqrt{(\gamma m-(m-n+2i-1))^2-4(m-n+i-1)(i-1)} \right) . \end{aligned}$$
(5.3)

Throughout the proofs, we will also use the notations

$$\begin{aligned} \tau _i=\frac{m-n+i}{|\rho _i^+|}, \qquad \delta _i=\frac{i-1}{|\rho _i^+|}. \end{aligned}$$
(5.4)

5.1 Proof for \(T_{1n}\)

Lemma 5.1

There exists a random random \(Z_n\), depending only on the upper left minor of size \(n-2n^{1/3}(\log n)^3\) of the matrix \(M_{n,m}\) such that

$$\begin{aligned} T_{1n}=\frac{Z_n}{\sqrt{\frac{2}{3}\log n}}+o(1). \end{aligned}$$

Proof

We begin our analysis of \(T_{1n}\) by remarking that it is tricky to analyze the distribution of \(\sum _{i=1}^n\log |d_+-\mu _i|\) directly because of how close \(d_+\) is to the eigenvalues \(\{\mu _i\}\). For this reason, [24] uses the technique of first analyzing the sum \(\sum _{i=1}^n\log |\gamma -\mu _i|\) for

$$\begin{aligned} \gamma =d_++\sigma _n n^{-2/3}, \end{aligned}$$
(5.5)

then analyzing the original sum by comparison to the shifted one. We employ a similar technique here. More precisely, we take

$$\begin{aligned} \sigma _n=\bar{\sigma }_n:=\left( \log \log n\right) ^3. \end{aligned}$$
(5.6)

From line (7.3) of [24], we have

$$\begin{aligned} \sum _{i=1}^n\log |d_+-\mu _i|{} & {} =\sum _{i=1}^n\log |d_++\bar{\sigma }_nn^{-2/3}-\mu _i| -C_1\bar{\sigma }_n n^{1/3}+C_2\bar{\sigma }_n^{3/2}+o(\sqrt{\log n}). \nonumber \\ \end{aligned}$$
(5.7)

where

$$\begin{aligned} C_1=\frac{1}{\lambda ^{1/2}(1+\lambda ^{1/2})},\qquad C_2=\frac{2}{3\lambda ^{3/4}(1+\lambda ^{1/2})^2}. \end{aligned}$$
(5.8)

Furthermore, from Lemma 3.1 and Section 4 of [24], we can rewrite the sum on the right-hand side of (5.7) as

$$\begin{aligned} \sum _{i=1}^n\log |d_+ +\bar{\sigma }_nn^{-2/3}-\mu _i|{} & {} =C_\lambda n-\sum _{i=3}^n L_i -\frac{1}{6}\log n+C_1\bar{\sigma }_n n^{1/3}-C_2\bar{\sigma }_n^{3/2}\nonumber \\ 0{} & {} +o\left( \sqrt{\log n}\right) \end{aligned}$$
(5.9)

where \(C_1,C_2\) are the same constants from (5.7) and \(L_i\) is given by the recursive formula

$$\begin{aligned} L_i:=\xi _i+\omega _i L_{i-1} \text { for }i\ge 4,\qquad L_3:=\xi _3. \end{aligned}$$
(5.10)

with

$$\begin{aligned} \xi _i:=\alpha _i+\beta _i(1+\tau _{i-1})+\alpha _{i-1}\delta _i, \qquad \omega _i:=\tau _{i-1}\delta _i. \end{aligned}$$
(5.11)

Thus, combining (5.7) and (5.9) with the definition of \(T_{1n}\), we get

$$\begin{aligned} T_{1n}=\frac{\sum _{i=3}^nL_i}{\sqrt{\frac{2}{3}\log n}}+o(1). \end{aligned}$$
(5.12)

It remains to show that \(\sum _{i=3}^n L_i=Z_n+o(\sqrt{\log n})\) for some \(Z_n\) depending only on the upper left minor of \(M_{n,m}\) of size \(n-2n^{1/3}(\log n)^3\). From the recursive definition of \(L_i\), we have, for any \(j\ge 4\),

$$\begin{aligned} \sum _{i=3}^nL_i=\sum _{i=3}^{n}\xi _i+\omega _i\xi _{i-1}+\cdots +\omega _i\cdots \omega _4\xi _3 =\sum _{i=3}^{n}g_{i+1}\xi _i \end{aligned}$$

where \(g_i = 1+\omega _i+\omega _i\omega _{i+1}+\dots +\omega _i\dots \omega _n\) for \(3\le i \le n\). Now we would like to compare this sum to a similar sum, truncated at index \(i=n-2n^{1/3}(\log n)^3\) and show that their difference is small, with probability arbitrarily close to 1. As this will involve computing the variance of the difference between the sums, we would like to eliminate the dependence between consecutive terms in the sum by rewriting

$$\begin{aligned} \sum _{i = 3}^n L_i = \sum _{i = 3}^n g_{i+1}X_i + \sum _{i = 3}^n \alpha _i - g_3\alpha _2. \end{aligned}$$

where

$$\begin{aligned} X_i = (1 + \tau _{i-1})(\delta _i \alpha _{i-1} + \beta _i), \quad 3 \le i \le n. \end{aligned}$$
(5.13)

Now we define

$$\begin{aligned} Z_n=\sum _{i=3}^{\lfloor n-2n^{1/3}(\log n)^3\rfloor }g_{i+1}X_i. \end{aligned}$$
(5.14)

This gives us

$$\begin{aligned} \sum _{i=3}^nL_i-Z_n=\sum _{i=\lceil n-2n^{1/3}(\log n)^3\rceil }^ng_{i+1}X_i+\sum _{i=3}^n\alpha _i-g_3\alpha _2. \end{aligned}$$
(5.15)

It follows from line (5.21) of [24] that \(\sum _{i=3}^n\alpha _i-g_3\alpha _2=o(\sqrt{\log n})\) with probability \(1-n^{-1/2}\). Finally, we bound the variance of the remaining sum on the right-hand side of (5.15). Since \(\{X_i\}\) are pairwise independent and \(\{g_i\}\) are deterministic, we have

$$\begin{aligned} \mathbb {E}\left[ \left( \sum _{i=\lceil n-2n^{1/3}(\log n)^3\rceil }^ng_{i+1}X_i\right) ^2\right] = \sum _{i=\lceil n-2n^{1/3}(\log n)^3\rceil }^ng_{i+1}^2\mathbb {E}X_i^2 \end{aligned}$$

From (4.42) of [24], we have \(\mathbb {E}X_i^2=O(n^{-1})\) uniformly in i. Combining Lemma 5.1 and Corollary 2.9 of [24], we have

$$\begin{aligned} g_i={\left\{ \begin{array}{ll}O(n^{1/2}(n-i)^{-1/2}) &{} i\le n-n^{1/3}\sigma _n\\ O(n^{1/3}\sigma _n^{-1/2}) &{} i\ge n-n^{1/3}\sigma _n. \end{array}\right. } \end{aligned}$$
(5.16)

Thus, we can bound the sum as follows:

$$\begin{aligned}\begin{aligned} \sum _{i=\lceil n-2n^{1/3}(\log n)^3\rceil }^ng_{i+1}^2\mathbb {E}X_i^2&\le \sum _{i=\lceil n-2n^{1/3}\sigma _n(\log n)^3\rceil }^{\lfloor n-n^{1/3}\sigma _n\rfloor }\frac{n}{n-i}\cdot \frac{C}{n}\\&\quad +\sum _{i=\lceil n-n^{1/3}\sigma _n\rceil }^n\frac{n^{2/3}}{\sigma _n}\cdot \frac{C}{n}\\&=O(\log \log n)+O(1). \end{aligned}\end{aligned}$$

This completes the proof of the lemma concerning \(T_{1n}\). \(\square \)

5.2 Proof for \(T_{2n}\)

We now verify that, \(T_{2n}=Y_n+o(1)\), for some random variable \(Y_n\) depending only on the bottom-right minor of size \(2n^{\frac{1}{3}}(\log n)^3\) of the matrix \(M_{n,m}\) (in fact, we get a much tighter tail bound than o(1)). Recall that \(T_{2n}\) is a shifted rescaling of the largest eigenvalue \(\mu _1\), and it converges to the Tracy–Widom distribution. Thus, \(Y_n\), if it exists, must converge to the same limit, while only depending on the bottom corner of \(M_{n,m}\). The following lemma shows that the largest eigenvalue of the minor described above, with the same transformation as in \(T_{2n}\), is a good choice for \(Y_n\).

Lemma 5.2

Let \(\widetilde{\mu }_1\) be the largest eigenvalue of the bottom-right minor of \(M_{n,m}\) of size \(p>2n^{\frac{1}{3}}(\log n)^3\). Then, for any \(D>0\) and \(\varepsilon >0\), with probability at least \(1-\varepsilon \),

$$\begin{aligned} |\mu _1-\widetilde{\mu }_1| = O(n^{-D}). \end{aligned}$$

Furthermore, by setting \(Y_n=\frac{n^{2/3}(\widetilde{\mu }_1-d_+)}{\sqrt{\lambda }(1+\sqrt{\lambda })^{4/3}}\) and taking \(D>\frac{2}{3}\) arbitrarily large, we have

$$\begin{aligned} T_{2n}=Y_n+O(n^{-D+2/3}). \end{aligned}$$

The key ingredient to bounding the difference \(\mu _1-\widetilde{\mu }_1\) lies in controlling the first \(n-2n^{1/3}(\log n)^3\) components of an eigenvector corresponding to \(\mu _1\). In particular, we need the following result.

Lemma 5.3

If \(\textbf{v}=(v_1,\dots ,v_n)^T\) is a principal eigenvector of \(M_{n,m}\), then for any \(\varepsilon >0\) and \(d>0\), with probability at least \(1-\varepsilon \), we have

$$\begin{aligned} \max _{j\le n-2n^{\frac{1}{3}}(\log n)^3}\frac{|v_j|}{\Vert \textbf{v}\Vert } < n^{-d}. \end{aligned}$$

Lemma 5.3 itself relies on the following two auxiliary Lemmas 5.4 and 5.5, both of which depend on the random entries in the tridiagonal matrix form. We include their proofs in Appendix B.

Lemma 5.4

Let \(\mu _1\) be the largest eigenvalue of \(M_{n,m}\). Let \(\{F_j\}_{j=1}^{n-1}\) be the sequence given by

$$\begin{aligned} F_1= & {} -1+\frac{\mu _1 m-a_1^2}{|\rho _1^+|}, \quad F_j=-1+ \frac{\mu _1m-(a_j^2+b_{j-1}^2)}{|\rho _j^+|}\\{} & {} +\frac{(a_{j-1}b_{j-1})^2}{|\rho _j^+||\rho _{j-1}^+|}\cdot \frac{1}{1+F_{j-1}} \text { for } j=2,\dots , n-1. \end{aligned}$$

Here, \(\rho _j^+\) is given by (5.3) with \(\gamma =d_+\). Then, for every \(\varepsilon >0\), with probability at least \(1-\varepsilon \),

$$\begin{aligned} \max _{j\le n-n^{\frac{1}{3}}(\log n)^3}|F_j| = o(n^{-\frac{1}{3}}). \end{aligned}$$
(5.17)

Lemma 5.5

Given \(\varepsilon >0\), then for sufficiently large n and \(a_i,b_i\) as defined in (2.21), we have

$$\begin{aligned} \mathbb {P}\left( \left| \max _{j\le n-n^{1/3}(\log n)^3}a_jb_j-\sqrt{mn}\right| \le (e\log n)^2n^{1/2}\right) \ge 1-\varepsilon . \end{aligned}$$
(5.18)

Proof of Lemma 5.3

From the tridiagonal representation (2.20) and the notations presented at the beginning of Sect. 5, we obtain the system of linear equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \left( \frac{a_1^2}{m}-\mu _1\right) v_1+\frac{a_1b_1}{m}v_2=0,&{}\quad \\ \frac{a_{j-1}b_{j-1}}{m}v_{j-1}+\left( \frac{a^2_j+b^2_{j-1}}{m}-\mu _1\right) v_j+\frac{a_jb_j}{m}v_{j+1}=0, &{}\quad j=2,\dots , n-1. \end{array}\right. } \end{aligned}$$

With probability 1, \(a_j>0\) and \(b_j>0\) for \(j=1,\dots , n-1\). This implies \(v_1\ne 0\) (otherwise, \(\textbf{v}\) is the zero vector). In fact, as functions of positive, continuous random variables \(a_1, \dots , a_{j-1},b_1,\dots , b_{j-1}\), it holds with probability 1 that \(v_j\ne 0\) for each j. Thus, we rescale \(\textbf{v}\) to have \(v_1=1\) and obtain

$$\begin{aligned} v_2{} & {} =\frac{\mu _1m-a_1^2}{a_1b_1}, \quad v_{j+1}=\frac{\mu _1 m-(a_j^2+b_{j-1}^2)}{a_jb_j}v_j-\frac{a_{j-1}b_{j-1}}{a_jb_j}v_{j-1},\nonumber \\{} & {} j=2,\dots , n-1. \end{aligned}$$
(5.19)

We introduce the following quantity

$$\begin{aligned} F_j=\frac{v_{j+1}}{v_j}\cdot \frac{a_jb_j}{|\rho _j^+|}-1, \quad \text {for } j=1, \dots , n-1. \end{aligned}$$
(5.20)

Here, \(\rho _j^+\) is given in (5.3) with \(\gamma =d_+\). Set \(k = \lceil n^{\frac{1}{3}} \rceil \), and let \(j\le n-2n^{1/3}(\log n)^3\). Observe that

$$\begin{aligned} \frac{|v_j|}{\Vert \textbf{v}\Vert }\le \left| \frac{v_j}{v_{j+k}}\right| =\prod _{l=j}^{j+k-1}(1+F_l)^{-1}\prod _{l=j}^{j+k-1}\frac{(a_lb_l)/m}{|\rho _l^+|/m}. \end{aligned}$$
(5.21)

Since \(\{F_l\}_{l=1}^{n-1}\) satisfies the hypothesis of Lemma 5.4 and each \(l\in [j,j+k-1]\) satisfies \(l\le n -n^{\frac{1}{3}}(\log n)^3\), it follows that, with probability \(1-\varepsilon /2\), we have \(\prod _{l=j}^{j+k-1}(1+F_l)^{-1} = 1+o(1)\).

We then consider the product \(\prod _{l=j}^{j+k-1}\frac{(a_lb_l)/m}{|\rho _l^+|/m}\). As \(|\rho _l^+|\) is decreasing in l by (5.3),

$$\begin{aligned} \begin{aligned} \frac{|\rho _l^+|}{m}&\ge \frac{\left| \rho _{n-n^{1/3}(\log n)^3}^+\right| }{m}\\&=\frac{d_+ m-(m+n-2n^{1/3}(\log n)^3-1)}{2m}\\ {}&\quad \left( 1+\sqrt{1-\frac{4(m-n^{1/3}(\log n)^3-1)(n-n^{1/3}(\log n)^3-1)}{(d_+ m-(m+n-2n^{1/3}(\log n)^3-1))^2}}\right) . \end{aligned}\nonumber \\ \end{aligned}$$
(5.22)

Using \(d_+ m= m+n+2\sqrt{mn}\), the first factor on the right-hand side of (5.22) is \(\sqrt{\lambda }(1+O(n^{-\frac{2}{3}}(\log n)^3))\), while the expression under the square root is \(\Theta (n^{-\frac{2}{3}}(\log n)^3)\). Therefore, there is a constant \(c>0\) such that

$$\begin{aligned} \frac{|\rho _l^+|}{m} \ge \sqrt{\lambda }+cn^{-\frac{1}{3}}(\log n)^{\frac{3}{2}} \quad \text { for all } l\le n-n^{\frac{1}{3}}(\log n)^3. \end{aligned}$$

Combining this with Lemma 5.5, we obtain that, for some \(c'>0\), with probability \(1-\varepsilon /2\),

$$\begin{aligned} \prod _{l=j}^{j+k-1}\frac{(a_lb_l)/m}{|\rho _l^+|/m}\le (1-c'n^{-\frac{1}{3}}(\log n)^{\frac{3}{2}}+o(n^{-\frac{1}{3}}))^k. \end{aligned}$$
(5.23)

Therefore, with probability \(1-\varepsilon \),

$$\begin{aligned} \max _{j\le n-2n^{\frac{1}{3}}(\log n)^3}\frac{|v_j|}{\Vert \textbf{v}\Vert }\le \max _{j\le n-2n^{\frac{1}{3}}(\log n)^3}\left| \frac{v_j}{v_{j+k}}\right| =\exp \left( -c'(\log n)^{3/2}+o(1)\right) . \end{aligned}$$

The above quantity is \(O(n^{-\log ^{1/2} n+o(1)})\), smaller than any \(n^{-d}\) for sufficiently large n. This completes the proof of Lemma 5.3. \(\square \)

We now have the necessary tools to prove Lemma 5.2 and conclude our argument of asymptotic independence.

Proof of Lemma 5.2

We observe that \(\widetilde{\mu }_1\) is equal to the largest eigenvalue of \(M^{(p)}_{n,m}\) where

$$\begin{aligned}{} & {} mM^{(p)}_{n,m}\nonumber \\{} & {} \quad = \begin{bmatrix} 0 &{}\quad 0 &{}\quad &{}\quad &{}\quad &{}\quad &{}\quad \\ 0 &{}\quad \ddots &{}\quad \ddots &{}\quad &{}\quad &{}\quad &{}\quad \\ &{}\quad \ddots &{}\quad 0 &{}\quad 0 &{}\quad &{}\quad &{}\quad \\ &{}\quad &{}\quad 0 &{}\quad a^2_{n-p+1}+b^2_{n-p} &{}\quad a_{n-p+1}b_{n-p+1} &{}\quad &{}\quad \\ &{}\quad &{}\quad &{}\quad a_{n-p+1}b_{n-p+1} &{}\quad \ddots &{}\quad \ddots &{}\quad \\ &{}\quad &{}\quad &{}\quad &{}\quad \ddots &{}\quad \ddots &{}\quad a_{n-1}b_{n-1}\\ &{}\quad &{}\quad &{}\quad &{}\quad &{}\quad a_{n-1}b_{n-1} &{}\quad a^2_{n}+b^2_{n-1}\\ \end{bmatrix}.\nonumber \\ \end{aligned}$$
(5.24)

This implies \(\mu _1\ge \widetilde{\mu }_1\). We now verify the upper bound on \(\mu _1-\widetilde{\mu }_1\).

Set \(\textbf{v}=(v_1,\dots ,v_n)^T\) to be a normalized principal eigenvector, i.e., \(\textbf{v}\) is a unit vector satisfying \(\textbf{v}^TM_{n,m}\textbf{v}=\mu _1\). Since \(\widetilde{\mu }_1\ge \textbf{v}^TM^{(p)}_{n,m}\textbf{v}\), it follows that, for \(\textbf{v}_{:n-p}:=(v_1,\dots ,v_{n-p})^T\) and

$$\begin{aligned} (m-p)M_{n-p,m-p}= \begin{bmatrix} a_1^2 &{}\quad a_1b_1 &{}\quad &{}\quad &{}\quad \\ a_1b_1 &{}\quad a_2^2+b_1^2 &{}\quad a_2b_2 &{}\quad &{}\quad \\ &{}\quad a_2b_2 &{}\quad a_3^2+b_2^2 &{}\quad \ddots &{}\quad \\ &{}\quad &{}\quad \ddots &{}\quad \ddots &{}\quad a_{n-l-1}b_{n-l-1}\\ &{}\quad &{}\quad &{}\quad a_{n-p-1}b_{n-p-1} &{}\quad a_{n-p}^2+b_{n-p-1}^2 \end{bmatrix},\nonumber \\ \end{aligned}$$
(5.25)

we have

$$\begin{aligned} \begin{aligned} \mu _1-\widetilde{\mu }_1&\le \textbf{v}^T\left( M_{n,m}-M^{(p)}_{n,m}\right) \textbf{v}\\&=\frac{m-p}{m}\textbf{v}_{:n-p}^TM_{n-p,m-p}\textbf{v}_{:n-p}+\frac{2a_{n-p}b_{n-p}}{m}v_{n-p+1}v_{n-p}. \end{aligned}\nonumber \\ \end{aligned}$$
(5.26)

As \(p>2n^{\frac{1}{3}}(\log n)^3\), Lemma 5.3 implies that for any \(d>0\) and \(\varepsilon >0\), then with probability \(1-\varepsilon /3\), \(\Vert \textbf{v}_{:n-p}\Vert ^2=O(n^{-2d+1})\) and \(\max {|v_{n-p+1}|, |v_{n-p}}|=O(n^{-d})\). Furthermore, \(\frac{2}{m}a_{n-p}b_{n-p}=O(1)\) by Lemma 5.5, and \(\Vert M_{n-p,m-p}\Vert =O(1)\) (due to being a rescaled LOE matrix), and each of these O(1) bounds holds with probability \(1-\varepsilon /3\). Therefore, (5.26) implies \(\mu _1-\widetilde{\mu }_1 = O(n^{-2d+1})\) with probability \(1-\varepsilon \). Setting \(d=\frac{1}{2}(D+1)\), we obtain the first statement of Lemma 5.2. The second one then follows immediately from the observation \(T_{2n}-Y_n=\Theta (n^{2/3}(\mu _1-\widetilde{\mu }_1))\). \(\square \)