Eigenvalues Outside the Bulk of Inhomogeneous Erdős–Rényi Random Graphs

Chakrabarty, Arijit; Chakraborty, Sukrit; Hazra, Rajat Subhra

doi:10.1007/s10955-020-02644-7

Eigenvalues Outside the Bulk of Inhomogeneous Erdős–Rényi Random Graphs

Published: 26 September 2020

Volume 181, pages 1746–1780, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Statistical Physics Aims and scope Submit manuscript

Eigenvalues Outside the Bulk of Inhomogeneous Erdős–Rényi Random Graphs

Download PDF

Arijit Chakrabarty¹,
Sukrit Chakraborty¹ &
Rajat Subhra Hazra¹

437 Accesses
10 Citations
3 Altmetric
Explore all metrics

A Correction to this article was published on 01 April 2024

This article has been updated

Abstract

In this article, an inhomogeneous Erdős–Rényi random graph on $\{1,\ldots , N\}$ is considered, where an edge is placed between vertices i and j with probability $\varepsilon _N f(i/N,j/N)$, for $i\le j$, the choice being made independently for each pair. The integral operator $I_f$ associated with the bounded function f is assumed to be symmetric, non-negative definite, and of finite rank k. We study the edge of the spectrum of the adjacency matrix of such an inhomogeneous Erdős–Rényi random graph under the assumption that $N\varepsilon _N\rightarrow \infty $ sufficiently fast. Although the bulk of the spectrum of the adjacency matrix, scaled by $\sqrt{N\varepsilon _N}$, is compactly supported, the kth largest eigenvalue goes to infinity. It turns out that the largest eigenvalue after appropriate scaling and centering converges to a Gaussian law, if the largest eigenvalue of $I_f$ has multiplicity 1. If $I_f$ has k distinct non-zero eigenvalues, then the joint distribution of the k largest eigenvalues converge jointly to a multivariate Gaussian law. The first order behaviour of the eigenvectors is derived as a byproduct of the above results. The results complement the homogeneous case derived by [18].

Large Deviation Principle for the Maximal Eigenvalue of Inhomogeneous Erdős-Rényi Random Graphs

Article 03 November 2021

Local Kesten–McKay Law for Random Regular Graphs

Article Open access 28 February 2019

Phase Transition in Inhomogenous Erdős-Rényi Random Graphs via Tree Counting

Article 20 November 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Given a graph on N vertices, say, $\{1,\ldots , N\}$, let $A_N$ denote the adjacency matrix of the graph, whose (i, j)th entry is 1 if there is an edge between vertices i and j and 0 otherwise. Important statistics of the graph are the eigenvalues and eigenvectors of $A_N$ which encode crucial information about the graph. The present article considers the generalization of the most studied random graph, namely the Erdős–Rényi random graph (ERRG). It is a graph on N vertices where an edge is present independently with probability $\varepsilon _N$. The adjacency matrix of the ERRG is a symmetric matrix with diagonal entries zero, and the entries above the diagonal are independent and identically distributed Bernoulli random variables with parameter $\varepsilon _N$. We consider an inhomogeneous extension of the ERRG where the presence of an edge between vertices i and j is given by a Bernoulli random variable with parameter $p_{i,j}$ and these $\{p_{i,j}:\, 1\le i< j\le N\}$ need not be same. When $p_{i,j}$ are same for all vertices i and j it shall be referred as (homogeneous) ERRG.

The mathematical foundations of inhomogeneous ERRG where the connection probabilities $p_{i,j}$ come from a discretization of a symmetric, non-negative function f on $[0,1]^2$ was initiated in [9]. The said article considered edge probabilities given by

$$\begin{aligned} p_{i,j}= \frac{1}{N} f\left( \frac{i}{N},\, \frac{j}{N}\right) . \end{aligned}$$

In that case the average degree is bounded and the phase transition picture on the largest cluster size was studied in the same article (see also [8, 32] for results on inhomogeneous ERRG). The present article considers a similar set-up where the average degree is unbounded and studies the properties of eigenvalues of the adjacency matrix. The connection probabilities are given by

$$\begin{aligned} p_{i,j}=\varepsilon _N f\left( \frac{i}{N}, \frac{j}{N}\right) \end{aligned}$$

with the assumption that

$$\begin{aligned} N\varepsilon _N\rightarrow \infty . \end{aligned}$$

(1.1)

Let $\lambda _1(A_N)\ge \cdots \ge \lambda _N(A_N)$ be the eigenvalues of $A_N$. It was shown in [13] (see also [35] for a graphon approach) that the empirical distribution of the centered adjacency matrix converges, after scaling with $\sqrt{N\varepsilon _N}$, to a compactly supported measure $\mu _f$. When $f\equiv 1$, the limiting law $\mu _f$ turns out to be the semicircle law. Note that $f\equiv 1$ corresponds to the (homogeneous) ERRG (see [16, 31] also for the homogeneous case). Quantitative estimates on the largest eigenvalue of the homogeneous case (when $N\varepsilon _N\gg (\log N)^4$) were studied in [20, 34] and it follows from their work that the smallest and second largest eigenvalue converge to the edge of the support of semicircular law. The results were improved recently in [7] and the condition on sparsity can be extended to the case $N\varepsilon _N\gg \log N$ (which is also the connectivity threshold). It was shown that inhomogeneous ERRG also has a similar behaviour. The largest eigenvalue of inhomogeneous ERRG when $N\varepsilon _N\ll \log N$ was treated in [6]. Under the assumption that $N^{\xi }\ll N\varepsilon _N$ for some $\xi \in (2/3, 1]$, it was proved in [17, Theorem 2.7] that the second largest eigenvalue of the (homogeneous) ERRG after appropriate centering and scaling converges in distribution to the Tracy–Widom law. The results were recently improved in [27]. The properties of the largest eigenvector in the homogeneous case was studied in [1, 17, 22, 27, 31].

The scaling limit of the maximum eigenvalue of inhomogenous ERRG also turns out to be interesting. The fluctuations of the maximum eigenvalue in the homogeneous case were studied in [18]. It was proved that

$$\begin{aligned} (\varepsilon _N(1-\varepsilon _N))^{-1/2}\left( \lambda _1( A_N)- \mathrm{E}[\lambda _1(A_N)]\right) \Rightarrow N(0, 2). \end{aligned}$$

The above result was shown under the assumption that

$$\begin{aligned} (\log N)^{\xi } \ll N\varepsilon _N \end{aligned}$$

(1.2)

for some $\xi >8$, which is a stronger assumption than (1.1).

It is well known that in the classical case of a (standard) Wigner matrix, the largest eigenvalue converges to the Tracy–Widom law. We note that there is a different scaling between the edge and bulk of the spectrum in ERRG. As pointed out before, the bulk is of the order $(N\varepsilon _N)^{1/2}$ and the order of the largest eigenvalue is $N\varepsilon _N$. Letting

$$\begin{aligned} W_N=A_N-\mathrm{E}(A_N), \end{aligned}$$

(1.3)

where $\mathrm{E}(A_N)$ is the entrywise expectation of $A_N$, it is easy to see that

$$\begin{aligned} A_N=\varepsilon _N\mathbf{1}\mathbf{1}^\prime +W_N, \end{aligned}$$

where $\mathbf{1}$ is the $N\times 1$ vector with each entry 1. Since the empirical spectral distribution of $(N\varepsilon _N)^{-1/2}W_N$ converges to semi-circle law, the largest eigenvalue of the same converges to 2 almost surely. As $\mathrm{E}[A_N]$ is a rank-one matrix, it turns out that the largest eigenvalue of $A_N$ scales like $N\varepsilon _N$, which is different from the bulk scaling.

The above behaviour can be treated as a special case of the perturbation of a Wigner matrix. When $W_N$ is a symmetric random matrix with independent and identically distributed entries with mean zero and finite variance $\sigma ^2$ and the deformation is of the form

$$\begin{aligned} M_N= \frac{W_N}{\sqrt{N}}+ P_N, \end{aligned}$$

the largest eigenvalue is well-studied. Motivated by the study of adjacency matrix of homogeneous ERRG, [20] studied the above deformation with $P_N= m N^{-1/2} \mathbf{1}\mathbf{1}^\prime $, $m\ne 0$. They showed that

$$\begin{aligned} N^{-1/2}\left( \lambda _1(M_N)- Nm-\frac{\sigma ^2}{m}\right) \Rightarrow N(0, 2\sigma ^2). \end{aligned}$$

Since the bulk of $N^{-1/2} M_N$ lies within $[-2\sigma , \, 2\sigma ]$, the largest eigenvalue is detached from the bulk. In general, when the largest eigenvalue of the perturbation has the same order as that of the maximum eigenvalue of $W_N$, the problem is more challenging. One of the seminal results in this direction was obtained in [3]. They exhibited a phase transition in the behaviour of the largest eigenvalue for complex Wishart matrix, which is now referred to as the BBP (Baik–Ben Arous–Péché) phase transition. It is roughly as follows. Suppose $P_N$ is a deterministic matrix of rank k with non-trivial eigenvalues $\theta _1\ge \theta _2\dots \ge \theta _k>0$. If $\theta _i\le \sigma $, then $\lambda _i(M_N)\rightarrow 2\sigma $ almost surely, and if $\theta _i> \sigma $ then

$$\begin{aligned} \lambda _i\rightarrow \theta _i+\frac{\sigma ^2}{\theta _i},\;\;\text {almost surely}. \end{aligned}$$

See [2, 19] for further extensions. It is clear that when $\theta _i>\sigma $ the corresponding eigenvalue lies outside the bulk of the spectrum. The phase transition is also present at the level of fluctuations around $2\sigma $ or $\theta _i+\sigma ^2/\theta _i$. It is known that under some moment conditions on the entries of $W_N$ (see [11, 24, 25]), when $\theta _i\le \sigma $, the fluctuations are governed by the Tracy–Widom law, and when $\theta _i>\sigma $, the limiting distribution is given by the eigenvalues of a random matrix of order k. This limiting random matrix depends on the eigenvectors of $P_N$ and also on the entries of $W_N$. The non-universal nature was pointed out in [11]. For example, when $W_N$ is a Gaussian orthogonal ensemble and $P_N=\theta _1 \mathbf{1}\mathbf{1}^{\prime }$ then the limit is Gaussian and if the entries of $W_N$ are not from a Gaussian distribution, then the limit is a convolution of Gaussian and the distribution from which the entries of $W_N$ are sampled. One can find further details in [3,4,5, 11, 12, 24, 25] and the survey by [29]. The case when the rank k depends on N was considered in [10, 23, 26]. Various applications of these results on outliers can be found in the literature, for example, [14, 15, 30].

The adjacency matrix of the inhomogeneous ERRG does not fall directly into purview of the above results, since $W_N$, as in (1.3), is a symmetric matrix, with independent entries above the diagonal, but the entries have a variance profile, which also depends on the size of the graph. The inhomogeneity does not allow the use of local laws suitable for semicircle law in an obvious way. The present article aims at extending the results obtained in [18] for the case that f is a constant to the case that f is a non-negative, symmetric, bounded, Riemann integrable function on $[0,1]^2$ which induces an integral operator of finite rank k, under the assumption that (1.2) holds. The case $k\ge 2$ turns out to be substantially difficult than the case $k=1$ for the following reason. If $k=1$, that is,

$$\begin{aligned} \mathrm{E}(A_N)=u_Nu_N^\prime , \end{aligned}$$

for some $N\times 1$ deterministic column vector $u_N$, then with high probability it holds that

$$\begin{aligned} u_N^\prime \left( \lambda I-W_N\right) ^{-1}u_N=1, \end{aligned}$$

where $\lambda $ is the largest eigenvalue of $A_N$. The above equation facilitates the asymptotic study of $\lambda $. However, when $k\ge 2$, the above equation takes a complicated form. The observation which provides a way out of this is that $\lambda $ is also an eigenvalue of a $k\times k$ matrix with high probability; the same is recorded in Lemma 5.2 of Sect. 5. Besides, working with the eigenvalues of a $k\times k$ matrix needs more linear algebraic work when $k\ge 2$. For example, the proof of Lemma 5.8, which is one of the major steps in the proof of a main result, becomes a tautology when $k=1$.

The following results are obtained in the current paper. If the largest eigenvalue of the integral operator has multiplicity 1, then the largest eigenvalue of the adjacency matrix has a Gaussian fluctuation. More generally, it is shown that the eigenvalues which correspond to isolated eigenvalues, which will be defined later, of the induced integral operator jointly converge to a multivariate Gaussian law. Under the assumption that the function f is Lipschitz continuous, the leading order term in the expansion of the expected value of the isolated eigenvalues is obtained. Furthermore, under an additional assumption, the inner product of the eigenvector with the discretized eigenfunction of the integral operator corresponding to the other eigenvalues is shown to have a Gaussian fluctuation. Some important examples of such f include the rank-one case, and the stochastic block models. It remains an open question to see if the $(k+1)$th eigenvalue follows a Tracy–Widom type scaling.

The mathematical set-up and the main results of the paper are stated in Sect. 2. Theorem 2.3 shows that of the k largest eigenvalues, the isolated ones, centred by their mean and appropriately scaled, converge to a multivariate normal distribution. Theorem 2.4 studies the first and second order of the expectation of the top k isolated eigenvalues. Theorems 2.5 and 2.6 study the behaviour of the eigenvectors corresponding to the top k isolated eigenvalues. Section 3 contains the special case when f is rank one and the example of stochastic block models. A few preparatory estimates are noted in Sect. 4, which are used later in the proofs of the main results, given in Sect. 5. The estimates in Sect. 4 are proved in Appendix.

2 The Set-Up And The Results

Let $f:[0,1]\times [0,1]\rightarrow [0,\infty )$ be a function which is symmetric, bounded, and Riemann integrable, that is,

$$\begin{aligned} f(x,y)=f(y,x), \quad 0\le x,y\le 1, \end{aligned}$$

(2.1)

and the set of discontinuities of f in $[0,1]\times [0,1]$ has Lebesgue measure zero.

The integral operator $I_f$ with kernel f is defined from $L^2[0,1]$ to itself by

$$\begin{aligned} \bigl (I_f(g)\bigr )(x)=\int _0^1f(x,y)g(y)\,dy, \quad 0\le x\le 1. \end{aligned}$$

Besides the above, we assume that $I_f$ is a non-negative definite operator and the range of $I_f$ has a finite dimension.

Under the above assumptions $I_f$ turns out to be a compact self-adjoint operator, and from the spectral theory one obtains $\theta _1\ge \theta _2\ge \cdots \ge \theta _k>0$ as the non-zero eigenvalues of $I_f$ (where k is the dimension of the range of $I_f$), and eigenfunctions $r_i$ corresponding to $\theta _i$. Therefore, $\{r_1,\ldots ,r_k\}$ is an orthonormal set in $L^2[0,1]$, and by assumption, each $r_i$ is Riemann integrable (see Lemma 6.1 in Appendix). Also, for any $g\in L^2[0,1]$ one has

$$\begin{aligned} I_f(g) =\sum _{i=1}^k \theta _i \langle \, r_i, g\rangle _{L^2[0,1]} r_i. \end{aligned}$$

Note that this gives

$$\begin{aligned} \int _0^1 \left( \sum _{i=1}^k \theta _i r_i(x) r_i(y) g(y)\right) \, dy= \int _0^1 f(x,y) g(y)\, dy \; \;\text {for almost all}{ x\in [0,1]}. \end{aligned}$$

Since g is an arbitrary function in $L^2[0,1]$ this immediately gives

$$\begin{aligned} f(x,y)=\sum _{i=1}^k\theta _ir_i(x)r_i(y), \quad \text {for almost all }(x,y)\in [0,1]\times [0,1]. \end{aligned}$$

Since the functions on both sides of the above equation are Riemann integrable, the corresponding Riemann sums are approximately equal, and hence there is no loss of generality in assuming that the above equality holds for every x and y.

That is, we now assume that

$$\begin{aligned} f(x,y)=\sum _{i=1}^k\theta _ir_i(x)r_i(y)\ge 0, \quad \text {for all}\;(x,y)\in [0,1]\times [0,1], \end{aligned}$$

(2.2)

where $\theta _1\ge \cdots \ge \theta _k>0$ and $\{r_1,\ldots ,r_k\}$ is an orthonormal set in $L^2[0,1]$. The assumptions on $r_1,\ldots ,r_k$ are listed below for easy reference.

Assumption F1. The functions $r_1,\ldots ,r_k$ from [0, 1] to ${{\mathbb {R}}}$ are bounded and Riemann integrable.

Assumption F2. For each $i=1,\ldots ,k$, $r_i$ is Lipschitz, that is,

$$\begin{aligned} |r_i(x)-r_i(y)|\le K_i|x-y|, \end{aligned}$$

for some fixed $K_i<\infty $. This is clearly stronger than Assumption F1, and will be needed in a few results. A consequence of this assumption is that there exists K such that

$$\begin{aligned} \left| f(x,y)-f(x^\prime ,y^\prime )\right| \le K\left( |x-x^\prime |+|y-y^\prime |\right) , \quad 0\le x,x^\prime ,y,y^\prime \le 1. \end{aligned}$$

(2.3)

Let $(\varepsilon _N:N\ge 1)$ be a real sequence satisfying

$$\begin{aligned} 0<\varepsilon _N\le \left[ \sup _{0\le x,y\le 1}f(x,y)\right] ^{-1}, \quad N\ge 1. \end{aligned}$$

The following will be the bare minimum assumption for all the results.

Assumption E1. For some $\xi >8$, fixed once and for all,

$$\begin{aligned} \lim _{N\rightarrow \infty }\frac{1}{N\varepsilon _N}(\log N)^\xi =0, \end{aligned}$$

that is, (1.2) holds. Furthermore,

$$\begin{aligned} \lim _{N\rightarrow \infty }\varepsilon _N=\varepsilon _\infty , \end{aligned}$$

(2.4)

for some $\varepsilon _\infty \ge 0$. It’s worth emphasizing that we do not assume that $\varepsilon _N$ necessarily goes to zero, although that may be the case.

For one result, we shall have to make a stronger assumption on $(\varepsilon _N)$ which is the following.

Assumption E2. As $N\rightarrow \infty $,

$$\begin{aligned} N^{-2/3}\ll \varepsilon _N\ll 1. \end{aligned}$$

(2.5)

For $N\ge 1$, let ${{\mathbb {G}}}_N$ be an inhomogeneous Erdős–Rényi graph where an edge is placed between vertices i and j with probability $\varepsilon _N f(i/N,j/N)$, for $i\le j$, the choice being made independently for each pair in $\{(i,j):1\le i\le j\le N\}$. Note that we allow self-loops. Let $A_N$ be the adjacency matrix of ${{\mathbb {G}}}_N$. In other words, $A_N$ is an $N\times N$ symmetric matrix, where $\{A_N(i,j):1\le i\le j\le N\}$ is a collection of independent random variable, and

$$\begin{aligned} A_N(i,j)\sim \text {Bernoulli}\left( \varepsilon _N f\left( \frac{i}{N},\frac{j}{N}\right) \right) , \quad 1\le i\le j\le N. \end{aligned}$$

A few more notations are needed for stating the main results. For a moment, set $\theta _0=\infty $ and $\theta _{k+1}=-\infty $, and define the set of indices i for which $\theta _i$ is isolated as follows:

$$\begin{aligned} {{\mathcal {I}}}=\{1\le i\le k: \theta _{i-1}>\theta _i>\theta _{i+1}\}. \end{aligned}$$

For an $N\times N$ real symmetric matrix M, let $\lambda _1(M)\ge \cdots \ge \lambda _N(M)$ denote its eigenvalues, as mentioned in Sect. 1. Finally, after the following definition, the main results will be stated.

Definition

A sequence of events $E_N$ occurs with high probability, abbreviated as w.h.p., if

$$\begin{aligned} P(E_N^c)=O\left( e^{-(\log N)^\eta }\right) , \end{aligned}$$

for some $\eta >1$. For random variables $Y_N,Z_N$,

$$\begin{aligned} Y_N=O_{hp}(Z_N), \end{aligned}$$

means there exists a deterministic finite constant C such that

$$\begin{aligned} |Y_N|\le C|Z_N|\;\text {w.h.p.}, \end{aligned}$$

and

$$\begin{aligned} Y_N=o_{hp}(Z_N), \end{aligned}$$

means that for all $\delta >0$,

$$\begin{aligned} |Y_N|\le \delta |Z_N|\;\text {w.h.p.} \end{aligned}$$

We shall say

$$\begin{aligned} Y_N=O_p(Z_N), \end{aligned}$$

to mean that

$$\begin{aligned} \lim _{x\rightarrow \infty }\sup _{N\ge 1}P(|Y_N|>x|Z_N|)=0, \end{aligned}$$

and

$$\begin{aligned} Y_N=o_p(Z_N), \end{aligned}$$

to mean that for all $\delta >0$,

$$\begin{aligned} \lim _{N\rightarrow \infty }P(|Y_N|>\delta |Z_N|)=0. \end{aligned}$$

The reader may note that if $Z_N\ne 0$ a.s., then “$Y_N=O_{p}(Z_N)$” and “$Y_N=o_{p}(Z_N)$” are equivalent to “$(Z_N^{-1}Y_N:N\ge 1)$ is stochastically tight” and “$Z_N^{-1}Y_N{\mathop {\longrightarrow }\limits ^{P}}0$”, respectively. Besides, “$Y_N=O_{hp}(Z_N)$” is a much stronger statement than “$Y_N=O_p(Z_N)$”, and so is “$Y_N=o_{hp}(Z_N)$” than “$Y_N=o_{p}(Z_N)$”.

In the rest of the paper, the subscript ‘N’ is dropped from notations like $A_N$, $W_N$, $\varepsilon _N$ etc. and the ones that will be introduced. The first result is about the first order behaviour of $\lambda _i(A)$.

Theorem 2.1

Under Assumptions E1 and F1, for every $1\le i\le k$,

$$\begin{aligned} \lambda _i(A)=N\varepsilon \theta _i\left( 1+o_{hp}(1)\right) . \end{aligned}$$

An immediate consequence of the above is that for all $1\le i\le k$, $\lambda _i(A)$ is non-zero w.h.p. and hence dividing by the same is allowed, as done in the next result. Define

$$\begin{aligned} e_i=\left[ \begin{matrix} N^{-1/2}r_i(1/N)\\ N^{-1/2}r_i(2/N)\\ \vdots \\ N^{-1/2}r_i(1)\\ \end{matrix} \right] , \quad 1\le i\le k. \end{aligned}$$

(2.6)

The second main result studies the asymptotic behaviour of $\lambda _i(A)$, for $i\in {\mathcal {I}}$, after appropriate centering and scaling.

Theorem 2.2

Under Assumptions E1 and F1, for every $i\in {\mathcal {I}}$, as $N\rightarrow \infty $,

$$\begin{aligned} \lambda _i(A)=\mathrm{E}\left( \lambda _i(A)\right) +\frac{N\theta _i\varepsilon }{\lambda _i(A)}e_i^\prime We_i+o_p(\sqrt{\varepsilon }), \end{aligned}$$

where W is as defined in (1.3).

The next result is the corollary of the previous two.

Theorem 2.3

Under Assumptions E1 and F1, if ${\mathcal {I}}$ is a non-empty set, then as $N\rightarrow \infty $,

$$\begin{aligned} \left( \varepsilon ^{-1/2}\left( \lambda _i(A)-\mathrm{E}[\lambda _i(A)]\right) :i\in {\mathcal I}\right) \Rightarrow \left( G_i:i\in {{\mathcal {I}}}\right) , \end{aligned}$$

(2.7)

where the right hand side is a multivariate normal random vector in ${{\mathbb {R}}}^{|{{\mathcal {I}}}|}$, with mean zero and

$$\begin{aligned} \mathrm{Cov}(G_i,G_j)=2\int _0^1\int _0^1 r_i(x)r_i(y)r_j(x)r_j(y)f(x,y)\left[ 1-\varepsilon _\infty f(x,y)\right] \,dx\,dy, \end{aligned}$$

(2.8)

for all $i,j\in {\mathcal {I}}$.

For $i,j\in {\mathcal {I}}$, that W is a symmetric matrix whose upper triangular entries are independent and zero mean implies that as $N\rightarrow \infty $,

$$\begin{aligned}&Cov\left( e_i^\prime We_i,e_j^\prime We_j\right) \nonumber \\&\sim 4\sum _{1\le k\le l\le N}\mathrm{Cov}\left( e_i(k)W(k,l)e_i(l),e_j(k)W(k,l)e_j(l)\right) \nonumber \\&=4N^{-2}\sum _{1\le k\le l\le N}r_i\left( \frac{k}{N}\right) r_i\left( \frac{l}{N}\right) r_j\left( \frac{k}{N}\right) r_j\left( \frac{l}{N}\right) \varepsilon f\left( \frac{k}{N},\frac{l}{N}\right) \left[ 1-\varepsilon f\left( \frac{k}{N},\frac{l}{N}\right) \right] \nonumber \\&\sim 4\varepsilon \int _0^1\,dx\,\int _x^1\,dy\,r_i(x)r_i(y)r_j(x)r_j(y)f(x,y)\left[ 1-\varepsilon _\infty f(x,y)\right] \nonumber \\&=2\varepsilon \int _0^1\int _0^1r_i(x)r_i(y)r_j(x)r_j(y)f(x,y)\left[ 1-\varepsilon _\infty f(x,y)\right] \,dx\,dy. \end{aligned}$$

(2.9)

With the help of the above, it may be checked that the Lindeberg–Lévy central limit theorem implies that as $N\rightarrow \infty $,

$$\begin{aligned} \left( \varepsilon ^{-1/2}e_i^\prime We_i:i\in \mathcal I\right) \Rightarrow (G_i:i\in {\mathcal {I}}), \end{aligned}$$

(2.10)

where the right hand side is a zero mean Gaussian vector with covariance given by (2.8). Therefore, Theorem 2.3 would follow from Theorems 2.1 and 2.2 .

Remark 2.1

If $f>0$ a.e. on $[0,1]\times [0,1]$, then the Krein–Rutman theorem (see Lemma 6.2) implies that $1\in {\mathcal {I}}$, and that $r_1>0$ a.e. Thus, in this case, if $\varepsilon _\infty =0$, then

$$\begin{aligned} \mathrm{Var}(G_1)=2\int _0^1\int _0^1r_1(x)^2r_1(y)^2f(x,y)\,dx\,dy>0. \end{aligned}$$

Remark 2.2

For a fixed $\theta >1$, define

$$\begin{aligned} f(x,y)=\theta \mathbf{1}\left( x\vee y<\frac{1}{2}\right) +\mathbf{1}\left( x\wedge y>\frac{1}{2}\right) , \quad 0\le x,y\le 1. \end{aligned}$$

In this case, the integral operator associated with f has exactly two non-zero eigenvalues, which are $\theta /2$ and 1/2, with corresponding normalized eigenfunctions $r_1(x)=\sqrt{2}\mathbf{1}(x<1/2)$ and $r_2(x)=\sqrt{2}\mathbf{1}(x\ge 1/2)$, respectively. Let $(\varepsilon _N)$ satisfy Assumption E1 and suppose that $\varepsilon _\infty =0$. Theorem 2.3 implies that as $N\rightarrow \infty $,

$$\begin{aligned} \left( \varepsilon ^{-1/2}\left( \lambda _1(A)-\mathrm{E}[\lambda _1(A)]\right) ,\varepsilon ^{-1/2}\left( \lambda _2(A)-\mathrm{E}[\lambda _2(A)]\right) \right) \Rightarrow \left( G_1,G_2\right) , \end{aligned}$$

where the right hand side has a bivariate normal distribution with mean zero. Furthermore, since $r_1r_2$ is identically zero, it follows that $G_1$ and $G_2$ are uncorrelated and hence independent.

Remark 2.3

That the claim of Theorem 2.3 may not hold if $i\notin \mathcal I$ is evident from the following example. As in Remark 2.2, suppose that Assumption E1 holds and that $\varepsilon _\infty =0$. Let

$$\begin{aligned} f(x,y)=\mathbf{1}\left( x\vee y<\frac{1}{2}\right) +\mathbf{1}\left( x\wedge y>\frac{1}{2}\right) , \quad 0\le x,y\le 1. \end{aligned}$$

In this case, the integral operator associated with f has exactly one non-zero eigenvalue, which is 1/2, and that has multiplicity 2, with eigenfunctions $r_1$ and $r_2$ as in Remark 2.2. In other words, f doesn’t have any simple eigenvalue.

Theorem 2.3 itself implies that there exists $\beta _N\in {{\mathbb {R}}}$ such that

$$\begin{aligned} \varepsilon ^{-1/2}\left( \lambda _1(A)-\beta \right) \Rightarrow G_1\vee G_2, \end{aligned}$$

where $G_1$ and $G_2$ are independent from normal with mean 0. Furthermore,

$$\begin{aligned} \mathrm{Var}(G_1)=2\int _0^1\int _0^1 r_1(x)^2r_1(y)^2f(x,y)\,dx\,dy=8\int _0^1\int _0^1\mathbf{1}\left( x\vee y\le \frac{1}{2}\right) \,dx\,dy=2. \end{aligned}$$

That is, $G_1$ and $G_2$ are i.i.d. from N(0, 2). Hence, there doesn’t exist a centering and a scaling by which $\lambda _1(A)$ converges weakly to a non-degenerate normal distribution.

The next main result of the paper studies asymptotics of $\mathrm{E}(\lambda _i(A))$ for $i\in {\mathcal {I}}$.

Theorem 2.4

Under Assumptions E1 and F2, it holds for all $i\in {\mathcal {I}}$,

$$\begin{aligned} \mathrm{E}\left[ \lambda _i(A)\right] =\lambda _i(B)+O\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) , \end{aligned}$$

where B is a $k\times k$ symmetric deterministic matrix, depending on N, defined by

$$\begin{aligned} B(j,l)=\sqrt{\theta _j\theta _l}N\varepsilon e_j^\prime e_l+\theta _i^{-2}\sqrt{\theta _j\theta _l}(N\varepsilon )^{-1}\mathrm{E}\left( e_j^\prime W^2 e_l\right) , \quad 1\le j,l\le k, \end{aligned}$$

and $e_j$ and W are as defined in (2.6) and (1.3), respectively.

The next result studies the asymptotic behaviour of the normalized eigenvector corresponding to $\lambda _i(A)$, again for isolated vertices i. It is shown that the same is asymptotically aligned with $e_i$, and hence it is asymptotically orthogonal to $e_j$. Upper bounds on rates of convergence are obtained.

Theorem 2.5

As in Theorem 2.4, let Assumptions E1 and F2 hold. Then, for a fixed $i\in {\mathcal {I}}$,

$$\begin{aligned} \lim _{N\rightarrow \infty }P\left( \lambda _i(A)\quad \text {is an eigenvalue of multiplicity }1\right) =1. \end{aligned}$$

(2.11)

If v is the eigenvector, with $L^2$-norm 1, of A corresponding to $\lambda _i(A)$, then

$$\begin{aligned} e_i^\prime v=1+O_p\left( (N\varepsilon )^{-1}\right) , \end{aligned}$$

(2.12)

that is, $N\varepsilon (1-e_i^\prime v)$ is stochastically tight. When $k\ge 2$, it holds that

$$\begin{aligned} e_j^\prime v=O_p\left( (N\varepsilon )^{-1}\right) , \quad j\in \{1,\ldots ,k\}\setminus \{i\}. \end{aligned}$$

(2.13)

The last main result of this paper studies finer fluctuations of (2.13) under an additional condition.

Theorem 2.6

Let $k\ge 2$, $i\in {\mathcal {I}}$ and Assumptions E2 and F2 hold. If v is as in Theorem 2.5, then, for all $j\in \{1,\ldots ,k\}\setminus \{i\}$,

$$\begin{aligned} e_j^\prime v=\frac{1}{\theta _i-\theta _j}\left[ \theta _i\frac{1}{\lambda _i(A)}e_i^\prime We_j+(N\varepsilon )^{-2}\frac{1}{\theta _i}\mathrm{E}\left( e_i^\prime W^2 e_j\right) \right] +o_p\left( \frac{1}{N\sqrt{\varepsilon }}\right) . \end{aligned}$$

Remark 2.4

An immediate consequence of Theorem 2.6 is that under Assumption E2., there exists a deterministic sequence $(z_N:N\ge 1)$ given by

$$\begin{aligned} z=\frac{1}{(N\varepsilon )^2\theta _i(\theta _i-\theta _j)}\mathrm{E}\left( e_i^\prime W^2e_j\right) , \end{aligned}$$

such that as $N\rightarrow \infty $,

$$\begin{aligned} Z_{ij}=N\sqrt{\varepsilon }\left( e_j^\prime v-z\right) \end{aligned}$$

(2.14)

converges weakly to a normal distribution with mean zero, for all $i\in {\mathcal {I}}$ and $j\in \{1,\ldots ,k\}\setminus \{i\}$. Furthermore, the convergence holds jointly for all i and j satisfying the above. This, along with (2.7), implies that the collection

$$\begin{aligned} \left( Z_{ij}:i\in {\mathcal {I}}, j\in \{1,\ldots ,k\}\setminus \{i\}\right) \cup \left( \varepsilon ^{-1/2}\left( \lambda _i(A)-\mathrm{E}\left[ \lambda _i(A)\right] \right) :i\in \mathcal I\right) \end{aligned}$$

converges weakly, as $N\rightarrow \infty $, to

$$\begin{aligned} \left( G_{ij}:i\in {\mathcal {I}}, j\in \{1,\ldots ,k\}\setminus \{i\}\right) \cup \left( G_i:i\in \mathcal I\right) \end{aligned}$$

which is a zero mean Gaussian vector in ${{\mathbb {R}}}^{k|{{\mathcal {I}}}|}$. The covariance matrix of $(G_i)$ is as in (2.8), and $\mathrm{Cov}(G_{ij},G_{i^\prime \,j^\prime })$ and $\mathrm{Cov}(G_{ij},G_{i^\prime })$ are not hard to calculate by proceeding as in (2.9).

3 Examples and Special Cases

3.1 The Rank One Case

Let us consider the special case of $k=1$, that is,

$$\begin{aligned} f(x,y)=\theta r(x)r(y), \end{aligned}$$

for some $\theta >0$, and a bounded Riemann integrable $r:[0,1]\rightarrow [0,\infty )$ satisfying

$$\begin{aligned} \int _0^1r(x)^2\,dx=1. \end{aligned}$$

In this case, Theorem 2.3 implies that

$$\begin{aligned} \varepsilon ^{-1/2}\left( \lambda _1(A)-\mathrm{E}\left( \lambda _1(A)\right) \right) \Rightarrow G_1, \end{aligned}$$

as $N\rightarrow \infty $, where

$$\begin{aligned} G_1\sim N\left( 0,\sigma ^2\right) , \end{aligned}$$

with

$$\begin{aligned} \sigma ^2=2\theta \left( \int _0^1r(x)^3\,dx\right) ^2-2\theta ^2\varepsilon _\infty \left( \int _0^1r(x)^4\,dx\right) ^2. \end{aligned}$$

If r is Lipschitz and $\varepsilon _\infty =0$, then the claim of Theorem 2.4 boils down to

$$\begin{aligned} \mathrm{E}\left[ \lambda _1(A)\right] =\theta N\varepsilon e_1^\prime e_1+(N\varepsilon \theta )^{-1}\mathrm{E}\left( e_1^\prime W^2 e_1\right) +O\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) , \end{aligned}$$

(3.1)

where

$$\begin{aligned} e_1=N^{-1/2}\left[ r(1/N)\,r(2/N)\,\ldots r(1)\right] ^\prime . \end{aligned}$$

Lipschitz continuity of r implies that

$$\begin{aligned} e_1^\prime e_1=1+O\left( N^{-1}\right) , \end{aligned}$$

and hence (3.1) becomes

$$\begin{aligned} \mathrm{E}\left[ \lambda _1(A)\right] =\theta N\varepsilon +(N\varepsilon \theta )^{-1}\mathrm{E}\left( e_1^\prime W^2 e_1\right) +O\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) . \end{aligned}$$

(3.2)

Clearly,

$$\begin{aligned}&\mathrm{E}\left( e_1^\prime W^2 e_1\right) \\&\quad =\frac{1}{N}\sum _{i=1}^Nr\left( \frac{i}{N}\right) ^2\mathrm{E}\left[ W^2(i,i)\right] \\&\quad =\frac{1}{N}\sum _{i=1}^Nr\left( \frac{i}{N}\right) ^2\sum _{1\le j\le N,\,j\ne i}\varepsilon f\left( \frac{i}{N},\frac{j}{N}\right) \left( 1-\varepsilon f\left( \frac{i}{N},\frac{j}{N}\right) \right) \\&\quad =\theta \varepsilon N^{-1}\sum _{1\le i\ne j\le N}r\left( \frac{i}{N}\right) ^3r\left( \frac{j}{N}\right) +O\left( N^{-1}\varepsilon ^2\right) \\&\quad =N\theta \varepsilon \int _0^1r(x)^3\,dx\int _0^1r(y)\,dy+O(\varepsilon ). \end{aligned}$$

In conjunction with (3.2) this yields

$$\begin{aligned} \mathrm{E}\left[ \lambda _1(A)\right] =\theta N\varepsilon +\int _0^1r(x)^3\,dx\int _0^1r(y)\,dy+O\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) . \end{aligned}$$

3.2 Stochastic Block Model

Another important example is the stochastic block model, defined as follows. Suppose that

$$\begin{aligned} f(x,y)=\sum _{i,j=1}^kp(i,j)\mathbf{1}_{B_i}(x)\mathbf{1}_{B_j}(y), \quad 0\le x,y\le 1, \end{aligned}$$

where p is a $k\times k$ symmetric positive definite matrix, and $B_1,\ldots ,B_k$ are disjoint Borel subsets of [0, 1] whose boundaries are sets of measure zero, that is, their indicators are Riemann integrable. We show below how to compute the eigenvalues and eigenfunctions of $I_f$, the integral operator associated with f.

Let $\beta _i$ denote the Lebesgue measure of $B_i$, which we assume without loss of generality to be strictly positive. Rewrite

$$\begin{aligned} f(x,y)=\sum _{i,j=1}^k{\tilde{p}}(i,j)s_i(x)s_j(y), \end{aligned}$$

where

$$\begin{aligned} {\tilde{p}}(i,j)=p(i,j)\sqrt{\beta _i\beta _j}, \quad 1\le i,j\le k, \end{aligned}$$

and

$$\begin{aligned} s_i=\beta _i^{-1/2}\mathbf{1}_{B_i}, \quad 1\le i\le k. \end{aligned}$$

Thus, $\{s_1,\ldots ,s_k\}$ is an orthonormal set in $L^2[0,1]$. Let

$$\begin{aligned} {\tilde{p}}=U^\prime DU, \end{aligned}$$

be a spectral decomposition of ${\tilde{p}}$, where U is a $k\times k$ orthogonal matrix, and

$$\begin{aligned} D=\mathrm{Diag}(\theta _1,\ldots ,\theta _k), \end{aligned}$$

for some $\theta _1\ge \cdots \ge \theta _k>0$.

Define functions $r_1,\ldots ,r_k$ by

$$\begin{aligned} \left[ \begin{matrix} r_1(x)\\ \vdots \\ r_k(x) \end{matrix} \right] =U \left[ \begin{matrix} s_1(x)\\ \vdots \\ s_k(x) \end{matrix} \right] ,\,x\in [0,1]. \end{aligned}$$

It is easy to see that $r_1,\ldots ,r_k$ are orthonormal in $L^2[0,1]$, and for $0\le x,y\le 1,$

$$\begin{aligned} f(x,y)&=\left[ s_1(x)\,\ldots \,s_k(x)\right] {\tilde{p}}\left[ s_1(x)\,\ldots \,s_k(x)\right] ^\prime \\&=\left[ r_1(x)\,\ldots \,r_k(x)\right] U{\tilde{p}} U^\prime \left[ r_1(x)\,\ldots \,r_k(x)\right] ^\prime \\&=\sum _{i=1}^k\theta _ir_i(x)r_i(y). \end{aligned}$$

Thus, $\theta _1,\ldots ,\theta _k$ are the eigenvalues of $I_f$, and $r_1,\ldots ,r_k$ are the corresponding eigenfunctions.

4 Estimates

In this section, we’ll record a few estimates that will subsequently be used in the proof. Since their proofs are routine, they are being postponed to Appendix. Let W be as defined in (1.3). Throughout this section, Assumptions E1 and F1 will be in force.

Lemma 4.1

There exist constants $C_1$, $C_2 > 0$ such that

$$\begin{aligned} P \left( \Vert W \Vert \ge 2 \sqrt{MN\varepsilon } + C_1 (N \varepsilon )^{1/4} ( \log N)^{\xi /4}\right) \le e^{-C_2(\log N)^{\xi /4}}, \end{aligned}$$

(4.1)

where $M=\sup _{0\le x,y\le 1}f(x,y)$. Consequently,

$$\begin{aligned} \Vert W\Vert =O_{hp}\left( \sqrt{N\varepsilon }\right) . \end{aligned}$$

The notations $e_1$ and $e_2$ introduced in the next lemma and used in the subsequent lemmas should not be confused with $e_j$ defined in (2.6). Continuing to suppress ‘N’ in the subscript, let

$$\begin{aligned} L&=[\log N], \end{aligned}$$

where [x] is the largest integer less than or equal to x.

Lemma 4.2

There exists $0<C_1<\infty $ such that if $e_1$ and $e_2$ are $N\times 1$ vectors with each entry in $[-1/\sqrt{N},1/\sqrt{N}]$, then

$$\begin{aligned} \left| \mathrm{E}\left( e_1^\prime W^ne_2\right) \right| \le (C_1N\varepsilon )^{n/2}, \quad 2\le n\le L. \end{aligned}$$

Lemma 4.3

There exists $\eta _1>1$ such that for $e_1,e_2$ as in Lemma 4.2, it holds that

$$\begin{aligned}&\max _{2\le n\le L}P\left( \left| e_1^\prime W^ne_2-\mathrm{E}\left( e_1^\prime W^ne_2\right) \right| >N^{(n-1)/2}\varepsilon ^{n/2}(\log N)^{n\xi /4}\right) \nonumber \\&\quad =O\left( e^{-(\log N)^{\eta _1}}\right) , \end{aligned}$$

(4.2)

where $\xi $ is as in (1.2). In addition,

$$\begin{aligned} e_1^\prime We_2=o_{hp}\left( N\varepsilon \right) . \end{aligned}$$

(4.3)

Lemma 4.4

If $e_1,e_2$ are as in Lemma 4.2, then

$$\begin{aligned} \mathrm{Var}\left( e_1^\prime We_2\right) =O(\varepsilon ), \end{aligned}$$

(4.4)

and

$$\begin{aligned} \mathrm{E}\left( e_1^\prime W^3e_2\right) =O(N\varepsilon ). \end{aligned}$$

(4.5)

5 Proof of the Main Results

This section is devoted to the proof of the main results. The section is split into several subsections, each containing the proof of one main result, for the ease of reading. Unless mentioned otherwise, Assumptions E1 and F1. are made.

5.1 Proof of Theorem 2.1

We start with showing that Theorem 2.1 is a corollary of Lemma 4.1. At this point, it should be clarified that throughout this section, $e_j$ will always be as defined in (2.6).

Proof of Theorem 2.1

For a fixed $i\in \{1,\ldots ,k\}$, it follows that

$$\begin{aligned} \left| \lambda _i(A)-\lambda _i\left( \mathrm{E}(A)\right) \right| \le \Vert W\Vert =O_{hp}\left( (N\varepsilon )^{1/2}\right) , \end{aligned}$$

by Lemma 4.1. In order to complete the proof, it suffices to show that

$$\begin{aligned} \lim _{N\rightarrow \infty }(N\varepsilon )^{-1}\lambda _i\left( \mathrm{E}(A)\right) =\theta _i, \end{aligned}$$

which however follows from the observation that (2.2) implies that

$$\begin{aligned} \mathrm{E}(A)=N\varepsilon \sum _{j=1}^k\theta _je_je_j^\prime . \end{aligned}$$

(5.1)

This completes the proof. $\square $

5.2 Proof of Theorem 2.2

Proceeding towards the proof of Theorem 2.2, let us fix $i\in {\mathcal {I}}$, once and for all, denote

$$\begin{aligned} \mu =\lambda _i(A), \end{aligned}$$

and let V be a $k\times k$ real symmetric matrix, depending on N which is suppressed in the notation, defined by

$$\begin{aligned} V(j,l)= {\left\{ \begin{array}{ll} N\varepsilon \sqrt{\theta _j\theta _l}\,e_j^\prime \left( I-\frac{1}{\mu }W\right) ^{-1}e_l,&{}\text {if }\Vert W\Vert <\mu ,\\ 0,&{}\text {else}, \end{array}\right. } \end{aligned}$$

for all $1\le j,l\le k$. It should be noted that if $\Vert W\Vert <\mu $, then $I-W/\mu $ is invertible.

The proof of Theorem 2.2, which is the main content of this paper, involves several steps, and hence it is imperative to sketch the outline of the proof for the reader’s convenience, which is done below.

(1)
The first major step of the proof is to show that w.h.p., $\mu $ exactly equals $\lambda _i(V)$. This is done in Lemma 5.2. When $i=k=1$, the matrix V is a scalar and hence in that case the equation boils down to $\mu =V$ which is a consequence of the resolvent equation. For higher values of k, the Gershgorin circle theorem is employed for the desired claim. Therefore, this step is a novelty of the given proof.
(2)
The next step in the proof is to write $\mu $ as the solution of an equation of the form
$$\begin{aligned} \mu =\lambda _i\left( \sum _{n=0}^L\mu ^{-n}Y_n\right) +Error, \end{aligned}$$
for suitable matrices $Y_1,Y_2,\ldots $. This is done in Lemma 5.4.
(3)
The third step is to replace $Y_n$ by $\mathrm{E}(Y_n)$ in the equation obtained in the above step, for $n\ge 2$. This is done in Lemma 5.5.
(4)
Arguably the most important step in the proof is to obtain an equation of the form
$$\begin{aligned} \mu ={\bar{\mu }}+\mu ^{-1}\zeta +Error, \end{aligned}$$
for some deterministic ${\bar{\mu }}$ depending on N and random $\zeta $. Once again, this is achieved from the previous step with the help of the Gershgorin circle theorem and other linear algebraic tools. This is done in Lemma 5.8.
(5)
The final step of the proof is to show that ${\bar{\mu }}$ of the above step can be replaced by $\mathrm{E}(\mu )$.

Now let us proceed towards executing the above steps for proving Theorem 2.2. As the zeroth step, we show that $V/N\varepsilon $ converges to $\mathrm{Diag}(\theta _1,\ldots ,\theta _k)$, that is, the $k\times k$ diagonal matrix with diagonal entries $\theta _1,\ldots ,\theta _k$, w.h.p.

Lemma 5.1

As $N\rightarrow \infty $,

$$\begin{aligned} V(j,l)=N\varepsilon \theta _j\left( \mathbf{1}(j=l)+o_{hp}(1)\right) ,1\le j,l\le k. \end{aligned}$$

Proof

For fixed $1\le j,l\le k$, writing

$$\begin{aligned} \left( I-\frac{1}{\mu }W\right) ^{-1}=I+O_{hp}\left( \mu ^{-1}\Vert W\Vert \right) , \end{aligned}$$

we get that

$$\begin{aligned} V(j,l)=N\varepsilon \sqrt{\theta _j\theta _l}\left( e_j^\prime e_l+\frac{1}{\mu }O_{hp}(\Vert W\Vert )\right) . \end{aligned}$$

Since

$$\begin{aligned} \lim _{N\rightarrow \infty }e_j^\prime e_l=\mathbf{1}(j=l), \end{aligned}$$

(5.2)

and

$$\begin{aligned} \Vert W\Vert =o_{hp}(\mu ) \end{aligned}$$

by Lemma 4.1 and Theorem 2.1, the proof follows. $\square $

The next step, which is one of the main steps in the proof of Theorem 2.2, shows that the ith eigenvalues of A and V are exactly equal w.h.p.

Lemma 5.2

With high probability,

$$\begin{aligned} \mu =\lambda _i(V). \end{aligned}$$

The proof of the above lemma is based on the following fact which is a direct consequence of the Gershgorin circle theorem; see Theorem 1.6, pg 8 of [33].

Fact 5.1

Suppose that U is an $n\times n$ real symmetric matrix. Define

$$\begin{aligned} R_l=\sum _{1\le j\le n,\,j\ne l}|U(j,l)|,\,1\le l\le n. \end{aligned}$$

If for some $1\le m\le n$ it holds that

$$\begin{aligned} U(m,m)+R_m<U(l,l)-R_l\quad \text {for all }1\le l\le m-1, \end{aligned}$$

(5.3)

and

$$\begin{aligned} U(m,m)-R_m>U(l,l)+R_l,\quad \text {for all }m+1\le l\le n, \end{aligned}$$

(5.4)

then

$$\begin{aligned} \bigl \{\lambda _1(U),\ldots ,\lambda _n(U)\bigr \}\setminus \left( \bigcup _{1\le l\le k,\,l\ne m}\left[ U(l,l)-R_l,U(l,l)+R_l\right] \right) =\bigl \{\lambda _m(U)\bigr \}. \end{aligned}$$

Remark 5.1

The assumptions (5.3) and (5.4) of Fact 5.1 mean that the Gershgorin disk containing the mth largest eigenvalue is disjoint from any other Gershgorin disk.

Proof of Lemma 5.2

The first step is to show that

$$\begin{aligned} \mu \in \bigl \{\lambda _1(V),\ldots ,\lambda _k(V)\bigr \}\text { w.h.p.} \end{aligned}$$

(5.5)

To that end, fix $N\ge 1$ and a sample point for which $\Vert W\Vert <\mu $. The following calculations are done for that fixed sample point.

Let v be an eigenvector of A, with norm 1, corresponding to $\lambda _i(A)$. That is,

$$\begin{aligned} \mu v=Av=Wv+N\varepsilon \sum _{l=1}^k \theta _l(e_l^\prime v)e_l, \end{aligned}$$

(5.6)

by (5.1). Since $\mu >\Vert W\Vert $, $\mu I-W$ is invertible, and hence

$$\begin{aligned} v=N\varepsilon \sum _{l=1}^k\theta _l(e_l^\prime v)\left( \mu I-W\right) ^{-1}e_l. \end{aligned}$$

(5.7)

Fixing $j\in \{1,\ldots ,k\}$ and premultiplying the above by $\sqrt{\theta _j}\mu e_j^\prime $ yields

$$\begin{aligned} \mu \sqrt{\theta _j}(e_j^\prime v)=N\varepsilon \sum _{l=1}^k\sqrt{\theta _j}\theta _l(e_l^\prime v)e_j^\prime \left( I-\frac{1}{\mu }W\right) ^{-1}e_l=\sum _{l=1}^kV(j,l)\sqrt{\theta _l}(e_l^\prime v). \end{aligned}$$

As the above holds for all $1\le j\le k$, this means that if

$$\begin{aligned} u=\left[ \sqrt{\theta _1}(e_1^\prime v)\ldots \sqrt{\theta _k}(e_k^\prime v)\right] ^\prime , \end{aligned}$$

(5.8)

then

$$\begin{aligned} Vu=\mu u. \end{aligned}$$

(5.9)

Recalling that in the above calculations a sample point is fixed such that $\Vert W\Vert <\mu $, what we have shown, in other words, is that a vector u satisfying the above exists w.h.p.

In order to complete the proof of (5.5), it suffices to show that u is a non-null vector w.h.p. To that end, premultiply (5.6) by $v^\prime $ to obtain that

$$\begin{aligned} \mu =v^\prime Wv+N\varepsilon \Vert u\Vert ^2. \end{aligned}$$

Dividing both sides by $N\varepsilon $ and using Lemma 4.1 implies that

$$\begin{aligned} \Vert u\Vert ^2=\theta _i+o_{hp}(1). \end{aligned}$$

Thus, u is a non-null vector w.h.p. From this and (5.9), (5.5) follows.

Lemma 5.1 shows that for all $l\in \{1,\ldots ,i-1\}$,

$$\begin{aligned}&\left[ V(i,i)+\sum _{1\le j\le k,\,j\ne i}|V(i,j)|\right] -\left[ V(l,l)-\sum _{1\le j\le k,\,j\ne l}|V(l,j)|\right] \\&\quad =N\varepsilon \left( \theta _i-\theta _l\right) (1+o_{hp}(1)), \end{aligned}$$

as $N\rightarrow \infty $. Since $i\in {\mathcal {I}}$, $\theta _i-\theta _l<0$, and hence

$$\begin{aligned} V(i,i)+\sum _{1\le j\le k,\,j\ne i}|V(i,j)|<V(l,l)-\sum _{1\le j\le k,\,j\ne l}|V(l,j)|\;\text {w.h.p.} \end{aligned}$$

A similar calculation shows that for $l\in \{i+1,\ldots ,k\}$,

$$\begin{aligned} V(i,i)-\sum _{1\le j\le k,\,j\ne i}|V(i,j)|>V(l,l)+\sum _{1\le j\le k,\,j\ne l}|V(l,j)|\;\text {w.h.p.} \end{aligned}$$

In view of (5.5) and Fact 5.1, the proof would follow once it can be shown that for all $l\in \{1,\ldots ,k\}\setminus \{i\}$,

$$\begin{aligned} \left| \mu -V(l,l)\right| >\sum _{1\le j\le k,\,j\ne l}|V(l,j)|\text { w.h.p.} \end{aligned}$$

This follows, once again, by dividing both sides by $N\varepsilon $ and using Theorem 2.1 and Lemma 5.1. This completes the proof. $\square $

The next step is to write

$$\begin{aligned} \left( I-\frac{1}{\mu }W\right) ^{-1}=\sum _{n=0}^\infty \mu ^{-n}W^n, \end{aligned}$$

(5.10)

which is possible because $\Vert W\Vert <\mu $. Denote

$$\begin{aligned} Z_{j,l,n}=e_j^\prime W^n e_l,\quad 1\le j,\;l\le k,\;n\ge 0, \end{aligned}$$

which should not be confused with $Z_{ij}$ defined in (2.14), and for $n\ge 0$, let $Y_n$ be a $k\times k$ matrix with

$$\begin{aligned} Y_n(j,l)=\sqrt{\theta _j\theta _l}N\varepsilon Z_{j,l,n},\quad 1\le j,\;l\le k. \end{aligned}$$

The following bounds will be used several times.

Lemma 5.3

It holds that

$$\begin{aligned} \mathrm{E}\left( \Vert Y_1\Vert \right) =O\left( N\varepsilon ^{3/2}\right) , \end{aligned}$$

and

$$\begin{aligned} \Vert Y_1\Vert =o_{hp}\left( (N\varepsilon )^2\right) . \end{aligned}$$

Proof

Lemma 4.4 implies that

$$\begin{aligned} \mathrm{Var}\left( Z_{j,l,1}\right) =O(\varepsilon ),\quad 1\le j,l\le k. \end{aligned}$$

Hence,

$$\begin{aligned} \mathrm{E}\Vert Y_1\Vert =O\left( N\varepsilon \sum _{j,l=1}^k\mathrm{E}|Z_{j,l,1}|\right) =O\left( N\varepsilon \sum _{j,l=1}^k\sqrt{\mathrm{Var}(Z_{j,l,1})}\right) =O\left( N\varepsilon ^{3/2}\right) , \end{aligned}$$

the equality in the second line using the fact that $Z_{j,l,1}$ has mean 0. This proves the first claim. The second claim follows from (4.3) of Lemma 4.3. $\square $

The next step is to truncate the infinite sum in (5.10) to level L, where $L=[\log N]$ as defined before.

Lemma 5.4

It holds that

$$\begin{aligned} \mu =\lambda _i\left( \sum _{n=0}^L\mu ^{-n}Y_n\right) +o_{hp}\left( \sqrt{\varepsilon }\right) . \end{aligned}$$

Proof

From the definition of V, it is immediate that for $1\le j,l\le k$,

$$\begin{aligned} V(j,l)=N\varepsilon \sqrt{\theta _j\theta _l}\sum _{n=0}^\infty \mu ^{-n}e_j^\prime W^ne_l\,\mathbf{1}(\Vert W\Vert <\mu ), \end{aligned}$$

and hence

$$\begin{aligned} V=\mathbf{1}(\Vert W\Vert <\mu )\sum _{n=0}^\infty \mu ^{-n}Y_n. \end{aligned}$$

For the sake of notational simplicity, let us suppress $\mathbf{1}(\Vert W\Vert <\mu )$. Therefore, with the implicit understanding that the sum is set as zero if $\Vert W\Vert \ge \mu $, for the proof it suffices to check that

$$\begin{aligned} \left\| \sum _{n=L+1}^\infty \mu ^{-n}Y_n\right\| =o_{hp}(\sqrt{\varepsilon }). \end{aligned}$$

(5.11)

To that end, Theorem 2.1 and Lemma 4.1 imply that

$$\begin{aligned} \left\| \sum _{n=L+1}^\infty \mu ^{-n}Y_n\right\| \le \sum _{n=L+1}^\infty |\mu |^{-n}\Vert Y_n\Vert =O_{hp}\left( (N\varepsilon )^{-(L-1)/2}\right) . \end{aligned}$$

In order to prove (5.11), it suffices to show that as $N\rightarrow \infty $,

$$\begin{aligned} -\log \varepsilon =o\left( (L-1)\log (N\varepsilon )\right) . \end{aligned}$$

(5.12)

To that end, recall (1.2) to argue that

$$\begin{aligned} N^{-1}=o(\varepsilon ) \end{aligned}$$

(5.13)

and

$$\begin{aligned} \log \log N=O(\log (N\varepsilon )). \end{aligned}$$

(5.14)

By (5.13), it follows that

$$\begin{aligned} -\log \varepsilon&=O\left( \log N\right) \\&=o\left( \log N\log \log N\right) \\&=o\left( (L-1)\log (N\varepsilon )\right) , \end{aligned}$$

the last line using (5.14). Therefore, (5.12) follows, which ensures (5.11), which in turn completes the proof. $\square $

In the next step, $Y_n$ is replaced by its expectation for $n\ge 2$.

Lemma 5.5

It holds that

$$\begin{aligned} \mu =\lambda _i\left( Y_0+\mu ^{-1}Y_1+\sum _{n=2}^L\mu ^{-n}\mathrm{E}(Y_n)\right) +o_{hp}\left( \sqrt{\varepsilon }\right) . \end{aligned}$$

Proof

In view of Theorem 2.1 and Lemma 5.4, all that has to be checked is

$$\begin{aligned} \sum _{n=2}^L(N\varepsilon )^{-n}\Vert Y_n-\mathrm{E}(Y_n)\Vert =o_{hp}(\sqrt{\varepsilon }). \end{aligned}$$

(5.15)

For that, invoke Lemma 4.3 to claim that

$$\begin{aligned}&\max _{2\le n\le L,\,1\le j,l\le k}P\left( \left| Z_{j,l,n}-\mathrm{E}(Z_{j,l,n})\right| >N^{(n-1)/2}\varepsilon ^{n/2}(\log N)^{n\xi /4}\right) \nonumber \\&\qquad =O\left( e^{-(\log N)^{\eta _1}}\right) , \end{aligned}$$

(5.16)

where $\xi $ is as in (1.2).

Our next claim is that there exists $C_2>0$ such that for N large,

$$\begin{aligned} \bigcap _{2\le n\le L,1\le j,l\le k}\left[ \left| Z_{j,l,n}-\mathrm{E}(Z_{j,l,n})\right| \le N^{(n-1)/2}\varepsilon ^{n/2}(\log N)^{n\xi /4}\right] \end{aligned}$$

(5.17)

$$\begin{aligned} \subset \left[ \sum _{n=2}^L(N\varepsilon )^{-n}\Vert Y_n-\mathrm{E}(Y_n)\Vert \le C_2\sqrt{\varepsilon }\left( (N\varepsilon )^{-1}(\log N)^\xi \right) ^{1/2}\right] . \end{aligned}$$

To see this, suppose that the event on the left hand side holds. Then, for fixed $1\le j,l\le k$, and large N,

$$\begin{aligned}&\sum _{n=2}^L(N\varepsilon )^{-n}\left\| Y_n(j,l)-\mathrm{E}\left[ Y_n(j,l)\right] \right\| \\&\quad \le \theta _1 N\varepsilon \sum _{n=2}^L(N\varepsilon )^{-n}\left| Z_{j,l,n}-\mathrm{E}\left( Z_{j,l,n}\right) \right| \\&\quad \le \theta _1\sum _{n=2}^\infty (N\varepsilon )^{-(n-1)}N^{(n-1)/2}\varepsilon ^{n/2}(\log N)^{n\xi /4}\\&\quad =\left[ 1-(N\varepsilon )^{-1/2}(\log N)^{\xi /4}\right] ^{-1}\theta _1\sqrt{\varepsilon }(N\varepsilon )^{-1/2}(\log N)^{\xi /2}. \end{aligned}$$

Thus, (5.17) holds for some $C_2>0$.

Combining (5.16) and (5.17), it follows that

$$\begin{aligned}&P\left( \sum _{n=2}^L(N\varepsilon )^{-n}\Vert Y_n-\mathrm{E}(Y_n)\Vert >C_2\sqrt{\varepsilon }\left( (N\varepsilon )^{-1}(\log N)^\xi \right) ^{1/2}\right) \\&\quad =O\left( \log N e^{-(\log N)^{\eta _1}}\right) \\&\quad =o\left( e^{-(\log N)^{(1+\eta _1)/2}}\right) . \end{aligned}$$

This, with the help of (1.2), establishes (5.15) from which the proof follows. $\square $

The goal of the next two lemmas is replacing $\mu $ by a deterministic quantity in

$$\begin{aligned} \sum _{n=2}^L\mu ^{-n}\mathrm{E}(Y_n). \end{aligned}$$

Lemma 5.6

For N large, the deterministic equation

$$\begin{aligned} x=\lambda _i\left( \sum _{n=0}^Lx^{-n}\mathrm{E}(Y_n)\right) ,\,x>0, \end{aligned}$$

(5.18)

has a solution ${\tilde{\mu }}$ such that

$$\begin{aligned} 0<\liminf _{N\rightarrow \infty }(N\varepsilon )^{-1}{\tilde{\mu }}\le \limsup _{N\rightarrow \infty }(N\varepsilon )^{-1}{\tilde{\mu }}<\infty . \end{aligned}$$

(5.19)

Proof

Define a function

$$\begin{aligned} h:(0,\infty )\rightarrow {{\mathbb {R}}}, \end{aligned}$$

by

$$\begin{aligned} h(x)=\lambda _i\left( \sum _{n=0}^Lx^{-n}\mathrm{E}(Y_n)\right) . \end{aligned}$$

Our first claim is that for any fixed $x>0$,

$$\begin{aligned} \lim _{N\rightarrow \infty }(N\varepsilon )^{-1}h\left( xN\varepsilon \right) =\theta _i. \end{aligned}$$

(5.20)

To that end, observe that since $\mathrm{E}(Y_1)=0$,

$$\begin{aligned} h\left( xN\varepsilon \right) =\lambda _i\left( \mathrm{E}(Y_0)+\sum _{n=2}^L(xN\varepsilon )^{-n}\mathrm{E}(Y_n)\right) . \end{aligned}$$

Recalling that

$$\begin{aligned} Y_0(j,l)=N\varepsilon \sqrt{\theta _j\theta _l}\,e_j^\prime e_l,\quad 1\le j,\;l\le k, \end{aligned}$$

it follows by (5.2) that

$$\begin{aligned} \lim _{N\rightarrow \infty }(N\varepsilon )^{-1}\mathrm{E}(Y_0)=\mathrm{Diag}(\theta _1,\ldots ,\theta _k). \end{aligned}$$

(5.21)

Lemma 4.2 implies that

$$\begin{aligned} \mathrm{E}(Z_{j,l,n})\le \left( O(N\varepsilon )\right) ^{n/2}, \end{aligned}$$

uniformly for $2\le n\le L$, and hence there exists $0<C_3<\infty $ with

$$\begin{aligned} \Vert \mathrm{E}(Y_n)\Vert \le (C_3N\varepsilon )^{n/2+1}, \quad 2 \le n\le L. \end{aligned}$$

(5.22)

Therefore,

$$\begin{aligned} \left\| \sum _{n=2}^L(xN\varepsilon )^{-n}\mathrm{E}(Y_n)\right\| \le \sum _{n=2} ^\infty (xN\varepsilon )^{-n}(C_3N\varepsilon )^{n/2+1}\rightarrow C_3^2x^{-2}, \end{aligned}$$

as $N\rightarrow \infty $. With the help of (5.21), this implies that

$$\begin{aligned} \lim _{N\rightarrow \infty }(N\varepsilon )^{-1}\left( \sum _{n=0}^L(xN\varepsilon )^{-n}\mathrm{E}(Y_n)\right) =\mathrm{Diag}(\theta _1,\ldots ,\theta _k), \end{aligned}$$

and hence (5.20) follows. It follows that for a fixed $0<\delta <\theta _i$.

$$\begin{aligned} \lim _{N\rightarrow \infty }(N\varepsilon )^{-1}\left[ N\varepsilon (\theta _i+\delta )-h\left( (\theta _i+\delta )N\varepsilon \right) \right] =\delta , \end{aligned}$$

and thus, for large N,

$$\begin{aligned} N\varepsilon (\theta _i+\delta )>h\left( (\theta _i+\delta )N\varepsilon \right) . \end{aligned}$$

Similarly, again for large N,

$$\begin{aligned} N\varepsilon (\theta _i-\delta )<h\left( (\theta _i-\delta )N\varepsilon \right) . \end{aligned}$$

Hence, for N large, (5.18) has a solution ${\tilde{\mu }}$ in $[(N\varepsilon )(\theta _i-\delta ),(N\varepsilon )(\theta _i+\delta )]$, which trivially satisfies (5.19). Hence the proof. $\square $

Lemma 5.7

If ${\tilde{\mu }}$ is as in Lemma 5.6, then

$$\begin{aligned} \mu -{\tilde{\mu }}=O_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) . \end{aligned}$$

Proof

Lemmas 5.5 and 5.6 imply that

$$\begin{aligned}&|\mu -{\tilde{\mu }}|\\&\quad =\left| \lambda _i\left( Y_0+\mu ^{-1}Y_1+\sum _{n=2}^L\mu ^{-n}\mathrm{E}(Y_n)\right) -\lambda _i\left( \sum _{n=0}^L{\tilde{\mu }}^{-n}\mathrm{E}(Y_n)\right) \right| +o_{hp}(\sqrt{\varepsilon })\\&\quad \le \Vert \mu ^{-1}Y_1\Vert +|\mu -{\tilde{\mu }}|\sum _{n=2}^L\mu ^{-n}{\tilde{\mu }}^{-n}\Vert E(Y_n)\Vert \sum _{j=0}^{n-1}\mu ^j{\tilde{\mu }}^{n-1-j}+o_{hp}(\sqrt{\varepsilon })\\&\quad =|\mu -{\tilde{\mu }}|\sum _{n=2}^L\mu ^{-n}{\tilde{\mu }}^{-n}\Vert E(Y_n)\Vert \sum _{j=0}^{n-1}\mu ^j{\tilde{\mu }}^{n-1-j}+O_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) . \end{aligned}$$

Thus,

$$\begin{aligned} |\mu -{\tilde{\mu }}|\left[ 1-\sum _{n=2}^L\mu ^{-n}{\tilde{\mu }}^{-n}\Vert E(Y_n)\Vert \sum _{j=0}^{n-1}\mu ^j{\tilde{\mu }}^{n-1-j}\right] \le O_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) . \end{aligned}$$

(5.23)

Equations (5.19) and (5.22) imply that

$$\begin{aligned} \left| \sum _{n=2}^L\mu ^{-n}{\tilde{\mu }}^{-n}\Vert E(Y_n)\Vert \sum _{j=0}^{n-1}\mu ^j{\tilde{\mu }}^{n-1-j}\right|&=O_{hp}\left( \sum _{n=2}^\infty n(N\varepsilon )^{-(n+1)}(C_3N\varepsilon )^{n/2+1}\right) \nonumber \\&=O_{hp}\left( (N\varepsilon )^{-1}\right) \nonumber \\&=o_{hp}(1),\,N\rightarrow \infty . \end{aligned}$$

(5.24)

This completes the proof with the help of (5.23). $\square $

The next lemma is arguably the most important step in the proof of Theorem 2.2, the other major step being Lemma 5.2.

Lemma 5.8

There exists a deterministic ${\bar{\mu }}$, which depends on N, such that

$$\begin{aligned} \mu ={\bar{\mu }}+\mu ^{-1}Y_1(i,i)+o_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) . \end{aligned}$$

Proof

Define a $k\times k$ deterministic matrix

$$\begin{aligned} X=\sum _{n=0}^L{\tilde{\mu }}^{-n}\mathrm{E}(Y_n), \end{aligned}$$

which, as usual, depends on N. Lemma 5.7 and (5.24) imply that

$$\begin{aligned} \left\| X-\sum _{n=0}^L\mu ^{-n}\mathrm{E}(Y_n)\right\|&\le |\mu -{\tilde{\mu }}|\sum _{n=2}^L\mu ^{-n}{\tilde{\mu }}^{-n}\Vert \mathrm{E}(Y_n)\Vert \sum _{j=0}^{n-1}\mu ^j{\tilde{\mu }}^{n-1-j}\\&=o_{hp}\left( |\mu -{\tilde{\mu }}|\right) \\&=o_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) . \end{aligned}$$

By Lemma 5.5 it follows that

$$\begin{aligned} \mu =\lambda _i\left( \mu ^{-1}Y_1+X\right) +o_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) . \end{aligned}$$

(5.25)

Let

$$\begin{aligned} H= & {} X+\mu ^{-1}Y_1-\left( X(i,i)+\mu ^{-1}Y_1(i,i)\right) I, \\ M= & {} X-X(i,i)I, \end{aligned}$$

and

$$\begin{aligned} {\bar{\mu }}=\lambda _i(X). \end{aligned}$$

Clearly,

$$\begin{aligned} \lambda _i\left( \mu ^{-1}Y_1+X\right) =X(i,i)+\mu ^{-1}Y_1(i,i)+\lambda _i(H)={\bar{\mu }}-\lambda _i(M)+\mu ^{-1}Y_1(i,i)+\lambda _i(H). \end{aligned}$$

Thus, the proof would follow with the aid of (5.25) if it can be shown that

$$\begin{aligned} \lambda _i(H)-\lambda _i(M)=o_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert \right) . \end{aligned}$$

(5.26)

If $k=1$, then $i=1$ and hence $H=M=0$. Thus, the above is a tautology in that case. Therefore, assume without loss of generality that $k\ge 2$.

Proceeding towards proving (5.26) when $k\ge 2$, set

$$\begin{aligned} U_1=(N\varepsilon )^{-1}M, \end{aligned}$$

(5.27)

and

$$\begin{aligned} U_2=(N\varepsilon )^{-1}H. \end{aligned}$$

(5.28)

The main idea in the proof of (5.26) is to observe that the eigenvector of $U_1$ corresponding to $\lambda _i(U_1)$ is same as that of M corresponding to $\lambda _i(M)$, and likewise for $U_2$ and X. Hence, the first step is to use this to get a bound on the differences between the eigenvectors in terms of $\Vert U_1-U_2\Vert $.

An important observation that will be used later is that

$$\begin{aligned} \Vert U_1-U_2\Vert =O_{hp}\left( (N\varepsilon )^{-2}\Vert Y_1\Vert \right) . \end{aligned}$$

(5.29)

The second claim of Lemma 5.3 implies that the right hand side above is $o_{hp}(1)$. The same implies that for $m=1,2$ and $1\le j,l\le k$,

$$\begin{aligned} U_m(j,l)=(\theta _j-\theta _i)\mathbf{1}(j=l)+o_{hp}(1), \quad N\rightarrow \infty . \end{aligned}$$

(5.30)

In other words, as $N\rightarrow \infty $, $U_1$ and $U_2$ converge to $\mathrm{Diag}(\theta _1-\theta _i,\ldots ,\theta _k-\theta _i)$ w.h.p. Therefore,

$$\begin{aligned} \lambda _i(U_m)=o_{hp}(1),\quad m=1,2. \end{aligned}$$

(5.31)

Let ${\tilde{U}}_m$, for $m=1,2,$ be the $(k-1)\times (k-1)$ matrix (recall that $k\ge 2$) obtained by deleting the ith row and the ith column of $U_m$, and let ${\tilde{u}}_m$ be the $(k-1)\times 1$ vector obtained from the ith column of $U_m$ by deleting its ith entry. It is worth recording, for possible future use, that

$$\begin{aligned} \Vert {\tilde{u}}_m\Vert =o_{hp}(1),\quad m=1,2, \end{aligned}$$

(5.32)

which follows from (5.30), and that

$$\begin{aligned} \Vert {\tilde{u}}_1-\tilde{u}_2\Vert =O_{hp}\left( (N\varepsilon )^{-2}\Vert Y_1\Vert \right) , \end{aligned}$$

(5.33)

follows from (5.29).

Equations (5.30) and (5.31) imply that ${\tilde{U}}_m-\lambda _i(U_m)I_{k-1}$ converges w.h.p. to

$$\begin{aligned} \mathrm{Diag}(\theta _1-\theta _i,\ldots ,\theta _{i-1}-\theta _i,\theta _{i+1}-\theta _i,\theta _k-\theta _i). \end{aligned}$$

Since $i\in {\mathcal {I}}$, the above matrix is invertible. Fix $\delta >0$ such that every matrix in the closed $\delta $-neighborhood $B_\delta $, in the sense of operator norm, of the above matrix is invertible. Let

$$\begin{aligned} C_4=\sup _{E\in B_\delta }\Vert E^{-1}\Vert . \end{aligned}$$

(5.34)

Then, $C_4<\infty $. Besides, there exists $C_5<\infty $ satisfying

$$\begin{aligned} \left\| E_1^{-1}-E_2^{-1}\right\| \le C_5\Vert E_1-E_2\Vert , \quad E_1,E_2\in B_\delta . \end{aligned}$$

(5.35)

Fix $N\ge 1$ and a sample point such that $\tilde{U}_m-\lambda _i(U_m)I_{k-1}$ belongs to $B_\delta $. Then, it is invertible. Define a $(k-1)\times 1$ vector

$$\begin{aligned} {\tilde{v}}_m=-\left[ {\tilde{U}}_m-\lambda _i(U_m)I_{k-1}\right] ^{-1}\tilde{u}_m, \quad m=1,2, \end{aligned}$$

and a $k\times 1$ vector

$$\begin{aligned} v_m=\left[ {\tilde{v}}_m(1),\ldots ,{\tilde{v}}_m(i-1),\,1,\,\tilde{v}_m(i),\ldots ,{\tilde{v}}_m(k-1)\right] ^\prime , \quad m=1,2. \end{aligned}$$

It is immediate that

$$\begin{aligned} \Vert {\tilde{v}}_m\Vert \le C_4\Vert {\tilde{u}}_m\Vert , \quad m=1,2. \end{aligned}$$

(5.36)

Our next claim is that

$$\begin{aligned} U_mv_m=\lambda _i(U_m)v_m, \quad m=1,2. \end{aligned}$$

(5.37)

This claim is equivalent to

$$\begin{aligned} \left[ U_m-\lambda _i(U_m)I_k\right] v_m=0. \end{aligned}$$

(5.38)

Let ${\bar{U}}_m$ be the $(k-1)\times k$ matrix obtained by deleting the ith row of $U_m-\lambda _i(U_m)I_k$. Since the latter matrix is singular, and ${\tilde{U}}_m-\lambda _i(U_m)I_{k-1}$ is invertible, it follows that the ith row of $U_m-\lambda _i(U_m)I_k$ lies in the row space of ${\bar{U}}_m$. In other words, the row spaces of $U_m-\lambda _i(U_m)I_k$ and ${\bar{U}}_m$ are the same, and so do their null spaces. Thus, (5.38) is equivalent to

$$\begin{aligned} {\bar{U}}_mv_m=0. \end{aligned}$$

To see the above, observe that the ith column of ${\bar{U}}_m$ is ${\tilde{u}}_m$, and hence we can partition

$$\begin{aligned} {\bar{U}}_m=\left[ {\bar{U}}_{m1}\,{\tilde{u}}_m\,{\bar{U}}_{m2}\right] , \quad \end{aligned}$$

where ${\bar{U}}_{m1}$ and ${\bar{U}}_{m2}$ are of order $(k-1)\times (i-1)$ and $(k-1)\times (k-i)$, respectively. Furthermore,

$$\begin{aligned} \left[ {\bar{U}}_{m1}\,{\bar{U}}_{m2}\right] =\tilde{U}_m-\lambda _i(U_m)I_{k-1}. \end{aligned}$$

Therefore,

$$\begin{aligned} {\bar{U}}_mv_m={\tilde{u}}_m+\left[ {\bar{U}}_{m1}\,{\bar{U}}_{m2}\right] \tilde{v}_m={\tilde{u}}_m+\left( {\tilde{U}}_m-\lambda _i(U_m)I_{k-1}\right) \tilde{v}_m=0. \end{aligned}$$

Hence, (5.38) follows, which proves (5.37).

Next, we note

$$\begin{aligned} \Vert v_1-v_2\Vert&=\Vert {\tilde{v}}_1-{\tilde{v}}_2\Vert \le \Vert ({\tilde{U}}_1-\lambda _i(U_1)I_{k-1})^{-1}\Vert \Vert {\tilde{u}}_1-{\tilde{u}}_2\Vert \\&\quad +\Vert ({\tilde{U}}_1-\lambda _i(U_1)I_{k-1})^{-1}-({\tilde{U}}_2-\lambda _i(U_2)I_{k-1})^{-1}\Vert \Vert {\tilde{u}}_2\Vert \\&\quad \le C_4\Vert {\tilde{u}}_1-{\tilde{u}}_2\Vert +C_5\Vert (\tilde{U}_1-\lambda _i(U_1)I_{k-1})-(\tilde{U}_2-\lambda _i(U_2)I_{k-1})\Vert \Vert {\tilde{u}}_2\Vert , \end{aligned}$$

$C_4$ and $C_5$ being as in (5.34) and (5.35), respectively. Recalling that the above calculation was done on an event of high probability, what we have proven, with the help of (5.29) and (5.33), is that

$$\begin{aligned} \Vert v_1-v_2\Vert =O_{hp}\left( (N\varepsilon )^{-2}\Vert Y_1\Vert \right) . \end{aligned}$$

Furthermore, (5.32) and (5.36) imply that

$$\begin{aligned} \Vert {\tilde{v}}_m\Vert =o_{hp}(1). \end{aligned}$$

Finally, noting that

$$\begin{aligned} U_m(i,i)=0, \quad m=1,2, \end{aligned}$$

and that

$$\begin{aligned} v_m(i)=1, \quad m=1,2, \end{aligned}$$

it follows that

$$\begin{aligned} \left| \lambda _i(U_1)-\lambda _i(U_2)\right|&=\left| \sum _{1\le j\le k,\,j\ne i}U_1(i,j)v_1(j)-\sum _{1\le j\le k,\,j\ne i}U_2(i,j)v_2(j)\right| \\&\le \sum _{1\le j\le k,\,j\ne i}|U_1(i,j)||v_1(j)-v_2(j)| \\&\quad +\sum _{1\le j\le k,\,j\ne i}|U_1(i,j)-U_2(i,j)|v_2(j)|\\&=O_{hp}\left( \Vert {\tilde{u}}_1\Vert \Vert v_1-v_2\Vert +\Vert U_1-U_2\Vert \Vert {\tilde{v}}_2\Vert \right) \\&=o_{hp}\left( (N\varepsilon )^{-2}\Vert Y_1\Vert \right) . \end{aligned}$$

Recalling (5.27) and (5.28), (5.26) follows, which completes the proof in conjunction with (5.25). $\square $

Now, we are in a position to prove Theorem 2.2.

Proof of Theorem 2.2

Recalling that

$$\begin{aligned} Y_1(i,i)=\theta _iN\varepsilon \,e_i^\prime We_i, \end{aligned}$$

it suffices to show that

$$\begin{aligned} \mu -\mathrm{E}(\mu )=\mu ^{-1}Y_1(i,i)+o_p(\sqrt{\varepsilon }). \end{aligned}$$

(5.39)

Lemma 5.8 implies that

$$\begin{aligned} \mu -{\bar{\mu }}=\mu ^{-1}Y_1(i,i)+o_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) =O_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) , \end{aligned}$$

(5.40)

a consequence of which, combined with Lemma 5.3, is that

$$\begin{aligned} \lim _{N\rightarrow \infty }(N\varepsilon )^{-1}{\bar{\mu }}=\theta _i. \end{aligned}$$

(5.41)

Thus,

$$\begin{aligned} \left| \frac{1}{{\bar{\mu }}}Y_1(i,i)-\frac{1}{\mu }Y_1(i,i)\right|&=O_{hp}\left( (N\varepsilon )^{-2}|\mu -{\bar{\mu }}|\Vert Y_1\Vert \right) \nonumber \\&=o_{hp}\left( |\mu -{\bar{\mu }}|\right) \nonumber \\&=o_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) \nonumber \\&=o_p(\sqrt{\varepsilon }), \end{aligned}$$

(5.42)

Lemma 5.3 implying the second line, the third line following from (5.40) and the fact that

$$\begin{aligned} \Vert Y_1\Vert =O_p(N\varepsilon ^{3/2}), \end{aligned}$$

(5.43)

which is also a consequence of the former lemma, being used for the last line. Using Lemma 5.8 once again, we get that

$$\begin{aligned} \mu ={\bar{\mu }}+\frac{1}{{\bar{\mu }}}Y_1(i,i)+o_{hp}\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) . \end{aligned}$$

(5.44)

Let

$$\begin{aligned} R=\mu -{\bar{\mu }}-\frac{1}{{\bar{\mu }}}Y_1(i,i). \end{aligned}$$

Clearly,

$$\begin{aligned} \mathrm{E}(R)=\mathrm{E}(\mu )-{\bar{\mu }}, \end{aligned}$$

and (5.44) implies that for $\delta >0$ there exists $\eta >1$ with

$$\begin{aligned} \mathrm{E}|R| \le \delta (\sqrt{\varepsilon }+(N\varepsilon )^{-1}\mathrm{E}\Vert Y_1\Vert )+\mathrm{E}^{1/2}\left( \mu -{\bar{\mu }}-\frac{1}{{\bar{\mu }}}Y_1(i,i)\right) ^2O(e^{-(\log N)^\eta }). \end{aligned}$$

Lemma 5.3 implies that

$$\begin{aligned} \mathrm{E}|R|\le o(\sqrt{\varepsilon })+\mathrm{E}^{1/2}\left( \mu -{\bar{\mu }}-\frac{1}{{\bar{\mu }}}Y_1(i,i)\right) ^2O(e^{-(\log N)^\eta }). \end{aligned}$$

Next, (5.41) and that $|\mu |\le N^2$ a.s. imply that

$$\begin{aligned} \mathrm{E}^{1/2}\left( \mu -{\bar{\mu }}-\frac{1}{{\bar{\mu }}}Y_1(i,i)\right) ^2 =O(N^2)=o(\varepsilon ^{1/2}N^3)=o(\varepsilon ^{1/2}e^{(\log N)^\eta }). \end{aligned}$$

Thus,

$$\begin{aligned} \mathrm{E}|R|=o(\sqrt{\varepsilon }), \end{aligned}$$

and hence

$$\begin{aligned} \mathrm{E}(\mu )={\bar{\mu }}+o(\sqrt{\varepsilon }). \end{aligned}$$

This, in view of (5.44), implies that

$$\begin{aligned} \mu =\mathrm{E}(\mu )+\frac{1}{{\bar{\mu }}}Y_1(i,i)+o_p\left( (N\varepsilon )^{-1}\Vert Y_1\Vert +\sqrt{\varepsilon }\right) =\mathrm{E}(\mu )+\frac{1}{{\bar{\mu }}}Y_1(i,i)+o_p\left( \sqrt{\varepsilon }\right) , \end{aligned}$$

the second line following from (5.43). This establishes (5.39) with the help of (5.42), and hence the proof. $\square $

5.3 Proof of Theorem 2.3

Theorems 2.1 and 2.2 establish Theorem 2.3 with the help of (2.10).

5.4 Proof of Theorem 2.4

Now we shall proceed toward proving Theorem 2.4. For the rest of this section, that is, this subsection and the subsequent two, Assumption F2. holds. In other words, $r_1,\ldots ,r_k$ are assumed to be Lipschitz continuous and hence so is f.

The following lemma essentially proves Theorem 2.4.

Lemma 5.9

Under Assumptions E1 and F2.,

$$\begin{aligned} \mu =\lambda _i\left( Y_0+(N\varepsilon \theta _i)^{-2}\mathrm{E}(Y_2)\right) +O_p\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) . \end{aligned}$$

Proof

Lemma 5.5 implies that

$$\begin{aligned} \mu =\lambda _i\left( \sum _{n=0}^3\mu ^{-n}\mathrm{E}(Y_n)\right) +O_p\left( \mu ^{-1}\Vert Y_1\Vert +\sum _{n=4}^L\mu ^{-n}\Vert \mathrm{E}(Y_n)\Vert \right) +o_p(\sqrt{\varepsilon }). \end{aligned}$$

Equation (5.43) implies that

$$\begin{aligned} \mu =\lambda _i\left( \sum _{n=0}^3\mu ^{-n}\mathrm{E}(Y_n)\right) +O_p\left( \sqrt{\varepsilon }+\sum _{n=4}^L\mu ^{-n}\Vert \mathrm{E}(Y_n)\Vert \right) . \end{aligned}$$

From (5.22), it follows that

$$\begin{aligned} \sum _{n=4}^L\mu ^{-n}\Vert \mathrm{E}(Y_n)\Vert =O_p\left( (N\varepsilon )^{-1}\right) , \end{aligned}$$

and hence

$$\begin{aligned} \mu =\lambda _i\left( \sum _{n=0}^3\mu ^{-n}\mathrm{E}(Y_n)\right) +O_p\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) . \end{aligned}$$

(5.45)

Lemma 4.4, in particular (4.5) therein, implies that

$$\begin{aligned} \Vert \mathrm{E}(Y_3)\Vert =O\left( (N\varepsilon )^2\right) , \end{aligned}$$

and hence

$$\begin{aligned} \mu ^{-3}\Vert \mathrm{E}(Y_3)\Vert =O_p\left( (N\varepsilon )^{-1}\right) . \end{aligned}$$

This, in conjunction with (5.45), implies that

$$\begin{aligned} \mu =\lambda _i\left( Y_0+\mu ^{-2}\mathrm{E}(Y_2)\right) +O_p\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) . \end{aligned}$$

(5.46)

An immediate consequence of the above and (5.22) is that

$$\begin{aligned} \mu =\lambda _i(Y_0)+O_p(1). \end{aligned}$$

(5.47)

Applying Fact 5.1 as in the proof of Lemma 5.2, it can be shown that

$$\begin{aligned} \left| \lambda _i(Y_0)-Y_0(i,i)\right| \le \sum _{1\le j\le k,\,j\ne i}|Y_0(i,j)|. \end{aligned}$$

(5.48)

Since $r_i$ and $r_j$ are Lipschitz functions, it holds that

$$\begin{aligned} e_i^\prime e_j=\mathbf{1}(i=j)+O\left( N^{-1}\right) . \end{aligned}$$

Hence, it follows that

$$\begin{aligned} Y_0(i,i)=N\varepsilon \left( \theta _i+O(N^{-1})\right) =N\varepsilon \theta _i+O(\varepsilon ), \end{aligned}$$

and similarly,

$$\begin{aligned} Y_0(i,j)=O(\varepsilon ), \quad j\ne i. \end{aligned}$$

Combining these findings with (5.48) yields that

$$\begin{aligned} \lambda _i(Y_0)=N\varepsilon \theta _i+O(\varepsilon ). \end{aligned}$$

(5.49)

Equations (5.47) and (5.49) together imply that

$$\begin{aligned} \mu =N\varepsilon \theta _i+O_p(1). \end{aligned}$$

(5.50)

Therefore,

$$\begin{aligned}&\left\| \mu ^{-2}\mathrm{E}(Y_2)-(N\varepsilon \theta _i)^{-2}\mathrm{E}(Y_2)\right\| \\&\quad =O_p\left( (N\varepsilon )^{-3}\Vert \mathrm{E}(Y_2)\Vert \right) \\&\quad =O_p\left( (N\varepsilon )^{-1}\right) . \end{aligned}$$

This in conjunction with (5.46) completes the proof. $\square $

Theorem 2.4 is a simple corollary of the above lemma, as shown below.

Proof of Theorem 2.4

A consequence of Theorem 2.2 is that

$$\begin{aligned} \mu -\mathrm{E}(\mu )=O_p(\sqrt{\varepsilon }). \end{aligned}$$

The claim of Lemma 5.9 is equivalent to

$$\begin{aligned} \lambda _i(B)-\mu =O_p\left( \sqrt{\varepsilon }+(N\varepsilon )^{-1}\right) . \end{aligned}$$

The proof follows by adding the two equations, and noting that B is a deterministic matrix. $\square $

5.5 Proof of Theorem 2.5

Next we proceed towards the proof of Theorem 2.5, for which the following lemma will be useful.

Lemma 5.10

Under Assumptions E1 and F2, as $N\rightarrow \infty $,

$$\begin{aligned} e_j^\prime \left( I-\mu ^{-1}W\right) ^{-n}e_l=\mathbf{1}(j=l)+O_p\left( (N\varepsilon )^{-1}\right) , \quad 1\le j,l\le k,\;n=1,2. \end{aligned}$$

Proof

For a fixed $n=1,2,$ expand

$$\begin{aligned} \left( I-\mu ^{-1}W\right) ^{-n}= I+n\mu ^{-1}W+O_p\left( \mu ^{-2}\Vert W\Vert ^2\right) . \end{aligned}$$

The proof can be completed by proceeding along similar lines as in the proof of Lemma 5.9. $\square $

Now we are in a position to prove Theorem 2.5.

Proof of Theorem 2.5

Theorem 2.1 implies that (2.11) holds for any $i\in {\mathcal {I}}$. Fix such an i, denote

$$\begin{aligned} \mu =\lambda _i(A), \end{aligned}$$

and let v be the eigenvector of A, having norm 1, corresponding to $\mu $, which is uniquely defined with probability close to 1.

Fix $k\ge 2$, and $j\in \{1,\ldots ,k\}\setminus \{i\}$. Premultiplying (5.7) by $e_j^\prime $ yields that

$$\begin{aligned} e_j^\prime v=N\varepsilon \sum _{l=1}^k\theta _l(e_l^\prime v)e_j^\prime \left( \mu I-W\right) ^{-1}e_l, \quad \text {w.h.p.} \end{aligned}$$

(5.51)

Therefore,

$$\begin{aligned}&e_j^\prime v\left( 1-\theta _j\frac{N\varepsilon }{\mu }e_j^\prime \left( I-\mu ^{-1}W\right) ^{-1}e_j\right) \nonumber \\&\quad =\frac{N\varepsilon }{\mu }\sum _{1\le l\le k,\,l\ne j}\theta _l(e_l^\prime v)e_j^\prime \left( I-\mu ^{-1}W\right) ^{-1}e_l, \quad \text {w.h.p.} \end{aligned}$$

Lemma 5.10 implies that as $N\rightarrow \infty $,

$$\begin{aligned} 1-\theta _j\frac{N\varepsilon }{\mu }e_j^\prime \left( I-\mu ^{-1}W\right) ^{-1}e_j{\mathop {\longrightarrow }\limits ^{P}}1-\frac{\theta _j}{\theta _i}\ne 0. \end{aligned}$$

Therefore,

$$\begin{aligned} e_j^\prime v&=O_p\left( \frac{N\varepsilon }{\mu }\sum _{1\le l\le k,\,l\ne j}\theta _l(e_l^\prime v)e_j^\prime \left( I-\mu ^{-1}W\right) ^{-1}e_l\right) \\&=O_p\left( \sum _{1\le l\le k,\,l\ne j}\left| e_j^\prime \left( I-\mu ^{-1}W\right) ^{-1}e_l\right| \right) \\&=O_p\left( (N\varepsilon )^{-1}\right) , \end{aligned}$$

the last line being another consequence of Lemma 5.10. Thus, (2.13) holds.

Using (5.7) once again, we get that

$$\begin{aligned} 1=(N\varepsilon )^2\sum _{l,m=1}^k\theta _l\theta _m(e_l^\prime v)(e_m^\prime v)e_l^\prime \left( \mu I-W\right) ^{-2}e_m, \end{aligned}$$

that is,

$$\begin{aligned}&\theta _i^2(e_i^\prime v)^2e_i^\prime \left( I-\mu ^{-1}W\right) ^{-2}e_i\nonumber \\&\quad =(N\varepsilon )^{-2}\mu ^2-\sum _{(l,m)\in \{1,\ldots ,k\}^2\setminus \{(i,i)\}}\theta _l\theta _m(e_l^\prime v)(e_m^\prime v)e_l^\prime \left( I-\mu ^{-1} W\right) ^{-2}e_m. \end{aligned}$$

(5.52)

Using Lemma 5.10 once again, it follows that

$$\begin{aligned} e_i^\prime \left( I-\mu ^{-1}W\right) ^{-2}e_i=1+O_p\left( (N\varepsilon )^{-1}\right) . \end{aligned}$$

Thus, (2.12) would follow once it’s shown that

$$\begin{aligned} (N\varepsilon )^{-2}\mu ^2=\theta _i^2+O_p\left( (N\varepsilon )^{-1}\right) , \end{aligned}$$

(5.53)

and that for all $(l,m)\in \{1,\ldots ,k\}^2\setminus \{(i,i)\}$,

$$\begin{aligned} (e_l^\prime v)(e_m^\prime v)e_l^\prime \left( I-\mu ^{-1} W\right) ^{-2}e_m=O_p\left( (N\varepsilon )^{-1}\right) . \end{aligned}$$

(5.54)

Equation (5.53) is a trivial consequence of (5.50). For (5.54), assuming without loss of generality that $l\ne i$, (2.13) implies that

$$\begin{aligned} \left| (e_l^\prime v)(e_m^\prime v)e_l^\prime \left( I-\mu ^{-1} W\right) ^{-2}e_m\right|&=\left| (e_m^\prime v)e_l^\prime \left( I-\mu ^{-1} W\right) ^{-2}e_m\right| O_p\left( (N\varepsilon )^{-1}\right) \\ \le&\left| e_l^\prime \left( I-\mu ^{-1} W\right) ^{-2}e_m\right| O_p\left( (N\varepsilon )^{-1}\right) \\&=O_p\left( (N\varepsilon )^{-1}\right) , \end{aligned}$$

the last line following from Lemma 5.10. Thus, (5.54) follows, which in conjunction with (5.53) establishes (2.12). This completes the proof. $\square $

5.6 Proof of Theorem 2.6

Finally, Theorem 2.6 is proved below, based on Assumptions E2. and F2.

Proof of Theorem 2.6

Fix $i\in {\mathcal {I}}$. Recall (5.8) and (5.9), and let u be as defined in the former. Let ${\tilde{u}}$ be the column vector obtained by deleting the ith entry of u, ${\tilde{V}}_i$ be the column vector obtained by deleting the ith entry of the ith column of V, and ${\tilde{V}}$ be the $(k-1)\times (k-1)$ matrix obtained by deleting the ith row and ith column of V. Then, (5.9) implies that

$$\begin{aligned} \mu {\tilde{u}}={\tilde{V}}{\tilde{u}}+u(i)\tilde{V}_i, \quad \text {w.h.p.} \end{aligned}$$

(5.55)

Lemma 5.1 implies that

$$\begin{aligned} \left\| I_{k}-\mu ^{-1}V-\mathrm{Diag}\left( 1-\frac{\theta _1}{\theta _i},\ldots ,1-\frac{\theta _k}{\theta _i}\right) \right\| =o_{hp}(1), \end{aligned}$$

and hence $I_{k-1}-\mu ^{-1}{\tilde{V}}$ is non-singular w.h.p. Thus, (5.55) implies that

$$\begin{aligned} {\tilde{u}}=u(i)\mu ^{-1}\left( I_{k-1}-\mu ^{-1}\tilde{V}\right) ^{-1}{\tilde{V}}_i, \quad \text {w.h.p.} \end{aligned}$$

(5.56)

The next step is to show that

$$\begin{aligned} \left\| \mu ^{-1}V-\mathrm{Diag}\left( \frac{\theta _1}{\theta _i},\ldots ,\frac{\theta _k}{\theta _i}\right) \right\| =o_{p}\left( \sqrt{\varepsilon }\right) . \end{aligned}$$

(5.57)

To see this, use the fact that f is Lipschitz to write for a fixed $1\le j,l\le k$,

$$\begin{aligned} V(j,l)&=N\varepsilon \sqrt{\theta _j\theta _l}\left( e_j^\prime e_l+\mu ^{-1}e_j^\prime We_l+O_p\left( \mu ^{-2}\Vert W\Vert ^2\right) \right) \nonumber \\&=N\varepsilon \sqrt{\theta _j\theta _l}\left( e_j^\prime e_l+O_p\left( (N\varepsilon )^{-1}\right) \right) \nonumber \\&=N\varepsilon \theta _j\left( \mathbf{1}(j=l)+O_p\left( (N\varepsilon )^{-1}\right) \right) \nonumber \\&=N\varepsilon \theta _j\left( \mathbf{1}(j=l)+o_p\left( \sqrt{\varepsilon }\right) \right) , \end{aligned}$$

(5.58)

the last line following from the fact that

$$\begin{aligned} (N\varepsilon )^{-1}=o\left( \sqrt{\varepsilon }\right) , \end{aligned}$$

(5.59)

which is a consequence of (2.5). This along with (5.50) implies that

$$\begin{aligned} (N\varepsilon \theta _i)^{-1}\mu =1+o_p\left( \sqrt{\varepsilon }\right) . \end{aligned}$$

(5.60)

Combining this with (5.58) yields that

$$\begin{aligned} \mu ^{-1}V(j,l)=\theta _i^{-1}\theta _j\mathbf{1}(j=l)+o_p\left( \sqrt{\varepsilon }\right) . \end{aligned}$$

Thus, (5.57) follows, an immediate consequence of which is that

$$\begin{aligned} \left\| \left( I_{k-1}-\mu ^{-1}\tilde{V}\right) ^{-1}-{\tilde{D}}\right\| =o_p\left( \sqrt{\varepsilon }\right) , \end{aligned}$$

(5.61)

where

$$\begin{aligned} \tilde{D}=\left[ \mathrm{Diag}\left( 1-\frac{\theta _1}{\theta _i},\ldots ,1-\frac{\theta _{i-1}}{\theta _i},1-\frac{\theta _{i+1}}{\theta _i},\ldots ,1-\frac{\theta _k}{\theta _i}\right) \right] ^{-1}. \end{aligned}$$

Next, fix $j\in \{1,\ldots ,k\}\setminus \{i\}$. By similar arguments as above, it follows that

$$\begin{aligned} V(i,j)&=N\varepsilon \sqrt{\theta _i\theta _j}\left( \sum _{n=0}^3\mu ^{-n}e_i^\prime W^ne_j+O_p\left( \mu ^{-4}\Vert W\Vert ^4\right) \right) \\&=N\varepsilon \sqrt{\theta _i\theta _j}\sum _{n=0}^3\mu ^{-n}e_i^\prime W^ne_j+O_p\left( (N\varepsilon )^{-1}\right) \\&=N\varepsilon \sqrt{\theta _i\theta _j}\sum _{n=1}^2\mu ^{-n}e_i^\prime W^ne_j+o_p\left( \sqrt{\varepsilon }\right) , \end{aligned}$$

using (5.59) once again, because

$$\begin{aligned} N\varepsilon e_i^\prime e_j=O(\varepsilon )=o\left( \sqrt{\varepsilon }\right) , \end{aligned}$$

and

$$\begin{aligned} N\varepsilon \mu ^{-3} e_i^\prime W^3e_j=O_p\left( (N\varepsilon )^{-2}\mathrm{E}(e_i^\prime W^3e_j)\right) =o_p\left( \sqrt{\varepsilon }\right) , \end{aligned}$$

by (4.5). Thus,

$$\begin{aligned} V(i,j)-N\varepsilon \sqrt{\theta _i\theta _j}\mu ^{-1}e_i^\prime We_j&=N\varepsilon \sqrt{\theta _i\theta _j}\mu ^{-2}e_i^\prime W^2e_j+o_p\left( \sqrt{\varepsilon }\right) \\&=N\varepsilon \sqrt{\theta _i\theta _j}\mu ^{-2}\mathrm{E}\left( e_i^\prime W^2e_j\right) +o_p\left( \sqrt{\varepsilon }\right) \\&=(N\varepsilon )^{-1}\theta _j^{1/2}\theta _i^{-3/2}\mathrm{E}\left( e_i^\prime W^2e_j\right) +o_p\left( \sqrt{\varepsilon }\right) , \end{aligned}$$

the second line following from Lemma 4.3, and the last line from (5.59), (5.60) and Lemma 4.2. In particular,

$$\begin{aligned} V(i,j)=O_p(1). \end{aligned}$$

The above in conjunction with (5.61) implies that

$$\begin{aligned}&\left[ \left( I_{k-1}-\mu ^{-1}{\tilde{V}}\right) ^{-1}{\tilde{V}}_i\right] (j)\\&\quad =\left( 1-\frac{\theta _j}{\theta _i}\right) ^{-1}\sqrt{\theta _i\theta _j}\left[ (N\varepsilon )^{-1}\theta _i^{-2}\mathrm{E}\left( e_i^\prime W^2e_j\right) +N\varepsilon \mu ^{-1}e_i^\prime We_j\right] +o_p(\sqrt{\varepsilon }). \end{aligned}$$

In light of (5.56), the above means that

$$\begin{aligned}&e_j^\prime v\\&\quad =(e_i^\prime v)\mu ^{-1}\left( 1-\frac{\theta _j}{\theta _i}\right) ^{-1}\left[ (N\varepsilon )^{-1}\theta _i^{-1}\mathrm{E}\left( e_i^\prime W^2e_j\right) +N\varepsilon \theta _i\mu ^{-1}e_i^\prime We_j+o_p(\sqrt{\varepsilon })\right] \\&\quad =\mu ^{-1}\left( 1-\frac{\theta _j}{\theta _i}\right) ^{-1}\left[ (N\varepsilon )^{-1}\theta _i^{-1}\mathrm{E}\left( e_i^\prime W^2e_j\right) +N\varepsilon \theta _i\mu ^{-1}e_i^\prime We_j+o_p(\sqrt{\varepsilon })\right] , \end{aligned}$$

the last line following from (2.12) and (5.59). Using (5.60) once again yields that

$$\begin{aligned} N\varepsilon (e_j^\prime v)=\frac{1}{\theta _i-\theta _j}\left[ (N\varepsilon )^{-1}\theta _i^{-1}\mathrm{E}\left( e_i^\prime W^2e_j\right) +N\varepsilon \theta _i\mu ^{-1}e_i^\prime We_j\right] +o_p(\sqrt{\varepsilon }). \end{aligned}$$

This completes the proof. $\square $

Change history

01 April 2024
A Correction to this paper has been published: https://doi.org/10.1007/s10955-024-03258-z

References

Alt, J., Ducatez, R., Knowles, A.: Delocalization transition for critical Erdős–Rényi graphs. ArXiv:2005.14180 (2020)
Baik, J., Silverstein, J.W.: Eigenvalues of large sample covariance matrices of spiked population models. J. Multivar. Anal. 97(6), 1382–1408 (2006). https://doi.org/10.1016/j.jmva.2005.08.003
Article MathSciNet Google Scholar
Baik, J., Ben Arous, G., Péché, S.: Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33(5), 1643–1697 (2005). https://doi.org/10.1214/009117905000000233
Article MathSciNet Google Scholar
Benaych-Georges, F., Nadakuditi, R.R.: The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 227(1), 494–521 (2011). https://doi.org/10.1016/j.aim.2011.02.007
Article MathSciNet Google Scholar
Benaych-Georges, F., Guionnet, A., Maida, M.: Fluctuations of the extreme eigenvalues of finite rank deformations of random matrices. Electron. J. Probab. 16, 1621–1662 (2011)
Article MathSciNet Google Scholar
Benaych-Georges, F., Bordenave, C., Knowles, A.: Largest eigenvalues of sparse inhomogeneous Erdős–Rényi graphs. Ann. Probab. 47(3), 1653–1676 (2019)
Article MathSciNet Google Scholar
Benaych-Georges, F., Bordenave, C., Knowles, A.: Spectral radii of sparse random matrices. Ann. Inst. H. Poincaré Probab. Statist. 56(3), 2141–2161 (2020). https://doi.org/10.1214/19-AIHP1033
Article MathSciNet Google Scholar
Bhamidi, S., Van Der Hofstad, R., van Leeuwaarden, J., et al.: Scaling limits for critical inhomogeneous random graphs with finite third moments. Electron. J. Probab. 15, 1682–1702 (2010)
Article MathSciNet Google Scholar
Bollobás, B., Janson, S., Riordan, O.: The phase transition in inhomogeneous random graphs. Random Struct. Algorithms 31(1), 3–122 (2007)
Article MathSciNet Google Scholar
Capitaine, M., Péché, S.: Fluctuations at the edges of the spectrum of the full rank deformed GUE. Probab. Theory Relat. Fields 165(1–2), 117–161 (2016). https://doi.org/10.1007/s00440-015-0628-6
Article MathSciNet Google Scholar
Capitaine, M., Donati-Martin, C., Féral, D., et al.: The largest eigenvalues of finite rank deformation of large Wigner matrices: convergence and nonuniversality of the fluctuations. Ann. Probab. 37(1), 1–47 (2009)
Article MathSciNet Google Scholar
Capitaine, M., Donati-Martin, C., Féral, D.: Central limit theorems for eigenvalues of deformations of Wigner matrices. Ann. Probab. Stat. 48(1), 107–133 (2012)
MathSciNet Google Scholar
Chakrabarty, A., Hazra, R.S., den Hollander, F., Sfragara, M.: Spectra of adjacency and Laplacian matrices of inhomogeneous Erdős–Rényi random graphs. To appear in Random Matrices: Theory and Applications (2019). https://doi.org/10.1142/S201032632150009X
Chapon, F., Couillet, R., Hachem, W., Mestre, X.: The outliers among the singular values of large rectangular random matrices with additive fixed rank deformation. Markov Process. Relat. Fields 20(2), 183–228 (2014)
MathSciNet Google Scholar
Couillet, R., Hachem, W.: Fluctuations of spiked random matrix models and failure diagnosis in sensor networks. IEEE Trans. Inform. Theory 59(1), 509–525 (2013). https://doi.org/10.1109/TIT.2012.2218572
Article MathSciNet Google Scholar
Ding, X., Jiang, T., et al.: Spectral distributions of adjacency and Laplacian matrices of random graphs. Ann. Appl. Probab. 20(6), 2086–2117 (2010)
Article MathSciNet Google Scholar
Erdős, L., Knowles, A., Yau, H.-T., Yin, J.: Spectral statistics of Erdős–Rényi graphs II: eigenvalue spacing and the extreme eigenvalues. Commun. Math. Phys. 314(3), 587–640 (2012). https://doi.org/10.1007/s00220-012-1527-7
Article ADS Google Scholar
Erdős, L., Knowles, A., Yau, H.-T., Yin, J.: Spectral statistics of Erdős–Rényi graphs I: local semicircle law. Ann. Probab. 41(3B), 2279–2375 (2013). https://doi.org/10.1214/11-AOP734
Article MathSciNet Google Scholar
Féral, D., Péché, S.: The largest eigenvalue of rank one deformation of large Wigner matrices. Commun. Math. Phys. 272(1), 185–228 (2007). https://doi.org/10.1007/s00220-007-0209-3
Article ADS MathSciNet Google Scholar
Füredi, Z., Komlós, J.: The eigenvalues of random symmetric matrices. Combinatorica 1(3), 233–241 (1981). https://doi.org/10.1007/BF02579329
Article MathSciNet Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Article MathSciNet Google Scholar
Huang, J., Landon, B., Yau, H.-T.: Transition from Tracy–Widom to Gaussian fluctuations of extremal eigenvalues of sparse Erdős—Rényi graphs. Ann. Probab. 48(2), 916–962 (2020). https://doi.org/10.1214/19-AOP1378
Article MathSciNet Google Scholar
Johansson, K.: From Gumbel to Tracy–Widom. Probab. Theory Relat. Fields 138(1–2), 75–112 (2007). https://doi.org/10.1007/s00440-006-0012-7
Article MathSciNet Google Scholar
Knowles, A., Yin, J.: The isotropic semicircle law and deformation of Wigner matrices. Commun. Pure Appl. Math. 66(11), 1663–1750 (2013). https://doi.org/10.1002/cpa.21450
Article MathSciNet Google Scholar
Knowles, A., Yin, J.: The outliers of a deformed Wigner matrix. Ann. Probab. 42(5), 1980–2031 (2014). https://doi.org/10.1214/13-AOP855
Article MathSciNet Google Scholar
Lee, J.O., Schnelli, K.: Extremal eigenvalues and eigenvectors of deformed Wigner matrices. Probab. Theory Relat. Fields 164(1–2), 165–241 (2016). https://doi.org/10.1007/s00440-014-0610-8
Article MathSciNet Google Scholar
Lee, J.O., Schnelli, K.: Local law and Tracy–Widom limit for sparse random matrices. Probab. Theory Relat. Fields 171(1–2), 543–616 (2018)
Article MathSciNet Google Scholar
Ninio, F.: A simple proof of the Perron–Frobenius theorem for positive symmetric matrices. J. Phys. A Math. Gen. 9(8), 1281 (1976)
Article ADS MathSciNet Google Scholar
Péché, S.: Deformed ensembles of random matrices. In: Proceedings of the International Congress of Mathematicians—Seoul 2014, vol. III. pp. 1159–1174. Kyung Moon SA, Seoul (2014).
Tiomoko Ali, H., Couillet, R.: Improved spectral community detection in large heterogeneous networks. J. Mach. Learn. Res. 18, Paper No. 225, 49 (2017)
Tran, L.V., Vu, V.H., Wang, K.: Sparse random graphs: eigenvalues and eigenvectors. Random Struct. Algorithms 42(1), 110–134 (2013). https://doi.org/10.1002/rsa.20406
Article MathSciNet Google Scholar
van der Hofstad, R.: Critical behavior in inhomogeneous random graphs. Random Struct. Algorithms 42(4), 480–508 (2013)
Article MathSciNet Google Scholar
Varga, R.S.: Geršgorin and His Circles. Springer Series in Computational Mathematics, vol. 36. Springer, Berlin (2004). https://doi.org/10.1007/978-3-642-17798-9
Vu, V.H.: Spectral norm of random matrices. Combinatorica 27(6), 721–736 (2007). https://doi.org/10.1007/s00493-007-2190-z
Article MathSciNet Google Scholar
Zhu, Y.: Graphon approach to limiting spectral distributions of Wigner-type matrices. arXiv.1806.11246 (2018)

Download references

Acknowledgements

The authors thank an anonymous referee for insightful comments that helped improve the paper significantly. RSH thanks Kavita Ramanan for a fruitful discussion. The research of AC and RSH was supported by the MATRICS grant of SERB. SC thanks Matteo Sfragara for helpful discussions.

Author information

Authors and Affiliations

Indian Statistical Institute, 203 B.T. Road, Kolkata, 700108, India
Arijit Chakrabarty, Sukrit Chakraborty & Rajat Subhra Hazra

Authors

Arijit Chakrabarty
View author publications
You can also search for this author in PubMed Google Scholar
Sukrit Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Rajat Subhra Hazra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajat Subhra Hazra.

Additional information

Communicated by Abhishek Dhar.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Lemma 6.1

The eigenfunctions $\{r_i:1\le i\le k\}$ of the operator $I_f$ are Riemann integrable.

Proof

Let $D_f\subset [0,1]\times [0,1]$ be the set of discontinuity points f. Since f is Riemann integrable, the Lebesgue measure of $D_f$ is 0. Let

$$\begin{aligned} D_f^x=\{y\in [0,\,1]:\, (x,y)\in D_f\}, \, x\in [0,\,1]. \end{aligned}$$

If $\lambda $ is the one dimensional Lebesgue measure, then Fubini’s theorem implies that

$$\begin{aligned} E=\{ x\in [0,1]: \lambda (D_f^x)=0\} \end{aligned}$$

has full measure. Fix an $x\in E$ and consider $x_n\rightarrow x$ and observe that

$$\begin{aligned} f(x_n,y)\rightarrow f(x,y) \quad \text {for all}{ y\notin D_f^x}. \end{aligned}$$

Fix $1\le i\le k$ and let $\theta _i$ be the eigenvalue with corresponding eigenfunction $r_i$, that is,

$$\begin{aligned} r_i(x)= \frac{1}{\theta _i}\int _0^1 f(x,y) r_i(y) \, dy. \end{aligned}$$

(6.1)

Using f is bounded and $r\in L^2[0,1]$, dominated convergence theorem implies

$$\begin{aligned} r_i(x_n)= \frac{1}{\theta _i}\int _{(D_f^x)^c} f(x_n,y) r_i(y)\, dy\rightarrow \frac{1}{\theta _i}\int _0^1 f(x,y) r_i(y)\, dy=r_i(x) \end{aligned}$$

and hence r is continuous at x. So the discontinuity points of $r_i$ form a subset of $E^c$ which has Lebesgue measure 0. Further, (6.1) shows that $r_i$ is bounded and hence Riemann integrability follows. $\square $

The following result is a version of the Perron–Frobenius theorem in the infinite dimensional setting (also known as the Krein–Rutman theorem). Since our integral operator is positive, self-adjoint and finite dimensional so the proof in this setting is much simpler and can be derived following the work of [28]. In what follows, we use for $f, g\in L^2[0,1]$, the inner product

$$\begin{aligned} \langle f, \, g\rangle =\int _0^1 f(x)g(x)dx. \end{aligned}$$

Lemma 6.2

Suppose $f>0$ a.e. on $[0,1]\times [0,1]$. Then largest eigenvalue $\theta _1$ of $T_f$ is positive and the corresponding eigenfunction $r_1$ can be chosen such that $r_1(x)>0$ for almost every $x\in [0,1]$. Further, $\theta _1>\theta _2$.

Proof

First observe that

$$\begin{aligned} 0<\theta _1= \langle r_1,\, \theta _1 r_1\rangle&= \langle r_1, \, I_f(r_1)\rangle = | \langle r_1, \, I_f(r_1)\rangle |\\&\le \langle u_1, I_f(u_1)\rangle \le \theta _1 \end{aligned}$$

where $u_1(x)=|r_1|(x)$ and the last inequality follows from the Rayleigh-Ritz formulation of the largest eigenvalue. Hence note that the string of inequalities is actually an equality, that is,

$$\begin{aligned} \langle r_1, \, I_f(r_1) \rangle = \langle u_1, I_f(u_1)\rangle . \end{aligned}$$

Breaking $r_1= r_1^{+}-r_1^{-}$ implies either $r_1^+=0$ or $r_1^{-}=0$ almost everywhere. Without loss of generality assume that $r_1\ge 0$ almost everywhere. Using

$$\begin{aligned} \theta _1 r_1(x)= \int _0^1 f(x,y) r_1(y)\, dy \end{aligned}$$

Note that if $r_1(x)$ is zero for some x then due to the positivity assumption on f, $r_1(y)=0$ for almost every $y\in [0,1]$ which is a contradiction. Hence we have that $r_1(x)>0$ almost every $x\in [0, 1]$.

For the final claim, without loss of generality assume that $\int _0^1 r_1(x)\, dx\ge 0$. If $\theta _1=\theta _2$, then the previous argument would give us $r_2(x)>0$ and this will contradict the orthogonality of $r_1$ and $r_2$. $\square $

Lemmas 4.1–4.4 are proved in the rest of this section. Therefore, the notations used here should refer to those in Sect. 4 and should not be confused with those in Sect. 5. For example, $e_1$ and $e_2$ are as in Lemma 4.2.

Proof of Lemma 4.1

Note that for any even integer k

$$\begin{aligned} \mathrm{E}(\Vert W\Vert ^k) \le \mathrm{E}({{\,\mathrm{Tr}\,}}(W^k)). \end{aligned}$$

(6.2)

Using $\mathrm{E}(W(i,j)^2)\le \varepsilon M$ and condition (1.2) it is immediate that conditions of Theorem 1.4 of [34] are satisfied. We shall use the following estimate from the proof of that result. It follows from [34, Sect. 4]

$$\begin{aligned} \mathrm{E}({{\,\mathrm{Tr}\,}}(W^k)) \le K_1N (2\sqrt{\varepsilon MN})^k \end{aligned}$$

(6.3)

where $K_1$ is some positive constant and there exists a constant $a>0$ such that k can be chosen as

$$\begin{aligned} k = \sqrt{2} a (\varepsilon M)^{1/4} N^{1/4}. \end{aligned}$$

Using (6.2), (6.3) and $(1-x)^k \le e^{-kx}$ for k, $x > 0$,

$$\begin{aligned}&P \left( \Vert W \Vert \ge 2 \sqrt{MN\varepsilon } + C_1 (N \varepsilon )^{1/4} ( \log N)^{\xi /4}\right) \nonumber \\&= K_1 N \left( 1 - \frac{C_1 (N \varepsilon )^{1/4} ( \log N)^{\xi /4}}{ 2 \sqrt{MN\varepsilon } + C_1 (N \varepsilon )^{1/4} ( \log N)^{\xi /4}} \right) ^k\nonumber \\&\le K_1 N \exp \left( -\frac{k C_1 (N \varepsilon )^{1/4} ( \log N)^{\xi /4}}{ 2 \sqrt{MN\varepsilon } + C_1 (N \varepsilon )^{1/4} ( \log N)^{\xi /4}} \right) . \end{aligned}$$

(6.4)

Now plugging in the value of k in the bound (6.4) and using

$$\begin{aligned} 2 \sqrt{M} + C_1 (N \varepsilon )^{-1/4} ( \log N)^{\xi /4} \le 2 \sqrt{M} +C_1 \end{aligned}$$

we have

$$\begin{aligned} (6.4) \le K_1 N \exp \left( -\frac{C_1 a M^{1/4}\sqrt{2} ( \log N)^{\xi /4}}{ 2 \sqrt{M} + C_1} \right) \le e^{-C_2 (\log N )^{\xi /4}} \end{aligned}$$

for some constant $C_2>0$ and N large enough. This proves (4.1) and hence the lemma.

$\square $

Proof of Lemma 4.2

Let A be the event where Lemma 4.1 holds, that is, $\Vert W\Vert \le C \sqrt{N\varepsilon }$ for some constant C. Since the entries of $e_1$ and $e_2$ are in $[-1/\sqrt{N}, 1/\sqrt{N}]$ so $\Vert e_i\Vert \le 1$ for $i=1,2$. Hence on the high probability event it holds that

$$\begin{aligned} \left| \mathrm{E}\left( e_1^\prime W^ne_2 \mathbf{1} _{A}\right) \right| \le (CN\varepsilon )^{n/2}. \end{aligned}$$

We show that the above expectation on the low probability event $A^c$ is negligible. For that first observe

$$\begin{aligned} |\mathrm{E}[(e_1^\prime W^n e_2)^2]|\le N^{nC^\prime } \end{aligned}$$

for some constant $0<C^\prime <\infty $. Thus using Lemma 4.1 one has

$$\begin{aligned} \left| \mathrm{E}\left( e_1^\prime W^ne_2 \mathbf{1} _{A^c}\right) \right|&\le \left| \mathrm{E}\left[ (e_1^\prime W^ne_2)^2\right] ^{1/2}\right| P(A_N^c)^{1/2} \\&\le \exp \left( nC^{\prime }\log N- 2^{-1}C_2 (\log N)^{\xi /4}\right) \end{aligned}$$

Since $n\le \log N$ and $\xi >8$ the result follows. $\square $

Proof of Lemma 4.3

The proof is similar to the proof of Lemma 6.5 of [18]. The exponent in the exponential decay is crucial, so the proof is briefly sketched. Observe that

$$\begin{aligned}&e_1^\prime W^ne_2-\mathrm{E}\left( e_1^\prime W^ne_2\right) \nonumber \\&\quad =\sum _{i\in \{1,\ldots , N\}^{n+1}}e_1(i_1)e_2(i_{n+1}) \left( \prod _{l=1}^{n}W(i_l, i_{l+1})-\mathrm{E}\left[ \prod _{l=1}^{n}W(i_l, i_{l+1})\right] \right) \end{aligned}$$

(6.5)

To use the independence, one can split the matrix W as $W^{\prime } + W^{{\prime \prime }}$ where the upper triangular matrix $W^{\prime }$ has entries $W^{\prime }(i,j) = W(i,j) \mathbf{1} (i\leqslant j)$ and the lower triangular matrix $W^{{\prime \prime }}$ with entries $W^{{\prime \prime }}(i,j) = W(i,j) \mathbf{1} (i > j)$. Therefore the above quantity under the sum breaks into $2^n$ terms each having similar properties. Denote one such term as

$$\begin{aligned} L_n= \sum _{i\in \{1,\ldots , N\}^{n+1}}e_1(i_1)e_2(i_{n+1}) \left( \prod _{l=1}^{n}W^\prime (i_l, i_{l+1})-\mathrm{E}\left[ \prod _{l=1}^{n}W^\prime (i_l, i_{l+1})\right] \right) . \end{aligned}$$

Using the fact that each entry of $e_1$ and $e_2$ are bounded by $1/\sqrt{N}$, it follows by imitating the proof of Lemma 6.5 of [18] that

$$\begin{aligned} \mathrm{E}[|L_n|^p] \le \frac{\left( Cnp\right) ^{np}(N\varepsilon )^{np/2}}{N^{p/2}}, \end{aligned}$$

where p is an even integer and C is a positive constant, independent of n and p. Rest of the $2^n-1$ terms arising in (6.5) have the same bound and hence

$$\begin{aligned}&P\left( \left| e_1^\prime W^ne_2-\mathrm{E}\left( e_1^\prime W^ne_2\right) \right| >N^{(n-1)/2}\varepsilon ^{n/2}(\log N)^{n\xi /4}\right) \\&\le \frac{\left( 2Cnp\right) ^{np}(N\varepsilon )^{np/2}}{N^{p/2}N^{p(n-1)/2}\varepsilon ^{pn/2}(\log N)^{pn\xi /4}}= \frac{\left( 2Cnp\right) ^{np}}{(\log N)^{pn\xi /4}}. \end{aligned}$$

Choose $\eta \in (1,\, \xi /4)$ and consider

$$\begin{aligned} p=\frac{(\log N)^{\eta }}{2Cn}, \end{aligned}$$

(with N large enough to make p an even integer) to get

$$\begin{aligned}&P\left( \left| e_1^\prime W^ne_2-\mathrm{E}\left( e_1^\prime W^ne_2\right) \right| >N^{(n-1)/2}\varepsilon ^{n/2}(\log N)^{n\xi /4}\right) \\&\le \exp \left( -\frac{1}{2C}(\log N)^{\eta }(\frac{\xi }{4}-\eta )\log \log N\right) . \end{aligned}$$

Note that $n\le L$, ensures that $p>1$. Since the bound is uniform over all $2\le n\le L$, the first bound (4.2) follows.

For (4.3) one can use Hoeffding’s inequality [21, Theorem 2] as follows.

Define

$$\begin{aligned} {\widetilde{A}}(k,l)= A(k,l) e_1(k)e_2(l), \,\,\, 1\le k\le l\le N. \end{aligned}$$

Since A(k, l) are Bernoulli random variables, so one has $\{{\widetilde{A}}(k,l): 1\le k\le l\le N\}$ are independent random variables taking values in $[-1/N, 1/N]$ and hence by Hoeffding’s inequality we have, for any $\delta >0$,

$$\begin{aligned}&P\left( \left| \sum _{1\le k\le l\le N} {\widetilde{A}}(k,l)- E\left( \sum _{1\le k\le l\le N} {\widetilde{A}}(k,l)\right) \right| >\delta N\varepsilon \right) \\&\le 2\exp \left( -\delta ^2(N\varepsilon )^2\right) \le 2 \exp \left( -\delta ^2 (\log N)^{2\xi }\right) . \end{aligned}$$

Dealing with the case $k>l$ similarly, the desired bound on$e_1^\prime We_2$ follows. $\square $

Proof of Lemma 4.4

Follows by a simple moment calculation. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chakrabarty, A., Chakraborty, S. & Hazra, R.S. Eigenvalues Outside the Bulk of Inhomogeneous Erdős–Rényi Random Graphs. J Stat Phys 181, 1746–1780 (2020). https://doi.org/10.1007/s10955-020-02644-7

Download citation

Received: 27 February 2020
Accepted: 19 September 2020
Published: 26 September 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10955-020-02644-7

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Eigenvalues Outside the Bulk of Inhomogeneous Erdős–Rényi Random Graphs

Abstract

Similar content being viewed by others

Large Deviation Principle for the Maximal Eigenvalue of Inhomogeneous Erdős-Rényi Random Graphs

Local Kesten–McKay Law for Random Regular Graphs

Phase Transition in Inhomogenous Erdős-Rényi Random Graphs via Tree Counting

1 Introduction

2 The Set-Up And The Results

Definition

Theorem 2.1

Theorem 2.2

Theorem 2.3

Remark 2.1

Remark 2.2

Remark 2.3

Theorem 2.4

Theorem 2.5

Theorem 2.6

Remark 2.4

3 Examples and Special Cases

3.1 The Rank One Case

3.2 Stochastic Block Model

4 Estimates

Lemma 4.1

Lemma 4.2

Lemma 4.3

Lemma 4.4

5 Proof of the Main Results

5.1 Proof of Theorem 2.1

Proof of Theorem 2.1

5.2 Proof of Theorem 2.2

Lemma 5.1

Proof

Lemma 5.2

Fact 5.1

Remark 5.1

Proof of Lemma 5.2

Lemma 5.3

Proof

Lemma 5.4

Proof

Lemma 5.5

Proof

Lemma 5.6

Proof

Lemma 5.7

Proof

Lemma 5.8

Proof

Proof of Theorem 2.2

5.3 Proof of Theorem 2.3

5.4 Proof of Theorem 2.4

Lemma 5.9

Proof

Proof of Theorem 2.4

5.5 Proof of Theorem 2.5

Lemma 5.10

Proof

Proof of Theorem 2.5

5.6 Proof of Theorem 2.6

Proof of Theorem 2.6

Change history

01 April 2024

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 6.1

Proof

Lemma 6.2

Proof

Proof of Lemma 4.1

Proof of Lemma 4.2

Proof of Lemma 4.3

Proof of Lemma 4.4