1 Introduction

Random regular graphs have long been one of the most important models in the theory of random graphs. The model is defined to be a random graph sampled uniformly at random from the set of all regular graphs on the same set of vertices. Thanks to their many nice behaviors such as being good expanders with large spectral gap that allow them to mix quickly, the graphs have been widely used for applications in computer science. From the spectral graph theory’s perspective, investigating the eigenvalues and eigenvectors of the adjacency matrices of random graphs can reveal important facts about the graphs themselves. The spectra of random regular graphs were studied for both local properties such as the spectral gap [5] and global properties such as the limiting distribution of the eigenvalues [9, 10]. It was showed that if the degree grows fast as a function of number of vertices, the random regular graphs will have similar behavior as that of the Erdös–Rényi random graphs, which in turn is similar to the Gaussian orthogonal ensemble on both global and local scales [3, 4, 10]. This phenomenon is an evidence of the universality conjecture in modern random matrix theory, which roughly states that the spectra of a random matrix depend less on the distribution of the entries but more on the algebraic structure of the matrix, so similar matrices with different entry distributions could have similar asymptotic spectral properties.

In this paper, we study the model of random regular bipartite graphs. This is a bipartite analogue of the popular random regular graph model. We mainly concern about the asymptotic behavior of the spectral of the adjacency matrices of these graphs as the size of the vertex set goes to infinity.

In [2], Dumitriu and Johnson studied the convergence of the empirical spectral distribution of random regular bipartite graphs. Their results show that as the degree grows to infinity, there is a strong connection between the adjacency matrix of random regular bipartite graph with Wishart random matrix. This is an interesting analogue to the connection between the random regular graphs and the Wigner random matrices. Due to a limitation in their method, their results only hold if the degree grows slower than any power of the number of vertices.

The goal of this paper is to prove an extension of Dumitriu and Johnson’s result, which allows the degree to grow at faster rates, up to \(o(\sqrt{n})\). Our method is very different and largely based on the comparison method in [10] which deals with a similar problem of random regular graphs.

2 Preliminaries and Main Results

A \((d_L,d_R)\)-regular bipartite graph is a bipartite graph on two sets of vertices L and R so that every vertex of L (or R) has degree \(d_L\) (or \(d_R\), respectively). The model \(G_{m,n,d_L,d_R}\) is defined as a random graph sampled uniformly from the set of all \((d_L,d_R)\)-regular bipartite graphs on two sets L and R and \(|L|=m\), \(|R|=n\). Assume that \(m\ge n\). The adjacency matrix of \(G_{m,n,d_L,d_R}\) is a random matrix A of the following form (under proper labeling of vertices)

$$\begin{aligned} A=\begin{pmatrix} 0&{}\quad X\\ X^T&{}\quad 0 \end{pmatrix}, \end{aligned}$$
(2.1)

where X is a \(m\times n\) (0, 1) random matrix. It is easy to show that the nonzero eigenvalues of A come in pairs \((-\lambda ,\lambda )\) where \(\lambda ^2\) is an eigenvalue of \(X^TX\) and A has at least \(m-n\) zero eigenvalues. Also assume that m and n increase to infinity in the way that

$$\begin{aligned} \frac{d_R}{d_L}=\frac{m}{n} \longrightarrow \alpha \ge 1. \end{aligned}$$

We will compare \(G_{m,n,d_L,d_R}\) with the Erdös–Rényi bipartite random graph model G(mnp) defined on two sets of vertices L and R, and each edge from a vertex in L to a vertex in R is chosen randomly and independently with probability p. Under proper vertex labeling, the adjacency matrix B of G(mnp) has the form

$$\begin{aligned} B=\begin{pmatrix} 0&{}\quad Y\\ Y^T&{}\quad 0 \end{pmatrix}, \end{aligned}$$
(2.2)

where Y is a \(m\times n\) random matrix with iid entries (equal 1 with probability p, 0 with probability \(1-p\)).

For a \(n\times n\) Hermitian matrix M with real eigenvalues \(\lambda _1\le \lambda _2\le \dots \le \lambda _n\), the empirical spectral distribution (ESD) is the probability measure \(\mu _n(M)\) defined as

$$\begin{aligned} \mu _n(M)=\frac{1}{n}\sum _{i=1}^{n}\delta _{\lambda _i}, \end{aligned}$$

where \(\delta _\lambda \) is the Dirac point measure at point \(\lambda \). Note that if M is a random matrix then \(\mu _n(M)\) is a random probability measure.

Let \(\{\mu _n\}_{n=1}^\infty \) be a sequence of random probability measures on real numbers. We say that \(\mu _n\) converges weakly in probability to a deterministic probability measure \(\mu \) if for all bounded continuous functions \(f:{\mathbf{R}}\rightarrow {\mathbf{R}}\) and any \(\epsilon >0\) we have

$$\begin{aligned} \lim _{n\rightarrow \infty } P\left( \Big |\int f \hbox {d}\mu _n - \int f\hbox {d}\mu \Big | <\epsilon \right) =1. \end{aligned}$$

It is well known that if M is a \(m\times n\) random matrix whose entries are iid copies of a random variable with mean zero and variance one, and m / n converges to a finite limit \(\alpha \), then the ESD of \(\frac{1}{n}M^TM\) converges to the Marchenko–Pastur distribution \(\nu _{MP}\) of ratio \(1/\alpha \). Thus, it is natural to expect that the ESD of \(\frac{1}{d_L}X^TX\) with X from the adjacency matrix of \(G_{m,n,d_L,d_R}\) also converges to the Marchenko–Pastur distribution when \(d_L\) grows to infinity.

Recall that the Marchenko–Pastur distribution with ratio \(1/\alpha \) is supported on [ab] and given by the density function

$$\begin{aligned} p(x)=\frac{\alpha }{2\pi x}\sqrt{(b^2-x)(x-a^2)}, \end{aligned}$$

where \(a=1-\alpha ^{-1/2}\) and \(b=1+\alpha ^{-1/2}\). Notice that the eigenvalues of \(\mu _n\) of \(d_L^{-1/2}A\) are the square roots (both positive and negative) of the eigenvalues of \({d_L^{-1}}X^TX\). If the limiting ESD of \({d_L^{-1}}X^TX\) is the Marchenko–Pastur distribution, then the ESD of \(d_L^{-1/2}A\) will have the limit measure \(\mu \) which has support on \([-b,-a]\cup \{0\}\cup [a,b]\) with density function

$$\begin{aligned} q(x)=\frac{2|x|}{1+\alpha }p(x^2)=\frac{\alpha }{(1+\alpha )\pi |x|}\sqrt{(b^2-x^2)(x^2-a^2)}, \end{aligned}$$
(2.3)

and a point mass of \(\frac{\alpha -1}{\alpha +1}\) at 0. Indeed, the \(m-n\) zero eigenvalues give the point mass of \(\frac{m-n}{m+n}\) at 0, while the other 2n eigenvalues are described by applying to p(x) a change of variable from x to \(\sqrt{x}\) and scaled by factor \(\frac{2n}{m+n}\).

In [2], Dumitriu and Johnson proved that if \(d_L=o(n^{\epsilon })\) for a fixed \(\epsilon >0\) then the ESD of \(\frac{1}{d_L}X^TX\) converges to the Marchenko–Pastur law. They also proved a local law as follows. Assume that \(d_L=\exp (o(1)\sqrt{\log n})\). For any interval I whose length at least \(\max (2\eta ,\eta /(-\,\delta \log \delta ))\) and does not contain \([-\,\epsilon ,\epsilon ]\), there exists a constant \(C_\epsilon \) so that for all \(\delta >0\) and n large enough, the following holds with probability \(1-o(1/n)\)

$$\begin{aligned} |\mu _n(I)-\mu (I)|<\delta C_\epsilon |I|. \end{aligned}$$

\(\mu _n\) is the ESD of \(d_L^{-1/2}A\) and \(\mu \) is the limiting distribution defined by (2.3). Since \(\eta \) is roughly \(1/d_L\), the length of I can shrink as \(\eta \) does.

We are going to prove a similar local law for the case \(\omega (\log n)\le d_L\le o(\sqrt{n})\), which is an extension of the result by Dumitriu and Johnson. Let R be the normalized adjacency matrix of \(G_{m,n,d_L,d_R}\)

$$\begin{aligned} R=\frac{1}{\sqrt{\frac{d_L}{n}\left( 1-\frac{d_L}{n}\right) }}\left[ A-\frac{d_L}{n}\begin{pmatrix}0&{}J\\ J^T&{}0\end{pmatrix}\right] , \end{aligned}$$
(2.4)

where J is the all 1 \(m\times n\) matrix. Since \(\begin{pmatrix}0&{}J\\ J^T&{}0\end{pmatrix}\) has rank 2, the ESD of R has the same global behavior (after proper scaling) as that of A due to Weyl’s perturbation interlacing inequality. Our main result is

Theorem 2.1

Suppose \(\omega (\log n)\le d_L\le o(\sqrt{n})\) as n tends to infinity. Let \(\delta >0\) and \(N_I\) be the number of eigenvalues of \(n^{-1/2}R\) in the interval I, where I is an interval avoiding \(\{0\}\) with length at least \(\big (\frac{\log d_L}{\delta ^3d_L^{1/2}}\big )^{1/4}\), then

$$\begin{aligned} |N_I-n\mu (I)|<\delta n\mu (I) \end{aligned}$$

with probability at least \(1-O(\exp (-cnd_L\log (d_L))\).

If \(m=n\), then I does not need to avoid \(\{0\}\).

The convergence of ESD of A is a direct consequence of Theorem 2.1.

Corollary 2.2

The ESD of \(d_L^{-1/2}A\) converges weakly in probability to the limiting measure \(\mu \) [as defined by (2.3)].

The rest of the paper is organized as follows: In Sect. 3, we present the proof of Theorem 2.1, pending two important lemmas. In Sect. 4, we prove the first lemma about the probability for Erdös–Rényi random bipartite graphs to be regular. This result provides a tool to compare the Erdös–Rényi model with the regular model. In Sect. 5, we prove the second lemma about a general concentration result of Wishart-like random matrix.

3 Proof of Theorem 2.1

We use the comparison method. Our proof will rely on two important results. The first one is a lower bound for the probability of Erdös–Rényi random bipartite graphs to be regular. Recall that the Erdös–Rényi bipartite graph model G(mnp) consists of two vertex sets A and B whose capacities are m and n, respectively, and an edge between a vertex of A and a vertex of B is chosen randomly and independently with probability p.

Lemma 3.1

If \(p=o(1/\sqrt{n})\) and \(m/n\rightarrow \alpha <\infty \) as \(n\rightarrow \infty \), then G(mnp) is (mpnp)-regular with probability at least \(\exp (-O(n(np)^{1/2}))\).

Another key ingredient of the proof is the following concentration lemma, which may be of independent interest.

Lemma 3.2

Let M be a \((m+n)\times (m+n)\) Hermitian random matrix of the form

$$\begin{aligned} M_n= \begin{pmatrix} 0&{}\quad X\\ X^{T}&{}\quad 0 \end{pmatrix}, \end{aligned}$$

where X is a \(m\times n\) random matrix whose entries \(\xi _{ij}\) are independent random variables with mean zero, variance 1 and \(|\xi _{ij}|< K\) for some common constant K (K may depend on n). Suppose m / n converges to \(1< \alpha <\infty \) as m and n tend to infinity. Fix \(\delta >0\) and assume that the eighth moment \(\gamma _8:=\sup _{i,j}{\mathbf{E}}(|\xi _{ij}|^8)<\infty \). Then for any interval \(I\subset {\mathbf{R}}\) avoiding \(\{0\}\) whose length is at least \(\varOmega (\delta ^{-1/2}\gamma _8^{1/8}n^{-1/4})\), there is a constant \(c>0\) such that the number \(N_I\) of the eigenvalues of \(\frac{1}{\sqrt{n}}M\) which belong to I satisfies the following concentration inequality

$$\begin{aligned} {\mathbf{P}}(|N_I-n\mu (I)|>\delta n\mu (I))\le 4\exp \left( -c\frac{\delta ^3n^2|I|^4}{K^2}\right) , \end{aligned}$$

where \(\mu \) is the limiting distribution defined by (2.3).

In the case \(\alpha =1\), the minimal length of I increases to \(\varOmega (\delta ^{-1/2}n^{-1/8})\).

Applying Lemma 3.2 for the normalized adjacency matrix of G(mnp)

$$\begin{aligned} M=\frac{1}{\sqrt{p(1-p)}}\left[ B-p\begin{pmatrix}0&{}J\\ J^T&{}0\end{pmatrix}\right] \end{aligned}$$

with \(K=1/\sqrt{p}\), we obtain

Corollary 3.3

Let \(\delta >0\) and \(N_I\) be the number of eigenvalues of M inside interval I avoiding \(\{0\}\) with length at least \(\big (\frac{\log (np)}{\delta ^3(np)^{1/2}}\big )^{1/4}\), there is a constant \(c>0\) so that

$$\begin{aligned} |N_I-n\mu (I)|\ge \delta n\mu (I) \end{aligned}$$

with probability at most \(\exp (-cn(np)^{1/2}\log (np))\).

By Corollary 3.3 and Lemma 3.1, the probability that \(N_I\) fails to be close to the expected value in the model G(mnp) is much smaller than the probability that G(mnp) is (mpnp)-regular. Since the conditional distribution of G(mnp) given G(mnp) is (mpnp)-regular is the same as the distribution of \(G_{m,n,d_l,d_R}\), it implies that the probability that \(N_I\) fails to be close to the expected value in the model \(G_{m,n,d_L,d_R}\) where \(d_L=np\), \(d_R=mp\) is the ratio of the two former probabilities, which is \(O(\exp (-cn\sqrt{np}\log np))\) for some small positive constant c. Thus, Theorem 2.1 is proved, depending on Lemma 3.2 which we turn to next.

4 Proof of Lemma 3.1

We will use a result on the asymptotic number of regular bipartite graph by McKay and Wang [8]. Let N(mnst) be the number of bipartite graphs on two vertex sets A and B with \(|A|=m,\ |B|=n\) and each vertex in A has degree s, each vertex in B has degree t.

Theorem 4.1

( [8], Theorem 2) Let \(S=nt=ms\) and \(p=n/s=m/t\). Suppose that \(S\rightarrow \infty \) and \(1\le st=o(p^2)\). Then N(mnst) is given by

$$\begin{aligned} \frac{S!}{(s!)^m(t!)^n}\exp \left( -\frac{(s-1)(t-1)}{2}-\frac{(s-1)(t-1)(2st-s-t+2)}{12S}+O\left( \frac{s^3t^3}{S^2}\right) \right) . \end{aligned}$$

Proof of Lemma 3.1

In our setting, \(s=np,\ t=mp=\alpha np, S=mnp=\alpha n^2p\). Since \(p=o(1/\sqrt{n})\), \(st=o(p^2)\). Then

$$\begin{aligned}&P(G(m,n,p)\text { is }(np,mp)-\text {regular})\\&\quad = P(G(m,n,p) \text { has exactly } mnp \text { edges})\times N(m,n,np,mp)\\&\quad =p^{\alpha n^2p}(1-p)^{\alpha n^2(1-p)}\frac{(\alpha n^2p)!}{((np)!)^{\alpha n}((\alpha np)!)^n}\exp (-O(n^2p^2))\\&\quad =(2\pi )^{-0.5(\alpha -1)n}2^{\alpha n^2p}\alpha ^{-0.5n}(np)^{-0.5(\alpha +1)n}p^{0.5}(1-p)^{\alpha n^2(1-p)}\exp (-O(n^2p^2))\\&\quad =\exp (-0.5(\alpha -1)n\ln (2\pi )+\alpha n^2p\ln (2) -0.5n\ln (\alpha )-0.5(\alpha +1)n\ln (np)\\&\qquad +0.5\ln (p)+\alpha n^2(1-p)\ln (1-p)-O(n^2p^2))\\&\quad \ge \exp (-O(n\ln (np)+n^2p^2))\ge \exp (-O(n\sqrt{np})). \end{aligned}$$

5 Proof of Lemma 3.2

Assume \(I=[a,b]\) where \(a<b<0\) or \(0<a<b\).

We will use the approach of Guionnet and Zeitouni in [7]. Consider a random Hermitian matrix \(W_n\) with independent entries \((W_n)_{ij}=A_{ij}w_{ij}\) where

  • \(A=(A_{ij})\) is a deterministic matrix of the form

    $$\begin{aligned}A= \begin{pmatrix} 0&{}\quad J\\ J^{T}&{}\quad 0 \end{pmatrix} \end{aligned}$$

    with J be the \(m\times n\) all 1 matrix.

  • \(w_{ij}\)’s are iid copies of a random variable w with mean zero, variance one, support in a compact region S. Moreover, w is bounded by a constant K.

  • \(A_{ij}\)’s and \(w_{ij}\)’s are independent.

Let f be a real convex L-Lipschitz function and define

$$\begin{aligned} Z:=\sum _{i=1}^nf(\lambda _i), \end{aligned}$$

where \(\lambda _i\)’s are the eigenvalues of \(\frac{1}{\sqrt{n}}W_n\). We are going to view Z as the function of the variables \(w_{ij}\). For our application, we need \(w_{ij}\) to be independent random variables with mean zero and variance 1, whose absolute values are bounded by a common constant K. (K may depend on n.)

The following concentration inequality is a version of Theorem 1.1 in [7].

Lemma 5.1

Let \(W_n, f, Z\) be as above. Then there is a constant \(c>0\) such that for any \(T>0\)

$$\begin{aligned} {\mathbf{P}}(|Z-{\mathbf{E}}(Z)|\ge T)\le 4\exp \left( -c\frac{T^2}{K^2L^2}\right) . \end{aligned}$$

In order to apply Lemma 5.1 for \(N_I\) and M, it is natural to consider

$$\begin{aligned} Z:=N_I=\sum _{i=1}^n \chi _I(\lambda _i), \end{aligned}$$

where \(\chi _I\) is the indicator function of I and \(\lambda _i\) are the eigenvalues of \(\frac{1}{\sqrt{n}}M_n\). However, this function is neither convex nor Lipschitz. As suggested in [7], one can overcome this problem by a proper approximation. Define \(I_l=[a-\frac{|I|}{C},a]\), \(I_r=[b,b+\frac{|I|}{C}]\), where C is a constant to be chosen later, and construct two real functions \(f_1\) and \(f_2\) as follows (see Fig. 1):

$$\begin{aligned} f_1(x)= & {} \left\{ \begin{array}{l@{\quad }l} -\frac{C}{|I|}(x-a)-1&{} \text {if }x\in \left( -\infty , a-\frac{|I|}{C}\right) \\ 0&{}\text {if }x\in I\cup I_l\cup I_r\\ \frac{C}{|I|}(x-b)-1&{} \text {if }x\in \left( b+\frac{|I|}{C},\infty \right) \end{array}\right. \\ f_2(x)= & {} \left\{ \begin{array}{l@{\quad }l} -\frac{C}{|I|}(x-a)-1&{} \text {if }x\in (-\infty , a)\\ -1&{}\text {if }x\in I\\ \frac{C}{|I|}(x-b)-1&{} \text {if }x\in (b,\infty ). \end{array}\right. \end{aligned}$$
Fig. 1
figure 1

Auxiliary functions used in the proof

Note that \(f_j\)’s are convex and \(\frac{C}{|I|}\)-Lipschitz. Define

$$\begin{aligned} X_1=\sum _{i=1}^n f_1(\lambda _i),\ X_2=\sum _{i=1}^n f_2(\lambda _i) \end{aligned}$$

and apply Lemma 5.1 with \(T=\frac{\delta }{8} n\mu (I)\) for \(X_1\) and \(X_2\). Thus, we have

$$\begin{aligned} {\mathbf{P}}(|X_j-{\mathbf{E}}(X_j)|\ge \frac{\delta }{8} n\mu (I))&\le 4\exp \left( -c\frac{\delta ^2n^2|I|^2(\mu (I))^2}{K^2C^2}\right) . \end{aligned}$$

Direct calculation shows that for I in the support of \(\mu \) one has \(\mu (I)\le \alpha |I|^2\) for some absolute constant \(\alpha \). Thus, we have for \(j=1,2\)

$$\begin{aligned} {\mathbf{P}}(|X_j-{\mathbf{E}}(X_j)|\ge \frac{\delta }{8} n\mu (I))\le 4\exp \left( -c_1\frac{\delta ^2n^2|I|^4}{K^2C^2}\right) . \end{aligned}$$

Let \(X=X_1-X_2\), then

$$\begin{aligned} {\mathbf{P}}(|X-{\mathbf{E}}(X)|\ge \frac{\delta }{4} n\mu (I))\le O\left( \exp \left( -c_1\frac{\delta ^2n^2|I|^4}{K^2C^2}\right) \right) . \end{aligned}$$

Now we compare X to Z using the following result about convergence rate for Marchenko–Pastur law by Götze and Tikhomirov .

Lemma 5.2

([6] Theorem 1.1) Let \(W_n=(\omega _{ij})\) be a \(m\times n\) random matrix whose entries are independent with mean zero and variance one, and \(\gamma _8=\sup _{i,j}{\mathbf{E}}(|\omega _{ij}|^8)<\infty \). Suppose that m / n converges to \(1<\alpha <\infty \). Then for any \(I\subset {\mathbf{R}}\) the number \(N'_I\) of eigenvalues of \(\frac{1}{\sqrt{n}}W_n^{T}W_n\) inside I satisfies

$$\begin{aligned} |{\mathbf{E}}(N'_I)-n\mu _{MP}(I)|<\beta ' n^{1/2}\gamma _8^{1/4}, \end{aligned}$$

where \(\beta '\) is an absolute constant.

Since \(\mu \) is a scaled down copy of \(\mu _{MP}\), the same convergence rate (with another constant) holds for our case

$$\begin{aligned} |{\mathbf{E}}(N_I)-n\mu (I)|<\beta \gamma _8^{1/4}n^{1/2}. \end{aligned}$$

We have \({\mathbf{E}}(X-Z)\le {\mathbf{E}}(N_{I_l}+N_{I_r})\). Thus, by Lemma 5.2

$$\begin{aligned} {\mathbf{E}}(X)\le {\mathbf{E}}(Z)+n(\mu (I_l)+\mu (I_r))+\beta \gamma _8^{1/4}n^{1/2}. \end{aligned}$$

Choose \(C=(4/\delta )^{1/2}\), because \(|I|\ge \varOmega (\delta ^{-1/2}\gamma _8^{1/8}n^{-1/4})\),

$$\begin{aligned} n(\mu (l_l)+\mu (I_r))=\Theta \left( n\left( \frac{|I|}{C}\right) ^{2}\right) >\varOmega \left( \gamma _8^{1/4}n^{1/2}\right) \end{aligned}$$

and

$$\begin{aligned} n(\mu (I_l)+\mu (I_r))+\beta \gamma _8^{1/4}n^{1/2}=\Theta \left( n\left( \frac{|I|}{C}\right) ^{2}\right) =\Theta \left( \frac{\delta }{4}n\mu (I)\right) . \end{aligned}$$

Therefore, with probability at least \(1-O(\exp (-\,c_1\frac{\delta ^4n^2|I|^4}{K^2}))\), we have

$$\begin{aligned} Z\le X\le {\mathbf{E}}(X)+ \frac{\delta }{4}n\mu (I) < {\mathbf{E}}(Z)+\frac{\delta }{2} n\mu (I). \end{aligned}$$

Lemma 5.2 again gives

$$\begin{aligned} {\mathbf{E}}(N_I)<n\mu (I)+ \beta \gamma _8^{1/4}n^{1/2}<\left( 1+\frac{\delta }{2}\right) n\mu (I); \end{aligned}$$

hence, with probability at least \(1-O(\exp (-c_1\frac{\delta ^3n^2|I|^4}{K^2}))\)

$$\begin{aligned} N_I<(1+\delta ) n\mu (I), \end{aligned}$$

which is the desires upper bound.

In the case \(\alpha \approx 1\), as proved in [1] the convergence rate decreases

$$\begin{aligned} |{\mathbf{E}}(N_I)-n\mu (I)|<O(n^{3/4}). \end{aligned}$$

Using the assumption that \(|I|\ge \varOmega (\delta ^{-1/2}n^{-1/8})\) and repeating the argument as in the case \(\alpha \) is bounded away from 1, we reach the same conclusion.

The lower bound is proved using a similar argument. Let \(I'=[a+\frac{|I|}{C}, b-\frac{|I|}{C}]\), \(I'_l=[a,a+\frac{|I|}{C}]\), \(I'_r=[b-\frac{|I|}{C},b]\) where C is to be chosen later and define two functions \(g_1\) and \(g_2\) as follows (see Fig. 1):

$$\begin{aligned} g_1(x)= & {} \left\{ \begin{array}{l@{\quad }l} -\frac{C}{|I|}(x-a)&{} \text {if }x\in (-\infty , a)\\ 0&{}\text {if }x\in I'\cup I'_l\cup I'_r\\ \frac{C}{|I|}(x-b)&{} \text {if }x\in (b,\infty ) \end{array}\right. \\ g_2(x)= & {} \left\{ \begin{array}{l@{\quad }l} -\frac{C}{|I|}(x-a)&{} \text {if }x\in \left( -\infty , a+\frac{|I|}{C}\right) \\ -1&{}\text {if }x\in I'\\ \frac{C}{|I|}(x-b)&{} \text {if }x\in \left( b-\frac{|I|}{C},\infty \right) . \end{array}\right. \end{aligned}$$

Define

$$\begin{aligned} Y_1=\sum _{i=1}^ng_1(\lambda _i),\ Y_2=\sum _{i=1}^ng_2(\lambda _i). \end{aligned}$$

A similar argument using Lemmas 5.1 and 5.2 with \(Y_1\), \(Y_2\) in place of \(X_1\), \(X_2\) shows that with probability at least \(1-O(\exp (-c_2\frac{\delta ^3n^2|I|^4}{K^2C^2}))\)

$$\begin{aligned} N_I>(1-\delta )n\mu (I). \end{aligned}$$

Thus, Lemma 3.2 is proved.