Abstract
In this work, we study the spectrum of the normalized Laplacian and its regularized version for random geometric graphs (RGGs) in various scaling regimes. Two scaling regimes are of special interest, the connectivity and the thermodynamic regime. In the connectivity regime, the average vertex degree grows logarithmically in the graph size or faster. In the thermodynamic regime, the average vertex degree is a constant. We introduce a deterministic geometric graph (DGG) with nodes in a grid and provide an upper bound to the probability that the Hilbert–Schmidt norm of the difference between the normalized Laplacian matrices of the RGG and DGG is greater than a certain threshold in both the connectivity and thermodynamic regime. Using this result, we show that the RGG and DGG normalized Laplacian matrices are asymptotically equivalent with high probability (w.h.p.) in the full range of the connectivity regime. The equivalence is even stronger and holds almost surely when the average vertex degree \(a_n\) satisfies the inequality \(a_n > 24 \log (n).\) Therefore, we use the regular structure of the DGG to show that the limiting eigenvalue distribution of the RGG normalized Laplacian matrix converges to a distribution with a Dirac atomic measure at zero. In the thermodynamic regime, we approximate the eigenvalues of the regularized normalized Laplacian matrix of the RGG by the eigenvalues of the DGG regularized normalized Laplacian and we provide an error bound which is valid w.h.p. and depends upon the average vertex degree.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The spectra of random matrices and random graphs have been extensively studied in the literature [1,2,3,4,5]. Spectral graph methods have become a fundamental tool in the analysis of large complex networks, and related disciplines, with a broad range of applications in telecommunication, machine learning, data mining, and web search. A first natural random graph model of complex networks is the Erdős-Rényi (ER) random graph [6] where edges between nodes appear with equal probabilities. This model has many appealing analytical properties but does not model important features of many real complex networks. In particular, the ER graph fails in describing clustering properties of graphs in which the geographical distance is a critical factor, as for example wireless ad-hoc network [7], sensor network [8], and the study of the dynamics of a viral spreading in a specific network of interactions [9, 10]. To properly model such networks, we consider a special class of graphs known as random geometric graphs (RGGs) [11]. Another very important motivation for the study of RGGs is their applications to statistics and learning. Many clustering techniques such as the nearest-neighbor technique in statistics and machine learning are based on the spatial structure of RGGs [12].
Let us analytically define the RGG studied in this work. We consider a finite set \(\mathcal {X}_{n}\) of n nodes,Footnote 1\(x_{1},...,x_{n},\) distributed uniformly and independently on the d-dimensional torus \(\mathbb {T}^d \equiv [0, 1]^d\). Taking a torus \(\mathbb {T}^d\) instead of a cube exempts us from considering boundary effects. Given a threshold \(r_{n} >0 \), we form a graph by connecting two nodes \(x_{i}, x_{j} \in \mathcal {X}_{n}\) if the \(\ell _{p}\)-distance between them is at most \(r_{n}\), i.e., \(\Vert x_{i}- x_{j} \Vert _{p} \le r_{n}\) with \( p \in [1,\infty ]\) (that is, either \(p \in [1,\infty )\) or \(p=\infty \), see Figure 1(a). Here \(\Vert . \Vert _{p}\) is the \(\ell _{p}\)-distance on \(\mathbb {R}^d\) defined as
For \(p=2,\) the \(\ell _{2}\)-distance defines the standard Euclidean distance between two points in \(\mathbb {R}^d\). When \(p=\infty \), we obtain the Chebyshev distance, i.e., the maximum of the differences between the coordinates in any dimension of the two points. Such graphs, denoted by \(G(\mathcal {X}_{n}, r_{n})\), are called RGGs and are extensively discussed in [11]. Typically, the function \(r_{n}\) is chosen such that \(r_{n}\rightarrow 0\) when \(n \rightarrow \infty \). Unlike ER graphs, the RGG is an inherently harder model to work with since the nature of the graph induces dependencies between edges.
The degree of a vertex in \(G(\mathcal {X}_{n}, r_{n})\) is the number of edges incident to it. Let \(\theta ^{(d)}\) denote the volume of the d-dimensional unit hypersphere in \(\mathbb {T}^d\). Then, the average vertex degree in \(G(\mathcal {X}_{n}, r_{n})\) is equal to \(a_{n}=\theta ^{(d)} nr_{n}^d\) [11]. The properties of RGGs, including spectral properties, often depend on the average vertex degree \(a_{n}\). Two different scaling regimes for \(a_{n}\) are of interest in this work. The first one is the connectivity regime in which the average vertex degree \(a_{n}\) grows logarithmically in n or faster, i.e., \(a_n=\varOmega (\log (n))\)Footnote 2 [11]. The second scaling regime of interest is the thermodynamic regime, in which the average vertex degree is a constant \(\gamma \), i.e., \(a_{n} \equiv \gamma \) [11]. This regime models real-world complex networks that consist of disconnected components, e.g., social interaction networks [13] and web graphs [14].
This work focuses on the analysis of the spectra of the normalized Laplacian matrix of \(G(\mathcal {X}_{n}, r_{n})\) in the connectivity and thermodynamic regimes under any \(\ell _{p}\)-distance as d remains fixed and \(n \rightarrow \infty \). To this end, we introduce the set \(\mathcal {D}_{n}\) of the n nodes that are at the intersections of all parallel hyperplanes with separation \(n^{-1/d}\), see Fig. 1(b), and define a deterministic graph \(G(\mathcal {D}_{n}, r_{n})\) by connecting any two nodes \(x'_{i}\), \(x'_{j}\) \(\in \mathcal {D}_{n}\) if \(\parallel x'_{i}- x'_{j} \parallel _{p} \le r_{n}\) for \(p \in [1,\infty ].\) For both \(G(\mathcal {X}_{n}, r_{n})\) and \(G(\mathcal {D}_{n}, r_{n})\), we assume that there is at most one edge between any pair of nodes. There is no edge from a vertex to itself. Moreover, we assume that the edges are not directed.
In the following, we define the normalized Laplacian matrix for \(G(\mathcal {X}_{n}, r_{n})\) and \(G(\mathcal {D}_{n}, r_{n})\).
Let \(\mathcal {N}(x_{i})\) be the set of neighbors of vertex \(x_{i}\) in \(G(\mathcal {X}_{n}, r_{n})\) and \(\mathcal {N}(x'_{i})\) be the set of neighbors of vertex \(x'_{i}\) in \(G(\mathcal {D}_{n}, r_{n})\). The normalized Laplacian matrices \(\mathcal {L}(\mathcal {X}_{n})\) and \(\mathcal {L}(\mathcal {\mathcal {D}}_{n})\) for \(G(\mathcal {X}_{n}, r_{n})\) and \(G(\mathcal {D}_{n}, r_{n})\), respectively, have entries defined by
where \(\mathbf {N}(x_{i})\) and \(\mathbf {N}(x'_{i})\) are the sizes of the two sets \(\mathcal {N}(x_{i})\) and \(\mathcal {N}(x'_{i})\), respectively, and \(\delta _{\mathrm{ij}}\) is the Kronecker delta function. The term \(\chi [x_{i}\thicksim x_{j}]\) takes unit value when there is an edge between nodes \(x_{i}\) and \(x_{j}\) in \(G(\mathcal {X}_{n}, r_{n})\) and zero otherwise, i.e.,
A similar definition holds for \(\chi [x'_{i}\thicksim x'_{j}]\) defined over the nodes in \(G(\mathcal {D}_{n}, r_{n})\). Recall that \(a_{n}\) denotes the average vertex degree in \(G(\mathcal {X}_{n}, r_{n})\) and, is constant and independent of n in the thermodynamic regime, i.e., \(a_{n}\equiv \gamma >0\). In \(G(\mathcal {D}_{n}, r_{n})\), the number of neighbors of each vertex is the same and we denote it by \(a'_{n}=\mathbf {N}(x'_{i})\). In particular, in the thermodynamic regime \(\mathbf {N}(x'_{i})= \gamma '\). Additionally, \(\mathbf {N}(x_{i}, x'_{i})= \sum \limits _{\begin{array}{c} j \end{array}}^{} \chi [x_{i} \sim x_{j}]\chi [x'_{i} \sim x'_{j}] \le a'_{n} \ \forall \ i, j\).
Note that the above formal definition of the normalized Laplacian in (1) requires \(G(\mathcal {X}_{n}, r_{n})\) and \(G(\mathcal {D}_{n}, r_{n})\) not to have isolated vertices. To overcome the problem of singularities due to isolated vertices in the thermodynamic regime, we follow the scheme proposed in [15]. It corresponds to the normalized Laplacian matrix on a modified graph constructed by adding auxiliary edges among all the nodes with weight \(\frac{\alpha }{n}> 0\). Specifically, the entries of the normalized Laplacian matrices are modified as
The corresponding matrices are referred to as regularized normalized Laplacian matrices [15]. Observe that for \( \alpha =0,\) (2) reduces to (1). The matrices \( \hat{\mathcal {L}}({\mathcal {X}_{n}})\) and \( \hat{\mathcal {L}}({\mathcal {D}_{n}})\) are symmetric, and consequently, their spectra consist of real eigenvalues. We denote by \(\lbrace \hat{\mu }_{i}, i=1,..,n \rbrace \) and \(\lbrace \hat{\lambda }_{i}, i=1,..,n \rbrace \) the sets of all real eigenvalues of the real symmetric square matrices \( \hat{\mathcal {L}}({\mathcal {X}_{n}})\) and \( \hat{\mathcal {L}}({\mathcal {D}_{n}})\) of order n, respectively. Then, the empirical spectral distribution functions of \( \hat{\mathcal {L}}({\mathcal {X}_{n}})\) and \( \hat{\mathcal {L}}({\mathcal {D}_{n}})\) are defined as
where \(\mathrm {I}\lbrace \mathrm {B} \rbrace \) denotes the indicator of an event \(\mathrm {B}\). The goal of this work is to show that \(F^{ \hat{\mathcal {L}}({\mathcal {D}_{n}})} \) is a good approximation for \(F^{ \hat{\mathcal {L}}({\mathcal {X}_{n}})}\) when n is large in both the connectivity and thermodynamic regimes. For that, we use a methodology that relates the asymptotic behavior of two sequences of matrices.
The rest of this paper is organized as follows. In Sect. 2, we provide an overview of related work. In Sect. 3, we present the main results of this work. In Sect. 4, we show how the results presented in Sect. 3 on the relation between the spectrum of RGGs and DGGs are obtained, and how to use the DGG structure to provide analytical approximations of the RGG eigenvalues. Numerical results are given in Sect. 5 to validate the theoretical results by comparing the limiting eigenvalue distribution (LED) obtained analytically and by simulation. Finally, conclusions and implications are drawn in Sect. 6.
2 Related Work
The spectral properties of random matrices are fundamental tools to predict and analyze complex network behavior. The work in [16] investigates the combinatorial Laplacian spectra of RGGs and shows that the spectrum consists of both discrete and continuous parts. The discrete part of the spectrum is a collection of Dirac atomic measures at integer values. The work in [17] shows that the Dirac atomic measure appear mainly due to the existence of symmetric motifsFootnote 3 that occur abundantly in RGGs compared to ER graphs. Other works also analyzed the symmetric motifs in RGGs and studied the probabilities of their appearance (see e.g., [18]).
Several works analyzed the spectra of Euclidean random matrices given by \(H_{n}=f(|| x_{i}-x_{j}||_{2} )\) when n and d grows large and function \(f(\cdot )\) belongs to a given class [19, 20]. However, the obtained results cannot be applied to our problem as we assume that the dimension d stays fixed. The work in [20] also studies the LED of Euclidean random matrices \(H_{n}\) when the dimension d remains fixed. It shows that under some conditions the LED of \(H_{n}\) converges almost surely to a Dirac atomic measure at zero. However, the results in [20] require the continuity of function f and they cannot be applied to the step function considered in this work.
Regarding spectral properties of the adjacency matrix of (RGGs), in [21] and [22], the authors show that the spectral distribution of the adjacency matrix has a limit in the thermodynamic regime as \(n\rightarrow \infty \). Due to the difficulty to compute exactly this spectral measure, in [21], the author proposes an approximation for it as \(\gamma \rightarrow \infty \). In [9], a closed form expression for the asymptotic spectral moments of the adjacency matrix of \(G(\mathcal {X}_{n}, r_{n})\) is derived in the connectivity regime. Then, an analytical upper bound for the spectral radius is derived in order to study the behavior of the viral infection in an RGG. Furthermore, in [23], the author shows that in the connectivity regime, the spectral measures of the transition probability matrix of the random walk in an RGG and in a DGG with nodes in a grid converge to the same limit as \(n \rightarrow \infty \). However, the author in [23] does not study the full range of the connectivity regime and in the proof of this result, a condition is enforced on the average vertex degree \(a_{n}\) using the concept of the minimum bottleneck matching distance. The condition enforced on \(a_{n}\) in [23] implies that for \(\epsilon >0\), the result holds only for RGGs with an average vertex degree \(a_{n}\) that scales as \(\varOmega \left( \log ^{\epsilon }(n)\sqrt{n} \right) \), \(\varOmega \left( \log ^{\frac{3}{2}+\epsilon }(n)\right) \), \(\varOmega \left( \log ^{1+\epsilon }(n)\right) \) for \(d=1\), \(d=2\) and \(d\ge 3\), respectively.
To the best of our knowledge, explicit expressions for the LED or the eigenvalues of the combinatorial Laplacian and normalized Laplacian for RGGs are still not known in the full range of the scaling laws for the average vertex degree \(a_{n}\) in the connectivity regime, nor in the thermodynamic regime. Compared to [23], in this work we study the spectrum of \(\mathcal {L}(\mathcal {X}_{n})\) in the connectivity regime for a wider range of scaling laws of the average vertex degree \(a_{n}\), or equivalently a wider range of scaling laws of the radius \(r_{n}^{d}\) and under any \(\ell _{p}\)-distance, \(p \in [1,\infty ]\). More specifically, for \(d\ge 1\), we show that the LEDs of the normalized Laplacian \(\mathcal {L}(\mathcal {X}_{n})\) for RGGs and for DGGs converge to the same limit when \(a_{n}=\varOmega (\log (n) )\). Moreover, we study the spectrum of \(\hat{\mathcal {L}}(\mathcal {X}_{n})\) in the thermodynamic regime. To overcome the problem of singularities due to isolated nodes in the thermodynamic regime, we investigate the normalized Laplacian on a graph modified by adding auxiliary edges among all the nodes with a certain weight. The corresponding normalized Laplacian is known as the regularized normalized Laplacian [15].
3 Main Results
In this section we present the main results of our work for both the connectivity and thermodynamic regimes. To this end, we introduce the Hilbert–Schmidt norm, also known as week norm, and defined as
The following Theorem 1 summarizes our main result in the connectivity regime. It shows that the spectrum of the RGG normalized Laplacian matrix converges in Hilbert–Schmidt norm almost surely (a.s.) to the spectrum of the DGG normalized Laplacian matrix in the connectivity regime as \(a_n > 24 \log (n)\), i.e.,
whereas the convergence holds w.h.p. for \(a_n=c \log (n) \) and \(c\le 24.\)
Theorem 1
In the connectivity regime, i.e., for \(a_{n}=\varOmega (\log (n))\), the spectrum of the RGG normalized Laplacian matrix \(\mathcal {L}(\mathcal {X}_{n})\) converges to the spectrum of the DGG normalized Laplacian matrix \(\mathcal {L}(\mathcal {D}_{n})\) with convergence rate depending on \(a_n\) as follows : Footnote 4
Proof
See Appendix D. \(\square \)
Theorem 1 shows that when \(a_{n} \ge \log ^{1+\epsilon }(n)\) for \(\epsilon >0\), the rate of convergence of \(\mathbb {P} \left\{ \Vert \mathcal {L}(\mathcal {X}_{n})- \mathcal {L}(\mathcal {D}_{n}) \Vert _{\mathrm {HS}}^{2} > t \right\} \) is \(\mathcal {O}\left( 1/n^{\left( a_{n}/12\log (n)-1\right) }\right) \). In particular, it shows that, when \(a_{n}=c\log (n),\) for \(c> 24\), the convergence rate is \(\mathcal {O}\left( 1/n^{c/12-1}\right) \), and when \(c\le 24\), a slower convergence rate holds which scales as \(\mathcal {O}\left( 1/n\right) \). When the graph is dense, i.e., \(a_{n}=\varOmega (n)\), the convergence rate is \(\mathcal {O}\left( n e^{-n/12}\right) .\) Notice that the almost sure convergence in (3) follows from Theorem 1 and the Borel-Cantelli Lemma, e.g., [24], for \(a_n > 24 \log (n)\).
The following Theorem 2 presents our result for the thermodynamic regime. The spectrum of the regularized normalized Laplacian of an RGG can be asymptotically approximated in Hilbert–Schmidt norm by the spectrum of the regularized normalized Laplacian of a DGG. In particular, we show that the Hilbert–Schmidt norm of the error is bounded with high probability by an explicit positive constant that depends on the average vertex degree \(a_{n}=\gamma \), i.e.,
where \(t> 0\) is an explicit positive constant.
Theorem 2
In the thermodynamic regime, with \(\gamma \ge \frac{2d^{1+1/p}}{2d-1}\), \(d \ge 1\) and \(t > \frac{8\left( n+2\alpha \right) \gamma +4\alpha ^2}{n(\gamma +\alpha )^2}\), the following bound holds
where \(\vartheta =\theta ^{(d)} \left[ 1 + 2 \gamma -\frac{4 \gamma }{n}\right] \).
Proof
See Appendix C. \(\square \)
This result implies that for \(n \rightarrow \infty \) and for every \(t > \frac{8 \gamma }{(\gamma +\alpha )^2}\), the following limit holds
Informally, this entails that for any chosen positive t satisfying the above inequality, as \(n \rightarrow \infty \), the Hilbert–Schmidt norm \(\Vert \hat{\mathcal {L}}(\mathcal {X}_{n})- \hat{\mathcal {L}}(\mathcal {D}_{n}) \Vert _{\mathrm {HS}}^{2}\) is not greater than t with high probability. Therefore, in the thermodynamic regime, we can approximate \(F^{ \hat{\mathcal {L}}(\mathcal {X}_{n})}\) by \(F^{ \hat{\mathcal {L}}(\mathcal {D}_{n})}\) with an error bound of \( \frac{8}{\gamma } \) that holds with high probability when \(n \rightarrow \infty \) and \(\alpha \rightarrow 0.\) It is worth noting that this error bound, valid with high probability, is small when \(a_n=\gamma \) is large.
Based on Theorem 1, we can use the LED of \(\mathcal {L}(\mathcal {D}_n),\) the normalized Laplacian matrix for a DGG defined by the Chebyshev distance to determine the LED of the normalized Laplacian in the connectivity regime as \(n \rightarrow \infty .\) Corollary 1 provides the expression of the eigenvalues of \(\mathcal {L}(\mathcal {D}_n).\) Similarly, Theorem 2 indicates that the LED of \(\hat{\mathcal {L}}(\mathcal {D}_n)\) provides an analytical approximation of the LED of \(\hat{\mathcal {L}}(\mathcal {X}_n)\) in the thermodynamic regime as n goes to infinity. The expression of the eigenvalues of \(\hat{\mathcal {L}}(\mathcal {D}_n)\) are derived in Corollary 2.
Corollary 1
In the connectivity regime, i.e., \(a_{n} = \varOmega ( \log (n))\), for RGGs defined by the Chebyshev distance, \(a_{n} \ge \frac{2d}{2d-1}\), \(d \ge 1\), and letting \(\alpha \rightarrow 0\), the eigenvalues of \(\mathcal {L}(\mathcal {D}_{n})\) are given by
with \(m_{1},...,m_{d}\) \(\in \) \(\lbrace 0,...\mathrm {N}-1 \rbrace \). In (4), \(n=\mathrm {N}^{d}\), \(a'_{n}=(2k_{n}+1)^d-1\) and \( k_{n}= \left\lfloor \mathrm {N}r_{n} \right\rfloor \). Then, in particular, as \(n\rightarrow \infty ,\) the LED of \(\mathcal {L}(\mathcal {D}_{n})\) converges almost surely to the Dirac measure at one.
Proof
See Appendix E. \(\square \)
Corollary 2
In the thermodynamic regime, for RGGs defined by the Chebyshev distance, \(\gamma \ge \frac{2d}{2d-1}\), \(d \ge 1\), as \(n \rightarrow \infty \), the eigenvalues of \( \mathcal {L}(\mathcal {D}_{n})\) are approximated by
where \(s \in \lbrace 1,...,d\rbrace \), \(m_{s} \in \lbrace 0,...,\mathrm {N}\rbrace \) and \(f_{s}=\frac{m_{s}}{\mathrm {N}}\) in \(\mathbb {Q} \cap [0, 1]\) with \(\mathbb {Q}\) denotes the set of rational numbers. Also, \(\gamma '=(2\left\lfloor \gamma ^{1/d}\right\rfloor +1)^d-1\) and \(\delta _{f_{1},...,f_{d}}=1\) when \(f_{1},...,f_{d}=0\) otherwise \(\delta _{f_{1},...,f_{d}}=0\).
Proof
See Appendix E. \(\square \)
4 On the Relation Between the Spectra of Laplacians for RGGs and DGGs
In this section, we analyze the asymptotic behavior of the RGG normalized Laplacian matrix in the connectivity and thermodynamic regimes. In particular, we provide all intermediate results that led to our main results presented in Sect. 3. To this end, first we present a well-known methodology to reduce the asymptotic spectral analysis of RGG Laplacians to the asymptotic spectral analysis of DGG Laplacians by introducing the concept of asymptotic equivalence of sequences of matrices. Then, we derive a fundamental bound on \(\Vert \hat{\mathcal {L}}(\mathcal {X}_n) -\hat{\mathcal {L}}(\mathcal {D}_n) \Vert ^2_{\mathrm {HS}} \) to establish the asymptotic equivalence between the two matrix sequences \(\{\hat{\mathcal {L}}(\mathcal {X}_n)\}\) and \(\{ \hat{\mathcal {L}}(\mathcal {D}_n) \}. \)
In the following we introduce the definition of asymptotic equivalent sequences of matrices and its implications on their eigenvalue spectra.
Definition 1
[25] Two sequences of \(n \times n \) matrices \(\lbrace \mathbf {A}_{n} \rbrace \) and \(\lbrace \mathbf {B}_{n}\rbrace \) are said to be asymptotically equivalent if the following conditions are satisfied.
-
1.
\(\mathbf {A}_{n}\) and \(\mathbf {B}_{n}\) are uniformly bounded in strong (and hence in weak) norm:
$$\begin{aligned} \Vert \mathbf {A}_{n} \Vert , \Vert \mathbf {B}_{n} \Vert \le M <\infty , \ \ n=1, 2,... \end{aligned}$$ -
2.
\(\mathbf {A}_{n} - \mathbf {B}_{n} = \mathbf {D}_{n}\) goes to zero in weak norm as \(n \rightarrow \infty \)
$$\begin{aligned} \lim _{n \rightarrow \infty } \Vert \mathbf {A}_{n} -\mathbf {B}_{n}\Vert _{\mathrm {HS}}= \lim _{n \rightarrow \infty } \Vert \mathbf {D}_{n}\Vert _{\mathrm {HS}}=0. \end{aligned}$$
The asymptotic equivalence between two sequences of matrices implies also the convergence to zero of both the Levy distance between the distributions of the eigenvalues of the two sequences of matrices and the mean squared Euclidean distance or error between pairs of ordered eigenvalues of the two matrices as follows from the next lemma.
Lemma 1
Let \(\mathbf{A} \) and \(\mathbf{B} \) be two n \(\times \) n symmetric matrices with ordered eigenvalues \(\lambda _{1},...,\lambda _{n}\) and \(\mu _{1},...,\mu _{n}\), respectively. Then, the following inequalities hold:
-
1.
Wielandt–Hoffman Inequality [26]
$$\begin{aligned} \dfrac{1}{n}\sum _{k=1}^{n} \vert \lambda _{k}-\mu _{k} \vert ^2 \leqslant \Vert \mathbf {A} -\mathbf {B} \Vert _{\mathrm {HS}}^2, \end{aligned}$$ -
2.
Inequality on the Levy distance (Theorem A.38, [1])
$$\begin{aligned} L^{3}(F_{n}^\mathbf{A} , F_{n}^\mathbf{B} ) \leqslant \Vert \mathbf {A} -\mathbf {B} \Vert _{\mathrm {HS}}^2, \end{aligned}$$where \(L(F_{n}^\mathbf{A }, F_{n}^\mathbf{B })\) denotes the Levy distance between the empirical distribution functions \(F_{n}^\mathbf{A }\) and \(F_{n}^\mathbf{B }\) of the eigenvalues of \(\mathbf{A} \) and \(\mathbf{B} \), respectively.
We observe that both \(\mathcal {L}(\mathcal {X}_n)\) and \(\mathcal {L}(\mathcal {D}_n)\) are uniformly bounded in strong norm since the spectral radius of any normalized (and also regularized) Laplacian matrix is bounded by 2. Therefore, to prove the asymptotic equivalence of the two sequences \(\{\mathcal {L}(\mathcal {X}_n)\}\) and \(\{\mathcal {L}(\mathcal {D}_n)\},\) in the following Lemma 2, we provide an upper bound on \(\Vert \hat{\mathcal {L}}(\mathcal {X}_{n})- \hat{\mathcal {L}}(\mathcal {D}_{n}) \Vert _{\mathrm {HS}}^{2} \) which holds with high probability for any \(a_{n}\) in the connectivity and thermodynamic regimes.
Lemma 2
For \(d \geqslant 1\), the Hilbert–Schmidt norm of the difference between \( \hat{\mathcal {L}}(\mathcal {X}_{n})\) and \( \hat{\mathcal {L}}(\mathcal {D}_{n})\) is upper bounded as follows:
where \(b= n a'_{n}+\alpha ^2+2 \alpha a'_{n}\).
Proof
See Appendix A. \(\square \)
In the thermodynamic regime, for \(d \geqslant 1\), an upper bound for \( \Vert \hat{\mathcal {L}}(\mathcal {X}_{n})- \hat{\mathcal {L}}(\mathcal {D}_{n})\Vert ^2_{\mathrm {HS}}\) is obtained by setting \(a'_{n} \equiv \gamma ',\) where \(\gamma '\) is a positive constant. When the graph is connected as in the connectivity regime, it is not necessary to utilize the regularized normalized Laplacian and it is sufficient to consider the normalized Laplacian which is obtained from the regularized normalized Laplacian by setting \(\alpha =0\) in (1). Then, in the connectivity regime, the upper bound in (6) reduces to the following:
We provide an upper bound for the probability that the Hilbert–Schmidt norm of the difference between the DGG and RGG regularized normalized Laplacian matrices is greater than a certain value t, for any \(a_{n}\).
Proposition 1
For \(d \ge 1\), \(a_{n} \ge \frac{2d^{1+1/p}}{2d-1}\) and \( t > \frac{8\left( n+2\alpha \right) a_{n}+4\alpha ^2}{n(a_{n}+\alpha )^2}\), the following inequality holds:
Proof
See Appendix B. \(\square \)
In the following, we show that the RGG and DGG normalized Laplacian matrices are asymptotically equivalent with high probability in the connectivity regime, i.e., \(a_{n}=\varOmega (\log (n))\), as \(n \rightarrow \infty .\) For this purpose, it is sufficient to prove that \(\Vert \mathcal {L}(\mathcal {X}_{n})- \mathcal {L}(\mathcal {D}_{n}) \Vert _{\mathrm {HS}}^{2} \rightarrow 0 \) as \(n \rightarrow +\infty \) with high probability, since the normalized Laplacian matrices \(\mathcal {L}(\mathcal {X}_{n})\) and \(\mathcal {L}(\mathcal {D}_{n})\) are uniformly bounded in strong norm.
Proposition 2
In the connectivity regime, i.e., \(a_{n}=\varOmega (\log (n))\), for \(d \ge 1\), \(a_{n} \ge \frac{2d^{1+1/p}}{2d-1}\), \(p \in [1, \infty ]\) and \(t > \frac{8}{a_{n}}\), we have
where \(\vartheta =\theta ^{(d)} \left[ 1 + 2 a_{n}-\frac{4 a_n}{n}\right] \).
Proof
See Appendix C. \(\square \)
Observe that the right hand side (r.h.s.) in (7) tends to zero as \(n \rightarrow +\infty \) for any \(a_n\) in the connectivity regime. Then, for any fixed t, it is possible to choose \(n>n_0\) to obtain a probability
arbitrarily small. Consequently, as \(n \rightarrow \infty \), the value \(\Vert \mathcal {L}(\mathcal {X}_{n})- \mathcal {L}(\mathcal {D}_{n}) \Vert _{\mathrm {HS}}^{2}\) is not greater than a fixed \(t>0\) with high probability. Hence, the difference of the RGG and DGG normalized Laplacian matrices goes to zero in weak norm with high probability and the two matrices are asymptotically equivalent. As a result, using Proposition 2, we obtain our main result in Theorem 1.
For the thermodynamic regime, we obtain the main result in Theorem 2 by applying directly Proposition 1 and setting \(a_n=\gamma \) and \(\alpha >0\).
Since the LEDs of the normalized Laplacian for \(G(\mathcal {X}_{n}, r_{n})\) and \(G(\mathcal {D}_{n}, r_{n})\) converge to the same limit as \(n \rightarrow \infty \) for any chosen \(\ell _{p}\)-distance and the convergence holds in the full range of the connectivity regime, i.e, \(a_{n}=\varOmega (\log (n))\), then, we use the structure of the DGG to approximate the eigenvalues of the regularized normalized Laplacian matrix for \(G(\mathcal {X}_{n}, r_{n})\) in the connectivity regimes. Theorem 2 enables a similar but looser approximation in the thermodynamic regime.
In the following, we derive the eigenvalues of \(G(\mathcal {D}_{n}, r_{n}),\) when the geometric graphs are defined based on the Chebyshev distance. Let us consider a d-dimensional DGG with \(n= \mathrm {N}^d\) nodes defined by the Chebyshev distance. Then, the degree of a vertex in \(G(\mathcal {D}_{n}, r_{n})\) is given as [16]
where \( \left\lfloor x \right\rfloor \) is the integer part, i.e., the greatest integer lower than or equal to x. Note that when \(d=1\), the Chebyshev distance and the Euclidean distance coincide.
In the following Lemma 3, by using the expression of the eigenvalues of the adjacency matrix for a DGG under the Chebyshev distance [16], we approximate the eigenvalues of the regularized normalized Laplacian for \(G(\mathcal {X}_{n}, r_{n})\) when the number of nodes n is fixed and for any \(a_{n}\).
Lemma 3
When using the Chebyshev distance and \(d \ge 1\), the eigenvalues of \(\hat{\mathcal {L}}(\mathcal {D}_{n})\) are given by
with \(m_{1},...,m_{d}\) \(\in \) \(\lbrace 0,...\mathrm {N}-1 \rbrace \) and \(\delta _{m_{1},...,m_{d}}=1\) when \(m_{1},...,m_{d}=0\) otherwise \(\delta _{m_{1},...,m_{d}}=0\). In (8), \(n=\mathrm {N}^{d}\), \(a'_{n}=(2k_{n}+1)^d-1\) and \( k_{n}= \left\lfloor \mathrm {N}r_{n} \right\rfloor \).
Proof
See Appendix E. \(\square \)
Lemma 3 is utilized along with the almost sure asymptotic equivalence of the two matrix sequences \(\{\mathcal {L}(\mathcal {X}_n)\}\) and \(\{\mathcal {L}(\mathcal {D}_n)\}\) to obtain the LED of \(\{\mathcal {L}(\mathcal {X}_n)\}\) in the connectivity regime in Corollary 1. Additionally, it is exploited to determine an approximation of the LED of \(\{\hat{\mathcal {L}}(\mathcal {X}_n)\}\) with controlled quadratic error in the thermodynamic regime.
5 Numerical Results
In this section, we validate our analytical results obtained in Sect. 3 by numerical computations. More specifically, we compare the simulated and the analytical spectra of the regularized normalized Laplacian matrix of RGGs in the connectivity and thermodynamic regimes.
Figure 2(a) illustrates the empirical spectral distribution in the thermodynamic regime of a realization for an RGG with \(n=30000\) vertices, \(\alpha =0.001\) and the corresponding DGG. The theoretical distribution is obtained from Corollary 2. We notice that the gap that appears between the eigenvalue distributions of the RGG and the DGG is upper bounded as in Theorem 2.
Here, we provide an additional example in the thermodynamic regime to quantify the error between \( F^{ \hat{\mathcal {L}}(\mathcal {X}_{n})}\) and \( F^{ \hat{\mathcal {L}}(\mathcal {D}_{n})}\) for different values of \(\gamma \) when the random graphs are defined by the Chebyshev distance.
-
(1)
When \(\gamma = 100\) and \(\alpha = 10^{-3}\), \(d=1\) then as \(n \rightarrow \infty \)
$$\begin{aligned} \mathbb {P} \left\{ \Vert \hat{\mathcal {L}}(\mathcal {X}_{n})- \hat{\mathcal {L}}(\mathcal {D}_{n}) \Vert _{\mathrm {HS}}^{2} > \mathbf {0.019} \right\} \rightarrow 0. \end{aligned}$$ -
(2)
When \(\gamma = 120\) and \(\alpha = 10^{-3}\), \(d=1\) then as \(n \rightarrow \infty \)
$$\begin{aligned} \mathbb {P} \left\{ \Vert \hat{\mathcal {L}}(\mathcal {X}_{n})- \hat{\mathcal {L}}(\mathcal {D}_{n}) \Vert _{\mathrm {HS}}^{2} > \mathbf {0.015} \right\} \rightarrow 0. \end{aligned}$$
From these examples, we notice that for \(\gamma =100\), the LED of the regularized normalized Laplacian in the RGG can be approximated by the LED of a DGG with an error bound of 0.019 when \(\alpha =10^{-3}\). Then, as we increase the average vertex degree \(\gamma \) from \(\gamma =100\) to \(\gamma =120\), the error bound decreases. Therefore, the larger the average vertex degree \(\gamma \) is, the tighter the approximation becomes.
In Fig. 2(b) we compare the spectral distribution of a DGG (continuous lines) with the one for an RGG for an increasing number of nodes n (dashed line for \(n=500\) and star markers for \(n=30000\)) in the connectivity regime. We notice that the curves corresponding to the RGG and the DGG match very well when n is large confirming the asymptotic equivalence provided in Proposition 2. Additionally, it becomes apparent that by increasing n, the eigenvalue distribution converges almost surely to the Dirac measure at one. This behaviour confirms the result obtained in Corollary 1.
6 Conclusion
In this work, we studied the spectrum of RGGs in both the connectivity and thermodynamic regime. We first study the asymptotic equivalence of the sequences of the regularized normalized Laplacian matrix of a DGG and RGG and obtained an upper bound on the Hilbert–Schmidt norm of the difference between the regularized normalized Laplacian matrix of RGG and DGG. In the connectivity regime, as \(n \rightarrow \infty \), we showed that the DGG and RGG regularized normalized Laplacian matrices are asymptotically equivalent with high probability. Then, using the structure of the DGG, we found that the LED of the RGG normalized Lapalcian matrix converges to the Dirac measure at one in the full range of the connectivity regime. In the thermodynamic regime, we showed that the LED of the RGG regularized normalized Laplacian matrix can be approximated in Hilbert–Schmidt norm by the LED of the DGG regularized normalized Laplacian matrix with an error bound which is valid with high probability. Then, with the regular structure of the DGG, we provided an analytical approximation for the eigenvalues of the RGG regularized normalized Lapalcian matrix.
Data Availability
The authors declare that the data supporting the findings of this study are available within the article. Specifically, data for figures are provided with the paper.
Notes
Throughout this work, the nodes of a graph are denoted by their spatial coordinates.
The notation \(f(n) =\varOmega (g(n))\) indicates that f(n) is bounded below by g(n) asymptotically, i.e., \(\exists K>0\) and \( n_{o} \in \mathbb {N}\) such that \(\forall n > n_{0}\) \(f(n) \ge K g(n)\).
Symmetric motifs are subsets of nodes which have the same adjacencies.
The notation \(f(x)=\mathcal {O}(g(x))\) as \(x\rightarrow \infty \) indicates that there exists a positive real number K and a real number \(x_0\) such that \(|f(x)| \le K g(x)\) for all \(x \ge x_0.\) Additionally, \( f(x)=o(g(x)) \) as \(x\rightarrow \infty \) indicates that for every positive constant \(\epsilon \) there exists a constant \(x_0\) such that \(|f(x)|\le \epsilon g(x)\) for all \(x\ge x_0.\)
References
Bai, Z., Silverstein, J.W.: Spectral Analysis of Large Dimensional Random Matrices. Springer, New York (2010)
Bai, Z.D.: Methodologies in spectral analysis of large dimensional random matrices, a review. Stat. Sinica 9, 611–677 (1999)
Farkas, I.J., Derényi, I., Barabási, A.-L., Vicsek, T.: Spectra of “real-world" graphs: beyond the semicircle law. Phys. Rev. E 64(2), 026704 (2001)
Van Mieghem, P.: Graph Spectra for Complex Networks. Cambridge University Press, Cambridge (2010)
Couillet, R., Debbah, M.: Random Matrix Methods for Wireless Communications. Cambridge University Press, Cambridge (2011)
Erdos, P.: On random graphs. Publicationes mathematicae 6, 290–297 (1959)
Haas, Z. J., Deng, J., Liang, B., Papadimitratos, P., Sajama, S.: “Wireless ad hoc networks,” Encyclopedia of Telecommunications, (2002)
Yick, J., Mukherjee, B., Ghosal, D.: Wireless sensor network survey. Comput. Netw. 52(12), 2292–2330 (2008)
Preciado, V. M., Jadbabaie, A.: “Spectral analysis of virus spreading in random geometric networks,” IEEE Conference on Decision and Control, (2009)
Ganesh, A., Massoulié, L., Towsley, D.: “The effect of network topology on the spread of epidemics,” in Proc. of IEEE Conference on Computer Communications (INFOCOM), (2005)
Penrose, M.: Random Geometric Graphs. Oxford University Press, Oxford (2003)
Maier, M., Hein, M., von Luxburg, U.: Optimal construction of k-nearest-neighbor graphs for identifying noisy clusters. Theor. Comput. Sci. 410(19), 1749–1764 (2009)
Sadri, A. M., Hasan, S., Ukkusuri, S. V., Lopez, J. E. S.: “Analyzing social interaction networks from twitter for planned special events,” arXiv preprintarXiv:1704.02489, (2017)
Page, L., Brin, S., Motwani, R., Winograd, T.: “The pagerank citation ranking: Bringing order to the web.” Stanford InfoLab, Tech. Rep., (1999)
Avrachenkov, K., Ribeiro, B., Towsley, D.: “Improving random walk estimation accuracy with uniform restarts,” in International Workshop on Algorithms and Models for the Web-Graph. Springer, pp. 98–109 (2010)
Nyberg, A.: “The Laplacian spectra of random geometric graphs,” Ph.D. dissertation, University of Houston, (2014)
Nyberg, A., Gross, T., Bassler, K.E.: Mesoscopic structures and the Laplacian spectra of random geometric graphs. J. Complex Netw. 3(4), 543–551 (2015)
Dettmann, C.P., Knight, G.: Symmetric motifs in random geometric graphs. J. Complex Netw. 6(1), 95–105 (2017)
El Karoui, N.: The spectrum of kernel random matrices. Ann. Stat. 38(1), 1–50 (2010)
Jiang, T.: Distributions of eigenvalues of large Euclidean matrices generated from lp balls and spheres. Linear Algebra Appl. 473, 14–36 (2015)
Bordenave, C.: Eigenvalues of Euclidean random matrices. Random Struct. Algorithms 33(4), 515–532 (2008)
Blackwell, P., Edmondson-Jones, M., Jordan, J.: Spectra of adjacency matrices of random geometric graphs. Unpublished, (2007)
Rai, S.: The spectrum of a random geometric graph is concentrated. J. Theor. Probab. 20(2), 119–132 (2007)
Billingsley, P.: Probability and Measure. John Wiley & Sons, Hoboken (2008)
Gray, R.M.: “Toeplitz and circulant matrices: a review,’’ Foundations and Trends®. Commun. Inf. Theory 2(3), 155–239 (2006)
Hoffman, A.J., Wielandt, H.: The variation of the spectrum of a normal matrix. Duke Math. J. 20, 37–39 (1953)
Müller, T.: Two-point concentration in random geometric graphs. Combinatorica 28(5), 529 (2008)
Janson, S., Luczak, T., Rucinski, A.: Random Graphs. John Wiley & Sons, Hoboken (2011)
Acknowledgements
This research was funded by the French Government through the Investments for the Future Program with Reference: Labex UCN@Sophia-UDCBWN.
Open Access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
Proof of Lemma 2
To upper bound the Hilbert–Schmidt norm of the difference between the regularized normalized Laplacian matrices \( \hat{\mathcal {L}}(\mathcal {X}_{n})\) and \(\hat{\mathcal {L}}(\mathcal {D}_{n})\), we first give the following lemma.
Lemma 4
If \(a_{i} \ge 0\) and \(b_{i} > 0\) for all i, and there exists an \(a_{i} >0\), then
By using Lemma 4, we upper bound the Hilbert–Schmidt norm of the difference between \(\hat{\mathcal {L}}(\mathcal {X}_{n})\) and \(\hat{\mathcal {L}}(\mathcal {X}_{n})\) as
We notice from the last equation that \(\sum \limits _{\begin{array}{c} j \end{array}}^{}{ \chi [x_{i}\sim x_{j}]} = \mathbf {N}(x_{i})\) corresponds to the number of neighbors of vertex \(x_{i}\) in the RGG, and similarly \(\sum \limits _{\begin{array}{c} j \end{array}}^{}{\chi [x'_{i}\sim x'_{j}]} = a'_{n}\) corresponds to the number of neighbors of vertex \(x'_{i}\) in the DGG. Let \(b= n a'_{n}+2 \alpha a'_{n}+\alpha ^2\) and \(\mathbf {N}(x_{i}, x'_{i})=\sum \limits _{\begin{array}{c} j \end{array}}^{}{\chi [x_{i}\thicksim x_{j}]\chi [x'_{i}\thicksim x'_{j}]}\). Then, by applying Lemma 4 we obtain the following upper bound.
Using the triangle inequality, the result follows.
\(\square \)
Proof of Proposition 1
In this appendix, we provide an upper bound for the probability that the Hilbert–Schmidt norm of the difference between the matrices \(\hat{\mathcal {L}}(\mathcal {X}_{n})\) and \( \hat{\mathcal {L}}(\mathcal {D}_{n})\) is greater than than \(t >\frac{8\left( n+2\alpha \right) a_{n}+4\alpha ^2}{n(a_{n}+\alpha )^2}\). To prove this result, we first present the following Lemma 5 and Lemma 6.
Lemma 5
For any chosen \(\ell _{p}\)-distance with \(p \in [1,\infty ]\), \(d\ge 1\) and \(a_{n}\ge \frac{2d^{1+1/p}}{2d-1},\) the following inequality holds
where \(a_{n}\) is the average vertex degree of \(G(\mathcal {X}_{n}, r_{n})\) and \(a'_{n}\) is the degree of each vertex in \(G(\mathcal {D}_{n}, r_{n})\).
Proof
First, we show that (9) holds under the Chebyshev distance. Let an RGG and a DGG be obtained by connecting two nodes if the Chebyshev distance between them is at most \(r_{n} >0\). Recall that the Chebyshev distance corresponds to the metric given by the \(\ell _{\infty }\)-norm. Then, the degree of a d-dimensional DGG with n nodes formed by using the Chebyshev distance is given by [16]
where \( \left\lfloor x \right\rfloor \) is the integer part, i.e., the greatest integer less than or equal to x.
For \(p=\infty \), we have
Notice that \(\lfloor a_{n}^{1/d} \rfloor \ge (a_{n}^{1/d}-1)\), then it is sufficient to show that
By taking the log in both sides of the last inequality, yields
Consequently, under the Chebyshev distance, (9) holds for \(a_{n}\ge \frac{2d}{2d-1}.\)
Next, we show that (9) holds under any \(\ell _{p}\)-distance, \(p \in [1, \infty ]\). Let \(b_{n}\) and \(b'_{n}\) be the degrees of an RGG and a DGG formed by connecting each two nodes when \(d^{1/p} \Vert x_{i}-x_{j} \Vert _{\infty }\) \(\le r_{n}\). This simply means that the graphs are obtained using the Chebyshev distance with a radius equal to \(\dfrac{r_{n}}{d^{1/p}}\). Then, the degree of the DGG can be written as
When \(p=\infty \), we have that (9) holds. Therefore, we deduce that for \(b_{n} \ge \dfrac{2d}{2d-1},\) we get
Note that for any \(\ell _{p}\)-distance with \(p \in [1, \infty )\) in \(\mathbb {R}^{d}\), we have
Then the number of nodes \(a'_{n}\) in the DGG that falls in the ball of radius \(r_{n}\) is greater or equal than \(b'_{n}\), i.e., \(b'_{n} \le a'_{n}\). Therefore,
Hence, for \(a_{n}\ge \dfrac{2d^{1+1/p}}{2d-1}\), we get
\(\square \)
Lemma 6
For any chosen \(\ell _{p}\)-distance with \(p \in [1,\infty ]\), \(d\ge 1\), \(a_{n}\ge \frac{2d^{1+1/p}}{2d-1}\), \(a'_{n} > \dfrac{a_n}{1 d^{1+\frac{1}{p}}}\), we have
Proof
For \(\alpha \) sufficiently small, we let \(c = \dfrac{\alpha ^2}{a_n}\simeq \dfrac{\alpha ^2}{a'_n} \), then we have,
Therefore, if \(\dfrac{a_n}{2 d^{1+\frac{1}{p}}}< a_n< a'_n\), then
If \(a_n > a'_n\), and \(a'_n > \dfrac{a_n}{2 d ^{1+\frac{1}{p}}}\), then
\(\square \)
Now, we prove the main result presented in the proposition.
Define,
and
In the following, we upper bound \(\mathrm {P}\left\{ A >\dfrac{t}{2}\right\} \) and \(\mathrm {P}\left\{ B >\dfrac{t}{2}\right\} \).
First, we write \(\mathrm {P}\left\{ A >\dfrac{t}{2}\right\} =\)
Note that \(\sum \limits _{\begin{array}{c} i \end{array}}^{} \mathbf {N}(x_{i}, x'_{i}) \le n a'_{n}\) and \(\mathbf {N}(x_{i})\le n\). Then, in step (a) for n sufficiently large, we remove the absolute value because
Notice from the last equality that, \(\dfrac{t n(a'_{n}+\alpha )^2}{4}> b \Leftrightarrow t > \dfrac{4n a'_{n}+4\alpha ^2+8\alpha a'_{n}}{n(a'_{n}+\alpha )^2}.\) Then,
We continue further by bounding \(\mathrm {P}\left\{ B>\dfrac{t}{2}\right\} = \)
Let
and
Now, we upper bound the two probabilities \(\mathrm {P}\left\{ B_{1} >\dfrac{t}{4}\right\} \) and \(\mathrm {P}\left\{ B_{2} >\dfrac{t}{4}\right\} \).
First, for \(t > \frac{4n a'_{n}+4\alpha ^2+8\alpha a'_{n}}{n(a'_{n}+\alpha )^2}\), we write \(\mathrm {P}\left\{ B_{1} >\dfrac{t}{4}\right\} =\)
Step (a) follows from \(\sum \limits _{\begin{array}{c} i \end{array}}^{} \sum \limits _{\begin{array}{c} j \end{array}}^{} \left( \chi [x_{i}\thicksim x_{j}]+\frac{\alpha }{n}\right) ^{2}= \left( 1+\frac{2\alpha }{n}\right) \sum \limits _{\begin{array}{c} i \end{array}}^{} \mathbf {N}(x_{i})+\alpha ^2\) and substituting the value of b.
We continue further by upper bounding the term \(\mathrm {P}\left\{ B_{2}>\dfrac{t}{4}\right\} \)=
Define
In the following, we upper bound \( \mathrm {P}\left\{ C_{1} > \dfrac{\mathrm{tn}}{4} \right\} \) and \( \mathrm {P}\left\{ C_{2} > \dfrac{\mathrm{tn}}{4} \right\} .\)
We start first with \( \mathrm {P}\left\{ C_{1} > \dfrac{\mathrm{tn}}{4} \right\} = \)
Step (a) follows from Lemma 4 and step (b) from \( \left( \frac{1}{\alpha }+\frac{2}{n}\right) \sum \limits _{\begin{array}{c} i \end{array}}^{}{}\mathbf {N}(x_{i})+\alpha > \frac{1}{n} \sum \limits _{\begin{array}{c} i \end{array}}^{}{}\mathbf {N}(x_{i})+\alpha .\)
Let
and
Step (a) follows from the inequality \(\left( n+2\alpha \right) na_{n}>\dfrac{\alpha (a_{n}+\alpha )^2}{\sum \limits _{\begin{array}{c} j \end{array}}^{}{\mathbf {N}(x_{j})+n\alpha }}\). Notice that \(8\left( n+2\alpha \right) na_{n}+4n\alpha ^2 < tn^2(a_{n}+\alpha )^2 \Leftrightarrow t > \dfrac{8\left( n+2\alpha \right) a_{n}+4\alpha ^2}{n(a_{n}+\alpha )^2}.\) Then,
Finally, we upper bound the remaining probability
Step (a) follows from applying Cauchy-Schwarz inequality. Then, \( \mathrm {P}\left\{ C_{2}>\dfrac{\mathrm{tn}}{4}\right\} \le \)
Finally, \( \mathrm {P}\left\{ C_{1} > \frac{\mathrm{tn}}{4} \right\} \) is upper bounded by the sum of (12) and (13). We Use this result combined with the upper bound of \( \mathrm {P}\left\{ C_{2} > \frac{\mathrm{tn}}{4} \right\} \) given in (15) to upper bound the term \( \mathrm {P}\left\{ B_{2} > \frac{t}{4} \right\} \). Then, apply the new upper bound with (10) and (11) to upper bound (1) and with Lemma 6, we get Proposition 1 follows. \(\square \)
Proof of Theorem 2 and Proposition 2
In this Appendix, we show that the LED of the regularized normalized Laplacian for a DGG is a good approximation for the LED of the regularized normalized Laplacian for an RGG in both the connectivity and thermodynamic regimes.
To upper bound the terms obtained in Proposition 1, we use the Chebyshev inequality. Notice that \( \sum \limits _{\begin{array}{c} i \end{array}}^{} \mathbf {N}(x_{i}) / 2\) that appears in Proposition 1 counts the number of edges in \(G(\mathcal {X}_{n}, r_{n})\). For convenience, we denote \( \sum \limits _{\begin{array}{c} i \end{array}}^{} \mathbf {N}(x_{i}) / 2\) as \(\xi _{n}\). In order to apply the Chebyshev inequality, we determine the variance of the number of edges, i.e.,\(\mathrm {Var}(\xi _{n})\) in the following lemma.
Lemma 7
(Variance of \(\xi _{n}\)) When \(x_{1},...,x_{n}\) are i.i.d. uniformly distributed in the d-dimensional unit torus \(\mathbb {T}^{d}=[0, 1]\)
Proof
The proof follows along the same lines of Proposition A.1 in [27] when extended to a unit torus and applied to i.i.d. and uniformly distributed nodes. \(\square \)
Let \(\vartheta = \theta ^{(d)} \left[ 1 + 2 a_{n}-\frac{4 a_n}{n}\right] \). In the following, we upper bound the probabilities given in Proposition 1 using Lemma 7 and the Chebyshev inequality. We start by upper bounding the first term as follows:
and
Finally, we upper bound the last term as
Then,
Corollary 2 for the thermodynamic regime is obtained from upper bounding the terms in the r.h.s of (1) in Proposition 1 obtained in (16), (17) and (18). In addition, by letting \(\alpha \rightarrow 0\), the provided upper bounds hold in the connectivity case.
In the following, we propose a tighter upper bound for the Levy distance between \(F^{ \mathcal {L}(\mathcal {D}_{n})}\) and \(F^{ \mathcal {L}(\mathcal {X}_{n})}\) in the connectivity regime. More precisely, the following upper bounds are tighter when the average vertex degree scales as \(\varOmega (\log ^{1+\epsilon }(n))\) or for \(a_{n}=c \log (n)\) when \(c>24\).
First, observe that the number of nodes that fall in an arbitrary interval of radius \(r_{n}\) follows a binomial distribution. Then in order to derive the distribution of \(\mathbf {N}(x_{i}),\) we need to derive the distribution of the nodes that fall in a ball centered in \(x_{i}\). To derive the distribution of \(\mathbf {N}(x_{i})\) in the connectivity regime, we throw at random a node which will be in a random position and we are left with \(n-1\) nodes. Then, we take a ball of size \(2r_{n}\), centered around the thrown node. If we throw randomly the remaining \(n-1\) nodes, then, \(\mathbf {N}(x_{i})\) will be a random variable binomially distributed with parameters \((n-1, \theta ^{(d)} r_{n})\), i.e.,
To upper bound the terms in the r.h.s of (1) given in Proposition 1, we introduce upper bounds for a binomially distributed random variable X appropriate for large deviations. These results play a key role to establish the relation between \(F^{\mathcal {L}(\mathcal {X}_{n})}\) and \(F^{\mathcal {L}(\mathcal {D}_{n})}\).
Lemma 8
( [28], Theorem 2.1, page 26) Let \(X \sim Bin(n, p)\), \(\mathbb {E}X=np\), then
Lemma 9
( [28], corollary 2.3, page 27) Let \(X \sim Bin(n, p)\), \(\mathbb {E}X=np\) and \(0<t \le 3/2\), then
We upper bound the first term in the r.h.s of (1) in Proposition 1 by using Lemmas 8 and 9 as follows. For \(t > \dfrac{8}{a_{n}}\), we have
Step (a) follows from \(\left| \sum \limits _{\begin{array}{c} i \end{array}}^{} z_{i} \right| \le \sum \limits _{\begin{array}{c} i \end{array}}^{} \left| z_{i}\right| \) and step (b) from \(\sum \limits _{\begin{array}{c} i \end{array}}^{} \left| \mathbf {N}(x_{i})- a_{n} \right| \le n \left| \mathbf {N}(x_{i})- a_{n} \right| \).
Now, instead of upper bounding the two last probabilities given in Proposition 1, we go back to Appendix B and upper bound (14) by letting \(\alpha \rightarrow 0\).
Then, applying Lemma 8 and 9, we have for \(t > \dfrac{8}{a_{n}}\), \(\mathrm {P}\left\{ C_{2}>\dfrac{tn}{4}\right\} \le \)
Finally, taking the upper bounds found by using the Chebyshev inequality in (16), (17) and (18) combined with the upper bounds found by using Lemmas 8 and 9, i.e., (19), (20) all together, then by applying Lemma 5 and letting \(\alpha \rightarrow 0\), Proposition 2 follows. \(\square \)
Proof of Theorem 1
Since Proposition 2 holds under the conditions that \(\vartheta < \theta ^{(d)} a_{n} \left( \frac{1}{a_{n}}+2 \right) \) and \(t>\frac{8d^{1+1/p}}{a_{n}},\) we apply these conditions on (7). Then,
and
Consequently,
When \(a_{n}=c\log {(n)}\) and \(c>24\), then \(B <A\) and the rate of convergence is \(\mathcal {O}\left( 1/n^{c/12-1}\right) \), whereas, when \(c\le 24,\) the rate of convergence is \(\mathcal {O}\left( 1/n\right) \). For \(\epsilon >0\) and \(a_{n} \ge \log ^{1+\epsilon }{(n)}\), the rate of convergence is \(\mathcal {O}\left( 1/n^{(a_{n}/12 \log {(n)})-1} \right) \). When the graph is dense, i.e., \(a_{n}\) scales as \(\varOmega (n)\), the LED of the normalized Laplacian matrix of the RGG converges to the Dirac measure at one with rate of convergence \(\mathcal {O}\left( n e^{-n/12}\right) \). Hence, from the result in Proposition 2 follows that the LEDs of the normalized Laplacian matrix for \(G(\mathcal {X}_{n}, r_{n})\) and \(G(\mathcal {D}_{n}, r_{n})\) converge to the same limit as \(n \rightarrow \infty \) under any chosen \(\ell _{p}\)-distance in the definition of the graphs, and the convergence holds in the full range of the connectivity regime, i.e, \(a_{n}=\varOmega (\log (n))\).
\(\square \)
Proof of Lemma 3
In this appendix, we provide the eigenvalues of the regularized normalized Laplacian matrix for a DGG using the Chebyshev distance. Then, the degree of a vertex in \(G(\mathcal {D}_{n}, r_{n})\) is given as [16]
where \( \left\lfloor x \right\rfloor \) is the integer part, i.e., the greatest integer less than or equal to x. The regularized normalized Laplacian can be written as
where \(\mathbf {I}\) is the identity matrix, \(\mathbf {1}^\top =[1,...,1]^\top \) is the vector of all ones and \(\mathbf {A}\) is the adjacency matrix defined as
When \(d=1\), the adjacency matrix \(\mathbf {A}\) of a DGG in \(\mathbb {T}^1\) with n nodes is a circulant matrix. A well known result appearing in [25], states that the eigenvalues of a circulant matrix are given by the (DFT) of the first row of the matrix. When \(d>1\), the adjacency matrix of a DGG is no longer circulant but it is block circulant with \(\mathrm {N}^{d-1}\times \mathrm {N}^{d-1}\) circulant blocks, each of size \(\mathrm {N} \times \mathrm {N}\). The author in [16], pages 85–87, utilizes the result in [25], and shows that the eigenvalues of the adjacency matrix in \(\mathbb {T}^{d}\) are found by taking the d-dimensional DFT of an \(\mathrm {N}^ {d}\) tensor of rank d obtained from the first block row of (A)
where m and h are vectors of elements \(m_{i}\) and \(h_{i}\), respectively, with \(m_{1},...,m_{d} \in \lbrace 0, 1,...,\mathrm {N}-1 \rbrace \) and \(c_{h_{1},...,h_{d}}\) defined as [16]
The eigenvalues of the block circulant matrix \(\mathbf {A}\) follow the spectral decomposition [16], page 86,
where \({\varvec{\Lambda }}\) is a diagonal matrix whose entries are the eigenvalues of \(\mathbf {A}\), and \(\mathbf {F}\) is the d-dimensional DFT matrix. It is well known that when \(d=1\), the DFT of an \(n\times n\) matrix is the matrix of the same size with entries
When \(d>1\), the block circulant matrix \(\mathbf {A}\) is diagonalized by the d-dimensional DFT matrix \(\mathbf {F}= \mathbf {F}_{\mathrm {N}_{1}} \bigotimes ... \bigotimes \mathbf {F}_{\mathrm {N}_{d}}\), i.e., tensor product, where \(\mathbf {F}_{\mathrm {N}_{d}}\) is the \(\mathrm {N}_{d}\)-point DFT matrix.
Notice that all the matrices in (A) have a common eigenvector that is \(\left( \frac{1}{\sqrt{n}},...,\frac{1}{\sqrt{n}}\right) \) and this eigenvector coincides with the first row and column of \(\mathbf {F}\). Then, \(\left( \frac{1}{\sqrt{n}},...,\frac{1}{\sqrt{n}}\right) \) is also an eigenvector of \(\hat{\mathcal {L}}(\mathcal {D}_{n})\).
The regularized normalized Laplacian can be expressed as
where \(\mathbf {e}_{1}=(1,0...0)\) and \({\varvec{\Lambda }}_{1}= \left( \mathbf {I}-\frac{1}{(a'_{n}+\alpha )}{\varvec{\Lambda }}- \frac{n \alpha }{n(a'_{n}+\alpha )} \mathbf {e}^\top _{1}\mathbf {e}_{1} \right) \) is a diagonal matrix whose diagonal elements are the eigenvalues of \(\hat{\mathcal {L}}(\mathcal {D}_{n})\). Then, from (23), the derivation of the eignevalues of \(\hat{\mathcal {L}}(\mathcal {D}_{n})\) reduces to the derivation of the eigenvalues of the normalized adjacency matrix \(\mathbf {A'}\).
The normalized adjacency matrix \(\mathbf {A'}\) is defined as
By using (21) and (22), the eigenvalues of \(\mathbf {A'}\) for a DGG in \(\mathbb {T}^d\) are given as
Then, we conclude that the eigenvalues of \(\hat{\mathcal {L}}(\mathcal {D}_{n})\) for n finite are given by
with \(m_{1},...,m_{d}\) \(\in \) \(\lbrace 0,...\mathrm {N}-1 \rbrace \) and \(\delta _{m_{1},...,m_{d}}=1\) when \(m_{1},...,m_{d}=0\) otherwise \(\delta _{m_{1},...,m_{d}}=0\).
In particular, as \(\alpha \rightarrow 0\), the eigenvalues of \(\mathcal {L}(\mathcal {D}_{n})\) in the connectivity regime are given by
with \(m_{1},...,m_{d} \in \lbrace 0,..., \mathrm {N}-1\rbrace \).
In the thermodynamic regime, for \(s \in \lbrace 0,...,d\rbrace \) we let \(f_{s}=\frac{m_{s}}{\mathrm {N}}\) then as \(n\rightarrow \infty \), \(f_{s}\) \(\in \) \(\mathbb {Q} \cap [0, 1]\) where \(\mathbb {Q}\) denotes the set of rational numbers. Therefore, for \(\gamma \ge 1,\) the eigenvalues of \(\hat{\mathcal {L}}(\mathcal {X}_{n})\) can be approximated by the eigenvalues of \( \hat{\mathcal {L}}(\mathcal {D}_{n})\) given as
where \(\gamma '=(2\left\lfloor \gamma ^{1/d}\right\rfloor +1)^d-1\) and \(\delta _{f_{1},...,f_{d}}=1\) when \(f_{1},...,f_{d}=0\), otherwise \(\delta _{f_{1},...,f_{d}}=0\).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hamidouche, M., Cottatellucci, L. & Avrachenkov, K. On the Normalized Laplacian Spectra of Random Geometric Graphs. J Theor Probab 36, 46–77 (2023). https://doi.org/10.1007/s10959-022-01158-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10959-022-01158-0
Keywords
- Random geometric graph
- Normalized Laplacian
- Limiting eigenvalue distribution
- Connectivity regime
- Thermodynamic regime