Keywords

1 Introduction

The Fréchet mean graph has become a standard tool for the analysis of graph-valued data (e.g., [5, 6, 8, 10, 13, 14]). In this work, we derive the expression for the population Fréchet mean for inhomogeneous Erdős-Rényi random graphs [2]. We prove that the sample Fréchet mean is consistent, and could be estimated using a simple thresholding rule. This novel theoretical result implies that the sample Fréchet mean computed from a training set of graphs, which display specific topological features of interest, will not inherit from the training set the desired topological structure.

We consider the set \(\mathcal {G}\) formed by all undirected unweighted simple labeled graphs with vertex set \(\left\{ 1, \ldots ,n\right\} \). We denote by \(\mathcal {S}\) the set of \(n \times n\) adjacency matrices of graphs in \(\mathcal {G}\),

$$\begin{aligned} \mathcal {S}= \left\{ \boldsymbol{A}\in \{0,1\}^{n \times n}; \text {where} \; a_{ij} = a_{ji},\text {and} \; a_{i,i} = 0; \; 1 \le i < j \le n \right\} . \end{aligned}$$
(1)

We denote by \(\mathcal {G}\big (n,\boldsymbol{P}\big )\) the probability space formed by the inhomogeneous Erdős-Rényi random graphs [2], defined on \(\{1,\ldots ,n \}\), where a graph G with adjacency matrix \(\boldsymbol{A}\) has probability,

$$\begin{aligned} \mathbb {P}\left( \boldsymbol{A} \right) = \prod _{1 \le 1 < j \le n} \left[ p_{ij} \right] ^{a_{ij}} \left[ 1- p_{ij} \right] ^{1 - a_{ij}}. \end{aligned}$$
(2)

The \(n \times n\) matrix \(\boldsymbol{P}= \left[ p_{ij}\right] \) determines the edge probabilities \(0 \le p_{ij} \le 1\), with \(p_{ii} = 0\). We identify \(\mathcal {G}\big (n,\boldsymbol{P}\big )\) with the probability space \((\mathcal {S}, \mathbb {P})\), where \(\mathbb {P}\) is defined by (2). The prominence of \(\mathcal {G}\big (n,\boldsymbol{P}\big )\) stems from its ability to provide tractable models of random graphs that can capture many of the structures of real networks (e.g., stochastic block models, which have great practical importance). We equip \(\mathcal {G}\) with the Hamming distance defined as follows.

Definition 1

The Hamming distance between G and \(G^\prime \) in \(\mathcal {G}\), with adjacency matrix \(\boldsymbol{A}\) and \(\boldsymbol{A}^\prime \) respectively, is given by

$$\begin{aligned} d_H(G,G^\prime ) = \sum _{1 \le i< j \le n} |a_{ij} - a^\prime _{ij}|. \end{aligned}$$
(3)

We characterize the mean of the probability \(\mathbb {P}\) with the Fréchet mean graph, [7].

Definition 2

The Fréchet mean of the probability measure \(\mathbb {P}\) is the set formed by the solutions to

$$\begin{aligned} \boldsymbol{\mu } \big [ \mathbb {P} \big ] = \underset{G\in \mathcal {G}}{\mathrm {argmin}} \sum _{G^\prime \in \mathcal {G}} d_H^2(G,G^\prime ) \mathbb {P}\left( G^\prime \right) \end{aligned}$$
(4)

where \(d_H\) is the Hamming distance (1).

By replacing \(\mathbb {P}\) with the empirical measure, the concept of Fréchet mean graph can be extended to a sample of graphs defined on the same vertex set \(\left\{ 1, \ldots ,n\right\} \).

Definition 3

Let \(\left\{ G^{(k)}\right\} _{1\le k \le N}\),be independent random graphs, sampled from \(\mathbb {P}\). The sample Fréchet mean is the set composed of the solutions to

$$\begin{aligned} \widehat{\boldsymbol{\mu }}_N \big [\mathbb {P} \big ] = \underset{G\in \mathcal {G}}{\mathrm {argmin}} \frac{1}{N}\sum _{k=1}^N d^2(G,G^{(k)}). \end{aligned}$$
(5)

We note that solutions to the minimization problems (4) and (5) always exist, but need not be unique. Because all the results in this paper hold for any graph in the set formed by the solutions to (4) and (5), and without any loss of generality, we assume that \(\boldsymbol{\mu } \big [ \mathbb {P} \big ]\) and \(\widehat{\boldsymbol{\mu }}_N \big [\mathbb {P} \big ]\) each contains a single element.

This notion of centrality is well adapted to metric spaces (e.g., [4, 10, 13]). The vital role played by the Fréchet mean as a location parameter, is exemplified in the works of [1, 14], who have created novel families of random graphs by generating random perturbations around a given Fréchet mean.

Because the focus of this work is not the computation of the Fréchet mean graph, but rather a theoretical analysis of the properties that the Fréchet mean graph inherits from the probability measure \(\mathbb {P}\), defined in (2), we can assume that all the graphs are defined on the same vertex set.

1.1 Our Main Contributions

The prominence of the inhomogeneous Erdős-Rényi random graph model [2] prompts the following critical question: does the Fréchet mean of \(\mathbb {P}\) inherit from the probability space \(\mathcal {G}\big (n,\boldsymbol{P}\big )\) any of the edge connectivity information encoded by \(\boldsymbol{P}\)?

In this paper, we answer this question. We show in Theorem 1 that the population Fréchet mean graph \(\boldsymbol{\mu } \big [ \mathbb {P} \big ]\) can be obtained by thresholding the mean adjacency matrix \(\mathbb {E} \left[ \boldsymbol{A} \right] =\boldsymbol{P}\); an edge exists between the vertices i and j in \(\boldsymbol{\mu } \big [ \boldsymbol{A} \big ]\) if and only if \(\mathbb {E} \left[ \boldsymbol{A} \right] _{ij} > 1/2\). We prove in Theorem 2 that this result also holds for the sample Fréchet mean graph, \(\widehat{\boldsymbol{\mu }}_N \big [\boldsymbol{A} \big ]\), when \(\mathbb {E} \left[ \boldsymbol{A} \right] \) is replaced with the sample mean adjacency matrix, \(\widehat{\mathbb {E}}_N\left[ \boldsymbol{A} \right] \).

2 Main Results

Let \(\boldsymbol{P}= \left[ p_{ij}\right] \) be an \(n \times n\) symmetric matrix with entries \(0 \le p_{ij} \le 1\). In the following two theorems we determine the Fréchet mean graph, and sample mean graph, of graphs in \(\mathcal {G}\big (n,\boldsymbol{P}\big )\). In the following, we denote by [n] the set \(\{1,\ldots ,n\}\).

2.1 The Population Fréchet Mean Graph of \(\mathcal {G}(n,\mathbf {P})\)

Theorem 1

The Fréchet mean graph \(\boldsymbol{\mu } \big [ \mathbb {P} \big ]\) of the probability measure (2), is given by

$$\begin{aligned} \forall i,j \in [n], \quad \boldsymbol{\mu } \big [ \mathbb {P} \big ]_{ij} = {\left\{ \begin{array}{ll} 1 &{} \text {if} \quad \mathbb {E} \left[ \boldsymbol{A} \right] _{ij} =p_{ij}> 1/2,\\ 0 &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(6)

Proof

The proof is given in Sect. 3.2.

2.2 The Sample Fréchet Mean Graph of a Graph Sample in \(\mathcal {G}(n,\mathbf {P})\)

We now turn our attention to the sample Fréchet mean graph, which has recently been used for the statistical analysis of graph-valued data (e.g., [5, 8, 14, 17]). The computation of the sample Fréchet mean graph using the Hamming distance is NP-hard [3]. For this reason, several alternatives have been proposed (e.g., [6, 8]).

Before presenting the second result, we take a short detour through the sample Fréchet median of the probability measure \(\mathbb {P}\), [9, 11, 16], minimiser of

$$\begin{aligned} \widehat{\boldsymbol{m}}_N\left[ \mathbb {P} \right] = \underset{G\in \mathcal {G}}{\mathrm {argmin}} \frac{1}{N}\sum _{k=1}^N d_H(G,G^{(k)}), \end{aligned}$$
(7)

and which can be computed using the majority rule [1].

Lemma 1

The adjacency matrix \(\widehat{\boldsymbol{m}}_N\left[ \boldsymbol{A} \right] \) of \(\widehat{\boldsymbol{m}}_N\left[ \mathbb {P} \right] \) is given by

$$\begin{aligned} \forall i,j \in [n],\quad \widehat{\boldsymbol{m}}_N\left[ \boldsymbol{A} \right] _{ij} = {\left\{ \begin{array}{ll} 1 &{} \text {if} \; \sum \nolimits _{k=1}^N a^{(k)}_{ij}\ge N/2,\\ 0 &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(8)

We now come back to the second main contribution, where we prove that the sample Fréchet mean graph of N independent random graphs sampled from \(\mathcal {G}\big (n,\boldsymbol{P}\big )\) is asymptotically equal (for large sample size) to the sample Fréchet median graph, with high probability.

Theorem 2

\(\forall \delta \in (0,1), \exists N_\delta , \forall N \ge N_\delta \), \(\widehat{\boldsymbol{m}}_N\left[ \boldsymbol{A} \right] \) and \(\widehat{\boldsymbol{\mu }}_N \big [\boldsymbol{A} \big ]\) are given by

$$\begin{aligned} \forall i,j \in \left[ n\right] , \quad \widehat{\boldsymbol{\mu }}_N \big [\boldsymbol{A} \big ]_{ij} = \widehat{\boldsymbol{m}}_N\left[ \boldsymbol{A} \right] _{ij} = {\left\{ \begin{array}{ll} 1 &{} \text {if} \quad \mathbb {E} \left[ \boldsymbol{A} \right] _{ij} =p_{ij}> 1/2,\\ 0 &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$
(9)

with probability \( 1- \delta \) over the realizations of the graphs, \(\left\{ G^{(1)}, \ldots , G^{(N)}\right\} \) in \(\mathcal {G}\big (n,\boldsymbol{P}\big )\).

Proof

The proof is given in Sect. 3.5.

The practical impact of Theorem 2 is given by the following corollary, which is an elementary consequence of Theorem 2 and Lemma 1.

Corollary 1

\(\forall \delta \in (0,1), \exists N_\delta , \forall N\ge N_\delta \), \(\widehat{\boldsymbol{\mu }}_N \big [\boldsymbol{A} \big ]\) is given by the majority rule,

$$\begin{aligned} \forall i,j \in \left[ n\right] , \quad \widehat{\boldsymbol{\mu }}_N \big [\boldsymbol{A} \big ]_{ij} = {\left\{ \begin{array}{ll} 1 &{} \text {if} \quad \sum \nolimits _{k=1}^N a^{(k)}_{ij}> N/2,\\ 0 &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$
(10)

with probability \( 1- \delta \) over the realizations of the graphs, \(\left\{ G^{(1)}, \ldots , G^{(N)}\right\} \), in \(\mathcal {G}\big (n,\boldsymbol{P}\big )\).

3 Proofs of the Main Results

We give in the following the proofs of Theorems 1 and 2. In the process, we prove several technical lemmata.

3.1 The Population and sample Fréchet Functions

Let \(\boldsymbol{A}\) and \(\boldsymbol{B}\) be two adjacency matrices in \(\mathcal {S}\). We provide below an expression for the Hamming distance squared, \(d^2_H(\boldsymbol{A}, \boldsymbol{B})\), where the computation is split between the entries of \(\boldsymbol{A}\) along the edges of \(\boldsymbol{B}\), \(\mathcal {E}(\boldsymbol{B})\), and the entries of \(\boldsymbol{A}\) along the “nonedges” of \(\boldsymbol{B}\), \(\overline{\mathcal {E}}(\boldsymbol{B})\). We denote by \(\left|\mathcal {E}(\boldsymbol{B})\right|\) the number of edges in \(\boldsymbol{B}\).

Lemma 2

Let \(\boldsymbol{A}\) and \(\boldsymbol{B}\) two matrices in \(\mathcal {S}\). Then,

$$\begin{aligned} d^2_H(\boldsymbol{A},\boldsymbol{B}) =&\bigg [\sum _{1 \le i < j \le n}a_{ij}\bigg ]^2 + \left|\mathcal {E}(\boldsymbol{B})\right|^2 + 2 \left|\mathcal {E}(\boldsymbol{B})\right|\bigg [ \sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})}a_{ij}- \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} a_{ij}\bigg ] \nonumber \\&- 4 \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})}\sum _{(i^\prime ,j^\prime ) \in \overline{\mathcal {E}}(\boldsymbol{B})} a_{ij}a_{i^\prime j^\prime } \end{aligned}$$
(11)

We now define the (population) Fréchet function associated with (4).

Definition 4

We denote by \(F_2\) the Fréchet function associated with the Fréchet mean

$$\begin{aligned} F_2 (\boldsymbol{B}) = \sum _{\boldsymbol{A}\in \mathcal {S}} d^2_H(\boldsymbol{A}, \boldsymbol{B}) \mathbb {P}\left( \boldsymbol{A} \right) . \end{aligned}$$
(12)

As explained in the following lemma, the value of the Fréchet function \(F_2(\boldsymbol{B})\) depends only on the entries of the probability matrix \(\boldsymbol{P}\) along the edges of \(\boldsymbol{B}\).

Lemma 3

Let \(\boldsymbol{B}\in \mathcal {S}\), let \(\mathcal {E}(\boldsymbol{B})\) be the set of edges of the graph associated to \(\boldsymbol{B}\). Then

(13)

Proof

The proof of (13) relies on the expression for the Hamming distance squared, (11). The proof is omitted; instead we will prove Lemma 4, the proof of which is extremely similar, albeit more technical, to that of Lemma 3.

3.2 Proof of Theorem 1

We are now in position to prove the first theorem. By Lemma 3, we seek the matrix \(\boldsymbol{B}\), with edge set \(\mathcal {E}(\boldsymbol{B})\), that minimizes the Fréchet function defined by (13). Let us denote

(14)

Since \(0 \le p_{ij} \le 1\), x is confined to the following interval,

$$\begin{aligned} - \sum _{1 \le i < j \le n} p_{ij} \le - \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} p_{ij} \le x \le \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} 1 \le n(n-1)/2. \end{aligned}$$
(15)

In fact, \(x = -\sum _{1 \le i < j \le n} p_{ij}\), only if \(\forall i,j \in [n], \; p_{ij}=1\), and the graph associated to \(\boldsymbol{B}\) is the complete graph. This case is of no interest to us, and thus we can assume that \(\boldsymbol{P}\) is always chosen such that

$$\begin{aligned} -\sum _{1 \le i< j \le n} p_{ij} < x. \end{aligned}$$
(16)

We define,

We have

$$\begin{aligned} f(x) = F_2(\boldsymbol{B}) - \sum _{1 \le i < j \le n} p_{ij}(1 - p_{ij}). \end{aligned}$$

Minimizing \(F_2\) is therefore equivalent to minimizing f. Clearly, f(x) is convex, has a global minimum at \(x_{\min {}} = -\sum _{1\le i< j \le n} p_{ij}\), and is increasing for \(x \ge -\sum _{1\le i< j \le n} p_{ij}\). We seek \(x^*\) that minimizes f(x) over the interval wherein x is enclosed,

$$\begin{aligned} \Big (-\sum _{1\le i< j \le n} p_{ij}, \; n(n-1)/2 \Big ]. \end{aligned}$$

We note that because of (16), \(x_{\min {}} < x^*\). Also, \(x^*\) cannot be positive; otherwise, we would get \(f(x^*) > f(0)\). The optimal value \(x^*\) is obtained by minimizing the distance from \(x^*\) to \(-\sum _{1\le i< j \le n} p_{ij}\),

$$\begin{aligned} x^*- (- \sum _{1\le i< j \le n} p_{ij}) = \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} (1 - 2 p_{ij}) + \sum _{1\le i< j \le n} p_{ij} \ge \sum _{(i,j); 1 -2p_{ij}< 0} \left( 1 - 2 p_{ij} \right) + \sum _{1\le i< j \le n} p_{ij} . \end{aligned}$$
(17)

The lower bound (17) is independent of \(\boldsymbol{B}\), and can be attained by choosing,

$$\begin{aligned} \boldsymbol{\mu } \big [ \mathbb {P} \big ]_{ij} = {\left\{ \begin{array}{ll} 1 &{} \text {if}\quad p_{ij} > 1/2,\\ 0 &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$
(18)

as advertised in the theorem.    \(\square \)

3.3 The Sample Fréchet Function for the Hamming Distance

We now consider N independent random graphs, \(\left\{ G^{(k)}\right\} _{1\le k \le N}\), sampled from \(\mathcal {G}\big (n,\boldsymbol{P}\big )\), with adjacency matrices \(\boldsymbol{A}^{(k)}\). The sample Fréchet function \(\widehat{F}_2{(\boldsymbol{B})}\) associated with the sample Fréchet mean graph is defined as follows.

Definition 5

We denote by \(\widehat{F}_2\) the sample Fréchet function

$$\begin{aligned} \widehat{F}_2{(\boldsymbol{B})} = \frac{1}{N} \sum _{k=1}^N d^2_H(\boldsymbol{A}^{(k)}, \boldsymbol{B}). \end{aligned}$$
(19)

We have the following expression for \(\widehat{F}_2{(\boldsymbol{B})}\), which is similar to (13).

Lemma 4

Let \(\boldsymbol{B}\in \mathcal {S}\), let \(\mathcal {E}(\boldsymbol{B})\) be the set of edges of the graph associated to \(\boldsymbol{B}\). Then

$$\begin{aligned} \widehat{F}_2{(\boldsymbol{B})} =&\Bigg [ \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \big [1 - 2\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \big ] + \sum _{1 \le i< j \le n} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \Bigg ]^2 + \sum _{1 \le i < j \le n} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \left( 1 - \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \right) \end{aligned}$$
(20)
$$\begin{aligned}&- \sum _{1 \le i< j \le n} \sum _{{\mathop {(i,j) \ne (i^\prime ,j^\prime )}\limits ^{1 \le i^\prime < j^\prime \le n }}} \left( \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \right) \nonumber \\&+ 4 \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \sum _{(i^\prime ,j^\prime ) \in \overline{\mathcal {E}}(\boldsymbol{B})} \left( \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \right) \end{aligned}$$
(21)

where the sample mean and sample correlation are defined by

$$\begin{aligned} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] = \frac{1}{N} \sum _{k=1}^N a^{(k)}_{ij}\quad \text {and} \quad \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] = \frac{1}{N} \sum _{k=1}^N a^{(k)}_{ij}a^{(k)}_{i^\prime j^\prime } \end{aligned}$$
(22)

Proof

The proof is similar to the proof of Lemma 3. For each graph \(G^{(k)}\), we apply Eq. (11), sum over all the graphs in the sample, and divide by N,

$$\begin{aligned} \widehat{F}_2{(\boldsymbol{B})}&= \left|\mathcal {E}(\boldsymbol{B})\right|^2 + 2 \left|\mathcal {E}(\boldsymbol{B})\right|\bigg [ \sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})} \frac{1}{N} \sum _{k=1}^N a^{(k)}_{ij}- \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \frac{1}{N} \sum _{k=1}^N a^{(k)}_{ij}\bigg ] \nonumber \\&+ \frac{1}{N} \sum _{k=1}^N \Big [ \sum _{1 \le i < j \le n}a^{(k)}_{ij}\Big ]^2 - 4 \sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})}\sum _{(i^\prime ,j^\prime ) \in \mathcal {E}(\boldsymbol{B})} \bigg [\frac{1}{N} \sum _{k=1}^N a^{(k)}_{ij}a^{(k)}_{i^\prime j^\prime }\bigg ]. \end{aligned}$$
(23)

Using the expressions for the sample mean and correlation, in (22), we get

$$\begin{aligned} \widehat{F}_2{(\boldsymbol{B})}&= \left|\mathcal {E}(\boldsymbol{B})\right|^2 + 2 \left|\mathcal {E}(\boldsymbol{B})\right|\Big [ \sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] - \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \Big ] \nonumber \\&+ \frac{1}{N} \sum _{k=1}^N \Big [ \sum _{1 \le i < j \le n}a^{(k)}_{ij}\Big ]^2 - 4 \sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})}\sum _{(i^\prime ,j^\prime ) \in \mathcal {E}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \end{aligned}$$
(24)

We note that

$$\begin{aligned} \frac{1}{N} \sum _{k=1}^N \Big [\sum _{1 \le i< j \le n}a^{(k)}_{ij}\Big ]^2 = \sum _{1 \le i< j \le n}\; \sum _{1 \le i^\prime <j^\prime \le n} \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] . \end{aligned}$$
(25)

Also, we have

$$\begin{aligned} \left|\mathcal {E}(\boldsymbol{B})\right|^2&+ 2 \left|\mathcal {E}(\boldsymbol{B})\right|\Big [\sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] - \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \Big ] \nonumber \\ =&\bigg [ \left|\mathcal {E}(\boldsymbol{B})\right|- 2 \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \bigg ]^2 - \sum _{1 \le i< j \le n}\; \sum _{1 \le i^\prime <j^\prime \le n} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] \nonumber \\&+ 4 \sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})}\sum _{(i^\prime ,j^\prime ) \in \mathcal {E}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] \end{aligned}$$
(26)

We can then substitute (25) and (26) into (24), and we get

$$\begin{aligned} \widehat{F}_2{(\boldsymbol{B})}&= \bigg [ \left|\mathcal {E}(\boldsymbol{B})\right|- 2 \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \bigg ]^2 \nonumber \\&- \sum _{1 \le i< j \le n}\; \sum _{1 \le i^\prime <j^\prime \le n} \Big [\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \Big ]\nonumber \\&+ 4 \sum _{(i,j) \in \overline{\mathcal {E}}(\boldsymbol{B})}\sum _{(i^\prime ,j^\prime ) \in \mathcal {E}(\boldsymbol{B})} \Big [ \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \Big ] \end{aligned}$$
(27)

Finally, we can extract from the second line of (27), the term that corresponds to \((i,j) = (i^\prime ,j^\prime )\), and we get

$$\begin{aligned} \sum _{1 \le i< j \le n} \sum _{1 \le i^\prime<j^\prime \le n}&\left[ \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \right] \nonumber \\ =&\sum _{1 \le i< j \le n}\; \sum _{{\mathop {(i^\prime ,j^\prime ) \ne (i,j)}\limits ^{1 \le i^\prime<j^\prime \le n}}} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \nonumber \\&+ \sum _{1 \le i < j \le n} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \left( \widehat{\mathbb {E}}_N\left[ a_{ij} \right] -1 \right) . \end{aligned}$$
(28)

Substituting (28) into the second line of (27), we obtain (21) as announced in Lemma 4.    \(\square \)

3.4 Concentration of the Sample Fréchet Function

In the following lemma, we show that for large sample size N, the sample Fréchet function \(\widehat{F}_2{(\boldsymbol{B})}\) concentrates around its population counterpart, \(F_2(\boldsymbol{B})\).

Lemma 5

\(\forall \delta \in (0,1), \exists N_\delta , \forall N\ge N_\delta , \forall \boldsymbol{B}\in \mathcal {S}\),

$$\begin{aligned} \widehat{F}_2{(\boldsymbol{B})} = \bigg [ \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \left( 1 - 2p_{ij}\right) + \sum _{1 \le i< j \le n} p_{ij}\bigg ]^2 + \sum _{1 \le i < j \le n} p_{ij}\left( 1 - p_{ij}\right) + {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$
(29)

with probability \(1-\delta \) over the realization of the sample \(\left\{ G^{(k)}\right\} _{1\le k \le N}\) .

Proof

The sample mean \(\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \), defined in (22), is the sum of Bernoulli random variables, and it concentrates around its mean \(p_{ij}\). We use Hoeffding inequality to bound the variation of \(\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \) around \(p_{ij}\). For each \( 1 \le i< j \le n\), we have,

$$\begin{aligned} \mathbb {P}\left( \boldsymbol{A}^{(k)}\sim \mathcal {G}\big (n,\boldsymbol{P}\big ); \left|\widehat{\mathbb {E}}_N\left[ a_{ij} \right] -p_{ij} \right|\ge \varepsilon \right) \le \exp {\displaystyle \left( -2N \varepsilon ^2\right) }. \end{aligned}$$
(30)

To control \(\sum _{k=1}^N a^{(k)}_{ij}\) for all \(1 \le i< j < n\), we use a union bound and we get,

$$\begin{aligned} \forall 1 \le i< j < n,\quad \left|\widehat{\mathbb {E}}_N\left[ a_{ij} \right] -p_{ij}\right|\le \frac{\alpha }{\sqrt{N}}, \end{aligned}$$
(31)

with probability \(1- \delta /8\), and where \(\alpha = \sqrt{\log {(2n/\sqrt{\delta }})}\).

We now study the concentration of the sample correlation,

$$\begin{aligned} \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] = \frac{1}{N} \sum _{k=1}^N a^{(k)}_{ij}a^{(k)}_{i^\prime j^\prime }, \end{aligned}$$
(32)

when the pair of edges (ij) and \((i^\prime , j^\prime )\) are distinct. Because \((i,j) \ne (i^\prime , j^\prime )\), the terms \(a^{(k)}_{ij}\) and \(a^{(k)}_{i^\prime j^\prime }\) are always independent, and the product \(a^{(k)}_{ij}a^{(k)}_{i^\prime j^\prime }\) is a Bernoulli random variable with parameter \(p_{ij}p_{i^\prime j^\prime }\). We conclude that \(\widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \) is the sum of Bernoulli random variables, and concentrates around its mean.

We use Hoeffding inequality to bound the variation of \(\widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \). Replicating the argument used for \(\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \) mutatis mutandis, yields

$$\begin{aligned} \forall \; 1 \le i< j \le n, \forall \; 1 \le i^\prime < j^\prime \le n,\; \left|\widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] -p_{ij}p_{i^\prime j^\prime }\right|\le \frac{\beta }{\sqrt{N}}, \end{aligned}$$
(33)

with probability \(1 - \delta /8\), where \(\beta = \sqrt{\log {(n^2/\sqrt{\delta /2})}}\). In summary, we have

$$\begin{aligned} \forall \; 1 \le i< j \le n, \quad \forall \; 1 \le i^\prime < j^\prime \le n,&\quad \text {with} \quad (i,j) \ne (i^\prime , j^\prime ),\nonumber \\ \widehat{\mathbb {E}}_N\left[ a_{ij} \right] = p_{ij}+ {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \;&\text {and} \; \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] = p_{i^\prime j^\prime }+ {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$
(34)

with probability \(1 - \delta /4\). We are now in position to substitute \(\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \) and \(\widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \) with the expressions given by (34), in \(\widehat{F}_2{(\boldsymbol{B})}\) given by (21) in Lemma 4. Using (34), the first term in (21) becomes

$$\begin{aligned} \Big [ \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \left( 1 - 2\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \right) + \sum _{1 \le i< j \le n} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \Big ]^2&\nonumber \\ = \Big [ \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \left( 1 - 2p_{ij}\right) + \sum _{1 \le i < j \le n} p_{ij}\Big ]^2&+ {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$
(35)

with probability \(1 - \delta /4\). Also, we have

$$\begin{aligned} \sum _{1 \le i< j \le n} \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \left( 1 - \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \right) = \sum _{1 \le i < j \le n} p_{ij}\left( 1 - p_{ij}\right) + {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) . \end{aligned}$$
(36)

with probability \(1 - \delta /4\). The last two terms in (21) can be neglected since,

$$\begin{aligned} \sum _{1 \le i< j \le n}&\sum _{{\mathop {(i,j) \ne (i^\prime ,j^\prime )}\limits ^{1 \le i^\prime< j^\prime \le n }}} \Big [ \widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \Big ] \nonumber \\&= \sum _{1 \le i< j \le n} \sum _{{\mathop {(i,j) \ne (i^\prime ,j^\prime )}\limits ^{1 \le i^\prime < j^\prime \le n }}} \Big [p_{ij}p_{i^\prime j^\prime }- p_{ij}p_{i^\prime j^\prime }\Big ] + {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) = {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$
(37)

with probability \(1 - \delta /4\). Similarly

$$\begin{aligned} \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \sum _{(i^\prime ,j^\prime ) \in \overline{\mathcal {E}}(\boldsymbol{B})} \Big [\widehat{\mathbb {E}}_N\left[ a_{ij} \right] \widehat{\mathbb {E}}_N\left[ a_{i^\prime j^\prime } \right] - \widehat{\mathbb {E}}_N\left[ \rho _{ij, i^\prime j^\prime } \right] \Big ] = {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$
(38)

with probability \(1 - \delta /4\). Substituting (35), (36), (37), and (38) into (21) yields the following estimate

$$\begin{aligned} \widehat{F}_2{(\boldsymbol{B})} = \Big [ \sum _{(i,j) \in \mathcal {E}(\boldsymbol{B})} \left( 1 - 2p_{ij}\right) + \sum _{1 \le i< j \le n} p_{ij}\Big ]^2 + \sum _{1 \le i < j \le n} p_{ij}\left( 1 - p_{ij}\right) + {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$

which holds with probability \(1 - \delta \).    \(\square \)

3.5 Proof of Theorem 2

We prove (9), in Theorem 2, for the sample Fréchet mean. The proof for the sample Fréchet median is completely similar (it also uses a concentration of measure argument for the Fréchet function defined in (7)) and is therefore omitted.

Because of Lemma 5, (29) implies that

$$\begin{aligned} \forall \delta \in (0,1), \exists N_\delta , \forall N\ge N_\delta , \forall \boldsymbol{B}\in \mathcal {S}, \widehat{F}_2{(\boldsymbol{B})} = F_2(\boldsymbol{B}) + {\mathcal {O}}\left( \frac{1}{\sqrt{N}}\right) , \end{aligned}$$

with probability \(1-\delta \) over the realization of the sample \(\left\{ G^{(k)}\right\} _{1\le k \le N}\). For N large enough, the main term dominates the expression of \(\widehat{F}_2{(\boldsymbol{B})}\), and we can neglect the \({\mathcal {O}}\left( 1/\sqrt{N}\right) \) term. We are left with \(F_2(\boldsymbol{B})\), the Fréchet function for the population mean, given by (13), in Lemma 3. The minimum of \(\widehat{F}_2{(\boldsymbol{B})}\) is thus achieved for the adjacency matrix given by the population Fréchet mean, \(\boldsymbol{\mu } \big [ \mathbb {P} \big ]\), defined by (6), as advertised in (9), in Theorem 2.    \(\square \)

4 Simulation Studies

We compare our theoretical analysis to finite sample estimates, which were computed using numerical simulations. The software used to conduct the experiments is publicly available [15].

All graphs were generated using the \(\mathcal {G}\big (n,\boldsymbol{P}\big )\) model (2). We varied the edge probability matrix, \(\boldsymbol{P}\). For each simulation, \(\boldsymbol{P}\) was chosen randomly using independent (up to symmetry) beta random variables, \(p_{ij} \sim \text {beta} (2,10)\). The sample Fréchet mean was computed using the approximation provided by (10). All graphs had \(n=512\) vertices. We varied the sample size for \(N \in [10, 7079]\). For each sample size N, we first generated a probability matrix \(\boldsymbol{P}\) from the beta distribution, and we then sampled N independent random graphs \(G^{(1)},\ldots , G^{(N)}\) from \(\mathcal {G}\big (n,\boldsymbol{P}\big )\).

We illustrate the concentration of the sample Fréchet function for large N, described by Lemma 5. Figure 1 displays the mean error between the population Fréchet function \(F_2(\boldsymbol{B})\), given by (13), and the sample Fréchet function \(\widehat{F}_2{(\boldsymbol{B})}\), given by (21), as a function of the sample size N. The average error between \(F_2(\boldsymbol{B})\) and \(\widehat{F}_2{(\boldsymbol{B})}\), is computed using a sample of \(N_B=16\) independent random graphs \(\boldsymbol{B}_1,\ldots , \boldsymbol{B}_{N_B}\), sampled from \(\mathcal {G}\big (n,\boldsymbol{P}\big )\),

$$\begin{aligned} \widehat{\mathbb {E}}_{N_B}\big [F_2(\boldsymbol{B}) - \widehat{F}_2{(\boldsymbol{B})} \big ] = \frac{1}{N_B} \sum _{i=1}^{N_B} \Big |F_2(\boldsymbol{B}_i) - \widehat{F}_2{(\boldsymbol{B})} \Big |. \end{aligned}$$
(39)

For each N, the sample average error \(\widehat{\mathbb {E}}_{N_B}\big [F_2(\boldsymbol{B}) - \widehat{F}_2{(\boldsymbol{B})} \big ]\), corresponds to a point in Fig. 1-left. We repeated this simulation 64 times to create 64 different values of the error (39). A linear regression was computed and is displayed (in blue) in Fig. 1-left. The slope of the error was found to be \(- 0.5028\), confirming the \(1/\sqrt{N}\) decay of the error predicted by Lemma 5.

Fig. 1.
figure 1

Left: mean error \(\widehat{\mathbb {E}}_{N_B}\big [F_2(\boldsymbol{B}) - \widehat{F}_2{(\boldsymbol{B})} \big ]\) between the population and the sample Fréchet functions, as a function of the sample size N. Right: \(d_H(\boldsymbol{\mu } \big [ \boldsymbol{A} \big ],\widehat{\boldsymbol{\mu }}_N \big [\boldsymbol{A} \big ])\), the Hamming distance between the population Fréchet mean graph and the sample Fréchet mean graph.

To estimate the deviation of the sample Fréchet mean graph away from the population Fréchet mean graph, we computed the Hamming distance between the population Fréchet mean graph (6), and the sample Fréchet mean graph (9). Figure 1-right displays \(d_H(\boldsymbol{\mu } \big [ \boldsymbol{A} \big ],\widehat{\boldsymbol{\mu }}_N \big [\boldsymbol{A} \big ])\). A linear regression was computed and is displayed (in blue) in Fig. 1-right. The slope was found to be \(- 0.6707\), suggesting that the sample Fréchet mean converges toward the population Fréchet mean at a rate \(1/N^{2/3}\), which is faster than the rate predicted by Lemma 5.

5 Discussion and Conclusion

Our answer to the question of the authors in [14]: “what is the “mean” network (rather than how do we estimate the success-probabilities of an inhomogeneous random graph), and do we want the “mean” itself to be a network?” is therefore disappointing in the context of the probability space \(\mathcal {G}\big (n,\boldsymbol{P}\big )\). While the Fréchet mean is indeed an element of \(\mathcal {G}\big (n,\boldsymbol{P}\big )\), it only provides a simplistic sketch of that probability space. Consider for instance sparse graphs where \(\min p_{ij} < 1/2\) (e.g., graphs with \({\scriptstyle \mathcal {O}} \left( n^2\right) \) but \(\omega (n)\) edges), then the sample Fréchet mean is the empty graph, and is pointless.

On a more positive note, our analysis provides a theoretical justification for several algorithms designed to recover a graph from noisy measurements of its adjacency matrix. For instance, the authors in [12] compute the sample mean of the noisy adjacency matrices, and threshold the sample mean to recover an unweighted graph. Our results offer a theoretical justification of the approach of [12]: Theorem 2 proves that the algorithm described in [12] recovers the sample Fréchet mean graph.