Keywords

1 Introduction

A large number of real-world systems can be modelled and analysed by looking at the structure that emerges from the interaction between their components [5]. The resulting graph is called a complex network, and provides a powerful way to study the static and dynamic aspects of the underlying system. Typical examples of systems that are studied in network science include metabolic pathways [9], protein interactions [8], brain regions interactions [20] and scientific collaborations [13]. Complex networks often display non-trivial structural properties that distinguish them from Erdös-Rényi random graphs [4], such as small-worldness and a power-law distribution of vertex degrees [5].

In these large networks one of the key problems is that of identifying the set of most relevant nodes, also called central nodes. A number of centrality measure have been introduced in the literature [3, 5,6,7, 12, 17], each of them capturing different but equally significant aspects of vertex importance. Commonly encountered examples include the degree centrality [7], the closeness centrality [21], and the betweenness centrality [7]. Let G be a graph with n nodes. In the degree centrality [7] the normalised (degree) is taken as the centrality value of a vertex, i.e.,

$$DC(v) = \frac{d_v}{\sum _{u=1}^n d_u},$$

where \(d_v\) denotes the degree of the vertex v. In other words, the number of edges incident on a vertex is interpreted as a measure of its “popularity”, or, alternatively, as the risk of a node being infected in an epidemiological scenario. The closeness centrality [21] links the importance of a vertex to its proximity to the remaining vertices of the graph. More specifically, the closeness centrality is defined as the inverse of the sum of the distance of a vertex v to the remaining nodes of the graph, i.e.,

$$CC(v) = \frac{n-1}{\sum _{u=1}^n sp(u,v)},$$

where sp(uv) denotes the shortest path distance between nodes u and v. Finally, the betweenness centrality [7] measures the extent to which a given vertex lies on the (shortest) paths between the remaining vertices, i.e.,

$$BC(v) = \sum _{s,t \in V} \frac{\sigma (s,t|v)}{\sigma (s,t)},$$

where V is the set of nodes, \(\sigma (s,t)\) and \(\sigma (s,t|v)\) denote the number of shortest paths between s and t and the number of shortest paths between s and t that pass through v.

Recently, there has been an increasing interest in using concepts from quantum mechanics and quantum information theory to probe the structure of graphs [11, 18, 19]. In [11], Lockhart et al. introduced an edge centrality index based on quantum information theory, where the importance of an edge is measured in terms of its contribution to the Von Neumann entropy of the network [16]. This in turn relies on decomposing the edge set of a graph as follows. Given an edge u, the original graph is decomposed into two graphs over the same vertex set, but with different number of edges: (1) a graph where only the edge e is present, and (2) a graph where all the original edges except e are present. With this decomposition to hand, the centrality of e is measured as the Holevo quantity of the associated decomposition [11].

In this paper, we show that a similar approach can be taken to measure the centrality of a vertex. Given a vertex v, we propose to decompose the graph into two graphs over the same vertex set but with edge sets as follows: (1) one graph where only the edges incident to v are present, and (2) one graph where all the original edges except those incident to v are present. Then, the centrality of v is the Holevo quantity associated to the resulting graph ensemble. We show that the centrality of a vertex v can be broken down in two parts: (1) one part that is negatively correlated with the degree centrality of v, and (2) one part that depends on the emergence of non-trivial structures in the graph when v is disconnected from the rest of the graph by removing all edges incident to it. Finally, we perform a series of experiments to evaluate the proposed edge centrality measure on real-world as well as synthetic graphs, and we compare it against a number of commonly used alternative measures.

The remainder of this paper is organised as follows: Sect. 2 reviews the necessary quantum mechanical background and the quantum information theoretical concepts that underpin our approach. Section 3 introduces the proposed vertex centrality measure, which is then analysed and compared to alternative measures in Sect. 4. Finally, Sect. 5 concludes the paper.

2 Quantum Information Theoretical Background

2.1 Quantum States and von Neumann Entropy

In quantum mechanics, a system can be either in a pure state or a mixed state. Using the Dirac notation, a pure state is represented as a column vector . A mixed state, on the other hand, is an ensemble of pure quantum states , each with probability \(p_i\). The density operator of such a system is a positive unit-trace matrix defined as

(1)

The von Neumann entropy [14] S of a mixed state is defined in terms of the trace and logarithm of the density operator \(\rho \)

$$\begin{aligned} S(\rho ) = -{{\mathrm{Tr}}}(\rho \ln \rho )=-\sum _i \lambda _i \ln (\lambda _i) \end{aligned}$$
(2)

where \(\lambda _1,\ldots ,\lambda _n\) are the eigenvalues of \(\rho \). If , i.e., the quantum system is a pure state with probability \(p_i=1\), then the Von Neumann entropy \(S(\rho ) = -{{\mathrm{Tr}}}(\rho \ln {\rho })\) is zero. On other hand, a mixed state always has a non-zero Von Neumann entropy associated with it.

2.2 A Mixed State from the Graph Laplacian

Let \(G=(V,E)\) be a simple graph with n vertices and m edges. We assign the vertices of G to the elements of the standard basis of an Hilbert space \(\mathcal {H}_{G}\), . Here denotes a column vector where 1 is at the i-th position. The graph Laplacian of G is the matrix \(L=D-A\), where A is the adjacency matrix of G and D is the diagonal matrix with elements \(d(u) = \sum _{v=1}^n A(u,v)\). For each edge \(e_{i,j}\), we define a pure state

(3)

Then we can define the mixed state with density matrix

(4)

Let us define the Hilbert spaces \(\mathcal {H}_{V}\cong \mathbb {C}^{V}\), with orthonormal basis \(\mathbf {a}_{v}\), where \(v\in V\), and \(\mathcal {H}_{E}\cong \mathbb {C}^{E}\), with orthonormal basis \(\mathbf {b}_{u,v}\), where \(\{u,v\}\in E\). It can be shown that the graph Laplacian corresponds to the partial trace of a rank-1 operator on \(\mathcal {H}_{V}\otimes \mathcal {H}_{E}\) which is determined by the graph structure [2]. As a consequence, the Von Neumann entropy of \(\rho (G)\) can be interpreted as a measure of the amount of entanglement between a system corresponding to the vertices and a system corresponding to the edges of the graph [2].

2.3 Holevo Quantity of a Graph Decomposition

Given a graph G, we can define an ensemble in terms of its subgraphs. Recall that a decomposition of a graph G is a set of subgraphs \(H_1,H_2,...,H_k\) that partition the edges of G, i.e., for all ij, \(\bigcup _{i=1}^k H_i=G\) and \(E(H_i)\cap E(H_j)=\emptyset \), where E(G) denotes the edge set of G. Notice that isolated vertices do not contribute to a decomposition, so each \(H_i\) can always be seen a subgraph that contains all the vertices. If we let \(\rho (H_1),\rho (H_2),...,\rho (H_k)\) be the mixed states of the subgraphs, the probability of \(H_i\) in the mixture \(\rho (G)\) is given by \(|E(H_i)|/|E(G)|\). Thus, we can generalise Eq. 4 and write

$$\begin{aligned} \rho (G)=\sum _{i=1}^k \frac{|E(H_{i})|}{|E(G)|}\rho (H_i). \end{aligned}$$
(5)

Consider a graph G and its decomposition \(H_1,H_2,...,H_k\) with corresponding states \(\rho (H_1),\rho (H_2),...,\rho (H_k)\). Let us assign \(\rho (H_1),\rho (H_2),...,\rho (H_k)\) to the elements of an alphabet \(\{a_1,a_2,...,a_k\}\). In quantum information theory, the classical concepts of uncertainty and entropy are extended to deal with quantum states, where uncertainty about the state of a quantum system can be expressed using the density matrix formalism. Assume a source emits letters from the alphabet and that the letter \(a_i\) is emitted with probability \(p_i = |E(H_i)|/|E(G)|\). An upper bound to the accessible information is given by the Holevo quantity of the ensemble \(\{p_i,\rho (H_i)\}\):

$$\begin{aligned} \chi (\left\{ p_i,\rho (H_i)\right\} ) = S\left( \sum \limits _{i=1}^k p_i\rho (H_i)\right) -\sum \limits _{i=1}^k p_i S(\rho (H_i)) \end{aligned}$$
(6)

3 The Holevo Vertex Centrality

Given a graph \(G=(V,E)\) and a vertex \(v \in V\), we propose to measure the centrality of v as follows. Let \(G_v = (V,E_v)\) denote the subgraph with vertex set V and edge set \(E_v = \{ (u,v) \in E | u \in V \}\), and \(G_{\overline{v}} = (V,E_{\overline{v}})\) be the subgraph with vertex set V and edge set \(E_v = \{ (u,v) \in E | (u,v) \not \in E_v\}\). In other words, \(E = E_{v} \cup E_{\overline{v}}\) and \(E_{v} \cap E_{\overline{v}} = \emptyset \). Hence, from Eq. 5, we can show that

$$\begin{aligned} \frac{|E_{\overline{v}}|}{|E|} \rho (G_{\overline{v}})+ \frac{|E_v|}{|E|}\rho (G_v) = \rho (G). \end{aligned}$$
(7)

With this decomposition to hand, we define the Holevo vertex centrality of v as

$$\begin{aligned} HC(v) =&~\chi \left( \left\{ \left( \frac{|E_{\overline{v}}|}{|E|},G_{\overline{v}}\right) ,\left( \frac{|E_v|}{|E|},G_v\right) \right\} \right) \nonumber \\ =&~S\left( \rho (G)\right) -\left( \frac{|E_{\overline{v}}|}{|E|}S\left( \rho (G_{\overline{v}})\right) + \frac{|E_v|}{|E|}S\left( \rho (G_v)\right) \right) . \end{aligned}$$
(8)

Given a graph \(G=(V,E)\) and a vertex \(v \in V\), the first term in Eq. 8 (i.e., \(S\left( \rho (G)\right) \)) does not depend on the choice of v, and thus can be ignored when ranking the nodes of G according to their Holevo centrality. Moreover, note that we only need to compute the spectrum of \(\rho (G_{\overline{v}})\), as the spectrum of \(\rho (G_v)\) can be easily determined analytically. Recall that the star graph on n vertices \(K_{1,n-1}\) has Laplacian spectrum

$$\{n^{[1]},1^{[n-2]},0^{[1]}\},$$

i.e., it has three eigenvalues n, 1, and 0 with multiplicity 1, \(n-2\), and 1, respectively. This in turn implies that the spectrum of the density matrix \(\rho (K_{1,n-1})\) is

$$\{\frac{n}{2n-2}^{[1]},\frac{1}{2n-2}^{[n-2]},0^{[1]}\},$$

as shown in [10]. Since adding disconnected vertices to a graph does not change its Von Neumann entropy [16], we have that the entropy of \(\rho (G_v)\) is

$$\begin{aligned} S\left( \rho (G_v)\right) =&~-\frac{d_v+1}{2d_v}\log \left( \frac{d_v+1}{2d_v}\right) -\frac{d_v-1}{2d_v}\log \left( \frac{1}{2d_v}\right) \nonumber \\ =&~\frac{d_v+1}{2d_v}\log \left( \frac{2d_v}{d_v+1}\right) +\frac{d_v-1}{2d_v}\log \left( 2d_v\right) \nonumber \\ =&~\frac{1}{2d_v}\left( 2d_v\log \left( 2d_v\right) -(d_v+1)\log ({d_v+1})\right) \end{aligned}$$
(9)

where \(d_v\) denotes the degree of v. In other words, the entropy of \(\rho (G_v)\) is completely determined by the degree of v. As a result, the computational complexity of computing the Holevo centrality of v is dominated by the cost of computing the eigendecomposition of \(\rho (G_{\overline{v}})\).

Fig. 1.
figure 1

Holevo centrality for the nodes of the Wheel graph on 6 (a) and 15 (b) nodes. The radius of each node is proportional to their Holevo centrality. In (c) we show a line plot of the Holevo centrality for increasing graph size. Here 0 denotes the hub node.

Finally note that the Von Neumann entropy of a star graph is 0 when \(d_v = 1\), and it grows logarithmically as a function of \(d_v\). This in turn suggests that the Holevo centrality given by Eq. 8 could be negatively correlated with the degree centrality, however proving this would require finding general analytical form of the spectrum of \(\rho (G_{\overline{v}})\). Moreover, it should be noted that the Von Neumann entropy of \(\rho (G_{\overline{v}})\) depends on the presence of several non-trivial structural patterns, including paths, cliques, and connected components. Therefore the Holevo vertex centrality measures the importance of a vertex as a combination of its degree as well as the structural patterns that emerge after its removal.

4 Experimental Evaluation

We perform our experiments on two well known real-world networks, the Florentine families graph [15] and the Karate club network [22], as well as a number of synthetic graphs. We compare the proposed similarity measure the three commonly used alternative measures: (1) the degree centrality [7], (2) the closeness centrality [21], and (3) the betweenness centrality [7].

4.1 Synthetic Networks

Wheel Graph. The Wheel graph \(W_n\) on n nodes is the graph obtained by taking a cycle \(C_{n-1}\) on \(n-1\) nodes and connecting each of the nodes of \(C_{n-1}\) to another node, i.e., the hub. Figure 1 shows 3 wheel graphs of increasing number of nodes n and the corresponding value of the Holevo centrality. Note that for small values of n the hub is the least central node. However as n grows the centrality of the hub remains constant while the centrality of the other nodes decreases, until the hub becomes the most central node.

Indeed, our centrality measure seems to capture the increasing redundancy of the nodes along the cycle \(C_{n-1}\) as n grows. Note that this implies that our measure is negatively correlated with the degree centrality for small values of n, but positively correlated for large values of n. While this may seem surprising given the negative correlation highlighted in Eq. 9, the observed behaviour is likely due to the other component in Eq. 8, i.e., \(S(\rho (G_{\overline{v}}))\), as well as their respective weights.

Fig. 2.
figure 2

Holevo centrality for the nodes of the (mn)-lollipop graph for (a) \(m=4,n=2\), (b) \(m=6,n=2\), and (c) \(m=8,n=2\). The radius of each node is proportional to their Holevo centrality.

Lollipop Graph. The (mn)-lollipop graph on \(m+n\) nodes is the graph obtained by joining the clique \(K_m\) on m nodes with the path graph \(P_n\) on n nodes. Figure 2 shows the value of the Holevo centrality for increasing size of the clique, while keeping the size of the path fixed. For small clique sizes, the most central node is the central node on the path. However, as the size of the clique increases, the centrality of the path node decreases, while the clique nodes become increasingly important. We observe a similar behaviour when the length of the path is increased, with the nodes belonging to it being the most central ones for small values of n. However, as we increase n, the most central node in the graph becomes the node the shared node between the clique and the path.

Similarly to the Wheel graph, the Holevo centrality measure seems to capture the importance of the nodes belonging to the path when the total number of nodes in the graph is small. However, as \(m+n\) grows, our measure places most of the importance on the tightly connected nodes of the clique, while the centrality of the path is “diluted” as its length increases.

Fig. 3.
figure 3

Holevo centrality for the nodes of the Barbell graph joining two cliques \(K_m\) through a path \(P_n\), for (a) \(m=2,n=3\), (b) \(m=3,n=3\), and (c) \(m=4,n=3\). The radius of each node is proportional to their Holevo centrality.

Barbell Graph. The Barbell graph is the graph obtained by joining two cliques \(K_m\) through a path \(P_n\) (i.e., a bridge between the two cliques). Note that when \(m=2\) the corresponding Barbell graph is the path graph \(P_{n+2m}\). Figure 3 shows three Barbell graphs with n constant (i.e., the length of the bridge is 3 in all the graphs) and m equal to 2, 3, and 4, respectively. When \(m=2\) the graph is a path over 7 nodes. In this case, our centrality measures assigns the largest weight to the two nodes that connect the two ends of the path to the rest of the nodes. As the m increases, the weight of the cliques shifts the importance from the bridge to the cliques, with the node connecting the cliques to the path becoming then most central ones. If we increase the path length, we observe that the junctions between the cliques and the path remain the most central nodes. However we start to discriminate between the nodes along the bridge, with the nodes closer to the center of the bridge being assigned a higher centrality. Note that this is contrast with the degree centrality, which would assign the same weight to all the nodes on the bridge (since they all have the same degree).

Fig. 4.
figure 4

Average correlation with the degree centrality for different realisation of the Barabási-Albert preferential attachment model [1]. Here we varied the number of nodes of the generated graph, as well as the number of edges k that are created from a new node to existing nodes.

Scale-Free Graph. Finally, we consider a set of scale-free graphs generated by the Barabási-Albert preferential attachment model [1]. Starting from an empty graph, we iteratively add nodes to it until a user-defined size n is reached. At each iteration, the new node is connected to at most m nodes chosen according to their degree, i.e., nodes with a higher degree are more likely to be selected. We let m and n vary between 1 and 3, and 10 and 20, respectively, and for each pair of parameters we generated 100 graphs.

We are interested in measuring the correlation between the Holevo vertex centrality and the degree centrality. In the previous subsections we have observed that the correlation with the degree centrality seems to increase as the size of the graph increases. Figure 4 confirms that this is the case. The figure also shows that the correlation increases as we increase the number of connections added per iteration. Note that when \(m=1\) the resulting graph is guaranteed to be a tree. Indeed, Fig. 4 seems to suggest that the correlation is particularly high on trees, however further investigation is needed to understand if the observed effect is instead due to the particular degree distribution of scale-free graphs or to the presence of high degree nodes.

Fig. 5.
figure 5

The Karate network (a) and the Florentine families network (b). (c) and (d) show the correlation matrices between the Holevo (HC), degree (DC), closeness (CC), and betweenness (BC) centrality measures on the Karate and Florentine families networks, respectively. Each node of (b) is labeled with the index associated to the corresponding family in Table 1.

4.2 Real-World Networks

We conclude our experimental evaluation by measuring the centrality of two well-known networks, the Florentine families graph [15] and the Karate club network [22]. Figures 5(a) and (b) show the two networks, where the radius of the nodes is proportional to their Holevo centrality. We also compute the degree (DC), closeness (CC), and betweenness (BC) centralities over these networks and we show the corresponding correlation matrices in Figs. 5(c) and (d).

Table 1. The Holevo and the degree (in bold) centrality of the families of Florentine families network. The number next to the name of each family is the index of the corresponding node in Fig. 5(b).

In these networks, the Holevo centrality measure shows a large positive correlation with all the other measure, in particular the degree centrality. However, if we look at the ranking induced by our measure there are some important differences. Table 1 shows the ranking given by the Holevo centrality, as well as the degree of each family in the network. Clearly, the degree centrality cannot distinguish between those families that have the same degree, while the Holevo centrality allows to define a more fine-grained ranking. For example, the Salviati family (node 10) is ranked higher than the Barbadori family (node 14), although both families have degree two. However the Salviati family is the only node connecting the Pazzi family (node 6) to the Medici family (node 2) and the rest of the graph, and therefore its importance is higher.

5 Conclusion

In this paper we have proposed a novel vertex centrality measure based on the quantum information theoretical notion of Holevo quantity. The idea underpinning our approach is that the importance of a vertex is proportional to the variation in the information content of the network before and after its removal. We have shown that the centrality of a vertex v can be broken down in two parts, one which is negatively correlated with the degree centrality of v, and one which depends on the emergence of non-trivial structures in the graph when v is disconnected from the rest of the graph. Finally, we have evaluated the Holevo centrality measure on a number of synthetic as well as real-world networks, and we have compared it against commonly used alternative measures. Future work will be aimed at investigating further the structural pattern that influence this centrality measure.