1 Introduction

Due to their importance, topological properties of complex networks attract a lot of research efforts. The goal is to exploit any hidden properties to increase the efficiency of existing algorithms, as well as to propose new algorithms that are more natural to the structure that a graph exhibits. Topological properties are either global such as the graph’s diameter or local such as the structure of the neighborhood of a vertex. A property that has been investigated recently is the δ-hyperbolicity (negative curvature) of a graph since it has a major impact on its underlying topology [24]. The δ-hyperbolicity (or simply hyperbolicity) measures how close the metric structure of the graph is to the metric structure of the tree [18]. Generally, the smaller the hyperbolicity the closer the graph is to a tree and, as a result, the hyperbolicity property is more evident. Many real-world networks show a tree-like structure with respect to their hyperbolicity [1, 3, 5, 7, 23, 24] . Trees and cliques are 0-hyperbolic, and accordingly are considered hyperbolic graphs. On the other hand, a cycle with n vertices is approximately n/4-hyperbolic and an n × n-grid is (n − 1)-hyperbolic.

In hyperbolic graphs, it was observed that the traffic heavily concentrates on a small set of vertices (the core of a graph) [24]. The core is defined using multiple measures such as the betweenness centrality, the eccentricity centrality, the closeness centrality, or any combination of these measures.

Global and local properties of a graph can be very different. For example, many networks that do not show a tree-like structure globally (i.e., using global analysis tools such as the δ-hyperbolicity) turned out to exhibit a tree-like structure when they are analyzed locally. This phenomenon was explained by the presence of a core-periphery structure [3, 24].

Let G = (V, E) be a connected graph with distance function d defined as the number of edges on the shortest path between a pair of vertices. Formally, the δ-hyperbolicity can be defined using the four-point condition definition [18].

Definition 1

Given a graph G = (V, E) and four vertices x, y, u, and v ∈ V  with d(x, y) + d(u, v) ≥ d(x, u) + d(y, v) ≥ d(x, v) + d(y, u), the hyperbolicity of the quadruple x, y, u, v denoted as hb(x, y, u, v) is defined as

$$\displaystyle \begin{aligned} hb(x,y,u,v) = \frac{d(x, y) + d(u, v) - (d(x, u) + d(y, v))}{2}. \end{aligned}$$

and the δ-hyperbolicity of the graph is

$$\displaystyle \begin{aligned} hb(G)= \max_{x,y,u,v \in V} hb(x,y,u,v). \end{aligned}$$

Finding the value of the δ-hyperbolicity is computationally very expensive even when distributed computing techniques are employed [2]. From the four-point definition, it is clear that the obvious algorithm requires O(n 4) time, where n is the number of vertices. The limitation of this algorithm is twofold. First, for large networks, this algorithm is impractical and almost unachievable. Second, calculating the hyperbolicity in dynamic networks, in which vertices constantly join and leave, is costly even for small to medium size networks.

The hyperbolicity of a graph is highly affected by its topology. Any modification of the graph may dramatically change its topology and accordingly its hyperbolicity. For example, consider the removal of one edge in a cycle graph G (with hb(G) = n/4). Upon this modification, the new hyperbolicity becomes hb(G) = 0. The best known exact algorithm for calculating the hyperbolicity requires O(n 3.69) time using the (max,min) matrix multiplication [16]. Multiple algorithms were proposed to reduce the size of the input graph. In [14], the authors propose exact and approximation algorithms that restrict the number of considered quadruples to those ones that may maximize the δ-hyperbolicity value. Moreover, they show that the hyperbolicity of a graph equals the maximum hyperbolicity over its bi-connected components.

In [6], we propose a method that reduces the size of the input graph to only a subset that is responsible for maximizing its hyperbolicity by analyzing the local dominance relationship between vertices. Furthermore, we show that the hyperbolicity of a graph can be found in a set of quadruples that are in close proximity. In this work, we empirically show that this set concentrates in the core of the graph. We adopt two core definitions each of which represents a different notion of vertex coreness [21]. The minimum-cover-set core, which can be identified as a transport-based core-periphery structure, and the k-core, which can be identified as a density-based core-periphery structure. Our observations have crucial implications on computing the δ-hyperbolicity of large graphs. We apply our ideas to a set of real-world and artificial networks, and we show their suitability to compute the δ-hyperbolicity value with only a fraction of the original calculations.

This paper is organized as follows. First, some basic notations that are used in this work and the basic concept of δ-hyperbolicity are introduced. Section 1.3 describes the network datasets used in this paper and presents a summary of their parameters. In Sect. 2, we present two methods that can reduce the number of vertices and quadruples needed to compute the δ-hyperbolicity of graphs: the dominance relationship and the p-δ-hyperbolicity. Then in Sect. 3, we show that the δ-hyperbolicity of a graph concentrates in its core. The conclusions and future work are discussed in Sect. 5.

1.1 Preliminaries

All graphs in this work are connected, finite, unweighted, and undirected. For a graph G = (V, E), the distance function d between every pair of vertices x and y ∈ V , denoted as d(x, y), is defined as the number of edges in the shortest (x, y)-path between them. The interval I(x, y) between a pair of vertices x and y includes all vertices on the shortest paths between x and y, that is, I(x, y) = {u ∈ V : d(x, u) + d(u, y) = d(x, y)}.

The eccentricity ecc(x) of a vertex x is the distance between x and a farthest vertex y. The minimum and the maximum values of the eccentricity represent the graph’s radius rad(G) and diameter diam(G), respectively. The center C(G) of a graph G is formed by the set of vertices with minimum eccentricity, that is, C(G) = {x ∈ V : ecc(x) = rad(G)}. The neighborhood of a vertex x is defined as N(x) = {y ∈ V : xy ∈ E}, and the degree of a vertex x is degree(x) = |N(x)|.

A subgraph G X  = (X, E X ), where X ⊆ V  and E X  = {xy ∈ E : x, y ∈ X}, is called the subgraph of G induced by X. An induced subgraph G X of a graph G is isometric if the distance between any pair of vertices in G X is the same as that in G.

1.2 δ-Hyperbolicity

Gromov [18] introduced the notion of hyperbolicity of metric spaces through several definitions (the definitions were shown to be equal up to a constant factor [13]). In this work, we use the four-point condition definition (Definition 1). A simple unweighted graph G = (V, E) naturally defines a metric space (V, d) on its vertex set V . In graphs, δ-hyperbolicity measures how close metrically the structure of a graph is to the structure of a tree.

The small δ-hyperbolicity property has been found in many real-world networks [5, 7, 23, 24]. It has been shown that the diameter of a graph represents an upper bound for its δ-hyperbolicity value.

Lemma 1 ([15, 25])

For any graph G with diameter diam(G) and hyperbolicity hb(G), hb(G) ≤ diam(G)/2.

In many real-world networks, it was observed that a small number of quadruples achieve the maximum hyperbolicity; therefore, the value of the average δ-hyperbolicity is also important [5, 7, 23]. The average δ-hyperbolicity, denoted as hb avg (G), is defined as

$$\displaystyle \begin{aligned} hb_{avg}(G) = \sum_{x,y,u,v \in V} hb(x,y,u,v) /\dbinom{|V|}{4}. \end{aligned}$$

1.3 Network Datasets

Throughout this work, we analyze a set of real-world networks belonging to various domains. Because computing the hyperbolicity is computationally expensive, we include a set of relatively small-sized networks. We also analyze several synthetic networks with some known structures of roughly same sizes as the real-world networks. All networks are unweighted and we ignore the directions of the edges. We analyze the largest connected component of each network. See Table 1 for a summary.

Table 1 Statistics of the analyzed networks: |V |: number of vertices, |E|: number of edges; |V C |: number of vertices in the largest component; |E C |: number of edges in the largest component; rad(G): graph’s radius; diam(G): graph’s diameter; \(\bar d\): average degree

Social Networks

We have examined the following four social networks: The Email network [29] that represents the e-mail interchanges between members of the university of Rovira i Virgili, Tarragona. The DUTCH-ELITE network [8] which is a network data on the administrative elite in the Netherland. In the DUTCH-ELITE network, vertices represent persons and organizations that are most important to the Dutch government (2-mode network). An edge connects two vertices if the person vertex belongs to the organization vertex. The Facebook network [32] represents the ego networks (the network of friendship between a user’s friends) of 10 people. Two vertices (users) are connected if they are Facebook friends. The EVA network [8] presents corporate ownership information as a social network. Two vertices are connected with an edge if one is the owner of the other.

Internet Networks

Each of those graphs represents the Autonomous Systems (AS) topology of the Internet. In each graph, a vertex represents an autonomous system, and two vertices are connected if the two autonomous systems share at least one physical connection. In this work, we examine three AS graphs: AS-Graph-97, AS-Graph-99-April, and AS-Graph-99-July [31] for which the data was collected during November 1997, April 1999, and July 1999, respectively.

Erdős Rényi Random Graphs

In an Erdős Rényi graph with n vertices, denoted by Erdős-Rényi(p), every two vertices are independently connected with a fixed probability p. Smaller values for p (\(1/n < p < \log (n)/n\) ) result in very sparse graphs. In contrast, larger p values yield dense graphs with very small diameters. Sparser Erdős Rényi graphs exhibit a clear core-periphery structure compared to dense Erdős Rényi graphs [3].

Since we are looking for graphs with large diameters to clearly see the potential of our method in calculating the hyperbolicity of a graph, we choose very small values for p. In our datasets, we include three Erdős Rényi graphs with equal number of vertices (n = 2500) and with p of 1.6/n, 2/n, and 8/n, respectively.

Power-Law Random Graphs

In a power-law graph, the degrees of the vertices follow (or approximate) a power-law distribution. In this work, we use a set of power-law graphs generated based on a variation of the Aiello-Chung-Lu model [4, 10]. This model produces a power-law random graph whose degree sequence is determined by a power-law with exponent β, where β is the power parameter. Smaller β values (β < 2) generate power-law graphs with cores that are denser and have smaller diameters compared to power-law graphs with higher β values [22].

Each power-law graph in the network datasets Power-Law(β) has 2500 vertices and a value β ∈{1.8, 1.9, 2, 2.7}.

Finally, we analyze multiple graphs that are expected to have different hyperbolic properties: the US-Airways transportation network [8] and the Power-Grid network [28], which represents the western United States power grid. Also we analyze two planar grid graphs: Planar-Grid(50 × 50) and Planar-Grid(1250 × 2).

In Table 2, we show the δ-hyperbolicity and the average δ-hyperbolicity of each network in the datasets. Table 2 shows that most real-world and artificial networks have small δ-hyperbolicity values. Note that the absolute value of the δ-hyperbolicity becomes meaningful when it is compared with other parameters of the graph such as its diameter [7]. Recall that half the diameter represents an upper bound for the δ-hyperbolicity.

Table 2 Hyperbolicity of each network in the network datasets: hb(G): hyperbolicity; hb avg (G): average hyperbolicity

2 δ-Hyperbolicity in Graphs

According to the definition of the δ-hyperbolicity of a quadruple, its value is not dependent on the distances among the vertex pairs; rather, it is affected by the topology present among the vertices. Even though the set of quadruples responsible for maximizing the value of the hyperbolicity has not been characterized, in this section, we present methods that can be used to eliminate vertices (and accordingly quadruples) that do not actively participate in increasing the δ-hyperbolicity of a graph [6].

2.1 δ-Hyperbolicity and Dominated Vertices

There are a few existing methods that aim at reducing the size of the graph without affecting its hyperbolicity. Some of those methods are suggested by the following lemmas:

Lemma 2 ([15])

Given a graph G = (V, E) and a vertex x  V  with degree(x) = 1, hb(G) = hb(G −{x}).

Lemma 3

Let G = (V, E) be a graph, x, y, w be a triangle in G, and let x be a vertex with degree(x) = 2. Then hb(G) = hb(G −{x}).

Proof

The proof is formally analogous to the proof of Lemma 8 in [14]. □

These cases can be generalized using the dominance relationship among vertices.

Definition 2

Given a graph G = (V, E) and a vertex x ∈ V , x is said to be dominated by a neighboring vertex y if N(x) ⊆ N(y).

Note that a vertex with degree 1 is also dominated by its only neighbor.

Lemma 4 ([12])

Let x  V  be a dominated vertex in a graph G = (V, E). The subgraph G Vx is isometric.

Proof

Let G = (V, E) be a graph and let x ∈ V  be a vertex that is dominated by a neighboring vertex y. Consider the shortest path ρ(u, v) between a pair of vertices u and v such that x ∈ ρ(u, v). That is, d(u, v) = d(u, x) + d(x, v). Let x′∈ N(x) be the vertex closest to u, then d(u, x) = d(u, x′) + 1. Since N(x) ⊆ N(y), then d(u, y) = d(u, x′) + 1 = d(u, x). Similarly, d(y, v) = d(x, v). Therefore, d(u, v) = d(u, y) + d(y, v) for any pair u and v. This shows that the distance d(u, v) is not affected by the removal of x. That is, d(u, v) in G equals that in G Vx. □

Next we analyze the effect of removing dominated vertices on the upper bound of the hyperbolicity (the diameter of the graph) and the value of the hyperbolicity.

Lemma 5

Let G = (V, E) be a graph, and let x  V  be a vertex dominated by a neighbor vertex y. Then either diam(G Vx) = diam(G) or diam(G Vx) = diam(G) − 1

Proof

Let G = (V, E) be a graph and let x ∈ V  be a vertex that is dominated by a neighboring vertex y. If x is not a part of any diametral pair, then diam(G Vx) = diam(G). Now assume that the pair (x, x′) is a diametral pair in G, that is, d(x, x′) = diam(G). Let ρ(x, x′) = x 1, x 2, …, x k where x 1 = x, x k  = x′, and k = d(x, x′) + 1. x 2 ∈ N(x) and d(x 2, x′) = d(x, x′) − 1 = diam(G) − 1. □

Lemma 6

Let G = (V, E) be a graph, and let x  V  be a vertex dominated by a neighbor vertex y. Then \(hb(G) \leq \max \{1,hb(G_{V - x})+ \frac {1}{2}\}\).

Proof

Let x and y be two vertices defined as above and let G X be the subgraph induced by the set X = {x}∪ N(x). Consider a vertex z ∈ G X , and three vertices u, v, wG X . We show that \(hb(G) \leq \max \{1,hb(G_{V-x})+ \frac {1}{2}\}\) holds for any quadruple that involves vertex x. We consider the cases when all the other three vertices in a quadruple belong to G X , when all the other three vertices do not belong to G X , when a quadruple consists of x, y, and any two vertices ∉G X , and when a quadruple consists of x, y, a vertex in G X , and a vertex ∉G X .

First, hb(G X ) ≤ 1 since diam(G X ) ≤ 2 (Lemma 1).

Second, \(hb(x,u,v,w) \leq hb(y,u,v,w) + \frac {1}{2}\) for any three vertices u, v, wG X . Assume 2hb(y, u, v, w) = d(y, u) + d(v, w) − d(y, v) − d(u, w). Let A = d(x, u) + d(v, w), B = d(x, v) + d(u, w), and C = d(x, w) + d(u, v). When A ≥ B ≥ C, we have 2hb(x, u, v, w) = d(x, u) + d(v, w) − d(x, v) − d(u, w). Since d(y, u) ≤ d(x, u) ≤ d(y, u) + 1 and d(y, v) ≤ d(x, v) ≤ d(y, v) + 1, then \(hb(x,u,v,w) \leq (d(y,u) + 1 + d(v,w) - d(y,v) - d(u,w))/2 \leq hb(y,u,v,w) + \frac {1}{2}\). When B ≥ A ≥ C, 2hb(x, u, v, w) = d(x, v) + d(u, w) − d(x, u) − d(v, w). Also d(y, u) ≤ d(x, u) ≤ d(y, u) + 1 and d(y, v) ≤ d(x, v) ≤ d(y, v) + 1, and by triangle inequality, d(u, w) ≤ d(y, u) + d(y, w) and d(v, w) ≤ d(y, v) + d(y, w). Then 2hb(x, u, v, w) = d(y, v) + 1 + d(y, u) + d(y, w) −d(y, u) −d(y, v) −d(y, w), and we get \(hb(x,,u,v,w) \leq \frac {1}{2}\). Finally, when C ≥ A ≥ B, 2hb(x, u, v, w) = d(x, w) + d(u, v) − d(x, u) − d(v, w). By triangle inequality, we get \(hb(x,u,v,w) \geq (1 + d(v,w) - d(v,w))/2 = \frac {1}{2}\).

Third, \(hb(x,y,u,v) \leq \frac {1}{2}\) for any two vertices u, vG X . Consider the following three distance sums for the quadruple (x, y, u, v): A = d(x, y) + d(u, v), B = d(x, u) + d(y, v), and C = d(x, v) + d(y, u). When A ≥ B ≥ C, we have 2hb(x, y, u, v) = d(x, y)+d(u, v)−d(x, u)−d(y, v) ≤ 1+d(y, u)+d(y, v)−d(y, u)−d(y, v) since d(u, v) ≤ d(y, u) ≤ d(y, v). Therefore, \(hb(x,y,u,v) \leq \frac {1}{2}\). When B ≥ A ≥ C, we have 2hb(x, y, u, v) = d(x, u)+d(y, v)−d(x, y)−d(u, v) ≤ 1+d(y, u)+d(y, v)−1−d(u, v) = 0. Finally, when C ≥ A ≥ B, we have 2hb(x, y, u, v) = d(x, v)+d(y, u)−d(x, y)−d(u, v) ≤ 1+d(y, v)+d(y, u)−1−d(u, v) = 0.

Fourth, we obtain similarly that \(hb(x,y,z,u) \leq \frac {1}{2}\) for any vertex z ∈ G X and any vertex uG X . □

To be able to obtain all quadruples responsible for maximizing the hyperbolicity, we do not consider cases in which vertices become dominated after other vertices have been removed. For example, in Fig. 1, vertex v is dominated by vertex u, which is not dominated by any other vertex. The hyperbolicity of the original graph G is one, and the hyperbolicity of the graph G Vv is also one. However, after removing vertex v, vertices u, x, and y become dominated by vertex w and the hyperbolicity of G V −{u,x,y} is zero.

Fig. 1
figure 1

A graph with dominated vertices. Vertex v is dominated by vertex u

For each graph in the datasets, we report the percent of the dominated vertices. We also differentiate between dominated vertices of degree 1, degree 2, and degree > 2. The results are listed in Table 3. In almost all networks, the dominated vertices have degrees at most two. This suggests that finding those vertices is computationally easier than what is implied by Definition 2. Also, in all networks, the hyperbolicity was preserved after removing all dominated vertices. This result is even better than what is suggested in Lemma 3.

Table 3 Statistics of dominated vertices

2.2 δ-Hyperbolicity and Restricted Path Lengths

Hyperbolicity in some sense is related to the uniqueness of shortest paths. In trees, which are 0-hyperbolic, there is a single shortest path among every vertex pair. While this property is mostly absent in general graphs, the core-periphery property, which has been recognized in many networks, suggests that when two vertices are relatively far from one another (with respect to their distance), all shortest paths that connect them pass the core of the graph. Let x and y be two vertices that are sufficiently far from one another. To some extent, the shortest path between them can be considered unique even though multiple shortest paths may exist between any pair of intermediate vertices u, v ∈ I(x, y). Applying this idea on sufficiently far vertices in a quadruple, we observe the following (see Fig. 2).

Fig. 2
figure 2

Illustration of Lemma 7

Lemma 7

Let G = (V, E) be a graph and x, y, u, v  V  be four distinct vertices. Consider four vertices x′, y′, u′, v′ such that x  I(x′, y) ∩ I(x′, u) ∩ I(x′, v), y  I(y′, x) ∩ I(y′, u) ∩ I(y′, v), u  I(u′, x) ∩ I(u′, y) ∩ I(u′, v), and v  I(v′, x) ∩ I(v′, y) ∩ I(v′, u). Then we have hb(x′, y′, u′, v′) = hb(x, y, u, v).

Proof

Assume that 2hb(x, y, u, v) = d(x, y) + d(u, v) − (d(x, u) + d(y, v)). Accordingly, 2hb(x′, y′, u′, v′) = d(x′, y′) + d(u′, v′) − (d(x′, u′) + d(y′, v′)). By the assumption above, we obtain 2hb(x′, y′, u′, v′) = d(x, y)+d(x, x′)+d(y, y′)+d(u, v)+d(u, u′)+d(v, v′)−d(x, u)−d(x, x′)−d(u, u′)−d(y, v)−d(y, y′)−d(v, v′) = 2hb(x, y, u, v). □

Remark 1

Using Lemma 7, we conclude that hb(x′, y, u, v) = hb(x, y, u, v), hb(x, y′, u, v) = hb(x, y, u, v), hb(x, y, u′, v) = hb(x, y, u, v), and hb(x, y, u, v′) = hb(x, y, u, v).

From the lemma and the remark above, it follows that the δ-hyperbolicity of a quadruple may be increased only because some intermediate quadruple has a higher δ-hyperbolicity (this was also observed experimentally in [3]). Accordingly, the δ-hyperbolicity of some graphs (especially the ones with clear core-periphery dichotomy) may be found in quadruples that are in close proximity, and it is sufficient to consider those quadruples when computing the graph’s hyperbolicity. Thus, we consider a variation of the definition of the δ-hyperbolicity that restricts the set of considered quadruples to those that are in close proximity.

Definition 3

Let G = (V, E) be an undirected and unweighted graph, diam(G) be its diameter, and x, y, u, v be vertices in V  with d(x, y) ≤ p, d(x, u) ≤ p, and d(x, v) ≤ p, where 0 ≤ p ≤ diam(G). Also let d(x, y) + d(u, v) ≥ d(x, u) + d(y, v) ≥ d(x, v) + d(y, u) be the three distance sums defined over the four vertices x, y, u, v in a nonincreasing order. The p-δ-hyperbolicity of the quadruple x, y, u, v denoted as hb p (x, y, u, v) is defined as

$$\displaystyle \begin{aligned} hb_p(x,y,u,v) = (d(x,y) + d(u,v) - (d(x,u) + d(y,v)))/2. \end{aligned}$$

and the p-δ-hyperbolicity of the graph is

$$\displaystyle \begin{aligned} hb_p(G) = \max_{x,y,u,v \in V} hb_p(x,y,u,v). \end{aligned}$$

The choice of distance p is critical. When p = 0, hb p (G) = 0 since we get a set of singletons. This value can be very far from the value of the hyperbolicity of the graph. When p = diam(G), hb p (G) = hb(G) since we include all possible quadruples. Generally, when 0 < p < diam(G), hb p (G) ≤ hb(G). For some graph types such as an n × n grid, the value of the hyperbolicity equals the hyperbolicity of the quadruple with vertices at maximum pair-wise distance. Thus restricting the distances among vertex pairs to any p < diam(G) results in a hb p (G) < hb(G). In contrast, in a 2 × n grid, hb p (G) = hb(G) when p = 2. Examples of both cases are provided in the datasets.

Table 4 and Fig. 3 show the p-δ-hyperbolicity of each graph in the datasets. The table lists hb p (G) for p = ⌈rad(G)/2⌉ and p = p max, where p max is the maximum distance p that achieved hb p (G) = hb(G). Table 4 also shows the decrease in the number of quadruples (compared to the total number of quadruples used to compute hb(G)). In almost all graphs, not only p max is smaller than the diameter of each network but also p max ≤ rad(G). The distance p max needed in the network Erdős-Rényi(8) is 6 = rad(G) + 1. This is probably due to the lack of a core in this type of graphs (denser random Erdős Rényi graphs) [3]. It is also interesting to observe that p max = 2δ in almost all networks. Figure 3 shows that the hyperbolicity increases with distance until a certain point (p max) and then remains the same.

Fig. 3
figure 3

p-δ-Hyperbolicity. (a) Real-world networks. (b) Synthetic networks

Table 4 p-δ-Hyperbolicity

To exploit the p-δ-hyperbolicity, it is sufficient to consider quadruples within the graph’s core, which may not be unique. In [7], it was observed that the shortest path (or paths) between distant vertices tends to include vertices in the center of the graph C(G) (C(G) = u ∈ V : ecc(u) = rad(G)).

Proposition 1 ([7])

Let G be a δ-hyperbolic graph and x, y be arbitrary vertices of G. If d(x, y) > 4hb(G) + 1, then on any shortest (x, y)-path there is a vertex w with \(ecc(w)<\max \ \{ecc(x),ecc(y)\}\).

Even though a distance of 4hb(G) + 1 may exceed the diameter of the graph in most networks (because of the small-world property), it was shown experimentally in [7] that even pairs with small distances include a vertex in the center (or close to the center).

Here we compute the p-δ-hyperbolicity considering only vertices within the center for some of our networks and with a distance p that is equal to p max (see Table 4). The results are listed in Table 5. The table shows that even though for some networks the p-δ-hyperbolicity is not equal to the hyperbolicity of the graph hb(G), it achieves a value that is very close.

Table 5 p-δ-hyperbolicity and the center of the graph

3 δ-Hyperbolicity and the Core-Periphery Structure

It was observed in [3] throughout a set of real-world and artificial networks that a tree-like structure becomes less evident below a certain size scale; specifically, within the core of the network. That is, quadruples whose vertices belong to the core part of the network have high hyperbolicity values while quadruples with vertices that belong to the peripheral part do not actively participate in increasing the hyperbolicity value (they affect hb avg (G) but not hb(G)). This confirms that quadruples like the ones described in Lemma 7 and Remark 1 exist in many networks due to the core-periphery structure in those networks. In this section, we exploit this observation for computing the value of the δ-hyperbolicity by considering only quadruples in the core of a graph.

Recently, two core-periphery structure notions have been discussed in the literature. The transport-based core-periphery structure, which was developed based on intuition from transportation networks, and the density-based core-periphery structure, which was developed based on intuition from social networks [21]. A transport-based core is central to the network (in terms of its betweenness), while a density-based core is densely connected and connected to a sparse periphery.

In this paper, we use two core definitions: the minimum-cover-set core, which can be classified as a transport-based core, and the k-core, which can be classified as a density-based core.

Let core be the set of core vertices in a graph G. The core of G, denoted by G core , is the subgraph of G induced by the set core. We denote the minimum-cover-set core by \(G_{core}^m\) and the k-core by \(G_{core}^k\). We compute the hyperbolicity of the core of each network in the datasets and compare it to the hyperbolicity of the graph. Note that we exclude the two planar grid networks from the analysis in this section because of their lack of a meaningful core.

3.1 The Minimum-Cover-Set Core

In [7], the authors show that the traffic tends to concentrate on vertices with small eccentricity (vertices in and close to the graph’s center). Accordingly, they introduce a core identification model (named the minimum-cover-set model) based on the eccentricity and the betweenness of vertices.

The minimum-cover-set core of a graph G, denoted by \(G_{core}^{m}\), is the smallest set of vertices that is sufficient to circulate the traffic between distant vertices in a graph [7]. This set includes vertices that have small eccentricities, are close to the graph’s center, and have high betweenness. The betweenness of a vertex x is the number of vertex pairs that have x on the shortest path between them.

First, in a priority list T, vertices are ranked according to three parameters: the eccentricity, the distance to the center of the graph, and the betweenness. Second, a vertex at the top of T will be added to the core if it is in the shortest path between some vertex pair x, y. In this case, pair x, y is covered by the core and will not be considered again.

Table 6 lists basic statistics about the minimum-cover-set core for each network in the datasets. Table 6 shows that in most real-world networks, the core size (number of vertices) does not exceed 35% of the number of vertices in the original graph. The only exception is networks Email and Power-Grid, which is not expected to present a concise core. In the three Erdős Rényi graphs, the conciseness of the cores seems to correlate with the sparsity of the network. The network Erdős Rényi(8), which has the highest density, does not have a well-defined core-periphery structure.

Table 6 δ-Hyperbolicity and the graph’s minimum-cover-set core \(G_{core}^{m}\)

It is clear from Table 6 that while the diameter of each minimum-cover-set core is slightly smaller than the diameter of the network, its hyperbolicity (\(hb(G_{core}^m)\)) is equal to the hyperbolicity of the original network (hb(G)). The exception is networks Erdős Rényi(8) and Power-Law(2). Table 6 also shows the decrease in the number of considered quadruples (compared to the number of quadruples needed to compute hb(G)) and the decrease in the running time (compared to the running time needed to compute hb(G)). For example, in the Facebook network, there is a 99.9 decrease in the number of considered quadruples. The running time needed to compute the hyperbolicity for the original Facebook network was about 18 h, but it took only few seconds to compute it for the minimum-cover-set core \(hb(G_{core}^m)\).Footnote 1 In the network Power-Grid, there is a 79.4 decrease in the number of quadruples (the time needed to compute hb(G) and \(hb(G_{core}^m)\) is about 31 and 15 h, respectively).

3.2 k-Core

The k-core decomposition [26] provides a way to decompose a graph that allows the identification of interesting structural properties that are not captured by other simple structural measures. Unlike the δ-hyperbolicity, the k-core decomposition is not intended to be a tree-like measure, yet in [3], the authors find the k-core of a graph to be an important part of its hyperbolic structure.

The k-core of a given graph G = (V, E), denoted by \(G_{core}^k\), is a maximal connected subgraph G core so that degree(x) is at least k for all x ∈ G core . The core of maximum order (k max) is the main core. A vertex x has core number k if it belongs to the k-core, but not the k + 1-core. All vertices with core number k form the k-shell. Parameter k refers to the depth of the core (higher k values represent deeper cores). The resulting cores are nested, and each core is not necessarily a connected subgraph. The k-core decomposition can be implemented in linear time which makes it applicable to very large graphs [9].

In Table 7, we list two different core numbers (depths): k max which is the main core and k δ which is the maximum k such that the core subgraph \(G_{core}^k\) achieves a hyperbolicity value that is equal to hb(G). We also compute k min (not listed in the table) which is the maximum k such that \(G_{core}^k = G\). Table 7 also lists the size and the diameter of the k max-shell and the size, the diameter, and the hyperbolicity of the k δ -shell of each network.

Table 7 δ-Hyperbolicity and the graph’s k-core \(G_{core}^{k}\)

Table 7 shows that the \(G_{core}^{k_{\delta }}\) has smaller diameter and equal hyperbolicity compared to each original network. In all networks, k min = 1, and k δ is always greater than k min which suggests that the quadruples responsible for increasing the value of the δ-hyperbolicity concentrate in a deeper core in the network. Note that in some networks, all vertices belong to the same core (k min ≈ k max). Some networks such as the US-Airways and the Email have k δ  ≈ k max which indicates a tree-like structure that concentrates within the deep core of the network.

4 Case Studies

In this section, we apply the idea of calculating the δ-hyperbolicity of the core to two larger real-world networks for which calculating the exact value of the δ-hyperbolicity is computationally expensive on a personal computer and for which the values of the δ-hyperbolicity are known [1]. The first network is a biological network. The second network is an Internet graph.

4.1 Biological Network

This biological network represents the protein and genetic interactions in human genus [27]. It consists of 16,711 vertices (proteins and genes) and 115,406 edges linking each pair of interacting proteins or genes. We focus on the largest connected component in the network which includes 16,635 vertices and 115,364 edges. The network has diameter 10, radius 5, and average path length 2.87. The δ-hyperbolicity of this network is 2 [1].

The minimum-cover-set core of the network has 6546 vertices (39% of the number of vertices of the original graph), 84,889 edges, and diameter 8. The size of the core subgraph allows us to calculate its δ-hyperbolicity and compare it with that of the original network. The hyperbolicity of the minimum-cover-set core is 2, which was calculated with 97.6% less quadruples.

For the k-core of the network, k min = 1, k max = 45, and k δ  = 14. The k-core (which corresponds to the k δ -shell) consists of 3053 vertices, which represents only 18% of the number of vertices in the original graph, and 64,085 edges with a diameter of 4. The computation of the δ-hyperbolicity for the k-core requires 99.8% less quadruples compared to the number of quadruples required to calculate the δ-hyperbolicity for the original network.

4.2 AS-Graph

This network depicts the Internet Autonomous Systems (AS) relationships collected by the Cooperative Association for the Internet Data Analysis (CAIDA) [11] during June 2012. The data was derived from BGP table snapshots taken at 8-h intervals over a period of 5 days.

The network includes 41,203 vertices and 121,309 edges (the average degree is 5.9). The diameter and radius of the network are 10 and 5, respectively. Also, the δ-hyperbolicity of the network is 2 [1]. Because of the size of the network, we remove all dominated vertices before calculating the minimum-cover-set core. About 57% of the vertices in the original network are dominated vertices, and 36% of which has a degree of one.

After removing all dominated vertices, the new network has 17,760 vertices, 157,152 edges, and the diameter is 9. We extract the minimum-cover-set core of this network which consists of 6576 vertices and 45,092 edges with a diameter of 8. Compared to the original network, the size of the core is only 16%, yet it has a δ-hyperbolicity of 2.

We also compute the k max-core and the k δ -core for this network. The k max-shell has 55 vertices and a diameter of 2, which is too small to achieve the hyperbolicity of the original network. The k δ -shell (with a hyperbolicity equal to the hypberbolicity of the original network) has 3873 vertices, 56,054 edges, and a diameter of 5.

5 Conclusions and Future Work

This paper describes a method of identifying quadruples that maximize the hyperbolicity using the dominance relationship between vertices. Also it demonstrates an interesting property of the δ-hyperbolicity in networks, which is its realization in quadruples with vertices that are within relatively close proximity and that are close to the graph’s core.

Restricting the calculation of the δ-hyperbolicity to some core of the network enables the computation of its value for large networks. Even though the hyperbolicity of the core may not resemble the exact value of the hyperbolicity of the graph, it provides a reasonable approximation. A key issue that needs to be considered when applying the idea of calculating the hyperbolicity within the core is the type of the network. Restricting hyperbolicity calculation within the core offers a tremendous gain in calculation time for networks with clear-cut core-periphery structures (more concise cores) including social and biological networks. This may not be the case for networks that lack a well-defined core such as some transportation networks and peer-to-peer networks. Moreover, it would be interesting to compare the values of the hyperbolicity within other core definitions. For example, the core that results from including vertices with the highest closeness centrality and/or betweenness centrality.

An interesting focus of subsequent research is the development of a local algorithm that calculates the p-δ-hyperbolicity of very large graphs, and the estimation of a p value that guarantees a p-δ-hyperbolicity that is close (if not equal) to the hyperbolicity of the original graph.