1 Introduction

With the rapid development of the Internet, social networks have become an essential part of everyday life. Due to the continuous evolution of the network and the rapid growth in the number of nodes, most researchers focus on the identification of critical nodes [1,2,3] and consider nodes with a high degree of importance as critical nodes. The earliest method used to identify the importance of nodes was node degree centrality [4], which is simple and easy to determine, but less accurate as only local information is considered. \(k\)-shell decomposition theory [5] was proposed by Kitsak et al. and was able to determine the core position of nodes in the network, but only a coarse-grained result was obtained. The importance of a node can also be measured based on paths, such as closeness centrality [6] and betweenness centrality [7], where closeness centrality considers the node that is closest on average to the rest of the nodes in the network to be the important node, while betweenness centrality considers the importance of a node to depend on how often the node appears on the shortest path of a node pair that does not contain that node. When addressing the problem of maximizing the influence of seed nodes through a node-based attack strategy, Wang et al. [8] employed various indicators for evaluating node importance, as previously discussed. This underscores the utility of identifying critical nodes in tackling real-world issues.

The rapid growth in the number of nodes has led to a large increase in the number of edges. In comparison, there has been less research on critical edges. The earliest identification method was proposed by Granovetter in 1973, who argued that weakly connected edges might be more important than strongly connected edges, attracting the attention of many researchers [9,10,11,12]. Girvan and Newman [13] proposed the edge betweenness centrality based on betweenness centrality to represent the proportion of the shortest paths in the network that pass through the edge, with larger values indicating that the edge is more important in the network, but this method requires calculating the number of shortest paths for each pair of nodes, which is time-consuming. Barrat et al. [14] argued that the importance of an edge is related to the importance of the nodes at both edges, so the importance of an edge is expressed by the product of the node degrees at both edges, but the method is strongly influenced by the node degree values. In order to study the importance of edges through the densities of subgraphs, researchers have proposed many models such as \(k\)-truss [15], \(k\)-core [15], clique [16], and so on. A clique represents a complete subgraph formed by a subset of a set of vertices, i.e., there are edges between any vertices inside the subgraph. Since cliques have more stringent requirements for subgraphs, whereas \(k\)-core and \(k\)-truss belong to relaxed clique structures with relatively simple requirements, they are widely used in practical applications. In comparison, \(k\)-truss is an extension of \(k\)-core based on triangles, which considers the degree of strength of edges by introducing support [15] and defines edges more strictly, whereas \(k\)-core is just a simple edge connection relation, and considers that the only factor affecting the dense subgraph is the degree of nodes, without emphasizing whether the relationship between any two nodes is tight or not. Kanwar et al. [17] introduced an edge centrality metric, BCDCN, which is derived from a synthesis of betweenness centrality, degree, and common neighborhood. However, this metric may inadvertently undervalue the significance of intra-community edges. Wang et al. [18] posit that the diffusion capability of an edge can be quantified by the influence exerted by the nodes at its termini. Consequently, they introduced an index, denoted as Inf, to measure edge importance based on node influence. Nevertheless, this index is disproportionately swayed by the influence of the nodes at both ends. The edge-based attacks have been proven to be disruptive to the process of information diffusion [19]. Therefore, Wang et al. [20] applied edge-based attack metrics in the process of measuring the influence of seed nodes to maximize influence. This illustrates that the detection of critical edges is of paramount importance.

The above researchers studied critical nodes and critical edges independently, ignoring the mutual influence between nodes and edges in the network. Since there are often correlations between critical nodes and critical edges in real networks, this paper introduces the \(k\)-sup structure and \(k\)-sup-based critical subgraph to identify both critical nodes and critical edges. Furthermore, the paper proposes an importance indicator termed \(\text {supEI}\) based on \(k\)-sup structure and substantiates the rationality and effectiveness of the indicator through empirical verification.

This paper is organized as follows. In Sect. 2, we introduce the basic concepts of social networks. The \(k\)-sup structure and the definition of critical subgraphs are presented in Sect. 3. In Sect. 4, we propose a critical edge detection method based on critical subgraphs. And we compare the effectiveness of the proposed method through experiments in Sect. 5. Finally, a summary and some prospects are stated in Sect. 6.

2 Basic concepts of social networks

First, we introduce a set of fundamental concepts that will form the basis of our analysis.

Definition 1

A subgraph of a graph \(G = (V,E)\) is defined as a graph \(G'= \left( V',E' \right)\), satisfying \(E' \subseteq E\) and \(V' \subseteq V\).

Triangular relationship structure is often found in social networks, indicating that two connected nodes are related with other nodes and reflecting the strength of the relationship between the two nodes.

Definition 2

([15]) The support of an edge \(e \in E\) in a graph \(G = (V,E)\), denoted by \(sup(e,G)\) (or simply written as \(sup(e)\)), is defined as the number of triangles containing \(e\) in \(G\).

In social network, the support of an edge measures the strength of the relationship between two persons by emphasizing the number of friends they have in common.

Example 1

A social network \(G\) is shown in Fig. 1.

Fig. 1
figure 1

A social network G

The support of each edge can be computed as follows.

\(sup\left( e_{4,7} \right) = sup\left( e_{7,8} \right) = sup\left( e_{4,8} \right) = sup\left( e_{6,9} \right) = sup\left( e_{2,9} \right) = sup\left( e_{5,6} \right) = sup\left( e_{2,3} \right) = sup\left( e_{3,5} \right) = 1,sup\left( e_{2,6} \right) = sup\left( e_{2,5} \right) = 2\)

Finding the \(k\)-truss structure in a social network helps to find the cohesive groups in that network.

Definition 3

([15]) Given an unweighted undirected graph \(G\), \(T_{k}\) is called the \(k\)-truss of \(G\) (\(k \ge 2\)), denoted by \(T_{k} = \left( V_{T_k},E_{T_k} \right)\), if \(T_{k}\) satisfies the following conditions:

  1. 1.

    \(sup(e,T_{k}) \ge (k - 2)\) for every edge \(e \in E_{T_k}\).

  2. 2.

    \(T_{k}\) is the maximal subgraph that satisfies condition 1, i.e., any supergraph \(T' \supset T_{k}\) is not a \(k\)-truss.

  3. 3.

    no isolated code can be found in \(T_{k}\).

The \(k\)-truss subgraph with the largest value of \(k\) in \(G\) is denoted as \(k_{\max }\)-truss.

The \(k\)-truss subgraph ensures that the relationship between any two nodes in this subgraph reaches a certain strength.

Example 2

Figures 2 and 3 show the \(k\)-truss subgraphs for all \(k\) values of the social network \(G\) in Fig. 1.

Fig. 2
figure 2

2-truss

Fig. 3
figure 3

3-truss

As shown in Fig. 3, when \(k = 3\), all the edges satisfy the condition of \(k\)-truss, so these edges are retained. When \(k = 4\), there is no edge satisfying the condition 1 of \(k\)-truss, so \(T_{4}\) does not exist. Therefore, the \(k_{\max }\)-truss of Fig. 1 is 3-truss, and in this case, \(k_{\max } = 3\).

3 k-sup and critical subgraph

In Example 2, edged \(e_{2,6}\) and \(e_{2,5}\) are significantly more important in graph \(G\) in terms of support than the other edges, and therefore their intersecting node 2 is also more important than the other nodes in \(G\). However, the 3-truss does not reflect the importance of node 2 and edges \(e_{2,6}\) and \(e_{2,5}\). Thus, the \(k\)-truss structure fails to capture certain important nodes and edges, because the \(k\)-truss structure diminishes the importance of edges in the original graph.

In fact, it is easy to verify that the supports of the edges in the original graph and subgraph have the following connection.

Property 1

Given an unweighted undirected graph \(G\) and a subgraph \(T\), we have \(sup(e,G) \ge sup(e,T)\) for any edge \(e\) in \(G\).

According to Property 1, the support of an edge in a subgraph is always less than or equal to its support in the original graph; therefore, the \(k\)-truss diminishes the importance of the edge in the original graph. For this reason, we propose the \(k\)-sup structure.

Definition 4

Given an unweighted undirected graph \(G\), \(S_k\) is the \(k\)-sup of \(G\) (\(k \ge 2\)), denoted by \(S_k = \left( V_{S_k},E_{S_k} \right)\), if \(S_k\) satisfies the following conditions.

  1. 1.

    \(sup(e,G) \ge (k - 2)\) for every edge \(e \in E_{S_k}\).

  2. 2.

    \(S_k\) is the maximal subgraph that satisfies condition 1, i.e., any supergraph \(S' \supset S_k\) is not a \(k\)-sup.

  3. 3.

    no isolated node can be found in \(S_k\).

The \(k\)-sup structure with the largest value of \(k\) in \(G\) is denoted as \(k_{\max }\)-sup.

By Definition 4, we can see the \(k\)-sup structure only modifies the condition 1 of \(k\)-truss, i.e., the supports are calculated in the original graph instead of the subgraphs, ensuring that the supports do not change with the subgraphs and maintain the importance of edges in the original graph. In social networks, \(k\)-sup can better discover the critical edges that have strong relationships and possess the potential to form groups.

Example 3

The corresponding \(k\)-sup structures of Fig. 1 are shown in Figs. 4, 5, and 6.

Fig. 4
figure 4

2-sup

Fig. 5
figure 5

3-sup

Fig. 6
figure 6

4-sup

As shown in Fig. 6, edges \(e_{2,5}\) and \(e_{2,6}\) satisfy condition 1 of Definition4 when \(k = 4\), i.e., \(sup\left( e_{2,6},G \right) = sup\left( e_{2,5},G \right) = 2\), so they are retained. When \(k = 5\), \(S_{5}\) does not exist because there is no edge in \(G\) with support greater than or equal to 3. Therefore, the \(k_{\max }\)-sup is \(4\)-sup, where \(k_{\max } = 4\).

Combining Figs. 1 and 6, it can be seen that edge \(e_{2,6}\) forms part of two cliques {2,6,9} and {2,5,6}, i.e., 2 and 6 share two friends 5 and 9, and thus \(e_{2,6}\) is stronger and less likely to be broken compared to the other relationships. In fact, the relationships between 2 and 5 and between 2 and 6 in Fig. 6 also have such characteristics.

As can be seen from Example 3, \(k\)-sup solves the problem of \(k\)-truss without losing any information on critical edges and identifies the strongest relationships in the social network. Therefore, the k-sup structure quantifies the strength of relationships between nodes by considering the support of edges in the graph. The support refers to the number of triangles that the given edge participates in, representing the number of common nodes between the two endpoints of the edge. Thus, edges with higher support indicate stronger connections between nodes. By preserving the largest subgraph that satisfies the support condition, the k-sup method helps identify the strongest relationships without losing critical information, thereby enabling the analysis of graphs with different relationship strengths while retaining the relationships between nodes in the original network.

Similar to \(k\)-truss, \(k\)-sup has the following properties.

Property 2

The \(k\)-sup of \(G\) is a subgraph of the (\(k-1\))-sup.

Proof

Based on the definition of \(k\)-sup, for any edge \(e\) in the \(k\)-sup, we have \(sup(e,G) \ge (k - 2) \ge (k - 1) - 2\). Thus, by condition 2 of \(k\)-sup, any edge \(e\) must also be an edge of the (\(k-1\))-sup, i.e., the \(k\)-sup is a subgraph of the (\(k-1\))-sup. \(\square\)

Property 2 indicates that the larger \(k\), the fewer edges in \(k\)-sup and the stronger the relationship.

Example 4

Comparing Figs. 4 with 6, it can be seen that 3-sup is a subgraph of 2-sup, and 4-sup is a subgraph of 3-sup.

Property 3 shows the relationship between \(k\)-truss and \(k\)-sup.

Property 3

The \(k\)-truss of \(G\) is a subgraph of the \(k\)-sup.

Proof

According to Definition 3, any edge \(e\) in the \(k\)-truss satisfies \(sup(e,T) \ge (k - 2)\). By Property 1, we have \(sup(e,G) \ge sup(e,T) \ge (k - 2)\). Thus, the edge \(e\) satisfies condition 1 of the \(k\)-sup. From condition 2 of the \(k\)-sup, it follows that \(e\) must be an edge of the \(k\)-sup. Consequently, the \(k\)-truss is a subgraph of the \(k\)-sup. \(\square\)

Property 3 illustrates that \(k\)-sup is able to retain more information than \(k\)-truss. In social network analysis, the stability of a social network refers to the ability of the social network to resist attacks. At present, most of the network attack strategies concentrate on node and edge attacks [8, 18, 20,21,22,23,24,25], but with the diversification and sophistication of online network attacks, network attack strategies also include various other attack strategies, such as hybrid network attacks. Hybrid network attacks have become the biggest security risk faced in social network analysis. Hybrid attack measures the importance of all edges ranked in descending order and attacks the highest ranked edge, thus completely destroying the connectivity between the two endpoints corresponding to the edge and preventing it from forming a pathway through other paths. In order to identify the key edges in social networks and determine the key groups, the \(k_{\delta }\) critical subgraph is defined as follows.

Definition 5

Given a threshold \(\delta \in [0,1]\), the \(k_{\delta }\) critical subgraph of a graph \(G = (V,E)\) is defined as the \(k_{\delta }\)-sup of \(G\), where \(k_{\delta }\) represents the smallest integer \(k\) that satisfies \(\delta \le k/|V|\).

Example 5

As shown in Example 3, a hybrid attack on edge \(e_{4,7}\) or other edges may cause less impact on the network than edges \(e_{2,5}\) or \(e_{2,6}\). When \(\delta \in \left( \frac{1}{4},\frac{3}{8} \right]\), we have \(k_{\delta } = 3\) and the corresponding \(k_{\delta }\) critical subgraph is shown in Fig. 5; when \(\delta \in \left( \frac{3}{8},\frac{1}{2} \right]\), we have \(k_{\delta } = 4\) and the corresponding \(k_{\delta }\) critical subgraph is shown in Fig. 6.

4 Critical edge detection based on critical subgraph

4.1 Edge importance and critical edge detection

To verify the rationality of critical subgraphs, a critical edge detection method based on critical subgraph is proposed in this section. Firstly, Zachary’s Karate Club network, shown in Fig. 7, is used as an example [16] for a case analysis.

Fig. 7
figure 7

Zachary’s Karate Club network

As shown in Fig. 7, Zachary’s Karate Club comprises of 34 nodes and 78 edges. The coach and the founder are nodes 1 and 34, respectively, and interestingly there is no direct edge linking the two. However, two groups eventually emerge around these two individuals, which unfortunately lead to the collapse of the club.

By utilizing \(k_{\delta }\) critical subgraphs on Zachary’s Karate Club, the \(k_{\delta }\) can be calculated for different values of \(\delta\). Figure 8 displays how the number of edges in \(k_{\delta }\) changes as \(\delta\) varies.

Fig. 8
figure 8

The number of edges in \(k_{\delta }\)

It can be seen from Fig. 8 that as \(\delta\) increases, the number of edges in the network gradually decreases, which means that as the requirement for support increases, some members are gradually excluded from \(k_{\delta }\) critical subgraphs due to insufficient strength of relationships with other members. Particularly, if there is a sudden drop in the number of edges (as in Fig. 8), a large number of members are excluded due to their relationship’s insufficient strength, and the others tend to stabilize. Since the excluded memberships have weak relationships with others, they can be considered as peripheral members, whereas the stable relationships can be considered as core members. Intuitively, core memberships, instead of peripheral memberships, can provide a more deep understanding of community changes. In our situation, when \(\delta \in \left( \frac{3}{34},\frac{4}{34} \right]\) and \(k_{\delta } = 4\), the number of edges significantly drops, resulting in the extraction of a critical subgraph, as shown in Fig. 9a. In addition, Fig. 9b shows the 4-truss structure of Karate.

Fig. 9
figure 9

4-sup versus 4-truss

As depicted in Fig. 9a, the support for each edge within the 4-sup precisely matches the support in the original image, ensuring that the informational content of every edge is fully retained. In contrast, Fig. 9b reveals that the 4-truss structure, acting as a subgraph of the 4-sup, entails the removal of certain edges during its extraction, consequently discarding their significance. Moreover, the very density of the 4-truss structure tends to obscure the edges, creating an illusion of uniformity where the importance of each edge appears identical when viewed solely through the lens of connectivity. This perceived uniformity can significantly impede our analytical process and may even prove to be an irrelevant consideration for our current research focus. In light of these considerations, we have made a deliberate choice to employ the k-sup structure for this study, favoring its ability to maintain the distinct importance of each edge and provide a more nuanced framework for our analysis.

Figure 9a verifies the fact that when the edge between nodes 3 and 9 is broken, the two groups are no longer connected and the club is formally disintegrated.

In order to identify such critical edges as the edge between nodes 3 and 9 based on k-sup, we will discover the distinguishing characteristics of the critical edges. One distinguishing characteristic of the edge between nodes 3 and 9 is its vital role in preserving the integrity of the social network in Fig. 9. Bridgeness [26], an indicator measuring the importance of edges, is very effective in maintaining network connectivity. The bridgeness of edge \(e_{i,j}\) is defined as [26]

$$\begin{aligned} bridgeness(e_{i,j}) = \frac{\sqrt{C_{i}C_{j}}}{C_{e_{i,j}}} \end{aligned}$$
(1)

where \(C_{i}\) denotes the size of the maximum clique containing node \(v_{i}\) and \(C_{e_{i,j}}\) denotes the size of the maximum clique containing \(e_{i,j}\). The larger the \(bridgeness\left( e_{i,j} \right)\) value, the greater the influence of \(e_{i,j}\) on network connectivity.

The bridgeness of each edge in Fig. 9 is shown in Table 1.

Table 1 Bridgeness

From Fig. 9, the removal of edges in indexes 1-4 in Table 1 would disrupt the connectivity of the network. However, deleting edge \(e_{3,9}\) would result in the network being divided into two disconnected components, while removing edge \(e_{1,5}\) would only isolate node 5 as an individual node. It is thus crucial to consider communities when discerning between \(e_{3,9}\) and the edges in indexes 2-4.

4.2 Proposed method

Communities can help us understand the structure and function of social networks, as well as the ways in which information spreads within them. Clearly, the importance of edges within a community and those outside a community is not the same due to the different roles they play in the network. Therefore, this paper distinguishes between edges internal and external to communities, determining the communities to which the endpoints of each edge in the network belong, thereby better differentiating the importance of edges. Thus, let’s review the following definitions.

Modularity [27] is an important metric used to measure the effectiveness of the community discovery algorithm. The larger the modularity, the better the community partition. The Louvain community discovery algorithm [28, 29] is a heuristic algorithm based on modularity maximization, and its community classification outperforms other basic algorithms. The algorithm can be primarily divided into two steps. In the first step, each node is considered as an independent community. For each node \(i\), the algorithm calculates the modularity gain \(\Delta Q_i\) if it were to join the community where its neighboring nodes are currently located. The algorithm keeps track of the community that yields the largest \(\Delta Q_i\), denoted as \(\Delta Q_i^{\max }\). If \(\Delta Q_i^{\max } > 0\), the algorithm records the node as belonging to the community of its neighboring nodes. This process continues until all nodes no longer change their community assignments. In the second step of the algorithm, the graph is compressed, meaning that nodes belonging to the same community are merged into a new single node. This compression reduces the complexity of the graph by collapsing multiple nodes into a single representative node for each community. After the compression, the first step of the algorithm is repeated on the compressed graph. Each compressed node represents a community, and the algorithm calculates the modularity gain for each compressed node joining the community of its neighboring compressed nodes. The process continues until the modularity of the entire graph no longer changes.

According to the Louvain algorithm, all the nodes in Figure 9 were classified, and the results are shown in Figure 10.

Fig. 10
figure 10

The findings of the Louvain Community Discovery

By Fig. 10, it becomes apparent that among the four bridges, i.e., \(e_{3,9}\), \(e_{1,5}\), \(e_{1,11}\), and \(e_{32,34}\), only the two end nodes of \(e_{3,9}\) (nodes 3 and 9) are assigned to different communities, and the two end nodes of the remaining three bridges (\(e_{1,5}\), \(e_{1,11}\), and \(e_{32,34}\)) belong to the same community. Thus, after the edges between different communities are removed, the dissemination of information between communities may be blocked, requiring connectivity to be reflected across communities. For this purpose, the importance of edge \(e_{i,j}\) should be reduced when its two end nodes \(v_{i}\) and \(v_{j}\) are in the same community, and the penalty term is established as follows.

$$\begin{aligned} p(e_{i,j}) = \left\{ \begin{aligned} -\frac{1}{E_{i,j}}\frac{{C}_{i} + {C}_{j}}{V}\frac{\min ( {C}_{i},{C}_{j})}{\max ( {C}_{i},{C}_{j})},&v_{i} \text {and} {v}_{j} \text {are in different communities} \\ 1,&v_{i} \text {and} {v}_{j} \text {are in the same community} \end{aligned} \right. \end{aligned}$$
(2)

where \({C}_{i}\) denotes the number of nodes in the community where node \(v_{i}\) is located; \(E_{i,j}\) denotes the number of connected edges between the communities where nodes \(v_{i}\) and \(v_{j}\) are located, which together maintain the communication between the two communities and are therefore equally distributed by \(\frac{1}{E_{i,j}}\); \(({C}_{i} + {C}_{j})/{V}\) means the ratio of the number of nodes affected in the network when two communities break; \(\min ({C}_{i},{C}_{j})/\max ({C}_{i},{C}_{j})\) represents the ratio of the number of nodes between two communities, reflecting the uniformity of the two community sizes. If \(v_{i}\) and \(v_{j}\) are in the same community, the community has the same number and then let \(p\left( e_{i,j} \right) = 1\).

Combined with the overall connectivity of social networks, the \(k\)-sup structure-based importance indicator for \(e_{i,j}\), called \(\text {supEI}\) (k-sup-based Edge Importance indicator), is defined as follows.

$$\begin{aligned} \text {supEI}(e_{i,j}) = bridgeness( e_{i,j} ) - p( e_{i,j}) \end{aligned}$$
(3)

Clearly, the larger the value of \(\text {supEI}(e_{i,j})\), the greater the impact caused by \(e_{i,j}\) on network connectivity.

The \(\text {supEI}(e_{i,j})\) values for each edge in the Karate club’s 4-sup (Fig. 9) are shown in Table 2.

Table 2 \(\text {supEI}\)

An interesting and surprising result can be seen in Table 2 that the edges after index 2 all have lower values of \(\text {supEI}\) because they are in the intra-community. It can also be seen that the values of \(p(e)\) of edges in indexes 1 and 2 correspond to negative numbers, indicating that the edges in indexes 1 and 2 are community-to-community connected edges that have a greater impact on the dissemination of information in the network after deletion, while the remaining edges all correspond to positive numbers, indicating that they are all intra-community edges that have a smaller impact on the dissemination of information in the network after deletion. Combining the results of \(bridgeness(e)\) and \(\text {supEI}(e)\), the importance of the edge in index 1 increase greatly because the nodes at the ends of the edge are in different communities.

If the edge attack is conducted based on the importance indicated in Table 2, when edge \(e_{3,9}\) are removed from the network, the network is divided into two disconnected communities. This division blocks the flow of information between these two communities. Furthermore, if the network continues to be attacked and all the edges in index 2 are removed, the network will be divided into three disconnected communities. Consequently, there is no longer any information dissemination among the communities.

In summary, the workflow for detecting critical edges using the critical subgraph-based Algorithm 1 is outlined as follows.

Algorithm 1
figure a

The detection of critical edges based on \(\text {supEI}\)

In Algorithm 1, one needs to determine the proper \(k_{\delta }\) by computing the maximal difference \(\max _k(|E_{S_k}|-|E_{S_{k-1}}|)\), and then compute \(\text {supEI}(e)\) for all the edges in \(E_{S_{k_\delta }}\). Algorithm 1 returns the ranked edges according to \(\text {supEI}(e)\).

4.3 Case analysis and comparison

In this Section, to verify the reasonableness of the critical subgraphs obtained from the \(k\)-sup structure, we compare the \(\text {supEI}\) with some other indicators for edge significance on Zachary’s Karate Club network, namely bridgeness(BR) [26] (see Sect. 4.1), reachability(RE) [30], the edge betweenness centrality(BC) [13], degree product(ED) [14], the Jaccard coefficient(JC) [31], the hybrid edge centrality(BC_DCN) [17], and the significance of edges in the diffusion process(Inf) [18].

The reachability of an edge \(e(v_{i},v_{j})\) is defined as [30]

$$\begin{aligned} R_{e(v_{i},v_{j})} = \frac{1}{|V|}\sum _{s \in V}^{}|R( s;G_{e(v_{i},v_{j})})| \end{aligned}$$

where \(|V|\) is the total number of nodes in the network; \(G_{e(v_{i},v_{j})}\) represents the subnetwork obtained by removing \(e(v_{i},v_{j})\) from the original network G; \(R\left( s;G_{e(v_{i},v_{j})} \right)\) represents the number of nodes that can be reached from node \(s\) within the modified network \(G_{e(v_{i},v_{j})}\).

The edge betweenness centrality of an edge \(e(v_{i},v_{j})\) is defined as [13]

$$\begin{aligned}C_{B} = \sum _{v_{i} \ne v_{j} \in V}^{}\frac{\sigma (e(v_{i},v_{j}))}{\sigma (v_{i},v_{j})}\end{aligned}$$

where \(\sigma \left( v_{i},v_{j} \right)\) is the number of shortest paths from node \(v_{i}\) to \(v_{j}\), and \(\sigma (E)\) represents the number of shortest paths from node \(v_{i}\) to \(v_{j}\) that pass through edge \(e(v_{i},v_{j})\).

The degree product of an edge \(e(v_{i},v_{j})\) is defined as [14]

$$\begin{aligned} ED = w_{ij} = d_{v_{i}}d_{v_{j}} \end{aligned}$$

where \(w_{ij}\) represents the edge weight between nodes \(v_{i}\) and \(v_{j}\), \(d_{v_{i}}\) and \(d_{v_{j}}\) are the degrees of nodes \(v_{i}\) and \(v_{j}\) respectively.

The Jaccard coefficient of an edge \(e(v_{i},v_{j})\) is defined as [31]

$$\begin{aligned} J_{e(v_{i},v_{j})} = \frac{| \Gamma _{i} \cap \Gamma _{j}|}{|\Gamma _{i} \cup \Gamma _{j}|} \end{aligned}$$

where \(\Gamma _{i}\) represents the set of neighbors of node \(v_{i}\), i.e., the set of nodes directly connected to \(v_{i}\).

The hybrid edge centrality of an edge \(e(v_{i},v_{j})\) is defined as [17]

$$\begin{aligned} BC\_DCN_{e(v_{i},v_{j})} = \frac{C_{B}*ED}{|\Gamma _{i} \cap \Gamma _{j}|} \end{aligned}$$

where \({|\Gamma _{i} \cap \Gamma _{j}|}\) represents the common neighbors of node \(v_{i}\) and \(v_{j}\).

The significance of edges in the diffusion process of an edge \(e(v_{i},v_{j})\) is defined as [18]

$$\begin{aligned} Inf_{e(v_{i},v_{j})} =\sqrt{Inf_i*Inf_j} \end{aligned}$$

where \(Inf_i\) represents the influence [32] of node \(v_{i}\).

The importance ranking of the edges in the 4-sup (Fig. 9) based on the above indicators is shown in Table 3, where the important values are calculated in brackets.

Table 3 Ranking of edge importance

In Table 3, the fewer the edges in the same index, the wider the range of edge importance and the higher the resolution. Among the compared indicators, the edge betweenness centrality, degree product, Jaccard coefficient and hybrid edge centrality show a higher resolution. The degree product, Jaccard coefficient and hybrid edge centrality differ more from each other, as the Jaccard coefficient considers the similarity of both end nodes, and the more similar both end nodes, the higher the edge’s importance. For instance, since the nodes at both ends of \(e_{3,9}\) have a similarity of zero, they are at the bottom.

The degree product ranks \(e_{3,9}\) in index 6 because it is more influenced by the degree values of the end nodes, which leads to a greater difference from the other methods. Since the hybrid edge centrality BC_DCN integrates the degree product, betweenness centrality, and common neighbors, the resultant rankings of BC_DCN are fundamentally aligned with those derived from the degree product and betweenness centrality. This congruence also elucidates why BC_DCN’s outcomes are markedly distinct from those of alternative methodologies.

Although the closeness centrality ranks \(e_{3,9}\) at index 1, all the edges in indexes 3-11 are within the intra-community; since the intra-community has strong connections, the probability of a break leading to a blockage of information within the community is small and does not cause much impact on the network.

Reachability, bridgeness, Inf, and \(\text {supEI}\) identify the \(e_{3,9}\) as the most significant edges, as supported by Fig. 9. The edges in index 2 for reachability are the same as those identified in indexes 2-3 for bridgeness. This is because both indicators consider these edges to be essential for maintaining the connectivity of the network. However, this connectivity is primarily significant within the intra-community, and its impact on the entire network is relatively minor. Similarly, in the identification results of Inf, only the edge with index 2 spans across communities, whereas the remaining edges are confined within a single community. The deletion of these intra-community edges is anticipated to exert minimal impact on the overall network integrity.

The \(\text {supEI}\) emphasizes that the connectivity between communities is more important than the connectivity within each individual community. Thus, the deletion of the edges in indexes 1-2 will disconnect all the communities in the network, blocking communication between them.

Therefore, from the analysis above, it is evident that \(\text {supEI}\), the proposed edge importance indicator, is reasonable.

5 Experiments and analysis

5.1 Experimental datasets

To assess the effectiveness of the proposed methodology presented in this paper, experiments are conducted on eight publicly available real-world network datasets and three randomly generated synthetic networks. Among the real-world network datasets considered in this study, we have included the following:

  • Karate: A social network contains 34 members of a karate club in the USA during the 1970 s [16];

  • Contiguous_USA: An infrastructure network among US states [33];

  • Football: A social network represents the American football games played between various colleges during the fall season of 2000 in the USA [34];

  • PDZBase: A metabolic network of protein-protein interactions from PDZBase [33];

  • Netscience: A collaboration network among researchers in the fields of network theory and experimental science [34];

  • Jazz: A collaboration network among jazz musicians [34];

  • Euroroads: An international E-road network that connects various cities across Europe [33];

  • Yeast: A metabolic network of protein-protein interactions [33].

The basic statistical information for the eight datasets is presented in Table 4, where \(|V|\) and \(|E|\) are the total number of nodes and edges, respectively; \(\left\langle k_{s} \right\rangle\) denotes the average degree of support; \(k_{s\_\max }\) represents the maximum degree of support; \(\left\langle k \right\rangle\) represents the average node degree; \(k_{\max }\) denotes the maximum degree, and \(c\) represents the average clustering coefficient.

Table 4 Basic statistical information of eight networks

5.2 Evaluation strategy

The stability of a network reflects its resistance to various types of attacks. This paper primarily discusses two commonly used evaluation criteria for network resilience: the maximum connectivity coefficient [35] and the decline rate of network efficiency [35].

The maximum connectivity coefficient \(\sigma\) can be calculated as follows [35]

$$\begin{aligned} \sigma = R/|V| \end{aligned}$$

where \(R\) represents the number of nodes in the maximum connected component after attack and \(|V|\) is the total number of nodes in the network. The faster \(\sigma\) falls, the greater the change in network stability and the more efficient the attack strategy.

The decline rate of network efficiency can be calculated as follows [35]

$$\begin{aligned}\mu = 1 - \frac{1}{\eta _{0}}\frac{1}{|V|( |V| - 1)}\sum _{i \ne j}^{}\frac{1}{d_{ij}}\end{aligned}$$

where \(|V|\) is the total number of nodes in the network, \(\eta _{0}\) is the efficiency of the original network, and \(d_{ij}\) denotes the shortest distance between nodes \(v_{i}\) and \(v_{j}\). The larger the value of \(\mu\), the more pronounced the decrease in network efficiency, highlighting the increasing importance of the corresponding edge.

5.3 Results analysis

To validate the effectiveness of the proposed method in this study, the importance of edges is evaluated using seven distinct indicators detailed in Sect. 4.3, in comparison with the method proposed in this paper. Then, edge attacks are performed on eight real-world networks and three randomly generated synthetic networks by removing these critical edges in turn. The stability changes of the network after an edge is removed will reflect the importance of the edge; the greater the stability change of the network, the more important the removed edges are. As observed in Table 4, the maximum support is notably small for some networks, thus all experiments are conducted with \(k_{\delta }=2\).

5.3.1 Analysis of experimental results on the maximum connectivity coefficient

Figure 11 illustrates the changes in the maximum connectivity coefficient when removing edges, where the horizontal axis \(p\) represents the proportion of removed edges and the vertical axis represents the maximum connectivity coefficient. The faster the maximum connectivity coefficient drops, the greater the impact on network connectivity, and the higher the identification accuracy of the corresponding method.

Fig. 11
figure 11

The maximum connectivity coefficients for different indicators on eight real networks

As evident from Fig. 11, the proposed \(\text {supEI}\) exhibits the most rapid decline in the maximum connectivity coefficient compared to other indicators on all the eight social networks. In the case of Netscience (Fig. 11e), the maximum connectivity coefficient experiences a rapid decline as edge removal begins. When the removal ratio reaches 0.045, all communication between communities is effectively blocked. If the attack persists, it starts to affect the internal structure of the communities. Notably, during the removal ratio range of 0.045 to 0.1, the proposed method in this paper demonstrates the fastest and lowest drop in the maximum connectivity coefficient; Similarly, for the Contiguous_USA (Fig. 11b), PDZBase (Fig. 11d) Euroroads (Fig. 11g) and Yeast (Fig. 11h), there is a marked decrease in the maximum connectivity coefficient when the proportion of removed edges is relatively low. In the case of other networks, the decline in the maximum connectivity coefficient initiates at different proportions of removed edges. For instance, in Fig. 11a, the decline begins at a removal ratio of 0.13, whereas in Figs. 11c and f, it starts at 0.14 and 0.08, respectively. This discrepancy is due to the presence of a higher number of connected edges between communities in these networks.

Overall, the proposed indicator in this paper demonstrates a rapid decline in the maximum connectivity coefficient with the removal of a smaller proportion of edges. In contrast, the corresponding curves of other methods exhibit a slower decline. This observation validates the effectiveness of the proposed method in capturing the impact of edge removal.

5.3.2 Analysis of experimental results on the decline rate of network efficiency

Figure 12 illustrates the network connectivity coefficient as edges are removed. The horizontal axis represents the proportion of edges removed (\(p\)), while the vertical axis indicates the decline rate of network efficiency (\(\mu\)). A higher decline rate indicates a more pronounced decrease in network efficiency, which correlates with a higher recognition accuracy of the corresponding method.

Fig. 12
figure 12

The decline rates of network efficiency for different indicators on eight real networks

As observed in Fig. 12, the method proposed in this paper consistently exhibits the highest rate of decrease in network efficiency. This result suggests that the proposed method is capable of identifying edges that have the most significant impact on network efficiency. As illustrated in Figs. 12b, d, e, g, and h, when the proportion of removed edges is minimal, the network damage reaches its maximum; in the case of the other networks depicted in Figs. 12a, c, and f, the rate of decline in network efficiency starts to increase significantly only when the proportion of removed edges reaches 0.1, 0.2, and 0.2, respectively. This discrepancy is likely due to the presence of a higher number of connections between communities in these networks. Moreover, the other methods exhibit minimal impact on the network when removing a certain percentage of connected edges. This is because these connected edges may exist within communities, and their removal diminishes internal connectivity but has a limited effect on the network as a whole.

Whether comparing the experimental results of the maximum connectivity coefficient or the decline rate of network efficiency, we found that the newly added networks are more prone to collapse than the original social networks (karate, football, and jazz). This may be because transportation networks and biological networks have more pronounced community structures and generally stronger modular characteristics, whereas the community structure in social networks may be looser and more diverse.

5.3.3 Analysis of experimental results on the synthetic networks

In addition to the real-world network datasets, the experimental analysis encompassed the utilization of artificial datasets, specifically the ErdÖs-Rényi (ER) model [36], Scale-Free (SF) network [37], and Small-World (SW) network [38]. A size of 400 nodes is considered in the experiment. The corresponding statistical information is presented as follows (Table 5).

Table 5 Basic statistical information of three synthetic networks

The experimental outcomes pertaining to the synthetic networks are delineated in Figs. 13 and 14, respectively, they depict the maximum connectivity coefficient and the decline rates of network efficiency upon the removal of edges.

Fig. 13
figure 13

The maximum connectivity coefficients for different indicators on three synthetic networks

As depicted in Fig. 13, the \(\text {supEI}\) generally outperforms other existing indicators in terms of overall effectiveness. However, in the initial phase of edge removal, as presented in Figs. 13a and b, the performance of other indicators surpasses that of the \(\text {supEI}\). Nevertheless, when the edge deletion ratio hits 0.4, the supEI can induce a total collapse of the network. In contrast, other algorithms exhibit a maximum connectivity coefficient that remains around 0.8, with a decline that is progressively moderating. This comparative analysis demonstrates the effectiveness of the method proposed in this study.

Fig. 14
figure 14

The decline rates of network efficiency for different indicators on three synthetic networks

Figure 14 illustrates the changes in the decline rates of network efficiency when removing edges. Figure 14a demonstrates that when the ratio of removed edges is below 0.4, the method introduced in this paper exhibits the lowest rate of network efficiency decline, indicating that the network’s efficiency suffers minimal degradation. As shown in Fig. 14b, for edge removal ratios less than 0.4, the performance of our method is comparable to that of other methods. However, it is Fig. 14c that highlights the distinctive advantage of our approach; even at very low ratios of removed edges, our method effectively disrupts the network, showcasing its ability to identify edges that significantly impact network integrity.

Since the connections between nodes in a random network are random and the community structure is not obvious, our method has the worst effect in the initial stage of edge deletion, as demonstrated in Figs. 13a and 14a. This reflects the limitations of the method presented in this article.

Experiments conducted on both real-world and synthetic networks have revealed limitations associated with the \(\text {supEI}\) indicator. As the design of supEI relies on community detection algorithms, the effectiveness of supEI is contingent upon the choice of community detection algorithm. If connections within a community are very strong, the community’s internal connectivity remains high, and the removal of inter-community connections has a minimal impact on the overall network connectivity because internal connections can uphold the integrity of the community. Additionally, if there are numerous edges between communities, subtle changes in connectivity may occur during the initial stages of edge removal. This is because edges between communities are considered more important than edges within a community. Nevertheless, it has been observed that supEI can significantly disrupt networks by removing a smaller proportion of edges in many cases, indicating the high practical value of the algorithm proposed in this paper.

5.3.4 Robustness analysis in real-world and synthetic networks

The robustness of networks is defined as their resistance against destructions. In the edge-based attacks, the most destructive attack is supposed to destroy the most “important” edges in the networks [39]. Therefore, it is necessary to compare the edge importance metric \(\text {supEI}\), as proposed in this paper, with other metrics to observe the robustness of the network.

The link-robustness index can be calculated as follows [8]

$$\begin{aligned} R_l=\frac{1}{|E|}\sum \limits _{p=1}^{|E|}s(p) \end{aligned}$$

where \(|E|\) is the total number of edges in the network, \(s(p)\) indicates the fraction of current largest connected component when \(p\) edges are disconnected. Apparently, if a network is robust against edge attacks, its \(R_l\) should be relatively large.

That is to say, if the \(R_l\) value is comparatively low, it indicates that the network is susceptible to attacks, implying that its capacity to withstand attacks on its edges is relatively weak. Table 6 shows the \(R_l\) value under different edge attack strategies.

Table 6 The obtained \(R_l\) values of different edge importance measures on networks

As depicted in Table 6, our method is associated with the lowest \(R_l\) values across both real-world and synthetic networks. This result implies that our method is highly effective at compromising network integrity. Consequently, it is a valuable consideration for the development of network defense strategies.

6 Summary and prospects

Analyzing the relationships between nodes in social networks is a crucial basis for understanding network structures. In this paper, we introduce the concept of the \(k\)-sup structure, which takes into account the strength of relationships between nodes, and investigate the critical subgraphs based on the \(k\)-sup structure. Building upon this, a novel importance indicator called \(\text {supEI}\), based on the \(k\)-sup, is proposed. This indicator not only distinguishes between the importance of internal and external community edges but also provides a fresh perspective on how to maintain network connectivity amidst the complex interactions inherent in social networks. By integrating these two critical factors: the bridgeness and community affiliation of nodes, the \(\text {supEI}\) presents a comprehensive framework that enhances our ability to understand and analyze the structural integrity and informational flow within networks.

Experiments are conducted on eight real-world network datasets and three synthetic network datasets to evaluate the performance of our method. The experimental results demonstrate that \(\text {supEI}\) effectively identifies the importance of edges. It exhibits a remarkable ability to identify critical edges, causing substantial disruption to the network within a limited number of attacks. Furthermore, in terms of network connectivity, the \(\text {supEI}\) indicator outperforms other existing methods by demonstrating a heightened sensitivity to edge attacks. In the majority of networks, a minimal number of targeted edge disruptions can significantly impair network connectivity. Even in scenarios where our method may not immediately excel during the initial phase of such attacks, it becomes increasingly effective as the extent of edge removal escalates, ultimately leading to the most substantial damage to the network’s integrity.

Additionally, the method holds significant implications in real-life scenarios. For instance, from a defensive perspective, it can be utilized to identify critical edges and subsequently enhance network resilience by protecting those crucial connections. From an offensive standpoint, the method can aid in identifying the key edges of adversary networks, allowing for targeted destruction with minimal cost and maximum gain. Overall, the method offers practical and strategic insights for both defensive and offensive operations, contributing to the advancement of network security and optimization in various real-world contexts.

However, in order to improve the effectiveness of our method, it would be valuable to extend our method and apply it to other social network problems, such as link prediction and community detection, etc. Considering the relationship between nodes and edges, we hope to further extend supEI in the future to make it applicable to node attack strategies, with the expectation of achieving better attack effects. Additionally, the analysis in this paper is limited to unweighted undirected networks, and cannot be applied to other types of networks. Therefore, further research will be conducted to address these aspects mentioned above.