Abstract
The critical node detection problem describes a class of graph problems that involves identifying sets of nodes that influence a given graph metric. One variant of this problem is to find the nodes that - when removed from the graph - maximize the number of connected components in the remaining graph. This is an example of a practical problem with multiple real-world applications in epidemic control, immunization strategies, social networks, biology, etc. This paper proposes the use of a simple GA to identify the set of the critical nodes of the problem without designing special problem specific variation operators. Problem specific information is used only in the fitness function and the constraint handling technique. We show that this simple approach performs as well as state-of-art methods.
This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS - UEFISCDI, project number PN-III-P1-1.1-TE-2019-1633.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Nodes in a network can have different importance with respect to different network measures and behavior. Finding these nodes, called critical nodes, is an essential computational task. Critical nodes can be approached also from the general node deletion problem [14], which is a large class of problem composed of several problems, such as the vertex separator problem, the minimum vertex cover problem, the critical node detection problem, etc. Recently, the critical node detection problem (CDNP) gained attention due to its large applicability. A very important class of the critical node detection problem is to identify the set of nodes of a maximal size to remove from the graph in order to maximize the number of connected components. Applications of this problem can be found in epidemic control and immunization strategies, social networks, biology, telecommunications, etc.
In general, the critical node detection problem consists in finding a set of nodes in a given graph \(G=(V,E)\), which deleted maximally degrades the graph according to a given measure \(\sigma \). CDNP is a central problem in network analysis with applications in several research fields, such as biology [2], network vulnerability [6], social network analysis [3], etc. Regarding the measure \(\sigma \) several studies focus on network centrality measures, such as betweenness centrality, closeness centrality, page rank [11, 16].
Although several variants of the CDNP exist, only a few of them deal with computational methods for the variant consisting of removing k nodes in order to maximize the number of remaining components. The main goal of this paper is to approach this problem using a genetic algorithm with minimal problem specific adaptations. The choice of a genetic algorithm came first due to the natural binary encoding of an individual, but this is not the only reason we made it: we believe that it is important to explore different methods and paths and not constrain ourselves to assuming that one method may not work on a certain problem because it has not been tested on it. This is also related to the choice of operators: if there is not need for specific operators that use domain knowledge, we should not use them and keep the approach as general and as flexible as possible.
The rest of the paper is organized as follows: the next section presents the problem and reviews some existing approaches. The third section describes the proposed genetic algorithm. In the fourth section numerical experiments considering synthetic and real world networks are used to compare our results with the existing ones. The articles ends with conclusions and further work.
2 Related Work
Many variants of the critical node detection problem are studied in the literature, among which we mention: minimizing the pairwise connectivity by deleting k nodes (this variant is the most studied in the literature), minimizing the largest component size by deleting k nodes, bound the pairwise connectivity to a given threshold by deleting the minimal set of nodes, etc. A recent survey of the problem can be found in [13].
There are several ways to classify the critical node detection problem (CNDP). In [21] the two types variant is adopted: CNDP type 1 problems aim to minimize the network connectivity maintaining the number of removed nodes under a given threshold and CNDP type 2 problems in which the goal is to minimize the number of nodes that are removed such that the network connectivity reaches a given threshold. The type of connectivity measure used depends on the envisaged application, effect or the type of network. Applications are multiple as the CNDP is related to network sustainability and vulnerability [21]. Many practical approaches are devised for wireless sensor networks [7, 8, 18].
In [21] an exact algorithm for the problem considering the largest connected component is proposed. The k-vertex cut problem, consisting in finding the minimum weight subset whose removal disconnects the graph in at least k components is studied in [9]. Component-Cardinality-Constrained Critical Node Problem (3C-CNP) is approached in [12]. A bi-objective design is presented in [25]. As far as the type of networks, weighted networks are studied for example in [5] and directed graphs in [19].
In [1] the two types of CNDP problems are studied in three versions, among which also kMaxComp, the problem of removing a set of maximum k nodes to maximize the number of connected components in the remaining graph. This is one of the less studied CNDP variants, proven to be NP-hard [24]. In [24] a Mixed Integer linear programming approach is presented, [27] present a general integer programming framework. For a special class of graphs (trees and series-parallel graphs) a dynamic programming approach is presented [23]. In [1] a genetic algorithm is designed to solve the problem. The proposed genetic algorithm incorporates in the fitness function a penalization of solutions that are too close to the best solutions, combines a greedy strategy with variation operators and employs a local search mechanism at the end in order to refine solutions.
In this paper we focus on the problem \(CDNP^3_a\), denoted here as kMaxComp, introduced in [23, 24]. The \(CDNP^3_a\) is by itself an interesting problem to be studied, with many possible applications. It has received less attention because it does not impose any conditions on the connected components. The problem consists in removing a maximum of k nodes such as the number of remaining components to be maximal. Formally, if S denotes the set of the deleted nodes, and \(\mathcal {H}(G[V\setminus S])\) denotes the set of the maximal component of graph G without the set of nodes S, the optimization problem consists in
where |A| denotes the cardinality of set A.
3 Maximum Components GA (MaxC-GA)
The goal of this work is to solve the kMaxComp problem by using a minimum number of problem specific information during the search. Because we search for a set of nodes from a network out of which some will be included in the critical set S and some not, a binary encoding of an individual of length \(N=|V|\) is natural, making a genetic algorithm the first choice in trying to approach this problem. We call this algorithm Maximum Components GA. MaxC-GA is outlined in Algorithm 1. MaxC-GA is a simple approach for the CDNP3a problem, that combines a standard GA with a constraint method based on the marginal contribution of a node to the fitness of an individual, concept borrowed from game theory, where such marginal contributions are used to evaluate the contribution of a player to the value of a coalition when computing the Shapley value [22].
Encoding. An individual has length N equal to the number of nodes in the network. The value 1 on position i indicates that node i is included in S.
Variation Operators. Two point crossover and flip-bit mutation are used.
Selection. Tournament selection is used for selection for recombination and mutation.
Fitness Function. The fitness of an individual is computed as the number of connected components the removal of its nodes with value 1 yields. Thus, if individual x encodes the critical set \(S_x\) then the fitness f(x) of x is computed as
Constraint Handling. In order to ensure that the size of the corresponding set S does not exceed k, before evaluation each individual is constrained to have only k nodes with value 1 by removing the nodes with the lowest marginal contribution to the fitness of the individuals from S. The marginal contribution of a node to the fitness of the individual is computed as the difference between the fitness of the individual and the fitness of the individual with the node removed from its corresponding set S of critical nodes. For a node i with value 1 in individual x with corresponding critical set \(S_x\) the marginal contribution of node i to the fitness of x denoted by \(u_i(x)\) is:
where f(x) is the fitness defined in Eq. (2).
Parameters. MaxC-GA is a standard GA, and uses typical GA parameters: maximum number of generations, crossover and mutation probabilities, probability to mutate a bit, and tournament size. The effect of these parameters on the search results of a GA has been widely documented [15].
4 Numerical Experiments
The behavior of MaxC-GA is illustrated by using several benchmarks and comparing results with best known found in the literature for this problem. Benchmarks. A set of synthetic benchmarksFootnote 1 was proposed in [26]. The benchmark set contains four different type of graphs: Barabási-Albert (BA), Erdős-Rényi (ER), Forest-fire (FF), Watts–Strogatz (WS) graphs. BA graphs are scale free networks, ER graphs are random networks, FF graphs simulate how fire spreads through a forest, WS graphs are small world graphs with a dense structure.
Table 1 presents some basic measures of the benchmarks used for numerical experiments here: number of nodes (|V|), number of edges (|E|), average degree (\(\langle d \rangle \)), density of the graph (\(\rho \)), and average path length (\(l_G\)). In a similar manner, real networks are described in Table 2 with a reference added for each network.
Parameter Settings. Several parameter setting are tested: population size set to 25 and 50, maximum number of generations 500, crossover probability 0, 0.5, 0.8, and 1, and mutation rate 0, 0.01, 0.02, 0.03, 0.04, and 0.05.
Results and Discussion. MaxC-GA is compared with three algorithms described in [1]: two greedy algorithms, the first one, \(G_1\) based on node deletion from the candidate critical node set, and the second one, \(G_2\), based on the node addition to the candidate critical node set and a genetic algorithm from an evolutionary algorithm framework using greedy rules (denoted by GA). The genetic algorithm uses a specific fitness function that combines the number of connected components determined by the interval with previous search information, problem specific variation operators and a specific designed local search technique. Since the problem has been less addressed, we only have one approach based on GAs to compare with, and those results represent only one run. Results presented in the paper are preliminary and promising, supporting the idea that this approach may be extended for larger data sets.
As results presented in [1] include only the maximum number of connected components in one run, therefore statistical comparisons with results reported there are not possible. Table 3 includes these results as well best results reported by MaxC-GA. Results reported by MaxC-GA using different parameter settings are illustrated in Fig. 2. Furthermore, Fig. 1 illustrates the evolution of the search of MaxC-GA (average best solutions over 10 runs). We find that the evolution is steady, faster for a larger population size, and that MaxC-GA is capable to find and maintain the optimal solution. Because the behavior of MaxC-GA under different parameter settings is typical for that of a GA, with respect to convergence we have presented only graphs showing that it is capable to detect and maintain the optimal solution during one run. In all other ways it behaves as expected: a larger population size leads to an earlier convergence at a higher computational cost and a small population will eventually converge.
The effect of various parameter settings presented in Fig. 2 as boxplots of the ratio of maximum fitness values reported in 10 runs for each parameter setting and best known result for the benchmark (in order to keep all values between 0 and 1). We find that the algorithm is robust with respect to variation of parameters, with the notable exception that mutation plays an important role in the search, as setting the mutation rate to 0 significantly decreases the performance of the algorithm.
5 Conclusions
The critical node detection problem is approached with MaxC-GA, a simple genetic algorithm that uses a node fitness based on marginal contributions for constraint handling. Numerical results show that this approach is as effective as other, more complex, using more problems specific information.
These results may also be used to advocate for the use of minimal problem specific information in designing new evolutionary algorithms for real-world applications. Overusing specific problem information decreases the adaptability of the presented method, as practitioners will rarely try to adapt an existing algorithm presented in literature to a slightly different problem, mainly because the stochastic nature of these approaches does not guarantee direct portability to a different problem.
Notes
- 1.
downloaded from http://individual.utoronto.ca/mventresca/cnd.html, last accessed 05.09.2020.
References
Aringhieri, R., Grosso, A., Hosteins, P., Scatamacchia, R.: A general evolutionary framework for different classes of critical node problems. Eng. Appl. Artif. Intell. 55, 128–145 (2016)
Boginski, V., Commander, C.W.: Identifying critical nodes in protein-protein interaction networks. In: Clustering Challenges in Biological Networks, pp. 153–167. World Scientific (2009)
Borgatti, S.P.: Identifying sets of key players in a social network. Comput. Math. Organ. Theory 12(1), 21–34 (2006). https://doi.org/10.1007/s10588-006-7084-x
Cacchiani, V., Caprara, A., Toth, P.: Scheduling extra freight trains on railway networks. Transp. Res. Part B: Methodol. 44(2), 215–231 (2010)
Chen, W., Jiang, M., Jiang, C., Zhang, J.: Critical node detection problem for complex network in undirected weighted networks. Phys. A: Stat. Mech. Appl. 538 (2020). https://doi.org/10.1016/j.physa.2019.122862
Cohen, R., Erez, K., Ben-Avraham, D., Havlin, S.: Resilience of the internet to random breakdowns. Phys. Rev. Lett. 85(21), 4626 (2000)
Dagdeviren, O., Akram, V.: An energy-efficient distributed cut vertex detection algorithm for wireless sensor networks. Comput. J. 57(12), 1852–1869 (2013). https://doi.org/10.1093/comjnl/bxt128
Dagdeviren, O., Akram, V., Tavli, B., Yildiz, H., Atilgan, C.: Distributed detection of critical nodes in wireless sensor networks using connected dominating set (2017). https://doi.org/10.1109/ICSENS.2016.7808815
Furini, F., Ljubić, I., Malaguti, E., Paronuzzi, P.: On integer and bilevel formulations for the k-vertex cut problem. Math. Program. Comput. 12(2), 133–164 (2019). https://doi.org/10.1007/s12532-019-00167-1
Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabási, A.L.: The human disease network. Proc. Natl. Acad. Sci. 104(21), 8685–8690 (2007)
Iyer, S., Killingback, T., Sundaram, B., Wang, Z.: Attack robustness and centrality of complex networks. PloS One 8(4), e59613 (2013)
Lalou, M., Tahraoui, M., Kheddouci, H.: Component-cardinality-constrained critical node problem in graphs. Discrete Appl. Math. 210, 150–163 (2016). https://doi.org/10.1016/j.dam.2015.01.043
Lalou, M., Tahraoui, M., Kheddouci, H.: The critical node detection problem in networks: a survey. Comput. Sci. Rev. 28, 92–117 (2018). https://doi.org/10.1016/j.cosrev.2018.02.002
Lewis, J.M., Yannakakis, M.: The node-deletion problem for hereditary properties is NP-complete. J. Comput. Syst. Sci. 20(2), 219–230 (1980)
Lobo, F., Lima, C.F., Michalewicz, Z.: Parameter Setting in Evolutionary Algorithms, vol. 54. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-69432-8
Lozano, M., García-Martínez, C., Rodriguez, F.J., Trujillo, H.M.: Optimizing network attacks by artificial bee colony. Inf. Sci. 377, 30–50 (2017)
Milo, R., et al.: Superfamilies of evolved and designed networks. Science 303(5663), 1538–1542 (2004)
Min, S., Jiandong, L., Yan, S.: Critical nodes detection in mobile ad hoc network, vol. 2, pp. 336–340 (2006). https://doi.org/10.1109/AINA.2006.136
Paudel, N., Georgiadis, L., Italiano, G.: Computing critical nodes in directed graphs. ACM J. Exp. Algorithmics 23 (2018). https://doi.org/10.1145/3228332
Reimand, J., Tooming, L., Peterson, H., Adler, P., Vilo, J.: GraphWeb: mining heterogeneous biological networks for gene modules with functional significance. Nucleic Acids Res. 36, 452–459 (2008)
Rezaei, J., Zare-Mirakabad, F., MirHassani, S., Marashi, S.A.: EIA-CNDP: an exact iterative algorithm for critical node detection problem. Comput. Oper. Res. 127 (2021). https://doi.org/10.1016/j.cor.2020.105138
Shapley, L.S.: 17. A Value for n-Person Games, pp. 307–317. Princeton University Press (1953). DOIurl10.1515/9781400881970-018
Shen, S., Smith, J.C.: Polynomial-time algorithms for solving a class of critical node problems on trees and series-parallel graphs. Networks 60(2), 103–119 (2012)
Shen, S., Smith, J.C., Goli, R.: Exact interdiction models and algorithms for disconnecting networks via node deletions. Discrete Optim. 9(3), 172–188 (2012)
Ventresca, M., Harrison, K., Ombuki-Berman, B.: The bi-objective critical node detection problem. Eur. J. Oper. Res. 265(3), 895–908 (2018). https://doi.org/10.1016/j.ejor.2017.08.053
Ventresca, M.: Global search algorithms using a combinatorial unranking-based problem representation for the critical node detection problem. Comput. Oper. Res. 39(11), 2763–2775 (2012)
Veremyev, A., Prokopyev, O.A., Pasiliao, E.L.: An integer programming framework for critical elements detection in graphs. J. Comb. Optim. 28(1), 233–273 (2014). https://doi.org/10.1007/s10878-014-9730-4
Yang, R., Huang, L., Lai, Y.C.: Selectivity-based spreading dynamics on complex networks. Phys. Rev. e 78(2), 026111 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Suciu, MA., Gaskó, N., Képes, T., Lung, R.I. (2021). A Simple Genetic Algorithm for the Critical Node Detection Problem. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2021. Lecture Notes in Computer Science(), vol 12886. Springer, Cham. https://doi.org/10.1007/978-3-030-86271-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-86271-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86270-1
Online ISBN: 978-3-030-86271-8
eBook Packages: Computer ScienceComputer Science (R0)