Abstract
In reality, many complex network systems can be abstracted to community detection in multi-layer networks, such as social relationships networks across multiple platforms. The composite community structure in multi-layer networks should be able to comprehensively reflect and describe the community structure of all layers. At present, most community detection algorithms mainly focus on the single layer networks, while those in multi-layer networks are still at the initial stage. In order to detect community structures in multi-layer networks, a new multi-objective evolution model is proposed in this paper. This model introduces the concept of modularity in different decision domains and the method of local search to iteratively optimize each layer of a network. Taking NSGA-II as the benchmark algorithm, the proposed multi-objective evolution model is applied to optimize the genetic operation and optimal solution selection strategies. The new algorithm is denoted as MulNSGA-II. The MulNSGA-II algorithm adopts the locus-based representation strategy, and integrates the genetic operation and local search. In addition, different optimal solution selection strategies are used to determine the optimal composite community structure. Experiments are carried out in real and synthetic networks, and results demonstrate the performance and effectiveness of the proposed model in multi-layer networks.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Complex networks are a simple and effective formalism in representing the fundamental structure among interacting units of real-world systems (e.g., social networks [1], biological networks [2], traffic networks [3]). Nodes and edges of a network represent objects in systems and relationships between objects, respectively. Community structure is an important feature of networks, which reflects some important properties, such as the network topology [4], the law of network existence [5].
At present, lots of algorithms have been proposed for community detection in single layer networks [6]. However, the same entity in a real complex system often presents multi-dimensional characteristics. For example, in social networks, people often communicate through different social platforms, e.g., microblog, QQ, and email. Multi-layer networks can more accurately describe the relationship between different systems and reflect the different properties of the same entity, which have gradually become one of the newest research trend in the field of complex networks [7,8,9,10].
A multi-layer network consists of a collection of different layers in which connections between nodes are mutually independent. Each layer of a network represents a relationship between entities and reflects an attribute of entities. Different interpretations of multi-layer networks have been proposed, such as multidimensional networks [11], multi-relational networks [12] and multiplex networks [13]. Community detection in multi-layer networks has not yet been defined uniformly, which faces more challenges compared with single-layer networks. Each layer of multi-layer networks has its own community structure, but it cannot reflect the overall structures accurately.
In order to explore the integrated communities in multi-layer networks, according to the concept of community detection (i.e., nodes within the community are closely connected and connections between communities are sparse [14]), community detection is naturally defined as a multi-objective optimization problem [15]. In this paper, in order to solve multi-objective optimization problems in multi-layer networks, a multi-objective evolutionary computing model for community detection in multi-layer networks is introduced. This model finds the community structure of each layer by iteratively optimizing the modularity of each layer of the network, which adopts the locus-based representation strategy, to integrate genetic operation and multilevel local search. Meanwhile, different optimal solution selection strategies are used to solve the compound community structure in multi-layer networks.
The rest of this paper is organized as follows. The problem definition is given in Sect. 2. Section 3 presents a multi-objective evolution model for community detection in multi-layer networks, and then proposes the MulNSGA-II algorithm based on the multi-objective model. Comprehensive experiments are performed to show the effectiveness of MulNSGA-II algorithm in various networks in Sect. 4. Finally, basic concluding remarks are discussed in Sect. 5.
2 Problem Definition
A multi-layer network can be modeled as a graph \(N = (\{G_1,...G_m\}, R)\), where \(G_m = (V_l, E_l)\) represents the networks at m layer with \(V_l\) node and \(E_l\) intra-layer links. And \(R = \{E_{ij} \subset V_i \times V_j, i, j \in {1,...m}, i \ne j \}\) denotes the connections between the nodes of layer \(G_i\) and layer \(G_j\). The elements of R are called the interlayer or crossed layers links. That is, a multi-layer network contains m subnetworks and there are connections among these m networks. Multi-layer networks have different network types according to the node sets of each layer and inter-layer relationships [16], as shown in Fig. 1. This paper mainly studies multi-layer networks with the same set of nodes in each layer, as shown in Fig. 1(a). Therefore, multi-layer networks can be formalized as follows.
Suppose \(\varOmega = \{S_1, . . . ,S_k\}\) is a set of feasible partitions in a multiplex network G and \(F = \{F_1,F_2, . . . ,F_d\}\) is a set of objective functions. A many-objective community detection problem in a multi-layer network can be defined as follows:
Each \(F_\delta \): \(\varOmega \rightarrow R\) evaluates the objective function only in the layer \(G_\delta \). Since F is a set of competing for objective functions that must be simultaneously optimized, it shows that the obtained non-dominated solutions of the Pareto front may be the best modularity of each layer.
Simplified composite modularity function is adopted here as the optimization function, so \(F_\delta (P) = - Q_\delta (P)\). Given a multi-layer network and community index, the community index can be maximized under non-overlapping conditions (i.e, each node can only be assigned to one community). Simplified composite modularity \(Q_e^{'}\) is calculated in Eq. (3).
M is the total number of communication connection. n represents the number of communication layers or dimensions of a network. The average difference between the true fraction and the expected fraction of edges belonged to the composite community structure is calculated by a simplified multilevel modularity in a multi-layer network. The larger the value of \(Q_e^{'}\) is, the higher the quality of the composite community structure is. A simplified composite modularity \(Q_e^{'}\) instead of the multiplex modularity is used to evaluate the quality of integrated community structure in a multi-layer network, which ignores coupling factors (i.e., the second term \(\delta (X_i,X_j)\) in Eq. (3)).
3 Proposed Method
This paper transforms the multi-layer community detection problem into a multi-objective optimization and presents a new multi-objective evolutionary computing model for community detection in multi-layer networks. This section will describe the multi-objective evolution model in detail from the aspects of encoding mode, genetic operation, local search, and optimal solution selection strategy.
3.1 Encoding Scheme
The encoding scheme of the solution is a key step to the success of an algorithm. The label-based and locus-based representation schemes are the two main encoding methods for community partition. However, the label-based representation is redundant, because if there are p labels in the diagram, then p! different chromosomes may correspond to the same partition [17]. For example, the vector [1, 1, 3, 3, 3, 3, 2, 2, 2] and [3, 3, 1, 1, 1, 1, 2, 2, 2] represent the same community division. Therefore, this paper adopts the locus-based adjacency representation which makes full use of the information in the diagram to express the division of communities. It is assumed that the chromosome or solution in the population is set as \(S=\{s_1, s_2,...s_N\}\). The gene length is N, and each gene i can be any integer from 1 to N, i.e., \(1<i<N\). The value of the \(i^{th}\) gene can be j, provided that i and j connected on at least one layer of the network.
3.2 Genetic Operators
Crossover is an important step in the genetic operation, and the uniform crossover is adopted in this paper. Uniform crossover actually belongs to the category of multi-point crossover, which has been proved to be an effective operator in evolutionary algorithms (EAs). The uniform crossover operation is as follows. First, a binary mask of the same length as the chromosome is randomly generated. The offspring is generated by selecting genes from the parent chromosome according to the mask. If the mask is equal to 0, the gene is selected from the first parent chromosome; otherwise, the gene is selected from the second parent chromosome. Figure 2(a) shows a simple example of a uniform crossover operator.
Mutation combines the knowledge of the layer nodes neighborhood, which mutates with a random probability. The \(i^{th}\) gene on a chromosome is randomly selected with predefined probability and mutates into the \(j^{th}\) neighbor node of the \(i^{th}\) gene. A simple example of the proposed mutation operator is given in Fig. 2(b).
3.3 Local Search Operation
The hill climbing method (HC) is incorporated into the proposed multi-objective evolution model. First, the neighbors of a chromosome are defined as follows: Given a chromosome \(S_k = \{ S_k^1, S_k^2, ... , S_k^n \}\), node \(S_k^i\) is randomly selected from chromosome \(S_k\). Then the gene \(S_k^i\) is replaced with other neighbor nodes \(S_k^j\) of the location i, where \(S_k^j \ne S_k^i\). The new generated partition is defined as the neighbor of chromosome \(S_k\). In the local search process, a chromosome is randomly selected for refinement and all possible neighbor chromosomes are identified. Compared with the original chromosome, the newly generated one is selected to replace this chromosome if it can achieve better solutions. The details of HC procedure are given in Algorithm 1.
3.4 Optimal Selection Strategy
In multi-objective optimization problems [18], determining how to find the optimal solution from the Pareto set is a key problem in multi-objective optimization. This paper adopts different optimal solution selection strategies. Multi-objective optimization and multi-criterion decision-making are divided into three types, namely, a posteriori, a priori, and an interactive [19].
In the strategy based on a priori, this paper selects the maximum value of the objective function, i.e., the maximum modularity, in the Pareto set as the optimal solution. In this case, the method is called prik.
Suppose \(S_i=(s_1, s_2, ..., s_m)\) is the fitness value of the \(i^{th}\) solution on the Pareto front. Based on an interactive method, the average value \(mvt = ((\sum _{i}s_x)/d)\) is calculated for each \(S_i\) in the optimization process, and the solution with the highest value \(max\_s\) is selected as the optimal solution at the end. This optimization algorithm is called postk.
The posteriori decision-making method mainly introduces the approach of k-means clustering, which can group a set of data obtained through different models to improve the quality of the results. First, MulNSGA-II algorithm is used to discover community detection at all layers, so as to form the Pareto front. Then, this strategy can find a consistent community clustering from detected communities, namely, to determine the optimal solution from the Pareto front. ot change or the maximum number of iterations is reached.
4 Experiment
4.1 Network Datasets
The synthetic network is generated by the benchmark function proposed by Bródka and Grecki [20], which is an extension of the LFR benchmark [21]. By changing parameters, networks with different structures and layers are generated respectively. Parameter settings are shown in Table 1. The mixed parameter u represents the connection part between a node and all nodes of a community. Generally speaking, the quality of the community divided will decrease with the increase of u. The degree change chance Dc will control how different the node degrees at different network layers are. The higher the Dc parameter value is, the more different nodes in different layers may be. In general, the mixing parameter u and the degree change chance Dc jointly control the network structure of each layer in a multi-layer network.
There are two real-world networks, which can be downloaded from the websiteFootnote 1. The first real multi-layer network is the KAPFERER TAILOR SHOP network, which records the interaction of people in a tailor shop. The network consists of 39 nodes, 1108 links, and four layers, that is, two instrumental attribute networks (i.e., work and assistance) and two social attribute networks (i.e., friendship and social emotion). The second real multi-layer network is the CS-AARHUS social network, which consists of 61 nodes, 620 connections, and five layers networks, namely, online Facebook relationship and offline relationship (i.e., leisure, work, co-authorship and lunch).
4.2 Evaluation Metrics
Redundancy index [22] used as an evaluation metrics calculates the proportion of redundant links in multi-layer networks. The intuitive implication of this metric is that composite communities should have more connections in multiple layer networks. Redundancy (denoted by Rc) is expressed in Eq. (4).
\(\Vert p\Vert \) is the total number of communities in a multi-layer network. \(S_i^{'}\) is the set \(\{u,v\}\) that connects at least one layer in G community. If \(\{u,v\} \in E_l\), \(\beta (u,v,E_l)=1\), 0 otherwise. The higher the value of \(R_c\) is, the better the quality of the partition is.
The Normalized Mutual Information (NMI) used as the other evaluation metrics can evaluate the similarity between the current clusters and the previous ones [23]. Assuming that \(c_1\) and \(c_2\) are two network partitions, then NMI(A, B) can be calculated as Eq. (5).
\(F^{'}\) is a confusion matrix. \(F_i^{'}(F_j^{'})\) is the number of elements in the \(i^{th}\) row (or the \(j^{th}\) column) of \(F^{'}\). \(l_{c_1}(l_{c_2})\) represents the total number of clustering in a partition \(c_1(c_2)\). The value of NMI ranges from 0 to 1. The larger the value of NMI is, the more similar the original and optimized networks are.
4.3 Experiments Result
The proposed method is to obtain the final results by averaging the values in 10 runs. Population size, iteration number, crossover and mutation probability are set as 200, 100, 0.8 and 0.2, respectively. These parameter values are obtained by the trial-and-error method on the benchmark function to get the best result.
Figure 3 shows the results of NMI in 12 different network structures based on different parameters in the mLFR dataset. The first experiment is mainly used to compare the three strategies of MulNSGA-II algorithm (i.e., MulNSGA-prik, MulNSGA-clu, MulNSGA-postk) and MLMaOP-proj, MLMaOP-cspa, MLMaOP-mf [24], under the different network structures formed by the interaction of the layers of multi-layer networks, the mixing parameters u and the degree change chance Dc.
It is obvious that the increase of network layers hardly affects the performances of MulNSGA-prik, MulNSGA-clu, MulNSGA-postk in Fig. 3. Dc controls the network structure, and the algorithm can find the best community division under different network structures. The performance of MulNSGA-II algorithm (i.e., MulNSGA-prik, MulNSGA-clu, MulNSGA-postk) is much better than that of MLMaOP-proj, MLMaOP-cspa, MLMaOP-mf algorithm [24] in different network structures and layers.
Tables 2 and 3 show that the comparison results between the proposed MulNSGA-II algorithm (i.e., MulNSGA-prik, MulNSGA-clu, MulNSGA-postk) and the MOEA-MultiNet [25], BGLL algorithms [26] in one of real multi-layer networks.
It can be seen from the data in Tables 2 and 3 that the compound community structure obtained by the proposed MulNSGA-II algorithm (i.e., MulNSGA-prik, MulNSGA-clu, MulNSGA-postk) is overall superior to the MOEA-MultiNet [25] and the BGLL algorithms [26] based on a single layer.
5 Conclusion
Community detection in multi-layer networks has attracted extensive attention. At present, the definition and evaluation index of community in multi-layer networks are still open questions. Therefore, this paper transforms community detection in multi-layer networks into a multi-objective problem and proposes a multi-objective optimization model in multi-layer networks. The model has the following characteristics.
-
The new multi-objective evolutionary computing model for community detection in multi-layer networks introduces a concept of modularity in different decision domains, and iteratively optimizes each layer of the network. By evaluating the objective function of each layer network to form the objective space, this paper proposes three different optimal solution selection strategies to find the optimum solution.
-
In order to overcome the problem of local optimal solution in the optimization-based community detection algorithm, local search strategy is introduced into this model.
-
The improved MulNSGA-II algorithm based on the multi-objective model has better performance than other algorithms in real multi-layer networks, which also shows the effectiveness of the proposed model. At the same time, by changing the network structure and layers in synthetic networks, the model is still able to find high-quality communities. When applied to layer 3 and 4 networks, the performance of the algorithm is almost unchanged. Experiments show that the improved MulNSGA-II algorithm may be applied to higher dimensional multilayer networks.
References
Gao, C., Liu, J.M.: Network-based modeling for characterizing human collective behaviors during extreme events. IEEE Trans. Sys. Man Cybern. Syst. 46(1), 171–183 (2017)
Chiti, F., Dobson, C.M.: Protein misfolding, amyloid formation, and human disease: A summary of progress over the last decade. Annu. Rev. Biochem. 86, 27–68 (2017)
Strano, E., Viana, M.P., Sorichetta, A.: Mapping road network communities for guiding disease surveillance and control strategies. Sci. Rep-UK 8(1), 4744 (2018)
Li, Z.T., Liu, J., Wu, K.: A multiobjective evolutionary algorithm based on structural and attribute similarities for community detection in attributed networks. IEEE Trans. Cybern. 48(7), 1963–1976 (2017)
Liu, C.L., Liu, J., Jiang, Z.Z.: A multiobjective evolutionary algorithm based on similarity for community detection from signed social networks. IEEE Trans. Cybern. 44(12), 2274–2287 (2014)
Gao, C., Liang, M.X., Li, X.H.: Network community detection based on the Physarum-inspired computational framework. IEEE/ACM Trans. Comput. Bi. 15(6), 1916–1928 (2018)
Bravobenitez, B., Alexandrovakabadjova, B., Martinezjaramillo, S.: Centrality measurement of the mexican large Value payments system from the perspective of multiplex networks. Comput. Econ. 47(1), 19–47 (2016)
Yao, Y., Zhang, R., Fan, Y.: Link prediction via layer relevance of multiplex networks. Int. J. Mod. Phys. C 28(08), 1750101 (2017)
Ma, L.J., Gong, M.G., Yan, J.N., Liu, W.F., Wang, S.F.: Detecting composite communities in multiplex networks: a multilevel memetic algorithm. Swarm Evol. Comput. 39, 177–191 (2018)
Taylor, D., Shai, S., Stanley, N., Mucha, P.J.: Enhanced detectability of community structure in multilayer networks through layer aggregation. Phys. Rev. Lett. 116(22), 228301 (2016)
Xuan, Q., Ma, X.D., Fu, C.B., Dong, H., Zhang, G.J.: Heterogeneous multidimensional scaling for complex networks. Int. J. Mod. Phy. C. 26(02), 1550023 (2015)
Dai, C.Y., Chen, L., Li, B., Li, Y.: Link prediction in multi-relational networks based on relational similarity. Inform. Sci. 394, 198–216 (2017)
Pitsik, E., et al.: Inter-layer competition in adaptive multiplex network. New J. Phy. 20(7), 075004 (2018)
Boutemine, O., Bouguessa, M.: Mining community structures in multidimensional networks. ACM Trans. Knowl. Discov. Data 11(4), 51 (2017)
Li, Z.T., Liu, J., Wu, K.: A multiobjective evolutionary algorithm based on structural and attribute similarities for community detection in attributed networks. IEEE Trans. Cybern. 48(7), 1963–1976 (2017)
Wang, Z., Wang, L., Szolnoki, A., Perc, M.: Evolutionary games on multilayer networks: A colloquium. Eur. Phys. J. B 88(5), 124 (2015)
Pizzuti, C.: Evolutionary computation for community detection in networks: A review. IEEE Trans. Evol. Comput. 22(3), 464–483 (2018)
Chen, X.J., Liu, Y.X., Li, X.H., Wang, Z., Wang, S.X., Gao, C.: A new evolutionary multiobjective model for traveling salesman problem. IEEE Access. 7, 66964–66979 (2019). https://doi.org/10.1109/ACCESS.2019.2917838. https://ieeexplore.ieee.org/abstract/document/8718296
Purshouse, R.C., Deb, K., Mansor, M.M., Mostaghim, S., Wang, R.: A review of hybrid evolutionary multiple criteria decision making methods. In: 2014 IEEE Congress on Evolutionary Computation (CEC), pp. 1147–1154 (2014)
Bródka, P.: A method for group extraction and analysis in multi-layered social networks. Ph.D. disertation, Wroclaw, Poland, arXiv.org:1302.1369 (2012). https://www.ii.pwr.edu.pl/~brodka/index-en.html
Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. 69(2), 026113 (2004)
Amelio, A., Pizzuti, C.: Community detection in multidimensional networks. In: 2014 IEEE Proceedings of the 26th International Conference on Tools with Artificial Intelligence, pp. 352–359 (2014)
Pizzuti, C., Socievole, A.: Many-objective optimization for community detection in multi-layer networks. In: 2017 IEEE Congress on Evolutionary Computation (CEC), pp. 411–418 (2017)
Liu, W.F., Wang, S.F., Gong, M.: An improved multiobjective evolutionary approach for community detection in multilayer networks. In: 2017 IEEE Congress on Evolutionary Computation (CEC), pp. 443–449 (2017)
Lancichinetti, A., Fortunato, S.: Consensus clustering in complex networks. Sci. Rep. 2, 336 (2012). https://www.nature.com/articles/srep00336
Acknowledgment
This work is supported by National Natural Science Foundation of China (Nos. 61602391, 61402379), Natural Science Foundation of Chongqing (No. cstc2018jcyjAX0274), and in part of Southwest University Training Programs of Innovation and Entrepreneurship for Undergraduates (No. X201910635045).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, X., Li, X., Deng, Y., Chen, S., Gao, C. (2019). A New Multi-objective Evolution Model for Community Detection in Multi-layer Networks. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11775. Springer, Cham. https://doi.org/10.1007/978-3-030-29551-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-29551-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29550-9
Online ISBN: 978-3-030-29551-6
eBook Packages: Computer ScienceComputer Science (R0)