Abstract
Community detection has become pervasive in understanding complex network structures and detecting similar patterns. The main motivation behind using deep learning methods for community detection comes from the brilliant performance results shown by deep neural networks in various fields. Using unsupervised learning models, the problem of community detection can be solved. The high-dimensional feature space representation of the network data leads to a complex neural network architecture that requires a high number of trainable parameters. Deep learning-based models can transform the high-dimensional graph data of complex networks into simple, low-dimensional space or latent representation. The transformation of network representation to latent representation consists of meaningful features of the network data. This mapping preserves the structural information of the network later on, which clustering algorithms can be applied to the converted latent representation. This survey paper provides an overview of the traditional and deep learning-based methods of community detection, followed by a discussion on the challenges and future directions of community detection.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Community detection is a multidisciplinary research area that is used to study the structural properties of complex networks. These structures of the network have a high-dimensional form of graph data, which requires a large number of trainable parameters. Deep learning techniques are employed to analyze the rich nonlinear structure of real-world networks. Spectral clustering techniques cannot scale for large networks because they can perform well only for small networks. Other traditional methods (as mentioned in Table 1) used for community detection, such as statistical inference, do not perform well on large networks having high-dimensional features. Such networks have high computational complexity in terms of both time and space. Deep learning models when used for community detection in networks provide improved performance over traditional techniques like spectral clustering and statistical interference.
They learn the nonlinear structural properties of the network and represent the network in its low-dimensional form. Reference [7] presented the architecture of the convolutional neural network to mitigate the redundant information existing in the networks by sharing the weights of convolutional layers among residual blocks, thus showing a 45% reduction in network parameters and increased efficiency.
1.1 Community: Definition and Properties
Community: A set of communities \(c = c_1 ,c_2 ...,c_k\) denotes k communities partitioned in a network G (V, E). Informally, a community C is a subgraph of a network that consists of a collection of nodes V such that the number of edges inside the community is denser than the edges linking the vertices of C with other communities of the graph. Here, V is a set of vertices; E is set of edges, n = |V|, m = |E|, C = A subset of V, nc = |C|.
Intra Cluster density
Inter Cluster density
Connectedness is an important property that maintains the connections between each pair of vertices in C. Community detection is delineated on sparse graphs only.
2 Categorization of Deep Learning Techniques
-
A.
Convolutional Neural Networks (CNNs)
It belongs to a specific category of feed-forward neural networks. Reference [8] addresses the problem of community detection in topologically incomplete networks. Their work proposed a deep CNN model that showed more robustness than classical supervised models, even in the case of missing edges in networks.
-
B.
Autoencoder-Based CD Approach
Autoencoders (AEs) are unsupervised models similar to spectral clustering frameworks using low-dimensional matrix reconstruction [9]. Several studies have been proposed to use variants of autoencoder models such as stacked autoencoder [10] and sparse AEs [11].
-
C.
Generative Adversarial Networks (GANs)
GANs are comprised of two competing neural networks with adversarial training to improve the discriminative ability when applied to community detection problem solves the overfitting challenge and resulting in fast-adjusting precision. Wang [12] introduced a low-dimensional vector space graph representation approach. Here, each vertex of the graph is represented as a low-dimensional vector space. A novel deep learning algorithm was proposed by the authors [13] to utilize graph representation learning techniques to solve overlapping community detection problems. Previous approaches focused only on communities having domain-specific rich topological information and failed in the networks having less structural information. An approach for cross-domain network representation was devised by the authors in [14].
-
D.
Deep NMF-Based CD Approaches
Non-negative factorization (NMF) [15] computation involves the factorization of a large matrix into two matrices having non-negative values. NMF approach follows for community detection tasks by decomposing the adjacency matrix of a network into the product of two matrices with non-negative elements. The error function is also minimized for further network partitioning tasks. NMF can be implemented in both overlapping and non-overlapping community detection tasks. Conventional NMF cannot capture all the sophisticated topological information for community detection. The deep learning-based NMF approach proposed for the multilayer learning strategy of complex data to uncover latent feature hierarchies using stacked NMF has shown an improved performance compared to that used for single-layered networks [16].
-
E.
Deep SF-Based CD Approach
Sparse filtering (SF) [17] is an effective feature learning algorithm that is known to handle high-dimensional graph data. It is an efficient two-layer learning model which is hyperparameter-free with only a single hyperparameter. It optimizes the cost function-sparsity of l2-normalized features and can scale easily to high-dimensional input data. Also, it is capable of learning significant features in multiple layers using stacked layering. In the discovery of communities, a sparse filtering algorithm is applied to extract the network features for further network partitioning tasks, resulting in meaningful community structures [18].
-
F.
Community Embedding-Based Approaches
The graph embedding approach focuses on the distribution of nodes present in communities in low-dimensional space. The approach embeds communities rather than specific nodes, which is a reverse approach. Community embedding is good for community detection as well as node categorization [19]. Reference [20] proposed a probabilistic generative model to learn representations of the social network by observing the information diffusion cascades instead of network structures. The proposed model learns community-preserving social network embeddings from social contagion logs. The focus is to discover social structures and predict information propagation in the network.
-
G.
Community Detection Based on Graph Neural Networks (GNN)
GNNs can model the complex relationships in graph-related data. GNN models are based on deep learning and graph mining techniques. The authors have proposed a model for detecting overlapping communities using graph neural networks (GNN) approach. This GNN-based model has proved to be more effective and robust than other existing approaches [21]. The authors have proposed a modified GNN framework that involves a line graph and a non-backtracking operator of the graph to analyze edge adjacency knowledge. The algorithms can be used for node-classification challenges apart from community detection tasks and have shown improvements in supervised community discovery problems [22].
2.1 Discussion
Table 2 summarizes the comparative analysis done based on the comprehensive survey of using deep neural networks (DNNs) for community detection. DNN is a new approach for solving social network analysis problems and is highly effective in graph and node representation in a network [23]. Previous approaches like stochastic block models and modularity maximization provide linear mapping to low-dimensional space. But mostly real-world networks have nonlinear structures so deep neural networks have proved to be effective in nonlinear representation [24].
3 Challenges and Future Directions
The models and frameworks we have discussed above are the most recent strategies developed in the last few years to solve the community detection problem. From our study, we have encountered many challenging criteria that need to be further focused on, and the most efficient solutions must be provided. This section presents some of these challenges spotted in our study that may lead to a new future research direction.
Temporal Changes in Communities
As the network continuously reflects changes in user relations and topological information, dynamic community detection should be considered. The models with high computational power which can analyze the dynamic community detection and extract spatio-temporal characteristics of the social structures are in demand to be developed.
Meaningful Representation of Datasets
Generally, an enormous amount of data is generated by social networks, which is used as input datasets to predict communities. Deep learning techniques must use datasets in a meaningful format to predict the correct semantic representation of communities. Additionally, a better interpretation of different communities formed may help in fast information propagation.
No Prior Knowledge of the Number of Communities
According to [30, 31], random walks were performed to get preliminary communities and refine results by modularity. But in the case of disconnected networks, random walks cannot cover each node, thus degrading the performance of community detection algorithms.
Signed Networks
The impact of the type of relationship (positive or negative) on nodes is different. So, existing community detection approaches implemented on unsigned networks cannot be used for signed networks. So, to detect communities in signed networks the focus of the research has to be on representing negative ties. Deep learning strategies developed should be efficient enough to represent positive and negative ties in signed networks. Future work may cover the impact of signed edges.
Community Overlapping Detection
More efforts are needed to focus on the overlapping detection approaches as some of these strategies discussed in this survey have worked on overlapping community detection problems.
Efficient use of Computational Resources
Some of the developed algorithms require heavy computations, one such mentioned in where the processing of adjacency matrix to similarity matrix construction requires large computation resources. Thus, mechanisms for better use of computational resources should be developed for the new computation-specific strategies.
Comparative Analysis Intermediaries
There is a shortfall of straightforward comparative analysis techniques for the strategies we have studied so far. In this regard, the Network kit is the most widely used tool kit for large-scale network analysis tasks and has inbuilt algorithms already implemented by the researchers.
NLP Embeddings
The latest trend used is called random walks for node embeddings. These node embeddings help similar nodes remain close in their representations. So, further trends also include temporal graphs and ego networks.
4 Conclusion
In this survey paper, we analyzed the existing community detection techniques and current trends using deep learning approaches for community discovery tasks in various scenarios. As discussed in this review paper, deep learning models for community detection have emerged to be more robust, effective, efficient, and flexible to handle high-dimensional network data. However, there is a scope for more research work in future to be done on studying the overlapping community detection problem, need of optimized algorithms with less computational complexity, taking into account signed networks, meaning representation of datasets to predict correct number of communities, dynamic community detection as the network is undergoing changes continuously, etc. Finally, along with the taxonomy of traditional and deep learning methods, challenges and prospects for community detection have also been elaborated in this paper.
References
Qiao S et al (2018) A fast parallel community discovery model on complex networks through approximate optimization. IEEE Trans Knowl Data Eng 30(9):1638–1651. https://doi.org/10.1109/TKDE.2018.2803818
Lu Z, Sun X, Wen Y, Cao G, La Porta T (2015) Algorithms and applications for community detection in weighted networks. IEEE Trans Parallel Distrib Syst 26(11):2916–2926. https://doi.org/10.1109/TPDS.2014.2370031
Džamić D, Aloise D, Mladenović N (2019) Ascent–descent variable neighborhood decomposition search for community detection by modularity maximization. Ann Oper Res 272(1–2):273–287. https://doi.org/10.1007/s10479-017-2553-9
Pirouz M, Zhan J (2018) Optimized label propagation community detection on big data networks. ACM Int Conf Proc Ser:57–62. https://doi.org/10.1145/3206157.3206167
Souravlas S, Sifaleras A, Katsavounis S (2019) A parallel algorithm for community detection in social networks, based on path analysis and threaded binary trees. IEEE Access 7:20499–20519. https://doi.org/10.1109/ACCESS.2019.2897783
Souravlas S, Sifaleras A, Katsavounis S (2020) Hybrid CPU-GPU community detection in weighted networks. IEEE Access 8:57527–57551. https://doi.org/10.1109/ACCESS.2020.2982227
Boulch A (2018) Reducing parameter number in residual networks by sharing weights. Pattern Recognit Lett 103:53–59. https://doi.org/10.1016/j.patrec.2018.01.006
Xin X, Wang C, Ying X, Wang B (2017) Deep community detection in topologically incomplete networks. Phys A Stat Mech Appl 469:342–352. https://doi.org/10.1016/j.physa.2016.11.029
Cao J, Jin D, Yang L, Dang J (2018) Incorporating network structure with node contents for community detection on large networks using deep learning. Neurocomputing 297:71–81. https://doi.org/10.1016/j.neucom.2018.01.065
Liang Y, Cao X, He D, Chuan W, Xiao W, Weixiong Z (2016) Modularity based community detection with deep learning. IJCAI Int J Conf Artif Intell 2016:2252–2258
Tian F, Gao B, Cui Q, Chen E, Liu TY (2014) Learning deep representations for graph clustering. Proc Natl Conf Artif Intell 2:1293–1299
Wang H et al (2021) Learning graph representation with generative adversarial nets. IEEE Trans Knowl Data Eng 33(8):3090–3103. https://doi.org/10.1109/TKDE.2019.2961882
Jia Y, Zhang Q, Zhang W, Wang X (2019) CommunityGan: Community detection with generative adversarial nets. In: Web conference on 2019—proceedings of world wide web conference on WWW 2019, pp 784–794. https://doi.org/10.1145/3308558.3313564
Xue S, Lu J, Zhang G (2019) Cross-domain network representations. Pattern Recognit 94:135–148. https://doi.org/10.1016/j.patcog.2019.05.009
Lee DD. Learning the pars of objects by nonnegative matrix factorization
Song HA, Kim BK, Xuan TL, Lee SY (2015) Hierarchical feature extraction by multi-layer non-negative matrix factorization network for classification task. Neurocomputing 165:63–74. https://doi.org/10.1016/j.neucom.2014.08.095
Ngiam J, Koh PW, Chen Z, Bhaskar S, Ng AY (2011) Sparse filtering. In: Advance neural information processing system 24, 25th annual conference on neural information processing system, NIPS, pp 1–9
Xie Y, Gong M, Wang S, Yu B (2018) Community discovery in networks with deep sparse filtering. Pattern Recognit 81:50–59. https://doi.org/10.1016/j.patcog.2018.03.026
Cavallari S, Zheng VW, Cai H, Chang KCC, Cambria E (2017) Learning community embedding with community detection and node embedding on graphs. Int Conf Inf Knowl Manage Proc Part F1318:377–386. https://doi.org/10.1145/3132847.3132925
Zhang Y, Lyu T, Zhang Y (2018) COSINE: community-preserving social network embedding from information diffusion cascades. In: 32nd AAAI conference on artificial intelligence AAAI 2018, pp 2620–2627
Shchur O, Günnemann S. Overlapping community detection with graph neural networks
Ine L (2019) Supervised community detection, pp 1–24
Hinton GE, Zemel RS (1994) Autoencoders, minimum description length and Helmholtz free energy. Adv Neural Inf Process Syst 6:3–10
Janowski T, Mohanty H (2010) Distributed computing and internet technology: preface, vol 5966. LNCS
Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: Proceedings of ACM SIGKDD international conference on knowledge discovery data mining, pp 701–710. https://doi.org/10.1145/2623330.2623732
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) LINE: large-scale information network embedding. In: WWW 2015—proceedings of 24th international conference on world wide web, pp 1067–1077. https://doi.org/10.1145/2736277.2741093
Huang Y et al (2016) node2vec real-time video recommendation exploration categories and subject descriptors. World Neurosurg 95(1):41–50
Tran PV (2019) Learning to make predictions on graphs with autoencoders. In: Proceedings of 2018 IEEE 5th international conference on data science advanced analysis DSAA 2018, pp 237–245. https://doi.org/10.1109/DSAA.2018.00034
Li S, Jiang L, Wu X, Han W, Zhao D, Wang Z (2021) A weighted network community detection algorithm based on deep learning. Appl Math Comput 401:126012. https://doi.org/10.1016/j.amc.2021.126012
Bhatia V, Rani R (2018) DFuzzy: a deep learning-based fuzzy clustering model for large graphs. Knowl Inf Syst 57(1):159–181. https://doi.org/10.1007/s10115-018-1156-3
Bhatia V, Rani R (2019) A distributed overlapping community detection model for large graphs using autoencoder. Futur Gener Comput Syst 94:16–26. https://doi.org/10.1016/j.future.2018.10.045
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sikarwar, R., Singh, S.S., Shakya, H.K. (2023). A Review on Community Detection Using Deep Neural Networks with Enhanced Learning. In: Gupta, D., Khanna, A., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds) International Conference on Innovative Computing and Communications. Lecture Notes in Networks and Systems, vol 473. Springer, Singapore. https://doi.org/10.1007/978-981-19-2821-5_15
Download citation
DOI: https://doi.org/10.1007/978-981-19-2821-5_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2820-8
Online ISBN: 978-981-19-2821-5
eBook Packages: EngineeringEngineering (R0)