Abstract
Graphs have been exercised as appealing candidates for modeling relational datasets in different domains such as cryptocurrency transaction networks, social networks, rating platforms, and many more. Recently, different powerful methods have emerged to analyze datasets where the complex underlying data connectivity can be modeled by graphs. These methods have demonstrated promising performance on graph common tasks including node classification. In this paper, we explore the impact of graph-based techniques for detecting anomalous entities on real-world networks. We focus on modeling the problem of detecting anomalous entities on a network as a node classification task, and inspect the role of different approaches together with the evaluation setup and metrics to provide several useful recommendations for practical applications. We investigate different ways of handling the imbalance issue of the datasets which is a common problem when dealing with datasets containing anomalies, and demonstrate how a method that is agnostic to the dataset imbalance may show misleading performance. Through extensive experiments on six real-world datasets in balanced and unbalanced setting for a node classification task, we provide several recommendations that can shed more lights on challenges of selecting the appropriate methods, settings, and performance metrics that better align with the intrinsic attributes of a specific dataset and task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akcora, C.G., Li, Y., Gel, Y.R., Kantarcioglu, M.: Bitcoinheist: topological data analysis for ransomware prediction on the bitcoin blockchain. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI) (2020)
Chami, I., Abu-El-Haija, S., Perozzi, B., Ré, C., Murphy, K.: Machine learning on graphs: a model and comprehensive taxonomy. arXiv preprint arXiv:2005.03675 (2020)
Chen, W., Guo, X., Chen, Z., Zheng, Z., Lu, Y.: Phishing scam detection on ethereum: towards financial security for blockchain ecosystem. In: International Joint Conferences on Artificial Intelligence Organization, pp. 4506–4512 (2020)
Farrugia, S., Ellul, J., Azzopardi, G.: Detection of illicit accounts over the ethereum blockchain. Expert Syst. Appl. 150, 113318 (2020)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1025–1035 (2017)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Kumar, S., Hooi, B., Makhija, D., Kumar, M., Faloutsos, C., Subrahmanian, V.: Rev2: fraudulent user prediction in rating platforms. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 333–341 (2018)
Longadge, R., Dongre, S.: Class imbalance problem in data mining review. arXiv preprint arXiv:1305.1707 (2013)
Ma, X., et al.: A comprehensive survey on graph anomaly detection with deep learning. IEEE Trans. Knowl. Data Eng. (2021)
Ma, X., Qin, G., Qiu, Z., Zheng, M., Wang, Z.: Riwalk: fast structural node embedding via role identification. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 478–487. IEEE (2019)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural. Inf. Process. Syst. 26, 3111–3119 (2013)
Nerurkar, P., Bhirud, S., Patel, D., Ludinard, R., Busnel, Y., Kumari, S.: Supervised learning model for identifying illegal activities in Bitcoin. Appl. Intell. 51(6), 3824–3843 (2020). https://doi.org/10.1007/s10489-020-02048-w
Poursafaei, F., Rabbany, R., Zilic, Z.: SigTran: signature vectors for detecting illicit activities in blockchain transaction networks. In: Karlapalem, K., et al. (eds.) PAKDD 2021. LNCS (LNAI), vol. 12712, pp. 27–39. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75762-5_3
Sakr, S., et al.: The future is big graphs: a community view on graph processing systems. Commun. ACM 64(9), 62–71 (2021)
Weber, M., et al.: Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591 (2019)
Wu, J., et al.: Who are the phishers? phishing scam detection on ethereum via network embedding. IEEE Trans. Syst. Man Cybern. Syst. (2020)
Zhao, T., Zhang, X., Wang, S.: Graphsmote: imbalanced node classification on graphs with graph neural networks. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 833–841 (2021)
Zhou, Z.H., Liu, X.Y.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Poursafaei, F., Zilic, Z., Rabbany, R. (2023). On Anomaly Detection in Graphs as Node Classification. In: Tang, L.C., Wang, H. (eds) Big Data Management and Analysis for Cyber Physical Systems. BDET 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 150. Springer, Cham. https://doi.org/10.1007/978-3-031-17548-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-17548-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17547-3
Online ISBN: 978-3-031-17548-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)