Abstract
Web-based life has turned into an integral part of the regular day-to-day existence of a large number of individuals around the globe. Online commenting spaces generate a plethora of expressive content in the public domain, which contributes to a healthy environment for humans. However, it also has threats and dangers of cyberbullying, personal attacks, and the use of abusive language. This motivates industry researchers to model an automated process to curb this phenomenon. The aim of this paper is to perform multi-label text categorization, where each comment could belong to multiple toxic labels at the same time. We tested two models: RNN and LSTM. Their performance is significantly better than that of Logistic Regression and ExtraTrees, which are baseline models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Online harassment pew research center (Online). Available http://www.pewinternet.org/2017/07/11/online-harassment-2017/
Social Network Ranking (Online). Available https://www.statista.com/statistics/272014/global-social-networksranked-by-number-of-users/
F. Mohammad (2018) Is preprocessing of text really worth your time for online comment classification? arXiv preprint arXiv:1806.02908
A. Lenhart, M. Ybarra, K. Zickuhr, M. Price-Feeney, Online harassment, digital abuse, and cyberstalking in America. Data and Society Research Institute (2016)
Teen Internet Safety Survey 2014 (Online). Available https://www.cox.com/content/dam/cox/aboutus/documents/tweeninternet-safety-survey.pdf
E. Wulczyn, N. Thain, L. Dixon, Ex machina: personal attacks seen at scale, in Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee (2017), pp. 1391–1399
Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G., Plagianakos, V.P. Convolutional neural networks for toxic comment classification, in Proceedings of the 10th Hellenic Conference on Artificial Intelligence (ACM, 2018), p. 35
S. Vijayarani, M.J. Ilamathi, M. Nithya, Preprocessing techniques for text mining-an overview. Int. J. Comput. Sci. Commun. Netw. 5(1), 7–16 (2015)
J.H. Park, P. Fung, One-step and two-step classification for abusive language detection on twitter. Hong Kong University of Science and Technology
P. Fortuna, J. Ferreira, L. Pires, G. Routar, S. Nunes, Merging datasets for aggressive text identification, in Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018) (2018), pp. 128–139
C. Guibin, Y. Deheng, Z.X.C. Jieshan, C. Erik, Ensemble application of convolutional and recurrent neural networks for multilabel text categorization, in International Joint Conference on Neural Networks (IJCNN) (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shah, D.K., Sanghvi, M.A., Mehta, R.P., Shah, P.S., Singh, A. (2021). Multilabel Toxic Comment Classification Using Supervised Machine Learning Algorithms. In: Joshi, A., Khosravy, M., Gupta, N. (eds) Machine Learning for Predictive Analysis. Lecture Notes in Networks and Systems, vol 141. Springer, Singapore. https://doi.org/10.1007/978-981-15-7106-0_3
Download citation
DOI: https://doi.org/10.1007/978-981-15-7106-0_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7105-3
Online ISBN: 978-981-15-7106-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)