Abstract
Social media has become an integral part of our day-to-day life. In our activities or posts on social media, the presence of hate speech written in the native language or English has increased significantly. It often leads to the spread of negativity, depression, or even sometimes considered cybercrime. In this paper, a hybrid deep learning approach has been taken to detect Bangla social media hate speech using fastText embedding, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Convolutional Neural Network. A publicly available dataset of 30000 samples has been used, and the proposed hybrid model achieved close to 90% accuracy with significant sensitivity and specificity in Bangla hate speech detection. Several related deep learning approaches were evaluated in this same dataset, but none of them performed better than the proposed model. The hybrid model also showed robustness which made it more suitable for this task.
Tapotosh Ghosh and Ashraf Alam Khan Chowdhury have contributed equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
monsoon-nlp/bangla-electra \(\cdot \) hugging face. https://huggingface.co/monsoon-nlp/bangla-electra, (Accessed on 07/06/2021)
Social media stats bangladesh | statcounter global stats. https://gs.statcounter.com/social-media-stats/all/bangladesh (May 2021), (Accessed on 06/29/2021)
Ahammed, S., Rahman, M., Niloy, M.H., Chowdhury, S.M.H.: Implementation of machine learning to detect hate speech in bangla language. In: 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART). pp. 317–320. IEEE (2019)
Akhter, M.P., Jiangbin, Z., Naqvi, I.R., AbdelMajeed, M., Zia, T.: Abusive language detection from social media comments using conventional machine learning and deep learning approaches. Multimed. Syst., 1–16 (2021)
Akter, F.: Cyber violence against women: the case of Bangladesh|genderit.org. https://www.genderit.org/articles/cyber-violence-against-women-case-Bangladesh (2018), (Accessed on 11/03/2021)
Al Banna, M.H., Ghosh, T., Al Nahian, M.J., Taher, K.A., Kaiser, M.S., Mahmud, M., Hossain, M.S., Andersson, K.: Attention-based bi-directional long-short term memory network for earthquake prediction. IEEE Access 9, 56589–56603 (2021)
Al Nahian, M.J., Ghosh, T., Al Banna, M.H., Aseeri, M.A., Uddin, M.N., Ahmed, M.R., Mahmud, M., Kaiser, M.S.: Towards an accelerometer-based elderly fall detection system using cross-disciplinary time series features. IEEE Access 9, 39413–39431 (2021)
Bhattacharjee, A., Hasan, T., Samin, K., Islam, M.S., Rahman, M.S., Iqbal, A., Shahriyar, R.: Banglabert: Combating embedding barrier in multilingual models for low-resource language understanding. CoRR abs/2101.00204 (2021), https://arxiv.org/abs/2101.00204
Cecillon, N., Labatut, V., Dufour, R., Linarès, G.: Abusive language detection in online conversations by combining content-and graph-based features. Front. Big Data 2, 8 (2019)
Chakraborty, P., Seddiqui, M.H.: Threat and abusive language detection on social media in bengali language. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–6. IEEE (2019)
Das, A.K., Al Asif, A., Paul, A., Hossain, M.N.: Bangla hate speech detection on social media using attention-based recurrent neural network. J. Intell. Syst. 30(1), 578–591 (2021)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018), http://arxiv.org/abs/1810.04805
Ghannay, S., Favre, B., Esteve, Y., Camelin, N.: Word embedding evaluation and combination. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp. 300–305 (2016)
Ghosh, T., Al Banna, M.H., Al Nahian, M.J., Taher, K.A., Kaiser, M.S., Mahmud, M.: A hybrid deep learning model to predict the impact of Covid-19 on mental health form social media big data (2021)
Ghosh, T., Al Banna, M.H., Rahman, M.S., Kaiser, M.S., Mahmud, M., Hosen, A.S., Cho, G.H.: Artificial intelligence and internet of things in screening and management of autism spectrum disorder. Sustain. Cities Soc. 74, 103189 (2021)
Ghosh, T., Banna, M., Al, H., Angona, T.M., Nahian, M., Al, J., Uddin, M.N., Kaiser, M.S., Mahmud, M.: An attention-based mood controlling framework for social media users. In: International Conference on Brain Informatics, pp. 245–256. Springer (2021)
Ishmam, A.M., Sharmin, S.: Hateful speech detection in public facebook pages for the bengali language. In: 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 555–560. IEEE (2019)
Islam, T., Ahmed, N., Latif, S.: An evolutionary approach to comparative analysis of detecting Bangla abusive text. Bull. Electr. Eng. Inform. 10(4), 2163–2169 (2021)
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Karim, M.R., Dey, S.K., Islam, T., Sarker, S., Menon, M.H., Hossain, K., Hossain, M.A., Decker, S.: Deephateexplainer: Explainable hate speech detection in under-resourced Bengali language. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)
Liao, M., Shi, B., Bai, X., Wang, X., Liu, W.: Textboxes: A fast text detector with a single deep neural network. In: Thirty-first AAAI conference on artificial intelligence (2017)
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)
Ritu, S.S., Mondal, J., Mia, M.M., Al Marouf, A.: Bangla abusive language detection using machine learning on radio message gateway. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES), pp. 1725–1729. IEEE (2021)
Romim, N., Ahmed, M., Talukder, H., Islam, M.S.: Hate speech detection in the Bengali language: A dataset and its baseline evaluation. CoRR abs/2012.09686 (2020), https://arxiv.org/abs/2012.09686
Romim, N., Ahmed, M., Talukder, H., Islam, M.S.: Hate speech detection in the bengali language: A dataset and its baseline evaluation. In: Proceedings of International Joint Conference on Advances in Computational Intelligence, pp. 457–468. Springer (2021)
Sarker, S.: Github-sagorbrur/glove-Bengali: Bengali glove pretrained word vector. https://github.com/sagorbrur/GloVe-Bengali, (Accessed on 07/03/2021)
Sarker, S.: Banglabert: Bengali mask language model for Bengali language understanding (2020), https://github.com/sagorbrur/bangla-bert
Sazzed, S.: Abusive content detection in transliterated Bengali-English social media corpus. In: Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching, pp. 125–130 (2021)
Steimel, K., Dakota, D., Chen, Y., Kübler, S.: Investigating multilingual abusive language detection: A cautionary tale. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pp. 1151–1160 (2019)
UNB: ‘49% Bangladeshi school pupils face cyberbullying’ | the daily star. https://www.thedailystar.net/bytes/’49-bangladeshi-school-pupils-face-cyberbullying’-287209 (2016), (Accessed on 11/03/2021)
UNB: Bangladesh charts 9m new social media users | dhaka tribune. https://www.dhakatribune.com/bangladesh/2021/04/26/bangladesh-charts-9m-new-social-media-users#: :text=A20study20has20demonstrated20the,February20by20We20Are20Social (2021), (Accessed on 11/03/2021)
Wu, S., Manber, U.: Fast text searching: allowing errors. Commun. ACM 35(10), 83–91 (1992)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ghosh, T., Chowdhury, A.A.K., Banna, M.H.A., Nahian, M.J.A., Kaiser, M.S., Mahmud, M. (2022). A Hybrid Deep Learning Approach to Detect Bangla Social Media Hate Speech. In: Hossain, S., Hossain, M.S., Kaiser, M.S., Majumder, S.P., Ray, K. (eds) Proceedings of International Conference on Fourth Industrial Revolution and Beyond 2021 . Lecture Notes in Networks and Systems, vol 437. Springer, Singapore. https://doi.org/10.1007/978-981-19-2445-3_50
Download citation
DOI: https://doi.org/10.1007/978-981-19-2445-3_50
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2444-6
Online ISBN: 978-981-19-2445-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)