Abstract
Hate speech is defined as an expression that targets an individual or community on the aspects like religion, sexual orientation, race, political opinion, and origin. Recently, hate speech on social media especially in the Arabic language has been exponentially increased and led to severe causes. Various studies had been conducted on social media platforms adopted by people to broadcast their opinions. This work aims to develop a model that is able to handle detection and classification of Arabic hate speech and offensive language. The experiments are carried out in using various machine learning (ML) and deep learning (DL) models. In this work, Arabic Hate Speech Detection (AHSD) model is proposed which composed of pre-processing, feature extraction, detection, and classification to identify hate speech on the Arabic benchmark dataset. The proposed model shows improved results. The transfer learning approach model exhibits superior performance compared to all other ML models in terms of accuracy, precision, recall, and F1 scores, achieving improvements of 84%, 79%, 80%, and 79%, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Albadi, N., Kurdi, M., Mishra, S.: Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018, pp. 69–76 (2018). https://doi.org/10.1109/ASONAM.2018.8508247
. Boukil, S., Biniz, M., El Adnani, F., Cherrat, L., El Moutaouakkil, A.E.: Arabic text classification using deep learning technics. Int. J. Grid Distrib. Comput. 11(9), 103–114 (2018). https://doi.org/10.14257/ijgdc.2018.11.9.09
Lee, S., Muaad, A.Y., Jayappa, H., Al-antari, M.A.: ArCAR: a novel deep learning computer-aided recognition for character-level Arabic text representation and recognition. Algorithms (2021)
Habash, N.Y.: Introduction to Arabic natural language processing, vol. 3, no. 1 (2010). https://doi.org/10.2200/S00277ED1V01Y201008HLT010
Al-Sarem, M., Saeed, F., Alkhammash, E.H., Alghamdi, N.S.: An aggregated mutual information based feature selection with machine learning methods for enhancing IoT botnet attack detection. Sensors 22(1) (2022). https://doi.org/10.3390/s22010185
Mubarak, H., Hassan, S., Chowdhury, S.A.: Emojis as anchors to detect Arabic offensive language and hate speech, pp. 1–21 (2022). 10.1017/xxxxx
Nagoudi, E.M.B., Elmadany, A., Abdul-Mageed, M., Alhindi, T., Cavusoglu, H.: Machine generation and detection of Arabic manipulated and fake news, pp. 1–15 (2020). http://arxiv.org/abs/2011.03092
Muaad, A.Y., Davanagere, H.J., Al-antari, M.A., Benifa, J.V.B., Chola, C.: AI-based misogyny detection from Arabic Levantine Twitter tweets. Comput. Sci. Math. Forum 2(1), 15 (2021)
Al-Sarem, M., Alsaeedi, A., Saeed, F., Boulila, W., Ameerbakhsh, O.: A novel hybrid deep learning model for detecting Covid-19-related rumors on social media based on LSTM and concatenated parallel CNNs. Appl. Sci. 11(17) (2021). https://doi.org/10.3390/APP11177940
Alkhamissi, B., Diab, M., Ai, R.: Meta AI at Arabic hate speech 2022: multitask learning with self-correction for hate speech classification, no. 2 (2022)
Abu Farha, I., Magdy, W.: Multitask learning for Arabic offensive language and hate-speech detection. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 86–90, May 2020. https://www.aclweb.org/anthology/2020.osact-1.14
Alkhair, M., Meftouh, K., Smaïli, K., Othman, N.: An Arabic corpus of fake news: collection, analysis and classification. In: Smaïli, K. (eds.) ICALP 2019. CCIS, vol. 1108, pp. 292–302. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32959-4_21. To cite this version : HAL Id : hal-02314246
Rodriguez, A., Chen, Y.L., Argueta, C.: FADOHS: framework for detection and integration of unstructured data of hate speech on Facebook using sentiment and emotion analysis. IEEE Access 10, 22400–22419 (2022). https://doi.org/10.1109/ACCESS.2022.3151098
Liang, G., He, W., Xu, C., Chen, L., Zeng, J.: Rumor identification in microblogging systems based on users’ behavior. IEEE Trans. Comput. Soc. Syst. 2(3), 99–108 (2015). https://doi.org/10.1109/TCSS.2016.2517458
Chiril, P.: Détection automatique des messages haineux sur les réseaux sociaux (Doctoral dissertation, Université Toulouse 3) (2021)
Mridha, M.F., Wadud, M.A.H., Hamid, M.A., Monowar, M.M., Abdullah-Al-Wadud, M., Alamri, A.: L-Boost: identifying offensive texts from social media post in Bengali. IEEE Access 9, 164681–164699 (2021). https://doi.org/10.1109/ACCESS.2021.3134154
Roy, P.K., Bhawal, S., Subalalitha, C.N.: Hate speech and offensive language detection in Dravidian languages using deep ensemble framework. Comput. Speech Lang. 75, 101386 (2022). https://doi.org/10.1016/j.csl.2022.101386
Ali, M.Z., Ehsan-Ul-Haq, Rauf, S., Javed, K., Hussain, S.: Improving hate speech detection of Urdu tweets using sentiment analysis. IEEE Access 9, 84296–84305 (2021). https://doi.org/10.1109/ACCESS.2021.3087827
Al-Sarem, M., Saeed, F., Alsaeedi, A., Boulila, W., Al-Hadhrami, T.: Ensemble methods for instance-based Arabic language authorship attribution. IEEE Access 8, 17331–17345 (2020). https://doi.org/10.1109/ACCESS.2020.2964952
Djandji, M., Baly, F., Antoun, W., Hajj, H.: Multi-task learning using AraBert for offensive language detection. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 97–101, May, 2020. https://www.aclweb.org/anthology/2020.osact-1.16
Alshalan, R., Al-Khalifa, H., Alsaeed, D., Al-Baity, H., Alshalan, S.: Detection of hate speech in COVID-19-related tweets in the Arab region: deep learning and topic modeling approach. J. Med. Internet Res. 22(12) (2020). https://doi.org/10.2196/22609
Alshalan, R., Al-Khalifa, H.: A deep learning approach for automatic hate speech detection in the Saudi Twittersphere. Appl. Sci. 10(23), 1–16 (2020). https://doi.org/10.3390/app10238614
Boucherit, O., Abainia, K.: Offensive language detection in under-resourced Algerian dialectal Arabic language, pp. 1–9, March 2022
Aldjanabi, W., Dahou, A., Al-Qaness, M.A.A., Elaziz, M.A., Helmi, A.M., Damaševičius, R.: Arabic offensive and hate speech detection using a cross-corpora multi-task learning model. Informatics 8(4), 1–13 (2021). https://doi.org/10.3390/informatics8040069
Ameur, M.S.H., Aliane, H.: AraCOVID19-MFH: Arabic COVID-19 multi-label fake news and hate speech detection dataset, May 2021. http://arxiv.org/abs/2105.03143
Mulki, H., Haddad, H., Bechikh Ali, C., Alshabani, H.: L-HSAB: a Levantine Twitter dataset for hate speech and abusive language, pp. 111–118 (2019). https://doi.org/10.18653/v1/w19-3512
Salminen, J., et al.: Developing an online hate classifier for multiple social media platforms. Hum.-Centric Comput. Inf. Sci. 10(1), 1–34 (2020). https://doi.org/10.1186/s13673-019-0205-6
Aziz, M., Nessir, B., Rhouma, M., Haddad, H., Fourati, C.: iCompass at Arabic hate speech 2022: detect hate speech using QRNN and transformers, pp. 176–180, June 2022
Mostafa, A., Mohamed, O., Ashraf, A.: GOF at Arabic hate speech 2022: breaking the loss function convention for data-imbalanced Arabic offensive text detection, pp. 167–175, June 2022
Naseem, U., Razzak, I., Eklund, P.W.: A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on Twitter. Multimedia Tools Appl. 80(28–29), 35239–35266 (2020). https://doi.org/10.1007/s11042-020-10082-6
Rong, X.: word2vec parameter learning explained, pp. 1–21 (2014). http://arxiv.org/abs/1411.2738
Muaad, A.Y., et al.: Arabic document classification: performance investigation of preprocessing and representation techniques, vol. 2022 (2022)
Bahassine, S., Madani, A., Al-Sarem, M., Kissi, M.: Feature selection using an improved Chi-square for Arabic text classification. J. King Saud Univ. Comput. Inf. Sci. 32(2), 225–231 (2020). https://doi.org/10.1016/j.jksuci.2018.05.010
Muaad, A.Y. , et al.: Artificial intelligence-based approach for misogyny and sarcasm detection from Arabic texts. Comput. Intell. Neurosci. 2022, 9 (2022). 7937667. https://doi.org/10.1155/2022/7937667
Dito, F.M., Alqadhi, H.A., Alasaadi, A.: Detecting medical rumors on Twitter using machine learning. In: 2020 International Conference on Innovation and Intelligence for Informatics, Computing and Technologies, 3ICT 2020, vol. 11, no. 8, pp. 324–332 (2020). https://doi.org/10.1109/3ICT51146.2020.9311957
Muaad, A.Y., et al.: An effective approach for Arabic document classification using machine learning. Glob. Transit. Proc., 0–5 (2022). https://doi.org/10.1016/j.gltp.2022.03.003
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Muaad, A.Y. et al. (2023). Arabic Hate Speech Detection Using Different Machine Learning Approach. In: Saeed, F., Mohammed, F., Mohammed, E., Al-Hadhrami, T., Al-Sarem, M. (eds) Advances on Intelligent Computing and Data Science. ICACIn 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 179. Springer, Cham. https://doi.org/10.1007/978-3-031-36258-3_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-36258-3_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36257-6
Online ISBN: 978-3-031-36258-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)