Skip to main content

Arabic Hate Speech Detection Using Different Machine Learning Approach

  • Conference paper
  • First Online:
Advances on Intelligent Computing and Data Science (ICACIn 2022)

Abstract

Hate speech is defined as an expression that targets an individual or community on the aspects like religion, sexual orientation, race, political opinion, and origin. Recently, hate speech on social media especially in the Arabic language has been exponentially increased and led to severe causes. Various studies had been conducted on social media platforms adopted by people to broadcast their opinions. This work aims to develop a model that is able to handle detection and classification of Arabic hate speech and offensive language. The experiments are carried out in using various machine learning (ML) and deep learning (DL) models. In this work, Arabic Hate Speech Detection (AHSD) model is proposed which composed of pre-processing, feature extraction, detection, and classification to identify hate speech on the Arabic benchmark dataset. The proposed model shows improved results. The transfer learning approach model exhibits superior performance compared to all other ML models in terms of accuracy, precision, recall, and F1 scores, achieving improvements of 84%, 79%, 80%, and 79%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.pewresearch.org/religion/2009/10/07/

  2. 2.

    https://sites.google.com/view/arabichate2022/home.

References

  1. Albadi, N., Kurdi, M., Mishra, S.: Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018, pp. 69–76 (2018). https://doi.org/10.1109/ASONAM.2018.8508247

  2. . Boukil, S., Biniz, M., El Adnani, F., Cherrat, L., El Moutaouakkil, A.E.: Arabic text classification using deep learning technics. Int. J. Grid Distrib. Comput. 11(9), 103–114 (2018). https://doi.org/10.14257/ijgdc.2018.11.9.09

  3. Lee, S., Muaad, A.Y., Jayappa, H., Al-antari, M.A.: ArCAR: a novel deep learning computer-aided recognition for character-level Arabic text representation and recognition. Algorithms (2021)

    Google Scholar 

  4. Habash, N.Y.: Introduction to Arabic natural language processing, vol. 3, no. 1 (2010). https://doi.org/10.2200/S00277ED1V01Y201008HLT010

  5. Al-Sarem, M., Saeed, F., Alkhammash, E.H., Alghamdi, N.S.: An aggregated mutual information based feature selection with machine learning methods for enhancing IoT botnet attack detection. Sensors 22(1) (2022). https://doi.org/10.3390/s22010185

  6. Mubarak, H., Hassan, S., Chowdhury, S.A.: Emojis as anchors to detect Arabic offensive language and hate speech, pp. 1–21 (2022). 10.1017/xxxxx

    Google Scholar 

  7. Nagoudi, E.M.B., Elmadany, A., Abdul-Mageed, M., Alhindi, T., Cavusoglu, H.: Machine generation and detection of Arabic manipulated and fake news, pp. 1–15 (2020). http://arxiv.org/abs/2011.03092

  8. Muaad, A.Y., Davanagere, H.J., Al-antari, M.A., Benifa, J.V.B., Chola, C.: AI-based misogyny detection from Arabic Levantine Twitter tweets. Comput. Sci. Math. Forum 2(1), 15 (2021)

    Google Scholar 

  9. Al-Sarem, M., Alsaeedi, A., Saeed, F., Boulila, W., Ameerbakhsh, O.: A novel hybrid deep learning model for detecting Covid-19-related rumors on social media based on LSTM and concatenated parallel CNNs. Appl. Sci. 11(17) (2021). https://doi.org/10.3390/APP11177940

  10. Alkhamissi, B., Diab, M., Ai, R.: Meta AI at Arabic hate speech 2022: multitask learning with self-correction for hate speech classification, no. 2 (2022)

    Google Scholar 

  11. Abu Farha, I., Magdy, W.: Multitask learning for Arabic offensive language and hate-speech detection. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 86–90, May 2020. https://www.aclweb.org/anthology/2020.osact-1.14

  12. Alkhair, M., Meftouh, K., Smaïli, K., Othman, N.: An Arabic corpus of fake news: collection, analysis and classification. In: Smaïli, K. (eds.) ICALP 2019. CCIS, vol. 1108, pp. 292–302. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32959-4_21. To cite this version : HAL Id : hal-02314246

  13. Rodriguez, A., Chen, Y.L., Argueta, C.: FADOHS: framework for detection and integration of unstructured data of hate speech on Facebook using sentiment and emotion analysis. IEEE Access 10, 22400–22419 (2022). https://doi.org/10.1109/ACCESS.2022.3151098

    Article  Google Scholar 

  14. Liang, G., He, W., Xu, C., Chen, L., Zeng, J.: Rumor identification in microblogging systems based on users’ behavior. IEEE Trans. Comput. Soc. Syst. 2(3), 99–108 (2015). https://doi.org/10.1109/TCSS.2016.2517458

    Article  Google Scholar 

  15. Chiril, P.: Détection automatique des messages haineux sur les réseaux sociaux (Doctoral dissertation, Université Toulouse 3) (2021)

    Google Scholar 

  16. Mridha, M.F., Wadud, M.A.H., Hamid, M.A., Monowar, M.M., Abdullah-Al-Wadud, M., Alamri, A.: L-Boost: identifying offensive texts from social media post in Bengali. IEEE Access 9, 164681–164699 (2021). https://doi.org/10.1109/ACCESS.2021.3134154

    Article  Google Scholar 

  17. Roy, P.K., Bhawal, S., Subalalitha, C.N.: Hate speech and offensive language detection in Dravidian languages using deep ensemble framework. Comput. Speech Lang. 75, 101386 (2022). https://doi.org/10.1016/j.csl.2022.101386

  18. Ali, M.Z., Ehsan-Ul-Haq, Rauf, S., Javed, K., Hussain, S.: Improving hate speech detection of Urdu tweets using sentiment analysis. IEEE Access 9, 84296–84305 (2021). https://doi.org/10.1109/ACCESS.2021.3087827

  19. Al-Sarem, M., Saeed, F., Alsaeedi, A., Boulila, W., Al-Hadhrami, T.: Ensemble methods for instance-based Arabic language authorship attribution. IEEE Access 8, 17331–17345 (2020). https://doi.org/10.1109/ACCESS.2020.2964952

    Article  Google Scholar 

  20. Djandji, M., Baly, F., Antoun, W., Hajj, H.: Multi-task learning using AraBert for offensive language detection. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 97–101, May, 2020. https://www.aclweb.org/anthology/2020.osact-1.16

  21. Alshalan, R., Al-Khalifa, H., Alsaeed, D., Al-Baity, H., Alshalan, S.: Detection of hate speech in COVID-19-related tweets in the Arab region: deep learning and topic modeling approach. J. Med. Internet Res. 22(12) (2020). https://doi.org/10.2196/22609

  22. Alshalan, R., Al-Khalifa, H.: A deep learning approach for automatic hate speech detection in the Saudi Twittersphere. Appl. Sci. 10(23), 1–16 (2020). https://doi.org/10.3390/app10238614

    Article  Google Scholar 

  23. Boucherit, O., Abainia, K.: Offensive language detection in under-resourced Algerian dialectal Arabic language, pp. 1–9, March 2022

    Google Scholar 

  24. Aldjanabi, W., Dahou, A., Al-Qaness, M.A.A., Elaziz, M.A., Helmi, A.M., Damaševičius, R.: Arabic offensive and hate speech detection using a cross-corpora multi-task learning model. Informatics 8(4), 1–13 (2021). https://doi.org/10.3390/informatics8040069

    Article  Google Scholar 

  25. Ameur, M.S.H., Aliane, H.: AraCOVID19-MFH: Arabic COVID-19 multi-label fake news and hate speech detection dataset, May 2021. http://arxiv.org/abs/2105.03143

  26. Mulki, H., Haddad, H., Bechikh Ali, C., Alshabani, H.: L-HSAB: a Levantine Twitter dataset for hate speech and abusive language, pp. 111–118 (2019). https://doi.org/10.18653/v1/w19-3512

  27. Salminen, J., et al.: Developing an online hate classifier for multiple social media platforms. Hum.-Centric Comput. Inf. Sci. 10(1), 1–34 (2020). https://doi.org/10.1186/s13673-019-0205-6

  28. Aziz, M., Nessir, B., Rhouma, M., Haddad, H., Fourati, C.: iCompass at Arabic hate speech 2022: detect hate speech using QRNN and transformers, pp. 176–180, June 2022

    Google Scholar 

  29. Mostafa, A., Mohamed, O., Ashraf, A.: GOF at Arabic hate speech 2022: breaking the loss function convention for data-imbalanced Arabic offensive text detection, pp. 167–175, June 2022

    Google Scholar 

  30. Naseem, U., Razzak, I., Eklund, P.W.: A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on Twitter. Multimedia Tools Appl. 80(28–29), 35239–35266 (2020). https://doi.org/10.1007/s11042-020-10082-6

    Article  Google Scholar 

  31. Rong, X.: word2vec parameter learning explained, pp. 1–21 (2014). http://arxiv.org/abs/1411.2738

  32. Muaad, A.Y., et al.: Arabic document classification: performance investigation of preprocessing and representation techniques, vol. 2022 (2022)

    Google Scholar 

  33. Bahassine, S., Madani, A., Al-Sarem, M., Kissi, M.: Feature selection using an improved Chi-square for Arabic text classification. J. King Saud Univ. Comput. Inf. Sci. 32(2), 225–231 (2020). https://doi.org/10.1016/j.jksuci.2018.05.010

  34. Muaad, A.Y. , et al.: Artificial intelligence-based approach for misogyny and sarcasm detection from Arabic texts. Comput. Intell. Neurosci. 2022, 9 (2022). 7937667. https://doi.org/10.1155/2022/7937667

  35. Dito, F.M., Alqadhi, H.A., Alasaadi, A.: Detecting medical rumors on Twitter using machine learning. In: 2020 International Conference on Innovation and Intelligence for Informatics, Computing and Technologies, 3ICT 2020, vol. 11, no. 8, pp. 324–332 (2020). https://doi.org/10.1109/3ICT51146.2020.9311957

  36. Muaad, A.Y., et al.: An effective approach for Arabic document classification using machine learning. Glob. Transit. Proc., 0–5 (2022). https://doi.org/10.1016/j.gltp.2022.03.003

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdullah Y. Muaad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Muaad, A.Y. et al. (2023). Arabic Hate Speech Detection Using Different Machine Learning Approach. In: Saeed, F., Mohammed, F., Mohammed, E., Al-Hadhrami, T., Al-Sarem, M. (eds) Advances on Intelligent Computing and Data Science. ICACIn 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 179. Springer, Cham. https://doi.org/10.1007/978-3-031-36258-3_38

Download citation

Publish with us

Policies and ethics