Machine Learning for Hate Speech Detection in Arabic Social Media

Boulouard, Zakaria; Ouaissa, Mariya; Ouaissa, Mariyam

doi:10.1007/978-3-030-77185-0_10

Part of the book series: EAI/Springer Innovations in Communication and Computing ((EAISICC))

398 Accesses
3 Citations

Abstract

(WARNING: This paper may contain some offensive words)

Over the past few years, abusive language and cyberbullying have known a great increase on social media in general. This phenomenon has encouraged efforts to propose solutions able to detect and prohibit such behavior. Most of these solutions are dedicated to English, but the ones that can handle Arabic are, to the best of our knowledge, rare. Many reasons lie behind this situation including the informality and ambiguity of the Arabic dialects, as well as the use of Arabic/Arabizi combinations. In this paper, we will use a collection of Arabic YouTube comments that are annotated as either “hateful” or “inoffensive” to compare the ability of five machine learning algorithms to perform correct classification on hateful Arabic comments. The algorithms are Logistic Regression, Naïve Bayes, Random Forests, Support Vector Machines, and Long Short-Term Memory. The performance metrics are Accuracy, F1-Score, Precision, and Recall.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Arabic Corpus of Fake News: Collection, Analysis and Classification

Hate Speech Detection in Social Media for the Kurdish Language

Detecting Hate Speech in Arabic Tweets During COVID-19 Using Machine Learning Approaches

Notes

1.
[Accessed: 28-Feb-2021]
2.
https://scikit-learn.org/stable/ [Accessed: 28-Feb-2021]
3.
https://www.tensorflow.org/ [Accessed: 28-Feb-2021]
4.
https://colab.research.google.com/ [Accessed: 28-Feb-2021]

References

S. Kemp, “Digital 2020: 3.8 billion people use social media – We Are Social,” (2020). [Online]. Available: https://wearesocial.com/blog/2020/01/digital-2020-3-8-billion-people-use-social-media. Accessed 21 Feb 2021
D. Radcliffe, H. Abuhmaid, Social Media in the Middle East: 2019 in Review, SSRN Electronic J., (2020)
Google Scholar
S. Modha, P. Majumder, T. Mandl, C. Mandalia, Detecting and visualizing hate speech in social media: a cyber watchdog for surveillance. Expert Syst. Appl. 161, 113725 (2020)
Article Google Scholar
P. Kapil, A. Ekbal, A deep neural network based multi-task learning approach to hate speech detection. Knowl.-Based Syst. 210, 106458 (2020)
Article Google Scholar
F.E. Ayo, O. Folorunso, F.T. Ibharalu, I.A. Osinuga, Machine learning techniques for hate speech classification of twitter data: state-of-the-art, future challenges and research directions. Comput. Sci. Rev. 38, 100311 (2020)
Article Google Scholar
W. Alhalabi et al., Social mining for terroristic behavior detection through Arabic tweets characterization. Futur. Gener. Comput. Syst. (2020)
Google Scholar
H. Mubarak, A. Rashed, K. Darwish, Y. Samih, A. Abdelali, Arabic offensive language on twitter: analysis and experiments. arXiv (2020)
Google Scholar
H. Mulki, H. Haddad, C. Bechikh Ali, H. Alshabani, L-HSAB: a Levantine Twitter dataset for hate speech and abusive language, in Proceedings of the Third Workshop on Abusive Language Online, (2019), pp. 111–118
Google Scholar
R. Alshalan, H. Al-Khalifa, A deep learning approach for automatic hate speech detection in the saudi twittersphere. Appl. Sci. (Switzerland) 10(23), 1–16 (2020)
Google Scholar
A. Alakrot, L. Murray, N.S. Nikolov, Dataset construction for the detection of anti-social behaviour in online communication in Arabic. Procedia Comput. Sci. 142, 174–181 (2018)
Article Google Scholar
United Nations, United Nations Strategy and Plan of Action on Hate Speech, (2019)
Google Scholar
F.M. Plaza-del-Arco, M.D. Molina-González, L.A. Ureña-López, M.T. Martín-Valdivia, Comparing pre-trained language models for Spanish hate speech detection. Expert Syst. Appl. 166, no. March 2020, 114120 (2021)
Article Google Scholar
C. Arcila Calderón, D. Blanco-Herrero, M.B. Valdez Apolo, Rechazo y discurso de odio en twitter: análisis de contenido de los tuits sobre migrantes y refugiados en español/rejection and hate speech in twitter: content analysis of tweets about migrants and refugees in Spanish. Revista Española de Investigaciones Sociológicas 172, 21–39 (2020)
Google Scholar
P. Chiril, F. Benamara Zitoune, V. Moriceau, M. Coulomb-Gully, A. Kumar, Multilingual and Multitarget Hate Speech Detection in Tweets, Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts, 4, (2019), pp. 351–360
Google Scholar
M. Corazza, S. Menini, E. Cabrio, S. Tonelli, S. Villata, A multilingual evaluation for online hate speech detection. ACM Trans. Internet Technol. 20(2), 1–22 (2020)
Article Google Scholar
H. Mubarak, K. Darwish, and W. Magdy, Abusive Language Detection on Arabic Social Media, in Proceedings of the First Workshop on Abusive Language Online, (2017), pp. 52–56
Google Scholar
E. Abozinadah, Detecting Abusive Arabic Language Twitter Accounts Using a Multidimensional Analysis Model (George Mason University, 2017)
Google Scholar
A. Alakrot, L. Murray, N.S. Nikolov, Towards accurate detection of offensive language in online communication in Arabic. Procedia Comput. Sci. 142, 315–320 (2018)
Article Google Scholar
I. Guellil, A. Adeel, F. Azouaou, S. Chennoufi, H. Maafi, T. Hamitouche, Detecting hate speech against politicians in Arabic community on social media. Int. J. Web Inf. Syst. 16(3), 295–313 (2020)
Article Google Scholar
N. Ousidhoum, Z. Lin, H. Zhang, Y. Song, D.-Y. Yeung, Multilingual and Multi-Aspect Hate Speech Analysis, in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), (2019), pp. 4674–4683
Google Scholar
Internet Live Stats, 1 Second – Internet Live Stats, (2021). [Online]. Available: https://www.internetlivestats.com/one-second/#youtube-band. Accessed 28 Feb 2021
YouTube Blog, “YouTube for Press,” (2021). [Online]. Available: https://blog.youtube/press/. Accessed 28 Feb 2021
I. Aljarah et al., Intelligent detection of hate speech in Arabic social network: a machine learning approach. J. Inf. Sci., 016555152091765 (2020)
Google Scholar
NLTK, Natural Language Toolkit — NLTK 3.5 documentation, (2021). [Online]. Available: https://www.nltk.org/. Accessed 02 Mar 2021
H. Nayebi, Logistic regression analysis, in Advanced Statistics for Testing Assumed Casual Relationships, (Springer, Cham, 2020), pp. 79–109
Google Scholar
G. I. Webb, E. Keogh, R. Miikkulainen, R. Miikkulainen, M. Sebag, Naïve Bayes, in Encyclopedia of Machine Learning, (Springer US, 2011), pp. 713–714
Google Scholar
Y. Liu, Y. Wang, J. Zhang, New machine learning algorithm: Random forest, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), (2012), vol. 7473 LNCS, pp. 246–252
Google Scholar
S.H.H. Mehne, S. Mirjalili, Support vector machine: applications and improvements using evolutionary algorithms, in Evolutionary Machine Learning Techniques, ed. by S. Mirjalili, H. Faris, I. Aljarah, (Singapore, Springer, 2020), pp. 35–50
Chapter Google Scholar
E. Alpaydin, Introduction to Machine Learning, 4th edn. (MIT Press, 2020)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Sciences and Techniques Mohammedia, Hassan II University, Casablanca, Morocco
Zakaria Boulouard
Moulay Ismail University, Meknes, Morocco
Mariya Ouaissa & Mariyam Ouaissa

Authors

Zakaria Boulouard
View author publications
You can also search for this author in PubMed Google Scholar
Mariya Ouaissa
View author publications
You can also search for this author in PubMed Google Scholar
Mariyam Ouaissa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zakaria Boulouard .

Editor information

Editors and Affiliations

Moulay Ismail University, Meknes, Morocco
Mariya Ouaissa
Hassan II University, Mohammedia, Morocco
Zakaria Boulouard
Moulay Ismail University, Meknes, Morocco
Mariyam Ouaissa
Parc Technopolis Rabat-Shore, International University of Rabat, Sala Al Jadida, Morocco
Bassma Guermah

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Boulouard, Z., Ouaissa, M., Ouaissa, M. (2022). Machine Learning for Hate Speech Detection in Arabic Social Media. In: Ouaissa, M., Boulouard, Z., Ouaissa, M., Guermah, B. (eds) Computational Intelligence in Recent Communication Networks . EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-77185-0_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-77185-0_10
Published: 21 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77184-3
Online ISBN: 978-3-030-77185-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Machine Learning for Hate Speech Detection in Arabic Social Media

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Arabic Corpus of Fake News: Collection, Analysis and Classification

Hate Speech Detection in Social Media for the Kurdish Language

Detecting Hate Speech in Arabic Tweets During COVID-19 Using Machine Learning Approaches

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Machine Learning for Hate Speech Detection in Arabic Social Media

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Arabic Corpus of Fake News: Collection, Analysis and Classification

Hate Speech Detection in Social Media for the Kurdish Language

Detecting Hate Speech in Arabic Tweets During COVID-19 Using Machine Learning Approaches

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation