Abstract
The paper deals with the issue of hate speech and radicalization. Oftentimes, they are spread by means of social media. Twitter lets one express their views in a relatively anonymous way; however, it seems to be a simple, yet effective tool for disseminating offensive or radical contents, too. The paper proposes an effective solution which applies machine learning for detecting signs of radicalization and hate speech in Twitter posts. The authors decided to use the Polish language, which due to the level of its complexity is known to pose a challenge for automated sentiment analysis. The authors also needed to create their own dataset of posts containing hate speech, as prior to the experiment, there existed no such datasets in the language. In the paper, the underlying technologies are first presented, then the course of experiment is described and the final conclusions are given thereafter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alshalan, R., Al-Khalifa, H.: A deep learning approach for automatic hate speech detection in the Saudi Twittersphere. Appl. Sci. 10(23), 8614 (2020). https://doi.org/10.3390/app10238614
Article 19: UN HRC maintains consensus on Internet resolution (2018). https://tinyurl.com/tp3p7pu3
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Berger, J., Morgan, J.: The ISIS Twitter census defining and describing the population of ISIS supporters on Twitter. Technical report, The Brookings Project on U.S. Relations with the Islamic World, Washington (2015)
Bloomfield, E.F., Tillery, D.: The circulation of climate change denial online: rhetorical and networking strategies on Facebook. Environ. Commun. 13(1), 23–34 (2019). https://doi.org/10.1080/17524032.2018.1527378
Bobriakov, I.: Sentiment analysis with naive bayes and LSTM. Data Science Central (2020). https://tinyurl.com/5mdzkf4h
Bradshaw, S., Howard, P.N.: The global disinformation order 2019 global inventory of organised social media manipulation. Technical report, Computational Propaganda Research Project (2019). https://tinyurl.com/mz9nf5j8
Choraś, M., et al.: Advanced machine learning techniques for fake news (online disinformation) detection: a systematic mapping study. Appl. Soft Comput. 101, 107050 (2020)
De Souza, G.A., Da Costa-Abreu, M.: Automatic offensive language detection from Twitter data using machine learning and feature selection of metadata. In: 2020 IJCNN, pp. 1–6. IEEE (2020). https://doi.org/10.1109/IJCNN48605.2020.9207652
Fauzi, M.A.: Word2Vec model for sentiment analysis of product reviews in Indonesian language. In. J. Electr. Comput. Eng. (IJECE) 9(1), 525 (2019). https://doi.org/10.11591/ijece.v9i1.pp525-530
Fbi: How Do Violent Extremists Make Contact? (2021). https://www.fbi.gov/cve508/teen-website/how
Gaydhani, A., Doma, V., Kendre, S., Bhagwat, L.: Detecting hate speech and offensive language on Twitter using machine learning: an N-gram and TFIDF based approach (2018)
Internet World Stats: Internet Usage Statistics; The Internet Big Picture; World Internet Users and 2021 Population Stats (2021). https://www.internetworldstats.com/stats.htm
Jacobo, J.: This is what Trump told supporters before many stormed Capitol Hill. ABC News (2021). https://tinyurl.com/w5aaar5c
Jang, B., Kim, I., Kim, J.W.: Word2vec convolutional neural networks for classification of news articles and tweets. PLOS One 14(8), e0220,976 (2019). https://doi.org/10.1371/journal.pone.0220976
Khattak, F.K., Jeblee, S., Pou-Prom, C., Abdalla, M., Meaney, C., Rudzicz, F.: A survey of word embeddings for clinical text. J. Biomed. Inf. X 4, 100,057 (2019). https://doi.org/10.1016/j.yjbinx.2019.100057
Kula, S., Choraś, M., Kozik, R.: Application of the BERT-based architecture in fake news detection. In: Conference on Complex, Intelligent, and Software Intensive Systems, pp. 239–249. Springer (2020)
Lewis, R.: Alternative influence; Broadcasting the reactionary right on YouTube. Data & Society (2018). https://tinyurl.com/4pys8w93
Liu, B.: Sentiment analysis and subjectivity. Handb. Nat. Lang. Process. 2(2010), 627–666 (2010)
Lyons, D.: The 6 hardest languages For English speakers to learn. Babbel Magazine (2021). https://tinyurl.com/drb83774
Ma, L., Zhang, Y.: Using Word2Vec to process big text data. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2895–2897. IEEE (2015). https://doi.org/10.1109/BigData.2015.7364114
McDonald, S., Ramscar, M.: Testing the distributioanl hypothesis: the influence of context on judgements of semantic similarity. In: Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 23 (2001)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). http://arxiv.org/abs/1301.3781
Mussiraliyeva, S., Bolatbek, M., Omarov, B., Medetbek, Z., Baispay, G., Ospanov, R.: On detecting online radicalization and extremism using natural language processing. In: 2020 21st International Arab Conference on Information Technology (ACIT), pp. 1–5. IEEE (2020). https://doi.org/10.1109/ACIT50332.2020.9300086
Nugroho, K., et al.: Improving random forest method to detect hatespeech and offensive word. In: 2019 ICOIACT, pp. 514–518. IEEE (2019). https://doi.org/10.1109/ICOIACT46704.2019.8938451
Pereira-Kohatsu, J.C., Quijano-Sánchez, L., Liberatore, F., Camacho-Collados, M.: Detecting and monitoring hate speech in Twitter. Sensors 19(21), 4654 (2019). https://doi.org/10.3390/s19214654
Pitsilis, G.K., Ramampiaro, H., Langseth, H.: Effective hate-speech detection in Twitter data using recurrent neural networks. Appl. Intell. 48(12), 4730–4742 (2018). https://doi.org/10.1007/s10489-018-1242-y
Ran: Extremists’ Use of Video Gaming - Strategies and Narratives (2020)
Staudemeyer, R.C., Morris, E.R.: Understanding LSTM - a tutorial into long short-term memory recurrent neural networks (2019)
The Washington Post: How rumors on WhatsApp led to a mob killing in India | The Fact Checker. The Washington Post (2020)
United Nations Organization: United Nations Strategy and Plan of Action on Hate Speech (2020)
Westerlund, M.: The emergence of deepfake technology: a review. Technol. Innov. Manage. Rev. 9(11), 39–52 (2019). https://doi.org/10.22215/timreview/1282
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kuchczyński, M., Pawlicka, A., Pawlicki, M., Choraś, M. (2022). Using Machine Learning to Detect the Signs of Radicalization and Hate Speech on Twitter. In: Choraś, M., Choraś, R.S., Kurzyński, M., Trajdos, P., Pejaś, J., Hyla, T. (eds) Progress in Image Processing, Pattern Recognition and Communication Systems. CORES IP&C ACS 2021 2021 2021. Lecture Notes in Networks and Systems, vol 255. Springer, Cham. https://doi.org/10.1007/978-3-030-81523-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-81523-3_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81522-6
Online ISBN: 978-3-030-81523-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)