Abstract
Generated hateful and toxic content by a portion of users in social media is a rising phenomenon that motivated researchers to dedicate substantial efforts to the challenging direction of hateful content identification. We not only need an efficient automatic hate speech detection model based on advanced machine learning and natural language processing, but also a sufficiently large amount of annotated data to train a model. The lack of a sufficient amount of labelled hate speech data, along with the existing biases, has been the main issue in this domain of research. To address these needs, in this study we introduce a novel transfer learning approach based on an existing pre-trained language model called BERT (Bidirectional Encoder Representations from Transformers). More specifically, we investigate the ability of BERT at capturing hateful context within social media content by using new fine-tuning methods based on transfer learning. To evaluate our proposed approach, we use two publicly available datasets that have been annotated for racism, sexism, hate, or offensive content on Twitter. The results show that our solution obtains considerable performance on these datasets in terms of precision and recall in comparison to existing approaches. Consequently, our model can capture some biases in data annotation and collection process and can potentially lead us to a more accurate model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Anti-muslim hate crime surges after Manchester and London Bridge attacks (2017): https://www.theguardian.com.
- 2.
A.: Hate on the rise after Trump’s election: http://www.newyorker.com.
- 3.
- 4.
References
Badjatiya, P., Gupta, S., Gupta, M., et al.: Deep learning for hate speech detection in tweets. CoRR abs/1706.00188 (2017). http://arxiv.org/abs/1706.00188
Davidson, T., Bhattacharya, D., Weber, I.: Racial bias in hate speech and abusive language detection datasets. CoRR abs/1905.12516 (2019). http://arxiv.org/abs/1905.12516
Davidson, T., Warmsley, D., Macy, M.W., et al.: Automated hate speech detection and the problem of offensive language. CoRR abs/1703.04009 (2017). http://arxiv.org/abs/1703.04009
Devlin, J., Chang, M., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Djuric, N., Zhou, J., Morris, R., et al.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Companion, pp. 29–30. ACM, New York (2015). https://doi.org/10.1145/2740908.2742760
Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 85:1–85:30 (2018). https://doi.org/10.1145/3232676
Founta, A.M., Chatzakou, D., Kourtellis, N., et al.: A unified deep learning architecture for abuse detection. In: Proceedings of the 10th ACM Conference on Web Science, WebSci 2019, pp. 105–114. ACM, New York (2019)
Gambäck, B., Sikdar, U.K.: Using convolutional neural networks to classify hate-speech. In: Proceedings of the First Workshop on Abusive Language Online, pp. 85–90. Association for Computational Linguistics, Vancouver (2017). https://doi.org/10.18653/v1/W17-3013
Howard, J., Ruder, S.: Fine-tuned language models for text classification. CoRR abs/1801.06146 (2018). http://arxiv.org/abs/1801.06146
Malmasi, S., Zampieri, M.: Challenges in discriminating profanity from hate speech. CoRR abs/1803.05495 (2018). http://arxiv.org/abs/1803.05495
Mehdad, Y., Tetreault, J.: Do characters abuse more than words? In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 299–303. Association for Computational Linguistics, Los Angeles (2016). https://doi.org/10.18653/v1/W16-3638
Mittos, A., Zannettou, S., Blackburn, J., et al.: And We Will Fight For Our Race! A Measurement Study of Genetic Testing Conversations on Reddit and 4chan. CoRR abs/1901.09735 (2019). http://arxiv.org/abs/1901.09735
Nobata, C., Tetreault, J., Thomas, A., et al.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, WWW 2016, pp. 145–153. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2016). https://doi.org/10.1145/2872427.2883062
Olteanu, A., Castillo, C., Boy, J., et al.: The effect of extremist violence on hateful speech online. CoRR abs/1804.05704 (2018). http://arxiv.org/abs/1804.05704
Ottoni, R., Cunha, E., Magno, G., et al.: Analyzing right-wing Youtube channels: hate, violence and discrimination. In: Proceedings of the 10th ACM Conference on Web Science, WebSci 2018, pp. 323–332. ACM, New York (2018). https://doi.org/10.1145/3201064.3201081
Pete, B., Williams, M.L.: Cyber hate speech on Twitter: an application of machine classification and statistical modeling for policy and decision making. Policy Internet 7(2), 223–242 (2015). https://doi.org/10.1002/poi3.8
Peters, M.E., Neumann, M., Iyyer, M., et al.: Deep contextualized word representations. CoRR abs/1802.05365 (2018). http://arxiv.org/abs/1802.05365
Radford, A.: Improving language understanding by generative pre-training (2018)
Sap, M., Card, D., Gabriel, S., et al.: The risk of racial bias in hate speech detection. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1668–1678. Association for Computational Linguistics, Florence (2019). https://doi.org/10.18653/v1/P19-1163
Waseem, Z.: Are you a racist or am I seeing things? Annotator influence on hate speech detection on Twitter. In: Proceedings of the First Workshop on NLP and Computational Social Science, pp. 138–142. Association for Computational Linguistics, Austin (2016). https://doi.org/10.18653/v1/W16-5618
Waseem, Z., Davidson, T., Warmsley, D., et al.: Understanding abuse: a typology of abusive language detection subtasks. In: Proceedings of the First Workshop on Abusive Language Online, pp. 78–84. Association for Computational Linguistics, Vancouver (2017). https://doi.org/10.18653/v1/W17-3012, https://www.aclweb.org/anthology/W17-3012
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93. Association for Computational Linguistics, San Diego (2016). https://doi.org/10.18653/v1/N16-2013
Waseem, Z., Thorne, J., Bingel, J.: Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection, pp. 29–55. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-78583-7_3
Wiegand, M., Ruppenhofer, J., Kleinbauer, T.: Detection of abusive language: the problem of biased datasets. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 602–608. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1060
Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on twitter using a convolution-GRU based deep neural network. In: The Semantic Web, pp. 745–760. Springer International Publishing, Cham (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mozafari, M., Farahbakhsh, R., Crespi, N. (2020). A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 881. Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_77
Download citation
DOI: https://doi.org/10.1007/978-3-030-36687-2_77
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36686-5
Online ISBN: 978-3-030-36687-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)