A Deep Dive into Multilingual Hate Speech Classification

Aluru, Sai Saketh; Mathew, Binny; Saha, Punyajoy; Mukherjee, Animesh

doi:10.1007/978-3-030-67670-4_26

Sai Saketh Aluru¹³,
Binny Mathew¹³,
Punyajoy Saha¹³ &
…
Animesh Mukherjee¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12461))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2317 Accesses
8 Citations

Abstract

Hate speech is a serious issue that is currently plaguing the society and has been responsible for severe incidents such as the genocide of the Rohingya community in Myanmar. Social media has allowed people to spread such hateful content even faster. This is especially concerning for countries which lack hate speech detection systems. In this paper, using hate speech dataset in 9 languages from 16 different sources, we perform the first extensive evaluation of multilingual hate speech detection. We analyze the performance of different deep learning models in various scenarios. We observe that in low resource scenario LASER embedding with Logistic regression perform the best, whereas in high resource scenario, BERT based models perform much better. We also observe that simple techniques such as translating to English and using BERT, achieves competitive results in several languages. For cross-lingual classification, we observe that data from other languages seem to improve the performance, especially in the low resource settings. Further, in case of zero-shot classification, evaluation on Italian and Portuguese dataset achieve good results. Our proposed framework could be used as an efficient solution for low-resource languages. These models could also act as good baselines for future multilingual hate speech detection tasks. Our code (Code: https://github.com/punyajoy/DE-LIMIT) and models (Models: https://huggingface.co/Hate-speech-CNERG) are available online.

Warning: contains material that many will find offensive or hateful.

S. S. Aluru and B. Mathew—Equal Contribution.

Access provided by Autonomous University of Puebla. Download conference paper PDF

A literature survey on multimodal and multilingual automatic hate speech identification

Article 20 January 2023

Comparative analysis of deep learning based Afaan Oromo hate speech detection

Article Open access 02 June 2022

A survey of hate speech detection in Indian languages

Article 26 March 2024

Keywords

1 Introduction

Online social media has allowed dissemination of information at a faster rate than ever [22, 23]. This has allowed bad actors to use this for their nefarious purposes such as propaganda spreading, fake news, and hate speech. Hate speech is defined as a “direct and serious attack on any protected category of people based on their race, ethnicity, national origin, religion, sex, gender, sexual orientation, disability or disease” [11]. Representative examples of hate speech are provided in Table 1.

Hate speech is increasingly becoming a concerning issue in several countries. Crimes related to hate speech have been increasing in the recent times with some of them leading to severe incidents such as the genocide of the Rohingya community in Myanmar, the anti-Muslim mob violence in Sri Lanka, and the Pittsburg shooting. Frequent and repetitive exposure to hate speech has been shown to desensitize the individual to this form of speech and subsequently to lower evaluations of the victims and greater distancing, thus increasing outgroup prejudice [35]. The public expressions of hate speech has also been shown to affect the devaluation of minority members [18], the exclusion of minorities from the society [26], and the discriminatory distribution of public resources [12].

While the research in hate speech detection has been growing rapidly, one of the current issues is that majority of the datasets are available in English language only. Thus, hate speech in other languages are not detected properly and this could be detrimental. While there are few datasets [3, 27] in other language available, as we observe, they are relatively small in size.

Table 1. Examples of hate speech.

Full size table

In this paper, we perform the first large scale analysis of multilingual hate speech by analyzing the performance of deep learning models on 16 datasets from 9 different languages. We consider two different scenarios and discuss the classifier performance. In the first scenario (monolingual setting), we only consider the training and testing from the same language. We observe that in low resource scenario models using LASER embedding with Logistic regression perform the best, whereas in high resource scenario, BERT based models perform much better. We also observe that simple techniques such as translating to English and using BERT, achieves competitive results in several languages. In the second scenario (multilingual setting), we consider training data from all the other languages and test on one target language. Here, we observe that including data from other languages is quite effective especially when there is almost no training data available for the target language (aka zero shot). Finally, from the summary of the results that we obtain, we construct a catalogue indicating which model is effective for a particular language depending on the extent of the data available. We believe that this catalogue is one of the most important contributions of our work which can be readily referred to by future researchers working to advance the state-of-the-art in multilingual hate speech detection.

The rest of the paper is structured as follows. Section 2 presents the related literature for hate speech classification. In Sect. 3, we present the datasets used for the analysis. Section 4 provides details about the models and experimental settings. In Sect. 5, we note the key results of our experiments. In Sect. 6 we discuss the results and provide error analysis.

2 Related Works

Hate speech lies in a complex nexus with freedom of expression, individual, group and minority rights, as well as concepts of dignity, liberty and equality [16]. Computational approaches to tackle hate speech has recently gained a lot of interest. The earlier efforts to build hate speech classifiers used simple methods such as dictionary look up [19], bag-of-words [6]. Fortuna et al. [13] conducted a comprehensive survey on this subject.

With the availability of larger datasets, researchers started using complex models to improve the classifier performance. These include deep learning [37] and graph embedding techniques [30] to detect hate speech in social media posts. Zhang et al. [37] used deep neural network, combining convolutional and gated recurrent networks to improve the results on 6 out of 7 datasets used. In this paper, we have used the same CNN-GRU model for one of our experimental settings (monolingual scenario).

Research into the multilingual aspect of hate speech is relatively new. Datasets for languages such as Arabic and French [27], Indonesian [21], Italian [33], Polish [29], Portuguese [14], and Spanish [3] have been made available for research. To the best of our knowledge, very few works have tried to utilize these datasets to build multilingual classifiers. Huang et al. [20] used Twitter hate speech corpus from five languages and annotated them with demographic information. Using this new dataset they study the demographic bias in hate speech classification. Corazza et al. [8] used three datasets from three languages (English, Italian, and German) to study the multilingual hate speech. The authors used models such as SVM, and Bi-LSTM to build hate speech detection models. Our work is different from these existing works as we perform the experiment on a much larger set of languages (9) using more datasets (16). Our work tries to utilize the existing hate speech resources to develop models that could be generalized for hate speech detection in other languages.

3 Dataset Description

We looked into the datasets available for hate speech and found 16 publicly^{Footnote 1} available sources in 9 different languages^{Footnote 2}. One of the immediate issues, we observed was the mixing of several types of categories (offensive, profanity, abusive, insult etc.). Although these categories are related to hate speech, they should not be considered as the same [9]. For this reason, we only use two labels: hate speech and normal, and discard other labels. Next, we explain the datasets in different languages. The overall dataset statistics are noted in Table 2.

Arabic: We found two arabic datasets that were built for hate speech detection.

Mulki et al. [25] : A Twitter dataset^{Footnote 3} for hate speech and abusive language. For our task, we ignored the abusive class and only considered the hate and normal class.
Ousidhoum et al. [27]: A Twitter dataset^{Footnote 4} with multi-label annotations. We have only considered those datapoints which have either hate speech or normal in the annotation label.

Table 2. Dataset details

Full size table

English: Majority of the hate speech datasets are available in English language. We select six such publicly available datasets.

Davidson et al. [9] provided a three class Twitter dataset^{Footnote 5}, the classes being hate speech, abusive speech, and normal. We have only considered the hate speech and normal class for our task.
Gibert et al. [17] provided a hate speech dataset^{Footnote 6} consisting sentences from Stormfront^{Footnote 7}, a white supremacist forum. Each sentence is tagged as either hate or normal.
Waseem et al. [36] provided a Twitter dataset^{Footnote 8} annotated into classes: sexism, racism, and neither. We considered the tweets tagged as sexism or racism as hate speech and neither class as normal.
Basile et al. [3] provided multilingual Twitter dataset^{Footnote 9} for hate speech against immigrants and women. Each post is tagged as either hate speech or normal.
Ousidhoum et al. [27] provided Twitter dataset (See Footnote 6) with multi-label annotations. We have only considered those datapoints which have either hate speech or normal in the annotation label.
Founta et al. [15] provided a large dataset^{Footnote 10} of 100K annotations divided in four classes: hate speech, abusive, spam, and normal. For our task, we have only considered the datapoints marked as either hate or normal, and ignored the other classes.

German: We select two datasets available in German language.

Ross et al. [32] provided a German hate speech dataset^{Footnote 11} for the refugee crisis. Each tweet is tagged as hate speech or normal.
Bretschneider et al. [5] provided a Facebook hate speech dataset^{Footnote 12} against foreigners and refugees.

Indonesian. We found two datasets for the Indonesian language.

Ibrohim et al. [21] provided an Indonesian multi-label hate speech and abusive dataset^{Footnote 13}. We only consider the hate speech label for our task and other labels are ignored.
Alfina et al. [1] provided an Indonesian hate speech dataset^{Footnote 14}. Each post is tagged as hateful or normal.

Italian. We found two datasets for the Italian language.

Sanguinetti et al. [33] provided an Italian hate speech dataset^{Footnote 15} against the minorities in Italy.
Bosco et al. [4] provided hate speech dataset^{Footnote 16} collected from Twitter and Facebook.

Polish. We found only one dataset for the Polish language

Ptaszynski et al. [29] provided a cyberbullying dataset^{Footnote 17} for the Polish language. We have only considered hate speech and normal class for our task.

Portuguese. We found one dataset for the Portuguese language

Fortuna et al. [14] developed a hierarchical hate speech dataset^{Footnote 18} for the Portuguese language. For our task, we have used the binary class of hate speech or normal.

Spanish. We found two dataset for the Spanish language.

Basile et al. [3] provided multilingual hate speech dataset (See Footnote 11) against immigrants and women.
Pereira et al. [28] provided hate speech dataset^{Footnote 19} for the Spanish language.

French

Ousidhoum et al. [27] provided Twitter dataset (See Footnote 6) with multi-label annotations. We have only considered those data points which have either hate speech or normal in the annotation label.

4 Experiments

For each language, we combine all the datasets and perform stratified train/ validation/ test split in the ratio 70%/10%/20%. For all the experiments, we use the same splits of train/val/test. Thus, the results are comparable across different models and settings. We report macro F1-score to measure the classifier performance. In case we select a subset of the dataset for the experiment, we repeated the subset selection with 5 different random sets and report the average performance. This would help to reduce the performance variation across different sets. In our experiments, the subsets are stratified samples of size 16, 32, 64, 128, 256.

4.1 Embeddings

In order to train models in multilingual setting, we need multilingual word/sentence embeddings. For sentences, LASER embeddings were used and for words MUSE embeddings were used.

Laser embeddings: LASER^{Footnote 20} denotes Language-Agnostic SEntence Representations [2]. Given an input sentence, LASER provides sentence embeddindgs which are obtained by applying max-pooling operation over the output of a BiLSTM encoder. The system uses a single BiLSTM encoder with a shared BPE vocabulary for all languages.

Muse embeddings: MUSE^{Footnote 21} denotes Multilingual Unsupervised and Supervised Embeddings. Given an input word, MUSE gives as output the corresponding word embedding [7]. MUSE builds a bilingual dictionary between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way.

4.2 Models

CNN-GRU (Zhang et al. [37]): This model initially maps each of the word in a sentence into a 300 dimensional vector using the pretrained Google News Corpus embeddings [24]. It also pads/clips the sentences to a maximum of 100 words. Then this \(300 \times 100\) vector is passed through drop layer and finally to a 1-D convolution layer with 100 filters. Further, a maxpool layer reduces the dimension to \(25 \times 100\) feature matrix. Now this is passed through a GRU layer and it outputs a \(100 \times 100\) dimension matrix which is globally max-pooled to provide a \(1 \times 100\) vector. This is further passed through a softmax layer to give us the final prediction.

BERT: BERT [10] stands for Bidirectional Encoder Representations from Transformers pretrained on data from english language. It is a stack of transformer encoder layers with multiple “heads”, i.e. fully connected neural networks augmented with a self attention mechanism. For every input token in a sequence, each head computes key value and query vectors which are further used to create a weighted representation. The outputs of each head in the same layer are combined and run through a fully connected layer. Each layer is wrapped with a skip connection and a layer normalization is applied after it. In our model we set the token length to 128 for faster processing of the query^{Footnote 22}.

mBERT: Multilingual BERT (mBERT^{Footnote 23}) is a version of BERT that was trained on Wikipedia in 104 languages. Languages with a lot of data were sub-sampled and others were super sampled and the model was pretrained using the same method as BERT. mBERT generalizes across some scripts and can retrieve parallel sentences. mBERT is simply trained on a multilingual corpus with no language IDs, but it encodes language identities. We used mBERT to train hate speech detection model in different languages once again limiting to a maximum of 128 tokens for sentence representation.

Translation: One simple way to utilize datasets in different languages is to rely on translation. Simple techniques of translation has shown to give good results in tasks such as sentiment analysis [34]. We use Google Translate^{Footnote 24} to convert all the datasets in different languages to English since translation to English from other languages typically have less errors in comparison to the other way round.

For our experiments we use the following four models:

1.
MUSE + CNN-GRU: For the given input sentence, we first obtain the corresponding MUSE embeddings which are then passed as input to the CNN-GRU model.
2.
Translation + BERT: The input sentence is first translated to the English language which are then provided as input to the BERT model.
3.
LASER + LR: For the given input sentence, we first obtain the corresponding LASER embeddings which are then passed as input to a Logistic Regression (LR) model.
4.
mBert: The input sentence is directly fed to the mBert model.

4.3 Hyperparameter Optimization

We use the validation set performance to select the best set of hyperparameters for the test set. The hyperparameters used in our experiments are as follows: batch size: 16, learning rate: \(2e^{-5}, 3e^{-5}, 5e^{-5}\) and epochs: 1, 2, 3, 4, 5.

5 Results

5.1 Monolingual Scenario

In this setting, we use the data from the same language for training, validation and testing. This scenario commonly occurs in the real world where monolingual dataset is used to build classifiers for a specific language.

Observations: Table 3 reports the results of the monolingual scenario. As expected, we observe that with increasing training data, the classifier performance increases as well. However, the relative performance seem to vary depending on the language and the model. We make several observations. First, LASER + LR performs the best in low-resource settings (16, 32, 64, 128, 256) for all the languages. Second, we observe that MUSE + CNN-GRU performs the worst in almost all the cases. Third, Translation + BERT seems to achieve competitive performance for some of the languages such as German, Polish, Portuguese, and Spanish. Overall we observe that there is no ‘one single recipe’ for all languages; however, Translation + BERT seems to be an excellent compromise. We believe that improved translations in some languages can further improve the performance of this model.

Although LASER + LR seems to be doing good in low resource setting, if enough data is available, we observe that BERT based models: Translation + BERT (English, German, Polish, and French) and mBERT (Arabic, Indonesian, Italian, and Spanish) are doing much better. However, what is more interesting is that although BERT based models are known to be successful when a larger number of datapoints are available, even with 256 datapoints some of these models seem to come very close to LASER + LR; for instance, Translation + BERT (Spanish, French) and mBERT (Arabic, Indonesian, Italian).

Table 3. Monolingual scenario: the training, validation and testing data is used from the same language. Here, Full D represents the full training data. The bold figures represent the best scores and underline represents the second best.

Full size table

Table 4. Multilingual scenario: the training data is from all the languages except one and the validation and testing data is from the remaining language. The bold figures represent the best scores.

Full size table

5.2 Multilingual Scenario

In this setting, we will use the dataset from all the languages expect one \((N-1)\), and use the validation and test set of the remaining language. This scenario represents when one wishes to employ the existing hate speech dataset to build a classifier for a new language. We have considered LASER + LR and mBERT that are most relevant for this analysis. In the LASER + LR model, we take the LASER embeddings from the \((N-1)\) languages and add to this the target language data points in incremental steps of 16, 32, 64, 128 and 256. The logistic regression model is trained on the combined data, and we test it on the held out test set of the target language.

For using the multilingual setting in mBERT we adopt a two-step fine-tuning method. For a language L, we use the dataset for \(N-1\) languages (except the \(L^\text {th}\) language) to train the mBERT model. On this trained mBERT model, we perform a second stage of fine-tuning using the training data of the target language in incremental steps of 16, 32, 64, 128, 256. The model was then evaluated on the test set of the \(L^\text {th}\) language.

We also test the models for zero shot performance. In this case, the model is not provided any data of the target language. So, the model is trained on the \((N-1)\) languages and directly tested on the \(N^\text {th}\) language test set. This would be the case in which we would like to directly deploy a hate speech classifier for a language which does not have any training data.

Observations: Table 4 reports the results of the multilingual scenario. Similar to the monolingual scenario, we observe that with increasing training data, the classifier performance increases in general.

This is especially true in low resource settings of the target languages such as English, Indonesian, Italian, Polish, Portuguese.

In case of zero shot evaluation, we observe that mBERT performs better than LASER + LR in three languages (Arabic, German, and French). LASER + LR perform better on the remaining six languages with the results in Italian and Portuguese being pretty good. In case of Portuguese, zero shot Laser + LR (without any Portuguese training data) obtains an F-score of 0.6567, close to the best result of 0.6941 (using full Portuguese training data).

For the languages such as Arabic, German, and French, mBERT seems to be performing better than LASER + LR is almost all the cases (low resource and Full D). LASER + LR, on the other hand, is able to perform well for Portuguese language in all the cases. For the rest of the five languages, we observe that LASER + LR is performing better in low resource settings, but on using the full training data of the target language, mBERT performs better.

5.3 Possible Recipes Across Languages

As we have used the same test set for both the scenarios, we can easily compare the results to access which is better. Using the results from monolingual and multilingual scenario, we can decide the best kind of models to use based on the availability of the data. The possible recipes are presented as a catalogue in Table 5. Overall we observe that LASER + LR model works better for low resource settings while BERT based models work well for high resource settings. This possibly indicates that BERT based models, in general can work well when there is larger data available thus allowing for a more accurate fine-tuning. We believe that this catalogue is one of the most important contributions of our work which can be readily referred to by future researchers working to advance the state-of-the-art in multilingual hate speech detection.

Table 5. The table describes the best model to use in low and high resource scenario. In general, LASER + LR performs well in low resource setting and BERT based models are better in high resource settings

Full size table

6 Discussion and Error Analysis

6.1 Interpretability

In order to compare the interpretability of mBERT and LASER + LR, we use LIME [31] to calculate the average importance given to words by a particular model. We compute the top 5 most predictive words and their attention for each sentence in the test set. The total score for each word is calculated by summing up all the attentions for each of the sentences where the word occurs in the top 5 LIME features. The average predictive score for each word is calculated by dividing this total score by the occurrence count of each word. In Table 6 we note the top 5 words having the highest attention scores and compare them qualitatively across models.

Table 6. Interpretations of the model outcomes.

Full size table

While comparing the models’ interpretability in Table 6, we see that LASER + LR focuses more on the hateful keywords compared to mBERT, i.e., words like ‘pigs’ etc. mBERT seems to search for some context of the hate keywords as shown in Table 7. Models dependent on the keywords can be useful when we are in a highly toxic environment such as GAB^{Footnote 25} since most of the derogatory keywords typically occur very close or at least simultaneously along with the hate target, for e.g., the first case in Table 1. In sites which are less toxic like Twitter, complex methods giving attention to the context like mBERT might be more helpful,for e.g., the third case in Table 1.

**Table 7. Examples showing word with the highest predictive word for both and .**

Table 8. Various types of errors (E) for the models (M) : mBERT and LASER + LR. The ground truth (GT) and prediction (P) consist of 0 (Non-Hate)/1 (Hate) label.

Full size table

6.2 Error Analysis

In order to delve further into the models, we conduct an error analysis^{Footnote 26} on both the mBERT and LASER + LR models using a sample of posts where the output was wrongly classified from the test set.We analyze the common errors and categorize them into the following four types:

1.
Wrong classification due to annotation’s dilemma (AD): These error cases occur due to ambiguous instances where according to us the model predicts correctly but the annotators have labelled it wrong.
2.
Wrong classification due to confounding factors (CF): These error cases are caused when the model predictions rely on some irrelevant features like normalized form of mentions (@user) and links (URL) in the text.
3.
Wrong classification due to hidden context (HC): These error cases are caused when the model fails to capture the context of the post.
4.
Wrong classification due to abusive words (AW): These error cases are caused by over-dependence of the model on the abusive words.

Table 8 shows the errors of the mBERT and LASER + LR models. For mBERT, the first example has no specific indication of being a hate speech and is considered an error on the part of annotators. In the second example the author of the post actually wants the reader to not use the abusive terms, i.e., sl*t and wh*re (found using LIME) but the model picks them as indicators of hate speech. The third example has mentioned the term “parasite" as a derogatory remark to refugees and the model did not understand it.

For the LASER + LR model, the first example is an error on the part of the annotators. In the second case the model captures the word “USER" (found using LIME), a confounding factor which affects the models’ prediction. For the third case, the author says (s)he will leave before homosexuality gets normalized which shows his/her hatred toward the LGBT community but the model is unable to capture this. In the last case the model predicts hate speech based on the word “retarded" (found using LIME) which should not be the case.

7 Conclusion

In this paper, we perform the first large scale analysis of multilingual hate speech. Using 16 datasets from 9 languages, we use deep learning models to develop classifiers for multilingual hate speech classification. We perform many experiments under various conditions – low and high resource, monolingual and multilingual settings – for a variety of languages. Overall we see that for low resource, LASER + LR is more effective while for high resource BERT models are more effective. We finally suggest a catalogue which we believe will be beneficial for future research in multilingual hate speech detection. Our Code (Code: https://github.com/punyajoy/DE-LIMIT) and Models (Models: https://huggingface.co/Hate-speech-CNERG) are available online for other researchers to use.

Notes

1.
Note that although Table 2 contains 19 entries, there are three occurrences of Ousidhoum et al. [27] and two occurrences of Basile et al. [3] for different languages.
2.
We relied on http://hatespeechdata.com for most of the datasets.
3.
https://github.com/Hala-Mulki/L-HSAB-First-Arabic-Levantine-HateSpeech-Dataset.
4.
https://github.com/HKUST-KnowComp/MLMA_hate_speech.
5.
https://github.com/t-davidson/hate-speech-and-offensive-language.
6.
https://github.com/aitor-garcia-p/hate-speech-dataset.
7.
www.stormfront.org.
8.
https://github.com/zeerakw/hatespeech.
9.
https://github.com/msang/hateval.
10.
https://github.com/ENCASEH2020/hatespeech-twitter.
11.
https://github.com/UCSM-DUE/IWG_hatespeech_public.
12.
http://www.ub-web.de/research/.
13.
https://github.com/okkyibrohim/id-multi-label-hate-speech-and-abusive-language-detection.
14.
https://github.com/ialfina/id-hatespeech-detection.
15.
https://github.com/msang/hate-speech-corpus.
16.
https://github.com/msang/haspeede2018.
17.
http://poleval.pl/tasks/task6.
18.
https://github.com/paulafortuna/Portuguese-Hate-Speech-Dataset.
19.
https://zenodo.org/record/2592149.
20.
https://github.com/facebookresearch/LASER.
21.
https://github.com/facebookresearch/MUSE.
22.
In the total data 0.17% datapoints have more than 128 tokens when tokenized, thus justifying our choice.
23.
https://tinyurl.com/yxh57v3a.
24.
https://github.com/sergei4e/gtrans.
25.
https://en.wikipedia.org/wiki/Gab_(social_network).
26.
Note that we rely on translation for interpretations of the errors and the translation itself might also have some error.

References

Alfina, I., Mulia, R., Fanany, M.I., Ekanata, Y.: Hate speech detection in the Indonesian language: a dataset and preliminary study. In: 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 233–238. IEEE (2017)
Google Scholar
Artetxe, M., Schwenk, H.: Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Trans. Assoc. Comput. Linguist. 7, 597–610 (2019)
Article Google Scholar
Basile, V., et al.: Semeval-2019 task 5: multilingual detection of hate speech against immigrants and women in twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 54–63 (2019)
Google Scholar
Bosco, C., Felice, D., Poletto, F., Sanguinetti, M., Maurizio, T.: Overview of the evalita 2018 hate speech detection task. In: EVALITA 2018-Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. vol. 2263, pp. 1–9. CEUR (2018)
Google Scholar
Bretschneider, U., Peters, R.: Detecting offensive statements towards foreigners in social media. In: Proceedings of the 50th Hawaii International Conference on System Sciences (2017)
Google Scholar
Burnap, P., Williams, M.L.: Us and them: identifying cyber hate on twitter across multiple protected characteristics. EPJ Data Sci. 5(1), 11 (2016)
Article Google Scholar
Conneau, A., Lample, G., Ranzato, M., Denoyer, L., Jégou, H.: Word translation without parallel data. arXiv preprint arXiv:1710.04087 (2017)
Corazza, M., Menini, S., Cabrio, E., Tonelli, S., Villata, S.: A multilingual evaluation for online hate speech detection. ACM Trans. Internet Technol. (TOIT) 20(2), 1–22 (2020)
Article Google Scholar
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Eleventh International AAAI Conference on Web and Social Media (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018)
Google Scholar
ElSherief, M., Kulkarni, V., Nguyen, D., Wang, W.Y., Belding, E.: Hate lingo: a target-based linguistic analysis of hate speech in social media. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Google Scholar
Fasoli, F., Maass, A., Carnaghi, A.: Labelling and discrimination: do homophobic epithets undermine fair distribution of resources? Br. J. Soc. Psychol. 54(2), 383–393 (2015)
Article Google Scholar
Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. (CSUR) 51(4), 85 (2018)
Google Scholar
Fortuna, P., da Silva, J.R., Wanner, L., Nunes, S., et al.: A hierarchically-labeled Portuguese hate speech dataset. In: Proceedings of the Third Workshop on Abusive Language Online, pp. 94–104 (2019)
Google Scholar
Founta, A.M., et al.: Large scale crowdsourcing and characterization of twitter abusive behavior. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Google Scholar
Gagliardone, I., Gal, D., Alves, T., Martinez, G.: Countering Online Hate Speech. Unesco Publishing (2015)
Google Scholar
de Gibert, O., Perez, N., Pablos, A.G., Cuadros, M.: Hate speech dataset from a white supremacy forum. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pp. 11–20 (2018)
Google Scholar
Greenberg, J., Pyszczynski, T.: The effect of an overheard ethnic slur on evaluations of the target: how to spread a social disease. J. Exper. Soc. Psychol. 21(1), 61–72 (1985)
Article Google Scholar
Guermazi, R., Hammami, M., Hamadou, A.B.: Using a semi-automatic keyword dictionary for improving violent web site filtering. In: 2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System, pp. 337–344. IEEE (2007)
Google Scholar
Huang, X., Xing, L., Dernoncourt, F., Paul, M.J.: Multilingual twitter corpus and baselines for evaluating demographic bias in hate speech recognition. arXiv preprint arXiv:2002.10361 (2020)
Ibrohim, M.O., Budi, I.: Multi-label hate speech and abusive language detection in indonesian twitter. In: Proceedings of the Third Workshop on Abusive Language Online, pp. 46–57 (2019)
Google Scholar
Mathew, B., Dutt, R., Goyal, P., Mukherjee, A.: Spread of hate speech in online social media. In: Proceedings of the 10th ACM Conference on Web Science, pp. 173–182 (2019)
Google Scholar
Mathew, B., Illendula, A., Saha, P., Sarkar, S., Goyal, P., Mukherjee, A.: Temporal effects of unmoderated hate speech in gab. arXiv preprint arXiv:1909.10966 (2019)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mulki, H., Haddad, H., Ali, C.B., Alshabani, H.: L-hsab: A levantine twitter dataset for hate speech and abusive language. In: Proceedings of the Third Workshop on Abusive Language Online, pp. 111–118 (2019)
Google Scholar
Mullen, B., Rice, D.R.: Ethnophaulisms and exclusion: the behavioral consequences of cognitive representation of ethnic immigrant groups. Personal. Soc. Psychol. Bull. 29(8), 1056–1067 (2003)
Article Google Scholar
Ousidhoum, N., Lin, Z., Zhang, H., Song, Y., Yeung, D.Y.: Multilingual and multi-aspect hate speech analysis. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4667–4676 (2019)
Google Scholar
Pereira-Kohatsu, J.C., Quijano-Sánchez, L., Liberatore, F., Camacho-Collados, M.: Detecting and monitoring hate speech in twitter. Sensors (Basel, Switzerland), 19(21) (2019)
Google Scholar
Ptaszynski, M., Pieciukiewicz, A., Dybała, P.: Results of the poleval 2019 shared task 6: first dataset and open shared task for automatic cyberbullying detection in polish twitter. In: Proceedings of the PolEval2019Workshop, p. 89 (2019)
Google Scholar
Ribeiro, M.H., Calais, P.H., Santos, Y.A., Almeida, V.A., Meira Jr, W.: Characterizing and detecting hateful users on twitter. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Google Scholar
Ross, B., Rist, M., Carbonell, G., Cabrera, B., Kurowsky, N., Wojatzki, M.: Measuring the reliability of hate speech annotations: The case of the European refugee crisis. arXiv preprint arXiv:1701.08118 (2017)
Sanguinetti, M., Poletto, F., Bosco, C., Patti, V., Stranisci, M.: An Italian twitter corpus of hate speech against immigrants. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018)
Google Scholar
Singhal, P., Bhattacharyya, P.: Borrow a little from your rich cousin: using embeddings and polarities of English words for multilingual sentiment classification. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 3053–3062 (2016)
Google Scholar
Soral, W., Bilewicz, M., Winiewski, M.: Exposure to hate speech increases prejudice through desensitization. Aggressive Behav. 44(2), 136–146 (2018)
Article Google Scholar
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93 (2016)
Google Scholar
Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on twitter using a convolution-GRU based deep neural network. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 745–760. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_48
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Kharagpur, Kharagpur, India
Sai Saketh Aluru, Binny Mathew, Punyajoy Saha & Animesh Mukherjee

Authors

Sai Saketh Aluru
View author publications
You can also search for this author in PubMed Google Scholar
Binny Mathew
View author publications
You can also search for this author in PubMed Google Scholar
Punyajoy Saha
View author publications
You can also search for this author in PubMed Google Scholar
Animesh Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Binny Mathew .

Editor information

Editors and Affiliations

Microsoft Research, Redmond, WA, USA
Yuxiao Dong
University College Dublin, Dublin, Ireland
Georgiana Ifrim
Jožef Stefan Institute, Ljubljana, Slovenia
Dunja Mladenić
Amazon Alexa Knowledge, Cambridge, UK
Craig Saunders
Ghent University, Kotrijk, Belgium
Sofie Van Hoecke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aluru, S.S., Mathew, B., Saha, P., Mukherjee, A. (2021). A Deep Dive into Multilingual Hate Speech Classification. In: Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12461. Springer, Cham. https://doi.org/10.1007/978-3-030-67670-4_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-67670-4_26
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67669-8
Online ISBN: 978-3-030-67670-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

A Deep Dive into Multilingual Hate Speech Classification

Abstract

Similar content being viewed by others

A literature survey on multimodal and multilingual automatic hate speech identification

Comparative analysis of deep learning based Afaan Oromo hate speech detection

A survey of hate speech detection in Indian languages

Keywords

1 Introduction

2 Related Works

3 Dataset Description