Keywords

1 Introduction

As we have entered the third decade of the twenty-first century, the advancements in technology are growing with every passing day. Among these advancements are social media which is one of the most popular and widely used technology. This has revolutionized the way people communicate with each other. Before its inception, there were limited ways for humans to be able to communicate with other humans. These platforms are not now restricted to only being used for sharing messages between each other. They have become the largest and most commonly used source for information sharing [1] such as news, thoughts, and feelings in the form of text, images, and videos. People can use these platforms for the dissemination of all kinds of content available at their disposal. These platforms are also used to express opinions about issues, products, services, etc. This ability, when combined with the speed with which online content spreads, has elevated the value of the opinions expressed [2]. Acquiring sentiment in user opinion is crucial since it leads to a more detailed understanding of user opinion [3]. The worldwide shutdown caused by the COVID-19 pandemic resulted in a tremendous rise in social media communication. As a result, a huge volume of data has been generated, and analysis of this data will assist companies in developing better policies that will eventually help make these platforms safer for their users.

The global digital population is increasing every passing day. As per [4], the active Internet users globally were 4.66 billion as of October 2020. Out of them, almost 4.14 billion were active social media users. According to another report [5], the rate of social penetration reached almost 49% in 2020. It is expected that the number of global monthly active users of social media will reach 3.43 billion by 2023, which is around one-third of the total earth’s population. Among all other social media platforms, Facebook is the most popular [5]. It became the first social media platform to surpass the 1 billion monthly active user mark. In 2020, it had almost 2.6 billion monthly active users globally, the highest among all the social media platforms. Among all the countries, India has over 680 million active Internet users. As of 2020, India is the country with a 300 million user-base of Facebook, the highest among all other countries. It is reported that on average, a normal Internet user spends almost 3 h/day on social media in India. From 326 million users of social media platforms in 2018, it is estimated that it will reach almost 450 million users by 2023 in India [6]. In the European Union (EU), 80% of people have experienced hate speech online, and 40% have felt assaulted or endangered as a result of their use of social media platforms [7]. According to research conducted by Pew Research Centre in 2021 [8], “about four-in-ten Americans (41%) have experienced some form of online harassment.” Online abusive content is a very serious matter. Online extremist narratives have been linked to heinous real-world occurrences like hate crimes, mass shootings like the one in Christchurch in 2019, assaults, and bombings; and threats against prominent people [9].

The abusive textual content on social media platforms is a persistent and serious problem. The presence of abusive textual content in the user's life increases their stress and dissatisfaction. The effects of such content can be very adverse with time. The analysis and evaluation of this generated data reveal important details about the people who provided it [10]. The use of emotion recognition techniques on this data will aid in stress and mental health management. This paper is intended to assist new researchers in gaining an overall perspective of this research area by providing an overview to gain insights into this field of study. The paper is organized as follows. Firstly, in Sect. 2, the abusive textual content is defined, its overall impacts on the users. Section 3 will discuss the latest popular approaches proposed for this task. A brief description of datasets, techniques, and results obtained is also in Sect. 3. Different ways of communication are discussed in Sect. 4. Various limitations and gaps of the approaches will be discussed in the Sect. 5 followed by Sect. 6 that will present the challenges, future scope, and conclusion.

2 Defining Abusive Textual Content

The word Abuse according to the Washington state department of social and health services [11], “covers many different ways someone may harm a vulnerable adult.” The state department has categorized abuse into seven different types based on different ways it can be taken advantage of to harm anyone. One among all those types of abuse was identified that can be inflicted using social media and that is “Mental mistreatment or emotional abuse.” Other types of abuse require the physical presence of both the abuser and the victim in some way or another way. They define mental mistreatment or emotional abuse as “deliberately causing mental or emotional pain. Examples include intimidation, coercion, ridiculing, harassment, treating an adult like a child, isolating an adult from family, friends, or regular activity, use of silence to control behavior, and yelling or swearing which results in mental distress.”

We are living in a world where causing someone mental and emotional pain does not require physical presence. As discussed in the section above, social media has become an alternate platform for communication [12]. Almost all the examples given in the definition of abuse above can be put into practice with the use of these social media platforms. The kind of effect this type of content has on humans is very dangerous in terms of mental and emotional health. Physical pain usually fades after a while, but emotional pain can linger for a long time and has a lot of serious consequences on one’s mental health [13].

Abusive content is a broad term, and it covers a lot of different types of content. So, the next subsection makes a clear distinction among the different types of online abuse that are present on social media platforms.

2.1 Classification of Online Abusive Content

Online abusive content can be categorized based on the presence of abusive language, aggression, cyberbullying, insults, personal attacks, provocation, racism, sexism, or toxicity in it (Fig. 1). Based on the classification, it can be said that any social media content containing any kind of language or expression that fall into the category of these types will be abusive. Researchers have been very concerned about this problem of abusive content dissemination and have been devising methods to control its inception, to stop its further spreading and most importantly to come up with methods that can detect abuse present in disguise.

Fig. 1
figure 1

Classification of abusive content

The abusive content has been classified into 9 types. The presence of any such type of textual content in online social media will be termed abusive content. Table 1 depicts the classification of abusive content and the difference among these types can be understood by the examples from social media platforms for each category. In the context of online discussions, the above highlighted types are defined as shown in Table 2. For each of the type, there are 3 examples given in Table 1. The definitions make every type distinct as every kind is expressed differently in online social media. Researchers have been developing methods to tackle mainly these types of abuse present on social media.

Table 1 Exemplar messages of each abusive content category
Table 2 Definitions of different types of abusive content

As per our knowledge, there is no definition of abusive content in this field of research to date. We define abusive content as:

The presence of an individual or a combination of abusive language, aggressive, bullying, insulting, sexist, toxic, provocative, personal attacking, and racist remarks in any type of social media content, that has the potential of causing mental and psychological harm to users.

3 Approaches for Detecting Textual Abusive Content

This section provides details about the different approaches employed by researchers to combat abusive content on social media sites. Table 4 gives a detailed description of the methods used for detecting abusive content on social media. The table also contains information about the datasets used, the platforms from which datasets have been obtained, and the ways of data classification. It is important to highlight that the trend in the field of abusive content detection is going toward multi-class classification rather than binary classification. The researchers have used datasets from multiple sources. In the overall review, it was found that Twitter data has been used more frequently to classify text. Also, gaming platform data, online blogs, magazines, newspapers, and Wikipedia generated datasets have been also used. Table 3 lists the numerous abbreviations and their complete form used throughout this article in order to improve readability and maintain the uniformity of the many notations utilized.

Table 3 Abbreviations and their full forms
Table 4 Comparative Analysis

Papegnies et al. [23] has classified the dataset into abusive and non-abusive messages depicting binary classification. A dataset containing users’ in-game messages from a multiplayer online game has been used. First-stage naïve Bayes classifier for performing the task of detecting abusive and non-abusive messages using content-based features is used in the paper. Chen et al. [33] used 9 datasets containing data from multiple platforms like YouTube and Myspace. Support vector machine, convolutional neural network, and recurrent neural network were applied to detect abusive content. Their results reveal that the SVM classifier achieved the best results in terms of average recall on balanced datasets and deep learning models performed well on extremely unbalanced datasets. The latest research is incorporating a variety of datasets from multiple platforms. In [34], logistic regression has been set as baseline model and CNN-LSTM and BERT-LSTM have been implemented on a combination of 6 datasets containing more than 60 K records of twitter to classify data into 3 classes. They demonstrated that BERT has the highest accuracy among all models. Table 4 describes other techniques implemented; their results are also shown along with the feature extraction techniques/features used for various machine learning models. After comparing the performance of all these techniques, deep learning models are more effective in classifying abusive text. These approaches have outperformed other existing approaches for text-based classification [35].

One of the most difficult tasks when using machine learning is choosing the proper features to solve a problem. The terms textual features and content features were used interchangeably by the researchers. Textual features include Bag-of-Words (BoW), TF-IDF, N-grams, and so on. Part-of-speech (POS) tagging, and dependency relations are two syntactic features that have also been used. Traditional feature extraction methods such as the Bag-of-Words model and word embedding implemented with word2vec, fastText, and Glove were used. In natural language processing, BoW with TF-IDF is a traditional and simple feature extraction method, and Word embedding represents the document in a vector space model. Unlike the Bag-of-Words model, it captures the context and semantics of a word. Word embedding keeps word contexts and relationships intact, allowing it to detect similar words more accurately. Available literature has shown that word embeddings used with deep learning models outperform traditional feature extraction methods. Ahammad et al. [36] showed that long short-term memory (LSTM) and Gated recurrent unit (GRU) give better accuracy compared to others on trained embedding and Glove, respectively. Table 4 lists various feature extraction techniques/features used by researchers.

To deal with the problem of abusive textual content detection, researchers have presented numerous machine learning algorithms and their variants. A lot of research in this area is focused on extracting features from text. Many of the proposed works make use of text feature extraction techniques like BOW (Bag-of-Words) and dictionaries. It was discovered that these traits were unable to comprehend the context of phrases. N-gram-based approaches outperform their counterparts in terms of results and performance [14].

4 Abusive Content in Social Media

The content that is being shared over social media platforms can be divided in many ways (Fig. 2)

Fig. 2
figure 2

Ways of communication in OSN’s

. The classification is discussed below. Possible cases of abusive content spread in OSN’s (how can social media platforms be used to spread abusive content):

  1. 1.

    One to one (individual to individual): the attacker is an individual and the victim is also an individual. E.g.: personal text messages on social media platforms can be used to spread abusive content.

  2. 2.

    One to many (individual to the community): the attacker is one and the victim is a community. E.g.: social media platforms are used by individuals to post and sometimes it is exploited to target a community with the abusive content.

  3. 3.

    Many to one (community to individual): the attacker is a community targeting an individual. E.g.: a community posting or spreading abusive content about an individual on social media.

  4. 4.

    Many to many (community to community): a community targeting another community over issues and posting and spreading abusive content on social media platforms. E.g.: persons belonging to a community posting abusive content and targeting other communities through their content is also a kind through which abusive content is shared and disseminated.

5 Discussion

The approaches used to classify the textual abusive content which are discussed in Sect. 3 of this paper, when viewed critically, have their own set of limitations and gaps. The researchers will be motivated to address these issues and develop effective methods by highlighting the limitations and gaps in previous research. In terms of practical application, the performance of [23] is insufficient to be used as a fully automated system that replaces human moderation. When using machine learning techniques, performance metrics such as precision, accuracy, and F1 score provide useful information. These metrics were not evaluated in [33]. Many researchers have also used over-sampling and under-sampling throughout the study. The datasets in [33] were also subjected to these two methods. They should, however, be used with caution on datasets because excessive use of either can result in over-fitting and the loss of important information from the datasets. The authors of [42] used data augmentation and determining the best data augmentation approach is critical because if the original dataset has biases, data-enhanced from it can have biases as well. Large datasets are preferred, but small datasets are used in [37, 41], and small datasets have an impact on the machine learning model's performance, according to [44]. Because small datasets typically contain fewer details, the classification model is unable to generalize patterns learned from training data. Furthermore, because over-fitting can sometimes extend beyond training data and affect the validation set as well, it becomes much more difficult to avoid. Parametric tuning was also discovered to have produced better results in [41]. In [38], tweets are labeled solely based on keywords, resulting in the omission of tweets containing abuse and harassment in plain language that does not contain any of the authors’ keywords. A dataset that only includes comments from articles on a single topic has been used in [39]. It influences the diversity of a dataset, which makes the performance of the various techniques less significant. It is also worth noting that researchers are classifying data into a variety of categories, making the problem of abusive content classification a multi-class problem. A concept for classifying abusive content was also included in Sect. 2 of this paper. Most of the models chosen for this study used multi-class classification, but [23, 40] classified the data using binary classification. The performance of the implemented models is critical in highlighting their significance. In terms of performance, the proposed models in [34] did not outperform the state-of-the-art models RoBERTa and XLM-R. It is also beneficial to consider using more train and test data to improve model performance. The researchers used less train and test data in [36], and did not use imbalanced class distributions to evaluate classifier performance in experiments. The models used in [43] have a high computational complexity cost to achieve better results. The above-discussed limitations and gaps may provide future directions for the researchers in this field.

6 Conclusion and Future Scope

This paper offered a complete overview of the topic by combining recent research articles that used cutting-edge approaches. The researchers employed machine learning approaches successfully, with Bag-of-Words (BoW) and N-grams being the most commonly used features in classification. Several recent research has adopted distributed word representations, also known as word embeddings, because of previous models' high dimensionality and sparsity. Deep learning-based architectures have lately shown promising outcomes in this discipline. Despite being introduced in 2019, BERT has become widely used. The most popular deep learning models being used are LSTM and CNN. Moreover, hybrid models, which are a combination of multiple models, such as BERT + CNN, LSTM + CNN, LSTM + GURU, and BERT + LSTM, have also been used in the research, and their performance in detecting online abuse is promising. To summarize, deeper language traits, demographic influence analyses, and precise annotation criteria are required to effectively discern between different types of abuse. Despite the abundance of work accessible, judging the usefulness and performance of various features and classifiers remains difficult because each researcher used a different dataset. For comparison evaluation, clear annotation criteria and a benchmark dataset are necessary. According to the findings, abusive content detection remains a research area of interest needing more intelligent algorithms to handle the primary issues involved and make online interaction safer for users [45].

In this paper, the approach for identifying the recent works in abusive content detection was limited to only those dealing with the textual form of data present on social media platforms in the form of tweets, comments, chats, reviews, blogs, etc. Only those research papers that used English language datasets were chosen. Work on detecting abusive content in languages other than English is also present. But due to the scarcity of datasets, the results are not that effective. Also, out of 9 different types of abusive content which were discussed in section II, research is focused only on a subset of these types. Research to classify all the different types from a corpus is the need of the hour and it will encourage the budding researchers to take up the task. So, the future work of this study will be to identify cross-language and cross-domain techniques being developed and analyze their performance with the already present state-of-the-art approaches.