Keywords

1 Introduction

Over the last few years, the term fake news has become extremely popular to the point of making this phenomenon a worldwide issue [15, 17]. This concept gained traction following the emblematic 2016 US elections, in which the diffusion of misinformation on social networks has been used as a form of propaganda to get substantial political advantages [8]. The main characteristics of fake news, i.e. volume, variety and velocity [22], are sustained by the rapid spread of web bots [12] that make fabricated articles easy to publish and disinformation sources even more difficult to recognise and control. In this scenario, attention is being paid by fact-checkers [10] and content managers [18] in automatic detection systems for two main motivations: 1) manual detection by experts and organisations is a time-consuming and expensive process, with a huge human-resources involvement to maintain it [11]; 2) the nature and composition of fake news are not the same for every fabricated article. Indeed, some news entries are blatant lies, while others hide their disinformation content among the facts. Furthermore, the outcomes have to be transparent to increase trust in such systems since the results must be cross-checked to be deemed false.

In this paper, we propose FacadeFootnote 1, an automatic system for fake articles classification and decision explanation. The system is designed with a cascading architecture composed of two classification pipelines. For each document to analyse, the detection process starts with a first classifier which exploits basic linguistic features (low-level descriptors) previously extracted from several fake news datasets. The second pipeline makes use of more complex features (high-level descriptors), such as sentiment, emotion, and attribution to known real or fake sources, computed by additional algorithms. We further present an explainable user interface (UI) which can help end users understand what parts of the investigated article are likely to be fake and for what reasons through the implementation of feature importance and post-hoc methods.

2 Existing Fake News Detection Systems

The early-stage detection systems started with manual fact-checking initiatives, and despite the enormous human effort required, some of them are nowadays still hugely reliable, such as Truth-o-Meter [1] and Snopes [2]. On the automatic detection front, many works, such as [19], shape their systems around the notion of linguistic similarity of the analysed content with known real or fake articles. Nevertheless, the state of the art is unsurprisingly dominated by machine learning and deep learning models, which usually rely on a supervised learning approach (e.g. [24, 26]). In a recent publication in this field, Zhang et al. [23] leveraged the relationship between the emotions portrayed in the news content and the end users’ emotions expressed in the related comments. In most of the existing systems, however, the component of interpretability is almost overlooked. Due to the coexistence of fake and real news, it is necessary to incorporate the vision of experts and the audience in general [14, 25], and this can be achieved through an effective explainable UI. Only a few works, such as dEFEND [20] and Xfake [21], presented a solution having explainability as a fundamental part of the system.

3 The Facade System

The Facade system is designed with a cascading architecture composed of three main phases: 1) Feature extraction: low-level and high-level features are extracted from the adopted fake news datasets: ISOT Fake News Dataset [3], Fake News Dataset [4], Fake News Corpus [5], Multi-Perspective Question Answering Dataset (MPQA) [6] and Myers-Briggs Personality Type Dataset (MBTI) [7]. 2) Classification: leveraging the low-level features, a first classifier is executed to the documents to produce the probability of how likely the analysed news is fake or real. 3) Filtering: based on the resulting probability and the related confidence level of the classifier that receives low-level descriptors (i.e. basic linguistic features extracted from the article texts and headlines, such as size, number of grammatical errors, parts of speech and term frequencies), each news is filtered and marked as fake, real or uncertain. For the latter group, a second classification is applied, making use of high-level descriptors (i.e. complex features detected from the news content with additional algorithms, like sentiment, entailment, attribution, syntactical structure, tones and latent topics). Both pipelines have different classifiers catering to the inputted features. The classifiers were selected based on specific evaluation metrics such as accuracy, recall, precision, F1 scores and other customised metrics, whose detailed discussion is out of the scope of this paper.

The explainability methods included in the system, constituting the basis for the UI, are feature importance, partial dependence plots and SHAP. Feature importance [9] is a widely used method for finding the attributes that contribute the most towards the classifier’s predictions. Partial dependence plots (PDP) [13] is a model-agnostic and global method, aiming to create a link between the target label (in our case, fake or real) and the attributes utilised by the classifiers (i.e. low-level and high-level descriptors). SHAP (SHapley Additive exPlanations) [16] is a state-of-the-art explainability technique and it is mainly used to figure out the effect of each attribute of a classifier’s prediction.

4 Demonstration

In this section, we will guide our readers in the exploration of the functionalities of Facade, whose UI has been designed with a Harry Potter style, resembling a wizard revealing the truth or the falsehoods of an investigated article.

The initial page (Fig. 1) shows a welcome message (Fig. 1a) with two possible input options (Fig. 1b): insert the URL of a public article and manually type a custom text to analyse, useful to evaluate only a piece of news.

Fig. 1.
figure 1

Initial page of Facade

After the execution of the two pipelines, we land on the result page (Fig. 2a), where we can view the prediction (top left corner) and the related confidence score to the right. Optionally, we can highlight the sentences attributed by the system to real or fake sources, coloured in green and red, respectively. The colour gradient relates to the similarity score between a sentence and the attributed source. As displayed in Fig. 2b, we can also check the detailed explanations for attributions and features by hoovering over the specific contributions. The list of the most influential features is on the right-hand side of the result page. The arrows next to each feature name indicate how strong the contribution of that feature towards the prediction is through their number.

Additionally, by browsing the Explainer Dashboard, we can visualise all the SHAP values and the partial dependence plots for a single prediction. With the “What If” module, we can adjust the feature importance scores to see how the prediction changes accordingly in a counterfactual scenario.

Fig. 2.
figure 2

Result page of Facade with explanation details (Color figure online)

To summarise, the system is designed to deal with the needs of computer scientists and non-expert audiences. The more specific aspects, such as entering the URL or directly the news text to be questioned, and highlighting parts of the articles considered fake or real by the system in the second pipeline, mainly cater to the non-expert audiences. The highlighting is done in a realistic colour scheme so that it is easier for everyone to follow, irrespective of their background or technical knowledge. The red and green colours are commonly used as a convention for wrong and right, respectively. Hence the same idea translates to them being associated with the fakeness or realness of the news articles. Moreover, highlighting and pop-up boxes are standard methods in UI design and might help in external validation by the user, who can check the reasoning behind the decision and be used to further improve the system in case of incorrect tagging. The design of the explainer dashboard is mainly done to ensure the technical information is communicated with accuracy and clarity, and it is openly addressed to computer scientists.

5 Conclusion

We presented a novel fake news detection system which includes a set of capabilities able to overcome the limitations of the existing systems by exploiting both linguistic features extracted from benchmarking fake news datasets to analyse an article’s text and complex features (e.g. sentiment, topic, attribution) computed for enriching the range of descriptors and enhance the classification performance. In addition, through the implementation of an explainable UI, we aim to provide fact-checkers and content managers with a reliable tool for cross-checking the validity of the system results. In the next steps, we plan to improve the system’s response time and perform a user study to evaluate the overall user satisfaction in interacting with Facade and its UI.