Abstract
Nowadays, fake news is one of major concerns in our society, that is a form of news consisting of deliberate disinformation or hoaxes spread via traditional news media or online social media. Thus, this study aims to explore state-of-the-art methods for detecting fake news in order to design and implement classification models. Four different classification models based on deep learning with self-attention mechanism were trained and evaluated using current datasets that are available for this purpose. Three models explored traditional supervised learning, while the fourth model explored transfer learning by fine-tuning the pre-trained language model for the same task. All four models yield comparable results with the fourth model achieving the best classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bian, T., Xiao, X., Xu, T., Zhao, P., Huang, W., Rong, Y., & Huang, J. (2020). Rumor detection on social media with bi-directional graph convolutional networks. Proceedings of the AAAI Conference on Artificial Intelligence, 34(1), 549–556.
Kshetri, N., & Voas, J. (2017). The economics of “fake news.” IT Professional, 19(6), 8–12.
Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146–1151.
Przybyla, P. (2020). Capturing the style of fake news. Proceedings of the AAAI Conference on Artificial Intelligence, 34(1), 490–497.
Oshikawa, R., Qian, J., & Wang, W. Y. (2018). A survey on natural language processing for fake news detection. arXiv:1811.00770
Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., & Menczer, F. (2017). The spread of fake news by social bots. arXiv:1707.07592
Ruchansky, N., Seo, S., & Liu, Y. (2017). Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on conference on information and knowledge management (pp. 797–806).
Tenney, I., Xia, P., Chen, B., Wang, A., Poliak, A., McCoy, R. T., Pavlick, E., et al. (2019). What do you learn from context? Probing for sentence structure in contextualized word representations. In International conference on learning representations.
Starbird, K., Maddock, J., Orand, M., Achterman, P., & Mason, R. M. (2014). Rumors, false flags, and digital vigilantes: Misinformation on Twitter after the 2013 Boston marathon bombing. In Conference 2014 Proceedings.
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555
Pérez-Rosas, V., Kleinberg, B., Lefevre, A., & Mihalcea, R. (2017). Automatic detection of fake news. arXiv:1708.07104
Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5), 1–40. https://doi.org/10.1145/3395046
Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211–236.
Chernyavskiy, A., Ilvovsky, D., & Nakov, P. (2021). Transformers: “The end of history” for NLP?. arXiv:2105.00813
Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22–36.
Conroy, N. K., Rubin, V. L., & Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. Proceedings of the Association for Information Science and Technology, 52(1), 1–4.
Afroz, S., Brennan, M., & Greenstadt, R. (2012). Detecting hoaxes, frauds, and deception in writing style online. In IEEE symposium on security and privacy (pp. 461–475). IEEE.
Girgis, S., Amer, E., & Gadallah, M. (2018). Deep learning algorithms for detecting fake news in online text. In 13th International conference on computer engineering and systems (ICCES) (pp. 93–97). IEEE.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980
Shu, K., Cui, L., Wang, S., Lee, D., & Liu, H. (2019). Defend: Explainable fake news detection. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 395–405).
Zhang, J., Cui, L., Fu, Y., & Gouza, F. B. (2018). Fake news detection with deep diffusive network model. arXiv:1805.08751
Qiao, Y., Wiechmann, D., & Kerz, E. (2020). A language-based approach to fake news detection through interpretable features and BRNN. In Proceedings of the 3rd international workshop on rumours and deception in social media (RDSM) (pp. 14–31).
Agarwal, A., Mittal, M., Pathak, A., & Goyal, L. M. (2020). Fake news detection using a blend of neural networks: An application of deep learning. SN Computer Science, 143(3), 1–9. https://doi.org/10.1007/s42979-020-00165-4
Bajaj, S. (2017). The pope has a new baby! Fake news detection using deep learning. CS 224N (pp. 1–8).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., & Polosukhin, I. (2017). Attention is all you need. arXiv:1706.03762
Fang, Y., Gao, J., Huang, C., Peng, H., & Wu, R. (2019). Self multi-head attention-based convolutional neural networks for fake news detection. PloS One, 14(9), e0222713.
Durrani, N., Sajjad, H., & Dalvi, F. (2021). How transfer learning impacts linguistic knowledge in deep NLP models?. arXiv:2105.15179
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Stoyanov, V., et al. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942
Gundapu, S., & Mamid, R. (2021). Transformer based automatic COVID-19 fake news detection system. arXiv:2101.00180
Al Asaad, B., & Erascu, M. (2018). A tool for fake news detection. In 20th International symposium on symbolic and numeric algorithms for scientific computing (SYNASC) (pp. 379–386). IEEE.
Tang, J., Qu, M., & Mei, Q. (2015). Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1165–1174).
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, 26. NIPS.
Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A neural probabilistic language model. The Journal of Machine Learning Research, 3, 1137–1155.
Almeida, F., & Xexéo, G. (2019). Word embeddings: A survey. arXiv:1901.09069
Firth, J. R. (1957). A synopsis of linguistic theory, 1930–1955. Studies in Linguistic Analysis.
Harris, Z. S. (1954). Distributional structure. Word, 10(2–3), 146–162.
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).
Bogoychev, N. (2020). Not all parameters are born equal: Attention is mostly what you need. arXiv:2010.11859
Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv:1409.1259
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Hu, D. (2019). An introductory survey on attention mechanisms in NLP problems. In Proceedings of SAI intelligent systems conference (pp. 432–448). Springer.
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166. https://doi.org/10.1109/72.279181
Thota, A., Tilak, P., Ahluwalia, S., & Lohia, N. (2018). Fake news detection: A deep learning approach. SMU Data Science Review, 1(3), 10.
Vaswani, A., & Huang, A. (2020). Self-attention for generative models. Presentation slides at Stanford University. https://web.stanford.edu/class/cs224n/slides/cs224n-2019-lecture14-transformers.pdf
Alammar, J. (2018). The illustrated transformer. https://jalammar.github.io/illustrated-transformer/
Lin, M., Chen, Q., & Yan, S. (2013). Network in network. arXiv:1312.4400
Nasir, J. A., Khan, O. S., & Varlamis, I. (2021). Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights, 1(1), 100007. https://doi.org/10.1016/j.jjimei.2020.100007
Singhania, S., Fernandez, N., & Rao, S. (2017). 3HAN: A deep neural network for fake news detection. In: D. Liu, S. Xie, Y. Li, D. Zhao, & E. S. El-Alfy (Eds.), Neural information processing. ICONIP 2017. Lecture notes in computer science (vol. 10635). Springer. https://doi.org/10.1007/978-3-319-70096-0_59
Beltagy, I., Peters, M. E., & Cohan, A. (2020). Longformer: The long-document transformer. arXiv:2004.05150
Pappagari, R., Zelasko, P., Villalba, J., Carmiel, Y., & Dehak, N. (2019). Hierarchical transformers for long document classification. In IEEE automatic speech recognition and understanding workshop (ASRU) (pp. 838–844). IEEE. https://doi.org/10.1109/ASRU46091.2019.9003958
Child, R., Gray, S., Radford, A., & Sutskever, I. (2019). Generating long sequences with sparse transformers. arXiv:1904.10509
Cui, B., Li, Y., Chen, M., & Zhang, Z. (2019). Fine-tune BERT with sparse self-attention mechanism. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 3539–3544).
Huang, L., Yuan, Y., Guo, J., Zhang, C., Chen, X., & Wang, J. (2019). Interlaced sparse self-attention for semantic segmentation. arXiv:190712273
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842–866. https://doi.org/10.1162/tacl_a_00349
Ruder, S. (2019). The state of transfer learning in NLP. Sebastian ruder. https://ruder.io/state-of-transfer-learning-in-nlp/
Pan, S. J., & Yang, Q. (2009). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.2009.191
Ruder S. (2021). Transfer learning—machine learning’s next frontier. Sebastian ruder. https://ruder.io/transfer-learning/index.html#whatistransferlearning
Ruder, S., Peters, M. E., Swayamdipta, S., & Wolf, T. (2019). Transfer learning in natural language processing. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Tutorials (pp. 15–18).
Mouratidis, D., Nikiforos, M. N., & Kermanidis, K. L. (2021). Deep learning for fake news detection in a pairwise textual input schema. Computation, 9(20). https://doi.org/10.3390/computation9020020
Peters, M. E., Ammar, W., Bhagavatula, C., & Power, R. (2017). Semi-supervised sequence tagging with bidirectional language models. arXiv:1705.00108
Shu, K., Mahudeswaran, D., Wang, S., Lee, D., & Liu, H. (2020). FakeNewsNet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data, 8(3), 171–188. https://doi.org/10.1089/big.2020.0062
Wang, W. Y. (2017). “Liar, liar pants on fire”: A new benchmark dataset for fake news detection. In Proceedings of the 55th annual meeting of the association for computational linguistics (vol. 2, pp. 422–426). https://doi.org/10.18653/v1/P17-2067
Ahmed, H., Traore, I., & Saad, S. (2018). Detecting opinion spams and fake news using text classification. Journal of Security and Privacy, 1(1).
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., & Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (vol. 32, pp. 8024–8035). Curran Associates, Inc.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.
Miranda, L. J. (2017). Understanding softmax and the negative log-likelihood. https://ljvmiranda921.github.io/notebook/2017/08/13/softmax-and-the-negative-log-likelihood/
McCormick, C., & Ryan, N. (2020). BERT fine-tuning tutorial with Pytorch. mccor-mickml.com. https://mccormickml.com/2019/07/22/BERT-fine-tuning/
Loshchilov, I., & Hutter, F. (2017). Decoupled weight decay regularization. arXiv:1711.05101
Sun, C., Qiu, X., Xu, Y., & Huang, X. (2019). How to fine-tune BERT for text classification?. In China national conference on Chinese computational linguistics (pp. 194–206). Springer.
Shakeel, D., & Jain, N. Fake news detection and fact verification using knowledge graphs and machine learning. https://doi.org/10.13140/RG.2.2.18349.41448
Deepak, S., & Chitturi, B. (2020). Deep neural approach to fake-news identification. Procedia Computer Science, 167, 2236–2243.
Liu, N. F., Gardner, M., Belinkov, Y., Peters, M. E., & Smith, N. A. (2019). Linguistic knowledge and transferability of contextual representations. arXiv:1903.08855
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Cvitanović, I., Babac, M.B. (2022). Deep Learning with Self-Attention Mechanism for Fake News Detection. In: Lahby, M., Pathan, AS.K., Maleh, Y., Yafooz, W.M.S. (eds) Combating Fake News with Computational Intelligence Techniques. Studies in Computational Intelligence, vol 1001. Springer, Cham. https://doi.org/10.1007/978-3-030-90087-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-90087-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90086-1
Online ISBN: 978-3-030-90087-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)