Abstract
Digital news becomes widely accessible to a large community of users with the advancement of several channels of communication and the progression of technology and thus, contributes to the increase of spreading of fake news. The current study experiments and investigates machine learning models that classify news as either fake or real. Five classifiers were implemented using Random Forest, Support Vector Machine, Gradient Boosting, Logistic Regression, and Naïve Bayes algorithms. Models were trained using merged open-source datasets extracted from online sources covering different domains. Text lemmatization, vectorization, and tokenization were applied to extract useful information from news text and to improve the generalization capabilities and the performance of fake news classification models. The impact of the voting strategy on the performance of ensemble learning models were explored. The performance of the five classifiers was evaluated using the accuracy, the F1-Score, the recall, and the precision. The attained results are promising. The ensemble classifier trained using random forest algorithm and gradient boosting algorithm outperform the other classifiers and thus it might be used effectively against fake news spreading.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Elyassami, S., & Kaddour, A. (2021). Implementation of an incremental deep learning model for survival prediction of cardiovascular patients. IAES International Journal of Artificial Intelligence. 10(1), 101–109. ISSN 2252–8938
Elyassami, S., Hamid, Y., & Habuza, T.: Road crashes analysis and prediction using gradient boosted and random forest trees. In 2020 6th IEEE Congress on Information Science and Technology (CiSt), Agadir—Essaouira, Morocco (pp. 520–525). https://doi.org/10.1109/CiSt49399.2021.9357298
Conroy, N. K., Rubin, V. L., & Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. Proceedings of the Association for Information Science and Technology, 52(1), 1–4.
Pradhan, & Ajay, M. (2020). Fake news detection methods: Machine learning approach. International Journal for Research in Applied Science and Engineering Technology, 8(7), 971–975. https://doi.org/10.22214/ijraset.2020.29630
Maurice, V. (2018). Incorrect, fake, and false. journalists’ perceived online source credibility and verification behavior. Observatorio (OBS*) 12.1 (2018): n. pag. Web.
Kuldeep, N. (2018). New social media and the impact of fake news on society. In ICSSM Proceedings, July (pp. 77–96).
Álvaro Ibrain, R., & Lloret Iglesias, L. (2019). Fake news detection using deep learning.
Federico, M et al. (2019). Fake news detection on social media using geometric deep learning.
Lyu, S., & Lo, D.C.-T. (2020). Fake news detection by decision tree. SoutheastCon, 2020, 1–2. https://doi.org/10.1109/SoutheastCon44009.2020.9249688
Natali, R et al. (2020). A hybrid deep model for fake news detection. CSI, 4(4). Accessed 27 Sept 2020.
Alao, A. (2020). How artificial intelligence tools are deployed in the fight against fake news. The Nation 4(4)
Nikhil, S. (2020). Fake news detection using machine learning. International Journal Of Trend In Scientific Research And Development (IJTSRD),4(4)
Kaggle. (2021). Fake news dataset 1. https://www.kaggle.com/clmentbisaillon/fake-and-real-news-dataset. Last Accessed 01 July 2021
Kaggle. (2021). Fake news dataset 2. https://www.kaggle.com/c/fake-news/data. Last Accessed 01 July 2021
Kaggle. (2021). Fake news dataset 3. https://www.kaggle.com/jruvika/fake-news-detection. Last Accessed 01 July 2021
Smelyakov, K., Karachevtsev, D., Kulemza, D., Samoilenko, Y., Patlan, O., & Chupryna, A. (2020). Effectiveness of preprocessing algorithms for natural language processing applications, In 2020 IEEE International Conference on Problems of Infocommunications. Science and Technology (PIC S&T) (pp. 187–191). https://doi.org/10.1109/PICST51311.2020.9467919
Shah, F. P., & Patel, V. (2016) A review on feature selection and feature extraction for text classification. In 2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET) (pp. 2264–2268). https://doi.org/10.1109/WiSPNET.2016.7566545
Shrivastava, P., & Shukla, M. (2015). Comparative analysis of bagging, stacking and random subspace algorithms. In 2015 International Conference on Green Computing and Internet of Things (ICGCIoT) (pp. 511‒516). https://doi.org/10.1109/ICGCIoT.2015.7380518
Kecman, V. (2005). Support vector machines-an introduction in “Support vector machines: Theory and applications.” Springer.
Hofmann, T., Schölkopf, B., & Smola, A. J. (2008). Kernel methods in machine learning. The Annals of Statistics, 36(3), 1171–1220.
Saba Abdul-baqi, S. et al. (2018) A new model for iris classification based on naïve bayes grid parameters optimization. International Journal of Sciences: Basic and Applied Research (IJSBAR) 40.2, 150–155.
Singh, G., Kumar, B., Gaur, L., & Tyagi, A. Comparison between multinomial and bernoulli naïve bayes for text classification. In 2019 International Conference on Automation, Computational and Technology Management (ICACTM) (pp. 593–596). https://doi.org/10.1109/ICACTM.2019.8776800.
Zhenhai, C., & Wei, L. (2012) Logistic regression model and its application. Journal of Yanbian University(Natural Science Edition), 38(01), 28–32.
Baldi, P. (1995). Gradient descent learning algorithm overview: A general dynamical systems perspective. IEEE Transactions on Neural Networks, 6(1), 182–195. https://doi.org/10.1109/72.363438
Friedman, J. H. (2001) Greedy function approximation: A gradient boosting machine. Annual Statistics, 29 (5), 1189–1232.
Galdi, P., & Tagliaferri, R. (2018) Data mining: Accuracy and error measures for classification and prediction. Encyclopedia of Bioinformatics and Computational Biology 431–436
Powers, D. (2020). Evaluation: from precision, recall and Fmeasure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061
Lever, J., Krzywinski, M., & Altman, N. (2016). Classification evaluation. Nature Methods, 13, 603–604. https://doi.org/10.1038/nmeth.3945
Hofmann, M., & Klinkenberg, R. (2013) RapidMiner: Data mining use cases and business analytics applications.
Agarwal, A., Mittal, M., Pathak, A., et al. (2020). Fake news detection using a blend of neural networks: An application of deep learning. SN Computer Science, 1, 143.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Elyassami, S., Alseiari, S., ALZaabi, M., Hashem, A., Aljahoori, N. (2022). Fake News Detection Using Ensemble Learning and Machine Learning Algorithms. In: Lahby, M., Pathan, AS.K., Maleh, Y., Yafooz, W.M.S. (eds) Combating Fake News with Computational Intelligence Techniques. Studies in Computational Intelligence, vol 1001. Springer, Cham. https://doi.org/10.1007/978-3-030-90087-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-90087-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90086-1
Online ISBN: 978-3-030-90087-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)