Abstract
Fake news are widely offered in digital media to raise the visitors hit and in an offbeat, it acts on users emotions. The foremost ordinary example of such fake news throughout this pandemic, are the various remedies to cure covid. As a result of which individuals are unable to acknowledge any kind of genuine news. People try and attempt numerous things which will never help in curing this contagious disease. Moreover, it might lead to some other major health issues. In this paper, a framework is provided for the classification of news as fake vs real. Text data is pre-processed using Natural Language Processing (NLP) by performing tokenization, text cleaning and vectorization. N-gram and TF-IDF vectorization is used. Seven Machine Learning (ML) algorithms are then applied for classification. Two different datasets Kaggle and ISOT is used for experimentation and evaluated on the same scale using different evaluation metrics to demonstrate the efficacy of the proposed framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Allcott, H., & Gentzkow, M. (2017, May). Social Media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211–236.
Soll, J. (2016, Dec). The long and brutal history of fake news. POLITICO Magazine, 18
Gabielkov, M., Ramchandran, A., Chaintreau, A., & Legout, A. (2016). Social Clicks: What and Who Gets Read on Twitter? In International conference on measurement and modelling of computer science (pp. 179–192)
Fiske, S. T., & Taylor, S. E. (2013). Social cognition: From brains to culture (2nd ed.). CA: SAGE.
Lim, C. (2017). Checking how facts- checkers check. 16 May 2017
Shao, C. (2018). Anatomy of online misinformation network. PLOS ONE, 13(4), 196087
Klyuev, V. (2018). Fake news filtering: Semantic approaches.
Horne, B. D., & Adali, S. (2017). This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In International AAAI conference on web and social media Vol. 8.
Castelo, S., Almeida, T., Elghafari, A., Santos, A., Pham, K., Nakamura, E., & Freire, J. (2019). A topic-agnostic approach for identifying fake news pages. World Wide Web Conference, 2019, 975–980.
Qazvinian, V., Rosengren, E., Radev, D. R., & Mei, Q. (2011). Rumor has it: Identifying misinformation in microblogs. Conference on Empirical Methods in Natural Language Processing, EMNLP, 2011, 1589–1599.
Pérez-Rosas, V., Kleinberg, B., Lefevre, A., & Mihalcea, R. (2017). Automatic detection of fake news. In International Conference on Computational Linguistics (pp. 3391–3401). Santa Fe, New Mexico, USA.
Yang, Y., Zheng, L., Zhang, J., Cui, Q., Li, Z., & Yu, P. S. (2018). TI-CNN: Convolutional neural networks for fake news detection.
Sivasangari, V., Anand, P. V., & Santhya, R. (2018). A modern approach to identify the fake news using machine learning. International Journal of Pure and Applied Mathematics, 118(20).
Ahmed, S., Hinkelmann, K., & Corradini, F. (2019). Combining machine learning with knowledge engineering to detect fake news in social networks—A survey. In Proceedings of the AAAI 2019 spring symposium Vol. 12.
Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI: A hybrid deep model for fake news detection. ACM on Conference on Information and Knowledge Management, CIKM, 2017, 797–806. https://doi.org/10.1145/3132847.3132877
Okoro, E. M., Abara, B. A., Umagba, A. O., Ajonye, A. A., & Isa, Z. S. (2018). A hybrid approach to fake news detection on social media. The Nigerian Journal of Technology, 37(2).
Kaggle, (2018). Fake news. San Francisco, CA, USA: Kaggle. https://www.kaggle.com/c/fake-news
ISOT, Fake News. University of Victoria, Engineering, Canada. https://www.uvic.ca/engineering/ece/isot/datasets/fake-news/index.php
Bird, S., Klein, E., & Loper, E. (2009) Natural language processing with Python—Analyzing text with the natural language toolkit.
Patil, S. M., & Malik, A. K. (2019). Correlation based real-time data analysis of graduate students behaviour. In K. Santosh, R. Hegadi (eds.), Recent trends in image processing and pattern recognition. RTIP2R 2018. Communications in Computer and Information Science Vol. 1037. Springer.
Shetty, B. (2018). Natural language processing (NLP) for machine learning. at towardsdatascience, Medium.
Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers by Shubham Jain, February 27, 2018
Ranjan, A. (July 2018) Fake news detection using machine learning. Department Of Computer Science & Engineering Delhi Technological University.
Mitchell, T. M. (2006). 7e Discipline of machine learning. Carnegie Mellon University.
Chao-Ying Joanne Peng, Kuk Lida Lee, Gary M. Ingersoll, “An Introduction to Logistic Regression”, Indiana University-Bloomington, September 2002
Crammer, K., Dekel, O., Shalev-Shwartz, S., & Singer, Y. (2006, March). Passive-Aggressive Algorithms. School of Computer Science & Engineering, The Hebrew University, Jerusalem 91904, Israe.
Patel, H. H., Prajapati, P. (2018, Oct). Study and analysis of decision tree based classification algorithms. Dept. of Information Technology, CSPIT, Charotar University of Science and Technology, Changa, Gujarat, India.
Breiman, L. (2001, Jan). Random forests. Statistics Department University of California Berkeley, CA 94720.
Natekin, A., Knoll, A. (2013). Gradient boosting machines. Department of Informatics, Technical University Munich, Garching, Munich, Germany.
Kaviani, P. (2017, Nov). Mrs. Sunita Dhotre, Short survey on naive bayes algorithm. International Journal of Advance Engineering and Research, Department of Computer Engineering, Bharati Vidyapeeth University, College of Engineering, Pune.
Thandar, M., & Usanavasin S. (2015). Measuring opinion credibility in Twitter. In H. Unger, P. Meesad, S. Boonkrong (eds.), Recent Advances in Information and Communication Technology 2015. Advances in Intelligent Systems and Computing Vol. 361. Springer.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dubey, Y., Wankhede, P., Borkar, A., Borkar, T., Palsodkar, P. (2022). Framework for Fake News Classification Using Vectorization and Machine Learning. In: Lahby, M., Pathan, AS.K., Maleh, Y., Yafooz, W.M.S. (eds) Combating Fake News with Computational Intelligence Techniques. Studies in Computational Intelligence, vol 1001. Springer, Cham. https://doi.org/10.1007/978-3-030-90087-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-90087-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90086-1
Online ISBN: 978-3-030-90087-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)