Framework for Fake News Classification Using Vectorization and Machine Learning

Dubey, Yogita; Wankhede, Pushkar; Borkar, Amey; Borkar, Tanvi; Palsodkar, Prachi

doi:10.1007/978-3-030-90087-8_16

Yogita Dubey⁶,
Pushkar Wankhede⁶,
Amey Borkar⁶,
Tanvi Borkar⁶ &
…
Prachi Palsodkar⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1001))

1214 Accesses
1 Citations

Abstract

Fake news are widely offered in digital media to raise the visitors hit and in an offbeat, it acts on users emotions. The foremost ordinary example of such fake news throughout this pandemic, are the various remedies to cure covid. As a result of which individuals are unable to acknowledge any kind of genuine news. People try and attempt numerous things which will never help in curing this contagious disease. Moreover, it might lead to some other major health issues. In this paper, a framework is provided for the classification of news as fake vs real. Text data is pre-processed using Natural Language Processing (NLP) by performing tokenization, text cleaning and vectorization. N-gram and TF-IDF vectorization is used. Seven Machine Learning (ML) algorithms are then applied for classification. Two different datasets Kaggle and ISOT is used for experimentation and evaluated on the same scale using different evaluation metrics to demonstrate the efficacy of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fake News Identification Through Natural Language Processing and Machine Learning Approach

Term Frequency Tokenization for Fake News Detection

Fake News Classification Using Vectorized Semantic and Syntactical Analysis

References

Allcott, H., & Gentzkow, M. (2017, May). Social Media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211–236.
Google Scholar
Soll, J. (2016, Dec). The long and brutal history of fake news. POLITICO Magazine, 18
Google Scholar
Gabielkov, M., Ramchandran, A., Chaintreau, A., & Legout, A. (2016). Social Clicks: What and Who Gets Read on Twitter? In International conference on measurement and modelling of computer science (pp. 179–192)
Google Scholar
Fiske, S. T., & Taylor, S. E. (2013). Social cognition: From brains to culture (2nd ed.). CA: SAGE.
Google Scholar
Lim, C. (2017). Checking how facts- checkers check. 16 May 2017
Google Scholar
Shao, C. (2018). Anatomy of online misinformation network. PLOS ONE, 13(4), 196087
Google Scholar
Klyuev, V. (2018). Fake news filtering: Semantic approaches.
Google Scholar
Horne, B. D., & Adali, S. (2017). This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In International AAAI conference on web and social media Vol. 8.
Google Scholar
Castelo, S., Almeida, T., Elghafari, A., Santos, A., Pham, K., Nakamura, E., & Freire, J. (2019). A topic-agnostic approach for identifying fake news pages. World Wide Web Conference, 2019, 975–980.
Google Scholar
Qazvinian, V., Rosengren, E., Radev, D. R., & Mei, Q. (2011). Rumor has it: Identifying misinformation in microblogs. Conference on Empirical Methods in Natural Language Processing, EMNLP, 2011, 1589–1599.
Google Scholar
Pérez-Rosas, V., Kleinberg, B., Lefevre, A., & Mihalcea, R. (2017). Automatic detection of fake news. In International Conference on Computational Linguistics (pp. 3391–3401). Santa Fe, New Mexico, USA.
Google Scholar
Yang, Y., Zheng, L., Zhang, J., Cui, Q., Li, Z., & Yu, P. S. (2018). TI-CNN: Convolutional neural networks for fake news detection.
Google Scholar
Sivasangari, V., Anand, P. V., & Santhya, R. (2018). A modern approach to identify the fake news using machine learning. International Journal of Pure and Applied Mathematics, 118(20).
Google Scholar
Ahmed, S., Hinkelmann, K., & Corradini, F. (2019). Combining machine learning with knowledge engineering to detect fake news in social networks—A survey. In Proceedings of the AAAI 2019 spring symposium Vol. 12.
Google Scholar
Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI: A hybrid deep model for fake news detection. ACM on Conference on Information and Knowledge Management, CIKM, 2017, 797–806. https://doi.org/10.1145/3132847.3132877
Article Google Scholar
Okoro, E. M., Abara, B. A., Umagba, A. O., Ajonye, A. A., & Isa, Z. S. (2018). A hybrid approach to fake news detection on social media. The Nigerian Journal of Technology, 37(2).
Google Scholar
Kaggle, (2018). Fake news. San Francisco, CA, USA: Kaggle. https://www.kaggle.com/c/fake-news
ISOT, Fake News. University of Victoria, Engineering, Canada. https://www.uvic.ca/engineering/ece/isot/datasets/fake-news/index.php
Bird, S., Klein, E., & Loper, E. (2009) Natural language processing with Python—Analyzing text with the natural language toolkit.
Google Scholar
Patil, S. M., & Malik, A. K. (2019). Correlation based real-time data analysis of graduate students behaviour. In K. Santosh, R. Hegadi (eds.), Recent trends in image processing and pattern recognition. RTIP2R 2018. Communications in Computer and Information Science Vol. 1037. Springer.
Google Scholar
Shetty, B. (2018). Natural language processing (NLP) for machine learning. at towardsdatascience, Medium.
Google Scholar
Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers by Shubham Jain, February 27, 2018
Google Scholar
Ranjan, A. (July 2018) Fake news detection using machine learning. Department Of Computer Science & Engineering Delhi Technological University.
Google Scholar
Mitchell, T. M. (2006). 7e Discipline of machine learning. Carnegie Mellon University.
Google Scholar
Chao-Ying Joanne Peng, Kuk Lida Lee, Gary M. Ingersoll, “An Introduction to Logistic Regression”, Indiana University-Bloomington, September 2002
Google Scholar
Crammer, K., Dekel, O., Shalev-Shwartz, S., & Singer, Y. (2006, March). Passive-Aggressive Algorithms. School of Computer Science & Engineering, The Hebrew University, Jerusalem 91904, Israe.
Google Scholar
Patel, H. H., Prajapati, P. (2018, Oct). Study and analysis of decision tree based classification algorithms. Dept. of Information Technology, CSPIT, Charotar University of Science and Technology, Changa, Gujarat, India.
Google Scholar
Breiman, L. (2001, Jan). Random forests. Statistics Department University of California Berkeley, CA 94720.
Google Scholar
Natekin, A., Knoll, A. (2013). Gradient boosting machines. Department of Informatics, Technical University Munich, Garching, Munich, Germany.
Google Scholar
Kaviani, P. (2017, Nov). Mrs. Sunita Dhotre, Short survey on naive bayes algorithm. International Journal of Advance Engineering and Research, Department of Computer Engineering, Bharati Vidyapeeth University, College of Engineering, Pune.
Google Scholar
Thandar, M., & Usanavasin S. (2015). Measuring opinion credibility in Twitter. In H. Unger, P. Meesad, S. Boonkrong (eds.), Recent Advances in Information and Communication Technology 2015. Advances in Intelligent Systems and Computing Vol. 361. Springer.
Google Scholar

Download references

Author information

Authors and Affiliations

Yeshwantrao Chavan College of Engineering, Nagpur, India
Yogita Dubey, Pushkar Wankhede, Amey Borkar, Tanvi Borkar & Prachi Palsodkar

Authors

Yogita Dubey
View author publications
You can also search for this author in PubMed Google Scholar
Pushkar Wankhede
View author publications
You can also search for this author in PubMed Google Scholar
Amey Borkar
View author publications
You can also search for this author in PubMed Google Scholar
Tanvi Borkar
View author publications
You can also search for this author in PubMed Google Scholar
Prachi Palsodkar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Hassan II University, Casablanca, Morocco
Mohamed Lahby
United International University, Dhaka, Bangladesh
Al-Sakib Khan Pathan
Sultan Moulay Slimane University, Khouribga, Morocco
Yassine Maleh
Taibah University, Madinah, Saudi Arabia
Wael Mohamed Shaher Yafooz

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dubey, Y., Wankhede, P., Borkar, A., Borkar, T., Palsodkar, P. (2022). Framework for Fake News Classification Using Vectorization and Machine Learning. In: Lahby, M., Pathan, AS.K., Maleh, Y., Yafooz, W.M.S. (eds) Combating Fake News with Computational Intelligence Techniques. Studies in Computational Intelligence, vol 1001. Springer, Cham. https://doi.org/10.1007/978-3-030-90087-8_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-90087-8_16
Published: 16 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90086-1
Online ISBN: 978-3-030-90087-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Framework for Fake News Classification Using Vectorization and Machine Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Fake News Identification Through Natural Language Processing and Machine Learning Approach

Term Frequency Tokenization for Fake News Detection

Fake News Classification Using Vectorized Semantic and Syntactical Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Framework for Fake News Classification Using Vectorization and Machine Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Fake News Identification Through Natural Language Processing and Machine Learning Approach

Term Frequency Tokenization for Fake News Detection

Fake News Classification Using Vectorized Semantic and Syntactical Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation