Abstract
Data has tremendously incorporated our lifestyle. With advancements in technology and reduced Internet cost, data usage has increased many folds resulting in generation of huge heaps of unstructured data called as big data. This unstructured big data is difficult to handle using existing database management technology. We observed that genetic information related to coronavirus is tremendously increasing everyday. With implementation of big data analytics, these databases will be easily manageable leading to advancements in COVID-19 research. In this article, we have used HDFS system for efficient data management. In our work, we classified gene classes present in complete sequence so as to quickly detect mutation in no time. To achieve this, we predicted machine learning models to classify gene sequences faster in-class with libraries like matplotlib to construct detailed graph of the data. We choose three different sequences to classify gene sequence using natural language processing technique of Sklearn library and tested our results using logical regression.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Volz E, Baguelin M, Bhatia S, Boonyasiri A, Cori A, Cucunubá Z, et al. Report 5: Phylogenetic analysis of SARS-CoV-2
Agarwal A, Saxena A (2019) Comparing machine learning algorithms topredict diabetes inwomen and visualize factors affecting it the most—a step toward better healthcare for women. In: International conference on innovative computing and communications. https://doi.org/10.1007/978-981-15-1286-5_29
Saxena A, Kaushik N, Chaurasia A, Kaushik N (2019) Predicting the outcome of an election results using sentiment analysis of machine learning. In: International conference on innovative computing and communications, https://doi.org/10.1007/978-981-15-1286-5_43
Agarwal A, Saxena A (2020) Comparing machine learning algorithms to predict diabetes in women and visualize factors affecting it the most—a step toward better health care for women. In: International conference on innovative computing and communications. Springer
Ying S, Li F, Geng X, Li Z, Du X, Chen H et al (2020) Spread and control of COVID-19 in China and their associations with population movement, public health emergency measures, and medical resources. medRxiv
Zhou Y, Hou Y, Shen J, Huang Y, Martin W, Cheng F (2020) Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2. Cell Discovery 6(1):1–18
Waheed A, Goyal M, Gupta D, Khanna A, Al-Turjman F, Pinheiro PR CovidGAN: data augmentation using auxiliary classifier GAN for improved Covid-19 detection. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2994762
Alimadadi A, Aryal S, Manandhar I, Munroe PB, Joe B, Cheng X (2020) Artificial intelligence and machine learning to fight Covid-19
Zhang H, Saravanan KM, Yang Y, Hossain MT, Li J, Ren X, Wei Y (2020) Deep learning based drug screening for novel coronavirus 2019-nCov
Bullock J, Pham KH, Lam CSN, Luengo-Oroz M (2020) Mapping the landscape of artificial intelligence applications against COVID-19. https://arXiv.org/2003.11336
Crossman LC (2020) Leverging deep learning to simulate coronavirus spike proteins has the potential to predict future zoonotic sequences. bioRxiv
Habib P, Alsamman AM, Saber-Ayad M, Hassanein SE, Hamwieh A (2020) COVIDier: a deep-learning tool for coronaviruses genome and virulence proteins classification. bioRxiv
Naudé W (2020) Artificial Intelligence against COVID-19: an early review
Punn NS, Sonbhadra SK, Agarwal S (2020) COVID-19 epidemic analysis using machine learning and deep learning algorithms. medRxiv
Agbehadji IE, Awuzie BO, Ngowi AB, Millham RC Review of big data, artificial intelligence and nature-inspired computing models for performance improvement towards detection of COVID-19 pandemic case and contact tracing
Acknowledgements
We would like to thank Amity Institute of Biotechnology and our family, without their support throughout the process, this paper would have not been accomplished. We would also like to thank Amity University for giving us this great opportunity.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mohanty, S., Sharma, R., Saxena, M., Saxena, A. (2021). Heuristic Approach Towards COVID-19: Big Data Analytics and Classification with Natural Language Processing. In: Khanna, A., Gupta, D., Pólkowski, Z., Bhattacharyya, S., Castillo, O. (eds) Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 54. Springer, Singapore. https://doi.org/10.1007/978-981-15-8335-3_59
Download citation
DOI: https://doi.org/10.1007/978-981-15-8335-3_59
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8334-6
Online ISBN: 978-981-15-8335-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)