Skip to main content

Heuristic Approach Towards COVID-19: Big Data Analytics and Classification with Natural Language Processing

  • Conference paper
  • First Online:
Data Analytics and Management

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 54))

Abstract

Data has tremendously incorporated our lifestyle. With advancements in technology and reduced Internet cost, data usage has increased many folds resulting in generation of huge heaps of unstructured data called as big data. This unstructured big data is difficult to handle using existing database management technology. We observed that genetic information related to coronavirus is tremendously increasing everyday. With implementation of big data analytics, these databases will be easily manageable leading to advancements in COVID-19 research. In this article, we have used HDFS system for efficient data management. In our work, we classified gene classes present in complete sequence so as to quickly detect mutation in no time. To achieve this, we predicted machine learning models to classify gene sequences faster in-class with libraries like matplotlib to construct detailed graph of the data. We choose three different sequences to classify gene sequence using natural language processing technique of Sklearn library and tested our results using logical regression.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Volz E, Baguelin M, Bhatia S, Boonyasiri A, Cori A, Cucunubá Z, et al. Report 5: Phylogenetic analysis of SARS-CoV-2

    Google Scholar 

  2. Agarwal A, Saxena A (2019) Comparing machine learning algorithms topredict diabetes inwomen and visualize factors affecting it the most—a step toward better healthcare for women. In: International conference on innovative computing and communications. https://doi.org/10.1007/978-981-15-1286-5_29

  3. Saxena A, Kaushik N, Chaurasia A, Kaushik N (2019) Predicting the outcome of an election results using sentiment analysis of machine learning. In: International conference on innovative computing and communications, https://doi.org/10.1007/978-981-15-1286-5_43

  4. Agarwal A, Saxena A (2020) Comparing machine learning algorithms to predict diabetes in women and visualize factors affecting it the most—a step toward better health care for women. In: International conference on innovative computing and communications. Springer

    Google Scholar 

  5. Ying S, Li F, Geng X, Li Z, Du X, Chen H et al (2020) Spread and control of COVID-19 in China and their associations with population movement, public health emergency measures, and medical resources. medRxiv

    Google Scholar 

  6. Zhou Y, Hou Y, Shen J, Huang Y, Martin W, Cheng F (2020) Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2. Cell Discovery 6(1):1–18

    Google Scholar 

  7. Waheed A, Goyal M, Gupta D, Khanna A, Al-Turjman F, Pinheiro PR CovidGAN: data augmentation using auxiliary classifier GAN for improved Covid-19 detection. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2994762

  8. Alimadadi A, Aryal S, Manandhar I, Munroe PB, Joe B, Cheng X (2020) Artificial intelligence and machine learning to fight Covid-19

    Google Scholar 

  9. Zhang H, Saravanan KM, Yang Y, Hossain MT, Li J, Ren X, Wei Y (2020) Deep learning based drug screening for novel coronavirus 2019-nCov

    Google Scholar 

  10. Bullock J, Pham KH, Lam CSN, Luengo-Oroz M (2020) Mapping the landscape of artificial intelligence applications against COVID-19. https://arXiv.org/2003.11336

  11. Crossman LC (2020) Leverging deep learning to simulate coronavirus spike proteins has the potential to predict future zoonotic sequences. bioRxiv

    Google Scholar 

  12. Habib P, Alsamman AM, Saber-Ayad M, Hassanein SE, Hamwieh A (2020) COVIDier: a deep-learning tool for coronaviruses genome and virulence proteins classification. bioRxiv

    Google Scholar 

  13. Naudé W (2020) Artificial Intelligence against COVID-19: an early review

    Google Scholar 

  14. Punn NS, Sonbhadra SK, Agarwal S (2020) COVID-19 epidemic analysis using machine learning and deep learning algorithms. medRxiv

    Google Scholar 

  15. Agbehadji IE, Awuzie BO, Ngowi AB, Millham RC Review of big data, artificial intelligence and nature-inspired computing models for performance improvement towards detection of COVID-19 pandemic case and contact tracing

    Google Scholar 

Download references

Acknowledgements

We would like to thank Amity Institute of Biotechnology and our family, without their support throughout the process, this paper would have not been accomplished. We would also like to thank Amity University for giving us this great opportunity.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ankur Saxena .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mohanty, S., Sharma, R., Saxena, M., Saxena, A. (2021). Heuristic Approach Towards COVID-19: Big Data Analytics and Classification with Natural Language Processing. In: Khanna, A., Gupta, D., Pólkowski, Z., Bhattacharyya, S., Castillo, O. (eds) Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 54. Springer, Singapore. https://doi.org/10.1007/978-981-15-8335-3_59

Download citation

Publish with us

Policies and ethics