Keywords

1 Introduction

The rapid spread of corona virus disease 2019 (COVID-19) caused a widespread distress and panic among people all over the world. By the end of 2019, there broke out a novel infectious corona virus disease that first spread from Wuhan, China, and then later spread out to 210 countries world-wide (Barlow 2004). COVID-19 is a deadly disease that targets the person’s immunity and infects the respiratory tract and in most of the cases may even lead to death (Keni et al. 2020). Among many branches of Computer Science, Artificial Intelligence (AI) and Machine Learning (ML) are the top trending technologies, which have flourished remarkably in the last few years. The primary driving force for AI is to bring solutions to those problems which are not easily understood, extremely complex and require extraordinary levels of intelligence. Also, this technology has an extraordinaire in image processing (Dişken et al. 2017), speech recognition (Mei et al. 2021), and text to speech conversions (Columbus 2018), which assist in increasing productivity in health care and businesses. Pertaining to these, various AI and ML techniques leveraged the healthcare sector in prediction, diagnosis, and monitoring and drug development of COVID-19. Two popular screening applications of ML are Deep Convolutional Neural Network ResNet-101 by Ardakani et al. (2020) and Convolutional Neural Network DarkCovidNet Architecture by Ozturk et al. (2020) which proposed optimal accuracies of around 98% and 99%, respectively.

There has been a tremendous growth in the research works in the field of AI in the last two decades, as is depicted in Fig. 1.

Fig. 1
A line graph of papers versus year presents a rising trend. The number of research papers increased from 2000 to 16000 between 2000 and 2016.

Growth of annual research done in AI sector vs annual research done in AI sector in medical field (Mahajan et al. 2019)

In this paper, we intend to review recent technological advancements of AI and ML in the medical field with regards to the COVID-19 pandemic and analyze their implications such as predicting the number of COVID-19-affected patients, diagnosing the disease, monitoring its spread as well as developing drugs and vaccines for the purpose of its cure and control. This paper outlines numerous fronts of AI and ML which have led to an enormous development of the healthcare industry, giving rise to expedite advancement of medical science. Also, we have analyzed the effectiveness of AI-based tools and algorithms during the COVID-19 outbreak and their effectiveness in handling extreme and risky events.

2 Predicting Number of COVID-19 Patients and Its Growth

Artificial Intelligence along with Machine Learning algorithms is widely used all over the world to forecast disease, weather, stock market, and much more, to help enhance the decision-making and action-taking for the future course (Jiang et al. 2017). ML algorithms along with Neural Network models have been successful in predicting the number of affected patients of a particular disease in advance (Hu et al. 2002). Research (Rustam et al. 2020) confirmed that the accuracy of AI-based methods and tools for the multi-step forecasting of the course of the COVID-19 virus was high with estimated average errors of 1.64%, 2.27%, 2.14%, 2.08%, 0.73% for 6-step, 7-step, 8-step, 9-step, and 10-step forecasting respectively. In this section, the applications of AI- and ML-based forecasting mechanisms to predict the number of COVID-19-affected patients are discussed.

2.1 Supervised Machine Learning Models

Studies (Rustam et al. 2020; JHU 2020) have shown that the four forecasting models, namely linear regression (LR) model, least absolute shrinkage and selection operator (LASSO) model, support vector machine (SVM) model, and exponential smoothing (ES) model, have contributed to the current COVID-19 crisis. These supervised ML regression models were thoroughly trained by providing patient stats of COVID-19 obtained from John Hopkins University (JHU 2020). This training data was processed and divided as training parameters (85% of the records) and testing parameters (15% of the records). The research based on the time series datasets and the results claiming an accuracy of more than 90%, proved that the ML model forecasting can be very effective and beneficial for the decision-making to restrain the infections and consequences of the COVID-19 pandemic (Pahuja and Nagabhushan 2021). Previously, many related works have concluded that ML and AI techniques hold importance in predicting and diagnosing COVID-19. However, Muhammad et al. (2021) proposed a supervised Machine Learning being the only work so far done using epidemiology labeled datasets. The dataset contained demographic as well as clinical data, including 263,007 instances with 41 features. This study incorporated different supervised learning techniques for classification: (a) Logistic Regression, (b) Naïve Bayes Algorithm (c) Decision Tree Algorithm, and (d) Support Vector Machine (SVM) (Fig. 2).

Fig. 2
A methodology diagram to predict COVID-19. The steps involved are the COVID Epidemiology dataset, pre-processing, correlation analysis, data splitting, and testing set. Training sets, models, trained models, and evaluation metrics included between correlation analysis and tasting set.

Methodology to develop supervised learning model to predict COVID-19

2.2 Cloud-Based Machine Learning Models

With the emergence of advanced computing technologies, there has been a boost in the dimension of large datasets. Parallel to this, Cloud Computing offers flexible and extensible storage resources, which are now widely adapted by many research communities.

Cloud Computing assists in promptly speed up the prediction process with high-speed computations and ciphering cloud data centers. Deployment of models on cloud data enters demonstrated in Fig. 1. The dataset was taken from ‘Our World in Data: COVID-19 Dataset’ (https://github.com/owid/covid-19data/tree/master/public/data/) and was continually updated from Situation Reports: WHO (https://www.who.int/emergencies/diseases/novel-coronavirus2019/situation-reports). Their system employed HealthFog framework (Tuli et al. 2020a) (a smart healthcare system based on deep learning which can automatically diagnose heart diseases in fog computing and IoT integrated environments) and FogBus (Tuli et al. 2019) (lightweight fog computing framework based on blockchain). These two frameworks were utilized to deploy collective learning approach to forecast variegated aspects of pandemic including the number of required staff to manage hospitals and affected patients. The study analyzed that the cost of regular tracking of patients was 1.2 USD per day (Tuli et al. 2020b) (Fig. 3).

Fig. 3
An illustration of cloud data centers. Cloud enterprise includes the A L analytics, prediction engine, and COVID databases. End users involve the patients, gateway devices, private hospitals, and government centers.

Deployment of models on cloud data centers

Support Vector Machine was incorporated with cross-fold validation testing, rendering an accuracy of 98.15% with 15-fold validation (Ur et al. 2021). The study resulted in better efficiency and accuracy in comparison to the other state-of-the-art ML methodologies for predicting COVID-19. Further, cloud-based AI solutions were developed by Alibaba (Kose et al. 2021) to aid China during COVID-19 outbreak by predicting number, extent and areas of infected cases through surveillance mechanisms as well as track chain methods and the information was computed and evaluated through multi-edge and cloud layers. The study affirms that these solutions were employed with an accuracy of 98% in real time, within varied sectors in China.

2.3 Time Series Forecasting Model

The time series analysis is the best way for forecasting using ML where the problem is taken as regression, where the forecast is the target feature (dependent variable) and the time is the input feature (independent). Forecasting completely depends on the past trends and data. In case of respiratory disease such as COVID-19, where there is not much recovery time once the patient gets severely infected, forecasting the rate of increase or decrease in the number of infected cases, can greatly help in preparing the resources and measures to be taken in the future. One such time series forecasting model is autoregressive integrated moving average (ARIMA), which is used to predict future trends (https://www.kaggle.com/sudalairajkumar/covid19in-india). A unit root test called automatic Dickey–Fuller is used to estimate and monitor the continuity of the time series. Domenico et al. (Pinter et al. 2020) demonstrated that the ARIMA model has also been successful in predicting the prevalence and incidence of COVID-19 with a confidence interval of 95%, whose parameters were evaluated by employing partial autocorrelation (PACF) correlogram and autocorrelation function (ACF) graph.

2.4 Hybrid Machine Learning Models

The Hybrid Machine Learning methods of adaptive network-based fuzzy inference system (ANFIS) and multi-layered perceptron-imperialist competitive algorithm (MLP-ICA) have been proved to be successful in predicting the rate of infected patients and also the mortality rate with a higher level of accuracy, as suggested by a study conducted in Hungary (Prakash et al. 2020). In that study, the sample data was considered for two scenarios which were used to train both ANFIS and MLP-ICA; for odd and even days. The prediction results were validated with promising results for a period of nine days, confirming its accuracy and determination coefficients ranging from 0.9963 to 0.9987, which wouldn’t get hampered unless significant error occurs, since further research is required for improving the quality of procedure and better evaluation of models. Further, another research (Khalilpourazari and Hashemi Doulabi 2021) puts forward an optimized search methodology built around Reinforcement Learning (RL) and Evolutionary Computation (EC). This research proposes a hybrid algorithm to predict COVID-19 in Quebec, Canada (https://www.quebec.ca/en/health/healthissues/a-z/2019-coronavirus/). According to the results, the methodology predicted the peak of COVID-19 infection, and complex sensitivity analysis allows plotting future scenarios to help healthcare professionals and policymakers.

2.5 Regression and Classification Models

There are various other AI- and ML-based models that have caught the attention of researchers, namely K-nearest neighbor (KNN), Decision Tree Classifier (DTC), Logistic Regression, Linear Regression, SVM, etc. Random Forest Regressor and Random Forest Classifier are two such models which have beat the other AI- and ML-based prediction models with regard to accuracy and coefficient of determination of COVID-19. Research shows that they have outperformed the other models on the basis of coefficient of determination and accuracy (Vaishya et al. 2020) described in Fig. 4. Figure 5 shows the comparison among the accuracies of different models for COVID-19 data, built on COVID-19 dataset.

Fig. 4
A bar graph of the coefficient of determination. Decision tree, 1. Gaussian Naive Bayes, 0.4. S V M, 0.4. R F classifier, 1. R F regenerator 0.8.

Coefficient of determination of different models for COVID-19 data

Fig. 5
A bar graph of the accuracy. Decision tree, 0.97. Gaussian Naive Bayes, 0.875. Multilinear regression, 0.925. Logistic regression, 0.925, X G B classifier 0.925. S V M, 0.97. K N N plus N C A, 0.925, R F classifier, 0.97.

Accuracy of different models for COVID-19 data

3 Diagnostics and Detection of COVID-19

Early detection and accurate diagnosis of COVID-19 helps the medical team with faster and steadier decision-making and taking appropriate actions accordingly. AI-and ML-based diagnostic mechanisms have boosted the efficiency of medical industry to provide faster, cost-effective, and precise treatment procedures to the affected patients, leading to better treatment and increased recovery rates.

3.1 Generative Adversarial Networks

Since, COVID-19 is a respiratory disease, the infection in lungs and nasal cavity could be determined generally by chest CT scans. Jin et al. (2020) in their research proposed one such model, which is based on Generative Adversarial Networks (GAN) and deep transfer learning model. The model consists of an actual dataset of 746 images, which is divided into training parameters (90%) and testing parameters (10%). After using GAN as an image augmenter, the 90% data is further divided into 80% for training purpose and the remaining 20% for validation purpose. ShuffleNet is the most accurate and optimal model for diagnosis of COVID-19 virus, since it resulted in 85.33% accuracy in overall performance metrics in terms of precision, recall, and F1 score as listed in Table 1.

Table 1 Testing phase: comparison metrics for different models using GAN

GAN and deep transfer learning model for the diagnosis and detection of COVID-19 in insubstantial chest X-rays was proposed by Loey et al (2020). The research focused on gathering all the available chest X-ray images and produce new images using the GAN network to help detect the coronavirus. The research also claimed that the main deep transfer model which was selected was GoogleNet, since it achieves a testing accuracy of 100% and a validation accuracy of 99.9%.

3.2 Artificial Intelligence-Inspired Models

Huang et al. (2020) proposed a model called AIMDP or Artificial Intelligence-inspired model for COVID-19 diagnosis and prediction for patient response to treatment, which has two main functions; the diagnosis module, that predicts the number of COVID-19-affected patients and the prediction module that predicts the response ability of those patients toward the COVID-19 virus. They employed one of the most widely used deep learning libraries—TensorFlow framework, for the purpose of training the process. This research claimed that the accuracy of this model reached up to a 97.14%. Also, COVID-19 can be detected by respiratory tract infection or pneumonia. An AI system was built by Jin et al. to detect and analyze COVID-19. A total of 1136 training cases were used to train the system, out of which 723 were positive cases from five hospitals. A resulting system sensitivity of 0.974 was obtained. Chest CT images are a valuable feature that aids to detect the COVID-19 virus efficiently, and researchers have been trying to decompose these images using deep learning in order to detect the features of the virus. A team from Tianjin Medical University Cancer Institute and Hospital (Huang et al. 2020) collected samples of 180 patients for CT scan, having severe viral pneumonia way earlier than coronavirus epidemic, and 79 patients having confirmed COVID-19. The study required the training and testing parameters to be provided randomly to train the Convolutional Neural Networks (CNN)-based algorithm, which resulted in an accuracy of 89.5% to detect COVID-19 virus through chest CT.

3.3 Convolutional Neural Network (CNN) Models

Convolutional Neural Network is a class of Neural Networks in deep learning, which is generally used for image processing by resolving the images into sub-forms that could be processed more easily, keeping intact the important features useful for prediction. A CNN architecture was utilized by Kumar et al. (2022) which employed U-net. It was trained on interpreted dataset, gathered from 842 confirmed patients of coronavirus, and evaluated for lung opacity segmentation. The patients were made to undergo chest CT scans in the Tongji Hospital, China. Another neoteric research introduced a custom DL architecture called SARS-Net, which aided in detecting the irregularities present in the chest X-rays of patients, using a Computer-Aided Diagnosis system, integrating Graph Convolutional Networks and Convolutional Neural Networks (Wang et al. 2020). A dataset of 13,975 CXR images was taken, extending to 13,870 patients. This model was successful in achieving a 97.60% of accuracy as well as a 92.90% of sensitivity on the testing dataset. Image distribution (CXR images) of COVID-19 dataset according to infection type and number of patients of COVID-19 dataset according to infection type are depicted in Fig. 6a, b, respectively.

Fig. 6
2 clustered bar graphs. A. Compares the number of images of training and test datasets for normal, pneumonia, and COVID-19. B. Compares the number of patients of training and test datasets for normal, pneumonia, and COVID-19.

a Image distribution (CXR images) of COVID-19 dataset according to infection type. b Number of patients of COVID-19 dataset according to infection type

4 Monitoring the Spread of the Virus

AI and ML can efficiently build intelligent systems that can help to monitor the spread of COVID-19 virus among individuals. Such systems or tools can be effectively utilized to extract and track the visual features of the virus.

4.1 Contact Tracing and Remote Monitoring

Remote monitoring and contact tracing of the individuals can help to detect the infection levels by visually recognizing the ‘hot spots’ (Jumper et al. 2021). Contactless monitoring can be done by monitoring blood pressure, heart rate, oxygen saturation levels, and also using X-rays and MRI scans. The study (Agbehadji et al. 2020) employed deep DR, a DL-based method which comprehends the high-stage features of the medications, and was successful in producing a high-performance of 90.8% AUC. Various countries incorporated many centralized, decentralized, and hybrid AI and ML techniques in form of applications as mentioned in Table 2.

Table 2 Few contact tracing applications employed by countries

4.2 AI-Based Models for Case Detection and Monitoring

During the COVID-19 pandemic, China deployed AI-based temperature screening in public places, which aided in detecting symptoms in people and putting the suspected patients it no isolation (Petropoulos 2020). Also lately, thermal imaging cameras have been employed to quickly and accurately detect one of the major symptoms of the coronavirus—elevation in body temperatures. Intelligently, monitoring the key symptoms can help avoid severity of the disease and rapidly inform the concerned authorities to enhance adequate action-taking.

5 Development of Drugs and Vaccines

In addition to three pre-existing categories of vaccines (vaccines for pathogen, subunit vaccines and vaccines for nucleic acid), a newest category has been included, called COVID-19 vaccine. Understanding the structure and genetic sequence of a specific protein before acknowledging its unique 3D structure is critical for vaccine development. Since this procedure can be time-taking and complex, AI systems can be of a great help to simplify this process. An intelligent system, called ALPHAFOLD, was developed after years-long research on large genomic datasets to predict 3D models of the proteins, which are exceedingly accurate. A research team from a vaccination biotech company developed AI and cloud computing-based models for COVID-19 drug and vaccine development (Deoras 2020). They successfully modeled the COVID spike protein and design and developed a synthetic COVID-19 vaccine, named COVAX19. Researchers from MIT have implemented a novel approach, where they have applied optimized ML-based methodology that opts for amides (short organic compound fibers) that are expected to offer giant vaccine numbers. The ‘OptiVax’ style code incorporates ways of coming up with new peptide medicines, evaluating existing vaccines and increasing their composition (Beck et al. 2020). The researchers took 4690 samples of COVID-19 genomes to be evaluated and the removal of undesired mutation-rate peptides and genetic makeup of the different populations was specifically focused. A DL-based drug–target interaction model, called molecule transformer-drug–target interaction (MT-DTI), was proposed by Beck et al. This model can effectively predict drugs for COVID-19 virus by utilizing SMILES strings and amino-acid sequences, to sort and classify target proteins with 3D crystal structures (Öztürk et al. 2018), and identify the commercially available drugs that could specifically target and destroy the viral components of the COVID-19 virus. The results led to the identification of six promising medicines; Remdesivir, Atazanavir, Efavirenz, Ritonavir, Dolutegravir, and Kaletra. A DTI model called DeepDTA, based on DL, was proposed, which was an end-to-end CNN-based model. This model can automatically detect useful features from raw protein sequences. For processing proteins and ligands’ similarities, the model made use of Smith–Waterman (S–W) and PubChem Similarity algorithms, respectively (Shim et al. 2021). The study concluded that this DL-based model which used 1D representations of the drug and the target was a non-cognitive approach. For the purpose of reading and evaluating the representations from raw protein sequences, CNN blocks were employed, which later fed all this information to DeepDTA, a fully connected block.

6 Conclusion

This paper analyzed various possible applications of Machine Learning and Artificial Intelligence to predict number of COVID-19-affected patients and its growth, diagnostics and detection of virus as well as monitoring the speed of the virus. It is also concluded that these technologies are also helping in development of drugs and vaccines, which in turn can aid in stopping the spread of COVID-19. This survey offers a detailed overview of the existing as well as latest state-of-the-art methodologies of AI and ML with superlative accuracies, which have proven to be exceedingly efficient in boosting the potentiality of the healthcare industries as well as the concerned authorities for the purpose of detection, control, cure, and management of a pandemic such as substantial as the COVID-19.