Keywords

6.1 Impact of Artificial Intelligence (AI) on Medicine

As per the proposal from authors of [1], artificial intelligence systems have made a mark in the medical fields of pathology, radiology, ophthalmology, and cardiology. This has led to collaborative efforts from physicians and the AI in the field. Though the concept of general AI is being explored deeply, today’s reality is the narrow AI specializing in specific tasks. These specific tasks would be image recognition, speech recognition, and so on. Among many influencing factors for AI’s role in medicine and healthcare is the availability of a large amount of medical data and data generated from medical wearables that have become highly popular. Prominent contribution of AI comes in the role of a decision-making layer on top of the current system. This is considerably improving the accuracy of treatment and cost involved. It also raises concerns about the possibility of AI replacing physicians. Electronic medical records (EMR) make this domain a data-rich endeavour that makes it more appropriate for AI to work on it and derive practical information. With quite a bit of progress in technology around scanners like magnetic resonance imaging (MRI), the next leg of the progress will be for AI to utilize the imaging data generated by imaging technologies [2]. Deep learning techniques like convolutional neural networks (CNNs) have been making useful contributions in imaging for object recognition and localization. Patients diagnosing the capability of AI are closely competing with that of doctors. Works of Norman [3] emphasize that the AI’s involvement is not about replacing the physician but complements and improvises their ways of work. A significant focus is also on cutting down on the administrative work that medical fraternity end up doing, which can be automated with machine learning. It is critical as the error that can creep into these documentations may be costly.

A more in-depth focus of AI in the medical field understands the pattern within the domain to assess the behaviour and derive appropriate logic for current scenarios. The key aspect that needs to be considered is the literal adoption of the AI algorithms to the objectives provided to them, rather than adapting to the unseen scenarios. Also, the algorithms’ black box natures make it tough to dive deep and derive useful interpretation; this needs attention.

6.1.1 Ensemble Models for Pandemic Management

Possible architecture that can add value to pandemic situations like that of COVID-19 would involve ensembling of various machine learning and deep learning models. Since the idea is to organize and put all the available data in a meaningful way, ensemble models are the right approach. Since works are happening around the utilization of radiology images associated with the patients, the convolutional neural network (CNN) is one component. Work is done by Tulin et al. [4], banks on the radiology images of the patient’s chest to identify the COVID-19. Here the author’s banks on utilizing salient information are hidden in these radiology images. This work’s advantages are also highlighted to help in remote areas where the doctors’ availability is a concern. Here, the approach highlights the usage of chest X-rays as a means for COVID-19 detection. The proposed model focuses on binary classification between presence and absence of COVID-19 and multi-class classification to differentiate between the presence of COVID-19, pneumonia, and none of these. In the case of the YOLO (You Only Look Once) model for object detection, the darknet model was used as a classifier. Series of convolution layers was the architecture component with intermittent filters.

Artificial neural network (ANN) is another critical component of the ensemble model. Due to the complexity of the diagnosis involved, it is imperative to study various features’ influence. The information on how each of the features influences each other can be handy in refining these systems and improving their performance. It would be an appropriate fit to use ANN with explainable features to address this concept. Since there is quite a bit of work on explainable AI, that thought process will be helpful. Recurrent neural network (RNN), with its specialization in managing time-series data, will play a key role in studying the evolution of the virus behaviour across the timeline, particularly on the characteristics of the virus and the possibility of its mutations. This characteristic of the virus will be significant to understand and device the mechanism for studying viruses and the possibility of the cure. In work [5], the author studies the spread of COVID-19 in India and the effect of the remedial measures taken to control the pandemic. With the largest economies struggling due to the virus’s impact and its spreading nature, resources employed are turning out ineffective. Considering the growing pressure on the health system and administration, there is a need to build a prediction system that provides heads-up. Authors have employed long short-term memory (LSTM) data-driven estimation methods and try to fit the curve to predict future cases of COVID-19 and also assess the effectiveness of preventive measures like social distancing and lockdown. This proposal seems to provide useful insights into the administration of health officials.

Potential information is available from social media, which can also be plugged into the ensemble model. Social data is mainly about the information that people share about themselves in the media, such as Facebook and Twitter. Crunching this information will play a significant role as an early indicator of the potential impact such a pandemic situation can bring in. This would be leveraged to the plugin as one of the sources in pandemic management. Natural language processing (NLP) is an obvious choice to model these data and plug it into its ensemble model. NLP will also put together the learned knowledge in the literature and building the solution on top of it. Work done in [6] stresses the scientific literature being searched for answers on COVID-19 questions. The novel neural ranking approach is proposed to ensure that all the new information that comes in is put together and analysed and learned in all the information’s overall context. Transfer learning, being a recent advancement in AI, can fit well on top of the ensemble model, bringing in the knowledge from various other domains and customizing the ensemble’s architecture to fit the learning needed for specific situations of COVID-19. Work done in [7] emphasizes leveraging deep transfer learning with the rapidly increasing rate of COVID-19 and challenges associated with the testing kits. It becomes paramount to look at all possible knowledge being plugged in for the detection of the virus. The development of COVID-19 testing methods is a significant area of exploration. CT (computed tomography) of the chest is seen as an excellent potential source for COVID-19 detection. However, it is not a straightforward method and involves some challenges. Authors in this work propose deep transfer learning to classify COVID-19 infections. Authors also employ top two smooth loss functions to handle noisy and imbalanced datasets of this problem. This approach seems to provide better performance in comparison to other supervised learning models. Transfer learning can also be leveraged to study the individual’s physiological and health state in the context of other pandemics, which can be extended into the context of COVID-19 infection. This provides a platform to build on top of learning from other pandemics (Fig. 6.1).

Fig. 6.1
figure 1

Architecture of ensemble model proposed for pandemic management

Convolutional neural network (CNN) in this architecture will have the primary role of managing the image data, particularly chest X-rays that are found to be one of the top sources of data of the virus infection. This module’s critical outcome is to identify the virus’s existence and retrieve the virus’s critical characteristics as much as possible from this source of data (Fig. 6.2).

Fig. 6.2
figure 2

Basic structure of convolutional neural network for image analysis

CNN specializes in picking the image data’s essential features, such as the chest X-ray using the convolutional filters. These filters need to be architected effectively based on expected features to be learned from the X-ray. The max-pooling layer next in line will help comprehend the image’s critical characteristics into significant knowledge. A further layer of fully connected one will formulate the summarized knowledge for the final prediction of the virus’s existence in the subject. Apart from figuring out the existence of the virus, CNN’s architecture can also be customized to learn enough critical characteristics from the virus that can be plugged in the ANN module that stands next in line with the framework. Convolution layer and pooling layer structure also need a study based on the target domain to customize its effective operation to generate expected output.

In this architecture, ANN will specialize in identifying the critical parameters from the study of the virus. Feature importance is the expected outcome. Historical medical information will be the critical inputs consumed by this model to generate important feature information. With a wide variety of historical information being fed in as input for the ANN part of the framework, layers of ANN and their neuron configuration must be customized to study the critical features for the study effectively. Backpropagation and gradient descent concepts can be leveraged to enhance ANN architecture to make it a focused study to generate useful output from this part.

Recurrent neural network (RNN) in this architecture will specialize in the study of time-series data. The time trend of the virus behaviour and the patients’ responses on the medications are the critical input data. RNN architecture specializes in capturing the knowledge across the time from input data. Some of the variations of RNN like long short-term memory (LSTM) are of specific interest here as they overcome the problem of RNN, where they were tending to miss out on the vital information as the time passed by. LSTM bridges that gap by carrying forward crucial information in time and helping consolidate the knowledge over time. This pipeline in the architecture can also play the role of active plugging in data as it is gathered so that the outcomes can be kept refined over the period. RNN is critical here, as it is significantly vital to focus only on the characteristics that matter. This becomes crucial as the fight against time is a situation in handling this pandemic when it hits.

Natural language processing (NLP) pipeline of this architecture will specialize in plugging data from the social platforms. That information generates from people in real time some critical insights into the possibility of spreading the pandemic and also assesses the social network influence on the pandemic and other related information. The second part of the NLP pipeline task would be to effectively consolidate the historical literature to make it a meaningful data for present situations. With a lot of literature that sits from history and tons for further study getting added, specializations from the NLP domain will be crucial in this framework. NLP pipelines will also specialize in working on natural language data, from text and audio. As this part of the architecture is built, it will be crucial to study the landscape from where the data is plugged in and devise the NLP pipeline accordingly. As part of the NLP pipeline, social network study is involved. Graph theory can be banked on to make this objective of study more robust. Graph theory is about the entire network that can be represented as vertices and nodes. Here, the relation between the nodes is established by the vertices. This thought process came out of the travelling salesmen problem that looked at a situation where the salesman was expected to travel all the cities without revisiting the same city. It was about visiting all the cities once and picking the shortest distance to accomplish this. Here, in graph theory, the social network can be established with these graphs and plug in the input’s knowledge.

Work in [8] studies the social network model of COVID-19. Here, the dynamic social network model of COVID-19 is built, where epidemiological models are used with person-to-person interaction graphs. Graphs are the thought process of establishing the mathematical relation between the objects and their entities. The consolidator role in this framework is taken by transfer learning (TL), which takes as its input all the key learnings from CNN, ANN, RNN, and NLP pipelines. Also, it will be fed with historical data of the pandemic. The TL pipeline’s primary focus would be making the effective transfer of knowledge from the source domain, which is the historical data with the current situation of COVID-19.

6.1.2 Transfer Learning (TL) and Its Capabilities for Pandemic Management

The capability of transfer learning is an ability that is leveraged from the way the human brain operates. We learn from the tasks that we carry out and develop the intuition for performing similar tasks. Transfer learning builds on the limited ability of conventional machine learning, where learning of pattern happens with the focus of solving specific tasks in hand. There is a need to revalidate the pattern studied when the data feature space distributions change in these cases. Article [9] provides comprehensive coverage of transfer learning and its real-world applications. We look for leveraging the TL approach’s critical characteristics to build into the ensemble architecture discussed here. Transfer learning is also a good leap towards the thought process of artificial general intelligence (Fig. 6.3).

Fig. 6.3
figure 3

Traditional learning and transfer learning

In the case of traditional machine learning, it boils down to learned weights to capture patterns. However, in the case of TL, features and knowledge from previous learning will be leveraged into new scenarios and domain. This will be of interest to encapsulate all the learnings gathered in other pandemic experiences in COVID-19. TL also enables faster learning without dependencies on a large amount of data for learning. This will be a great advantage to use every learning that we add on as we go forwards in handling the virus. This also fits in well in the proposed ensemble architecture, as the proposal is to integrate the models from various specialities that work with a variety of data. In work [10], the author focuses on categorizing and reviewing TL’s progress for clustering, regression, and classification problems. They also provide a comparison of multi-task learning and domain adaptation. TL’s potential issues are also highlighted, which will provide useful insights into improvements that can enhance TL architecture’s performance.

In work [10], the definition of transfer learning is defined as follows: for a domain, ‘D,’ defined as two-element-base tuples ‘χ’, and P(X), which is a marginal probability, with ‘X’ as sample data, the domain can be represented as D = {χ, P(X)}.

Domain will have two components, feature space ‘χ’ and marginal distribution P(X), X = {x 1,…, x n}, x i − χ.

X i’, is the specific vector, task ‘T’ will be the two-element tuple for ‘γ’ label space, and ‘η’ is objective function. From the probabilistic view, the objective function can be depicted as P(γ|X).

For a domain D, task T is defined as

$$ T=\left\{\gamma, \mathrm{P}\left(Y|X\right)\right\}=\left\{\gamma, \eta \right\} $$
$$ Y=\left\{{y}_1,\dots, {y}_n\right\},{y}_i-\gamma $$

A predictive function ‘η’, learned from feature vector (x i, y i), x i − χ, y i − γ. For each feature vector for the domain, ‘η’ predicts a label for each, η(x i) = y i.

One of TL’s essential aspects is to decide upon which part of the knowledge from the previous domain must be taken forwards in the new domain. Similarly, there is a need to study the historical pandemics, more so with viruses that belong to the same family, and figure out what aspects of this historical information are relevant to be carried forwards to the context of COVID-19. This is important to ensure that the performance of the model is at its best. All available historical knowledge may not be of significance for the latest context, and just plugging in all available knowledge may lead to negative learning and will not help. Correlation needs to be established with the new virus and earlier ones and then pick a crucial part of authentic learning to be plugged into the latest studies (Fig. 6.4).

Fig. 6.4
figure 4

Transfer learning formats

In the case of inductive learning, there is the possibility of extending the learning between the same family classes. Viruses of the corona family can be considered here. Through inductive learning, the source-target task of COVID-19 can be improved. In the case of the historical class of virus which does not have the required classification of information, the knowledge of that class can be extended to the study of the target class which is present, through unsupervised TL. Transductive TL can be leveraged to use the knowledge from other unrelated classes of the virus, where useful labelled data is available. With these three focus areas, layers of knowledge can be built between historical data and current scenarios (Fig. 6.5).

Fig. 6.5
figure 5

Transfer learning methods in the context of COVID-19

In the selective transfer for critical aspects of the source domain, which is historical knowledge with the target domain, COVID-19, TL’s inductive learning aspect is adopted. AdaBoost is an approach that fits well here. Both target and source domains can be learned to balance the missing aspects in both and balance one another in a combination of supervised, semi-supervised, and unsupervised learning.

6.1.3 Deep Transfer Learning for Pandemic

In work [11], authors propose the detection of COVID-19-associated pneumonia with generative adversarial network (GAN) and fine-tuning the deep TL model with chest X-ray data as the virus leads to pneumonia that infects the lungs of humans. So, an X-ray of the chests is subject to study with GAN with deep TL. In this study, solution robustness with GAN helps avoid overfitting problem and generating images from available datasets. Standard and pneumonia X-rays form the dataset. GoogLeNet, AlexNet, SqueezeNet, and ResNet-18 are employed as deep TL architecture in this study. Careful attention is put in to balance the architecture’s complexity and make it optimal in consumed memory. Authors propose ResNet-18 to be the optimal deep TL model based on testing accuracy and performance on precision, recall, and F1 score, where GAN played the role of image augmenter (Fig. 6.6).

Fig. 6.6
figure 6

Deep transfer learning

Pre-trained data models can work as feature extractors for source domain, and customized networks can be built on top of this to adapt to the need of the target domain, in this case COVID-19 (Fig. 6.7).

Fig. 6.7
figure 7

Pre-trained deep transfer learning acting as a feature extractor

The critical thought process is to use the source domain weights learned to extract features from the target domain and not modify the source domain’s weights. It is essential to do white-box modelling where the focus is to see the influence of various characteristics of the other viruses historically from the source domain. It will become essential to pay attention to every step in the architecture and see how the prediction evolves. Deep learning architectures offer high flexibility in terms of the layers and hyper-parameters that can be tuned to customize them. It becomes possible to fine-tune these architectures. As per the characteristics of the deep learning architectures, they tend to learn simpler patterns in the data as they begin to dig deeper into the architectures’ layers, and patterns identified to become more domain-specific of task-specific. This helps to visualize the importance of various aspects of the source domain and customize the target domain. These capabilities also will contribute to strengthening the effort towards research on vaccines, and the source domain knowledge can be dug deep for further study. Another prominent aspect of deep TL uses parameters like freezing and fine-tuning the network based on the need. Based on the extent to which the labels are available in the target domain, we may more freely fine-tune the layers where the weights get updated in the backpropagation. However, if the target domain labels are limited, we will freeze the layers so that their weights do not get updated in the backpropagation. Computer vision and NLP being the top focus areas for deep learning find multiple pre-trained models that are open-sourced by the community and can be leveraged to study a pandemic. Natural language processing (NLP)-based pre-trained models can play a handy role in consolidating the knowledge hidden in the literature and the new information accumulated every day on the pandemic.

To explore NLP’s advancement, work done in [12] introduces a new language model representation called BERT (Bi-directional Encoder Representation for Transformers). Compared to conventional learning from text, this model learns the unlabelled text from both directions of text. This ability provides the advantage of utilizing and adopting this model in the target domain with the least amount of change in layers. It uses them in various tasks like question answering, without major re-architecting needed specific to tasks. The primary focus in deep TL is an adaptation of the domains between the source and target domains. Work on pushing the architecture to learn what is essential is a critical differentiating fact that can add much value which is expedited learning on target. In work [13], the author’s domain adaptation methods focus on the machines’ inability to handle change in source and target data distribution compared to that of humans. These domain shifts cause major drawbacks to conventional machine learning approaches. In cases of the target being labelled, supervised approaches do well, but unsupervised adaptation would be needed in case labels on the target domain are not available enough. CORAL (CORelation ALignment) is a simple, practical approach proposed by authors. The shift between domains is minimized with the alignment of second-order statistics between source and target distributions, with no dependency on labels.

Work in [14] proposes new representation learning for domain adaptation where training and testing data come from similar but different data representations. The approach focuses on the bank on the features that do not differentiate between the source and target domains. Here the labelled data of source domain and unlabelled data of target domain are considered. With the evolution of the training, the dominant features come to the forefront as the source domain’s primary learning task and subdue the facts that represent the shift between domains. Adaptation characteristics are achieved with a feedforward model that is combined with gradient reversal layers. Standard backpropagation and stochastic gradient methods are used to train this setup. Image classification and document sentiment analysis tasks are chosen here for demonstration (Fig. 6.8).

Fig. 6.8
figure 8

Architecture proposed in work [14]. An in-depth feature extractor and profound label predictor compose the forward feed network of the architecture. The domain classifier provides the unsupervised adaptation to the domain. Backpropagation-based training is done with a feature extractor connected to a gradient reversal layer, multiplying with some negative constant in the process. The rest of the training process is standard, with domain classification loss optimized in all samples and prediction loss optimized in source examples. Domain-invariant features are ensured with domain classifier by gradient reversal, where feature distribution of the two domains involved is maintained similarly. Multi-task learning makes learning between source and targets a random learning approach, where there is no segregation or sequence involved in learning. All the tasks involved between the source and target are exposed for the learners

Deep learning has always faced data availability challenges; in that light, one-shot learning is a great approach. In work [15], authors propose one-shot learning for object categories. With the challenge of learning from a visual-based domain needing hundreds of data points in the form of images, this approach demonstrates the possibility of using a few images to learn. The idea is to avoid work from scratch but leverage the knowledge from past occurrences. Object categories are represented as probabilistic models to represent the Bayesian implementation. Using prior knowledge of the subject, the probability density function is established on these parameters. Multiple observations prior are updated to obtain the posterior model for the object category. Models learned from maximum a posterior (MAP) and maximum likelihood (ML) methods are compared with Bayesian models learned from this approach proposed by authors. With few data samples, the Bayesian approach seems to exceedingly well recognize hundreds of categories compared to other approaches which are struggling. Zero-shot learning takes it to the next level, which will have on top of the standard input ‘X’ and output ‘y’, a random variable that explains the task and makes necessary training adjustments itself when it looks at unseen data. Machine translation is an appropriate scenario where zero-shot learning can fair well. This is of interest to expedite the learning in situations of a critical pandemic, where every bit of knowledge gathered daily has to be conceived to build the knowledge on an ongoing basis.

6.1.3.1 Challenges to Tackle in Transfer Learning (TL) from a Pandemic Management Perspective

Managing the organization of when to transfer the knowledge is a key focus area that can play a significant role in deciding the success of the approach. Negative transfer is another critical focus area which leads to the degradation of the performance if the knowledge transferred is not appropriate. This may lead to degradation of the performance, instead of improvement. These can occur because relevant connections between the source and target are not established, or relevant tasks between the source and target are not connected. Bayesian approaches are explored to tackle some of these issues. Another critical aspect of TL is to make sure that quantification is possible in case of transfer learning. This gives an excellent opportunity to explore in-depth the relationship between source and target domains.

6.1.4 Graph Theory for Social Network Analysis for COVID-19

Graph theory finds its applications in airline networks, physician networks, and supply chain networks. In a physician network, doctors would be the vertices, with specialities, demographics, and patient volumes being some of the vertex attributes. Patients will be the edges in the graph connecting the physicians, and edges carry the history of the diagnosis, visit frequencies, and other information related to patients. The graph plays a significant role in providing a summarized view of the relations and interactions between various entities involved in the domain. Graphically displaying the content provides an intuitive way to establish relations and is of significance in social network analysis. One key input for graphs and neural network association is to explore the possibility of leveraging the graph theory in neural network management. Here, the flow of information can be better visualized and managed to ensure effective control over the neural network and directing the network towards the intended objective. One specific instance of the application of graph theory with other machine learning algorithms would be as follows. If there are scenarios of grouping the data in this pandemic management effort, then algorithms like K-means and graph theory can go hand in hand.

In case if any scenarios involved the optimization of the paths, then graphs are well suited. In general, there is an advantage associated with memory consumption as the data space is represented by graphs which are more efficient. In work [16], authors recognize the graph theory importance in science and technology. They highlight the possibility of depicting any situation with a graph. With the pandemic situation of COVID-19 and precautions, graph theory seems to play a significant role. The virus growth is mapped in different types of graphs, and the number of infected people over a period is accounted for. Some of the concepts in graph representation that can be handy for the pandemic study are as follows. Among all the possible node pairs, the shortest path length represents the average path length. This can help to understand the pace of movement in the network. This will be of interest in the case of virus spread. Graph traversal where a search is happening from node to node involves searching across breadth or depth, where breadth search focuses on being close to the primary node. In contrast, depth search intends to remain away, as much as possible, from the root node. These concepts will help track the infected individuals and extend secondary and tertiary contacts of the primary infected individual [17].

Centrality concepts of the graph assist in the identification of the most critical node in the graph. In the identification of centrality, there is a classification aspect involved. Centrality may be decided based on the number of edges connected to the node. Based on the shortest distance to another node, centrality may be decided. These centrality measures play a prominent role in pandemic management to assess the network’s essential entities while we trace the pandemic contacts. Graph density is measured concerning the number of edges. These graph density measures will assist in planning the medical facilities, including the reservations of the beds, ICUs, ambulances, and other facilities. Graph randomization is another useful concept that will be handy. Here, metrics for the graphs are generated by building hypothetical graphs based on the graph that we have in hand. The similarity between the target graph and the source one that we generate for reference would be based on the number of nodes, density, or other metrics. Here, the reference graphs can be created, making use of the historical knowledge of the pandemic. This provides the opportunity to use the historical learnings of the pandemic in current scenarios. This will provide a good benchmark for the latest study of the pandemic as well. Some graph analysis tools that are available for graph computation and visualization make the inferencing process more effective.

6.1.5 Importance of Transparent AI for Effective Pandemic Management

The importance of the machine learning approaches to be transparent relates to traceability that can be established from the study of pandemics to derive differentiating facts when a new virus-like COVID-19 hits hard. As the algorithms bring in their sophisticated architecture for solving problems, understanding the prediction process gets affected. To ensure knowledge accumulation can be ensured in a pandemic situation, this aspect of AI becomes key. This becomes extremely critical as the domain that is dealt with here is that of healthcare. To explore the interpretability of AI systems, one technical aspect to consider that will play a vital role in the system’s success is to assess all data aspects’ influence on the AI system’s decision. For example, suppose we pick up an image recognition problem. It is significant to assess if the trivial aspects in an image like the background of the image influence decision-making. This is important because scenarios where the training data may come from the same set of backgrounds lead to systems assuming that this image may always have the same setting, which may not hold good.

This aspect points to one of the essential considerations of accounting the distribution of the data in the ecosystem that they are coming from. Training data need to be carefully curated to ensure that the data represents all possible distribution in the ecosystem. That does more vigorous training to be possible, resulting in a robust AI system. With the challenge in pandemic management in handling the knowledge coming from unrelated domains of the virus, this becomes a significant part. Feature importance identification is another substantial part of a robust AI system in the pandemic management domain that we are talking about. Some of the techniques that help build interpretability into the AI systems are as follows (Fig. 6.9).

Fig. 6.9
figure 9

Global surrogate model

In the global surrogate model, simple, intermediate models are built to interpret the complex models’ predictions. In case the random forest algorithms are involved in a complex multi-class decision-making process, a simple decision tree can be used as an intermediate model to interpret the random forest model’s predictions to interpret the results. This provides a rough map to get a sense of what is happening in the process of complex model predictions.

LIME (local interpretable model-agnostic explanations) is a framework that helps determine the need for interpreting at the granular level of input data. In this approach, the data point of interest in the input domain is marked, and then fake data is generated around the target data point using standard data distributions. More weights are assigned for data points that are closer to the target data point. Now prediction model is built for these new data distributions. This model is used as a surrogate model for interpretation (Fig. 6.10).

Fig. 6.10
figure 10

LIME interpretation demonstration in case of virus identification. LIME explains individual prediction. The model predicts the outcome of the COVID-19 positive; the LIME model assists in shortlisting the key factors causing it. This provides interpretable outcomes of the model for the physician to make informed decisions

LIME models will help to work as intermediate to complex AI systems and humans to better interpretation. Complex AI system would identify the virus’s characteristics by studying the symptoms of the virus in the affected individual and creating the library of characteristics of the symptoms by learning from the source domain. At the same time, the LIME interpretation model that fits in between will highlight the key characteristics of the library that are significant in the study. This scenario will be of prime importance to determine the plan for early identification of the symptoms so that the physician gets enough time to treat the affected person. Asymptotically affected cases have been a massive challenge in handling this pandemic, making this improvement scenario explored.

6.1.6 Building an Active Data Pipeline for Pandemic Management

As much as the algorithm itself, it becomes crucial to building an active data management pipeline for the effective modelling of the virus. Since multiple nations are involved, they needed to strike collaborative efforts between nations. Another challenging task would be to make sure the doctors and medical community are engaged in this effort. Since they are in mid-term action, saving lives and paying attention to the data pipeline would not be easy. Doctors cannot end up spending hours to get the documentation right. There is a need to think about effective systems that can facilitate this. This process is crucial to understand in real time what is the essential features in the data and make sure they are tracked effectively. As part of the data pipeline building, the data privacy part must be counted as well. However, sophisticated data protection systems are built; there are counter-approaches to deriving the patient’s related information from the data by leveraging the people’s available social data. This makes it a key challenge in building an active data pipeline for the study. Since the hospital end’s data collection mechanism is a traditional system, leveraging those data for research purposes is challenging. Though there is an effort to put them together and create a database, ensuring the information’s accuracy will be challenging. This may need more commitment between the data research team and the medical team to strike good coordination to make sure the data accuracy is improved. Also added challenge is on putting together the information from across the globe in an inconsistent manner.

Challenges in these data pipelines also extend to complete traceable records of patient history and their final health state being documented. To ease out the data management part, it is not easy to get in an expert data management corporation due to many challenges involved concerning data privacy. All possible approaches to ease this data pipeline should explore smart natural language processing (NLP)-based system where doctors can record and upload to the database, possibly real-time data validation with the doctors to help improve data quality. Exploring the possibilities of engaging student doctors for managing the documentation part would be the right approach. The importance of these data is also in tracking essential medicines that are tried on patients and their evolution.

6.1.7 Deep Learning Models for Integrating Social Media and Epidemic Data

Infectious disease management across the globe is a critical aspect that needs focused attention. In these situations, understanding the virus and studying its evolution becomes a critical need. These are computational frameworks under epidemiology that provide the required framework, but they lack real-time monitoring abilities. On the other hand, the right amount of information is available on social media but will not provide a complete picture of how the network of virus spans out. In work [18], the author proposes a deep learning semi-supervised approach to bring social media mining techniques and epidemiology information together. The approach focuses on studying social media information for health and how they interact with each other. This information is modelled, taking into reference models associated with the disease and network of the contact. The approach has also attempted to feed in social media knowledge to an epidemic model built on computation to improve the model’s efficiency built on controlling the disease. To achieve consistent integration of both aspects of the data, an optimization algorithm is proposed to ensure an interactive learning process and consistency in the integration. The approach has been said to demonstrate the ability to characterize, time-based, and space-based diffusion of the disease better than other models. One of the highlighting factors of this epidemic is its spread resulting from the increased local and global travelling. The rapid spread of the virus in a short period also adds to the concern. The root action to this situation lies in understanding the characteristics of the virus and studying its evolutionary pattern. Recent trends have said to contribute to both social data mining and computational epidemiology. The individual-based epidemiology network is proposed in computational epidemiology where time and space influence are epidemic spread is accounted for. The study starts with the personal impact of the epidemic and then extends up to how various actions have influenced the epidemic’s control. Network-based models are simulated with high-performance simulation abilities related to epidemics. These simulation results help forecast the evolution of epidemic that can also help forecast the spread of disease, peak time associated with the spread, and how practical the prevention measures are. Some of the limitations of computational epidemiology are lack of spatial data that is fine-grained and eligible for surveillance and model tuning.

There is a dependency on the data received from the Centers for Disease Control and Prevention (CDC) to estimate model parameters [19]. However, there is no granularity of the CDC’s data at the state level, and the data is not enough for disease diffusion within a state. Real-time detection of dynamic networking of the contact is another challenge. There is a significant influence of actions such as closing the school, medications given on the infection level, and spreading viruses that change the network structure. Real-time updating of the network with the ongoing changes is a critical challenge that needs attention. The cost of real-time training with real-time data is high. Mostly, the existing approach relies on batch training with the CDC data. The limitation of CDC data is updated once a week, making it tough for real-time updates. Social media data provides a better supply of data and a timely one from the socially located sensors [20]. Data based on social media provide detailed health information and disease surveillance at the aggregate level. Social media users are reporting the disease’s self-symptoms to help in aggregating the trend and looking at possible outbreaks. There will be a twofold focus: one trying to highlight the current spread and the other forecasting the possible breakout in the future. Another part of the focus is to model social media data to assess health behaviour and disease informatics.

Social media data would suffer from a granular view of real-time contact information to establish a network that will facilitate diffusion of the disease based on its pattern. Just depending on the user location to assess the social contact network would not be enough. Visualizing the complete demography would not be possible as the restriction is towards social users’ health information. Limitation continues in terms of data used which is restrictive to the disease level and not to the extent of deriving knowledge from the disease model. With limitations of their own in computational epidemiology data and social media data, in work [18], the author put the approach to make use of both sources of data with a deep learning framework called SocIal Media Nested Epidemic SimulaTion (SimNest) (Fig. 6.11).

Fig. 6.11
figure 11

SimNest framework from work [18]. Simulated space is the mirror of social media conversations among people. Based on the conversations, infection, isolation, and vaccination are mapped. Demography-based contact information is used to establish network connections in the simulated world

The framework focuses on deriving insights from social media based on interventions utilizing deep learning. In the case of unsupervised learning, it facilitates disease model understanding from computational epidemiology. So, the disease model from epidemiology will in case be fed with granular data from social media feed that optimizes the disease parameters. This establishes the iteration between both sources of data to build an integrated model. A semi-supervised multilayer perceptron is involved in the framework, which focuses on mining epidemic features. Epidemic disease progress model is used to derive the unsupervised pattern from epidemic disease and later subjected to supervised classification. The sparsity of labelled data is managed with a semi-supervised approach. Online learning is employed to ensure the gap between the social media world and the simulated world is minimized—the algorithm focuses on injecting real-time data into the model and reduces the cost of retraining.

Historically, computational epidemiology has been utilized to study the clustered data of population based on the health state of people and demography. Differential equations have been used to model these data [21, 22]. In terms of supporting network epidemiology, an individual computational model is used to study stochastic propagation of the epidemic in people’s network. Random models of people interactions are a common approach in network epidemiology [23, 24]. Network epidemiology has also attempted to represent people networks and attempt to simulate individual-based data to assess the epidemic spread network. Individual-based attributes are built into this model, like social, geographical, and other attributes that create a population’s synthetic simulation model close to the real population. This model is regularly updated based on daily activities and its location to build a social contact network for the population [25]. This individual-based epidemic spread model helps in assessing the spread of who infected who and time of the same [26]. Apart from this disease model and synthetic network, individual-based epidemic models are also composed of individual intervention and public health information like vaccination and medication and other aspects like social distancing. Network node or edge properties are modified by this intervention that’s done as precautions. Epidemic knowledge mining approaches have considered social media sources of data like focusing on disease surveillance data that is rolled up. Approaches in [27] have highlighted the reliability of social media feed data regarding how people feel about their help. Generally, the disease information is not explicit from people unless diagnosed by experts. Health information semantic analysis is another source of reference. Here, the social message’s intent is used to assess public health scenarios, interventions that are tried out, and other health behaviour. The topic model proposed by Biswas et al. [28] considered the symptoms of the ailment and possible treatments and then assessed the geographical patterns of such health issues. Classification of social tweets was proposed by Barrett et al. [29], where user behaviour is accounted for. In work [30], accurate geographical locations were explored for the outbreak. The health condition of Twitter users was assessed based on user interaction by the work of Krieck et al. [31]. Individuals’ disease progression tracking was a different dimension of exploration in the work of [20].

6.1.8 Importance of Secured E-Health in a Pandemic Situation

E-health systems will play a prominent role in the pandemic situation like that of COVID-19. It will be essential to make sure that the information is disseminated quickly and all the knowledge is plugged in for better management. It is essential to ensure that E-health’s entire ecosystem is built with patients as the focus centre. E-health systems need to consider data collection mechanisms, data transfer mechanisms, and decision-making systems. With all these, as focus security of the E-health systems is a vital one that needs attention. The primary driver for the E-health systems is to optimize the cost of healthcare services. With the associated benefits, E-health suffers the security concern of the data handled. The security concerns’ primary theme revolves around data privacy, authentication of the right user, and integrity. Biometric has been the solution to account for security and works better than traditional approaches. Work in this focuses on addressing privacy and security issues with the biometric technology-enabled E-health. Biometrics provides a reliable solution if other aspects, like patient privacy of using biometrics, processing time, and complexity of the process, are taken care of. Singh et al. [32] focuses on building an encryption technique that is lighter but is strong to manage large data transferred on the network.

E-health facilitates easy access to patient’s information where efficiency and cost are improved. With limited time between patients and physicians, an efficient information system would be an excellent facilitator. Securing E-health application and their components is the key focus, focusing on user authentication and authorization. Traditional authorization methods are not user friendly from various perspectives, including managing the credentials. Fingerprint, face, voice, and signature are explored to strengthen biometric capabilities. Deep learning techniques can be employed to make biometric more efficient, to make sure that the system does not depend on the database being created for the user. One-shot learning kind of advanced techniques is explored to perform on the fly learning. Voice and face authentication are done as a comprehensive solution. Biometric features are difficult to be reproduced and make it a healthy control. Health data encryption is an important area to be considered as E-health systems are built. Biometric systems have enrolment and recognition as their key phrases. As a system, it will comprise pattern identification with sensors, extracting the feature, creating a database, and running the comparison. Based on the scenario in which the application fits in, there is a need to decide on an identification and authentication approach.

Unimodal or multimodal features of the users are used to figure out the right users. Evaluation of identifying approach is based on the parameters like trait chosen to be constant over time, uniqueness of the feature, quantification possibility of the feature, people accepting the system, the strength of the system to face hacking, and so on. A similarity score is generated to identify the identity. False non-match rate (FNMR) and false match rate (FMR) are necessary biometric system accuracy measures. These are sources of error from the system where FNMR is the wrong declaration of the correct user as the wrong one. FMR is the wrong identification of the identity resulting in fake identities getting access to the system. Failure to capture, failure to acquire, and failure to enrol are the other possibilities of system failure. Failure to acquire is the deficiency of the biometric system to capture the input.

There would be cases where there are difficulties in enrolling the user. Receiver Operating Characteristic (ROC) curve helps to depict the trade-off between FMR and FNMR. Detection error tradeoff (DET) is also a mechanism for measurement [33]. ROC is the accuracy of the system in the test environment. Equal error rate (EER), which is the point in the curve of DET where FMR and FNMR are equal, provides a measure for the biometric system’s performance. The lowest value of EER is expected for an effective biometric system. E-health has expanded as telemedicine, which is about providing healthcare delivery systems using information technology and telecommunication. Electronic health record systems storing patient’s information are an E-health system. Customer medical needs are assessed based on consumer health information. The health knowledge management system also has a significant role to play. Medical decision-making systems play a vital role and assist the physician—M-health with mobiles involved in various healthcare support. Digitization of the data is a common theme for all these systems, so E-health stands for digital health management systems in place of a paper-based system (Fig. 6.12).

Fig. 6.12
figure 12

Biometric technology-based E-health system architecture

In this biometric technology-based E-health system architecture, level 1 has all devices that assist in data transmission across various parties involved in the transaction. Sensors, mobile devices, and computers form part of this. In level 2, primary communication architecture is hosted that establishes a connection between level 1 and level 2. Telemedicine, healthcare information storage, and healthcare provision are involved in level 3. Levels 1 and 3 are devised to be authenticated with biometrics of physicians or patients. Enhancing the knowledge of the consumers and medical care individuals is one of the critical objectives. This objective also facilitates consumers to develop their knowledge to manage primary health conditions. There is a broader focus on moving from a provider-driven system to a patient-centric system. This is a significant change in mindset that will increase people’s confidence in the healthcare system and play a significant role, particularly during a pandemic like COVID-19 [14]. With tremendous pressure on both consumers and providers during this pandemic, such a robust system stronger from all dimensions will make a significant difference. Studies taken on various E-health systems have provided knowledge on the system architecture of the system. The most common theme is the three-layer architecture presented [34, 35]: the first layer for devices to collect the information, the second layer of the network that transfers data from patients to the interested parties’ databases, and the third layer about the systems for intelligent decision-making. Network layer architecture is presented in the work by Samanta et al. and Brennan et al. [36, 37].

6.1.8.1 Security Challenges

Since the information is stored and moved in digital format, it attracts security vulnerabilities. Information is seen in storage, transmit, and processing stages. The data’s confidentiality is impacted as the data gets moved by medical fraternity with the information network systems’ confidence. Any instance of the confidentiality breach will impact the patient’s privacy. That calls for effective access control for the data processing systems. Password-based smart cards are a traditional method that is vulnerable and not user friendly [38, 39]. Wireless sensor network (WSN) strives to protect the patient’s data from securing communication across biosensors. Within this network, there is a need for more robust controls for ensuring the security of confidentiality, integrity, and privacy of the data. Encryption has a prominent role in protecting transmitted data. However, key generation and key distribution requirements make it hard as resources constrain the biosensors [40].

Electronic health record (EHR) makes good use of the biometric system, to prevent the possible tampering of the data by any party involved. Biometric technology adopted is based on the physiological characteristics and biometric systems multimodal with a one-time verification system [41]. Static methods, like the utilization of fingerprints and iris patterns, are standard. Biosignal-based biometric modules are studied in the healthcare industry [42]. These bring in dynamic nature as they are varying across time [43]. These provide readily accessible information as these are already accessed once for returning patients. Biometric information of the patient and of the physician is used for authentication. This mechanism is more straightforward compared to traditional ones where credentials were supposed to be memorized. This becomes handy in the case of elderly patients and patients who are not conscious. Biosignals are a useful mechanism as they provide a seamless experience to the user without intervening in the user’s regular activity [44,45,46]. Systems like photoplethysmography (PPG), electrocardiogram (ECG), and electroencephalography (EEG) are the popular biosignal systems. Wireless secure communication with biometric encryption on the sensors is seen in wireless body sensor networks (BSNs). BSN specializes with the sensors that are worn on the human body. BSN helps monitor health conditions, enhance memory, access medical data, and communicate in case of emergencies [47].

Health information is protected with encrypted access to sensitive information. The generation of cryptographic keys is the crucial aspect of biometric cryptography. Authentication key generation and wireless communication of the data are ensured by biometrics. With randomness and time variation characteristics, key generation makes sense in biometrics where dynamic traits are considered [48]. In work], authors propose encryption and authentication combined biometric solutions for wireless communication in BSNs. Secure communication is ensured in case of work for BSN; here ECG or PPG kinds of physiological features are employed to generate the cryptographic keys that get communication within the network. Fast Fourier Transform (FFT) is employed for feature extraction like interpulse interval (IPI) from PPG and ECG, frequency domain, or time domain. The quality of keys that are generated is essential to ensure cryptosystems are secure [49].

Distinctiveness and randomness decide the quality of the keys generated. The distinction between the people is covered by distinctiveness, measured with FAR, hamming distance (HD), and FRR. Making unique keys unpredictable is the role of randomness. The entropy of the generated keys decides the randomness [50].

6.1.8.2 Future Vision

E-health security issues are tackled well by the biometric system. Besides providing identity management, encryption is provided to ensure the secure transfer of the information across the pipeline of exchange [51, 52]. Data modification and eavesdropping kind of attack can be managed by protecting health information with biometric encryption. Biometrics promises to tackle several issues faced by the healthcare industry in the midst of technological evolution. Efficiency and robustness of the biometric system are crucial features. Even with these possibilities, there are potential areas that must be improved around authentication, attracting security attacks. Biometric templates can be compromised, leading to security attackers manipulating the templates and replaying the attacks. Since the biometric templates are linked to an individual, it will not be possible to reverse the manipulation. This will need a biometric system that can be cancelled. Time taken for identification and verification needs a closer focus for further improvement [53]. Multiple biometric processes like noise removal, feature extraction, feature matching, enhancement, and classification are managed in a time-optimal way.

Data encryption plays a significant role in transforming institution-based healthcare into home healthcare with telemedicine and mobile health, with a wireless communication approach leveraged. Biometric systems’ speciality in encryption and decryption of the information plays a significant role, with crucial generation playing a vital role in cryptographic systems. Wireless communication networks have challenges in the generation and distribution of keys [54]. These needs focused attention and research. Keeping older adults in mind while devising these biometric systems is a scope of further exploration. In work, authors have focused upon the importance of E-health in the healthcare industry and exploring the challenges that need focused research. Focus is drawn upon patient privacy data security in E-health. All the drawbacks in the conventional system are focused on being addressed. Unimodal to multimodal biometric range of systems is proposed as part of E-health applications [55]. Biometric technology is explored to secure wireless sensor networks with continuous and unobtrusive authentication methods. ECG, EEG [56], and PPG biosignals that gather unconventional biometrics provide a new direction for technological adoption in the E-health domain. Specific areas like biometric database protection and processing time involved should be focused on.

6.2 Conclusion

With the technology evolving rapidly, so is the situation of a pandemic caused by a deadly virus. Based on the experience gained over the last few decades, it is essential to invest regularly in a well-planned effort to learn these pandemics and tackle them effectively in a short period. Exploration of artificial intelligence (AI) in the field of medicine was presented in this chapter. There is greater collaboration between the healthcare industry and the technology industry. Exploring the possibility of bringing critical concepts of deep learning together in an ensemble model approach to utilize these models’ best aspects is proposed. With revolutions taking place in transfer learning, it finds its place as a good fit for making use of all the knowledge gathered every time a pandemic hits the world. Social network analysis plays a significant role in the pandemic management situation; graph theory highlights its significance. AI solutions have been striving to make themselves more explicit and transparent; this thought process can be handy when struggling with a complex domain like that of the COVID-19. The ability to break into the data analysis patterns helps to device the solutions faster and effectively.

With a large amount of data that is spread across the globe, it is essential to build a robust data pipeline that can put all the useful information together and conduct a relevant exploration to build intelligence against the pandemic. The discussion is extended to put together the past knowledge of the pandemic and real-time data generated on social media. Putting social media data into meaningful use can be achieved with this thought process. E-health system’s role in these uncertain times is highlighted from building confidence among the people. The demand for healthcare exponentially growing in these pandemic times builds tremendous pressure to respond to the need. Automating all possible elements of the healthcare system can play a prominent role in this demand management need. As the demand is addressed, healthcare systems cannot compromise people’s data security and privacy. So, this is a tricky scenario to balance all dimensions of the people’s needs.