Keywords

1 Introduction

Artificial intelligence (AI) is defined as ‘a field of science and engineering concerned with the computational understanding of what is commonly called intelligent behaviour and with the creation of artefacts that exhibit such behaviour’ (Shapiro 1992). Through a vast amount of data sets as input, systems are able to perform a series of algorithms to produce computational models and provide a decision as an output. Many technology vendors have made investments towards AI to provide solutions and services with the use of their technology , such as Microsoft, Google, Apple, IBM, and Amazon to name a few (Chouffani 2016). One of the domains that AI has been extensively studied in is the field of health care, specifically developing an AI-driven healthcare system .

One of the major goals of the modern healthcare system is to offer quality healthcare services to individuals in need (Kamruzzaman et al. 2006). In order for that goal to be achieved, it is important to undergo a successful early diagnosis of a disease so that the most appropriate treatment can be administered, leading to a better patient outcome (Ifeachor et al. 1998; Teodorescu et al. 1998). As such, there is a need for AI computational models/algorithms that can aid in diagnosing, with reliable accuracy, a patient’s condition given a series of data.

2 Artificial Neural Networks

Artificial neural networks (ANNs) have drawn tremendous interest due to their demonstrated successful applications in many pattern recognition and modelling arenas (Rumelhart et al. 1988), particularly in processing data for biomedicine (Nazeran and Behbehani 2001). Neural networks have also been demonstrated to be very useful in many biomedical areas, such as assisting in diagnosing diseases, studying pathological conditions, and monitoring treatment outcomes (Kamruzzaman et al. 2006).

ANNs are computational paradigms that operate with the resemblance of a biological nervous system (Sordo 2002). They are also referred to as connectionist systems , parallel distributed systems, or adaptive systems , because they are composed of a series of interconnected processing elements that operate in parallel. First represented as a binary threshold function (McCulloch and Pitts 1943), the ANN was eventually developed into a perceptron as a practical model (Rosenblatt 1958). The perceptron is most popularly represented as a multilayer feedforward system, where networks are made up of several layers of neurons. Such model (Fig. 22.1) is often made up of the input layer, the middle or hidden layer(s), and the output layer, each of which is fully connected to each other with numerical weights associated (Ramesh et al. 2004). One important feature of the ANN is its ability to learn through a training environment relying on backpropagation, where the neural network performs repeated adjustments of its weights to simulate learning (Werbos 1974).

Fig. 22.1
figure 1

The artificial neural network (Ramesh et al. 2004)

2.1 Artificial Neural Networks for Diagnosis

An instance where ANNs have been applied within the medical domain is for clinical diagnosis (Baxt 1995), along with image analysis in radiology, data interpretation in intensive care settings, and waveform analyses, particularly in oncology (Naguib and Sherbet 2001; Ramesh et al. 2004). One example of such system is PAPNET , an automated neural network-based computer programme for assisted screening of Pap (cervical) smears (Boon et al. 1993). A Pap smear test examines cells taken from the uterine cervix for signs of any malignancy where a properly taken and analysed Pap smear can detect very early slightly abnormal squamous cells. Detected early, cervical cancer has an almost 100% chance of cure through the removal of such possible precancerous cells performed as an outpatient procedure (Sordo 2002). In addition, a Bayesian posterior probability (BPP) distribution function in a neural network input selection was designed to assist gynaecologists in the preoperative discrimination between benign and malignant ovarian tumours (Verrelst et al. 1998). The network used data from 191 consecutive patients for training with the results from the neural network being validated by experienced gynaecologists. Another example is a neural network called ProstAsure index which can classify prostate tumours as benign or malignant with a diagnostic accuracy of 90%, a sensitivity of 81%, and a specificity of 92% (Stamey et al. 1996). Further studies in the applications of ANNs to prostate cancer involved the development of techniques for the analysis of conventional and experimental prognostic markers (Naguib and Hamdy 1998; Naguib et al. 1998).

There are several systems available for the diagnosis and selection of therapeutic strategies in breast cancer. A neural network judged the possible recurrence rate of tumours correctly in 960 of 1008 cases by using data from lymphatic node-positive patients, based on tumour size, number of palpable lymphatic nodules, tumour hormone receptor status, etc. It was reported that similar results were obtained by neural network evaluation of the parameters of the BI-RADS standardised code system (Baker et al. 1995). To predict metastases in breast cancer patients, an entropy maximisation network (EMN) was applied (Choong 1994). EMN was used to construct discrete models that predicted the occurrence of axillary lymph node metastases in breast cancer patients, based on characteristics of the primary tumour, using a series of clinical and physiological features. Similarly, the prediction accuracy of artificial neural networks and other statistical models for predicting breast cancer survival was compared (Burke et al. 1995). In that study, ANNs using the backpropagation algorithm, compared with the TNM staging system (tumour size, number of nodes with metastatic disease, and distant metastases), proved to be more accurate in predicting the 5-year survival of 25 cases.

In addition to breast cancer, a multilayer perceptron proved to be a reliable decision support tool for the prognosis and the extent of hepatectomy of patients with hepatocellular carcinoma (Hamamoto et al. 1995).

Using the backpropagation learning algorithm and a radial basis function (RBF) network , a diagnostic aid system for the serum electrophoresis procedure was developed. The serum electrophoresis is a standard laboratory medical test for diagnosing pathological conditions, such as liver cirrhosis or nephrotic syndrome (Sordo 2002). Results confirmed the feasibility of the network as an architecture for medical diagnosis with respect to the provided input (Costa et al. 1998). Similarly, diagnosing bowel disease was aided with the assistance of an adaptive resonance theory mapping neural network using 23 features extracted from 280 inflammatory bowel disease instances and was used to classify as either Crohn’s disease or ulcerative colitis (Cross and Harrison 1998).

2.2 Assistive EEG Analysis

Neural networks have also been used in electroencephalography (EEG) analysis , particularly in diagnosing neurological diseases (de Haan et al. 2009). One such disease, epilepsy, is characterised by sudden recurrent and transient disturbances of mental function and/or movement of the body that result from excessive discharging of groups of brain cells. The detection of an abnormal EEG plays an important role in the diagnosis of epilepsy (Kamath et al. 2006): spikes or spike discharges are transient waveforms present in the human EEG and have a high correlation with seizure occurrence. There is a general agreement that automated spike detection is a well-recognised task for an ANN in a neurodiagnostic laboratory; therefore, identification and scoring of spikes in the EEG signal are necessary tasks to determine the diagnosis of epilepsy. Using a three-layer feedforward backpropagation (BP) network trained with raw EEG data yielded respectable results in detecting spikes during or before an onset of an epileptic seizure (Özdamar and Kalayci 1998). Raw EEG signals recorded from the scalp and intracranial electrodes of two epileptic patients were also considered, where the EEG recorded acted as an input to discrete time recurrent multilayer perceptrons which concluded that intracranial recordings provided better results in identifying spikes (Petrosian et al. 2000). A further study indicated that using an enhanced cosine radial basis function neural network was able to perform EEG classification with an extensive parametric and sensitivity analysis to validate the robustness of the classifier in diagnosing epilepsy (Ghosh-Dastidar et al. 2008).

Another neurological disease that has been analysed using EEG scans is Alzheimer’s disease (AD). The disease onset usually takes place post 60 years old, and the risk of AD goes up with age, such that approximately 5% of men and women between ages 65 and 74 have AD and nearly half of those who are aged 85 and older may have the disease (Kamath et al. 2006). However, AD is not a normal part of ageing, and an analysis of EEG scans of patients with AD revealed abnormal frequency signatures (Moretti et al. 2004). Continuous EEGs (as well as their wavelet-filtered sub-bands) from parieto-occipital channels of ten early AD patients and ten healthy controls were used as input into recurrent neural networks (RNNs) for training and testing purposes (Petrosian et al. 2001). The study indicated that features derived from wavelet analysis, combined with the RNN approach , may be useful in analysing long-term continuous EEGs for early identification of AD . In addition, neural network analysis combined with graph theory for the analysis of topological changes in large-scale functional brain networks found that the brain network organisation in patients with AD deviated from the optimal network structure towards a more random type (de Haan et al. 2009).

Developing assistive technology for patients through EEG analysis can be achieved through a brain-computer interface (BCI) that operates on the principle that non-invasively recorded cortical EEG signals contain information about an impending task intended by the user. That task can be identified by an ANN (Kamath et al. 2006), where it is imperative that the intention of the user is translated into a signal interpretable towards a control movement (Blanchard and Blankertz 2004) using either a regression or classifier algorithm (McFarland and Wolpaw 2005; Penny et al. 2000). One application of a BCI involves being a patient-centric medium for controlling a motorised device (such as a wheelchair or a home appliance) or performs computer-related tasks (e.g. moving a mouse or typing on the keyboard) despite being hindered by a physical disability such as paralysis. The design of an assistive BCI system ranges from components such as EEG amplifiers, a feature extractor, a pattern recognition system for the analysis of signals, and a mechanical interface to achieve the motor component of the task. Software for feature extraction and recognition must be optimised for rapid response, and the mechanical device coupled to a BCI should be easy to operate. Due to this, it is imperative that the algorithms used for classifying patterns through EEG signals in the BCI are as efficient and effective as possible. A variety of classifier algorithms that have been used in BCI have been described by Lotte et al. (2007) and can be seen in Table 22.1. Depending on the features selected, it is important to be aware as to which algorithm is considered as the most appropriate, depending on the specifications of the system’s design, particularly the features selected and its intended users.

Table 22.1 A summary of the qualities of classification algorithms used by brain-computer interface (BCI) (Lotte et al. 2007)

2.3 Imaging Towards a Patient-Centric Medium

As indicated in the previous studies, EEG analysis is useful in diagnosing diseases in neurology, but the main underlying foundation is being able to detect patterns in the EEG of patients through imaging. Imaging is an important area for the application of ANNs (Egmont-Petersen et al. 2002), particularly in medicine , as pattern recognition is widely used to identify and extract important features from medical images (Shuttleworth et al. 2008; Jiang et al. 2010; Hilado et al. 2014). Medical images, such as radiographies, computed tomography (CT), and magnetic resonance imaging (MRI), are some of the most used imaging input media towards healthcare-centric computational models using neural networks.

Using certain imaging principles such as filtering, segmentation, and edge detection techniques, cellular neural networks were able to improve the resolution in brain tomographies, which in turn enabled the improved detection of microcalcifications in mammograms by refining the global frequency correction (Aizenberg et al. 2001). Another study had ANNs trained to recognise regions of interest (ROIs) corresponding to specific organs within electrical impedance tomography (EIT) images of the thorax, where the network allowed automatic selection of optimal pixels based on the number of images, in which each pixel is classified as belonging to a particular organ (Miller 1993). ANNs have also been applied to microscope images to classify blood cells. Combined with the use of a neural network classifier, ROIs were extracted using global threshold methods, morphological filters, and connected-component labelling for the classification of red blood cells as normal or abnormal. This achieved an overall average accuracy of 83% (Tomari et al. 2014). Using a multispectral imaging approach , ANNs were used for automated melanoma detection (Tomatis et al. 2003). That study involved analysing lesions by a telespectrophotometric system prior to surgery. Through a reduction in dimension by factor analysis techniques, the ANN was able to achieve results of 78% sensitivity and 76% specificity for the validation set.

Using a variety of image processing algorithms, a cardiovascular image analysis software package, named Segment , was used in MRI, CT, single-photon emission computer tomography (SPECT), and positron emission tomography (PET) for the analysis of automated segmentation of the left ventricle, myocardial viability analysis, and quantification of MRI flow (Heiberg et al. 2010). The image processing algorithms used ranged from ROI analysis, automated segmentation, flow quantification, linear analysis, and image fusion. The user interface of Segment is displayed in Fig. 22.2, while Fig. 22.3 shows an overview of the Segment system.

Fig. 22.2
figure 2

User interface of Segment : A cardiovascular image analysis software package (Heiberg et al. 2010)

Fig. 22.3
figure 3

Overview of the Segment software package and the transaction analysis between interfaces (Heiberg et al. 2010)

2.4 Towards Assistive Drug Administration and Patient Care

In developing patient-centric systems, it is important that we take into consideration studies that involved the development or analysis of drugs administered to patients. It is also important to evaluate and determine the parameters needed for utmost patient care by selecting the most effective treatment options given the patient’s circumstances (Sordo 2002).

Digoxin (DGX) is a drug widely used for treating various cardiovascular diseases, such as congestive cardiac failure and symptomatic alterations of the heart rate such as auricular fibrillation and paroxysmal supraventricular tachycardia, by improving the effective behaviour of the heart and by relaxing the heartbeat. However, the main drawback of using this drug is the possibility of intoxication in the patient. A study used ANNs to determine a patient’s risk of digoxin intoxication (Camps-Valls and Martin-Guerrero 2006). The study had patients classified as patients with high risk of intoxication (DGX levels >2 ng/mL) and patients with low risk of intoxication (DGX levels <2 ng/mL). Two hundred and fifty-seven patients were included and monitored in the study. Data collected included anthropometrical data of the patient (age, sex, height, and total body weight), renal function parameters (creatinine level and creatinine clearance), indicators of existing interaction with other drugs (treatment with amiodarone), daily dosage, and the administration rate (times per week). The study used a multilayer perceptron trained by the eigensystem realisation algorithm (ERA) (Gorse et al. 1997) which improved results of the classical backpropagation learning algorithm, compared to other neural approaches, such as radial basis function neural networks (RBFNNs) and one-class one-network (OCON) neural networks (Camps-Valls et al. 2003).

In anaesthesia , designers have employed ANNs in order to measure and provide more quantitative information to surgical teams in relation to the depth of the drugs administered (Robert et al. 2002a, b). The variety of features used to quantify the state of consciousness and effectiveness included traditional EEG measures (power in delta, theta, alpha, and beta bands), features derived from bispectra, mutual information (MI), and nonlinear dynamic (chaos) models of the EEG signal (Ortolani et al. 2002; Zhang and Roy 2001; Zhang et al. 2001). In addition, another study presented an approach based on MI to predict response during isoflurane anaesthesia (Huang et al. 2003). The MI of EEG recorded from four cortical electrodes was computed from 98 consenting patients prior to incision during isoflurane anaesthesia of different levels. The system was able to correctly classify purposeful response with an average accuracy of 91.84% of the cases.

ANNs were also used to monitor kidney patients who undergo cyclosporine A (CyA) treatment (Camps-Valls et al. 2003). CyA is still the drug of choice for immunosuppression in renal transplant recipients. The study had two objectives: (a) the prediction of CyA trough levels to determine CyA blood concentration from previous values by following a time series methodology and (b) the prediction of CyA level class . Due to high inter- and intra-subject variability, non-uniform sampling, and nonstationarity of the time series, the prediction task is well known to be intricate. The study consisted of 57 patients, where the population was randomly assigned to two groups: 39 patients (665 patterns) were used for training the models and 18 patients (427 patterns) for their validation. The root-mean-square error (RMSE) was used as a measure of precision. Blood levels were accurately predicted with an error margin of 20% through the one-way analysis of variance (ANOVA) method by using the mean of the absolute prediction error to compare the precision of the models, with the best results being obtained by using the profile-dependent support vector regression (PD-SVR) for prediction and support vector machines (SVM) for classification.

In addition to cardiovascular imaging as mentioned in the previous section, ANNs can also be used for the development of prosthetic heart valves (Morsi and Das 2006), which are commonly used to replace natural heart valves and are also widely used in ventricular assist devices (VAD) in total artificial hearts (TAH). Fluid flow phenomena, particularly in vitro velocity profiles, shear stresses, regurgitation, and energy losses, contribute to the clinical success of any valve design (Morsi et al. 2001). Thus, the optimisation of the valve leaflet or wall stress development patterns relies on various parameters (Lin et al. 2004). Moreover, if a prosthetic heart valve is to be used, valve-related problems, such as blood cell damage, thrombus formation, calcification, and infection, as well as valve durability, need consideration.

Several numerical techniques , such as arbitrary Lagrangian-Eulerian (ALE), fictitious domain/mortar element (FD/ME), and immersed boundary (IB), have been examined to solve the problem (fluid-structure) sequentially. The solution of fluid forces is obtained using the conservation of law of mass and momentum equations, and then the structural solution follows for each time step. In all the methods mentioned, the deformation of the mesh poses a formidable computational task, particularly in the case of complex geometric problems involved in cardiovascular application (an example of the solution procedure can be seen in Fig. 22.4). The development of a neural network model can be used as an approach to deal with the optimisation issues mentioned using a dataset of fluid variables, structure variables, tube diameter, leaflet thickness, Reynolds number , and so forth and can be used as input parameters to an ANN model, with an estimation of the heart valve leaflet deflection with time as an output variable (Morsi and Das 2006).

Fig. 22.4
figure 4

Tri-leaflet heart configurations for closed and open valves (Morsi and Das 2006; Morsi 2014)

3 Hidden Markov Models in Patient-Centric Analysis

Aside from ANNs in general, hidden Markov models (HMMs) can also be used in analysing circumstances beneficial towards a patient-centric system. An HMM is a statistical model in which the system being modelled is assumed to be a Markov process with hidden states (Cooper and Lipsitch 2004). The outputs of the hidden states are observable and are represented as probabilistic functions of the state. A general approach to estimating parameters of continuous-time Markov chains from discretely sampled data was used in analysing hospital-acquired infections (Ross and Taimre 2007). Hospital-acquired infections caused by transmissible nosocomial pathogens have been widely studied due to the detrimental effects of the infections, resulting in possible loss of life, as well as in high demands on healthcare resources. Reports state that 1 in 10 patients admitted to a hospital will acquire a nosocomial infection, resulting in approximately 5000 deaths and costs amounting to one billion pounds per annum (Inweregbu et al. 2005), with reports of nosocomial infections being continually on the rise. Focus has thus turned on developing strategies for limiting the occurrence of such infections through improved hygiene practices among healthcare workers, selective antibiotic use, and isolation (human-human distancing) strategies. In their study using HMMs , Cooper and Lipsitch (2004) developed a methodology which was combined with a new stochastic model for the transmission of hospital-acquired infections—one which accounts for dynamic bed occupancy—providing a method for estimating the parameters of such systems. In the study, particular attention was given to modelling dynamic bed occupancy as a necessary parameter in predicting patients who may begin exhibiting symptoms of hospital-acquired infections.

HMMs have also been used for predicting body trajectories for cancer progression , where conditional probabilities of clinical data were modelled using HMM techniques (Ohlsson et al. 2001). In that study, each potential body site was encoded by an N-letter code, and a disease trajectory was described in terms of a string of letters. Patient database records were then represented by start- and endpoints through the architecture illustrated in Fig. 22.5. The approach was explored using pathological data for non-Hodgkin lymphoma, augmented with an artificial database generated according to observed distributions in the clinical data. For the HMMs , a Bayesian approach was taken using the hybrid Monte Carlo method, producing an ensemble of models rather than a single one. In addition, Simöes (2010) made use of an emission probability value for detecting possible aberrations that could occur during cancer progression. In this study, an additional value was placed between the transition of states to represent the possibility of mutations occurring from one phase to another and which are undetected in laboratory procedures. A representation of the study is shown in Fig. 22.6.

Fig. 22.5
figure 5

Standard HMM architecture developed by Ohlsson et al. (2001). (S and E denote start and end state, respectively, while the delete, main, and insert states are marked as d, m, and i)

Fig. 22.6
figure 6

HMM architecture developed by Simöes (2010). (s values mark a stop state and the e represents emission values between states)

As mentioned previously, administering drugs to patients is one of the important procedures in patient care; however, it is important to take note of possible adverse drug reaction as it is one of the leading causes of injury or death among patients undergoing medical treatments. This is due to adverse reactions not being thoroughly identified prior to a drug being made available on the market. Sampathkumar et al. (2014) developed an HMM-based system for mining online healthcare fora and extracting reports of adverse drug side effects (some of which are given in Table 22.2) from messages to use them as early indicators to assist in post-marketing drug surveillance. The system’s architecture is presented in Fig. 22.7. The study also made use of an annotated dataset which was used in the training and validation of the HMM-based text mining system illustrated in Fig. 22.8. The results from a tenfold cross-validation of the manually annotated dataset yielded on average an F-score of 0.76 from the HMM classifier, compared to 0.575 from the baseline classifier. Furthermore, the study also managed to discover some novel adverse side effects, using the aforementioned computational models, which could potentially be classified as adverse drug reactions.

Table 22.2 Mined side effects of drugs from different sources using HMM (Sampathkumar et al. 2014)
Fig. 22.7
figure 7

System architecture of the mining system for adverse drug reactions (Sampathkumar et al. 2014)

Fig. 22.8
figure 8

Resulting HMM obtained from the training set (Sampathkumar et al. 2014)

4 Conclusions: Challenges in Developing Intelligent Patient-Centric Systems

The future potential success of AI in the development of patient-centric intelligent systems is directly related to the careful consideration and design of the theories and architectures involved in any such system. As described, developing applications towards patient-centric procedures is a well-appraised challenge in the domain of AI. As ANNs and HMMs are non-application-specific computational models, various theoretical and application research studies have been attained, all with varying degrees of success. Nevertheless, the challenge lies in which part of the classification process the ANN or HMM is to be applied. It is clear that their nonlinear representation abilities can address highly complex problems, and it is this attribute which should be further exploited through their usage to address a single element in the overall automated classification process.

In terms of designing intelligent systems , in particular, that cater towards the healthcare industry, there continues to be a need for novel approaches that apply the complexities of hospital operations and offer much needed productivity gains in resource usage and providing services to patients. Despite improvements in technology and the translation of numerous studies from labs to daily use in hospitals and medical practices, the health research community still faces one of the greatest challenges in relation to effective AI adoption. Issues, such as those presented by big data , necessitate the development of novel computational techniques to handle the large volumes of data available, along with their variability and velocity, with efficiency and in the least amount of processing time. There are cases when instantaneous and accurate results may be important in the immediate diagnosis and treatment of patients. Another issue exists where the use of technology should not go against the design of effective human-computer interfaces. Humans should be able to better interact with the data, and communication could be integrated with the flow of information, despite the presence of complex computational models integrated within such systems (Khanna et al. 2013).

Nonetheless, the presence of AI in healthcare applications has been shown to be beneficial in providing more sophisticated techniques in aiding physicians with performing various clinical tasks and allowing hospitals to accommodate and discover new means of servicing patients to provide better treatments and health care procedures. From assistive technology to data analytics, patient-centric systems have continued to evolve, and with the help of research in the field of health care, such applications will produce more sophisticated techniques that can improve patient care and produce a wider array of treatment approaches for handling disease.