Artificial intelligence (AI) is increasingly being used in clinical anesthesia, and researchers are using algorithms to dig information from patients’ perioperative data, process and analyze them from multi-dimensions, after which predictive models are built to dynamically predict perioperative adverse events.

The depth of anesthesia (DOA) is associated with morbidity, mortality, postoperative adverse events, and related organ damage. Therefore, maintaining the appropriate DOA in the perioperative period is of great significance for clinical anesthesia. Currently, the monitoring of perioperative anesthetic depth uses BIS. Maintaining BIS at 40–60 can avoid intraoperative awareness and deep anesthesia, but the monitoring may be influenced because there is a time lag and BIS may be easily interfered by the electrotome. There are researches exploring the monitoring of the DOA according to the patient’s original electroencephalography (EEG). Due to the complex changes of EEG under different anesthesia states, it is difficult to effectively assess the DOA by extracting a single feature, while multiple effective features can be extracted from EEG with the help of AI algorithms to accurately assess the DOA and improve real-time monitoring. Apart from BIS and EEG, other clinical signals have also been investigated to help monitor the DOA and other perioperative clinical data, which will be introduced in this chapter as well.

1 Application of AI in BIS

Artificial neural networks are commonly used in medical research to build prediction models. A multilayer feed-forward neural network was used to predict steady-state plasma drug concentration, which showed less prediction error than nonlinear mixed effects modeling (Brier et al. 1995). In a study of clinicians and artificial neural networks, researchers found the AI predicted a BIS value under 60 after bolus propofol injection better than clinicians with 10 common clinical parameters (Lin et al. 2002). Using spontaneous neuromuscular recovery and time elapsed since reversal, a simple feed-forward neural network predicted residual neuromuscular block (Laffey et al. 2003). As compared to traditional and statistical diagnostic models, feed-forward neural networks predicted postoperative nausea and vomiting (Peng et al. 2007), and hypotension (Lin et al. 2011) better. Additionally, artificial neural networks have been extensively used to interpret complicated data, such as electroencephalograms (EEGs). With a correlation coefficient of 0.94, a feed-forward neural network was trained to build a novel index of anesthesia depth based on raw EEG signals (Ortolani et al. 2002). Preprocessed EEG was used to differentiate three anesthetic states using a recurrent neural network were capable of differentiating three anesthetic states using preprocessed EEG with an accuracy as high as 99.6% (Srinivasan et al. 2005). A feed-forward neural network model that combined preprocessed EEG with multiple vital signs to build a new DOA index was tested for prediction of anesthesia level, and the index showed less error and higher prediction accuracy than BIS (Sadrawi et al. 2015).

Traditionally, isobole and response surface models have been used to explain the pharmacodynamic interaction between propofol and remifentanil (Short et al. 2016; Bouillon et al. 2004). An empirical response surface model was recently used by Short et al. (2016) to predict the BIS value for propofol Ce and remifentanil Ce. There was a good correlation between predicted and measured BIS with a MDPE of 8 ± 24% and a MDAPE of 25 ± 13%. BIS prediction during propofol and remifentanil target-controlled infusions was better using artificial neural networks than traditional response surface models. Gambús et al. (2011) adopted a fuzzy logic-based artificial neural network (Adaptive Neuro-Fuzzy Inference System, ANFIS) to predict BIS from the combination of propofol Ce and remifentanil Ce during sedation-analgesia for endoscopic procedure. A validation group analysis found an MDPE of 5.83, MDAPE of 15.85, and RMSE of 13.25%, which is significantly less than the mistakes in the Short et al. (2016) study. As a result, the ANFIS model has been built using calculated Ce, which is inherently inaccurate in dynamic phases, and has only been tested in steady states. Induction and recovery periods of anesthesia may be less applicable to the ANFIS model. The use of feed-forward neural networks in combination with time series data may lead to enhanced predictive power in the dynamic phase due to the effective use of long and short-term memory to process time series data.

The empirical model aiming at optimizing data description has the disadvantage that it has no biological basis, and the parameters are difficult to interpret. Additionally, complex models with a large number of parameters are likely to exhibit overfitting, which decreases the predictive power of the empirical model. By using advanced computational methods such as deep learning, we addressed the weaknesses of empirical modeling by designing a model system that mimics the traditional mechanistic PK–PD model. This study contrasts substantially with the traditional PK model in terms of long- and short-term memory, as well as in terms of theoretical similarity. According to the traditional PK model, the change in drug amount over time in the final node of the long short-term memory is perfectly linear, as the previous time node affects the next time node. The study does not assume pharmacokinetic intermediaries such as plasma concentrations or Ce, which are sources of error in traditional PK–PD models, in our long short-term memory model. Based on the computation of the nonlinear dose–response relationship between propofol in the compartments and BIS measured in the chambers, a feed-forward neural network is the number of nodes in a feed-forward neural network with a hidden layer that can approximate any nonlinear function, unlike a simple feed-forward neural network that performs a similar task to multiple linear regression analysis layers (Hornik 1991). A hidden layer of the feed-forward neural network was used to estimate the effects of covariates and propofol and remifentanil combined. PD and PK parts were both fed covariates to improve performance, though PD was more error-prone than PK.

Its extensibility in various areas is the main advantage of Verotta’s deep learning model architecture. Due to cost or ethical concerns, traditional PK–PD studies require frequent blood sampling and analysis of drug concentrations, which are major limitations. Verotta’s study can perform more easily PK–PD studies in vulnerable subjects since the deep learning model only requires dosing history and measured effect. The second benefit of the deep learning model is that it can easily test the effects of multiple covariates. Because Verotta related covariates directly with effects rather than PK–PD parameters, the high-dimensionality problem associated with traditional covariate modeling can be eliminated (Verotta 2012). In the deep learning model, several covariates that affect propofol PK–PD can be quickly incorporated as input nodes (Upton et al. 1999; Johnson et al. 2004). These include cardiac output and hemorrhage. Another long short-term memory input can be used to model the combined effects of more than two drugs. Lastly, it is an excellent way to extend machine learning algorithms and software that are rapidly developing. Results of this study can also be applied clinically. Target-controlled infusion pumps can provide a BIS prediction curve to aid in determining the best dose of two synergistic drugs. By calculating the BIS from the input and node weights, deep learning can be applied immediately to target-controlled infusion devices, contrary to the learning process (Beam and Kohane 2016).

2 EEG with a Deep Learning Approach

In surgery, anesthetic drugs primarily affect the brain (Brambrink and Kirsch 2019). Physiological measures like blood pressure, heart rate, and blood oxygen level are usually used to measure the DOA during surgery. Patients and surgeries differ in these clinical parameters, depending on their age, body weight, gender, and medical history. Since vital signs are primary inputs in consciousness assessment, observing them is quite challenging. A BIS is used to reduce the incidence of awareness during total intravenous anesthesia by monitoring the effect of anesthetic agents by processing the online EEG. Commercial EEG monitors are known as BIS. Since BIS is still subject to patent access restrictions, it is not publicly available. In the BIS monitor, electrodes are molded onto the forehead to capture raw EEG signals and generate DOA scores ranging from 0 to 100 (Nimmo et al. 2019; Punjasawadwong et al. 2014). EEG-based DOA estimation is commonly performed using BIS.

EEG is a useful tool for recording brain activity and has been widely used to analyze and diagnose epilepsy, Alzheimer’s disease, attention deficit hyperactivity disorder, and other disorders. As one of the common methods for monitoring, detecting, and diagnosing epilepsy, EEG measures the electrical activity of the brain through multiple electrodes placed at different locations in the brain, and the recorded signal usually contains multiple channels. Based on previous work, EEG signals are usually acquired by placing electrodes on the surface of the scalp or by short-term intracranial implantation, called scalp EEG and intracranial EEG, respectively. Although intracranial EEG recordings provide a better signal-to-noise ratio, intracranial electrodes have limited coverage and may miss discharges outside the coverage area, making them more demanding for the surgeon. Scalp EEG is a noninvasive technique that is more applicable and easy to use for daily patient monitoring and emergence alert generation.

There has been considerable progress in the use of machine learning methods in processing complex data, including deep learning (Ravì et al. 2016; Hong et al. 2020; Korkalainen et al. 2019). By creating a hybrid deep learning structure, this study attempts to mimic the BIS index online. EEG raw data is received by the network, and the DOA index is calculated without any handcrafted features elicited from the EEG. A deep neural network (DNN) outperforms feature-based classification systems as well as other DNN structures using large patient datasets (Bengio et al. 2013). A real-time forecast of continuous BIS scores is relatively new when used in the field of anesthesia. In this study, we combine deep learning methods in order to estimate the BIS index by using a regression model.

As deep learning is widely used and deeply promoted in the fields of image classification, natural language processing, and time series prediction, more and more deep learning models are proposed. In particular, deep learning algorithms possess the ability to learn high-level representations from natural signals (Mei et al. 2018), so it has achieved more prominent results in the medical field and signal processing. In EEG monitoring, deep learning models such as convolutional neural network (CNN) and stacked autoencoder (SAE) can learn feature representations directly from EEG data, thus replacing hand-designed feature extraction one way or another (Craley et al. 2021; Yang et al. 2020). The extracted features have been proven to be more robust and can achieve better performance detection.

BiLSTM networks have design advantages over CNNs in extracting temporal features of brain activities in different states one way or another, such as emotion recognition (Jia et al. 2020), motor imagery classification (Jin et al. 2018), and sleep staging (Lea et al. 2016). However, because information decays after many layers in the deep neural network structure, back propagation also leads to gradient disappearance problem when the long short-term memory (LSTM) network is faced with ultra-long sequences, which can weaken the reliability of the model. CNNs can extract displacement-invariant local patterns from input sequences as features for classification models, especially for learning features of multivariate time series data, e.g., for action or activity recognition (Morid et al. 2020), capturing hidden patterns of multivariate time series of healthcare data (Wang et al. 2019), and extracting period information for multivariate time series prediction (Yuan et al. 2017).

The DOA assessment has been proposed for a variety of features in a range of domains over the past few years. BIS indexes obtained using wavelet coefficient energy entropy and wavelet weighted median frequency, for instance, exhibit a high correlation with wavelets (Zoughi and Boostani 2010; Afrasiabi et al. 2012). A key feature of deep anesthesia detection is burst suppression. The nonlinear energy operator was used to detect and segment burst suppression automatically by Sarkela et al. (Särkelä et al. 2002) It is common for several studies to use sample entropy and permutation entropy features (Shalbaf et al. 2013, 2017; Liu et al. 2018). An important component of the BIS score is the instantaneous frequency (IF) (Lashkari and Boostani 2017). EEG can also be used to estimate the IF using a short-time Fourier transform. Moreover, Kalman filters are used to predict the cutoff frequencies of the band-pass filter through successive windows, resulting in a more accurate estimation of IF (Lashkari and Boostani 2017). It is possible to make decisions using various types of regressors and classifiers, such as artificial neural networks (Shalbaf et al. 2013), neuro-fuzzy inference systems with linguistic hedges (Shalbaf et al. 2017), and random forests (Liu et al. 2018). It is, however, mostly private datasets that are used in anesthesia research. DOA labels in datasets are assessed by anesthesiologists (Liu et al. 2019) or extracted from automatic EEG monitoring systems (Bengio et al. 2013; Liu et al. 2018).

Based on data collected from 231 subjects undergoing total intravenous anesthesia during surgery, Lee et al. (Bengio et al. 2013) developed a deep learning model. Besides the subject’s characteristics, propofol, and remifentanil infusion histories are inputs into the network. By predicting continuous values, it determines the BIS score. Pharmacokinetic-pharmacodynamic model does not perform well in comparison to their developed method (Liu et al. 2019). Convolutional neural networks like CifarNet, AlexNet, and VGGNet are trained on the spectrograms of EEGs from 50 subjects. A big dataset requires computing intensive conversion of EEG signals into 2D images. A classification performance of 93.5% is achieved after only three levels of anesthesia, while it is more common to consider four anesthetized states before a classification is possible (Shalbaf et al. 2013, 2017; Liu et al. 2018). In Lee et al.’s study (Lee et al. 2019), a decision tree is built to classify BIS ranges using four parameters driven by the BIS monitor. BIS values are then calculated using multiple regression models. A dataset of 5427 subjects is being used to train the model. As compared to our end-to-end deep learning model, this method is less generalized and more susceptible to noise.

Most feature-based methods combine expert handcrafted features with classifiers that focus more on extracting handcrafted features from background patterns, and common features include time-domain methods, frequency-domain methods, time-frequency-domain methods, and nonlinear methods. Classifiers often use traditional machine learning methods.

However, in many fields, features extracted by deep learning methods are more robust than handcrafted features. In the literature (Truong et al. 2018), the short-time Fourier transform (STFT) was used to extract the time and frequency domain information of EEG signals, and a CNN architecture consisting of three blocks (each block includes a normalization layer, a convolutional layer, and a maximum pooling layer) was used for feature extraction and classification. In the literature (Ullah et al. 2018), instead of feature extraction for EEG signals, a pyramidal one-dimensional deep convolutional neural network was directly used to detect single-channel EEG signals, and the experimental results showed that CNNs learn better than manual engineering techniques.

Manual feature extraction requires a large amount of domain knowledge, and selecting only some EEG channels will lose some useful information. Although EEG signals are usually dynamic and nonlinear, the signals can be considered smooth over sufficiently small time periods. Different brain regions may have different effects on epilepsy, different brain regions have different EEG data characteristics for epilepsy, and there may be local dependence between different channels. The characteristics of EEG signals at one point in time have different degrees of correlation with data from past time points and data from future time points. In contrast, in the field of natural language processing, self-attentive mechanisms are often used to capture contextual relationships. For example, the literature (Li et al. 2020) proposes a BiLSTM model with a self-attentive mechanism and multi-channel features, which combines multiple feature vectors and the implicit output of the BiLSTM model to give different sentiment weights to different words using the self-attentive mechanism. It can effectively improve the importance of sentiment polar words and fully exploit the sentiment information in the text. A Chinese-named entity recognition model based on multi-scale local contextual features and self-attentiveness mechanism is proposed in the literature (Guo et al. 2020). The original bidirectional long short-term memory and conditional random field (BiLSTM-CRF) model is modified by fusing convolutional neural networks (CNNs) with different kernel sizes to extract multi-scale local contextual features. The self-attentive mechanism breaks the limitation of BiLSTM-CRF in capturing process dependencies, and further improves the performance of the model.

EEG as a key technology for brain–computer interface can be divided into five stages in terms of its application method (Ilyas et al. 2015). The first stage is the acquisition of EEG signals. The second stage is the preprocessing of EEG signals, which aims to remove noise interference. The original EEG signal contains interfering signals of eye, heart, and muscle, and removing the interfering signals can simplify the subsequent analysis and processing of EEG signals. The third stage is EEG signal feature extraction. The features are extracted from the preprocessed EEG signals to distinguish different EEG signals, and to reduce the dimensionality of the signals to simplify the calculation process. The fourth stage is the classification of the extracted features. The selection of the appropriate classifier is an important factor affecting the classification effect. The fifth stage is to use the classification results for the control of external devices or to give judgment results. Preprocessing, feature extraction, and classification of EEG signals are important elements of EEG signal processing and have been widely and deeply studied (Motamedi-Fakhr et al. 2014; Tambe and Khachane 2016).

The raw EEG signal contains eye, ECG, EMG, and other noises, and also industrial frequency interference is an important source of EEG artifacts, which increase the complexity of EEG signal processing and increase the amount of operations during processing, and need to be stripped before signal analysis (Rajya Lakshmi et al. 2014). The main EEG signal preprocessing methods are Common Spatial Patterns (CSP), Principal Components Analysis (PCA), Common Average Referencing (CAR), adaptive filtering, Independent Component Analysis (ICA), Digital Filter, etc.

After preprocessing, the original EEG signal becomes a relatively pure EEG signal with various artifacts and noise removed, but due to the large amount of EEG signal data, direct processing is too complicated, and feature extraction is needed to reduce the dimensionality of the data (Ilyas et al. 2015). At present, the commonly used signal feature extraction methods are Power Spectrum Density (PSD), Principal Component Analysis (PCA), Independent Component Analysis (ICA), Auto Regressive Analysis (AR), Wavelet Transform (WT), Wavelet Packet Transform (WPT), Fast Fourier Transform (FFT), etc.

After the EEG signal is preprocessed and feature extracted, the extracted feature vectors are classified by classifier to achieve the analysis and prediction of EEG signal. Commonly used EEG signal classifiers include k-Nearest Neighbor (k-NN), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Naive Bayes (NB), Artificial Neural Network (ANN), and Deep Learning (DL).

2.1 Common Spatial Patterns

The common spatial pattern (CSP) in signal processing is a mathematical method for separating multivariate signals into additive subcomponents that have the largest variance difference between two windows. CSP filtering is derived from Common Spatial Subspace Decomposition (CSSD), the basic idea of CSSD algorithm is to find a direction in the high-dimensional space that maximizes the variance of one class while minimizing the variance of the other class when classifying two cases. The basic idea is to design a spatial filter to process the EEG signal to obtain a new time series that maximizes the variance of one type of signal while minimizing the variance of the other type of signal, thus obtaining the feature with the largest variance. The advantage of this algorithm is that it does not require pre-selection of specific frequency bands, but the disadvantage is that it is noise sensitive and depends on multi-channel analysis (Pei and Yang 2018).

2.2 Principal Component Analysis

PCA is the process of computing the principal components and using them to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest. It is a statistical method that transforms a set of correlated independent variables into linearly uncorrelated variables through an orthogonal transformation, and the transformed variables are called “principal components.” The function of principal component analysis is to reduce the dimensionality of vectors and the complexity of signal feature extraction and classification. In EEG signal processing applications, principal component analysis decomposes the EEG signal into uncorrelated components with maximum variance, separates the interfering components with large amplitude such as EEG and EMG, and then reconstructs the EEG signal to achieve signal denoising (Liu and Yao 2006).

2.3 Common Average Reference

CAR is a computationally simple technique, and therefore amenable to both on-chip and real-time applications.

2.4 Adaptive Filter

The adaptive filter comprises a linear filter with variable parameters and a method to adjust each parameter according to an optimization algorithm. In most cases, adaptive filters are digital filters due to the complexity of optimization algorithms. It is a filter that automatically adjusts its parameters without knowing the statistical characteristics of the input signal and noise in advance, and gradually estimates the desired statistical characteristics during operation to adjust its own parameters to achieve the best filtering effect. A complete adaptive filter consists of four main parts: the input signal, the reference signal, the filter, and the parameter adjustment.

2.5 Independent Component Analysis

An independent component analysis (ICA) involves separating multivariate signals into additive subcomponents as part of signal processing. It is a blind source analysis method that separates artifacts from the EEG signal as independent components based on data characteristics. According to the theory of ICA algorithm, oculomotor artifacts, ECG artifacts, EMG artifacts, and IDF interferences are generated by independent sources with statistical independence, which can be separated by the ICA algorithm to extract useful EEG signals. ICA algorithm provides an effective method for separating and removing oculomotor artifacts from EEG signals, and Matthew B. Pontifex et al. explored a fully automated ICA component separation method for eye-movement artifacts that avoids mis-segregation of signal components resembling the distribution of eye-movement artifacts in scalp EEG and reduces the potential for human error in identifying artifacts (Pontifex et al. 2017a). In the same year in the same journal, Matthew B. Pontifex et al. also explored the possibility that the variability associated with the uncertainty of the ICA algorithm may affect the reconstruction of the EEG signal after the removal of the oculomotor artifact component. Matthew B. Pontifex et al. performed ICA analysis of EEG signal data from 32 university students using three different ICA algorithms repeated 30 times. The results showed that the ICA algorithm may introduce other artifacts in the reconstruction of EEG signals after removing artifact components, and careful selection of the ICA algorithm and parameters may reduce this effect (Pontifex et al. 2017b).

2.6 Power Spectrum Density

Power spectral density defines how the power of a time series signal is distributed with frequency and is a probability statistic that is a measure of the mean square value of a random variable. The results showed that there were statistically significant differences between the “between” and “before” and “after” periods. The results show that there are statistical differences between the “interphase” and “before” and “after” periods, and that the fractal dimensions are also significantly different, and that these differences help to understand the changes in the sleep fusiform waves (De Dea et al. 2018).

2.7 Auto Regressive Analysis

AR analysis is a time-domain analysis method for feature extraction by fitting EEG signal data with a mathematical model. AR models can be formulated as linear prediction problems, where for time series data, the predicted value at the current point can be approximated by a linear weighted sum of the sampled values of the n closest previous points. AR models commonly used in EEG signal analysis can be further classified into adaptive and non-adaptive models (Li et al. 2009).

2.8 Wavelet Transform and Wavelet Packet Transform

Wavelet transform is a time-frequency transform method, which inherits and develops the idea of localization of short-time Fourier transform, and can provide a “time-frequency” window that changes with frequency. The wavelet transform highlights the signal characteristics and refines the signal at multiple scales through the telescopic translation operation to achieve higher time resolution at high frequencies and higher frequency resolution at low frequencies, which automatically adapts to the requirements of signal time-frequency analysis. The wavelet transform decomposes only the low-frequency part of the signal, but not the high-frequency part, so the frequency resolution decreases as the signal frequency increases. The discrete wavelet transform (DWT) of EEG signals from migraine patients was performed, and 23 feature quantities were extracted from each channel signal, and all of them were used for pattern recognition after secondary screening (Subasi et al. 2019). The quality factor Q of the discrete wavelet transform wavelet basis function is fixed, while the quality factor Q of the Tunable Q-factor Wavelet Transform (TQWT) is adjustable to adjust the wavelet oscillation characteristics to match the characteristic waveform oscillation characteristics. TQWT generally decomposes EEG signals into different sub-bands based on the quality factor Q, redundancy R, and the number of decomposition layers J. Because of the random non-smooth characteristics of EEG signals, the quality factor Q takes a larger value, for example, Q takes 14 (Al Ghayab et al. 2019). Wavelet packet transform has a higher resolution than wavelet transform for high-frequency signals and is a more refined analysis method, which is used for feature extraction in studies based on EEG signals such as lie detection, facial expression recognition, driving intention recognition, etc., to obtain better classification results (Dodia et al. 2019; Edla et al. 2018; Li et al. 2018).

2.9 Fast Fourier Transform

Fast Fourier Transform is a fast algorithm of discrete Fourier Transform, and in EEG signal feature extraction, FFT transforms EEG signal from the time domain to frequency domain and does spectral analysis or calculates power spectral density. FFT is also used for fatigue driving EEG signal analysis and driver EEG signal analysis in unmanned driving system driving behavior simulation experiments (Dkhil et al. 2018; Yang and Ma 2018).

EEG signal feature extraction is an important step in EEG signal classification and recognition, EEG signal is the superposition of potentials formed by various electrophysiological activities of the brain on the surface of the scalp, which has random and non-smooth characteristics, how to extract useful features from the complex EEG signal is the key to EEG signal analysis. The band-pass filtering of the EEG signal according to its frequency distribution is not sufficient to reflect its characteristics, and the high-dimensional feature vector will bring a very complex operation to the subsequent classification algorithm, so it is necessary to do the dimensionality reduction process, generally using PCA or ICA dimensionality reduction.

2.10 Linear Discriminant Analysis

LDA is a linear learning method proposed by Fisher in 1936. The main idea of LDA is: for a given set of training samples, find the appropriate projection direction to project the samples onto a straight line, so that the projection points of the same class are concentrated as much as possible and the projection points of different classes are as far away as possible (Zhou 2016). LDA is not too computationally intensive, easy to use, and is a good classification method.

2.11 Support Vector Machine

The basic principle of SVM is to find the optimal decision surface in space so that different classes of data can be distributed on both sides of the decision surface to achieve classification (Li 2018). Siuly et al. performed the optimum allocation based principal component analysis method (OA_PCA) for feature extraction and tested four popular classifiers: least square support vector machine (LS-SVM), naive bayes classifier (NB), k-nearest neighbor algorithm (KNN), and linear discriminant analysis (LDA). The results showed that the classification accuracy of LS-SVM was up to 100%, which was 7.10% more accurate than the existing classification algorithms for epilepsy EEG data (Siuly and Li 2015).

2.12 Naive Bayes

The Naive Bayesian classifier is a simple and practical classifier based on Bayes’ theorem, and in some fields its efficiency is comparable to that of some other classifiers (Tahernezhad-Javazm et al. 2018; Machado and Balbinot 2014; Mehmood et al. 2017). The main idea of the Naive Bayesian is that for a given item to be classified, solve for the probability of occurrence of each category under the conditions of this item’s occurrence, and whichever category is the largest, the item to be classified belongs to that category. The Naive Bayesian algorithm assumes that the samples are independent of each other and uncorrelated (Obeidat and Mansour 2018). The Naive Bayesian classifier has outstanding features of speed, efficiency, and simple algorithm structure when used to process high-dimensional data (Katkar and Kulkarni 2013). Based on the Naive Bayesian algorithm researchers have proposed various improved algorithms, such as tree augmented Naive Bayesian algorithm and network augmented plain Bayesian algorithm, which all aim to improve the algorithm performance and increase the classification accuracy (Tahernezhad-Javazm et al. 2018).

2.13 Artificial Neural Network

ANN is a hot research topic in the field of AI since the 1980s, which abstracts the neuronal network of human brain from the perspective of information processing and builds corresponding models to form different networks with different connection methods. It is a branch of machine learning methods.

ANN is widely used in the field of medical diagnosis, especially in the detection and analysis of biomedical signals, and can be used to solve problems that are difficult or impossible to solve by conventional methods in biomedical signal processing, and has been widely used in EEG, ECG, oncology, and psychiatry (Dande and Samant 2018; Ventouras et al. 2005). Payal Dande et al. present a trained and learned ANN for the diagnosis of tuberculosis with a sensitivity and specificity of 100% and 72% (Dande and Samant 2018), respectively. Enzo Grossi et al. used an ANN-based MS-ROM/I-FAST system to extract features of interest from EEG for the differential diagnosis of autism in children with good results, requiring only a few minutes of EEG data and without any data preprocessing (Grossi et al. 2017).