Keywords

8.1 Introduction

8.1.1 EOG Fundamentals

The electrooculogram (EOG) is the electrical signal produced by the potential difference between the cornea (the positive pole) and the retina (the negative pole). This potential difference is measured by placing surface electrodes near the eyes, and the process of measuring the EOG is called electrooculography.

The eye movements can be either saccades, smooth pursuits, vergence or vestibule-ocular, which can further be reflex or voluntary. Saccades are eye voluntary movements used in clinical studies to analyse eye movements. Smooth pursuits are also voluntary but are slower tracking eye movements that keep a moving stimulus on the fovea. Vergence and vestibulo-ocular movements have involuntary origins [1], so they are usually removed in EOG applications.

EOG-based human−computer interfaces (HCIs) offer disabled people a new means of communication and control as eye movements can easily be interpreted in EOG signals. In recent years, different types of HCI systems have been developed such as controlling the computer cursor, virtual keyboards, electric wheelchairs, games, hospital alarm systems, television control systems, home automation applications and smartphones [2,3,4,5,6,7,8]. Diabetic retinopathy or refractive disorders such as hypermetropia and myopia can be diagnosed early based on the EOG results [9]. EOG also provides reliable information to identify sleep stages and detect anomalies [10, 11].

Figure 8.1 shows a block diagram of the main stages to develop an EOG system. For the development of widely used EOG-based applications in the real world, it is necessary to provide accurate hardware and efficient software to implement the tasks shown in Fig. 8.1, which will be introduced in this chapter.

Fig. 8.1
A block diagram of the main stages which can make up a system. The involved steps are signal acquisition, digital processing, features processing, classification, and decision-making.

Block diagram of the main stages that can make up a system based on EOG signals

8.1.2 EOG Signal Measurement

Saccadic eye movements are the most interesting as they are voluntary and easily identifiable on the EOG. The most basic movements are up, down, right, and left. To distinguish between these eye movement classes, two pairs of bipolar electrodes and a reference electrode are positioned around the eyes as shown in Fig. 8.1.

The amplitude of the EOG signals has a mean range of 50–3500 μV and the frequency is between zero and about 50 Hz. Another issue to consider is that muscle noise spreads along with the signal bandwidth almost constantly, which makes it very difficult to completely remove it. The amplitude of the signal obtained using two electrodes to record the differential potential of the eye is directly proportional to the angle of rotation of the eyes within the range ± 30°. Sensitivity is on the order of 15 μV/º [12].

Voluntary and involuntary blinks produce spikes in EOG signals that must be detected because they can be mistaken for saccades. Figure 8.2 shows EOG fundamentals by modelling the eye as a dipole and an electrooculogram where two typical saccades with different amplitudes depending on the gaze angle are represented.

Fig. 8.2
An eye model and a graph. The model involves the electrode, retina, cornea, head orientation vector, gaze vector, and bioamplifier. The graph presents an electrooculogram where two typical saccades with different amplitudes depending on the gaze angle.

Eye modelled as a dipole and an electrooculogram where two typical saccades with different amplitude depending on the gaze angle

Before any further analysis, a preprocessing step will be necessary to reduce mainly the base-line drift, the powerline interference and the electromyographic potential. For this task, analogue filters are usually used. Table 8.1 shows the most relevant commercial bio amplifiers used in EOG for experimental measurements.

Table 8.1 Most relevant commercial bio amplifiers used for EOG measurement along with their main characteristics

On the other hand, several datasets are publicly available that offer signals which are already pre-processed and ready to be used. Some of the most widely used datasets in the literature are shown in Table 8.2, mostly related to sleep recordings.

Table 8.2 Summary of the most relevant EOG signal databases

8.2 EOG Signal Denoising

After hardware acquisition and preprocessing, EOG signals are still contaminated by several noise sources that can mask eye movements and simulate eyeball events. Additional denoising must be done to remove unwanted spectral compo-nents to improve analogue filtering and remove other kinds of human biopotentials. Denoising must preserve the signal amplitudes and the slope of the EOG signals to detect blinks and saccades. In some cases, it is an additional feature while in others, it is considered an artefact that should be eliminated. In these latter cases, the blink artefact region is easily removed because blinking rarely occurs during saccades, instead, they usually occur immediately before and after the saccades. Another issue that needs to be considered in EOG noise removal is crosstalk or the interdependence between acquisition channels. Many changes in eye movements recorded by one channel generally appear in other EOG channels. The main strategy is to ignore those signals with a low amplitude.

Several methods are proposed in the literature to attenuate or eliminate the effects of artefacts on EOG signals. Digital filters are typically employed to reduce muscle artefacts and remove power line noise and linear trends. Adaptive filters, such as Kalman and Wiener filters are used to remove the effects of overlap frequencies over the EOG spectrum from electrocardiographic and electromyographic artefacts. In addition to linear filtering, the median filter is very robust in removing high-frequency noise, and preserving amplitude and the slope, without introducing any shift in the signal [11].

Regression methods can learn signal behaviour by modelling colour noise that distorts the EOG signal and subtracting it. Due to the relationship between nearby samples, these methods include the nearby noise samples for the prediction of the given sample. The noise distribution characteristics are not considered in the regression models.

The wavelet transform (WT) is a powerful mathematical tool for noise removal. The WT consists of chopping a signal into scaled and displaced versions of a wavelet that is called “mother wavelet”. From the point of view of signal processing, wavelets act as bandpass filters. The WT is a stable representation of transient phenomena, and therefore, conserves energy. In this way, the WT provides much more information about the signal than the Fourier transform because it allows to highlight its peculiarities by acting as a mathematical microscope.

Figure 8.3 shows the general scheme of the WT-based filtering procedure. The decomposition module is responsible for obtaining the wavelet coefficients of the EOG signal at the output of the conditioning stage. The thresholding stage consists of selecting an appropriate threshold for the wavelet coefficients so that those of lower values are eliminated because they correspond to noise and interference. Finally, the EOG signal is reconstructed through the coefficients not discarded in the previous stage. To do this, the reverse process of the wavelet transformation carried out in the first module is followed.

Fig. 8.3
A process flow presents the general outline of W T-base filtering. The order of the flow is as follows. E O G, decomposition, thresholding, reconstruction, and E O G filtered.

General outline of the WT-based filtering procedure

For a complete decomposition and reconstruction of the signal, it is necessary that the filters of the wavelet structures have a finite number of coefficients (finite impulse response filters) and that they are regular. In addition, it is also important to ensure that the filters have phase linearity, as this prevents the use of non-trivial orthogonal filters but allows the use of biorthogonal filters. It is very common to choose the Biorthogonal and Daubechies families for EOG signal processing [3, 24].

8.3 Compression

In some EOG applications, such as sleep studies, it is necessary to compress the signals because they can extend up to several gigabytes. Storage and transmission for remote health monitoring of this amount of data comes at a high cost. In these cases, compression techniques are needed for the transmission of the signal through the communication networks. An alternative is the Turning point compression algorithm reported in [25]. This algorithm reduces the effective sampling rate by half and saves the turning points that represent the peaks and valleys of the EOG signal. The main purpose of compressing data is to reduce the size while retaining the characteristic and useful features. Figure 8.4 shows the original, filtered, and compressed EOG signal using this technique.

Fig. 8.4
A graph of voltage versus time. It depicts three curves for original, filtered, and compressed. The trend for original is more noisy.

Original, filtered, and compressed EOG signal using Turning point algorithm

8.4 EOG Feature Processing

EOG feature processing consists of selecting how many and which signal features are the most relevant for the application. An important issue is that the chosen features must be independent of each other to prevent collecting redundant data. By evaluating the EOG signals, it is possible to conclude that fixations have a stable slope whereas saccades and blinks increase quickly. The same happens for smooth eye movements and other features such as average speed, range, variance, and signal energy. Fixations are the slowest eye movements and saccades are the fastest [26]. The processing of informative, discriminatory, and independent features is a key step in preparing an appropriate collection of values for a classifier.

8.4.1 Feature Extraction

Feature extraction is the process of extracting features from the processed signals to obtain significant discrimination on several independent features. The goal is to find a space that makes the extracted features more independent and where they can be discriminated against. Feature extraction techniques can fall in the time domain, frequency domain, time-frequency domain, and nonlinear domain [27].

8.4.1.1 Time-Domain Features

Time-domain features represent the morphological characteristics of a signal. They are simply interpretable and suitable for real-time applications. The most popular time-based parameters are compiled in Table 8.3.

Table 8.3 Summary of the main time-domain EOG features, where x refers to the input signal, n is the nth sample of the signal, and N is the total number of samples
  • Statistical features are mean, variance, standard deviation, skewness, kurtosis, median and the 25th, and 75th percentile of the signal.

  • Hjorth features are activity, mobility, and complexity parameters to measure the variance of a time series, the proportion of standard deviation of the power spectrum and change in the signal frequency, respectively.

  • Zero crossing rate features refer to the number of times that a signal crosses the baseline. This parameter is very sensitive to additive noises.

Other time-domain EOG features for eye movements are amplitude, latency, deviation, velocity, slope, peak polarity, and duration.

8.4.1.2 Frequency Domain Features

The main features in the frequency domain are energy, power ratio, spectral frequency, duration ratio and power spectral density (PSD). The most popular methods to estimate the PSD are Autoregressive (AR), Moving average and Autoregressive moving average. In [28] AR was considered to extract features from the EOG signal. These methods are named the parametric method because the spectrum is estimated by the signal model. These approaches are suitable for signals with both low SNR and length. In the non-parametric methods, such as Periodogram and Welch, the PSD values are calculated directly from the signal samples in each signal window. In [29] a non-parametric statistical analysis is performed using Welch method. The features obtained by the Welch technique discriminate better due to the lower sensitivity of nonparametric methods to residual noise and motion artefacts compared to parametric and cumulant-based methods. The non-parametric methods based on the Fast Fourier transform are easy to implement. Another method used to extract the frequency domain features is the higher-order spectra, which represent the frequency content of a higher-order signal static.

8.4.1.3 Time-Frequency Features

EOG signals are non-stationary, and to transfer a signal from the time domain to the frequency domain, three main techniques are available:

  • Signal decomposition: The aim of signal decomposition is to decompose the signals into a series of basic functions, and the most common methods are Short-Time Fourier and WT [3]. The first one is simple and well-known, however, for EOG signals, the second one is the most widely used. Continuous wavelet transforms have more separable features and the coefficients are more redundant than the discrete wavelet transforms within the same period.

  • Energy distribution: Several methods are proposed for energy distribution: Choi–Williams distribution and Wigner-Ville distribution are the traditional non-linear time-frequency methods widely used to analyse non-stationary signals. Hilbert–Huang transform is a more recent method to obtain momentary regularity of nonlinear and non-stationary signals such as EOGs [4].

  • Modelling: The Gaussian mixture model (GMM) is used in some works to estimate the continuous probability density of the signal. The model parameters are estimated using the Expectation-Maximization algorithm such that the probability of observation is maximised. Figure 8.5 depicts the GMM model [8, 26, 30].

Fig. 8.5
A diagram presents the G M M model. The input passes through Gaussian function and results in the output.

GMM structure where three parameters must be estimated separately for each Gaussian function: mean vector (μ), covariance matrix (∑) and weight (w). The weighted sum on these probabilities builds the output based on the input observations, x(t) = {x1, ..., xn} where xn is the nth observation or feature vector

8.4.1.4 Non-Linear Features

Non-linear methods employed for EOG signal feature extraction fall into two main groups:

  • Entropy and complexity-based methods. Complexity methods are used to estimate the nonlinear dynamic parameters of EOG, electroencephalographic (EEG) and electromyographic (EMG) signals. Among complexity methods, entropy-based algorithms are robust estimators for evaluating the regularity of signals. Shannon’s entropy method is the most famous one. However, in some cases, the data for the decision-making processes cannot be measured accurately and other methods have been proposed, such as Renyi’s, Sample, Tsallis, Permutation, Multi scale and Approximate entropy [31].

  • Fractal-based methods. They propose measuring the fractal dimension of the EOG irregular shape and determining the amount of self-similarity on the signal. The Correlation Dimension, Lyapunov exponent and Hurst exponent are examples of fractal-based methods. First, they map a signal into the phase space and then measure the self-similarity of its trajectory shape [32].

Both techniques are suitable for measuring the amount of roughness in the signal, in turn increasing the entropy of the signal with the irregularity. These techniques are only effective at detecting stage transitions, not for the signal bandwidth.

8.4.2 Feature Selection

After the feature extraction, feature selection techniques are applied to find a discriminative subset of features to reduce the number of features needed to feed and train subsequent classification models, for avoiding over-fitting and reducing the computational time [33].

Minimum redundancy maximum relevance is an algorithm for feature selection according to the criteria of minimum redundancy (least correlation between themselves) and maximum relevance (most correlation with the class). Redundancy can be computed by using Pearson’s R for continuous features or mutual information for discrete ones. Relevance can be calculated using F-distribution under the null hypothesis for continuous features or mutual information for discrete ones [34].

Another selection method is named the Clear based feature selection (CBFS). CBFS computes the distance between the objective sample and the centroid of each class. Then, the algorithm compares the class of the closest centroid with the class of the target sample. [35] reports an efficient classification of EOG signals using this algorithm.

8.4.3 Feature Normalization

Feature normalization can be applied to reduce the effects of the individual variability and is performed over values for each feature separately. This process can prevent extremely high or low values from influencing any conclusions. [36] reports the procedure for feature normalization in an automatic sleep staging method. Another example of normalization is shown in [37] in which the original EOG signal, managed by a dynamic threshold (includes a positive and a negative threshold), would be transformed into a series of rectangular pulses that have −1 or 1 in their amplitude.

8.5 Classification

The automatic identification of eye movements (classification) is essential to generate accurate commands, especially in real-time applications. Classification techniques based on static or dynamic thresholds are also not easily generalisable, so methods based on artificial intelligence (AI) are needed. Many conventional machine learning algorithms and recently deep learning due to the increased computational power are truly becoming virtual assistants for clinicians to classify EOG features and improve medical diagnosis. The most widely used classifiers are herewith briefly commented.

The main parameters associated with classification performance are accuracy, precision, sensitivity, specificity, recall, F1 and F2 score, true positive rate, false-positive rate and Genni’s or Mathew’s correlation coefficient. Confusion matrices are also commonly used to compare the performance of different classification methods and avoid misleading when data is unbalanced.

8.5.1 Machine Learning Techniques

Conventional machine learning techniques include a wide variety of algorithms. All of them have shown great performance in EOG features’ classification compared to threshold-based classification techniques proposed in some preliminary EOG-based systems. The main machine learning techniques used in EOG features’ classification are briefly described below.

K-Nearest Neighbor (K-NN) is an algorithm that finds the nearest observations to the one it is trying to predict and classifies the observation of interest according to the majority of the surrounding data. The only parameter to set is the number of neighbouring points to consider in the vicinity to classify the different classes that are already known in advance.

Support vector machines are other conventional hierarchical supervised classifiers. They involve the adoption of a nonlinear kernel function to transform the input data into an optimal hyperplane for separating the features.

Decision trees are non-parametric supervised learning techniques that require little preprocessing and have a good runtime performance to handle tasks in real-time. The goal is to create a model that predicts the value of an objective variable according to various input variables [38].

Random Forest (RF) is one of the best algorithms for classifying large data with accuracy. RF is an ensemble of predictor trees such that each tree depends on the values of a random vector. This random vector is tested independently and presents the same distribution for each tree. Each tree is grown through bootstrap training. Figure 8.6 shows the general structure of RF. The classification is made from the vote of each tree in the ensemble and by selecting the most popular class among them [26]. [39] reports an automatic scoring of sleep stages classification using EOG signals.

Fig. 8.6
A flowchart of a random forest structure. The dataset leads to random vectors which divides into 3 datasets and passes through bootstrapping, and training, later combined for voting which in turn influences the prediction.

Random Forest structure

Linear Discriminant Analysis (LDA) is a method of supervised classification in which two or more groups of variables are known a priori and new observations are classified into one of them according to their characteristics. The result is created on the nearest centre classifier applied to the LDA outputs. After training, the nearest centres calculate the distance between any point and each class. In [40], an LDA classifier was applied to EOG classification with good training and testing accuracy that could be used for disabled people.

Logistic Regression (LG) optimises a set of weights assigned to each input feature to provide the best classification performance using a training dataset. LG was used in the design of an omnidirectional robot controlled by eye movements because of its efficiency and the low computing resources needed [7].

GMM as a classifier learns the input features of each class and assigns a specific label to them. When a sample fits into the scheme of Fig. 8.5, the label that produces the highest probability is assigned. The GMM provides a framework to model unknown distributional shapes. The key issue is to estimate how many components to include in the model.

A Hidden Markov Model (HMM) is a statical model in which the parameters are unknown. The training is done using maximum likelihood. HMM assesses the transition and emission probabilities from the observation sequence to the state sequence. This classifier can tolerate time warping of the input data. [8, 41] report a wheelchair navigation system based on an HMM for people with restricted mobility. Figure 8.7 shows an example of the transition of states of HMM.

Fig. 8.7
A diagram presents an example of the transition of states of Hidden Markov Model.

Example of HMM architecture, where ‘x’ refers to hidden states, ‘y’ refers to observable outputs, ‘a’ are transition probabilities, and ‘b’ are the output probabilities

Clustering is an unsupervised grouping classifier where the samples lack labels. The goal is to create groups with similar samples using criteria such as information, statistical measures, and distance metrics. Each eye movement has specific features; therefore, first grouping the signals into the two categories, centre gazes and non-centre gazes might be a useful step in some classification schemes. Figure 8.8 shows the application of this concept to the hierarchical clustering procedure for classifying eye movements.

Fig. 8.8
A flowchart describes the hierarchical clustering procedure to classify eye movements. All eye movements divided into center gaze, and non center gaze which further divided into down and left, and up and right.

Hierarchical clustering procedure to classify eye movements

Based on how the clusters are related to each other and the objects in the dataset, the first division of clustering algorithms can be established. In hard clustering, each object belongs to a single cluster, so the clusters would become a partition of the dataset. In soft (or fuzzy) clustering, the objects belong to the clusters according to a degree of trust or belonging (e.g., Fuzzy C-Means). Clustering can be classified by looking at how the object is related: flat clustering, hierarchical clustering, graph-based clustering, and density-base clustering. Examples of these clusters can be found in [26, 42].

Even EOG signals from the same eye movement can differ in amplitude and time and thus, produce errors in recognition. The Dynamic time wrapping algorithm can solve this problem by breaking the problem recursively into subproblems, storing the results and later using those results when needed. For large datasets, this algorithm employs a lot of time for training the model [8, 43].

Artificial neural network (ANN) has numerous applications for pattern classification in the medical field to easily interpret the EOG signals and diagnose the problem more accurately. An ANN comprises several highly interconnected processing elements called neurons, which are organized into layers. These layers have a geometry and functionality linked to the human brain. ANNs include three layers: input, hidden and output as depicted in Fig. 8.9 [44].

Fig. 8.9
A diagram depicts the basic architecture of multilayer feed-forward A N N. It consists of three layers, namely input layer, hidden layer, and output layer.

Basic architecture of multilayer feed-forward ANN where the circles represent an artificial neuron

8.5.2 Deep Learning Techniques

Deep learning (DL) replicates the functioning of the human brain regarding sending information from one neuron to another and handling a great amount of data. DL produces more insight knowledge than machine learning techniques as it can learn multiple levels of representation from raw data using unsupervised learning and model more complex relationships. The nucleus of DL is the ANN with multiple nonlinear hidden layers. DL offers robust computing power and enormous datasets, as they generally use a greater number of recordings to develop and evaluate their methodologies than the traditional machine learning classification methods.

A convolutional neural network (CNN) is an ANN class composed of a convolution layer to filter the extraction of features, a pooling layer to reduce the size of the analysed data, a fully connected layer, and a loss function to calculate the errors between the current and the desired network output. Back propagation is applied to update weights for convolutional layers and pooling filters cascade. Figure 8.10 shows an example of a deep neural network for eye movement classification.

Fig. 8.10
A schematic presents an example of a deep neural network for eye movement classification. It involves input signal, feature extraction, classification, and probabilistic distribution.

Deep neural network for detection/classification of EOG features

CNN requires fewer parameters than the conventional neural network, therefore CNN can be applied for solving regression problems. For example, in [45] CNN was used to eliminate eye blinking artefacts, and in [46], the authors used CNN for drowsiness detection based on EOG signals.

The recurrent neural network (RNN) is basically an ANN developed under the premise that humans always consider the past when making decisions. RNN automatically stores past information through a loop within its architecture. Based on this fact, in [47], an RNN was considered for real-time eye blink suppression in EEG recordings.

The time distributed convolutional neural network (TDConvNet) is a DL model comprising two main stages: a one-dimensional CNN epoch encoder, to extract the time-invariant features from raw EOG signals, and another one-dimensional CNN stage, to infer labels from the sequence of epochs. TDConvNet was applied to classify the sleep stages of polysomnography signals [48].

Unsupervised pre-training algorithms initialise the parameters such that the optimisation process ends up with a higher speed of training. In [49, 50], two pre-training methods were presented for EOG signals: restricted Boltzmann machine (RBM) and deep belief networks (DBN). Figure 8.11a shows an example of the RBM system. The relation between the input and output layers allows the network to be trained much faster. RBM can be extended if the output layer of one RBM is the input layer for another RBM, as shown in Fig. 8.11b.

Fig. 8.11
Two diagrams a and b. Diagram A depicts an example of the R B M system. Diagram B depicts an example of D B N system. It illustrates that R B M can be extended if the output layer of one R B M is the input layer for another R B M.

Examples of (a) RBM and (b) DBN. v visible layer, w weights, h hidden layer

Long short-term memory (LSTM) deep networks are developed to obtain long-term dependencies in the data. LSTM algorithm as a classifier uses three kinds of gates to configure the data entering a network: input, forget and output. The more important formations can be saved between data segments using two forward and backward LSTMs. This architecture of two LSTMs, named Bidirectional long short-term memory (Bi-LSTM), presents each forward and backward training sequence in two separate LSTM layers, connected to the equal output layer.

Since the large length of the data can cause a leakage gradient problem, Gated Recurrent Unit (GRU) networks can be used to learn the representation of the EOG signal. This recurrent neural layer not only allows the improvement of the memory capacity but also eases the training since they retain the information within the unit while a sequence flows in the gating unit [51].

8.6 Decision-Making

The major difficulty in classification and subsequent decision-making is the variability of the data. Hence the importance of having large datasets that allow the creation of generalisable models and the learning process. The classification methods must be adapted to each user based on the previous actions and the results derived from them. Eye movements and blinks (voluntary and involuntary) are considered commands in the EOG-based systems and are used as input in different medical diagnostic systems and operation interfaces, such as serious games, home automation or communication and mobility solutions.

Some of the EOG-based communication solutions also include text-to-speech modules. These multilingual speech synthesisers are one of the recent forms of AI. They convert the stream of digital text selected by eye movements and blinks into natural-sounding speech. EOG can also be found in the design of industry-oriented robotic arms. In these systems, decision-making based on the results of previous commands improves real-time usability to provide the user with a reasonable degree of control.

8.6.1 Intelligent Decision Support Systems

Intelligent decision support systems (IDSS) use AI tools to improve the decision-making related to complex problems that involve a large amount of data in real-time. ANNs, Fuzzy logic, Expert systems, Case-based reasoning (CBR), and Intelligent agents (IA) can be considered IDSS. This section is a brief introduction to some of these techniques related to bio-signals for medical applications.

Fuzzy logic is a very promising technology within the medical decision-making application. Its main challenge is obtaining the required fuzzy data, even more when one must produce such data from patients. Usually, Fuzzy logic is used for the classification of the EOG and EEG signals, but they need calibration parameters obtained previously during the user training. [52] is an example of a Fuzzy logic-based controller for wheelchair motion control using the EOG technique.

An Expert system tries to solve human problems by embedding human knowledge in the computer. A typical application of expert systems is to filter ocular artefacts hidden in EEG signals without affecting clinically important EEG information [53]. Another application is reported in [54] for multichannel sleep data analysis.

ANNs have the advantage of executing the trained network quickly, which is a key issue for signal processing applications. However, the ANN algorithm is iterative and suffers from convergence problems. ANN has many practical applications, for example, ANN can be used for the diagnosis of a subnormal eye through the analysis of EOG signals [44].

To solve a new problem, CBR compares that problem with earlier solved problems and adjusts their well-known solutions instead of starting from scratch [55]. A CBR problem requires recovering relevant cases from the memory of cases, choosing the best cases, developing a solution, assessing the solution, and storing in the memory the newly solved case. A CBR system was used to classify ocular artefacts in EEG signals [56].

Alexa and Siri are examples of IA. They gather data from the internet without the help of the user. In the field of biomedical application, IA is used to diagnose, treat, and manage problems associated with dementia and Alzheimer’s [57]. In these cases, the agents may be any methodologies with decision-making abilities such as patient analysts, signal processing, neural network models and Bayesian systems. The information of each agent can be shared with other agents.

In the last ten years, Multi-Agent System (MAS) has gained interest due to the advances in AI, wireless sensor networks and sensors. In MAS, a larger problem can be divided into smaller subproblems. A task can be delegated among different agents, and each agent produces the output according to its task. Then, all outputs are joined and converted into the final answer to the complete problem. The interaction between agents increases the speed of problem resolution. MAS is a research topic in complex medical applications [58].

8.6.2 Learning Approaches

Learning approaches try to overcome the problems related to training data of the classification models in many EOG studies. Another important challenge is to improve the automatic classification.

Two of these learning approaches are Transfer learning (TL) and Deep transfer learning. These are powerful methods that reuse previously trained models as the starting point. This approach avoids the needed large training dataset and saves time, while, at the same time, does not reduce the accuracy of the assessment. Figure 8.12 outlines the TL from the source to the target domains [59]. The base model is trained using the data from the source domain and then fitted to the data from the target domain to complete the transfer of knowledge and make EOG-based systems more reliable and accurate. Through transfer learning, the classification performance improves significantly in all learning cases for temporal models trained only on the target domains.

Fig. 8.12
A diagram outlines the T L from the source to the target domains. It includes Source dataset, M L model for training, model of knowledge, M L model for finetune, and target dataset.

Transfer learning from the source to the target domains

Due to the limited ability of the EOG signals to adapt to the characteristics of each user, a Reinforcement Learning (RL) algorithm is included, which allows adapting the interface to the user. The RL algorithm allows the adaptation of the user’s commands to the responses in the interfaces controlled by EOG. This algorithm is usually implemented in computer serious games as a moderator of the intensity of user commands given experience [60].

A model-free Q-learning method was proposed in [61] for the planning of robot motion through the user EOG signals, including obstacles surrounding the robotic platform. Figure 8.13 depicts the navigation approach in a simplified way.

Fig. 8.13
A flowchart depicts the navigation approach in a simplified way. It involves ocular event class, environment, robotic platform model, and reinforcement learning applied to robot motion planning.

Q-learning reinforcement learning algorithm applied to robot motion planning based on eye movements

Learning vector quantization (LVQ) is a supervised classification algorithm frequently used to recognise eye movements in EOG-based systems. LVQ is an artificial neural network that lets us choose how many training instances to latch onto and learns exactly what those instances should look like. EOG features can be considered as training data to build a network for recognition. Despite not being particularly powerful compared to other methods, it is simple and intuitive for the recognition of eye movements [62].

8.7 Discussion

Considering the articles published in the last decade, we can say that for EOG signal denoising, wavelet transform is the most useful technique for data preprocessing because this mathematical tool is better focused on transient and high-frequency phenomena. For EOG feature selection, the CBFS allows reducing redundant features and increases the precision and accuracy of the neural network-based classifier. EOG compression improves the signal transmission with fewer data from the original signal. As a result, the size of memory is reduced, which is an important feature for large polysomnogram signals.

EOG signal classification can be done automatically using any conventional classification algorithm. K-NN is the typical classification algorithm based on supervised ML that offers better performance and simplicity. K-NN employs the complete dataset to train “every point”, which is why the required memory is higher than other classifiers. Therefore, K-NN is recommended for small datasets with fewer features.

CNN is a very efficient classification method in EOG signal processing, especially for EOG-based HCIs. CNN yields models of significantly higher correlation coefficients than the traditional K-NN classification algorithm for large datasets. The RL layer is used to help the user by selecting proper actions, and at the same time, learning from previous behaviours. For example, to prevent collisions in a robotic platform or improve wheelchair navigation. Deep transfer learning can be used for a relatively small amount of data for sleep stage classification and models.

The development of sophisticated AI-based models together with the availability of larger datasets will allow better interpretation of EOG. This will result in the design of more efficient systems that also present an improvement in the decision-making stage.

8.8 Conclusions

This chapter introduced and discussed signal processing in electrooculographic signals, which is a challenging problem due to the wide variability in the morphology and features of electrooculograms within the population. The key aspect is to find the technique that presents the best overall performance in each of the basic signal processing stages that are divided: denoising, feature extraction, classification, and decision-making. Some applications require the processing of large electrooculograms lasting several hours to monitor the health status of patients. Such scenarios also bring the need for powerful artificial intelligence-based techniques for classification and modelling, as well as compression of the signal for efficient decision and storage.