Keywords

1 Introduction

Electrooculography (EOG), as a reliable and non-invasive technique, measures the electrical potentials that arise between the cornea and the retina changes when the eyeball rotates. By placing a pair of electrodes either horizontally or vertically around the eyes, these potentials can be recorded. The transitions and magnitude of the obtained potentials are essentially corresponding to the rotation angle of the eyes [1]. Thus, EOG has been widely explored in many health applications, such as wheelchair guidance [2], human-computer interface [3, 4], fatigue detection [5], etc. While, for these applications, eye movement angle estimation is considered to be the fundamental step.

To estimate the eye movement angle, various approaches have been proposed. These approaches can be roughly classified into the physically-driven white box method and the data-driven black-box method. The research idea of the physically-driven white-box model focuses on the EOG eye movement recognition model based on the relationship between the principle of eye movement and gaze location. Under this idea, Barbara et al. [6] proposed an eye movement angle fitting model, by employing the EOG battery model [7] and the spatial geometric relationship between eye movement and the angle of gaze location. However, the model proposed in this work requires a certain amount of trigonometric function operations. Furthermore, the calculation of trigonometric functions mainly relies on the Taylor expansion, which has requirements for computer computing power, or the look-up table method, which has certain requirements for computer storage capacity. As a result, the model training is resource-consuming.

In contrast, the research idea of data-driven black-box focuses on data-driven EOG eye movement angle recognition modeling. Compared with the white-box idea, this technique tends to have higher accuracy. Researches under this idea normally employ data-driven modeling methods and establish regression models between the eye movement angle and the collected EOG signal. According to the type of the proposed regression models, it can be divided into linear models and non-linear models generally. Barbara et al. proposed a linear model in the research to estimate the eye movement angle according to EOG signals [4]. However, other researches show that the eye movement angle within 45° is linear, eye movement angle larger than 45° is non-linear [8,9,10,11]. The advantage of the non-linear model is that it has higher accuracy but correspondingly requires higher computing resources to train the model. Although the linear model has a relatively simple model, the requirements for computing resources are correspondingly low, but its accuracy for a larger eye movement angle is not satisfactory. Putting aside the resource usage, focusing only on accuracy and interpretability is not pragmatic enough to apply EOG signals to health application scenarios. The relationship between resource consumption and accuracy is an important issue that is yet to be considered.

To address the above issues, an accurate and cost-effective model based on non-linear polynomial regression for eye movement angle estimation is proposed. The model is simple, analytical, fast, and with few parameters. Compared with most of the existing methods with high computational complexity, the proposed model can be easily deployed in the embedded platform or mobile device for real-time eye movement angle estimation. Meanwhile, to verify the feasibility of the proposed method, a series of large-space experiments (range: −50° to 50°) were conducted. The proposed model provides a favorable accuracy with less computational time.

2 Experiment

2.1 Materials

In this work, EOG signals were recorded from 19 subjects aged 25 ± 4 years (9 males and 10 females), these subjects are healthy adults without strabismus and exophthalmos. All subjects understood and agreed with the experiment process before the experiment. Polysomnography (PSG) with a sampling frequency of 256 Hz, a 0.3–10 Hz band pass filter and a 50 Hz notch filter was applied for the EOG data acquisition.

The electrode configuration was set as shown in Fig. 1, with electrodes placed on the right side of the right eye socket (dot ‘1’) and on the left side of the left eye socket (dot ‘2’). A reference (dot ‘Ref’) and a ground (dot ‘GND’) electrode were also attached to the center of the forehead and on the left mastoid respectively.

Fig. 1.
figure 1

This is the electrode configuration illustration. The dots mark the placed PSG electrode positions on the face. The dot is on the right side of the right eye socket labeled ‘1’ and on the left side of the left eye socket labeled ‘2’. A reference (dot ‘Ref’) and a ground (dot ‘GND’) electrode are attached to the center of the forehead and on the left mastoid respectively

2.2 Experimental Setup and Procedure

Before the start of the experiment, the subject sat upright in the experimental apparatus in a comfortable position with arms resting on the desk naturally. The face of the subject was cleaned with wet wipes and then connected to conductive gel electrodes. Their head was fixed by a bracket to reduce the impact of head shaking.

The experimenter helped the subject to attach the electrodes. After that, the experimenter configured and tested the PSG (used to acquire EOG signals) and acquisition program (used to guide the experimenter and subject and marked EOG signals). Then, the experimenter ran the EOG acquisition program and prompted the subject to make corresponding saccades according to the experiment requirements. The saccades procedure is shown in Fig. 2. The symbol Θ is the angle between the center point and the target point of saccade. The saccades sequence is {0°, 10°, 0°, −10°, 0°, 20°, 0°, −20°, 0°, 30°, 0°, −30°, 0°, 35°, 0°, −35°, 0°, 38°, 0°, −38°, 0°, 40°, 0°, −40°, 0°, 42°, 0°, −42°, 0°, 44°, 0°, −44°, 0°, 46°, 0°, −46°, 0°, 48°, 0°, −48°, 0°, 50°, 0°, −50°, 0°}.

Figure 3 shows the experimental paradigm. At the beginning of the saccade procedure, the subject was asked to gaze at the center point (0° point). Then the subject was asked to make a saccade from 0° to 10° according to the audio prompt of the program and keep gazing at the 10° point for 3 s. At the same time, the program marked the saccade signal \({EOG}_{0to10}^{1}\) (the potential of electrode 1, saccade from 0° to 10°) and \({EOG}_{0to10}^{2}\) (the potential of electrode 1, saccade from 0° to 10°) for subsequent signal processing.

Before proceeding to the next step, the subject can take a short break to relax the eyeballs, blink, etc. The purpose is to reduce the discomfort of the eyes during the experiment and ensure the quality of the data acquired in the experiment. After asking for consent that the subject can continue the experiment, the experiment continues. Then after a short break, the subject was asked to continue to gaze at 10° point to finish the next saccade (from 10° to 0°).

The subject was required to repeat the above process until all saccades sequences had been completed. Finally, we got all saccade EOG signal from one subject (\({EOG}_{0to10}^{1}\), \({EOG}_{10to0}^{1}\), \({EOG}_{0to-10}^{1}\), \({EOG}_{-10to0}^{1}\), …, \({EOG}_{0to50}^{1}\), \({EOG}_{50to0}^{1}\), \({EOG}_{0to-50}^{1}\), \({EOG}_{-50to0}^{1}\) and \({EOG}_{0to10}^{2}\), \({EOG}_{10to0}^{2}\), \({EOG}_{0to-10}^{2}\), \({EOG}_{-10to0}^{2}\), …, \({EOG}_{0to50}^{2}\), \({EOG}_{50to0}^{2}\), \({EOG}_{0to-50}^{2}\), \({EOG}_{-50to0}^{2}\)).

In addition, another experimenter observed the subject’s eye movements and recorded abnormalities (blinks, wrong saccades, etc.) on the experiment log. These abnormalities are excluded when processing these EOG data.

Fig. 2.
figure 2

The illustration of eye saccades experiment. The saccades sequence is {0°, 10°, 0°, −10°, 0°, 20°, 0°, −20°, 0°, 30°, 0°, −30°, 0°, 35°, 0°, −35°, 0°, 38°, 0°, −38°, 0°, 40°, 0°, −40°, 0°, 42°, 0°, −42°, 0°, 44°, 0°, −44°, 0°, 46°, 0°, −46°, 0°, 48°, 0°, −48°, 0°, 50°, 0°, −50°, 0°}.

Fig. 3.
figure 3

The illustration of the experimental paradigm.

3 Methodology

3.1 EOG Signal Preprocessing

Before building a model, the raw data need to be preprocessed. The data preprocessing flowchart is shown in Fig. 4. We extracted the data between the begin-mark and the end-mark in the EOG sample as saccade events. Then we excluded the abnormalities according to the experiment log (32 data from 18 subjects and all data from one male subject with completely distorted signal due to bad electrode placement). After the data were captured, the measured EOGs were manually examined, by employing the wavelet transform denoising [12] and observation [13], obvious abnormal signals that clearly included large noise components such as blinking or gazing at the wrong target position were excluded (45 data from 18 subjects).

To build a simple model between the absolute eye movement angle \(\theta \) and EOG information, we define the value \({\Delta EOG}_{\theta }\):

$$\Delta {EOG}_{\theta }=max (\left|{EOG}_{\theta }^{1}-{EOG}_{\theta }^{2}\right|)$$
(1)

where the \({\Delta EOG}_{\theta }\) is the maximum absolute EOG value after differencing the two electrodes EOG signal in absolute eye movement angle, \(\theta \), saccade.

In this work, four similar saccades (e.g., \({EOG}_{0to10}^{1}\), \({EOG}_{10to0}^{1}\), \({EOG}_{0to-10}^{1}\), \({EOG}_{-10to0}^{1}\)) were marked as a same absolute eye movement saccade angle \(\theta \) (e.g., 10°) to extend the data set. Hence, for one subject, an eye movement angle \(\theta \) has 4 absolute eye movement data. As a result, we have obtained 44 \({\Delta EOG}_{\theta }\) data with 11 absolute eye movement angle targets (\(\theta \) = {10°, 20°, 30°, 35°, 38°, 40°, 42°, 44°, 46°, 48°, 50°}) from 18 subjects.

In real life, due to the activities of humans, some unpredictable situations will happen. It is not enough to exclude the outliers manually, and other non-manual methods are needed to assist in processing the data.

To build a robust model, some outlier excluding methods were applied to the training set before training the model. 3σ criterion (Pauta criterion) [14] and MAD (Median Absolute Deviation) [15] are both the outlier excluding methods. These methods can further ensure that the training data will not deviate too much from the normal value.

Fig. 4.
figure 4

The flowchart of signal preprocessing.

3.2 Polynomial Fitting Eye Movement Angle Estimation Model

Traditional eye movement angle estimation models consider eye movement angle \(\theta \) to be a linear relationship with EOG. However, some further studies point out that the relationship between eye movement angle \(\theta \) and EOG is not completely linear, but approximate linear within a certain range. In this work, we build a polynomial model to represent this incomplete linear relationship.

Denote the model as:

$$\widehat{\theta }(i)=f({\Delta EOG}_{\theta ,i}, \mathbf{w})$$
(2)
$$={w}_{0}+{w}_{1}\cdot {\Delta EOG}_{\theta ,i}+{w}_{2}\cdot {{\Delta EOG}_{\theta ,i}}^{2}+\cdots +{w}_{k}\cdot {{\Delta EOG}_{\theta ,i}}^{k} , k\in {N}^{+}$$
(3)

where the \(\widehat{\theta }(i)\) is the \(i\)th angle predicted by the absolute eye movement angle estimation model. \({\Delta EOG}_{\theta ,i}\) is the \(i\)th \({\Delta EOG}_{\theta }\) training data of the model. \(\mathbf{w}=\left[ {w}_{0}, {w}_{1},{w}_{2},\cdots ,{w}_{k}\right]\) is the weight coefficient vector of the polynomial model. \(k\) is the order of the polynomial model.

Denote the loss function as:

$$Loss=\sum\nolimits_{i=1}^{n}{[\widehat{\theta }\left(i\right)-\theta \left(i\right)]}^{2}$$
(4)

where the \(\theta \left(i\right)\) is the \(i\)th true target of training data. \(n\) is the number of training set data.

The problem of obtaining the optimal model is equivalent to solving the following equation:

$$\sum\nolimits_{i=1}^{n}{[\widehat{\theta }\left(i\right)-\theta \left(i\right)]}^{2}\to min$$
(5)

After coding a program to solve this equation, the optimal weight coefficient vector \(\mathbf{w}\) has been found. The absolute eye movement angle estimation model is established. The number of parameters in this polynomial model is \(k+1\).

4 Results

Leave-one-subject out cross-validation was used to evaluate the performance of the eye movement estimation model. Both validation and modeling methods used MATLAB R2021a software. Each model training and testing was conducted on a hardware specification with an Intel Core i5-9400F CPU, 8G DDR4 RAM and GTX1650 GPU in the Win10-64bit environment.

MAE (Mean Absolute Error) and RMSE (Root Mean Squared Error) are used to evaluate the performance of the model.

$$MAE=\frac{1}{n}\sum\nolimits_{i=1}^{n}\left|\widehat{\theta }\left(i\right)-\theta \left(i\right)\right|$$
(6)
$$RMSE= \sqrt{\frac{1}{n}\sum\nolimits_{i=1}^{n}{(\widehat{\theta }\left(i\right)-\theta \left(i\right))}^{2}}$$
(7)

where n is the number of test sample, \(\widehat{\theta }\left(i\right)\) is the \(i\)th predict absolute eye movement angle value of the model in test set, \(\theta \left(i\right)\) is the \(i\)th true target in the test set.

4.1 The Result of the Proposed Method

In this method, all ΔEOG data with angle target are separated according to each subject (totally 18 subjects). For all subjects, leave one of all subjects as the test set in turn, and all the others as the training set.

Figure 5 shows the performance change of the polynomial model from 1-order to 9-order. The performance increases significantly with the increase in number of orders before the 3rd order and reaches the best around 3rd or 4th order. The performance doesn’t improve with the order increasing, but the computational resource consumption and parameters continue to increase.

Fig. 5.
figure 5

The performance and running-time change of the polynomial model from 1-order to 9-order. Figure (a) shows the mean MAE and RMSE values of different orders. Figure (b) shows the training time of different orders.

The following Table 1 shows the best result of leave-one-subject method. The model is best when the order of the polynomial is 3 and 4. Parameters of the model are 4 and 5 respectively. To accurately evaluate the model and reduce the interference of random errors, here the mean and standard deviation of 18 subjects’ results are provided.

Table 1. The results of the eye movement angle estimation model

Compared with the two results, the 3-order model is slightly better than the 4-order model without using the outlier excluding method. When implementing the outlier excluding method, the performance of both 3-order and 4-order models are slightly enhanced. It also implies that the model is robust even when some outliers exist.

4.2 Comparison with Linear and Some Non-linear Methods

Table 2 shows the speed and accuracy of a polynomial model. The linear model can also be considered as a 1-order polynomial model. Fourier Model means the model is fitted by cosine and sine functions. As shown in Table 2, the 3-order polynomial model can achieve better performance in comparison with both linear and other non-linear methods.

Table 2. Comparison with some other modeling methods

4.3 Compared with the Existing Works

Barbara et al. proposed a physically-driven, white-box and explicit electrical battery model of the eye movement angle estimation [6]. 2.42 ± 0.91° is the MAE of angle estimated by Barbara’s model. The accuracy of the model is better than ours (3.50 ± 0.72°). Compared with our model, this battery model is subject-dependent because it requires the distance between the subject’s face-plane and the target-plane while we don’t. Barea et al. proposed an electrooculographic eye model based on wavelet transform and neural networks with an error of less than 2° during long periods of use [16]. But there is a 250 ms lag between the eye movement and confirmation of the same. In this paper, the model we proposed is designed to deploy in embedded platforms or mobile devices with limited computing power limited storage space.

5 Conclusion

In this paper, a non-linear polynomial eye movement angle estimation model is proposed. With the optimal 3-order of the model, the estimation error in angle is less than 3.5° within a large-space from −50° to 50°. The model is simple, analytical, fast, and with less than 5 parameters. For single model training, the minimum time is about 0.008 s with an Intel Core i5-9400F CPU, 8G DDR4 RAM, and GTX1650 GPU. Experimental results in realistic scenarios across 18 subjects exhibit that the proposed model can achieve favorable performance in terms of accuracy and consumption cost. Consequently, the model can be easily deployed in the embedded platform or mobile device with limited computing power and limited storage space for real-time eye movement angle estimation. The proposed model is expected to be integrated with mobile devices to realize real-time eye movement angle estimation for EOG-related healthcare applications. However, it is also worth noticing that this paper is preliminary research that offers a novel and accurate model for eye movement angle estimation. Currently, only horizontal eye movement angle was estimated. In further research, experiments to collect both horizontal and vertical eye movement data for building a comprehensive eye movement angle estimation model will be explored. Meanwhile, to further verify the model, we will deploy it in a hardware system for realizing real-time estimation.