1 Introduction

Mental fatigue is a feeling of weariness and exhaustion with reduced energy, activeness and declined cognitive competence (Borghini et al. 2014). Even though there is no precise definition of mental fatigue, it can be best understood as a feeling of tiredness, low arousal and low energy level. Motor imagery (MI) is the process of mental simulation of movement without actually performing the movement or without even stimulating the muscles. Such tasks, if carried out repeatedly for a longer period, become monotonous and require much cognitive effort to maintain the vigilance level. In MI Brain Computer Interfaces (BCIs), significant cognitive effort is required to concentrate on motor imagery tasks and hence the signal features are heavily affected by mental states, attention level, fatigue and arousal. Loss of attention and declined arousal level due to mental fatigue can significantly degrade the signal features and consequently decrease the BCI system performance (Cao et al. 2014). Therefore investigation of the inter-relationship between mental fatigue and motor imagery is substantial for MI based BCI.

The literature has discussed various means to analyse and estimate mental fatigue with the simplest and most convenient being self report ratings (Pomer-Escher et al. 2014). Various questionnaires and scales are used to estimate mental fatigue. Additionally, observed behaviour while performing a particular task, response time and response accuracy can also be used to estimate mental fatigue (Trejo et al. 2015). Response time implies the period of time taken to react to a given stimulus/event. Response accuracy is the degree of proximity of measurement of a response to the response’s accurate value. These measures can assess mental fatigue subjectively. There is a need for objective measures that can evaluate, monitor and predict the growth of fatigue with time.

Mental fatigue is known to alter brain activity. With the increase in mental fatigue, EEG spectral power increases in delta (δ), theta (𝜃), alpha (α) and beta (β) bands (Cao et al. 2014). Increased theta power signifies the decrement of arousal level, information encoding and working memory; while alpha and beta power increases with the increase in effortful attention and alertness of the participants while experiencing fatigue to maintain the vigilance level. When a participant experiences fatigue, the drowsiness and decreased arousal level elevates the delta power (Cao et al. 2014). Borghini et al. (2014) summarised the correlation of different neurological signals: EEG, electrocculogram (EOG) and heart rate with three different cognitive states: mental workload, mental fatigue and situational awareness. They illustrated how these different neurological signals can be associated in estimating the aforesaid cognitive states. Cao et al. (2014) presented a method to estimate mental fatigue in steady state visually evoked potentials (SSVEP) based BCIs. They explored the correlation of EEG indices (amplitudes in δ, 𝜃, α and β frequency bands), their ratio indices (𝜃/α, (𝜃 + α)/β) and SSVEP properties (amplitudes and signal to noise ratio changes) with growth of mental fatigue. EEG spectral features (power in δ, 𝜃, α1, α2 and β bands) were also analysed by Craig et al. (2012) to estimate mental fatigue in different areas of the scalp during simulated driving. Liu et al. (2010) employed two EEG features: approximate entropy and kolgomorov complexity to estimate and evaluate mental fatigue during three different cognitive tasks. Pomer-Escher et al. (2014) presented 14 different EEG spectral indices to estimate mental fatigue. Chai et al. (2016) also presented their approach based on power spectral density and Bayesian classifier. Roy et al. (2014) used common spatial pattern filter with Fisher linear discriminant classifier to estimate mental fatigue. Auto regression model (Zhao et al. 2011; Chai et al. 2017a, b) and wavelet transform (Kar et al. 2010) have also been used to estimate mental fatigue. The aforesaid studies analyzed the difference in fatigue levels before the beginning and after completion of the desired task instead of tracking the growth of fatigue with time. The study reported here monitors the growth of fatigue with time using spectral powers and spectral entropy as features. Spectral entropy is used here to investigate the change in the regularity of EEG signal with the rise in fatigue level.

The literature has reported a number of approaches to track the growth of fatigue with time. Jap et al. (2009) and Borghini et al. (2012) put forward approaches by showing the increasing and decreasing trends of different EEG spectral indices. However, in both cases, they did not quantitatively relate these indices to levels of mental fatigue (Charbonnier et al. 2016). Charbonnier et al. (2016) proposed an EEG index to monitor fatigue level, which compares the current EEG spectral content to the EEG spectral content recorded in an initial state when the participant was not fatigued. Trejo et al. (2015) presented a method to estimate the growth of fatigue using non-linear Kernel Partial Least Square (KPLS) algorithm. The method tracked the rise in fatigue using two spectral powers: theta power at Fz and alpha power at Pz. KPLS took as input these two spectral power and computed a fatigue score. Higher value of the score implies the growth of fatigue.

Monitoring the growth of fatigue with time holds promise as it would be helpful in better understanding the fatigue problem, based on which its influence on vigilance task can be investigated. This may eventually lead to the design of adaptive BCIs that are more robust to changes in fatigue level. Our study presents an approach to analyze the inter-relationship between fatigue and motor imagery using EEG, aiming to test whether a prolonged motor imagery session induces mental fatigue and whether mental fatigue affects motor imagery EEG separability. To the best of our knowledge, such analysis has been rarely done in the literature. Myrden and Chau (2015) investigated the effects of three mental states, i.e., fatigue, frustration and attention on BCI performance and depicted that moderate fatigue and moderate frustration improves BCI performance. The three mental states were estimated through self report. Another work by Rozand et al. (2016) presented an approach using EMG to investigate whether motor imagery induces mental fatigue which results in increasing the imagined movement duration. Mental fatigue was estimated through self report while EMG activities of the biceps brachii and triceps brachii muscles of the right arm were analysed at rest and during motor imagery.

Our study aims to monitor how the level of fatigue changes while performing motor imagery which in turn affects the motor imagery performance. The fatigue analysis has been carried out in five different areas of the scalp: frontal, parietal, temporal, central and occipital. The aim is to investigate the regional brain activity changes associated with fatigue during motor imagery. Statistically significant features that showed significant changes between first and last runs were then picked up for analysing mental fatigue. The advantage is that it examines where and what changes have occurred with the rise in fatigue level, i.e, which features from what part of the brain were affected by the growth of fatigue. Picking up different features from different areas of the scalp is very important and informative for exploring EEG correlation with the rise in fatigue level and may improve the monitoring of a person’s fatigue level. The growth of fatigue was monitored using Kernel Partial Least Square (KPLS) algorithm (Rosipal and Trejo 2001). Analysis of the effect of fatigue on motor imagery performance has been carried out from signal feature perspective. Common spatial patterns were used to extract features and KPLS-mRMR (Talukdar et al. 2018) to select features. The performance was measured in terms of class separability of motor imagery EEG features. Four different class separability metrics were used: Davies Bouldin Index (DBI) (Davies and Bouldin 1979), Dunn’s index (DI) (Dunn 1973), Fisher score (FS)(Duda et al. 1973) and Mutual information (mi) (Cover and Thomas 2012). Higher class separability was considered as better motor imagery performance.

The rest of the paper is organized as follows. Experimental protocol and methods for EEG analysis are described in Sections 2 and 3 respectively. Section 4 presents experimental results while Section 5 discusses the findings. Finally, Section 6 concludes the paper.

2 Experimental protocol

2.1 Participants

Data was collected from 11 individuals at University of Essex, England. All the participants were either students or staff of the aforesaid university. The participants gave their informed consent using a form approved by the Ethics Committee of University of Essex and were paid for their participation. The sample included 5 males and 6 females with a mean age of 29.3 years (SD = 7.4, range = 20–42 years, male mean age = 33 years and female mean age = 26.2 years). The participants were asked to have a good sleep before the experiment. As reported by them, the mean hours of sleep was found to be 6.8 h and none of them had any sleep disorder.

2.2 Experimental procedure

At the beginning of the experiment, the participants were a) given an orientation to the study; b) asked to read and sign an informed consent form; c) asked to complete a brief demographic questionnaire (name, age, gender, employment status, hours of sleep) and assigned an ID to each of them; d) asked to practice the motor imagery tasks for 5 min.

Short break was provided before the experiment and then the participants were prepared for data collection. The participants were asked to complete the pre-test self report measures: Visual Analogue Scale - Fatigue (VAS-F) (Lee et al. 1991) and Chalder Fatigue Scale (CFS) (Cella and Chalder 2010). The 13 items of VAS-F (item no: 1, 2, 3, 4, 5, 11, 12, 13, 14, 15, 16, 17, 18) and 6 items of CFS (item no:2, 3, 5, 7, 8, 10) that relates to the subjective experience of fatigue were used in this study. Thereafter, the participants were asked to perform the motor imagery tasks for one complete session. Each session comprises 8 runs and each run is 12 min in length comprising 80 trials. No extra breaks were provided between the runs. At the end of each run, the fatigue state was rated by using a continuous “fatigue scale” (See Appendix A.3). The “fatigue scale” has been introduced as a subjective scale with a value from 1 to 5 that extends between two extremes (1 = “Least fatigued” and 5 = “Most fatigued”). The participant circled a number along the scale that best represents how they felt regarding fatigue. The circled number was taken as the fatigue score of that particular run. Finally, the experiment termination was followed by completion of the post-test self report using VAS-F and CFS.

2.3 Motor imagery tasks

The participants were asked to perform 4 different motor imagery tasks: left hand movement (Class 1), right hand movement (Class 2), both feet movement (Class 3) and tongue movement (Class 4). During each trial, a fixation cross appeared on the computer screen at the beginning (t = 0 s) along with a short acoustic warning tone that asked the participants to get ready for the task. After 2 s, a cue in the form of a circle appeared either left, right, down or up of the fixation cross (indicating the imagination of left hand, right hand, both feet and tongue movement respectively) which instructed the participant what motor imagery task to perform. The participants accomplished the desired task until the cue and the fixation cross disappeared from the screen at t = 6. Thereafter, each trial included a break for 3 s.

The paradigm is illustrated in Fig. 1.

Fig. 1
figure 1

The experimental paradigm

The participants carried out the experiment until either they quit due to extreme fatigue or they completed all the 8 runs. Five out of the 11 participants could not complete the whole experiment. As depicted in Table 1 all the participants completed at least 5 runs out of the 8 runs and 6 participants completed all the 8 runs.

2.4 EEG recording

The participants were seated at approximately 80 cm from an LCD screen where the stimuli for the motor imagery task was presented. The Biosemi Active Two System was used to record the EEG data. 64 EEG channels were used to record the data (Fig. 2, electrodes marked in grey were used in this study) following 10–20 international montage system, with a sampling frequency of 256 Hz. The artefacts from the EEG data were removed by EAWICA (Mammone and Morabito 2014). The EEG data were then low-pass filtered (40 Hz) and subtracted by common average reference. The EEG data was processed from instant t = 0 to t = 6 from the 9 s epoch excluding the last 3 s break.

Fig. 2
figure 2

EEG channels used in the study following the 10–20 international montage system

Table 1 Number of participants in each run

3 Methods for EEG analysis

The framework to analyse mental fatigue-motor imagery inter-relationship using EEG is portrayed in Fig. 3. The first block “Fatigue analysis during MI” carries out optimization to get the best feature vector for fatigue analysis, followed by monitoring the growth of fatigue using the optimized feature vector and finally estimating the trends of the growth of fatigue. The other two blocks analyze the effect of fatigue on motor imagery. The block “Training phase to extract optimal spatio-temporal patterns and then to select the best features” acts on the training data to extract the optimal spatio-temporal patterns and then carry out feature selection to identify the best features of motor imagery. The block “Evaluation phase to analyse the effect of fatigue on motor imagery” evaluates the effect of fatigue on motor imagery class distributions. The approach for analyzing MI-fatigue inter-relationship consists of four key phases: a. Optimization to get the best feature vector for fatigue analysis, b. Monitoring the change in fatigue level using KPLS algorithm, c. Estimation of trends of growth of fatigue and d. Analysing the effect of mental fatigue on motor imagery from signal feature perspective.

Fig. 3
figure 3

Framework for analyzing inter-relationship between MI and mental fatigue

3.1 Optimization to get the best feature vector for fatigue analysis

Optimization of feature vector to get the best features was carried out on the first and last runs. The EEG data collected was subjected to Fast Fourier Transform (using Hanning window) to obtain the spectral power and spectral entropy in the following frequency bands: delta (0.1–3.5 Hz), theta (4–7.5 Hz), alpha (8–12 Hz) and beta (13–35 Hz) from five different areas of the scalp: frontal (F1, F3, F5, F7, Fz, F2, F4, F6, AFz, AF3, AF4 and FPz), parietal (P1, P3, P5, P7, Pz, P2, P4, P6, POz, PO3, PO4 and CPz), temporal (FT8, T8, TP8, FT7, T7 and TP7), central (C1, C3, C5, Cz, C2, C4,C6) and occipital (O1, O2, Oz). EEG indices used as features for analysing mental fatigue were spectral power and spectral entropy. Spectral entropy can be defined as a generic measure of system disorganization (Ekštein and Pavelka 2004). These two types of features computed in all the four bands during the first and last runs were averaged across all the 11 participants and the average values obtained were compared. Statistically significant features that show significant changes between the first and last run were then picked up for analysing mental fatigue.

3.2 Monitoring the change in fatigue level using Kernel Partial Least Square (KPLS)

The optimized feature vector was used to track the growth of fatigue while performing motor imagery. Monitoring of fatigue level was then conducted using KPLS algorithm that provided a score of mental fatigue (Talukdar and Hazarika 2016). KPLS is a non-linear regression method based on the projection of input (explanatory) variables to the latent vectors (components) (Rosipal and Trejo 2001). It is the non-linear variant of Partial Least Square (PLS) that computes uncorrelated latent vectors, the combinations of the original regressors (Rosipal and Trejo 2001). A least square regression is then performed that gives the regression coefficients.

In this study, KPLS takes as input two matrices X and Y with X being the set of predictors and Y being the set of response variables. Y is a vector of −1 or + 1 representing two classes—active state and fatigue state. X is an n × m matrix where n is the number of observations and m is the size of optimized feature vector. The approach of using KPLS to track the growth of fatigue was employed on the 6 participants (out of 11) who completed all the 8 runs. KPLS consists of two key phases: model selection and model prediction. During model selection, the optimal number of KPLS components were investigated while during model prediction fatigue scores for each trial was estimated. The scores are the estimates/predictions of Y which are computed by projecting X onto the KPLS regression coefficients. This study termed these scores as KPLS scores. For each of the 6 participants, for each run, the mean of the KPLS scores was computed which was interpreted as fatigue score of that particular run. The idea of using KPLS to track the growth of fatigue is closely followed from Trejo et al. (2015).

3.3 Estimation of trends of growth of fatigue

The subjective scores rated by the participants at the end of each run using the “fatigue scale” were used to validate the growth of fatigue obtained by the KPLS algorithm. The trends of KPLS scores and subjective scores were estimated using the Centered Moving Average algorithm and the similarity between the estimated trends were measured by Pearson Correlation Coefficient.

3.4 Analysing the effect of mental fatigue on motor imagery from signal feature perspective

The analysis of the effect of mental fatigue on motor imagery from signal feature perspective was then carried out to investigate how motor imagery signal features vary with the rise in fatigue level which consists of two key phases: a) training phase (last block of Fig. 3) to get the optimal spatio-temporal patterns and then optimal features of motor imagery data and b) evaluation phase (middle block of Fig. 3) to analyse the effect of fatigue on motor imagery class separability. The evaluation phase of MI EEG separability change with rise in fatigue level was carried out in two different scenarios: First, the score of mental fatigue obtained for each run was quantized to two levels: low fatigue and high fatigue using K-means clustering. K-means takes as input the highest fatigue score for the high fatigue level class and the lowest fatigue score for the low fatigue level class along with the number of classes. Each run of the evaluation data was then categorised as either high or low fatigue state. The separability of the MI EEG was analysed at each fatigue level. Second, the separability of MI EEG was examined for each run and then the correlation between MI EEG separability and mental fatigue over various runs was established.

In both cases, MI performance was evaluated in terms of class separability. Higher class separability was considered as better motor imagery performance. Separability of extracted features can be measured directly by certain metrics like DBI, FS etc. or indirectly by the accuracy of the classifiers (Hasan 2010). Higher the separability of extracted features, better would be the classification accuracy (Hasan 2010). This study examines the separability of features in terms of signal-feature perspective using four class separability metrics DBI, FS, DI and mi. All the four metrics computes the relevance of features with respect to the class and thus measures the separation between classes/clusters. Unlike classifiers, no pre-training is required. They are independent of the number of groupings and grouping algorithm used (Adel et al. 2015) and hence are simple, feasible and time saving (Löster 2016). Further, this study aims to show the decrease in class separability of MI EEG features with rise in fatigue level which can be described by these four separability metrics.

These aforesaid metrics are the most widely used to measure class separability. The computation of DBI is much less complex as compared to that of most class separability metrics like Silhoutte index (Petrovic 2006). FS is known for its simplicity, feasibility and time saving (Zhou 2016). As portrayed by Löster (2016), DI gives better performance as compared to DBI as well as other different class separability metrics. However, DBI, DI and FS are linear separability metrics and hence it cannot capture non-linear relationship. For this, the study also used another metric ‘mi’ as it is the most widely used metric to capture the non-linear relationship between the features and their classes.

3.4.1 Davies Bouldin Index (DBI)

Davies Bouldin Index (DBI) is computed as follows:

$$ M_{ij}={\left\{\sum\limits_{k = 1}^{n}|\mu_{ik}-\mu_{jk}|^{q}\right\}}^{1/q} $$
(1)
$$ S_{i}=\left\{{\frac{1}{T_{i}} \sum\limits_{l = 1}^{T_{i}} \sum\limits_{k = 1}^{n} |x_{lk}-\mu_{ik}|^{q}}\right\}^{1/q} $$
(2)
$$ RI_{ij} = \frac{(S_{i} + S_{j})}{M_{ij}} $$
(3)
$$ DBI = \frac{1}{m} \sum\limits_{i = 1}^{m}\left( \underset{j\neq i}{max}RI_{ij}\right) $$
(4)

where Mij is a measure of separation between class i and class j, Si is a measure of scatter within class i, μi is the centroid of class i, xl is a feature vector of size n, Ti is the number of feature vectors in class i, m is the number of classes, and the value of q is usually 2. Since DBI is the ratio of within-class scatter to between-class distance, a smaller DBI value implies better class separation.

3.4.2 Dunn’s Index (DI)

Dunn’s index is a measure to evaluate class separability and is defined as follows:

$$ DI = \frac{\underset{1\leq i < j \leq m}{min}\delta(\mu_{i},\mu_{j})}{\underset{1 \leq k \leq m}{max} {\Delta}_{k}} $$
(5)

where m is the number of classes, δ(μi, μj) is the distance between the centroids of class i and j and Δk computes the intra-class distance, i.e., distance of all the points from their centroid μ.

$$ {\Delta}_{k} = \frac{{\sum}_{x \in C_{i}}d(x,\mu)}{|C_{i}|} $$
(6)

where Ci represents class i. |Ci| is the number of samples of a particular class. A larger value of DI implies better class separability.

3.4.3 Fisher Score (FS)

Fisher Score is computed as follows:

$$ FS = \frac{|\mu_{1}-\mu_{2}|^{2}}{{\sigma_{1}^{2}}-{\sigma_{2}^{2}}} $$
(7)

where μ1 and μ2 represent the centroids of class 1 and 2 while σ1 and σ2 represent the variance of class 1 and class 2 respectively. Higher value of FS implies better class separability.

3.4.4 Mutual information (mi)

Mutual information gives the mutual dependence between two variables and is defined as follows:

$$ mi = \underset{y \in Y}{\sum}\underset{x \in X}{\sum}p(x,y)log\frac{p(x,y)}{p(x)p(y)} $$
(8)

where X is a set of feature vectors and Y is a set of class labels, p(x,y) is the joint probability functions of x and y while p(x),p(y) are the marginal probability functions of x and y respectively. Higher value of mi implies higher relevance of a feature with its class and hence better class separability.

4 Experimental results

4.1 Subjective evaluation of fatigue

Two scales Chalder Fatigue Scale (CFS) and Visual Analogue Scale - Fatigue (VAS-F) were used to assess the fatigue level of the participants before and after the experiment. These two scales reveal that all the participants experienced fatigue after completing several runs of motor imagery tasks. The individual subjective scores were averaged across all the 11 participants and shown in Fig. 4. Friedman statistical test was conducted on the averaged subjective scores obtained before and after the experiment. Compared to the averaged subjective scores on mental fatigue obtained before the experiment (pre-task), there is a significant increase in subjective scores on mental fatigue after the experiment (post-task) as shown in Table 2.

Fig. 4
figure 4

Comparison of subjective scores on mental fatigue averaged across all participants between two sessions: before and after the experiment

4.1.1 Evaluation of CFS

The obtained CFS scales reveal that all the participants experienced drowsiness and difficulty in concentrating after the completion the experiment. The CFS questionnaire is given in Appendix A.2. Mann Whitney t-test was conducted on the subjective scores obtained before and after the experiment for each of 6 items used in the study. The results are shown in Table 3. The first column shows the item number and the second column shows the p-value on the subjective scores of that particular item before and after the experiment (\(\checkmark \) indicates statistically significant while × indicates insignificant difference).

Table 2 Statistical test on the subjective scores
Table 3 Statistical test on the subjective scores on the questions of CFS

4.1.2 Evaluation of VAS-F

The obtained VAS-F scales reveal that all the participants experienced drowsiness, fatigue, difficulty in concentrating and difficulty in keeping eyes open after the completion of the experiment. The VAS-F questionnaire is given in Appendix A.1. Mann Whitney t-test was conducted on the subjective scores obtained before and after the experiment of each of the 13 items used in the study. The results are shown in Table 4. The first column shows the item number and the second column shows the p-value on the subjective scores on that particular question before and after the experiment (\(\checkmark \) indicates statistically significant while × indicates insignificant difference).

Table 4 Statistical test on the subjective scores on the questions of VAS-F
Table 5 Comparison of the average spectral power for delta activity obtained during first and last runs for all the 11 participants

The results collectively show that after completion of motor imagery tasks for a long period of time without any break between the runs, the participants felt drowsy, difficulty in concentrating, worn out and fatigued. Other different factors like room temperature, sitting in EEG room could also be the possible reasons for experiencing fatigue, but the participants if given break in the middle of the experiment might not experience the same level of fatigue as they experienced while accomplishing the experiment without any break. This is because motor imagery is a quite monotonous task and a participant needs much cognitive effort to maintain his/her vigilance level. Hence, they experience cognitive fatigue while accomplishing such cognitive tasks.

4.2 Optimization to get the best feature vector for fatigue analysis

Spectral power and spectral entropy computed in all the four aforesaid bands during the first and last runs in five different areas of the scalp: frontal, central, parietal, occipital and temporal were averaged across all the 11 participants and the average values obtained were compared. For each band average spectral power and average spectral entropy were recorded in Tables 567891011 and 12. Friedman statistical test was conducted on the obtained average values. The 2nd and 3rd columns in the tables show the average values obtained for the 1st and the last runs. The p values obtained are shown in the 4th column. Significant difference in the average values between the first and the last runs is indicated by \(\checkmark \) whereas × is used to show insignificant difference. The last column shows the direction of change of the computed mean values. shows the significant increase while - shows insignificant change.

  • A. Delta wave results

    Tables 5 and 6 show average spectral power and average spectral entropy respectively for the delta activity.

    No significant changes occur in average spectral entropy between the first and last runs in all the five areas. Average spectral power increases significantly in the frontal and temporal areas during the last run. The peak amplitudes of delta activity in frontal lobe during the first and last runs are: 0.0653 (SD = 0.1208, SE = 0.0364) and 0.1029 (SD = 0.242 and SE = 0.0729) respectively; while in the temporal lobe the peak amplitudes during the first and last runs are: 0.0340 (SD = 0.0174, SE = 0.00524) and 0.04732 (SD = 0.0474, SE = 0.0143) respectively.

  • B. Theta wave results

    Tables 7 and 8 show the results for average spectral power and average spectral entropy in the theta band respectively.

    No significant changes occur in average spectral entropy between the first and last runs in all the five areas. Average spectral power increases significantly in the frontal, temporal and occipital areas during the last run. The peak amplitudes of the theta activity in frontal lobe during the first and last runs are: 0.0773 (SD = 0.139, SE = 0.0419) and 0.093 (SD = 0.157, SE = 0.0474) respectively; in occipital lobe the peak amplitudes are 0.040361 (SD = 0.0209, SE = 0.00632) and 0.0517 (SD = 0.03302, SE = 0.01014) during first and last runs respectively; while in the temporal lobe the peak amplitudes during the first and last runs are: 0.0371 (SD = 0.0209, SE = 0.00631) and 0.0511 (SD = 0.0334, SE = 0.0101) respectively.

  • C. Alpha wave results

    Tables 9 and 10 show the results for average spectral power and average spectral entropy in the alpha band respectively.

    No significant changes occur in average spectral entropy between the first and last runs in all the five areas. Average spectral power increases significantly in frontal, parietal and temporal areas during the last run. The peak amplitudes of alpha activity in frontal lobe during the first and last runs are 0.116 (SD = 0.218, SE = 0.0657) and 0.139 (SD = 0.236, SE = 0.0711) respectively; in the parietal lobe 0.0542 (SD = 0.033, SE = 0.00987) and 0.0783 (SD = 0.068, SE = 0.0205) while in the temporal lobe the peak amplitudes during the first and the last runs are 0.0527 (SD = 0.0349,SE = 0.0105) and 0.079 (SD = 0.065, SE = 0.0197) respectively.

  • D. Beta wave results

    Tables 11 and 12 show the results for average spectral power and average spectral entropy in the beta band respectively. No significant changes occur in average spectral entropy and average spectral power between the first and last runs in all the five areas.

  • E. The topographic plots

    The topographic plots to depict the marked increase in delta, theta and alpha power between the first, intermediate and last runs are shown in Fig. 5. The maps show the plots of the electrodes as mentioned in Section 3.1 and hence depict the rise in the delta, theta and alpha power at only those aforesaid electrodes. Beta power is not shown in the plots as no significant change in the beta power is found between the first and last runs in all the five areas of the scalp. The dark red color in the topographic maps indicates the increase in delta, alpha and theta power. The color scales of the topographic maps are in the unit of dB and range from blue (minimum) to red (maximum) with green as the midrange. The green zones encompass those electrodes which were not used for this analysis. The dark red colour depicts the highest value of delta, theta and alpha. The green zones appear rectangular here as it encompasses those electrodes whose position forms a rectangular area. The limits of the color scales used to plot the topographic maps is (-0.06 to 0.06). The figure clearly portrays the increase in delta, theta and alpha power from the first to last run, with fatigue reaching the highest level during the last run as compared to the first and intermediate runs. Frontal area is found to be mostly effected with the rise in fatigue level as compared to the other four. Delta power is found to be more prominent during the experiment as compared to theta and alpha power. This shows that motor imagery produces increased level of drowsiness and decreased level of arousal.

Table 6 Comparison of the average spectral entropy for delta activity obtained during first and last runs for all the 11 participants
Table 7 Comparison of the average spectral power for theta activity obtained during first and last runs for all the 11 participants
Table 8 Comparison of the average spectral entropy for theta activity obtained during first and last runs for all the 11 participants
Table 9 Comparison of the average spectral power for alpha activity obtained during first and last runs for all the 11 participants
Table 10 Comparison of the average spectral entropy for alpha activity obtained during first and last runs for all the 11 participants
Table 11 Comparison of the average spectral power for beta activity obtained during first and last runs for all the 11 participants
Table 12 Comparison of the average spectral entropy for beta activity obtained during first and last runs for all the 11 participants
Fig. 5
figure 5

Scalp topographic maps of delta, theta and alpha band power

4.3 Kernel partial least squares to track the growth of fatigue during motor imagery

In the study by Montgomery et al. (1995) and Trejo et al. (2015), the first 15 min of a cognitive task did not produce substantial mental fatigue while mental fatigue became prominent in the final 15 min. In our study, the analysis of EEG spectral power and entropy from different frequency bands in different areas of the scalp also showed that spectral power increases during the last run as compared to that of the first run. δ, 𝜃 and α power from frontal region, α power from parietal lobe, δ, 𝜃 and α power from temporal lobe and 𝜃 power from occipital lobe show significant increase during the last run as compared to that of the first run; while β power from all the lobes shows insignificant change between the first and last runs. Since the increase in mental fatigue can be associated with increase in δ, 𝜃 and α power as discussed in Section 1, we can conclude that the last run of the experiment produced substantial mental fatigue as compared to that of the first run. Further, the subjective evaluation through pretest and post-test measures: CFS and VAS-F as discussed in Section 4.1 also shows that the level of fatigue experienced by the participants is higher after the experiment as compared to that at the beginning of the experiment. Hence, the first run and the last run were taken as cornerstones of mental fatigue, i.e., first run as active state and last run as fatigue state.

The set of response variables is a vector of −1 or + 1 representing two classes—the active state and the fatigue state for training while the set of predictors is an n × 8 matrix where n is the number of observations with columns consisting of 8 features i.e. frontal δ, 𝜃 and α power, parietal α power, occipital 𝜃 power and temporal δ, 𝜃 and α power.

  1. A.

    KPLS model selection

    The first and last runs were considered as both training as well as testing data for selecting the optimal KPLS model. KPLS scores were estimated for each trial of the training and testing data. Linear Discriminant Analysis (LDA) was trained and tested on the computed KPLS scores to find the optimal number of KPLS components. The number of KPLS components were checked in the range of 1 to 10. The optimal number of KPLS components was determined by maximum classification accuracy of LDA. Table 13 portrays the optimal number of KPLS components for each participant. Figure 6 shows the classification accuracy obtained for each participant with the optimal number of KPLS components.

  2. B.

    KPLS model prediction

    The predictive validity of the KPLS model was examined with the EEG epochs from all the runs. The KPLS model obtained by the KPLS model selection provides KPLS regression coefficients. KPLS scores which are the estimates/predictions were computed by projecting explanatory variables from each run on the KPLS regression coefficients. Since −1 and + 1 are taken as class labels used for training, the KPLS scores of each trial ranges from −1 to + 1. −1 and + 1 are used as class labels so that it helps to identify the alert and fatigue state clearly; negative scores as alert state and positive scores as fatigue state. The mean of the KPLS scores for each run was then computed which was interpreted as fatigue score. It has a value between −1 and + 1. For each of the 6 participants, for each run, the fatigue scores were plotted in a graph as shown in Fig. 7. The figure shows the orderly progression from active to fatigue state with some intermittent reversals. This is because fatigue may not have a perfect monotonic increase over time and sometimes waxing and waning behaviours can be observed.

Table 13 Optimal number of KPLS components for individual participants
Fig. 6
figure 6

Classification accuracy with the optimal number of KPLS components

Fig. 7
figure 7

Means of KPLS scores for each run with 6 participants

4.4 Estimation of trends of the growth of fatigue for validation of the EEG-KPLS model

The fatigue scores estimated through EEG based on KPLS model ranges from −1 to 1 while the values of fatigue based on self-reported subjective scores through “fatigue scale” are normalized in the range from 0 to 1. The trend for KPLS scores and subjective scores were estimated using the Centered Moving Average algorithm. A trend can be defined as a flow or direction in which a particular thing is changing or developing. The trend for fatigue scores estimated through EEG shows the rate of change of KPLS score w.r.t. time while the trend for fatigue scores obtained through subjective scores portrays the rate of change of subjective scores w.r.t time. The reason of estimating trends of KPLS scores and subjective scores instead of taking both scores directly for identifying correlation is that KPLS gives the fatigue score during a particular run while subjective evaluation gives the fatigue score at the end of a particular run for which it cannot be correlated directly.

To estimate the trends, the Centered Moving Average algorithm takes as input the fatigue scores of all the 8 runs as it aims to estimate the trends of growth of fatigue. The predictive validity of the KPLS model during model prediction was investigated with the EEG epochs from all the 8 runs which shows the growth of fatigue from 1st to last run. Also the study aims to correlate the growth of fatigue obtained through KPLS model with that obtained through subjective scores and hence it estimates the trends taking all the 8 runs into account. Further, since runs 1 and 8 were used for training the KPLS model, training may find some combination of input features to maximise the separability between runs 1 and 8. Hence, trends were also computed using Centered Moving Average algorithm taking runs 2–7 into account to examine the predictive validity of KPLS model without runs 1 and 8.

4.4.1 Centered moving average algorithm

The centered moving average algorithm computes the unweighted mean of the previous n data points. The mean is taken from an equal number of data points on either side of a central value which ensures that variations in the mean are aligned with the variations in the data rather than being shifted in time.Footnote 1 Given a window length of n and data points = y1, y2, y3,....,yn1, the centered moving average is defined as

$$ avg=\frac{1}{n}\sum\limits_{i = 1}^{n}y_{i} $$
(9)

When calculating successive values, a new value comes into the sum and an old value drops out, i.e

$$ avg_{new}=avg + \frac{y_{j}}{n} - \frac{y_{j-n}}{n} $$
(10)

where j > n.

This study first examined with different sliding window length in the range of [1 6] and then selected the best one.

Finally the similarity between the estimated trends were measured by Pearson Correlation Coefficient. The results are shown in Table 14. In the table, the 1st column shows the Participant id. Coulmns 2 and 3 show the correlation coefficient and statistical analysis of the estimated trends computed taking all the 8 runs while fourth and fifth column show the correlation coefficient and statistical analysis of the trends computed taking runs 2–7. Plots of the trends are shown in Appendix A.4 and A.5 respectively.

Table 14 Correlation between the trends of KPLS scores and Subjective scores

The result conveys that for 5 participants there is a strong correlation (>0.5) between the KPLS scores and the subjective scores. However, in case of Participant 11 no correlation is found since the subjective score was same throughout the whole experiment while the fatigue score based on KPLS model shows orderly progression towards fatigue state.

4.5 Evaluation of MI EEG separability change with rise in fatigue level

4.5.1 Training and testing sessions

The KPLS model predicts the fatigue scores of all the 8 runs. The first two runs (i.e. 160 trials) were used for training to compute the spatial filter for extraction of CSP features. Evaluation of the class separability of MI EEG features was then carried out on the testing data. The testing data has been considered in two ways: a. the remaining 6 runs i.e runs 3–8 were used for evaluation of MI EEG separability and b. Excluding the 8th run i.e runs 3–7 were used for evaluation of MI EEG separability. Six out of 11 participants completed all the 8 runs of the experiment. However, out of these 6 participants, the fatigue estimated using EEG for Participant S11 showed no correlation with the fatigue based on self-reported subjective scores. Hence, the following analysis was carried out on the remaining 5 participants. For each of these 5 participants, the evaluation of MI EEG separability change with the rise in fatigue level was carried out in two different scenarios: First, based on the fatigue score, each run was quantized to two levels: low fatigue and high fatigue using K-means clustering. The separability of the MI EEG was analysed at each fatigue level. Second, the separability of MI EEG was examined for each run and then the correlation between MI EEG separability and mental fatigue over various runs was established.

4.5.2 Feature extraction and selection

Common spatial pattern (CSP) was used for extracting features. The literature reports that CSP performs best with a lot of channels, for instance, 55 channels (Blankertz et al. 2008), 56 (Ramoser et al. 2000), 118 (Lu et al. 2009) or 127 (Ge et al. 2014). CSP may not perform well with a small number of channels (Górski 2014). This analysis employed all the 64 electrodes. One of the approach to analyse MI EEG data is to split the data into different time windows and select the optimal temporal segment. Optimal temporal segment refers to the segment that contains the most discriminative information based on a predefined criterion. This study uses optimal spatio-temporal filtering to extract the optimal spatial-temporal patterns and was carried out by employing ADSWIN (Talukdar and Hazarika 2017), an adaptive temporal segmentation of EEG trial. ADSWIN is an adaptive sliding window approach that automatically segments EEG trials and then selects the best segments to produce the optimal spatio-temporal patterns. ADSWIN is portrayed as an enhancement to classic sliding window methodology which

  • i ncreases the class separability,

  • d ynamically adapts the two parameters window size and overlapping region on which a sliding window approach relies.

  • c an be applicable to online learning algorithm.

CSP was employed to extract features of each segment obtained through segmentation. DBI is used as a cost function to identify the optimal segment. The EEG segment with minimum DBI is then selected as optimal EEG segment. CSP projection matrix is computed based on the selected optimal time segment to create a spatio-temporal profile. Our motivation to use such a filter is driven by two factors (Talukdar and Hazarika 2017): a. constructing a reduced representation of the original time series of training data and b. consolidating spatial analysis with temporal study, to extract spatio-temporal patterns. It is feasible to process the whole trial, but extraction of optimal time segment has the benefit in computing results with much shorter time segments. Detailed formulation of the method can be found in Talukdar and Hazarika (2017). The training data was used to build up the optimal spatio-temporal filter. The parameters that have been employed for the filtering are shown in Table 15. ADSWIN requires two parameters, default segment length (ωd) and sliding window overlapping region (δ). It adapts these two parameters to generate the optimal spatio-temporal filter. This study investigates segments with two different sets of ωd and δ. With the trial being 6 s in length this study uses two different ωd keeping in mind that lower value than 2.99 s would be too small to find the optimum number of time points and higher value than 3.99 s may create larger ωd. δ was selected in such a way that it maintains even distributions along the trial. Also it is set keeping in mind that the larger value of δ would keep more historic information rather than new information. The study investigated the effect of MI EEG separabilty with different frequency bands and hence examined 6 different bandpass filter banks (4–7 Hz, 8–13 Hz, 13–30 Hz, 30–40 Hz; 4–9 Hz, 9–15 Hz, 15–30 Hz, 30–40 Hz; 4–9 Hz, 9–16 Hz, 15–30 Hz, 30–40 Hz; 4–9 Hz, 9–16 Hz, 15–32 Hz, 30–40 Hz; 4–9 Hz, 9–16 Hz, 15–32 Hz, 30–40 Hz; 4–9 Hz, 9–16 Hz, 15–32 Hz, 30–40 Hz). The best two bandpass filter banks were selected for this study. Four segments were then selected based on the different combinations of ωd and δ for each filter bank as shown in Table 16. Hence, a total of eighth different segments were investigated. KPLS-mRMR (Talukdar et al. 2018), a KPLS based feature selection method was used to select a set of discriminative features. The method selects the maximum relevant and minimum redundant features. The number of selected features was set to 25.

Table 15 Parameters used by ADSWIN
Table 16 Segments investigated for analysis of effect of fatigue on motor imagery

4.5.3 Effect of mental state on class distributions

  1. (A)

    Evaluation of MI EEG separability during low and high fatigue level

    Table 17 shows the runs that have been categorized as low or high fatigue level. The first column shows the participant id while the second and third columns portrays the runs categorized as low or high fatigue level respectively.

    The class separability of the MI EEG at each level using runs 3–8 as evaluation data was then estimated by means of four evaluation metrics: DBI, FS, DI and mi. The average DBI and FS across all the 5 participants during low fatigue and high fatigue are shown in Figs. 8 and 9 respectively. Figures 10 and 11 portray the average mi and DI during low and high fatigue across all the 5 participants. Similarly, the MI EEG separability at both fatigue levels using runs 3–7 as the evaluation data were also estimated. The average DBI and FS across all the 5 participants during low fatigue and high fatigue are shown in Figs. 12 and 13 respectively. Figures 14 and 15 portray the average mi and DI during low and high fatigue across all the 5 participants. The figures show that the class separability during high fatigue level was lower than that during low fatigue level in terms of all the four evaluation metrics. Friedman statistical test was carried out to examine its statistical difference. The results are shown in Table 18. The first and second columns show filter banks (abbreviated as FB in the table) and its corresponding segments respectively while the p-values for the evaluation metrics DBI, FS, mi and DI for runs 3–8 are shown in the third, fourth, fifth and sixth columns respectively. The p-values for the evaluation metrics DBI, FS, mi and DI for runs 3–7 are shown in seventh, eighth, ninth and tenth columns respectively. Significant difference in class separability between low and high fatigue is indicated by \(\checkmark \) while × indicates insignificant difference. These findings collectively show distinct relationship between motor imagery performance and mental fatigue.

  1. (B)

    MI EEG separability over various runs

    For each of the 5 participants the separability of MI EEG over various runs are estimated in terms of DBI, FS, DI and mi. The correlation between the estimated separability and the fatigue score obtained through KPLS model was computed using Pearson correlation coefficient and are shown in Table 19. The first, second and third columns show the Participant id, filter banks (abbreviated as FB in the table) and its corresponding segments respectively while the correlation values of the fatigue score with the evaluation metrics DBI, FS, DI and mi for runs 3–8 are shown in the fourth, fifth, sixth and seventh columns respectively. The correlation values of the fatigue score with the evaluation metrics DBI, FS, DI and mi for runs 3–7 are shown in the eighth, ninth, tenth and eleventh columns respectively. Participants S1, S2, S5 and S6 show negative correlation between MI EEG separability and fatigue score in most cases, although in some cases, it shows positive correlation but most of the value is not greater than 0.55. This clearly indicates the decrease in class separability with rise in fatigue, supporting the assertion that high fatigue level decreases MI EEG separability. However, Participant S3 shows strong positive correlation with rise in fatigue level in most cases.

Fig. 8
figure 8

Average DBI values across all the 5 participants during low fatigue and high fatigue level using runs 3–8 as evaluation data

Fig. 9
figure 9

Average Fisher scores across all the 5 participants during low fatigue and high fatigue level using runs 3–8 as evaluation data

Fig. 10
figure 10

Average mutual information values across all the 5 participants during low fatigue and high fatigue level using runs 3–8 as evaluation data

Fig. 11
figure 11

Average Dunn’s index values across all the 5 participants during low fatigue and high fatigue level using runs 3–8 as evaluation data

Fig. 12
figure 12

Average DBI values across all the 5 participants during low fatigue and high fatigue level using runs 3–7 as the evaluation data

Fig. 13
figure 13

Average Fisher scores across all the 5 participants during low fatigue and high fatigue level using runs 3–7 as the evaluation data

Fig. 14
figure 14

Average mutual information values across all the 5 participants during low fatigue and high fatigue level using runs 3–7 as the evaluation data

Fig. 15
figure 15

Average Dunn’s index values across all the 5 participants during low fatigue and high fatigue level using runs 3–7 as the evaluation data

Table 17 Runs categorized as low or high fatigue level
Table 18 Statistical test on the class separability of low and high fatigue
Table 19 Correlation of MI EEG separability with fatigue score over various runs

5 Discussion

5.1 MI and mental fatigue: inter-relationship

This study aims to investigate the inter-relationship between MI and mental fatigue, i.e, whether a prolonged sequence of motor imagery tasks induces mental fatigue and whether mental fatigue affects MI performance. The subjective scores evaluated by means of CFS and VAS-F support that prolonged motor imagery induces substantial mental fatigue. Our KPLS model based on EEG spectral powers showed an orderly progression towards high fatigue. It was then confirmed with the subjective scores obtained through fatigue scale reported after each run.

The EEG analysis during motor imagery presents the change in EEG spectral power with increase in fatigue. However, the change in spectral power was found to be most significant in the range of 0.1–12 Hz. No significant indication of fatigue was found above 13 Hz or in the beta band. As reported by Craig et al. (2012), out of five studies, beta activity was not found significant in two studies (Caldwell et al. 2002; Papadelis et al. 2006), significant increase in two studies and decrease in one study. Beta activity in relation to fatigue is not properly understood and its functional role remains unclear (Jensen et al. 2005). Some literature reports that increased beta activity indicates slowed motor behaviour (Craig et al. 2012; Zhang et al. 2008), while some findings presents slowed cognitive performance decreases the beta power (Jensen et al. 2005). Many studies have not reported the activity in the beta band but mainly concentrated on the hypothesised fatigue related increase in delta, theta and alpha powers. In compliance with the findings of Trejo et al. (2015), Cao et al. (2014), Pomer-Escher et al. (2014) and Jap et al. (2009), our study also demonstrates that with the increase in fatigue, theta, alpha and delta activities increase. In the literature, delta activity has received quite little attention as it is a low frequency signal and mainly influenced by artifacts like breathing or movement (Lal and Craig 2002). However, with advanced artifact removal technique, changes in delta activity can be reported reliably (Lal and Craig 2002).

Cognitive fatigue manifests itself as the changes in spectral power distribution (Kar et al. 2010). And spectral entropy captures the dissemination in broader context. However, no significant change in spectral entropy is found in this study.

This study analysed the changes in EEG spectral power and spectral entropy in different scalp areas over the entire period of motor imagery tasks aiming to investigate how they change with time and with varying fatigue level. The study was carried out in five different areas of the cortex: frontal, parietal, central, occipital and temporal. The EEG analysis shows how much these areas are affected by fatigue while performing motor imagery. The study shows that delta activity was more prominent in temporal area while theta and alpha activities were more prominent in frontal area. Parietal area shows significant change before and after the experiment in alpha activity only while frontal and temporal area show significant changes in all the three spectral powers. Occipital lobe shows significant change in theta power.

The trends of the growth of fatigue estimated based on subjective scores reported by the participants after each run and the EEG spectral power estimated based on KPLS model show a strong correlation between them. For the 6 participants, who completed the whole experiment, the KPLS model shows an orderly growth towards fatigue state from active state. It can be seen from the results that the participants experienced fatigue at different rates. Most of the participants entered the fatigue region (positive KPLS scores) during the 4th run of the experiment while during the 5th run all the participants entered into the region of fatigue. Participant S6 entered the fatigue region the earliest.

The analysis of the effect of mental fatigue on motor imagery portrays the distinct relationship between motor imagery performance and mental fatigue. The evaluation of MI EEG separability during low and high fatigue levels shows that MI EEG separability decreases during high fatigue level as compared to that during low fatigue level. The evaluation of MI EEG separability over various runs shows a negative correlation or small positive correlation with the fatigue score obtained through KPLS for Participants S1,S2,S5 and S6 in terms of all the four evaluation metrics. However, for Participant S3, the analysis shows a strong positive correlation between the MI EEG separability and mental fatigue in most cases. Although there exists no BCI literature for comparison of mental fatigue and MI EEG separability, the literature from other disciplines may be used for comparison. Jackson et al. (2001) identified psychological flow as a key requirement for excellent performances. The flow is defined by Romero and Calvillo-Gámez (2013) as effortless attention, where a person can concentrate deeply without much effort. This is contrasted to effortful attention. However, on maintaining effortful attention, participants may sometime excel in their performances which presents a challenge in developing adaptive MI EEG BCI.

5.2 Implications: towards adaptive motor imagery BCI by tracking mental fatigue

Mental fatigue is related to other different cognitive states like drowsiness, loss of attention, decreased arousal, lower focus level, which can also be called as symptoms of mental fatigue (Cao et al. 2014). Regardless of which symptom is most responsible for the relationship between mental fatigue and motor imagery performance, the findings of this study strongly suggest that distinct relationship between mental fatigue and motor imagery performance exists. This necessitates the designing of adaptive BCI systems that are more robust to changes in fatigue level. Moreover, the decrease in class separability of motor imagery signal features with the rise in fatigue level depicts that mental fatigue can be used as a metric to design such adaptive MI-BCI.

One of the ubiquitous hurdle in developing an adaptive BCI based on mental fatigue is the accurate detection of the point where the rise in fatigue level needs adaptation. It may happen that slight increase in mental fatigue may sometimes increase the MI-EEG separability as discussed in Section 5.1. Hence, it is vital that the control systems must be able to distinguish between “when to adapt” and “when not”. Another substantial concern is whether it is correct to refer mental fatigue as “cognitive state” or “cognitive process”. It is always seen that when a person experiences fatigue, his level of fatigue keeps on changing. Or in other words, one can refer mental fatigue as “cognitive process”. In such cases, instead of adapting during high fatigue and low fatigue or during alert and fatigue, adapting the system based on the level of fatigue holds promise. This would suggest what level of fatigue needs how much adaptation of the system.

6 Conclusion

This study investigates the inter-relationship between MI and mental fatigue based on EEG analysis. It is observed that prolonged sequences of motor imagery induce mental fatigue. Moreover, the rise in fatigue level affects the motor imagery signal features, indicating the need of BCI adaptation. Future work should consider designing an adaptive MI-BCI that is more robust to the changes in fatigue level.

A limitation of this study is in the size of the sample set. The sample consists of 11 participants out of which only 6 participants could complete the whole experiment. Another limitation is the use of self-reported scores as the validation measure. Although self-report is the most convenient method for estimating fatigue, a combination of self-report ratings along with other measures like response accuracy or behavioral activity would reinforce the study.