1 Introduction

Retaining vigilance above a constant level is of vital importance in many applications such as pilots in aircraft, drivers in running vehicles, security forces in defense systems, operators in high accuracy process controllers, etc. In all these applications, persons are involved in repetitive, monotonous and long-term tasks. In Kecklund and Akerstedt (2004), Warm et al. (2008), Lin et al. (2014) and Schutte (2017), it has been found that long-term and monotonous tasks often lead to drop of vigilance and performance capability of the operators. Recent studies carried out in Möckel et al. (2015), Epstein et al. (2016), Stapp and Karr (2018) and Seiter (2018) show that recess increases the performance capability, and thereby helps to solve tasks in an efficient and creative way. On the other hand, for safety in work, systems demand a continuous monitoring of operator’s vigilance level and make appropriate interventions when declining vigilance is detected. Such a warning system can become futile if the warning signals are ignored due to inattentional deafness or cognitive tunnelling. In fact, a sophisticated system would enable accurate detecting of vigilance level of an operator prior assigning to a critical task and then continuously monitor him during the task (Shingledecker et al. 2010; Corver and Grote 2016).

Several bio-behavioural signatures have been developed to monitor vigilance of operators in working environments, including eye closure (Damousis et al. 2009; Sigari 2009; McIntire et al. 2014a, b), facial expression (Sigari 2009), head position (Bergasa et al. 2006), blood flow velocity (Shaw et al. 2013) and heart beats (Zhang and Liu 2012; Lee et al. 2016). Corresponding to this, the authors in Damousis et al. (2009) used eyelid-related features from electrooculogram (EOG) signals to develop a fuzzy expert system that can predict hypo-vigilance and help in providing early warning signals. A similar hypo-vigilance detection work has been reported in Sigari (2009), which is based on facial image processing and where percentage of eye closure, eye closure rate and eyelid distance changes have been used to detect hypo-vigilance among individuals. Monitoring the same parameters as in Sigari (2009) along with blink frequency, nodding frequency, face position and fixed gaze, authors in Bergasa et al. (2006) detected vigilance in drivers. They developed a hardware system and its software implementation, by acquiring drivers’ images in real-time using an active infrared illuminator and utilized a fuzzy classifier to infer the level of inattentiveness of the drivers.

Other works worth mentioning utilized pupillometry (McIntire et al. 2014a, b), wherein authors measured eye features such as pupil diameter, pupil eccentricity and pupil velocity for vigilance detection. However, pupillometry measures reflect the combined influences of cognitive arousal, anxiety, age, fatigue, intelligence, illness and medication and, therefore, does not provide actual vigilance estimation (Cacioppo 2016; Piquado et al. 2010; Rozado et al. 2015; Hallowell and Chapman 2014; Dollaghan et al. 2012). Körber et al. (2015) used the response time variations of the users and Dundee Stress State Questionnaire in addition to eye-related features to study the variability of vigilance in individuals. Although blink-related features appear to be a reasonable contender for hypo-vigilance detection, literature (Damousis et al. 2009) demonstrates that these features are not exact and sufficiently dependable, since they exhibit strong interpersonal and intra-personal variability.

Related studies in recent years have demonstrated that electroencephalography (EEG) is a highly effective neurophysiological indicator for assessing vigilance (Yildiz et al. 2009; Choi et al. 2014; Sauvet et al. 2014; Zhang et al. 2016, 2017; Zheng and Lu 2017). EEG is a popular brain imaging modality with a high temporal and spatial resolution, which is sufficiently lightweight to be worn in operational settings (Guger et al. 2009). Further, an EEG device satisfies both convenience criteria (that is, non-intrusiveness, non-obtrusiveness, and simplicity) and effectiveness criteria (that is, sensitivity, efficiency, and compatibility) (He et al. 2015).

On this notion, the authors in Gu et al. (2011) estimated vigilance using integrated hierarchical Gaussian mixture model where power spectral density (PSD) and error score are taken as features. Nevertheless, the approach requires further study for generalization. In Choi et al. (2014), authors proposed a hidden Markov model-based hypo-vigilance detection technique using EEG signals for Unmanned Combat Aerial Vehicle operators. The authors in Yu et al. (2007) proposed a method to distinguish between two vigilant states (that is, sleep and awake) using the spatio-temporal features of the EEG signals. Sauvet et al. (2014) used the spectral power of each frequency band of EEG signals and their ratios as features to detect vigilance. Correlation between wavelet coefficients of the EEG signal bands and vigilance state of a person has been established using sparse representation of EEG signals in Yu et al. (2010). Zhang et al. (2016) detected drivers’ vigilance level by combining sparse representation of PSD with k-singular value decomposition. A novel fatigue detection system, for high-speed train safety, by assessing driver’s vigilance through a wearable EEG, is presented in Zhang et al. (2017). In Yildiz et al. (2009), authors presented a new application of adaptive neuro fuzzy system model for estimation of vigilance by using EEG signals. Detection of vigilance decrement has also been made possible using event-related potentials (ERPs). In this regard, Martel et al. (2014) utilized EEG predictors, namely N100, N200 and P300 in their covert vigilant attention task by observing emergence and accumulation of increased activity in the \(\alpha\)-frequency range (8–14 Hz) 10 s before a missed target along with a significant gradual attenuation of the P300 ERP which was found to antecede misses by 5 s. In Giraudet et al. (2015), authors used auditory stimuli for evoking P300 and N100 ERPs and used them as an indicator of inattentional deafness. Recently, in Huang et al. (2017) authors utilized Vigilance Algorithm Leipzig to study the effect of finely differentiated EEG-vigilance stages (indicating arousal states) on evoked potentials (P100, N100, P200, N300, mismatch negativity and P300) and behavioural performance. They identified various vigilant stages such as active wakefulness, relaxed wakefulness, drowsiness and sleep onset. Their study provides a base that ERPs can be used for indication of various vigilance states.

However, several research challenges are to be addressed for the effective use of EEG signals in vigilance detection/estimation. Vigilance decrement is a dynamic changing process because the intrinsic mental states of users involve temporal evolution rather than a time point. This process cannot simply be treated as a function of the duration of time while engaged in tasks. The ability to predict vigilance levels with high temporal resolution is more feasible in real-world applications. Vigilance cannot be simply classified into several discrete categories (namely, alert, drowsy and sleep) but should be quantified into different levels. We still lack a standardized method for measuring the overall vigilance levels of humans.

The previous studies have focused on applying EEG features for developing vigilance detection systems without providing a numeric quantification of the different vigilance states. In this paper, an approach for vigilance estimation by combining ERPs (N100 and P300) and eye blink rate has been presented. ERPs have high temporal resolution that allows for the measurement of brain activity within milliseconds without any propagation delay. N100 ERP is associated with an individual’s pre-attention and perception. It usually affects neural activity in the human brain while performing a discrimination task (Haider et al. 1964; Revolvy 2017). On the other hand, P300 ERP elicitation is linked with a rare event that initiates cognitive and mental processing (Fröhlich 2016). It is evoked post-stimulus as a result of attending or responding to target stimuli. ERPs and eye blink rate represent internal cognitive states and external subconscious behaviours, respectively. These two modalities contain complementary information and can be integrated to construct a more robust vigilance estimation model.

With an aim to address the limitations of the current hypo-vigilance detection systems, we propose a fuzzy rule-based system which can satisfactorily discriminate between various states of human vigilance. The objectives of our work are the following:

  1. 1.

    Establish the correlation between ERPs (that is, N100 and P300) and the vigilance activity.

  2. 2.

    Establish a numerical relationship between ERPs (N100 and P300) and eye blink rate, such that these signals can be used in combination to obtain improved accuracy in vigilance detection.

  3. 3.

    Observe the impact of recess on the performance of monotonous tasks demanding sustained attention.

The rest of the paper is organized as follows: Sect. 2 presents the design of the experiment, data collection procedure, description of ERPs and eye blink extraction from the EEG signals and the detailed explanation for vigilance estimation using fuzzy model. Section 3 presents the results obtained from the experiment. In Sect. 4 the detailed interpretations of the results have been discussed. Finally, Sect. 5 concludes the paper.

2 Materials and methods

2.1 Subjects

Ten healthy participants (male: 8, female: 2) comprised of research scholars and post-graduate students available on campus at IIT Kharagpur were randomly selected for this study. The participants had no history of mental ailment and their age ranged between 24 and 32 years. Each participant had normal or corrected-to-normal vision and were right handed. The participants were not sleep deprived; they had no deviations from their usual circadian cycle, and they took no medicine or alcohol. They were asked to refrain from having tea or coffee 3 h before the experiment. Informed consent from all participants was taken before conducting the experiment. Appropriate certificate of approval (IIT/SRIC/SAO/2017) was also obtained from the Institutional Ethical Committee at IIT Kharagpur.

2.2 Subjective measures

Before performing the vigilance task, each participant was requested to fill their background information along with the pencil-and-paper version of Global Vigor and Affect (GVA) form, whose scale ranges from 0 to 100 (Monk 1989). This helps in subjective analysis of affective state (feelings, mood) and level of vigor (alertness, vigilance) of each participant. Moreover, the subjects also filled a Visual Analogue Scale (VAS) to indicate their present mood. The VAS scale ranges from 0 to 10, where ‘10’ signifies a “happy” mood and ‘0’ corresponds to a “sad;” mood.

Further, immediately after the completion of the session, the participants expressed their present state using the VAS form. Along with this, the participants also reckoned the task difficulty using the NASA-TLX (Center 2017) questionnaire on the scale of 20.

2.3 Vigilance task

In this paper, we used the Mackworth Clock Test (Mackworth 1948) implemented in Psychology Experiment Building Language (Mueller 2017) for the vigilance detection experiment. In this test/task, a participant monitors a red pointer moving circularly in front of a black background in steps like the seconds’ hand of a clock and responds when the clock hand makes a random jump. The probability of clock skipping a normal step/jump is 0.4. Each shift of the pointer depicts a trial. The pointer shifts to a new position throughout the experiment after lapse of 1 second, such that after at most 60 movements/shifts it completes one full circular round. In a session, whenever the clock skips a normal jump, the user has to promptly press the ‘space bar’ key. Note that the size of the pointers and the radius of the clock can be varied according to the requirement.

2.4 Procedure

The experiment was carried out in a quiet and isolated room with maintained room temperature, where each participant was seated comfortably. A large 20-in. monitor was placed approximately 65 cm away from the participants for presenting visual stimuli. Initially, to inure each participant, we asked them to relax for a duration of 10 min. Thereafter, in the next 5 min, we asked participants to fill subjective questionnaires, namely GVA and VAS. Through these questionnaires, we assessed the physiological health of participants via their own judgements about various parameters relating to them. After this task, a 5 min instruction, demonstration and practice session was arranged for each participant, which was followed by baseline EEG data recording. During baseline data recording the participants were asked to sit idle for 5 min with restricted movement of body organs. During the complete session participants were asked to avoid, as far as possible, movement of any kind except for responding to the target stimuli. After baseline data recording, we performed a 20 min clock test (phase-I) which involves 1200 trials. Next, participants were again asked to fill the subjective questionnaires namely VAS and NASA-TLX for assessing the toughness, present mood and effort required during the experiment. Further, a 5 min baseline data recording was done again to seek the changes in the mental stress/load of the participants, which was followed by a repetition of the clock test with 600 trials for 10 min (phase-II). This experiment was conducted once with each participant. Only a single participant’s data were recorded in a day. The data recording was done between 7:00 AM and 10:00 AM as per the availability of the participants. This complete procedure is graphically shown in Fig. 1 for the sake of clarity.

Fig. 1
figure 1

Description of the experimental protocol

2.5 Data recording

Recording of the EEG data is done using the Emotiv Epoc+ device with 14 electrodes (following the 10–20 international system), at a sampling rate of 128 Hz. The 14 channels present in the device are: AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8 and AF4, with 2 references at P3/P4 locations. The bandwidth of the device is in the range of (0.2–43) Hz. The data are transferred through Bluetooth, which is having a band power of 2.4 GHz. The average battery life of the device is around 9 h. All recorded EEG data are digitally filtered in the range of 0.1–30 Hz. This is the frequency range which is used for ERP-based studies. Signals are analysed using MATLAB® version 2014a running on a PC with the following configuration: Processor: Intel(R) Core(TM) i3-3240, CPU @ 3.40GHz, RAM: 4.00 GB, System Type: 64-bit Operating System, x64-based processor.

In Emotiv Epoc+ device electrodes are absent at Fz, Cz, Pz and Oz locations. Also, the reference locations in the headset are usually behind the ears, that is approximately at P3/P4 positions. Thus, a significant component of P300 ERP which is prominently detected at the central locations of the scalp (Geoff and CTO 2010; Polich 2007) gets subtracted from all other channels if we select P3/P4 as reference electrodes. This results in considerable reduction in the magnitude of P300 peak. Therefore, to minimize this effect, we have considered alternate reference locations for collecting data (see Fig. 2) and extracted P300 from central location by averaging the central pair, namely O1/O2, F3/F4, AF3/AF4 and P7/P8 (Geoff and CTO 2010).

Fig. 2
figure 2

Alternate reference location

Besides, it has been established in the literature (Mangun and Hillyard 1991), that N100 deflection can be detected at most recording sites, including the occipital, parietal, central and frontal electrode sites. Also, N100 peaks earlier over frontal than posterior regions of the scalp, suggesting distinct neural and/or cognitive correlation (Mangun and Hillyard 1991). The objectives of our work are to (Ciesielski and French 1989), and the visual N100 component is usually largest over the occipital region (Hopf et al. 2002). Hence, keeping these in mind, we have gathered the information about N100 ERP from F3, F4, AF3, AF4, P7, P8, O1 and O2 electrodes. Further, it has also been observed that blink signatures in EEG data are immediately recognizable from the AF3 channel.

2.6 ERP and eye-blink detection

The detection of ERPs and eye-blink from EEG involves several steps. These steps have been described in detail in the following:

  1. 1.

    ERP detection: EEG data are susceptible to noise from various electro-physiological sources. Hence, to remove the noise and for extracting the desired features (ERPs) from EEG, we perform the following:

    1. (a)

      Filtering: The recorded raw signals are first pre-processed to remove artefacts of all kinds and to harness crucial information. For this, we resorted to basic filtering process using Chebyshev’s high-pass filter (having cut off frequency of 0.1 Hz) to remove disturbing components emerging due to breathing and voltage changes in neuronal and non-neuronal artifacts. We also used Chebyshev’s low-pass filter (having cut off frequency of 30 Hz) to eliminate noise arising from muscle movements. Besides, we considered a notch filter, with null frequency of 50 Hz, at the recording time to ensure perfect rejection of the strong 50 Hz power supply interference, impedance fluctuation, cable defects, electrical noise and unbalanced impedances of the electrodes. Further, we performed independent components analysis to decompose the EEG signals into independent components using EEGLAB (Delorme and Makeig 2004). The resulting components marked as artifacts (that is, eye-blinks or an EMG artefact) were discarded from the subsequent process. The remaining ones, classified as signal components, were back projected to reconstruct artefact-free EEG signals.

    2. (b)

      Epoch marking: In this step, we extract the corresponding event epochs from the EEG signals. This is accomplished by identifying the locations, that is, the time instants of occurrence of the target stimuli and the non-target stimuli. This process marks every target and non-target events. Here, the term ‘target’ events indicate the locations where the clock skips a beat/jump and non-target events indicate the normal ticking of the clock. The window length for each epoch is kept from \(-\) 500 to 1000 ms.

    3. (c)

      Baseline removal: This is to remove the artifacts arising from low-frequency drifts and leading to data skewness. The baselineremoval also eliminates the overall voltage offset (if any) from the waveforms in each epoch. Besides, it is done to prevent unnecessary rejection of many trials owing to the presence of overall voltage offset. Figure 3 describes the procedure followed for baseline removal.

    4. (d)

      Trial averaging: To increase the Signal-to-Noise-Ratio (SNR) of ERPs, we used temporal processing method (ensemble averaging) on large number of trials (Cohen 2014; Nidal 2014). Further, to obtain the recognizable ERP waveforms the post stimulus data are averaged according to the ordinal position of the target or non-target stimulus sequence, respectively. After obtaining the ERP signals, we identify the N100 (amplitude and latency) and P300 (amplitude and latency). To visualize the ERP occurrences obtained from the target events (phase-I), an instance of average signal plot for all the considered channels (for a participant) has been shown in Fig. 4. For more clarity, the ERPs observed under F4 channel in the presence of target and non-target events is shown in Fig. 5.

  2. 2

    Eye blink detection: It is well known that in a normal human-being an eye blink lasts for about 400 ms and has an amplitude of at least 40 \(\upmu\)V. Using these two parameter limits as threshold we detected eye blinks from the EEG signals (see Fig. 6), through the AF3 channel of the EEG device, due to its prominent presence at this location (Khatun et al. 2016). Note that, in the figure, the red colour indicates eye blink. This task requires minimization of the false detection of eye blinks and was carefully performed by dividing the recorded signals (from AF3 channel) into overlapping windows wherein each window had a width of 110 samples. To minimize the false detection of eye blinks we checked whether a complete eye blink was present in a window or not. If it was successfully detected, it was considered as true eye blink; however, if only a trough was observed in a window then the adjacent window was checked for the presence of its crest. If the crest was also observed then it was regarded as true eye blink and noted; otherwise, it was regarded as a false signal originating due to some noise and was rejected.

Fig. 3
figure 3

Baseline removal process

Fig. 4
figure 4

An instance of obtained ERP plot for target events

Fig. 5
figure 5

Signal plot showing comparison between target and non-target events and the ERP components

Fig. 6
figure 6

Locating/detecting eye blink during a vigilance activity

2.7 Vigilance computation

This work combines P300 and N100 ERPs along with eye blink rate to estimate the vigilance level of an individual. The fusion of the parameters has been done following a fuzzy based approach such that it allows for events to be simultaneously present in more than one category, thereby making the response dimensions continuous. The rationale of applying fuzzy theory is that all the inputs can be better characterized with fuzzy linguistics, as the input are actually not crisp. The following steps have been followed to build the fuzzy model:

  1. 1.

    Fuzzification: In our vigilance detection task, to quantify the uncertainty inherent in the response we utilized three parameters (N100, P300 and Eye blink rate) as input to evaluate the vigilance level of an individual. The three variables which vary with the variation in vigilance level are (1) elicitation time (t) of N100 and P300, (2) amplitude (a) of N100 and P300 and (3) eye blink rate. During fuzzification, these variables are defined linguistically (Mamdani approach) based on the range they cover (refer Table 1). Let us denote P300 as P and N100 as \(\mathbb {N}\); then for amplitude, we define three linguistic states, namely low amplitude (LA), medium amplitude (MA) and high amplitude (HA). Similarly, the linguistic states for time are classified as follows: before time (BT), optimum time (OT) and after time (AT). Finally, blink rate is categorized into three states such as fast blink (FB), normal blink \((\eta {B})\) and slow blink (SB). Now, depending on the amplitude and time, P300 (P) is divided into four fuzzy sub-categories: No P300 (NP), Low P300 (LP), Moderate P300 (MP) and High P300 (HP). Similarly, N100 \((\mathbb {N})\) is divided as No N100 (N\(\mathbb {N})\), Low N100 (L\(\mathbb {N})\), Moderate N100 (M\(\mathbb {N})\) and High N100 (H\(\mathbb {N})\). The above definitions of P300 and N100 signal can be mathematically represented as follows:

    $$\begin{aligned} \begin{aligned} Z_{a,t}=Y&, {\text {where}}, \\&\ Z\ {\text {represents}}\, {\text {considered}}\, {\text {ERP}}\ {\text {features}}, \\&Z\in \left\{ {\text {P}}, \mathbb {N} \right\} , \\&a\in \left\{ {\text {LA, MA, HA}} \right\} , \\&t\in \left\{ {\text {BT, OT, AT}} \right\} {\text {and}} \\&Y\ {\text {is}}\, {\text {the}}\ {\text {set}}\ {\text {of}}\ {\text {all}}\ {\text {possible}}\ {\text {states}}\ {\text {of}} \\&{\text {considered}}\ {\text {ERP}}\ {\text {features}}, \\&Y\in \left\{ {\text {NP, LP, MP, HP, N}}\mathbb {N},{\text {L}}\mathbb {N},{\text {M}}\mathbb {N},{\text {H}}\mathbb {N} \right\} \end{aligned} \end{aligned}$$
    (1)

    Figure 7 shows the relationship between amplitude and time for N100 and P300 signals. In this fuzzy rule base system, vigilance has been graded into four classes: no vigilance (NV), low vigilance (LV), moderate vigilance (MV) and high vigilance (HV).

  2. 2.

    Fuzzy rule base: The ERP signals form an intermediate state of the fuzzy system and are fuzzily defined as no ERP (NE), low ERP (LE), moderate ERP (ME) and high ERP (HE). Vigilance determination is accomplished using two-level fuzzy rules. In the first level, from the N100 and P300 signals, intensity of ERP signals is identified and next, using the intensity of ERP signals and the blink rate, vigilance level is determined. To clarify the notion the fuzzy inference system for vigilance estimation is shown in Fig. 8 and the overall fuzzy rule base matrix is shown in Fig. 9. Besides, the proposed logic for calculating vigilance from the ERP signals and blink rate is given in Eq. (2).

    $$\begin{aligned} {\text {Vigilance}}=(P300 \vee N100)\wedge {\text {Blink}} \ {\text {Rate}} \end{aligned}$$
    (2)

    The logic behind Eq. (2) is that under the effect of target stimulus, both P300 and N100 are elicited; however, the environmental noise deteriorates both P300 and N100. In many cases, these ERPs become completely invisible or either of them may be present with very low magnitude. In order to address this issue, we perform OR operation between P300 and N100, so as to obtain values from at least one of the two ERPs. Next, we use AND operator between the obtained intermediate result and the blink rate (which is independent of ERPs) to quantify the vigilance. The rule base contains a total of 48 rules for all the possible instances. At any particular instance (that is, a given condition or input), membership values along with rule strength are computed.

  3. 3.

    Defuzzification: To extract deeper insight from the results obtained, the fuzzified values are defuzzified into crisp forms. We have used the Mean of Maximum (MoM) method to obtain the crisp values. The MoM method generates a quantity which represents the mean value of all outputs, whose membership functions reach the maximum (Novà et al. 2016). This method is known to provide the most plausible result. In other words, the fuzzy controller uses the typical value of the consequent term of the most valid rule as the crisp output value. Let A be a fuzzy set with membership function \(\mu _A(x)\) defined over \(x\in X\), where X is a universe of discourse. The defuzzified value is (say) \(x^*\) of a fuzzy set and is defined as in Eq. (3):

    $$\begin{aligned} x^*=\frac{\sum _{x_i \in M}x_i}{\mid M \mid }, \end{aligned}$$
    (3)

    where

    $$\begin{aligned} \begin{aligned}&M= \lbrace x_i \mid \mu _A(x_i)\ {\text {is}}\ {\text {equal}}\ {\text {to}}\ {\text {the}}\ {\text {height}}\ {\text {of}}\ {\text {the}}\\ {\text {fuzzy}} \ {\text {set}}\ {\text {A}} \rbrace \\&\quad {\text {and}} \mid M \mid {\text {is}}\ {\text {the}}\ {\text {cardinality}}\ {\text {of}}\ {\text {the}}\ {\text{set}}\ M. \end{aligned} \end{aligned}$$
Table 1 Input and output value ranges of fuzzy variables
Fig. 7
figure 7

Relationship between amplitude and time for N100 and P300 signals

Fig. 8
figure 8

Fuzzy inference system for vigilance estimation

Fig. 9
figure 9

Fuzzy rule base matrix

3 Results

For accurately observing various phenomenon recorded in EEG signals, the data recorded in the two phases of the experiment were divided into equal segments of 2 min, such that the data recorded in phase-I and phase-II comprise of 10 and 5 equal parts, respectively. The other analysis carried out by us and the results obtained therein are discussed hereunder:

3.1 Behavioural analysis

  1. (a)

    Global Vigor and Affect Scale (GVA): We utilized one-way ANOVA and Tukey’s pairwise comparison of means of factors for analysing Global Vigor (GV) and Global Affect (GA) subjective ratings. From the results, it can be observed that the considered factors for GV (mean, \(\mu\) = 74.8 and standard deviation, \(\sigma\) = 0.86) and GA (\(\mu\) = 52.825 and \(\sigma\) = 0.89) are significant at \(\alpha\) = 5% (significance level). The contributing factors of GV have the following \(\mu\) and \(\sigma\) values: alert (\(\mu\) = 6.750, \(\sigma\) = 1.620), effort (\(\mu\) = 3.050, \(\sigma\) = 1.499), weary (\(\mu\) = 2.700, \(\sigma\) = 1.418) and sleepy (\(\mu\) = 2.100 , \(\sigma\) = 1.524). The post hoc Tukey’s comparison test (refer to Fig. 10a) revealed that alert factor is significantly different than effort, weary and sleepy factors at p < 0.001, such that participants self-reported higher levels of alertness than effort, weariness and sleepiness. Likewise, \(\mu\) and \(\sigma\) values for the involved GA factors are as follows: happy (\(\mu\) = 7.450, \(\sigma\) = 1.212), calm (\(\mu\) = 6.900, \(\sigma\) = 2.234), sad (\(\mu\) = 2.200, \(\sigma\) = 1.549) and tense (\(\mu\) = 1.900, \(\sigma\) = 1.912). The Tukey’s comparison test (refer to Fig. 10b) reveals that factor happy is highly significant than sad and tense at p < 0.001, whereas calm is significant than sad and tense at p < 0.001. This directly indicates that the participants self-reported higher levels of happy and calm state than sad and tense state before the experiment.

  2. (b)

    Visual Analogue Scale (VAS): For the vigilance experiment, it is important to observe changes in the mood of the participants before and after the experiment. Hence, we performed a subjective analysis using VAS. Next, we set the hypotheses as follows: \(H_0\) = The null hypothesis: the mood of each participant remains same throughout the experiment; \(H_a\) = The alternate hypothesis: the mood of each participant before and after the experiment differs significantly, and performed a paired t test. From the results it is evident that p value is 0.009 (< 0.05) and t value is 3.33, which indicates strong evidence against the null hypothesis, thereby rejecting the null hypothesis (refer Fig. 11).

  3. (c)

    NASA-TLX: Similar to GVA, here, we used one-way ANOVA and Tukey’s pairwise comparison test for multiple mean comparisons of NASA-TLX subjective load index (\(\mu\) = 12.09 and \(\sigma\) = 1.69). The results reveal that the factors considered are significant at \(\alpha\) = 5% (significance level). The mean and standard deviation for each of the factors are as follows: mental demand (\(\mu\) = 14.000, \(\sigma\) = 2.981), physical demand (\(\mu\) = 6.76, \(\sigma\) = 4.86), temporal demand (\(\mu\) = 10.22, \(\sigma\) = 4.05), performance (\(\mu\) = 13.80, \(\sigma\) = 3.38), effort (\(\mu\) = 10.88, \(\sigma\) = 3.26), frustration (\(\mu\) = 7.06, \(\sigma\) = 3.91). Further, Tukey’s pairwise comparisons (refer to Fig. 12) are used for grouping of significant and non-significant comparisons. From the assessment of NASA-TLX, we observed that mental demand is more significant to both physical demand (at p < 0.001) and frustration (at p < 0.002). Also, it is observed that physical demand is more significant than performance at p < 0.002, while performance is more significant than frustration at p < 0.003.

Fig. 10
figure 10

Difference mean plot for Global Vigor and Affect Scale (GVA)

Fig. 11
figure 11

Histogram plot of the paired t test for the VAS scale

Fig. 12
figure 12

Difference mean plot for factors involved in NASA-TLX questionnaire

Fig. 13
figure 13

Mean and standard deviation of reaction time during the first phase of the experiment

Fig. 14
figure 14

Mean and standard deviation of reaction time during the second phase of the experiment

3.2 Reaction time analysis

Performance tests help to objectively assess the degree of deterioration in cognitive performance during the tasks. It can be measured using the reaction/response time (RT) taken by each participant to respond to stimuli that occur at random intervals. Usually, vigilant decrement is marked by slow RTs, an increase in the number of errors of omission (that is, RTs \(\ge\) 500 ms) and an increase in the number of errors of commission (responses without a stimulus) (Basner and Dinges 2011; Carsten and Vanderhaegen 2015). We plotted, using the interval plot, the mean reaction time (and its standard deviation) taken by each participant for correctly detecting and responding to the critical stimuli in phase-I and phase-II of the experiment (refer Figs. 13, 14).

From Fig. 13a, we can observe that the mean response time rapidly increases initially for few minutes of the experiment, then it becomes almost stable indicating that the participants becomes accustomed to the task. Later parts of phase-I again show an increasing trend of the mean reaction time, which is due to the mental fatigue; but, strangely during the last 2 min of phase-I, mean reaction time decreased significantly.

The plot of the standard deviation of reaction times, shown in Fig. 13b, obtained from different participants during phase-I indicates that there is little deviation in the reaction times of different participants.

The trend-line of the phase-II experiment starts at a lower level in comparison to the starting reaction time of phase-I. The plot of the standard deviation of reaction times for phase-II which is shown in Fig. 14, depicts that there is not much deviations in the reaction times among the participants. The plots shown in Fig. 15 display the errors made by each participant during phase-I and phase-II of the experiment, respectively.

Fig. 15
figure 15

Number of errors made by the participants during the experiment

To evaluate the performance of the participants, accuracy is used as an evaluation criterion. For estimating the accuracy of detection, the recorded EEG data are divided into four sub-categories defined as true alarm (TA), true skip (TS), false alarm (FA) and false skip (FS), where TA represents correct identification of target stimuli, TS represents correct identification of non-target stimuli, FA represents incorrect key pressed at non-targets and FS represents non-identification of the target stimuli. Based on these data, the accuracy is calculated using the formula given in Eq. (4). The plots of accuracy of detection for both phases of the experiment can be seen in Fig. 16.

$$\begin{aligned} \mathrm{Accuracy} = \frac{{\text {TA}}+{\text {TS}}}{{\text {TA}}+{\text {TS}}+{\text {FA}}+{\text {FS}}} \end{aligned}$$
(4)
Fig. 16
figure 16

Percentage accuracy obtained by each participant in the experiment

Fig. 17
figure 17

Variation of P300 amplitude and latency during phase I of the experiment

Fig. 18
figure 18

Variation of P300 amplitude and latency during phase II of the experiment

3.3 ERP analysis

For each individual, we observed the variation of N100 and P300 amplitude and latency within every small interval of 2 min during the phase-I and phase-II of our experiments. We used the scatter plot to plot the obtained variation.

Fig. 19
figure 19

Variation of N100 amplitude and latency during phase I of the experiment

Fig. 20
figure 20

Variation of N100 amplitude and latency during phase II of the experiment

3.3.1 P300 analysis

From the scatter plot, we obtained the overall trend line to easily observe the changes occurring in the amplitude and latency of P300 for each participant. From phase-I of the experiment, we observed that for each participant, P300 amplitude either decreased or remained steady with the passing time, see Fig. 17a. Besides, we also observed that with passing time the latency in invocation of P300 increased for almost all participants, see Fig. 17b.

Further, in phase-II of the experiment, which was performed after a short rest period of 5 min, we observed similar behaviour of variation in amplitude and latency of P300 as was observed in phase-I (see Fig. 18).

3.3.2 N100 analysis

We observed that N100 amplitude either decreased or remained steady with passing time during both phases of the experiment (see Figs. 19a, 20a). However, for N100 latency no particular trend was observed (refer Figs. 19b, 20b).

Table 2 Vigilance value for every 2 min interval estimated in phase I using the fuzzy inference system
Table 3 Vigilance value for every 2 min interval estimated in phase II using the fuzzy inference system
Table 4 Elaborated fuzzy vigilance calculation for phase I of the experiment

3.4 Calculation of vigilance through our proposed fuzzy rule-based system

After processing the EEG signals, thereby removing the inherent noise, we locate the P300 and N100 peaks and extract their respective amplitudes and latency time. We also note the number of eye-blinks and their corresponding intervals. Then, we feed this information in Eq. (2) of Fuzzy inference system for vigilance estimation (see Fig. 8). The fuzzy rules utilized for quantifying vigilance are already discussed in Figs.  7 and 9. The numeric evaluation of vigilance, through fuzzy rule-base, for each participant has been done (for every 2 min interval) for both phases of the experiment. The obtained results are tabulated in Tables 2 and 3. Besides, the overall mean and standard deviation of each variable (N100, P300 and eye blink) during the entire phase I and phase II of the experiment has been tabulated in Tables 4 and 5.

Table 5 Elaborated fuzzy vigilance calculation for phase II of the experiment
Table 6 Comparison of computed vigilance level with the obtained performance accuracy

3.5 Validation of the proposed fuzzy model

To evaluate the performance of the proposed approach, a reference vigilance index is necessary. In our experiment, the performance of the participants’ (accuracy) is used as an evaluation criterion for reference vigilance level. The performance accuracy is evaluated using Eq. (4), which is based on the accuracy of detection of target stimuli and responding to a presented target within a time window of constant width. In other words, to validate the accuracy and efficiency of the proposed model, we compared the mean vigilance level obtained by averaging all vigilance data, of 2 min slices, of each individual with the target detection accuracy (from clock test) of each individual. To accomplish this task, we have heuristically divided the accuracy levels into four bands: (a) very low accuracy—if the value is \(\ge\) 0 and < 30%; (b) low accuracy—if the value is \(\ge\) 30 and < 50%, (c) moderate accuracy—if the value is \(\ge\) 50 and < 80% and (d) high accuracy—if the value is \(\ge\) 80 and \(\le\) 100%. The respective bands for vigilance are already discussed in Sect. 2.7. The obtained values are tabulated for comparison and can be seen in Table 6. It is clearly evident from Table 6 that the values obtained from our proposed fuzzy model suitably mimic the accuracy, thereby vigilance in both phases of the experiment. From the results shown in Table 6, it can also be observed that there is only one instance (for P5 in phase-I) in which the obtained fuzzy vigilance level differs from the accuracy achieved by the participant. Thus, the overall accuracy of our proposed fuzzy model is 95%.

4 Discussion

In the previous section, we presented various statistical results obtained by analysing GVA, VAS, NASA-TLX, reaction time and variation of amplitude and latency of ERPs. We found that our experiment significantly impacted the transient states of mood of different participants. Participants felt less attentive, more sleepy, bored, strained, irritated and fatigued after performing phase-I of the experiment. Later after having a short break, they were rejuvenated to perform the next phase of the experiment. The results obtained from GVA and VAS analysis support this claim. Moreover, the result of t test also validates this inference. We feel that such shift in mood occurs probably due to the high cognitive load involved in the experiment, requiring sustained attention on the moving pointer in the clock test. Any such task is mental capacity draining because of the need to make more frequent and rapid decisions about whether or not a stimulus constitutes a critical signal for detection. This helps to conclude that vigilance tasks are resource demanding and associated with high workload. In other words, vigilance decrement is accompanied by a linear increase in the overall workload over time. The assessment of responses to NASA-TLX questionnaire also favoured the above conclusion. We observed that the mental demand and performance are highly significant factors; effort and temporal demand are moderately significant factors; and frustration and physical demand are least significant factors.

The quintessential finding in vigilance research is that detection performance declines over time due to the effect of vigilance decrement. The results obtained from reaction time analysis also supports the above-mentioned fact. For both phase-I and phase-II, we observed that there is an increase in reaction time as time passes. However, in phase-I we have observed that there is a decrement in the reaction time during the last 2 min which is against the intuitive behaviour. This peculiar behaviour was observed in case of each participant. This suggests that the clock test affects all participants in similar way and induces similar mental demand from each participant.

The effect of recess on the performance is prominently visible from the reaction time observed in the phase-II of the experiment. It can be easily observed that the reaction time of the participants in phase-I has significantly lower value than in that of phase-I. This happens mainly due to (a) the 5 min rest given to each participant before beginning phase-II experiment, and (b) the participants became accustomed to the task which they were performing. phase-II also shows an increasing mean reaction time due to the monotonous nature of the job under consideration. Through this, we can infer that if a person/operator takes short spans of rest while continuously performing any monotonous task, the reaction time to any alarming situation can be considerably reduced. Further, an increase in the number of errors of omission, as well as an increase in errors of commission also ascertains that recess increases the performance capability. Figure 15 shows that there is a significant reduction in committed errors among the participants in phase-II of the experiment. Figure 16 shows that the percentage of accurate detections made by the participants has increased in phase-II than in phase-I.

We also considered ERPs (P300 and N100) to observe the cognitive changes taking place inside brain from an instant before the occurrence of critical stimulus to an instant after the behavioural response as ERPs are known to be correlated with stimulus discrimination task by providing precise timing information. From the ERP analysis, we observed that for phase-I and phase-II, the overall P300 amplitude decreased and latency increased as time passed. This signifies that both P300 ERP and vigilance are time-dependent quantities and their amplitude falls with passing time. Similar results were observed with N100 ERP for amplitude variation. However, for N100 latency no particular trend was observed.

Subsequently, we combined P300 and N100 ERPs along with eye blink rate with the help of our proposed fuzzy rule-based technique for robustly and accurately characterizing the vigilance level of an individual. On comparison of the fuzzy output obtained from each participant (refer Tables 2 and 3), it can be observed that for participants 3, 4 and 9, the vigilance level in phase I was higher in comparison to phase II, while for participants 6 and 8 the scenario was vice-versa, and for the remaining participants the level remained consistent in both the phases. The variation for participants 3, 4 and 9 followed the fact that, with decreased cognitive ability, P300 and N100 amplitude becomes lower and latency is higher which was evident in phase-II. For the other case, the recess period motivated the participants to give better performance.

To validate the accuracy and efficiency of the proposed model, we compared the performance accuracy obtained from reaction time analysis and the vigilance level obtained from fuzzy computation. The results provided in Table 6 show that accuracy obtained through reaction time analysis mismatches the computed vigilance level at only one instance. Thus, the proposed model achieves 95% accuracy. To search for the reason behind an inaccuracy of 5%, we observed the input parameters of the concerned participant (P5) and found that the P300 latency associated with the participant was quite high, due to which the fuzzy rule-base system resulted in low vigilance value for the participant. Here, we reiterate that P300 latency varies with the difficulty to discriminate the target stimuli. The latency is usually interpreted as the speed of stimulus classification resulting from the discrimination of one event from another. Shorter latencies indicate superior mental performance relative to longer latencies. From this, we can infer that although the participant succeeded in giving correct responses, but his ability to arrive at quick decision was comparatively higher, which led to higher P300 latency.

5 Conclusion

This paper presented a new vigilance estimation method using EEG signals, recorded with the help of Emotiv Epoc+ and fuzzy rule-base. Through this work we achieved multiple objectives: First, we analysed the mood and stress level of each participant with the help of subjective analysis; second, we extracted ERPs (N100 and P300) and eye blinks from the recorded EEG signals and established the correlation between the ERPs, eye blinks and vigilance; Third, with the help of our proposed fuzzy logic we increased the credibility of the vigilance estimation, which in earlier works used to be mostly qualitative due to the uncertainty in the EEG signal classification and indicated mere presence or absence; fourth, we validated the performance of our proposed fuzzy model against the target detection accuracy and found that the average estimation accuracy of our fuzzy model is 95%. As per the results obtained, we conclude that the proposed fuzzy vigilance estimation method performs effectively and is as good as an expert’s opinion. Hence, the method can be instrumental to predict an individual’s vigilance in real-time. The proposed approach is, in fact, suitable for smooth functioning of safety critical operations such as process control, pilots in aircrafts, operators in nuclear reactors, surveillance and military operations. Also, the system can be useful in assessing human reliability by assigning the best operator for a mission critical operation and help in the selection in human resource process. Extensive future work will be directed in extending the proposed work to allow online evaluation of the data.