Introduction

Obstructive sleep apnoea (OSA) is a prevalent disorder that is characterised by the repeated episodes of closure or narrowing of the upper airway during sleep. These upper airway episodes result in intermittent hypoxaemia and fragmentation of sleep. The long-term consequences of OSA include excessive daytime somnolence, increased risk of motor vehicle accidents, increased risk of cardiometabolic disease [1] and increased healthcare utilisation [2]. In addition to these consequences, OSA is also associated with cognitive deficits in the domains of attention, memory and executive function [3]. These cognitive deficits are unable to be fully explained by the sleepiness that usually accompanies OSA [4].

To better explain the relationship between OSA and cognitive impairment, it has been proposed that the sleep fragmentation and intermittent hypoxaemia associated with OSA leads to chemical and structural changes in the brain [5]. This has been supported by the identification of abnormalities of grey matter and white matter structures and hypometabolism of specific brain regions in OSA patients [6]. However, despite the growing evidence that shows a relationship between cognitive impairment and OSA, cognitive impairment in the setting of OSA appears to have a weak relationship with the usual markers of OSA severity, severity of sleep fragmentation and degree of hypoxaemia. Furthermore, cognitive impairment is not present in every patient diagnosed with OSA [7]. This suggests that our current markers of OSA severity and sleep fragmentation need further refinement. By refining these markers, we may be able to predict OSA patients who are most susceptible to cognitive impairment and thus allow the efficient allocation of health resources.

A potential area for further refinement is how we measure and define sleep fragmentation. Sleep fragmentation is often described by the electroencephalogram (EEG) arousal index or the “number of awakenings” as they are explained to the OSA patient. The marking of EEG arousals was incorporated into clinical polysomnogram (PSG) scoring following the demonstration that they were the best predictor of mean sleep latency in the multiple sleep latency test (MSLT) [8, 9]. The numerous definitions used to define an EEG arousal by various groups prompted the American Academy of Sleep Medicine (AASM) to provide a consensus definition of an EEG arousal in 1992 [10]. The consensus definition essentially required an abrupt shift in EEG frequency of 3 s or greater duration after a minimum of 10 s of continuous sleep. This definition has been carried over without modification into the AASM’s Manual for the Scoring of Sleep and Associated Events [11].

The 3-s minimum duration criteria for an EEG arousal were acknowledged by the task force as an arbitrary decision [10]. This was due to the poorer levels of agreement between scorers with EEG arousals of shorter durations. Nevertheless, the 3-s EEG duration is also associated with relatively poor inter-scorer reliability [12], which is unaffected by montage selection [13]. A study by Schwartz and Moxley [14] examined longer EEG arousal duration and showed that “long arousals” (15 to 60 s in duration) were better correlated with subjective sleepiness in OSA patients. These results suggest that minimum EEG arousal durations greater than the standard 3 s may have also greater clinical utility in the evaluation of OSA patients with cognitive impairment.

The aim of this study was to examine if a longer minimum EEG arousal duration could differentiate between OSA patients with impaired and unimpaired cognitive performance. In this study, we used the psychomotor vigilance task (PVT) as a surrogate for cognitive performance and examined the differences between impaired and unimpaired PVT performance.

Methods

This was a retrospective study. A total of 307 full diagnostic PSGs conducted for the suspicion of OSA during the period of January 2015 to December 2015 were considered for this study. Patients were excluded from the analysis if any of the following recognised risk factors for mild cognitive impairment formed part of their medical history: cigarette smoking, hypertension, diabetes mellitus, Down syndrome, hypothyroidism, significant alcohol consumption, stroke, head trauma, cardiac failure, respiratory failure, depression, cerebrovascular accident and use of psychoactive medications. PSGs were also excluded if a split night treatment protocol (diagnostic to PAP therapy) was implemented, if oxygen was administered, and if a primary PSG channel (nasal pressure, pulse oximetry, all EEG, respiratory effort) contained too much artefact for reliable analysis. The Metro South Human Research Ethics Committee approved this study (HREC/16/QPAH/021).

PSGs were recorded with the Compumedics Grael acquisition system (Abbotsford, Australia). The recording montage comprised of EEG (F4-M1, C4-M1, O2-M1), left and right EOG (recommended derivation: E1-M2, E2-M2), chin electromyogram (EMG, mental/submental positioning), modified lead II ECG, nasal pressure (DC amplified), oronasal thermocouple, body position, thoracic and abdominal effort (inductive plethysmography), pulse oximetry, left and right leg movement (anterior tibialis EMG) and sound pressure (dBA meter: Tecpel 332). EEG channels were sampled at 1024 Hz.

PSGs were scored according to the 2012 AASM Manual for the Scoring of Sleep and Associated Events [11] with Compumedics Profusion 4.0 (Build 410) software while viewed on Dell P2414H (1920 × 1080 resolution) LCD monitors. Care was taken to ensure that the initiation and termination of each EEG arousal were correctly marked. The termination points of EEG arousals greater than 15 s in duration were marked between 15 and 16 s irrespective of their actual length. Whenever the three EEG channels displayed different EEG arousal initiation and termination locations, the EEG channel with the shortest duration was chosen for initiation and termination. EEG arousals were classified as respiratory arousals if they occur less than 3 s after the termination of the respiratory event. EEG arousals were classified as limb movement arousals when there was an overlap of the events or when there was < 0.5 s between the end of one event and the onset of the other event irrespective of which event (arousal or limb movement) occurs first. EEG arousal indices were calculated according to their association (all, respiratory-related and PLM-related). EEG arousal indices were also categorised according to minimum duration thresholds (index of EEG arousals that were ≥ 3 s, ≥ 5 s, ≥ 7 s, ≥ 10s and ≥ 15 s, respectively).

Prior to undertaking the diagnostic PSG, patients completed the Epworth Sleepiness Scale (ESS), the Functional Outcomes of Sleep Questionnaire (FOSQ) and the Short Form-36 quality of life questionnaire (SF-36). Patients also completed the 10-min version of the PEBL Psychomotor Vigilance Task (PVT) [15] on an ASUS Transformer Pad with attached keyboard. The patients were instructed to continually monitor the screen and press a response button on the attached keyboard with either the index finger or thumb on their dominant hand as soon as the pink stimulus dot appeared on the screen. The presentation of the next stimulus was programmed to vary randomly between 2 and 10 s.

PVT responses were considered valid if the reaction time (RT) was ≥ 100 ms. RTs < 100 ms were considered to be false starts. Lapses were considered as RTs ≥ 500 ms. The following PVT outcomes were calculated: mean 1/RT (also known as response speed), median RT, slowest 10% 1/ RT and the number of lapses [16]. For calculating mean 1/RT and slowest 10% 1/RT, each RT was divided by 1000 and then reciprocally transformed. The transformed values were then averaged. K-Means clustering was used to divide the patients into two groups based on their PVT reaction time results. The patient group with the slower response speed was designated as the “impaired” group while the patient group with the faster response speed was designated as the “unimpaired” group.

Statistical analyses were performed using GraphPad Prism 7.02 (GraphPad Software, La Jolla, CA) and MedCalc 17.9.2 (MedCalc Software bvba, Ostend, Belgium). Normality in the distribution of data collected was determined by the D’Agostino-Pearson omnibus K2 test. Data are presented as mean ± standard deviation or median and interquartile range for normally distributed and non-normally distributed data, respectively. Impaired and Unimpaired group data were compared using either an unpaired t test or Mann-Whitney test for normally distributed and non-normally distributed data, respectively. The proportion of male to female in each group was compared using chi-square test. The accuracy of each EEG arousal minimum duration threshold to predict impaired PVT performance in an OSA patient was examined using receiver-operator characteristic curves (ROC). Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive and negative likelihood ratios and accuracy were calculated for each EEG arousal minimum duration threshold to determine the cut-off values that provided maximum diagnostic efficiency. A P < 0.05 was set as the limit of statistical significance.

Results

The demographic characteristics and PVT performances of each group are shown in Table 1. A total of 65 patients were included in this study. Cluster analysis separated these patients into two groups consisting of 40 unimpaired and 25 impaired patients. The unimpaired and impaired groups were not different with respect to age (P = 0.253), level of obesity (BMI, P = 0.443), subjective somnolence (ESS, P = 0.209) and gender distribution (P = 1.00). In terms of the functional outcome of sleep questionnaire (FOSQ), the impaired group showed significant decreases in the total score (P < 0.001) as well the activity (P < 0.001), general productivity (P < 0.001), vigilance (P < 0.001) and social outcome (P = 0.026) subscale scores. There were also differences between the unimpaired and impaired groups in the Short-Form 36 quality of life questionnaire. The impaired group showed decreases in general health (P = 0.037), social role functioning (P < 0.001), emotional role functioning (P = 0.011) and the mental component score (P = 0.018). There were no differences in the physical role functioning, physical functioning, bodily pain, vitality, mental health and the physical component score. As expected, there were clear differences in PVT performance with clear differences in the mean response speed (mean 1/RT, P < 0.001), medium response time (P < 0.001), slowest 10% of response times (P < 0.001) and the number of responses < 500 ms (P < 0.001).

Table 1 Group demographics and PVT results

The polysomnographic data, including EEG arousal indices, are shown in Table 2. The unimpaired and impaired groups displayed no differences with respect to total sleep time (P = 0.371), sleep efficiency (P = 0.346), proportions of sleep stages (N1, P = 0.685; N2, P = 0.298; N3, P = 0.904; and R, P = 0.076) and wakefulness after sleep onset (P = 0.120). The severity of OSA in between the groups was also similar (AHI, P = 0.427) and both groups had minimal periodic leg movements. There was also no difference in the mean oxygen saturations between the two groups (P = 0.607).

Table 2 Polysomnographic parameters

The descriptive characteristics of EEG arousal indices are shown in Table 3. There was no difference between the two groups with respect to the standard, 3 s minimum EEG arousal duration (P = 0.220). However, the impaired group showed significantly increased EEG arousal indices that required a minimum duration of 5 s (P = 0.034), 7 s (P = 0.041) and 15 s (P = 0.036). There were no differences in respiratory-related EEG arousal indices irrespective of the minimum duration requirement (P = 0.191, 0.182, 0.147, 0.126 and 0.178 for minimum respiratory-related EEG arousal durations of 3 s, 5 s, 7 s, 10s and 15 s, respectively). There was no difference in the PLM-related EEG arousal index (P = 0.935) between the two groups.

Table 3 EEG arousal characteristics

Comparisons of receiver-operator characteristic (ROC) curves of minimum EEG arousal duration thresholds for the identification of OSA patients with impaired PVT performance are shown in Fig. 1. Calculated area under the curve (AUC), sensitivity, specificity, positive and negative likelihood ratios, and positive and negative predictive values are summarised in Table 4. The AUC increased as the threshold for duration of EEG arousals increased. Similarly, the specificity and positive likelihood ratio also increased as the threshold for duration of EEG arousals increased. In contrast, sensitivity decreased as the threshold for duration of EEG arousals increased. The negative predictive ratio did not change with changes to the threshold for duration of EEG arousals. All EEG arousal duration thresholds were significant except for the ArI3.

Fig. 1
figure 1

Receiver-operator characteristic (ROC) curves of minimum EEG arousal duration thresholds for the identification of OSA patients with impaired PVT performance. The grey dot indicates the Youden Index J value (the maximum vertical distance between the ROC curve and the diagonal line). ArI3, minimum EEG arousal duration of 3 s; ArI5, minimum EEG arousal duration of 5 s; ArI7, minimum EEG arousal duration of 7 s; ArI10, minimum EEG arousal duration of 10 s; ArI15, minimum EEG arousal duration of 15 s

Table 4 Discriminatory ability of EEG arousal durations in predicting PVT performance

Discussion

In this exploratory study, we investigated the relationship between EEG arousal duration and cognitive performance in OSA patients. We carefully selected patients that did not have conditions typically associated with mild cognitive impairment and separated them into two groups based on psychomotor vigilance task (PVT) performance. Our study shows that patients with impaired PVT performance tended to have longer EEG arousal durations, despite no differences in standard PSG parameters. This same group also showed more adverse quality of life outcomes compared to those with unimpaired performance. The frequency of EEG arousals that were 10 s or greater in duration (ArI10) showed the greatest discriminatory ability between patients with impaired and unimpaired PVT performance. In contrast, the standard arousal index (frequency of EEG arousals that were 3 s or greater) did not have any significant discriminatory ability with respect to PVT performance.

OSA is a sleep disorder with an estimated global prevalence of almost 1 billion people affected [17]. The consequences of untreated OSA are very serious, with not only cardiovascular disease and type 2 diabetes more prevalent but also increased risk of driving and workplace accidents [1]. The impact of OSA upon healthcare systems is great [2]; however, not all OSA patients are affected by the disorder to the same extent. Vakulin and colleagues were able to demonstrate that some OSA patients were resistant to the effects of OSA when subjected to driving simulation tests [18]. The ability to identify the OSA patients who are most at risk would allow the targeting of healthcare resources to those who need it most.

The exact role that EEG arousals play in the development of OSA-related neurocognitive impairment is largely unknown. The EEG arousal has usually been considered a sign of sleep disruption and thus considered to be detrimental to sleep quality [19]. Furthermore, the EEG arousal was also seen as a crucial event in the resumption of normal breathing after an apnoea or hypopnoea in OSA patients [20]. Consequently, it was concluded that the EEG arousal, through the act of terminating the apnoea or hypopnoea, disrupts the OSA patients’ sleep and thus causes the daytime symptoms of sleepiness and impaired vigilance. For clinical purposes, the EEG arousal index (ArI) is therefore used as a measure of sleep disruption. This mechanism by which EEG arousals cause the daytime symptoms of OSA patients through sleep disruption is sometimes questioned on a number of grounds. Firstly, EEG arousals occur naturally in healthy subjects and are intrinsic to the maintenance of normal sleep architecture [21]. Secondly, not all obstructive apnoeas and hypopnoeas coincide or terminate with an EEG arousal [22]. Thirdly, the relationship between EEG arousal frequency and daytime performance appears to be equivocal [23, 24]. Lastly, only a weak relationship exists between the change in health status and sleep fragmentation indices after the commencement of CPAP treatment for OSA [25]. This suggests that our current measures of sleep fragmentation lack the precision needed to predict outcomes.

The current EEG arousal criteria were first described in 1992 and mandated a minimum duration of 3 s in EEG frequency shift to score an EEG arousal. The number of EEG arousals scored during the PSG is then divided by the total sleep time to give the EEG arousal index (ArI). The choice of the 3-s minimum duration was acknowledged to be a methodological rather than a physiological decision in the original guideline report [40]. The standard ArI (designated as ArI3 in this study) is a very poor predictor of PVT performance in OSA patients. However, there is some evidence to show that longer duration EEG arousals may have a stronger relationship with subjective sleepiness [14]. Thus, an exploration of arousal duration criteria may enhance our definitions of sleep fragmentation and improve identification of OSA patients most at risk.

While our study utilised a more objective measure of sustained attention (PVT) instead of a subjective scale of sleepiness as the outcome measure, our results show remarkable similarities to those of Schwartz and Moxley [14]. Patients with longer EEG arousals had not only worse PVT results but also worse health outcomes as measured by the SF-36 and FOSQ quality of life metrics. These disparities occurred despite no differences in the usual PSG measures used to describe sleep, respiratory, and oxygenation parameters. The relationship between PVT results and SF-36 outcomes has been demonstrated previously [26]. However, our PVT relationship contrasts with the study of Lee and colleagues as we showed a relationship between PVT outcomes and the SF-36 mental component summary score while their relationship was only significantly related to the physical component summary score. These differences could possibly be explained by the nature of the two studies and the group of patients used for analysis. While Lee and colleagues excluded participants with a history of major medical illnesses, they did include participants with hypertension. Their rationale was based on the high prevalence of hypertension in the OSA population. Unfortunately, hypertension is recognised as an independent risk factor for neurocognitive impairment [27].

Overall, our results suggest that modifying EEG arousal duration requirement could help differentiate between EEG arousals associated with normal sleep and those associated with pathological conditions. Of the different EEG threshold definitions examined in this study, we believe that a minimum EEG arousal duration of 5 s or more would be the most appropriate to use in the clinical setting. The ArI5 threshold was able to improve the specificity without any reduction in sensitivity. The higher ArI thresholds all reduced the sensitivity in predicting impaired neurocognitive performance. If our clinical goal is ensuring appropriate allocation of healthcare resources, then we need good sensitivity and specificity in identifying those patients who would benefit from a trial of therapy (e.g., continuous positive airway pressure, positional therapy or oral appliance therapy). Furthermore, the ArI5 did not require a change in the normal limit compared to the ArI3, with each having threshold value of approximately 19 events per hour. Thus, it may be useful to report both the standard ArI and the ArI5 arousal indices in the future.

There are a number of other aspects of the EEG arousal that can be explored to improve the utility of our measurements. Much is still unknown with respect to the spatial and temporal distribution of normal and pathological EEG arousals during the night. For example, O’Malley and colleagues were able to demonstrate that central EEG leads were not able to detect all sleep- and arousal-related activity [28]. Furthermore, the presence of specific EEG frequencies within the EEG arousal as well as associations with other EEG features may also of further interest in differentiating between normal and pathological EEG arousals. Another important avenue of study would be to examine the underlying reason for the increased EEG arousal duration in the impaired group. The demonstration of no differences in respiratory event-related EEG arousals between the two groups suggests a causal factor unrelated to the apnoeas and hypopnoeas themselves.

There are some limitations to our study. Firstly, the number of OSA patients examined in this study is quite small and thus limits the generalisation of our results. This limitation highlights one of the issues with exploring relationship to neurocognitive status in OSA. Many of the comorbidities seen in OSA patients are also associated with neurocognitive impairment [27]. Thus, in this exploratory analysis, we excluded patients with any of these comorbidities from the analysis to ensure that any differences between the groups could be attributed to differences in EEG arousal characteristics. Large population studies are needed to truly demonstrate the utility of this change to EEG arousal duration. A second limitation is that we did not control for cognitive reserve in this population. Higher premorbid cognitive ability is believed to shield that individual from the cognitive effects of OSA [29]. Thus, a case could be made that the differences in the PVT could be related to differences the two groups in pre-morbid cognitive ability. We would argue however that the PVT is considered to be a test of sustained attention not higher cognitive functions and thus is less likely to affected by cognitive reserve [30] compared to other tests. A third limitation to this study was that we have no knowledge of their sleep schedule in the lead up to their diagnostic PSG. There is a possibility that the impaired group may be more sleep-restricted in the week or so prior to their diagnostic PSG and this may contribute to their poor PVT performance. Another limitation was that we did not examine the cyclic alternating pattern (CAP) between these two groups. CAP is a well-known framework used to characterise arousal instability which occurs in normal and abnormal sleep.

In conclusion, our preliminary analysis of EEG arousal duration demonstrates that using a longer minimum duration provides a better relationship between impaired vigilance and health status in OSA patients. Further refinement of how we describe EEG arousals and how we measure sleep fragmentation could improve our ability to determine which OSA patient is most at risk for neurocognitive impairment.