Introduction

Obstructive sleep apnea (OSA) is a disorder characterized by recurrent episodes of upper airway narrowing and collapse in sleep. This leads to intermittent airflow limitation, arterial oxygen desaturations, and arousals from sleep, resulting in sleep fragmentation and poor sleep quality. OSA is highly prevalent, reported in 5–24 % of men and 1–9 % of women [1], and carries considerable cardiovascular morbidity [2]. In addition, it hampers daytime functioning through reduced vigilance, cognitive impairment, decreased work productivity, and an increased risk of motor vehicle accidents [35]. Excessive daytime sleepiness (EDS) has long been considered a cardinal symptom of OSA, and studies have shown that, when evaluated by objective means such as multiple sleep latency testing (MSLT), most patients with OSA demonstrate EDS [6]. However, even with severe OSA, subjective daytime sleepiness is often absent [79], and patients themselves may deny or minimize the symptom. This has led to skepticism about the relative importance of self-reported EDS in the evaluation of patients with suspected OSA [10].

As a symptom, EDS has been proven challenging to quantify. The MSLT and the maintenance of wakefulness test (MWT) have both been used to objectively measure EDS, but are time-consuming, laborious, inconvenient, and expensive, and do not necessarily recreate typical day-to-day circumstances in which patients would consider hypersomnolence problematic. Several questionnaires have therefore been developed to be simple, cost-effective, and convenient tools to measure EDS. One of the most frequently used such questionnaires is the Epworth Sleepiness Scale (ESS), an eight-item self-scored instrument that asks respondents to rate their propensity to fall asleep under a variety of routine daily situations on a scale of 0 to 3, with higher numbers signifying a greater chance of dozing; total scores may therefore range from 0 to 24. It has been suggested that a cutoff total score of 10 or higher indicates the presence of pathological hypersomnolence [11]. Due to its ease of administration and proven reliability in multiple languages and cultures, the ESS has been widely employed in both clinical and research settings as a measure of EDS. However, there is considerable controversy in the literature about the degree to which the ESS, a subjective measure, predicts abnormal sleep latencies on MSLT [1219] and MWT [20, 21], objective measures of hypersomnolence. Similarly, while the ESS has become a routine instrument in the evaluation of patients with OSA, the multitude of published studies examining the value of the ESS as a screening tool in identifying the presence of OSA and its ability to predict disease severity have yielded contradictory results [2238].

It has long been observed that patients with OSA and their partners often rate the patient’s hypersomnolence discordantly [3941]. As a result, some authors have attempted to determine whether a partner-completed ESS, rating the patient’s EDS, is superior to a patient-completed ESS in predicting specific markers of OSA severity. However, these studies have also resulted in discrepant findings [4244]. Thus, there remains much confusion about the extent to which the ESS, whether completed by the patient or by the partner, has true utility in the evaluation of patients for OSA. Additionally, to our knowledge, no prior study has attempted to determine if combining patient-completed and partner-completed ESS scores is of value in the screening process and the prediction of disease severity in patients with OSA.

We therefore conducted this study among patients with suspected OSA in a sleep clinic-based outpatient setting to compare the relative strengths of the relationships between patient-completed and partner-completed ESS scores on the one hand and sleep study-derived parameters of OSA severity on the other, as well as to study the utility of patient-completed ESS scores, partner-completed ESS scores, and various combinations thereof as screening tools in the identification of the disease.

Materials and methods

Recruitment and data collection

This was a cross-sectional, observational study. All adult patients (over the age of 18 years) who were consecutively seen at the JFK Neuroscience Institute Sleep Center in Edison, NJ, between October 2014 and June 2015 and who were recommended to undergo diagnostic testing to evaluate for OSA were offered the opportunity to participate in the study. Patients were excluded from participation if they had a prior diagnosis of OSA treated with continuous positive airway pressure (CPAP) or by other means (regardless of time since last treatment, treatment duration, or level of compliance); if they had a prior diagnosis of or were suspected of having another sleep disorder such as narcolepsy, periodic limb movements of sleep (PLMS), or rapid eye movement (REM) behavior disorder (RBD); if they had no suitable partner; and if they or their partner were unable to complete the ESS due to cognitive or language issues. In addition, patients were excluded if it was felt that there might be personal reasons for either the patient or the partner to complete the ESS in a biased fashion (specifically, commercial drivers or those undergoing testing due to a mandate from their employer). For the purposes of this study, “partner” included a spouse, significant other (girlfriend/boyfriend), or close relative (parent or adult child) who lived with and therefore had significant daily contact with the patient. Written informed consent was obtained in each case, and the study was approved by the Institutional Review Board at JFK Medical Center, Edison, NJ. When the patient was accompanied by his/her partner to the clinic visit at which they were enrolled, both the patient and partner completed the ESS (rating the patient’s symptoms in both cases) independently and without mutual discussion. When the patient was unaccompanied by the partner, the ESS was completed by the partner over the phone with the researcher reading the questions and recording the answers.

Patients were then scheduled to undergo a diagnostic sleep study (either an in-laboratory polysomnography [PSG] or a portable monitoring [PM] study at home, depending on their insurance carrier coverage and personal preferences). Some patients who were noted to have severe OSA on in-laboratory PSG were converted to split-night studies if deemed appropriate by the technologist conducting the study. All PSGs (and PSG portions of split-night studies) were performed using GRASS Technology (Natus Neurology, Inc., Warwick, RI) equipment that employed channels for electroencephalography, electrooculography, submental and bilateral tibial electromyography and electrocardiography, a nasal pressure transducer and an oronasal thermistor for airflow, thoracic and abdominal respiratory impedance plethysmography belts for effort, and arterial oxygen saturation recording using a pulse oximeter (SpO2). All PM studies were performed using one of three type-3 devices (GRASS SleepTrek 3 [Natus Neurology, Inc., Warwick, RI], ResMed ApneaLink Air [ResMed Corp., San Diego, CA], or AccuSom [NovaSom Inc., Glen Burnie, MD]), which included a nasal pressure transducer for airflow, thoracic belt for respiratory effort, and pulse oximetry for SpO2 and heart rate. All sleep studies were reviewed and scored by board-certified sleep medicine specialists, using the standard 2012 American Academy of Sleep Medicine criteria for scoring of sleep, arousals, leg movements (for in-laboratory PSG and split-night studies), and respiratory events [45]. Apnea-hypopnea index (AHI) was defined as the total number of apneas and hypopneas divided by total sleep time (TST) for in-laboratory PSG and PSG portions of split-night studies and the number of apneas and hypopneas divided by total recording time (TRT) for PM studies. Oxygen desaturation index (ODI) was calculated as the total number of episodes of fall in SpO2 by 3 % or more divided by TST for in-lab PSG and PSG portions of split-night studies and as the total number of episodes of fall in SpO2 by 3 % or more divided by TRT for PM studies.

Statistical analysis

Continuous variables were tested for fit to normality using the D’Agostino-Pearson omnibus normality test. Since patient-completed and partner-completed ESS scores were both found to be normally distributed, parametric tests (unpaired and paired t tests [as appropriate] for comparisons and Pearson’s product-moment correlation coefficient [r] for correlation) as well as linear regression were used to assess the relationship between them, and a Bland-Altman plot was constructed to determine bias. Spearman’s rank correlation coefficient (r s ) was used to evaluate all other relationships between variables since one of the pair was not normally distributed. Sensitivity, specificity, negative predictive value, and positive predictive value were calculated for patient-completed ESS scores 10 or higher, partner-completed ESS scores 10 or higher, combined ESS scores 20 or higher, either patient-completed or partner-completed ESS scores 10 or higher, and both patient-completed and partner-completed ESS scores 10 or higher in predicting the presence of OSA (defined as AHI greater than 15/h).

For this study, data were considered statistically significant if the two-tailed p value was less than 0.05. All calculations were made using Prism® software (GraphPad Corp., San Diego, CA, USA), on a Windows 7/personal computer platform.

Results

A total of 85 patients (ages ranging from 27 to 89 years old) met inclusion criteria, consented to participate, were enrolled, and provided patient-completed and partner-completed ESS scores that were included for comparison. Ten of these patients did not subsequently complete their diagnostic sleep study, and were therefore excluded from further sleep study-related analysis, leaving a total of 75 patients whose sleep study data was available for comparison with ESS scores. Fifty-one patients underwent in-laboratory PSG (of whom 10 had split-night studies due to the severity of their OSA) and 24 underwent PM studies.

Demographic and sleep study information for all patients whose data were analyzed in the study is presented in Table 1. In 57 cases (67 %), the partner-completed ESS scores were higher than their patient-completed counterparts; in 6 cases (7 %), they were both equal; and in 22 cases (25.9 %), the patient-completed ESS scores were higher than the patient-completed ESS score. Differences between patient-completed and partner-completed ESS scores ranged from 14 (partner-completed ESS score higher) to −9 (patient-completed ESS score higher). Meanwise comparison showed that partner-completed ESS scores were significantly higher than patient-completed ESS scores by 2.9 (95 % confidence interval 1.4 to 4.3, p < 0.0001), although there was a weak correlation between patient-completed and partner-completed ESS scores (r = 0.5, p < 0.0001). Figure 1 shows the relationship between patient-completed and partner-completed ESS scores as determined by linear regression. As noted in Fig. 2, the Bland-Altman plot demonstrated substantial bias with partners completing higher ESS scores than patients (33.5 %, SD ± 55.2 %). To examine the influence that having a predominantly male sample may have exerted on our data, we performed further gender-based analysis. We found that the differences between patient-completed and partner-completed ESS scores did not differ significantly between male patients (3.2 ± 4.6) and female patients (2 ± 4.6; p = 0.3). In addition, differences between patient-completed ESS scores and partner-completed ESS scores remained significant when male patients (n = 60) and female patients (n = 25) were considered separately (9.4 ± 4.7 vs. 12.6 ± 4.3, p < 0.0001 and 8.6 ± 5.3 vs. 10.7 ± 4.3, p = 0.037, respectively).

Table 1 Demographic and sleep study characteristics of study participants (n = 85, except as noted)
Fig. 1
figure 1

Linear regression scatterplot of partner-completed Epworth Sleepiness Scale (ESS) scores as a function of patient-completed ESS scores. Correlation was significant (p = 0.003) but weak (R 2 = 0.245). Dashed lines represent 95 % confidence intervals

Fig. 2
figure 2

Bland-Altman plot; average of patient-completed and partner-completed Epworth Sleepiness Scale (ESS) scores plotted against the percentage difference. There is demonstration of substantial bias, with partners scoring the ESS 33.5 % higher (SD 55.2 %)

Results of correlations between patient-completed, partner-completed, and combined ESS scores and various sleep study-derived parameters and body mass index (BMI) are presented in Table 2. Partner-completed and combined ESS scores, but not patient-completed ESS scores, weakly correlated with AHI and ODI. Patient-completed, partner-completed, and combined ESS score did not correlate with SpO2 nadir, time spent with SpO2 below 90 %, or with BMI.

Table 2 Correlations between patient-completed ESS scores, partner-completed ESS scores, and combined ESS scores and various sleep study parameters among all patients who completed a sleep study (n = 75, except as noted)

As noted in Table 3, partner-completed ESS scores had greater sensitivity than patient-completed ESS scores (76.9 vs. 46.2 %) but poorer specificity (39.1 vs. 65.2 %). Combined ESS scores had better sensitivity (63.5 %) than patient-completed ESS scores and better specificity (52.1 %) than partner-completed ESS scores. However, the sensitivity was best when either patient-completed or partner-completed ESS score was 10 or higher (82.7 %), and the specificity was best when both patient-completed and partner-completed ESS scores were 10 or higher (80.8 %).

Table 3 Predictive value of various combinations of patient-completed and partner-completed ESS scores in detecting the presence of OSA (defined as AHI > 15/h) among all patients who completed a sleep study (n = 75)

Discussion

The results of our study confirm that while, as determined by several other authors [3941], partner-completed ESS scores are significantly higher than patient-completed ESS scores, neither are strongly predictive of OSA severity as measured by sleep study-derived parameters. Nevertheless, in light of the contradictory findings of the few studies comparing the performance of partner-completed and patient-completed ESS scores in predicting the severity of OSA [41, 42], our finding that partner-completed ESS scores correlate with the AHI and ODI, while patient-completed ESS scores show no such correlation, provides interesting additional input into the debate. We also found that while partner-completed ESS scores do provide greater sensitivity in detecting OSA than patient-completed scores, they suffer from poorer specificity. Previous reports in the literature of the sensitivity and specificity of patient-completed ESS scores in predicting OSA vary widely, although the trend of better specificity than sensitivity has been consistent [10, 22, 33, 3638]. On the other hand, Walter et al. [42] attempted to evaluate partner-completed ESS scores as a screening tool for OSA but were unable to determine a cutoff score that predicted an AHI of 40/h or higher with a sensitivity and specificity greater than 60 %. We are aware of no other studies that have specifically looked at partner-completed ESS scores as screening tools for OSA or calculated of predictive value of various combinations of patient-completed and partner-completed ESS scores in this regard. Therefore, we believe that our finding of high sensitivity and specificity when both ESS scores are considered together provides a valuable addition to the literature on the clinical evaluation of patients with suspected OSA, which should be validated with larger-scale studies at other centers.

It is unclear if the discrepancy between patient-completed and partner-completed ESS scores is due to patient underestimation of their own hypersomnolence or partner overestimation born of concern or a desire to ensure that the patient receive adequate medical attention. Our findings of a correlation between partner-completed ESS scores (but not patient-completed ESS scores) and certain measures of OSA severity seem to suggest that patient underestimation is the more likely explanation for the discordant ESS scores. In support of this hypothesis, researchers have found that partner-completed ESS scores correlate more strongly with objective hypersomnolence on MSLT than patient-completed ESS scores [18]. There are many possible reasons for patient underestimation of the degree of their EDS. It has been suggested that patients long accustomed to chronic OSA-related hypersomnolence may suffer from poor insight into the degree of their impairment. This argument is strengthened by recent observations that patients with OSA treated adequately with auto-CPAP, when asked to retrospectively rescore their pretreatment ESS, report higher scores than they had originally provided at their baseline visit [46]. Other potential reasons for patients’ apparent inability to accurately gauge their degree of impairment may include denial and a dismissive patient attitude toward the suspected illness, a genetic resistance to or an inability to perceive EDS, and the desire to avoid admissions of personal weakness or perceived physical deficiencies due to cultural and personality factors. Patients may also be motivated by exigencies of employment (such as in the case of commercial drivers) to underplay symptoms [47]. We specifically excluded commercial drivers in our study to minimize this possibility, but it is undoubtedly a consideration in a typical sleep medicine practice.

In our study, patient-completed ESS scores did not correlate with the AHI, which is in agreement with the findings of some other authors [8, 26, 27, 30, 31] but is in contradiction to others [18, 25, 32, 34, 35, 42]. Similarly, our findings of a correlation between partner-completed ESS scores and AHI support the conclusions of some other researchers [18, 4244] and are at variance with others. Several reports have suggested that EDS in patients with OSA is related to the degree of nocturnal hypoxemia [8, 27, 4853]. However, in our study, partner-completed ESS scores and combined ESS scores weakly correlated with ODI, but with none of the other SpO2-related parameters, and patient-completed ESS scores did not correlate with any of them. In this context, it is worth pointing out that in most studies where positive correlations were found between ESS scores and sleep study parameters, including ours, the effect sizes were small. Some studies in patients with OSA found no significant differences in sleep study-derived parameters of disease severity between patients with and without EDS [32, 54]. Taken together, these findings suggest that factors other than conventional markers of disease severity at least partly determine EDS in OSA, and there have been several attempts to better define the potential patient characteristics and comorbidities that may be responsible [55]. The literature suggests that obesity [32, 56], increased circulating inflammatory cytokines [9, 27, 57], insulin resistance, disturbed nocturnal sleep [31], depression, male gender [32], and younger age [58] all predispose patients to EDS, independent of the degree of sleep-disordered breathing.

Our results also support the growing realization that subjective EDS alone, whether reported by the patient or the spouse, should not be relied upon as a necessary or sufficient symptom while screening patients with OSA. As discussed, several reports [7, 10, 22, 26, 28, 32, 54, 55] have suggested that the degree of EDS, both subjective and objective, is not reflective of the severity or even of the presence of OSA as measured by PSG metrics. This may explain why ESS scores are often misleading when screening patients for OSA. Although our data suggest that partner perception of EDS is more likely to indicate the presence of OSA, there are clearly factors other than the severity of the disease as measured by AHI that determine how susceptible a patient with OSA is to EDS.

There are a few limitations of our studies that deserve discussion and which engender recommendations for future research. While we only studied patients referred with a suspicion of OSA, thereby eliminating the confounding effect of other coexistent sleep disorders, we did not control for additional potential confounders like sleep schedules and duration, time of day of ESS administration, medications, or concomitant mood disorders. Future large-scale studies that account for these covariates may help in further defining the relationships between patient-completed and partner-completed ESS and OSA parameters. While we found no gender-related differences with regards to discrepancies between patient-completed and partner-completed ESS scores, our small sample size precluded more detailed analysis based on patient gender and age as covariates or stratification of patients by severity of OSA; this may also be a worthwhile endeavor in larger studies. Finally, since our study involved patients referred to a sleep clinic with a high suspicion of OSA, the results may not be generalizable and future population-based studies may be considered.

In summary, our results suggest that while both patient-completed and partner-completed ESS scores have limited clinical roles by themselves in the evaluation of patients with suspected OSA, taking both scores into consideration does improve the sensitivity and specificity of the screening process. Our observations should encourage clinicians to elicit spousal input where appropriate and will hopefully serve to discourage patients from dismissing spousal perception of their impairment. This is particularly important when treatment decisions are dependent on patient’s self-reported symptoms; given the implications for patient and public safety with regards to driving or activities requiring intense concentration and vigilance, relying on patient perception alone in such situations may be deceptive.