Introduction

Excessive daytime sleepiness (EDS), defined as sleepiness that occurs unintentionally during daily situations when an individual is expected to be awake and alert [1], is responsible for daytime working inefficiency and work-related and traffic-related accidents [2,3,4], thus presenting a major healthcare, social and economic burden [5]. It is therefore mandatory to differentiate EDS from fatigue (subjective lack of physical or mental energy), tiredness (physical or muscular exhaustion) or even depression (unwillingness to get up out of the bed in the morning or during the day, loss of interest, thoughts of self-worthlessness) [1, 6] to appropriately diagnose and treat each condition. The prevalence of EDS is estimated to be 10–25% in general population with many different causes such as insufficient sleep, comorbidities (neurological, metabolic etc.), medications, depression, but also sleep-related breathing disorders [7]. For decades EDS was considered to be the most common and prominent symptom of sleep-related breathing disorders, especially of obstructive sleep apnea (OSA) [8, 9], where repetitive collapse of upper airways during sleep leads to oxygen desaturation, arousal and consequently shortness or fragmentation of sleep [10]. Even though some studies dispute EDS as the most identified subjective problem in the population with confirmed sleep apnea (only 25% of patients experienced EDS as the major problem in contrast to 40% of patients complaining about the lack of energy and 35% feeling tired or fatigued) [8], both the literature and sleep medicine practice consider EDS the most prominent and common indicator of OSA in a population at risk [1, 7]. Other sleep-related breathing disorders (central sleep apnea or sleep-related hypoxemia/hypoventilation disorders) or sleep disorders in general (circadian rhythm or movement disorders) are not considered to usually provoke this outcome [1].

There are many diagnostic tools and tests available in sleep medicine that may, after a thorough history of sleep and daytime sleepiness (including heteroanamnestic data) and physical examination (nonspecific in most cases, except in OSA), objectively or subjectively measure EDS [11]. The most familiar objective tools for quantifying EDS, the multiple sleep latency test and the maintenance of wakefulness test, are not readily available and demand expertise [12]. Polysomnography (PSG) is another objective tool for assessing patients’ quality of sleep and sleep architecture or the incidence of arousal (respiratory, spontaneous or motoric), all of which already have an established correlation to EDS, especially in individuals suspected of OSA or other sleep-related disorders [13]. PSG, as a gold standard, requires attended overnight video-monitored surveillance of physiologic variables during sleep in a specialized sleep laboratory to confirm the diagnosis of sleep apnea and to quantify its severity [1, 14, 15]. Performing PSG is very demanding for both patient and technician/physician, incurs high costs and hospital stay. Therefore, there is a more easily available but subjective tool for quantifying EDS in a suspected obstructive sleep apnea patient, the Epworth sleepiness scale (ESS) [16, 17]. Introduced by Johns in 1991, ESS estimates the likelihood of dozing off or falling asleep in the usual, mainly sedentary, life situations (sitting, reading, traveling) and a total score greater than 10 suggests genuine EDS as opposite to similar complaints, such as tiredness, fatigue or lack of energy. Since its appearance, the ESS has become the most widely used and studied subjective tool for assessing excessive daytime sleepiness [17,18,19,20,21], which is considered a major characteristic of OSA [22]. Many studies have evaluated the ESS as a screening tool for OSA [20, 22,23,24], by investigating the correlation between the ESS score and polysomnographic apnea hypopnea index (AHI), where the ESS score was found to be significantly higher in the OSA subjects than in the control group, successfully distinguishing patients from healthy individuals [25]. The ESS was also investigated to accurately identify the sleepy subtypes of moderate to severe OSA, which have been shown to have a higher cardiovascular risk than non-sleepy patients [26]. The ESS has been evaluated in correlation to the maintenance of wakefulness test (objective tool for EDS diagnosis) in OSA patients treated with positive airway pressure [27].

The OSA is a disorder burdened with numerous comorbidities, utilizing healthcare resources and creating high costs [28,29,30,31,32,33], it is often underdiagnosed [34, 35] despite its increasing prevalence [36]. Therefore, detecting OSA in a population at risk, by using a screening tool that is not expensive and is widely and easily available, is of major benefit.

The aim of this study was to revisit the ESS and to re-evaluate the significance of EDS for detecting OSA patients in a population at risk compared to AHI diagnosed by gold standard OSA diagnostics, the overnight polysomnography (PSG).

Subjects and methods

From total of 320 individuals, referred to the laboratory for sleep-related breathing disorders at University Hospital Centre Zagreb for the evaluation of suspected sleep apnea, 266 subjects were included in the study. Inclusion criteria were symptoms of OSA: heavy snoring with witnessed repetitive breathing cessations and excessive daytime sleepiness. Exclusion criteria were previously diagnosed sleep-related breathing disorders, other unrelated sleep disorders (insomnia, parasomnia etc.), acute or severe chronic illness or medication intake that could interfere with the process of PSG recording or overestimate the PSG results, non-completed ESS questionnaire or non-completed PSG monitoring or recording.

Each subject underwent medical history and physical examinations after which the standardized ESS questionnaire (Fig. 1, [16, 17]) was filled in, followed by an overnight PSG study. The PSG was performed by standard polysomnograph Alice 5 (Philips Respironics, Murrysville, PA, USA) in accordance with the American Academy of Sleep Medicine (AASM) 2007 recommendations [14]: overnight in laboratory video-attended monitoring with electroencephalogram, electro-oculogram, submental and anterior tibial electromyogram, electrocardiogram, airflow parameters (detected by nasal pressure sensor and oronasal thermistor), breathing effort parameters (detected by inductance plethysmography of chest wall and abdomen), oxygen saturation (by peripheral pulse oximetry) and body position included. All recorded PSG data, automatically scored by the computer, were then manually corrected by a qualified sleep technician and supervised by a sleep medicine physician, in accordance with AASM criteria [14, 15].

Fig. 1
figure 1

Epworth sleepiness scale [16, 17]

An ESS cut-off value of ≥ 10 was considered as positive (suggesting a high risk of excessive daytime sleepiness) [16, 19]. The PSG apnea hypopnea index (number of obstructive apneas and hypopneas per hour of total sleep time) of ≥ 5 was considered positive for OSA. We further classified AHI severity cut-offs accordingly: AHI 5–14 defining mild OSA, AHI 15–29 moderate and AHI ≥ 30 as severe OSA [1, 23].

This study was approved by the Ethics Committee of the University Hospital Centre Zagreb and was performed in accordance with ethical principles outlined in the 1964 Declaration of Helsinki. Each participant gave written informed consent prior to inclusion in the study.

Data analysis

Statistical analysis was performed by using MedCalc Statistical Software version 18 (MedCalc Software bvba, Ostend, Belgium; http://www.medcalc.org; 2018). Minimal sample size of 223 subjects (178 positive and 45 negative) was calculated for the expected area under the curve (AUC) of 0.65 with a statistical power of 90% (beta 0.10) and alpha of 0.05. Categorical data were presented as absolute and relative (%) numbers. Continuous variables were presented as mean ± standard deviation (SD) or as median (interquartile range, IQR) depending on the type of distribution. Categorical data were compared between subgroups using χ2-test and continuous variables using Student’s t‑test or Mann-Whitney U test. Receiver operating characteristic (ROC) analysis was employed to compare the results of each ESS to polysomnographic AHI with sensitivity (SE), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), diagnostic odds (DO) and diagnostic accuracy (DA) calculated for ESS. P < 0.05 was used as statistically significant for all analyses.

Results

Of 266 recruited subjects, 189 were men (71.1%) and 77 women (28.9%) with the mean ± SD age of 57.9 ± 11.6 years, with no significant difference between sexes (mean ± SD, men 57.4 ± 12.4 years vs. women 58.9 ± 9.5 years, p = 0.371). Of the subjects 92 (34.6%) had a positive ESS test with mean ± SD ESS score of 8.1 ± 5.2 for men and 7.8 ± 4.7 for women (p = 0.633). An OSA was diagnosed by PSG in 213 (80.1%) patients: 46 (21.6%) having mild, 37 (17.4%) moderate and 130 (61.0%) severe OSA. Men were found to have more severe OSA than women, median (IQR), AHI 37.5 (8.1–68.0) vs. 15.4 (4.0–43.2; p = 0.001, Table 1). Most subjects with a positive ESS test (88.0%) were found to have positive PSG but most subjects with a negative ESS (75.9%) were also PSG positive (42.0% of them with AHI ≥ 30). Area under ROC curve for ESS was 0.60 (95% CI, 0.54–0.66; p = 0.020, Fig. 2), with SE 38.0%, SP 79.3%, PPV 88.0%, NPV 24.1% and DA 46.2% for ESS at score point of 10. We also found that changing the cut-off point for a positive ESS result (below or above score of 10) changed the sensitivity on behalf of specificity, as expected, but did not yield a higher predictive probability of OSA diagnosis (positive PSG) (Table 2).

Table 1 Patients characteristics
Fig. 2
figure 2

Receiver operating characteristic (ROC) analysis for the Epworth sleepiness scale (ESS) result in predicting positive obstructive sleep apnea (polysomnographic apnea hypopnea index (AHI) ≥ 5). Area under curve was 0.599 for the ESS; p = 0.020. Sensitivity (true positive rate, Y axis) is plotted in function of the 100 specificity (false positive rate, X axis)

Table 2 Criterion values and coordinates of the ROC analysis for the ESS result in predicting positive obstructive sleep apnea (polysomnographic AHI ≥ 5)

Discussion

It has been observed in our daily sleep medicine practice that the ESS is often used as a screening tool for OSA, especially in patients referred to our sleep laboratory for polysomnographic confirmation of the OSA diagnosis. With respect to this observation and to the review of available literature, we realized that excessive daytime sleepiness is still considered the most often and important symptom of sleep disorders, including sleep apnea [7,8,9, 22], even though EDS is not etiology specific and may be caused by any sleep disturbing factor (respiratory or non respiratory).

So, with this study we wanted to re-evaluate the significance of EDS, subjectively measured by the ESS, as a screening tool for obstructive sleep apnea in a population at risk for OSA, by comparing it to polysomnographic AHI.

In our studied population (predominantly male, mean age 58 years regardless of gender) referred to us under suspicion of obstructive sleep apnea, OSA was diagnosed by PSG in most of them (80%), and most of the OSA patients having moderate (17%) and severe apnea (61%), men having more severe OSA than women (median AHI 37.5 vs. 15.4) but only one third of all the participants (35%) reported excessive daytime sleepiness, as measured by the ESS.

What was to be expected by the results of our study, most subjects with positive ESS test (88.0%) were found to have OSA. The finding of a high specificity and positive predictive value of ESS was in accordance with other studies and literature investigating correlation of excessive daytime sleepiness to OSA [22, 37]; however, unexpectedly most subjects with negative ESS (75.9%) from our study were also diagnosed with OSA by PSG (almost half of them having severe OSA).

Many studies compared the relationship between ESS and AHI, but the results were often controversial [20,21,22], most of them showing good [25] and some of them moderate or even weak correlation [27].

How to explain these findings of low sensitivity (SE 38%) and low negative predictive value (NPV 24%) of ESS found in our OSA predominant population?

Regarding the ESS we were using the validated Croatian version of the questionnaire [25] in order to avoid any misinterpretation or misunderstandings among participants related to the items asked. Comparing patients and diagnostic characteristics of our study with the study of Pecotic et al. [25] we found no significant differences between tested patients. Nevertheless, the results of their study showed that the mean ESS score was significantly higher for the OSA patients than the control group (without OSA history/symptoms), successfully distinguishing patients from healthy individuals. According to our data, we cannot rule out the suspected OSA patients on behalf of negative ESS score (low SE and low NPV of ESS in our study) but the correlation between positive ESS and the diagnosis/severity of OSA (AHI) was significant, both in our study (high SP and PPV of ESS) and the study of Pecotic et al. [25].

Furthermore, regarding the ESS, we did the analysis of changing the cut-off point for a positive test (below or above score of 10) in order to improve its diagnostic value, and by doing so we did change the sensitivity on behalf of specificity, as expected, but did not increase predictive probability of OSA diagnosis (i.e., positive PSG) (Table 2).

The low SE of ESS was also found in the study of Silva et al. [24], comparing this questionnaire to Berlin [38], STOP [39] and STOP-Bang [40]. The reason for this finding may be in population specific characteristics [41, 42], behavior or situations [7, 43, 44], or it could be even gender-related [45, 46]. Most of our OSA diagnosed patients were men and maybe their self-report of excessive sleepiness is not as reliable as the one taken from witnessed spouses, because of the fear of losing the driver license/other social benefits, or having problem at work, or just ignoring the sleepiness as important impact on daily functioning. The data on sleeping habits and daytime sleepiness in relation to OSA in different population are still controversial [47].

According to the cluster analyses of OSA patients across the international sleep centers, moderate to severe OSA is considered a disorder with different subtypes, depending on symptoms: disturbed sleep, excessive sleepiness, minimally symptomatic [26]. As the Sleep Heart Health Study recognized the excessively sleepy subtype of OSA bearing the higher risk for cardiovascular events and disease than other subtypes [48], Mazzotti et al. investigated whether the ESS may accurately identify this subtype. In three large cohorts of moderate to severe OSA the authors compared ESS scores among subtypes adjusted for demographic and AHI characteristics and observed higher ESS scores in excessively sleepy subtype in all cohorts. Even though positive ESS (score > 10) had SE 96.6%, SP 57.2%, PPV 73.3%, NPV 93.3% for predicting excessively sleepy subtype, the authors concluded that additional sleepiness symptoms beyond ESS increase predictive performance and consequently detection of “risky” subtype.

There are two important parameters assessing the severity of OSA, apnea hypopnea index and excessive daytime sleepiness [49]. It is pathophysiologically understandable and confirmed in daily practice that more severe OSA patients are expected to be more symptomatic, among other things more sleepy and therefore positive in ESS, as already discussed. As this was not the case with our tested population, where among OSA patients there was a predomination of severe OSA (61%), maybe we should look beyond the relationship of ESS and polysomnographic AHI. The correlation of ESS with the incidence of respiratory arousals (arousal index, ArI), diagnosed by PSG, would maybe better explain complex relationship between sleepiness (measured by the ESS) and OSA (with different symptom subtypes), additional to AHI. There is still a lack of studies exploring the connection between the ESS and ArI (± AHI), we have not performed this kind of subanalysis in our research, so this relationship needs to be further investigated.

The strength of this study is its reliability: after completion of the ESS every participant underwent standardized overnight attended in-laboratory polysomnography (the gold standard for OSA diagnosis), which was performed in accordance with the official recommendations of AASM 2007 [14]. So, regardless of the ESS result, each subject was polysomnographically diagnosed as OSA or not. We used the validated Croatian version of the ESS to avoid any mishaps in establishing the state of sleepiness. There is always a question of reproducibility of our study, when it is compared to similar research using different inclusion criteria and threshold criteria regarding ESS, polysomnographic AHI or even PSG devices [50]. We believe that under circumstances set in our study and keeping to the thresholds stated (positive ESS ≥ 10, positive AHI ≥ 5, with severity cut-offs as mentioned), the reproducibility of our study could be feasible.

There are certain limitations of our study with respect to the investigated population, suggesting a bias towards subjects with high pre-test (pre-PSG) probability of OSA. Also, there is always a question of population characteristics regarding gender, lifestyle and sleeping habits, medications, comorbidities that may affect subjective presentation or impression of daytime somnolence [47].

Conclusion

Based on our studied population, we consider excessive daytime sleepiness, measured by ESS, not to be a valuable tool for detecting obstructive sleep apnea patients in a population at risk of OSA, especially when the test is negative. Using the ESS in that fashion, we are omitting a huge subgroup of patients that are in need for PSG and consequently OSA treatment. Therefore, we should think of other screening tools for OSA that do not rely solely on daytime sleepiness but on other parameters as well (e.g., Berlin questionnaire, STOP-Bang).