Introduction

Sleep apnoea describes a phenomenon wherein a complete cessation of airflow occurs during sleep for more than 10 s. The cause falls into two categories: central or obstructive. In contrast to central sleep apnoea’s neurological cause, obstructive sleep apnoea (OSA) is defined by a lack of respiratory effort and airflow due to partial or complete upper airway obstruction [1, 2]. This is more likely in patients with structural abnormalities of the upper airway, commonly seen in obesity.

OSA carries a range of adverse consequences. In the acute setting, this includes fragmented sleep, hypoxia, and varying blood pressure and heart rhythm. These physiological changes subsequently predispose cardiovascular [3], thrombotic, neurological and metabolic sequelae [2], likely to impair mood and quality of life, and increase the burden on hospital services [4]. Of greatest concern is these factors collectively increase the likelihood of premature death.

In 2010, the prevalence of OSA was estimated at 25% in the adult population, with a prevalence of around 45% in the obese [5,6,7,8,9,10,11]. More recent study reveals this figure to be as high as 71% among bariatric surgery patients, with a predisposition towards male patients [12]. It is essential for OSA to be diagnosed in bariatric patients prior to surgery to avoid potentially life-threatening pre-, peri-, and post-operative complications [13,14,15]. Thus, there is a need to evaluate the methods used to diagnose OSA to allow for early therapeutic interventions, as necessary.

OSA is currently diagnosed using a combination of the clinical history and objective measures typically obtained from polysomnography. This is a sleep study measuring several factors including the patients breathing, heart rate, and oxygen saturations. There is a high cost for this test, and the associated waiting times involved for conduct and receiving results are drawbacks to obtaining an early diagnosis of OSA; particularly for those patients with a greater need for treatment. This leads to a question of efficacy for alternative screening tools, such as the Epworth Sleepiness Scale (ESS, Online Appendix 1) [16] and Stop-Bang model (Stop-Bang Questionnaire, Online Appendix 2) [17].

These two questionnaires provide measures of the risk of OSA, with ESS providing a subjective perspective. Several studies investigated the predictive probabilities of these models, with the majority finding ESS to have little impact in the diagnosis and characterization of OSA [18,19,20]. This is not surprising as ESS was designed to measure sleepiness [16], and ignores key predisposing factors for OSA such as body mass index (BMI), gender, and neck size [21,22,23]. Despite this, ESS is still used to screen for OSA in the UK.

Unlike ESS, Stop-Bang considers a few predisposing factors for OSA (i.e. BMI and gender), making it more suitable to identify OSA [24, 25], and there is evidence to support its implementation as a screening tool [26,27,28,29]. Of particular benefit is its use in pre-operative assessments [30,31,32]. There are, however, aspects of the model that make it more difficult to use as a screening tool, for instance needing to have someone report if the patient has “witnessed apnoeas” or “loud snoring”.

Studies have looked into the performance of modified versions of the Stop-Bang model [33,34,35], which highlight the importance of variables such as age, BMI, gender and neck circumference in characterising OSA. In the UK, however, over at least the past 6 years, there have been no studies done to analyse or develop the performance of the Stop-Bang score as a screening too OSA.

The focus of this study is to ascertain the predictive abilities of ESS, Stop-Bang, and BMI for obstructive sleep apnoea (OSA), with particular interest in determining whether either can identify patients who could benefit from early intervention. Exploratory analyses including age and gender as covariates are included.

Method

Data were collected retrospectively for bariatric and non-bariatric patients who attended the sleep clinic between February 2012 and July 2013. No patient had previously been diagnosed with OSA, and all underwent polysomnography regardless of ESS and Stop-Bang outcomes. Information collected included initial ESS and Stop-Bang scores (prior to polysomnography), age, gender, BMI and Apnoea–Hypopnoea Index (AHI). All data were collected as part of normal care and these routinely collected data were anonymous at the point of analysis, conforming to the Governance Arrangements for Research Ethics Committees (GAfREC) standards [36].

Apnoea Hypopnoea Index score, equivalent to OSA severity, was used to assign patient status group:

  • “None”, for patients who do not have OSA (AHI < 5),

  • “Notreat”, for those with OSA not requiring treatment (5 ≤ AHI ≤ 10),

  • “Treat”, for patients with OSA requiring treatment (AHI > 10).

Multinomial logistic (MNL) regression analyses were performed setting independent variables ESS, SBM, and BMI, and dependent variable Patient Status group; tests were conducted at α = 0.05. The aim of the analyses were to determine whether a correlation existed between Stop-Bang and ESS scores and the outcomes of polysomnography, while accounting for BMI, age and gender.

Analysis and results

Data from 192 patients were included and analysed. This population consisted of 126 bariatric patients, and 66 non-bariatric patients, with mean age 49 and mean BMI 44 (Summaries of data collection; Tables 1, 2). There were some missing data such that the numbers in each patient status group alter between each analysis (ESS and Stop-Bang data collected; Tables 3, 4; Online Appendix). A total of 49 SB scores and 3 ESS scores were missing. Analysis of the bariatric and non-bariatric groups by available covariates found no significant differences between the groups, and so were combined into a single data set. There was no collinearity between the predictor variables ESS, SBM, and BMI, as indicated by calculated variation inflation factors <1.4. Three univariable MNL regressions were performed and one bivariable MNL regression, with the outcome as patient status group. Further triple-variable MNL regressions were performed to assess the effects of age and gender. In all MNL regressions on this data set the reference group was the “treatment” outcome group. Of note, we are aware that BMI, age and gender are components of Stop-Bang, however, as there is no evidence of problematic collinearity we perform sensitivity analyses which included them as co-variates. Our findings are summarised in Table 5.

Table 1 Data collection summary, including patient demographics expressed as mean values with confidence intervals
Table 2 Summary of data as per patient status group, expressed as mean values
Table 3 ESS data collected
Table 4 SB data collected
Table 5 Summary of main results of multinomial logistic (MNL) regressions, for the odds ratios the reference category was “treatment”, thus lower ORs indicate higher values of the predictor would result in “treatment”

MNL regression (univariable predictor: ESS)

The model overall is not significant p = 0.132. There were a total of 189 observations.

MNL regression (univariable predictor: Stop-Bang score)

Both the model and the variable Stop-Bang are significant (p = 0.034); the risk of the outcome falling into the comparison group relative to the referent (treatment) group decreases as the variable increases if the odds ratio (OR) is less than 1. The OR in this case is <1 meaning that as the Stop-Bang score increases one is more likely to be in the “treat” group as opposed to the “None” group (OR 0.671, CI 0.478–0.925). Stop-Bang, however, is not discriminatory between the “notreat” and “treat” groups (0.966, CI 0.690–1.351). Total of 192 observations. Nagelkerke’s pseudo r-square is 0.054, and overall correct classification is 55.9%.

MNL regression (univariable predictor: BMI)

The model and the variable are significant overall, p = 0.004; and as BMI increases one is less likely to be in the “none” group compared to the treatment group (OR 0.952, CI 0.919–0.986) and less likely to be in the “notreat” compared to treatment group (OR 0.955, CI 0.917–0.995), with a total of 191 observations. Nagelkerke’s pseudo r-square is 0.065, and overall correct classification is 57.1%.

MNL regression (bivariable predictors: BMI, Stop-Bang score)

The model is significant p = 0.001; BMI is significant 0.002; Stop-Bang maintains significance 0.016. Increasing BMI means less likely to be in “None” group compared to the treatment group (OR 0.629, CI 0.449–0.882); but does not discriminate “notreat” from the treatment group (OR 0.946, CI 0.895–1). Increasing Stop-Bang means less likely to be in “None” group compared to the treatment group (OR 0.925, CI 0.879–0.972); but does not discriminate “notreat” from the treatment group (OR 0.900, CI 0.637–1.272). 142 observations. Nagelkerke’s pseudo r-square is 0.150, and overall correct classification is 62.7%.

Controlling for demography

MNL regressions were performed with either Stop-Bang and BMI, with the inclusion of co-variates age and gender, to assess their effects and further analyze the previous significant findings.

MNL regression (tri-variable predictors (Stop-Bang score, age, gender)

Neither the model nor the variables are significant.

MNL regression (tri-variable predictors BMI, age, gender)

The model is significant p < 0.001. All variables are significant. Age p = 0.003; increasing age means less likely to be in “None” group compared to the treatment group (OR 0.951 CI 0.922, 0.980); but does not discriminate “notreat” compared to treatment group (OR 0.981, CI 0.948, 1.016). BMI p < 0.001; increasing BMI means less likely to be in “None” group compared to the treatment group (OR 0.918 CI 0.880, 0.958); and less likely to be in “notreat” compared to treatment group (OR 0.927, CI 0.884, 0.972). Gender p = 0.001; being female more likely to be in “does not have” group compared to the treatment group (OR 3.538, CI 1.597, 7.838); and more likely to be in the “notreat” compared to treatment group (OR 3.893, CI 1.567, 9.675). Nagelkerke’s pseudo r-square is 0.204, and overall correct classification is 62.3%.

Discussion

This study set out to ascertain the predictive abilities of ESS and Stop-Bang, while assessing the influence of co-variates BMI, age and gender, to risk stratify patients. In common with previous literature [18,19,20, 37], ESS was not found to have predictive ability for OSA severity: ESS’s identification of OSA is based upon the level of daytime somnolence, which is not invariably present with the condition and correlates poorly with its severity [21, 38, 39]. Further, the questionnaire has a fairly low sensitivity when using the suggested cut-off of 10 [40]. The International Classification of Sleep Disorders has thus made it clear that the ESS is no longer important for the diagnosis of OSA [41].

The performance of Stop-Bang as a screening tool for OSA has been widely demonstrated, with recent studies emphasizing the ability of the questionnaire to effectively detect moderate-severe OSA [26, 27, 29, 35, 42, 43]. In particular, Chung et al. were able to validate use of the score in bariatric patients, and demonstrate high sensitivity and specificity for detecting severe OSA with a score of 4 and 6, respectively [43]. In our study, a large relative proportion of patients without OSA had a score of 4, whereas the “treat” group had the highest relative proportion of patients with a score of 6. While our findings demonstrate that a higher Stop-Bang score increases the risk of severe OSA, the score failed to differentiate between the two groups of patients diagnosed with the condition. Indeed, this requires investigation in a larger population of patients, however, it can be said that there remains the issue of distinguishing severe from moderate-severity OSA, particularly with mid-range scores (i.e. 3–4) for which further classification is necessary [17, 27, 35].

There is no doubt that raised BMI, age and male gender are all important risk factors for OSA [6, 21]. From this study, it can be inferred that these variables contribute to the predictive power of the overall Stop-Bang score. Indeed, the Stop-Bang model failed to retain its significance when 2 out of the 8 items included (age and gender) were controlled for statistically. Modifying the Stop-Bang questionnaire to provide weighting to these variables, particularly as continuous as opposed to dichotomised measures, can improve its performance as a screening tool [33,34,35]. Nahepetian et al. compared the predictive abilities of the standard Stop-Bang questionnaire with two weighted versions and found that the specificity for classifying OSA patients with AHI ≥ 15 was greatest when the model was weighted for continuous variables BMI, age and neck circumference. The high sensitivity of the model remained preserved at around 93% [33]. Chung et al. demonstrated in a cohort of 516 patients that specific combinations of items in the Stop-Bang model could improve its specificity; this was seen in various combinations of a stop score ≥2 and BMI > 35, male gender and age >50. The specific combination of a Stop score ≥2, male gender and BMI > 35 was shown to yield the greatest increase in predictive power [34]. Given these findings, in combination with our own, one may postulate better predictive power of an alternative score, which only considers objective measures which relate to the identification or severity of OSA [44]. The recently developed DAS-OSA score, for example, comprises of five items, including Mallampati Score (a measure of ease of endotracheal intubation), chin-thyroid distance, BMI, gender and neck circumference [45], and was shown to be more specific than Stop-Bang in predicting moderate-severe OSA and further sensitive in predicting severe OSA [46]. Extensive study, however, would be necessary to validate use of such a score over an already well established screening tool such as the Stop-Bang questionnaire.

Limitations

The study population overall was fairly small, yet sufficient to power regressions with ten events per variable (EPV) for the uni-variable and bi-variable analyses. However, in the analyses with age and gender, the EPV is reduced, and so the results should be treated as exploratory [47]. The low Nagelkerke’s values indicate unexplained variation. To assess the effect other variables, we would require more observations to provide sufficient EPV to power the analysis. The overall predictions were correct for just over half of the study population, and thus despite statistical significance, there is a need for development before these models would be useful clinically. The relatively small sample size and the involvement of a single centre mean our findings are unlikely to be representative of all patients with OSA in the UK. Furthermore, due to the retrospective nature of the study, almost a quarter of Stop-Bang scores were missing. Other data such as neck circumference and co-morbidities were not collected but could strengthen the models.

Strengths

As a retrospective study, no additional burden was placed on staff to collect data, which were obtained from hospital records. Rather than looking at a selected population of patients, all patients referred for sleep study over a year were included in this study, thus representing the reality of clinical practice in the study centre.

Conclusion

ESS is not an appropriate screening tool for OSA. This study demonstrates the Stop-Bang model to be a useful screening tool for OSA; in particular, OSA requiring treatment. Age, gender and BMI are also shown to have equal predictive significance. While studies have promoted the use of a “weighted” Stop-Bang model, further study may benefit the development and implementation of a concise and more specific screening tool that considers high evidence-based risk factors for OSA including male gender, greater age and raised BMI. This would be of particular benefit in bariatric patients who are at a high risk of the condition.