Introduction

Obstructive sleep apnea (OSA) is a highly prevalent condition [1,2,3,4], which is characterized by recurrent obstructive episodes of the upper airways, resulting in intermittent hypoxemia and sleep fragmentation [5]. OSA has often been associated with cardiovascular, metabolic, and neurocognitive consequences [5], in addition to a myriad of other health-related issues, all of which result in increased healthcare utilization and costs [6,7,8]. Among the several risk factors attributable to OSA development, obesity is, without a doubt, the most robust risk factor and predictor [5, 9,10,11]. Therefore, a high frequency of OSA is usually observed among individuals with obesity [12,13,14,15,16,17,18].

Bariatric surgery (BS) is the most effective and long-lasting treatment for obesity, reducing the risk of obesity-related comorbidities, in addition to promoting significant improvements in obstructive respiratory events, oxygenation, and mortality [19,20,21]. As OSA is significantly underdiagnosed in BS patients [22,23,24,25], the majority of BS programs engage in the routine clinical evaluation of OSA of all patients regardless of sleep complaints [26,27,28], although such approach is not universally implemented [29, 30]. In general, BS patients with OSA appear to be at higher risk for perioperative and postoperative complications, thereby justifying OSA screening, especially for the most severe phenotypes [31,32,33,34], even if the association between OSA severity and complicated clinical course is not always present [35, 36].

Currently, the gold standard for OSA diagnosis consists of full polysomnography (PSG) performed in a sleep laboratory. However, this test is not widely available for the large number of patients who need to be tested for suspected OSA, especially in areas with limited financial resources. In this context, clinical screening instruments can identify patients at high risk for diagnosis of OSA, thus offering portable diagnostic methods and possibly reducing the long waiting times found in many sleep laboratories [37,38,39].

The GOAL questionnaire is a recently developed OSA screening tool [40]. It includes four components that are easy to acquire during the clinical evaluation of a patient with suspected OSA: gender, obesity, age, and loud snoring being that score ≥ 2 points (from 0 to 4 points) classifies individuals at high risk for OSA [40]. In the derivation and validation study, GOAL reported adequate performance in OSA screening, showing similar discrimination to that obtained by three other widely validated instruments, namely No-Apnea, STOP-Bang, and NoSAS [40].

Despite some studies evaluating screening tools for OSA in BS patients [41,42,43,44,45,46], the GOAL questionnaire has not yet been subjected to similar validation in this setting. Accordingly, the present study aimed to evaluate the GOAL as a screening tool for OSA in a sample of consecutive BS patients and to compare its discriminatory ability with that obtained with other frequently used screening instruments.

Methods

From January 2017 to March 2020, this cross-sectional study prospectively enrolled individuals with obesity (body mass index (BMI) ≥ 35.0 kg/m2) who were referred for overnight in-lab PSG assessment by their respective attending physicians. Inclusion criteria were adult patients with obesity (aged ≥ 18 years) undergoing assessment before BS. Individuals were excluded for any of the following reasons: previously diagnosed OSA, at-home PSG studies, incomplete clinical data, and technically inadequate PSG.

Patient characteristics included gender, age, BMI, neck circumference (NC), and self-reported comorbidities (hypertension and type 2 diabetes mellitus). BMI was calculated by dividing the weight in kilograms by the square of the height in meters (kg/m2). NC (in cm) was measured using a flexible tape with all subjects in the upright sitting position. On the evening of the PSG, all demographic, anthropometric, and clinical data were systematically collected by qualified sleep technicians, besides completing the screening instruments: GOAL, STOP, STOP-Bang, No-Apnea, NoSAS, and Epworth Sleepiness Scale (ESS).

The protocol (SleepLab Study) was approved by the Ethics Committee of the Federal University of Rio de Janeiro (#1.764.165) and was carried out following the Declaration of Helsinki. Written informed consent was obtained from each participant with the anonymity of each of them being preserved.

Screening Instruments

GOAL questionnaire is a 4-item instrument (male gender, obesity (BMI ≥ 30.0 kg/m2), age ≥ 50 years, and loud snoring), containing yes-or-no dichotomous answers (1 point for each positive answer). Score ≥ 2 (from 0 to 4 points) classifies individuals at high risk for OSA [40].

No-Apnea score contains two objective parameters (NC and age). Each variable is categorized as follows: NC (in cm) is scored in three values: 1 (37.0–39.9), 3 (40.0–42.9), and 6 (≥ 43.0), while age (in years) is scored as follows: 1 (35–44), 2 (45–54), and 3 (≥ 55). Score ≥ 3 (from 0 to 9 points) is considered high risk for the presence of OSA [47].

STOP and STOP-Bang questionnaires consist, respectively, of 4 or 8 yes-or-no questions (1 point for each affirmative answer). STOP contains 4 questions about loud snoring, tiredness, observed apnea, and hypertension, while STOP-Bang uses STOP components plus BMI > 35 kg/m2, age > 50 years, NC > 40 cm, and male gender. STOP and STOP-Bang questionnaires use score ≥ 2 (from 0 to 4 points) and ≥ 3 (from 0 to 8 points) to identify individuals at risk for the presence of OSA, respectively [48].

NoSAS score allocates 4 points for having NC > 40 cm, 3 points for having a BMI of 25–29 kg/m2, or 5 points for having a BMI ≥ 30 kg/m2, 2 points for habitual snoring, 4 points for age > 55 years, and 2 points for the male gender. Score ≥ 8 (from 0 to 17 points) classifies individuals at high risk for the presence of OSA [49].

ESS is an 8-item questionnaire that assesses the subjective likelihood of falling asleep in various settings. Each item is scored from zero (would never doze) to three (high chance of dozing). Score ≥ 11 (from 0 to 24 points) was considered indicative of excessive daytime sleepiness [50].

Polysomnography

All PSGs were conducted at a single Brazilian sleep center: SleepLab – Laboratório de Estudo dos Distúrbios do Sono, Rio de Janeiro. All patients underwent an attended, full PSG (EMBLA® S7000, Embla Systems, Inc., Broomfield, CO, USA), consisting of continuous monitoring of electroencephalogram, electrooculogram, electromyogram, electrocardiogram, airflow, thoracic and abdominal impedance belts for respiratory effort, pulse oximetry, snoring microphone, and body position sensors.

Data from PSG were manually scored by two board-certified sleep physicians following previous guidelines [51], who were blinded for all the scores obtained by the instruments. Obstructive apneas were classified as a drop of at least 90% of airflow from baseline with persistent respiratory effort, lasting at least 10 s [51]. Hypopneas were defined as a reduction in the respiratory signal ≥ 30% lasting ≥ 10 s that were associated with more than 3% oxygen desaturation or arousal [51]. OSA severity was classified according to the apnea/hypopnea index (AHI) thresholds: ≥ 5.0/h as any OSA (OSA≥ 5), ≥ 15.0/h as moderate/severe OSA (OSA≥ 15), and ≥ 30.0/h as severe OSA (OSA≥ 30) [51].

Statistical Analysis

Data analysis was performed using SPSS for Windows (version 21.0; Chicago, IL, USA). Results were summarized as the median and interquartile range (IQR) for quantitative variables and as number and percentage for qualitative variables. Comparisons between groups were performed using the chi-squared test for dichotomous variables, Student’s t test, and univariate analysis of variance (ANOVA) for continuous variables. Correlation was evaluated by the Spearman correlation coefficient (r). Discrimination was estimated from the area under the curve (AUC) obtained by receiver operator characteristic (ROC) curves. An AUC > 0.7 was considered being clinically significant [52] and AUCs were compared using the previously described algorithm [53]. Sensitivity, specificity, positive predictive value, and negative predictive value were calculated using contingency tables. All two-tailed tests were performed at a 5% significance level.

Results

A flowchart illustrating the study approach is shown in Fig. 1. Baseline patient characteristics are listed in Table 1. Overall, the median age was 37.0 years (IQR: 31.0–45.0) and 70.8% were females. As would be anticipated from the study design, we found a high prevalence of OSA≥ 5 (82.6%), OSA≥ 15 (60.0%), and OSA≥ 30 (38.8%). Prevalence of OSA≥ 5, OSA≥ 15, and OSA≥ 30 was statistically higher in males than in females: 97.9% versus 76.2%, 88.7% versus 48.1%, and 73.9% versus 24.3%, respectively (all p values < 0.001).

Fig. 1
figure 1

Flowchart of the study. Diagnosis of obstructive sleep apnea (OSA) obtained by full in-lab polysomnography (PSG) was based on an apnea/hypopnea index ≥ 5.0/h as any OSA, ≥ 15.0/h as moderate/severe OSA, and ≥ 30.0/h as severe OSA

Table 1 Summary of patient characteristics (n = 814)

As can be seen in Table 1, the GOAL scores were grouped as follows: 1 point (35.4%), 2 points (36.1%), 3 points (25.7%), and 4 points (2.8%). Figure 2 shows the OSA frequencies according to the GOAL questionnaire scores. The frequency of subjects classified as high risk for OSA (GOAL ≥ 2 points) was 64.6%. Figure 3 illustrates the distribution of AHI, oxygen desaturation index (ODI) at 3%, average oxygen saturation (SpO2), and nadir SpO2 according to the GOAL scores. The AHI, ODI at 3%, average SpO2, and nadir SpO2 values were statistically different between the categories of the GOAL scores, confirming previous observations that increasing GOAL scores are associated with increases in the severity of OSA-related respiratory parameters obtained from PSG. In addition, the GOAL questionnaire showed statistically significant bivariate correlations with the following respiratory variables: AHI (r = 0.570), average SpO2 (r = − 0.431), nadir SpO2 (r = − 0.504), and ODI at 3% (r = 0.547); all p values < 0.001.

Fig. 2
figure 2

Frequency of obstructive sleep apnea (OSA) based on GOAL questionnaire scores (from 1 to 4 points) on 814 bariatric surgery patients. Diagnosis of OSA was based on an apnea/hypopnea index ≥ 5.0/h, being its severity classified as follows: mild OSA (AHI: 5.0–14.9/h), moderate OSA (AHI: 15.0–29.9/h), and severe OSA (AHI ≥ 30.0/h)

Fig. 3
figure 3

Boxplot diagram showing the distribution of apnea/hypopnea index (AHI), oxygen desaturation index (ODI) at 3%, average oxygen saturation (SpO2), and nadir SpO2 according to the GOAL questionnaire scores on 814 bariatric patients. The bottom and top of the box represent the lower (25%) and upper (75%) quartiles, respectively, while the horizontal bars indicate the median (50%). The upper and lower bounds of the error bars denote the range. Circle and asterisk represent outlier and extreme outlier, respectively. The non-parametric Kruskal-Wallis test showed statistically significant differences in the distribution of all respiratory parameters (AHI, ODI, average SpO2, and nadir SpO2) through the GOAL scores: all of them with p < 0.001

Predicting OSA

Table 2 shows GOAL predictive performance according to its scores. Using the score ≥ 2 points to classify patients at high risk for OSA of any severity, GOAL questionnaire revealed sensitivities ranging from 73.7 to 89.2% and specificities ranging from 78.2 to 51.0% based on the OSA severity cut-off AHI values. As expected, as GOAL scores increased, there was a reduction in sensitivity with an increase in specificity. The GOAL questionnaire performed similarly among genders for screening of OSA≥ 5 (p = 0.834), OSA≥ 15 (p = 0.194), and OSA≥ 30 (p = 0.270).

Table 2 GOAL questionnaire predictive performance (n = 814)

Pairwise Comparison of ROC Curves

Table 3 summarizes the discrimination achieved by the GOAL questionnaire, in addition to the other 5 screening instruments: STOP, STOP-Bang, No-Apnea, NoSAS, and ESS. For predicting OSA≥ 5, OSA≥ 15, and OSA≥ 30, GOAL exhibited similar and non-inferior discriminative properties compared with STOP-Bang, No-Apnea, and NoSAS (all p values > 0.05). Moreover, GOAL performed significantly better than STOP and ESS at all OSA severity levels (all p values < 0.001); Table 3.

Table 3 Discrimination of six screening tools and pairwise comparison of ROC curves (n = 814)

Discussion

The main finding of our study was that in a sample of consecutive adult patients undergoing evaluation before BS, the GOAL questionnaire emerged as a suitable OSA screening instrument. Indeed, despite its simplicity, our instrument showed adequate predictive performance and discriminatory ability. The discriminatory properties were similar to other widely used instruments, namely No-Apnea, STOP-Bang, and NoSAS. Moreover, its ability to discriminate patients with or without OSA was always statistically better than the STOP and ESS questionnaires, at all OSA severity levels.

The BS population referred for sleep evaluation is a cohort with a high pretest probability for OSA. Therefore, the primary use of a screening instrument lies in the possibility of offering portable sleep tests for diagnosis, a strategy capable of reducing long waiting lines for in-lab PSG [54, 55]. As the GOAL questionnaire has a risk escalation—increasing scores are associated with a greater likelihood of having OSA—it can be used to identify individuals with more severe forms of OSA who can benefit from positive airway pressure treatment, and thereby reduce perioperative and postoperative complications [31,32,33,34]. This approach of risk escalation gradient has already been previously reported with No-Apnea [47] and STOP-Bang instruments [56,57,58].

Although several screening instruments have already been widely validated in the literature, there are surprisingly few studies specifically focused on BS patients [41,42,43,44], possibly because of a large proportion of the preoperative evaluation programs compulsorily forward all BS patients to testing, regardless of the presence or absence of symptoms suggestive of OSA [26,27,28]. Our findings may suggest that most of these subjects can be properly evaluated at home rather than in a sleep laboratory, enabling cost savings.

Performance of a given questionnaire can vary widely, and such performance is dependent on the following factors: sleep test type used for OSA diagnosis, characteristic of enrolled population, and AHI thresholds employed to assess OSA [37,38,39]. For a disease such as OSA, it is possibly more important than a screening test has high sensitivity instead of high specificity, particularly in a population with a high pretest probability such as BS patients [37,38,39]. Similar to the GOAL original study [40], this tool displays high sensitivity and moderate specificity to predict OSA, mainly in its most severe forms in BS patients that are a priori a group with a high frequency of OSA.

We should emphasize that the primary and foremost aim of the screening approach is not to replace the standard diagnostic method, but rather to reliably assign those who emerge as high-risk patients to home-based portable recording methods. Implementation of this strategy should enable the referral of the vast majority of individuals to home-based sleep testing, thereby reducing costs and accelerating the diagnosis of OSA, particularly when waiting times in sleep laboratories are particularly lengthy. Of note, no clinical questionnaire to date has been sufficiently accurate to substitute an objective sleep test to either confirm or rule out OSA. Besides, the sensitivity and specificity of the method are generally inversely related, which translates into a reduction in specificity, especially in the most severe forms of OSA. However, other possibilities may emerge shortly upon more extensive validation of our GOAL instrument. For example, we would propose that for all those BS patients with low GOAL scores, home overnight oximetry rather than PSG could suffice. As such, home sleep test would be used for those with high GOAL scores, such as to delineate the severity of the likely presence of OSA in these patients, and overnight oximetry would simply serve to confirm the a priori low probability of a positive test among low GOAL scoring patients.

Although this is the first study evaluating the GOAL questionnaire performance in BS patients, our findings are comparable with those of other instruments when applied to BS cohorts. In a study that included 606 BS patients, No-Apnea had similar discrimination to STOP-Bang and NoSAS for predicting OSA≥ 5 (p = 0.979 and p = 0.358, respectively), OSA≥ 15 (p = 0.158 and p = 0.399, respectively), and OSA≥ 30 (p = 0.388 and p = 0.903, respectively) [43]. In a study with 414 BS patients, the Berlin questionnaire was a reasonably effective tool to predict OSA [42]. Similar to our current findings, ESS also had no value predicting OSA on 99 consecutive severely obese subjects [45]. In a retrospective study containing 266 BS patients, neither the STOP-Bang nor Berlin questionnaire was an adequate OSA screening tool [46]. Surprisingly, the performance of these two instruments was substantially inferior to previously reported in the literature [46]. In another study with 251 BS patients, STOP-Bang and NoSAS performed markedly better than ESS and the Fatigue Severity Scale [41]. Except for ESS, all sleep screening instruments showed improved OSA prediction in females than in males, indicating gender-related performance differences that remain however unexplained [41]. Conversely, we did not find any differences between genders regarding GOAL performance.

Our study did not consider newer non-invasive techniques, such as peripheral arterial tonometry (PAT), for two main reasons. First, a recent article that included five hundred patients with suspected OSA, which were simultaneously tested with in-lab PSG and PAT, reported a significant percentage of patients who exhibited clinically relevant classification errors if and when exclusively relying on PAT [59]. Second, a meta-analysis on fourteen different studies in the bariatric population did not include any study using PAT as the diagnostic approach for OSA [60], such that more extensive PAT implementation will require further validation.

Our study has some limitations that deserve to be highlighted. All participants were referred to a single sleep laboratory, which may compromise the generalization of our findings. Individuals from other ethnic groups who may have different demographic and anthropometric characteristics were also not preferably evaluated.

Conclusions

The GOAL questionnaire, a concise and practical instrument, showed adequate performance and discrimination for OSA screening, regardless of the severity level. Its discrimination was similar to that of other instruments such as No-Apnea, STOP-Bang, and NoSAS. Both STOP and ESS were not instruments that performed adequately when used as OSA screening tools.