Introduction

The prevalence of obesity worldwide is increasing to alarmingly high levels, present in a third of the population in the world [1]. To combat this chronic disease, the rate of metabolic and bariatric surgery (MBS) worldwide has risen from 100,092 per year in 2014 to 394,431 in 2019 [2]. Although MBS is considered the most efficacious and durable approach for weight loss and maintenance, less than 1% of all patients who qualify for MBS undergo surgery in the USA [3].

In 1988, biliopancreatic diversion with duodenal switch (BPD-DS) was first introduced by Hess et al. as an alternative to the Scopinaro BPD, the sleeve gastrectomy, and the gastric bypass [4]. Previous literature has demonstrated BPD-DS to be the most effective bariatric technique regarding weight loss with low weight regain rates [5,6,7,8]. Furthermore, the American Society for Metabolic and Bariatric Surgery (ASMBS) recently reported that the number of BPD-DS operations in 2020 has almost doubled since 2019 in the USA [3]. Nonetheless, BPD-DS only accounts for 1.1% of all current MBS due to lack of training, technical complexity, long operation time, and high postoperative complication rates [2, 9, 10]. This hesitancy of adoptions was evaluated by Clapp et al., who observed that the largest reason for a reluctance to perform BPD-DS among bariatric surgeons is the worry about long-term complications, which is linked with a poor long-term quality of life [10]. Despite the “skepticism” towards the BPD-DS, the rate of complications is demonstrated to be comparable to Roux-en-Y gastric bypass [11]. Moreover, no existing literature reviews have specifically evaluated the impact on quality of life after BPD-DS.

The aim of any treatment is to address the disease and improve quality of life [12], and the primary motivation to undergo MBS for some patients is the desire to improve health-related quality of life (HrQoL) [13]. Around 5–12% of patients seeking bariatric surgeries are primarily driven by quality-of-life improvement without regard to health concerns [13, 14]. This is especially important moving forward as patient-centered outcomes need to be considered with equal weight to medical outcomes.

HrQoL is a subjective and multidimensional assessment of the physical, mental, and social functioning of patients determined by themselves [15]. To understand the impact of BPD-DS on patients, we need to evaluate HrQoL quantitatively using the established instruments. HrQoL is often evaluated in various specialties using generic questionnaires like 36-Item Short-Form Health Survey (SF-36) [16]. Furthermore, this SF-36 model includes eight first-order factors (physical functioning, role physical, bodily pain, general health, vitality, social function, role emotion, and mental health) as well as two second-order factors (physical and mental component scores) to assess overall health status and HrQoL [16]. Conversely, MBS-specific questionnaires, such as gastrointestinal-(gastroesophageal reflux disease–health-related quality of life score and gastrointestinal quality of life index), bariatric-(bariatric analysis and reporting outcome system, bariatric quality of life index, and Moorehead-Ardelt quality of life questionnaire II), and obesity-related (the impact of weight on quality of life questionnaire-Lite, Laval questionnaire, and obesity-related problems scale) HrQoL instruments, are used to evaluate complications and postoperative outcomes of MBS, which are strongly associated with HrQoL of patients [17]. Lastly, outcomes of HrQoL are commonly interpreted with reported minimally clinically important differences (MCID), which is defined as a change that would be regarded as meaningful for patients to consider repeating the intervention if it was their choice to make it again [18].

By comparing the HrQoL outcome of BPD-DS with reported MCID, we can identify the clinical relevance of BPD-DS and the role of this procedure in the scope of MBS and obesity. Despite demonstrated superior weight loss outcomes, no consensus has been reached with regard to the impact on HrQoL after BPD-DS. To the best of our knowledge, no meta-analysis of existing studies has investigated the impact on HrQoL after BPD-DS with reference to MCID. Thus, the aim of this study is to assess the impact on mid-term HrQoL after BPD-DS in the management of obesity.

Methods

Data Sources and Search Strategies

A comprehensive search of several databases from each database’s inception to August 16, 2022, was conducted in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [19]. The databases included the Ovid MEDLINE(R) and Epub Ahead of Print, In-Process & Other Non-Indexed Citations, Daily, Ovid EMBASE, Ovid Cochrane Central Register of Controlled Trials, Ovid Cochrane Database of Systematic Reviews, Ovid APA PsycInfo, PubMed, Scopus, and Web of Science. The search strategy was designed and conducted by an experienced librarian with input from the study’s principal investigator. Controlled vocabulary supplemented with keywords was searched for quality of life after a duodenal switch in adult patients. The actual strategy listing all search terms used and how they are combined is available in Supplementary Item 1. The review was registered prospectively with PROSPERO (CRD42022352073).

Eligibility Criteria and Quality Assessment

Eligible studies were cohort studies that must meet all of the following inclusion criteria: (1) adult participants older than or equal to 18 years who underwent primary BPD-DS procedure for the treatment of diagnosed obesity and (2) postoperative outcomes of HrQoL, including bariatric analysis and reporting outcome system (BAROS), bariatric quality of life index (BQL), gastroesophageal reflux disease–health-related quality of life (GERD-HRQL), gastrointestinal quality of life index (GIQLI), Laval questionnaire, Moorehead-Ardelt quality of life questionnaire II (M-A QoL II), the impact of weight on quality of life questionnaire (IWQOL)-Lite, obesity-related problems scale (OP), and 36-Item Short-Form Health Survey (SF-36). Case reports, case series, abstracts, conference abstracts, and articles that were not reported in English were excluded from the study. This meta-analysis also excluded studies if a sample size was less than ten or if patients had secondary/revisional bariatric surgery. The quality of each study was independently evaluated by two authors (HN and RHM) using the Newcastle–Ottawa scale [20]. Any discrepancies will be discussed by the two independent assessors, with disagreements addressed via an adjudicator (OMG). Results of the quality assessment of all included studies are shown in Supplementary Item 2.

Data Collection

The percentage of excess weight loss (%EWL) and the percentage of excess body mass index loss (%EBMIL) were calculated using the following formulas [21]: %EWL = [(preoperative weight – current weight)/(preoperative weight – ideal weight)] × 100. %EBMIL = [(preoperative BMI – current BMI)/(preoperative BMI – 25)] × 100, where BMI = weight/height2.

HrQoL Instruments and MCID Interpretation

HrQoL was assessed using the following instruments: BAROS [22], BQL [23], GERD-HRQL [24], GIQLI [25], IWQOL-Lite [26], Laval questionnaire [27], M-A QoL II [28], OP [29], and SF-36 [30]. Additionally, SF-36 was further investigated using the physical component score (PCS) and mental component score (MCS), which follow a T-distribution (mean 50, SD 10) normalized for the general US population, as well as eight subscales, including bodily pain, general health, mental health, physical function, role emotion, role physical, social function, and vitality [30]. Similarly, the Laval questionnaire had the following subscales: activity/mobility, symptoms, personal hygiene/clothing, emotions, social interactions, and sexual life. Likewise, M-A QoL II had the following subscales: self-esteem, physical activity, social contracts, work ability, sexual interest, and relationship with food. Additionally, MCID for HrQoL instruments was evaluated based on the reported literature. A three-to-five-point increase in PCS or MCS score is regarded as clinically important [31]. However, given the severe baseline HrQoL in our included studies, a score of 5 was used as the MCID threshold for PCS and MCS [32]. The range of MCID for Laval questionnaire domains is 0.6–2.0 (symptoms: 0.8; activity/mobility: 0.9; personal hygiene/clothing: 1.4; emotions: 1.2; social interactions: 1.2; and sexual life: 2.0) [27]. Additionally, the range of MCID for the IWQOL-Lite total score is 7.7 to 12 points (depending on baseline severity) [33].

Statistical Analysis

The pooled means and proportions of our data were analyzed using a random-effects, generic inverse variance method of DerSimonian and Laird, which assigns the weight of each study based on its variance [34]. The heterogeneity of effect size estimates across the studies was quantified using the Q statistic and I2 (P < 0.10 was considered significant). A value of I2 of 0–25% indicates insignificant statistical heterogeneity, 26–50% low heterogeneity, and 51–100% high heterogeneity [35]. Furthermore, a leave-one-out sensitivity analysis was conducted to assess each study’s influence on the pooled estimate by omitting one study at a time and recalculating the combined estimates for the remaining studies. Publication bias was assessed using a funnel plot [36]. If mean and standard deviation (SD) were unavailable, the median was converted to mean using the formulas from the Cochrane Handbook for Systematic Reviews of Interventions [37]. Additionally, if mean and SD were only depicted in figures, mean and SD were digitized from figures using WebPlotDigitizer version 4.4 (https://automeris.io/WebPlotDigitizer/). Lastly, if SD was not available or extractable, the reported mean was omitted from the calculation. Data analysis was performed using Open Meta analyst software (CEBM, Brown University, Providence, RI, USA).

Results

Study Selection and Patient Characteristics

The initial literature search of the electronic databases yielded 223 studies. After title and abstract screening, 52 articles were retained for full-text review. Those articles were then assessed for eligibility using specified inclusion and exclusion criteria. Twelve unique studies involving 937 patients that met the eligibility criteria were included. Of the twelve studies, six were retrospective [38,39,40,41,42,43], and the other six were prospective [44,45,46,47,48,49]. Eleven of the included studies were single institutional studies, while one study was performed in a multicenter setting. The mean age ranged from 34.5 to 49.8 years, and 565 patients (63.3%) were women. A PRISMA flowchart of the study selection process is depicted in Supplementary Item 3. The baseline characteristics of the included studies are comprehensively described in Table 1.

Table 1 Baseline characteristics of included studies

Risk of Bias

Results of the quality assessment of all included studies are shown in Supplementary Item 2. Two studies [40, 43] were determined to be of poor quality with a critical selection bias due to significant loss to follow-up and insufficient follow-up period. The remainder of the prospective and retrospective cohort studies was judged to be of fair quality.

Clinical Characteristics and Weight Outcomes

Among twelve studies, a total of 937 patients underwent primary BPD-DS. The mean length of follow-up ranged from 2.0 to 12.3 years, and the reported HrQoL questionnaire completion rates at follow-up ranged from 55 to 100%. Preoperative BMI was reported in ten studies [38, 40,41,42,43,44,45,46,47,48], and the pooled estimate of mean preoperative BMI was 53.8 kg/m2 (95% CI: 50.9, 56.8, I2 = 96.5%). Additionally, BMI at follow-up and absolute BMI loss were reported in seven studies [38, 42, 44,45,46,47,48] and four studies [38, 42, 45, 46], respectively, and the results demonstrated BMI at the mid-term follow-up period, and absolute BMI loss was 31.2 kg/m2 (95% CI: 27.7, 34.8, I2 = 95.8%) and 22.6 kg/m2 (95% CI: 21.7, 23.5, I2 = 26.6%). Likewise, %EBMIL and %EWL at the mid-term follow-up period were estimated as 68.9% (95% CI: 63.5, 74.3, I2 = 11.7%) and 86.3% (95% CI: 76.6, 96.0, I2 = 97.8%). Clinical characteristics and weight outcomes are comprehensively described in Table 1 and Fig. 1.

Fig. 1
figure 1

Pooled estimate of baseline and postoperative weight outcomes: A preoperative BMI; B BMI at follow-up; C absolute BMI loss; D %EBMIL; and E %EWL

SF-36 Outcomes

Mid-term HrQoL of patients who underwent BPD-DS was evaluated with SF-36 using physical and emotional component scores. Baseline PCS (n = 502) and MCS (n = 502) were reported in three studies [40, 44, 47], and the pooled estimated mean of PCS and MCS was 34.7 (95% CI: 31.5, 37.9, I2 = 73.8%) and 43.2 (95% CI: 37.7, 48.6, I2 = 87.7%), respectively. Similarly, five studies [40,41,42, 44, 47] reported PCS (n = 525) and MCS (n = 525) at the mid-term follow-up period, and the pooled estimated mean of PCS and MCS was 48.1 (95% CI: 44.9, 51.3, I2 = 81.4%) and 46.8 (95% CI: 45.6, 47.9, I2 = 0%). Furthermore, the mean difference between the baseline and the mid-term follow-up period was 13.4 for PCS and 3.6 for MCS, and the mean difference of PCS was demonstrated to be greater than the reported MCID. SF-36 component scores are comprehensively depicted in Table 2 and Fig. 2.

Table 2 Baseline and follow-up outcomes of SF-36
Fig. 2
figure 2

Pooled estimate of PCS and MCS: A baseline PCS; B PCS at follow-up; C baseline MCS; and D MCS at follow-up

Eight subscales of SF-36 were analyzed to assess the mid-term HrQoL on patients who received BPD-DS. Three studies [44, 45, 47] reported SF-36 subscales (n = 158), and the following were the pooled estimate of baseline SF-36 subscales: physical function (mean = 42.9, 95% CI: 31.5, 54.3, I2 = 88.5%), role physical (mean = 34.9, 95% CI: 22.2, 47.5, I2 = 72.8%), bodily pain (mean = 51.0, 95% CI: 35.6, 66.3, I2 = 91.5%), general health (mean = 43.5, 95% CI: 38.8, 48.1, I2 = 34.0%), vitality (mean = 39.0, 95% CI: 31.1, 46.8, I2 = 72.4%), social function (mean = 59.6, 95% CI: 49.2, 70.0, I2 = 82.3%), role emotion (mean = 58.1, 95% CI: 50.7, 65.4, I2 = 41.4%), and mental health (mean = 55.9, 95% CI: 46.7, 65.1, I2 = 76.7%). Conversely, four studies [41, 42, 44, 47] reported the SF-36 subscales (n = 92) at the follow-up. The pooled estimate of SF-36 subscales at the mid-term follow-up period was physical function (mean = 80.0, 95% CI: 68.4, 91.6, I2 = 88.0%), role physical (mean = 68.9, 95% CI: 52.9, 84.9, I2 = 77.2%), bodily pain (mean = 65.9, 95% CI: 54.3, 77.6, I2 = 77.6%), general health (mean = 63.8, 95% CI: 54.0, 73.6, I2 = 74.0%), vitality (mean = 51.8, 95% CI: 42.2, 61.3, I2 = 71.2%), social function (mean = 79.1, 95% CI: 70.2, 88.0, I2 = 73.8%), role emotion (mean = 72.7, 95% CI: 67.1, 78.2, I2 = 0%), and mental health (mean = 73.5, 95% CI: 68.7, 78.3, I2 = 0%). The mean difference was evaluated between the baseline and the mid-term follow-up period of SF-36 subscales. The results demonstrated physical function with the largest difference (37.1) and vitality with the smallest difference (12.8), and the remainder of the mean difference was the following: role physical (34.0), bodily pain (14.9), general health (20.3), social function (19.5), role emotion (14.6), and mental health (17.6). In addition to the four included studies, Duarte et al. [39] and Uribarri-Gonzalez et al. [48] also reported only mean for SF-36 subscales. The SF-36 subscales at the follow-up for Duarte et al. (n = 17) and Uribarri-Gonzalez et al. (n = 36) were the following: physical function (91.8, 83.4), role physical (94.1, 83.4), bodily pain (86.4, 69.8), general health (93.2, 61.7), vitality (86.8, 63.0), social function (92.7, 84.0), role emotion (94.1, 81.7), and mental health (94.1, 71.6). Details of baseline and follow-up outcomes of SF-36 subscales are comprehensively described in Table 2 as well as Figs. 3 and 4.

Fig. 3
figure 3

Pooled estimate of baseline SF-36 subscales: A physical function; B role physical; C bodily pain; D general health; E vitality; F social function; G role emotional; and H mental health

Fig. 4
figure 4

Pooled estimate of SF-36 subscales at follow-up: A physical function; B role physical; C bodily pain; D general health; E vitality; F social function; G role emotional; and H mental health

Outcomes of GI-/Bariatric-/Obesity-Related HrQoL Instruments

Mid-term HrQoL of patients who underwent BPD-DS were evaluated with GI-, bariatric-, and obesity-related QoL instruments. Laval questionnaire subscales (n = 100) were analyzed in two studies [45, 46]. The mean difference between the baseline and the mid-term follow-up period of each subscale was greater than the reported MCID: activity/mobility (2.90), symptoms (1.80), personal hygiene/clothing (3.19), emotions (2.02), social interactions (2.72), and sexual life (2.52). Likewise, the IWQOL-Lite score (n = 18) was reported in one study [47], and the mean difference (48.7) between the baseline and the mid-term follow-up period was greater than the reported MCID. Similarly, bariatric quality of life index (n = 23) was reported in one study [49], and the mean difference was 32.1. Two studies [40, 44] reported an obesity-related problems scale (n = 469), and the mean obesity-related problems scale at follow-up was 28.2 with a mean difference of 38.4. BAROS score (n = 17) was reported in one study [39], and the mean BAROS score at the mid-term follow-up was 7.39. Additionally, Magee et al. [43] qualitatively reported BAROS among 67% of the first 100 patients (n = 67). An improvement in 98% of respondents was observed with 85% reporting “very good” or “excellent” outcomes based on the BAROS score. GERD-HRQL (n = 76) and gastrointestinal quality of life index (n = 36) were evaluated by Badaoui et al. [38] and Uribarri-Gonzalez et al. [48], respectively. The mean scores at the mid-term follow-up period were 8.7 for GERD-HRQL and 101.0 for gastrointestinal quality of life index. Details of baseline and follow-up outcomes of GI-, bariatric-, and obesity-related instruments are summarized in Table 3.

Table 3 Baseline and follow-up outcomes of GI-/bariatric-/obesity-related quality of life instruments

Discussion

Evaluating the effect on the HrQoL is essential when selecting a type of MBS for patients with obesity. In this report, our aim was to investigate the impact of BPD-DS on mid-term HrQoL, and a generic instrument as well as GI-, bariatric-, and obesity-related instruments was used to assess the improvement of HrQoL. Our meta-analysis based on twelve available studies suggests that BPD-DS improves the mid-term HrQoL of patients with obesity. Furthermore, MCID was achieved on PCS of SF-36 as well as IWQOL-Lite and Laval questionnaire subscales.

Our study also analyzed PCS and MCS outcomes at baseline and follow-up and evaluated the mean difference in reference to MCID. To our knowledge, this meta-analysis is the first to compare the mean difference in PCS and MCS after BPD-DS with a reported MCID. The results found the mean difference of PCS and MCS was 13.4 and 3.6, respectively, and only PCS reached the reported MCID (5.0). However, when discussing the quality of life, it is also important to discuss the magnitude of effect size and not just the MCID. Since PCS and MCS scores are standardized based on the general US population, the scores could be interpreted into three effect sizes: small (2–4.9 points), medium (5–7.9 points), and large (8 + points) [44]. Our study demonstrated PCS with a large effect size with a clinically meaningful difference, while MCS had only a small effect size. Additionally, this finding was consistent with the previous meta-analysis for an overall trend in bariatric surgery, where other studies have demonstrated a greater improvement in physical HrQoL score than in mental HrQoL score, as these domains would more directly benefit from a weight reduction [50]. Interestingly, Warkentin et al. hypothesized that a 20% weight reduction appeared to be an appropriate threshold for clinically important HrQoL improvement [32]. Despite the convenience of substituting HrQoL assessment with weight reduction, Biron et al. emphasize the importance of directly measuring the HrQoL since weight loss does not represent an appropriate surrogate for HrQoL in patients with obesity [45]. Therefore, it is essential to evaluate the HrQoL at baseline and at mid-term follow-up to observe the impact BPD-DS has on the patients.

In addition to the component scores, SF-36 subscales were evaluated to determine the effect of mid-term HrQoL on patients who underwent BPD-DS. Our study demonstrated the largest improvement in physical function (37.1) and the smallest improvement in vitality (12.8). Interestingly, changes of 5 to 12.5 points are regarded as clinically meaningful when assessing the changes in HrQoL using the SF-36 [50]. Thus, our results demonstrated a clinically meaningful improvement in each subscale of SF-36. Nonetheless, SF-36 subscales’ results must be evaluated cautiously, since half of the results were from the first 2 years after the surgery, known as the honeymoon period. Andersen et al. postulated that patients in this honeymoon period experience a meaningful amount of weight reduction, as well as a feeling of being in control of their obesity [17]. Consequently, in the short term, patients would likely report higher scores in HrQoL. After this period, there can be a decline in scores, probably associated with weight regain [51]. Similarly, Biron et al. also observed some decline in HrQoL scores postoperatively after 1–2 years [45]. Conversely, Aasprang et al. found no significant correlations between a 10-year change in HrQoL and weight loss, weight regain, and weight stability [44]. Additional studies are needed to draw a conclusion on the effect of the honeymoon period and identify factors that are strongly associated with HrQoL.

Similar to the SF-36 questionnaire, GI-, bariatric-, and obesity-related HrQoL instruments were used to assess the impact of mid-term HrQoL after BPD-DS in the management of patients with obesity. The Laval questionnaire is an obesity-related questionnaire, and our study demonstrated that patients who underwent BPD-DS achieved MCID in all six domains. Specifically, the largest improvement was observed in personal hygiene/clothing (3.19), and the smallest improvement was depicted in symptoms (1.80). Furthermore, these overall findings in the Laval questionnaire were consistent with the previous systematic reviews [17, 50]. Biron et al. theorized that the ratio of the magnitude of improvements with its standard deviation, which represents an estimate of signal-to-noise ratio, is generally higher with the Laval questionnaire than with the SF-36 [45]. Disease-specific questionnaires like the Laval questionnaire are more appropriate and sensitive to change than generic questionnaires in follow-up and clinical studies. Nonetheless, further studies are needed to improve the interpretability of the Laval questionnaires by ascertaining the small, moderate, or large improvement based on the score [27].

One study [47] in our report used the IWQOL-Lite total score to assess obesity-related HrQoL. This demonstrated a clinically meaningful improvement in scoring (48.7) compared with the reported MCID of 7 to 12 [52]. Furthermore, the total postoperative IWQOL-Lite score was equivalent to the community norms, which might indicate that patients who underwent BPD-DS are functioning close to the general population [47]. The result of this meta-analysis demonstrated a clinically meaningful improvement in the IWQOL-Lite total score, which was consistent with the overall trend in bariatric surgery [17].

This meta-analysis has several limitations that should be addressed. The main limitation of this meta-analysis is the design of the studies, which lacked long-term randomized controlled trials (RCT). Additionally, only one of the twelve included studies was performed in a multicenter setting. The inclusion of additional multicenter studies would improve the sample size and generalizability of this meta-analysis. As previously mentioned, the follow-up periods of included studies may not be long enough to negate the effect of the honeymoon period HrQoL. The loss to follow-up in participants could have introduced a large attrition bias and resulted in biased estimates of outcomes. Moreover, the rationale behind the cause of loss to follow-up was not adequately addressed in the included studies. Similarly, outcomes of HrQoL after BPD-DS could be influenced by reoperation, malnutrition, complication, death, or loss to follow-up, which might have skewed our results. Furthermore, patients in the long-term follow-up over a period of more than 10 years could develop severe malnutrition secondary to hypoabsorption, requiring reoperation and lengthening of the common channel. Therefore, additional studies are necessary to evaluate the long-term effect of HrQoL after BPD-DS. Large heterogeneity in follow-up, HrQoL instruments, and baseline BMI was observed across our included studies. Furthermore, additional information is necessary on GI-, bariatric-, and obesity-related QoL instruments, including baseline HrQoL scores and MBS-related MCID, to assess the clinical relevance of HrQoL after BPD-DS. Lastly, there was considerable heterogeneity in the outcomes, such as %EWL, PCS and MCS at follow-up, and the majority of SF-36 subscales. Despite these limitations, this meta-analysis demonstrated improvement in the mid-term HrQoL after BPD-DS. Future studies should address these limitations while continuing to evaluate the safety and efficacy of BPD-DS.

Conclusions

Our meta-analysis demonstrated an improvement in mid-term HrQoL after BPD-DS. Notably, MCID was observed on PCS of SF-36 as well as IWQOL-Lite and Laval questionnaire subscales. Additionally, clinically meaningful improvement was depicted in SF-36 subscales, with the largest improvement in physical functioning. Despite the promising trends demonstrated in this meta-analysis, further studies with large sample sizes are needed to evaluate the impact of HrQoL on patients with obesity after BPD-DS.