Introduction

Bariatric surgery is widely accepted as the most reliable approach for achieving effective, long-term weight loss [1, 2]. Sleeve Gastrectomy (SG) is currently the most popular bariatric procedure (47.0%), followed by Roux en Y Gastric Bypass (RYGB) (35.3%) gastric banding (8.4%) and One Anastomosis Gastric Bypass (3.7%) [3].

However, a growing body of evidence suggests that the risk of fracture increases after bariatric surgery [4, 5]. In a large, population-based cohort study in France, the risk of major osteoporotic fracture was significantly higher among patients who had undergone bariatric surgery than matched controls. The risk remained significant with RYGB but not SG [5]. Nevertheless, a recent meta-analysis of 22 studies showed significant decreases in bone mineral density (BMD) even after SG [6].

Areal BMD (aBMD) is a standardized test used to evaluate BMD. Other biochemical and hormonal markers of bone metabolism include serum calcium, Bone-Specific Alkaline Phosphatase (BALP), Parathyroid Hormone (PTH), and 25-OH-vitamin D. The available meta-analyses to date [6, 7] have focused on the changes after RYGB and SG rather than head-to-head comparisons of RYGB and SG at different follow-up endpoints. To the best of our knowledge, there is no systematic review in published scientific literature comparing bone profile outcomes after RYGB and SG.

We, therefore, conducted a systematic review and meta-analysis comparing BMD and other markers of bone metabolism in RYGB and SG patients following the Preferred Reporting Items for Systematic reviews and Meta-analyses (PRISMA) guidelines [8].

Methods

Eligibility criteria

Studies reporting aBMD data on both RYGB and SG patients with at least 1-year post-operative follow-up were included. These included observational cohort studies, case–control studies, randomized clinical trials (RCTs) Case series, pre-and post-intervention studies with a single-arm, and narrative reviews were excluded. Only English-language articles were included.

Types of outcomes measures

The primary outcome measure was the aBMD scores at 1 year, 2 years, and > 2 years. The secondary outcome measures were serum concentrations of bone turnover markers, including BALP, Vitamin D, calcium, and PTH.

Search strategy

A specific search strategy was developed using dedicated search terms combined with Boolean operators. The strategy was performed on three academic databases—PubMed, Embase, and the Cochrane Library. The following set of specific search terms was developed: ("bariatric" OR "bariatric surgery" OR "gastric bypass" OR "bypass" OR "gastric sleeve" OR "sleeve gastrectomy") AND (("bone" OR "mineral density" OR "bone mineral density" OR "fracture") OR ("vitamin D" OR "vit D" OR "25(OH)D" "PTH" OR "calcium")), Appendix 1.

Study selection and data collection

Two authors independently performed the search. The researchers screened the titles and abstracts of all the records across the three databases. Additionally, the bibliographies of screened articles were searched for additional eligible studies. The full-article versions of eligible articles were accessed, and the studies were further assessed for formal inclusion based on the availability of the primary outcomes. Any disagreement between authors was resolved by discussion.

Risk of bias

The methodological quality of non-randomized studies (cohort-based studies and case–control investigations) was assessed by two reviewers using the Newcastle–Ottawa Scale (NOS) [9]. A star-based scoring system was applied, and the total score was calculated based on the number of stars that were assigned to each study (score range was 0–9). Studies were considered medium or high quality at a NOS score of 4–6 and ≥  7, respectively. The quality of RCTs was assessed using the Cochrane risk-of-bias tool for randomized trials [10] to explore the risks of bias in random sequence generation, blinding, and allocation concealment, as well as attrition bias and other sources of bias.

Statistical analysis

Outcome data (aBMD and biomarkers) were analyzed and expressed as standardized mean differences (SMDs) and 95%CIs. This was carried out using the metacont package in R (R i386 version 4.0.0). The I2 statistic was used to indicate the statistical heterogeneity between studies, which was deemed significant at I2 > 0%. Results were retrieved from a fixed-effects model when there was no statistical heterogeneity; otherwise, results were reported from a random-effects model.

Subgroup analysis (mixed-effects models) was performed if the analysis comprised more than five studies (k > 5), and it was based on the study design, place of publication, and the study population. Publication bias was assessed visually using funnel plots and statistically using Egger’s test [11].

Results

Results of the search process

We found a total of 768 records. After an initial screening, 20 studies were identified for full-text review. Six of these were excluded due to the lack of information on our primary outcome (aBMD measurements) [12,13,14]. One was excluded as it reported the volumetric BMD (g/cm3) rather than the aBMD (g/cm2) [15], and one article was excluded because it only reported the primary outcome up to 6 months [16]. A total of 14 studies were included in the qualitative and quantitative syntheses (Fig. 1).

Fig. 1
figure 1

A PRISMA flowchart depicting the used search process in the current study

Characteristics of the included studies

The included studies were published between 2010 and 2021 (Table 1). Three studies were conducted in the United States [17,18,19], Southern America [20, 21], Asia [22, 23], Australia [24], and Europe [25,26,27,28,29,30]. Regarding the study design, three studies were RCTs [19, 26, 27], two studies were case-matched controlled studies [22, 30], and one was a retrospective chart review [18]. In contrast, the remaining articles employed a prospective cohort design. Out of the 14 studies, 13 articles reported the outcomes at 1 year, five articles at 2 years, and three articles at >2 years.

Table 1 Characteristics of included studies

Surgical interventions were performed on a total of 717 patients (81.87% females), where 363 (50.63%) patients underwent RYGB and 354 (49.37%) patients underwent SG. Based on the NOS scoring system of the methodological quality, four non-randomized studies [18, 25, 29, 30] were judged as having a medium methodological quality, whereas the remaining studies were of a high quality (Table 1). Results of reviewers' judgment regarding the risk of bias for RCTs are depicted in Fig. 2.

Fig. 2
figure 2

A summary of risk of bias assessment results for each included randomized clinical trial. Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Changes in body weight and BMI

There was no significant difference in BMI values between the RYGB and SG groups at 1 year (SMD = −0.18, 95%CI,  −0.65 to 0.29, p = 0.413) and at 2 years (SMD =  −0.03, 95%CI,  −1.49 to 1.54, p = 0.948, Fig. S1). The pooled results of BMI change at > 2 years were not available due to loss of follow-up.

Changes in the parameters of bone mineral density

The combined analysis of aBMD results at 1 year of the follow-up period showed no significant differences in aBMD between these two groups at different sites, including the total hip (Fig. 3a), lumbar spine (Fig. 3b), femoral neck (Fig. 3c), and the whole body (Fig. 3d) with no significant statistical heterogeneity (I2 ranged between 0–42%). Likewise, there were no significant differences between groups in all BMD parameters at 2 years and at longer follow-up periods (Table 2).

Fig. 3
figure 3

Forest plots depicting the outcomes at one year in terms of bone mineral density parameters, including at the total hip (a), lumbar spine (b), femoral neck (c), and the whole body (d)

Table 2 Differences in bone mineral density parameters between the two surgical groups

On subgroup analysis, eligible comparisons (k > 5) were primarily related to BMD measurements at 1 year. There were no significant differences between the study groups based on the location at which the study was conducted, study design, and the study population (Table 3).

Table 3 Results of subgroup analysis of BMD measurements at different sites at 1 year

Changes in the markers of bone metabolism

At 1 year, BALP concentration was significantly higher in the RYGB group compared to the SG group (SMD = 0.52, 95%CI, 0.23–0.81, p = 0.0004, Table 4) with no statistical heterogeneity between studies (I2 = 14%, p = 0.32). However, the outcomes of BALP were not available for longer follow-up periods. There were no significant differences in other bone turnover markers at 2 years. Notably, the RYGB group had significantly higher PTH concentrations at > 2 years of follow-up compared to the SG group (SMD = 0.68, 95%CI, 0.31 to 1.05, p = 0.0003), and there was no significant heterogeneity between studies (I2 = 0%, p = 0.75, Table 4).

Table 4 Between-group differences in bone turnover markers at distinct timepoints of follow-up

Publication bias

Assessment of publication bias was performed for comparisons with > 5 studies. Therefore, the analysis was limited to the reports of BMD at 1 year. Visually, there was no asymmetry of published studies around the mean effect estimate of BMD of the total hip (Fig. 4a), lumbar spine (Fig. 4b), femoral neck (Fig. 4c), and the whole body (Fig. 4d), indicating no publication bias. This was corroborated by Egger’s test for asymmetry (Fig. 4).

Fig. 4
figure 4

Funnel plot of the meta-analysis of BMD at one year; the results were limited to comparisons with > 5 comparisons, including BMD at the total hip (a), lumbar spine (b), femoral neck (c), and the whole body (d)

Discussion

This systematic review did not find any significant differences in aBMD measurements at the hip, lumbar spine, femoral neck, and the total body following RYGB and SG procedures. However, BALP and PTH concentrations were significantly higher after RYGB surgeries compared to SG.

In the present study, the lack of significant differences in aBMD between the most popular laparoscopic bariatric approaches may be explained by several factors. Firstly, while several individual studies have shown significant reductions in aBMD within groups, these differences were not significant when between-group measurements were considered. In a recent meta-analysis of SG procedures, Jaruvongvanich et al. [6] showed significant decreases in aBMD measurements at the total hip and femoral neck. The analysis of such parameters had considered pre- versus post-operative changes to explore within group differences. However, in our study, final aBMD outcomes were not statistically different. Indeed, using the follow-up score in the meta-analysis usually produces more conservative findings than the change score [31]; this might mediate the lack of significant differences in aBMD data between the RYGB and SG groups. Collectively, surgery type was not an important risk factor in the reported skeletal changes.

Secondly, from another perspective, the changes in aBMD might be linked to changes in weight loss. In our meta-analysis, we showed no significant differences in BMI values up to 2 years of follow-up, which might partly contribute to the lack of aBMD changes between GB and SG. Thirdly, and more importantly, the inclusion of a small number of articles in long-term comparisons might have limited our knowledge regarding aBMD changes, which might have occurred on a long-term basis. In the present study, while BALP changes have occurred early at 1 year, aBMD changes remained insignificant thereafter. This supports the fact that early biochemical changes usually precede gross changes that would be only apparent at long follow-up periods. Therefore, there is a need to understand the skeletal changes after bariatric procedures over long periods, which should be considered in future studies.

Our findings emphasize the post-operative nutritional deficiencies that might develop due to gastric restriction, malabsorption, or both. In RYGB, a small pouch from the proximal aspect of the stomach is constructed and anastomosed to the proximal section of the jejunum; thus, malabsorption of minerals and fat-soluble vitamins occurs due to the reduction of the available intestinal surface area. Bypassing the preferential sites of mineral absorption might place the patients at an increased risk for hypocalcemia. Despite the significant differences in PTH between RYGB and SG that might indicate more significant secondary hyperparathyroidism in the former arm, vitamin D deficiency and the consequent hyperparathyroidism were not uniformly reported in the literature. While several publications have demonstrated that high PTH levels are commonly reported after GB surgery [32,33,34], others have revealed that PTH and 25(OH)-vitamin D had remained within the normal range postoperatively [35, 36]. It is noteworthy that PTH changes were only apparent at 2 years of the follow-up as per the findings of the current meta-analysis. Therefore, well-designed, large-sized randomized studies are warranted to explore the long-term risk profile of RYGB surgeries.

In the studies included in our review, the role of vitamin and mineral supplementation should not be neglected. Patients in all the studies had received calcium and vitamin D supplements postoperatively, and Carrasco et al. [20] have shown negative correlations between PTH reduction and both calcium and vitamin D intake. Ieong and colleagues [18] have also demonstrated significant decreases in vitamin D and calcium levels despite vitamin D and mineral supplementation. This was associated with a significant aBMD reduction within the RYGB and GB groups [18]. Of note, the risk of fracture was investigated in four studies; two studies [26, 29] reported no fracture incidents across the study periods, whereas the remaining studies [19, 28] showed no between-group differences in the frequency of fracture.

The current study has some limitations. First, eligible studies were selected based on the availability of aBMD measurements at different sites as a primary outcome. Clinically, aBMD measurement might not be available in all healthcare facilities, and aBMD utilization might differ significantly based on patients' socioeconomic status [37]. Second, although we provided a comprehensive overview of studies relying on aBMD, the results of bone metabolism markers might not be systematically covered. This means that we could not collect all relevant studies of hormonal and biochemical markers; therefore, their clinical relevance might be relatively unreliable. Third, the findings were limited by the inherent limitations of the included studies. Only three studies employed a randomized design, representing 21.4% of the included studies. The lack of randomization might have limited the causal relationships between the study variables and might have induced differences among patients at baseline. Finally, notwithstanding the observed significant differences in PTH and BALP, eligible pairwise comparisons comprised a small number of studies, which might limit the reliability of data. Future meta-analyses might consider the biochemical and hormonal markers of bone metabolism as primary outcomes to provide a comprehensive overview of the metabolic changes and better understand the underlying mechanisms of action.

In conclusion, the present systematic review and meta-analysis showed no significant differences between RYGB and SG in terms of aBMD measurements over follow-up periods ranging from 1 year to more than 2 years. These measurements were based on three sites, including the total hip, lumbar spine, and femoral neck, as well as the whole body. However, patients undergoing RYGB procedures showed significantly higher PTH and BALP at 1 year and > 2 years of follow-up, respectively. Bone-related changes of RYGB and SG procedures seem comparable; yet, patients undergoing RYGB should receive rigorous vitamin D and calcium supplementation to control the expected skeletal fragility. Future large-sized, randomized studies are warranted, considering the measurement of biochemical and hormonal markers over long follow-up periods.