Introduction

There is great interest in reducing mortality associated with bariatric surgery in the medical community, in the media, and understandably, on the part of morbidly obese patients. Universal appreciation of the consequences of the global obesity epidemic and a growing recognition that bariatric surgery is the most effective therapy for morbid obesity are contributing factors underlying the steadily increasing numbers of bariatric procedures. While treatment methods for morbid obesity other than surgical intervention have been associated with a lack of durable success, one benefit of medical management has been the low risk with which it is associated [1, 2]. The fact that surgical treatment possesses a directly associated risk of complications means patient education, information, and risk stratification are critically important preoperatively. Among the crucial information that informs decisions regarding bariatric procedures is that of mortality and morbidity rates [3].

Medical literature records the mortality results of some of the best surgeons [4]. The literature also enumerates the results of some of the worst mortality outcomes with emphases on patients in the higher age ranges and presenting with severe comorbidities [5]. Regional mortality data [6] may not reflect countrywide statistics from population-based administrative databases, such as the Nationwide Inpatient Sample database [7]. A previous meta-analysis reported operative mortality rates in only a limited cohort of patients defined by the presence of one or more comorbid conditions, specifically diabetes, hyperlipidemia, hypertension (HTN), or obstructive sleep apnea [8]. Many obese patients, however, possess multiple comorbid conditions, which may or may not impact postoperative complications. It is thus understandable, given the different bariatric procedures available, that operative selection algorithms have been attempted to match a specific patient with a specific operation in order to, among other factors, minimize operative mortality [9]. But, again, these algorithms have used small data samples and only focused on a few possible risk factors.

What are needed are data—and an accompanying statistical model—that describe the relationship between independent patient factors and postoperative morbidity/mortality. Such information would be of use during the always necessary, but often difficult, counseling of patients regarding their risks for postoperative morbidity and mortality. The concept of using an accessible collection of facts from which conclusions may be calculated as the basis for a predictive nomogram of clinical reliance and reliability is one that is rapidly gaining approbation [10]. Nomograms as a formulated means of prediction are increasingly cited for permitting rapid assessment of numerous variables with increased accuracy. Recently, for instance, it was noted that for prostate cancer over 80 predictive models exist [11]. The use of predictive models incorporated into preoperative patient assessment assists the physician in determining a patient’s expected outcome risk based on results with similar patients; while not a substitute for physician expertise, these models are a valuable supplement. In this study, we undertook a review of patient factors and patient outcomes of all those undergoing laparoscopic gastric bypass surgery procedures as reported in the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database with the purpose of developing a tool for baseline risk assessment for 30-day major morbidity and mortality in this patient population.

Materials and Methods

The NSQIP methodologies, previously reported in detail [12], are here summarized in brief. Each participating site has a clinical nurse reviewer who prospectively collects preoperative patient characteristics including risk factors, intraoperative processes of care, and postoperative adverse occurrences up to 30 days after the operation as regards the first 36 (VA) or 40 (private sector) operations in an 8-day cycle. Entry of common procedures such as breast biopsies and hernia repairs has been limited so that such cases would not overwhelm the database. Data such as laboratory values were copied into the NSQIP dataset from other computerized sources within the VA and were either pulled from computerized systems or entered by the nurses, who completed in-depth training on all study definitions, in the non-VA hospitals. Regular conference calls, annual meetings, and site visits were used to maintain data reliability. The index operation was defined as the first operation during the hospitalization for a given patient because some patients had more than one operation during their hospital stay. Multiple operations within 30 days are not counted in totals. On the 30th postoperative day, the nurse obtains outcome information, via chart review, morbidity and mortality conferences, and/or communication with each patient by letter or by telephone.

By applying a filter to the American Medical Association Current Procedural Terminology codes, we were able to identify through the NSQIP database 32,426 bariatric surgery patients treated between 2005 and 2008 who had presented with >35 kg/m2 preoperative body mass index (BMI) [13].

We defined our primary outcome—composite 30-day morbidity—as any one of the following: wound infection (including superficial and deep wound infection, organ/space infection, and wound disruption), systemic inflammatory response syndrome/sepsis/septic shock, pneumonia, pulmonary embolism, stroke/cerebral vascular accident with neurological deficit, cardiac arrest requiring CPR, myocardial infarction, acute renal failure, bleeding requiring transfusions, unplanned intubation, ventilator dependence >48 h, coma >24 h, and mortality.

We prespecified the following demographic, morphometric, and preoperative surgical variables as potentially predictive of outcome: age, gender, race (i.e., Caucasian, African American, Hispanic, other), BMI, smoking status (yes, no, any within the last year), HTN (requiring medication), diabetes mellitus (yes, no, oral, or insulin dependent), history of chronic obstructive pulmonary disease (COPD), American Society of Anesthesiologists physical status (i.e., I/II, III, IV/V), partial or total functional dependence prior to surgery, and serum albumin level.

The individual outcomes comprising our composite were non-missing for all patients; thus, the composite outcome was non-missing. The covariables were all <0.1% missing with the exception of preoperative serum albumin, which was 31% missing. Rather than discard this variable from the analysis or exclude patients with missing serum albumin—two strategies which, in separate ways, would potentially reduce the discriminative ability of the nomogram—we developed a Bayesian multiple imputation model [14] to predict values for missing predictors based on values observed for non-missing predictors, simulated five completed datasets from this Bayesian model, and combined analysis of these five datasets into a final predictive model.

Multivariable logistic regression (using penalized maximum likelihood as the fitting criterion) [1518] was used to develop the predictive model based on the multiply imputed dataset. Linearity of the relationship among continuous predictors (age, BMI, and serum albumin) and log-odds of 30-day morbidity were assessed, respectively for each predictor, using a Wald chi-square test to compare a model incorporating a nonlinear component (restricted cubic spline with four equally spaced knots over the range of observed values) with a model assuming a linear relationship.

The C-statistic (i.e., the area under the receiver-operating characteristic curve)—a measure of a statistical model’s ability to separate events (in our case, 30-day morbid outcomes including mortality) from non-events—was used to assess the discriminative ability of our model. This quantity ranges from 0.5 to 1.0, with a value of 1.0 representing perfect discriminative ability (i.e., absolute separation of those with the outcome from those without the outcome) and a value of 0.5 representing no discriminative ability (i.e., no better than might be expected from random guessing, such as a result based on a flip of a coin ending in correctness 50% of the time). Since predictive models tend to predict observations in the derived dataset more accurately than in new data, we used a bootstrap resampling procedure (with 500 bootstrap replicates) to get an estimate of predictive accuracy for new data.

The relationship between predicted and observed probabilities in regard to composite 30-day morbidity/mortality was assessed graphically using a calibration plot. Since, as noted earlier, a model is generally better calibrated for the data from which it was derived, we again employed a bootstrap resampling procedure, this time to obtain an overfitting-corrected calibration curve.

For statistical analysis, we used R software version 2.8.1 for Windows [19], incorporating the Hmisc [20] and Design [21] libraries. The false-positive rate (type I error rate) for all tests was controlled at 5%.

Results

Descriptors of the included patients are characterized in Table 1. Of note, the median [quartiles] BMI for the 32,426 patients in the study was 45.2 [41.2, 50.7], greater than 50% had medication-dependent HTN, 12% were smokers, and 80% of the patients were of female gender. The composite outcome was observed for 1,238 of patients (3.82%): 972 of these 1,238 patients (79%) had a systemic and/or wound infection and 429 (34.7%) had pulmonary complications (Table 2).

Table 1 Summary of baseline predictor variables for 32,426 US bariatric surgery patients treated between 2005 and 2008
Table 2 Frequency of individual outcomes for 1,238 of 32,426 patients (3.82%) experiencing the primary composite outcome

In the multivariable logistic regression model, nonlinear terms for the continuous predictor of age was statistically significant (P = 0.003 in the presence of all other aforementioned predictors). The nonlinear terms for BMI and serum albumin were not statistically significant (P = 0.18 and P = 0.73, respectively), and we removed them from the model.

The nomogram that resulted from our final model—as well as associated instructions for obtaining and presenting the predicted probability of composite morbidity/mortality for a new patient—is given in Fig. 1. In our final model, low serum albumin was the strongest factor in terms of producing a highly expected probability of mortality for a given patient, followed by BMI, age, and functional dependence (as evidenced by the relative lengths of the axes presented in Fig. 1).

Fig. 1
figure 1

Nomogram for predicting operative (30-day) major morbidity/mortality in US bariatric surgery patients based on demographic, morphometric, and preoperative variables. Asterisk, C Caucasian, H Hispanic, A African American, O Other. Instructions for physician: Locate the patient’s age on the “Age (yr)” axis by interpolating between the displayed values (10, 20, 30, 40, 70), then draw a straight line upwards to the “Points” axis to determine how many points toward major morbidity/mortality the patient receives for his/her age. Repeat this process for the other comorbidities listed; then sum all the points to get his/her total points. Locate his/her total points on the “Total Points” axis, and draw a straight line down to the “Predicted Probability (%)” axis. This is the predicted probability of 30-day major morbidity/mortality for the patient. For example, a 50-year old, Caucasian, hypertensive, ASA-III patient with BMI of 60 kg/m2 and serum albumin of 6 g/dL would have a predicted probability of 30-day major morbidity/mortality of approximately 2.7% (26 points for age, 6 for Caucasian race, 6 for hypertension, 7 for ASA status, 16 for BMI, and 44 for albumin, a total of 105 points). Note: Comorbidities—gender and diabetes mellitus—omitted due to small impact on the predicted probability arising from this logistic model. To properly account for these comorbidities, add 2 points for male gender and 3 points if patient is diabetic

The distribution of predicted probabilities in regard to complications for the patients in our dataset is summarized in Table 3. The median [quartiles] predicted probability was 3.44% [2.55%, 4.56%], and predictions ranged between 0.17% and 53.6%.

Table 3 Distribution of predicted probability of composite major morbidity/mortality for bariatric surgery patients in the NSQIP database from which the model was derived

The estimated C-statistic [95% confidence interval] for our model (as applied to new data, based on our bootstrap resampling procedure) was 0.629 [0.614, 0.645], indicative of slight to moderate discriminative ability beyond that obtainable by chance alone. A calibration plot of the observed probability of composite outcomes versus the predicted probability of composite outcome is given in Fig. 2. Ideal calibration is depicted by a curve that lies along the line y = x (i.e., the 45° line originating from the origin).

Fig. 2
figure 2

Calibration (i.e., the expected relationship between observed and predicted probabilities of composite major morbidity/mortality for new data) of the predictive logistic regression model (corrected for training data overfitting based on 500 bootstrap resamples): Predicted probabilities accurately represent the proportion of patients experiencing the outcome when the predicted probability is less than about 6% (which includes 86% of patients), while for patients with predicted probability greater than 6%, the prediction tends to overstate the actual probability of event by an amount up to about 1%. Thus, for patients with predicted probability >6%, accuracy might be improved by subtracting 1%

Discussion

Our data analysis allowed us to evaluate the relationships between prespecified demographic, morphometric, and preoperative surgical variables and the primary outcome of composite 30-day morbidity and mortality for those patients undergoing bariatric surgery. Overall, both morbidity and mortality were low, 3.7% and 0.14%, respectively. Among the factors considered in our study, age, BMI, serum albumin, and functional status displayed the strongest independent associations with the probability of morbidity/mortality. The nonlinear effects for age and BMI are evident in the nomogram; age-associated risk increases steadily to about age 40 and then levels off while risk associated with BMI reaches a minimum at about 40 kg/m2. Low levels of serum albumin and lower functional independence were associated with increased possibility of morbidity/mortality. Within our model, chronic HTN, smoking status, gender, diabetes mellitus, race, and history of COPD were not indicated as factors associated with appreciable changes in risk.

Our findings may appear to conflict with the results of previous studies, which have shown that comorbid conditions such as HTN and COPD do impact patient outcomes postoperatively [21]. These patient factors in our review, however, contribute minimally to a patient’s overall risk in comparison with other more salient factors.

In addition, our results did not identify the presence of diabetes mellitus as an independent risk factor for postoperative complication, which is in contrast to the results of previous studies [22]. This apparent discrepancy may best be explained by recent studies that have shown patients undergoing gastric bypass experience improvement or resolution of insulin resistance prior to achieving significant weight loss [23]. In fact, according to one study by Wickremesekera et al. [24], this improvement is seen as early as 6 days postoperatively. This rapid improvement in insulin resistance, which is frequently implicated as a cause of poor wound healing and infections, may explain why our results did not identify diabetes mellitus as an independent risk factor for postoperative complication.

The best way to interpret and apply these findings is not in terms of how the individual factors contribute to risk but in terms of how these parameters can be modified or improved preoperatively to potentially decrease complications. Since bariatric surgery is an elective procedure, our findings and algorithm should be used to modify identified patient risk factors in an effort to minimize 30-day morbidity and mortality. For example, during an initial preoperative workup for bariatric surgery, a patient’s serum albumin is found to be low, which increases the patient’s risk for postoperative complications. Because, however, the surgery is elective, the procedure may be postponed while the patient’s nutritional status and serum albumin are improved. Thus, we propose the use of our algorithm as a tool to modify risk factors preoperatively to minimize the risk for complication postoperatively.

There was a relatively strong linear relationship between serum albumin and likelihood of major 30-day morbidity/mortality. Serum albumin, which modulates tissue healing, immune system, and pulmonary function, has been identified as a surrogate for nutritional status [25, 26]. Despite the often large quantity of food consumed by obese patients, its nutritional quality is frequently suboptimal [27], placing these patients at increased risk for poor tissue healing, wound or pulmonary infections as well as prolonged ventilator dependence [25]. Interestingly, Ernst et al. [27] found that the rate of albumin deficiency among morbidly obese patients increased with increasing BMI. Serum albumin may be considered a particularly important factor to take into account when evaluating risks associated with the type of surgical approach as well as the presence of anastomosis. Low serum albumin is associated with poor tissue healing, which can, in turn, increase the risk of an anastomotic leak, which often cascades into systemic infection and other potentially devastating complications.

Likewise, patients with poor functional status prior to surgery are at increased risk for postoperative morbidity and mortality. The ACS-NSQIP database defines three categories of functional status: independent (including those who may require prosthetics, equipment, or devices), partial (requiring at least some assistance with activities of daily living (ADLs)), and total (requiring assistance with all ADLs). Included within the definition of ADL is a patient’s ability to mobilize. Early postoperative mobility plays an integral role in the healing process; it reduces the risk of deep vein thrombosis and pulmonary embolism (PE) as well as improves lung ventilation, which decreases the risk of pneumonia due to atelectasis. Patients who require assistance with physical activity will likely be less mobile during this crucial period, increasing the probabilities for PE and pneumonia. Difficulty maintaining incisions may play a role in the risk for superficial wound infections, depending on the procedure approach (open versus laparoscopic). Supporting this is the fact that patients undergoing laparoscopic procedures experience fewer wound complications than those undergoing an open approach [28].

Interestingly, diabetes mellitus, hypertension, smoking, and history of COPD were not strong contributors to the predicted probability of 30-day morbidity and mortality. The nomogram indicates that independently none of these factors contributes significantly to the overall predicted probability of poor outcome, yet in combination, their impact demonstrably increases a patient’s risk of complications. Hypertension, smoking, and COPD are more highly correlated with morbidity than diabetes mellitus.

Previous research attempting to correlate patient factors with risk of postoperative complications has placed patients into discrete categories based upon calculated risk scores [29]. A limitation of this approach is that the groups are inherently diverse, and therefore, outcomes in their regard are less accurately predictable. Our nomogram may prove a useful tool for guiding physicians and patients in terms of their decisions regarding bariatric surgery. Based on our results, for example, we expect about 5% of patients to have a predicted probability of 7% or higher with the odds of composite morbidity/mortality for these patients being at least 116% higher than that of the average patient (as calculated by an odds ratio using probabilities 0.07 and 0.034).

This model was developed with the goal of attaining maximal predictive accuracy for the outcome. It has been shown that penalized estimation—a regression technique that intentionally trades unbiased estimation of individual relationships for an improvement in the overall predictive ability of the model—improves the predictive ability of a model when applied to new data. Therefore, this model gives the best prediction for new data but may not accurately represent specific individual relationships.

Conclusion

The growing recognition that bariatric surgery is currently the most effective therapy for morbid obesity has been accompanied by progressively increasing the numbers of bariatric procedures performed annually. Little has been done, however, to enhance the surgeon’s ability to properly counsel patients prior to surgery about individual risk.

In our retrospective study, information on 32,426 bariatric surgical patients was analyzed in order to generate a statistical model that permitted our development of a predictive nomogram. Our large and varied patient sample group increases the validity of data and emphasizes broad applicability to the general population. ACS-NSQIP data are rigorously collected and it is unlikely that complication rates were underreported. Though our analysis did include the use of a bootstrap resampling procedure to estimate predictive accuracy for new data, this process would benefit from further assessments of external validity such as the prospective use of the nomogram.

The nomogram developed here is a useful tool, one incorporating demographic, morphometric, and preoperative surgical variables derived from a review of more than 30,000 patients. This nomogram can serve an important role in helping to evaluate risk for bariatric surgery as it pertains to patient comorbidities, promising to significantly contribute to a more accurate prediction of 30-day morbidity and mortality through provision of better risk-specific information during patient counseling and consent as well as of the ability to determine which patients would be acceptable surgical candidates. A prospective study using the nomogram and demonstrating its predictive value is the logical extension of this work.

Of particular import, also, is that our results indicate that age and/or presence of comorbidities alone should not be the reason to exclude patients from consideration for surgery. As our results indicate a less-than-expected increased risk for complications after surgery in older or sicker patients, we recommend that they be considered as surgical candidates, with the potential being improvement of many of their obesity-related conditions. Improvement of obesity-related conditions would suggest possible reductions in overall healthcare spending though the specific economics of bariatric surgery in the elderly must be more extensively researched.