Introduction

Surgeons continue to debate the optimal timing for repair of paraesophageal hernia. Urgency of repair is a recognized predictor of poor outcomes,13 and previously described risk stratification tools have included urgency of operation in covariate models that predict postoperative mortality and morbidity with reasonable accuracy.4 Because acute complications of PEH, including gastric volvulus and strangulation, are associated with mortality rates as high as 16 %,3 equipoise remains regarding the safety of watchful waiting for symptomatic patients, with many surgeons continuing to advocate for elective intervention.2,3

While there is some conflicting data on the impact of non-elective surgery on morbidity and mortality in reports from large scale national registries,1,5 most studies show that patients undergoing non-elective repair are older with greater number of comorbid diseases. Significant differences in baseline characteristics are concerning when comparing two treatment groups, such as elective and non-elective paraesophageal hernia repair, to determine the impact of treatment allocation on outcomes such as postoperative morbidity and mortality. This is because these differences are very likely to introduce significant bias in the analysis, affecting the precision of the relationship between the treatment allocation (e.g., urgency of operation) and the outcome. Indeed, we have found that age and selected comorbid diseases are associated with worse outcomes, independent of the urgency of operative intervention, supporting the concern for propensity for treatment bias.4 It is, therefore, not surprising that patients having non-elective repair are older and have comorbidities because physicians may be hesitant to offer elective surgery because of these baseline characteristics. The question that arises, however, is whether the recognized increase in morbidity and mortality is due to older patient age and greater comorbid illnesses or to the urgency of the operation. In other words, what is cause and what is effect?6 The aim of this study was to determine whether non-elective PEHR is associated with differential postoperative outcome compared to elective repair, using inverse probability of treatment propensity-score weighting to balance the differences in pretreatment characteristics, thus enabling apples-to-apples comparison.

Materials and Methods

Patient Population

This is a retrospective review of prospectively collected data on 980 patients who underwent PEH repair between January 1997 and August 2010 at a single institution. We reanalyzed data for all patients,4 and 931 patients had data for all 28 pretreatment propensity variables and are included in this analysis. Patients with type II–IV paraesophageal hernia involving at least 30 % of the stomach above the diaphragmatic crura comprised the study population. At our institution, patients with symptomatic PEHs are counseled to undergo elective repair; patients are considered symptomatic if they report reflux-related complaints (e.g., heartburn/regurgitation, postprandial vomiting), obstructive complaints (e.g., dysphagia, postprandial bloating, chest and epigastric pain), space-occupying symptoms (e.g., postprandial dyspnea), or bleeding (e.g., anemia, hematemesis, melena, or hematochezia). For propensity-score modeling, exposure was defined as non-elective repair, which included urgent and emergent surgery. Urgent repair constituted an admission for symptomatic management followed by operative repair during the same admission. Emergent repair constituted immediate operation for acute complications and inability to relieve gastric or esophageal obstruction endoscopically. Patients either presented in the emergency department, as a transfer from another facility, or were directly admitted from an outpatient setting. This study received Institutional Review Board approval.

Surgical Technique

Laparoscopic PEH repair is the primary approach to elective repair at our institution. Our surgical approach to the laparoscopic repair of PEH has been previously described.7 Briefly, the tenets of open or laparoscopic surgical repair include complete sac reduction from the mediastinum, mobilization of at least 2–3 cm of tension-free intra-abdominal esophagus, and tension-free hiatal closure. Anti-reflux procedures were performed at the discretion of the surgeon based on patient stability at the time of operation, baseline symptoms (obstruction versus reflux/regurgitation), and adequacy of esophageal length. Patients with inadequate esophageal length underwent either extended gastropexy of the stomach to the anterolateral abdominal wall following complete sac reduction, mobilization and hiatal closure, or stapled gastroplasty for esophageal lengthening and fundoplication.

Statistical Analysis

Statistical analyses were performed using Stata® version 14.8 and R version 3.0.0.9 with the use of the user-written R package “twang.”10 Categorical variables are described as frequencies and percentages; continuous variables as medians with interquartile ranges (IQR). Outcomes of interest were short-term mortality and major complications. Mortality was defined as in-hospital death during index admission or death within 30 days of operative repair. Major complications were defined according to the Society of Thoracic Surgeons definitions for postoperative complications. Patients were classified as having a major complication if they suffered one or more of the following: pneumonia, reintubation, tracheostomy, pulmonary embolism, myocardial infarction, congestive heart failure, acute renal failure, cerebral vascular accident, septic shock or bacteremia, postoperative gastric or esophageal leak, perioperative hernia recurrence, and readmission or reoperation within 30 days.

Propensity Weighting

Boosted regression modeling was used to model non-elective repair, the binary exposure variable, as a function of 25 pretreatment factors (Table 1). The model was allowed to find and account for up to three-way interactions between variables. Though boosted regression is a relatively novel technique for propensity score generation with a number of parameters that must be specified by the user, recommended parameters for our model included a shrinkage (learning rate) of 0.005, a bag fraction of 50 %, and a total of 30,000 fit trees, all of which are consistent with guidelines suggested in the literature.11 We elected to use boosted regression modeling over other propensity score approaches due to its ability to search for interaction terms and generally superior performance across many types of datasets.12

Table 1 Distribution of baseline demographics, comorbid diseases, and pretreatment factors used to generate propensity scores stratified by urgency (elective versus non-elective) of paraesophageal hernia repair

The model generated a propensity score for each patient, representing their pretreatment probability of undergoing a non-elective PEH repair, which was then converted into an inverse probability of treatment weight (IPTW). In contrast with propensity score matching, which often involves the discarding of unmatched patients, the use of IPTW allows all subjects to remain in the final analysis of outcomes, albeit with varying analysis weights. After applying these propensity weights to the dataset, pretreatment factors were assessed for adequate balance, which would indicate a feasible comparison between patients who underwent elective versus non-elective repair. The weighted dataset was then used to calculate odds of in-hospital/30-day mortality and major adverse events through logistic regression. The relationship between the outcomes and urgency of surgery was a priori adjusted for two binary indicator variables: age 80 years or older, and age-adjusted Charlson comorbidity index (aaCCI)13,14 score of 6 or higher.4 Two-sided p values <0.05 were considered statistically significant. Patients who had lymphoma (n = 4) were excluded from the analysis due to perfect association with the exposure; all had non-elective surgery. Patients who had a metastatic tumor (n = 3) were also excluded due to perfect separation; all had elective surgery. Therefore, these variables were not included in the propensity model due to lack of variability within these conditions.

Results

Baseline Characteristics

A total of 171 paraesophageal hernia patients (19 %; 171/924) were repaired non-electively. Non-elective repair was associated with female sex, greater age at operation, and lower BMI (Table 1). Patients who underwent non-elective repair were more likely to have a higher age-adjusted Charlson comorbidity index score than those who underwent elective repair, with a higher proportion of prior myocardial infarction, history of congestive heart failure, and history of renal dysfunction. Intraoperatively, a greater proportion of non-electively repaired patients were found to have a type 4 hernia, and a greater proportion were converted to laparotomy (Table 2). Conversion to thoracotomy was not required in any patient. Four patients required resection—two for esophagectomy (one open/one minimally invasive) and two for gastrectomy (one open/one laparoscopic). Both gastrectomy and one esophagectomy were non-elective. Patients repaired non-electively were significantly more likely to be 80 years of age or older (42 % (70/167) versus 13 % (101/757); p < 0.001) and have an aaCCI of 6 or higher (46 % (58/127) versus 14 % (113/797); p < 0.001) compared to electively repaired patients. Of the patients who were 80 years or older, 48 % also had an aaCCI score of 6 or higher (80/167; p < 0.001). All patients underwent definitive repair of the hiatal hernia (defined as a complete reduction of hernia sac, esophageal mobilization, and closure of the diaphragmatic hiatus), but fewer non-elective patients had an anti-reflux procedure in addition to the hiatal hernia reduction and closure.

Table 2 Operative details comparing patients undergoing elective versus non-elective paraesophageal hernia repair

Because the imbalance in these baseline predictors can impact the precision of the point estimate of the relationship between the predictors and the study outcomes (perioperative morbidity and mortality), the conditional treatment probability (propensity score) for non-elective repair was estimated for each patient using logistic regression and the data weighted for the propensity score. Prior to weighting, the median absolute percentage bias across 28 propensity-score variables was 19.3 %. Multiple variables had standard percentage bias greater than 20 %, a threshold that indicates imbalance between groups, including overall aaCCI (64 %), age (52 %), smoking status (23 %), perioperative hernia size (15–44 % depending on percent of herniated stomach), operating surgeon (38 %), and specific comorbidities including myocardial infarction (25 %), congestive heart failure (25 %), cerebral vascular disease (26 %), dementia (29 %), peptic ulcer disease (31 %), and renal disease (23 %) (Fig. 1).

Fig. 1
figure 1

Absolute standard difference between elective and non-elective patients for pretreatment variables before and after propensity weighting. After weighting, all variables have absolute standard differences of <20 % (most 10 % or less)

After inverse probability of treatment weighting using the propensity score, absolute percentage bias across all covariates was decreased to less than 20 % (Fig. 1). Importantly, the absolute percentage bias for age after inverse probability of treatment weighting improved from 52 % (mean age 74.2 (std. dev. 12.66) for non-elective; 68 (std. dev. 11.43) for elective patients) to 6.7 % (mean age 69.4 (std. dev. 11.85) for non-elective; 68.6 (std. dev. 11.72) for elective patients). Similarly, the absolute percentage bias for aaCCI after inverse probability of treatment weighting improved from 64 % (mean aaCCI of 4 (std. dev. 2.97) for non-elective and 2.4 (std. dev. 2.4) for elective patients) to 5.3 % (mean aaCCI of 2.7 (std. dev. 2.7) for non-elective and 2.5 (std. dev. 2.5) for elective patients). The largest absolute percentage bias for a single variable was 18 % for hemiplegia, which only affected ten patients (1.08 %). The median absolute standardized percentage bias for all variables was reduced to 5.6 %. This level of balance indicates that, with inverse probability of treatment weighting, elective surgery patients and non-elective surgery patients were well-balanced in baseline characteristics and suitable for comparison of outcomes.

Risk of Complications

Major complications occurred in 201 out of 924 patients (21.8 %), including 38 % of patients repaired non-electively and 18 % repaired electively (Table 3). Major adverse events were also more common in patients 80 or older (56/167 (33.5 %) versus 145/757 (19 %); p < 0.001) and in patients with an aaCCI score of 6 or higher (48/127 (37.8 %) versus 153/797 (19.2 %); p < 0.001). Patients having non-elective repair had a significantly higher proportion of pulmonary events, including postoperative pneumonia, prolonged initial mechanical ventilation greater than 48 h, need for reintubation, tracheostomy, and bronchoscopy for airway clearance. Non-electively repaired patients had a significantly higher requirement for perioperative blood transfusion, were more likely to suffer a myocardial infarction and develop congestive heart failure, and have new or uncontrolled atrial fibrillation, Clostridium difficile colitis, delirium, and acute renal insufficiency. They were more likely to require reoperation, have a greater length of hospital and postoperative stay, and more likely to be readmitted within 30 days of operation (Table 3).

Table 3 Perioperative mortality and major adverse events within 30 days or in-hospital comparing patients undergoing elective versus non-elective paraesophageal hernia repair

Prior to adjusting for propensity for non-elective repair, patients undergoing non-elective repair were nearly three times more likely to experience major adverse events than were patients who underwent elective repair. When the relationship between non-elective repair and major adverse events was adjusted for age 80 or older and an aaCCI score of 6 or higher, patients undergoing non-elective repair were 2.3 times more likely to experience major adverse events. Age 80 or older was not an independent predictor, while the aaCCI score of 6 or higher was independently associated with an approximate doubling of the odds of having adverse events (Table 4). The lack of independence for age 80 and older may be a function of the fact that nearly 50 % of octogenarians also had an aaCCI score of 6 or higher. After inverse probability of treatment weighting for propensity for non-elective surgery, which balances the baseline variables that are associated with worse outcomes, the association between non-elective repair and major adverse events continued to be significant in univariate analysis but increased the odds by 1.7 times rather than 2.8 times. Adjusting for age 80 or older and age-adjusted CCI score 6 or higher in the weighted data did not meaningfully change the strength of the association (Table 4).

Table 4 Multivariable regression analysis for mortality within 30 days and major complications adjusting for non-elective surgery, age, and comorbidity, before and after propensity weighting

Risk of 30-Day and/or In-hospital Death

Overall 30-day and/or in-hospital death across all PEH repairs in our cohort was 2.3 %. Prior to adjusting for inverse probability of treatment weighting for propensity for non-elective repair, patients undergoing non-elective repair were nearly eight times more likely to suffer perioperative death (7.6 % for non-elective patients and 1.1 % for elective cases; p < 0.001) (Table 3). Patients age 80 and older were also significantly more likely to die perioperatively (15/167 (9 %) versus 6/757 (0.8 %); p < 0.001) as were patients with an aaCCI of 6 or higher (16/127 (12.6 %) versus 5/797 (0.63 %); p < 0.001). When adjusted for age 80 or older and aaCCI 6 or higher in the non-weighted cohort, non-elective repair trended toward an association with nearly three times increased odds of death but was no longer an independent predictor (Table 4). When patients experienced at least one major postoperative event, 30-day and/or in-hospital death was 7.5 % compared to 0.83 % (15/201 versus 6/723; p < 0.001). After the balancing of baseline variables using inverse probability of treatment weighting for propensity for non-elective surgery, non-elective repair was associated with more than three times increased odds of death. After adjusting for age 80 or older and aaCCI 6 or higher, 30-day/in-hospital death was 2.7 times more likely but was not statistically significant (Table 4). In the weighted data, aaCCI of 6 or higher was an independent predictor of greater than 21 times increased odds of perioperative mortality (specificity and sensitivity 88 and 76 %, respectively, for predicting mortality). The 95 % confidence interval for this finding was quite broad, however, indicating a less precise estimate than would be desirable.

Discussion

Because patients who present with acute complications of paraesophageal hernia requiring urgent or emergent repair are typically older and have more associated comorbid illnesses, our study determined the conditional probability of non-elective paraesophageal hernia repair for each patient to create two groups who were balanced for 28 pretreatment variables using inverse probability of treatment weighting. Weighting the data using propensity for the exposure enables the balancing of baseline characteristics between two groups of patients, thus improving the precision of the point estimate for the relationship between the exposure (urgency of operation) and the outcomes of interest. This technique minimizes the potential bias of factors such as age and comorbidities which may influence the exposure (non-elective repair) as well as the outcomes (morbidity and mortality). Adjusting for propensity for non-elective repair created two cohorts that were well-balanced in baseline characteristics and suitable to examine the impact of non-elective repair on outcomes. With this weighted dataset, we were able to compare similar patients and more precisely determine whether non-elective paraesophageal hernia repair was associated with differential postoperative outcome compared to elective repair. Non-elective repair was performed in 19 % of patients and was associated with multiple predictors. Importantly, there were very large differences in predictors such as age, age-adjusted Charlson comorbidity index score, and underlying cardiac and renal disease, variables that are known to be associated with greater likelihood of adverse outcomes. Inverse probability of treatment weighting reduced the imbalance in median standardized percentage bias from nearly 20 % in the unweighted data to 5.6 % in the weighted data. Importantly, all variables except hemiplegia (affecting only 1 % of patients) had absolute percentage bias of less than 10 % following weighting. Using this weighted data, we found that non-elective repair was independently associated with 1.7 times increased odds of major adverse events and trended toward an increase of 2.7 times for odds of mortality compared to elective repair, after accounting for age and comorbid index score. These findings allow us to conclude that non-elective repair does increase the likelihood of perioperative morbidity and mortality when compared to similar patients treated with elective repair.

Previous Studies

Prior studies have described urgent laparoscopic repair of acutely symptomatic PEH as safe and effective.2,15 Parker and colleagues found no difference in mortality rate compared to a control group matched for age and CCI in a cohort of 25 patients who underwent non-elective PEH repair. They did find a statistically significant increase in the prevalence of major adverse events, which they defined as Clavien grade 3–4 complications (16 versus 1.6 % respectively; p = 0.021). Similarly, analysis of 10,656 patients in the American College of Surgeons National Surgical Quality Improvement Program (NSQIP), of which 383 (3.6 %) underwent emergent PEH repair, found that emergent PEH repair did not predict mortality on multivariable analysis but did increase the odds of serious morbidity.16 Our data are consistent with these findings. It is somewhat surprising that mortality was not associated with non-elective repair in our data, but this finding may be explained by the small numbers of deaths in our series. There were only 21 deaths in our series, for a rate of 2.3 %. Similarly, there were only two perioperative deaths in the series by Parker. These small numbers limit the extent of multivariable analysis that can be performed and may explain the lack of association in the Parker study and the non-significant trend in our data when compared to reports by Poulouse and colleagues, who did find that non-elective repair was the sole predictor of inpatient mortality among octogenarians undergoing PEH repair.3 In the NSQIP data, there were a total of 87 deaths, but only 21 of them were in the non-elective group. The proportion of patients who died after non-elective repair was 5.5 % compared to only 0.65 % in the elective group, with significant differences in age, nutritional status (by preoperative weight), and medical comorbidities between the two groups. Given the very large discrepancy in the total number of patients in the control group (>10,000) compared to the non-elective group (n = 383), the failure to balance the major differences in baseline characteristics between groups, variables which are also associated with survival, is likely strongly biasing the analysis and masking potential associations between the urgency of surgery and mortality.

It is notable that our study did not find age to be independently associated with increased odds of major adverse events or mortality when adjusted for non-elective repair and an age-adjusted CCI score of 6 or higher. We have previously reported a risk model for perioperative morbidity and mortality which utilized individual Charlson comorbidity variables in the prediction model. The prediction model did find that age over 80 was a significant predictor when adjusted for non-elective operation, pulmonary disease and congestive heart failure. In the current analysis, in comparison, the age-adjusted Charlson comorbidity score of 6 or higher is strongly associated with adverse outcomes, which may be negating the individual contribution of age over 80 alone, since the age-adjusted Charlson comorbidity score takes the interaction between age and comorbid illnesses into account.4 The current study findings are in line with other reports in the literature. Studies by Gangopadhyay and colleagues17 and Spaniolas and colleagues18 found that complications were not significantly more likely among elderly patients compared to younger age groups. These authors concluded that laparoscopic PEH repair is safe in elderly and select high-risk patients. In contrast to our study, neither of these two studies included non-elective operation and, therefore, do not adjust for the impact of non-elective repair on mortality and morbidity.

After propensity weighting and adjusting for non-elective repair and age 80 or older, an aaCCI of 6 or higher remained an independent predictor of both mortality and major adverse events following repair of paraesophageal hernia. We and others have used the Charlson comorbidity index to risk adjust patients in the analysis of outcomes and in clinical practice.2,4,5,19,20 Use of aaCCI score may be a reasonable option when surgeons are considering whether to offer the patient an elective operation. At a minimum, patients should be counseled on the significantly increased risks of the operation and the symptoms and impact on quality of life carefully considered as well as the experience of the surgeon.

Study Strengths and Limitations

The size of our institutional cohort and use of propensity weighting are key strengths of our study. The use of inverse probability of treatment weighting minimizes the expected limitations of an observational study which inherently lacks the qualities of a randomized study. Balancing of the treatment groups with propensity weighting techniques allows a more precise analysis of the relationship between our exposure and outcomes. Our data is limited to the inherent biases of any retrospective review; however, our data is prospectively collected and periodically audited for accuracy and completeness. Our study is limited to 30-day or in-hospital morbidity and mortality, so we do not examine long-term outcomes. We attempt to be expansive in our definitions of morbidity and mortality, our data captures those events that occur within 30 days of the operation or during the initial hospital stay or during readmissions within 30 days and all events occurring during that readmission, even if the event occurs more than 30 days after surgery. However, major complications or deaths that may have occurred beyond 30 days would not be captured if the patient was not readmitted prior to 30 days. We also do not examine patient-centered outcomes including recurrence, symptom recurrence, or patient satisfaction in this manuscript, as we have published on these outcomes previously.2023

In addition to age and comorbid diseases, increasing recognition of the additive impact of these variables on overall patient function has shifted the focus to indices that encompass multiple dimensions contributing to patient frailty. A composite measure that typically includes weakness, weight loss, level of exhaustion, level of physical activity and walking speed, and frailty measures may be more accurate predictors of postoperative outcomes rather than age or comorbidities,24 and indeed, multiple studies have found that greater frailty risk score is associated with increased mortality and morbidity following thoracic, vascular, and other surgical procedures.5,24,25 We do not have measures of frailty in the current dataset but have begun to calculate frailty in our patients and plan further analysis in future studies. The important consideration, however, is that understanding of patient comorbidities and frailty allows the patient and surgeon to weigh the level of operative risk against potential gains in quality of life in the elderly, as paraesophageal hernia repair has been shown to significantly improve quality of life and is associated with high rates of satisfaction.26 Simply denying an elective operation to a symptomatic elderly person based on age is not supported by our analysis nor the analysis of others.3,17,26

Conclusion

Non-elective repair of large paraesophageal hernias is associated with nearly three times greater odds of perioperative death and nearly two times greater odds of major adverse events compared to elective repair, even after accounting for differences in baseline characteristics. Based on our findings, we support the elective repair of symptomatic paraesophageal hernias; consideration for elective repair by a surgeon with extensive experience in advanced foregut surgery is appropriate even in patients with advanced age and significant comorbid diseases. Preoperative evaluation and counseling should include the calculation of patient risk and a frank discussion with the patient and family regarding risk for death, major adverse events, impaired functional status, and quality of life versus the likelihood of improved quality of life and symptom relief with hernia repair.