Paraesophageal hernia is defined by herniation of the gastric fundus, and occasionally the entire stomach or other abdominal viscera, through a dilated diaphragmatic hiatus [1]. In several cases, paraesophageal hernias are incidental findings on radiological imaging and are asymptomatic. However, the majority of patients with large paraesophageal hernia often report a broad range of symptoms that can individually or cumulatively have a substantial impact on their quality of life [2]. The symptoms are not only gastrointestinal in nature but can be respiratory and cardiovascular [3,4,5]. Indications for surgical repair are controversial, but typically consider the balance of patients’ symptoms with their effects upon quality of life, and the desire to avoid acute complications [6].

Further, the incidence of paraesophageal hernia rises with age, with a median age of diagnosis between 65–75 years [2], however older patients often present with additional co-morbidities, reduced physical fitness and frailty, which together increase operative risk. Pooled analysis estimates the probability of a patient with paraesophageal hernia developing acute symptoms and requiring emergency surgery being around 1% per year [7]. Thus, the decision to offer surgery can be challenging in this cohort of patients.

Laparoscopic repair, when technically possible, is the recommended form of surgical management with acceptable safety and success rate in patients of all ages [8]. Several studies with often heterogenous results have evaluated the use of different types and configurations of mesh in reinforcing the repair to reduce the risk of recurrence compared with a traditional hiatal suture repair alone [9, 10]. The European Association for Endoscopic Surgery (EAES) consensus conference in 2014 stated that hiatal repair with mesh reinforcement may reduce hernia recurrence, although mesh-related complications have to be considered. As a consequence, EAES recommended that indications for mesh should be limited to patients with weak crurae and a large hiatal defect [11]. SAGES guidelines on the management of hiatal hernia [6] acknowledged the controversy surrounding the use of mesh cruroplasty stating ‘‘There is inadequate long-term data on which to base a recommendation either for or against the use of mesh at the hiatus.”

There are no recent guidelines on the management of paraesophageal hernias, and previous recommendations may not be pertinent in the light of new evidence [6]. A survey of European Association for Endoscopic Surgery members indicated that this topic is prioritized by a substantial proportion of European surgeons [12].

The aim of this rapid guideline is to support healthcare professionals (surgeons, gastroenterologists, primary care physicians) and patients in navigating clinical decision-making around the management of paraesophageal hernias, with the objective to improve perioperative and long-term outcomes, including quality of life.

Methods

This guideline follows AGREE-S, GRADE, Institute of Medicine, Guidelines International Network (GIN) and Cochrane Rapid Reviews Methods Group development and reporting standards [13,14,15,16,17]. Principles of rapid guidelines or rapid recommendations were followed, including focus on few prioritized clinical questions, completion in a short timeframe and applying rapid review methods. An AGREE-S reporting checklist is provided in Supplementary File 2. GRADE guidance published in a series of articles in the Journal of Clinical Epidemiology was consulted for up-to-date information. The process of guideline development was facilitated by the use of MAGICapp, an online authoring and publication platform.

Steering group

The steering group consisted of two general surgeons who perform laparoscopic surgery for paraesophageal hernia (SAA, SM). A member of the steering group is a certified master guideline developer and chair with vast experience in evidence outreach, synthesis, assessment and guideline development (SAA; INGUIDE Certification Number: 2022-L3-V1-00014). Both steering group members declared no direct nor indirect conflicts [18].

Guideline panel

The guideline panel consisted of 6 general surgeons, 2 gastroenterologists, and 1 patient representative. The patient representative (MM) is chairman of Heartburn Cancer UK, and has participated as a non-medical professional patient advocate and representative in several national guidelines. She was a regular member of the guideline panel, with equal contribution and voting rights from the start of the guideline development process. Panel members watched a short video tutorial outlining the guideline development methodology. The composition of panel members aimed to be representative of different parts of Europe and different age groups. All panel members disclosed no direct nor indirect conflicts related to the topic of this guideline [18]. We invited key opinion leaders as external advisors, who are authors in studies that expressed an opinion on the effectiveness of an intervention, or are performing research on a topic that could be affected by a recommendation of this guideline. These members were not involved in the decisions on the strength, the direction or the wording of the recommendations, but they were consulted in the development of the evidence-to-decision framework, as per GRADE and GIN guidance [19]. The composition of the guideline development group and each member's role are available in the online appendix [18].

Health question

The guideline addresses the following healthcare questions:

  1. 1.

    should asymptomatic/minimally symptomatic paraesophageal hernias be managed conservatively or with surgery?

  2. 2.

    should a mesh versus sutures only be used for closure of the hiatus in paraesophageal hernia repair?

  3. 3.

    should a gastropexy versus a fundoplication be performed in paraesophageal hernia repair?

Paraesophageal hernias in the context of the present document are considered hernias with > 50% of the stomach herniated into the thorax.

Protocol

A protocol was developed a priori by the steering group [20]. The protocol draft was made publicly available through the EAES website, and EAES members were invited through email to comment on the content. The guideline question and outcomes of interest were refined in collaboration with the panel members. Amendments to the protocol with justifications are provided below.

Rating the importance of outcomes

The importance of outcomes was rated by panel members using the GRADE scale [21]. The GRADE scale assigns scores 1–3 to outcomes of lower importance; 4–6 to important outcomes; and 7–9 to critical outcomes. The classification of outcomes into each of the three categories (not important, important, critical) was made by the steering group under consideration of panel members' ratings available online [18]. The median score across panel members' votes was considered the final score.

We considered the importance of outcomes as follows:

  • 30-day complications Clavien–Dindo ≥ 3: 8

  • 30-day complications Clavien–Dindo ≤ 2: 6

  • Dysphagia beyond 6 months: 7

  • Reoperation: 7

  • Quality of life: 7

Some panel members further nominated a number of outcomes, which were not prioritized due to overlap with current outcomes (see online appendix for full list and justification for exclusion [18]).

Setting minimal important differences

The evidence-to-decision framework was set within a fully contextualized approach [22]. An anonymous web-based survey of panel members was performed to define minimal important differences. The results of the survey are available online [18]. The median of the minimal important differences across panel votes was selected.

Under consideration of panel's responses, the following minimal important differences were set:

  • 30-day complications Clavien–Dindo ≥ 3: 10 per 1000

  • 30-day complications Clavien–Dindo ≤ 2: 50 per 1000

  • Dysphagia: 50 per 1000

  • Reoperation: 50 per 1000

  • Quality of life: 2 out of 10 points—or 0.2/0.5 standard deviations (small/moderate difference)

The outcome quality of life was reported with different scales (Gastrointestinal Quality of Life Index, Short Form 36), we therefore calculated standardized mean differences. Although no universal cutoff can be applied [23], we considered the above differences in standard deviation units as important based on expert guidance (INGUIDE certification program).

Search strategy

The guideline methodologist has developed separate search strategies for each question framework [18] after a scoping search of PubMed to assess the availability of randomized trials or observational studies on each clinical question. Specifically, the search syntax was specific to randomized trials for question 2, and to observational studies to questions 1 and 3. We searched PubMed for original articles of these study designs per question, published from 1990 onwards in the English language. The search syntax, date limits, and summary search results are provided in the online appendix [18].

Study selection

An ad hoc evidence outreach team (NG, NM) performed record screening using the platform Rayyan [24]. Both reviewers were blinded to each other's judgement and the senior author (SAA) resolved disagreements after unblinding. The same reviewers in collaboration with the methodologist selected articles based on full text screening.

We considered randomized controlled trials on question 2 and observational studies on questions 1 and 3, addressing these specific question frameworks. Overarching inclusion criteria were adult patients with paraesophageal hernias, documented in cross-sectional imaging, barium studies, or esophagogastroscopy. We only included studies in the quantitative analysis which reported on outcomes with more than 12-month follow-up, except for perioperative outcomes. Panel members and external advisors were provided with the list of included articles and they were asked whether they are aware of any other studies addressing the clinical questions.

Data extraction

Outcome data were extracted by 2 reviewers (NG, NM), and cross-checked in detail by the senior author (SAA). The data extraction spreadsheet and detailed risk of bias assessments per outcome or group of outcomes with justifications are available online also for third-party use under the Creative Commons license, after approval by the senior author [18].

Risk of bias assessment

We performed de novo risk of bias assessments using RoB-2 for randomized trials and ROBINS-I for observational studies [25, 26]. Risk of bias assessments were performed by 2 reviewers (NG, NM) and cross checked by the senior author in detail (SAA). For the purposes of outcome-specific risk of bias assessment, outcomes were grouped as follows: 30-day complications Clavien–Dindo; dysphagia; reoperation; quality of life. Visual summarization of risk of bias was performed using the robvis tool [27].

Statistical analysis

We conducted a random effect meta-analysis [28, 29] to synthesize quantitatively the evidence for the research questions. For the binary outcomes, we extracted the number of events and the sample size of each group, and we estimated for each outcome the risk ratio (RR) along with the corresponding 95% confidence interval (CI). A continuity correction was applied to the studies with zero-cell counts. For the continuous outcome of quality of life, we extracted the sample size, and the mean effect with the corresponding standard deviation (SD) for each group. We estimated the standardized mean difference (SMD) because different scales were used to measure the quality of life across studies. One study reported subgroup data on patients undergoing repair with a synthetic mesh or a biologic mesh [30]. For both subgroups, the sample size, the mean, and their corresponding confidence interval were provided. We followed two approaches, to synthesize this evidence. Firstly, we meta-analyzed them using the metamean command of the meta package [31] to obtain the pooled effect. In the second approach, we calculated a weighted mean and the pooled standard deviation using Cohen’s formula [32]. Both approaches provided similar results. The Restricted Maximum Likelihood [33] estimator was used for the between-study-variance (heterogeneity). We explored heterogeneity via the I2 statistic that describes the percentage of the variability of effect estimates, that is due to heterogeneity rather than sampling error. We further explored heterogeneity by computing the Q-statistic and the 95% predictive intervals that show the plausible range of values for the effect size in a future trial. Due to the small number of studies in each outcome (< 10 studies), it was not possible to examine for small study effect via Egger’s test [34] and it is not advised to visually inspect the symmetry of the funnel plot when few studies are available. The fixed effect (also known as common effect) model was applied for all the analyses as a sensitivity analysis. A subgroup analysis is also presented for Q3 between the cohort and the one randomized study, to examine if the two different types of studies give ambiguous results. Another subgroup analysis was also conducted for Q2 between the studies that used a synthetic mesh versus those that used a biologic or mixed synthetic and biologic mesh. Additionally, we ran proportion meta-analyses to calculate the baseline risk of each outcome. All the analyses were performed in R statistical package version 4.0.3 [35] using the meta package [31].

Assessment of the certainty of evidence

We constructed GRADE evidence profiles of certainty for each comparison separately and for each outcome using MAGICapp. The certainty of evidence is determined by the risk of bias across studies, incoherence, indirectness, imprecision, publication bias and other parameters [35]. To inform calculations of absolute effect differences, we performed proportion meta-analyses of frequencies of baseline risks/effects provided by the source studies; these are available in the online appendix [18].

Evidence-to-decision framework and development of recommendations

The guideline panel reviewed the evidence tables and the stratified rankings. In an in-person consensus meeting, panel members provided their judgements on:

  • the magnitude of benefit of each intervention

  • the magnitude of harm of each intervention

  • the certainty of the evidence for each intervention

  • any variability in patients' values and preferences

  • costs or savings related to each intervention

  • effect of each intervention on equity

  • acceptability of each intervention

  • feasibility of each intervention.

Panel members then participated in an online Delphi process to formulate the recommendations. A draft of the recommendations was developed by the steering group, and panel members were invited to anonymously propose modifications.

Amendments to the protocol

Following public input from EAES members, we included a second gastroenterologist with expertise in upper gastrointestinal manometry to participate as panel member (REP). A panel member participated from the outset up to the online prioritization of outcomes and setting minimal important differences. Due to inability to participate in the consensus meeting, he was replaced by another member from the same stakeholder’s group, who fully participated in the further process (FB). Due to the very low certainty of the evidence on Q3, we used a structured observation form to document experiential evidence from panel members, which informed the benefits/harms domain of the evidence-to-decision framework [36].

Results

We identified 2 observational studies on Q1 [37, 38], 14 reports of 9 randomized trials on Q2 [10, 30, 39,40,41,42,43,44,45,46,47,48,49,50], and 11 observational studies [51,52,53,54,55,56,57,58,59,60]/1 randomized trial [61] on Q3. Excluded records on first- and second-level screening, with reasons, and PRISMA 2020 flow charts are available in the online appendix [18].

Subgroup analyses comparing biologic or biosynthetic versus non-absorbable mesh were not performed, because the studies did not provide subgroup data.

Sensitivity analyses of studies reporting on non-absorbable mesh versus studies reporting on both absorbable and non-absorbable meshes did not suggest comparative effect differences (see statistical analyses in the online appendix [18]).

Six out of 9 panel members agreed with the recommendation on surgery versus conservative management in the general population (conditional recommendation) after 2 Delphi rounds (2 out of 9 disagreed, 1 out of 9 had no opinion). There was unanimous consensus with regards to the recommendation on surgery versus conservative management in frail patients. Seven out of 9 panel members agreed with the recommendation on mesh over suture repair after 2 Delphi rounds (1 out of 9 disagreed, 1 out of 9 had no opinion). There was unanimous consensus with regards to the recommendation on antireflux surgery versus gastropexy.

The evidence tables are provided in Tables 1, 2, and 3, and the evidence-to-decision frameworks are summarized in Tables 4, 5, and 6.

Table 1 Evidence table on Q1: Should asymptomatic/minimally symptomatic paraesophageal hernias be managed conservatively or with surgery?
Table 2 Evidence table on Q2: Should a mesh versus sutures only be used for closure of the hiatus in paraesophageal hernia repair?
Table 3 Evidence table on Q3: Should a gastropexy versus a fundoplication be performed in paraesophageal hernia repair?
Table 4 Summary of evidence-to-decision considerations on Q1: Should asymptomatic/minimally symptomatic paraesophageal hernias be managed conservatively or with surgery?
Table 5 Summary of evidence-to-decision considerations on Q2: Should a mesh versus sutures only be used for closure of the hiatus in paraesophageal hernia repair?
Table 6 Summary of evidence-to-decision considerations on Q3: Should a gastropexy versus a fundoplication be performed in paraesophageal hernia repair?

Recommendations

We suggest surgery over conservative management for asymptomatic/minimally symptomatic paraesophageal hernias. (conditional recommendation).

We recommend conservative management over surgery for asymptomatic/minimally symptomatic paraesophageal hernias in frail patients. (strong recommendation).

We suggest mesh over sutures for hiatal closure in paraesophageal hernia repair. (conditional recommendation)*

We suggest fundoplication over gastropexy in elective paraesophageal hernia repair. (conditional recommendation).

We suggest gastropexy over fundoplication in patients who have cardiopulmonary instability and require emergency paraesophageal hernia repair. (conditional recommendation).

A strong recommendation means that the proposed course of action is appropriate for the vast majority of patients. A conditional recommendation means that most patients would opt for the proposed course of action, and joint decision-making of the surgeon and the patient is required.

*Please see the accompanying evidence-to-decision table.

Discussion

Implications for policy makers

A policy of operating asymptomatic or minimally symptomatic patients is suggested by an interdisciplinary panel of stakeholders. Mesh reinforcement of the hiatus in paraesophageal hernia repair, which is suggested here, requires availability of appropriate prosthetic materials.

Implications for healthcare professionals

This interdisciplinary report suggests a change of practice by surgeons who follow a conservative management for asymptomatic and minimally symptomatic paraesophageal hernias, and primary suture closure of the hiatus. The evidence was of low or very low quality; however, the panel followed a structured, transparent evidence-informed decision framework considering risks and benefits, acceptability, feasibility, equity, cost and patients’ values and preferences.

Careful and robust discussion during the formulation of this guideline centered around appropriate context in the application of these guidelines and particularly around the nuances in management of this heterogeneous condition. The conditional recommendation in favor of surgical management for patients with asymptomatic and minimally symptomatic paraesophageal hernias is underpinned by two main premises. Firstly, assessment and confirmation that the patient is ‘fit’ to receive all available treatment options including surgical intervention. Secondly, that the paraesophageal hernia itself is of sufficient size to be either causing the symptoms or at risk of future complications. Thus, the context of this guideline and much of the discussion focused on moderate to large paraesophageal hernias type II to IV with at least 50% of the stomach herniated to the thoracic cavity.

Based upon the best available published evidence, a conditional recommendation was also made in favor of mesh over primary suture repair in the surgical treatment of paraesophageal hernias using the definition described above. Importantly, within the available evidence, there was substantial heterogeneity regarding the type mesh, orientation of mesh and method of mesh fixation, which precluded a more in-depth analysis of the subtleties of mesh utilization. Further it was acknowledged in the discussion during the consensus meeting that the available evidence regarding mesh utilization is largely based upon historical randomized controlled trials, and thus requires careful consideration in application to modern surgical practice. The most striking benefit to mesh utilization in primary paraesophageal hernia repair was a reduction in incidence of reoperation, which is often utilized as a surrogate for recurrence of paraesophageal hernia in the absence of routinely undertaken imaging. However, as was discussed extensively in the consensus meeting, threshold for reoperation especially in the context of previous mesh repair maybe be higher, representing an inherent bias within the existing literature. It is important to note that the net benefit is expected to be higher in patients at high baseline risk for hernia recurrence (e.g., patients with suspected or confirmed connective tissue disorder, for example, those with groin or umbilical hernia, Marfan syndrome, those with abdominal aortic aneurysm, patients at an advanced age, or under immunosuppression).

Patient involvement in decision-making is key to the application of the recommendations, as most are based upon low or very low certainty evidence and are subject to variable values and preferences.

We have not involved primary care representatives in the panel, because we considered that patients with symptomatic paraesophageal hernias are almost invariably referred to surgeons or gastroenterologists. Nevertheless, symptomatic and asymptomatic paraesophageal hernias may be followed up in the long term by primary care providers, and this may represent an important group of stakeholders that should be included in future updates of these guidelines.

Implications for patients

Patients can be informed on the uncertainty of the evidence, the relative expected merits and risks of each management option, along with the surgeon's preferences. This document provides valuable information on the relative effects of mesh versus suture hiatal repair to assist decision-making.

Implications for researchers

The physical history of hiatal hernia, especially with regard to the evolvement of symptoms, is largely unknown. Evidence on the relative merits of conservative management versus surgery in various patient subgroups does not exist. Further, current evidence on the comparative effects of mesh versus suture hiatal repair is lacking precision.

The following are expected to address these evidence gaps:

  • longitudinal cohort studies on conservative management of hiatal hernias with repeated measurements of quality of life and hernia characteristics.

  • multicenter matched cohort studies on conservative management versus surgery.

  • randomized trials comparing mesh versus suture repair, reporting classified complication data (e.g., Clavien–Dindo), dysphagia, reoperation and quality of life; or

  • individual patient data meta-analysis of existing randomized trials.

Barriers and facilitators

Individual and institutional change of practice is the primary barrier to implementation. This document aims to serve as a reliable source of summary evidence using anchor-based minimal important differences to inform decision-making.

Decision aids available on MAGICapp (https://app.magicapp.org/#/guideline/j7q7Gn) and the evidence tables can assist healthcare professionals and patients in choosing the most appropriate intervention tailored to individual patient characteristics and preferences.

Monitoring

Summary intervention effects of mesh hiatal repair may serve as quality assurance anchors:

  • major morbidity: 6–20%

  • dysphagia: 3–28%

  • reoperation: 2–9%

Validity period

An average of 1.6 reports per year was published on Q2. One trial addressing Q3 with estimated completion date in 2025 (ClinicalTrials.gov Identifier: NCT04007952) and another trial addressing Q2 with estimated primary completion date in 2024 (ClinicalTrials.gov Identifier: NCT05201508) were identified in a scoping search of clinicaltrials.gov. We do not anticipate substantial change in effect estimates within the next 7 years. This document is valid until December 2030.

Update

We plan to update this guideline within 2030, unless substantial new evidence will become available.

Conclusion

This guideline provides recommendations on the management of paraesophageal hernias based on best available evidence, developed by an interdisciplinary European panel of stakeholders using a structured, trustworthy methodology.