Heller myotomy has been the standard of treatment for achalasia since the 1950s [1]. Dividing the extramucosal muscular fibers at the esophagogastric junction (EGJ) relieves elevated pressure at the lower esophageal sphincter (LES) and improves the hallmark symptoms of dysphagia and regurgitation. There have been a number of modifications to the original myotomy, most notably the addition of an anti-reflux procedure to account for the loss of the anti-reflux barrier of the EGJ. Studies show that a partial fundoplication reduces postoperative gastroesophageal reflux disease (GERD) without affecting the improvement in esophageal clearance [2,3,4]. However, GERD remains a consideration throughout the postoperative course. Institutions use a combination of subjective symptom reporting and objective assessment in the diagnosis of postoperative GERD [5]. If not well controlled, chronic esophageal inflammation can cause recurrent dysphagia due to stricture, necessitating reintervention. Chronic inflammation due to uncontrolled reflux may also be one of the contributing factors to the higher rate of esophageal cancer seen in patients with achalasia [6].

A burgeoning consideration in achalasia research is the establishment of per-oral endoscopic myotomy (POEM) as a potential first-line treatment. POEM is safe and efficacious with roughly comparable relief of dysphagia, yet it has consistently demonstrated higher rates of GERD than laparoscopic Heller myotomy (LHM) [7,8,9]. One randomized controlled trial noted reflux esophagitis in 44% of POEM cases relative to 29% in LHM with fundoplication [8]. Another study reported abnormal acid exposure in 39% versus 17% [9].

The incidence of GERD following laparoscopic Heller myotomy with Toupet fundoplication (LHM+T ) has been explored in the literature. Reported GERD rates range from 12 to 39% [2,3,4, 10, 11], however, the evaluation of nuanced details of the pH studies is sparse. The clinical and scientific assessment of GERD is heavily reliant upon pH study or symptom reporting (rarely both). We know that reading the pH tracing is important in the post-treatment achalasia patient, as one can be deceived by just noting the numbers (e.g., % acid exposure). It is crucial that these diagnostic nuances are broadly applied to emerging research and ongoing practice in the field. This study leverages our experience over the last decade to establish the incidence of pathological reflux in LHM+T for achalasia using pH studies in concert with symptom monitoring.

Methods

A single-institution retrospective review of adult patients (≧ 18 years old) with achalasia who underwent LHM+T fundoplication from January 2012 to April 2022 was performed. Patients with achalasia as confirmed by high-resolution manometry (HRM), upper endoscopy (EGD), and upper gastrointestinal imaging were included. Patients who underwent concurrent procedures had a history of prior myotomy or fundoplication, and those who did not meet the definition for achalasia were excluded. Patients filled out a symptom questionnaire at their initial evaluation and filled out the same symptom questionnaire at their routine 6-month follow-up for pH testing. We defined GERD symptoms based on questionnaire-reported severe heartburn, heartburn requiring PPI therapy, and regurgitation symptoms.

Demographics, comorbid conditions, preoperative diagnostic testing, and postoperative follow-up data were collected. HRM testing was reviewed, and the type of achalasia based on the Chicago Classification was recorded as well as the IRP. pH monitoring studies at our institution were performed with 48-h Medtronic Bravo probe testing (Minneapolis, MN), 24-h dual probe testing, or 24-h pH-impedance testing (Diversatek Highlands Ranch, CO) depending on need for concomitant endoscopy, tolerance of nasal probe, or patient preference. Extracted data points from pH studies included DeMeester score, percent acid reflux, number of reflux episodes, longest reflux episode, and presence of poor esophageal clearance pattern. Importantly, all tracings were performed and read by the authors. Postoperative patient-reported reflux symptoms on review of systems and patient symptom questionnaires were recorded at routine 6-month follow-up appointments.

Descriptive statistics were used to report patient characteristics. Differences in preoperative and postoperative questionnaires were explored via Kruskal–Wallis test. Differences in patient-reported reflux symptoms in relation to normal or abnormal pH study were explored via chi-square tests and Fisher’s exact tests where appropriate. Two-sample T tests were used to explore all other differences in pH study results. For all analyses, a p value ≤ 0.05 was considered statistically significant. Missing data were excluded from calculations that were specific to that field. Analyses were performed using IBM SPSS Statistics (version 28). The University of Washington Institutional Review Board approved this study.

Results

There were 170 patients who underwent LHM+T from January 2012 to April 2022. The median [IQR] age was 51 [34, 65] years old, 48% were female. Thirty-six (21.2%) had type 1 achalasia, 108 (63.5%) had type 2 achalasia, 9 (5.3%) had type 3 achalasia, and 10% did not have their preoperative manometry tracing available to review (Table 1). Of the 170 patients, 51 (30%) underwent routine 6-month pH, 50 (98%) of those had patient-reported symptom data, and 30 (59%) completed a full-patient symptom questionnaire (Fig. 1). These 51 patients make up our cohort.

Table 1 Preoperative patient characteristics
Fig. 1
figure 1

Patient symptom questionnaire

For manometry findings, the median [IQR] preoperative IRP was 28.9 mmHg [22, 34.9] and postoperative was 7 mmHg [3.5–11], p < 0.001. For routine 6-month postoperative pH studies (Table 2), the median [IQR] DeMeester score was 4.6 [1, 15.5], average (SD) percent distal acid exposure was 3.8% (6.4). Eleven (22%) had an abnormal pH study as defined by a DeMeester score greater than 14.72 and percent distal acid exposure greater than 6%. Upon manual review, 5 of 11 (45.5%) demonstrated low-frequency, long-duration reflux events, suggesting poor esophageal clearance of gastric refluxate (Fig. 2). In contrast, 6 (54.5%) had presence of typical reflux episodes on their pH study. The average longest reflux episode for patients with typical reflux episodes was 11 min compared to 58 min (p = 0.01) for patients with the presence of a poor esophageal clearance pattern. Therefore, the true rate of pathologic GERD is 11.7% (6/51).

Table 2 pH monitoring results and reported reflux symptoms at 6 months
Fig. 2
figure 2

Postoperative pH monitoring outcomes

For the routine 6-month postoperative symptom questionnaire, the median [IQR] reflux severity score was 1 [0, 3] and the median [IQR] reflux frequency score was 0.5 [0, 1]. Ten (33.3%) reported mild reflux severity (score from 1 to 3), 4 (13.3%) reported moderate reflux severity (score from 4 to 6), and 1 (3.3%) reported severe reflux severity (score from 7 to 10). Median dysphagia severity score improved postoperatively (1 from 7, p < 0.001) and regurgitation severity score improved (0 from 4, p < 0.001). Of note, at 6-month routine follow-up, only 8 (15.6%) reported clinically significant GERD symptoms on review of systems. Patients with a typical esophageal reflux pattern on pH testing experienced more GERD symptoms (50% vs 12.8%, p = 0.03) than patients with a normal pH study. Those with the presence of a poor esophageal clearance pattern (n = 5) reported no concurrent GERD symptoms.

Discussion

To our knowledge, our study is the largest cohort of LHM+T patients which reports routine symptom questionnaires specifically with concurrent detailed and manual analysis of routine postoperative pH studies.

In our series, we demonstrate clinically significant reduction in the LES IRP with excellent improvement in subjective dysphagia and regurgitation symptomatology. This finding is in line with the known historical success of LHM [12,13,14,15,16,17]. Furthermore, in routinely performing Toupet fundoplication, it accomplishes this with a low rate of postoperative GERD symptoms. The rate of clinically significant postoperative GERD patient-reported symptoms was 15.6%. This rate of GERD symptoms is similar or lower than the current literature [3, 11].

In addition to reported subjective GERD symptoms, we correlated this with routine postoperative pH monitoring studies. We found that the median postoperative DeMeester score as well as the percent distal acid exposure were in the normal range for the cohort. Furthermore, only 11.7% of patients with routine 6-month postoperative pH testing had a pattern of typical GERD. By comparison, two studies conducted 6-month pH testing following LHM+T and reported GERD incidences of 33% and 34% [3, 11]. Our lower rates of objective GERD findings may be due to our discrimination between a typical reflux pattern (due to incompetency of the anti-reflux barrier) and a pattern of poor esophageal clearance (in which the barrier is likely functioning normally). There may also be differences in institutional surgical techniques as well.

In assessing postoperative objective GERD rates for our cohort, we found that while 22% of pH studies were abnormal based solely on numerical acid exposure, only about half of these reflected typical reflux patterning compared to a poor esophageal clearance pattern. Many studies consider an abnormal pH study, defined as a DeMeester score greater than 14.72, to be synonymous with GERD [3, 10, 11]. However, in this patient population, computer-generated reports may overestimate GERD rates. Of note, our patients with a poor clearance pattern on pH study reported no associated GERD symptoms. Additionally, we found that the duration of reflux episodes distinguished a typical acid reflux pattern from a poor esophageal clearance pattern. Specifically, our findings demonstrated an association between a poor esophageal clearance pattern and an average longest reflux episode of > 20 min. Two smaller studies have reported on this topic suggesting between one-third and one-half of abnormal pH studies are attributable to poor esophageal clearance rather than a typical acid reflux pattern [18, 19], thus the findings in our larger cohort are in line and confirmatory.

There are few papers which utilize and attempt to correlate routine symptom questionnaires and objective pH testing after LHM with fundoplication [11, 20,21,22]. However, detailed symptoms questionnaires used in concert with objective pH testing is important in delineating clinically significant postoperative GERD rates and silent GERD rates. Of the few studies which correlate subjective and objective GERD findings, symptom correlation to pH monitoring ranges from 7 to 46.7% [11, 21, 22]. We found overall higher rates of symptoms correlation in our cohort with half of patients with a typical GERD pattern on pH study reporting concurrent GERD symptoms. Still, 50% had silent GERD: abnormal pH studies and no associated GERD symptoms. In the literature, silent reflux rates after LHM are reported as high as 68.5% [11]. This lack of correlation is documented but remains incompletely understood. Ponds et. al. hypothesized that acid sensitization is the culprit, as it is associated with the “impaired mucosal integrity, increased activation of oesophageal nociceptors and visceral sensitisation” seen in achalasia [23]. While they found no supporting evidence for esophageal hyposensitivity as the basis for silent reflux, hypersensitivity has been found to be associated with positive reflux symptomology in treated achalasia patients. Therefore, it is important to note that positive symptom correlation to objective GERD findings in this population can be variable and it is important to obtain routine pH monitoring in post-treatment achalasia patients so that these patients may be identified and treated appropriately.

Limitations of our study include incomplete follow-up and retrospective design. While we ask all patient to return at 6 months for repeat manometry and pH testing, only 30% did so, and it is therefore possible that it our sample is not representative of the entire LHM-T cohort. However, we expect the study results likely skew toward symptomatic patients; thus, the incidence of GERD in this study may in fact be inflated. It is possible that patients may have sought care outside of our medical recording system, though in our region we are the only academic medical center and our lab does the majority of esophageal function testing for other hospital systems. Despite these limitations, the results are consistent with other reports in the literature, and size supports these findings which are very important for post-treatment achalasia evaluations.

Conclusion

Relief of dysphagia is high and the incidence of GERD after LHM+T is relatively low and is still the standard that POEM and other treatments should compare themselves to. Patient-reported GERD symptoms do not reliably reflect reflux burden and poor esophageal clearance patterns are a common cause of an abnormal appearing pH study. Manual review of pH testing is essential to determine true GERD vs poor esophageal clearance. Since GERD is a common dissatisfier and a known cause of long-term recurrent dysphagia [24], it is important to understand in comparing treatments for achalasia. Therefore, all outcomes studies in achalasia should include manually read pH monitoring studies to accurately represent the results of the intervention. Pneumatic dilation, LHM, and POEM are all accepted interventions for achalasia, but we need more complete outcome studies using the methodology expressed here to understand whether there are relevant differences in the outcomes that should impact provider recommendations and patient choices.