Introduction

Achalasia is a relatively rare esophageal motility disorder characterized by impaired lower esophageal sphincter (LES) relaxations and the absence of esophageal peristalsis1. Although the pathogenesis of achalasia is still unknown (so no definitive therapy is available), an effective and durable palliation of the related dysphagia symptoms has been achieved in most patients by disrupting the LES muscle fibers with forceful endoscopic pneumatic dilations (PD) or by dividing them by means of a laparoscopic Heller’s myotomy (LHM)2. A new endoscopic procedure, the so-called per-oral endoscopic myotomy (POEM) was recently introduced3 and is being used more and more extensively throughout the world. It achieves good short to midterm results and may be a candidate for replacing LHM (and PD, too) as the first-line therapy for achalasia, if the expertise is available4.

The results of some undergoing randomized control studies comparing POEM versus LHM are not available yet, and the evaluation of POEM has been up to now possible only by comparing the results of a number of newly published case series to the established results of LHM studies5,6,7,8,9,10. Some meta-analyses have also been recently published, confirming that POEM provides the same short- and midterm results as the traditional techniques11,12,13,14,15. Some concerns were raised by these studies, however, about the occurrence of post-procedural complications such as gastroesophageal reflux, reported with a higher rate than after LHM16,17,18.

The aim of the present study was therefore to compare the short- and midterm outcome of POEM and laparoscopic Heller myotomy and Dor fundoplication (LHD) for the treatment of esophageal achalasia, in order to verify if the former may really aim at replacing the latter as a first-line treatment for achalasia. By using the propensity score (PS) for matching19, a series of consecutive patients was evaluated in this polycentric case-control study performed in two high-volume Italian institutions, one with extensive experience with POEM20 and the other with LHD21.

Materials and Methods

Patients

All consecutive patients with a diagnosis of achalasia who underwent POEM at the Digestive Endoscopic Unit of the Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy, from January 1, 2014 to December 31, 2017, entered the study. The same applied to all consecutive achalasia patients who underwent LHD at the Department of Surgical, Oncological and Gastroenterological Sciences of the University of Padua, Padua, Italy, during the same period.

Patients who had already undergone surgery or POEM for esophageal achalasia were ruled out, as well as patients belonging to the so-called learning curve (first 20 cases for each single operator22). Also, some patients who underwent the procedure during some “teaching” courses or master classes, with missing data or eventually diagnosed as having a motor disorder which was different from achalasia, were excluded (Figure 1). On the contrary, patients who had received unsuccessful “traditional” endoscopic treatment (PD and/or Botox injections) were included in the study. The patients’ data, procedure details, and postoperative follow-up were prospectively recorded in a dedicated database.

Fig. 1
figure 1

Flowchart of patients recruitment: with propensity score (caliper 0.2), 140 patients for group were strictly matched

All the patients of both Centers signed an informed consent to have their data recorded in a dedicated database. Since the tests and the follow-up method were the ones normally used in both Centers to evaluate the results of the treatment, a proper IRB approval of the study was not necessary. Verbal consent was obtained by participants contacted by telephone. The study was conducted in accordance with the ethical principles of the Declaration of Helsinki.

Preoperative Assessment

The disease was diagnosed on the basis of well-established radiological, endoscopic, and manometric criteria23. Patients’ clinical and demographic data were collected prospectively by means of a questionnaire, and symptoms were assessed using the Eckardt score24, that scores the symptoms of dysphagia, food regurgitation, chest pain, and weight loss from 0 (=absent) to 3 (= symptoms every meal and > 10 kg. weight loss), with a maximum score of 12 (Table 1). The maximum esophageal diameter was measured at the barium-air interface in the standard anteroposterior image obtained during a barium swallow. Patients were classified according to their maximum esophageal diameter and the shape of the esophagogastric passage as follows: grade I, 4 cm or less; grade II, 4 to 6 cm; grade III, 6 cm or more; grade IV 6 cm or more, and/or a sigmoid-shaped esophagus25. Endoscopy was always performed to rule out any malignant disease.

Table 1 The Eckardt’s score for the evaluation of symptoms of achalasia

High-Resolution Esophageal Manometry

All the patients underwent HRM of the esophagus. Both Centers had the same instrumentation and followed the same manometric protocol, described in detail elsewhere26. HRM was performed using a catheter 4.2 mm in diameter with 36 solid-state circumferential sensors spaced at 1-cm intervals and spanning the whole esophagus (Medtronic, Minneapolis, MN, USA). The HRM assembly was placed transnasally and the manometric catheter positioned to record from the hypopharynx to the stomach with approximately five intragastric sensors. The manometric protocol included a 5-min period to assess basal LES pressure, ten swallows of 5 ml of saline solution with a standardized electrolyte concentration to ensure proper catheter function, separated by intervals of at least 20 s. An average 4-s integrated relaxation pressure (4sIRP) greater than 15 mmHg was considered indicative of impaired LES relaxation. The manometric data were analyzed using ManoViewTM software (Medtronic, Minneapolis, MN, USA). The Chicago classification (v. 3.0)27 was used for the HRM findings, defining achalasia as: pattern I when there was no distal esophageal pressurization to > 30 mmHg in > 8/10 swallows; pattern II when at least 2 test swallows were associated with a panesophageal pressurization > 30 mm Hg; and pattern III when patients had at least 20% of premature contractions (distal latency < 4.5 s).

POEM Technique

The procedure was carried out with the technique described by Inoue et al3 and reported in detail elsewhere16. A high-definition endoscope (GIF-H180 or GIF-H180J; Olympus Co., Tokyo, Japan) and carbon dioxide insufflation (UCR; Olympus Co.) were used for all cases. A triangle-tip knife (TT-knife; Olympus Co.) with spray coagulation current 50 W (VIO300D; ERBE Elektromedizin GmbH, Tübingen, Germany) was used both for submucosal dissection and myotomy. Myotomy was usually carried out on the anterior esophageal wall. The length of myotomy was approximately 10–12 cm, including 2–4 cm of the gastric wall. The myotomy included preferably the circular bundles of the muscularis propria of the mid- and distal esophagus. At the level of the esophagogastric junction (EGJ) and on the gastric side, myotomy was fully thick and also included the longitudinal bundles. Oral feeding was started 24 h after POEM. Any occurring complication was recorded in the hospital charts of the patient and then reported in the database.

Surgical Technique

The surgical technique for LHD was first described in 199328 and has changed little since. Briefly, a 7–8-cm-long myotomy was performed after dissecting only the anterior wall of the esophagus, extending the myotomy 1.5–2 cm on the gastric side. During the procedure, a 30 mm Rigiflex balloon was placed inside the esophageal lumen at the cardia level, using an endoscopically positioned guide wire. The balloon was then gently inflated with air and deflated while the muscle fibers were being cut. If a mucosal perforation was identified intraoperatively, a 4/0 absorbable suture was performed, with 1–3 separate stitches. A partial anterior fundoplication according to the technique described by Dor was performed and sutured to the edges of the myotomy with three stitches on each side. The more proximal suture included the homolateral pillar of the hiatus to keep the fundoplication high around the esophagus. After a negative contrast swallow, a liquid diet was allowed, and patients were given soft food on the second POD. The length of their hospital stay depended on the distance patients had to travel from home to the hospital. If a leakage was identified with the postoperative contrast swallow, a nasogastric tube was left in place and used for gastric decompression, antibiotics were administered, and the patient was kept on total parenteral nutrition until a new contrast study (usually 7 days later) showed no leakage. If a mucosal lesion was detected and repaired during the surgical procedure, the postoperative contrast swallow was scheduled on the 5th to 7th POD, and the patient was kept on total parenteral nutrition with the nasogastric tube in place. Any occurring complication was recorded in the hospital charts of the patient and then reported in the database.

Follow-up and Outcome

Clinical outcome was assessed at the outpatient clinic by administering the preoperative questionnaire again 2, 6, and 12 months after surgery, and every 2 years ever since. If the patient failed to return to the outpatient clinic, a telephone interview was conducted. Endoscopy was performed 6 (POEM group only) and 12 months after the operation and then recommended every 24 months. Any esophagitis was rated according to the Los Angeles classification29. Barium swallow was repeated 2 months after the myotomy and then 2 to 4 years later and whenever patients had symptoms. Esophageal HR manometry and 24-h pH monitoring (according to DeMeester30 were performed 6 months after the surgical procedure. Accuracy of reflux detection was checked manually by an expert to distinguish true episodes of gastroesophageal reflux from false reflux due to stasis31. Both HR Manometry and pH-Monitoring were performed at the esophageal lab of the corresponding center, whereas barium swallow and endoscopy was allowed to be performed in the hospital of residency of the patient. The follow-up was slightly different in the two centers, since the POEM group underwent endoscopy also 6 months after the operation. Table 2 summarizes the study protocol.

Table 2 Summary of the study protocol for the two groups of patients. (# = POEM group only)

Treatment failure was defined as the persistence or reoccurrence of a Eckardt score > 3 or the need for retreatment.

Statistical Analysis

For the best possible matching, we used the Propensity Score (PS) matching. This relatively recent statistical method attempts to reconstruct a situation similar to randomization19. When RCT studies are not possible or not available for different reasons, PS matching represents the probability of receiving treatment A rather than B for a patient with given observed baseline characteristics (potential confounders), by replacing these characteristics with a summary score, the PS. This method has also been called a “quasi-randomization” method19. For the calculation of the PS, the following variables were considered: age, sex, duration of symptoms, previous endoscopic treatment(s), Eckardt score, radiological stage, and manometric pattern at HRM. We selected these variables since they may affect the outcome of different achalasia treatments, as reported in a number of previous studies1,21,32. For the best possible pairing of the patients of the two groups, a caliper of 0.2 of the standard deviation of the PS logit was used33.

The data were expressed as medians and interquartile ranges (IQR) for continuous variables and as counts or proportions (%) for categorical variables. Nonparametric tests were used to compare groups (Mann-Whitney and Wilcoxon, as appropriate). Fisher’s exact test was used to compare categorical data. Symptom-control survival estimates were calculated with the Kaplan-Meier method and survival comparisons were performed using the log-rank test. A probability of 5% was assumed to be statistically significant (p = 0.05). The statistical analyses were performed by means of “R” statistical software34.

Results

Patients

From January 1, 2014 to December 31, 2017, 397 consecutive patients underwent POEM at the Digestive Endoscopic Unit of the Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy. The operations were performed by 3 expert staff endoscopists (GC, PF and RL). Fourteen patients were excluded because they had undergone a previous laparoscopic myotomy; 54 patients were excluded because they were operated during the learning curve of the operators (quantified to 20 procedures each)22. Nine patients were excluded as they had undergone POEM for Jackhammer esophagus or motor disorders other than achalasia, and 11 because they had been operated by endoscopists not belonging to the Unit, during master classes held in that period. One last patient was finally excluded for some missing parameters for the calculation of the PS. The total number of POEM operations considered for the study purposes was therefore 318.

During the same period, 267 patients received LHD at the Clinica Chirurgica 3 Unit of the Department of Surgical, Oncological and Gastroenterological Sciences of the University of Padua, Padua, Italy. The operations were performed by two experienced surgeons (MC and RS) who already had finished their learning curve (also established to 20 cases)35. Three patients were excluded because they had undergone a previous LHM and one because he underwent LHD for a motor disorder different from achalasia. Finally, 21 patients were excluded for some missing parameters for the calculation of the PS. The total number of LHD operations considered for the study purposes was therefore 242 (Figure 1).

The calculation of the PS with a caliper of 0.233 allowed the selection of 140 pairs of strictly matched patients that represent the study population. Table 3 summarizes the demographic and preoperative characteristics of the two groups of patients: no statistically significant differences were found, both in the parameters considered for the PS and all the other.

Table 3 Demographic and clinical characteristics of the two groups of patients: no statistically significant differences were found, both in the parameters considered for the PS and in all the other. Data are expressed as Median and IQR. * = parameters considered for the calculation of the PS

The operation was successfully completed in all the cases of both groups. Its median duration was 47 minutes (35–57) in the POEM group, compared to 95 minutes (85–105) of the LHD group, with a statistical significant difference (p < 0.001). Also the postoperative stay was shorter for the POEM group than the LHD one (2 days [2-2] vs 3 days [3-3], respectively, p < 0.001)

Complications

No mortality was observed in both groups. Seven adverse events (Grade II or higher of the Clavien-Dindo36 classification) were recorded in the POEM group (5%), all healed with conservative treatment. In particular, 5 mucosal perforations (3.6%) were repaired with endoscopic clips during the same endoscopic procedure. In the LHD group, 3 mucosal perforations were recorded (2.1%), all repaired during the same laparoscopic operation without the need of conversion to open (Table 4). In both groups, these complications resolved without further consequences for the patients, except for a slight prolongation of the hospital stay in some. Further surgery or endoscopic treatment was never required. The frequency of such complications showed to be comparable in the two groups (p = 0.33). One patient in both groups died at 3 and 24 months, respectively, in the POEM and LHD group, for causes unrelated to the disease or the performed procedure (natural causes and prostatic cancer, respectively).

Table 4 Frequency of adverse events in both groups of patients. There was no statistically significant difference between the two methods (p = 0.33, Fisher’s exact test)

Symptomatic Outcome

Two patients of the POEM group and one of the LHD group were lost to follow-up. At a median follow-up of 24 months (15-30) and 31 months (15-41), respectively, the median postoperative Eckardt score did not differ in the two groups, being 1 (0–1) in the POEM patients and 1 (0–2), in the LHD patients (p = 0.45). Moreover, 137/138 patients in the POEM group (99.3%) and 133/139 in the LHD (95.7%) group had an Eckardt score ≤ 3 and, therefore, were considered having a successful treatment (p = 0.12). Figure 2 represents the variation of the median Eckardt score in both groups, as evaluated at different follow-up visits. After both treatments, the Eckardt score decreased dramatically, remaining steadily low with the passing of time (p < 0.001, Friedman test). After more than 24 months, the median score did not differ between the two groups: 1 (0–1) for POEM and 1 (0–2) for LHD, respectively (p = 0.45).

Fig. 2
figure 2

Variation of the Eckardt’s score in the two groups of patients as evaluated at different follow-up visits. After the treatment, the symptom score decreased dramatically in both groups of patients and remained steadily low during the subsequent follow-up (p < 0.001, Friedman test)

In Figure 3, the symptom-control survival curves are illustrated: the two curves showed a similar trend, without significant differences between the two groups. In particular, nearly 5 years after the operation, the probability to have achalasia symptoms controlled (i.e., Eckardt’s score ≤ 3) was higher than 90% in both treatments (POEM 98.2%, LHD 93.9%, p = 0.2, log rank test). All the 6 patients with symptom relapse some months after LHD were treated with a median of 2 complementary PD (1-3), with resolution of recurrent symptoms, whereas the only patient in the POEM group is still under evaluation for further treatment.

Fig. 3
figure 3

Kaplan-Meier survival curves for both methods showed a similar pattern, without significant differences between the two groups. In particular, nearly 5 years after the operation, the probability to have good control of achalasia symptoms was higher than 90% for both treatments (POEM 98.2%, LHD 93.9%, p = 0.2, log rank test)

Function Evaluation

The majority of the patients accepted to undergo endoscopy and esophageal function tests 6 months after the operation. In particular: 126 patients of the POEM group (91.3%) had postoperative endoscopy and manometry, whereas 99 (71.7%) had 24-hour pH monitoring. In the LHD group, 101 patients (72.7%) had endoscopy, 109 (78.4%) had esophageal manometry, and 105 (75.5%) had a 24-h pH-monitoring of the distal esophagus. However, for a more complete evaluation and to obviate mild differences of the follow-up protocol of the two centers in this retrospective study, only patients who also had endoscopy in the same timeframe were considered. Thus, 99 patients in the POEM group (71.7%) and 79 in the LHD group (56.8%) were selected.

  1. a.

    Manometry. In both groups, achalasia treatment resulted in a significant decrease of the LES resting and residual pressure at relaxation (4sIRP). The LES resting pressure decreased from 41 mmHg (29–53) to 18 mmHg (12–25) in the POEM group (p < 0.01) and from 42 mmHg (32–56) to 18 mmHg (13–26) in the LHD group (p < 0.01). Also, the 4sIRP changed from 28 mmHg (21–37) to 8 mmHg (5–11.5) and from 32 mmHg (25–43) to 10 mmHg (7-13), in the POEM and LHD groups, respectively (p < 0.01 for both). After the treatment, the pressure values (resting and 4sIRP) did not differ between the two groups (Table 5).

Table 5 Six-month postoperative endoscopic and function evaluation: only patients with complete evaluation (endoscopy, HR Manometry, and 24-h pH-monitoring) were considered

No relationship was found between postoperative manometric data and symptom score in the POEM group: after the operation, 6 patients showed 4sIRP higher than normal (15 mmHg), but none of them had recurrent symptoms (the only patient in this group with Eckardt’s score > 3 refused to repeat manometry). On the other hand, 12 patients in the LHD group had a 4sIRP higher than normal (15 mmHg): 2 of them had recurrent symptoms (16.7%), as well as 3 out of the 67 patients who had normal 4sIRP at the postoperative control (4.4%)(p < 0.05).

  1. b.

    24-hour pH-monitoring. Function tests showed a statistically significant difference in the postoperative exposure of the distal esophagus to acid (both for the total % and the DeMeester’s score) in the two groups (Table 5). In particular, the median total % of exposure to acid was 3.5% (1.35–8.65) in the patients who received POEM, as compared to 0.3% (0–2.1) of the patients who had LHD, and the median DeMeester’s score was 14.5 (5.35–30.1) and 1.7 (0–23.5), respectively (p < 0.01 in both cases). By evaluating the single patients, 38 out of the 99 patients evaluated after POEM (38.4%) showed an abnormal exposure of the distal esophagus to acid, as compared to only 14 of the 79 patients evaluated after LHD (17.1%)(p < 0.01).

Endoscopy

Also, the prevalence of endoscopic esophagitis was, after treatment, significantly higher in the POEM group than in the LHD group: endoscopic esophagitis of any degree was found in 37.4% of patients of the former, as compared to 15.2% of patients of the latter group (p < 0.001) (Table 5). However, it must be underlined that the majority of esophagitis belonged to grade A: more severe forms of esophagitis (grade B-D of the Los Angeles classification) were detected only in a minority of patients even if, also in this respect, a frequency higher than in the LHD was recorded in the POEM group (16.2% and 3.8%, respectively, p < 0.01).

A significant correlation between esophagitis and abnormal acid exposure was found both for POEM and LHD patients: esophagitis was detected in 63.2% of the POEM patients with abnormal pH studies, as compared to 21.3% of patients with normal pH studies (p < 0.001), and in 35.7% and 10.8% of LHD patients, respectively (p < 0.02). Interestingly, in the LHD group, esophagitis > grade A was detected in 3 patients with normal pH studies.

A positive association between GERD symptoms (heartburn ≥2) and the presence of endoscopic esophagitis and/or abnormal pH studies was found in the POEM group: however, therapy with PPI was not different between patients with and patients without GERD symptoms. On the contrary, no association was found between GERD symptoms and presence of esophagitis and/or abnormal pH studies in the LHD group. In this group also, the prolonged use of PPIs did not differ between patients with and without symptoms. Finally, by evaluating all the patients in the follow-up (both those with and those without a complete endoscopic and functional postoperative evaluation), chronic PPI therapy was more frequent in patients who had undergone POEM (39.8%) than in patients who had undergone LHD (15.1%), p < 0.001.

Discussion

The treatment of esophageal achalasia is still palliative and aims at the elimination of the barrier caused by an unrelaxing sphincter1. This has usually been achieved for a long time by pneumatic dilations or laparoscopic Heller myotomy, both very effective in controlling the symptoms in the long run37.

Recently, a new endoscopic myotomy technique, the per-oral endoscopic myotomy, or POEM, has been introduced3. It rapidly diffused first in Asia (especially Japan and China) and then in the USA and Europe. It is a mini-invasive technique that allows to perform an esophageal myotomy similarly to the surgical technique with and endoscopic trans-oral approach. The reported short- and medium-term results of this technique are extremely good, equal if not superior to those of LHM5,6,7,8,9,10,11,12,13,14,15,38. Even if a number of efficacy studies for both treatments were singularly evaluated5,6,7,8,9,10 and a number of recent meta-analyses11,12,13,14,15 are available, to date the results of some ongoing RCT comparing POEM with LHM are not available yet. Only one RCT study comparing POEM and PD has been recently published showing far better results with the former and prompting the authors to claim that POEM should be considered as the initial treatment option for achalasia39.

In absence of more definitive results provided by RCT studies, our study may provide interesting and nearly definitive elements for the comparison of POEM and LHD and therefore supporting or confuting this claim. One of the strong points of our study relies in the statistical matching method we used. By choosing the PS with a very conservative threshold, we selected two good-sized, strictly matched populations, submitted to one of the two tested methods for the same disease in a relatively short time span (4 years). The patients were followed for an adequate period of time (> 2 years) with endoscopic and function tests, in addition to the symptom evaluation with a method, the Eckardt’s Score24, which is well accepted and used in Literature. Had the study been a RCT in design, these patients could have been correctly randomized to one treatment or the other. Of course PS, as well as alternative statistical procedures as multivariable analyses, cannot adjust for unobserved and unknown confounders. And this, again, is a limitation of this method compared to RCT studies. Multivariable analyses can adjust for confounding but may be inadequate to eliminate bias in studies in which the outcome under investigation is rare (i.e., failure of treatment in achalasia)40. In addition, when different statistical methods were compared regarding performance in controlling for confounders, PS proved most useful, yielding appropriate estimates even in situations of extreme correlation between the confounder and the exposure40. In a recently published paper41, other authors also used PS to match patients who underwent POEM or LHM at their institution. The number of POEM patients, however, was small (31) as compared to the patients who underwent LHM (88), with a resulting matching of 1:3 for the two groups. Moreover, the time span in which the two groups of patients were recruited was very different (2014–2015 for POEM and 2005–2015 for LHM) and no mention was made of patients treated during the learning curve. In spite of these limitations, however, their results were remarkably similar to those of the present study, adding further strength to our findings.

The results of our study confirmed that POEM is a very good method for the treatment of esophageal achalasia and compares with the more established LHD very well. In fact, the probability to have achalasia symptoms controlled at 5 years was higher than 90% with both techniques. These results are in line with those reported to date in Literature for both methods11,12,13,15,19,32,42 and do not confirm the progressive worsening of results, as compared to the immediate results other authors have reported 2 years after POEM43. In this particular multicenter and retrospective study, however, also the first patients treated by the participating groups were included, and these declining results may reflect the effect of the learning curve on the clinical efficacy of POEM. In our study, the first 20 patients treated by the different endoscopists were excluded, thus avoiding biases eventually related to the learning curve, set at 20 cases for both POEM and LHD22,35. POEM confirmed to be superior to LHD for the shorter operative time and postoperative hospital stay. This finding is in adherence with some studies in Literature5, but it is not confirmed by others6,7 and by a recent meta-analysis12.

Both POEM and LHD are invasive, surgical techniques carried out in general anesthesia: thus complications may occur. In our study, however, these were infrequent, less than 5% for both techniques and with no statistical difference between them. They were minor adverse events mostly, such as mucosal perforations recognized and repaired during the same operation, with the applications of endoclips during POEM or immediate laparoscopic suturing during LHD. In any case, these complications did not affect the postoperative course or the final outcome of the operation, causing only a small increase in the postoperative stay. This finding is also in accordance with the Literature8,11,44: in particular, a recent meta-analysis12 reported a frequency of adverse events similarly low for both methods.

Our study further confirmed the finding, already known, of a higher incidence of post-procedural gastroesophageal reflux disease (GERD) after POEM compared to LHD, with a consequently higher postoperative use of PPIs. A number of retrospective studies had already reported this finding16,17. The appearance of an abnormal gastroesophageal reflux is a common finding after all therapeutic procedures for achalasia, since they all aim at disrupting the barrier of an unrelaxing LES. In fact, an incidence of GERD of about 15% of the cases was reported after PD37 and between 8.8% and 31.5% after laparoscopic myotomy, with or, respectively, without an antireflux procedure42. The 5-year results of the European Achalasia Trial reported an incidence of GERD after LHD as high as 34% (as compared to 15% of the PD group)37. It must be said; however, that postoperative pH control in this important study was possible in about one-third of patients of both arms only. It is worth emphasizing that our study reproduced the results of a recent meta-analysis nearly perfectly, with a percentage of abnormal exposure to acid of 39% after POEM, and 16.8% after LHD18.

The problem “GERD” must be surely taken into account. It is however a problem whose relevance is perhaps overemphasized and whose consequences are not understood well yet45. Are we treating a disease by creating a new one and a worrisome experimental model for the development of Barrett’s esophagus and, eventually, esophageal adenocarcinoma? Or is the need of long-term PPI therapy the only consequence? At the moment, it is not possible to answer these questions, for several reasons.

First, achalasia patients rarely complain of reflux symptoms after treatment. Usually they are so happy to have regained their eating capabilities that they tend to underestimate GERD symptoms eventually occurring after treatment. It is also possible that esophageal mucosa, chronically irritated and inflammated by the stasis of saliva and food, is less sensible to the effect of gastric juice eventually refluxed. Moreover, in case of appearance of GERD symptoms, the medical treatment with PPIs is extremely effective in controlling symptoms and, above all, esophagitis. If, for any reasons, the medical treatment reveals to be insufficient or cannot be carried out in the long run (especially in young patients with long life expectancy), there will always be the possibility to perform a laparoscopic fundoplication some time after a POEM. Finally, even after effective treatment, the patients with achalasia should be per se endoscopically followed (every two or three years) for the early diagnosis of a possible development of esophageal carcinoma46, even if endoscopic surveillance is not recommended by current guidelines23,47. This may occur after any treatment for achalasia, as well as in the untreated disease. It must be underlined, however, that the histotype of cancer which develops in achalasia patients is nearly always squamous46. Even if reported in Literature, cases of development of Barrett esophagus or even adenocarcinoma in patients with achalasia are rare48,49,50. This implies an etiology different from GERD, probably more related to the chronic stasis of indigested food and saliva in a dilated gullet which is unable to empty completely. The exponential bacterial overgrowth and the chronic chemical irritation due to the continuous decomposition of food and saliva may result in chronic, hyperplastic esophagitis, dysplasia, and, eventually, neoplastic transformation of the esophageal epithelial cells. The true relevance of the combination of this iatrogenic gastroesophageal reflux, the possible development of Barrett esophagus, and, probably, the risk of cancer for patients treated for achalasia is not well known yet.

Our study has some limitations, though. First of all, it is not a RCT that still represents the gold standard for estimating the effects of two or more different interventions. Though the use of the PS matching tries to approximate a randomized controlled trial19, the data are still retrospective and non-randomized. This can lead to a hidden bias due to latent variables that may remain after matching. Moreover, the strict statistical matching with PS reduced our sample size of about half the initial population in both groups. Only about three quarters of patients accepted to undergo postoperative 24-hour pH-monitoring. Our choice to consider for the analysis only the patients who underwent function studies and endoscopy during the same time span further reduced the number of evaluated cases, thus possibly introducing additional bias. Finally, another possible limit of this study is that the results we obtained may not represent those achievable by centers with less experience with one or the other technique (i.e., the “real” world): since they were obtained by two centers of excellence. They probably represent the best possible results one can expect with both techniques, instead.

Conclusions

In conclusion, our study confirmed that POEM is a valid option for the treatment of esophageal achalasia, with outcome results well comparable to those obtained with LHD, at least at midterm. Both POEM and LHD, performed by experts in the procedures, have shown to be equivalent in terms of clinical efficacy and safety. Further comparative studies are certainly necessary to clarify possible differences of the two methods and, in this context, the results of ongoing RCT studies are eagerly awaited. Moreover, studies with longer follow-up will testify the real long-term results of POEM and the need for further treatments. Above all, they will clarify the real impact and long-term complications of the iatrogenic GERD, undoubtedly higher for POEM if compared to LHD. In the end, while we think POEM is a good option in treating achalasia and it should be, alongside with PD and LHD, in the armamentarium of any referral center dealing with this rare disease, its claim to be considered the initial treatment for patients with achalasia needs further, stronger evidence.