The primary therapeutic goal for the currently available modalities for treatment of achalasia is to lower the lower esophageal sphincter (LES) pressure. Over the years, modalities like pharmacotherapy (calcium channel antagonists, nitrates), endoscopic pneumatic dilatation, surgical myotomy and injection of botulinum toxin have been incorporated into the treatment armamentarium for achalasia [1]. However, each of these modalities has its pros and cons. For example, pneumatic dilatation is associated with symptom recurrence and post-procedure gastroesophageal reflux (GER); botulinum toxin has a short-lived action, and is expensive; and surgical myotomy usually requires an additional fundoplication procedure to prevent GER. These have led endoscopists and endoscopic surgeons to explore into novel technologies. Peroral endoscopic myotomy (POEM) is an evolving therapeutic modality that has stemmed from the concept of natural orifice transmural endoscopic surgery. The technique of POEM involves four major consecutive steps [1], viz.: esophageal mucosal incision and entry into the submucosal space; creation of a submucosal tunnel; incision of the esophageal muscles (myotomy); and closure of the mucosal incision. The earliest report of clinically effective endoscopic myotomy for achalasia came from Ortega et al., in 1980 [2]. This technique was however more of a blind incision of the esophageal mucosal and deeper layers, much different from the current technique of POEM. The first report of POEM was based on an experimental study in a porcine model where successful esophageal submucosal tunneling was demonstrated that translated into lowering of the LES pressure [3]. This technique was subsequently refined and executed in humans that culminated in the first case series testifying the utility of POEM in achalasia by Inoue et al. in 2010 [4]. Subsequently, several single and multicenter case series/studies involving variable sample sizes, and exploring a variety of technical modifications and outcomes have been reported over the past few years. Authors have reported a myriad of complications associated even with POEM. Furthermore, very few studies have compared POEM with the existing therapeutic modalities like surgical myotomy in terms of efficacy and safety. Therefore, it becomes prudent that the cumulative efficacy and safety of POEM, and how these stands compared to surgical myotomy be addressed.

In the current communication, we present results of a systematic review and meta-analysis of the efficacy of POEM in patients with achalasia and its comparison with surgical myotomy.

Materials and methods

Study selection

This systematic review and meta-analysis was conducted as per preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines [5]. We conducted a systematic literature search in Pubmed, Medline, Cochrane and Ovid databases, and proceedings of major conferences from January 2005 to July 2014, using the search words ‘achalasia’, ‘POEM’, ‘per oral endoscopic myotomy’, ‘per-oral endosopic myotomy’, and ‘peroral endoscopic myotomy’. Inclusion criteria were: clinical studies on patients with achalasia irrespective of previous endosopic or surgical procedures and full-length papers. There were no language restrictions in the selection. Articles in non-English language were translated to English using online translation services (Google translate). Exclusion criteria included: experimental studies in animal models, technical reports, single case reports, abstracts, editorials, and review articles.

Data abstraction

Two of the investigators abstracted data independently using a standard proforma; and any discrepancies were mutually resolved with consensus. Following parameters were recorded: first author and year of publication, study duration, country of origin, single-center or multicenter, sample size, previous myotomy or other specific interventions, age, gender distribution, duration of disease (in months), procedural time (in mins), length of the submucosal tunnel (in cms), length of myotomy (in cms), days of hospitalization, follow-up duration (in months), and intra- and adverse events. Attempts were made to contact the corresponding authors of studies for any missing data points.

Outcome measures

The main outcome measures that were studied included improvement in the Eckhart score and reduction in the resting LES pressure. Subgroup analysis of the studies comparing POEM with surgical myotomy was also performed, where the following outcomes were compared: Eckhart score, length of hospital stay, post-operative pain score, post-operative analgesic dose, procedure time, adverse events, and post-procedure symptomatic GER.

Assessment of study quality

We used a study rigor table that was previously developed and validated to standardize comparison of rigor across studies [6]. Following aspects related to study rigor were recorded: prospective cohort analysis presented data from the same subjects followed over time; control or comparison groups compared those who received POEM to those who did not; pre/post intervention data assessed participants before and after receiving POEM; random assignment of treating groups of study subject; attrition determined if follow-up rate was 80 % or more; comparison groups matching assessment if there were statistically significant baseline difference in study outcomes.

Statistical analysis

Statistical analyses were performed under the guidance of a statistician. A database was generated in Excel for Mac (Microsoft Corp., Redmond, WA) and meta-analysis was performed using the Comprehensive Meta-analysis software (Ver. 2.2.064; 2011). A pre-post design was used for evaluating the outcomes after POEM in the same group. Effect sizes for numerical variables were expressed as standardized difference in means with 95 % confidence interval (CI); while that of categorical data were expressed as odd’s ratio with 95 % CI. Whenever data in individual studies were expressed as a range, they were converted to standard deviation (SD) before analysis. Between-study heterogeneity was assessed by the I 2 measure, and was considered to be important if it was greater than 25 %. Q measure was used to evaluate significance of heterogeneity and was considered statistically significant when p < 0.1. Random-effect model (DerSimonian and Laird [7]) was used when there was heterogeneity, while a fixed effect model (Mantel–Haenszel method [8]) was used in the absence of heterogeneity. Publication bias was initially evaluated and quantified by the Egger’s test.

Results

Characteristics of individual studies and quality assessment

As shown in Fig. 1, initial search revealed 247 studies, of which 167 were screened for eligibility criteria after removing duplicates. Of these, 96 studies fulfilled criteria for eligibility assessment; of which 52 abstracts and 19 studies that were not related to POEM were excluded. Of the remaining 96 records, 29 fulfilled eligibility criteria and were included for qualitative analysis [4, 936]. Of these 29 records, 20 fulfilled criteria for inclusion for quantitative analysis (meta-analysis). Among the 67 excluded studies, three were experimental studies in animal models, 14 were single case reports, 36 were reviews and/or editorials, and 14 were technical reports.

Fig. 1
figure 1

PRISMA diagram showing the flow of study selection

Table 1 shows the study and patient characteristics. Among the included studies, one was multicenter [24] (Germany, Netherlands, and Canada). Countries of origin of the single-center studies were Japan [4, 14, 19] (n = 3), Italy [15, 31] (n = 2), USA [9, 11, 12, 16, 20, 23, 2527, 32, 35] (n = 11), Hong Kong [17] (n = 1), Korea [18] (n = 1), Netherlands [21] (n = 1), China [10, 22, 2830, 33, 36] (n = 7), and Germany [13, 34] (n = 2), respectively. Overall, five studies compared efficacy between POEM (n = 90) and Heller’s myotomy (n = 160) [9, 12, 16, 25, 26]. The study by Cai et al. [29] randomized patients with achalasia to two different types of techniques (conventional and water-jet assisted method). Water-jet technique was used in the study by Khasab et al. as well [27]. Another study compared symptom relief and manometry of endoscopic full-thickness and circular muscle myotomy [28]; while the study by Zhai et al. compared efficacy of POEM with transverse versus longitudinal entry incisions [22]. We used pooled efficacy data for pre-post analysis from these two studies since the tested techniques showed similar results for study outcomes in both studies. In one study [23], fellows/trainees were also involved under supervision in performing POEM. It was found that with an increase in training, there was a reduction in the procedure time and mucosal perforation. Reporting of POEM-related data was not homogeneous; and non-reporting included: change in Eckhart’s score in 10 studies, change in LES pressure in 13, previous interventions for achalasia in 6, duration of disease in 13, procedural time in 4, submucosal tunnel length in 19, myotomy length in 6, days of hospital stay in 12, follow-up duration in 6, and adverse events in 1. The study by Bhayani et al. did report a change in dysphagia score that was different from the Eckhart’s score.

Table 1 Clinical characteristics of patients undergoing POEM

Table 2 shows the quality of the individual studies in the form of a study rigor analysis.

Table 2 Assessment of quality of individual studies

Patient characteristics

A total of 1,045 patients (490 males) were pooled from 29 studies. Mean (SD) age of the patients was 50.5 (14.1) years while the duration of disease was 51.01 (104.6) months. Mean (SD) follow-up duration for the patients was 6.5 (3.2) months. Previous interventions were reported for 397 patients that included: dilatation (pneumatic and bougie) in 128 patients; surgical myotomy in 53, botulinum injection in 48; combined botulinum injection with dilatation in 5; POEM in 1, temporary stenting in 1; and other procedures (including medical treatment) in 113. Two studies did report the number of patients who underwent interventions (n = 28) but did not mention the details. Sharata et al. divided the patients into two groups (pre-POEM intervention and non-intervention); and all outcomes and complication rates were similar in both groups.

Adverse events

No adverse events were encountered in two studies involving 24 patients [18, 19]. Adverse events were not reported in the study by Khasab et al. [27]. There were a total of 1,120 adverse events reported in the remaining studies that included: bleeding (n = 10); esophageal and gastric perforation (n = 27); subcutaneous emphysema (n = 228); mediastinal emphysema (n = 51); pneumoperitoneum (n = 169); pneumothorax (n = 91); pleural effusion (n = 182), and pneumonia (n = 103). All but one case of pneumonia were reported from the study by Li et al.; and all cases were diagnosed on a post-operative CT scan of the thorax. Overall, 114 (10.9 %) patients with gastroesophageal reflux/reflux esophagitis (GER/RE) were reported. The number of patients who developed GER/RE could be higher since few of the studies made no mention about its presence or absence. On evaluating the individual studies the frequency of GER/RE was variable and was found to be higher in the studies by Hungness et al. [16] (38.9 %), Verlaan et al. [21] (60 %), von Renteln et al. [24] (33 %), and Swanstorm et al. [32] (72.2 %), respectively. The rate of GER rose to 37 % (4 % increase) in the study by von Renteln et al. [24] at the end of 1-year follow-up. Most of the complications were minor and self-limited; and could be managed conservatively. Perforations could be managed successfully with endoscopic clipping. Three patients with bleeding required endoscopic hemostasis. Pneumoperitomeun could be treated with a Veress needle in most cases, and most of the pleural effusion and pneumothorax resolved spontaneously. Only seven pleural effusion required thoracotomy with drainage. Patients who had symptomatic GER could be satisfactorily managed with proton pump inhibitor and antacids. There was no mortality and none of the POEM procedure had to be converted to surgery.

Efficacy of POEM in achalasia

The studies that evaluated the efficacy of POEM for achalasia were non-randomized, and a pre-post model was used to perform meta-analysis. Nineteen studies evaluated the change of the Eckhart’s score in the patients post POEM (Fig. 2A). There was significant heterogeneity among the studies (Q = 83.06; I 2 = 78.33 %; p < 0.0001), due to which a random-effect model was used for analysis. There was a significant reduction in Eckhart’s score with a overall effect size (Z) of −7.95 (p < 0.0001) [overall standardized difference in means (95 % CI) of −0.938 (−1.169 to −0.706)]. We re-ran the meta-analysis (Fig. 2B) after removing seven studies that were reported by three groups and had likely included a proportion of same patients across different studies. Even after removal of the eight studies, the overall effect size (Z) of reduction of Eckhart’s score was −5.99 (p < 0.0001) [overall standardized difference in means (95 % CI) of −0.851 (−1.129 to −0.573)]. Statistically significant reduction in improvement of Eckhart’s score was observed after meta-analysis even after exclusion of the two studies by Vigneswaran et al. [12] and Familliari et al. [15], which had low sample size and markedly increased relative weight compared to the other studies (data not shown).

Fig. 2
figure 2

A Forrest plot showing the efficacy of POEM in reducing Eckhart’s score in patients with achalasia. B Forrest plot after exclusion of studies from same group of authors showing the efficacy of POEM in reducing Eckhart’s score in patients with achalasia

Sixteen studies evaluated the change in resting LES after POEM (Fig. 3A). Similar to the studies that evaluated Eckhart’s score, these studies also demonstrated significant heterogeneity (Q = 61.44; I 2 = 75.68 %; p < 0.0001). Meta-analysis using random-effect modeling revealed significant improvement of the resting LES pressure with an overall effect size (Z) of −7.28 (p < 0.0001) [overall standardized difference in means (95 % CI) of −0.869 (−1.102 to −0.635)]. Meta-analysis after removal of five studies from three groups (Fig. 3B) also resulted in an overall effect size (Z) of reduction of LES of −5.39 (p < 0.0001) [overall standardized difference in means (95 % CI) of −0.950 (−1.296 to −0.605)]. Statistically significant reduction in LES reduction was observed after meta-analysis even after exclusion of the study by Familliari et al. [15], which had markedly increased relative weight compared to the other studies (data not shown).

Fig. 3
figure 3

A Forrest plot showing the efficacy of POEM in reducing lower esophageal sphincter pressure in patients with achalasia. B Forrest plot after exclusion of studies from same group of authors showing the efficacy of POEM in reducing lower esophageal sphincter pressure in patients with achalasia

There was significant publication bias among the studies with an Egger’s regression intercept of −3.19 (95 % CI −3.56 to −2.82) (p < 0.0001) and −3.06 (95 % CI −3.58 to −2.53) (p < 0.0001) for Eckhart’s score and LES pressure reduction respectively.

Comparison of efficacy of POEM with laparoscopic Heller’s myotomy (LHM)

Five studies compared POEM with LHM. Figures 4A–G depict Forrest plots comparing the efficacy on the following outcomes: Eckhart’s score; procedural time; post-operative pain; post-operative analgesic dose; length of hospital stay; adverse events and presence of symptomatic GER. There was significant heterogeneity in the studies for time taken for the procedure (Q = 11.19; I 2 = 73.18 %; p = 0.011); post-operative pain (Q = 51.11; I 2 = 98.06 %; p < 0.0001); analgesic dose (Q = 17.49; I 2 = 94.29 %; p < 0.0001); length of hospital stay (Q = 15.04; I 2 = 80.05 %; p = 0.02); while no heterogeneity was observed for the other outcomes. There was a trend toward significant reduction in the Eckhart’s score in favor of POEM compared to LHM, though it did not reach statistical significance [overall effect size (Z) = −1.77; p = 0.078] (Fig. 4A). Time for the procedure was significantly less for POEM compared to LHM [overall effect size (Z) = −2.220; p = 0.026] (Fig. 4B). There was no statistically significant difference in the reduction of post-operative pain score [overall effect size (Z) = −0.691; p = 0.489] and analgesic (morphine equivalent) dose [overall effect size (Z) = −0.755; p = 0.450] (Fig. 4C, D). Furthermore, there was no difference in the effect of POEM and LHM on length of hospital stay [overall effect size (Z) = −1.41; p = 0.156] (Fig. 4E). Similarly, the risk of adverse events did not differ between POEM and LHM [overall effect size (Z) = 1.227; p = 0.220] (Fig. 4F). Finally, there was also no difference between the development of symptomatic GER between POEM and LHM [overall effect size (Z) = −1.41; p = 0.156] (Fig. 4G).

Fig. 4
figure 4figure 4

A Forrest plot showing comparison of the efficacy of POEM with LHM in reducing Eckhart’s score in patients with achalasia. B Forrest plot showing comparison of operative time required for POEM with that for LHM. C Forrest plot showing comparison of post-operative pain score after POEM with that after LHM. D Forrest plot showing comparison of post-operative analgesic requirement after POEM with that after LHM. E Forrest plot showing comparison of the length of hospital stay after POEM with that after LHM. F Forrest plot showing comparison of adverse events after POEM with that after LHM. G Forrest plot showing comparison of symptomatic GER/GE after POEM with that after LHM

Comparison of technical modifications in the POEM procedure

Cai et al. [29] conducted a randomized study comparing water-jet assisted (hybrid knife) (n = 50) versus conventional dissection (n = 50) techniques and both were found to have similar efficacy in terms of treatment success (Eckhart’s score of <3; seen in 96.5 %). However, accessory exchanges were significantly lower in the water-jet technique (2 ± 2.4 vs. 19.2 ± 1.0; p < 0.0001) that could have been likely associated with a significantly less procedural time for the water-jet technique (22.9 ± 6.7 vs. 35.9 ± 11.7 min; p < 0.0001). While most adverse events were similar in both the techniques, episodes of minor intraprocedural bleeding was significantly lower in the water-jet technique (3.6 ± 1.8 vs. 6.8 ± 5.2; p < 0.0001). Another retrospective study by Li et al. [28] evaluated the clinical efficacy of endoscopic full-thickness and circular muscle myotomy in terms of reduction of Eckhart’s score and LES pressure. The procedure time and length of post-operative hospital stay was significantly lower in the full-thickness myotomy group [(41.7 ± 18.9 vs. 48.9 ± 28.6; p = 0.02) and (2.7 ± 1.1 vs. 3.6 ± 2.7; p = 0.00) respectively]. Rates of post-operative subcutaneous emphysema, pneumothorax, pneumoperitoneum, pleural effusion and pneumonia were similar after both techniques.

Discussion

In the present systematic review and meta-analysis, we report that: (a) POEM is an effective and relatively safe endoscopic treatment for achalasia; and (b) POEM and LHM are similar in terms of efficacy, post-operative analgesic need, adverse events and development of post-procedure GER, in the treatment of achalasia.

The first case series on the utility and efficacy of POEM in treating achalasia came from Inoue et al. in 2010 [4]. The technique was a refinement of the first report from an experimental study in a porcine model where successful esophageal submucosal tunneling was demonstrated [2]. Prior to the development of POEM for the treatment of achalasia, other modalities like pharmacotherapy (calcium channel antagonists, nitrates), endoscopic pneumatic dilatation, surgical myotomy and injection of botulinum toxin had been in vogue. However, these modalities, though effective, had their own disadvantages.

Pneumatic dilatation, which is a minimally invasive and the most commonly performed technique, has the inherent disadvantage of symptom recurrence and high prevalence of post-procedure GER. Botulinum toxin can be injected into the LES under direct endoscopic vision, but has a short-lived action, require repeated injections. and can incur high cost of therapy. Possibly the best indication of botulinum toxin for achalasia could be a bridge to therapy in situations like pregnancy or use of multiple antiplatelet agents. Heller’s myotomy, though effective, is an invasive procedure (even if performed laparoscopically), mandates hospitalization and usually requires a fundoplication procedure to prevent post-operative reflux. There is also a risk of causing intraoperative esophageal perforation that could be missed early. In contrast, POEM is a minimally invasive procedure that can be performed under direct endoscopic vision without the need for hospitalization. Furthermore, follow-up of up to 12 months have shown a sustained success rate of 82.4 % [24].

The current meta-analysis involved 29 studies with 1,045 patients. However, all but one [29] of the studies was non-randomized. Nineteen studies evaluated the pre-and post Eckhart’s score and 16 evaluated the LES pressures. There was significant improvement in both these outcomes. However, seven studies were from the same groups, which led to a possibility of double reporting of cases and thus an inflated beneficial effect. In order to negate this, we re-ran the meta-analysis for these two outcomes after removing those studies; and the effect sizes maintained statistical significance. Five studies compared POEM with Heller’s myotomy in a non-randomized manner, with similar outcomes in terms of post-operative course and adverse events [9, 12, 16, 25, 26]. Operative time was significantly less in POEM compared to LHM while there was a trend toward statistical significance for reduction of Eckhart’s score after POEM compared to LHM. Most of the studies included patients who were both treatment naïve and underwent previous endoscopic or surgical interventions for achalasia. Sharata et al. compared the outcomes and adverse events between patients who were treatment naïve and who underwent previous procedures. All study outcomes and adverse events were similar in both groups of patients, thereby reiterating the efficacy and safety of the procedure even in patients undergoing previous procedures.

Adverse events that occurred commonly were subcutaneous emphysema, mediastinal emphysema, pneumoperitoneum, pneumothorax, pleural effusion, and pneumonia. Even though the total number of adverse events appeared to be higher than what would usually be seen with other procedures, most of these were inherent to POEM and were self-limiting. Majority of the symptomatic adverse events could be managed conservatively. There were no deaths associated with the procedure and the frequency of perforation and bleeding was not high. Overall, POEM emerged as a safe procedure that was comparable to the safety profile of LHM. GER/RE is a common concern after POEM since no anti-reflux procedure is involved unlike in surgical myotomy. Overall, GER/RE was seen in 10.9 % after POEM and the incidence of GER/RE after POEM was similar to LHM. Few of the individual studies did have a high rate of GER/RE, but could be managed effectively with proton pump inhibitors.

Our study had limitations. Even though five studies had compared POEM with LHM, none of the studies were randomized. There was significant publication bias and heterogeneity among the studies that reported a change in Eckhart’s score and LES pressures. Majority of the studies did not provide results of long-term follow-up. Furthermore, there was a risk of double reporting of cases since seven studies were published from three groups. However, we removed these studies and re-ran the analyses, which still resulted in a statistically significant effect size. Nevertheless, to our knowledge, this is the first meta-analysis to study the efficacy of POEM and compare it with LHM. This study is likely to open up avenues for further larger scale multicenter studies where POEM will be compared with other standard procedures including surgical myotomy in a randomized manner; and also compare the efficacy of POEM in treatment naïve patients with those who failed to respond to previous interventions or had relapse.