Introduction

Achalasia is a progressive disorder where inhibitory ganglionic cells in the myenteric plexus of the esophagus are irreversibly lost. This results in an impaired ability to relax the lower esophageal sphincter (LES) causing functional obstruction [13].

The symptoms include dysphagia, regurgitation and pulmonary problems, and impaired quality of life (QoL) [46]. Manometry confirms the diagnosis [7, 8]. The treatments available aim at reducing LES pressure. Pneumatic dilatation (PD) is a well-studied treatment strategy that is easy to perform, but has the disadvantage of requiring repeated interventions [9]. PD is evidence-based in terms of both short-term and long-term therapeutic outcomes [10]. Surgical cardio-myotomy is also common, but must be combined with a partial fundoplication to avoid gastro-esophageal reflux. Conventional and laparoscopic myotomy (LM) have been shown to be safe and efficient, and previous studies comparing open surgery with pneumatic dilatation report advantages for the myotomy strategy [4, 5, 1113]. Irrespective of strategy, the success rate early after therapy initiation is high. However, longer follow-up studies are needed [14].

The 12-months results of this prospective randomized study comparing PD versus LM with a posterior partial fundoplication as primary treatments for achalasia has been reported earlier [5]. Here, we report long-term follow-up results. The primary endpoint is treatment failure. Secondary endpoints are differences in symptom relief, QoL, and health economy.

Methods

Inclusion

Between 2000 and 2005, 56 patients with newly diagnosed achalasia were invited to participate, three of whom did not fulfill the study criteria, leaving 53 patients to be randomized to either PD or LM with a posterior partial fundoplication (Fig. 1). A computer aided randomization was done using the minimization technique stratifying for age, gender, and previous medical treatment. No patient had received any disease-specific treatment such as Botox injection, myotomy, or dilatation to more than 18 mm prior to inclusion. Two patients who were randomized to the dilatation arm and treated accordingly did not wish to undergo any invasive examination procedures during the first year of follow-up, but agreed to participate at the 3- and 5-years follow-ups. This explains a minor dissimilarity in the short- and long-term study populations [4, 5, 1113].

Fig. 1
figure 1

Flow diagram of the trial, according to CONSORT, showing inclusion, treatment allocation, follow-up, and data analysis (60 months). ITT intention to treat, PP per protocol

All patients presented with a typical clinical history and characteristic findings on esophageal manometry as previously described [4, 5, 1113]. To be eligible for inclusion, the patients had to be suitable for both treatment alternatives. There were no differences in the background characteristics of the groups (Table 1). All patients were followed for a minimum of 60 months with a median follow-up of 81.5 months.

Table 1 Background characteristics

Treatment failure

The primary endpoint was the cumulative number of treatment failures, which was defined before the start of the study. A composite approach was applied to the definition of a treatment failure. A treatment failure was defined as follows:

  1. 1.

    Incomplete symptom control or symptom relapse that required more than three additional treatments other than those given initially (surgery or one to two dilatations at an interval of about 10 days).

  2. 2.

    Relapse which required treatment occurring within 3 months after the initial treatment series.

  3. 3.

    A serious complication or side effects after treatment that required a switch-over to the alternative strategy.

  4. 4.

    The patient required or requested alternative treatment or another treatment due to dissatisfaction with the allocated therapy.

  5. 5.

    The responsible physician recommended that the patient should undergo another treatment after consulting with the Trial Committee.

Procedures

Pneumatic dilatation

Patients allocated to PD were treated under conscious sedation with midazolam and pethidine or under general anesthesia. A dilatation balloon (Rigiflex ABD®, Boston Scientific, Natick, MA) of 30–40 mm was placed with the aid of a gastroscope and a guide wire, and insufflated to 10 psi for 60 s over the gastro-esophageal junction under fluoroscopic guidance. A predefined staged dilatation protocol was followed. At the initial treatment, women were dilated to 30 mm and men to 35 mm. After dilatation, the patency of the gastro-esophageal junction (GEJ) was carefully inspected. Each patient was discharged as soon as he/she had recovered from the conscious sedation/general anesthesia and after having swallowed fluid without significant complaints and/or symptoms. If symptom relief from the first dilatation was insufficient, another was performed within 10 days using 35 mm balloons for women and 40 mm balloons for men. Further dilatations were performed as needed because of symptom relapse.

Laparoscopic myotomy

All patients allocated to surgical myotomy were operated by one of two experienced surgeons. Myotomy involved division of the entire muscle layer down to the mucosa at least 5 cm above the GEJ and 2–3 cm into the ventral aspect of the stomach to include the sling fibers of the cardia. To prevent risk of gastro-esophageal reflux, a partial fundoplication according to Toupet [15] was added.

Follow-up

The patients were followed during the first year with regular visits to their respective outpatient clinics [5]. The patients were instructed thereafter to contact the clinic in case of symptom relapse. Structured telephone interviews were carried out at 3 and 5 years by surgeons who were not involved in the initial treatment or follow-up. At the end of the interview, all patients were offered an outpatient clinic visit. Questionnaires to assess dysphagia and QoL were mailed to the patients in conjunction with the interview.

Variables

Symptom control

Dysphagia was assessed with the Watson dysphagia score [16]. This is a validated [17] and well-described instrument for benign dysphagia.

Quality of life

QoL was assessed with the personal general well being (PGWB) score and the Gastrointestinal Symptom Rating Scale (GSRS) [1822]. The PGWB is a generic instrument giving a total score, as well as six different domains (anxiety, depressive mood, positive wellbeing, self-control, general health, and vitality). It is well described and has the advantage of having validated reference values for the general population. The GSRS is a validated disease-specific instrument measuring six different scales (reflux, pain, indigestion, constipation, diarrhea, and dysphagia). GSRS has mainly been used in the field of gastro-esophageal reflux disease.

Health economic evaluation

Direct medical costs were assessed during the initial treatment together with all medical costs associated with gastrointestinal, thoracic, and abdominal symptoms during the first 12 months of follow-up. These costs included in hospital stays, procedures, medical devices used during procedures, x-ray investigations, manometry, endoscopies, outpatient clinic visits, and visits to the emergency ward. Costs for investigations made purely for study reasons were not included. All costs were taken from charts for within-hospital debits at Sahlgrenska University Hospital in 2006 (when the first evaluation was made). Indirect costs (e.g. costs for medication and costs for sick leave) were not assessed since we, for legal reasons, were unable to confirm them objectively. Costs after 12 months were considered if directly connected to the management of the achalasia.

Statistics

A sample size of 70 patients in each treatment arm was calculated from a 30 % difference in the dysphagia score with a power of 80 % at a significance level of 95 % (p < 0.05), with an interim analysis scheduled after enrollment of half of the patients. Inclusion was halted after 53 evaluable cases for practical reasons and difficulty recruiting patients. The SPSS statistical program was applied for data analysis. The cumulative number of treatment failures was displayed and the difference between the two protocols was evaluated by the log-rank test. The point prevalence’s of data were compared by the use of nonparametric tests (the Mann–Whitney U test and the Wilcoxon signed-rank test). A p value less than 0.05 were considered statistically significant.

Ethics

This study was performed according to the Declaration of Helsinki and the Ethical Review Act. The study protocol was approved by the Ethical Review Board of the University of Gothenburg (protocol S500-00). Written informed consent was obtained from each participant before inclusion in the trial. The trial is registered in the www.ClinicalTrials.gov (NCT02086669).

Results

Fifty-three patients were available for the intention-to-treat (ITT) analysis. There were no differences in demographic background characteristics between the groups (Table 1). Baseline recordings were obtained regarding the patients’ social and work situation since they carry relevance for the adherent indirect medical costs: in the PD group, five were pre-operatively on sick leave, one was retired, one was on disability pension and the rest were in the work force. The corresponding figures for those allocated to LM were four, none, and one, respectively, in addition, one operated patient was a college student. The response rate for the mailed surveys was at 3 years 20/25 (LM) and 23/28 (PD) and at 5 years 22/25 (LM) and 25/28 (PD). Patients had overall few complaints with regard to reflux symptoms and a low use of PPI at 3 and 5 years with no differences between groups.

Treatment failures

At 36 months, nine cases (32 %) in the dilatation group and one case (4 %) in the myotomy group had been classified as treatment failures (p = 0.03). The corresponding figures at 5 years were ten cases (36 %) in the PD group and two cases (8 %) in the LM group, including two patients who were lost to follow-up (one in each arm) (p = 0.016) (Table 2). The Kaplan–Meier analysis showed a significantly shorter time to treatment failure for the PD strategy (p = 0.02) (Fig. 2).

Table 2 Failures at three and five years after treatment
Fig. 2
figure 2

a Kaplan–Meier analysis of time to treatment failure for laparoscopic myotomy with partial posterior fundoplication compared to repeated pneumatic dilatation. b Subset table for Fig a. Patients without treatment failure at each time period

Ten cases were classified as failures due to a change in treatment strategy; eight patients in the PD group had an LM and two LM patients required additional dilatations (Table 2). There were no differences in the failure group versus the successfully treated group with regard to age or sex.

Symptom relief

Patients subjected to myotomy surgery tended to report less dysphagia after 3 years, a difference which was not significant (p = 0.110). On the other hand, the improvement in dysphagia scores was significantly better in the LM group than after PD between the pretreatment situation and that at 3 years, particularly when the differences in individual scores were compared separately (Table 3). After 5 years, this difference did not reach significance.

Table 3 Watson score and QoL (PGWB) at 3 and 5 years

Quality of life

The total PGWB score was significantly higher in LM patients than in the PD group at the three-year follow-up. This difference was recognizable in all domains and reached significance for anxiety and self-control (Table 3), again with a decrease in differences over time. There were no differences in GSRS scores at either 3 or 5 years after treatment.

Costs

The direct medical costs were lower for the PD strategy at 60 months and for the entire study period. The total medical cost for each patient during the first 60 months was $13,215 after LM as compared to $5,247 for PD (p = 0.0001). This was mainly caused by the difference in the initial treatment costs (Table 4).

Table 4 Total costs at 5 years

Discussion

We report the long-term results (≥5 years) of the first randomized, controlled, clinical trial comparing the therapeutic outcome of laparoscopic cardiomyotomy with pneumatic dilatation in newly diagnosed idiopathic achalasia. Our main finding was that cumulative incidence of treatment failures was larger in the PD group, predominantly during the first 3 years after the initiation of therapy. This harmonizes with results reported by Csendes and co-workers. They randomized a corresponding group of patients, some also suffering from Chagas disease, to open surgery or endoscopic dilatation [11, 12, 23]. Despite that a less strict dilatation protocol was used (PD were performed under general anesthesia and were of short duration), the Csendes group found a significant difference in favor of a myotomy, and a progressive accumulation of failures over time among those randomized to pneumatic dilatation. In contrast, no difference after PD versus LM was shown either early or after 2 years in the European Achalasia Trial [9]. In this study, the primary variable was therapeutic success rather than failure and the protocol was more liberal with dilatations. Success was measured using Eckardt score which is a patient-rated score that measure dysphagia, chest pain, and regurgitation. We found a similar trend (i.e., a declining difference of symptoms over time), although over a slightly longer period. It is possible that failure as primary outcome is a more objective measure.

We used the cumulative occurrence of treatment failures as the primary outcome variable, which so far represents a fairly unique approach to outcome assessment in achalasia. This methodology has been valuable in comparisons of treatment strategies in, e.g., chronic reflux disease [24]. It may be argued that our composite definition of treatment failures was too rigid and therefore perhaps less clinically applicable. However, this outcome variable offers a dynamic assessment approach that might better describe the clinical management of the patients over time, rather than point prevalence recordings of individual symptoms. An overt perforation in association with the dilatation represents an undisputable failure since it requires surgical correction. Here, two of our 28 patients (7 %) had this complication. Since this is comparable to what is known in the literature [25] our protocol may be considered as acceptably safe.

In this long-term follow-up, patients assessed their ability to swallow in the Watson score [16]. We found that the improvement in dysphagia symptoms that was evident in both groups after 1 [5] and 3 years only reached significance in favor of LM when the difference between pre- and post-operative values was determined (Table 3). After 5 years, however, this was no longer statistically significant (Table 3). This observation is interesting given the natural progressive course of the disease where symptoms are gradually worsened. In the literature, a cumulative increase in failures over the longer term is evident irrespective of which treatment is used [10, 26]. On the other hand, obstructive complaints and health-related QoL are affected by the withdrawal of treatment failures, which in this case burdens the LM arm over time. An alternative approach would be to allow the last value of these parameters at the time of treatment failure to be carried forward in the subsequent comparison. Such an analysis would obviously strengthen the superiority of LM even further. Our data suggest that LM surgery is superior to PD in providing longer lasting symptom relief. However, neither treatment will prevent the progressive motor dysfunction of the LES and esophageal body and provide permanent cure.

One measure of treatment success is a composite analysis of the burden of symptoms. We assessed QoL of these patients before and 12 months after the initiation of the respective therapy [5]. The newly diagnosed, untreated achalasia patients reported a poor QoL far below what is seen in age- and sex-matched control subjects [18]. A significant improvement in QoL was recorded one and 3 years after the initiation of therapy, close to normal [18, 27]. In the present study, the total PGWB score was significantly higher in LM patients compared to the PD group after 3 years. This difference was evident in all domains, in particular for anxiety and self-control. After 5 years, this difference had diminished.

Health economy is also of relevance in selection of treatments. We analyzed the accumulated direct medical costs connected with the respective therapy prospectively. The costs per treatment arm from first intervention and 5 years onwards revealed that the total medical costs were significantly lower for the PD strategy than the LM (Table 4). During follow-up, we found no difference in costs from 3 to 5 years of follow-up.

Despite the small sample size in our study, the differences in the therapeutic efficacy of LM and PD seemed more prominent with the extension of the follow-up period and showed a consistent pattern when additional outcome measures were collected. At the time of enrollment, the significance of the different manometric subtypes of achalasia and their therapeutic impact had not been recognized, and thus stratified data on this could not be presented [28]. In conclusion, LM is followed by a lower risk of treatment failure in newly diagnosed achalasia than PD, although the former carries higher initial costs. This advantage of LM is reinforced by the recording of the patients’ QoL and dysphagia symptoms.