Submucosal tumors (SMTs) are a class of protruding lesions with a normal mucosa-covered surface. SMTs are uncommon entities of the upper gastrointestinal tract, with an estimated overall prevalence of 0.3% in past [1]. However, the detection rate of SMTs has become increasingly prevalent thanks to the development of imaging techniques [2]. SMTs are usually found incidentally without symptoms. The majority of SMTs are benign, although some have malignant potential, especially large ones of those originating from the muscularis propria (MP) layer [3,4,5].

The American Gastroenterological Association (AGA) recommends that gastrointestinal stromal tumors (GISTs) ≤ 3 cm should be followed up by endoscopy, endoscopic ultrasound (EUS) or resected [6]. The National Comprehensive Cancer Network (NCCN) guidelines indicate that GISTs ≥ 2 cm should be resected, while the treatment of GISTs < 2 cm remains controversial [7]. The malignant potential varies with the type of SMTs. Therefore, an accurate diagnosis for differentiating between truly benign and malignant lesions seems pretty essential. EUS-guided fine needle aspiration (EUS-FNA) and biopsy are considered the most reliable methods in the histological diagnosis of SMTs [8,9,10]. Considering their limited diagnostic values and the challenge of preoperative tissue collection [2, 6, 11,12,13], pathological examination seems not easy or necessary for easily resectable tumors [7]. Long-term follow-up adds to the financial burden and psychological stress to patients, and may delay the diagnosis and treatment of the malignancy [14]. Early resection of SMTs allows confirmation of the pathological diagnosis and achieves clinical cure.

Surgery and endoscopic resection are two current methods used to remove SMTs [14]. Compared with open surgery, minimally invasive resection methods such as video-assisted thoracoscopic enucleation (VATE) and endoscopic resection, have been widely performed and are feasible and safe. Surgical enucleation is widely considered the first treatment choice for SMTs [15, 16]. Endoscopic techniques, such as endoscopic submucosal dissection (ESD), endoscopic submucosal excavation (ESE), and endoscopic full-thickness resection (EFR), are feasible, effective, and safe [17,18,19]. However, they rarely achieve en bloc resection in SMTs originating from the MP and maintain the integrity of the digestive tract mucosa. Submucosal tunneling endoscopic resection (STER) is a novel endoscopic method used to resect SMTs originating from the MP by establishing a tunnel to maintain the integrity of the digestive tract mucosa [20]. STER is considered to be superior to other endoscopic methods for SMTs originating from the MP, especially those with transverse diameter ≤ 35.0 mm [21,22,23]. Both video-assisted thoracoscopic enucleation and STER are safe and effective techniques for SMTs originating from the MP. However, only few retrospective studies comparing STER and VATE for the treatment of esophageal SMTs have been reported, with no related prospective randomized controlled trials available [24, 25]. The aim of this randomized clinical trial was to compare VATE with STER for treating esophageal SMTs.

Patients and methods

Study design

Random assignment was performed using permuted blocks without stratification in a computer-generated random sequence. All of 66 patients were consecutively randomized to the VATE or STER group from July 2014 to December 2015. After exclusion of 8 patients, the remaining 58 undergoing either STER or VATE were enrolled (Fig. 1). The present study was approved by the Ethics Committee of the Chinese People’s Liberation General Hospital (S2014-024-02). The trial is registered at http://www.chictr.org.cn/showproj.aspx?proj=4814 (ChiCTR-TRC-14,004,759).

Fig. 1
figure 1

Study flowchart

Mediastinal-enhanced computed tomography (CT) and EUS were conducted to evaluate tumor size, location, shape and depth, and rule out metastasis or invasion outside the digestive tract prior to operation. Preoperative examinations, including complete blood count, were performed. All patients fasted for 8 h before resection. Adverse events during and after the operation were closely monitored. Therapeutic outcomes in both groups were prospectively assessed. Follow-up gastroscopy was performed at 1, 3, 6, 12, and 24 months post-operation, respectively.

Patients

This was a single-center, prospective randomized controlled study. Inclusion criteria were (1) age between 16 and 70 years; (2) esophageal SMTs originating from the MP layer confirmed by imaging examinations; (3) SMTs with longest diameter ≥ 10.0 mm and transverse diameter ≤ 35.0 mm and intact mucosal surface; (4) no malignancy; (5) no signs of metastasis or invasion outside the digestive tract; and (6) signed informed consent. Exclusion criteria were (1) reluctance to undergo VATE or STER, or inability to sign informed consent; (2) intolerance to anesthesia; (3) high-risk operation or pregnancy; and (4) coagulopathy (international normalized ratio > 1.5, platelets < 50,000). Three patients with multiple SMTs were enrolled and all SMTs were resected during one procedure. Multiple SMTs made the comparison between STER and VATE difficulty, therefore we eliminated these 3 patients. A patient with SMTs located in lower esophagus had underwent peroral endoscopic myotomy (POEM) which interfered operation was excluded, too.

STER procedure

STER was conducted mainly as previously reported (Fig. 2) [20]. The STER procedures were performed by experts of more than 100 cases of peroral endoscopic myotomy (POEM). Patients were in the left lateral position, under intravenous anesthesia. A single-channel gastroscope (GIF Q260J/GIF Q290J; Olympus, Tokyo, Japan) equipped with a transparent cap (D-201-11802; Olympus), a high-frequency generator (VIO 200D; ERBE, Tübingen, Germany), and an argon plasma coagulation unit (APC300; ERBE) were used during the procedures. A carbon dioxide (CO2) insufflator (UCR; Olympus) was used to achieve CO2 insufflation. The key steps were as follows: (1) the characteristics of SMTs, such as size, location, and depth, were evaluated under a liner-array echo-endoscope (Prosound F75; Aloka, Tokyo, Japan and GF-UCT260; Olympus) before STER; (2) submucosal injection with an injection needle (NM-4L-1; Olympus) was performed at 3–5 cm proximal to the tumor; (3) a mucosal incision was made with a triangular knife (KD-640L; Olympus); (4) a longitudinal tunnel ending at 1–2 cm distal to the tumor between the submucosal and muscular layers was made using a triangular knife; (5) after complete exposure of the SMT, the tumor was resected using an insulation-tip knife (KD611L, IT2; Olympus) or a triangular knife; a snare (ASM-1-S or ASJ-1-S; Cook, Limerick, Ireland) was needed in some cases; (6) clips (HX-610-135; Olympus) were used to close the incision.

Fig. 2
figure 2

Key procedures of submucosal tunneling resection. A Endoscopic view of a submucosal tumor located in the middle esophagus. B Endoscopic ultrasound view of the same lesion, showing the tumor originating from the muscularis propria. C Creating a fluid cushion by a submucosal injection. D Making a mucosal incision 5 cm proximal to the submucosal tumors. E Creating a submucosal tunnel to the lesion. F Exposure of the entire tumor. G Tunnel after tumor resection. H Closure of the tunnel entry with clips. I Resected specimen. J Endoscopic view at 1-year follow-up after operation, showing a scar at the mucosal entry

VATE procedure

The VATE procedures were mainly performed by surgeons with experience of more than 1000 cases of thoracoscopic surgery. The patients was in the left lateral decubitus position at about a 15° frontal inclination, under general anesthesia with double or single lumen intubation. Three to five cameras or working ports were placed over the right chest depending upon the location of the mass. Sometimes, a working incision was made about 3–4 cm in one of the ports, to facilitate instrument manipulation. After the lesion was identified, the mediastinal pleura over the tumor was incised longitudinally. The mass was exposed after longitudinal split of the overlying muscle, and carefully enucleated to preserve the vagal branches and prevent mucosal damage. The integrity of the mucosa must then be assessed for the presence of any bubble in the water-submerged esophagus, after insufflating air through the nasogastric tube or by gastroscopy. The muscular layer was closed with interrupted absorbable sutures, with a chest tube placed through one of the ports for postoperative drainage.

Postoperative management

For STER, the patients were fasted for 3 days, received a liquid diet for 3 days after surgery, and returned gradually to a normal diet within 2 weeks. Chest/abdominal X-ray or CT was performed in case of severe chest pain. For VATE, the patients were fasted for one day, received a liquid diet for 2 days, and returned gradually to a normal diet within 2 weeks. The chest tube was removed with daily drainage of less than 10 mL. Wound dressing and suture removal were performed at the outpatient clinic.

For all patients, at 1 day and 3–7 days post-operation, complete blood count was performed. Intravenous proton pump inhibitors (PPIs) and antibiotics were used for 3 days, followed by oral PPI therapy for 4 weeks.

Outcome measurements

En bloc resection, complete resection, recurrence rate, residual rate, operation time, hospital time, and cost were compared between the STER and VATE groups as the outcome measures of effectiveness SMT size were determined by the longest diameter under EUS. Complete resection was defined as en bloc removal of the tumor with negative margins upon pathologic examination. Operators included not only the main surgeon but also the auxiliary surgeons, such as technologists. The entire expense from admission to discharge was involved in the cost analysis. To evaluate safety, AEs, including gas-related AEs, perforation, pleural effusion, mucosal injury, fever (temperature > 38 °C), severe chest pain, acute or delayed major bleeding, and structure, were assessed. Changes of hemoglobin levels between preoperative values and those at 3–7 days post-operation were also evaluated. The average postoperative pain was scored by the numeric rating scale (NRS) at the first 24 h: 0, painless; 10, twinge; 1–3, mild pain; 4–6, moderate pain; 7–10, severe pain. Artificial pneumothorax in the right side was required in the VATE group; therefore, pneumothorax in the right chest was not considered as an AE.

Statistical analysis

The number of cases per group was estimated based on an average STER time of 84.4 ± 29.1 min and VATE time of 125 ± 57.8 min [21, 26]. At least 22 patients were needed in each group to achieve a statistical power of 90% with a significance level of 5%. We anticipated a 50% dropout rate for each group, and finally enrolled 66 patients. All calculations were performed with the SPSS 22.0 software (IBM Corp, Armonk, NK). Quantitative data were expressed as mean ± standard deviation (SD) or median with ranges, and assessed by Student’s t test, nonparametric test, or analysis of variance (ANOVA). Enumeration variables were expressed as proportions, and assessed by χ2 test or Fisher’s exact test. P < 0.05 was considered statistically significant.

Results

Clinical characteristics

Sixty-six patients with small esophageal SMTs were prospectively randomized from July 2014 to December 2015. After exclusion of 8 patients, 46 males and 12 females with a mean age of 46.1 ± 9.4 years were randomized to the STER (n = 30) and VATE (n = 28) groups, respectively. Median size of SMTs was 18.1 mm (range 10.0–50.0 mm). One tumor was localized in the upper esophagus, 35 in the middle esophagus, and 22 in the lower esophagus. Fifty-four resected SMTs were diagnosed as leiomyomas, three as GISTs, and one as a fibrous tumor. In terms of age, sex, tumor size, transverse diameter, tumor location, pathological diagnosis, and preoperative hemoglobin levels, no differences were found between the two groups (all P > 0.05). The median volume of intraoperative infusion in the STER group was smaller than that of the VATE group (P = 0.002). The median volume of postoperative infusion in the STER group was about 2500 mL, while that of the VATE group was about 3000 mL. The detailed characteristics of patients and SMTs in the VATE and STER groups are listed in Table 1.

Table 1 Characteristics of patients and SMTs in the STER and VATE groups

Effectiveness and safety of STER and VATE

En bloc resection was achieved in 26 (83.3%) patients in the STER group and 28 (100%) after VATE; indicating no significant difference between the two groups (P = 0.138). The SMT margins after en bloc resection were all negative. No residual tumor or recurrence was noted in all enrolled patients during follow-up. Effectiveness outcomes are shown in Table 2. Despite similar hospitalization durations, STER was superior to VATE, with shorter operation time, decreased cost, and less operators needed. There was no significant difference in AEs between the two groups (STER, 16.7%; VATE, 35.7%; P = 0.098) (Table 3). Subcutaneous or mediastinal emphysema was the most common AE related to STER, while moderate fever was the most common AE post-VATE. Median decrease in hemoglobin levels post-procedure was − 1.6 g/L in the STER group and 14.7 g/L after VATE (P = 0.001). No patients, neither in the STER group nor in the VATE group, needed blood transfusion. Although, few patients complained of severe pain, postoperative mild to moderate chest pain was more common. A significant difference in pain scores was found between the STER and VATE groups (P < 0.001) (Table 4).

Table 2 Effectiveness outcomes in the STER and VATE groups
Table 3 Safety outcomes in the STER and VATE groups
Table 4 Postoperative pain scores in the STER and VATE groups

Discussion

As the most common esophageal SMT, leiomyoma is mostly considered to be benign. However, there are some tumors with malignant potential, e.g., GISTs. The European Society for Medical Oncology (ESMO) and Japanese GIST guidelines recommend that all GISTs, regardless of size, should be resected at diagnosis, different from the NCCN guidelines [7, 27, 28]. The cutoff size remains controversial. Considering that accurate diagnosis for differentiating potentially malignant GISTs from benign SMTs without resection is difficult, early resection of esophageal SMTs seems essential to ward off SMT-related cancer.

SMTs originating from the MP layer are traditionally difficult to resect under endoscopy, with a high risk of perforation and a low en bloc resection rate. Nowadays, ESD constitutes a promising treatment for SMTs originating from the MP layer with en bloc resection rates ranging from 64 to 75% [29, 30]. Considering that ESD is more suitable for superficial lesions originating from the mucosal and submucosal layers, ESE and EFR have been modified to resect SMTs originating from the deeper MP layer, with demonstrated usefulness [19, 31, 32]. However, these techniques are hampered by high risk of postoperative perforation and infection. EFR has been mainly used to treat gastric and colonic lesions, with esophageal SMTs not considered an indication. STER was firstly reported as a novel therapy to treat SMTs originating from the MP layer by establishing a tunnel between the submucosal and MP layers to maintain the integrity of the digestive tract mucosa [20]. STER was demonstrated advantages over ESD and ESE in SMTs [21, 33].

VATE was previously considered the best choice for management of esophageal leiomyoma 1–5 cm in diameter [34, 35]. However, whether STER is better than VATE in treating SMTs remains unclear. Only few retrospective studies have compared STER and VATE, with relatively small sample sizes [24, 25]. The present study was prospectively designed as a randomized controlled trial to compare these two novel techniques for their effectiveness in the management of patients with esophageal SMTs originating from the MP layer.

As shown above, the en bloc resection rate of VATE was higher than that of STER, although the difference was not statistically significant. While 16.7% SMTs treated by STER failed to achieve en bloc resection, no residual tumor or recurrence was observed in all enrolled patients during follow-up, even after piecemeal resection. STER was superior to VATE, with shorter operation time, lower cost, and less operators required. The AEs seemed slightly more frequent in STER compared with VATE, but with no significant difference. AEs in both STER and VATE groups were conservatively treated. STER had the advantage of milder postoperative chest pain over VATE. Previous findings revealed comparable treatment efficacy between STER and VATE, with STER showing the advantages of shorter operation time, milder hemoglobin level decrease, lower cost, and reduced chest pain [24, 25], corroborating the current study. This study revealed that median decrease in hemoglobin levels post-procedure in the STER group was milder than that in the VATE group, with significant difference. However, the greater volume of infusion, intraoperative and postoperative, in the VATE group might bias the results. We speculated that the changes of hemoglobin levels between the two groups might not have had such a great difference with the similar volume of infusion. However, wound effusion was more pronounced in the VATE group compared with the STER group. We believed that this bias might not affect our conclusion.

Previous studies showed that the inner diameter of the tunnel is approximately 35 mm; therefore, the transverse diameter of SMTs treated by STER cannot exceed 35 mm [36, 37]. The longest diameter can reach 70 mm, and few studies have reported STER for SMTs > 35 mm [25, 38,39,40]. In SMTs < 20.0 mm, both techniques had the same satisfactory en bloc resection rate of 100% in the current study. SMTs < 20.0 mm are often ignored by video-assisted thoracoscopy. Therefore, endoscopic assistance is needed to evaluate the accurate location of tumors. For tumors located in the left side of the esophagus, this organ has to be overturned for a better exposure of the mass since right VATE was performed in all patients because of operators’ preference and experience, as well as the location of heart. The right lateral position was not suitable for SMTs even located in the left side of the esophagus because the heart and many large vessels are located in the left thoracic cavity. Overturning esophagus not only wasted time but also hardly gave an operative view as clear as treating SMTs located in the right side. Managing lesions in the left side seemed more challenging to VATE operators than those in the right side, while left or right location made no difference in the STER procedure. Taking the advantages and disadvantages of STER and VATE into consideration, we recommend STER as a preferable choice for SMTs < 20.0 mm without abundant blood supply, while VATE is superior for SMTs > 35 mm with abundant blood supply.

Resecting a SMT near the aortic arch is challenging. Indeed, the aortic pulsation makes it difficult to establish a tunnel and resecting the tumor during STER. When resecting a tumor by VATE, the esophagus was overturned and special attention should be paid to avoid angiorrhexis, which may lead to hemorrhoea. Arteriorrhexis, no matter during STER or VATE, is life-threatening. A patient suspected of a GIST near the aortic arch was enrolled in our study and underwent VATE. Fortunately, no damages of vessel were encountered during the operation. We suggested that if the patient was diagnosed with a asymptomatic SMT near the aortic arch with small size and suspected to be benign, surveillance might be a good choice. For a SMT is located in the upper esophagus near the esophageal inlet, there may not be enough room to establish a tunnel wither; STER is not suitable in this case.

This study had several limitations. First, operators performing the procedures were not blinded to STER and VATE. However, the study design made this bias inevitable. In addition, postoperative treatments between the two groups were different, which may affect the occurrence of AEs. What is more, patients undergoing STER were fasted for 3 days while patients undergoing VATE were fasted only for 24 h. STER was a novel technique, therefore operators were more conservative. It seems that surgeons operating VATE are more bold. Our own protocol may be the main reason why the length of stay is the same. If the patients were fasted for the same days in both groups, STER will act better for a shorter hospital day. Finally, the sample size was relatively small, and this was a single-center study. Therefore, further multi-center studies involving larger populations are needed to confirm our findings.

In conclusion, STER and VATE are comparably effective for esophageal SMTs. However, STER is superior to VATE, with shorter operation time and lower cost. STER seems safer than VATE with milder hemoglobin level decrease and reduced postoperative pain. STER might be a preferable choice for SMTs < 20.0 mm without abundant blood supply, while VATE is superior for SMTs > 35 mm with abundant blood supply.