Thyroid carcinoma is a rapidly growing endocrine malignancy worldwide [1], with papillary thyroid cancer (PTC) being the most frequent type with an excellent treatment prognosis [1, 2]. Surgery has been the main treatment option for PTC for more than a century, but conventional open thyroidectomy (COT) leaves a noticeable scar in the anterior neck area [3]. With the rise in young women needing treatment for thyroid carcinoma, patients are interested in not only in oncologic safety, but also in the cosmetic outcomes of treatment [4]. Therefore, the thyroid surgeon must consider not only the successful treatment of the disease, but also quality of life factors such as post-surgery scarring and patient cosmetic satisfaction [4]. Since the development of endoscopic thyroidectomy by Hüscher [5], and endoscopic papillary thyroid microcarcinoma by Miccolietal [6], increasing numbers of studies have reported on the applicability of total endoscopic thyroidectomy (TET) for treatment of thyroid cancer in low-risk patients, and compared the surgical outcomes with those for COT. In these studies, TET showed similar surgical outcomes when compared with COT in selected low-risk patients with PTC. Despite these studies, the general application of TET for treatment of malignant thyroid tumors remains to be proven in terms of oncological safety.

High quality meta-analysis is increasingly regarded as one of the key tools for providing evidence for clinical efficacy [7, 8]. While previous meta-analyses [9, 10] have reported on the oncologic outcomes of using TET to treat PTC, they did compare this with COT. Additionally, since these studies are more than 4 years old and the Cochrane Database of Systematic Reviews require that meta-analysis data be updated every 2 years, a thorough review comparing the treatment outcomes of TET and COT for PTC including recent studies is both timely and necessary. In this review we specifically addressed the following points: (1) the completeness of thyroidectomy, (2) recurrence of the tumor after long-term follow-up, and (3) conducted a subgroup analysis in patients undergoing central lymph node dissection (CLND).

Materials and methods

Protocol and registration

This meta-analysis was performed according to the criteria in the Cochrane Handbook for Systematic Reviews of Interventions [11] and presented using the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines [12,13,14]. The protocol used for this meta-analysis is available in PROSPERO. A MeaSurement Tool to Assess systematic Reviews (AMSTAR) was used to assess methodological quality in the studies under review [15, 16].

Data sources and searches

A systematic literature search was conducted during March 2019 using four electronic databases: PubMed, Embase, Web of science and the Cochrane Library. The search strategy used a combination of the following Mesh terms: “laparoscopy” or “endoscopy” or “minimally invasive” and “thyroidectomy” and “papillary thyroid cancer” or “thyroid cancer” and “conventional open thyroidectomy”. No restriction was applied to the language or date of publication. All abstracts that met the search criteria were read and analyzed for suitability. All studies identified were downloaded to a reference manager software suite (Endnote) and searched to identify duplicates, which were then removed from the dataset.

Eligibility criteria

We included all studies that directly compared the outcomes of TET with COT for treatment of patients with PTC. Both randomized clinical trials (RCTs) and non-randomized clinical trials were included in the dataset. Studies that did not compare TET to COT, or conduct statistical analyses of the clinical data were excluded, as were duplicate publications, narrative reviews, and opinions pieces. Two reviewers conducted the eligibility criteria assessment independently by examining titles and abstracts identified by the electronic database search terms. Any differences in applying the inclusion or exclusion criteria between the two reviewers were resolved by thorough discussion and mutual agreement.

Data extraction and quality assessment

Two reviewers independently extracted the data from all eligible studies using a standardized format. Data extracted from each report included first author and year of publication, number of patients, study design, participant characteristics, operative details, and postoperative outcomes. Intraoperative and postoperative outcomes were used to compare the efficacy of TET with COT treatment for PTC.

Surgical outcomes included operative time, number of removed lymph nodes, intraoperative bleeding, cosmetic satisfaction and length of hospital stay. Adverse events and complications included transient recurrent laryngeal nerve (RLN) palsy, permanent RLN palsy, transient hypocalcemia and permanent hypocalcaemia. Surgical completeness included postoperative TG levels and tumor recurrence after long-term follow-up.

The methodologic robustness of the included studies in this meta-analysis was determined using the Downs and Black scale [17], a validated tool for assessing randomized and nonrandomized studies. The scale consists of 27 items evaluating study reporting, as well as external and internal validity, and the power of the study to determine if the findings are statistically relevant, with a maximum assigned score of 32 (highest assessment). Refer to Table 1 for characteristics of studies included in the meta-analysis.

Table 1 Characteristics of the included studies

Data synthesis and analysis

The meta-analysis was performed by using Stata software, Stata V.13.0. A formal meta-analysis was conducted for all studies comparing the results of TET and COT for treatment of PTC. Pooled estimates of outcomes were calculated using a fixed-effects model. For dichotomous data, the results for each study were expressed as an odds ratio (OR) with 95% confidence intervals (CI). For continuous outcomes, the effect size was measured as the weighted mean difference (MD) with 95% CI. The test for homogeneity of effects was performed using χ2 tests, with p ≤ 0.05 indicating significant heterogeneity. The prevailing heterogeneity between ORs for comparable outcomes between different studies was calculated using the I2 inconsistency test that determines the percentage of total variation across the studies and reflects heterogeneity rather than differences due to chance. A sensitivity analysis was subsequently conducted by eliminating each study from the analysis in turn. A potential publication bias was assessed by visually inspecting the Begg’s funnel plots in which the log OR is plotted against the standard error (SE).

Subgroup analysis

The surgical management of PTC, especially regarding the necessity of central lymph node dissection (CLND), remains controversial. Therapeutic CLND is the accepted treatment for advanced disease as the American Thyroid Association(ATA) recommends cervical node dissection in patients with clinically involved (cN1) cervical lymph nodes [18]. However, the role of prophylactic CLND in patients with clinically negative (cN0) lymph node remains controversial due to limited survival benefits and associated complications, such as hypoparathyroidism, recurrent laryngeal nerve injury and increase in operation time [19, 20]. In this meta-analysis, 17 studies included cases lymph node dissection in the central region, so we a conducted subgroup analysis on this group of studies in our review.

Results

Study selection and characteristics

Seven hundred and fifty-one potentially relevant articles were identified according to the database search strategy. Twenty-five articles were selected for further investigation, of these, five studies reported both benign and malignant tumors. Finally, a total of 20 nonrandomized studies [2, 4, 21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38] matched the inclusion criteria. Two of these publications were prospective research and 18 were retrospective research. Studies were assessed for Measuring Quality by the Downs and Black scale, and found to score in the range from 20 to 32 points, with an average of 28.5 points. A flowchart describing the reference selection process is illustrated in Fig. 1.

Fig. 1
figure 1

The PRISMA flowchart of literature research

The general characteristics and pathological details of the included studies are summarized in Table 1. A total of 5664 patients with PTC were included in the 20 studies, of whom 1633 underwent TET, and 4031 underwent COT. All these studies were performed in China and Korea. Five surgical approaches were used for TET in the 20 studies, including the bilateral axilla-breast approach, unilateral axilla-breast approach, chest-breast approach, axilla approach and breast approach. Surgery in both treatment groups involved different extents of thyroidectomy, total thyroidectomy (TT) or less-than total thyroidectomy (LTT) which encompassed subtotal thyroidectomy (ST), hemithyroidectomy (HT), and lobectomy (Lob). Central lymph node dissection was performed in 17 studies.

Meta-analysis results

Surgical outcomes

The operative time was reported in fifteen studies. Meta-analysis of the pooled data revealed that operative time was significantly longer in the TET group than in the COT group (pooled WMD, − 50.28; 95% CI, − 59.62 to − 40.94; p < 0.05) with high heterogeneity (I2= 98.8%). In the subgroup analysis, the operation time of TET was also longer than COT incorporating CLND (WMD − 52.02, 95% CI − 65.47 to − 38.57, p < 0.05), with a relatively high heterogeneity (I2= 98.1%); (Fig. 2A).

Fig. 2
figure 2figure 2figure 2figure 2figure 2figure 2

Meta-analysis forest plot concerning A operative time, B number of removed lymph nodes, C incidence of intraoperative bleeding, D patient satisfaction with the cosmetic results, E length of hospital stay, F incidence of transient RLN palsy, G occurrence of transient hypocalcemia, H permanent postoperative RLN palsy, I permanent postoperative hyopcalcemia, J postoperative TG levels < 1.0 n/ml and K recurrence of postoperative tumors

The number of removed lymph nodes was reported in twelve studies, and was significantly less in the TET group compared with the COT group (WMD 0.46; 95% CI 0.09 to 0.83; p < 0.05), with no significant heterogeneity (I2= 39.7%); (Fig. 2B).

The incidence of intraoperative bleeding was reported in five studies, and was found to be not significantly different for the TET group compared with the COT group (pooled SMD − 0.10; 95% CI − 0.25 to 0.05; p > 0.05) with high heterogeneity (I2= 73.1%); Egger’s test P = 0.061; (Fig. 2C).

Patient satisfaction with the cosmetic results was reported in three studies and evaluated using a simple self-scoring system (1: Extremely; 2: Fairly; 3: Normal; 4: Not at all) 6 months post-surgery. Patients were generally more satisfied with the cosmetic outcome in the TET group compared with the COT group (pooled WMN 1.73; 95% CI 1.05,2.40; p < 0.05; I2= 91.1%; Egger’s test p = 0.568); (Fig. 2D).

No significant differences were observed between the two operative methods in terms of length of hospital stay (pooled WME − 0.26; 95% CI − 1.05 to 0.52; P > 0.05; I2= 98.9%; Egger’s test p = 0.001); (Fig. 2E).

Adverse events and complications

The incidence of transient RLN palsy was reported nineteen studies, and found to be less for the COT group compared with the TET group (pooled OR 0.41; 95% CI 0.31 to 0.54; p < 0.05; I2= 98.8%; Egger’s test p = 0.702). In the subgroup analysis, the incidence of transient RLN palsy for COT was also less than that observed for TET utilizing CLND (pooled OR 0.43; 95% CI 0.29 to 0.62; p < 0.05; I2= 12.7%); (Fig. 2F).

The occurrence of transient hypocalcemia was reported in fourteen studies, and found to be less for the TET group compared with the COT group (pooled OR 1.66; 95% CI 1.39 to 1.99; p < 0.05; I2= 83.9%; Egger’s test p = 0.740). In the subgroup analysis of treatments utilizing CLND, the occurrence of transient hypocalcemia was also less for TET than for COT (pooled OR 2.58; 95% CI 1.97 to 3.38; p < 0.05; I2= 89.3%). No significant difference in the occurrence of transient hypocalcemia was observed between the two operative methods in patients who did not undergo CLND (pooled OR 1.07; 95% CI 0.83 to 1.37; p < 0.183; I2= 33.8%); (Fig. 2G).

There were no significant differences observed between the two operative methods in terms of permanent postoperative RLN palsy (pooled OR 0.65; 95% CI 0.34 to 1.26; p > 0.05; I2= 0%). In the subgroup analysis, no significant differences were observed between the two operative methods in patients where CLND was utilized (pooled OR 0.48; 95% CI 0.22 to 1.06; p > 0.05; I2= 0%); (Fig. 2H).

When permanent postoperative hypocalcemia was compared, no significant differences were noted between the two operative methods (pooled OR 1.54; 95% CI 0.93 to 2.55; P > 0.05; I2= 0%). This was also true for patients where treatment included CLND (pooled OR 1.50; 95% CI 0.76 to 2.93; P > 0.05; I2= 0%); (Fig. 2I).

Surgical completeness

No significant differences were observed between the two operative methods in terms of postoperative TG levels < 1.0 ng/ml (pooled OR 1.10; 95% CI 0.65 to 1.88; p > 0.05; I2= 43.6%;Egger’s test p = 0.191); (Fig. 2J).

Recurrence rates

There were no significant differences in the recurrence of postoperative tumors observed between the two operative methods (pooled OR 2.21; 95% CI 0.87 to 5.12; p > 0.05; I2= 0%); (Fig. 2K).

Sensitivity analysis and publication bias

In our study, we performed a sensitivity analysis by investigating the influence of each individual study on the overall pooled estimates by systematic elimination of each study from the pooled analysis in turn. Our results suggested that the influence of each individual data set on the pooled OR and MD was not statistically significant.

Publication bias was only analyzed for the 6 treatment outcomes that were included in 10 or more studies [39]. After viewing the funnel plots and Egger’s tests for these outcomes it was concluded that there was no publication bias noted for these 6 treatment outcomes.

Discussion

In this study, we carried out a systematic review and meta-analysis of the clinical literature comparing treatment outcomes for PTC published since 2007. The surgery outcomes, incidence of complications, recurrence rates, and postoperative TG levels of TET and COT as treatments for PTC were compared. Although previous meta-analyses [9, 10] have been published regarding TET and COT as treatments for PTC, the oncologic effectiveness and the long-term effectiveness of TET applied to thyroid cancer are still controversial. This study included the most recent studies in the overall analysis to compare the effectiveness of TET or COT as treatment options for PTC, some of which included long-term follow-up results. To our knowledge, this is the largest and the most comprehensive meta-analysis review comparing the treatment outcomes and occurrence of adverse events for TET and COT.

Unlike previous studies [9, 10], this meta-analysis showed that TET is associated with lower rates of transient hypocalcemia than COT. The current meta-analysis shows that the findings from previous reviews are upheld in this more extensive study. These include that TET has better patient cosmetic satisfaction, longer operation time, higher rates of transient RLN palsy and is associated with fewer retrieved lymph nodes. Additionally, TET was found to have equivalent rates of postoperative tumor recurrence, intraoperative bleeding, and postoperative thyroglobulin levels after withdrawing thyroid hormone treatment when compared with COT.

The results of this meta-analysis indicated that the operative time was significantly longer in the TET group than the COT group with high heterogeneity, similar to previous studies [9, 10]. This due to the additional time required to create the flap for TET, a procedure not required for conventional thyroidectomy. However, this additional time is reduced for experienced surgeons and by advances in instrument technology [4].

This study found that the number of lymph nodes removed during TET was significantly less than for the COT group, consistent with previous studies [9, 10]. According to the 2015 ATA guidelines, central neck dissection is recommended as the therapy for clinically involved central nodes, but prophylactic CCND is not appropriate for T1/T2, noninvasive, clinically node-negative PTC [2]. Therefore, more precise central lymph node dissection must be performed when the endoscopic thyroidectomy is conducted to treat PTC.

The incidence of transient RLN palsy was reported in this study, and was found to be lower for the COT group when compared with the TET group. This was especially significant for the subgroup analysis of treatment involving lymph node dissection in the central neck region. With improvements in visualization technology contained in the endoscope, the recurrent laryngeal nerve can be easily exposed to reveal its position and precisely define its course by skilled surgeons; however thermal damage from the ultrasonic scalpel may cause injury to the RLN [2]. In our study the rate of transient hypocalcemia was found to be less for the TET group compared with the COT group, especially for the subgroup analysis of lymph node dissection in the central region. This is because the parathyroid glands are easily identified using the optics of the endoscope [25]. Results from this meta-analysis also indicate that there is no significant difference in the incident rates of intraoperative bleeding, permanent RLN palsy and permanent hypocalcemia, or the length of hospital stay, between the treatment groups.

Analysis of the patient satisfaction scores for cosmetic outcome, found that TET treatment scored higher than the COT treatment group. An obvious scar on the neck left after COT causes psychological concerns in patients [25]. TET is a surgical technique that is associated with good cosmetic outcomes. While it is difficult to conduct an objective evaluation, it appears that the main advantage of endoscopic thyroidectomy is the cosmetic result of not leaving a scar on the anterior neck [40]. Studies comparing cervical and extracervical approaches for endoscopic thyroidectomy reported that the extra-cervical approaches are successful treatments for thyroid tumors with the added advantage of not leaving scars on the neck [41].

This is the first study to report on the postoperative thyroglobulin levels after withdrawing thyroid hormone treatment, and the meta-analysis showed that the levels for TET are comparable with COT. The stimulated TG levels were considered to be a reliable surrogate marker reflecting the amount of remaining thyroid tissue [42]. Therefore, in our study this data was used to further compare the outcomes between TET and COT in terms of surgical completeness. The long-term oncologic effectiveness of TET is one of the most important concerns in its use as a treatment for PTC, requiring more studies with data from long-term follow-up consultations. While there are previous reviews on thyroid cancer treatment [9, 10], our study is the first to report on the pooled recurrence rates for postoperative thyroid cancer. The pooled data from five studies show no significant difference between the two groups in recurrence rates with no observed heterogeneity in the studies. This proves that the long-term oncologic effectiveness between TET and COT are comparable.

There are some limitations to this meta-analysis: First, all included studies were nonrandomized observational clinical studies; all studies were performed in China and Korea, potentially limiting the clinical effectiveness to patients of Asian descent. Second, some heterogeneity was observed in the data between the two treatment groups, possibly due to differences in patient selection, surgical approaches, and surgeon experience. Third, the TET has brought a set of complications of its own including postoperative bleeding and skin burn occurred in the track field, but there is no similar counterpart in COT to be compared with postoperative bleeding and skin burn [21, 25]. Despite these limitations, this study provides support for the efficacy of TET in the treatment of PTC.

Conclusions

The study provides strong evidence that TET is an equally safe and effective treatment for thyroid cancer as COT. Indeed, the tumor recurrence rate and the level of surgical completeness in TET appears to equivalent to COT. Additionally, TET was associated with a significantly lower rate of transient hypocalcemia and better cosmetic satisfaction. Consequently, TET is the preferred option for thyroid cancer patients with cosmetic needs. Further studies using randomized clinical trials and larger patient cohorts including long-term follow-up data are essential to further demonstrate the value of the TET in the treatment of PTC.