Conventional open thyroidectomy using the Kocher incision has been used as the standard approach until the emergence of endoscopic and robotic technologies. Recently, many surgeons have attempted extracervical approaches via axillae or breast for improved esthetic results [1]. Among extracervical approaches, the gasless transaxillary (TA) endoscopic thyroidectomy and the gasless TA robotic thyroidectomy were introduced in 2006 and 2009, respectively [2, 3]. These techniques are known to be safe and efficacious in the hands of experienced surgeons, and cosmetic and surgical outcomes are known to be comparable with conventional open thyroidectomy [4, 5].

Besides the well-known complication of thyroidectomy, TA thyroidectomy poses additional risks, which are not typically associated with thyroid surgery but rather related to the new approach and surrounding anatomy. These include stretch injury to the brachial plexus, perforation of the esophagus, and injury to the carotid artery and/or internal jugular vein [6, 7]. Moreover, rare complications due to injury of neck muscles and traction injury have also been reported recently [8]. Transaxillary thyroidectomy has an additional disadvantage in that it is significantly more expensive than the standard cervical approach, especially for robotic thyroidectomy [9].

In addition to these disadvantages, the lack of data on long-term outcomes regarding voice and functional parameters of patients with total thyroidectomy mandate the need for prospective studies, although some reports have demonstrated the excellence of TA thyroidectomy (including both hemithyroidectomy and total thyroidectomy) in cosmetic and functional outcomes even with neck dissection [10]. Clinically, we sometimes encounter patients with stiff neck, especially around the sternocleidomastoid muscle, and voice or swallowing dysfunction long after surgery with a TA approach. In this prospective study, we aimed to evaluate the long-term voice and functional outcomes of TA total thyroidectomy in comparison with conventional open thyroidectomy.

Materials and methods

Study population

The study protocol was approved by the Institutional Review Board (IRB) of the Korea University College of Medicine. From May 2011 to December 2013, patients undergoing total thyroidectomy with either of the two approaches were included in this prospective study. Inclusion criteria for enrollment were as follows: (1) underwent total thyroidectomy; (2) no clinical evidence of neck node involvement (central or lateral); and (3) age ≥ 20 years. Enrollment exclusion criteria were as follows: (1) a diagnosis of anaplastic or medullary carcinoma; (2) preoperative evidence of nodal disease or lymph node dissection of any lateral compartment during the initial operative intervention; and (3) patients who did not complete 1 year of follow-up.

Evaluations of the surgical approach

All patients were informed about the operative techniques for both conventional open and TA approach thyroidectomy, and patients subsequently chose their preferred surgical procedure. Informed consent was obtained from each patient for the endoscopic thyroidectomy and for the possibility of a conversion to an open thyroidectomy. All surgery was performed by a single high-volume surgeon (KY Jung), and the surgical procedure was described previously elsewhere [11]. Patients were categorized as the TA group (those with TA approach thyroidectomy) and the conventional group (those with conventional open thyroidectomy).

Preoperative evaluations were completed for all patients using ultrasonography, fine-needle aspiration cytology, computed tomography, and thyroid hormone tests. Intraoperative and pathologic findings, including invasion to surrounding tissue, tumor and thyroid size, tumor side and location, multiplicity, extrathyroidal extension, and number of inadvertently excised parathyroid glands, were evaluated. Operative time, amount of total drainage, and complications, including vocal cord paralysis, hypoparathyroidism, and postoperative hematoma, were evaluated and compared between surgical groups.

Analysis of voice quality

All patients underwent repeated functional evaluations before surgery and postoperatively at 1 week and 1, 3, 6, and 12 months using a comprehensive battery of functional assessments. Acoustic voice analysis was performed under identical conditions by a single voice specialist. The voice specialist performed a perceptual rating of subjective voice assessments using the GRBAS scale. The GRBAS scale consists of the following five parameters: overall grade of hoarseness (G), roughness (R), breathiness (B), asthenia (A), and strain (S). A 4-point grading scale was used for each parameter, where 0 = normal, 1 = slight, 2 = moderate, and 3 = severe. Acoustic voice analysis was performed using the multi-dimensional voice program (MDVP) and the voice range profile program (VRP) of the CSL for maximal vocal pitch (MVP) (Kay Elemetrics, Lincoln Park, NJ, USA).

Assessment of subjective functional status and satisfaction

The subjective voice outcome was analyzed using the voice handicap index (VHI) [12], which included functional, physical, and emotional subscales that measure defects in verbal communication. Each item on the VHI is scored using a 5-point scale (range 0–4), with a highest possible score of 120 points. The dysphagia handicap index (DHI) [13] is a 25-item statement that describes the handicapping effect of dysphagia, including functional, physical, and emotional subscales, and measures defects in swallowing function. Each item on the DHI is scored according to a 3-point scale, with the highest possible score being 75 points. On both the VHI and DHI, a higher score is indicative of a greater perception of functional disability.

To assess patients’ subjective pain and cosmetic satisfaction, a visual analogue scale (VAS) was used for pain (0 = no pain, 5 = moderate pain, and 10 = most severe pain) and patient satisfaction (0 = dissatisfied, 1 = acceptable, 2 = satisfied, and 3 = extremely satisfied). Paresthesias in the neck and chest areas were evaluated using a questionnaire. For each item on the questionnaire, patients were asked to rate their response on a 4-point scale, where 0 = not at all, 1 = mild, 2 = moderate, and 3 = severe, using a reference period.

Statistical analysis

Results for continuous variables are presented as mean ± standard deviation (SD), and results for categorical variables are presented as frequencies and group percentages. Continuous outcomes were analyzed using independent (2-sample) t tests; one-way analysis of variance (ANOVA) was used for comparison of three or more groups. Dichotomous outcomes were analyzed using the Chi-square test for trend. All statistical analyses were performed using SPSS Statistics for Windows, version 20.0 (IBM. Armonk, NY, USA). P values <0.05 were considered statistically significant.

Results

General characteristics and complications

A total of 280 patients were enrolled in this study. The mean age was 49.5 ± 11.9, and the male-to-female ratio was 1:4.71. Seventy-six patients were included in the TA group and 204 in conventional group (Table 1). Age at diagnosis was significantly lower in the TA group than in the conventional group (p < 0.001). Male-to-female ratio, body mass index, diagnosis, tumor size, and thyroid size were not significantly different between groups (p = 0.495, 0.297, 0.144, 0.281, and 0.430, respectively), nor were gross operative findings (p = 0.084). However, bilateral tumor was more prevalent in the TA group (p = 0.004), and tumor in the mid-portion of thyroid was more prevalent in conventional group (p = 0.012). Multiplicity and extrathyroidal extension showed no significant differences between the two groups.

Table 1 Perioperative data according to the surgical approach

No surgeries were converted to a conventional open thyroidectomy in the TA group. Operation time was significantly longer in the TA group than in the conventional group (p < 0.001). The amount of drainage was also higher in the TA group (p = 0.033). Vocal cord paralysis, hypoparathyroidism, and hematoma were not different between the two groups (p = 0.215, 0.290, and, 0.385, respectively) (Table 2). In addition, there were no cases of great vessel, esophagus, trachea, brachial plexus, or marginal mandibular nerve injury.

Table 2 Operative outcomes and postoperative major complications according to the surgical approach

Analysis of voice quality

Two patients in the TA group and three in the conventional open thyroidectomy were excluded from analysis due to postoperative vocal cord paralysis. GRBAS and VHI scores showed peak aggravation at 1 week in both groups, which gradually decreased to a level similar to the preoperative scale by 6 months (Fig. 1A). Preoperative GRBAS scores were not different between two groups. The TA group showed a more aggravated tendency postoperatively, although statistical significance from the conventional group was attained only at postoperative 6 months (p = 0.043). VHI abruptly increased postoperatively in the TA group, showing a significant difference from the conventional group at 1 week postoperatively and at 1 month (p < 0.001 and p = 0.001, respectively) (Fig. 1B). Fundamental frequency (F0) did not change significantly postoperatively in either group, although a slightly faster recovery was observed in the TA group (Fig. 1C). MVP also showed no difference between the two groups, showing recovery to the preoperative level at 6 months postoperatively (Fig. 1D).

Fig. 1
figure 1

Serial patterns of postoperative function parameters on voice according to surgical approach. A perceptive scale (GRBAS), B voice handicap index, C fundamental frequency (F0), D maximal vocal pitch

Subjective functional satisfaction

Two patients in the TA group and three in the conventional group were excluded from analysis due to postoperative vocal cord paralysis. With regard to pain, both groups showed a gradual decrease in mean pain scores on VAS. The conventional group showed a faster decrease compared with the TA group, and statistical significance was observed at 1 year postoperatively (p = 0.030) (Fig. 2A). Paresthesias on the neck and chest were more aggravated in the TA group during the early postoperative period (Fig. 2B, C). Statistical significance was observed at 1 week for neck paresthesia (p = 0.001) and at 1 week and 1 month for chest paresthesia (p < 0.001 and p < 0.001, respectively). Paresthesia was roughly equal at postoperative 3 months. DHI was higher in the TA group. However, cosmesis was better in the TA group at all postoperative periods compared with the conventional group, with significant differences observed only at postoperative 3 and 6 months (p = 0.023 and p = 0.015, respectively) (Fig. 3).

Fig. 2
figure 2

Serial patterns of postoperative function parameters on pain and paresthesia. A Pain, B paresthesia (neck), C paresthesia (chest)

Fig. 3
figure 3

Serial patterns of postoperative function parameters on swallowing and cosmesis. A Swallowing, B cosmesis

Discussion

This study demonstrated that the complication rate with TA thyroidectomy is comparable to that of conventional open thyroidectomy, while postoperative functional outcomes of TA thyroidectomy showed a relatively slower recovery than conventional thyroidectomy. The difference between TA and conventional open thyroidectomy was apparent in VHI scores, pain, paresthesia, and swallowing, with the TA group showing more aggravated scores in the in the early postoperative period. Cosmesis was better in the TA group at postoperative months 3 and 6, while postoperative changes of fundamental frequency and maximal vocal pitch were about the same in both groups.

Since the first description of robotic or endoscopic thyroidectomy in 2007, several papers have been published describing the feasibility and safety of this new surgical approach. The associated functional results, cosmetic advantages, and technical limitations of this newer approach have been studied on large series of patients, and a learning curve for this technique has been proposed [14, 15]. Among the possible advantages of this approach, cosmetic results have been excellent, and the most common reason for selecting extracervical thyroidectomy is the need to hide the scar far away from the anterior neck [1]. This prospective study without randomization showed that younger female patients tended to elect TA thyroidectomy rather than conventional open thyroidectomy. As is shown in our results, cosmetic outcome was excellent, showing that the patient’s satisfaction was significantly higher at postoperative 3 and 6 months with this approach.

One important consideration regarding cosmetic results in previous studies is that none of the studies on cosmesis followed patients beyond 3 months. Typically with scar formation, at the 3-month time point, the scar width is generally wider than the original incision, and erythema is still present past the boundaries of the incision [16]. Since scar maturation tends to occur over a period of 1 year, longer follow-up could reveal the long-term advantage of the TA approach. Our study with a follow-up of 1 year showed that the differences at a late postoperative period of 1 year were not statistically significant between the two groups, although the satisfaction score was somewhat higher in the TA group. A scar in the axilla can be more acceptable to patients, where it can be hidden with usual clothing or when the shoulder is in a neutral position. This may be a great advantage, especially during the postoperative period within 6 months, but may not be advantageous after 1 year.

The total operation time, including flap elevation, is one disadvantage of this technique, although this can be reduced with greater surgical experience [17, 18]. In our study, an additional 90 min was needed in the TA group compared with the conventional group. This longer operation time may be associated with not only possible increased morbidity from general anesthesia, but also postoperative complications. Because the TA approach includes flap elevation in the upper chest and traction of sternal head of SCM and strap muscles, these could all affect postoperative functional parameters [19].

As in previous studies, our study showed no differences in changes in GRBAS scores, fundamental frequency, or maximal vocal pitch [20]. Tae et al. [21] used a 5-item questionnaire to investigate symptoms of vocal fatigue, hoarseness, pitch limitation, breathiness or weakness, and difficulty singing, with patients rating the severity of each symptom on a 5-point scale (0 = none to 4 = very severe). Preoperatively, the open cohort and the robotic cohort were similar in their reporting of vocal impairment. Postoperatively, the open cohort had a higher severity of voice symptoms at postoperative 1 day (p = 0.008), 1 month (p = 0.049), and 3 months (p = 0.043), but not at 6 months. In contrast, VHI in our study was significantly aggravated in the early postoperative period, especially within 1 month, which demonstrated that patients are subjectively more unsatisfied regarding voice after TA thyroidectomy in the early postoperative period. This change was minimized at postoperative month 6 and at 1 year. This discrepancy with previous studies may have resulted from differences in the study cohorts. However, based on the results of our study, we think that voice discomfort during the early postoperative period should be a consideration, especially for younger patients who want to undergo the TA approach, although the enrollment of younger patients with a more active social life in the TA group could have affected the results.

Previous data regarding pain, paresthesia, and swallowing relied on a short follow-up period and different numbers of cohorts in each follow-up period [2224]. Follow-up periods of previous studies have primarily been 3 months, which seems to be too short for sufficient analysis of long-term postoperative outcomes. Pain severity in our study was in accordance with the previous studies in that no significant difference was observed between the TA and conventional groups. Pain in the TA group did show a relatively slower recovery, although the difference between the groups did not reach statistical significance. In addition, paresthesias in the neck and chest were more aggravated in the TA group. We do not think that this difference resulted from the surgeon’s experience, because the primary surgeon in our study had over 3 years of experience with robotic surgery before initiating this prospective study. Rather, we believe that dissatisfaction in paresthesia is innate weakness of TA thyroidectomy, where the flap elevation of the chest and neck and traction injury during the longer operation may have an effect. Although paresthesias decreased and were about the same after postoperative 3 months, we think that discomfort in the neck and chest may be more severe in the early postoperative period with TA thyroidectomy than with conventional thyroidectomy. Results regarding subjective swallowing function were similar. DHI scores were higher in the TA group than in the conventional group, and we believe that wider flap elevation and injury to the neck muscle can affect this result.

Our results should be viewed in the context of potential limitations. The cohorts being compared in this study were significantly different in many aspects. Patients in the TA group were younger and there were more females than in the conventional group. This bias has been seen in most studies. We think that further study with randomization or an age- and sex-matched prospective study is needed to elucidate the role of TA thyroidectomy on postoperative functional outcomes. Before such research is available, our 1-year prospective study can provide good evidence that TA thyroidectomy is not always beneficial in regard to functional outcomes.

Thyroid cancer is associated with a relatively long survival rate and a long recurrence-free survival. For this reason, oncologic outcomes are not established in TA thyroidectomy because of the short duration of the introduction of the surgical skill. Most previous studies have demonstrated that the complication rate and functional outcomes of TA thyroidectomy are comparable to or better than those with conventional open thyroidectomy. However, as previously mentioned, most of the studies have potential biases from a retrospective design, along with varying age and male-to-female ratios and a short follow-up period. Our study found negative findings in that functional outcomes could be aggravated in the early postoperative period in TA total thyroidectomy. These negative results should be a consideration when counseling patients regarding the procedure and in the decision-making on the surgical approach of total thyroidectomy. We suggest that TA total thyroidectomy should not be thoughtlessly recommended without reviewing all possible aspects of postoperative changes until a higher level of evidence is available to elucidate the role of TA thyroidectomy.