Background

Recurrent laryngeal nerve palsy (RLNP) is one of the most serious complications associated with thyroid surgery. Not only is it the leading cause of litigation in a thyroidectomy, but it can also have significant impacts on patients’ subsequent quality of life [1]. RLNP is responsible for vocal fold paralysis, consequences of which include dysphonia, difficulty swallowing, and respiratory problems such as aspiration symptoms and airway obstruction in cases of bilateral cord involvement [2].

History

In the early 1930s, technique improvements began involving routine visual identification of the RLN over simple avoidance of the RLN during thyroid and parathyroid dissection. RLN visualization is still broadly considered the standard of care. Intraoperative neurophysiologic monitoring (IONM) was proposed 30–40 years ago as a means of verifying the functional integrity of the RLN [37]. Despite widespread and increasing use of this technology, true vocal cord palsy and paralysis as a result of iatrogenic injury to the RLN persist. The overall incidence of temporary injury is 2–8 %, while that for permanent injury is between 0.5 and 3 % [812].

Function and debate

There are three primary purposes for which neuromonitoring may be pursued during thyroid surgery: to verify the functional integrity of the RLN prior to ending the surgical procedure, to increase the ability of the surgeon to reliably identify the RLN in initial operations, and to provide guidance for the surgeon in difficult situations—anatomic variants, re-operations, and surgery for malignant disease [13].

While use of neuromonitoring in thyroid surgery has long been a contended topic, there are a dearth of reviews which approach these aspects holistically, including the predictive power of neuromonitoring in determining RLN function intraoperatively, its reductions towards the rate of temporary and permanent RLNP, and surgeons’ response to the technology. This review seeks to pool studies addressing the aforementioned topics to assess the role of IONM in preventing RLNP during thyroid surgery. And, when IONM is utilized and creates false positives, it seeks to evaluate the factors contributing to such outcomes.

Methods

MEDLINE, EMBASE, and PubMed were searched for human studies addressing the efficacy of RLN monitoring during thyroid surgery from 1999 forwards. Boolean operators were used to combine key terms mapped from the medical subject heading terms: identification, function, thyroid, monitoring. Studies were run through a set of inclusion/exclusion criteria to identify those focusing on IONM predictive power, efficacy, and surgeon response.

Selection criteria

Markers of predictive power in determining RLN function intraoperatively included sensitivity, specificity, and negative and positive predictive values relative to the number of nerves at risk (NAR). Clinical endpoints to determine effects on postoperative results focused on transient and permanent palsy rates compared to rates for visualization-only controls. Efficacy was assessed overall and weighting for thyroidectomy risk factors—re-operation and malignancy settings. Endocrine surgeons’ (1) reported comfort and (2) procedural learning curves, using the technology, were assessed.

Of the 48 studies initially identified, a total of 17 were finally selected—12 comparative studies, 1 randomized clinical trial, 2 nonrandomized clinical trials, and 2 case studies. Studies were excluded given any of the following: (1) unspecified n value—number of patients/NAR, (2) combination of IONM group results with control, and (3) pooled data including other head/neck surgeries. Comparative studies or clinical trials examining both techniques were also excluded when overall palsy rates for both IONM-guided and RLN visualization-only techniques were not reported. Selected studies included prospective and retrospective research with the following allocation methods: random, consecutive, convenience, and physician/equipment based. Among studies pooled focusing on IONM efficacy—there were 44,575 NAR, of which 25,843 (57.98 %) were operated on with IONM and 18,732 (42.02 %) without (control) (Fig. 1).

Fig. 1
figure 1

Selected reporting items flow chart from systematic review of the literature search

Results and discussion

Predictive power

With regards to the ability of IONM to verify the functional integrity of the RLN prior to ending the procedure, there is a concern about the technology’s low specificity and positive predictive power, PPP [1417]. Selected studies were pooled to gage predictive power—including 7366 NAR. Mean sensitivity, specificity, and negative predictive values were 82.76, 90.24, and 99.15 %, respectively. Mean positive predictive value based on 4700 NAR was 61.30 %.

Despite a range of sensitivity and specificity values in recent studies (Table 1 [1721]), it is generally concluded that a positive IONM can reliably demonstrate intact nerve function intraoperatively, though IONM results should not be relied upon to detect an injured or malfunctioning RLN [1618]. However, within bilateral thyroid surgeries, Goretzki et al. [19], deriving overall sensitivity and specificity value of 93 and 77 %, respectively, found negative IONM stimulation at the first side of dissection, specific and sensitive enough to predict early postoperative vocal cord paralysis with more than 70 % certainty. The study also identified a substantial increase in the possibility of bilateral vocal cord paralysis given a negative versus normal IONM result at the first side of dissection. Thus, hemithyroidectomy or a two-stage operation was suggested for negative IONM results, although findings could not support claims of a significant impact of IONM on RLNP during thyroid surgery.

Table 1 Predictive power of IONM results

Stratification by risk levels in a study examining 171 patients with 271 NAR revealed increased sensitivity and PPP for high-risk cases, pointing to a potential benefit of IONM use in high-risk cases [22].

However, a study of 228 evaluated, treated patients contradicted this, indicating that sensitivity of identifying postoperative paresis for benign disease reached 87 % but only 25 % in re-operation for malignant disease, defined high-risk factor situations. PPP was 62.5 and 25 % for paresis and permanent palsy, respectively, reflecting poorly on the ability of IONM to identify RLN damage in the most severe cases (permanent palsy) and eliciting the conclusion, “neuromonitoring does not reliably predict postoperative outcome” [18].

Decreasing RLN injury

The latter two reasons for IONM use—detecting the location of the RLN and providing guidance for the surgeon, particularly in difficult situations—seek to reduce the risk of nerve injury during thyroid surgery. Despite much debate, the impact of IONM on the rate of RLNP has yet to be statistically proven or agreed upon [8, 11, 14, 23, 24].

The overwhelming majority of studied cases confirmed the lack of a significant difference between the rate of RLN injury with IONM versus visualization of the RLN alone [1, 8, 11, 1315, 2124]. Pooled studies focusing on IONM efficacy are summarized in Table 2. They include 44,575 NAR, of which 25,843 (57.98 %) were operated on with IONM and 18,732 (42.02 %) without (control). The rates of overall RLNP per NAR were 3.18 % in the IONM group. The overall RLNP rate for a subpopulation of the control group including 6566 control nerves, was 3.83 %. Neither these differences nor those for transient or permanent RLNP rates were statistically significant.

Table 2 Summary statistics of selected RLN neuromonitoring articles

Chan et al. [21] supported this, noting the similarity between IONM and control RLNP rates, even when stratified according to risk factors. Yet, while malignancy and recurrence (secondary thyroidectomy) presented a significant increase in the incidence of postoperative RLN paralysis overall (P = 0.025) and (P = 0.017), respectively, the overall RLNP rate was significantly higher during re-operative thyroidectomy (19 vs. 4.6 %; P = 0.019) in the control group but not in the neuromonitoring group (7.8 vs. 3.8 %; P > 0.05). This indicates that IONM may present particular benefit in lowering nerve injury rates in the re-operative setting when it is generally the highest, further backed by the “positive trend in reduction” noted among patients exposed to IONM. Yarbrough et al. [13] challenge such findings, observing that, even in high-risk situations, there was no statistical significance in the proportional difference in injury among the group of patients that underwent surgery with IONM and those that did not. The study followed 111 patients undergoing cervical re-exploration procedures for thyroid and parathyroid diseases, where the average patient had undergone 1.8 prior procedures. However, the increased cost burden of IONM was highlighted.

A review by Pisanu et al. [23] analyzed 20 studies and 23,152 patients to reveal that overall palsy rates for IONM versus visualization alone (3.47 and 3.67 %, respectively) demonstrated no statistically significant difference. Similarly, Higgins et al. [8] considered 42 studies, evaluating a total 64,699 NAR; they observed no statistically significant difference in the rate of true vocal fold palsy—which develops from injury to the nerves—using intraoperative neuromonitoring versus RLN identification alone (3.52 and 3.12 %, respectively) during thyroidectomy. The review emphasized that while IONM may be useful depending on surgeon comfort, or in high-risk cases, neuromonitoring should not be the standard of care in routine thyroid cases or serve to supplant anatomical identification of the nerve.

Findings by Thomusch et al. [25] were an exception. The 6229 NAR study demonstrated overall palsy rates of 2.9 and 1.8 % for control and IONM group, respectively, a statistically significant difference. The lower rate among IONM patients is particularly notable given the higher proportion of patients with a recurrent goiter in the neuromonitoring group than in the control (8.74 and 5.99 %, respectively) and that recurrent goiters were traced to a significantly increased risk of palsy (P < 0.0001). The decrease noted by Frattini et al. [27] applied only to surgeries for thyroid cancer which, despite being a RLNP risk factor, is not proven to significantly increase injury rates, and would not be expected to show such variability, although the study’s small sample size (n = 152 patients) may have biased results.

A more recent and larger study considering 686 patient cases identified a substantial increase in the rate of RLNP when comparing the control and the IONM group (4.7–7.6 %) [1]. However, it mentioned that higher rate may be attributed to the learning curve as one of the three surgeons adapted a new, reportedly less invasive strategy. There is also indication that the procedures with a greater number of risk factors may have been performed with IONM, potentially contributing to the difference between the two rates [1].

Surgeon response

Surgeons undergo a learning curve as they acquaint themselves with neuromonitoring technology [1], a study tracking three such endocrinologists over the course of a year observed positive results about surgeons’ attitudes. “All surgeons claimed that the neuromonitoring improved their sense of safety during thyroid surgery.” However, learning curves were substantial, tracked to be up to 304 patients for a single physician—with higher than projected rates of RLN injury for patients during the learning process [1]. Working to bolster IONM predictive power and counter variability in injury rate, Chiang et al. [16] established a set of standardized IONM procedures, noting that these could elucidate the method of RLN injury in each case. Conceptually, surgeons could use such information to avoid injuries in future procedures.

Yet, IONM’s increasing prevalence may be due to pressure from the “demand of applying new surgical technology” [17] and as a means of covering litigation concerns. 170 voluntary surveys of the members of the American Head and Neck Society, among which 65 % were laryngeal nerve monitoring users, revealed “medical-legal protection” as a predominant reason for use. In a survey of 117 Members of the American Association of Endocrine Surgeons consisting of 37.1 % neuromonitoring users and 62.9 % nonusers, 76 % of respondents believed that neuromonitoring does not improve safety of thyroid surgery, while 56 % upheld the same for thyroidectomies [21]. Overall, the surveys indicated that neuromonitoring users were more inclined to believe that neuromonitoring, particularly using electromyographic endotracheal tubes (ETTs), reduced transient and permanent RLN dysfunction (P < 0.001) and could facilitate identification of the RLN and medical resident education, while decreasing liability risk [28, 29]. Nonusers upheld that neuromonitoring could result in a loss of surgical technique or judgment and lead to excess reliance on technology. Paired with views of the “high rate of false-negatives or false-positives,” neuromonitoring, particularly ETT, was not considered dependable enough for patient care and was noted to present a substantially increased cost burden [29]. Varying degrees of surgeon training or comfort with IONM may thus influence the decision to use IONM.

Factors contributing to low specificity and false positives

When IONM is used, variation in predictive accuracy can create situations whereby neuromonitoring provides a positive signal, indicating nerve functionality, but postoperative laryngoscopy demonstrates vocal cord palsy [1417]. Cases of such false positives present significant concerns for surgeons and patients [28, 29], potentially affecting surgical technique and outcome and presenting medicolegal concerns.

Current literature was investigated in response to the need to identify and evaluate the most probable factors contributing to this phenomena. Among them, most prevalent were the lack of a standard cutoff for what is defined as a loss of signal (LOS) for a nerve [19, 30, 31] and the contribution of vocal fold injury via intubation to laryngeal complications [3234].

Defining LOS amplitude

Utilizing IONM during thyroid surgery, some surgeons seek a laryngeal electromyography (EMG) amplitude response value of 100 µV or less to identify a LOS, while others follow higher cutoffs, 200 µV or even 280–300 µV [26, 32]. Higher cutoff values ensure early detection of laryngeal EMG change and decrease the artifact during dissections in proximity to the trachea [32]. However, patient baselines amplitudes can vary significantly, from less than 400 µV to above 1500 µV [32]; testing of the thyroarytenoid-lateral cricoarytenoid muscle complex in control subjects revealed baseline amplitudes ranging from 145 to 1112 µV [30].

Using a low absolute cutoff may create the appearance of a positive signal when the nerve has been damaged and is functioning improperly [31]. In patients with higher-than-average baseline amplitudes, a value above 100 μV could indicate larger than an 80–90 % reduction. Thus, there is a need to establish a standardized method of identifying LOS to optimize positive predictive value of the technology [2023, 2529, 32, 33]. Future studies may seek to define LOS as a percent reduction in patients’ baseline (e.g., 50, 80, or 90 % amplitude decrease) and, if such a drop is noted, to test for a response at varying levels—280, 200, and 100 µV—to ensure signal and interpretation accuracy. Supporting the establishment of a ‘gold-standard’ to postoperatively assess the larynx, Jeannon et al. [33] advocate fiber-optic naso-laryngoscopy use across the board to further decrease deviation in RLNP identification.

Vocal fold injury from intubation

Another concern is the rate of laryngeal complications or palsy that results from vocal fold injury rather than injury to the laryngeal nerve [30, 31, 34, 35, 37]. Examining 761 patients, Echternach and colleagues [34] identified vocal fold injury due to endotracheal intubation to be the primary cause of laryngeal complications. While the severity of complications of procedural versus intubation wrought injuries are debated [34, 38], a study of varying procedures revealed a higher rate laryngotracheal injury among patients with endotracheal intubation than tracheotomy, 95 and 91 %, respectively, at autopsy [39]. Other studies have improved correlated intubation quality with fewer laryngeal complications and decreased postoperative hoarseness, identifying the risk of patients otherwise developing ulcers and granulomas due to intubation [30, 31, 3538].

Proposed solutions include documentation and informed patient consent; recognition by anesthesiologists that such injury can result from intubation and careful examination of postoperative symptoms to prevent worsening of said complications; and the addition of Atracurim to a propofol–fentanyl induction regimen [34, 35]. This opens the door for further study of synergistic combinations of neuromuscular blocking agent (NMBA) and anesthetics during endotracheal intubation. Pajewski et al. [39] also emphasize a careful consideration of anesthetic agents, pointing out that rapid alterations in anesthetic concentrations may further complicate interpretation of evoked potentials in spinal surgery. The study stresses the necessity of communication between the operative team-particularly the surgeon, anesthesiologist, and other involved specialists.

Conclusion

In light of the increasing use of IONM and a high prevalence of the technology among younger physicians, and with users indicating greater access to and perceived benefits of the technology, IONM presents the potential to become the standard of care for thyroid surgery. However, given the increased cost, high rate of false positives, and poorer patient outcomes as surgeons initially familiarize themselves with the technology, assessing the benefit of IONM for such procedures is imperative.

Currently, the data do not support a significant decrease in the RLNP rate during thyroid surgery when IONM is employed, nor do they support a sufficient reliability in detecting nonfunctioning nerves intraoperatively. Thus, IONM is not recommended as the standard of care and should not be employed for medicolegal reasons. However, it may abet to guide the surgeon during high-risk cases, particularly in the re-operative setting, or provide benefit depending on individual surgeon’s comfort level with the technology.

The low PPP of IONM can, in part, be attributed to the lack of an established standard cutoff value for identifying a negative or nonfunctional signal. Vocal cord injuries resulting from intubation can also contribute to vocal cord complications/palsy even in the absence of direct injury to the laryngeal nerve. The former may be alleviated via the establishment of a “gold-standard” for RLNP diagnosis involving detection of relative reductions in EMG amplitude rather than absolute values. For the latter, increased attention to the possibility of vocal cord injury during intubation is advised by ensuring patient’s informed consent; careful examination of postoperative symptoms; and combination laryngoscopy induction regiments, although room for further study regarding synergistic NMBA and anesthetics combinations still exists.