Introduction

More than 100,000 thyroid operations are performed annually in surgical departments throughout Germany, and the case volume in thyroid surgery is also rising in the U.S. due to improved imaging techniques [13]. The prevalence of endemic goiter in the adult population of Germany, an iodine-deficient country, ranges between 20 % and 30 % [4]. Today, thyroid surgery is a highly standardized procedure with low morbidity and mortality. Since Theodor Kocher, hemostasis in thyroid surgery has been achieved by clamping vessels followed by the placement of a ligature [5]. It remains the reference method, although it requires several separate steps and is thought of as time consuming. It may therefore increase the duration of the procedure as well as the associated costs. New devices for the occlusion of vessels have therefore been introduced in recent years, such as ultrasonically activated shears, electrothermal bipolar vessel sealing systems, and vascular clips. Several almost exclusively single-center randomized controlled trials (RCTs), conducted in specialized centers and involving a small number of experienced surgeons, have demonstrated the safety (similar complication rates including hypocalcemia, recurrent laryngeal nerve palsies, blood loss, and length of hospital stay) of these new devices compared to the conventional treatment using ligatures and bipolar diathermy in various patient populations (benign goiter, Graves disease, papillary thyroid cancer) [612].

All trials show a significant reduction in operation time and many conclude that costs are lowered. However, operation time is influenced by multiple variables such as the underlying disease (normal goiter, Graves disease, or thyroid cancer), the surgeon's experience, and the center. Therefore, it remains unclear whether the time reduction can be reproduced in a multicenter setting with a large number of participating surgeons.

Vascular clips do not require additional technical equipment connected to a continuous power supply system, and their application is less complex than the use of a ligature. Nevertheless, it remains unclear whether this simple technique is superior to the classical approach. To date, no evaluation of this method through a RCT has been performed. Accordingly, the CLIVIT Trial was designed to test the primary hypothesis that the use of vascular clips rather than ligatures reduces operation time in thyroid surgery.

Methods

The study protocol was published in advance to ensure transparency of the design and analysis procedures. The protocol was approved by the local ethics committee of the University of Heidelberg on January 7, 2004 and internationally registered [13]. The study was designed, managed, monitored, and analyzed by the Study Center of the German Surgical Society (SDGC) and conducted at 13 hospitals of various categories (one primary, six secondary, six tertiary university centers) throughout Germany [14].

Participants and hospitals

Only patients with a euthyroid goiter and normal vocal cord function scheduled for elective, at least subtotal resection bilaterally were eligible. In addition, informed consent, age equal to or greater than 18 years, and life expectancy greater than 1 year were required inclusion criteria. Patients with a malignant disease or Graves disease, recurrent laryngeal nerve (RLN) palsy, current immunosuppressive therapy, coagulopathy, or disorders that would preclude study participation (dementia, language problems) were excluded, as were patients participating in other trials, to avoid interferences that might distort the outcome of the study. Hospitals were allowed to participate only if they had obtained the approval of their local ethics committee and entered into a formal agreement with the SDGC.

Randomization/interventions

Three standardized surgical approaches were allowed: the Dunhill procedure, which consists of hemithyroidectomy and subtotal thyroidectomy (at least two thirds of the gland); total thyroidectomy; and near-total thyroidectomy, leaving unresected only a small portion of gland adjacent to the entrance of the RLN into the larynx.

Only after initial surgical exploration of the thyroid gland had shown the necessity for at least subtotal resection to treat the underlying disease could randomization to either the clamp and tie group or the vascular clip group be performed. Opaque sealed envelopes stratified by center and blocking methods to create groups of equal size were used at each center. After randomization, the vessels of the upper poles were ligated according to the conventional clamp and tie technique. All other vessels were occluded according to the randomization result.

In the ligature group, each vessel had to be occluded by manual ligature (braided absorbable suture material, UPS 3-0). In the clip group, vessels were occluded by application of one medium-size vascular clip close to the thyroid capsule and two vascular clips distal to the thyroid using the Aesculap Challenger multifire clip applicator (Aesculap, Tuttlingen, Germany). Additional bipolar diathermy was allowed in both groups, as was suturing of the capsule according to local standards following subtotal resection. After hemostasis, unilateral or bilateral drains were placed according to the preference of the surgeon.

Objectives and outcomes

The primary objective was to compare the surgical resection time (time between ligation of the upper pole vessels and removal of the complete specimen) using vascular clips versus ligatures. Secondary endpoints were the amount of postoperative bleeding (total measured amount in drains), reoperation due to bleeding, wound infection, temporary (reversal within 12 months) and permanent (over 1 year) RLN paralysis confirmed by an otorhinolaryngologist, length of hospital stay, and safety. Moreover, total operation time (minutes), weight of surgical specimen, wound infection (redness, wound dehiscence with secretion of putrid or opaque, foul-smelling fluid, requirement for antibiotic treatment), postoperative function of parathyroid glands (defined and assessed according to local standards in each center), postoperative drainage duration, and postoperative length of hospital stay were recorded.

Sample size

A clinically relevant difference between the two techniques was defined as a difference of more than 15 min. The expected difference was 20 min. The α level was set to 0.05 and a power of 0.80 chosen, and a simple one-sided t test for two groups with a shifted null hypothesis was applied. Given an expected early drop-out rate of 5 %, a total sample size of 420 patients was required.

Statistical analysis

A confirmatory analysis of variance (ANOVA) was calculated for the primary endpoint, based on the intention-to-treat (ITT) principle, to test first the hypothesis H0: μligclip ≤ 0 and in the event of statistical significance the hypothesis H0: μligclip ≤ 15, with the expected operating times μlig and μclip, each at the α-level of 5 %. The fixed factors of this ANOVA analysis were vessel occlusion technique, center, and surgeon's experience (categorized into two groups: ≥50 or <50 thyroid operations). Even if neither superiority nor relevance could be shown statistically, the exploratory interpretation of the narrow two-sided confidence intervals (expected width <10 min) could be used to compare the difference in operating time. Sensitivity analysis was performed using a linear mixed model as well as using the per-protocol population, which consists of all patients of the ITT population without major protocol violations. Secondary endpoints were analyzed and characterized using descriptive statistics.

Amendments

Three amendments of the protocol were accepted by the ethical committees during the execution of the trial: (1) In October 2004, two additional secondary endpoints, total procedure time (skin incision to closure) and weight of specimen, were added to improve evaluation of the primary endpoint. (2) In March 2005, the surgical procedures allowed within the trial were extended to include total thyroidectomies and near-total thyroidectomies, as both are standard procedures in goiter surgery. (3) Data management observed missing values for the primary endpoint in about 18 % of the patients (due to different surgical approaches). Blinded review of the data by an independent biostatistician showed a correlation coefficient of the primary endpoint with the total operating time of 0.83. A multiple linear regression method was used to impute these missing values. Simulation showed that these imputations resulted in a substantial loss of power, so sample size had to be increased to a total of 500 patients to achieve a power of about 75 %. Therefore, the number of participating centers was increased to 13. Furthermore, the sample size was planned for the t test, whereas an ANOVA test was used in the analysis. Using an ANOVA model would probably increase the actual power compared to the t test. Hence, a sample size of 500 patients (250 per group) seemed satisfactory to ensure sufficient power for this study. To avoid loss of power due to excessively small cell sizes, the participating centers were pooled as recommended [15]. This third amendment was accepted by the ethics committee in February 2008.

The analyses were performed using the software package SAS® System 9.1 (SAS Inc., Cary, NC, USA) according to the pre-specified statistical analysis plan and two-sided P values were given throughout.

Results

Patient enrollment

A total of 494 patients were randomized at 13 surgical sites between March 5, 2004 and July 29, 2008. The ITT population consisted of 491 patients. One-year follow-up of all patients was completed by October 5, 2009 (Fig. 1).

Fig. 1
figure 1

Flow chart of patients

Baseline data, indications, and surgical characteristics

The trial groups were comparable for all patient and procedure characteristics. The indications for thyroid resection were as follows: nodular goiters (48 %), nodular goiters with cold nodules (38 %), nodular goiters with hot nodules (6 %), and autonomic adenoma (8 %). The most common procedure was subtotal resection (51 %), followed by total thyroidectomy (39 %) and near-total thyroidectomy (10 %). The distribution of surgical procedures changed significantly in the course of the recruitment period, with more radical resections being selected more frequently (P < 0.0001, chi-square test). The specimen weight (mean 67 g) did not differ between the two groups (Table 1; Fig. 2).

Table 1 Baseline characteristics, indications, and surgical procedures
Fig 2
figure 2

Change in surgical procedures over recruitment time

Primary endpoint

The mean resection time in patients randomized to ligatures was 66.1 (95 % CI: 62.5; 69.8) min versus 63.5 (95%CI: 59.8; 67.3) min for vascular clips (P = 0.258, ANOVA). The mean total operation time (skin incision–skin closure) was 120.1 min in the ligature group versus 117.1 min in the vascular clip group (P = 0.334) (Table 2; Figs. 3, 4).

Table 2 Primary endpoint (resection time), total operation time, and influence of center and surgeon on operation time
Fig. 3
figure 3

Resection time, by intervention and surgeon's expertise

Fig. 4
figure 4

Resection time, by intervention and center

The primary endpoint was significantly influenced by center (P < 0.0001, ANOVA) and expertise (less than 50 versus more than 50 thyroid procedures; P < 0.0001, ANOVA) of the 124 participating surgeons. There was no interaction between intervention group and center in this model (P = 0.91, ANOVA), indicating that resection time was similar for the interventions throughout the centers. These results were confirmed in a per-protocol analysis as well as with several sensitivity analyses (data not shown).

Secondary endpoints

No surgical drains were used in 123 patients (ligature 59, vascular clip 64). The amount of postoperative fluid collected in drains did not differ between the two groups (mean 86 ml ± 93). Four patients in the vascular clip group and two in the ligature group had to be reoperated due to bleeding. Wound infections occurred four times in each groups. Function of parathyroid glands, defined and assessed according to local standards in each center, was impaired transiently in five patients in the vascular clip group and nine patients in the ligature group (Tables 3, 4).

Table 3 Secondary endpoints and safety
Table 4 Surgical drains versus no drains

An otorhinolaryngologist examined 197 patients in the vascular clip group and 193 patients in the ligature group at least once during the postoperative follow-up. Temporary RLN paralysis was observed in 17 (6.9 %) patients in each group. Permanent paralysis after 1 year was confirmed in 6 (2.5 %) patients in the vascular clip group and 8 (3.2 %) patients in the ligature group, respectively (P = 0.636). There were no differences between the two groups in these findings, but differences were observed between centers. The rate of temporary RLN paralysis ranged from 0 % to 17 %, that of permanent paralysis from 0 % to 8.5 %. No significant difference was observed for postoperative hospital stay (mean 3.0 days ± 1.9). Safety data showed no difference between the groups. There were no deaths postoperatively or during follow-up.

Baseline characteristics (age, gender, body mass index, and pulse rate) and the occurrence of postoperative complications such as reoperation, wound infection, impaired function of the parathyroid glands, and permanent RLN paralysis were not correlated with whether surgical drains were used or not. A difference was observed, however, in the duration of hospital stay (3.1 days with drain versus 2.8 days without; P = 0.029). Less-experienced surgeons used drains less often than more experienced surgeons (P < 0.0001) (Table 4).

Discussion

CLIVIT failed to demonstrate a reduction in either resection time or total procedure time when using vascular clips rather than ligatures in a multicenter setting (124 participating surgeons at 13 centers). Operation time was strongly influenced by surgeon experience (Fig. 3) and center (Fig. 4). Surgeons with greater experience needed a mean of 10 min less for the resection than those with less experience, and the total operation time was 20 min less. Thus, CLIVIT has confirmed that operation time in thyroid surgery is influenced by experience of the surgeon and center, confirming the findings of a previous single-center randomized controlled trial (RCT) with multiple participating surgeons [11].

The results are in contrast to the findings of many single-center RCTs in specialized centers with a limited number of surgeons evaluating ultrasonically activated shears or electrothermal bipolar vessel sealing systems in comparison with conventional clamp and tie techniques to obtain hemostasis. These RCTs have consistently shown a significant operating time reduction in various patient groups and in different forms of thyroid resection. The clinical relevance of such a reduction is questionable, as no decrease in the rate of complications such as RLN palsy, reoperation due to bleeding, or hypocalcemia was observed. In a recent RCT with 150 patients, three groups were compared: conventional clamp and tie technique versus electrothermal bipolar vessel sealing system versus ultrasonic shears. Overall morbidity was even higher in the two latter groups, and the only benefit found for the new devices was time reduction [16]. It has not yet been proven by a properly designed RCT of cost-effectiveness that operating time reduction is directly related to cost reduction in thyroid surgery and whether this reduction outweighs the costs of such a device. Such a RCT should be multicenter and include more than one surgeon per center to determine whether cost savings can be reproduced in other settings and locations.

Moreover, CLIVIT has demonstrated the safety of vascular clips compared to the conventional method. RLN palsy is one of the most important complications following thyroid surgery and occurs, according to a recent review, in 0–7.1 % of cases as transient RLN palsy and 0–11 % as permanent RLN palsy. Patients suffer a noticeable reduction in quality of life, and this complication is a leading reason for treatment-related litigation [17]. Overall, 6.9 % of patients participating in the CLIVIT Trial had temporary RLN palsy and 2.9 % suffered permanent RLN palsy. The true proportions may be even higher, because eight patients had no postoperative examination, and in 93 patients, this information was missing. In a recent RCT assessing, the value of RLN monitoring in a mixed patient population (goiters, thyroid carcinoma, thyroiditis) of 500 patients, the rates of transient and permanent RLN palsy were 1.9 % and 0.8 %, respectively, in the group with neuromonitoring and 3.8 % and 1.2 % in the visualization group [18]. Taking into account that CLIVIT patients were at low risk due to the inclusion of primarily benign, these numbers are a matter of concern, especially considering the range from 0 % to 8.5 % and indicate a need for either more specialization (fewer surgeons with more cases) or further improvement through better training [1]. Both approaches are feasible in thyroid surgery. CLIVIT did not assess how many procedures were done using neuromonitoring or magnifying glasses to visualize the RLN.

The rates of wound infections, at 1.6 %, and impaired function of the parathyroid glands, 2.9 %, were comparable to those in earlier studies assessing other methods to obtain hemostasis in thyroid surgery [16]. The frequency of hypocalcemia as an adverse event was higher, at 5.1 %, but no difference was observed between the two groups. No laboratory data were collected, and no centralized analysis of blood samples was performed for financial reasons. Therefore, the true frequency of hypocalcemia may be even higher.

The use of drains in thyroid surgery is an ongoing topic. According to a Cochrane systematic review and meta-analysis, no benefit was observed for patients who had surgical drains placed. There were no differences in the rates of re-operation, wound infections, and respiratory distress. Hospital stay was significantly longer in patients with surgical drains [19], but not of clinical significance (3.1 versus 2.8 days). This evidence was not available at the time of design of CLIVIT, but 25 % of all patients were already operated without drains. Interestingly, inexperienced surgeons used drains less often and no increase in complication rates was observed for patients in whom surgical drains were not used. Therefore, the results of CLIVIT support the Cochrane Review and the use of drains in elective thyroid surgery for goiters can be discontinued.

Randomization resulted in two groups comparable for all known and unknown risk factors, so selection bias is therefore unlikely to have influenced the results. Nevertheless, there are limitations that may impair the validity of CLIVIT's findings. There were three amendments to the protocol during the trial. Two new secondary endpoints (total time of procedure and weight of specimen) were included in order to allow for control of the primary endpoint. The relevant data for patients treated before the amendment were readily collected from anesthesia records and pathology reports, respectively. An influence on the study interventions and primary endpoint is unlikely.

An extension of surgical methods for the treatment of goiter from initially only the Dunhill procedure (hemithyroidectomy on one side and subtotal resection on the other) to include near-total and total thyroidectomies was allowed, as successful treatment can be achieved by all techniques. This extension is in line with the increasing radicality of surgical approaches to benign goiter due to the risk of recurrence [20]. The resection times for these procedures do differ and therefore may affect the primary endpoint. However, randomization distributed the procedures equally between the two groups, so a bias would be equally effective in both groups, leading overall to longer resection time. CLIVIT has shown a significant trend in favor of more radical procedures over recruitment time.

An increase of the sample size from 420 to 500 was necessary because of missing data for the primary endpoint due to different resection strategies; primarily, surgery in some cases did not start with dissection of the upper pole vessels. However, these changes were a result of analysis by the independent blinded biostatistician, not by the study group, and were important to maintain the power of the trial.

Prior to the conducting of new RCTs, a systematic review and meta-analysis is helpful in evaluating the magnitude of time reduction for the new strategies. In 2003, when CLIVIT was designed, the results of only one RCT were available [7]. The CLIVIT study protocol was published to ensure transparency of the design issues [13]. As expected during the study design, operation time was indeed influenced by center, so randomization was stratified according to centers and centers were included in the ANOVA analysis model. The strength of this effectiveness trial is the inclusion of many surgeons with different levels of experience, reflecting the reality of surgical treatment. Moreover, surgical expertise is extremely important and influenced the outcome of this trial, and therefore it was included in the analysis strategy. An efficacy RCT at one center and involving a small number of experienced surgeons might show different results. In general, an efficacy RCT should be carried out first, and only if a clinically meaningful difference is found should an effectiveness RCT with many centers and surgeons be conducted to investigate whether there is still an effect in favor of one method or the other. Given the prior assumptions of the CLIVIT Trial and the corresponding sample size of 420 patients, it was clear that such a trial would need to involve more than one center; therefore, a multicenter approach was chosen from the outset.

It is unclear how large the amount of time saved must be in order to compensate for the higher costs of the new devices (vascular clips, ultrasonically activated shears, electrothermal bipolar vessel sealing systems). To date, it appears unlikely that patients will benefit from these devices, so further trials must be conducted focusing primarily on cost-effectiveness and patient-relevant outcomes. Procedure time alone may have little impact on patient outcomes, so it may not be useful as a surrogate parameter for the assessment of a surgical procedure.

In the current era of available energy devices for thyroid surgery, the timeliness of the CLIVIT Trial might be questioned. However, the presented results are valuable for the surgical community, since two standard techniques in thyroid surgery were evaluated in a multicenter randomized effectiveness trial, which still represents the most valid form of clinical research. Proper evaluation of the standard techniques from each era is still mandatory for future surgical and technical innovation.

In conclusion, vascular clips do not reduce resection time in thyroid surgery. However, in contrast to quality-assurance data, a rate of 2.9 % for permanent paralysis of the recurrent laryngeal nerve is of concern. Drains in elective surgery may be of no benefit.