Introduction

Toxic diffuse goiter is named after Robert James Graves in the English-speaking world and in continental Europe after von Basedow. Graves’ disease (GD) is an autoimmune disorder with an unpredictable clinical course. The disease is accompanied by a number of symptoms directly related to hormone excess. In addition, some patients develop manifestations in localized regions of the connective tissue system, including Graves´ ophthalmopathy (GO) and dermopathy. The annual incidences of GD are around 20 per 100,000 in Sweden and around 40 per 100,000 in the United States. It is four to six times more common in females and mostly occurs between 20 and 50 years of age.

About 1% to 5% of patients with GD are children, but the disease is rare in children under 5 years of age. The peak annual incidence occurs at age 11 to 14 years and is about 3 per 100,000; it is seen more commonly in girls. Epidemiologic studies have shown a strong hereditary component. Recent studies have demonstrated that multiple genetic factors may contribute to the risk of developing GD, although not with full penetrance and not fully following mendelian rules. Especially in children, GD may be associated with other manifestations of autoimmune diseases, such as diabetes mellitus type 1 or Addison’s disease; GD is also more common in children with Down’s syndrome. Infections such as those caused by Yersinia enterocolitica have been postulated as imposing an increased risk, but a causal relation has not been established. Smoking is weakly associated with GD but strongly associated with the development of GO.

Graves’ disease is caused by thyroid receptor antibodies that activate the thyrotropin receptor (TSHR), leading to stimulation of cyclic adenosine monophosphate (cAMP) synthesis and production of thyroid hormones in the follicular cells. Presentation of autoantigens, such as TSHR, in predisposed individuals result in production of thyrotropin receptor autoantibodies and infiltration of lymphocytes in the thyroid. The inflammatory cells release molecules such as interleukins and tumor necrosis factor-α (TNFα), which have a yet uncertain role in the vicious cycle leading to GD. This process not only leads to hyperthyroidism but also to hypertrophy and hyperplasia of the thyroid parenchyma. The pathologic appearance include a change in the thyroid epithelium from cuboidal to columnar, with papillary infoldings accompanied by diffuse cellular infiltration by lymphocytes and plasma cells [1].

Options for treatment include antithyroid medication or definitive treatment with radioactive iodine ablation or thyroidectomy. Indications vary as to which of these treatments is the preferred option. 131I, or radioactive iodine (RAI) ablation is usually preferred in the United States, whereas in Europe and Asia antithyroid drugs or surgery are favored. Generally, antithyroid medication is the preferred initial therapy, with surgery or RAI ablation being considered when drug therapy fails or in case of recurrence. Indications for surgical management of GD can be considered as either absolute or relative, and they provide a guide for endocrine surgeons and endocrinologists to use when discussing treatment options with patients. Patients, however, may have a different view of the reason for choosing a particular treatment option.

Absolute indications for surgery include the following: presence of GD and an associated suspicious or malignant thyroid nodule; pregnancy not controlled with antithyroid medication; a desire for pregnancy; local compressive symptoms; and recurrence after medical treatment. Although most agree with these indications, the variations among centers and parts of the world reflect the relative indications. In addition, other factors, such as a high reluctance to using RAI in, for instance, Japan and hesitance to perform surgery in North America are important. Therefore, it is crucial to identify sound guidelines for the management of this disease. The aim of this review was to identify the evidence base behind the current surgical management of GD and to give recommendations based on this evidence.

Treatment

Although many advances have been made in understanding GD at the molecular level, little has changed in its management. The treatment modalities mainly target the thyroid gland. Interestingly, the treatment modalities of GD vary from country to country. In the United States 70% of endocrinologists recommend RAI as the first-line treatment [2], whereas in Europe 77% of endocrinologists would use antithyroid drugs (ATDs) [3]. Only 1% in either region would recommend surgery [3, 4]. The Japanese approach is similar to the European approach, with 88% of surveyed Japanese endocrinologists recommending ATDs and 11% recommending RAI [5]. For patients with large goiters, however, 51% of European endocrinologists would recommend surgery compared to only 7% of U.S. endocrinologists, 75% of whom would recommend RAI [3, 4]. Factors that influence these differences are (1) the tendency of GD to resolve spontaneously with time, often resulting in an end-stage hypothyroidism; (2) the degree of the disease with its associated varying risk of complications such as GO; and (3) socioeconomic factors, which vary in different parts of the world.

Antithyroid drug therapy

In Europe, ATDs are the first-line therapy, with RAI or surgery as second line when ATDs fail or in the case of recurrence. In the United States, ATDs are commonly used as an adjunct before RAI or surgery. The two most commonly used ATDs in the United States are methimazole and propylthiouracil (PTU). In Europe, carbimazole is also often used. Both methimazole and PTU inhibit the organification of iodine to tyrosine residues on the thyroglobulin molecule and the coupling of iodotyrosines. PTU, but not methimazole, also inhibits peripheral thyroxine (T4) to triiodothyronine (T3) conversion [6]. The drugs appear to be equivalent in terms of efficacy. PTU is preferred during pregnancy owing to lower placental passage reducing exposition to the fetus. Both drugs have similar minor toxicities, such as rash, arthralgias, urticaria, and gastrointestinal symptoms. Agranulocytosis, a potentially life-threatening toxicity, can occur with both medications and has an incidence of 0.2% to 0.5%. Rare major toxicities include drug-induced hepatitis and anti-neutrophil cytoplasmic antibody-positive vasculitis, which are mainly seen in patients taking PTU. Most patients treated with ATDs are euthyroid after 6 weeks and almost all by 3 months. The block/replace regimen, in which thyroxine is added to the ATD regimen once T4 levels have returned to normal, is advocated by some authors. One study showed that there is no difference in relapse using this regimen between 6 and 12 months of treatment. In either treatment regimen, the usual remission rate after 1 year is 50% to 60% and after 10 years only 30% to 40%. Negative predictors of remission include male sex, large goiters, high titers of thyroid receptor antibody and thyroperoxidase antibody (TPO) at 6 months after initiation of ATD, high baseline T4 and T3 values, and a history of more than one prior relapse [612].

Radioactive iodine therapy

Radioactive iodine was developed during the 1930s and was first used to treat hyperthyroidism during the 1940s. Initially 130I was used, but in 1946 131I became available and has been the iodide isotope most widely used to treat hyperthyroidism [13]. Since then, RAI therapy has become the most common treatment modality for adults with GD in the United States [4].

Radioactive iodine is a safe, effective therapy, but it is sometimes associated with a long latency period before its effect is seen. Although 70% of patients tend to become euthyroid within 4 to 8 weeks, it may take up to 6 months before the full effect of a dose of RAI is noted [6, 14]. In addition, there is an apparent conflict between giving a dose high enough to be effective and minimizing the recurrence risk on one hand and the risk of causing hypothyroidism on the other hand [10]. The overall risk of persistent hyperthyroidism is 5% to 25% and of developing hypothyroidism is 20% during the first year, subsequently increasing to 3% to 5% per year—both risks depending on the dose of RAI used [10, 14, 15].

There is no consensus about whether patients need to be pretreated with ATDs prior to RAI therapy. RAI is sometimes followed by temporary worsening of the thyrotoxicosis due to uncontrolled release of thyroid hormone from degenerating cells [8, 13], which is the rationale for why some authors recommend that older patients and patients with cardiac disease be pretreated with a thionamide before RAI [13]. Other complications include potential worsening of eye disease, particularly in smokers; it can be ameliorated or prevented by glucocorticoid therapy [9]. This issue is discussed in detail below. Rare but major complications include thyrotoxic crisis and hyper- as well as hypoparathyroidism [6, 16]. However, generally there are few complications with RAI therapy. The most common minor ones in adults are nausea and pain originating from the thyroid gland often secondary to radiation thyroiditis.

Pregnancy and lactation are the two absolute contraindications for the use of RAI therapy. Although RAI therapy has no adverse effect on future fertility for women planning pregnancy, but it is recommended that pregnancy should be delayed at least 6 months after RAI treatment, which has made RAI less desirable for some women [13]. Many endocrinologists are hesitant to treat patients less than 20 years of age with RAI therapy because of concerns about carcinogenesis [13]. A large epidemiologic study using the Cooperative Thyrotoxicosis Study Follow Up database of more than 20,000 adults who had received 131I for treatment of thyrotoxicosis did find a slightly increased risk of mortality from thyroid cancer but no other malignancies. However, the number of patients who died from thyroid cancer was small (24 patients), and many of these had toxic nodular goiter, which in itself is a risk factor for thyroid cancer [17]. Another study of 7209 patients treated with RAI found an increased overall mortality from all causes [18]. There is also a report on lowered overall cancer incidence, however, with an increased risk of thyroid and small bowel cancer [19]. No studies have yet demonstrated an increased risk of thyroid cancer or other malignancies with the use of 131I in children, although the total number of children evaluated in these studies is relatively small, and long-term follow-up is limited [16].

Surgery

The surgical management of goiter was greatly advanced by Kocher, who received the Nobel Prize in 1909 for his work (which greatly reduced the operative mortality after thyroidectomy). He proposed subtotal thyroidectomy as surgical treatment for GD, which then became the routine form of therapy for the disease [12, 16]. It was the Australian surgeon Dunhill, however, who pioneered safe surgery and anesthesia for GD, reducing the mortality to less than 2% by performing subtotal thyroidectomy on grossly toxic patients under local anesthesia [20]. After the introduction of RAI therapy during the 1940s, surgery became less common as the primary treatment. The indications for surgery that are more or less agreed upon are still young age, pregnancy and lactation, the presence of a thyroid nodule or large goiter, and patient preference (Table 1).

Table 1 Generally accepted indications for surgery in Graves´ disease

The continuously ongoing debate of surgical indications concern patients in whom a relative indication for surgery is present and for whom RAI and ATDs are alternatives. In addition, if surgery is chosen, there is also a debate what type of surgery to perform—total thyroidectomy or various subtotal thyroid operations. One debated issue has been the higher incidence of thyroid cancer (TC) in patients with GD. Although some studies have reported an increased incidence, others have contradicted these figures, possibly owing to the bias included in the different indications for surgery needed for a pathology diagnosis. However, the Cooperative Thyrotoxicosis Therapy Follow-up Study presented results from 36,000 patients (Level IV evidence) with thyrotoxicosis and stated that TC is twice as frequent in GD patients as in euthyroid individuals [21]. A recent analysis (Level IV) demonstrated an incidental finding of TC of 2%, most of the tumors being papillary, in surgically treated GD patients in whom a pathology examination was performed. After 50 months of follow-up, no support was found for TC being more aggressive in GD patients than in euthyroid patients [22]. This is in contrast to a previous study stating that thyroid receptor antibodies may function as a stimulator to papillary thyroid carcinomas (PTCs), resulting in larger, more aggressive tumors in the presence of GD [23]. Taken together, the risk for TC in GD still reaches an insufficient level of evidence to lead to it being included in the list for surgical indications with the grade of recommendation used in this presentation.

Methods

An electronic PubMed search was performed from the English-language literature from 1980 to the present. Searches were performed for GD and surgery and focused on prospective randomized trials, meta-analyses, and case series. The levels of evidence for the publications included in this review were ranked according to Sacket with Heinrich et al.´s modifications [24, 25]. Level I evidence consists of large well designed prospective randomized controlled trials (PRCT) or meta-analyses; Level II of small PRCTs with potential errors; and Level III of nonrandomized prospective studies and case–control studies. Level IV evidence consists of retrospective studies with historical controls; and Level V includes expert opinion or consensus documents. Grading of recommendations, based on the available evidence, was also performed using the criteria of Sacket et al. Grade A recommendations are supported by Level I evidence, grade B recommendations by Level II evidence, and grade C recommendations by Level III, IV, or V evidence.

Issue 1: Is Surgery better than RAI or ATDs?—no grade of recommendation

The issue of whether surgery, RAI, or ATDs should be used for the general treatment of GD has been debated for decades. One PRCT attempted to study this issue, but it did not result in any differences from which recommendations could be drawn [26]. After a thorough literature search, one may conclude that only a few studies have addressed this question, and most of them concern GO, which are presented subsequently. For the general adult patient with GD without GO, there is no recommendation based on any level of evidence. However, in case GO develops, the patient falls into another group, described below. In addition, the clinical observation of troublesome compliance in taking the medication leading to a GD difficult to manage, adverse reactions to ATDs, and the benefit of surgery in recurrent disease has never been scrutinized with any level of evidence.

Issue 2: Extent of surgical resection; cure and complication rates—grade A recommendation for total thyroidectomy)

The controversies regarding the extent of surgical resection in GD are presumably due to the previously reported higher incidence of permanent complications when performing total thyroidectomy (TT). However, there are now meta-analyses (Level I) and small PRCTs (Level II) that have established that TT can be performed with safety equal to that seen with lesser resections in GD. In the present review, we focus on TT versus all types of surgery leading to remnant thyroid tissue (ST), including bilateral thyroid resections and hemithyroidectomy associated with contralateral resection, as no studies have demonstrated any significant difference in outcome between these operations (e.g., Level II evidence by Andaker et al. and Muller et al. [27, 28]).

Palit et al. [29] performed a meta-analysis (Level I) on 35 studies comprising 7241 patients, where TT was performed in 538 patients and ST in 6703. The permanent recurrent laryngeal nerve (RLN) palsy rate was 0.9% after TT and 0.7% after ST, where nerve function was reported. Transient hypocalcemia occurred in 9.6% and 7.4% of patients, respectively. More important, permanent hypoparathyroidism occurred in 0.9% of patients after TT and in 1.0% after ST. All TT patients had follow-up data for time periods ranging from 4 to 12 years, and none had persistent or recurrent disease. Follow-up data on ST patients were available in 82% with a mean follow-up of 5.6 years. There was a 7.9% persistence or recurrence rate in the ST group.

Witte et al. [30] reported the results of a PRCT (Level II) including 150 patients randomized to TT or ST. They reported a 28% rate of transient hypocalcemia in the TT group but equal long-term outcomes for the two groups. Overall, their results displayed a higher incidence of both permanent RLN palsy and permanent hypoparathyroidism compared with the meta-analysis and most case series; but there was no statistically significant difference between the groups in their study. Numerous case series have presented data supporting the comparable results between TT and ST for GD (Table 2). Nonrandomized studies are difficult to evaluate because most use historical controls and one cannot disregard the hazard of a selection bias. The studies by Barakate et al. and Ku et al [31, 32] support TT as the method of choice. Since these studies represent a conscious change in practice from ST to TT one would expect that selection bias would be minimal.

Table 2 Studies on surgical approach in Graves´ disease

Given that TT inevitably gives rise to hypothyroidism, there is a lifelong need for thyroxine supplementation. It has been well documented that performing ST with the aim of producing euthyroidism is unpredictable, and there is a significant risk of a somewhat unpredictable developing hypothyroidism over the years, with a reported incidence of up to 70% on long-term follow-up. Thus, there is no documented perfect remnant size for the individual patient. Michie illustrated that in a range of 2- to 8-g remnant size, every gram of remnant left behind decreased the risk for hypothyroidism by 10% [33]. In a meta-analysis it was reported that the average remnant size after ST was 6.1 g (range 2–12 g), and the size was negatively correlated to hypothyroidism, with an 8.9% decline in hypothyroidism for each gram of remnant. In addition, euthyroidism correlated with the remnant size, with a positive increase for the chance of euthyroidism of 6.9% per gram of remnant [29]. These data call for prolonged surveillance of the thyroid hormone status to avoid an undetected slow progression into hypothyroidism.

There is also the risk of persistent or recurrent disease. Recurrence rates of 1% to more than 20% have been reported (Table 2). Given that the only advantage of ST has been its perceived lower complication rates, we believe that there is now sufficient data regarding cure rates and complications to recommend that TT should be the procedure of choice when performing surgery for GD, thus reaching a grade A recommendation. Arguments may be summarized as follows: There is a significantly reduced risk for recurrence as well as the need for reoperation or RAI therapy and the same morbidity as ST. In addition, although still without evidence on morbidity and mortality, micropapillary thyroid carcinomas are also removed that have been found in a slightly higher incidence in patients with GD (8%) compared to that in patients without GD.

Issue 3: Surgery and Graves´ ophthalmopathy

The effect of surgery on Graves’ ophthalmopathy (GO) without any comparison to other treatment modalities has been extensively reported in the literature. The data from these studies are conflicting, which may be due, at least in part, to the retrospective feature of most studies and the lack of precise evaluation of ocular involvement. Comparative studies between the extent of surgical resection and between RAI and surgery are scarce.

Surgery, ATDs, or RAI when GO is present?—Grade B recommendation for surgery

Several studies have addressed the question whether surgery may be the treatment of choice for GO. Reasons are the noted worsening of the GO after RAI in some patients. A PRCT from 1992 [34] noted progression of GO in 32% of patients after RAI, whereas only 16% progressed after ST. These results (Level II evidence) lead to a grade B recommendation to avoid RAI in patients with GO. On the other hand, glucocorticoid protection seems to be as efficient as surgery or ATDs in regard to diminishing the risk [35]. Indeed, worsening of GO may also occur after surgery, which has prompted the general recommendation to use glucocorticoid treatment during thyroid ablative therapy regardless whether it is by RAI or surgery. RAI as well as surgery may induce a rise in the level of thyroid receptor antibodies and thyroid hormones after treatment, possibly being more dramatic after RAI due to uncontrolled leakage of hormones due to radiation-induced tissue destruction. A recent PRCT (Level II evidence) documented that total thyroid ablation achieved by near-total thyroidectomy followed by RAI had a better outcome on mild GO than near-TT alone in association with glucocorticoid treatment [36] (Table 3). This indicates that remaining thyroid antigens may continue to induce autoantibodies, being causative for GO. However, long-term results are lacking. PRCTs with Level II evidence have also evaluated ATDs as treatment of GD with ophthalmopathy and have documented a higher relapse risk or similar results in terms of lack of worsening of the GO. ATDs did not induce cure at the same rate as the alternative treatments in these studies [26, 35].

Table 3 Effect of treatment on the course of Graves’ ophthalmopathy

Taken together, one grade B recommendation is that mild GO benefits from “total thyroid ablation” including TT (or near-TT) followed by RAI under glucocorticoid protection. Studies with a high level of evidence for severe GO are scarce; however, after combining the data of Tallstedt et al. [34], Bartalena et al. [35], and Abe et al. [37], one may arrive at a grade B recommendation that surgery is preferred in patients with severe GO to minimize the risk for worsened GO or relapse of the GD.

Does extent of resection affect GO?—Grade B recommendation for equal outcomes

There are only two small prospective studies (Level II) that address whether the extent of resection affects the outcome of GO after surgery [30, 38]. None of these studies has been able to show any significant difference between TT and ST. Several retrospective studies have demonstrated either equal results between TT and ST or a slight advantage for TT concerning improvement of GO (Table 3). Based on the available data, one may conclude that TT has no advantage over ST in patients with GD with GO (grade B recommendation).

Issue 4: Preferred treatment for children with Graves disease—no grade of recommendation

Treatment of GD in children is performed using the same methods as in adults: ATDs, RAI, and surgery. However, there are no RCTs, and the various recommendations have been based on low levels of evidence according to the presently used classification. Generally, evidence from reports of adult GD has been expanded to the situation during childhood and the special considerations mostly regarding RAI and hypothyroidism. RAI has recently been proposed also for treating juvenile thyrotoxicosis in North America, whereas ATDs comprise the primary choice in Japan and Europe, followed by a low threshold for surgery.

It is generally accepted that treatment with ATDs alone is effective only in a fraction of children, being most successful in those with a low body mass index (BMI) and a small goiter (Table 4) [39]. An opinion from our own institution has been that, after long-term follow-up, only a small fraction enter permanent remission [40]. In addition, there are small risks for adverse reactions, including hepatic failure and bone marrow suppression. Other disadvantages are the slow-onset effect and the need for frequent monitoring to achieve and maintain euthyrodisim during treatment.

Table 4 Studies on pediatric Graves´ disease

The use of RAI is efficient in children, with high cure rates. The risk of radiation-induced papillary thyroid cancer has been debated. A meta-analysis focusing on this issue has documented that the risk is twofold higher in children less than 5 years of age than in children 5 to 9 years of age and fivefold higher than in children 10 to 14 years of age [41]. However, long-term results are lacking. The risk for thyroid cancer diminishes to the adult nonincreased risk level in patients 15 to 20 years of age [42]. RAI is not always effective after the first dose, especially as sensitivity to RAI is somewhat difficult to estimate in children, leading to a higher risk for an insufficiently low dose, necessitating a second dose. The adverse effect on GO should be noted (see above).

Thyroid surgery can today be performed safely in children, with low complication rates. However, there are still controversies regarding the treatment of GD in children based on the generally accepted notion that none of the therapeutic options is capable of achieving remission without some type of complication or long-term consequence. This includes surgery, even if it is performed in a safe way. TT induces inevitable hypothyroidism associated with lifelong thyroxine substitution, leading to close surveillance needs especially during adolescence. ST may also induce hypothyroidism, but the risk of persistent disease or relapse occurring is especially troublesome in children.

Several reviews state that definitive treatment with RAI or surgery should be advocated, although initially ATDs are generally given. No recommendation based on any level of evidence may be proposed however. Several authors have listed factors favoring surgery rather than RAI: young age, large goiter, GO, and Downs’ syndrome.

Conclusions

There are no recommendations that reach any grade of evidence for which treatment to choose for adults with GD without GO. However, if surgery is chosen, total thyroidectomy should be performed because there is grade A evidence to support this option.

In case of severe GO in adults, total thyroidectomy is recommended, supported by grade B evidence. It should probably also be treated further by completion RAI accompanied by glucocorticoids.

No recommendation can be made at any level of evidence regarding which treatment is appropriate for children. However, so long as data on long-term cancer risk are missing or conflicting and until RAI has proven harmless in children, we continue to recommend surgery in this group.