Introduction

Thyroid cancer is the most common endocrine malignancy, with differentiated thyroid cancer (DTC) comprising more than 90% of all thyroid carcinomas [1]. Thyroid follicular cells are assumed to be the only source of thyroglobulin (Tg), a large glycoprotein (660,000 Daltons) in human body [2]. Serum Tg can be used as a tumor marker to evaluate disease status in the whole course of DTC. Most nuclear medicine (NM) physicians in Asia are actively involved in the management of DTC. Radioactive iodine (RAI) therapy for DTC is a common practice of NM physicians throughout Asia. They are actively using serologic markers such as Tg, anti-Tg antibody (TgAb), and thyroid stimulating hormone (TSH) as well as imaging studies including neck ultrasound (US) and RAI whole body scan (WBS) [3]. Of note, Tg measurement can be influenced by a variety of factors, including TgAb, TSH level, and different laboratory assay methods. Tg measurement also faces challenge of test interference and measurement heterogeneity due to different test methods. The present paper describes the clinical roles of Tg in the management of DTC through a wide systematic review. We performed a literature search in electronic databases PubMed, Embase, and the Cochrane Library up to January 2019 using the folowing keywords: thyroid, cancer/carcinoma/neoplasm, and thyroglobulin. Inclusion criteria were prospective and retrospective original studies written in English regarding thyroglobulin in DTC. Additional evidence was collected by identifying studies from respective references of selected articles. Reviews, expert opinion articles, and guidelines were also appraised. Due to the plethora of relative studies, the most recent and qualified studies in each clinical issue were selected and reviewed.

Preoperative measurement of Tg

Role of preoperative serum Tg assay is controversial. The American Thyroid Association (ATA) guideline in 2015 did not recommend measuring serum Tg routinely for initial evaluation of thyroid nodules [4]. A complementary role of preoperative serum Tg can be expected in selected cases. Serum Tg combined with nodule size on US can provide information to differentiate follicular cancer from benign nodules in indeterminate nodules [5]. Serum Tg is an insensitive marker for diagnosis of metastatic diseasese as its level is low in many DTC patients with locoregional metastasis. Malignant nodule topography such as location of the nodule was helpful to predict the presence of metastases [6]. Serum Tg level is correlated with the size of the thyroid gland and thyroid nodules rather than metastases [7]. Tg can be measured in washout of fine needle aspiration (FNA). It can be used for differential diagnosis. Multiple studies have shown high sensitivity and specificity of FNA washout Tg for the diagnosis of metastasis from thyroid cancer in neck mass and lymph nodes [8,9,10,11]. Various cutoff values of FNA washout Tg make this technique difficult to use in daily practice [12, 13]. False-positive result of FNA washout Tg can occur if there is blood contamination in the aspirate [14].

Postoperative Tg for selection of patients and dose for RAI treatment

Measurement of either stimulated TSH or non-stimulated serum Tg a few weeks after total thyroidectomy and before RAI therapy has been demonstrated to be clinically useful. ATA 2015 guideline recommends measurement of serum Tg at 3–4 weeks after operation when serum Tg level reaches its nadir [4]. It is regarded as a tool to aid in initial risk stratification and decision-making for adjuvant therapy as it helps assessment of the persistence of residual thyroid tissues either remnant normal thyroid tissue or thyroid cancer. The negative predictive value of postoperative and pre-ablation serum Tg is very high for successful RAI treatment and recurrence of DTC [15, 16]. RAI treatment is usually not recommended if postoperative Tg is undetectable or very low (less than 1.0 ng/mL for TSH stimulated status or less than 0.2 ng/mL for non-stimulated status) [17].

As today expressly stated or widely accepted Tg-based dose-decision rules are not available. Zhang et al. [18] have demonstrated that low-dose RAI (30 mCi) treatment is not inferior to high-dose (100 mCi or higher) in achieving excellent response when DTC patients have stimulated Tg of less than 5 ng/mL irrespective of lymph node metastases. Jin et al. [19] have recently compared adjusted dose group according to RAI uptake and serum Tg levels (n = 207) to fixed dose group (n = 58). RAI uptake and Tg-based activity was established based on four levels of RAI uptake (≤ 2%, 2–5%, 5–15%, and > 15%) and Tg levels (≤ 2 ng/mL, 2–5 ng/mL, 5–10 ng/mL, and > 10 ng/mL). Based on this, RAI doses of 1.1, 1.85, 3.7, and 5.55 GBq were administered. The dose in the adjusted group varied from 1.54 to 3.26 GBq. Although the administered dose was significantly lower than that in the fixed dose group, the successful response rate was significantly higher in the adjusted group (94.2% vs. 70.7%; p < 0.0001). Incidence of xerostomia was significantly lower in the adjusted group. Further multicenter studies are warranted to confirm whether serum Tg level can be used to select RAI dose.

Preablative Tg for prediction of treatment response and recurrence

Preablative Tg has a great value in predicting therapeutic response and recurrence [20]. A high preablative stimulated Tg level is the most significant predictor of therapeutic failure at the time of first RAI therapy in patients with DTC [21]. Several studies have shown that the optimal diagnostic threshold of Tg level is between 20 and 30 ng/mL in predicting resistance or recurrence of the disease [22,23,24]. Patients with preablative stimulated Tg level ≥ 10 ng/mL have 25.5 times greater chance of therapeutic failure than those with level < 10 ng/mL [25]. Yang et al. [26] have provided cutoff value of preablative stimulated Tg at 26.75 ng/mL for differentiating structural incomplete from either excellent, indeterminate, or biomedical incomplete responses. Preablative stimulated Tg is an independent and the most predictive variable of structural incomplete response (odds ratio: 42.312; P < 0.001) [26]. High postoperative TSH stimulated Tg values (> 10–30 ng/mL) are also associated with poor survival and treatment failure [27, 28].

There is a good probability that most low Tg (< 2 ng/mL) patients can achieve excellent response [29]. Although low level of pre-ablation stimulated Tg (less than 1–2 ng/mL) can indicate lower recurrence rate and better prognosis, it cannot completely eliminate the possibility of identifying a metastatic foci outside the thyroid bed on post-therapy WBS [30, 31]. In one study, RAI-avid metastatic foci outside the thyroid bed, mostly in cervical and mediastinal lymph nodes, were detected in 6.3% of intermediate-/high-risk patients with stimulated Tg < 2 ng/mL [32]. Hu et al. [33] have reported that the prognosis of patients showing positive post-therapy WBS but negative serum Tg is excellent. There was no recurrence or metastasis in such patients [33].

Timing of measuring serum Tg after recombinant human TSH (rhTSH) is controversial. Serum Tg is usually measured at 72 h after second injection of rhTSH as it corresponds to the peak serum Tg level [34, 35]. Serum Tg values measured after RAI therapy may include released Tg from thyroid follicles of normal thyroid tissue damaged by RAI. Thus, they might not be reliably predictive [36, 37]. Park et al. [38] have shown that stimulated Tg measured early after the second rhTSH injection, but before RAI administration is more predictive than Tg values measured after RAI therapy. The cutoff value to predict a non-excellent response was 2.0 ng/mL.

Combinations of serologic markers with imaging studies offer distinct advantages for diagnosis. Higher serum Tg level indicates presence of residual thyroid tissue with or without distant metastasis. Lin et al. [39] have reported that preablative stimulated Tg is significantly different between patients with and without distant metastasis (440.6 vs 5.3 ng/mL, p < 0.0001). It is important to evaluate the location of the lesion by imaging studies such as 123I WBS and 18F-fluorodeoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) when serum Tg level is higher. Sometimes distant metastasis can be found in patients showing low serum Tg level. One study reported that 9 out of 573 patients (1.56%) had a discordant pattern with low level of Tg/positive WBS in post-surgical follow-up [40]. Of these 9 patients, four were metastatic at presentation while five with metastasis during follow-up persistently remained low levels of Tg (< 5 ng/mL) [40]. The risk of persistent disease was minimal when both stimulated Tg and post-therapy WBS were negative: only 2 of 93 had persistent diseases [41]. However, neither Tg level alone nor WBS alone can be considered as a reliable indicator for absence of disease. Diagnostic WBS contributes to risk stratification by defining residual nodal and distant metastatic disease. It has changed clinical management for 29.4% of patients [41]. Single photon emission computed tomography with computed tomography (SPECT/CT) could provide additional clinical information compared to planar WBS, which results to change in risk stratification after surgery or patient management. Avram et al. [42] demonstrated that pre-ablation SPECT/CT, combined with stimulated serum Tg, altered the risk stratification in 15% of the patients by detecting residual nodal or distant metastases, which led to changes in the further management plan in 29.4% of the patients with DTC. Jeong et al. [43] reported that SPECT/CT could detect additional findings in 8.6% of the patients and that the serum Tg levels at the time of RAI therapy were significantly higher in patients with additional imaging findings than in those without them. From this study, post-therapeutic SPECT/CT should be considered in patients with high level of serum Tg for the detection of hidden metastases.

Tg for risk stratification after RAI treatment

Serum Tg on thyroxine therapy is recommended to be measured every 6–12 months after RAI treatment for low and intermediate risk patients [4]. It can be lengthened to 12–24 months if patients show excellent response to therapy. Excellent response is defined as an unstimulated Tg below 0.2 ng/mL or stimulated Tg below 1.0 ng/mL with negative imaging findings based on ATA guideline 2015. Hu et al. [44] have shown that suppressed Tg < 0.2 ng/mL performs better than stimulated Tg < 1.0 ng/mL in defining an excellent response. More frequent Tg measurements may be appropriate for high-risk patients. Several studies have shown a strong correlation between follow-up Tg and recurrence or response to initial treatment. In patients with positive TgAb, serum Tg concentrations alone cannot be used as a marker to detect disease status. Disease recurrence can be heralded by a rise in TgAb with or without a corresponding rise in serum Tg. Changes in TgAb levels in the first year after surgery can predict the risk of persistent or recurrent disease of TgAb-positive DTC patients. Patients who have achieved negativization of TgAb have excellent prognosis [45].

RAI therapy algorithm using Tg as a gatekeeper along with other risk factors

Although many clinicians commonly measure serum Tg in the course of RAI treatment [3], there is no consensus for the timing of measurement or optimal cutoff value and such measurement is not adopted in guidelines as an indicator for decision-making [4]. In this review, we suggested new RAI therapy and follow-up algorithm considering both Tg and conventional risk factors as indicators (Fig. 1). RAI therapy might not be recommended if postoperative Tg is undetectable or very low (less than 0.2 ng/mL in unstimulated status or less than 1.0 ng/mL in stimulated status) in low risk group patients. It was reported that 17.4% of DTC patients with low- or low- to intermediate-risk showed metastases on post-therapy WBS [46]. Postoperative RAI treatment can be actively considered as an adjuvant therapy if serum Tg level is detectable. When patients show higher serum Tg levels, indicating the presence of residual thyroid tissue, the location of residual thyroid tissue should be evaluated by imaging studies such as 123I WBS and 18F-FDG PET/CT. According to authors’ experience, patients with pretreatment stimulated Tg levels ≥ 10 ng/mL have greater risk of therapeutic failure [25]. We recommend imaging studies when stimulated Tg level is 10 ng/mL or over. Serum Tg level of 10 ng/mL after TSH stimulation is equivalent to 2 ng/mL of non-stimulated Tg [4]. RAI-avidity is an important finding in the decision-making of RAI therapy. Surgical resection or external radiotherapy be considered first before RAI therapy when metastatic lesions show low RAI-avidity. Dose of RAI higher than 5.5 GBq is considered for high-risk group, especially for patients with distant metastasis showing RAI-avidity. Bone marrow dosimetry by Benua method [47] can be done by serially measuring blood activities. This information is important for the next high-dose RAI therapy, especially when treatment failure is suspected.

Fig. 1
figure 1

Algorithm for radioactive iodine therapy using thyroglobulin as a gatekeeper. TT total thyroidectomy, Tg thyroglobulin (along with anti-Tg antibody), Risk risk assessment using clinical and pathologic information, LR low risk, IHR intermediate to high-risk, THR thyroid hormone replacement, Img imaging studies, M malignancy or metastasis, RT external radiotherapy, RAI radioactive iodine, (−) negative, (+) positive, (++) strong positive

There is a trend to de-escalate the dose in patients with intermediate-risk group. We suggest administration of 1.1 GBq when non-stimulated serum Tg level is between 0.2 and 2.0 ng/mL or when stimulated serum Tg is between 1.0 and 10.0 ng/mL, if patients’ risks are favorable. This dose can be administered at an outpatient clinic in many countries. WBS at 3 days after RAI therapy is important to evaluate the extent of residual thyroid tissue in the body. A delayed imaging after several days is helpful when there is a discrepancy between serum Tg level and WBS result.

Response to therapy is evaluated between 6 and 18 months after RAI treatment by serum Tg, TgAb, TSH, neck US, and RAI WBS. If a patient shows complete response, i.e., serum Tg is undetectable and imaging studies are negative, a surveillance protocol in longer intervals can be applied. Undetectable Tg alone is insufficient to confirm complete response as some patients show metastatic lesions. Serial measurement of serum Tg is sufficient if complete response is confirmed by both serum Tg and imaging studies. Appropriate treatment plans should be applied when patient has either biochemically or structurally incomplete response.

Interpretation of Tg in various clinical conditions

Stimulated Tg measured after thyroid hormone withdrawal (THW) is different by level of TSH which can be increasing with duration of THW. Son et al. [48] have reported that change of TSH is positively correlated with change of Tg in each patient. They revealed that stimulated Tg of individual patient after THW was not identical. They recommended considering TSH level when interpreting serial stimulated Tg measurement. rhTSH becomes a safe and effective mean to increase serum TSH level for RAI therapy preparation while avoiding signs and symptoms of hypothyroidism associated with THW. Comparisons of Tg levels after THW and rhTSH administration were reported. Kowalska et al. [49] have reported that rhTSH-stimulated Tg values of 0.6 and 2.3 ng/mL correspond to THW-stimulated Tg levels of 2.0 and 10.0 ng/mL, respectively. Patients with THW-stimulated Tg > 10 ng/mL [20, 25] and rhTSH-aided Tg > 2 ng/mL [38] have been reported to have poor prognosis. THW-aided stimulated Tg > 5.22  ng/mL [50] and rhTSH stimulated Tg > 4.64  ng/mL [51] at the time of RAI therapy were associated with recurrence in a 6-year follow-up.

The clinical role of serial Tg measurement before and after RAI treatment has been reported for prediction of therapeutic response. High ratios of serum Tg measured at 3, 5, and 7 days after treatment to pretreatment Tg indicate the release of stored Tg in follicles of remnant thyroid tissue, which can reflect early thyroid tissue destruction and good response of RAI therapy [37, 52, 53]. Early released Tg indicated by high Tg-Day3/Tg-Day0 ratio (i.e., extensive release of Tg to the blood after RAI therapy) and low preablative Tg (i.e., small remnant thyroid tissue) constitute indices of successful ablation after RAI therapy [37]. Similar findings have been observed for Tg-Day5/Tg-Day0 and Tg-Day7/Tg-Day0 ratios [52, 53].

Patients who no longer respond to RAI therapy despite appropriate TSH stimulation and iodine preparation have been identified as RAI refractory DTC [4]. These patients often show no RAI uptake on WBS, but significantly elevated serum Tg levels or rapidly rising serum Tg. Wang et al. [54] have observed changes of stimulated Tg from the first and the second RAI treatments. If Tg level does not decrease appreciably after the first RAI therapy, patients may benefit little and suffer from elevated risk of RAI refractory DTC. Cutoff value of the first/second Tg ratio has been reported to be 0.544, with a sensitivity of 90.0% and a specificity of 47.7% [54].

18F-FDG PET/CT plays an emerging role in assessment of RAI refractory DTC (Fig. 2). A recent study has suggested that 18F-FDG PET/CT may be of great value in identifying metastases in postoperative DTC patients with elevated Tg before RAI administration [55]. 18F-FDG positive metastatic DTC with maximum standardized uptake value (SUVmax) of greater than 4.0 possesses higher probability of non-avidity to RAI [55]. Another study has shown that 18F-FDG PET/CT is useful for investigating DTC with high stimulated Tg levels (> 10 ng/mL) and negative WBS as a standard of care for RAI refractory DTC [56]. Higher Tg level is related to higher sensitivity of 18F-FDG PET/CT [56].

Fig. 2
figure 2

18F-FDG PET/CT (a) and 123I-whole body scan (b) images of a 51-years-old female patient with papillary thyroid cancer who underwent total thyroidectomy and two times of high-dose (5.55 GBq) 131I treatment. At follow-up, stimulated serum thyroglobulin level was 32.0 ng/mL, anti-thyroglobulin antibody level was 21 U/mL, and thyrotropin level was 35.79 uIU/mL. Recurrence was verified by fine needle aspiration cytology. 18F-FDG PET/CT shows sensitivity to recognize non-RAI-avid lesion in patients with elevated Tg

Tg measurements in concordance with response evaluation criteria in solid tumors (RECIST 1.1) are valuable in the evaluation of tumor responses to molecular targeted treatment, especially for earlier response evaluation. A decrease in serum Tg level following sorafenib therapy has been observed in patients with RAI-refractory DTC [57,58,59]. Lin et al. [60] have also demonstrated rapid metabolic and structural response to apatinib in RAI refractory DTC. Tg and 18F-FDG PET/CT could afford more informative and distinct biochemical and glucose metabolism changes. They could be used as early response indicators [60].

Pitfalls of thyroglobulin measurement

Over the last few decades, three main methods for Tg measurement have been developed: radioimmunoassay (RIA) since 1970s, immunoradiometric assays (IRMA) since 1980s, and liquid chromatography–tandem mass spectrometry (LC–MS/MS) since 2008. These three different methods have different sensitivities, specificities, and propensities for interference from TgAb and heterophile antibodies (HAb). Many NM departments in Asia are running laboratories to provide measurements for Tg, TgAb, and other thyroid hormones. Tg from test sample can competes with a radiolabelled (125I) human Tg for binding to a limited amount of a high-affinity rabbit polyclonal TgAb in RIA. The functional sensitivity of RIA can reach a level of 0.5 µg/L [61, 62]. RIA is advertised to be more resistant than other assays to TgAb interferences because polyclonal antibodies could recognize Tg epitopes bound to TgAb [61]. However, false result due to TgAb still occurs with overestimation or underestimation of Tg [63,64,65]. IRMA are based on a two-site reaction that involves Tg capture by a solid-phase antibody followed by addition of a labeled antibody that targets different epitopes on the captured Tg [66]. The second-generation IRMA tests have higher functional sensitivity (≤ 0.10  μg/L) than the first-generation tests with functional sensitivity of ~ 1.0  μg/L. The second-generation IMA has been widely used as the standard of care, especially for monitoring low basal Tg [67,68,69]. The main limitation of Tg-IRMA is its high propensity for TgAb [70, 71] and HAb interferences [72, 73]. The presence of TgAb might cause falsely low/undetectable result whereas HAb might cause falsely high result. TgAb interference is the most common problem that limits Tg utility in one-fourth of DTC patients [74]. HAb interferences occur only in approximately 0.5% of DTC patients [73, 75]. By cleaving all proteins including Tg and TgAb, MS is regarded as a solution to interference of TgAb [66]. Kushnir et al. [76] have demonstrated that 23% of TgAb-positive samples with undetectable Tg concentrations by immunoassay (< 0.1 μg/L) have Tg concentrations ranging from 0.7 to 11 µg/L on MS assay. The degree of underestimation effect on Tg measurement by TgAb is variable according to Tg assay kits [77]. Different methods can display a twofold difference in numeric thyroglobulin values reported for the same serum [61, 71]. In clinical practice, such discrepancies emphasize that Tg monitoring should be done using the same manufacturer’s method, the same kit, and preferably the same laboratory. Ahn et al. [78] have suggested that applying mathematical equation to estimate true Tg value can eliminate effect of TgAb on Tg measurement. Currently, MS utility is compromised by suboptimal functional sensitivity and longer turnaround time than the second generation IMA with limited availability and high instrumentation cost [69]. Discordances in measurement results can also be observed. TgAb is recommended to be tested along with every Tg measurement. When positive TgAb renders IRMA unreliable, serum TgAb concentration can be used as a surrogate of disease marker in TgAb-positive patient. Notably, the use of the TgAb trend as a surrogate of disease is only possible when the same assay is used longitudinally.

Future perspectives

Much has been learned about the clinical utility of Tg as a biomarker for DTC over the past decades. Newly developed assay methods with high sensitivity offer more accurate ways for disease surveillance. A careful assessment of interferences (from TgAb, HAb, and non-adequate TSH level) in Tg measurement is needed to overcome potential pitfalls. Tg measurement along with reasonable imaging authenticates the use of Tg. More well designed, prospective, and multicenter studies are needed to disclose more about the wholesome role of Tg in DTC.