Introduction

The American Joint Committee on Cancer/Union for International Cancer Control tumor-node-metastasis (TNM) staging system, recommended by the American Thyroid Association (ATA) and National Comprehensive Cancer Network (NCCN) guidelines is used to predict survival and provide guidance for proper treatment in patients with differentiated thyroid carcinoma (DTC) [1]. Based on recent evidence, the eighth edition of the TNM staging system (TNM-8) was crafted in late 2016 and came into effect in January 2018. The differences between the seventh and eighth editions of the staging system were well documented by studies from the United States, South Korea, Australia and other countries, which showed the superiority of TNM-8 for predicting disease mortality [2,3,4,5,6]. However, there is no reference study on the effects of the system update on a Chinese population of patients with DTC.

In addition to the increased cutoff age and the reclassification of T3, the decreased stage of N1 disease and redefinitions of lateral lymph node metastases (LNM) are the most notable changes distinguishing TNM-8 from TNM-7 [6]. Most studies have suggested that regional LNM, especially for N1b, has prognostic significance in DTC [7,8,9,10,11]. Furthermore, TNM-8 classifies level VII lymph nodes as central neck lymph nodes (N1a) rather than lateral ones (N1b), considering that level VII lymph nodes are more anatomically consistent with central ones [12]. However, due to the rarity of level VII LNM, there is no study showing its impact on mortality. There is controversy over the prognostic implications of N1b redefinition and classification that may affect the model performance of TNM-8.

This study aimed to compare TNM-8 and TNM-7 for predicting disease-specific survival (DSS) of Chinese patients with DTC, and to evaluate the effect of N1b redefinition and reclassification on prediction of survival. A modification of TNM-8 considering LNM status in older patients is also suggested.

Materials and methods

Patients

A total of 569 consecutive patients with DTC who underwent initial thyroid surgery at the Tianjin Medical University Cancer Institute and Hospital (Tianjin, China) and Fujian Medical University Cancer Hospital from January 2004 to December 2005 were selected for this study. Within this cohort, 22 patients who were less than 18 years old were excluded, therefore, 569 patients were eligible for analysis. All patients had pathologically proven papillary thyroid carcinoma (PTC) or follicular thyroid carcinoma (FTC), including Hürthle cell carcinoma (HCC). Patients with anaplastic carcinomas or poorly differentiated carcinomas in the thyroid were excluded from this study. Partial thyroidectomy was preferred treatment for small (< 4 cm), intrathyroidal, unifocal papillary carcinomas in the absence of aggressive histology, prior head and neck irradiation, or preoperatively detected cervical nodal metastases. RAI ablation was performed for patients with known distant metastases, gross extrathyroidal extension of the tumor regardless of tumor size, or suspicious thyroid remnant following total thyroidectomy.

As previously reported, prophylactic central compartment node dissection (CCND) was routinely performed, whereas therapeutic central and lateral neck dissection was performed in patients with clinically suspicious lateral LNM. Preoperative LNM status was assessed with either high-resolution ultrasonography or enhanced computed tomography. CCND usually involved a level VI lymph node (LN). Dissection of level VII LNs was performed only in patients with suspected metastasis in that area. Lateral neck LN dissection included levels II, III, and IV. Dissection of levels I and V was performed only for suspected metastatic LNs. Other variables, including age at diagnosis, sex, disease type, TNM stage, minor extrathyroidal extension, extent of surgery and use of radioactive iodine (RAI), time to last follow-up, survival status, and date and cause of death, were obtained from patient records. The follow-up protocol after initial treatment for DTC was as previously reported. DSS was defined as the time from the date of surgery until last censoring or death caused by DTC. This study was approved by the Ethics Committee of the Tianjin Medical University Cancer Hospital and Fujian Medical University Cancer Hospital. The informed consent was obtained from each patient to allow their information to be used.

Modifications to TNM-8

For better prediction of survival in TNM-8, stage II was subdivided in a modified TNM staging system. Patients with levels I–V and VII LNM aged 55 years or older at diagnosis were restaged as stage IIB, and the remaining cases were staged as IIA.

SEER data acquirement

Data from the Surveillance, Epidemiology, and End Results (SEER) program were obtained to evaluate the effect of the changes in N1b definition and classification. Data were queried for all adult patients (≥ 18 years) diagnosed with DTC who underwent thyroid surgery of any extent between 2004 and 2013. A DTC diagnosis was identified with the ICD-O, third edition codes: 8050/3, 8230/3, 8260/3, 8290/3, 8330/3, 8331/3, 8335/3, 8340/3, 8341/3, 8342/3, 8343/3, 8344/3, and 8350/3. Demographic characteristics included patient age at diagnosis, sex, race/ethnicity, and year of diagnosis. Treatment characteristics included extent of surgery and radioactive iodine (RAI). Survival information and pathologic T, N, and M stages were also included. As previous study described, patients were staged with TNM-8 according to pathologic T, N, and M stages and clinicopathological characteristics [4]. Patients with missing data for these characteristics were excluded.

Statistical analyses

Statistical analyses were performed using SASv9.4 (SAS Institute, Cary, NC). Continuous variables were presented as medians with range or means with standard deviations, and categorical variables were presented as numbers with percentages. Unadjusted survival curves for DSS and overall survival (OS) were drawn using the Kaplan–Meier method, and the log-rank test was applied to determine significance. The impacts of LNM and TNM stage on DSS were evaluated by the Cox proportional hazard model after adjustment for known covariates. The relative risk for survival is presented as the hazard ratio (HR), and p value. Two-sided p values < 0.05 were considered statistically significant.

The proportion of variation explained (PVE) in the Cox proportional hazard model was calculated for the relative predictability of each TNM staging system. Akaike information criterion (AIC) and the Bayesian information criterion (BIC) measure the relative quality of a statistical model, as previously described [4, 5, 13]. A model with higher PVE and lower AIC and BIC is considered to be better for predicting survival.

Results

Characteristics and stage migration of DTC patients

The clinicopathologic characteristics of the 569 patients with DTC are listed in Table 1. The mean age of patients was 47.5 ± 13.6 years and 427(75.0%) patients were female. PTC was present in 550 (96.7%) patients and 19 (3.3%) patients had FTC, including HCC. Microscopic extrathyroidal extension was found in 365 (64.1%) patients, and 136 (23.9%) patients had microcarcinomas (primary tumor size ≤ 1 cm). A total of 302 (53.1%) patients had LNM, including level VI (131; 23.0%), level VII (18; 3.2%), and levels I–V (153; 26.9%). A total of 309(54.3%) patients underwent total thyroidectomy and 22 (3.9%) patients received RAI therapy. CCND was performed in all patients, and therapeutic central and lateral neck dissection was performed in 251 (44.1%) patients. The median follow-up was 133 months (range 9–170 months). Recurrence occurred in 101 (17.8%) patients. Disease-specific mortality (DSM) was 66 (11.6%) and overall mortality was 101 (17.8%).

Table 1 Clinicopathologic characteristics of differentiated thyroid cancer patients

TNM distribution by edition and stage is presented in Table 2. In the change from TNM-7 to TNM-8, 226 (39.7%) patients were restaged to a lower stage: 126 patients in stage IV by TNM-7 were classified into stage I (56/126), stage II (39/126), stage III (17/126), and stage IV (14/126); all patients in stage III were downstaged to stage I and stage II, and 7 patients were downstaged from stage II to stage I. All patients in stage I by TNM-7 remained in stage I by TNM-8.

Table 2 Comparison of 569 differentiated thyroid cancer patients staged according to TNM-7 and TNM-8

Survival analyses according to TNM-7 and TNM-8

DSS was significantly associated with stage at diagnosis using both TNM-7 and TNM-8 (p < 0.001), and there was more separation between stages I and IV in TNM-8 than TNM-7 with respect to DSS and OS (Fig. 1). After adjustment (Supplementary Table S1), DTC stages classified by both editions were significantly related to DSS. The PVE, AIC, and BIC of TNM-7 and TNM-8 were calculated to compare the ability of each system to predict DSS in patients with DTC. Comparing TNM-7 and TNM-8, the PVE was 18.68% and 22.33%, the AIC was 704.22 and 680.50, and the BIC was 702.98 and 679.24, respectively (Table 3).

Fig. 1
figure 1

Unadjusted disease-specific survival (DSS) curves for patients with differentiated thyroid cancer (DTC) using a TNM-7 and b TNM-8. Unadjusted overall survival (OS) curves for patients with DTC using c TNM-7 and d TNM-8

Table 3 Hazard ratios of DSS and comparison of model performance of TNM-7 and TNM-8 in patients with DTC

Survival analysis according to LNM status

Adjusted DSS analysis according to the LNM status in 569 Chinese patients with DTC is shown in Table 4. After adjustment (Supplementary Table S2), levels I–V LNM (HR = 4.332; p < 0.001) was significantly related to poorer DSS compared with N0 and level VI LNM (HR = 1.713; p = 0.274). Among patients aged ≥ 55 years, those with levels I–V (HR = 7.425; p < 0.001) and VII (HR = 11.994; p = 0.005) LNM had significantly worse DSS than those with N0 and level VI LNM (HR = 3.960; p = 0.063).

Table 4 Hazard ratios of DSS in patients with DTC according to LNM status

To validate the above results, we use data from SEER database for further study. A total of 89,000 patients in the SEER cohort were included and their characteristics are recorded in Supplementary Table S3. Patients with levels I–V and VII LNM had significantly poorer DSS than those with N0 and level VI LNM, especially among older patients (age ≥ 55 years; Fig. 2a, b), and these differences remained even after adjustment for known covariates (Table 5 and Supplementary Table S4).

Fig. 2
figure 2

Unadjusted DSS curves for patients with differentiated thyroid carcinoma (DTC) according to the LNM status in SEER database. a All patients, b older patients (age ≥ 55 years).Unadjusted DSS curves for low-stage DTC patients according to c the eighth edition of the TNM staging system and d the restaged TNM system in SEER database

Table 5 Hazard ratios of DSS in patients with DTC according to LNM status in SEER database

Survival analyses according to the modified TNM system and TNM-8 in SEER database

The modified TNM system is described in Supplementary Tables S5 and S6. In brief, the modified TNM system classified level VI lymph nodes as central neck lymph nodes (N1a), whereas levels I–V and VII lymph nodes were classified as lateral ones (N1b). Patients with stage II in TNM-8 were classified into IIA and IIB in the restaged TNM system. Patients with levels I–V and VII LNM who aged ≥ 55 years at diagnosis were restaged as IIB, while the remaining cases were restaged as IIA. Unadjusted DSS curves for low-stage DTC patients were plotted according to TNM-8 and the modified TNM system (Fig. 2c, d). There was a significant difference between patients in stage I and stage II according to TNM-8 (p < 0.001; Fig. 2c); the 10-year DSS rates for patients in stages I and II according to TNM-8 were 99.6% and 93.3%, respectively. After reclassification by the modified TNM system, 5,086 and 877 patients were restaged as stage IIA and stage IIB, respectively. The 10-year DSS rates for stage I, IIA, and IIB patients were 99.5%, 94.0%, and 90.8%, respectively. There was a significant difference in DSS between the three groups (p < 0.001; Fig. 2d).

Discussion

This study demonstrates that TNM-8 not only was a significantly better predictor for DSS, but also reduced the number of patients in stages III and IV. In TNM-8, there was a more discriminating classification of unadjusted DSS, as well as improved model performance on adjusted multivariable analyses. The PVE value was higher in TNM-8 than in TNM-7, while AIC and BIC were lower in TNM-8 compared with TNM-7. When TNM-8 was applied, 39.7% of patients were shifted into lower stages. Only 11.1% of patients in stage IV by TNM-7 were classified as stage IV (14/126) by TNM 8, and all patients in stage III by TNM-7 were downstaged to stages I and II by TNM-8. Compared with TNM-7, TNM-8 has more precise risk stratification, preventing overtreatment in lower risk patients with DTC. TNM-8 tailors clinical application better in a Chinese population of patients with DTC.

Previous studies have well discussed the prognostic value of the increased cutoff age and the reclassification of T3 in TNM-8 [14,15,16,17,18]. However, there is still some concern about N1b changes in TNM-8. Due to “significant coding difficulties for tumor registrars, clinicians, and researchers”, levels VI and VII (upper mediastinal) lymph nodes were classified as central neck lymph nodes (N1a), whereas levels I–V lymph nodes were classified as lateral ones (N1b) [12, 19]. Our study demonstrates that the risk for DSM of level VII LNM matched or even exceeded that of levels I–V LNM, especially in older patients (age ≥ 55 years). Among Chinese patients aged ≥ 55 years, those with level VII LNM had significantly worse DSS than those with N0 and level VI LNM. Analysis of the SEER dataset for further validation showed that patients with level VII LNM had significantly worse survival than those with N0 and levels I–VI LNM, especially in older patients. To our knowledge, this is the first study to estimate the impact of level VII LNM on DSS using a Chinese patient cohort and the largest patient cohort available in the United States. Since the association between lymph node location and prognosis could be affected by the number and size of LN metastases, TNM-8 collapses N1a and N1b into a single N1 category when forming stage groupings [19]. However, there is significant evidence that the survival prognosis of patients with N1b is significantly worse than that of patients with N1a, and there is growing concern that the mortality risk for N1b patients is underestimated in the eighth edition TNM staging system [7, 10, 11, 20, 21]. Considering that patients with levels I–V LNM are classified into N1b in TNM-8, our study found that patients with N1b had significantly poorer DSS compared with those with N0 and level VI LNM. According to the effects on DSS, patients with levels I–V and level VII LNM should be categorized as the same risk stratification, compared with those with N0 and level VI LNM.

In TNM-8, N1 disease in older patients is classified into stage II, resulting in management with various levels of aggression adopted for patients of the same stage. Kim et al. classified DTC patients with stage II in TNM-8 depending on whether or not levels I–V LNM occur, demonstrating that employing N1b could be more accurate for the prediction of DSS [11]. Our study also divided stage II patients into stage IIA and IIB, where stage IIB was defined as N1b patients aged ≥ 55 years. Distinct from the work of Kim, we reclassified both level VII and levels I–V LNM into N1b according to the prognostic implication of level VII LNM. Compared with TNM-8, the advantage of our modified stage system in predicting prognosis will be significant, considering with the high prevalence and increased incidence of DTC [22,23,24,25]. In addition, patients with stage IIB DTC undergo lateral lymph node dissection whereas those with stage IIA DTC undergo a more limited extent of thyroid surgery. Therefore, subclassification of patients with stage II DTC based on LNM will be helpful in both predicting disease mortality and establishing a treatment plan with precise selection of surgical extent.

This study has several limitations. First, the Chinese cohort was recruited from a single tertiary referral institution and was smaller than the populations in other studies. However, the outcomes of this study were based on a highly uniform group of patients with a relatively long follow-up and were confirmed by SEER data, and, therefore, have valuable implications in evaluating the prognostic value and changes in N1b of TNM-8 for DTC. Second, there is a possibility of coding errors, which exists in all large database studies. However, the probability of such errors is extremely low because SEER databases are standardized and highly audited.

Conclusion

In conclusion, our study shows that compared with TNM-7, TNM-8 has improved clinical usefulness with respect to predicting survival for a Chinese population of patients with DTC. However, the changes in lateral LNM definition and classification of TNM-8 have a significant prognostic implication in patients with DTC. Our study suggests that a modified TNM system including redefined N1b would be more useful for predicting mortality and determining proper management in patients with DTC.