Introduction

Thyroid cancer (TC) develops from the tissues of the thyroid gland. Differentiated thyroid cancer (DTC) accounts for ~ 95% of all TC cases [1]. Distant metastases of DTC carry a prevalence of about 1–10%, but is considered to be the leading cause of DTC-related death [2, 3]. If distant metastases occur in DTC, the disease can progress rapidly and cause death. The reported 10-year survival for patients with distant metastatic differentiated thyroid cancer (DMDTC) varies from ~ 30 to ~ 80% [4,5,6,7,8].

Prediction of the prognosis and survival probability of patients with DMDTC is based mainly on tumor–node–metastases (TNM) staging (8th edition) set by the American Joint Committee on Cancer (AJCC) [9]. Several studies have reported that the 8th edition of the AJCC TNM staging system (hereby termed “AJCC8SS”) has a better statistical performance than that of previous versions [10,11,12], but it does not seem to be entirely accurate for predicting survival at different stages (including patients with distant metastases) [13]. In addition to age, T-staging, and N-staging, several clinical factors (e.g., tissue type, radioactive iodine (RAI) affinity, metastatic site) are associated with the prognosis of patients with DMDTC [14,15,16,17,18]. Considering that TNM staging does not consider other clinical factors adequately, we sought to adopt a more scientifically accurate method to predict the prognosis of patients with DMDTC.

A “nomogram” is a visual statistical model built on multiple independent predictors in a multifactorial regression model. Nomograms have been used widely to predict the survival probability of patients with different TC subtypes, including DTC [19,20,21]. The clinical applicability of these study models can be increased objectively and accurately, and they frequently exhibit a higher predictive value than TNM staging. However, predictive studies on survival models for patients with DMDTC are lacking.

We aimed to calculate the total overall survival (OS) and cancer-specific survival (CSS) of patients with DMDTC registered in the Surveillance, Epidemiology, and End Results (SEER) Program from 2004 to 2015. We wished to develop and validate a nomogram model to predict the probability of survival of patients with DMDTC. To assess the efficacy of our model for predicting patient survival, we also compared the predictive value of our prognosis model with that of AJCC8SS. In this way, we aimed to offer a theoretical framework upon which clinicians can base their decisions about the treatment and outlook for patients with DMDTC.

Methods

Study population

We enrolled 2993 patients diagnosed with DTC complicated by distant metastases between 2004 and 2015 from the SEER Program (www.seer.cancer.gov). SEER Stat 8.4.0.1 was used to screen patients in the SEER Program from the “Incidence-SEER Research Plus Data, 17 Registries, Nov 2021 Sub (2000–2019)” database.

The inclusion criteria were: (i) recording information on tumor location using “Site and Morphology. Site recode ICD-O-3/WHO 2008” and selecting “thyroid”; (ii) age ≥ 18 years; (iii) M1 was derived from the 6th edition of TNM staging by AJCC (2004–2015); (iv) the year of diagnosis was between 2004 and 2015.The exclusion criteria are as follows:(i) patients with imcomplete follow-up information; (ii) patients with multiple primary tumor; (iii) according to “ICD-O-3 Hist/behav, malignant”, patients with undifferentiated pathological type; (iv) patients with unknown cause of death; (v) patients with staging T0, Tx, Nx and survival time of 0 months.

Based on the above inclusion and exclusion criteria, 906 eligible patients were finally included in this study.

Extraction of clinical variables for analyses

We extracted the following clinical variables from the SEER Program: sex, age recodes with single ages and 85 + , ICD-O-3 Hist/behav, malignant, Race recode, Marital status, RX Summ–Surg Prim Site (1998 +), RX Summ–Scope Reg LN Sur (2003 +), RX Summ–Surg Oth Reg/Dis (2003 +), radiation recode, survival months, derived AJCC T, 6th ed (2004–2015), derived AJCC N, 6th ed (2004–2015), derived AJCC M, 6th ed (2004–2015), CS tumor size (2004–2015).

We obtained survival information from the following options: “Vital status recodes”, “SEER cause-specific death classification”, and “Survival months”. OS and CSS were selected as the primary end point and secondary end point.

Statistical analyses

Categorical variables are expressed as percentages, and components were compared by applying Chi-square tests. Measures with a non-normal distribution are expressed as medians (interquartile range). The Mann–Whitney U test was employed to compare data from two groups.

We used simple random sampling to divide the study population into a training set and validation set in a 7:3 ratio (Fig. 1). The “simputation” package within R.4.2.1 (http://www.r-project.org) was applied for K-nearest neighbor interpolation for missing values. Least absolute shrinkage and selection operator (LASSO) regression [22, 23] was applied to screen variables. LASSO regression analyses were run in R with a 50-fold cross-validation method to select tuning coefficients λ. We selected the value of λ with the smallest mean squared error (MSE), λmin, in which case the screened variables with non-zero coefficients were applied to the multivariate Cox regression survival analysis. The variables screened out of the multivariate Cox regression survival analysis were used in the final nomogram construction at a standard of P < 0.05. The consistency index (C-index) was employed to assess the discriminatory ability of the nomogram. The package "timeROC" was used to conduct time-dependent ROC curve analyses, and follow-up time was selected at three time points (3, 5, and 10 years, respectively). Time-dependent receiver operating characteristic (ROC) curves and area under the ROC curve (AUC) values were generated at 3, 5, and 10 years to assess prognostic discrimination. Calibration curves were employed to determine if the survival probability predicted using the nomogram was consistent with the true probability. Decision curve analysis (DCA) was used to assess the net benefit and clinical utility of the nomogram. In addition, we constructed AJCC8SS to compare the C-index of our nomogram with that of AJCC8SS for different survival times. The C-index was calculated using 1000 bootstrap resampling. Both models were compared in terms of ROC curves, calibration, and DCA. Finally, Kaplan–Meier survival curves and log-rank tests were used to evaluate the ability of our nomogram to stratify risk for OS and CSS. Statistical analyses were undertaken using R.4.2.1. The data packages we used were “tableone”, “naniar”, “simputation”, “pec”, “glmnet”, “timeROC”, “regplot”, “rms”, “survival”, “ResourceSelection”, “magrittr”, and “ggDCA”. P < 0.05 (two-tailed) was considered significant.

Fig. 1
figure 1

Flowchart of our study

Results

Characteristics of the clinical data of the study cohort

A total of 906 patients with DMDTC (417 men and 489 women; 62.5% aged ≥ 55 years) were included (Table 1). Total thyroidectomy accounted for 78.7% of all surgical procedures, subtotal or near-total thyroidectomy for 4.9%, lobectomy and/or isthmus for 5.3%, and the remaining patients did not receive surgical treatment. Also, one to three regional lymph nodes were removed in 18.3% of patients and ≥ 4 regional lymph nodes were removed in 44.5% of patients. We found that 87.3% of patients received RAI therapy or external beam radiotherapy (EBRT). We discovered that 82.2% of pathologic types were PTC and 17.8% were FTC. The OS for the study population was 69.2%, 59.6%, and 47.4% at 3, 5 and 10 years, respectively, and the CSS at 3, 5, and 10 years was 73.0%, 65.0%, and 56.3%, respectively.

Table 1 Description of the study population

All study subjects were randomly sampled and assigned to the test and validation sets according to the 7:3 method and compared between groups. The training set comprised 634 patients (288 men and 346 women). The validation set contained 272 patients (129 men, and 143 women).

Nomogram construction

LASSO regression analysis was used to analyze the correlation between variables and OS. In LASSO regression analysis, the determination of λ values determines which variables can make the model optimal. We used 50-fold cross-validation to find the optimal λ values: the corresponding λ values when the MSE is the lowest determine the variables included in the model (the smaller the MSE value, the better is the accuracy of the prediction model). We, therefore, choose the value of λ when the MSE is at its smallest, i.e., λmin. All eight variables with non-zero coefficients were included in the Cox multivariate analysis when λmin (0.03194) was chosen: sex, age, marital status, type of surgical procedure, lymphadenectomy, radiotherapy, T stage, and N stage (Fig. 2). Further Cox multivariate analysis revealed age, marital status, type of surgical procedure, lymphadenectomy, radiotherapy, and T stage to be independent prognostic factors for OS (P < 0.05) (Table 2). Finally, we constructed OS nomograms predicting DMDTC at 3, 5, and 10 years based on these six factors (Fig. 3A).

Fig. 2
figure 2

A Distribution of LASSO regression coefficients for OS. B Cross-validation plot for OS. C Distribution of LASSO coefficients for CSS. D Cross-validation plot for CSS. Each colored curve represents the LASSO coefficient of one variable at different lambda values

Table 2 Multivariate Cox regression analysis of overall survival and cancer-specific survival in the training set
Fig. 3
figure 3

A Nomogram for predicting overall survival (OS) at 3, 5, and 10 years in patients with DMDTC. B Nomogram for predicting cancer-specific survival (CSS) at 3, 5, and 10 years in patients with DMDTC

Correlations between variables and CSS were analyzed using the same method. LASSO regression analysis was employed to screen six variables associated with CSS (λmin = 0.05574): age, marital status, type of surgical procedure, lymphadenectomy, radiotherapy, and T stage. Further Cox multivariate analysis revealed age, marital status, type of surgical procedure, lymphadenectomy, radiotherapy, and T stage to be independent prognostic factors for CSS (P < 0.05) (Table 2). We constructed CSS nomograms predicting DMDTC at 3, 5, and 10 years based on these six factors (Fig. 3B).

Evaluation and validation of nomograms

The C-index was used to evaluate the discriminatory ability of the model. The C-index of the OS nomogram was 0.7474 (95% confidence interval (CI) = 0.7199–0.775), and that of the CSS nomogram was 0.7572 (0.7281–0.7862). Comparison of the C-index of the nomogram with AJCC8SS with different survival times revealed the training set and validation set of the nomogram to have a higher C-index (Fig. 4).

Fig. 4
figure 4

A, B C-index values for different survival times for the OS nomogram with AJCC8SS in the training set and validation set. C, D C-index values for different survival times for the CSS nomogram with AJCC8SS in the training set and validation set

We plotted time-dependent ROC curves to assess the accuracy of our prediction model. We set the method of weight calculation as follows: weighting = "marginal", and used the Kaplan–Meier estimator of the censoring distribution. The AUC at 3, 5, and 10 years for the training set OS nomogram was 0.799 (95% CI = 0.761–0.838), 0.790 (0.753–0.828), and 0.859 (0.817–0.900), respectively (Fig. 5A–C). The AUC at 3, 5, and 10 years for the training set CSS nomogram was 0.810 (95% CI = 0.772–0.849), 0.803 (0.765–0.842), and 0.868 (0.827–0.909), respectively (Fig. 5D–F). The AUC at 3, 5, and 10 years for the validation set OS nomogram was 0.786 (95% CI = 0.728–0.845), 0.753 (0.693–0.813), and 0.799 (0.726–0.871) (Fig. 5G–I). The AUC at 3, 5, and 10 years for the validation set CSS nomogram was 0.783 (95% CI = 0.718–0.848), 0.748 (0.684–0.812), and 0.804 (0.730–0.878) (Fig. 5J–L). Our results show that the time-dependent ROC of AJCC8SS shows only one breakpoint. This is because AJCC8SS only included four variables including age, T stage, N stage, and M stage in the prediction of survival probability of thyroid cancer. Our study was conducted in patients with distant metastatic differentiated thyroid cancer (M1), age < 55 years defines stage II, and age ≥ 55 years defines stage IV B, regardless of T stage and N stage, so the time-dependent ROC shows only one breakpoint. Our nomogram, on the other hand, takes full account of multiple prognostic factors, not just age alone, and therefore exhibits a detailed curve. The AUC of the time-dependent prediction model was significantly different to that of AJCC8SS (Fig. 5).

Fig. 5
figure 5

AC Time-dependent ROC curves for 3, 5, and 10 years predicted by the OS nomogram and AJCC8SS in the training set. DF Time-dependent ROC curves for 3, 5, and 10 years indicated by the CSS nomogram and AJCC8SS in the training set. GI Time-dependent ROC curves for 3, 5, and 10 years predicted by the OS nomogram and AJCC8SS in the validation set. J-L: Time-dependent ROC curves for 3, 5, and 10 years indicated by the CSS nomogram and AJCC8SS in the validation set

Calibration curves were used to evaluate the calibration degree of the nomogram. The predicted probability of the OS nomogram and CSS nomogram agreed well with the true possibility of the training set and validation set, respectively (Fig. 6). Comparisons were made with AJCC8SS: the OS nomogram and CSS nomogram performed better, thereby demonstrating their prediction accuracy.

Fig. 6
figure 6

Calibration curves. The x-axis represents the predicted survival probability. The y-axis represents the true survival probability. The diagonal lines (gray) indicate the “ideal” calibration curves (predicted probability equals actual probability). AC Calibration curves for 3, 5, and 10 years for the OS nomogram and AJCC8SS in the training set. DF Calibration curves for 3, 5, and 10 years for the CSS nomogram and AJCC8SS in the training set. GI Calibration curves for 3, 5, and 10 years for the OS nomogram and AJCC8SS in the validation set. JL Calibration curves for 3, 5, and 10 years for the CSS nomogram and AJCC8SS in the validation set

DCA was used to assess the clinical effectiveness and net benefit of our nomograms. The prediction model for the OS nomogram (red) showed significantly more net benefit than AJCC8SS (blue) in the training set (Fig. 7A–C) and validation set (Fig. 7G–I). The training set (Fig. 7D–F) and validation set (Fig. 7J–L) of the CSS nomogram exhibited similar results.

Fig. 7
figure 7

DCA. The x-axis in DCA is the risk threshold. The y-axis indicates the net clinical benefit. The gray horizontal line indicates that all samples are negative, no patients received the intervention, and the net benefit is 0. The black diagonal line indicates that all samples are positive and that all patients received the intervention. AC DCA of the clinical benefit of the OS nomogram vs. AJCC8SS in the training set. DF DCA of the clinical benefit of the CSS nomogram vs. AJCC8SS in the training set. GI DCA of the clinical benefit of the OS nomogram vs. AJCC8SS in the validation set. JL DCA of the clinical benefit of the CSS nomogram vs. AJCC8SS in the validation set

We plotted Kaplan–Meier curves by stratifying patients according to “low”, “intermediate”, and “high” risk based on the predicted probability of survival from the nomogram. The results showed excellent performance of risk stratification based on our nomograms (log-rank P < 0.001) (Fig. 8).

Fig. 8
figure 8

Kaplan–Meier curves. A, B Kaplan–Meier curves for the OS nomogram and CSS nomogram in the training set. C, D Kaplan–Meier curves for the OS nomogram and CSS nomogram in the validation set

Discussion

We developed a nomogram to predict OS and CSS at 3, 5, and 10 years in patients with DMDTC by acquiring information from the SEER Program. Six variables were included in OS and CSS: age, marital status, type of surgical procedure, lymphadenectomy, radiotherapy, and T stage. Each had different weights in different nomograms. We also evaluated and validated our model: a C-index of 0.7474 (95% CI = 0.7199–0.7750) for the OS nomogram and 0.7572 (0.7281–0.7862) for the CSS nomogram were documented. Time-dependent ROC curves suggested that the nomogram had better discriminatory ability for DMDTC than AJCC8SS. The nomogram showed good agreement with the “ideal” calibration curve in the training set and validation set. DCA confirmed that nomogram-predicted survival probability had high clinical predictive value. In addition, the nomogram we constructed could be used to stratify patients better, and showed more robust accuracy and predictive power than AJCC8SS. Our prediction model highlights the relative contributions of various independent variables associated with clinical outcomes.

In published prognostic staging systems for DTC, age has almost always been identified as an independent predictor of disease-specific survival, which is distinct from most other cancer types [24]. Several previous studies have shown that increasing age increases the risk of death in DTC, and advanced age has been shown to be a major predictor of its lower survival rate [25,26,27,28]. We used 55 years as the age cutoff for the prognostic model according to AJCC8SS. We found that age had a greater weight in LASSO and Cox multivariate analysis (≥ 55 years: hazard ratio (HR) for OS = 2.59; HR for CSS = 2.4). Thus, age is extremely important for diagnosing patients with DMDTC.

Aizer and colleagues found that unmarried (including widowed) patients had a significantly higher risk of metastatic cancer, undertreatment, and cancer-related death than married patients [29]. Shi and coworkers showed that widowhood was an independent risk factor for the prognosis in patients with DTC [30]. Widowed patients have a significantly increased risk of death at all stages of DTC progression compared with that of married patients. We documented similar findings, with HR values of 1.83 and 1.93 for separated/widowed people in the OS nomogram and CSS nomogram, respectively, and the risk of death was increased significantly in this group of patients. Based on the above findings, we speculate that this phenomenon may be due to the lack of emotional burden sharing by the partner of separated/widowed patients, which to some extent has a detrimental effect on the psychology and life of this group of patients, for whom distant metastases are more painful and frustrating than for married people.

Despite the slow progression of DTC, lymph node metastases can appear early in the disease [31]. Several studies have shown lymph node metastases to be associated with the risk of distant metastases from TC [32, 33]. Recently, Allen and collaborators [34] showed in a SEER Program-based study that patients with well-differentiated TC had a higher prevalence of distant metastases as the number of metastatic lymph nodes increased. Interestingly, when we studied patients with DMDTC, lymph node metastases failed to be included in the model. We showed that lymph node dissection had greater predictive efficacy for the prognosis of patients with distant metastases. A higher number of lymph nodes dissected was more beneficial to improve patient survival (for ≥ 4 regional lymph nodes removed, the HR for OS was 0.67 and for CSS was 0.63).

Usually, TC is treated appropriately based on histological type, with surgery being the preferred treatment for DTC. Many investigators have shown that total thyroidectomy lengthens patient survival compared with lobectomy [35]. In a two-operative study of PTC < 1 cm, the total-thyroidectomy group showed a trend toward a lower prevalence of recurrence and mortality than the lobectomy group (5.4% and 0.3% vs. 8.3% and 1.1%, respectively) [36]. In our nomogram, total-thyroidectomy patients showed a higher probability of survival. Postoperative radiotherapy for unresected residual lesions is also important treatment and includes RAI therapy or EBRT. A study by Li and coworkers [37] showed that RAI improved survival significantly in patients with well-differentiated/medium-differentiated TC and distant metastasis at age ≤ 45 years. Kwon and colleagues [38] assessed 39 patients with stage T3/T4 DTC and found that postoperative EBRT was an essential factor in the prognosis. Several studies have demonstrated that EBRT lengthens the survival of patients with advanced DTC significantly [39, 40]. Our results align with the findings stated above, with an HR of 1.44 (1.12–1.84) and 1.38 (1.05–1.81) for OS and CSS, respectively, without radiotherapy, thereby showing a poor performance compared with patients who received radiotherapy.

Similar to the situation with other tumor types, the role of T staging on the survival of patients with DMDTC is unquestioned. Studies have shown that gross extrathyroidal extension of DTC increases disease persistence/recurrence and reduces survival [41,42,43]. Most staging systems for DTC incorporate gross extrathyroidal extension as a predictor of relapse and/or death (e.g., age, metastasis, extrathyroidal invasion and size (AMES), distant metastasis, patient age, completeness of resection, local invasion, and tumor size (MACIS), AJCC, Union for International Cancer Control) [24]. Our findings were similar: T4 stage (including gross extrathyroidal extension) had an HR value of 2.15 (1.49–3.12) and 2.64 (1.71–4.08) for OS and CSS, respectively, suggesting a higher risk of death.

In constructing this model, we applied LASSO regression analysis [44], which has significant advantages in dealing with samples with multiple covariates and has not been used commonly in studies related to survival analysis for DTC. We analyzed the risk stratification of patients based on our nomogram. The survival prognosis of different risk groups was very different, which provides a reference for more accurate assessment of the survival risk of patients with DMDTC in clinical research.

In this study, we included data on factors related to the impact of the SEER database on DMDTC survival under existing conditions where possible. After performing a series of more comprehensive statistical analyses of LASSO regression analysis, Cox multivariate analysis, C-index, time-dependent ROC, calibration curves, DCA and Kaplan–Meier curves, we found that the constructed nomogram had better predictive value overall in terms of DMDTC survival probabilities. For each specific statistical step, this value is better than the AJCC8SS. The nomogram we developed takes more factors into account than the AJCC8SS in the comprehensive assessment of DMDTC patients. This actually helps clinicians to more accurately identify high-risk thyroid cancer patients with poor prognosis for distant metastases, so that they can develop better individualize treatment plans for patients and optimize clinical decision-making during consultation and follow-up.

The main limitations of our study were that, similar to other retrospective studies, some patients were not included in this study due to missing data, which led to selection bias [45]. As the SEER database does not contain patients' thyroglobulin and serum levels in patients with distant metastases from multiple organs, these influencing factors were not included in our study. Furthermore, this nomogram was developed based on the SEER database only and is not sufficiently representative of DMDTC survival probabilities across races and populations. In terms of data statistics, the nomograms we constructed was not validated by an external validation set, something we will need to continue to add in the future. In addition, we did not study other TC subtypes with distant metastases for differences in survival probability.

Conclusions

For the first time, we constructed and validated a nomogram model to predict the probability of survival in patients with DMDTC. Our nomograms demonstrated better predictive ability than AJCC8SS. We hope that our data will contribute to comprehensive clinical assessment of the prognosis and “individualized” treatment of patients with DMDTC.