1 Introduction

Thyroid cancer is the most frequent endocrine malignancy, accounting for 3.2% of all cancers [1], and its incidence continues to increase worldwide [2]. In particular, the incidence of differentiated thyroid carcinomas (DTCs), including papillary and follicular thyroid cancers, has doubled in the last decades [2]. Although the prognosis is relatively good, locoregional recurrences are common in DTC patients, reaching 10–30% of cases, thus greatly affecting patients’ quality of life [3, 4].

In order to stratify patients with thyroid cancer several staging and classification systems have been described, such as the AMES, AGES and MACIS scores [5,6,7]. These classifications include patient-related factors (i.e., age at diagnosis) and tumor-related factors (i.e., size, histologic grade, presence of metastases, and extent of invasion), but also surgeon-related factors. Indeed, the MACIS system accounts for completeness of surgical resection in addition to the aforementioned criteria [7]. However, none of these multivariable prognostic scoring systems is universally accepted nor has been shown to be clearly superior to the others. Thus, efforts continue to be made to improve the current staging system by studying other factors that may have a prognostic impact.

Recently, it has been found that chronic inflammation might play a key role in the initiation and progression of most of the cancers, by mediating interaction with the immune response [8,9,10]. Inflammatory mediators present in the bloodstream include neutrophils, lymphocytes, monocytes, and platelets. Among these, monocytes and lymphocytes stand out as crucial anti-tumor agents. Monocytes infiltrate tumor masses, suppressing angiogenesis and triggering cancer cell apoptosis, thus reducing cancer invasion and progression [11]. In contrast, neutrophils and platelets contribute to inflammation by producing pro-inflammatory cytokines like vascular endothelial growth factor (VEGF), tumor necrosis factor-α (TNF-α), interleukin-2 (IL-2), interleukin-6 (IL-6), and interleukin-10 (IL-10), which can promote tumorigenesis [12]. In this setting, some inflammatory markers, such as C-reactive protein (CRP) levels [13], the neutrophil-to-lymphocyte ratio (NLR) [14,15,16], the lymphocyte-to-monocyte ratio (LMR) [17,18,19], the platelet-to-lymphocyte ratio (PLR) [20,21,22], and the albumin-to-globulin ratio (AGR) [23] have been investigated as predictors of tumor recurrence in subjects with DTC. Specifically, low LMR and AGR values, as well as high CRP, NLR, and PLR values, have been associated with a poor prognosis. However, it is important to acknowledge that the results of studies on these markers have demonstrated considerable heterogeneity and contradictions. Therefore, the aim of this systematic review and meta-analysis was to assess the prognostic value of different inflammatory markers, including NLR, LMR, and PLR, in patients diagnosed with differentiated thyroid cancer. In particular, we selected these markers because they can be derived from a simple complete blood count, an economical method routinely measured in clinical practice even in smaller healthcare centers.

2 Materials and methods

The present study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [24]. This review of previously published studies did not require neither institutional review board approval, nor informed consent. No review protocol was registered for this study.

2.1 Eligibility criteria

This systematic review was carried out according to the PICOS tool: Patients (P), adult patients diagnosed with differentiated thyroid carcinoma; Intervention (I) thyroidectomy or hemithyroidectomy; Comparison (C), high versus low neutrophil-to-lymphocyte ratio (NLR), lymphocyte-to-monocyte ratio (LMR), and platelet-to-lymphocyte ratio (PLR); Outcomes (O), disease-free survival (DFS); Study design (S), retrospective and prospective cohort studies, randomized controlled trials (RCTs).

Studies were not eligible if they (a) were not in English, (b) were not available in full text form, (c) reported insufficient data or data were not extractable, (d) were subgroup analyses of patients from a larger study, (e) the article type was either review, case report, conference abstract, letter to the editor, or book chapter. No publication date restriction was imposed, but articles had to be published in a peer-reviewed journal.

2.2 Data source and study searching

PubMed/MEDLINE, Cochrane Library, Scopus, and Google Scholar databases were comprehensively searched to identify relevant articles. The following search strategy was used for PubMed/MEDLINE: (“neutrophil-to-lymphocyte ratio” OR “neutrophil lymphocyte ratio” OR “NLR” OR “lymphocyte-to-monocyte ratio” OR “lymphocyte to monocyte ratio” OR “LMR” OR “platelet-to-lymphocyte ratio” OR “platelet to lymphocyte ratio” OR “PLR” OR “systemic immune inflammation index”) AND (“thyroid” OR “thyroid gland”). Then, the search strategy was modified accordingly to suit the search rules of the other databases. Furthermore, the reference lists of all the included publications were hand-searched to ensure the retrieval of further potential eligible articles. The last search was conducted on December 20, 2022.

2.3 Data collection process

Two authors (E.R. and M.G.) independently conducted the literature search. Initially, all articles were screened for relevance by title and abstract. Then, the two investigators (E.R. and M.G.) separately reviewed the full text of each publication that was considered pertinent. Any disagreement in the assessment process was discussed until consensus was reached. Data extraction from the included studies was conducted independently by two reviewers (E.R. and M.G.) using a structured form. The following information were recorded from each eligible study: first author, year of publication, country, study design, sample size, patients’ demographics, tumor stage, treatment, circulating inflammatory biomarkers, cut-off values determination method, cut-off values, oncologic outcomes and follow-up. Data extraction discrepancies were solved by consensus.

2.4 Risk of bias and study quality assessment

Two separate reviewers (E.R. and M.G.) independently assessed the methodological quality of included studies using the Methodological Index for Non-randomized Studies (MINORS) [25]. A funnel plot was created using the effect size of each outcome to examine a potential publication bias.

2.5 Data synthesis and statistical analysis

Clinical measures were reported as provided by the individual studies. A single-arm meta-analysis of proportions was performed to synthetize dichotomous variables using arcsine transformation of the data, while continuous variables were synthetized through a meta-analysis of means using the generic inverse variance method. All data are reported with the corresponding 95% confidence intervals (CIs).

The impact of NLR, LMR, and PLR on DFS was measured by the effect size of hazard ratio (HR). HRs and their 95% CIs were extracted directly from each study if provided by the authors. Whenever both univariable and multivariable analyses were reported, HRs were extracted from multivariable models. Otherwise, they were indirectly estimated using the methodology described by Tierney et al. [26]. Published Kaplan-Meier curves from each study were digitized using GetData Graph Digitizer (version 2.26; http://getdata-graph-digitizer.com/index.php) and survival data and follow-up times extracted. Hence, the number of subjects at risk adjusted for censoring at different follow-up times was calculated to reconstruct the HR estimate and its variance. HRs were log-transformed before pooling effect size estimates. Cumulative log-HRs with 95% CI are presented for the reported outcomes, calculated through the inverse variance method. A log-HR > 0 suggested a higher risk of recurrence for patients with higher values of NLR, LMR or PLR, whereas a log-HR < 0 implied a lower risk of recurrence for patients with higher values. A forest plot graph was created for each outcome.

Cochran’s Q method and I2 statistic test were used to assess heterogeneity between studies [27, 28]. The I2 value represents the percentage of variability across studies caused by heterogeneity rather than by sampling errors. According to the Cochrane criteria, values from 0 to 40% may indicate low heterogeneity, 30–60% may represent moderate heterogeneity, 50–90% may represent substantial heterogeneity, and 75–100% represents considerable heterogeneity. Given the observational nature of the majority of included studies, a random-effects model was applied to pool the study results, assuming both within- and between-studies variability. Influence analysis [29] was performed to identify potential influential studies for each outcome. In particular, a leave-one-out meta-analysis was performed to test the influence of each included study on the overall effect and I2 heterogeneity.

Analysis of publication bias was performed by visual inspection of the funnel plot and calculating the Egger’s regression intercept [30], which statistically examines the asymmetry of the funnel plot. In case of plot’s asymmetry, the Duval and Tweedie nonparametric trim-and-fill method was performed to further assess the potential publication bias [31].

Meta-analyses were conducted by Review Manager (RevMan, version 5.3; The Nordic Cochrane Centre, The Cochrane Collaboration) and R software for statistical computing (R, version 3.4.0; “meta” and “dmetar” packages). Statistical significance was defined as p < 0.05.

3 Results

3.1 Literature search results

Figure 1 shows the flow chart of the study identification process, also including the reasons behind the exclusion of the non-eligible studies. The search strategy and the other sources retrieved a total of 797 papers, which were decreased to 586 after duplicates removal. After reviewing the titles and abstracts, 542 studies were excluded and the full texts of the remaining 44 articles were assessed for eligibility. Finally, 12 studies were included in the qualitative and quantitative synthesis [14, 17,18,19, 23, 32,33,34,35,36,37,38].

Fig. 1
figure 1

PRISMA flow diagram

3.2 Studies description

Studies’ general characteristics are listed in Table 1. The included studies comprised a total of 7599 patients (males: 23.65%; 95% CI 19.0–28.65) with a mean age of 48.89 (calculated in 5481 patients out of 7599; 95% CI: 44.16–53.63). Most of the patients were diagnosed with papillary thyroid carcinoma (99.80%; 95% CI: 98.68–100). According to AJCC stage, 64.07% (95% CI: 9.40–99.93) patients showed early-stage tumors, while 35.93% (95% CI: 0.70–90.60) were diagnosed with advanced-stage tumors. The principal surgical procedure applied was total thyroidectomy (94.9%; 95% CI: 81.4–99.9), while hemithyroidectomy was performed in 723 cases (5.07%; 95%: CI 0.10–18.58). After surgery, a total of 3907 out of 5629 patients (80.75%; 95% CI: 55.32–96.92) underwent radioiodine therapy. The mean NLR, LMR, and PLR were 2.20 (calculated in 5281 patients out of 7599; 95% CI 1.92–2.48), 6.59 (calculated in 3268 patients out of 7599; 95% CI: 5.34–7.85), and 133.52 (calculated in 5090 patients out of 7599; 95% CI: 122.82–144.21), respectively. Cut-off values were obtained through median values, tertiles, and receiver operating characteristic curves (AUC-ROC). Their ranges varied from 1.6 to 3.15, 3.62 to 10.42, and 124.3 to 180.0 for NLR, LMR, and PLR, respectively.

Table 1 Studies’ general characteristics

3.3 Methodological quality and risk of bias of included studies

MINORS criteria showed an overall mean score of 12.25 ± 0.43 (maximum of 16). MINORS scores of individual studies are shown in Table 2.

Table 2 The Methodological Index for Non-randomized Studies (MINORS) results

Funnel plots for each outcome are shown in Fig. 2A-C. Visual inspection and the Egger’s linear regression test showed a symmetric distribution of the points in the funnel plots for NLR (Intercept = 0.23, p = 0.72) and LMR (Intercept = -1.32, p = 0.06), suggesting no obvious publication bias. Conversely, the funnel plot for PLR showed an asymmetric distribution (Intercept = 1.12, p = 0.02), suggesting evidence of publication bias. A sensitivity analysis using the trim-and-fill method was performed with 4 imputed studies, which produced a symmetrical funnel plot (Fig. 2D).

Fig. 2
figure 2

Funnel plot for evaluation of publication bias for DFS with (A) NLR, (B) LMR, (C) PLR, and (D) PLR after application of the trim-and-fill method

3.4 Survival analysis

The mean follow-up time was 56.48 months (in 5781 patients out of 7599; 95% CI: 30.64–82.33). Nine studies (6114 patients out of 7599) assessed the relation between NLR and DFS. The estimated pooled log-HR was 0.07 (95% CI: -0.12–0.26; p = 0.43; Fig. 3A). Moderate heterogeneity was measured between studies (Q = 17.93, p < 0.05). In particular, the between-study heterogeneity variance was estimated at τ2 = 0 (95% CI: 0–1.31), with an I2 value of 49.8% (95% CI: 0–75.7). Baujat plot showing the studies contribution to the overall heterogeneity is shown in Fig. 4A. The association between LMR and DFS was assessed by 8 studies (4877 patients out of 7599). The estimated pooled log-HR was − 0.58 (95% CI: -1.21–0.05; p = 0.06; Fig. 3B), with substantial between-study heterogeneity (Q = 29.38, p < 0.05). The between-study variance was estimated at τ2 = 0.36 (95% CI: 0.04–2.25), with an I2 value of 72.8% (95% CI: 46.7–86.1). Baujat plot showing the studies contribution to the overall heterogeneity is shown in Fig. 4B. Finally, 6 studies (in 3782/7599) investigated the relation between PLR and DFS. The estimated pooled log-HR was 0.01 (95% CI: 0–0.01; p = 0.21; Fig. 3C). Moderate heterogeneity was measured between studies (Q = 9.74, p = 0.14). In particular, the between-study heterogeneity variance was estimated at τ2 < 0.01 (95% CI: 0–0.96), with an I2 value of 38.4% (95% CI: 0–74.1). Baujat plot showing the studies contribution to the overall heterogeneity is shown in Fig. 4C. Influence analysis identified no influential studies on the cumulative log-HRs for DFS with NLR and LMR, while one study [31] was found to be influential on the pooled log-HR for DFS with PLR. In particular, after removing the influential study, the between-study heterogeneity variance was estimated at τ2 = 0 (95% CI 0–0.67), with an I2 value of 0% (95% CI: 0–74.6), while the pooled effect size did not significantly change (log-HR = 0.01; 95% CI 0–0.01; p = 0.08).Table 3 displays both the original results and the results of the sensitivity analysis after the removal of the influential study.

Fig. 3
figure 3

Baujat plots showing the studies contribution to the overall heterogeneity for (A) NLR, (B) LMR, and (C) PLR

Fig. 4
figure 4

Forest plots showing the pooled log-HRs for DFS with (A) NLR, (B) LMR, and (C) PLR. The dashed vertical line represents the overall measure of effect

Table 3 Sensitivity analysis before and after the removal of the influential study

4 Discussion

In real-world clinical practice, recurrence/persistence of DTC after thyroidectomy is predicted by using several clinical and pathological features included in the American Thyroid Association risk stratification system, such as tumor size, histological type, lymph nodes involvement, vascular invasion, gene mutations [39]. In this context, the clinical value of inflammation biomarkers is uncertain. The aim of this meta-analysis was to assess the prognostic value of inflammatory markers, including NLR, LMR, and PLR, in patients diagnosed with differentiated thyroid cancer. We found a pooled log-HR of 0.07, -0.58, and 0.01 for NLR, LMR, and PLR, respectively. As abovementioned, a log-HR > 0 suggested a higher risk of recurrence for patients with higher values of NLR, LMR or PLR, whereas a log-HR < 0 implied a lower risk of recurrence for patients with higher values. However, there was no significant association between the analyzed markers and either tumor control or survival. Moreover, heterogeneity between studies ranged from moderate to high, thus these results should be interpreted cautiously.

NLR is one of the most extensively studied inflammatory index as a prognostic factor in solid tumors, including thyroid cancer. It has been demonstrated that neutrophils may directly interact with circulating tumor cells to drive cell cycle progression within the bloodstream and to accelerate metastasis seeding [40]. Conversely, lymphocytes play a pivotal role in identifying and targeting cancer cells. Consequently, NLR appears to influence oncological outcomes by reflecting the balance between the body’s inflammatory response and its ability to mount an effective immune defense against cancer. Elevated NLR values are commonly indicative of a poorer prognosis. Furthermore, a high NLR has been correlated with diminished responsiveness to cancer treatments, including chemotherapy and immunotherapy, potentially affecting overall patient outcomes [41, 42]. However, current literature data are inconclusive about the prognostic role of NLR in thyroid neoplasms. In a previous meta-analysis, Feng et al. found that high-NLR values correlated with unfavorable DFS and with increased tumor size and metastatic status [43]. Conversely, our study showed no significant association between NLR and DFS. It is noteworthy that in the abovementioned meta-analysis tumors of follicular origin were analyzed together with medullary thyroid cancers, which are known to have different clinical features as well as a different prognosis. Moreover, the heterogeneity of the sample was notably large, limiting the generalizability of the results.

Recently, several studies have demonstrated the potential role of PLR as a prognostic factor in some tumor types, such as renal cancer [44], lung cancer [45] and colorectal cancer [46], suggesting that high values of PLR might correlate with a worse prognosis. The exact mechanism underlying the association between PLR and tumor cells behavior remains unclear, although it has been suggested that it might result from the release of some interleukins (ILs), such as IL-1, IL-2 and IL-6 [47]. Nevertheless, most studies have focused on high-grade disease and advanced cancers [48]. Considering the low aggressiveness of well-differentiated thyroid cancers and the poor data on this parameter, we can conclude that the role of PLR remains to be elucidated.

In a previous meta-analysis by Gu et al., it has been demonstrated that low LMR values might correlate with poor prognosis in different cancer types [49]. The most accredited explanation is that peripheral lymphocytes are attracted to tumor site and transform into tumor-infiltrating lymphocytes with antitumor effect. Circulating monocytes accumulate at the tumor site and may develop into tumor-associated macrophages (TAMs), which promote the growth of the tumor. As a result, LMR could serve as an indicator of the body’s ability to fight tumors, with low ratios indicating pro-tumor environments and high ratios indicating anti-tumor environments [50]. Accordingly, Ahn et al. found that low LMR was related to poor overall survival in a cohort of 35 anaplastic thyroid carcinomas [51]. In addition, the same authors demonstrated that low LMR was an independent unfavorable prognostic factor in patients with progressive papillary thyroid carcinoma refractory to radioactive iodine treated with thyroisin-kinase inhibitors [52].

Research has shown that the body’s overall immune system significantly impacts cancer development and control by influencing the tumor microenvironment [53]. Understanding the local inflammatory conditions within the thyroid gland and the associated markers could provide valuable insights into the local mechanisms of the disease.

These immunological aspects can represent a new area of investigation in the search for pre- and post-operative factors that can predict the effectiveness of new therapeutic treatments in the context of personalized medicine for differentiated thyroid cancer. Indeed, serum biomarkers could preoperatively guide therapeutic decisions more accurately, contributing to the classification of patients as high or low risk. Conversely, intratumoral immunological factors could serve as a potential post-operative factor that may contribute to the choice of adjuvant post-operative treatment and, if applicable, determine the most favorable outcome [54].

However, our study primarily concentrated on examining the potential of specific inflammatory markers (NLR, LMR, PLR) as predictors of thyroid malignancies, with an emphasis on their systemic effects. Globally, our meta-analysis did not find significant correlations between DFS and the NLR, PLR, LMR blood-derived immunity markers. Given that DTCs are low-malignancy tumors, it could be hypothesized that in such tumors the blood-derivable immune profile is less strongly predictive than in other solid tumors, and therefore a larger sample is required to achieve high statistical power. However, as these markers are extremely cost-effective and easy to obtain in any center, further prospective large-samples studies are advised to investigate these parameters more thoroughly, especially in the high-risk setting.

This meta-analysis has several limitations. The main limitation of this meta-analysis is related to the low quality of the included studies. All the included papers were retrospective studies, which may have introduced a selection and review bias of data regarding the analyzed outcomes. Therefore, further prospective studies or even randomized controlled trials should be conducted to better assess the prognostic role of these markers. Second, although some authors found that patients with higher NLR, PLR and/or lower LMR may have a significant higher risk of recurrences or death, they were unable to propose an optimal cut-off value to categorize these inflammatory markers as high or low. In addition, the methods to determinate the cut-off values varied widely from different studies, including median values, tertiles, and ROC curves. Accordingly, the cut-off values were found to differ widely in a range from 1.6 to 3.15, 3.62 to 10.42, and 124.3 to 180.0 for NLR, LMR, and PLR, respectively. These differences introduced significant between-study heterogeneity, that warrants caution in the interpretation of our results. Finally, because of a lack of reported data in the included studies, our analysis could not stratify the results according to disease staging. In fact, only a minority of the included articles reported data about the staging. Likewise, it was not possible to perform a subgroups analysis according to histological subtype, differentiating between high-grade and low-grade tumors.

5 Conclusions

This meta-analysis showed no significant association between DFS and either NLR, LMR or PLR in patients with DTC. However, moderate to high heterogeneity between analyzed studies was present. Inflammatory markers are extremely cost-effective and easy to obtain in any center, therefore, further prospective large-samples studies should be advised to better assess their prognostic value, especially in the high-risk setting.