Introduction

Diffuse large B-cell lymphoma (DLBCL) is an invasive neoplasm derived from mature B cells and represents the most prevalent type of lymphatic malignancy [1]. DLBCL predominantly originates in lymph nodes, with some cases exhibiting secondary spread to extranodal organs, including the gastrointestinal (GI) tract, central nervous system (CNS), bones, lungs, and liver [2]. Approximately one-third of DLBCL patients may also present with primary extranodal involvement (ENI) [3]. DLBCL with extranodal presentation exhibits genetic and clinical distinctions from nodal DLBCL, negatively impacting the prognosis of patients [2, 4,5,6]. The prognostic significance of multiple ENI has been acknowledged and integrated into the International Prognostic Index (IPI), which stands as one of the most prevalent clinical prognostic models [7]. Additionally, the National Comprehensive Cancer Network-IPI (NCCN-IPI) underscores the importance of disease manifestation at distinct extranodal locations such as the bone marrow, CNS, liver/GI tract, or lung [8]. According to existing literature [4], patients with multiple ENI exhibited a notable decrease in the overall response rate (75.5% vs. 89.3%) and the 2 year overall survival rate (61.8% vs. 81.3%) when compared to individuals without ENI, following routine first-line treatment with rituximab in combination with cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP) therapy. The results of a study demonstrate that the first-line treatment regimen zanubrutinib with R-CHOP (ZR-CHOP) yields superior efficacy and a longer duration of remission for high-risk DLBCL patients with ENI, coupled with tolerable safety profiles [9]. Hence, the proactive identification of DLBCL with ENI at a heightened risk of recurrence or mortality holds significant value. This proactive strategy can provide crucial assistance to clinicians in formulating personalized treatment plans.

18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) serves a crucial function in evaluating the prognosis of DLBCL with ENI. A recent study has indicated that the total metabolic tumor volume (TMTV) of ENI significantly influences the prognosis of DLBCL. Within the overall TMTV of DLBCL lesions, extranodal TMTV may have a more substantial impact on prognosis compared to nodal TMTV [10]. This finding underscores the essential contribution of 18F-FDG PET/CT in revealing the heterogeneity of DLBCL and formulating personalized treatment strategies. Nonetheless, relying solely on PET parameters does not comprehensively encompass the morphological and metabolic features of the lesions. Baseline PET radiomics, established by analyzing these features to extract deeper insights, have emerged as crucial independent prognostic factors for DLBCL [11,12,13,14]. Notably, there is currently relatively limited research on radiomics predicting the prognosis of DLBCL with ENI.

Therefore, the objective of this investigation is to delve deeper into the capability of radiomics to predict the prognosis of DLBCL with ENI.

Materials and methods

Study population

This study, approved by the institutional ethics review board under No. 2022KY288, involved a retrospective analysis of patients diagnosed with new-onset DLBCL in our department between October 2013 and June 2022. The requirement for obtaining written informed consent was waived by the institutional review committee. The inclusion criteria for the study comprised the following: (1) age 18 years or older, (2) histologically confirmed diagnosis of DLBCL, (3) presenting extranodal involvement on baseline 18F-FDG PET/CT scans, and (4) receiving initial treatment with R-CHOP(-like). Individuals were excluded if they had incomplete clinical data or a documented history of prior cancer.

A cohort comprised 126 patients diagnosed with DLBCL with extranodal involvement (ENI), consisting of 74 men and 52 women. The patients’ average age was 56.80 ± 14.20 years, ranging from 23 to 84 years. Patient selection process is illustrated in Supplementary Fig. 1 through a flowchart. Several clinical characteristics were documented, encompassing age, gender, number and specific sites of ENI, performance status, B symptoms, pathological type, serum lactate dehydrogenase (LDH) level, and serum β2-microglobulin (β2-MG) level. The NCCN-IPI was computed using a method that had been established previously [8].

Follow-up assessments occurred post-treatment and continued until October 2023, ensuring at least 16 months of follow-up or until mortality. We have categorized the various treatment modalities administered to patients in this study. Supplementary Table 1 summarizes the treatment approaches. This study examined the prognosis of DLBCL, with 5-year progression-free survival (PFS) and 5-year overall survival (OS) as the main focus. PFS represented the time from initial diagnosis to disease recurrence, progression, or mortality. Correspondingly, OS was measured from diagnosis to mortality.

PET/CT examination

The PET/CT examinations was conducted using either PHILIPS GEMINI GXL16 (45 patients, 35.7%) or PHILIPS Vereos (81 patients, 64.3%) scanners. Prior to the injection of 18F-FDG, patients fasted for a minimum of 6 h, and their blood glucose levels needed to be below 11.1 mmol/L. Following the intravenous injection of 18F-FDG, PET/CT scans (covering the entire body) were conducted about 60 ± 5 min later. CT acquisition data were employed for attenuation correction. Supplementary Table 2 provides the detailed PET/CT image acquisition parameters.

VOI segmentation and radiomics feature extraction

Two experienced nuclear medicine physicians with over 5 years of diagnostic experience, blinded to the outcomes, used the LIFEx software (version 7.3.0; available at https://www.lifexsoft.org/) to semi-automatically delineate the volumes of interest (VOIs) in the PET images with a fixed threshold of the maximum standardized uptake value (SUVmax) ≥ 2.5 and ≥ 4.0 [15, 16].

All radiomics features from the two segmentation methods were extracted utilizing the open-source PyRadiomics, available at http://www.radiomics.io/pyradiomics.html [17,18,19]. Subsequently, to mitigate the impact of variations in size, characteristics, and distribution, the z-score method was applied to standardize the radiomics signatures. Figure 1 depicts the radiomics analysis workflow.

Fig. 1
figure 1

The workflow of radiomics analysis in the study. AIC Akaike information criterion, C-index concordance index

Feature selection and model construction

To mitigate issues of multicollinearity and overfitting, the least absolute shrinkage and selection operator (LASSO) Cox regression algorithm, employing tenfold cross-validation, was utilized to independently refine the optimum subset from the all 1328 features based on the two segmentation methods. The radiomics scores were calculated by aggregating the chosen signatures and each signature was weighted based on its corresponding LASSO coefficient.

Construction of hybrid nomograms

Conventional PET parameters of the lesions at the patient level were evaluated, including SDmax, SUVmax_T, TMTV, and total lesion glycolysis (TLG). Additionally, parameters specific to the largest lesion, such as SUVmax_B, metabolic bulk volume (MBV), and TLG_B, were also examined. SDmax was defined as the maximum separation among lesions, normalized according to the patient’s body surface area (BSA). Additionally, MBV was calculated as the metabolic volume of the largest identified lesion. The cutoff points of these parameters were determined using X-tile software (version 3.6.1, Yale University).

Univariate Cox regression analysis was employed to assess the prognostic value of clinical variables and conventional PET parameters of the lesion (at patient level and largest lesion level). Subsequently, multivariate Cox regression was employed to discern significant clinical variables and conventional PET parameters, which were then employed with radiomics score to develop combined model for predicting both PFS and OS. Hybrid nomograms were developed based on the regression coefficients of each factor that retained significant in the multivariate Cox analysis of the combined model [20].

Model performance assessment and validation

The fitness of the models was assessed via the Akaike information criterion (AIC) [21], while their predictive performance was evaluated using Harrell’s concordance index (C-index). Internal validation was performed using a bootstrap resampling technique to obtain corrected C-index values [22]. Additionally, distinctions in the area under the curve (AUC) values between radiomic models derived from two segmentation methods were analyzed using the DeLong test (MedCalc19.6.4). Subsequently, the optimal radiomics model was determined and integrated with clinical and conventional PET features to create a combined model.

Statistical analysis

All statistical analyses were conducted utilizing R statistical software (version 4.2.2) (http://www.R-project.org) and SPSS software (version 26.0, IBM). The optimum cutoff points of the radiomics score and the combined model score were determined to stratify patients into low- or high-risk groups. Survival outcomes were assessed through Kaplan–Meier analysis and compared utilizing log-rank tests. Statistical significance was determined at a P-value less than 0.05. The dependence of the models on the type of scanners was compared with the Mann–Whitney U test.

Results

Patient characteristics

The characteristics of the DLBCL with ENI were outlined in Table 1. The median follow-up duration for patients was 29.77 months (range: 4.53–111.97 months). During this period, 62 patients experienced disease recurrence or progression, and 28 patients ultimately died. The 5-year PFS rate was 51.6% (65/126), and the 5 year OS rate was 78.6% (99/126). The ENI site was most commonly observed in the bone marrow (50, 39.7%), followed by the bones (48, 38.1%), stomach (22, 17.5%), intestine (20, 15.9%), lungs (20, 15.9%), kidney/adrenal glands (18, 14.3%), skin (11, 8.7%), liver (9, 7.1%), breasts (9, 7.1%), uterus/ovaries (6, 4.8%), nasal cavity (6, 4.8%), testes (5, 4.0%), CNS (5, 4.0%), muscles (5, 4.0%), pancreas (4, 3.2%), thyroid (1, 0.8%), and heart (1, 0.8%) (Fig. 2).

Table 1 The characteristics of enrolled patients
Fig. 2
figure 2

Distribution according to sites of extranodal involvement. CNS central nervous system

Feature analysis and model establishment

The outcomes of the LASSO Cox analysis are depicted in Supplementary Fig. 2. Using the fixed thresholding method with SUVmax ≥ 2.5, a total of 8 and 4 optimal signatures were identified for predicting PFS and OS, respectively. Meanwhile, using SUVmax ≥ 4.0, a total of 6 and 8 optimal signatures were identified for predicting PFS and OS, respectively. Supplementary Table 3 provides a list of the selected features along with their corresponding coefficients.

The C-index values of the radiomics models derived from SUV ≥ 2.5 and SUV ≥ 4.0 segmentation methods were 0.693 (95% CI 0.660–0.727) and 0.704 (95% CI 0.671–0.738) for predicting PFS, and 0.742 (95% CI 0.696–0.789) and 0.744 (95% CI 0.699–0.789) for OS, respectively. No statistically significant difference in the AUC between the two radiomics models was observed for both PFS and OS (P = 0.974 and 0.957, respectively). Compared to the SUV ≥ 2.5 segmentation method, the radiomics model obtained from the SUV ≥ 4.0 segmentation method exhibited slightly better performance in predicting PFS and OS. Moreover, this method requires less manual adjustment, saving time and effort. Therefore, it will be considered as the optimal radiomics model and will be combined with clinical and conventional PET features.

For the clinical variables, we identified the number of ENI in major organs (HR = 2.265, P = 0.003) and time interval to recurrence (HR = 1.932, P = 0.005) as independent risk factors for OS, while no independent factors were found for PFS. Regarding conventional PET parameters, univariate and multivariate analysis showed that the SDmax was an independent risk predictor for both PFS (HR = 3.106, P = 0.004) and OS (HR = 4.318, P = 0.004) (Fig. 3 and Supplementary Table 4).

Fig. 3
figure 3

Forest plot illustrating the predictors of PFS and OS derived from results of both univariate and multivariate analysis. ENI extranodal involvement, LDH lactate dehydrogenase, β2-MG β2-microglobulin, NCCN-IPI National Comprehensive Cancer Network-International Prognostic Index, SUVmax maximum standardized uptake value, TMTV total metabolic tumor volume, TLG total lesion glycolysis, MBV metabolic bulk volume

In the multivariate analysis of the combined model, SDmax (HR = 2.378, P = 0.030) and radiomics score (HR = 7.328, P < 0.001) emerged as independent predictors of PFS. Additionally, the number of ENI in major organs (HR = 1.978, P < 0.001), time interval to recurrence (HR = 1.769, P = 0.018), and radiomics score (HR = 2.368, P = 0.047) were recognized as predictors of OS independently (Supplementary Table 5). The hybrid nomograms provide a visual representation of the prediction outcomes and the contribution of each factor for both PFS and OS (Fig. 4).

Fig. 4
figure 4

Hybrid nomograms for predicting a PFS and b OS. ENI extranodal involvement

Model performance assessment and validation

The combined model exhibited a lower AIC value compared to the radiomics model and SDmax for both PFS (507.101 vs. 510.658 vs. 525.506) and OS (215.667 vs. 230.556 vs. 219.313), respectively. For PFS, the C-indices of the combined model, radiomics model, and SDmax were 0.724 (95% CI 0.692–0.756), 0.704 (95% CI 0.671–0.738), and 0.615 (95% CI 0.590–0.641), respectively. Regarding OS, the C-indices of the combined model, radiomics model, and clinical variables were 0.842 (95% CI 0.811–0.874), 0.744 (95% CI 0.699–0.789), and 0.792 (95% CI 0.748–0.837), respectively. The findings were further validated by bootstrap resampling (n = 1000) (Supplementary Table 6).

Survival prediction and risk stratification

For predicting low or high recurrence risk, the cutoff values of the radiomics score and the combined model score were 0.10 and 1.00, respectively. Similarly, for predicting the risk of patient death, the cutoff values of the radiomics score and the combined model score were 0.16 and 1.25, respectively. In both radiomics and combined models, the Kaplan–Meier curves showed significantly higher rates of relapse and mortality among patients classified as high-risk compared to those classified as low-risk (all P < 0.05) (Fig. 5).

Fig. 5
figure 5

Kaplan––Meier survival analysis of PFS and OS in the radiomics model (a, b) and the combined model (c, d)

Scanner comparability

There were no statistically significant differences in the performance of the combined model, radiomics model, and SDmax/clinical variables when considering different scanner types different scanner types (PFS P=0.641, 0.506, and 0.836, respectively; OS P=0.897, 0.701, and 0.457, respectively).

Discussion

DLBCL with ENI represents a distinct subgroup of lymphomas associated with unique clinical and genetic characteristics, often leading to a worse prognosis compared to nodal DLBCL. Early identification of DLBCL with ENI at a higher risk of recurrence or death is crucial for clinicians to develop personalized treatment strategies and improve outcomes. Our retrospective study indicates that baseline 18F-FDG PET radiomics and hybrid nomogram may prognosticate the outcomes of DLBCL with ENI. The hybrid nomograms, which combine radiomics with clinical factors or conventional PET parameters, outperformed models based solely on clinical factors or PET parameters. This underscores the incremental prognostic value that radiomics brings to the predictive accuracy of clinical and PET parameters. To our knowledge, there is limited research on radiomics predicting the prognosis of DLBCL with ENI.

A previous study represented the first attempt to integrate radiomics with IPI to evaluate the prognostic value of baseline PET/CT in DLBCL [11]. Subsequent studies have further confirmed that PET-based radiomics enhances the prognostic predictive value of clinical factors in DLBCL [23,24,25]. These findings are consistent with our study, indicating that the combination of radiomics with conventional PET parameter (SDmax) or clinical factors (the number of ENI in major organs and time interval to recurrence) could predict 5 year PFS and 5 year OS with better goodness of fit (lower AIC) and performance (higher C-index) compared to standalone conventional PET parameter or clinical factors in DLBCL with ENI.

Whether patients with DLBCL have multiple ENI and major organ involvement have already been included in clinical prognostic models such as IPI and NCCN-IPI [7, 8]. In this study, the number of ENI in major organs was identified as an independent risk factor for OS in DLBCL with ENI, while the number of ENI sites and whether there was involvement of major organs were not independent prognostic factors for OS. None of these factors was an independent prognostic factor for PFS. In addition, the time interval to recurrence was identified as an independent predictor of OS. Particularly, when the recurrence time interval was less than 1 year, the prognosis was poorer, consistent with the viewpoint in the previous literature [26]. In our study, SDmax was identified as an independent predictor for PFS but not for OS. Other conventional PET parameters, such as TMTV and TLG, have also demonstrated no independent correlation with either PFS or OS. While another study indicated that SDmax and TMTV, reflecting tumor dissemination and its burden, served as independent prognostic indicators [27]. These parameters could be utilized to identify high-risk patients with DLBCL. The potential rationales lie in the study’s exclusive focus on DLBCL with ENI, setting it apart from other predictive models. Consequently, the study’s emphasis may be more concentrated on specific aspects of ENI, such as the affected major organs, potentially carrying greater independent significance in the prediction of overall survival. Additionally, radiomic features are closely correlated with the intratumoral metabolic heterogeneity and complexity [28]. The inclusion of radiomics variables in the combined model may exert a more pronounced impact on survival prognosis, thereby relatively diminishing the prognostic efficacy of clinical factors or conventional PET parameters.

This study compared the performance of radiomic models in predicting patient PFS and OS using two different semi-automated segmentation methods. The results showed that the SUV ≥ 4.0 segmentation method exhibited slightly better performance compared to the SUV ≥ 2.5 method, although there was no statistical difference between the two models. Moreover, this method requires less manual adjustment, saving time and effort, and has been recommended and utilized for prognostic evaluation [29, 30]. However, consensus has not yet been reached on standardized segmentation methods for computing DLBCL radiomic features, making it an ongoing focus of future studies [17]. Despite variations in data distribution among different PET/CT scanners, there were no statistically significant differences in predicting the prognosis of DLBCL with ENI. Therefore, in our study population, the scanner selection may not play a substantial role in affecting prognosis.

This study has certain limitations. It is a single-center retrospective study and lacks external validation, which may introduce selection bias, overestimate the model’s performance, and limit the generalizability of the model. In future research, further validation should be conducted in larger cohorts and prospective studies to determine the clinical applicability of these models in guiding treatment decisions for DLBCL with ENI.

Conclusions

This study explores the capability of baseline PET/CT radiomics to prognosticate the outcomes of DLBCL with ENI. The combined model incorporating clinical variables, conventional PET parameters, and radiomics features demonstrates high accuracy in prognostic prediction for DLBCL with ENI.