Introduction

Hepatectomy is the optimal and most acceptable treatment strategy for patients with early-stage hepatocellular carcinoma (HCC), especially for patients with good liver function and solitary tumor less than 5 cm [1, 2]. However, the prognosis of patients with HCC remains unsatisfactory, with approximately 50–70% of patients with HCC relapsing within 5 years after hepatectomy [3, 4]. Early recurrence (ER), which often refers to recurrence or metastasis occurring within 2 years after operation, accounts for approximately 70% of the overall postoperative recurrence rate [1, 5, 6]. The clonal origin of the early recurrent tumors is analogous to that of preoperative primary tumors and is assumed to be the result of latent primary tumor metastasis [7, 8]. Thus, investigation of potential prognostic biomarkers associated with ER is critical, as these could help clinicians identify patients at high risk of recurrence and thus facilitate the implementation of individualized treatment.

Multiparametric magnetic resonance imaging (MRI) is the ideal technique for detecting and diagnosing HCC because of its excellent soft tissue contrast. Multiple investigations have shown a connection between prognosis and preoperative MRI characteristics in patients with HCC [9,10,11]. Notably, HCC is a complex neoplastic lesion that develops from the multistep carcinogenesis of cirrhotic nodules, during which the blood supply is dramatically altered [12]; and the greater the heterogeneity of the tumor, the more aggressive it is and the worse the prognosis [13].

Histogram analysis is an emerging measurement method that provides accurate information about the tumor microenvironment and heterogeneity, allowing for the extraction of more objective quantitative data from medical images that cannot be determined from the simple visual assessment of MRI characteristics [14, 15]. Histogram parameters, known as first-order features, are utilized to evaluate the distribution of voxel intensity in tumor tissues. Compared with higher-order radiomics features, histogram parameters have the advantages of simplicity and high reproducibility. However, the predictive ability of histogram analysis for HCC prognosis and the optimal time window for analysis remain unclear.

The main objective of this study was to evaluate histogram parameters and clinical-radiological (CR) characteristics as prognostic predictors in patients with solitary HCC ≤ 5 cm and to determine the optimal time window for histogram analysis.

Methods

This retrospective single-center study was approved by the Institutional Review Board, and the requirement for informed consent was waived owing to the retrospective study design.

Study participants

Between January 2012 and December 2017, 1462 patients with HCC who underwent preoperative MRI and surgical resection in our hospital were recruited. The current study included patients who met the following criteria: (a) had single HCC ≤ 5 cm, (b) no preoperative tumor-related treatments before curative resection, (c) good MRI quality for analysis, (d) complete follow-up data and clinical information, and (e) no history of other malignancies. The exclusion criteria were as follows: (a) multiple tumors or single tumor > 5 cm, (b) preoperative tumor-related treatment, (c) unavailable clinical-pathological information or loss to follow-up, and (d) history of other malignancies. Thus, our final study population comprised 151 patients with HCC (130 males and 21 females; median age, 54.46 ± 9.09 years; age range, 27–80 years) (Fig. 1). The clinical variables and laboratory indices involved in the study were age, sex, hepatitis status, cirrhosis, serum alpha-fetoprotein level, alanine transaminase, aspartate aminotransferase, gamma-glutamyl transpeptidase, alkaline phosphatase, total bilirubin, and direct bilirubin. The albumin-bilirubin (ALBI) grade was calculated as previously described [16]. Pathological information was collected from pathological reports, including the Edmondson-Steiner grade and microvascular invasion (MVI) status.

Fig. 1
figure 1

Flow chart of patient enrollment

MRI protocol

All patients underwent preoperative multiparametric MRI examination using two 3.0-T scanners (Discovery MR 750 and Signa HDx, GE Medical Systems). The baseline sequences consisted of in-phase and opposed-phase axial T1-weighted images, axial fat-suppressed T2-weighted images (T2WI-FS), diffusion-weighted images (DWI, b values = 0 and 800 s/mm2), and fat-suppressed T1-weighted dynamic contrast-enhanced images. Apparent diffusion coefficient (ADC) maps were generated from the DWI images using the function tool ADC software. At 20–25 s (arterial phase, AP), 60–70 s (portal venous phase, PVP), and 100–120 s (delayed phase, DP) after injection of contrast agent, dynamic contrast-enhanced MRI was acquired using a three-dimensional spoiled gradient echo liver acceleration volume acquisition sequence. For dynamic contrast-enhanced images, 0.2 mL/kg of gadoxetic acid (Omniscan 0.5 mmol/mL; GE Healthcare) was injected via an autoinjector at 2.0 mL/s and followed by 20 mL saline flush. Detailed acquisition information is presented in Table 1.

Table 1 MRI parameters

Qualitative and quantitative analyses

Two abdominal radiologists (with 5 years and 9 years of abdominal MRI experience) preformed a retrospective review of all imaging features. Both radiologists were blinded to patient information and prognosis. When there was a disagreement between the two evaluators, a third, more senior radiologist (with 20 years of abdominal MRI experience) was consulted, and an agreement was reached. The following imaging characteristics were analyzed in accordance with the Liver Imaging Reporting and Data System (LI-RADS) version 2018: non-rim arterial phase hyperenhancement (APHE), non-peripheral washout, enhancing capsule, mosaic architecture, fat in mass, blood products in mass, corona enhancement, restricted diffusion, and mild-moderate T2 hyperintensity. Tumor size was defined as the maximum tumor diameter measured on transverse or coronal images. Two non-LI-RADS imaging features, non-smooth tumor margin and incomplete tumor capsule, were also analyzed.

Tumor segmentation and parameters extraction

One radiologist (with 5 years of experience in abdominal radiology) manually drew the volumes of interest (VOI) along the tumor boundary on T2WI-FS images, ADC maps, and dynamic contrast-enhanced images (including AP, PVP, and DP images) using the ITK-SNAP software (v.3.8.0; www.itksnap.org, open-source software) (Supplementary Figure 1). Another radiologist (with 9 years of experience in abdominal radiology) independently segmented the VOI of all lesions to assess the interclass correlation coefficient (ICC). Following VOI segmentation, histogram parameters consisted of (a) Skewness, (b) Entropy, (c) Maximum, (d) Mean Absolute Deviation (MAD), (e) Minimum, (f) Uniformity, (g) Variance, and (h) Kurtosis were extracted from each sequence using the Artificial Intelligence Kit software (version 3.3.0, GE Healthcare) based on the open source Pyradiomics Python package.

Follow-up surveillance

All patients underwent routine postoperative follow-up every 3–6 months in the first 2 years and at least once a year thereafter, including serum alpha-fetoprotein tests and imaging (ultrasound, enhanced computed tomography, or MRI). ER was defined as the occurrence of typical imaging features of intra/extrahepatic recurrence or pathologically confirmed metastases within 2 years after surgery. Recurrence-free survival (RFS) was defined as the period between the date of curative resection and the date of recurrence or the last follow-up. All the patients were monitored until June 2022.

Statistical analysis

Data in accordance with normal distribution were expressed as mean ± standard deviation and compared using an independent samples t-test. Data in accordance with skewed distribution were expressed as medians and compared using the Mann–Whitney U test. The clinical characteristics and imaging features with p values < 0.1 in the univariate analysis were then incorporated into the multivariate logistic regression analysis to identify significant independent predictors of ER. Only stable features with ICCs over 0.75 were kept. The diagnostic performance of the different models was evaluated using the corresponding area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity, and specificity. The RFS rates between the low- and high-risk groups were assessed using Kaplan–Meier survival curves with log-rank tests. Two-tailed p values < 0.05 were considered statistically significant. Statistical analyses were performed using SPSS (version 25.0, IBM), R software (version 3.5.1), and Python (version 3.5.6).

Results

Patient characteristics

The final cohort comprised 151 patients (130 males and 21 females; median age, 54.46 ± 9.09 years; age range, 27–80 years) with solitary HCC ≤ 5 cm and all patients were randomized allocated to either the training cohort (n = 105, 90 males and 15 females) or the validation cohort (n = 46, 40 males and 6 females). CR variables did not differ significantly between the training and validation cohorts (all p > 0.05), with the exception of corona enhancement (p = 0.046). Of the patients included in this study, 54 experienced ER and 85 relapsed at the end of follow-up. In the training cohort, 38 patients experienced ER and 59 patients relapsed at the end of follow-up. In the validation cohort, 16 patients experienced ER and 26 patients relapsed at the end of follow-up. The median RFS was 43.87 months (range, 1.43–119.83 months) in the training cohort and 46.19 months (range, 3.5–114.7 months) in the validation cohort, respectively (Table 2).

Table 2 Baseline characteristics of patients in the training and validation cohorts

Determination of optimal histogram analysis

The ICCs of inter-observer reproducibility for the histogram parameters are listed in the Supplementary Table 1. In the whole population, the scatter plot showed that Kurtosis from the T2WI-FS images, Skewness from the AP images, Entropy, Maximum, MAD, Minimum, Uniformity, and Variance from the histogram analysis of the PVP and DP images were significant risk factors for ER (Fig. 2). In particular, the histogram parameters from the analysis of the PVP and DP images had the greatest and similar impacts. In the training cohort, univariate analysis showed that the histogram parameters that significantly correlated with ER were Entropy, Maximum, MAD, Minimum, Uniformity, and Variance from the PVP and DP images. Multivariate logistic regression revealed two parameters, MAD (p = 0.002, odds ratio [OR] 0.776, 95% confidence interval [CI] 0.288–1.265]) and Minimum (p = 0.061, OR − 0.425, 95% CI − 0.869–0.020) from the DP images, which were significantly associated with ER. The optimal cut-off values for MAD and Minimum were 7.421 and 6.782, respectively. The AUC, accuracy, sensitivity, and specificity of the histogram model for predicting ER were 0.716 (95% CI 0.635–0.795), 0.619, 94.7%, and 43.3% in the training cohort, and 0.694 (95% CI 0.560–0.819), 0.630, 81.2%, and 53.3% in the validation cohort, respectively (Fig. 3A and B).

Fig. 2
figure 2

Scatter plot represents the effect of histogram parameters of different sequences on the prognosis of patients in the univariate analysis

Fig. 3
figure 3

The distribution and performance of three different models. A ROC curve of the histogram model in the training cohort. B ROC curve of the histogram model in the validation cohort. C ROC curve of the CR model in the training cohort. D ROC curve of the CR model in the validation cohort. E ROC curve of the CRH model in the training cohort. F ROC curve of the CRH model in the validation cohort. CR Clinical-radiological, CRH Clinical-radiological-histogram, ROC Receiver operator characteristic

CR model construction

Univariate analysis showed that the CR variables for ER were ALBI grade (p = 0.003), tumor size (p = 0.097), MVI (p = 0.029), age (p = 0.014), non-rim APHE (p = 0.066), and alanine transaminase levels (p = 0.056). Multivariate logistic regression analysis revealed that the independent risk variables related to ER were ALBI grade (p = 0.040, OR 4.018, 95% CI 1.064–15.177), MVI (p = 0.068, OR 2.334, 95% CI 0.940–5.791), and tumor size (p = 0.026, OR 1.622, 95% CI 1.059–2.484) (Table 3). The AUC, accuracy, sensitivity, and specificity of the CR model for predicting ER were 0.690 (95% CI 0.594–0.791), 0.714, 44.7%, and 86.6% in the training cohort, and 0.666 (95% CI 0.514–0.804), 0.696, 43.8%, and 83.3% in the validation cohort, respectively (Fig. 3C and D).

Table 3 Univariate and multivariate analyses of clinical-radiological variables associated with early recurrence of HCC patients in the training cohort

Prognostic potential of the CRH model

The final CRH model that predicted ER for HCC included MAD and Minimum from the histogram analysis of the DP images, ALBI grade, MVI, and tumor size. For ER prediction, the AUC, accuracy, sensitivity, and specificity of the CRH model were 0.747 (95% CI 0.664–0.827), 0.667, 76.3%, and 61.2% in the training cohort and 0.765 (95% CI 0.641–0.871), 0.630, 62.5%, and 63.3% in the validation cohort, respectively (Table 4). The ROC curves of the CRH model in the training and validation cohorts are shown in Fig. 3E and F. The nomogram and calibration curves for the CRH model are shown in Fig. 4.

Table 4 The AUC, accuracy, sensitivity, and specificity of different models in the validation cohort
Fig. 4
figure 4

The nomogram and calibration curves for predicting ER in patients with HCC. A The nomogram was developed by combined histogram parameters (Histogram-score), ALBI grade, MVI, and tumor size in the training cohort. B Calibration curve of the nomogram in the training cohort. C Calibration curve of the nomogram in the validation cohort. ER Early recurrence, HCC Hepatocellular carcinoma, ALBI Albumin-bilirubin, MVI Microvascular invasion

Using a critical value of 0.336, the training cohort was classified into high- and low-risk groups for RFS prediction. The RFS rate was significantly higher in the low-risk group than in the high-risk group, as confirmed in the validation cohort (training cohort, p < 0.001; validation cohort, p = 0.002) (Fig. 5).

Fig. 5
figure 5

Kaplan–Meier curves for recurrence-free survival according to the cut-point in the training cohort (left) and validation cohort (right)

Discussion

Our results revealed that histogram parameters, including MAD and Minimum from the histogram analysis of DP images, were the most valuable predictors of ER and poor RFS in patients with solitary HCC ≤ 5cm. The prediction nomogram created by integrating optimal histogram parameters and CR risk factors (including ALBI grade, MVI, and tumor size) effectively distinguished patient prognosis.

Artificial intelligence is widely used in both clinical and scientific research. However, compared with complex artificial intelligence models, histogram analysis has the advantages of simplicity and high reproducibility, and eliminates the need for complex post-processing techniques. Several studies have achieved good results in assessing the prognosis of various cancers using MRI-derived histogram analysis [17,18,19]. Zhang et al. [20] reported that Entropy from the histogram analysis of AP images was the most valuable predictor of ER in patients with HCC. Moreover, the Mean derived from the T1-weighted images and Entropy derived from PVP images were found to be the most effective predictors of MVI [21]. In the present study, we found that MAD and Minimum from the histogram analysis of DP images were independent predictors of ER. The cause of this result may be that the heterogeneity of the HCC in the DP images was more clearly shown and thus contained more tumor heterogeneity information that could be extracted and used to analyze the prognosis of patients. Based on the results of the current study, histogram analysis of DP phase provides prognostic value for patient prognosis, and subsequent studies with larger sample sizes are needed to confirm this conclusion.

Our findings contradict some earlier studies that demonstrated histogram analysis of ADC maps is an important imaging marker for predicting tumor aggressiveness and survival in patients with HCC following curative resection [22,23,24]. For instance, Lee et al. [23] and Nakanishi et al. [24] found that the ADCmin is a reliable indicator of HCC histological grade and ER prediction. However, the current study indicated that the histogram parameters of ADC maps were not risk factors for ER. One possible explanation for this result may be that different sequences have different predictive values in assessing patient prognosis. DWI is a noninvasive MRI technique that can quantify the diffusion of water molecules in biological tissues, which is reflected in the ADC values. However, the comprehensive and effective heterogeneous characteristics of the tumor can be described more accurately using dynamic contrast-enhanced images than using ADC maps. As a result, in the current multiparametric investigation, the histogram parameters derived from ADC maps were ineffective in predicting the ER.

ALBI grade, a simple and effective biomarker that correlates liver function and survival outcomes based solely on serum total bilirubin and albumin levels, is one of the most commonly used indicators of HCC [16]. Significant differences in ALBI grade occur among patients with different prognoses. In our study, baseline ALBI grade was associated with ER and poor RFS, suggesting that baseline liver function could be a good predictor of patient outcomes, which is in accordance with previous studies [25, 26].

As a critical marker of tumor aggressiveness, MVI is a well-established prognostic factor for ER and shorter long-term survival in patients with HCC after curative resection. It is regarded as a crucial pathway for intrahepatic tumor spread and an early indicator of tumor cell dissemination through peritumoral vessels. In the current study, the presence of MVI was strongly associated with postoperative ER and shorter RFS in patients with single tumor ≤ 5 cm, as previous studies have demonstrated [27,28,29]. In addition, our results suggested a certain relationship between tumor size and prognosis in patients with HCC. The larger the tumor size, the greater the heterogeneity, which may reflect the underlying poor cell differentiation, necrosis, and tumor angiogenesis.

In addition, the standardized imaging features of LI-RADS version 2018 were evaluated. However, the results of our study showed that preoperative imaging features were not related to the prognosis of patients with HCC. Although the results of the univariate analysis showed that APHE was a prognostic risk factor, it was excluded from the risk factors in the multivariate analysis. A possible reason for this may be that the quantitative analysis of the histogram has a better performance in predicting the prognosis of patients than the imaging features.

This study has several limitations. First, selection bias seems unavoidable, given its retrospective design. Second, the number of patients was relatively small, and were retrospectively enrolled from a single center. Further larger multicenter prospective studies are required to confirm our findings. Third, the findings of this study may not be applicable to MRI imaging using liver-specific contrast agents, and we will conduct quantitative and qualitative analysis of the hepatobiliary phase subsequently. Fourth, the inclusion and exclusion criteria were strict, and all patients had solitary lesion ≤ 5 cm. Additional studies involving patients with multiple tumors and single tumor > 5 cm in diameter are recommended. Finally, the postoperative follow-up time was insufficient to accurately determine the prognostic predictors of overall survival in patients with HCC. A longer follow-up period is recommended for future studies.

In conclusion, the CRH model constructed by incorporating the MAD and Minimum from the histogram analysis of DP images, ALBI grade, tumor size, and MVI had a significant predictive value for ER and RFS in patients with single HCC ≤ 5 cm. In particular, the DP provided the optimal time window for histogram analysis of the MR images. The prognostic nomogram can effectively stratify the prognostic risk degree and assist clinicians in the surveillance and treatment of patients at high risk of recurrence.