Introduction

Hepatocellular carcinoma (HCC) is very common worldwide and a major cause of mortality [1, 2]. Surgical resection is one of the potentially curative treatments available for HCC, but the recurrence rate remains high [3, 4]. The 5-year postoperative recurrence rate of HCC is approximately 70% in cases with liver resection (LR) and 25% in cases with liver transplantation [4]. One of the key factors is microvascular invasion (MVI). MVI is defined as microscopic tumour invasion in smaller intrahepatic vessels, including microvessels of the portal venous vein or hepatic artery and small lymphatic vessels [5]. Several studies have reported that MVI is a validated, independent predictor of early recurrence and poor survival after LR [6,7,8]. In addition, an accurate preoperative estimation of MVI presence may help surgeons to choose appropriate surgical procedures for patients [9]. If LR is considered for patients with a high risk of MVI, a procedure incorporating a wide resection margin may be preferable [10]. Thus, it is important that the presence of MVI can be evaluated preoperatively. However, MVI is currently only diagnosed after surgical resection, via histopathologic evaluation. It is currently difficult to diagnose MVI via preoperative examinations such as computed tomography (CT), magnetic resonance (MR) imaging, serum markers and preoperative biopsy [10,11,12,13]. Thus, there is an urgent need for a quantitative means of predicting MVI preoperatively.

Recently, radiomics has become an evolving topic in conjunction with hypotheses that medical imaging may provide crucial information pertaining to tumour pathophysiology [14, 15]. Some studies have shown that radiomics features may potentially be useful as diagnostic or prognostic imaging markers for tumour lesion detection, subtype classification and therapeutic response assessment [14, 16,17,18,19,20,21,22]. And, Lei et al [9] showed that the nomogram, including diameter, number, status of capsule, boundary, location and typical dynamic pattern of CT imaging, achieved an optimal preoperative prediction of MVI in HBV-related HCC. However, no previously reported study has utilised radiomics nomograms to preoperatively predict the MVI of HCC using CT imaging. Moreover, our study was a direct extension of Lei et al’s work [9].

The aim of the current study was to develop and validate a CT-based radiomics signature to predict MVI in HCC preoperatively with a graphical nomogram that is user-friendly for clinicians, to assist in the determination of individual therapeutic strategies for patients with HCC.

Materials and methods

Patients

Ethical approval was obtained for this retrospective study, and the requirement for informed consent was waived. The entire cohort dataset was acquired from the January 2012 to December 2016 records of the institutional picture archiving and communication system (PACS; Toshiba), which was used to identify patients who had histologically confirmed HCC with MVI (MVI+) or without MVI (MVI−). Two pathologists with over 5 years of working experience independently reviewed all postoperative specimens histologically and assessed the presence or absence of MVI. All patients underwent contrast-enhanced CT (CECT) scans before LR. Figure 1 shows the patient recruitment process and the inclusion and exclusion criteria. A total of 157 patients conformed to the criteria, 134 men and 23 women (median age 57 years, age range 34–64 years), and were divided into two datasets at a ratio of 2:1 according to the time of surgery. The training dataset included 110 patients (93 men and 17 women, median age 55 years, age range 46–64 years), and 47 patients were allocated to the time-independent validation dataset (41 men and 6 women, median age 56 years, age range 34–64 years) (Table 1).

Fig. 1
figure 1

Flow chart of the enrolled patients in our study

Table 1 Patient characteristics in the training and validation datasets

Baseline clinical-pathologic data were collected from our PACS medical records. Clinical factors (CFs) included age, gender, alpha-fetoprotein (AFP) level, tumour location (left, right or caudate lobe), hepatitis B surface antigen (HBsAg) status (positive or negative) and postoperative pathologic differentiation (well, moderate or poor). The WHO histologic grade system was used to determine the pathologic grade of hepatocellular carcinoma. Laboratory analysis consisted of routine blood tests performed within 1 week before surgery.

CT technology and segmentation

All CT examinations were performed with 64-row spiral CT scanners (Optima CT660 and Discovery 750 HD, GE Healthcare). A 1.2–1.5 mL/kg body weight bolus of iohexol (Omnipaque, GE Healthcare Co., Ltd.) was injected intravenously at a flow rate of 2.5 mL/s, followed by a 20-mL saline flush. Arterial phase (AP), portal venous phase (PVP) and delay phase (DP) were obtained at 35 s, 65 s and 120 s, respectively, after intravenous injection of contrast. The scanning parameters were 120 kV, 280 mAs, 0.8 s rotation time, 5 mm slice thickness and a 5-mm interval. The protocol requirements used for CECT imaging met the criteria recommended by the LI-RADS guideline [23].

Workflow

Radiomics extracts high-throughput quantitative imaging features to perform subsequent data analysis related to target clinical outcome. The workflow of a typical radiomics process consists of four steps: tumour segmentation, feature extraction, model construction and model evaluation.

Tumour segmentation

Three-dimensional manual segmentation was performed by a radiologist with work experience of 8 years, using ITK-SNAP software (http://www.radiantviewer.com). Regions of interest (ROIs) were drawn on all AP, PVP and DP images slice-by-slice for each patient, slightly along the visible borders of the lesion to include the entire lesion’s volume approximated. In the case of the blurred edges, we drew the maximum range of the lesion. The final segmentation results were validated by a senior radiologist with 15 years of work experience.

Feature extraction and selection

A set of 647 radiomics features that reflected the heterogeneity of the tumour was extracted from the AP, PVP and DP images for each patient.

The extracted radiomics features could be divided into two kinds: non-textural features and textural features. Non-textural features included shape and size features and intensity features. Shape and size features captured the direct-viewing characteristics of the lesion, and intensity features depicted the characteristics of the histogram of the lesion. Textural features were extracted based on four textural matrixes: grey level co-occurrence matrix (GLCM), grey level run-length matrix (GLRLM), grey level size zone matrix (GLSZM) and neighborhood grey-tone difference matrix (NGTDM). All the feature extraction was implemented using Matlab 2014a (MathWorks). We applied a wavelet filter onto the original image dataset in order to extract high-dimensional features from different frequency scales. We finally obtained nine image datasets including the original image dataset and eight filtered image datasets in different frequencies. A detailed description of feature definition and extraction is provided in Supplementary Appendix 1. All feature extraction was implemented using Matlab (version 2014a) (MathWorks).

To verify the robustness of the radiomics features, we randomly selected 20 patients for test and re-test analysis and multiclinician segmentation. The intraclass correlation coefficient (ICC) and concordance correlation coefficient (CCC) were calculated to determine the stability of the features. Features with an ICC and CCC lower than 0.75 were excluded from the final feature dataset (Supplement Fig. 1). For AP, 618 out of 647 features were retained; for PVP, 621 out of 647 features; and for DP, 563 out of 647 features.

We used the least absolute shrinkage and selection operator (LASSO) method to reduce the redundancy and dimensionality of the features in the training dataset. We chose the optimal feature dataset with the least cross-validation binominal deviance. Non-zero coefficients were defined as the weight for each selected feature, which indicated the correlation between the feature and MVI. The LASSO model was implemented using the glmnet package [24].

Model construction

Three models were constructed to preoperatively predict MVI status: a respective radiomics model, a clinical model and a combined model. The radiomics signature was calculated using support vector machine (SVM) with the LASSO-selected features as the input factors. The formula of the radiomics signature was as follows:

$$ \mathrm{Rad}\ \mathrm{score}=\sum \limits_{i=1}^N{C}_i\times \left({\boldsymbol{sv}}_{\boldsymbol{i}}\cdotp \boldsymbol{x}\right)-b $$

where b is the intercept, N is the support vector number, svi is the ith support vector, Ci is the coefficient of the ith support vector and x is the new data consisting of the values of the selected feature.

For the construction of the clinical model, we performed multivariable logistic regression analysis of clinical parameters including age, sex, maximum tumour diameter (MTD), cirrhosis, AFP, HBsAg, pathologic grade and location. Backward step-wise variable selection was implemented with the Akaike information criterion. The clinical model was constructed by integrating the final selected clinical predictors using logistic regression modelling.

The combined models incorporated the radiomics signature and the related clinical predictors together with a logistic regression model, which predicted MVI status by synthesising both the radiomics and clinical characteristics.

Model evaluation

Receiver operating characteristic (ROC) curve analysis was utilised to illustrate the diagnostic performance of the three models constructed. Delong validation was used to compare the areas under the curve (AUCs) in different models and determine whether they differed significantly. We also constructed a nomogram for the effective prediction model to provide a more direct way for the clinician to assess the possibility of MVI. Calibration curves were adopted to analyse the diagnostic performance of the nomogram in both the training and the validation datasets. Decision curve analysis was conducted to determine the clinical usefulness of the nomogram by quantifying the net benefits at different threshold probabilities in the entire cohort.

Statistical analysis

All statistical analysis was performed using the PASW Statistics 18.0.0 software package (SPSS). Categorical variables were expressed as numbers or percentages, and continuous variables were expressed as mean ± SD, or median. The two-sample t test was used to determine whether the values of the demographic variables differed significantly between the training and validation groups. Two-sided p values < 0.05 were considered statistically significant.

Results

Clinical characteristics

The clinical characteristics of the training and validation datasets are summarised in Table 1. There were no significant differences in age, gender, tumour location, MTD, AFP, cirrhosis, HBsAg or histologic grade between the training and validation cohorts (p = 0.124–0.873).

Based on the results of histopathology, the patients were classified into two groups: the MVI+ group and the MVI− group. There were no significant differences between the MVI+ group and the MVI− group in the training and validation datasets in terms of gender, tumour location, cirrhosis, HBsAg or pathologic grade. There were significant differences in age and MTD between the two groups in the training and validation datasets (p < 0.05). AFP differed significantly in the training dataset, but this was not confirmed in the validation dataset (Table 1).

Clinical model construction

Multivariable analysis showed that age (odds ratio (OR) 0.94; 95% confidence interval (CI) 0.90–0.98), MTD (OR 1.37; 95% CI 1.07–1.77), AFP (OR 1.52; 95% CI 0.88–2.63) and HBsAg (OR 0.43; 95% CI 0.14–1.31) were effective factors for clinical model construction.

Radiomics signature calculation

We performed LASSO modelling on AP, PVP and DP feature datasets in order to investigate the effectiveness of MVI discrimination via CECT (Supplement Fig. 2). For AP, five features were selected for radiomics signature construction. For the tri-phasic CECT image, five, seven and nine features were ultimately selected as putatively effective features for AP, PVP and DP radiomics signature construction, respectively. Details of the selected features are shown in Table 2.

Table 2 Selected features for the arterial phase and portal venous phase

Performance of the six proposed models

To evaluate the diagnostic performance of the developed model, we included a time-independent validation dataset with 47 patients. The PVP radiomics model exhibited better predictive performance with regard to MVI in the validation datasets, with the AUC of 0.793 compared with the AP radiomics model (AUC 0.684) and DP radiomics model (AUC 0.490) (Fig. 2). Separate bar charts of the PVP radiomics signatures are shown in Supplement Fig. 3, and box plots are shown in Fig. 3. The clinical model did not perform well in terms of discrimination, with AUCs of 0.734 in the training dataset and 0.761 in the validation dataset. After combining the PVP radiomics signature with the effective CF, the predictive performance improved significantly, with AUCs of 0.835 in the training dataset and 0.801 in the validation dataset. For AP and DP, the performance of the combined model showed a significant improvement in the training dataset than the single clinical model (AUC 0.703 and 0.798 vs. 0.734); however, the result was inverse in the validation dataset than with the clinical model (AUC 0.684 and 0.490 vs. 0.761). The performance of the PVP radiomics signature combining with the effective CF was better than that of the tri-phasic radiomics signature combining with the effective CF in the validation dataset (AUC 0.801 vs. 0.680) (Table 3, Fig. 2). Comparison was made between the best performed PVP plus CF model and the single CF model. The Delong test manifested a significant difference in the training dataset with a p value of 0.005. In the validation dataset, the p value was 0.281, but it was a trend that when adding PVP signature, the performance of the validation dataset was better than the single CF model with higher AUC.

Fig. 2
figure 2

a, b The receiving operating characteristics (ROC) curves of the radiomics signature-based model, the clinical model and the combined model on the portal venous phase (PVP). CF, clinical factor

Fig. 3
figure 3

a, b The boxplots for radiomics signature in training and validation datasets in the portal venous phase, categorised by MVI+ and MVI− groups

Table 3 Predictive performance of the proposed models

Nomogram construction and validation

As the combined model incorporating the PVP radiomics signature and the effective CF had the best predictive performance, we built a nomogram for the graphical representation of predictive outcome (Fig. 4). Good calibration was observed with both the training and the validated datasets, with respective C-indexes of 0.827 and 0.820 (Fig. 5). The Hosmer-Lemeshow test yielded no significant difference between the predictive calibration curve and the ideal curve for MVI prediction with both the training and the validation datasets (p = 0.371 and 0.094, respectively). In the current study, the threshold probability of the decision curve was 3% and the corresponding net benefit was 0.357 (Fig. 6).

Fig. 4
figure 4

The nomogram obtained by combining the effective clinical factor (CF) and PVP radiomics signature

Fig. 5
figure 5

Calibration curves of the nomogram on the training (a) and validation (b) datasets. The y-axis represents the actual microvascular invasion (MVI) rate, the x-axis represents the predicted MVI possibility and the diagonal dashed line indicates the ideal prediction by a perfect model

Fig. 6
figure 6

Decision curve analysis for the combined nomogram in the validation dataset. The y-axis represents the net benefit, and the x-axis represents the threshold probability. In our study, the threshold probability of the decision curve is 4% and the corresponding net benefit is 0.357. It indicates that the nomogram improves the benefit compared with the measures that treat all patients and treat none patient when threshold probability is > 4%

Discussion

We developed and validated a radiomics signature-based nomogram for preoperative and individualised prediction of MVI in patients with HCC. In previous studies, some researchers [9, 25] have analysed subjective imaging characteristics determined by radiologists and combined clinical-pathologic or gene expression factors to preoperatively predict the MVI of HCC. However, in the current study, radiomics analysis was applied to quantitatively extract CECT imaging features to assess subtle textural variation within tumour lesions, which may contain holistic information related to tumour physiology or microenvironment. Furthermore, we developed an easy-to-use nomogram integrating both proposed PVP radiomics signature and CF to facilitate the preoperative individualised prediction of MVI. The predictive calibration curves of the training and validation datasets demonstrated agreement with the ideal curve. Decision curve analysis showed that the radiomics nomogram was clinically useful in the current study.

Previous studies incorporating radiomics nomograms have yielded many results pertaining to the predictive values of tumour diagnoses and therapy effectiveness [26,27,28,29]. There is also growing interest in multimarker analysis [26,27,28,29]. In the current study, the combined radiomics-CF nomogram for predicting MVI was superior to both the radiomics signature and the CF nomogram alone, with a higher C-index and better calibration. Furthermore, the results implied that the combined PVP radiomics-CF nomogram was robust with regard to the prediction of MVI in HCC. The results of the study emphasised the great importance of the radiomics signature developed for MVI prediction according to the weights of the nomogram.

In the current study, PVP radiomics signature performed better than AP, DP or any kind of the combined radiomics signatures (Table 3). In a previous study, Banerjee et al [25] reported that radiogenomic venous invasion (RVI), as a non-invasive radiogenomic marker, could accurately and preoperatively predict MVI in HCC. In Banerjee et al’s [25] study, all feature scoring was also based exclusively on PVP images. But, what differs between Banerjee’s work and the current study is that our CECT-based PVP radiomics features were more objective than the PVP radiological characteristics assessed by doctors [9, 30, 31]. Furthermore, the predictive accuracy of our proposed model was higher than that of previously reported studies [9, 30,31,32]. In addition, according to the pathological definition of MVI [5], it is mainly detected in the small branches of the portal vein and may also be found in the small branches of the hepatic artery and/or within the small lymphatic vessels of the liver [33]. This suggests another hypothetical explanation as to why the PVP radiomics model reflected MVI better than the AP or DP radiomics model. In addition, the PVP radiomics signature was simpler than the other eight radiomics signature. This approach could be easier to operate in the clinical practice.

The proposed radiomics features are categorised into non-textural and textural features, based on statistical methods [34]. The final predictive model demonstrated that the non-textural radiomics features ‘fos_skewness’ and ‘fos_minimum’ of the PVP were significantly related to MVI. The fos_skewness measures the asymmetry of the distribution of values about the mean grey value, and the fos_minimum reflects the intensity of tumour region. The results showed that the fos_skewness was positively and the fos_minimum was negatively associated with the occurrence of MVI. Other significant radiomics features of the PVP in the current study were textural features. The textural radiomics parameters could not be identified via visual inspection, but these reflected heterogeneity in the tumour. Tumour heterogeneity may be difficult to identify and quantify via traditional imaging tools, the subjective assessment of images or random sampling biopsy [35], but which were approved to be significant with cancer pathophysiology. Although radiomics is not a new tool, it may be a useful imaging marker that improves the assessment and quantification of tumour spatial heterogeneity [36]. However, the radiomics features are extracted and calculated by the computer. Explaining the associations between the radiomics features—especially the higher-order radiomics features—and pathological manifestations is very challenging [37]. On the one hand, the pathophysiologic processes involve multiple interacting components; on the other hand, the maximised information obtained from computer-based radiologic image analysis is far beyond that which is attainable via visual inspection.

In the current study, we developed a clinical model incorporating preoperative age, MTD, HBsAg and AFP. The clinical model exhibited good predictive efficiency for the MVI of HCC, especially when combined with a PVP radiomics model. Furthermore, preoperative AFP and MTD were positively correlated with MVI. As MVI is a common event in advanced HCC [33], an estimate during early T-stage HCC has specific clinical significance. In addition, Lei et al [9] reported that large tumour diameter was one of the preoperative factors associated with MVI. Thus, we selected early T-stage tumours with MTDs less than 6 cm. However, MTD was still one of the key factors in the preoperative estimation of MVI. Some previous studies have reported that AFP levels were significantly higher in patients with MVI [9, 38]. It has also been reported that AFP levels were positively associated with tumour size [31, 39]. In addition, preoperative age and HBsAg were correlated with the MVI of HCC. Fundamental research has revealed that the HBV-initiated tumourigenic process can play an important role in the development of MVI in HCC [40]. Recently, Lei et al [9] and Wei et al [41] reported that high HBV infection and active HBV replication were associated with the development of MVI in HCC patients. The results of the current study are consistent with those findings. The age of the MVI+ group was significantly lower than that of the MVI− group in both the training and the validation datasets. The predictive value of age with regard to MVI of HCC remains unknown and requires further investigation.

The limitations of the current study include the relatively small sample size, the fact that it was entirely retrospective and thus needs be validated via prospective studies and the lack of multicentre validation. MVI grade was also not taken into account in the MVI+ group, and the tumour signs of CECT were not analysed. One hundred twenty seconds after the contrast injection may be too late as a delay phase.

In conclusion, the radiomics signature identified may be useful as an imaging marker for predicting MVI of HCC preoperatively. Nomograms combining PVP radiomics and CF may prove useful as a tool to guide personalised treatment, although this would require further external validation prior to widespread application in clinical practice.