Introduction

Liver resection and transplantation are considered as first-line curative treatments for hepatocellular cancer (HCC). Unfortunately, approximately 50% of patients recur within 2 years after curative hepatectomy [1, 2]. Microvascular invasion (MVI) is one of the well-known potential predictors for early recurrence of HCC [3, 4]. MVI is defined as the invasion of tumour cells within a vascular space lined by endothelium that is visible only on microscopy without macroscopic tumour invasion [5]. The presence of MVI indicates aggressive behaviour of HCC and poor survival outcomes [3].

The diagnosis of MVI is crucial for decision of surgery strategies. Absence of MVI is required for liver transplantation in HCC patients according to the new Milan criteria [6, 7]. If the presence of MVI can be accurately predicted preoperatively, anatomic resection with expanding resection margin should be performed even for a small tumour, while tumour resection might be enough for large tumours in cases with absence of MVI [8]. Besides surgical decision making, for patients diagnosed with MVI preoperatively, additional adjuvant therapies after surgery are preferred [9]. Thus, an accurate preoperative prediction of MVI plays an important role in treatment strategy making of HCC.

Currently, the diagnosis of MVI is determined only on the postoperative histologic examination. Due to high heterogeneity of HCC, no stable serological or genomic predictors of MVI have been found so far [10]. In recent years, several studies have found that some image features of computed tomography (CT) and magnetic resonance imaging (MRI) were predictive of MVI [11, 12]. Some studies showed that Gd-EOB-DTPA-enhanced MRI has a high value in the prediction of presence of MVI in HCC [11, 13]. In the study of Lee et al [11], three MR imaging features, including arterial peritumoural enhancement, non-smooth tumour margin, and peritumoural hypointensity on hepatobiliary phase (HBP), were independently associated with MVI [11]. However, in all these studies, imaging features were extracted visually. Although visual images provide some valuable feature information, their limitation in visual image grey scales restricts potential application in reflecting valuable microcosmic image features. Radiomics is a rapidly growing discipline based on quantitative image analysis to reflect image textures and morphology of tumours by grey value. It can obtain two-dimensional image features, as well as high-dimensional image features by extracting quantitative image features through a computer algorithm and thus can extract far more features than manual extraction by experts [14]. Several studies on skin cancer, glioma, and breast cancer showed that unviewable radiomics features were closely related to pathological microscopic structures. These features include texture features (a series of matrix transformations, such as grey-level co-occurrence metrics, run-length metrics, to reflect the high-order information of region of interest (ROI)), filter-transformed features (be used to obtain a series of target features, such as LOG transformation, Gaussian transformation), wavelet features (the characteristics of the ROI through different resolution angles) [15,16,17].

Most previous radiomics studies highlighted the intratumoural region whereas the proximal liver parenchyma (distance from the tumour margin ≤ 2 cm) possessed main valuable information to diagnose MVI. Recently, a breast cancer radiomics study [17] tried to extract radiomics features from the intratumoural and peritumoural region simultaneously, finding that a combined radiomics method could produce generalisable and robust unsupervised clusters. This method was better than the extraction from the intratumoural region alone, suggesting that peritumoural radiomics added value in predicting treatment efficacy in comparison to use of intratumoural radiomics alone.

To the best of our knowledge, no literature has reported to apply a combined intratumoural and peritumoural radiomics technique to preoperative prediction of MVI of HCC. Therefore, we developed and validated radiomics models based on gadolinium-ethoxybenzyl-diethylenetriamine (Gd-EOB-DTPA)-enhanced MRI for preoperative prediction of MVI in patients with HCC through intratumoural radiomics and the combined intratumoural and peritumoural radiomics method, respectively.

Materials and methods

Patients

Patients who underwent Gd-EOB-DTPA-enhanced MRI examination before surgery between January 2013 and August 2017 were consecutively included in this study according to the inclusion and exclusion criteria. Only the patients who met all the following criteria were included: (1) resectable solitary HCC lesion or multiple HCC lesions within one liver lobe; (2) no macroscopic vascular invasion; (3) received Gd-EOB-DTPA-enhanced MRI of the liver within 1 month before surgery; (4) underwent curative hepatectomy; and (5) pathological confirmation of primary HCC. Exclusion criteria included: (1) received other anti-tumour therapies before surgery; (2) incomplete clinical or pathological information. Patients were randomly allocated to training and validation cohorts in a ratio of 7:3. Our Institutional Ethic Review Board has approved the current study, following the Declaration of Helsinki.

Pathological examination

Information about tumour size, number, colour, and capsule condition were obtained grossly, and information about histological type, differentiation grade, lymphocyte infiltration, MVI, satellite nodules, and chronic liver disease were obtained under microscope. MVI was defined as presence of tumour emboli in a vascular space lined by endothelial cells on microscopy [5]. Two pathologists with more than 10 years of experience in HCC pathology reviewed all the specimen slices independently, without knowing the patient’s clinical data.

MR imaging acquisition

MRI examination was performed by using a 3.0-T system (Siemens Healthineers) in all patients. Eight-channel phased array coil was used, and the scanning scale covered from the top to the lower edge of the liver. The MR scan sequence included: half-Fourier single-shot turbo spin-echo (HASTE) sequence, fast low angle shot (FLASH) T1WI in/out of phase sequence imaging, FLASH T1WI fat suppression (FS) sequence axial imaging, and turbo spin-echo (TSE) T2WI navigation trigger axial imaging. The images in arterial phase, portal phase, and hepatic venous phase were obtained by performing three-dimensional volume interpolated breath-hold test (3D VIBE) T1WI (FS) sequence during suspended respiration at 30–35 s, 65–70 s, and 100–120 s, respectively, after as bolus injection of Gd-EOB-DTPA (Primovist®, 0.1 mL/kg body weight) with a flow rate of 1 mL/s. Additional hepatobiliary phase images were obtained at 20 min after injection. The specific parameters of each scan sequence are shown in Table 1.

Table 1 MRI scan sequence and parameters

Volume of interest identification and segmentation

The volumes of interest (VOIs) were delineated around the liver lesions outline for 3D volume area as indicated in hepatobiliary phase of MRI images by three independent radiologists with more than 5 years of experience with ITK-Snap software (open source software; www.itk-snap.org) (Supplementary Fig. 1) [18]. The reproducibility of the three radiologists to delineate VOI was assessed for further feature selection.

The intratumoural region was defined as the area within radiologist-annotated tumour boundaries. Then, the annotated intratumoural region was dilated at a radius of 1 cm by topology algorithm, generating the combined intratumoural and peritumoural region (dilated distance was analysed as dilated pixel counts multiplied by pixel size).

Radiomics feature extraction

The MR image features of all patients were extracted and analysed by the A.K. software version 2.0.0 (house-made software; Analysis-Kit, GE Healthcare). A total of 1044 imaging features were extracted, including four kinds of features (Fig. 1): 42 grey-level histogram (concerned with properties of individual pixels and they describe the distribution of voxel intensities within the image through commonly used and basic metrics), 972 transformed matrix texture (namely, texture features), 15 wavelet transformed texture, and 15 filter transformed texture.

Fig. 1
figure 1

Flow chart of the study. (a) Collecting HCC patients who met inclusion and exclusion criteria; (b) Extracting radiomics features: (I) obtaining Gd-EOB-DTPA-enhanced MRI images in the hepatobiliary phase; (II) segmentation: The volume of interest was delineated by experienced radiologists and three-dimensional images were formed; (III) extracting four kinds of quantitative features by software; (c) pathologic examination: (I) obtaining gross specimens of tumour tissue; (II) pathologic specimens; (III) pathologic diagnosis; (d) data cleaning and dimensions reduction; (e) establishing the model for predicting microvascular invasion by machine learning. Features through dimension reduction were applied to establish the model by machine learning and obtain quantitative radiomics score to predict microvascular invasion. HBP, hepatobiliary phase

Statistical analysis

Independent sample t test or Kruskal-Wallis (KW) nonparametric rank sum test was used to compare clinical characteristics between MVI-positive and MVI-negative groups in the training and validation cohort for continuous variables, while chi-square test or Fisher exact test for categorical variables. The reproducibility of three radiologists to delineate VOI was assessed by the intra-class correlation coefficient (ICC). ICC ≥ 0.8 indicated high consistency, 0.5–0.79 middle, and < 0.5 low [19].

The predictive model was established by combining the 1044 extracted features with pathologically diagnosed MVI outcomes through statistical methods of feature de-redundancy, supervised feature dimensionality reduction, and machine-learning-based classification. The least absolute shrinkage and selection operator (LASSO) logistic regression model was applied to dimensionality reduction through bootstrap. A 10-fold cross validation was used to select the minimum value of λ. Iteration was used, and frequencies of each features were ranked decreasingly. In order to avoid overfitting, the top 50 features were selected to establish the initial model. Finally, given collinearity of 50 features initially selected, logistic regression analysis was performed. Optimal model was obtained through Akaike information criterion (AIC) and area under the receiver operating characteristic curve (AUC). AUC and the corresponding 95% confidence interval (CI) were obtained from receiver operating characteristic curve (ROC) analysis, as well as the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the optimal cutoff value.

To compare the diagnostic performance of experts and radiomics model, univariable logistic regression models were used to evaluate the relationship between three MRI features extracted by Lee’s study [11]. The association between the clinical variables of the training cohort with MVI was assessed by univariable logistic regression analysis. And those variables with p value < 0.05 were included into the multivariable logistic regression analysis along with Rad score to further assess their independent association with MVI.

A heatmap analysis was performed to present associations between radiomics features and histopathologic features. It can visually represent data values in defined shades of colour. The radiomics score (Rad score) of each patient was calculated from the linear combination of the selected features multiplied by the coefficients estimated by the optimal logistic model.

Spearman correlation analysis was used to evaluate the associations between radiomics and histopathologic features.

The goodness of fit of the predictive model was evaluated by Hosmer-Lemeshow Wald chi-square test.

All statistical analyses were performed by R software version 3.2.3 (Bell Laboratories; https://cran.r-project.org/bin/windows/base/old/3.2.3). A two-sided p value was considered statistically significant if less than 0.05.

Results

Baseline characteristics

One hundred sixty patients were collected, including 110 patients in the training cohort and 50 in the validation cohort (Supplementary Fig. 2). Among all 160 patients, MVI was pathologically diagnosed in 62 patients (38.8%). The clinical characteristics of both cohorts are listed in Table 2. Baseline characteristics were not significantly different between both cohorts (Supplementary Table 1). Larger tumour size was detected in MVI-positive patients compared to that in MVI-negative patients in the training cohort, while higher serum AFP level and larger tumour size were found in MVI-positive patients compared to those in MVI-negative patients in the validation cohort. No significant differences were found between MVI-positive and MVI-negative patients in both cohorts in terms of other characteristics.

Table 2 Baseline clinical characteristics of the training and validation cohort

Interobserver and intraobserver reproducibility of radiomics feature extraction

The interobserver ICC was ≥ 0.8, 0.5–0.79, < 0.5 for 82%, 10% and 8% of the features, respectively. The intraobserver ICC was ≥ 0.8, 0.5–0.79, < 0.5 for 85%, 14% and 1% of the features, respectively.

Radiomics model establishment

The heatmap was used to show the correlation coefficient matrix among four kinds of features (Fig. 2a). To reduce dependency and redundancy, we used LASSO in logistic regression model to reduce dimensions of these 1044 features, and 50 features were selected. Logistic model was used and, finally, ten features were selected to establish the final model: kurtosis, percentile 10, percentile 75, percentile 80, ClusterShade_angle0_offset1 (ClusterShade), GLCMEntrophy_angle45_offset9 (GLCMEntrop), ShortRunEmphasis_AllDirection_offset2_SD (SRE), ShortRunLowGreyLevelEmphasis_AllDirection_offset5_SD (SRLGLE), HighGreyLevelRunEmphasis_AllDirection_offset8_SD (HGLRE), and LongRunHighGreyLevelEmphasis_AllDirection_offset8_SD (LRHGLE). Correlation coefficient matrix of these ten features were mostly negative but with low correlation coefficients (Fig. 2b), thus, were suitable for establishing a model for predicting MVI.

Fig. 2
figure 2

Radiomics heatmaps. (a) Heatmap depicting correlation coefficients matrix of 1044 features in the training cohort. Unsupervised clustering analysis was used. Yellow was used to represent positive correlation and blue to represent negative correlation. (b) Heatmap depicting correlation coefficients matrix of ten selected features in the training cohort. Unsupervised clustering analysis was used. Yellow was used to represent positive correlation and blue to represent negative correlation

Hosmer-Lemeshow test showed the model established by these ten features was predictive in MVI (Hosmer-Lemeshow Wald chi-square = 10.188, df = 8, p = 0.252). Rad score was defined as a score resulted by the regression coefficients of these ten features multiplied by the value of corresponding feature (Formula see Supplementary Material). Each patient’s Rad score in the training cohort was shown as bar chart in Fig. 3a, suggesting that MVI-positive patients presented higher score while MVI-negative patients presented lower score.

Fig. 3
figure 3

Rad scores for each patient in the training and validation cohort. (a) Rad score for each patient in the training cohort. Light blue bars show scores for MVI-negative patients. Dark blue bars show scores for MVI-positive patients. (b) Rad score for each patient in the validation cohort. Light Blue bars show scores for MVI-negative patients. Dark blue bars show scores for MVI-positive patients. MVI, microvascular invasion; Rad score, radiomics score

Validation of radiomics model

ROC curves of the training and validation cohorts were shown in Fig. 4a and b, respectively. The AUC, sensitivity, and specificity of our model in predicting MVI were 0.850 (95% CI, 0.77–0.93), 76.2%, and 88.2%, respectively, in the training cohort while 0.833 (95% CI, 0.71–0.95), 73.7%, and 90.0%, respectively, in the validation cohort.

Fig. 4
figure 4

Receiver operating characteristic curves (ROC) of the training and validation cohort. AUC, area under the receiver operating characteristic curve

MVI-positive patients had significantly higher Rad score in the training and validation cohorts (training cohort, 0.40 (− 0.04, 1.63) in MVI-positive vs. − 1.53 (− 2.19, − 0.48) in MVI-negative, p < 0.001; validation cohort, 0.96 (− 0.06, 3.39) vs. − 2.3 (− 4.3, − 1.35), p < 0.001) (Table 3, Fig. 3a, b).

Table 3 Association between radiomics features with MVI from binary logistic regression analysis

Association of radiomics features with clinical data

In the training cohort, there was statistical difference between MVI-positive and MVI-negative group in terms of six of the selected ten features in univariable logistic regression model (Supplementary Table 2). Univariable analysis of the ten selected features and other clinical characteristics (Fig. 5) showed that eight features were associated with clinical characteristics listed in Table 1. Five features (percentile 10, ClusterShade, LRHGLE, GLCMEntrophy, and percentile 75) were associated with BCLC stage. Seven features (kurtosis, percentile 10, percentile 80, ClusterShade, LRHGLE, GLCMEntrophy, and percentile 75) were associated with tumour size. In addition, percentile 10, percentile 75, percentile 80, and LRHGLE were associated with several clinical characteristics.

Fig. 5
figure 5

Map of associations between ten features selected and clinical characteristics. Blue was used to represent high p values and white to represent low p values

Association of radiomics features with histopathologic features

Three histopathologic features were analysed, including Edmondson grade, satellite nodules, and lymphocyte infiltration. Heatmap results showed that kurtosis, GLCMEntropy, and LRHGLE were associated with Edmondson grade (p < 0.05). Only Kurtosis and SRE were associated with satellite lesions (p < 0.05), and GLCMEntropy and HGLRE were associated with lymphocyte infiltration (p < 0.05). The other radiomics features were not associated with three histopathologic features (Fig. 6).

Fig. 6
figure 6

Heatmap of associations between ten radiomics features and clinical characteristics. Red was used to represent low p values and green to represent high p values

Comparison of diagnostic performance of experts and radiomics model

An abdominal radiologist with 17 years of experience in abdominal MR imaging performed image analysis according to these three findings extracted from preoperative MRI and established a model consists of these three features and MVI in the way Lee’s [11] study adopted to predict MVI. Based on 160 patients in our study, when two of these three findings were combined, AUC, sensitivity, specificity, and accuracy were 0.57, 45.2%, 67.3%, and 58.8%, respectively. When these three findings were combined, AUC, sensitivity, specificity, and accuracy were 0.47, 19.3%, 83.7%, and 58.8%, respectively (Table 4). The radiomics model, of which AUC, sensitivity, specificity, and accuracy were 0.83, 90.0%, 75.0%, and 84.0%, respectively, seems superior to the radiologist in predicting MVI (Table 4).

Table 4 Comparison of diagnostic performance of three MRI imaging features and the radiomics model for MVI

Decision curve analysis used to predict MVI by radiomics model and experts was shown in Supplementary Fig. 3. It showed that if the threshold probability is larger than 20%, in other words, the estimated MVI-positive probability of a patient > 20%, the radiomics model would make more patients benefit from accurately predicting MVI, compared with the model based on imaging features extracted by experts.

Discussion

In this current study, we established a radiomics model predicting MVI preoperatively by extracting radiomics features from the intratumoural and peritumoural regions of Gd-EOB-DTPA-enhanced MRI. Our results showed that the AUC, sensitivity, and specificity of the combined intratumoural and peritumoural radiomics model were 0.83, 90%, and 75%, respectively. To the best of our knowledge, this is the first study to establish an MRI radiomics model for MVI prediction of HCC so far. The main reasons that the sensitivity and specificity of more than 75% were achieved in our model might be explained as follows. Firstly, radiomics requires accurate discrimination of lesion boundaries, and Gd-EOB-DTPA-enhanced MRI images were used in our radiomics model. It is known that more than half of HCCs have invaded capsule or completely lack them, and the infiltrative growth of the tumour further enhances the difficulty of lesion segmentation [20]. Gd-EOB-DTPA is a new hepatobiliary-specific MRI contrast agent [21]. The difference in signal between tumour tissues and surrounding liver parenchyma is more dominant in the hepatobiliary phase of Gd-EOB-DTPA-enhanced MRI than conventional contrast agents [22,23,24], which makes the boundaries of tumours clearer to delineate. Secondly, radiomics has the advantages of stable calculation, high repeatability, indefatigability, and no interference of human subjectivity [25, 26]. When we validated the efficacy of the features reported in the study of Lee et al [11] via visual inspection in our cohorts, the accuracy and sensitivity were much lower than those of our predictive model. This finding demonstrates the lower sensitivity, higher discrepancy, and poorer generalisability of extracting imaging features manually. Thirdly, the delineation of all the tumour slices in hepatobiliary phase extracted almost the whole tumour characteristics including the three-dimensional features (e.g. shape and smoothness), making the features more stable and representative than 2-dimensional regions of interest [27].

In our study, imaging features of intratumoural and peritumoural region were extracted simultaneously to establish the model. A previous study [5] has shown that more than 85% of MVI was found in peritumoural region within 1-cm distant from tumour boundaries. That was why the defined peritumoural region in our study was obtained by dilating the annotated intratumoural region at a radius of 1 cm. Lee et al [11] found three imaging features in Gd-EOB-DTPA highly suggestive of MVI. Among these features, arterial peritumoural enhancement and peritumoural hypointensity on hepatobiliary phase were both peritumoural features. The former was probably due to the local haemodynamic change of peritumoural region when MVI was present. The latter was probably attributed to the decreased intake of Gd-EOB-DTPA on hepatobiliary phase due to hepatic dysfunction induced by ischaemia. In light of these research findings, it was reasonable to come up with the idea that radiomics features of the peritumoural region were of great importance for the preoperative prediction of MVI. This is the first study to establish combined intratumoural and peritumoural radiomics model to predict MVI in HCC so far. Moreover, our study has demonstrated that the combined model was superior to intratumoural radiomics model in preoperative prediction of MVI in HCC.

Previous studies have found that radiomics features were closely related to tumour microscopic structure and biological behaviour [28,29,30,31,32]. Our study discovered ten radiomics quantitative features associated with MVI of HCC, which have not been reported before. Texture features are important markers of intratumoural homogeneity. Of the ten radiomics features associated with MVI in our study, four were histogram-based features (kurtosis, percentile 10, percentile 75, percentile 80) and others were matrix-based features (ClusterShade, GLCMEntrophy, SRE, SRLGLE, HGLRE, LRHGLE). Histogram-based features were first-order statistic, which mainly relied on the statistics of the intensity information (or brightness information) of the intratumour and peritumour, and then investigated the overall distribution of the intensity information of the intratumour and peritumour. For example, ‘kurtosis’ is a measure of the ‘peakedness’ of the distribution of values in the image ROI, which can be used to describe the concentration degree of image brightness information. A higher kurtosis implies that the mass of distribution is concentrated towards the tail(s) rather than towards the mean. The percentile (%) of a distribution is defined as that value of the brightness. Matrix-based features were second-order statistic, which can be used to describe the complexity of the intratumour and peritumour, changes in hierarchy, and the degree of thickness of the texture. For example, cluster shade is a measure of the skewness and uniformity of the GLCM. A higher cluster shade implies greater asymmetry about the mean. SRE is a measure of the distribution of short run lengths, with a greater value indicative of shorter run lengths and more finer textures. GLCMEntrop measures the average amount of information required to encode the image values. SRLGLE measures the joint distribution of shorter run lengths with lower grey-level values. The larger the value, the more complex the image, and the smaller the image grey value. LRHGLE measures the joint distribution of long run lengths with higher grey-level values. HGLRE measures the distribution of the higher grey-level values, with a higher value indicating a greater proportion of higher grey-level values and size zones in the image. On one hand, tumours with high-grade malignancy may have greater heterogeneity and larger difference between cells, manifesting as grey-level nonuniformity and mixed signals in intratumoural regions. On the other hand, high-grade malignant tumours with faster growing speed would be more prone to necrosis due to poorer intratumoural blood supply, which leads to worse uniformity and more mixed signals in intratumoural regions [33]. In addition, high-grade malignant tumours would be more prone to invade capsules, and infiltrate grow, which leads to less uniformity and more mixed signals in peritumoural regions. As high-grade tumour is tending to present MVI, the texture features in MVI-positive HCC are less uniform than that of MVI-negative HCC, and the histogram features in MVI-positive HCC have more mixed signals than that of MVI-negative HCC in intratumoural and peritumoural regions. Therefore, the biological function similarity may influence microscopic pathologic similarity and subsequent similarity in radiomics features, which were identified in our study as valuable predictive features for MVI.

There are several limitations to this study. Firstly, the sample size is still limited compared with the relatively large number of variables. A large-scale clinical study enrolling more patients would definitely help validate and improve its applicability as an effective tool for predicting MVI in the decision-making strategy of HCC management. Secondly, our validation cohort was from the same centre as the training cohort, which restricted us to assess the generalisability of our findings to other centres and settings. Thirdly, in our study, normalisation of the signal intensities on MR images has not been performed. MR signals are a relative value rather than an absolute value, which may influence interpretation of signal characteristics on MVI prediction. Fourthly, the correlation of the actual site of MVI on histological specimen and VOI on MRI has not been performed.

In conclusion, this model showed high accuracy and sensitivity in both the training and validation cohorts, indicating its good representativeness and stability. Thus, it may be useful in preoperative individual prediction of MVI and assist clinicians in pretreatment decision making.