Introduction

Hepatocellular carcinoma (HCC) is the fifth-most common malignancy and the third leading cause of cancer death worldwide, with increasing incidence and mortality [1, 2]. Surgical resection is regarded as one of the most effective treatments for HCC. However, the prognosis of patients with HCC remains poor with a 5-year recurrence of up to 70% [3]. Histologic grade is one of the most important factors affecting patient prognosis. Poorly differentiated HCC is associated with higher recurrence and worse survival compared with well and moderately differentiated HCC after surgical resection [4,5,6]. Therefore, preoperative prediction of HCC grade is important in the selection of treatment strategies and may assist in the prediction of a patient’s prognosis. Preoperative biopsy is not widely used due to its invasiveness, complications and risk of sample error [7]. Therefore, noninvasive biomarkers able to accurately predict the grade of HCC are urgently needed.

Diffusion-weighted imaging (DWI) is a noninvasive and noncontrast functional imaging technique based on the diffusion of water molecules in vivo and has shown great promise in the detection and characterization of focal liver lesions [8,9,10]. In clinical practice, DWI is usually performed with two b values in which an apparent diffusion coefficient (ADC) is calculated by fitting these b values into a monoexponential model. Several previous studies have evaluated the correlation of ADC values or signal intensity with histologic grade of HCC preoperatively [11,12,13,14,15,16,17,18]. However, ADC is overestimated due to the ‘‘pseudodiffusion’’ effect [8]. Moreover, it is dependent on the b values used which may vary across studies and investigational centers [19]. Thus, there have been conflicting results regarding the correlation of ADC values with histologic grade of HCC. Most previous studies reported that ADC value of poorly differentiated HCC was significantly lower than those of well and/or moderately differentiated HCC [11, 12, 14, 15, 17], while some other studies reported no correlation between ADC and histologic grade of HCC [16, 18].

Le Bihan et al. [20] first proposed the intravoxel incoherent motion (IVIM) model to describe the relationship between signal attenuation and increasing b values. Based on a biexponential model, quantitative metrics that represent the diffusion of water molecules and the microcirculation of tissue separately were derived using multi-b value DWI. IVIM has been used in the prediction of histologic grade of HCC. However, the diagnostic performance of these approaches has been contradictory [21,22,23]. ADC and diffusion coefficient (D) were significantly lower in poorly differentiated HCC compared with well and moderately HCC in the above studies. Granata et al. [22] and Shan et al. [23] reported that perfusion fraction (f) value of poorly differentiated HCC was also significantly lower than that of well and moderately HCC, while Woo et al. [21] reported that f did not show significant difference among different groups. In addition, it will take longer time to obtain IVIM metrics compared with conventional two-b value DWI and whether it is worthwhile to perform multi-b value DWI remains controversial. The optimization of a DWI protocol would facilitate accurate prediction of histologic grade of HCC and proper selection of treatment strategies for HCC. Therefore, this study aimed to determine the diagnostic performance of IVIM parameters calculated using a biexponential model and ADC obtained with multi-b value and two-b value DWI using a monoexponential model in differentiating histologic grades of hepatitis B virus (HBV)-related HCC.

Materials and methods

Study population

The retrospective study was approved by the institutional review board and written informed consent was waived. 211 patients who were suspected for HCC underwent liver MRI and following operation between June 2015 and May 2017 at our institution. All patients underwent 11 b value DWI ranging from 0 to 1500 s/mm2 and 2 b value DWI (b = 0, 800 s/mm2). 94 patients were excluded for the following reasons: (a) patients with histologically confirmed cholangiocarcinoma (n = 13), mixed HCC/cholangiocarcinoma (n = 4), and metastasis (n = 2); (b) patients underwent preoperative transcatheter arterial chemoembolization (n = 32); (c) patients had non-HBV related HCC (n = 8); (d) the interval between the MR exam and operation was more than one month (n = 2); (e) the size of the lesion was less than 1 cm (n = 20); (f) extensive necrosis and hemorrhage was present within tumor affecting measurement (n = 3); and (g) there was slice misregistration or distinct motion artifacts (n = 10). Overall, 117 HBV-related HCC patients (100 males and 17 females; age range 25–79 years; mean, 52.7 ± 12.5 years; BMI, 27.2 ± 4.2) with 120 postoperation pathologically confirmed HCCs were included: 103 lesions located on the right lobe and 17 lesions located on the left liver lobe. The baseline characteristics of the included patients are shown in Table 1.

Table 1 Comparison of patient characteristics according to histologic grade

MRI scan

All MR imaging was performed on a 3.0T whole-body MR scanner (Discovery MR750, GE Healthcare, Milwaukee, WI). An eight-channel phased-array abdominal coil was used. Respiratory-triggered (RT) axial diffusion-weighted single-shot echo-planar imaging was performed with 11 b values (b = 0, 30, 50, 100, 150, 200, 300, 500, 800, 1000, and 1500 s/mm2) and 2 b values (b = 0, 800 s/mm2) successively using the following parameters: repetition time/echo time (TR/TE) = (6000–10,000)/56 ms, flip angle = 90°; matrix size = 128 × 128; field of view (FOV) = 30 × 30 cm, receiver band width = 250 kHz/pixel, slice thickness = 5 mm, and slice gap = 1 mm. The number of excitation (NEX) was 4 for 2b values (b = 0, 800 s/mm2) and 1, 1, 1, 1, 1, 1, 2, 4, 4, 6, and 6, respectively, for the 11 b values (ranging from 0 to 1500 s/mm2) DWI. The fat-suppression scheme was chemical shift selective saturation (CHESS). The total acquisition times of multi-b value DWI and two-b value DWI approach which depends on the respiration rate of the patient and includes respiratory interval were 4–7 min and 1.5–4.5 min, respectively. The other sequences used included fast imaging employing steady-state acquisition (FIESTA), single-shot fast spin echo (SSFSE), T2WI, and liver acquisitions with volume acceleration (LAVA). After contrast agent injection, arterial, portal venous, and delayed phases were acquired at 15–20 s, 60 s, and 180 s, respectively. The detailed information of these sequences was reported in a previous study [24].

Monoexponential and biexponential model of DWI

In conventional DWI, the signal decrease caused by the diffusion can be described with a monoexponential function:

$$S{\text{b}}/S0 = \exp ( - b{\text{ADC}})$$

where S0 and Sb are the signal intensity obtained at b = 0 s/mm2 and a given b value, respectively. The ADC0,800 and ADCtotal were calculated by fitting b = 0, 800 s/mm2 and all 11 b values into a simplified monoexponential model, respectively.

The biexponential fitting model of IVIM theory assumed diffusion and perfusion compartments which both contribute to signal attenuation. The model is described using the equation proposed by Le Bihan et al. [20] as follows:

$$S{\text{b}}/S0 = (1 - f)\exp ( - bD) + f\exp ( - bD*)$$

where Sb is the signal intensity of a b value, S0 is the signal intensity of b = 0 s/mm2, D is the true diffusion coefficient representing pure molecular diffusion, pseudodiffusion coefficient (D*) is the pseudodiffusion coefficient representing perfusion-related diffusion, and f is the perfusion fraction representing the fraction of diffusion associated with microcirculation. A typical multistep approach was used to separate the diffusion and perfusion effects [24, 25].

Image analysis

All imaging data were transferred to a vendor provided software (Functool on GE Advantage Workstation 4.6, GE Healthcare) for image processing and analysis. The location and size of HCC was confirmed by reviewing DWI, T2WI and dynamic enhancement images by two trained radiologists (QGS and SCK, with 4 and 12 years of experience in interpretation of liver MRI, respectively) who were blinded to the histologic results after reaching a consensus. The two reviewers placed a single region of interest (ROI) on the representative slice with the maximum tumor cross section independently. ROIs were manually drawn on the axial b800 images and were checked on all b images to contain as much solid component as possible and were about 5 mm away from the margin to minimize partial volume effects [26]. After analyzing dynamic enhancement images and T2WI images, necrosis and hemorrhage were avoided. All the ROIs were copied and transferred to ADC0,800, ADCtotal, and IVIM-derived maps for measurement (Fig. 1). ADC0, 800 and ADC total were calculated by fitting b = 0, 800, and all 11 b values into the monoexponential model, respectively. D, D*, and f were calculated using the biexponential model. The values of the above metrics generated by the two radiologists were averaged for further analyses. The two sets of data were used to calculate the interobserver agreement of measurements.

Fig. 1
figure 1

A surgically confirmed poorly differentiated HCC in a 55-year-old man. a Axial fat-suppressed T2WI image of the liver showed a slightly hyperintense focal liver lesion. b Axial diffusion-weighted image with b = 800 s/mm2 showed an obviously hyperintense focal liver lesion. cf Mapping of the estimated value of ADCtotal, D, D*, and f calculated using all 11 b values, respectively. g Graph of signal attenuation versus b values for HCC showed steep slope at low b-values (< 200 s/mm2), which was suggestive of perfusion effects. h Mapping of the estimated value of ADC0,800 calculated using two b values (0.800 s/mm2). The mean values of ADCtotal, D, D*, f, and ADC0,800 of the tumor were 0.60 × 10−3mm2/s, 0.55 × 10−3mm2/s, 8.69 × 10−3mm2/s, 18.2%, and 0.90 × 10−3mm2/s, respectively

Histologic analysis

The histologic specimens were obtained from surgical resection or liver transplantation in all patients. All lesions were macroscopically located according to the Couinaud classification of hepatic segment of I–VIII. All tissue slices were stained with hematoxylin and eosin, and were analyzed by a pathologist with 21 years of experience who was blinded to the clinical data and MRI results. According to International Working Group classification system, HCC was divided into well, moderately, and poorly differentiated groups [27]. When the evaluated lesion included tissues of different differentiations, the most predominant component of the tumor was selected to represent the entire lesion for analysis.

Imaging-pathologic correlation

After finishing the image analysis, the two trained radiologists mentioned above reviewed the pathologic reports including size, number, shape, location, and histologic grade of HCC, and the imaging-pathologic correlation was performed. If two or more lesions were in a patient, they were distinguished by the information of location and size according to the pathologic reports.

Statistical analysis

To compare patient characteristics according to histologic grade, categorical variables were analyzed using Chi-square test or Fisher exact test and continuous variables were compared using one-way ANOVA or Kruskal–Wallis test. The Kruskal–Wallis test was used to compare ADCtotal, ADC0, 800, D, D*, and f values of the three groups and Bonferroni correction was used for post hoc tests. Spearman’s rank correlation was used to assess the correlation among the five metrics of histologically differentiated HCCs. Receiver operating characteristic (ROC) analysis was performed to evaluate and compare the performance of the above parameters in differentiating HCCs with different histologic grades. Binary logistic regression combined with ROC was used to evaluate the joint diagnostic performance of two or more parameters. The method of Delong et al. [28] was used to compare the areas under the ROC curve. To assess interobserver agreement, the intraclass correlation coefficient (ICC; two-way random effects model) was calculated for ADCtotal, D, D*, f, and ADC0,800 values in the same lesion along with 95% confidential intervals. ROC analyses were performed by MedCalc13.0 (MedCalc Software bvba, Ostend, Belgium), box plots were generated by GraphPad Prism, version 6.02 (GraphPad Software, La Jolla, Calif), and other statistical analyses were performed by using SPSS 22.0 (IBM Corp, U.S.A). P < 0.05 was considered statistically significant in all statistical analyses.

Results

Histologic grade of HCC

Histologic specimens of 109 patients with one lesion and 3 patients with two lesions were obtained from surgical resection. Histologic specimens of 5 patients with one lesion were obtained from liver transplantation. According to the results of histologic analysis, the lesions were divided into three subgroups: well differentiated (n = 24), moderately differentiated (n = 74), and poorly differentiated (n = 22).

ADCtotal, ADC0, 800, and IVIM parameters among different HCC groups

ADCtotal, ADC0,800, and IVIM parameters of well, moderately, and poorly differentiated HCCs are shown in Table 2. ADCtotal, ADC0,800, D, and f values were significantly different among the three groups (P < 0.001). ADCtotal, ADC0,800, D, and f values of well differentiated HCCs were significantly higher than those of moderately (P < 0.001, P = 0.002, P < 0.001 and P = 0.007, respectively) and poorly differentiated HCCs (P < 0.001). The ADCtotal value of moderately differentiated HCCs was significantly higher than that of poorly differentiated HCCs (P = 0.009), while no significant difference was noted between these groups in terms of ADC0,800, D, or f values (P = 0.42, P = 0.308 ,and P = 0.243, respectively). No significant difference was found in D* value among the three groups (P = 0.514). ADCtotal, ADC0,800, D, and f were all inversely correlated with histologic differentiation: r = − 0.633, r = − 0.394, r = − 0.435, and r = − 0.358 (P < 0.001), respectively, demonstrating that the worse the differentiation, the lower these values are (Fig. 2). However, D* values were not significantly correlated with histologic differentiation (r = − 0.09, P = 0.328).

Table 2 ADCtotal, ADC0,800, and IVIM parameters of different histologic differentiated HBV-related HCCs
Fig. 2
figure 2

Box-and-whisker plots showing values of a ADCtotal, bD, cD*, df, and e ADC0,800 according to histologic grades of HCC. Boxes stretch across from the lower quartile (25th percentile) to the upper quartile (75th percentile); whiskers represent the range of the values; and the horizontal line inside each box represents median values. W-well differentiated, M-moderately differentiated, P-poorly differentiated

Performances of ADCtotal, ADC0,800, and IVIM metrics for discriminating different histologic subgroups

The area under the ROC curve (AUC-ROC) of ADCtotal, ADC0,800, D, and f for diagnosing well differentiated HCC was 0.925, 0.755, 0.812, and 0.736, respectively, and the AUC-ROC of these metrics for diagnosing poorly differentiated HCCs was 0.800, 0.660, 0.685, and 0.674, respectively. The sensitivity, specificity, and cutoff values of the above metrics are shown in Tables 3 and 4. The AUC-ROC of ADCtotal was greater than that of ADC0,800 in diagnosing both well and poorly differentiated HCC (P < 0.001 and P = 0.04, respectively). The AUC-ROC of ADCtotal was greater than that of D and f in diagnosing well differentiated HCCs (P < 0.001), but there was no difference for diagnosing poorly differentiated HCCs (P = 0.06 and 0.13, respectively). The AUC-ROC of D, f, and ADC0,800 for diagnosing well and poorly differentiated HCCs were not significantly different (Supplementary Tables 1, 2). Combining f, the specificity of D for the diagnosis of well differentiated HCC increased to 86.5%. However, the combination of parameters did not bring additional improvement to the performance in diagnosing poorly differentiated HCC using the same method.

Table 3 Diagnostic performances of ADC and IVIM parameters for distinguishing well differentiated HCC from moderately and poorly differentiated HCC
Table 4 Diagnostic performances of ADC and IVIM parameters for distinguishing poorly differentiated from moderately and well differentiated HCC

Interobserver agreement of measurements

There was excellent agreement for ADCtotal, D, and f; good agreement for ADC0,800; and moderate agreement for D* (Table 5).

Table 5 Interobserver agreement of ADCtotal, ADC0,800, and IVIM parameters in HBV-related HCCs

Discussion

Preoperative evaluation of histologic grade of HCC is helpful to modify treatment plan and predict prognosis [6]. In this study, in order to discriminate poorly or well differentiated HCC, the other two groups were incorporated, which was previously used in several previous studies [11, 14]. The present study evaluated and compared the diagnostic accuracy of ADCtotal, ADC0,800, and IVIM-derived parameters for differentiating histologic grade of HBV-related HCC. Our study demonstrated that ADCtotal, ADC0,800, and IVIM parameters D and f can be used for the discrimination of histologic grade of HBV-related HCC, that they were inversely correlated with histologic grade and that ADCtotal demonstrated the highest diagnostic performance. In addition, the analysis of combined D and f increased diagnostic specificity for well differentiated HCC.

Chronic HBV infection is the main risk factor of HCC in Asia and the characteristics of HBV-related HCC are different from those of HCC with other factors [29]. HBV-related HCC is more frequent in male and shows a more advanced stage with better liver function compared with HCC caused by other etiologies when it is detected [30, 31]. Therefore, we focused on HBV-related HCC in this study. It would be useful to perform a similar study in patients with different etiologies in the future.

ADCtotal, ADC0,800, and D were inversely correlated with histologic grade, which is consistent with previous studies [21,22,23]. The decreases in ADC and D with histologic grade may be explained by the increased cellular density and nuclear/cytoplasmic ratio which may restrict the diffusion of water molecule [21]. The decrease in f values with the increasing histologic grade is consistent with previous studies [22, 23] and may be explained by a reduced blood circulation through tumor capillaries [32]. D* was not correlated with histologic grade and this may be due to its instability as reported previously [19, 33].

In our study, ADCtotal demonstrated superior diagnostic performance compared with IVIM-derived parameters D and f, which was consistent with the results of Granata et al. [22] and Shan et al. [23] and inconsistent with the results of Woo et al. [21]. This may be because ADCtotal is a composite metric containing both pure diffusion (D) and microcirculation (f) which may have cooperative effects that lead to better performance in the discrimination of histologic grades. We found that similar studies which reported that ADC showed better or similar performance compared with D in discriminating benign and malignant focal liver lesions [34,35,36,37,38] were even more than those which reported that D performed better than ADC [39]. The cause of the results may be similar with that mentioned above [35, 36]. Therefore, the diagnostic performance of ADC and D may be different in various situations. To the best of our knowledge, no consensus on this subject has been reached, and further studies with a larger sample size are warranted.

Previous studies have tried to find optimal b values for DWI of liver tumors but no consensus has been reached [40, 41]. In our study, b values of 0 and 800 s/mm2 were selected based on previous studies [40, 42]. ADC calculated by using b values of 0 and 800 s/mm2 may better reflect the diffusion process compared to that obtained using b values of 0 and 500 s/mm2 [40], and ADC0,800 showed better diagnostic performance for the discrimination of benign and malignant liver lesions compared with ADC0,600 and ADC0,1000 [42]. In our current study, ADC0,800 was inversely correlated with histologic grade (r = − 0.394), which was similar to the findings by Tang et al. [15] who also used b values of 0 and 800 s/mm2 (r = − 0.462).

Unlike conventional monoexponential DWI, the biexponential IVIM model can be used to derive both microcirculation and tissue diffusion parameters using multi b value DWI [19]. However, no consensus has been reached on the number and range of b values used for IVIM [19]. Since more lower b values (< 100–200 s/mm2) should be used to acquire perfusion sensitive information [19] and measurements at higher b values are more stable and sensitive to reflect diffusion and can provide better conspicuity for tumor detection [19, 43], we selected 11 b values ranging from 0 to 1500 s/mm2 with 6 b values ≤ 200 s/mm2 and 4 b values ≤ 100 s/mm2. In our study, the b values were in low-to-intermediate b value range (≤ 1500 s/mm2), and non-Gaussian behaviors were not evident [43]. The number and value of b values used in our study were similar to those in recent IVIM studies (ranging from 0 to 1200 to 1500 s/mm2) in which diffusion and microcirculation characteristics were investigated [24, 44]. In a recent liver IVIM study, 13 b values from 0 to 1200 s/mm2 were used, and the NEX for 1200 s/mm2 was 8 [44]. In our study, we minimize the effect of higher b values on signal-to-noise ratio (SNR) by setting the number of excitations (NEX) for 1500 s/mm2 at 6. In future studies, it may be helpful to use lower max b value (≤ 800 s/mm2). Our results demonstrate that the diagnostic performance of ADCtotal calculated with multiple b values was better than that calculated with two b values. Zhu et al. [37] reported that ADCtotal showed superior diagnostic performance compared to IVIM-derived parameters and ADC0,500 in the discrimination of malignant lesions from hemangiomas, which was similar to our findings in this study. Lemke et al. [45] reported that the acquisition of multiple b values improved the stability of the parameter estimation in the differentiation of pancreas carcinoma from healthy pancreatic tissue. Kim et al. [41] reported that the use of multiple b-values may improve the reproducibility of ADC measurements on RT DWI. One possible reason is that by fitting multiple b values the variability of ADC may be canceled out, resulting in a smaller measurement error and higher accuracy and reproducibility of parameters [41, 46, 47]. Our results confirm this finding in the discrimination of histologic grade of HCC. Although longer acquisition times and more complicated calculations are needed to obtain ADCtotal, it would still be a better choice to fit multiple b values for differentiating histologic grade of HCC due to its better performance compared with two b values. Voert et al. [46] reported that the largest gain in reducing error is in the range when moving from 4 to 11 b values and at least 11 b values should be used for IVIM. However, a prolonged acquisition time caused by using more b values may lower the work efficiency and add to the suffering of patients. A previous study reported that good reproducibility of ADC using both 4 and 16 b values was obtained, and the precision and reproducibility of IVIM parameters calculated with 4 b values were not reduced significantly compared to those calculated with 16 b values [33]. Therefore, on the basis of our preliminary finding in this study, decreasing the number of b values to reduce the scan time while guaranteeing the diagnostic performance and reproducibility of ADC values should be explored in the future study.

We found that D in combination with f provided better diagnostic specificity than each did alone and that the result was similar to ADCtotal for diagnosing well differentiated HCC using joint analysis. These findings indicate that IVIM provides more parameters which facilitate a better understanding of tissue characteristics of HCC and a combination of parameters may help provide more accurate differentiation of histologic grade of HCC, which would permit more proper selection of treatment strategies for HCC. However, the combination of multiple parameters did not perform better in diagnosing poorly differentiated HCC.

In this study, we used the conventional method of ROI drawing which were previously reported in many studies, and whole tumor analysis was not performed [21, 48]. Whole tumor analysis could capture the tumor heterogeneity better and have better reproducibility [44, 49], but it includes areas of necrosis and hemorrhage, and may not provide the true ADC values for solid component of tumor, which may result in more overlaps between different histologic grades [50]. In addition, whole tumor analysis takes longer time [50]. Although Wei et al. [44] reported that ADC and IVIM parameters generated from whole tumor analysis were significantly correlated with histologic grade, their diagnostic performance was not better compared with one slice ROI. Xu et al. [49] reported that the AUC-ROC of the best parameters in discriminating poorly differentiated HCC from nonpoorly differentiated HCC was 0.763, which was lower than that of parameters generated form one-slice ROI [21,22,23]. A previous study reported that ADC generated from whole tumor analysis did not yield better results than one slice method in distinguishing low-grade gliomas from high-grade gliomas [50]. Some components of lesion such as necrosis were recommended to be excluded from the analysis [43]. To the best of our knowledge, the selection of ROI methods still remains debatable, and no consensus on this subject has yet emerged. Further studies on comparison of different methods of ROI drawing are warranted. However, in this study, our main goal was to compare the effectiveness of mono and biexponential models of diffusion-weighted imaging in differentiating histologic grade of HBV-related HCC. Therefore, the comparison of different ROIs was not investigated in this study.

There were some limitations in our study. First, this was a single-center retrospective study, and while care was taken in selecting appropriate patients, retrospective analyses always carry some risk of selection bias. While the cohort values of HCC showed significant difference among different histologicgrades, the findings warrant a separate validation patient cohort where the diagnostic threshold derived from this study is applied to this new cohort prospectively. Second, there were relatively few well and poorly differentiated HCCs in our sample. More well and poorly differentiated HCC patients should be included in further studies. Third, the tumors were heterogeneous and some lesions included tissues of different histologic grades, and the most predominant component of the tumor was selected to represent the entire lesion for analysis, which may be one cause of overlap between groups for the parameters [15]. Fourth, the mean values of ADC and IVIM parameters were not enough to reflect the heterogeneity of HCC. Histogram analysis should be used in further studies to evaluate this problem. Fifth, cardiac gating was not used because the simultaneous application of both electrocardiography and respiratory triggering sequence will significantly increase the scan time, which may reduce the clinical applicability [51]. Therefore, work remains to be done in future studies for the better utilization of IVIM imaging of left liver lobe. Sixth, the test–retest reproducibility was not assessed because it was not ethical to scan a patient twice in routine clinical work. Finally, the algorithms of biexponential model are sophisticated and sometimes prone to deviations [52].

In conclusion, both ADC and IVIM parameters can be used in the discrimination of histologic grade of HBV-related HCC and ADCtotal derived from a monoexponential model using multiple b-value DWI showed higher diagnostic accuracy compared with other metrics. The performance of ADC for differentiating histologic grade could be improved by fitting multiple b values.