Introduction

With the widespread use of cross-sectional imaging techniques, the detection of focal liver lesions (FLLs) has become increasingly common. Many FLLs are incidental, and characterization with imaging is an effective means to reduce the need for invasive biopsies for histological confirmation. Several features of FLLs on computed tomography (CT) and magnetic resonance imaging (MRI) are useful for their characterization. CT and MRI rely on the use of contrast agents and multiphasic studies for evaluation of enhancement characteristics that provide valuable clues to FLL type. However, the use of contrast agents can be expensive and may be contraindicated in some patients.

Perhaps the most important determination when an FLL is detected is knowing whether it is benign or malignant, which can determine the need for further evaluation. An imaging technique that can provide such information with high accuracy and without the use of contrast agents is useful. In this direction, MRI techniques have been the focus of many studies, as MRI does not involve ionizing radiation. Magnetic resonance elastography (MRE) and diffusion-weighted imaging (DWI) are two such MRI techniques capable of differentiating benign and malignant FLLs without the use of an intravenous contrast agent [14].

Malignant tumours generally have high cellularity compared to benign tumours, which results in lower apparent diffusion coefficient (ADC) values evaluated with DWI. DWI has been studied by several researchers for characterization of FLLs, with mixed success [3, 58]. Malignant tumours also have abundant extracellular matrix and increased vascularity and interstitial pressure [9]. These characteristics, along with greater cellularity, may result in increased stiffness that can be evaluated with MRE. Although MRE is a robust technique for the detection and staging of liver fibrosis [1012], few studies have explored this technique for evaluation of FLLs [1, 2].

To the best of our knowledge, there are no published studies in the English literature that have performed a systematic comparison of DWI and MRE for differentiating FLLs. In this study, we compared DWI and MRE for distinguishing between benign and malignant FLLs.

Materials and methods

Institutional review board approval was obtained for this study. Written informed consent was waived for the retrospective review of the data.

Subjects

Between March 2009 and April 2012, 96 consecutive subjects with 160 FLLs were studied with both MRE and DWI. Fifty patients underwent a prospective liver MRI study with MRE for evaluation of FLLs for a clinical feasibility study on MRE. Forty six patients underwent MRI of the liver with MRE for evaluation of chronic liver disease. In these patients, MRE slices were also obtained through the FLLs, and were therefore eligible for the study. After follow-up of at least 2 years in the case of suspected benign lesions, we retrospectively evaluated the patient data. Seventeen subjects with 36 FLLs were excluded from the study for the following reasons: no histological confirmation or other supportive clinical and imaging evidence of malignancy in the case of suspected malignant lesions, failure to demonstrate stability of lesions for at least 2 years with follow-up imaging, and/or failure to follow up after initial imaging workup in the case of benign lesions without histological confirmation. The final study group comprised 79 subjects (45 male, 34 female), with a mean age of 44.8 ± 14.5 years. The FLLs included hemangioma (HEM), hepatocellular adenoma (HCA), focal nodular hyperplasia (FNH), hepatocellular carcinoma (HCC), cholangiocarcinoma (CCA), and metastasis (MET).

Imaging technique

All subjects were studied on a clinical 1.5-Tesla MRI system (GE Healthcare, Milwaukee, WI, USA). Liver imaging was performed with a phased-array torso coil. The standard liver imaging protocol included the following sequences: coronal single-shot fast spin-echo T2-weighted sequence, respiration-triggered fast spin-echo T2-weighted sequence or axial breath-hold fast recovery fast spin-echo T2-weighted sequence, axial dual-echo in- and out-of-phase spoiled gradient-echo sequence, spoiled gradient-echo pulse sequence with volume acquisition (LAVA) before and after administration of a contrast-agent acquired at three phases (late arterial, portal venous and delayed), MRE sequence, and DWI with free breathing, and 20-min delayed images as needed. Intravenous gadolinium (gadoterate meglumine, Dotarem; Guerbet LLC, Paris, France) or gadoxetate disodium (Primovist; Bayer HealthCare AG, Leverkusen, Germany) was injected intravenously with an automated injector at a rate of 2–3 ml/s, followed by a 30-ml saline flush. Arterial, portal venous, and delayed-phase images of the entire liver were obtained. Additional delayed images were obtained as needed. The choice of intravenous contrast was indication-related and followed institution protocol. Gadoxetate sodium (Primovist) was used in cases with suspected FNHs.

DWI was performed before injection of intravenous contrast utilising a free breathing technique with the following parameters: repetition time/echo time (TR/TE) = 3000–5000/91 ms, matrix 128 × 256, 5-mm thickness, gap = 2 mm, bandwidth = 1.5 kHz, number of excitations (NEX) = 5, parallel imaging factor = 2). The gradients were applied in three orthogonal directions. Two b-values, b0 and b500, were used. The mid-range b-value of 500 was chosen so as to maintain good signal intensity from the liver, which can be low at higher b-values, and to reduce capillary perfusion effects that are dominant in lower b-values. The typical acquisition time was 4–6 min. ADC maps were automatically generated on the workstation using an algorithm implemented within the GE scanner.

MRE was performed with a 19-cm-diameter, 1.5-cm-thick cylindrical passive driver placed over the right chest wall over the right lobe of the liver at the level of the xiphisternum. A continuous acoustic vibration at 60 Hz was transmitted from an active to passive driver via a flexible vinyl tube. A modified phase-contrast pulse sequence (GRE-MRE) with cyclic motion-encoding gradients synchronized to the mechanical waves was used for imaging of the micron-level displacements associated with wave propagation. The process yielded images demonstrating the propagating waves in the liver. The tumours were identified on T2- and contrast-enhanced T1-weighted images and the MRE slices targeted to the tumour. MRE was performed with a breath-hold modified gradient-echo sequence (TR/TE = 100/26 ms, matrix 96 × 256, 5–10 mm thickness, four phase offsets, bandwidth = 33 kHz, parallel imaging factor = 2, NEX = 1). Four slices of 5–10 mm thickness were obtained through the liver at the level of the FLL; additional slices were prescribed in the case of multiple FLLs. At least two slices were ensured to cut through the largest cross-section of the FLL. Each MRE slice was obtained with a breath-hold of 11–16 s, and the entire MRE sequence was completed in 2 min. Stiffness maps or elastograms were automatically generated by the scanner by processing the acquired images of propagating shear waves with an inversion algorithm to generate quantitative cross-sectional images showing tissue stiffness [10].

Two readers (a radiology fellow and a consultant radiologist with more than 6 years of experience in MRE and 10 years of experience in body imaging), who were blinded to the final diagnosis of the FLL, evaluated the ADC and stiffness maps of 40 FLLs (20 benign and 20 malignant) for evaluation of interobserver correlation. Circular or oval regions of interest (ROIs) measuring 1–3 cm2 were placed on the focal lesions on the DWI or ADC map and copied to the stiffness map at the same or similar level wherever possible. Particular care was taken to exclude large vessels, liver edges, and artefacts. Mean values were calculated from ROIs drawn on different slices that cut through the FLL. With MRE, there were at least two slices through the FLL. Mean stiffness values in kilopascals (kPa) and mean ADC (10−3 mm2/s) values were derived for each lesion and tabulated. For comparison and accuracy analysis, readings from the senior experienced reader for all 124 lesions were used.

Two additional radiologists (attending radiologists with 8 and 2 years of experience in interpreting liver MRI studies) in consensus, performed qualitative analysis of conventional MRI images including T1- and T2-weighted, in/opp images, pre-contrast T1-W, and post-contrast enhanced images and classified the lesions as either benign or malignant. The diagnostic performance was then compared with DWI and MRE for any additional value over conventional imaging.

Statistical analysis

Mean (± SD) values with 95 % confidence intervals (95 % CI) were generated for tumour size, ADC, and stiffness values. Inter-observer agreement for measurements of ADC and stiffness was calculated using intraclass correlation analysis. Correlations among stiffness, ADC and FLL size were evaluated using the Pearson correlation coefficient. ADC values and liver stiffness were compared between malignant and benign FLLs and among individual FLL types using one-way analysis of variance (ANOVA), and post hoc testing for pairwise comparisons for all FLLs was performed using the Tukey-Kramer method. Receiver operating characteristic (ROC) curve analysis was performed to determine the area under curve (AUC) for accuracy of MRE and ADC for differentiation of benign and malignant lesions. Sensitivity, specificity, positive predictive value, and negative predictive value were generated using the optimal cut-off values. ROC analysis for comparisons of accuracy among DWI, MRE, and conventional MRI for classifying benign and malignant FLLs was also performed. All statistical analysis was performed using MedCalc for Windows, version 12.4.0.0 (MedCalc Software, Ostend, Belgium)

Results

The study group consisted of 124 FLLs comprising 80 malignant (HCC, CCA and MET) and 44 benign (HEM, HCA and FNH) lesions (Table 1). Final diagnoses were established with histopathological confirmation in 45 FLLs (Table 1) and in the remaining 79 lesions with imaging characteristics [1320], clinical follow-up, and stability of the lesion on follow-up imaging for 2 years or longer.

Table 1 Final diagnosis of focal liver lesions

A single FLL was present in 58 subjects and multiple FLLs (two to eight) in the remaining 21 subjects. The most common benign and malignant tumours were HEM (n = 24) and HCC (n = 57), respectively. The aetiologies of chronic liver disease in subjects with HCC were chronic hepatitis B infection in 34, hepatitis B and C co-infection in one, and cryptogenic cirrhosis in two.

The mean (± SD) size of FLLs was 37.7 ± 33.9 mm (95 % CI, 31.7–43.7 mm). Malignant FLLs were significantly larger than benign FLLs (mean size 43.1 vs. 27.9 mm, p = 0.016).

Interobserver agreement

There was excellent agreement between the two readers, with intra-class correlation coefficients of 0.93 (95 % CI, 0.89–0.96) for mean ADC values and 0.99 (95 %, 0.96–0.99) for mean stiffness of FLLs.

DWI of FLLs

The mean (± SD) ADC of the FLLs was 154.2 ± 53.6 × 10−3 mm2/s (95 % CI, 144.7–163.7). Benign FLLs had significantly higher mean ADC values than malignant FLLs (200 vs. 129 × 10−3 mm2/s, p < 0.001) (Figs. 1 and 2). ROC analysis showed accuracy of 0.87 for ADC in differentiating benign and malignant FLLs, with an optimal cut-off of 1.51 × 10−3/mm2 (p < 0.001). One-way ANOVA analysis revealed significant differences among lesions. HEMs showed the highest mean ADC among the FLLs (Table 2), which was significantly higher than that of FNHs (p = 0.0002) and of all malignant FLLs (p < 0.0001). The mean ADC of HCAs was significantly higher than that in all malignant tumours (p = 0.0006–0.0012) (Figs. 1 and 2). The mean ADC of FNHs was significantly lower than that for HEMs, but was not significantly different from that for any of the malignant tumours (p = 0.06–0.07) or HCAs (Table 2). All malignant tumours had significantly lower mean ADC than HEMs and HCAs. No significant differences in mean ADC value were found among malignant FLLs or between HEMs and HCAs.

Fig. 1
figure 1

MRE and DWI of focal liver lesions. Examples of hemangioma (top row), focal nodular hyperplasia (second row), hepatocellular carcinoma (third row), and cholangiocarcinoma (bottom row), with representative images from T2-W sequence (first column) and DWI (second column), with corresponding ADC map (third column) and stiffness map from MRE at a similar level. The numerical values over the ADC and stiffness maps are mean ± standard deviation of ADC (10−3 mm2/s) and stiffness (kPa) values of the lesions. Arrows point to the lesions in ADC and stiffness maps

Fig. 2
figure 2

Box plot graphs of mean ADC and mean stiffness of benign and malignant FLLs. The centre line of the box plot is the mean value, the box represents the 95 % confidence interval, and whiskers represent the range of the values. Dots represent the outliers

Table 2 Mean apparent diffusion coefficient (ADC) and stiffness (kPa) values of different focal liver lesions

MRE of FLLs

The mean (± SD) stiffness of FLLs was 6.2 ± 3.1 kPa (95 % CI, 5.7– 6.8 kPa). Malignant FLLs showed significantly higher mean stiffness than benign FLLs (7.92 vs. 3.12 kPa, p < 0.001) (Fig. 2). ROC analysis revealed accuracy of 0.98 (p < 0.001) (Table 3), with an optimal cut-off value of 4.54 kPa for differentiating malignant from benign FLLs. One-way ANOVA revealed significant differences among FLLs: HCAs had the lowest mean stiffness (Fig. 3), at less than 4 kPa, which was significantly lower than that for HCCs, CCAs and METs (p < 0.0001). No significant differences in mean stiffness were found among HCAs, HEMs, and FNHs (p = 0.98–0.99). The mean stiffness of HEMs was not significantly different from that of FNHs. Stiffness measurements were less than 5 kPa in all but one HEM, which was a 5 cm complex sclerosing hemangioma with mean stiffness of 6.1 kPa. Mean stiffness values were significantly lower for HEMs than for any of the malignant FLLs (p < 0.001). FNHs showed the highest mean stiffness among benign FLLs, but not significantly different from HEMs or HCAs. With the exception of one case with stiffness of 6.2 kPa, stiffness was less than 5 kPa in all FNHs. Mean stiffness was significantly lower for FNHs compared to all malignant FLLs (p < 0.0001).

Table 3 Results of receiver operating characteristic (ROC) curve analysis for MRE and DWI for differentiating benign and malignant focal liver lesions
Fig. 3
figure 3

Box plot graphs of mean ADC and stiffness values of benign and malignant FLLs. The centre line of the box plot is the mean value, the box represents the 95 % confidence interval, and whiskers represent the range of the values. Dots represent the outliers. HEM hemangioma, HCA hepatocellular adenoma, FNH focal nodular hyperplasia, HCC hepatocellular carcinoma, CCA cholangiocarcinoma. MET metastases

The mean stiffness of HCCs was not significantly different from that of CCAs (p = 0.98) or METs (p = 0.96), but was significantly higher than that for all benign FLLs (p < 0.001) (Table 2). CCAs had the highest mean stiffness among all FLLs, but not significantly different from that of METs. The mean stiffness in all CCAs was higher than 5.5 kPa, and was significantly higher than that of benign FLLs (7.92 vs. 3.12 kPa, p < 0.001) (Fig. 3). METs had significantly higher stiffness than all benign FLLs (p < 0.001), but not significantly different from HCCs and CCAs.

Comparison of MRE and DWI

A significant negative correlation (r = −0.54, p < 0.0001, 95 % CI, −0.65 to −0.40) was present between stiffness and ADC values of FLLS (Fig. 4). There was also a significant correlation between tumour size and stiffness (r = 0.43, 95 % CI, 0.28–0.57, p < 0.0001), but there was no correlation between tumour size and ADC (r = 0.14, P = 0.13).

Fig. 4
figure 4

Scatter diagram showing the correlation between mean stiffness and mean ADC values of 124 FLLs. Open circles represent benign lesions and black dots represent malignant lesions

ROC analysis showed that MRE performed better than DWI for differentiating malignant from benign FLLs (Table 3). Comparison analysis of ROC curves showed MRE performing significantly better than DWI (0.986 vs. 0.87, p = 0.0016) (Figs. 5 and 6).

Fig. 5
figure 5

Graph showing comparison of receiver operating curves for MRE and DWI for differentiating benign and malignant focal liver lesions

Fig. 6
figure 6

Graph showing comparison of ROC curves for MRE, DWI, and conventional MRI for differentiating 117 benign and malignant focal liver lesions

Further analysis showed that, with an optimal ADC cut-off value, DWI misclassified 21 FLLs (17 % of the study population): 13 benign FLLs (5 HCAs, 7 FNHs and 1 HEM) and 8 malignant FLLs (all HCCs). MRE misclassified only 5 FLLs (4 % of the study group): 2 benign FLLs (1 case each of FNH and HEM) and 3 malignant FLLs (all HCCs). The 3 HCCs were small (18-21 mm) and showed stiffness less than 4.5 kPa. The FNH and HEM, which had higher stiffness, were larger lesions (80 and 55 mm) with a central scar and sclerosis, respectively.

Comparison of conventional imaging with DWI and MRE

The readers correctly classified 104/124 (84 %) lesions. Of the remaining 20 lesions, 5 malignant lesions were misclassified as benign, 8 benign lesions were misclassified as malignant, and 7 were thought to be indeterminate. With ROC analysis of 117 lesions (excluding 7 lesions deemed indeterminate), the area under the ROC of MRE (0.99) was found to be superior to both DWI (0.87, p = 0.0009) and conventional images (0.87, P = 0.0006). There was agreement between conventional images and DWI in 90/117 cases (77 %) and with MRE in 102/117 (87 %) cases.

Using DWI, 6/7 (86 %) indeterminate lesions would have been correctly classified, whereas MRE would have correctly classified all 7 indeterminate cases.

Discussion

Our study results show that MRE is significantly superior to DWI for distinguishing between malignant and benign FLLs, with > 95 % sensitivity, specificity, and accuracy for MRE. Among all FLLs, mean stiffness was lowest for HCAs, at less than 3.1 kPa, although this overlapped with HEMs and FNHs. FNHs had the highest stiffness among benign lesions, but did not differ significantly from HEMs or HCAs. This trend of differentiation with MRE may be useful for characterization of benign FLLs; however, additional studies are needed to confirm our study results.

CCAs had the highest mean stiffness and lowest mean ADC values among FLLs. CCAs are known to be scirrhous, with an increased amount of fibrous stroma [1, 21], a characteristic that would be expected to increase stiffness and restrict diffusion to a greater extent than in predominantly cellular HCCs. A stiffness cut-off value of > 5.45 kPa correctly identified all CCAs, but specificity was < 50 %, suggesting an overlap in stiffness with other malignant FLLs. However, the number of CCAs in our study was small, and additional research is needed to evaluate significant differences among malignant FLLs. MRE provides a non-invasive quantitative parameter that may be useful for differentiating common benign and malignant FLLs, and our results must be confirmed in future studies in order to determine the clinical utility of MRE

The results with MRE in our study are similar to those reported in the limited literature [1, 2]. For example, Venkatesh et al. [1] showed successful differentiation of all benign and malignant tumours using a cut-off of 5 kPa. In the present study, our cut-off was lower, and the accuracy was 98.6 %, which is likely a result of differences in composition of the study group and the larger number of benign lesions included in our study. We also found stiffness higher than 5 kPa in a HEM with sclerosis and FNH with a fibrotic central scar. A small number of HCCs had a mean stiffness less than 5 kPa. Careful interpretation of stiffness values with other imaging features should be exercised when increased stiffness in otherwise benign appearing FLLs is encountered, especially in patients with no known risk factors for malignancy. Similarly, smaller malignant FLLs may occasionally demonstrate lower stiffness, as shown in our study, and which may be a result of partial volume effects due to lower resolution of stiffness maps with current MRE technique. The use of a 3D MRE sequence and improved resolution may be helpful for addressing partial volume effects in the case of smaller-volume FLLs.

In an MRE study of 72 lesions, Garteiser et al. [2] showed significantly higher shear and loss modulus in malignant lesions as compared to benign lesions, but no significant difference in storage modulus. In that study, the shear modulus was significantly different among lesions, similar to our study. The authors also found significant differences in loss modulus between HCCs and FNHs, HEMs and HCAs. In our study, all three malignant lesions demonstrated significantly higher stiffness than all three benign malignant lesions. The differences between studies may be attributed to the composition of the study groups, although HCC was the most common FLL in both studies. Furthermore, the study by Garteiser et al. was performed with a mechanical frequency of 50 Hz, compared to 60 Hz in our study. Whether accuracy is influenced by operating frequency has yet to be determined. Another difference was that shear, storage, and loss moduli were analysed separately in their study, whereas we studied only shear modulus. The current MRE inversion algorithm used in our study does not generate storage and loss moduli parameters, and therefore they were not available for comparison. The use of a 3D MRE technique may allow determination of these components of tumour stiffness.

We also demonstrated an association between lesion size and stiffness, most likely because malignant tumours were generally larger (43 vs. 28 mm) in our study group. However, ADC was not significantly associated with lesion size, probably due to the lower accuracy of ADC in distinguishing between benign and malignant lesions. In a previous study Venkatesh et al. [1] found a non-significant linear correlation between size of the lesions and shear stiffness. The mean size of the FLLs was 3.8 cm similar to our study. In an another study by Garteiser et al. [2] no such correlation was performed. Further studies are needed to confirm relationship between size of the FLLs and the shear stiffness.

With DWI, there was a significant difference in mean ADC between benign and malignant FLL groups. Both HEMs and HCAs had significantly higher mean ADC than all malignant FLLs. The ADC in FNHs was significantly different only from that of HEMs. There was a trend for higher ADC in HCAs than in FNHs, but it was not significantly different. These results are similar to those in a study by Sandrasegaran et al. [6], while they differ from those in another study that showed higher mean ADC values for FNHs than HCAs [20]. These disparities may be due to the number of lesions studied, as our study group had a smaller number of both FNHs and HCAs. There were no significant differences among malignant FLLs.

The superior performance of MRE is likely due to the fact that MRE measures the mechanical property of tissue, whereas DWI measures diffusion, which is affected by vascular perfusion and capillary microcirculation. The influence of vascular perfusion on stiffness in FLLs is not well known and needs further investigation. The performance of DWI is variable, and our study results are within the reported 74–100 % sensitivityand 77-100 % specificity [38, 2224]. The results with DWI may be affected by the population studied and the proportion of benign lesions included. Most previous studies included benign cysts, predominantly hemangiomas, which have significantly higher ADC measurements, and thus influence cut-off values.

The stiffness of FLLs evaluated with MRE is not dependent on the stiffness of the surrounding liver. Therefore, a benign lesion with lower stiffness in a fibrotic/cirrhotic liver would appear as a focal area of lower stiffness against a background of increased stiffness of fibrotic liver parenchyma. Similarly, a malignant liver lesion would appear as a focal area of increased stiffness that may be lower or higher than surrounding fibrotic parenchyma, depending on the severity of fibrosis. Interestingly, in one study, increased surrounding liver parenchyma stiffness was shown to be a risk factor for developing HCC in chronic liver disease [25], while another study showed no such association in the case of compensated cirrhosis [26]. The influence of parenchymal liver stiffness on the development of HCC needs to be clarified in future studies.

Interestingly, in our study, the performance of DWI was similar to that of conventional imaging, while MRE performed significantly better than conventional imaging. Our study also demonstrated the utility of both DWI and MRE in cases of indeterminate lesions with interpretation of standard MRI sequences. Overall, however, MRE performed better than DWI and would be potentially more useful for classifying FLLs as benign or malignant.

Our study has certain limitations. First, histological proof was not available for every FLL. However, this limitation is unavoidable, as it is generally not common practice to obtain histological evidence when imaging criteria for HCC are met or when characteristic imaging features are seen in benign FLLs. Second, we performed DWI only with two b-values. Although it has been proposed that the use of more b-values would obtain better results, studies with more than two values of b > 0 have not produced significantly superior results. A study suggests that DWI with two b-values is generally sufficient [3]. Our results with DWI are within the range of results reported in the literature. Studies with intravoxel incoherent motion (IVIM) imaging have shown mixed results for differentiation of benign and malignant liver masses [27, 28], and additional investigation of IVIM is needed to evaluate its utility in differentiating benign from malignant FLLs. Overall, DWI suffers from a lack of standardized b-values and variable reproducibility across platforms. Third, we had a small number of HCAs. Differentiation of HCAs from HCCs in clinical practice would be very useful. During our study period, we could include only five HCAs with final diagnosis, as several other HCAs were excluded for lack of final confirmation or follow-up imaging to confirm their stability. In our study, HCAs had the lowest stiffness and this was significantly different from all malignant lesions. A future study comparing the stiffness of HCAs with HCCs and FNHs would be useful.

Although we have demonstrated the additive value of MRE and DWI to conventional imaging, we believe that this should be performed prospectively and in a blinded fashion. A prospective study assessing the clinical utility of MRE is being planned at our institute.

In summary, MRE performed better than DWI, and shows great promise as a non-invasive alternative to biopsy for the differentiation of benign and malignant FLLs and demonstrates additive value to conventional imaging. However, there is opportunity for further enhancement of the technique, particularly with regard to optimization of image quality and evaluation of smaller FLLs.