Introduction

Hepatocellular carcinoma (HCC) is the most common primary hepatic malignancy, accounting for up to 90% of cases, and is the second leading cause of cancer-related death worldwide [1]. According to the Barcelona Clinic Liver Cancer (BCLC) current guidelines, patients diagnosed with early-stage HCC can be curatively treated by surgical resection, radiofrequency ablation, or liver transplantation [2]. However, 80% of patients are not eligible for curative treatment at the time of HCC diagnosis because of advanced tumor stage and/or background liver disease [3]. In the past two decades, multiple locoregional therapies (LRTs) have been developed for advanced HCC or with a curative prospect in patients who are not candidate for liver transplantation. LRT can also be used for tumor downstaging to make patients eligible for transplantation [4]. Yttrium-90 (90Y) transarterial radioembolization (TARE) is a growing LRT technique with data supporting its use in curative intent but also as neoadjuvant therapy before surgical resection or transplantation [5,6,7,8,9,10,11,12] and in cases of advanced-stage HCC with macrovascular invasion [13, 14].

Radiologic assessment of the response after LRT can be challenging as the lesion may demonstrate heterogeneous necrosis and typically no reduction in size at least initially. Among LRT techniques, TARE is particular as the radiation emitted may increase blood flow that may translate into increased enhancement in the treated area (the so-called radiation effect) [15], which may lead to uncertainty over the presence/absence of residual disease [16]. Traditional criteria for response to therapy assessment such as the Response Evaluation Criteria in Solid Tumor (RECIST) have been adapted in order to achieve a more accurate evaluation of lesions after LRT. As such, the European Association for the Study of Liver (EASL) criteria and the modified RECIST criteria (mRECIST) that only assess the viable tumor portion instead of the whole lesion have been adapted to LRT assessment [17,18,19,20]. In addition, the use of post-contrast image subtraction with MRI has been shown to be useful for tumor necrosis assessment [21,22,23]. Radiopathologic correlation is crucial in order to understand and interpret correctly the imaging spectrum present in treated tumors after TARE. The assessment of the degree of necrosis is of particular interest as it determines tumor viability. Published data on pathologic correlation in patients with HCC treated with TARE is limited (9–35 patients) [8, 19, 24,25,26]. Previous studies have emphasized the challenge of assessing HCC response post TARE [8, 24], with poor to moderate inter-reader agreement using mRECIST or LI-RADS-TR in one study [19]. King et al showed a correlation between mRECIST, qEASL (3D measurement of the lesion), and LI-RADS-TR criteria and percentage of necrosis at pathology [19]. Additional imaging features including rim enhancement, lack of hepatocyte-specific contrast uptake, T2-hyperintensity, and plateau or persistent enhancement were described as having potential added value to classic criteria of response assessment [8, 24]. With the increased use of TARE for HCC, there is a need for a clear definition of tumor response/necrosis.

Our objective was to assess the diagnostic performance of gadoxetate disodium–enhanced MRI for predicting complete pathologic necrosis (CPN) of HCC treated with TARE, using histopathology as the reference standard.

Material and methods

Patients

This single-center HIPAA compliant retrospective study was approved by the local Institutional Review Board. A waiver for informed consent was obtained. All consecutive patients with HCC treated with TARE who underwent liver transplantation or resection and contrast-enhanced MRI using gadoxetate disodium, ≤ 90 days prior to surgery between November 2012 and March 2019, have been identified through the institutional liver transplant database. Inclusion criteria were as follows: (1) adult cirrhotic patients that underwent liver transplant or resection, (2) MRI with gadoxetate disodium performed with adequate technique and quality ≤ 90 days prior to liver transplantation or resection, (3) segmental/lobar TARE prior to imaging with a minimal delay of 12 weeks between TARE and imaging. Among 57 potentially eligible patients, 10 patients were excluded (Fig. 1). Clinical variables including demographics, body mass index, etiology of liver disease, Child-Pugh score, Model for End Stage Liver Disease score, BCLC classification, serum alfa-feto protein, Y90 dose, and type of treatment (lobar/segmental) and surgery type (resection/transplantation) were recorded.

Fig. 1
figure 1

Flowchart of the study population. Abbreviations: HCC: hepatocellular carcinoma; TARE: transarterial radioembolization; LT: liver transplantation; MRI: magnetic resonance imaging; LRT: locoregional therapy; CPN: complete pathologic necrosis

MRI acquisition

We analyzed the MRI closest to surgery (< 90 days) and we only included patients with a minimal delay of 12 weeks between TARE and imaging, so to assess the response adequately. However, most patients got a longer delay between TARE and surgery with multiple MRIs performed. Of those, only the last MRI before surgery (performed < 90 days before surgery) was considered for the present study.

MRI of the abdomen was performed using various 1.5-T/3.0-T clinical systems with abdominal phase-array coils, using parallel imaging with a field of view of 300–400mm (Magnetom Avanto [n = 6], Skyra [n = 8], Aera [n = 16], Siemens Healthineers; and Signa 1.5T [n = 17], GE Healthcare). The liver MRI was a standard-of-care contrast-enhanced exam including dynamic phase images (pre-contrast, early and late arterial phases [AP], portal venous phase [PVP], transitional phase) as well as hepatobiliary phase (HBP) at 10 and 20 min post-injection of a fixed dose of 10 mL of gadoxetate disodium (Eovist/Primovist, Bayer Healthcare). MRI protocol is presented in Table 1. Enhanced phases–precontrast image subtraction datasets were generated automatically for all MRI exams.

Table 1 MRI protocol used in our study

Index lesion selection

The study coordinator (N.V.V., a radiologist with 2 years of experience in abdominal imaging) reviewed the MRI exams using a picture archiving and communication system (PACS) and the pathology reports to identify the index lesions. If multiple lesions were present in the same patient, up to three treated HCCs (≥ 1 cm) were included. Size, segment location, and series/image numbers of the lesion were recorded and labeled as lesion1/lesion2/lesion3 to ensure that radiologists and pathologist assessed the same lesion. Anonymized evaluation sheets were filled by the observers.

Image analysis

Two radiologists (J.G.,A.L.), both with 3 years of experience in abdominal imaging, independently reviewed the images on PACS. The readers were aware of the study aims but blinded to all clinical radiological and pathological reports. Exams were reviewed in a random order. If multiple lesions were present, all lesions were scored in the same reading session. The readers were asked to report the following parameters on late AP, PVP, and HBP (20 min post-contrast): LI-RADS v2018 treatment response algorithm (nonviable/equivocal/viable [27]), RECIST (cm), mRECIST (cm), EASL (cm2) on non-subtracted images as well as degree of tumor necrosis (defined visually as non-enhancing tissue) on subtracted post-contrast images, in 10% increments. The readers used late AP, PVP, and HBP (20 min post-contrast) for assessment of response. As the characteristics of tumor enhancement during the transitional phase are not well understood, we decided to not assess transitional phase images in our study [28].

Histopathological analysis

An experienced liver pathologist (M.I.F., 15 years of experience) blinded to the MRI results and clinical pathologic report retrospectively reviewed gross and microscopic finding in all index lesions, and reported degree of pathologic necrosis in 10% increments. Percentage of necrosis and size at pathology were considered the reference standard. For each lesion, tumor grade, presence/absence of microvascular invasion, satellite lesions, and T-stage were reported.

Statistical analysis

Imaging data was compared to histopathological data on a per-lesion basis. Reader assessments were pooled across the two readers by averaging their values when comparing them to the reference standard.

Reader agreement was assessed using the kappa coefficient (K) for the binary assessment of LI-RADS-TR, and the intraclass correlation coefficient (ICC) for the numeric measures (RECIST, EASL, m-RECIST, percentage of necrosis). Κ was interpreted as an indication of poor agreement when < 0, as slight agreement when 0 ≤ K ≤ 0.2, as fair agreement when 0.2 < K ≤ 0.4, as moderate agreement when 0.4 < K ≤ 0.6, and as substantial agreement when K > 0.6. The ICC was interpreted as follows: poor reliability (< 0.5), moderate reliability (0.5–0.75), good reliability (0.75–0.9), and excellent reliability (0.90) [29].

Spearman rank correlations were used to characterize the association of measures with degree of necrosis on pathology, and ICC was calculated between degree of necrosis on image subtraction and on pathology.

Demographic and clinical variables, as well as TARE (including TARE type, 90Y doses), imaging, and pathological (including number of HCC, HCC size, percentage of necrosis and histological grade) variables, were considered predictors of CPN using generalized estimating equations to account for the correlation among lesions from the same patient. Predictors were entered one at a time to obtain unadjusted odds ratios and 95% confidence intervals. For multivariable analysis, a backward selection procedure was employed, taking as candidate predictors any variables whose unadjusted association had a significance level of p < 0.10. The final model thus selected was then assessed for collinearity. When predictors were sufficiently correlated to indicate a collinearity problem (variance inflation factor > 10), they were removed one at a time, and the model with the best fit to the data as determined by the smallest QIC (Quasi-likelihood under the Independence model Criterion) statistic was chosen [30]. Receiver operating characteristic (ROC) analyses were used to assess the significant measure’s (based on univariate analysis) ability to predict CPN, with area under the ROC curve, and corresponding confidence limits, as a performance metric. Optimal cutoffs from the ROC analyses were established for each measure based on the value that minimized the Euclidian distance from the point on the ROC curve representing sensitivity and specificity of one (100%). All statistical tests were conducted at the two-sided 5% significance level using SAS 9.4 (SAS Institute).

Results

The final study population included 47 patients (M/F: 35/12, mean age: 62 years), with a total of 71 HCCs (mean size at pathology: 2.8 ±1.7 cm, 0.5–7.5 cm). TARE was delivered at the lobar level in 31 patients and segmental level in 12 patients, while 4 patients underwent other treatment distributions: whole liver (n = 1), right lobe + caudate lobe (n = 2), and right lobe + segment IV (n = 1). The mean delay between TARE and MRI was 6 months (range: 4–27 months), and the mean delay between TARE and surgery was 8 months (range: 5–29 months). Patient characteristics are presented in Table 2.

Table 2 Cohort characteristics

Histopathologic findings

There were 28 patients with 1 HCC, 17 with 2 HCCs, and 3 with 3 HCCs. Fifty-seven tumors were in the right hepatic lobe, 12 in the left hepatic lobe, and 2 in both hepatic lobes. Mean percentage of necrosis was 82% ± 32% (range 0–100%). CPN was observed in 42 (59.2%) HCCs, 90–99% necrosis in 10 (14.1%), 80–89% necrosis in 3 (4.2%), 70–79% necrosis in 2 (2.8%), 60–69% necrosis in 4 (5.6%), 30–39% necrosis in 2 (2.8%), and 10–19% necrosis in 1 (1.4%), while 7 HCCs were completely viable (0% necrosis, 9.9%).

Of the 29 HCCs with a viable component, 7 were well differentiated, 19 were moderately differentiated, and 3 were poorly differentiated. Cirrhosis was present in 44/47 patients, the 3 remaining patients had hepatitis B virus infection without cirrhosis.

MRI findings

Inter-reader agreement

There was good inter-reader agreement for assessing radiologic degree of necrosis using subtraction on AP/PVP (ICC: 0.85/0.86) and for RECIST (ICC: 0.82), while it was moderate for HBP subtraction (ICC: 0.7), mRECIST (ICC: 0.65), and EASL (ICC: 0.62). Agreement was fair for LI-RADS-TR (K: 0.4).

Prediction of CPN

Univariate logistic regression showed that mRECIST, EASL, LI-RADS-TR, and subtraction (at all assessed phases) were all significant predictive factors of CPN (Table 3). In multivariable analysis, only EASL [OR: 0.74 (confidence intervals, CIs: 0.57–0.96), p = 0.02], subtraction on AP [OR: 0.73 (0.55–0.97), p = 0.03], and PVP [OR: 0.71 (0.51–0.89), p = 0.02] were independent predictive factors of CPN. No other variable was predictive of CPN, including DWI signal.

Table 3 Univariate logistic regression to assess for predictors of complete pathologic necrosis of HCC

mRECIST, EASL, LI-RADS-TR, and subtraction were all significant predictors of CPN (pooled AUC: 0.83–0.84 for radiologic percentage of necrosis with pooled AUCs of 0.79–0.8 for mRECIST, EASL, and LI-RADS-TR, p < 0.0001), with a significant difference observed between subtraction and LI-RADS-TR (p = 0.04) for reader 2 (Figs. 2 and 3). While we observed no difference in sensitivity, the specificity of subtraction on AP/PVP was significantly higher compared to mRECIST (p = 0.04), EASL (p = 0.04), and LI-RADS-TR (p = 0.01) for prediction of CPN. The specificity of subtraction on HBP was significantly higher compared to LI-RADS-TR (p = 0.01) for the prediction of CPN. RECIST was not a significant predictor of CPN (AUC: 0.51, p = 0.17) (Table 4, Fig. 4).

Fig. 2
figure 2

A 62-year-old man with HCV cirrhosis and right hepatic lobe HCC (arrows) treated with lobar TARE. Non-subtracted images are shown (A: AP, B: PVP, and C: HBP). On image subtraction, the degree of tumor necrosis was rated as 100% on AP (D), PVP (E), and HBP (F) by readers 1 and 2, respectively. Measurements of both observers were for mRECIST: 0/0 cm, EASL: 0/0 cm2, LI-RADS-TR: LR-TR nonviable (both observers), RECIST: 3.8/4 cm. Note decreased hepatobiliary uptake in surrounding peritumoral liver (asterisks on C and F). Corresponding gross pathologic image shows completely necrotic lesion (arrow) measuring 4 cm in cirrhotic liver (G). Corresponding microscopic H&E image (×10) shows completely necrotic HCC with embolization beads noted (H, arrow). Abbreviations: HCV: hepatitis C virus; HCC: hepatocellular carcinoma; TARE: transarterial radioembolization; AP: arterial phase; PVP: portal venous phase; HBP: hepatobiliary phase; RECIST: Response Evaluation Criteria in Solid Tumors; mRECIST: modified RECIST; EASL: European Association for the Study of the Liver; LI-RADS-TR: liver imaging reporting and data system for treated tumor; LR-TR: LI-RADS treatment response

Fig. 3
figure 3

A 58-year-old man with HCV cirrhosis and right hepatic lobe HCC (arrows) treated with segmental TARE. Non-subtracted images are shown (A: AP, B: PVP and C: HBP). On image subtraction, tumor necrosis was rated as 30% on AP (D), PVP (E), and HBP (F) by observers 1 and 2, respectively. Note some misregistration artifact on PVP (E). Measurements of both observers were for mRECIST: 2.2/2.4 cm, EASL: 3.8/4.3 cm2, LI-RADS: LR-TR viable (both observers), RECIST: 2.9/3 cm. Corresponding gross pathologic picture shows 2.7-cm HCC (arrows) with 30% necrosis in cirrhotic liver (G). Corresponding microscopic H&E image (×10) shows viable HCC (arrows) and a portion of necrotic tumor around (H). Abbreviations: HCV: hepatitis C virus; HCC: hepatocellular carcinoma; TARE: transarterial radioembolization; AP: arterial phase; PVP: portal venous phase; HBP: hepatobiliary phase; RECIST: Response Evaluation Criteria in Solid Tumors; mRECIST: modified RECIST; EASL: European Association for the Study of the Liver; LI-RADS-TR: liver imaging reporting and data system for treated tumor; LR-TR: LI-RADS treatment response

Table 4 ROC analysis to assess the diagnostic performance (expressed as AUC estimates and 95% confidence intervals, and sensitivity/specificity) of imaging parameters selected from univariate analysis as predictors of complete pathologic necrosis of HCC
Fig. 4
figure 4

Receiver operating characteristics curves for significant MRI criteria (selected based on univariate analysis). ROC curves for image subtraction at the AP (sub AP), PVP (sub PVP) and HBP (sub HBP), mRECIST, EASL and LI-RADS-TR as individual predictors of complete pathologic necrosis. Abbreviations: MRI: magnetic resonance imaging; ROC: receiver operating characteristic; AP: arterial phase; PVP: portal venous phase; HBP: hepatobiliary phase; mRECIST: modified RECIST; EASL: European Association for the Study of the Liver; LI-RADS-TR: liver imaging reporting and data system treatment response

Degree of tumor necrosis on subtraction was significantly correlated to histopathologic degree of tumor necrosis at all phases (AP r = 0.63, PVP and HBP r = 0.64, all p < 0.0001), without significant difference between phases (p-range: 0.33–0.92).

Discussion

In our study, we observed that TARE is an efficient therapy for HCC with 59% (42/71) of tumors demonstrating CPN, in line with previous reports [8, 24]. When considering radiologic assessment of tumor response, we found that image subtraction, mRECIST, EASL, and LI-RADS-TR have excellent performance for predicting CPN in HCC post TARE, confirming the results from previous studies assessing different LRTs [8, 21]. In addition, the degree of tumor necrosis on image subtraction demonstrated a strong correlation with histopathologic degree of tumor necrosis. In addition, image subtraction on late AP and PVP had good inter-reader agreement, superior to that of mRECIST, EASL, and LI-RADS-TR. Inter-reader agreement for assessment of tumor necrosis on subtraction on HBP was in the lower range than based on AP and PVP, likely due to the fact that HBP is not typically used for the assessment of necrosis, with consecutively more reader variation. Finally, demographic, clinical, and qualitative radiological (other than those mentioned above) and tumor size characteristics were poor predictors of CPN, while EASL and subtraction at AP and PVP were independent predictors of CPN in multivariable analysis.

Evaluation of tumor response after TARE can be challenging due to the following: (1) necrosis rather than shrinkage of the tumor and (2) the presence of radiation effect in the surrounding liver parenchyma [31]. The radiation effect relates to diffuse increase in blood flow secondary to TARE that may translate as increased contrast enhancement in the surrounding liver, which may lead to confusion over the presence/absence of residual tumor [16]. Typically, the enhancement of the surrounding peritumoral liver decreases approximately 3 months post-treatment, and residual tumor and/or local tumor progression become easier to assess [32]. Thus, a delay of at least 3 months after TARE is recommended for imaging follow-up [32, 33]. In the present study, the mean delay between TARE and MRI was 6 months, with decreased effect of radiation. Chronic appearance of the treated zone is retraction and fibrosis (with enhancement on late post-contrast phases when using extracellular contrast media) [31]. There are a few studies assessing HCC response to TARE with MRI compared to histopathology (between 9 and 35 patients) [8, 19, 24,25,26]. These studies evaluated the diagnostic performance of EASL and mRECIST criteria for the diagnosis of CPN. Riaz et al showed that EASL and mRECIST criteria were predictive of the degree of pathologic necrosis and the presence of rim enhancement was correlated with histopathologic necrosis [8]. Ahmed et al also demonstrated a correlation between each of EASL and mRECIST criteria with histopathologic degree of necrosis. Additional features that correlated with histopathologic necrosis included the following: absence of HBP uptake, increased T2 signal, and plateaued or persistent enhancement after contrast injection [24]. King et al found a correlation between LI-RADS-TR criteria with histopathologic necrosis in 9 patients [19]. None of these studies assessed image subtraction. In another study including various LRT techniques with only 5 patients treated by TARE, Gordic et al showed the predictive value of image subtraction which strongly correlated with histopathologic necrosis [21]. The present study confirms these results, highlighting the potential benefit of image subtraction for the assessment of necrosis after TARE and showing the predictive value of EASL criteria for diagnosis of CPN. Our results demonstrate good specificity (82.8%) for the prediction of CPN using subtraction while sensitivity was lower (61.9–69.1%). This could be due to the interval between imaging and surgery with a possible ongoing necrosis phenomenon (even if we selected patients with < 90 days between imaging and surgery), and the possibility of residual enhancement due to granulation tissue. Also, as in by Gordic et al [21], we did not find any predictive role of DWI for diagnosis of CPN.

There is limited data on the role of image subtraction in HCC post LRT with histopathologic correlation [21,22,23]. These previous studies included mostly patients treated with ablation or transarterial chemoembolization and demonstrated that image subtraction had excellent diagnostic performance for diagnosis of tumor necrosis, with AUC ranging from 0.82 to 0.90 for the AP and 0.85 to 0.88 for the PVP, equivalent to our results. These studies also showed good inter-reader agreement, as in our current study, in which subtraction was better than mRECIST, EASL, and LI-RADS-TR in terms of inter-reader agreement (fair or moderate agreement). Correlation between necrosis at image subtraction and necrosis at histopathology reported by Mannelli et al and Gordic et al (AP, r = 0.71–0.89, PVP, r = 0.72–0.91, p < 0.001 for both studies) was slightly better than ours, possibly due to different LRT assessed. These previous studies did not use gadoxetate disodium, thus did not assess the percentage of necrosis using HBP. In the present study, we showed no difference in the diagnostic performance of subtraction obtained during different post-contrast phases (excluding transitional phase). Our results suggest that gadoxetate disodium could be used for the follow-up of patients treated with TARE as it allows index tumor response assessment and detection of new lesions with high sensitivity on the HBP [34]. However, subtraction based on HBP was not an independent predictive factor of CPN on multivariable logistic regression, unlike subtraction based on AP/PVP and EASL. Consequently, we suggest that subtraction should be assessed on AP and PVP and could be part of tumor response assessment to TARE, providing a linear assessment of response that mirrors histopathology. One limitation of subtraction is that is not systematically used with CT imaging, which limits its application to MRI [35]. Moreover, subtraction is limited by the need of good dynamic image quality with reproducible breath-holds in order to avoid misregistration between pre- and post-contrast phases [36]. In our study, we did not assess the effect of subtraction misregistration on diagnostic performance. Future directions will need to assess the effect of misregistration quality on response assessment, particularly with regard to the risk of AP respiratory artifacts that are known to be associated with the use of gadoxetate disodium [37, 38].

Our study has several limitations. First, it is a retrospective single-center study with a relatively small sample size. However, all tumors were verified at pathology. Second, there was an inherent selection bias for patients eligible for surgery and transplantation, who are more likely to have smaller tumors. This selection bias was counterbalanced by a uniform standard of reference. Third, contrast-enhanced sequences were assessed during the same reading session, which limits the comparison between sequences. Fourth, there is a possibility of interval tumor progression/response between MRI and surgery in the 90-day window. Fifth, we assessed various TARE treatment options, including segmental and lobar, reflecting our evolving experience, with the trend towards more segmental treatments. Future studies should focus on segmental TARE in small tumors.

In conclusion, subtraction (on the AP and PVP) and EASL criteria are independent predictors of CPN with excellent performance for diagnosing CPN in HCC treated with TARE. Image subtraction demonstrated good interreader agreement and showed strong correlation with histopathologic degree of necrosis. Thus, we recommend the use of subtraction when assessing HCC response to TARE with MRI.