Introduction

Primary sclerosing cholangitis (PSC) is a chronic liver disease characterized with bile duct inflammation and fibrosis leading to cholestasis and parenchymal injury [1, 2]. The clinical course of the disease is variable; however, a typical pattern shows slow progression that eventually leads to biliary cirrhosis with portal hypertension and hepatic decompensation. The diagnosis of PSC is based on detection of cholestasis based on liver tests and characteristic bile duct changes at cholangiography studies [1]. Unlike other chronic liver diseases (CLD), liver biopsy is controversial in PSC as histologic features are nonspecific and prone to sampling error because of heterogeneous involvement of the biliary tree and is reserved for patients with suspected small duct PSC or overlap with autoimmune hepatitis [3, 4].

MRI with magnetic resonance cholangiopancreatography (MRCP) is the standard investigation for the diagnosis of PSC [5]. A beaded appearance in the biliary tree with multifocal strictures and segmental dilatations is the characteristic finding of PSC. It is also feasible to evaluate parenchymal features of PSC with magnetic resonance imaging (MRI) which is routinely performed as a part of MRCP. Morphologic changes such as peripheral atrophy, central hypertrophy, and large macronodular regenerations (MNR) located in the central parts of the liver are well described findings in PSC [6, 7]. Beyond the diagnosis, the importance of prediction of disease severity and survival has motivated several investigators to find a relationship with imaging findings and outcome. Recent studies have reported that morphologic liver changes and quantitative liver and spleen volumes may predict disease severity and survival in PSC [8,9,10]. Other researchers have evaluated the role of magnetic resonance elastography (MRE) and demonstrated correlation between liver stiffness measurement (LSM) and Mayo risk score and for predicting hepatic decompensation [11,12,13]. However, studies that investigate both LSM and liver and spleen volumetry measurements in assessment of disease severity and prognosis in patients with PSC are lacking.

The aim of this study was to evaluate LSM with MRE and volumetry measurements of liver and spleen and their correlation with disease severity and prediction of hepatic decompensation.

Materials and methods

This retrospective review study was approved by the institutional review board with waiver of written consent. The inclusion criteria were (i) typical features of PSC on cholangiography and/or liver biopsy and (ii) underwent an MRI and MRE at between January 1, 2007 and December 31, 2013. This study period was chosen so that there is 5 years’ or more of follow-up data as PSC typically shows slow progression. Patients were excluded if they had concurrent chronic liver disease with the exception of overlap syndrome with autoimmune hepatitis (PSC-AIH). The first MRI study that had both MRE and routine MRI liver sequences were used for stiffness and volumetric measurements. Relevant laboratory data [serum aspartate aminotransferase (AST), alkaline phosphatase (ALP), total bilirubin (TB), albumin (ALB), sodium, and creatinine, international normalized ratio (INR)] were recoded. History of variceal bleeding closest to the MRE study was documented. Mayo risk score and MELD scores were calculated. Mayo risk score is a predictor of severity of disease and is calculated as follows [14]:

$${\text{Mayo}}\;{\text{risk}}\;{\text{score}}\, = \,0.0295\;\left( {{\text{age}}\left[ {\text{years}} \right]} \right)\, + \,0.5373\;\left( {{\text{TB}}\left[ {{\text{mg}}/{\text{dl}}} \right]} \right)\, + \,0.5380\ln \left( {{\text{AST}}\left[ {{\text{U}}/{\text{L}}} \right]} \right) - 0.8389\left( {{\text{ALB}}\left[ {{\text{g}}/{\text{dl}}} \right]} \right)\, + \,1.2426\;({\text{variceal}}\;{\text{bleeding}}\;{\text{history}}\;[0;\;{\text{if}}\;{\text{none}},\;1:\;{\text{if}}\;{\text{present}}]).$$

We grouped the patients according to Mayo risk score as low-risk (≤ 0), intermediate-risk (> 0 and ≤ 2), and high -risk (> 2) groups based on the Mayo risk score [14].

MELD score was calculated using the published formula [15].

$${\text{MELD}}\;{\text{score}}\, = \,3.78\;\ln \left( {{\text{TB}}\left[ {{\text{mg}}/{\text{dl}}} \right]} \right)\, + \,11.2\;\ln \left( {\text{INR}} \right)\, + \,9.75\ln \left( {{\text{creatinine}}\;\left[ {{\text{mg}}/{\text{dl}}} \right]} \right)\, + \,6.43.$$

Hepatic decompensation was defined as development of one or more of the following: (a) bleeding esophageal varices, (b) ascites, and (c) encephalopathy. Follow-up duration was recorded till July 2019 for hepatic decompensation and clinical endpoints of liver transplantation or all cause mortality.

Measurements on liver MRI and MRE

The study population underwent standard MRI/MRCP and MRE study for suspected PSC or follow-up of known PSC. MRE was performed for assessment of parenchymal disease. All patients underwent standard liver contrast enhanced MRI/MRCP with MRE protocol. MRE was performed with standard four 10-mm-thick slices obtained through largest cross-section of the liver as described earlier [16].

All measurements were performed by two board certified radiologists. A training session for liver stiffness measurements and volumetric assessment was performed with an expert board certified abdominal radiologist with more than 12 years’ experience in MRE (SKV). Ten MRI and MRE studies not included in the study population were used for training. Results were compared among the readers and the expert to ensure high inter- and intra-reader agreement. Subsequently the radiologists performed the volumetry and stiffness measurement independently as described below.

Liver stiffness assessment

Regions of interest (ROI) were manually drawn over the liver on the magnitude images generated with MRE sequence, taking care to avoid liver edge, vessels more than 3 mm in diameter, lesions, and artifacts in the liver. The ROIs were then copy-pasted onto stiffness maps which provided the liver stiffness values in kilopascals (kPa). ROIs were drawn as large as possible in order to include as large liver parenchyma as possible. The confidence maps were not available in these MRE studies as they were performed before the introduction of confidence maps for liver MRE for clinical practice.

For this study, the mean LSM, right liver lobe LSM (RLSM), and left liver lobe LSM (LLSM) were calculated. The mean LSM was obtained by placing an ROI that included both the left and the right lobes of the liver and averaging the stiffness obtained from each slice of the MRE sequence as described above. The RLSM and LLSM were acquired by drawing ROIs over the right lobe and left lobe, respectively, within the large ROI for whole liver described above (Fig. 1). Cantlie’s line was used to denote the border between the right and left lobes of the liver, using the middle hepatic vein and gallbladder fossa as landmarks.

Fig. 1
figure 1

Axial post-contrast T1WI in the portal venous phase and MRE sequences showing the morphological and elastogram differences between 2 patients. The top row shows a 45-year-old female patient with mild PSC (Mayo risk score − 0.25). Volumes were measured on the portal venous phase (a). The right lobe volume (turquoise outline) was 986 ml, left lobe volume (green outline) was 409 ml, and the caudate lobe volume (yellow outline) was 33 ml. The spleen volume (dark blue outline) measured 205 ml. b The magnitude image of the elastogram with ROIs for the total liver stiffness (white line), right lobe stiffness (orange line), and the left lobe (green line). The corresponding ROIs on the elastogram (c) yielded the stiffness values as follows: LSM = 1.99 kPa, RLSM = 2.12 kPa, LLSM 1.79 kPa. The bottom row shows a 65-year-old patient with severe PSC (Mayo risk score 2.75). Volumetric and elastographic values were measured in a similar fashion. The right lobe volume (turquoise outline) was 290 ml, left lobe volume (green outline) was 1364 ml, and the caudate lobe (yellow outline) was 362 ml on portal venous phase CT (d). The spleen volume was 870 ml (dark blue outline). Stiffness measurements were performed  with regions of interest drawn on  magnitude image (e) and copied to stiffness map (f). The liver showed increased stiffness with the total liver stiffness (white outline) measuring 9.47 kPa, right lobe stiffness  (turquoise outline) measuring 11.82 kPa, and the  left lobe stiffness (green outline) measuring 8.99 kPa

The MRE raw data were reprocessed with automated liver elasticity calculation (ALEC) algorithm for automated liver stiffness measurement (ALSM) [17]. We performed this additional step to evaluate reproducibility of the LSM with manual drawing of ROI by the independent reader.

Volumetric assessment

The segmentation of the liver and spleen for volumes was performed on Visage PACS software (Visage Imaging, GmbH). The volumes were obtained by manually tracing the liver and spleen on axial portal venous phase images. In a small number of cases (n = 6) where portal venous phase images were not available or suboptimal, the pre-contrast T1 fat suppressed images or opposed phase images were used. The liver outline was hand traced on every 3 to 5 images starting from the top slice that included liver till the inferior edge of the liver. The PACS software did automatic interpolation between the manually traced slices. Corrections were made by the reader as necessary. The intrahepatic inferior vena cava, extrahepatic portal vein, and major fissures (such as the fissure for the ligamentum teres) were excluded in the volumetric assessment. The soft ware automatically generated the volume of the traced regions in milliliters (ml).

The total liver volume (Tvol) was first acquired by tracing the outline of the whole liver. The caudate lobe was then excluded from the segmentation using the portal vein branching and inferior vena cava as landmarks. The difference between the resultant volume and the Tvol was taken as the caudate lobe volume (Cvol). Subsequently, the left lobe was excluded from the segmentation along the Cantlie’s line. The resultant volume would be the right liver lobe volume (Rvol). The left liver lobe volume (Lvol) was calculated by subtracting the sum of Rvol + Cvol from the Tvol. The spleen volume (Svol) was also obtained using the same method as for Tvol (Fig. 1). The lobar volume-to-total liver volume ratios—right to total (Rvol/Tvol); left to total (Lvol/Tvol); and caudate to total (Cvol/Tvol)—were also calculated.

Morphological assessment

The MRI images were also assessed for morphological features of chronic liver disease and cirrhosis, namely (i) presence of macronodular regeneration (MNR) (7) and (ii) presence of atrophy (reduction in volume, crowding of biliary ducts, and intrahepatic vessels) and hypertrophy (enlarged lobe with separation of vessels/fissures) of right lobe, left lobe, and caudate lobe. Imaging features of portal hypertension such as splenomegaly, oesophageal varices, splenic varices, recanalization of the umbilical vein, and ascites were recorded as present or absent.

Statistical analyses

Statistical analyses were performed with JMP (SAS Institute Inc.) and MedCalc Statistical Software version 16.4.3 (MedCalc Software bvba, Ostend, Belgium). Categorical data are presented as numbers (percentages) and continuous variables are expressed as medians, interquartile ranges (IQR) unless otherwise stated. Categorical data were compared using the Pearson Chi-squared test and continuous variables were compared using the nonparametric Wilcoxon test. The degree of association between continuous and/or ordinal variables was calculated by using the Spearman’s rho analysis. Agreement between liver stiffness measurements by reader and automatic algorithm was evaluated with intra-class correlation coefficient (ICC) analysis. ROC analysis was performed to determine the diagnostic accuracy of stiffness measurements, volumetry measurements, Mayo risk score and MELD score. Cut-off ranges were calculated using the optimal cut-off to maximize sensitivity and specificity to differentiate high-risk group according to Mayo risk score. Univariate and multivariate analyses using the cox-proportional hazard regression model were performed to determine significant quantitative measures for predicting the cumulative indices of the development of hepatic decompensation. For all tests, a two-tailed p value of less than 0.05 was considered statistically significant.

Results

Patient demographics

A total of 266 patients (M/F, 185/81) were included in this retrospective study. At the time of MRI, the mean age ± SD of the patients was 46.12 (range 33.02–59.4 years). Detailed patient characteristics of this cohort were previously reported [12]. Mayo risk score was available in 262/266 subjects and MELD score was available in 251/266 subjects. The LSMs, volumes, and volume ratios of the study population are summarized in Table 1. The morphologic features of the liver and portal hypertension are summarized in Table 2.

Table 1 Baseline characteristics of the study population
Table 2 Morphologic features on MRI in 266 patients with PSC

Stiffness measurements

There was excellent agreement between mean LSM and ALSM measured with automatic algorithm (ICC = 0.96, 95% CI 0.95–0.97) confirming reproducibility of the LSM values. There was a very strong correlation between mean LSM and RLSM (rs = 0.961, p < 0.001), between mean LSM and LLSM (rs = 0.924, p < 0.001), and between RLSM and LLSM (rs = 0.822, 95% CI 0.78–0.86). However, the median RLSM was significantly higher than LLSM (3.11 kPa vs. 2.75 kPa, p < 0.0001). The lobar stiffness measurements (RLSM and LLSM) were significantly higher in the atrophic lobes in comparison with normal and hypertrophied lobes (p < 0.05, Table 3). The presence of MNR was associated with increased LSM with statistically significant differences in median LSM in patients with and without MNR (5.01 kPa vs. 2.69 kPa, p < 0.0001).

Table 3 Comparison of median LSM measurements in liver lobes according to morphology

Volume measurements

The Tvol showed strong correlation with Rvol (rs = 0.685, p < 0.0001) and Lvol (rs = 0.658, p < 0.0001), moderate correlation with Cvol (rs = 0.451, p < 0.0001) but weak correlation with Svol (rs = 0.388, p < 0.0001). The Svol also showed weak correlation with Rvol (rs = 0.248, p < 0.0001), Lvol (rs = 0.242, p < 0.0001), and Cvol (rs = 0.386, p < 0.0001). There was moderate correlation between Lvol and Cvol (rs = 0.437, p < 0.0001) but no statistically significant correlation between Rvol and Cvol and between Rvol and Lvol (p > 0.05).

Correlations between LSM, volumetry, and Mayo risk score

There were strong correlations between mean LSM and Tvol, Lvol, Cvol, Rvol/Tvol, Lvol/Tvol, and Cvol/Tvol (p < 0.05) (Table 4). There was fair and moderate correlation between Mayo risk score and mean LSM (rs = 0.646, p < 0.001) and Svol (rs = 0.335, p < 0.001). There was weak but statistically significant correlations between Mayo risk score and Cvol (rs = 0.294, p < 0.001) and Cvol/Tvol (rs = 0.274 p < 0.001) (Table 4). There was no significant correlation between mean LSM and Rvol (p > 0.05).

Table 4 Spearman's rho correlation analysis between volumetry, LSM, and Mayo risk score

According to Mayo risk score, 147/262 patients (56.1%) were in the low-risk group, 99/262 (37.8%) in the intermediate-risk group, and 16/262 (6.1%) in the high-risk group. There were significant differences in the mean LSM, RLSM, LLSM, Tvol, Lvol, Cvol, Svol, Rvol/Tvol, and Cvol/Tvol among the different risk groups (Table 5, Fig. 2). There was no significant difference in Lvol/Tvol in different risk groups (p > 0.05).

Table 5 LSM and volumetry variables according to Mayo risk score classification
Fig. 2
figure 2

Bar graphs showing the comparison of LSM, liver and spleen volumes, and volume ratios among patients with low-risk, intermediate-risk, and high-risk patients according to Mayo risk score classification

ROC analysis for predicting high Mayo risk score group showed that mean LSM, Tvol, Cvol, and Cvol/Tvol had area under ROC curve (AUC) more than 0.6 (Table 6, Fig. 3). All other measures had an AUC below 0.6. Mean LSM had significantly better performance (AUC = 0.92) than Tvol (AUC = 0.64), Cvol (AUC = 0.70), and Cvol/Tvol (AUC = 0.66) for the prediction of high-risk group.

Table 6 ROC analysis results for LSM, volumetry for discrimination of high risk from low and intermediate risk according to Mayo risk score
Fig. 3
figure 3

ROC curves of LSM, Tvol, Cvol, and Cvol/Tvol for differentiation of patients with high risk according to Mayo risk score

Follow-up evaluation of clinical events

The median follow-up interval between first MRI and last visit was 5.81 years (IQR 2.35–6.55 years) for the total cohort. A total of 60 patients were diagnosed with the primary endpoint of hepatic decompensation and 24 patients underwent liver transplantation. 31 patients died with PSC-related complications such as cholangiocarcinoma, biliary sepsis and variceal bleeding accounting for 17 subjects. The median time between MR imaging and hepatic decompensation was 374 days (1–1036 days). In patients with hepatic decompensation, mean LSM, Tvol, Lvol, Cvol, Svol, Cvol/Tvol, MELD, and Mayo risk score were significantly higher than in those patients who did not have hepatic decompensation. The median Rvol/Tvol was lower in the patients with hepatic decompensation (Table 7).

Table 7 Comparison of variables in patients with and without hepatic decompensation

In the univariate analyses, mean LSM, Svol, Mayo risk score, and MELD were significantly associated with hepatic decompensation. In the multivariate analyses, only mean LSM (HR = 1.29 per unit; 95% CI 1.02–1.36 p = 0.028) and Mayo risk score (HR = 1.65 per unit; 95% CI 1.08–2.53 p = 0.021) remained significant (Table 8).

Table 8 Cox survival analyses for prediction of hepatic decompensation

Discussion

In this study, LSM with MRE demonstrated to be an excellent predictor of disease severity and hepatic decompensation in patients with PSC. There were significant but weaker correlations between Mayo risk score and volumetry measurements. Volumetry measures were not significantly associated with hepatic decompensation in the multivariate analyses.

In our study, we observed very strong correlations between mean LSM and ALSM, RLSM and LLSM (p < 0.0001), suggesting excellent reproducibility. However, there were significant differences between right and left lobe stiffness measurements. From our experience and the results of this study, LSM should therefore be performed including the largest possible liver parenchyma for evaluation of chronic parenchymal disease to ensure reproducibility and meaningful interpretation of longitudinal changes. In the morphologic assessment, we observed significantly higher lobar LSM in the presence of atrophy similarly to that in a recent study by Bookwalter et al. [13]. We observed significantly higher mean LSM in the presence of MNR. This is consistent with the previous study by Bader et al. who suggested a correlation between liver cirrhosis pattern and MNR and atrophy in patients with PSC [7]. The MNR are large regions of regenerative parenchyma that develops as a compensatory hyperplasia response to the progressive disease with peripheral atrophy and therefore found in advanced stages of PSC that typically show increased LSM.

We observed statistically significant correlations between liver stiffness measurements and volumetry with the best correlation between LSM and spleen volume. Spleen volume increased in advanced stages of the disease due to development of portal hypertension. There was increase in liver volume in patients with advanced liver disease. This probably explains the good correlation between LSM and spleen volume.

There were statistically significant correlations between Mayo risk score and volumetry measurements with the best correlation observed with Svol (rs = 0.335, p < 0.001). Khoshpouri et al. evaluated lobar volumes and volume ratios in patients with PSC and observed that left lobe to total liver volume had the strongest correlation with Mayo risk score [9]. In our study, we also observed a correlation between Mayo risk score and Lvol/Tvol. However, there was a weak correlation and no significant difference in Lvol/Tvol in the risk group analyses according to Mayo risk score. PSC is a heterogeneous disease and different combinations of segmental and/or lobar hypertrophy and atrophy can occur [8] rather than typical right lobe atrophy and left lobe hypertrophy commonly seen in other chronic liver diseases such as chronic viral hepatitis and alcoholic liver diseases [18]. This can be explained secondary to heterogeneous involvement of biliary tree and liver parenchyma in PSC. Furthermore, in our study population, there were patients with left lobe atrophy and right lobe hypertrophy as well. Our study findings suggest that lobar-to-total liver volume ratio analysis may not be an optimal method for evaluating disease severity in PSC. It is possible that the PSC population may be different from the population studied by previous study.

There was excellent correlation of LSM with Mayo risk score and is consistent with results from a recent study by Jhaveri et al. [11]. They also showed that the correlation between Mayo risk score and MRE was better than Mayo risk score correlation with VCTE [11].

For prediction of hepatic decompensation, the diagnostic accuracy for mean LSM was excellent and significantly better than volumetry. Previous studies investigated the role of volumetry measurements and morphologic changes in the assessment of disease prognosis in PSC [8, 10]. Kitzing et al. investigated hepatic morphology changes over time and observed that progressive hepatic atrophy showed significant association with adverse clinical outcome [8]. Khoshpouri et al. observed shorter transplant-free survival in patients with a spleen volume change more than 50 ml and left lobe to total liver volume change more than 0.04 in the follow-up of patients with PSC patients [10]. Ehlken et al. suggested that both transient elastography (TE) measurements and spleen length are significant predictors of outcome in PSC [19]. Jung et al. reported a 15% increased risk to reach a clinical outcome as liver transplantation or liver-related death per 1 cm greater spleen length at baseline [20]. In our study, spleen volume was the only volumetric measure at the baseline MRI evaluation that was significantly accurate in prediction of hepatic decompensation in univariate analyses. However, the significance was lost in the multivariate analyses.

In our study, the baseline LSM was significantly associated with hepatic decompensation in both univariate and multivariate analyses. This is similar to Corpechot et al. who performed a study with TE in PSC patients and suggested that both baseline measurements and longitudinal changes in LSM are prognostic factors in PSC [21]. Our findings are also consistent with a previous study including 217 patients with different etiologies of chronic liver diseases [22].

Our study has limitations. The study was retrospective analysis of data which were unavoidable as it is an outcome analysis and we chose to have a longer follow-up duration as PSC tends to show slow progression. We also did not analyze the longitudinal changes in liver and spleen volumes and its correlation with outcome. Future studies are required for assessing the utility of follow-up changes in liver volume, spleen volume and LSM for predicting in PSC. We did not evaluate death as a primary endpoint as the causes of death in our population were heterogeneous with many deaths being unrelated to PSC.

In conclusion, we demonstrated that baseline LSM with MRE is better than liver and spleen volume measurements for prediction of disease severity and hepatic decompensation in patients with primary sclerosing cholangitis.