Introduction

Primary sclerosing cholangitis (PSC) is an autoimmune liver disease marked by significant cholestasis related to biliary stricturing and hepatic fibrosis. In many cases, unchecked inflammation and ductal sclerosis ultimately lead to cirrhosis and/or development of hepatobiliary malignancy (principally cholangiocarcinoma), although the rate of progression is variable. While PSC may occur in isolation, 60–80% of patients may have concurrent inflammatory bowel disease and up to 17% may have overlap with autoimmune hepatitis [1, 2]. Whether PSC incidence and prevalence are increasing is debatable and may relate to increasing early recognition. There also appears to be ethnic differences in risk, as predominantly Caucasian populations from the United States and Northern/Western Europe suggest higher PSC incidence and prevalence than in Alaskan natives or East Asians. Furthermore, African-Americans are diagnosed with PSC in similar frequency to Caucasians but may have more aggressive disease including diagnosis at earlier age and higher disease burden at the time of transplant listing [1].

However, significant clinical heterogeneity defines PSC and like the god Janus from Roman mythology, two patients with PSC are often not alike. These variable phenotypes present both diagnostic and management challenges for clinicians, while also stymying ongoing research effects to develop disease-modifying therapies. The American Association for the Study of Liver Diseases (AASLD) and European Association for the Study of the Liver (EASL) recently published updated practice guidance on PSC [3, 4]. These references provide practical information for clinicians but also highlight the significant knowledge gap that remains regarding its pathogenesis, optimal management, and best disease-specific cancer screening practices. This review will expand upon several important issues and controversies patients and providers face on a daily basis, referencing evidence-based recommendations when possible and drawing on our observations to inform areas with less well-established literature.

Utility of Ursodeoxycholic Acid

Ursodeoxycholic acid (UDCA) is a hydrophilic bile acid with multiple beneficial effects including promotion of bile flow, cytoprotection, cell membrane stabilization, dilution of hydrophobic bile acids, reduction in apoptosis, and reducing inflammation [3]. It is guideline-directed first-line therapy in patients with primary biliary cholangitis (PBC) [5, 6]. Its use leads to normalization or significant improvement in alkaline phosphatase (ALP) levels in the majority of patients and more importantly, it slows histologic progression and improves transplant-free survival [7]. The medication is generally well tolerated at recommended doses of 13–15 mg/kg/day, with a small proportion of patients reporting side effects such as mild GI symptoms (nausea, vomiting, diarrhea, or constipation), headache, or alopecia. A single study also showed that UDCA treatment was associated with small initial weight gain, which was maintained for at least four years [8].

Outside of PBC, UDCA is utilized in other clinical scenarios albeit with less clear morbidity and mortality benefit. The use of UDCA for gallstone dissolution is time-honored and carries a 37% dissolution rate, with better efficacy for smaller stones [9]. Its use as a prophylactic agent in patients undergoing allogenic stem cell transplant is supported by a Cochrane systematic review and meta-analysis noting a 40% reduction in the incidence of hepatic sinusoidal obstructive syndrome (veno-occlusive disease) [10]. However, this same analysis did not find an overall mortality reduction. The narrow benefit of UDCA also extends to intrahepatic cholestasis of pregnancy. Systematic review and meta-analysis has shown an ability to reduce pruritus yet the effect of UDCA on fetal outcomes is uncertain due to small effect size and study heterogeneity [11]. Thus, while there is broad clinical experience with UDCA and overall a strong safety signal, its use should not be consider a panacea.

Similarly, UDCA use in PSC has historically been controversial. Several meta-analyses of studies where UDCA was dosed up to 15 mg/kg/day demonstrated improvement in biochemical abnormalities. Yet this optimism was blunted by concurrent lack of efficacy in achieving superior outcomes with respect to a reduction in histologic progression, need for liver transplantation, or death [12,13,14]. It should be noted, however, that many of the patients in the included studies had advanced fibrosis to cirrhosis and thus it is unclear if early initiation may alter the disease course. High-dose UDCA has also been trialed in two studies. At a dose of 17–23 mg/kg/day, the medication had no benefit on reduction of need for transplant-free survival, although the study was notably underpowered [15]. In contrast, higher dose (25–30 mg/kg/day) UDCA increased the incidence of adverse events including need for liver transplantation or death [16]. This dosage was also associated with higher rates of colorectal cancer in patients with concurrent ulcerative colitis in a post hoc subgroup analysis [17].

In response to the disheartening outcomes results above and concerns its safety in high-doses, societal guidelines initially recommended against use of UDCA. However, the most recent clinical practice guidance by both AASLD and EASL have supported its use in doses between 13–23 mg/kg/day and 15–20 mg/kg/day, respectively, given its relative safety in low to moderate doses and efficacy in reducing biochemical abnormalities [3, 4]. Clinicians should be aware that these recommendations are graded as weak and derived from consensus expert opinion. Nonetheless, our practice (Fig. 1) is consistent with these recommendations and we typically will institute UDCA at doses in the 13–15 mg/kg/day range and monitor liver-associated enzymes every 3–6 months up to a year. If no significant response is achieved, then we may discontinue UDCA. Anecdotally, we have found that tolerance of UDCA is superior when the medication is started at low dose (such as 300 mg twice daily) and then titrated to goal within 7–14 days. This approach mitigates the GI symptoms some patients may experience. If GI symptoms persist, giving the majority or entirety of the dose at nighttime can help certain patients.

Fig. 1
figure 1

Suggested approach to ursodeoxycholic acid (UDCA) utilization in patients with PSC. Overlap with autoimmune hepatitis and other mimickers should be first ruled out before UDCA initiation

Normalization of Alkaline Phosphatase

Clinicians and patients with PSC alike frequently desire improvement of ALP level. Significant ALP elevation is associated with a poorer prognosis, particularly when combined with high symptom burden [18]. A single-center retrospective study found that patients with normalization of ALP after initial diagnosis had improved long-term prognosis, including fewer instances of cholangiocarcinoma (CCA) and improved transplant-free survival [19]. Similarly, another single-center retrospective study noted long-term benefits if the ALP level decreased to < 1.5 times the upper limit of normal [20]. The latter study also performed a sub-analysis of patients with ALP response, noting no difference in outcomes in those achieving complete ALP normalization vs. partial ALP reduction (but still to < 1.5 times the upper limit of normal). However, clinicians should be cautious when extrapolating findings from the latter two studies given they were derived from single centers and had small sample sizes.

Bolstering support for the potential utility of ALP as biomarker is a prospective German cohort study, which confirmed benefit of ALP reduction within the first year of diagnosis on transplant-free survival [21]. Similar findings were noted in two retrospective cohorts, one from the Mayo Clinic in the United States and another from a multi-province Dutch population [22, 23]. Furthermore, a secondary analysis of the Scandinavian PSC UDCA trial noted that reduction in ALP to either ≥ 40% baseline or normalization, irrespective of UDCA use, was associated with improved rates of transplant-free survival or CCA development [24]. Given these findings, recent prognostic models include ALP to predict transplant-free survival or hepatic decompensation [25,26,27]. In contrast, the older Revised Mayo Risk Score (which predicts short-term mortality) does not include ALP [28].

Despite the potential attractiveness of ALP as a surrogate biomarker in PSC, its use in isolation is problematic. Serum levels demonstrate wide variability across time both between and within patients. Initial reports of spontaneous ALP normalizations were derived from a small case series of 12 patients and later confirmed in a larger retrospective cohort, although the latter included some patients treated with UDCA [19, 29]. Interestingly, ALP fluctuations may occur independent of biliary stricture burden [30]. As a corollary, we have noted spontaneous ALP normalization when the stricture burden is segmental rather than diffuse. The involved hepatic segment atrophies (decreasing the stimulus for ALP and gamma-glutamyl transferase elevation) but the remainder of the liver remains functional (see Fig. 2). Unfortunately, this often comes at a price as hepatic atrophy can lead to capsular retraction and carbohydrate antigen (CA) 19–9 elevation. This amalgam may falsely raise concern for CCA and an invasive workup to exclude malignancy that only adds iatrogenic risk without clinical benefit. Thus, we agree with current EASL guidelines explicitly recommending against the isolated use of ALP to predict outcomes [4].

Fig. 2
figure 2

Serial MRIs in a patient with PSC over a 12-year span demonstrating segmental atrophy and capsular retraction; the alkaline phosphatase level normalized over this timeframe. Post-contrast T1 weighted MRI image from 2007 shows capsular retraction due to atrophy of segment 4a (arrow, A). The band like enhancement in segment 7 is due to fibrosis (arrowhead, A). The 2019 MRI shows progression of segmental atrophy, now involving segment 2 without underlying lesion as the cause for increased capsular retraction (arrow, B)

Diagnosis and Management of PSC-AIH Overlap

Students of medical history are familiar with the contrasting diagnostic approaches of Occam and Hickam, with the former favoring parsimony and the latter noting patient complexity often lends itself more to multiple concurrent diagnoses. Autoimmune liver disease is no exception and patients may not fit into the neat “buckets” frequently described in textbooks or medical schools lectures. Overlap of PSC and autoimmune hepatitis (AIH) is a well-known phenomenon, particularly in the pediatric and young adult populations [31]. However, specific diagnostic criteria remain undefined, complicating true prevalence estimates.

When the modified AIH score is retrospectively calculated in patients with PSC, the prevalence of an overlap syndrome is estimated at 1.4–8% [32, 33]. Comparably, when alternative criteria were used, including a revised AIH score > 15, antinuclear antibody (ANA) or anti-smooth muscle antibody (ASMA) presence at a titer of at least 1:40, and liver histology including piecemeal necrosis, lymphocyte rosettes, or moderate to severe periportal inflammation, the prevalence of an overlap syndrome rose to 17% [2]. Notably, the recent AASLD Practice Guidance describes the prevalence of PSC-AIH overlap syndrome being up to 35% in children versus only 5% in adults [3]. However, studies including those by Abdalian et al. and Lewin et al. suggest that prevalence in adults is age-dependent [34, 35].

These varied estimates can be discomforting to clinicians. In real-world practice, many patients with PSC have transaminase elevation along with more typical cholestasis. Whether this represents normal variation along the PSC spectrum or an overlap syndrome can be difficult to discern. We suggest several factors to consider regarding the presence (or absence) of an overlap syndrome (Fig. 3):

  1. (a)

    Age: As previously mentioned, overlap syndrome is more common in pediatric patients and/or young adults;

  2. (b)

    Degree of transaminase elevation: Mild to moderate (< 5 × upper limit of normal) may be expected in patients with PSC alone whereas higher levels may suggest concurrent AIH [3];

  3. (c)

    Serologies: Moderate to high titer autoantibodies (predominantly antinuclear or anti-smooth muscle) and/or isolated elevation of total IgG (assuming the IgG4 fraction is not significant elevated) may suggest AIH is present;

  4. (d)

    Imaging: Magnetic resonance cholangiopancreatography (MRCP) which is normal or not suggestive of large-duct PSC should raise suspicion for small-duct PSC which is fairly prevalent at 27% among a small series of patients with overlap syndrome [36];

  5. (e)

    Liver biopsy: features compatible with AIH are present and outside of the spectrum expected with PSC alone.

Fig. 3
figure 3

Key clinical variables to consider when weighing a PSC-AIH diagnosis. *Autoantibodies include antinuclear antibody and/or anti-smooth muscle antibody in titers of at least 1:40. †IgG subclasses should also be checked to ensure there is not a predominance of IgG4, as this may raise suspicion instead for IgG4-related cholangiopathy. ‡Liver biopsy findings are variable but typically include features suggestive of AIH such as lobular inflammation, interface hepatitis, and/or prominent plasma cell infiltrate. Biliary features including portal tract inflammation, lymphocytic infiltration of the bile ducts, ductal proliferation, and periductal fibrosis may or may not be present

Evidence on the importance of timely diagnosis of an overlap syndrome is unsettled but current evidence suggests that patients with PSC-AIH have worse prognosis compared to classical PSC. Current AASLD Practice Guidance and the International Autoimmune Hepatitis Group (IAIHG) recommend treatment of the PSC-AIH overlap syndrome with immunosuppressive agents ± UDCA [3, 37]. However, these recommendations are based on weak evidence derived mostly from small case series; response is typically better in the pediatric population [37].

We have seen a significant number of young adult patients, particularly males, with a mixed pattern of liver injury and MRCP findings consistent with PSC. The elevations of transaminases are typically ≥ 5 × the upper limit of normal (ULN) and they frequently have at least one positive autoantibody of moderate to high titer (ANA or ASMA) and/or an elevated total IgG level. Given the higher prevalence of PSC-AIH in this population, we will generally recommend a liver biopsy and institute UDCA + immunosuppressive therapy if biopsy findings are compatible with the presence of AIH. For immunosuppression, we will usually begin azathioprine monotherapy (after appropriate thiopurine S-methyltransferase enzymatic activity assessment) unless the patient has severe features including hospitalization at time of diagnosis, severe inflammation on biopsy, or transaminases > 10 × ULN. In the aforementioned cases, we will institute a steroid taper along with concurrent azathioprine therapy. We feel that although the immunosuppression has not been proven to alter long-term outcomes in classical PSC, patients with PSC-AIH represent a specific subgroup which may benefit and our approach is consistent with the EASL guidelines [4]. Immunosuppressive therapy in AIH can slow or prevent disease progression, albeit its effect in PSC-AIH is less well-established [38]. In our experience, we have seen most patients have excellent biochemical response to this approach—it will be important for multicenter cohort studies to determine the longitudinal effect since single center experiences are prone to significant bias. Nonetheless, we advocate that clinicians strongly consider liver biopsy in younger patients with significantly elevated transaminases (particularly if positive autoantibodies or total IgG).

Small-Duct Disease and PSC Mimickers

Another variant of PSC is small-duct disease. In suspected patients with cholestatic liver injury but normal high-quality MRI/MRCP, liver biopsy is recommended to confirm the diagnosis [3, 4]. While histologic findings of periductal fibrosis, fibro-obliterative changes, periductal inflammation, ductular reaction, ductopenia, and portal inflammation may be suggestive, their presence is variable and can make diagnosis challenging [4]. Alternatively, if a compelling reason to perform endoscopic retrograde cholangiopancreatography exists, this may be utilized instead of biopsy [39]. Small-duct PSC tends to follow a less aggressive course and it is unclear whether it represents a separate disease process or simply early stage disease as only a minority of cases tend to progress to large-duct disease [4, 39, 40]. Notably, small-duct PSC tends to occur more with IBD and the absence of IBD raises suspicion for other diagnoses such as PBC or genetic cholestasis.

When considering a diagnosis of PSC, clinicians should take caution to rule out IgG4-related cholangiopathy (IRC). The latter is a biliary manifestation of systemic IgG4-related disease (IgG4-RD) and is important to distinguish since it often responds to corticosteroid therapy (in contrast to PSC). Obtaining serum IgG4 levels are recommended by EASL in the workup of sclerosing cholangitis (discussed but no formal recommendation provided by the AASLD), although caution is advised [3, 4]. Significantly elevated levels (> 4 × ULN) have high specificity, but mild to moderately elevated levels may be seen in up to 15% of patients with PSC and thus carries low sensitivity [4]. In these cases, a serum IgG4/IgG1 ratio of < 0.24 can exclude IRC [41]. Formal diagnosis of IRC typically relies on the HISORt criteria, which comprises histopathologic (H), imaging (I), serologic (S), other organ manifestations of IgG4-RD (O), and response to treatment (Rt).

Other disease processes can mimic PSC, including ischemia, infection (including HIV or parasitic disease in patients from or with significant travel to endemic areas), and malignancy. It is vital for clinicians to avoid premature closure of a PSC diagnosis since these alternative processes (i.e. secondary sclerosing cholangitis; SSC) may portend different prognoses. While extensive description of the broad differential is beyond the scope of this focused review, a few conditions are worth specific mention. Critical-illness related sclerosing cholangitis occurs in a subset of patients without baseline liver disease and prolonged hypotension [42]. Biliary injury from ischemia, microcirculatory abnormalities, and sepsis-related inflammatory processes are thought to underlie the development of cholestatic liver injury [43]. Diagnostic modalities including MRI/MRCP and/or ERCP commonly show ductal irregularity, strictures and dilatations, and destruction of the intrahepatic ducts with relative sparing of the common bile duct (“pruned tree” appearance). Biliary casts are commonly seen during ERCP, appearing as intraductal filling defects. Management is typically similar to other sources of biliary obstruction and its presence carries a poor prognosis, with transplant-free survival of only 17–40 months [42]. Similarly, COVID-19 infection has been associated with the development of cholestatic liver injury and SSC (the spectrum of disease has been termed COVID-19 cholangiopathy) but there is a lack of consensus diagnostic criteria [44]. Time to onset is variable, ranging from 48–118, and risk factors include male sex, obesity, chronic liver disease, and severe infection based on two retrospective cohort studies [45, 46]. Figure 4 provides a case of secondary sclerosing cholangitis with features of the two aforementioned PSC mimickers.

Fig. 4
figure 4

Image from a patient with prior severe COVID-19 infection complicated by acute respiratory distress syndrome and shock who developed fever, encephalopathy, and cholestatic liver injury seven weeks later. Clinical picture felt consistent with development of secondary sclerosing cholangitis. MRCP (A) demonstrated a distal common bile duct filling defect (circled) and irregular segment II/III intrahepatic biliary stricturing (arrow). Subsequent ERCP confirmed irregularity of the upper biliary tree radicals with beading; balloon sweep notable for clearance of sludge, small stones, and a biliary cast (B)

Hepatobiliary Malignancy Screening and Gadolinium Exposure: Friend or Foe?

Patients with PSC are at high-risk to develop malignancy, including CCA, gallbladder cancer, and colorectal cancer. In particular, hepatobiliary cancer risk is 161–398 × the general population in patients with PSC and CCA carries an estimated annual incidence of 0.6–1.5% as well as a lifetime risk of up to 20% [4]. While a sizeable minority of CCA diagnoses are made within the first year of PSC diagnosis, this may reflect chronic, asymptomatic disease that was not previously identified. Risk factors for CCA are numerous (older age, male sex, concurrent IBD, history of colorectal malignancy, advanced liver disease, smoking, alcohol) but present in many patients and thus not overly helpful for the practicing clinician [3, 4]. However, specific subpopulations may warrant differing screening recommendations. Patients with PSC and longstanding IBD (> 10 years) have a significantly elevated risk of developing CCA and colectomy is not preventative; a history of colectomy for neoplasia similarly confers increased CCA risk [47]. In contrast, CCA is rare in the pediatric PSC population or in those with small-duct PSC [3].

Current recommendations from the AASLD and EASL suggest at least annual screening for hepatobiliary malignancy in patients with large duct PSC based primarily on evidence from retrospective studies [3, 4]. One such cohort study noted significant 5-year survival benefit in those undergoing regular screening [48]; another noted a twofold reduction in patients undergoing annual imaging to detect hepatobiliary malignancy in patients with PSC and IBD [49]. While these studies suggest potential benefit, they are subject to selection and lead-time bias. In contrast, a recent large prospective study from a Swedish cohort published after formalization of the societal guidelines found that annual contrasted MRI/MRCP and CA 19–9 did not improve survival. Furthermore, the overall prevalence of hepatobiliary malignancy was low and the presence of progressive and/or severe bile duct changes, while strongly associated with the development of malignancy, still carried a low positive predictive value in isolation [50]. Furthermore, a paucity of cost-effectiveness analysis, particularly given the low incidence rate, is problematic. With patients being diagnosed with PSC at earlier ages (given more frequent lab and imaging acquisition in healthcare), the number of screening imaging exams a patient with PSC may be recommended is large.

Given these conflicting results, the optimal imaging strategy is unsettled. While ultrasound (US) is relatively inexpensive, widely available, and has excellent specificity for CCA (94%), its sensitivity is far inferior to MRI/MRCP (57% vs. 89%, respectively) [51]. In contrast, US is typically excellent to evaluate for gallbladder polyps (which may harbor neoplasm) with sensitivity of 84% and specificity of 96% [52]. Its ability to detect gallbladder polyps in patients with PSC is unknown, although in the general population it may provide additional diagnostic yield in certain cases [53]. Use of CA 19–9 alone for screening is discouraged given variable cutoff values used in the literature. In addition, both normal levels can be seen in the presence of CCA while elevated levels may be present in the absence of CCA, particularly if progressive or severe biliary ductal changes are present [50, 54, 55]. Some providers use it in conjunction with imaging studies but the utility of combination screening is hampered by low specificity [51, 56]. Thus, there are differing recommendations between major societal recommendations regarding the utility of CA 19–9 with EASL discouraging its use while the AASLD is more equivocal.

As MRI/MRCP becomes more widely available and has reasonable performance characteristics (sensitivity 89%, specificity 75%), it is frequently employed as the preferred CCA screening test by gastroenterologists and hepatologists alike [56]. We have traditionally screened large-duct PSC patients at our institution annually with MRI/MRCP but have taken a renewed interest in the long-term implications and utility of this practice. The most recent AASLD Practice Guidance acknowledges concerns regarding cost, false-positive findings (leading to unnecessary testing and potential harm), and uncertainty regarding the significance of repeat exposure to gadolinium [3]. The latter is an area of increasing interest within the radiology field and warrants close attention.

Gadolinium-containing contrast agents routinely accumulate in the human central nervous system despite an intact blood–brain-barrier and normal renal and hepatobiliary function [57]. Importantly, the properties of various agents likely play a role in their tissue retention. Linear agents (e.g. gadodiamide, Omniscan®) reach thermodynamic equilibrium quicker than macrocyclic agents (e.g. gadobutrol, Gadavist® or gadoxetate, Eovist®), thus there is increased release of free gadolinium [58]. The former have been associated with nephrogenic sclerosing fibrosis in patients with severe chronic kidney disease, whereas the latter are felt to have very low risk (< 1%) [59].

Several studies (Table 1) have found a dose-dependent relationship between the number of gadolinium injections and neuronal retention within the dentate nucleus (DN) and globus pallidus (GP) [60,61,62]. These studies involved exclusively or predominantly linear agents in patients with normal renal function. In contrast, a study by Radbruch et al. demonstrated that macrocyclic contrast agents did not cause high signal intensity in the DN or GP on MRI, suggesting either minimal or absent tissue deposition of gadolinium [63]. However, this hypothesis was challenged when Murata et al. showed in postmortem fashion that patients exposed to macrocyclic agents did have neuronal deposition of gadolinium [64].

Table 1 Selected literature on effects of gadolinium exposure in humans

Another factor for clinicians to consider is that younger patients and repeat exposure to gadolinium injections are independent risk factors for neuronal deposition [65]. Similarly, this phenomenon is seen in children undergoing serial contrasted MRIs [66]. To date, no long-term adverse events from gadolinium exposure have been seen in animal or human studies (Table 1), although the evidence is limited [67,68,69] Nonetheless, until further safety studies in humans are performed, we advise caution in younger patients who are most at-risk for significant gadolinium exposure. Clinicians should undertake frank discussion with PSC patients about the risks and benefits of repeated contrasted MRI both from an efficacy and safety standpoint.

Optimal Clinical Trial Targets and Outcomes

Multiple therapeutic agents have been trialed in PSC including various immunosuppressants or immunomodulators, colchicine, antibiotics, nicotine, pentoxifylline, silymarin, and pirfenidone [70,71,72,73,74,75,76,77,78,79,80,81,82,83,84]. Unfortunately, none are associated with improved long-term impactful outcomes such as delaying need for liver transplantation or reducing mortality. We strongly agree with the AASLD recommendation that PSC patients should consider clinical trial participation [3]. However, the feasibility and success of trials is difficult in PSC given its heterogeneity, slow progression, and lack of definitive diagnostic criteria [85].

An international group consensus process was recently commissioned to address the need for more uniform criteria to diagnose PSC [86]. To this point, early studies also may have been confounded by the presence of IgG4-associated cholangitis which was only first described in 2007 [85]. Furthermore, concurrent IBD is present in the majority of PSC patients (traditionally, the disease activity of each is considered independent) and dominant strictures are variably defined [85]. Patient quality-of-life factors, including the impact of fatigue and pruritus, are not reflected in typical biochemical, imaging, or histologic assessments and thus patient perception of their disease may differ from the treating clinician [85].

Important clinical outcomes such as delaying biliary and/or cirrhosis-related complications and improving transplant-free and overall survival are difficult to study in PSC. Specifically, disease heterogeneity and differences in transplant allocation across time and region complicate optimal study design. Several risk scores (Table 2), including the Mayo Risk Score and the PSC Risk Estimate Tool (PREsTO), as well as the enhanced liver fibrosis test, exist to assist clinicians in predicting PSC course [27, 28, 87, 88]. In contrast, the MELD score is predominantly utilized in the United States (and by the authors) for prognostic purposes given its centrality to organ allocation across all forms of liver disease. Nonetheless, simpler surrogate endpoints for clinical trials have been sought as attractive outcome measures but suffer from a lack of validation [89].

Table 2 Prognostic models for PSC

Many traditional serologic biomarkers, including ALP and bilirubin, do not directly correlate with clinical outcomes in isolation. Conversely, histologic assessment of fibrosis with liver biopsy is often part of trial protocols and associates with transplant-free survival as well as time to liver transplantation [87, 90, 91]. While biopsy remains the gold standard for staging fibrosis, it is invasive, carries risks of bleeding and mortality (albeit low), and is subject to histologic discrepancy depending on whether an atrophic or hypertrophic lobe is biopsied [92]. The International PSC Study Group underwent a consensus process in 2016 to evaluate surrogate endpoints for clinical trials, including biomarkers (ALP, bilirubin), transient elastography, liver biopsy, magnetic resonance imaging features, and various combinations. While many of these surrogates were rated as reasonable measurements, none were felt to have reached high-level validation [93].

Magnetic resonance features as surrogate endpoints have gained increasing interest. The ANALI score (Table 2), with variations to account for the presence of absence of gadolinium, predicts progression in PSC patients and clinical outcomes (portal hypertension-related decompensation, liver transplantation, or liver-related death) [94,95,96]. However, it has lacked widespread adoption given poor interrater reliability [96]. Nonetheless, recent literature suggests a potential role for quantitative MRI/MRCP metrics to predict liver-related outcomes. Two small retrospective European cohort studies utilized propriety commercial MRI post-processing software (MRCP + , Perspectum Ltd, Oxford, UK) to evaluate biliary stricture characteristics including number, length, and associated dilations. While limited in length of follow-up, sample size, and retrospective nature, the quantitative metrics were predictive of liver-related events and generally outperformed the ANALI score [97, 98]. Thus while promising, further study into imaging characteristics and derived prognostic scores is required before these can be recommended as valid surrogate endpoints.

Drug regulatory and approval agencies typically allow for expedited approval of agents targeting rare disease if convincing evidence suggests benefit for a selected surrogate endpoint, provided this endpoint is believed to be along the pathway of disease [85]. With this in mind, future trial designs need to be cognizant of the patient population enrolled and consider combinations of biomarker improvement, clinical outcomes, and patient quality-of-life measures. Another practical challenge is resource allocation and feasibility in the setting of endpoint timeframes. For example, biomarker and quality-of-life improvements may occur with shorter follow-up than the development of cirrhosis, hepatic decompensation, cholangitis, liver transplantation, or death.

Conclusion and Future Directions

In summary, PSC remains a difficult disease for clinicians to treat given the wide range of phenotypes. This review serves to remind the reader that there is still a plethora of unanswered questions but we have highlighted some of the most salient controversies and provided a balanced review of the literature as well as pearls from our own experience. Efficacious novel drug development is needed to delay progression of this insidious disease. However, significant patient heterogeneity and lack of a universally accepted clinical endpoint complicate mid-to-late phase clinical trial design.