Introduction

The hepatopulmonary syndrome (HPS) is found in 10–32 % of patients with cirrhosis [1, 2] and is associated with decreased median survival compared to age- and disease severity-matched cirrhosis alone [1].

No gold standard diagnostic test exists for HPS. Accordingly, a diagnosis is established by the presence of the following syndromic triad: (1) liver dysfunction or portal hypertension, (2) intrapulmonary vascular dilatations (IPVDs), and (3) abnormal oxygenation [3]. Liver dysfunction or portal hypertension is identified through liver function tests, imaging, liver biopsy, and/or portal pressure measurement [3]. IPVDs can be identified indirectly through macroaggregated albumin (MAA) scanning or more commonly through contrast echocardiography (CE). Although highly sensitive, CE is not specific for HPS, with 34–47 % of cirrhotic patients having positive CE [4, 5]. The final diagnostic criterion, abnormal oxygenation, is identified through arterial blood gas (ABG) analysis in the upright position. However, the threshold for abnormal oxygenation used to define HPS has been based solely on expert opinion and is inconsistent in published literature [6]. Definitions have included a single measurement of an elevated alveolar-arterial oxygen gradient (AaDO2) alone, a reduced partial pressure of arterial oxygen (PaO2) alone, or combinations thereof [4, 6]. This inconsistency has resulted in highly variable published prevalence estimates, limited comparability of research findings across studies, and ambiguity in clinical diagnosis [6].

We sought to characterize the variability of oxygenation over time on HPS, and to compare the performance of published oxygenation diagnostic criteria for HPS according to their diagnostic stability over time and their ability to identify patients with clinically relevant markers of disease progression and severity. For each published criterion, we also evaluated the impact of requiring two consecutive abnormal ABGs on different days, on these parameters.

Patients and Methods

We performed a retrospective chart review in all patients referred for possible HPS to either the University of Toronto (St. Michael’s Hospital) (from June 2004 onward) or Universite de Montreal (Hopital St-Luc) (from November 2000 onward). The study protocol was approved by each institution’s research ethics board.

Population

We included patients with liver dysfunction and/or portal hypertension who had a CE consistent with IPVDs (appearance of microbubbles in the left atrium ≥3 cardiac cycles after appearance in the right atrium), and ≥2 ABGs (a minimum of two ABGs were required to evaluate the performance of oxygenation cutoffs over time). We excluded patients with severe concurrent lung disease [obstruction with forced expiratory volume in 1 s (FEV1) <50 % predicted and/or restriction with total lung capacity (TLC) <65 % predicted] [7], pulmonary hypertension (echocardiographic right ventricular systolic pressure ≥50 mmHg and/or measured mean pulmonary artery pressure >25 mmHg) (either at initial evaluation or any time thereafter, including patients who transitioned from HPS to portopulmonary hypertension), CE evidence of an inter-atrial shunt, or no eligible clinical visits. We excluded clinical visits which occurred after liver transplantation, during which there was evidence of a new acute lung process or a change in a chronic coexisting lung process which could alter oxygenation, or during which the patient was on an experimental HPS therapy.

Assessment Routine

All patients referred for HPS had CE, chest radiography, pulmonary function testing, and ABG. Patients with hypoxemia (PaO2 < 80 mmHg) also had a chest CT. All patients received optimization of any concurrent lung disease. Patients were followed regularly with standardized ABG and diffusion capacity (DLCO) testing, and with other tests as required. ABGs were performed on room air, at rest, in the same position for each patient, at each visit. Samples were taken after 15 min in the seated position prior to January 2005 at the University of Toronto, and prior to January 2007 at the Universite de Montreal, and after 15 min in the standing position [8] subsequent to those dates, at both centers. Only samples taken in the same position and analyzed in the same laboratory (by the same operators, using the same blood gas analyzer) were compared.

AaDO2 was calculated for each patient, as follows; AaDO2 = PAO2 − PaO2, where PAO2 is the partial pressure of alveolar oxygen estimated by the ideal alveolar gas equation at sea level (PAO2 = 150 − PaCO2/0.8) [3].

Statistical Analysis

We calculated and expressed the variability of oxygenation over time as means of within-patient coefficients of variation and means and ranges of within-patient changes in PaO2 over 1 year. We repeated this analysis after removing all patients with any concurrent lung disease. We assessed the stability of different diagnostic criteria by determining the percentage of patients who met criteria for HPS initially, but no longer met criteria on a later ABG over the study period. For criteria requiring only one abnormal ABG, all patients with a minimum of two clinic visits (i.e., two ABGs) were included, since a re-classification from HPS to “no HPS” could potentially occur on any visit from the second visit onward. Correspondingly, for criteria requiring two consecutive abnormal ABGs on different days, only patients with a minimum of three visits were included, since a re-classification from HPS to “no HPS” could only occur from the third visit onward (the first two clinic visits were required to establish an HPS diagnosis). We tested unique oxygenation-related diagnostic criteria in the literature as well as all possible PaO2 and AaDO2 cutoff combinations thereof that yielded unique results. We calculated rates of change in PaO2 for patients with and without HPS (according to each diagnostic criterion) by using a simple slope if only two ABGs were available, and using the least squares regression technique if ≥3 ABGs were available. In order to distinguish disease progression from random variability, only patients with at least two ABGs 6 months apart were included in this analysis. We also calculated the mean DLCO for patients with and without HPS according to each diagnostic criterion. Data are expressed as proportions (percentages), means and standard deviations or minimum and maximum values, as appropriate. Data were analyzed using SAS 9.3.

Results

Over the study period, 119 patients were assessed for possible HPS. Of these, 21 (17.6 %) had fewer than two available ABGs, one (0.8 %) had a non-diagnostic CE, one (0.8 %) had no available CE or ABG, 27 (22.7 %) had a negative CE, four (3.4 %) had pulmonary hypertension, three (2.5 %) had an inter-atrial shunt, two (1.7 %) had severe concurrent lung disease (one severe obstruction due to asthma and one severe obstruction due to asthma and mild sarcoidosis), and two (1.7 %) had no eligible clinical visits, leaving 58 eligible patients. Among these, 41 (71 %) were male, mean age was 54.0 ± 12.7 years, mean MELD score was 13 ± 8, and Childs-Pugh class distribution was 15/52 (29 %), 30/52 (58 %), and 7/52 (13 %) in classes A, B, and C (on presentation, in patients with available results), respectively. Etiologies of liver disease among these patients were as follows: 19/58 (33 %) alcoholic cirrhosis, 19/58 (33 %) hepatitis C, 5/58 (9 %) nonalcoholic steatohepatitis, 3/58 (5 %) hepatitis B, and 12/58 (20 %) other.

Within-Patient Variability in Oxygenation

Annual within-patient variability in oxygenation among the 58 patients with IPVDs and more than one ABG are in Table 1. Findings were similar when limited to patients with no concurrent lung disease (data not shown). When comparing the PaO2 at the first visit to that at the last, 41 % of patients had an increase in PaO2 over 1 year, 8 % had no change, and 51 % had a decrease.

Table 1 Within-patient temporal variability in oxygenation (58 patients)

Diagnostic Stability of Published Oxygenation Criteria for HPS Diagnosis

The stability of HPS diagnosis for each oxygenation criterion is expressed in Table 2. Considering patients “re-classified” if they met criteria for HPS initially, but no longer met criteria on a later ABG, the only combination of published PaO2 and AaDO2 cutoffs that yielded a difference in the number of re-classifications was the addition of AaDO2 > age-related threshold to PaO2 < 80 mmHg. This criterion was thus added to the eight previously published criteria (Table 2). Further, among the 40 patients with ABG data from more than two visits, we evaluated the effect of requiring two consecutive abnormal ABGs on different days, on the frequency of re-classifications (Table 2).

Table 2 Percentage of patients re-classified from HPS to No HPS over time, by oxygenation criterion for HPS diagnosis

Physiological Characteristics of Populations Defined by Each Criterion

Mean rates of change in PaO2 in patients with and without HPS, as defined by each diagnostic cutoff, and differences in these rates of change between HPS and non-HPS patients are in Table 3. Results for AaDO2 were similar (data not shown). Mean DLCO at time of diagnosis in patients with and without HPS according to each criterion, and differences between these means are in Table 4.

Table 3 Mean rate of change in partial pressure of arterial oxygen by diagnostic criterion
Table 4 Mean diffusion capacity by diagnostic criterion

Discussion

Published diagnostic criteria for the oxygenation abnormality required to diagnose HPS are inconsistent and based only on expert opinion. We analyzed serial ABGs in patients with cirrhosis and IPVDs and found that: (1) oxygenation varies greatly over time; (2) depending on the diagnostic criterion being applied, a high proportion of patients who were diagnosed with HPS on an initial ABG no longer met criteria for HPS on a subsequent ABG (re-classification); and (3) for each diagnostic criterion, requiring two consecutive abnormal results as opposed to a single abnormal result defines a disease population which is less likely to be “re-classified” on a later ABG, and exhibits better congruence with clinically relevant disease parameters. To our knowledge, this is the most detailed description of variability in oxygenation over time in this population, and the first to compare existing oxygenation criteria for their diagnostic stability over time and their correlation with clinically relevant parameters.

We demonstrated large temporal variability in oxygenation within patients, ranging from a drop in PaO2 by 19.0 mmHg to an increase by 26.6 mmHg over 1 year, with a mean change of 5.49 mmHg (Table 1). This variability is due to a combination of random variability in both patients with and without HPS, and the expected deterioration in the subset of patients with HPS [911]. Sources of random variability in oxygenation likely included measurement-related and true physiological variations. Although we minimized measurement variability by standardizing operators, patient position, and ABG analyzing equipment, prior studies have shown that PaO2 technical measurement error is high. Thorson et al. [12] found a mean coefficient of variation (CoV) of 5.1 % in six ABGs taken over 50 min in stable ICU subjects. This is similar to our annual CoV of 6.3 % and suggests that the majority of variability is measurement-related and not physiological. Any random physiological variation could have been from either HPS or non-HPS-related causes. We attempted to minimize non-HPS-related variability by eliminating any visits during which there was evidence of a new acute lung process or a change in a chronic coexisting lung process which could alter oxygenation. We also repeated the analysis in patients with no concurrent lung disease and noted similar variability. Potential causes of variation over time in this population include changes in ventilation/perfusion matching due to changes in ventilation caused by transient atelectasis (possibly related to changes in ascites) or airway secretions, and/or changes in perfusion due to variations in vessel tone. Other causes include central changes in ventilation and spontaneous changes in systemic O2 consumption, CO2 production, and O2 delivery (due to fluctuations in cardiac output and/or hemoglobin) [12, 13].

To be clinically useful, the criterion used to diagnose HPS should account for this random variability, identifying patients who will continue to meet the criterion with repeated testing over time and whose disease is likely to progress over time. However, when testing the various existing criteria, over a mean of only 1.1 years, we noted that 8.6–15.5 % of patients were re-classified from a diagnosis of HPS to a diagnosis of no HPS upon repeated testing (Table 2). Given that the only PaO2 cutoff that actually impacts treatment is PaO2 < 60 mmHg (at which point MELD exception points are awarded to increase liver transplant priority [14]), one might argue against the practical relevance of our findings. However, we believe that our findings have important implications for individual patients and their families. Diagnostic instability leads to a scenario in which a patient is diagnosed with HPS at one clinical visit, only to have the diagnosis reversed at a follow-up visit. Given the prognostic implications of HPS, the clinical ramifications of HPS misdiagnosis are not trivial and include both patient anxiety and implications for future insurability [1, 9].

HPS misdiagnosis may be addressed by requiring two abnormal ABG results to make an HPS diagnosis. We found that requiring two consecutive abnormal ABG results on different days minimizes the impact of random variability on diagnosis, with a smaller proportion of patients later being re-classified (Table 2). Furthermore, when compared to those diagnosed by a single abnormal ABG, HPS patients diagnosed by two abnormal ABGs had a more rapid progression in hypoxemia, with a larger difference in the rate of this progression between HPS and non-HPS patients (a larger difference suggests a greater ability to identify patients who will deteriorate) (Table 3). This is an attractive diagnostic feature given that hypoxemia has been shown to be progressive in clinically relevant HPS [2, 911]. Other data also suggest that identifying a population likely to have a more rapid decline in PaO2 is important. Subjects with a PaO2 < 70 mmHg experience dyspnea more commonly [4], those with a PaO2 ≤ 60 mmHg may have increased mortality without liver transplant [10], and those with PaO2 ≤ 50 mmHg may have increased mortality post-liver transplant [10, 15, 16]. Accordingly, an HPS diagnostic criterion that accurately identifies patients at risk of rapid deterioration would be clinically relevant and desirable. Given that experts already recommend serial pulse oximetry and ABGs in all liver transplant candidates [10, 17], a two-visit diagnostic criterion may also be practically applicable. However, it should be noted that in patients being transplanted principally for HPS, such an approach might also delay transplant listing, and in severe cases, MELD exception point allocation, which could increase morbidity and mortality. At the same time, this concern must be balanced against the fact that 8.6 % of our patients with an initial PaO2 < 60 mmHg had a higher PaO2 on a subsequent clinic visit, implying that these patients may have inappropriately been granted early transplant priority through MELD exception points. This concern is congruent with and may partially explain the findings of a recent multicenter analysis, suggesting that the current MELD exception policy (a single visit PaO2 < 60 mmHg) grants patients with HPS an unfair survival advantage over patients without HPS [16].

It should also be noted that the lack of a standard definition for oxygenation abnormality in HPS has led to a multitude of studies reporting prevalence, predictors, outcomes, and even therapies for HPS populations which are defined very differently from one study to the next [6, 18]. As shown in Table 5, this problem exists in recently and contemporaneously published studies. This leads to confusion among various stakeholders, including clinicians, patients, patient-support groups, and policy-makers. Furthermore, it limits not only our ability to draw conclusions from existing literature, but also to meta-analyze results to reach more robust conclusions. This is especially problematic in a rare disease such as HPS, where combining results from several small studies across centers would be particularly valuable, given limited recruitment at any one center [6]. Accordingly, we believe that future research is urgently required to establish a best evidence-based and universally accepted oxygenation abnormality criterion for HPS diagnosis. Our findings can be used to identify which candidate criteria provide the best diagnostic stability over time and correlate best with clinically relevant parameters, thereby meriting further evaluation.

Table 5 Recently and contemporaneously published oxygenation criteria for diagnosis of HPS

Our study has several limitations. We did not directly assess clinical outcomes such as pre- or peri-transplant morbidity or mortality in comparing the various criteria, including two-visit criteria. Schenk et al. demonstrated that a single value AaDO2 > age threshold independently predicts a significantly lower adjusted median survival [1], and Fallon et al. noted an increased adjusted hazard ratio for death in patients with HPS defined by a single AaDO2 ≥ 15 mmHg (≥ 20 mmHg in patients > 64 years of age) (similar to ERS Task Force Guidelines) [19]. It is possible that patients identified with two abnormal results in these studies would have demonstrated even worse outcomes, and this should be determined in a future prospective study. A prospective study would also be better able to control for possible confounding variables that may have influenced the results of our retrospective analysis. Next, given the high frequency of mild to moderate concurrent lung disease in patients with cirrhosis and IPVDs [20], inclusion of these patients in our analysis improves the generalizability and practical applicability of our findings. However, we did not include patients with severe lung disease in this study, and diagnosis of HPS in this population remains particularly challenging. Future research should address objective methods to weigh the contribution of oxygenation abnormality from HPS in the presence of concurrent lung disease. Given its higher specificity compared to CE, macroaggregated albumin shunt quantification (MAA) may have a role in these cases [21]. Similarly, we did not have results of MAA and non-invasive shunt (PaO2 on 100 % FiO2) testing in our cohort. These data would have helped to identify the most clinically relevant oxygenation criterion and should be included in future studies seeking to compare oxygenation diagnostic criteria on the basis of clinical validity. Finally, as noted above, the practical applicability of our findings cannot be fully assessed without analysis of the impact that a requirement for two abnormal oxygenation results would have on the timing of transplant listing, and the clinical consequences of any listing delays.

Continued efforts to identify key biomarkers [3] and genetic determinants of HPS [22] will hopefully lead to a gold standard diagnostic test for this disease. Until then, syndromic criteria must be used, leaving diagnosis inherently susceptible to error. Our findings suggest that two consecutive abnormal oxygenation results on different days may reduce misdiagnosis and better differentiate patients with and without HPS according to clinically relevant markers of disease progression and severity. Future research will be required to identify a single preferred criterion among existing published diagnostic criteria, based on predicted survival, and to prospectively assess whether requiring two abnormal results enhances the performance of this criterion. This would not only improve clinical care, but also harmonize research across centers.