Introduction

In recent years, there has been a trend towards the non-invasive evaluation of the stage of liver fibrosis with the goal of providing this important parameter in a less cumbersome and less invasive way as compared to liver biopsy [1, 2]. This paradigm shift is, at least, the partial result of numerous reports dealing with liver biopsy complications, as well as sampling and inter-observer variability in data interpretation [3, 4].

Physical non-invasive tests measure liver stiffness (LS) as a surrogate for liver fibrosis. Ultrasound (US) methods that use elastography for liver stiffness measurement (LSM) include transient elastography (TE), point shear wave elastography (pSWE), two-dimensional shear wave elastography (2D-SWE) and MR elastography [5,6,7]. Due to the accumulated scientific evidence from the numerous studies performed so far, TE (developed by Echosens) is considered to be the non-invasive standard for the measurement of LS [2]. The calculation of the elastic shear wave propagation speed through different tissues is the principle behind TE [5, 8, 9]. However, the inability to acquire valid measurements in the setting of ascites and also, to a certain extent, in obesity, is one of several drawbacks [10, 11].

Point SWE utilises uses acoustic radiation force (ARFI) to generate ultrasonic pressure waves that are transmitted through the body/liver where a portion of its energy is used to induce shear waves that travel in a perpendicular direction to the plane of the excitation impulse and the shear wave velocity is subsequently measured by pulsed-Doppler using the same probe.

As a recently introduced representative of pSWE, elastography point quantification (ElastPQ®), developed by Philips Healthcare, has not been fully evaluated in clinical studies [12,13,14]. The main advantage of pSWE is its integration in the US machine and its simultaneous operation with B-mode scanning. This enables the selection of the region of interest (ROI) and also its operation, despite the presence of ascites and obesity. Parameters, such as the cutoff values for different fibrosis stages, the predictive value and the influence of the quality criteria, have underdone considerable investigation in recent years [14,15,16,17,18]. The aim of this study was to investigate the diagnostic performance of ElastPQ for the non-invasive assessment of LS in patients with chronic liver disease (CLD) and healthy patients, using TE as the reference test.

Patients and methods

This was a single-centre, cross-sectional study with prospectively enrolled patients in a tertiary-care hospital setting. During a 4-month period (February–June 2017), outpatients with CLD referred for liver ultrasound examination in the ultrasound unit of the Department of Gastroenterology were considered as candidates for this study. On each day of the week, the first two patients with a referring diagnosis of CLD were included in the study, provided that a successful LSM had been performed using TE, which served as a reference method for staging liver fibrosis. In cases where the LSM had failed, the next patient with CLD was analysed and so on, until successful LSMs had been accomplished for two patients each day.

Patients had to be over 18 years of age with previously diagnosed CLD and with available laboratory results performed within a 3-month period. The diagnosis of non-alcoholic fatty liver disease (NAFLD) relied on the ultrasonographically confirmed presence of fatty liver in at least two US examinations 6 months apart, with/without elevated liver function tests if excessive alcohol consumption and the use of drugs with known steatogenic potential had been excluded. The following biochemical parameters were documented: bilirubin, aspartate transaminase (AST), alanine transaminase (ALT), gamma-glutamyltransferase (GGT), alkaline phosphatase (AP) and platelet count (Plt). From these data, the FIB4 score was calculated for each patient according to the previously published formula [19]. The exclusion criteria were overt cholestasis with dilatation of the intrahepatic bile ducts, congestive liver failure, liver transaminases greater than 5 × the upper limit of the normal (ULN) value, the presence of ascites and the failure to perform a reliable LSM using TE [2]. The patients that were eligible according to the inclusion/exclusion criteria underwent LSM using ElastPQ during the same visit. We evaluated the diagnostic accuracy of the LSM obtained by ElastPQ against the LSM obtained using TE as the reference standard. As a reference, we used three points of clinical interest, according to established TE cutoff points for liver fibrosis stage: ≥ 7 kPa for F ≥ 2; ≥ 9.5 kPa for F ≥ 3 and ≥ 12 kPa for F = 4 [7].

We also included a group of healthy volunteers in order to check the LSM values obtained using ElastPQ in healthy livers. This group of participants was recruited from the subjects who came to the unit for an annual preventive check-up. These participants were considered eligible if they had no history of liver disease and their liver function tests and liver US were normal. All of the participants signed an informed consent and the local ethics committee approved the protocol of the study.

Elastography point quantification

Elastography point quantification was performed by three experienced physicians (each had performed at least 100 LSMs using ElastPQ prior to the start of this study) using the Epiq7 ultrasound system (Philips Healthcare,) with a convex transducer C5-1 (1–5 MHz). The subjects fasted for at least 3 h prior to sessions. The right liver lobe was targeted through the intercostal space with the subjects lying in a dorsal decubitus position with the right arm in maximal abduction. The US probe was lubricated with gel to improve ultrasonic wave transmission into the liver. The skin to liver capsule distance (SCD) was measured for each participant. With the help of the real-time B-mode image, a vessel-free area of at least 1.5 cm below Glisson’s capsule, was selected. During measurements, patients were instructed to hold their breath in a neutral position while the operator pressed a button that launched the measurement acquisition. At least 10 valid measurements, expressed in kPa, were repeated for each patient. The median value was considered reliable only if the interquartile range/median (IQR/M) was < 30%.

Transient elastography

Three independent operators with at least 2 years of experience performed the TE (each performed > 200 LSM using TE). We used the FibroScan Touch 502 device (EchoSens). Since both the TE and the ElastPQ were performed in the same session, the conditions were the same. All sessions were performed in the same examination room, the subjects were fasting and the right liver lobe was targeted through the intercostal space with the subjects lying in a dorsal decubitus position with the right arm in maximal abduction. Only examinations with 10 valid measurements (success rate ≥ 60%) and an interquartile range/median (IQR/M) of < 30% for values greater than 7.1 kPa were considered reliable. The IQR < 30% criterion was not mandatory for patients with a median LSM < 7.1 kPa [20]. The LSM was considered failed when no numerical value could be obtained. We intended to use an M probe in patients with SCD ≤ 25 mm or BMI ≤ 30 kg/m2. However, if the Fibroscan device signalled that the XL probe should be used instead, or in cases of a high number of failed LSMs using the M probe (> 40%), we switched to the XL probe.

Statistical analysis

Statistical analysis was performed using SPSS software, version 24.0 (SPSS Inc.), and MedCalc for Windows, version 12.0 (MedCalc Software). The patient characteristics are given as the mean ± SD, as appropriate. Student’s t test for independent measurements was used for comparison of the means. The diagnostic accuracy of each non-invasive model was evaluated by calculating the areas under the receiver operating characteristics (ROC) curve (AUC).

In order to assess the diagnostic reliability and interchangeability of ElastPQ against TE, which served as the reference standard, we used a set of statistical tests for the purpose of method comparison. We graphically inspected the two methods using a Bland-Altman plot and then proceeded with Passing-Bablok regression analysis in order to quantify the existence of differences between the measurements produced by the two methods and to assess for the presence of the constant and proportional difference between them.

A two-sided p value of < 0.05 was considered to be significant for all statistical tests.

Results

A successful TE LSM, used as the reference standard, was obtained in 201 subjects (181 patients who were eligible on the basis of the inclusion/exclusion criteria and 20 healthy volunteers). The majority of the TE LSM measurements were performed using an XL probe (126; 68.1%). These 201 subjects underwent LSM using ElastPQ during the same session, which produced reliable results in 185/201 (92%) and an unreliable LSM in 16/201 (7.9%), whereas no failure of LSM was observed in this cohort. Subjects with successful measurements had, on average, lower mean values (p < 0.01) of BMI and skin to capsule distances (SCD) in comparison with those with unreliable measurements (BMI 27.25 vs. 30.68 kg/m2 and SCD 1.81 vs. 2.06 cm, respectively).

The final analysis included a total of 185 patients with a successful LSM using both methods, with 102 (55.1%) males and 83 (44.9%) females. The mean age (SD) of the patients was 53 (14) years, ranging from 18 to 82 years. The baseline characteristics of the patients are presented in Table 1.

Table 1 Baseline characteristics of the studied cohort of patients

The liver stiffness measurement values obtained using ElastPQ had AUC = 0.955 (95% CI = 0.914–0.980; p < 0.001) for diagnosing fibrosis stage F ≥ 2, with the best performing threshold point of 5.5 kPa. The sensitivity to this cutoff was 97.8% (95% CI = 88.2–99.6) and the specificity was 84.3% (95% CI = 77.2–89.9) with a positive predictive value (PPV) of 66.7% and a negative predictive value (NPV) of 99.2%.

To diagnose the F ≥ 3 fibrosis stage, the LSM values obtained using ElastPQ had AUC = 0.983 (95% CI = 0.952–0.996; p < 0.001), with the best performing threshold point of 8.1 kPa. This cutoff had a 92% sensitivity (95% CI = 73.9–98.8) and the specificity was 96.25% (95% CI = 92.0–98.6) with a PPV of 79.3% and an NPV of 98.7%.

To diagnose cirrhosis (F = 4), the LSM values obtained using ElastPQ had AUC = 0.982 (95% CI = 0.921–1.000; p < 0.001), with the best performing threshold point of 9.88 kPa. The sensitivity to this cutoff was 90.5% (95% CI = 69.6–98.5) and the specificity was 98.2% (95% CI = 94.7–99.6) with a PPV of 86.4% and an NPV of 98.8%.

A graphic representation of all three AUCs is depicted in Fig. 1.

Fig. 1
figure 1

Areas under the receiver operating curves for diagnostic accuracy of liver stiffness measurement (LSM) obtained by ElastPQ with transient elastography (TE) serving as the reference method. (a) F ≥ 2 (LSM by TE ≥ 7 kPa). (b) F ≥ 3 (LSM by TE ≥ 9.5 kPa). (c) F = 4 (LSM by TE ≥ 12 kPa)

We performed a subgroup analysis in patients with NAFLD (N = 128; 69.2%). The AUC for F ≥ 2 (N = 30) was 0.961 (95% CI = 0.932–0.999; p < 0.0001), with the best performing cutoff value 5.5 kPa (sensitivity 93.9%; specificity 83.2%). Due to spectrum bias (F0-1 was present in 98, F2 in 15, F3 in four, and F4 in 11 patients, respectively), it was not possible to calculate reliable LSM cutoff values for F ≥ 3 and F4 in the NAFLD subgroup of patients. This warrants further study.

We analysed the correlation between the LSM values obtained using TE and those obtained with ElastPQ. There was a medium to strong significant correlation between the methods, with ρ = 0.72 (p < 0.0001; 95% CI 0.64–0.78). However, a pronounced dissipation of the LSM values was observed above the TE LSM threshold of around 10 kPa, as depicted in the scatter diagrams in Fig. 2.

Fig. 2
figure 2

Scatter diagram of LSM values obtained by TE and ElastPQ. Diagram depicts dissipation of values, especially with TE > 10kPa

The mean (SD) difference between the TE and the ElastPQ values was 0.98 (3.27) kPa, with 95% CI 0.51–1.45 and ranging from -4.99 to 21.60 kPa.

We also calculated the relative difference (in %) in LSM values between the methods ((TE/ElastPQ-1) × 100%). Although the overall mean relative difference between the TE and the ElastPQ values in the whole cohort was 3.7% (SD 26.9), there was a significant difference between the values in the TE LSM subgroups (see Table 2). In the subgroup of patients with TE LSM values ≤ 5 kPa, on average, the measurements of liver stiffness using the ElastPQ revealed 8.4% higher values in favour of ElastPQ. However, in the other subgroups of patients, the magnitude of the difference increased progressively, with the average highest difference of 37.5% in favour of TE, as observed in the subgroup of patients with TE values > 15 kPa.

Table 2 Relative difference between TE and ElastPQ liver stiffness values. Relative difference was calculated as (TE / ElastPQ - 1 × 100%) for each patient

In order to evaluate the diagnostic reliability and interchangeability of the LSM obtained using ElastPQ, we compared this against the established reference standard (TE) using statistical methods for the comparison of the diagnostic methods.

We initially used a Bland and Altman plot to visually compare the two methods (see Fig. 3). The dotted lines represent the limits of agreement between ElastPQ and TE, which are defined as the mean difference ± 1.96 SD of the differences Δ. If these limits do not exceed the maximum allowed difference between methods Δ (the differences within the mean ± 1.96 SD are not clinically important), the two methods are considered to be in agreement and may be used interchangeably. In our case, the two methods are mostly in agreement with values of less than 10 kPa. Yet, with values over 10 kPa, there is a clear dispersion of the measurements, with a tendency of ElastPQ to produce lower values than TE, although, clearly, this effect is dependent on the magnitude of the measurements.

Fig. 3
figure 3

Bland and Altman plot comparing transient elastography (TE) and elastography point quantification (ElastPQ) measurements of liver stiffness. Most of the values < 10 kPa remain within ± 1.96 SD boundaries; values > 10 kPa show prominently lower values of ElastPQ when compared with TE

The Passing-Bablok regression analysis yielded a regression formula of y = 1.06 + 0.75× (Fig. 4) [21]. The intercept of 1.06 had a 95% CI of 0.55–1.46, which indicated a small but statistically significant constant difference (CI does not contain the value of 0). The slope of 0.75 had a 95% CI of 0.66–0.85 (CI does not contain the value of 1), which corresponded with the significant proportional difference between ElastPQ and TE. According to the cusum test (p > 0.10), there was no significant deviation from linearity.

Fig. 4
figure 4

Scatter diagram of transient elastography (TE) and elastography point quantification (ElastPQ) values. Passing and Bablok regression line (blue), red dashed line represents confidence interval for regression line. Red dotted line represents reference line where both methods would be in perfect correlation

We also created a residuals diagram (Fig. 5) based on the Passing-Bablok regression equation. The TE LSM values (used as a reference point) are on the x-axis, while the residuals of difference between the ElastPQ values and the values calculated from the regression equation (under perfect conditions, each difference would be zero, in other words, these methods would produce identical results) are shown on the y-axis. As observed, although there was a fair grouping of values with measurements of < 10 kPa, there was an obvious disagreement in cases with measurements of > 10 kPa.

Fig. 5
figure 5

Residuals diagram of transient elastography (TE) and elastography point quantification (ElastPQ) comparison, based on Passing and Bablok regression equation (Fx). TE values are entered on x-axis, while on y-axis are residuals of difference between ElastPQ values and values calculated from regression equation (in perfect condition, each difference would be zero, in other words, both methods produce identical results)

Discussion

The results of this study demonstrate the excellent diagnostic performance of ElastPQ as the representative of the pSWE methods for the non-invasive staging of liver fibrosis in patients with CLD when TE was used as a reference method. The areas under the receiver operating characteristics curve for three clinically relevant points of interest (significant fibrosis, advanced fibrosis and cirrhosis) were in the range of 0.95 to 0.99.

ElastPQ is a relatively new US technique that is based on the elastography method for the quantitative assessment of liver fibrosis via measurements of liver stiffness. The studies published so far have revealed inconsistent results both in terms of the diagnostic performance of ElastPQ and the calculated cutoff values for differentiating between the stages of liver fibrosis (Table 3). The use of different reference methods for liver fibrosis assessment (i.e., TE or liver biopsy) and the mixture of the aetiologies of the CLD studied, added to the heterogeneity of the results. Although our results were obtained from a cohort of mixed aetiology CLD, they mostly reflect the performance of ElastPQ in NAFLD since the majority of the patients (almost 70%) had NAFLD. This is a finding that has not previously been reported.

Table 3 Published cutoff values and the diagnostic performance of elastography point quantification for non-invasive staging of liver fibrosis

In the studies published to date, there has been a clear trend pointing to the high reliability of the diagnostic performance of ElastPQ, although there remain some unresolved issues that need to be analysed in greater detail. First, from the results published by Ferraioli et al (and confirmed in the papers that followed), it became evident that the LSM values measured using ElastPQ were lower compared to the TE measure when undertaken in typical clinical and scientific scenarios that recruited patients with compensated CLD without overt portal hypertension or liver decompensation (this is because this subset of patients were candidates for liver biopsy) [13]. These patients usually have an LSM in the range of 5–15 kPa, as measured using TE. Within this range, the LSM measured using ElastPQ is, on average, 1 kPa lower, as also demonstrated in our study. However, in patients with more advanced liver disease and a stiffer liver, the difference between TE and ElastPQ becomes progressively divergent (Fig. 2). Although this 1 kPa approximation might be considered acceptable for LSM values within the range of 5–10 kPa, the lack of linearity and the increasing dissipation above this threshold do not allow for any meaningful correction to make the LSM values interchangeable with TE.

This observation calls for further investigation of ElastPQ, specifically in patients with compensated advanced chronic liver disease (defined by a TE LSM ≥ 10 kPa), in order to test its ability for risk stratification in this group of patients (presence of clinically significant portal hypertension, large oesophageal varices and prognostication).

The differences in the LSM values between the elastographic devices may be related to the specific technologies used to generate and track the shear waves. This has not only been observed between TE and ElastPQ but also between other elastography methods, as demonstrated by Piscaglia et al [22]. These authors also reported that a different intercostal space for the SWE from the one adopted for the Fibroscan was selected in almost half of the cases, which might have additionally influenced the final LSM result. Further, we must note the fact that TE measures a larger volume of liver tissue than pSWE; hence, it is probable that pSWE measures fibrosis in a local area of liver tissue that may have a different stiffness to that of the surrounding area. When using TE, such local differences are muted by averaging the stiffness across a larger sampling volume.

We need to note the limitations of our study. We used TE as a reference standard. Although liver biopsy has numerous shortcomings, it is still the de facto considered as the “gold standard”. Yet, TE has been established as a reliable surrogate and, in terms of comparing the interchangeability of the methods, a lack of the “gold standard” liver biopsy is not methodologically relevant. The studied cohort comprised different aetiologies of CLD but the majority were NAFLD patients, thus, the results mostly reflect this specific aetiology. However, there was a significant spectrum bias in terms of the fibrosis stage with only one quarter of patients presenting with significant fibrosis. Nevertheless, this reflects the prevalence of significant fibrosis in the real world outpatient population and, therefore, this enhances the clinical usability of our results. The number of analysed patients was moderate and further analyses with higher numbers of patients with a single aetiology CLD are still needed. The majority of the LSMs in this study were obtained using the XL probe, even in the patients with SCD < 25 mm, and it may be argued that the cutoff values are not the same for the M and the XL probe. However, in our experience, 25 mm is probably too high a SCD to use an M probe because even an 18–20 mm SCD results in a high number of failed or unreliable LSMs. In these cases, the XL probe almost always produces a reliable LSM. This observation is supported by some other authors who have reported the better accuracy of the XL as compared to the M probe in patients with SCD ≥ 17.5 mm [23]. Although the LSMs obtained using the XL probe tend to be 1–2 kPa lower compared to the M probe, it has recently been reported that when used in appropriate patients the LSM result can be interpreted using the same diagnostic cutoffs for both probes [24, 25].

Conclusions

ElastPQ appears to be a reliable method for the assessment of liver fibrosis, with the data presented here as being mostly applicable to NAFLD, which has not previously been evaluated. Yet, there is a need for more studies with larger samples of patients with a single aetiology of liver disease in order to establish valid thresholds for the fibrosis stages in a specific aetiology. The LSM values produced by TE and ElastPQ are NOT interchangeable—in values < 10 kPa, they are similar, but in values > 10 kPa, they appear to be increasingly and significantly different. The diagnostic performance of ElastPQ for a sub-classification of patients with compensated advanced chronic liver disease should, therefore, be further investigated.