Introduction

Invasive fractional flow reserve (FFR) remains the gold standard in evaluating the hemodynamic significance for patients with coronary stenosis, and substantial clinical evidence pointed out that FFR-guided percutaneous coronary intervention (PCI) can significantly reduce major cardiac adverse events [1,2,3]. A method of noninvasive CT-derived fractional flow reserve (FFRCT) has been widely recommended to evaluate the physiological function of coronary arteries without high cost and additional use of vasodilators. There is emerging evidence that FFRCT exhibits high diagnostic performance and significantly improve the identification of ischemic lesions with moderate stenosis, reclassifying the degree of ischemia [4,5,6,7,8,9,10]. With the developments of artificial intelligence algorithms, machine learning (ML)–based FFRCT provides incremental diagnostic value over computational fluid dynamics (CFD)–based FFRCT in terms of optimized calculation time and user-friendly workstation [8, 9, 11].

Whether image quality of CCTA influences the diagnostic performance of ML-based FFRCT remains controversial. Although Leipsic et al showed that the image quality had a negative effect on the diagnostic accuracy of FFRCT [12], there were indeed 10–15% participants excluded from FFRCT examination due to poor image quality [6, 11, 13, 14]. In addition, current studies have not comprehensively assessed the effect of image-dependent factors on determining the diagnostic accuracy of ML-based FFRCT, as well as some patient-dependent factors. Because the value of diagnostic imaging in intervening clinical management and improving patients’ outcome relies on its ability to reflect the reality of ischemia-caused lesions, a large-scale study for the evaluation of existing factors involved in affecting diagnostic performance of ML-based FFRCT is urgently required. In this regard, we performed a nationwide multicenter retrospective study for invasive FFR vs. a ML-based FFRCT testing (Siemens) to examine the effect of image quality and related parameters on the diagnostic performance of FFRCT.

Materials and methods

Study participants

Our study was approved by all local institutional review boards with all written informed consent waived. Patients suspected or known to have stable coronary artery disease (CAD) were included from 10 medical centers across China between May 2015 and January 2019. After 90 days of CCTA, they underwent invasive coronary angiography (ICA) and FFR on at least one vessel. Exclusion criteria included the following: previous PCI or coronary artery bypass grafting (CABG); suspected or recent acute coronary syndrome; complex congenital heart disease; implanted cardiac devices; left ventricular hypertrophy and valvular disease; severe arrhythmia; serum creatinine level greater than 1.5 mg/dL; contraindications to adenosine administration; allergy to iodine contrast agent; pregnancy; and unstable clinical symptoms. In addition, four patients were ruled out because of unreliable invasive FFR and software inability. The final study population included 437 patients with 570 vessels.

CCTA acquisition protocols

CCTA was performed using CT scanners with ≥ 64 detector rows (Somatom Definition Flash/Force, Siemens Healthineers (n = 395); Brilliance iCT 256, Philips Healthcare (n = 13); and Aquilion ONE, Toshiba (n = 29)). The detailed CT scanning protocols can be found in our previous study [15]. A total of 60–65-mL low-osmolality iodinated contrast agent (Omnipaque 350 mg I/mL, GE Healthcare; Ultravist 370 mg I/mL, Bayer Schering Pharma) was injected intravenously with the injection rate of 4–5 mL/s, followed by 20–40-mL saline injection with the same injection rate. (Supplementary Table 1 displays the detailed parameters of CT scanners and contrast agent in this multicenter study.)

Image quality assessment

Subjective image quality assessment

All data were sent to a dedicated workstation (Syngo.via; Siemens). Image quality was assessed by two independent observers (P.P.X. and F.Z., with 2 and 3 years of CCTA experience, respectively) in a double-blind condition. We used a four-score scale [16, 17]: 4 = excellent, no visible artifact; 3 = good, slight artifact; 2 = acceptable, moderate artifact, but images are available; and 1 = poor, unavailable. Image quality was also judged as excellent (scores 3 and 4) and inferior (scores 1 and 2). When the two observers disagreed, they mutually adjusted a consensus through consultation.

Objective image quality assessment

Objective image quality was evaluated by one cardiac radiologist (P.P.X). The measurement locations were at the proximal segment of the vessels. The regions of interest (ROIs) with the size of 1.5 mm were placed within the coronary artery lumen avoiding vessel walls and plaques. ROIs were measured three times for each location, and the average CT values in HU were obtained [18,19,20]. According to the degree of vessel enhancement, vessels were divided into four groups [21]: low, ≤ 300 HU; moderately low, > 300 HU, ≤ 400 HU; moderately high, > 400 HU, ≤ 500 HU; and high, > 500 HU.

Heart rate, BMI, and sex

The effects of heart rate at scanning, BMI, and sex on FFRCT evaluation were also taken into consideration. Enrolled patients were divided into three groups according to the heart rate at scanning as follows [22]: ≤ 70 bpm; 71–80 bpm; and > 80 bpm. The patients were subdivided into the following two groups according to BMI: BMI < 25 kg/m2 and BMI ≥ 25 kg/m2 [17, 18, 23]. Sex was classified into male and female.

Agatston score

The coronary calcium score was assessed by one experienced radiologist with 2 years of clinical experience (M.D.J.) using the Agatston method [24] in nonenhanced calcium score scanning (Syngo.via, Siemens). Coronary artery calcium is defined as any plaque with at least three adjacent pixel densities > 130 HU. Based on the classification of coronary artery Agatston score by the American College of Cardiology Foundation (ACCF) and the American Heart Association (AHA) [25], the vessels were divided into the following four groups: group 1, score 0; group 2, score 1–99; group 3, score 100–399; group 4, score ≥ 400.

FFRCT modeling and measurements

All FFRCT calculations were performed on conventional CCTA datasets using the software prototype (FFRCT, version 3.2.0, Siemens) at the core Lab. Diastolic images with a thickness of 0.75 mm were used for reconstruction. Another radiologist (X.L.Z with 8 years of clinical experience), who was blinded to invasive FFR outcome, measured FFRCT values for all evolved patients. FFRCT values at the distal 2–4 cm of stenosis were compared with invasive FFR [26]. Lesion with FFRCT ≤ 0.80 was considered to be functionally significant.

ICA and FFR measurements

Both of ICA and FFR were performed according to the standard practice [7]. FFR measurements were conducted using 6 or 7-F guiding catheters and intravenous adenosine triphosphate (ATP) was administered through the elbow vein through a dedicated infusion pump of either 40 ug (RCA) or 60 ug (left coronary artery). FFR pressure wire was localized at 20–40 mm from the distal end of the stenosis with a diameter of ≥ 2 mm. Hyperemia was induced (140–180 ug/kg/min) via intravenous or intracoronary adenosine triphosphate. FFR ≤ 0.80 was considered ischemia-specific [4, 27, 28].

Statistical analysis

All analyses were performed using the SPSS version 22.0 (SPSS, IL) and MedCalc version 15.8 (MedCalc Software). The normality of quantitative data was assessed using the Kolmogorov–Smirnov test. The continuous variables were described as mean ± SD with 95% confidence interval (CI) if the data were normally distributed. Median and interquartile range (IQR) were used for non-normal distribution data description. Categorical variables were represented by numbers and percentages. Kappa analysis was used to examine the consistency between observers. The Pearson chi-square test was used to obtain the p values of accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) between groups. The Wilson scoring method was used to calculate the 95% CI. Pearson’s correlation analysis and Bland–Altman plots were used for agreement measurement. The area under the curve (AUC) was measured for each strategy and compared by using the DeLong test. Considering that FFRCT cannot be measured for the cases in group score 1, we arbitrarily classify all cases in group score 1 as positive (inferior group A) and negative (inferior group B), respectively, to calculate the potential impact of the poor image quality on diagnostic performance of FFRCT rather than excluding them. A p value < 0.05 was denoted statistically significant.

Results

Patients

A total of 463 patients were initially included in this study, and 26 patients were excluded due to unexpected inability. The demographic characteristics of the patients are listed in Table 1. Of the 437 enrolled patients, the median (IQR) age was 61 (56–67) years and 71% of the patients were men. The median (IQR) heart rate at scanning was 68 (60–77) beats per minute (bpm) and the median (IQR) BMI was 25.3 (23.0–27.1) kg/m2. The Agatston score was measured on 174 vessels in 126 patients, and the median Agatston score was 36.8 (0–191.6) on a per-vessel basis.

Table 1 Characteristics of the study population

The effect of subjective image quality on FFRCT diagnostic performance

The prevalence of ischemia-specific vessels in the study population was 37.89% (216/570) determined by FFR, corresponding to 34.74% (198/570) by FFRCT. The agreement between the two observers was good (Kappa value = 0.715). For subjective image quality assessment, 159 vessels (27.89%), 312 vessels (54.74%), 92 vessels (16.14%), and 7 vessels (1.23%) were stratified as score 4, 3, 2, and 1, respectively. The mean (SD) subjective score of image quality was 3.09 (0.69). The diagnostic performances of FFRCT for detecting ischemic-specific lesions are summarized in Table 2. On a per-vessel level, the AUC in group score 4 was markedly higher than that in group score 2 (0.93 vs. 0.80, p = 0.024). Group score 4 was shown to be noninferior to score 3 in terms of AUC (0.93 vs. 0.89, p = 0.222), the same as group scores 3 and 2 (0.89 vs. 0.80, p = 0.104). The specificity of group score 4 was higher than that of group score 2 (0.98 vs. 0.86, p = 0.005) without the cost of sensitivity. Excellent quality group possessed much higher specificity than inferior group A (0.95 vs. 0.79, p < 0.001). Meanwhile, PPV is significantly in favor of excellent group when compared with that of inferior group B (0.90 vs. 0.74, p = 0.033) (Table 2). Figure 1 shows two typical cases of scores 4 and 2.

Table 2 Diagnostic performance of FFRCT in different subjective image quality groups (n = 563)
Fig. 1
figure 1

Representative cases in different image quality score groups. a A 47-year-old male with a mixed plaque in the middle segment of left anterior descending artery (LAD). Subjective image quality is score 4. Distal FFR and FFRCT of the lesion are 0.78 and 0.79. b A 63-year-old female with multiple non-calcified plaques in middle LAD. Subjective image quality is score 2. Distal FFR and FFRCT of the lesion are 0.76 and 0.84. CPR, curve planar reformation

The effect of objective image quality on FFRCT diagnostic performance

A total of 563 vessels were classified as low enhancement (73/563, 12.97%), moderately low enhancement (184/563, 32.68%), moderately high enhancement (158/563, 28.06%), and high enhancement (148/563, 26.29%), respectively. The sensitivities of the low enhancement group (0.94 vs. 0.64, p = 0.002) and moderately low enhancement group (0.89 vs. 0.64, p = 0.005) were significantly higher than those of the high enhancement group, while the low enhancement group and moderately low enhancement group showed a similar sensitivity (0.94 vs. 0.89, p = 0.346), as well as the moderately high and high enhancement group (0.79 vs. 0.64, p = 0.185) (all results were corrected by Bonferroni). However, no major differences were found in specificity (p = 0.150), PPV (p = 0.851), and NPV (p = 0.495) among the groups (Table 3).

Table 3 Diagnostic performance of FFRCT in different objective image quality groups (n = 563)

The effect of BMI, sex, heart rate, and calcium on FFRCT diagnostic performance

There were 136 patients with 178/410 vessels (43.41%) identified as BMI < 25 kg/m2. No statistical difference was found in sensitivity (p = 0.775), specificity (p = 0.061), PPV (p = 0.071), and NPV (p = 0.736) in different BMI groups (Table 4). Meanwhile, similar trend has also been observed in sensitivity (p = 0.399), specificity (p = 0.514), PPV (p = 0.423), and NPV (p = 0.443) of FFRCT in terms of the variable of sex.

Table 4 Diagnostic performance of FFRCT in different the BMI, sex, and heart rate groups

The proportion of patients with different heart rates varied as follows: 57.1% (288/504 vessels) with heart rates < 70 bpm, 24.4% (123/504 vessels) with heart rates between 71 and 80 bpm, and 18.5% (93/504 vessels) with heart rates > 80 bpm. In general, heart rate was highly associated with the specificity (p = 0.015) and PPV (p = 0.003) of FFRCT measurements. Especially, higher specificity (p = 0.003) and PPV (p = 0.001) were observed in patients with heart rates ≤ 70 bpm than > 80 bpm, though the difference was not statistically significant between heart rates ≤ 70 bpm and 71–80 bpm, or between  heart rate > 80 bpm and 71–80 bpm (Table 4).

Vessels were stratified by Agatston score. Fifty-eight vessels (33.14%), 53 vessels (30.29%), 47 vessels (26.86%), and 17 vessels (9.71%) were determined as scores 0; > 0, < 100; ≥ 100, < 400; and ≥ 400, respectively. There was no significant difference in sensitivity (p = 0.595), specificity (p = 0.086), PPV (p = 0.201), and NPV (p = 0.843) in different calcium groups (Table 5; Supplementary Table 2, Supplementary Figure 1).

Table 5 Diagnostic performance of FFRCT in different calcium groups (n = 175)

Correlation of FFRCT to FFR

Overall, there was a good correlation between FFRCT and FFR (r = 0.674, p < 0.001) in all vessels. Based on subjective image quality, modest to good correlations were exhibited in group scores 4 (r = 0.789, p < 0.001), 3 (r = 0.679, p < 0.001), 2 (r = 0.563, p < 0.001) (Fig. 2). The agreement was assessed using the Bland–Altman analysis. The results indicated underestimation of FFRCT when compared with FFR (− 0.007 [SD], 95% LoA was − 0.19–0.17). In subgroup assessments, we found that the agreement of group score 4 (mean difference = − 0.016; 95% LoA, − 0.14–0.11) was better than that of score 3 (mean difference = − 0.005; 95% LoA, − 0.19–0.18) and score 2 (mean difference = 0.0008; 95% LoA, − 0.26–0.26) (Fig. 2).

Fig. 2
figure 2

Linear correlation plots and Bland–Altman plots of FFRCT with invasive FFR in different image quality groups. Linear correlation plots and Bland–Altman plots of FFRCT with invasive FFR for all data (panel a), group score 4 (panel b), group score 3 (panel c), and group score 2 (panel d). As the image quality score decreases, the correlation between FFRCT and FFR also decreases (group score 4, r = 0.789; group score 3, r = 0.679; group score 2, r = 0.563; p < 0.01). In addition, there is a similar downward trend in consistency (group score 4, mean difference = − 0.016, 95% LoA − 0.14–0.11; group score 3, mean difference = − 0.005, 95% LoA − 0.19–0.18; group score 2, mean difference = 0.0008, 95% LoA − 0.26–0.26)

Discussion

This retrospective China FFRCT study demonstrated that image quality of CCTA does have a negative effect on the diagnostic performance of FFRCT. This study also found that heart rate was associated with diagnostic performance of FFRCT while there was no statistical difference in diagnostic performance among the BMI, sex, and calcium groups. Thus, optimizing image quality and controlling patients’ heart rate during scanning might be critical in routine clinical practice of FFRCT.

ML-based FFRCT is a time-saving and cost-saving diagnostic test, which will be beneficial to clinical application. However, the on-site analysis of different centers is strongly interfered by inevitable imaging or patient-related factors. There are numerous factors involved in controlling imaging quality, including patient’s situation and preparation, heart rate, CT scan protocols, and vessel reconstruction methods. Notably, DeFACTO study conducted by Leipsic et al reminded us that misalignment and motion-related artifacts restrained the clinical application of HearFlow-based FFRCT [12]. Nevertheless, to our knowledge, clinical studies have not been used to systemically assess the potential influence factors acted in constraining overall image quality of FFRCT. In our study, the mean subjective score based on a four-score scale of all evaluated images was 3.09 ± 0.69, which is good enough for diagnosis as reported in published literatures [29]. Due to the high-quality images included in our studies, FFR and FFRCT revealed a high consistency in detecting ischemic lesions. Admittedly, our assessment of image quality was influenced by subjective factors of the readers and may not be universally valid, but the relationship between image quality and FFRCT diagnostic performance was still of reference value. Furthermore, our study demonstrated that excellent image quality can improve the diagnostic performance of FFRCT as well as higher recognition ability of vessel specific ischemia. Thus, we recommended using high-quality images for routine FFRCT calculation.

Attenuation is the basis of segmentation and quantization of CT imaging technology [30]. It remains unclear whether different attenuations will affect the diagnostic performance of FFRCT. Lower peak tube voltage usually improves image contrast [31], which has been used in elderly and patients with impaired renal function [32]. Therefore, our study investigated whether objective image quality affects the diagnostic performance of FFRCT. FFRCT diagnostic performance did not vary between 100- and 120-kVp subgroups, whereas the 80-kVp subgroup showed better diagnostic performance [30]. Inconsistent with prior reports, our results showed that the subgroup with CT values of ≤ 400 HU had higher sensitivity and lower false-negative than > 500 HU. This disparity most likely attributes to over enhancement which may blur the boundary between lesion and remaining lumen, making it difficult to locate and discriminate ischemia lesions. Based on our findings, the degree of vessel enhancement restricted to 300–400 HU should be suggested in routine FFRCT practice.

Beta-blockade and sublingual nitroglycerin are commonly used to adjust an average heart rate of subjects from 55.7 to 62.7 bpm (p < 0.001) [12]. DeFACTO research found that the pretreatment of beta-blockade and sublingual nitroglycerin before CCTA scanning can obviously improve the specificity of FFRCT. Thus, beta-blockade and sublingual nitroglycerin pretreatment were recommended for FFRCT testing [33]. Previous studies have shown that high heart rate was negatively correlated with CCTA image quality [22, 34]. The fluid mechanics model of FFRCT relies on accurate image data and correct boundary of the vessel wall, while excessive heart rate will lead to insufficient coronary artery dilatation. Nozue et al confirmed that in order to obtain the high-quality images required by FFRCT, the control of heart rate during CCTA examination was necessary [35]. In keeping with prior study, our study showed that the specificity and PPV of FFRCT were significantly higher in the subgroup with a heart rate ≤ 70 bpm than that with > 80 bpm.

Our study included patients with BMI of 17–36 kg/m2, among whom 24 patients had a BMI > 30 kg/m2 (7.8%). We did not find any adverse effect of these low BMI on the diagnostic performance of FFRCT. Paradoxically, prior studies showed that BMI contributes to the decline in image quality [34]. Thus, we hypothesize that the lack of obese people might be a limitation of our study. Besides, we believe that the self-adaptation of tube current should also allow for the independence of obesity and FFRCT accuracy. Another consideration is the effect of calcium score. It has been commonly reported that calcium did not influence the diagnostic performance of FFRCT, even in patients with high Agatston scores (> 1000) [36, 37] for the reason that diagnostic performance of FFRCT lies in the complexity of FFRCT modeling and calculation. However, Nozue et al believed that high Agatston score would result in poor image quality, thus making cases not suitable for FFRCT [35]. In our study, we found calcium score and sex had no effect on diagnostic performance of FFRCT.

This study has some limitations. This is a retrospective study with confounding factors and information loss in some patients, which needs to be further verified by prospective studies. Secondly, the nitrate regimen used in the present study may result in incomplete vasodilation in some patients. Moreover, the lower qualifications of readers who assess image quality may also affect the results; however, they had good experiences for interpreting CCTA in our medical center with high CCTA workload.

In conclusion, this retrospective study revealed that image quality of CCTA and heart rate at scanning influence the diagnostic performance of FFRCT, while there was no statistical difference in diagnostic performance among BMI, sex, and calcium groups. Our study indicated that CCTA with ≥ score 3, intracoronary enhancement degree of 300–400 HU, and heart rate below 70 bpm at scanning could be of great benefit to more accurate FFRCT analysis.