Introduction

Stress single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) is a well-established technique for the diagnosis and risk stratification of patients with suspected or known coronary artery disease (CAD).1 Conventional (C)-SPECT systems use sodium iodide crystals and parallel-hole collimators. Although the latest developments have enhanced image quality and performance of these systems, some technical limits are still present, as poor energy resolution, prolonged imaging time, low spatial resolution, and need for relatively large doses of radiopharmaceuticals.2 The novel gamma cameras with semiconductor cadmium-zinc-telluride (CZT) detectors have been recently introduced.3 In this new system, CZT semiconductors, that directly convert radiation into electric signals, have replaced the conventional sodium iodide crystals allowing an improvement in terms of image accuracy and acquisition time.4,5,6 Prior studies demonstrated a good diagnostic performance of these new cameras.3 It has also been showed that CZT-SPECT findings allow adequate risk stratification of patients referred to MPI for suspected or known CAD.7 Machine learning (ML) has been successfully applied in different fields with many scopes for decades. Its introduction in healthcare can provide useful insights to clinicians: it helps in finding hidden patterns inside data, supporting in diagnosis and therapy planning, in technology assessment, hospital planning, and management of resources.8,9 In patients referred for MPI, a direct comparison of the performance of C-SPECT and CZT-SPECT cameras has been conducted.10 However, no studies have been performed using ML algorithms to evaluate the diagnostic accuracy of the two cameras. The purpose of the present investigation was to evaluate the performance of C-SPECT and CZT-SPECT in a large cohort of patients with suspected or known CAD and to compare the diagnostic accuracy of the two systems using two different ML algorithms.

Materials and Methods

Patients

Between February 2016 and May 2017, a total of 517 consecutive patients with suspected or known CAD were submitted by referring physicians to stress MPI for assessment of myocardial ischemia. Patients’ clinical history was collected, and cardiac risk factors were assessed before testing. A patient was considered to have known CAD at the time of the MPI based on a provided history of previously diagnosed atherosclerotic coronary disease, history of myocardial infarction (chest pain or equivalent symptom complex, positive cardiac biomarkers, or typical electrocardiographic changes), history of percutaneous coronary intervention, or history of coronary artery bypass grafting. The review committee of our institution approved the study and all patients gave informed consent (Protocol Number 110/17).

Study Protocol

Patients underwent stress technetium-99m (99mTc) sestamibi-gated MPI by physical exercise or dipyridamole stress test, according to the recommendation of the European Association of Nuclear Medicine and European Society of Cardiology.11 In all patients, beta-blocking medications and calcium antagonists were withheld for 48 hours and long-acting nitrates for 12 hours before testing. For patient undergoing exercise test, symptom-limited treadmill standardized protocols were performed, with monitoring of heart rate and rhythm, blood pressure, and electrocardiography (ECG). Test endpoints were achievement of 85% maximal predicted heart rate, horizontal or down sloping ST-segment depression > 2 mm, ST-segment elevation >1 mm, moderate to severe angina, systolic blood pressure decrease > 20 mm Hg, blood pressure > 230/120 mmHg, dizziness, or clinically important cardiac arrhythmia. For dipyridamole stress test, patients were instructed not to consume products containing caffeine for 24 hours before the test. Dipyridamole was infused at dose of 0.142 mg·kg−1·minute−1 intravenous over 4 minutes. A dose of 100 mg of aminophylline was administered intravenously in the event of chest pain or other symptoms, or after significant ST depression. At peak exercise, or 4 minutes after completion of dipyridamole infusion, a bolus of 370 MBq of 99mTc-sestamibi was intravenously injected. Patients continued the exercise for additional 60 seconds after tracer injection. For both types of stress, heart rate, blood pressure, and 12-lead ECG data were recorded at rest, at the end of each stress stage, at peak stress and in the delay phases at rest. Maximal degree of ST-segment changes at 80 ms after the J point of the ECG was measured and assessed as horizontal, down sloping or upsloping. Four hours after stress test, a bolus of 1110 MBq of 99mTc-sestamibi was intravenously injected at rest and imaging was started 30 minutes after tracer injection.

Imaging

All patients underwent MPI by both C-SPECT and CZT-SPECT systems according to a randomized scheme in 1:1 ratio that determined which camera was used for first acquisition. The study protocol is shown in Figure 1.

Figure 1
figure 1

Study protocol for myocardial perfusion imaging

For C-SPECT a dual-head rotating gamma camera (E.CAM, Siemens Medical Systems, Hoffman Estates, IL, USA) equipped with a low-energy, high-resolution collimator and connected with a dedicated computer system was used,12 without attenuation or scatter correction. For gating, a cardiac cycle was divided into eight frames. The acquisition time was 20 min for both stress and rest images. Perfusion imaging was reconstructed by summing the gated data at each projection into an “ungated” raw data file before low phase pre-filtering and ramp-filtered back projection.

For CZT-SPECT (D-SPECT, Spectrum Dynamics, Caesarea, Israel) recordings were obtained using 9 pixilated CZT crystal detector columns mounted vertically spanning a 90° geometry.13 Each of the columns consists of 1024 (16 × 64) 5-mm-thick CZT crystal elements (2.46 × 2.46 mm). Square-hole tungsten collimators are fitted to each of the detectors, which are shorter than conventional low-energy, high-resolution collimators, yielding significantly better geometric speed. Data were acquired focusing on the heart by the detectors rotating in synchrony and saved in list mode. The CZT camera uses a proprietary Broadview reconstruction algorithm based on the maximum likelihood expectation maximization algorithm.14,15,16 Images were obtained with the patient in a semi recumbent position. A 10-s pre-scan acquisition was performed to identify the location of the heart and to set the angle limits of scanning for each detector (region of interest—centric scanning). Using the myocardial count rate from the pre-scan acquisition, the time per projection was set to target the recording of 1000 myocardial kcounts. Scan duration was < 10 minutes for stress and < 5 minutes for rest imaging. Summed and gated projections were reconstructed with an iterative maximum likelihood expectation maximization algorithm using 7 and 4 iterations, respectively.

Imaging Interpretation

A general quality control of images was performed for both cameras at the end of each study. Raw dataset in cinematic display and sinogram for C-SPECT and sinogram, panogram, planar cine, and positioning ellipse for CZT-SPECT were reviewed to check for possible imaging artifacts. Appropriate control of images was also ensured verifying proper contour placement. For both SPECT systems, an automated program (Cedars-Sinai, QGS/QPS, version 2015) was used to calculate left ventricular (LV) volumes, ejection fraction (EF), wall motion, wall thickening, and the scores incorporating both the extent and severity of the perfusion defects, using standardized segmentation of 17 myocardial regions.17 The total perfusion defect (TPD) of the stress images was also generated, representing the defect extent and severity and expressed as a percentage of the LV myocardium.18 A post-stress LVEF > 45% and a TPD < 5% were considered normal.19 All studies were visually reviewed and interpreted by two nuclear medicine physicians with > 6 years of experience in nuclear cardiology blinded to clinical information. A third nuclear medicine physician resolved any discrepant readings. At reporting clinical interpretation, a patient was defined as normal when a TPD ≥ 5% at semiquantitative analysis was associated with the preserved wall thickening into the defect and visual over-read.20

Statistical Analysis and Supervised Machine Learning

Continuous variables were expressed as mean value ± standard deviation and categorical data as frequencies or percentage. The intraclass coefficient of correlation (ICC) was used to evaluate intraobserver and interobserver reproducibility of perfusion and functional parameters for both C-SPECT and CZT-SPECT.21 Intraobserver reproducibility was measured by comparing the images interpreted twice by the same experienced reader at an interval of at least 2 months. Interobserver reproducibility was measured by comparing the images interpreted by two different experienced readers. Concordance between the methods was also expressed as exact agreement by the κ statistics using the following grading system: ≤ 0.40 poor agreement, 0.41 to 0.60 moderate agreement, 0.61 to 0.80 good agreement, and > 0.80 excellent agreement. Correlations between C-SPECT and CZT-SPECT were evaluated by linear regression analysis, and the agreement between the two SPECT systems was assessed by Bland-Altman analysis.22 Comparison of continuous data between groups was performed using the two-sided Student’s t test. A P value <.05 was considered statistically significant.

Supervised ML was performed by Knime analytics platform.23 Although a wide range of algorithms are available, random forest (RF) and k-nearest neighbors (NN) were chosen because they are based on different principles. The former is an empowerment of decision trees that exploits some principles of ensemble learning, namely randomization and bagging, and, together with the use of cross-validation, is considered one of the ways to reduce the probability of overfitting. The latter is an instance-based algorithm: the principle behind nearest neighbor methods is to find a predefined number of training samples closest in distance to the new point, and predict the label from these. These two algorithms allowed us to test the differences between the two camera systems exploiting different working principles. RF creates a high number of decision trees, resampling data repeatedly and training a novel classifier for each sample aiming to obtain a higher accuracy.24 On the other hand, k-NN stores all the available cases and classifies the new data or case based on a similarity measure.25 In k-fold cross-validation, the dataset is divided into k partitions (“folds”) of equal number of records; then, a model is learned and tested k times, each time k-1 folds are used to train a model and the last one is used to test it.26 The k results are finally averaged in order to obtain the abovementioned evaluation metrics. The most effective procedure is 10-fold (10-F) cross-validation.26

End point for the analysis of sensitivity, specificity, and diagnostic accuracy was the classification of normal or abnormal MPI at reporting clinical interpretation. The same procedures were applied to data obtained by C-SPECT and CZT-SPECT. The included features were 26: 12 clinical variables and cardiovascular risk factors (i.e., age, sex, body mass index, diabetes, hypertension, hypercholesterolemia, smoking, angina symptoms, family history of CAD, previous myocardial infarction or acute coronary syndrome, and previous revascularization procedure) and 14 variables obtained from the two SPECT cameras (i.e., scores incorporating both the extent and severity of myocardial perfusion defects and the parameters of global and regional LV function). After computing evaluation metrics (accuracy, error, sensitivity, and specificity). Accuracy expresses the number of correct predictions over the total number of records; error the number of incorrect predictions over the total number of records; sensitivity the number of true positives over the sum of true positive and false negative and specificity the number of true negatives over the sum of true negative and false positive. The diagnostic performances of the algorithm were the mean of the cycles coming from 10 folds cross-validation. The area under the receiver operating characteristic (ROC) curve is a performance measurement for binary classification problems and represents degree or measure of discrimination between classes: it tells how much a model is capable of distinguishing between classes. Its value ranges between 0 and 1 with 0.5 representing the worst value, indicating a model that cannot predict better than a random guessing. The procedure was repeated twice, with two different algorithms, in order to avoid an eventual bias introduced by the employed algorithm. The sample size for binary logistic prediction models was also assessed.27

Results

The clinical characteristics of patient population are shown in Table 1. The study group comprised 287 (55%) patients with suspected CAD and 230 (45%) with known CAD. Of the overall population, 266 (51%) patients performed treadmill exercise stress test and 251 (49%) dipyridamole stress test. Stress test results are reported in Table 2.

Table 1 Clinical characteristics of 517 patients with suspected or known CAD referred for stress MPI
Table 2 Stress test results according to stress type

Comparison Between C-SPECT and CZT-SPECT

The intra- and interobserver reproducibility of perfusion and functional parameters were excellent (ICC > 0.90) for both C-SPECT and CZT-SPECT. The differences between the measurements obtained by the same examiner and two different examiners were negligible and not significant.

An excellent correlation between summed stress score (SSS), summed rest score (SRS) and TPD measured by C-SPECT and by CZT-SPECT was observed (r = 0.80, 0.91, 0.89, respectively, all P < .001). However, at Bland-Altman analysis, the mean differences in SSS, SRS, and TPD were 1.3, 0.5, and 0.5%, and the lower and upper limits of agreement between the two SPECT systems were wide (− 7.2 to 9.9, − 4.7 to 5.8 and − 8.8 to 9.9, respectively). Linear regression analysis of LV functional parameters showed an excellent correlation between methods in end-diastolic volume (EDV), end-systolic volume (ESV) and LVEF (r = 0.94, 0.92, 0.91, respectively, all P < .001). But at Bland-Altman analysis, the mean differences in EDV, ESV, and LVEF were − 13.1, − 8.1 and 1.4 %, and again the lower and upper limits of agreement between the two SPECT systems were wide (− 39.2 to 12.8, − 33.4 to 17.2 and − 10.1 to 13.1, respectively). Figures 2 and 3 show the correlation and agreement in TPD and LVEF measured by the two SPECT cameras.

Figure 2
figure 2

Relationship between TPD (%) measured by C-SPECT and by CZT-SPECT (A). Agreement between C-SPECT and CZT-SPECT by Bland-Altman analysis for TPD measurement (B). The differences between the two SPECT systems are plotted against the means of the two systems. The horizontal black line indicates the mean difference between the two systems and the red lines indicate the limits of agreements. y = 0 is line of perfect average agreement

Figure 3
figure 3

Relationship between LVEF (%) measured by C-SPECT and by CZT-SPECT (A). Agreement between C-SPECT and CZT-SPECT by Bland-Altman analysis for LVEF measurement (B). The differences between the two SPECT systems are plotted against the means of the two systems. The horizontal black line indicates the mean difference between the two systems and the red lines indicate the limits of agreements. y = 0 is line of perfect average agreement

Agreement between TPD and reporting clinical interpretation for the classification of subjects as normal or abnormal by C-SPECT and CZT-SPECT analysis is depicted in Figure 4. The agreement was moderate for C-SPECT (κ value 0.45) and good CZT-SPECT (κ value 0.72).

Figure 4
figure 4

Agreement between TPD and reporting clinical interpretation for the classification of subjects with normal and abnormal by C-SPECT (A) and CZT-SPECT (B). CI, confidence interval

Machine Learning Analysis

The diagnostic performance of two cameras is summarized in Table 3. The events per variable criterion was > 10. For both the algorithms the hyperparameters were tuned in order to maximize the results, namely the number and depth of decision trees, the type of distance and number of neighbors of k-NN. Diagnostic accuracy of CZT-SPECT was significantly higher than that of C-SPECT for RF (P < .05), but not for k-NN (P = .11). On the other hand, sensitivity of CZT-SPECT was significantly than that of C-SPECT for both RF (P < .001) and k-NN (P < .05). ROC curve areas of RF were >0.95 for both C-SPECT and CZT-SPECT, while those of k-NN were lower (between 0.70 and 0.80), demonstrating a high and good quality of the models applied.

Table 3 Machine learning analysis results

Discussion

To our knowledge, this is the first study using ML analysis to compare diagnostic performance of C-SPECT and CZT-SPECT in a large cohort of patients with suspected or known CAD. Our results show that although there is a good correlation of perfusion and functional parameters between the two systems, at Bland-Altman analysis the limits of agreement were wide and the diagnostic accuracy and sensitivity of CZT-SPECT are greater than those of C-SPECT.

CAD remains one of the leading causes of mortality in the developed countries; however, epidemiologic data show that improved control of cardiac risk factors has resulted in a temporal decrement in the incidence and severity of CAD as well as its related mortality.28 Stress gated MPI with SPECT imaging is a well-established modality for the evaluation of myocardial perfusion and LV function in patients with suspected or known CAD. This technique has clinical advantages including high sensitivity and high negative predictive value.29 Cardiac SPECT imaging has been markedly enhanced by the recent introduction of new CZT cameras,30 characterized by higher energy resolution and count sensitivity, with improved image quality that may overcome the considerable limitations of C-SPECT as the prolonged time to scan acquisition and radiation dose.31 Since the introduction of the novel dedicated cardiac CZT camera, several clinical studies confirmed the high performance of this new camera.32,33 These studies evaluated its physical characteristics and the overall performances in comparison to conventional Anger cameras and tested the potential to reduce scan times and/or tracer activity and their effectiveness in clinical use.5,30,31 The superior performances reported for CZT system in comparison to conventional Anger cameras, are strictly dependent on the intrinsically different technology and acquisition techniques, which are, in clinical conditions, potentially capable of producing significantly different results.34 Moreover, previous studies have demonstrated a high correlation between quantitative measure of perfusion abnormality and functional parameters using novel CZT-SPECT and C-SPECT systems, with a good concordance between the two methods.5,9 Quantitative analysis has been shown to be useful in comparing sequential MPI studies,35,36 assessing the effectiveness of invasive as well as medical treatments in patients with CAD 37 and to provide evidence of the generalizability of the SPECT MPI results with the new system. Sharir et al. 14 compared the conventional gamma camera systems and new high-speed technology, demonstrating that high-speed stress and rest TPD correlated linearly with C-SPECT TPD (r = 0.95 and 0.97, respectively, P < .0001), with good concordance in the three vascular territories (k value for the left anterior descending coronary artery, left circumflex coronary artery, and right coronary artery were 0.73, 0.73, and 0.70, respectively; >90% agreement). A recent multicenter study demonstrated in a population of 50 patients with suspected or known CAD, a comparable diagnostic performance between a CZT and conventional camera, with superior image quality and significantly shorter acquisition time.38 In particular, the authors demonstrated a higher diagnostic accuracy of CZT-SPECT than C-SPECT by visual analysis for the detection of significant CAD, using invasive coronary angiography as the gold standard.38 In the present investigation, C-SPECT performance was compared with CZT-SPECT in detecting perfusion abnormalities by stress 99mTc-sestamibi protocol, in 517 patients referred for evaluation of CAD at our institution. The results of our study highlight excellent correlation between C-SPECT and CZT-SPECT in the evaluation of perfusion scores and LV functional parameters. Recent advances in ML research have resulted in algorithms that allow correct combination of several input features in order to improve risk classification.39 In a recent study, comparing diagnostic performance of C-SPECT and CZT-SPECT for the classification of abnormal perfusion in women with suspected CAD, ML analysis proved that CZT-SPECT has higher values of accuracy, sensitivity and specificity than C-SPECT.40 In the present study, employing ML algorithms, both cameras reached a maximum accuracy greater than 75% through the implementation of RF and k-NN algorithms. Nevertheless, the sensitivity of CZT-SPECT for both algorithms (96% with RF and 60% with k-NN) was greater than that obtained by C-SPECT (88% with RF and 53% with k-NN). The evaluation metrics for k-NN were lower than those obtained by RF, but two aspects need to be considered. First, the difference between the algorithms can be explained by the different operating principle behind them. Indeed, the RF has already proved to be one of the best algorithms in the cardiologic field.41,42 Furthermore, it does not seem to be related to the performance of the two systems since the evaluation metrics became lower for both C-SPECT and CZT-SPECT.

The additional potential advantage of using the novel CZT system lead to direct benefits for the patients. First of all, with a reduced imaging time by a factor of 5 or greater, CZT imaging requires only 2 min of acquisition time, earning the name of ultrafast camera. CZT systems also apply either a wide cushioned table or a cushioned chair to support the patient during imaging improving comfort.43 Moreover, the CZT detectors have superior energy and intrinsic spatial resolution, as well as a significant decrease in the scatter photons in the imaged data. The small detector size permits development of novel gantry designs that can focus these detectors on the myocardial field of view and be equipped with optimized collimators to improve count sensitivity. Better image quality will result in less artifacts and less need for rest imaging with the benefit of lower radiation dose for a patient. Finally, a non-negligible benefit of the abovementioned dose-saving strategies arises from the fact that a reduction in radiopharmaceutical dosage enables nuclear MPI to be more cost-effective, a trend that is most welcome against the background of the continuously growing use of nuclear MPI in an increasingly challenging fiscal environment with ever-rising healthcare costs.43

We also investigated the intra and inter observer reproducibility of CZT-SPECT findings. Reproducibility is an important issue in the conduct of clinical studies, and it should be appropriately calculated and reported for any test introduced in outcome research.44,45 Reproducibility studies provide information on the amount of error inherent in any diagnosis, score or measurement, such that the amount of measurement error determines the validity of the study results or scores. Our results showed a high observer reproducibility of results of CZT-SPECT and these findings were confirmed by the results of ICC coefficient values that were excellent (> 0.90) for all measurements.

New Knowledge Gained

Our study adds new information about the use of CZT-SPECT in patients submitted to stress MPI for assessment of myocardial ischemia. In particular, this is the first study using ML analysis to compare diagnostic performance of C-SPECT and CZT-SPECT in a large cohort of patients with suspected or known CAD. Our results indicate that although there is a good correlation of perfusion and functional parameters between the two systems, at Bland-Altman analysis the limits of agreement were wide and the diagnostic accuracy of CZT-SPECT is greater than those of C-SPECT.

Conclusion

Myocardial perfusion data and LV functional parameters obtained by CZT-SPECT are high reproducible and provide good correlation with those obtained by C-SPECT. ML approach showed that the accuracy and sensitivity of CZT-SPECT is greater than C-SPECT in detecting CAD.