Introduction

Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) is the most utilized non-invasive imaging study for patients with known or suspected coronary artery disease (CAD),1 with an estimated 15 million tests performed annually worldwide.2,3 However, widespread use has led to concerns regarding the relatively high effective radiation dose associated with SPECT-MPI. In 2015, it was reported that a standard rest-stress MPI study carries an effective radiation dose of more than 10 mSv,2,4,5 and contributed to over 10% of cumulative radiation from medical procedures in the United States.2,3 Additional drawbacks include long procedure times and the subjectivity of visual interpretation for perfusion abnormalities.

Currently, most SPECT-MPI protocols utilize two separate image acquisitions, one with patient at stress (exercise or pharmacological) and another after rest. However, when stress images are normal, rest imaging provides no additional clinical value.5,6,7,8 There are data from clinical trials, including over 20,000 patients, confirming that the prognosis of a normal stress-only MPI is no different than a normal rest-stress MPI.8 Critically, eliminating unnecessary rest imaging reduces radiation exposure by up to 75% and laboratory time by up to 70%.5,6,8,9 In a stress-first MPI protocol, stress images are acquired and, when normal, rest scanning can be canceled, resulting in a “stress-only” procedure. However, successful implementation of stress-only MPI requires timely review of stress imaging which poses a significant problem for busy physicians. For this reason, stress-only MPI is severely underutilized in United States.2,3,4 There is an unmet need for a simple and efficient method to cancel unnecessary rest imaging without increased work for the interpreting physician.

A machine learning approach that combines clinical and image data has been shown to improve both the diagnostic and prognostic value of SPECT-MPI results.10,11 In this multicenter study we developed a high-sensitivity machine learning score (MLS) for prediction of obstructive CAD using stress-only MPI data and pre-test clinical variables for use in a stress-first MPI protocol. We recently demonstrated that the MLS could identify patients for stress-only imaging at low risk of major adverse cardiovascular events (MACE).12 This novel application of machine learning can improve the utilization of stress-only MPI by automatically determining the need for the additional rest MPI imaging.

Methods

Study Population

The REgistry of Fast Myocardial Perfusion Imaging with NExt generation SPECT (REFINE SPECT) is a multicenter, international collaborative database of both imaging and clinical variables.13 The REFINE SPECT diagnostic registry includes 2079 patients from 9 centers without known CAD who underwent both SPECT-MPI and invasive coronary angiography (ICA) within 180 days between 2009 and 2014. Patients with a history of CAD, determined by a physician at each clinical site, were excluded. History of CAD was defined as either previous myocardial infarction or previous revascularization.14 Pre-defined clinical variables as well as deidentified MPI data and ICA correlations were transferred to a single core laboratory (Cedars-Sinai Medical Center). The study was approved by the institutional review boards at each participating institution and the overall study was approved by the institutional review board at Cedars-Sinai Medical Center.

Clinical Data

Clinical information was obtained at each site from patients’ medical records. Prior to collection, a homogenous classification of each “pre-test” clinical variable was defined to assure standardization among sites. Stress-test results, including patient symptoms and ECG data, were interpreted by the performing physician at each clinical site. Clinical and imaging parameters included in REFINE SPECT are shown in Supplemental Table 1.13 Only stress imaging variables and pre-test clinical information collected as part of a standard SPECT-MPI protocol were considered for automated MLS development.

Image Acquisition

Details of image acquisition and processing are available in the supplement.

Invasive Coronary Angiography

ICA was performed according to standard clinical protocols within 180 days of MPI. The indication for ICA was determined by on-site cardiologists at each participating center. Follow-up for ICA results and revascularization details were collected from medical records (including all clinic visits, cardiology groups, insurance registries, and hospital visits) at each enrolling center. All coronary angiograms were visually interpreted by an on-site cardiologist. A luminal narrowing of ≥ 70% in the left anterior descending artery (LAD), left circumflex artery (LCx), or right coronary artery (RCA) or ≥ 50% stenosis of the left main coronary artery (LMCA) was considered significant. Dates of revascularization (if performed), revascularization type (percutaneous coronary intervention [PCI] or coronary artery bypass grafting [CABG]) and location (which artery) are also available.

Visual Perfusion Analysis

Visual perfusion analysis was performed at each participating center at the time of clinical interpretation, with full details in the supplement.

Machine Learning

The MLS was generated using automated variable selection by information gain ratio ranking and model building with a boosted ensemble algorithm which is then tested using a 10-fold cross-validation procedure. Machine Learning (ML) techniques were implemented in the open-source Waikato Environment for Knowledge Analysis (WEKA) platform 3.8.0 (University of Waikato, Hamilton, New Zealand).12 A LogitBoost method was implemented in the WEKA platform for training and validation.

Variable Selection

A combination of imaging variables, stress-test variables, and pre-test clinical variables were available for selection by the feature selection module (information gain ratio). The information gain ratio measures the effectiveness of each variable for building training models. Only variables that resulted in an information gain ratio > 0 were included for model building.

Model Building

Predictive classifiers for obstructive CAD were developed by an ensemble (“boosting”) LogitBoost algorithm. The principle behind ML ensemble boosting is to combine the prediction of simple classifiers with weak performances to create a single strong classifier. These weak predictions are then combined in an ensemble (weighted majority voting) to derive an overall classifier, the MLS. More details of the model building process are available in the supplement.

Cross-Validation

The performance and error estimation of the MLS (variable selection and LogitBoost) were assessed using a stratified 10-fold cross-validation procedure. With this approach, the total patient population is split into 10 equal groups, with similar prevalence of obstructive CAD, which act as a split-sample (train-test) method. An MLS is derived from each of the 90% (training cohort) and applied to the remaining 10% (testing cohort). This process is repeated 10 times, with different and non-overlapping data used for testing in each iteration, to ensure that the MLS is consistently evaluated on previously unseen data and not performed on the same population used for model building. These testing results are then concatenated.15,16 This repeated procedure is done to obtain a more accurate and unbiased estimate of the diagnostic performance of the model. The repetition of testing removes the uncertainty associated with which cases are used for training and which are used for testing. The diagnostic accuracy the MLS in each individual testing cohort in Supplemental Figure 1.

External Validation of the MLS

To assess the external validity of the proposed MLS, we performed an additional analysis in which 8 of the 9 sites were used to generate a new MLS which was then tested in the external held-out population.

Missing Variables

For all variables with missing values, the default imputation was performed by the WEKA platform 3.8.1. These data were imputed with the population’s mean value for continuous variables, or with a distinct ‘missing’ label for categorical variables. We also generated an MLS without variables with > 40% missing values to evaluate the impact of missing values, with results available in the supplement.

Clinical Validation of Stress-Only MPI

The MLS was generated using stress-only MPI data but not all studies in the REFINE SPECT diagnostic registry were performed using a stress-first MPI protocol. Thus, an additional analysis was performed to determine its diagnostic performance for stress-first MPI using a clinical validation cohort. Details of the clinical validation procedure are available in the supplement.

Statistical Analysis

Continuous variables are presented as mean ± standard deviation (SD) or median (± interquartile range) as appropriate. Categorical variables are presented as numbers and proportions. The paired Student’s t test (for normally distributed variables) or the Wilcoxon rank-sum test (for non-parametrically distributed variables) were used to compare differences between groups. Proportions were compared using Chi-Squared, McNemar’s, or Friedman’s Test, when appropriate. The predictive accuracy of the MLS, TPD, reader diagnosis and SSS for obstructive CAD were evaluated with a receiver operating characteristic (ROC) analysis and pairwise comparisons of AUC according to DeLong et al on a continuous scale.17 For the clinical diagnostic algorithm, an ideal MLS would be highly sensitive for identifying obstructive and high-risk CAD from stress-only MPI data without increasing the frequency of unnecessary rest scans. Therefore, three thresholds for MLS were established which corresponded to sensitivities of 90%, 95%, and 99% for obstructive CAD based on ROC curve analysis. The calibration of the MLS was assessed with a calibration plot and the Brier score. A two-tailed P value < 0.05 was considered statistically significant. Statistical calculations were performed using R software version 3.4 and SPSS (IBM SPSS Software for Windows, Version 24.0).

Results

Myocardial Perfusion Imaging

Baseline characteristics are presented in Table 1. MPI was performed using a D-SPECT scanner at four centers (n = 1170[56%]), GE Discovery NM 530c at three centers (n = 808[39%]), and GE Discovery NM/CT570c with attenuation correction at two centers (n = 101[5%]). A combination of stress and rest imaging were performed in 2005 patients (96.4%), with a summary of image acquisition parameters in Table 2. The frequency of each missing variable is available in Supplemental Table 2.

Table 1 Baseline characteristics
Table 2 SPECT-MPI acquisition

Prior to stress imaging, patients underwent either symptom limited treadmill exercise testing without adjuvant pharmacologic stress (n = 805[39%]) or pharmacologic stress testing (1274[61%]) with radiotracer injection at peak exercise or during maximum hyperemia, respectively. Pharmacologic stress agents included adenosine (n = 185), dobutamine (n = 15), dipyridamole (n = 494), or regadenoson (n = 577). The administered stress activity was lower with stress-first imaging [n = 793(267 ± 95 MBq or 7.24 ± 2.6 mCi)] than when rest imaging was performed first [n = 1215(880 ± 398 MBq or 23.8 ± 10.8 mCi)] (P < .01). All other stress MPI results are shown in Table 3. Automatic contours were evaluated by an experienced technologist in all cases and corrected in 291 (14%) cases. Most changes were small and involved changes in the valve plane position. Two or more corrections were performed in a small proportion of the population (2.1%).

Table 3 Stress and imaging results

Invasive Coronary Angiography

The average time interval between MPI and ICA was 28 days (± 39). On ICA, obstructive CAD was identified in 1309 (63%) patients and involved 2423 coronary artery segments. Obstructive CAD involved the LAD in 73% of patients (n = 958), the LCx in 48% (n = 633), and the RCA in 53% (n = 689). High-risk CAD was present in 539 (26%) patients and was defined as LMCA stenosis (present in 143 patients), proximal LAD stenosis (present in 286 patients), or 3-vessel CAD (present in 223 patients).

Analysis and Machine Learning Prediction

A summary of information gain ratio and variable selection is shown in Figure 1. AUC was calculated for prediction of obstructive CAD using the entire REFINE SPECT diagnostic cohort (n = 2079, prevalence = 63%). For prediction of any obstructive CAD, MLS had a superior AUC than both reader diagnosis and TPD (0.84 vs 0.70 vs 0.78, P < .01) (Figure 2). ROC results for 10 individual hold-out datasets for each of the 10 models are shown in Supplemental Figure 1. From the ROCs, three MLS thresholds were identified corresponding to pre-defined sensitivities of 90%, 95%, and 99%. The diagnostic properties of each MLS threshold were then compared to reader diagnosis, SSS, and TPD (Table 4).

Figure 1
figure 1

Variable Selection. A combination of imaging (blue bar: 17 selected), stress-test (red bar: 12 selected) and clinical (green bar: 17 selected) variables ranked by their mean [95% CI] information gain ratio within 10-fold cross-validation for prediction of obstructive CAD. *The alternative view (supine for D-SPECT or prone for Discovery)

Figure 2
figure 2

Receiver operator characteristics for prediction of obstructive CAD in all patients (prevalence = 1309/2079 [63%]). *P < 0.01 for AUC comparison by Delong Test. AUC, area under the receiver operating characteristic curve, CAD, coronary artery disease; CI, confidence interval; MLS, machine learning score; TPD, total perfusion deficit

Table 4 Diagnostic testing characteristics for prediction of obstructive CAD from SPECT-MPI (n = 1309/2079[63%])

All MLS thresholds had significantly higher sensitivity than TPD ≥ 1% or SSS > 0 (P < .01 for all). An MLS of 0.38 had a significantly greater specificity than reader diagnosis or SSS but not TPD. An MLS of 0.29, which corresponded to sensitivity of 95% for obstructive CAD, had a sensitivity of > 95% for high-risk CAD. This threshold (0.29) had superior sensitivity for high-risk and obstructive CAD compared to an MLS of 0.38 (P < .01). Although highly sensitive, an MLS threshold of 0.16 (corresponding to 99% sensitivity) did not identify significantly more cases with obstructive CAD than an MLS of 0.29 and had a significantly lower specificity. There was no difference in the negative predictive value (NPV) for the MLS of 0.16 and 0.29 (0.80 vs 0.78, respectively) but both were significantly higher than the NPV for an MLS of 0.38 (0.73, P < .01). An MLS of 0.29 was applied to the clinical algorithm. The MLS showed good calibration (Figure 3) with a Brier score of 0.16.

Figure 3
figure 3

Calibration plot showing the observed vs. predicted risk of obstructive CAD. The machine learning score (MLS) showed good calibration with a Brier score of 0.16. CAD, coronary artery disease

SSS were available in 1139 patients, 729 (64%) of whom had obstructive CAD. The ROC for prediction of obstructive CAD between SSS, TPD, and MLS are shown in Supplemental Figure 2.

Although developed for prediction of obstructive CAD, an MLS threshold of 0.29 also had superior sensitivity for detection of high-risk CAD than reader diagnosis or TPD (96% vs. 89% vs. 90%, P < .01). The frequency of false-negative test results for obstructive and high-risk CAD by reader diagnosis, TPD, and MLS are shown in the Figure 4. The frequency of false-negative studies between SSS, TPD, and MLS are shown in Figure 5, in patients with available segmental scores. Of the 539 patients with high-risk CAD, the MLS was negative in 22 (4%). Of these, TPD was negative in 15 (68%) and by reader diagnosis in 14 (64%). When the stress images were reviewed as part of the internal validation cohort, 14 (64%) were considered normal.

Figure 4
figure 4

Diagnostic safety of automated cancellation of rest MPI in all patients. Frequency of false-negative test results among each method of interpretation for SPECT-MPI in total population (N = 2079). High-risk(HR) coronary artery disease (CAD) refers to 3-vessel CAD, LMCA stenosis ≥ 50%, or proximal LAD stenosis ≥ 70%. Three-vessel CAD refers to stenosis ≥ 70% of the LAD, LCx, and RCA. *P < .01. LCx, left circumflex artery; LMCA, left main coronary artery; pLAD, proximal left anterior descending artery; RCA, right coronary artery; TPD, total perfusion deficit; MLS, machine learning score

Figure 5
figure 5

Frequency of false-negative results in population with available segmental scores (N = 1139). High-risk (HR) coronary artery disease (CAD) refers to 3-vessel CAD, LMCA ≥ 50% or pLAD ≥ 70%. Three-vessel CAD refers to stenosis ≥ 70% of the LAD, LCx, and RCA. *P < .01. LCx, left circumflex artery; LMCA, left main coronary artery; pLAD, proximal left anterior descending artery; RCA, right coronary artery; SSS, summed stress score; TPD, total perfusion deficit; MLS, machine learning score

The diagnostic properties of the MLS to identify stenosis ≥ 50% (Supplemental Figure 3), for prediction of future revascularization (Supplemental Results), for each center separately (Supplemental Figure 4), per vessel (Supplemental Figure 5), and when applied to different cameras and stress protocols (Supplemental Table 3) are available in the Supplemental Data.

External Validation Results

A new MLS was generated using the data from 8 sites (n = 1723) and then tested on the held-out site (n = 356). The new MLS was then compared to readers and TPD for prediction of obstructive CAD. This MLS had superior diagnostic accuracy compared to both readers and TPD for prediction of obstructive CAD (Supplemental Figure 6). Comparison of the MLS generated with and without variables with > 40% missing values are shown in Supplemental Figure 7.

Automated Stress-First Clinical Diagnostic Algorithm

A clinical diagnostic algorithm for automated stress-first SPECT-MPI in Figure 6. After completion of stress testing, an MLS is automatically calculated from the stress MPI data and pre-test clinical variables. If the score is less than the decision threshold of 0.29, rest imaging can be omitted. If the MLS is greater than or equal to the applied threshold of 0.29, the patient should proceed with rest imaging.

Figure 6
figure 6

Proposed automated stress-first SPECT-MPI algorithm. MLS, machine learning score

Discussion

In this study, we applied machine learning to create a clinical diagnostic algorithm for stress-first MPI that automatically determines which patients are appropriate candidates for stress-only MPI, prior to physician review. An MLS threshold of 0.29, which is generated from stress-only MPI data and pre-test clinical variables demonstrated, superior diagnostic sensitivity for both obstructive and high-risk CAD than reader diagnosis, and TPD. The MLS computation does not require additional clinical or imaging data collection beyond that of a standard SPECT-MPI protocol and can be readily incorporated into clinical laboratories. Trained on a large population, the MLS provides an adaptive model which can be applied to any patient population, omitting the need for pre-test patient selection. As shown in Figure 6, all clinical and imaging data, including the MLS are reviewed by the interpreting physician prior to finalizing a clinical report. This application of machine learning allows the physician to review all test results and, if necessary, the patient may return for additional rest imaging on a separate day.

The safety of stress-only MPI is well established,7,8,18 but appropriate patient selection remains unclear. Current guidelines recommend performing stress-only MPI in patients with a “low pre-test probability of CAD”,1 but do not provide a clinical algorithm for patient selection. Prior studies have proposed pre-test risk calculators that require additional clinical data collection prior to selecting the appropriate MPI protocol.9,19 Duvall et al previously published pre-test clinical algorithms to predict which patients will have normal stress MPI results with over 80% accuracy.19 These algorithms differ from the proposed MLS algorithm in that they use pre-test clinical variables to predict a successful stress-first MPI protocol, whereas the MLS uses pre-test clinical data, stress-test results and stress imaging data to produce a post-test clinical risk score. The MLS algorithm for automatic selection of patients for stress-only MPI who were at low risk of MACE was recently published.12 In that study, a MLS was superior for prediction of all-cause mortality (N = 20,414, AUC: MLS 0.8 vs TPD 0.7 vs readers 0.68; P < 0.01) and selected a similar proportion of patients for rest scan cancellation but with significantly lower all-cause mortality compared to clinical selection rules (1.3%, 1.2% and 1.0% vs 0.6%, 0.6% and 0.2%).12 The combination of such robust prognostic prediction and the highly sensitive rule out of obstructive CAD could be a safe and efficient method for reducing unnecessary rest imaging in clinical laboratories

Stress-only imaging can greatly reduce patient radiation exposure while maintaining the robust prognostic value of a normal SPECT-MPI.2,3 However, in 2015, it was estimated that stress-only MPI account for only 3% of studies in North America.3,4 This can be explained in part by the need for timely review of stress imaging which may be problematic for physicians with other clinical responsibilities. The ML algorithm provides a highly sensitive diagnostic tool that can automatically cancel unnecessary rest imaging and ultimately improve the utilization of stress-first MPI in busy cardiology practices. The primary endpoint of obstructive CAD on ICA was selected as most SPECT-MPI exams are performed clinically to rule out significant coronary artery stenosis. In the REFINE SPECT prognostic database (n = 20,414), the indication for exam was suspicion of obstructive CAD in more than 90% of cases.12

Although trained from the entire population, the diagnostic properties of the MLS were not different when generated from low-dose stress-first images compared to stress images acquired after rest imaging using higher isotope dosages. Thus, when applied to low-dose stress-first images, the MLS recognizes these differences and adjusts the feature selection algorithm to maintain its high diagnostic sensitivity for abnormal test results. Additionally, an MLS threshold of 0.29 was associated with a diagnostic sensitivity of 95% for obstructive CAD from stress-only images, without the use of computed tomographic AC. The high sensitivity of the MLS offers a novel approach to increase the feasibility of stress-only MPI in laboratories without access to attenuation correction and when applied to different stress imaging protocols.

Another limitation of SPECT-MPI is the relatively low sensitivity for detection of stenosis in the LMCA or proximal LAD.20,21 The CORE320 study previously reported a sensitivity of 75% for detection of LMCA stenosis.21 Other studies have estimated the sensitivity of SPECT-MPI was just 54.5% and 55.0% for detection of stenosis in the LMCA and LAD, respectively.20 In our study, the MLS threshold of 0.29 was associated with a diagnostic sensitivity of 95% for all obstructive CAD, including high-risk lesions in both the LMCA and proximal LAD. The MLS was trained using the REFINE SPECT diagnostic cohort, which has an overall prevalence of 63% for obstructive CAD, which is higher than the general population undergoing stress testing.22 We included only patients who had gone for coronary angiography to ensure that coronary anatomy was accurately defined since SPECT-MPI may be interpreted as normal in patients with high-risk CAD, including those with low pre-test likelihood of disease.23 This high-risk population is ideal for training the computer to learn features of high-risk CAD from stress-first MPI data and clinical variables, allowing the computer to recognize abnormal stress MPI data that would have otherwise been considered normal by both TPD and readers. The incorporation of clinical variables to the MLS further improves its sensitivity of detecting obstructive CAD in images that would have been considered normal by readers. As shown in Figure 1, ischemic EKG changes were high predictors of an elevated MLS which is consistent with prior studies, especially given the high frequency of pharmacologic stress testing in the study population.24 Similarly, post-stress wall motion abnormalities are associated with the presence of obstructive CAD,25 and wall motion extent had the fourth highest information gain ratio.

A highly sensitive MLS is critical to safely cancel unnecessary rest imaging prior to physician review, but this was achieved at the cost of specificity. The specificity of the MLS reflects the frequency of patients who would have additional rest imaging without obstructive CAD. Importantly, in North America, rest imaging is performed prior to stress testing in over 92% of nuclear cardiology laboratories.2,10 Thus, despite a lower specificity, the MLS threshold of 0.29 would lead to an overall reduction in unnecessary rest imaging and ultimately radiation exposure from SPECT-MPI. It is also notable that the NPV of the proposed MLS was 78% despite a sensitivity of 95%. This is likely due to the high prevalence of obstructive CAD in the study population who were all referred for ICA. The prevalence of abnormal stress perfusion, defined as stress TPD ≥5%, in this population was 64.7% compared to only 25.5% in the overall REFINE registry.26 Assuming the prevalence of obstructive CAD is proportional to the prevalence of abnormal SPECT perfusion, we would estimate 25% of patients having obstructive CAD in the overall REFINE registry. Thus when applied to the general patient population referred for SPECT-MPI the NPV would be estimated at 95%.

There are limitations to the current study. Automatically generated myocardial contours were evaluated by an experienced technologist and corrected in a small subset of cases. Most adjustments were small, but the MLS algorithm would require technologist verification prior to generating an MLS. The information gain ratios provide a group importance for a given feature in the whole population but not an explanation for a given patient. Although not reported in this study, the individual importance of features can be determined to provide patient-specific explanations as recently demonstrated.11 Another limitation is that the time interval between SPECT-MPI and ICA was relatively long. This interval is similar to prior studies with similar inclusion criteria,27 and significant changes in coronary artery stenosis are unlikely to occur over this timeframe. When generating the MLS, missing data was imputed using either population means or without the specific value, as described above. Despite this, the MLS algorithm was highly sensitive for obstructive and high-risk CAD and reflects clinical settings where not all data is available for MLS computation. Because all patients underwent ICA with 6 months of SPECT-MPI there is an overall selection bias of the study population. Although this high-risk population was ideal for training a computer to recognize abnormal SPECT-MPI, the true NPV is likely underestimated. Finally, although we studied a large, multicenter, international cohort, due to this selection bias, validation in a prospective, unselected population is needed

Conclusion

In this study, machine learning was used to develop a diagnostic algorithm for stress-only MPI that automatically determines the need for additional rest images, prior to physician review. An MLS threshold of 0.29 was associated with superior diagnostic sensitivity for obstructive and high-risk CAD than SSS, TPD, or expert readers. The proposed MLS algorithm can improve implementation of stress-first MPI protocols, ultimately reducing cost, scanning times, and radiation dosages.

New Knowledge Gained

Machine learning can detect obstructive and high-risk CAD from stress MPI with high sensitivity and could be applied in an automated clinical algorithm to select patients for stress-only MPI without increased burden to the interpreting cardiologist.