Introduction

Myocardial perfusion single photon emission computed tomography (SPECT) is the most commonly used non-invasive stress imaging modality of choice for diagnosis of coronary artery disease (CAD) the screening, characterization, and follow-up [1,2,3]. The image interpretation for nuclear medicine examination is generally diagnosed by a double-checking system with the reading of nuclear cardiology trainees and experts. Although nuclear cardiologists have to accurately detect two conditions of stress-induced defect and ischemia, the diagnostic accuracy is dependent on the difference of image interpretation by readers. The image interpretation of nuclear cardiology trainees and experts has reported being moderate to excellent in terms of interpretive reproducibility [4, 5]. Stress-induced defects and ischemia are quantified not only for diagnosis but also for the prognostic evaluation of future cardiac events [6].

Artificial intelligence (AI) technology such as artificial neural networks (ANN) and machine learning has recently been applied to myocardial perfusion SPECT to support the image interpretation [7,8,9,10,11,12,13,14,15]. AI can provide the diagnostic accuracy equal to or higher than quantitative indexes such as summed stress score (SSS), summed rest score (SRS) and summed difference score (SDS) and total perfusion deficit. The automatic computer-aided diagnosis software for ANN technology used in this study was developed by the corporation in Sweden, which was introduced to Japan [10]. Furthermore, Japanese training databases of 1001 cases have been applied to myocardial perfusion SPECT studies using Technetium-99m (99mTc)-labeled hexakis-2-methoxyisobutylisonitrile (MIBI) to improve the ANN diagnosis [12]. The patient-based diagnosis with the artificial neural network (ANN) has shown potential utility for the detection of coronary artery disease; however, the region-based accuracy of the detected region has not been fully evaluated. The aim of this study was to demonstrate the accuracy of all candidate regions compared with expert interpretation.

Materials and methods

Patient population

All 109 candidate regions (33 regions with stress defects and 76 regions with ischemia) derived from 21 patients (male 81%; mean age 72 ± 9 years) who underwent myocardial perfusion SPECT within 45 days of coronary angiography. Patient information is shown in Table 1. The number of stenotic coronary stenosis ≥ 50% in the left anterior descending coronary artery (LAD), left circumflex coronary artery (LCX) and right coronary artery (RCA) was 14 (66.7%), 13 (61.9%) and 11 (52.4%) patients, respectively. When significant stenosis of ≥ 50% was defined as abnormal, 4, 3, 8, and 6 patients had 0-, 1-, 2-, and 3-vessel disease, respectively, and multi-vessel disease was diagnosed in 66.7% of the patients. The clinical history of hypertension, diabetes mellitus, dyslipidemia and chronic kidney disease was observed in 85.7%, 66.7%, 42.9% and 28.6% of patients, respectively. The percutaneous coronary intervention was performed in 33% of the patients. The Ethics Committee of the university hospital approved our study.

Table 1 Demographics of patients who underwent myocardial perfusion SPECT. PCI, percutaneous coronary intervention

Myocardial perfusion SPECT

All patients underwent pharmacological stress with adenosine, which was administered intravenously at a standard infusion rate of 120 µg/kg/min for 6 min. The 99mTc-MIBI or 99mTc-tetrofosmin of radioactive dose 300–400 MBq adjusted for weight was injected at 3 min during the adenosine infusion. Furthermore, the patients were given the radioactive dose of 600–800 MBq to acquire rest image after the stress image acquisition. After 60–80 min from the injection, myocardial perfusion SPECT was performed using a dual-headed SPECT system equipped with low-energy high-resolution collimators. SPECT data were acquired with a 64 × 64 matrices, 1.45 zooming, the pixel size of 6.6 mm, with 60 views, a circular orbit (center of rotation 24 cm) of 360°, and acquisition time of 35 s/view. Moreover, ECG-gated images were obtained in 16 frames per cardiac cycle. Energy window was set at 140 keV ± 10% fitted for 99mTc.

Short-axis images of gated and non-gated data were reconstructed using the filtered back projection method with Ramp and Butterworth filters. The order and cut-off frequency of Butterworth filter were set 8 and combined with 0.68 cycles/cm as non-gated data and 0.52 cycles/cm as gated data. Scatter, attenuation, and resolution corrections were not utilized.

Image analysis

The left ventricle (LV) was segmented using a three-dimensional heart-shaped active shape model reconstructed with short-axis slice images, which was used by cardioREPO software (FUJIFILM RI Pharma Co. Ltd., Tokyo, Japan; EXINI Diagnostics AB, Lund, Sweden) [9, 10]. Vertical long-axis and horizontal long-axis images were reoriented from short-axis image with non-gated data. The polar map was segmented with a 17-segment model to calculate SSS, SRS and SDS using Japanese normal database [16, 17]. In addition, the ANN value and the percentage of extent were analyzed using training database [12, 13]. The edge-detection of the short-axis image using an active shape model for delineating ventricular edges was performed to calculate the left ventricular (LV) function including end-diastolic volume (EDV), end-systolic volume (ESV) and ejection fraction (EF).

Visual scoring and image interpretation

The gray and color scale images, the polar map of stress, rest and difference and LV function displayed were displayed on the liquid crystal display monitor. Two nuclear medicine physicians (Observers A and B) scored the abnormality with a 4-point scale (1: definitely normal, 2: probably normal, 3: probably abnormal, 4: definitely abnormal) and draw abnormal regions on a polar map. The experience of nuclear cardiology was 17 and 2 years for Observers A and B, respectively, and they drew abnormal regions on the polar map with the confidence of abnormality. Their abnormality was corresponded to the candidate regions detected by the ANN results that included 33 regions with stress defects and 76 regions with ischemia. The ANN probability values were classified into the following score (ANN score): definitely normal group of ANN ≤ 0.25 as score 1, probably normal group of 0.25 < ANN ≤ 0.50 as score 2, probably abnormal group of 0.50 < ANN ≤ 0.75 as score 3 and definitely abnormal group of 0.75 < ANN as score 4. When the region was identified only by ANN or expert, the probability of undetected region was defined as 0–25%. In addition, the final classification of positive and negative was defined using ANN > 0.50 (more than ANN score 3) as a positive finding. The gold standard was determined by the final judgment of normal or abnormal by the consensus of two other independent expert nuclear cardiologists who had experience of nuclear cardiology with more than 30 years. They used clinical information available including electrocardiography, coronary angiography, coronary computed tomography angiography and clinical information such as history of coronary revascularization, comorbidities and typical chest symptoms during clinical courses. The gold standard was compared with the stress defect and ischemia score among ANN and Observers A and B.

Statistical analysis

Statistical analysis was performed using a statistical package for social science (SPSS) version 23 for Windows (SPSS Inc., Chicago, IL, USA) and JMP version 12 (SAS Institute Inc., Cary, NC, USA) statistics software program. The concordance rate and κ coefficient between CS and scores of ANN and Observer A and B were calculated. Furthermore, receiver operating characteristic (ROC) analysis with the area under the curve (AUC) was used to statistically compare CS with each score. All statistical tests were two-tailed, and a p value of less than 0.05 was considered significantly different.

Results

The concordance rate and κ coefficient of 109 abnormal regions (33 candidate regions with stress defect and 76 candidate regions with ischemia) are shown in Table 2. The concordance rates of ANN and Observers A and B using a 4-point score were 81.8%, 57.6% and 66.7% for stress defect and 73.3%, 65.3% and 74.7% for ischemia. In addition, the κ coefficient was 0.695, 0.399 and 0.513 for stress defect, 0.394, 0.101, and 0.293 for ischemia, which indicated moderate to substantial agreement for stress defect and slight to the fair agreement for ischemia. The concordance rates of ANN and Observers A and B for positive and negative classification were 84.8%, 81.8% and 72.7% for stress defect, and 85.3%, 80.0% and 84.0% for ischemia, respectively, which showed higher concordance rates than that for the 4-point core. The ANN had higher concordance rate and κ coefficient compared with observers A and B.

Table 2 The concordance rate and κ coefficient for stress defect ischemia

The ROC AUCs of ANN and Observers A and B were 0.946, 0.865, and 0.865 for stress defect (33 regions), and 0.884, 0.599, and 0.731 for ischemia (76 regions) (Fig. 1). The AUC of ANN for both stress defect and ischemia was the highest in all AUCs. In particular, the ANN of ischemia showed significantly higher AUC than Observer A (p = 0.005).

Fig. 1
figure 1

The AUC with ROC analysis for stress defect (a) and ischemia (b) among ANN and two observers. AUC area under the curve, ROC receiver operating characteristic, ANN artificial neural network, CI confidence interval

The sensitivity, specificity and accuracy of stress defect were 93%, 84% and 88% for ANN, 86%, 79% and 82% for Observer A, and 79%, 68% and 73% for Observer B, while the those of ischemia were 64%, 89% and 85% for ANN, 36%, 88% and 80% for Observer A, and 55%, 89% and 84% for Observer B, respectively (Fig. 2). The sensitivity and specificity of ANN for stress defect was higher compared with Observers A and B, while the sensitivity of ANN for ischemia was higher. Consequently, the accuracy of ANN showed the highest in this study.

Fig. 2
figure 2

Diagnostic accuracy of stress defect (a) and ischemia (b) among ANN and two observers. The ANN of stress defect showed a higher specificity compared with Observers A and B, while the ANN of ischemia showed a higher sensitivity. The overall accuracy by ANN was the highest in this study. ANN artificial neural network

Figure 3 shows a patient with single-vessel disease of LAD and how the defect scoring and ANN system interpreted the images for the detected ischemic area.

Fig. 3
figure 3

A 72-year-old man with angina pectoris showing the highest score of stress defect and ischemia in both ANN and two observers. Adenosine stress and rest perfusion images are shown. The locations of stenosis on CAG were 75% stenosis of the segment 6 in LAD and 90% stenosis of the segment 4 posterior descending branch in RCA on CAG. Consensus interpretation was stress-induced ischemia in the anterior to inferior walls. The defect scores of SSS, SRS, and SDS were 20, 4, and 16, respectively. The ANN system detected the abnormality in the anterior to inferior regions in the stress (a) (black contour; extent area 27%, ANN probability 0.99) and ischemia (b) (white contour; area 34%, ANN probability 0.93) images. In addition, the ischemia of basal lateral region was detected as normal distribution (white contour; area 1%, ANN probability 0.08). ANN artificial neural network, LAD left anterior descending coronary artery, LCA left coronary artery, RCA right coronary artery, LCX left circumflex coronary artery, CAG coronary angiography, SSS summed stress score, SRS, summed rest score, SDS summed difference score, RAO right anterior oblique, LAO left anterior oblique

Discussion

Although patient-based analysis using a neural network has been reported in myocardial perfusion imaging, few studies have shown its accuracy for detecting the regional-based abnormality. The diagnostic accuracy of regional-based abnormality is important to determine the treatment strategy of multi-vessel disease. Therefore, this study demonstrated that the regional-based detectability of 109 abnormal candidate regions, with either stress defect or ischemia, was also in good agreement with the gold standard and comparable to the interpretation by nuclear medicine physicians.

The ANN technology is noted as a tool for computer-aided diagnosis of myocardial perfusion SPECT. The cardioREPO software collaboratively developed by the corporation in Japan and Sweden can provide computer-aided diagnosis using ANN technology, which is able to provide automatic and practical analyzes on the universal operating system. In addition, the many quantitative indexes including scoring of perfusion distribution, LV function and phase analysis can be calculated. The usefulness of ANN diagnosis has already evaluated [10,11,12,13,14]. For example, the ANN probability of stress defect and ischemia can equivalently evaluate abnormalities, which has been performed with the SSS and SDS. Moreover, the diagnostic accuracy of ANN was equal to the consensus of nuclear cardiologists [11]. However, the specificity of ANN was slightly lower than defect score and the consensus of nuclear cardiologists due to the training database on Swedish patients. Thereafter, the training database on Japanese patients created by multicenter study greatly improved the sensitivity, specificity and accuracy of ANN [12, 13]. Although these previous results demonstrated the diagnostic accuracy as overall evaluation, the accuracy of an individual region by ANN diagnosis has not been evaluated. Therefore, we investigated the concordance rate and the accuracy of regional-based ANN including 33 stress defects and 76 regions with ischemia. Furthermore, we compared to the difference of ANN diagnosis and the image interpretation by the two experienced nuclear medicine physicians.

The ANN value showed equal or higher diagnostic accuracy than each observer. In particular, the sensitivity of ischemia remarkably improved in this study. The detection of ischemia for myocardial perfusion SPECT has an important role in a choice of revascularization or medications, which have to be appropriately decided [18, 19]. Relatively lower sensitivity (36–64%) using ischemia by observers and ANN could be explained as follows. The SDS of subjects in this study was lower than that in the previous study; namely, the stress-rest myocardial SPECT image in this study had many equivocal fill-in phenomena region. Moreover, two experienced nuclear medicine physicians performed the image interpretation without % uptake of 17 segments on the polar map and perfusion scoring such as SSS, SRS and SDS to avoid the bias of scoring evaluation. Therefore, the sensitivity of ischemia region with observers might have shown a lower value than that of the previous study [6]. The image interpretation of myocardial perfusion SPECT commonly judges normal or abnormal perfusion in consideration of finding of various data such as defective perfusion score and LV function for SPECT, electrocardiogram, echocardiography and CAG. Thus, we have predicted that the sensitivity of ischemia detection will improve by providing quantitative indices such as % uptake of 17 segments on a polar map, the score of SSS, SRS and SDS, and by adding information such as CAG, echocardiography, electrocardiogram and patient history. Furthermore, ANN information may serve as the diagnostic aid on image interpretation to have higher sensitivity than nuclear medicine physicians. Therefore, we have to demonstrate the difference of diagnostic accuracy with and without ANN information in further studies.

There are nuclear medicine institutions more than one thousand in Japan. However, all the institutions do not necessarily have expert nuclear medicine physicians. The image interpretation of myocardial perfusion SPECT might be complicated for nuclear cardiology trainees due to small-heart effects and various artifacts such as breast and diaphragmatic attenuation, patient motion and high accumulation of liver, stomach and bowel [10, 20,21,22,23,24,25,26]. Thus, we expect that the ANN diagnosis of cardioREPO software will support image interpretation as a good tool for nuclear cardiology trainees and even for experts of nuclear cardiology.

Conclusion

We demonstrated the concordance rate and the accuracy of ANN-based regional diagnosis. When we compared the difference of ANN diagnosis and the interpretation of nuclear medicine physicians, the ANN-based regional diagnosis showed a high concordance rate with the gold standard and was comparable or even higher than the interpretation by nuclear medicine physicians.