Introduction

Cardiac computed tomography has emerged as a cornerstone in the non-invasive assessment of coronary artery disease (CAD) and now has a class I recommendation, according to the current guidelines of the European Society of Cardiology [1]. Non-enhanced coronary artery calcium scoring (CACS) and coronary CT angiography (cCTA) have been proven for excellent evaluation of CAD, combining anatomical and morphological assessment, such as detailed plaque quantification and characterization of CAD for cardiovascular risk stratification and therapeutic decision-making, in addition to providing prognostic value for the occurrence of adverse cardiac outcome [2,3,4]. Technical advances in hardware and software applications have led to the addition of functional analysis to the previously solely anatomical assessment of CAD. Namely, cCTA-derived fractional flow reserve (CT-FFR) and CT myocardial perfusion (CTP) have been introduced for the detection of hemodynamically significant CAD. Whereas CT-FFR is based on image post-processing using specific software applications, CTP is performed by the addition of a pharmacological stress agent using an additional acquisition protocol. Both techniques have demonstrated incremental value for the prediction of lesion-specific ischemia and cardiovascular outcome [5,6,7,8]. Furthermore, automated plaque segmentation and plaque quantification using data analytic techniques such as radiomics have attracted interest for improved diagnostic and prognostic accuracy [9].

Artificial intelligence (AI) is a field of mathematics and computer science that uses tasks that are normally linked to human intelligence (i.e. pattern recognition and perception, translation of information, and decision-making). Most AI systems used in medical imaging are based on machine learning (ML) applications [10, 11]. ML defines computer-based algorithms that can effectively learn from large training data sets that can be applied for the prediction and intelligent decision-making of a specific task on new untrained data. In the digital era where medical imagers are faced with an increasing amount of imaging data, ML algorithms are both capable of handling this big data with high computational storage power and can be very useful by providing streamlined, time-saving workflows due to learned decision-making based on this input data. Thus, the main goal of ML applications is to assist imagers with their daily tasks by increasing efficiency, reducing errors, and achieving objectives with minimal manual input.

In this review, we outline the contemporary state of ML-based algorithms in cardiac CT, focusing on the clinical validation and implementation of these algorithms in CACS, cCTA, CT-FFR, and CTP for the prediction of ischemia-specific CAD and cardiovascular outcome.

Artificial intelligence and machine learning applications in cardiac CT

Principles and technical background

ML is an analytic method using computer algorithms to learn from datasets without direct programming of these functions [12]. By utilizing the concept of more learning leading to better results, it is analogous to the human learning process. ML features the creation of an autonomous system, which can detect and gain knowledge through pattern recognition of large data sets without being explicitly programmed for a specific task. Some key requirements must be met for appropriate application of ML: the data set must be detailed and relevant enough for the desired task, the applied ML algorithm must be appropriate for the complexity, amount, and type of data used, and the ML-derived results have to be validated and demonstrate usefulness in clinical practice [12] (Table 1).

Table 1 Main publications of machine learning applications in cardiac CT

In cardiac CT, those ML algorithms can be broadly categorized as supervised and unsupervised algorithms: in unsupervised learning, only the input variable is given without trying to engender a specific outcome. The model learns the structure of the data to identify any potential consistent patterns within the data space, without learning an association with a target outcome. Cluster analysis and principal component analysis comprise these algorithms. K-nearest neighbors, K-means, and generalized adversarial networks can be categorized as unsupervised ML models [12]. K-nearest neighbors classify every object being compared to its k-nearest training examples and assigns it to the most frequent of its neighbors. The method is based on the general assumption of a stronger connection between closer cases.

Supervised models can be applied to take on both classification and regression problems. In supervised learning algorithms, the dataset is analyzed to select individual input features that are processed and weighted to identify the best combination of features to fit the outcome variable. Some of the most common algorithms used are support vector machines, decision trees, random forest, artificial neural networks, and linear regression [12, 13]. Support vector machines use a higher dimensional space known as kernel, where data are separated in groups, divided by a hyperplane as a separation between these classes. The name-giving support vector classifier describes additional lines or planes that define a frontier which best segregates the two classes. Decision trees are the simplest ML models based on the human decision-making process, where data is analyzed and split into two groups. It is also commonly used in flow diagrams and risk calculation charts and its parameter selection is based on information rather than common knowledge. It can be further utilized by combination for ensemble learning to create multilayer classifiers like random forest [14]. These consist of large individual numbers of decision trees, where each describes a class prediction. The model’s prediction presents the class with the highest number. By using a combination of learning models, it enhances the overall result. In contrast, artificial neural networks are biologically inspired computational networks, created to emulate the human brain. Their main use in medical science is convolutional neural networking based on deep neural networks consisting of up to hundreds of internal layers, which is currently considered state-of-the-art in algorithms for outcome prediction using imaging data. An example of a deep learning neural network with multiple layers is shown in Fig. 1.

Fig. 1
figure 1

Architecture of a deep learning framework for automated calcium analysis based on a convolutional neural network with a ResNet architecture for image features, as well as a fully connected neural network for spatial coordinate features (Automated CaScoring, Courtesy of Siemens Healthineers)

These artificial layers are connected by synapses, which are weighted values. It uses specific receptor fields, just like the human brain cortex would, to analyze structures without connecting every pixel to a single neuron. With further learning, these structures can range from simple lines to evaluation and even classification of whole images [15]. At last, linear regression as part of regression analysis is a longtime-used standard method for determining strengths of predictors, as well as trend or effect prognostication.

A main challenge of ML is the adequate fitting of decision boundaries used to describe the actual data distribution. Underfitting, mainly assigned to a small sample size and incorrect data assumption, leads to poor results representing the data. Meanwhile, a model that is too complex may lead to Overfitting. Here, an ML algorithm captures not only appropriate distributed data but also single data not well presented within the boundary. It therefore is accurate for the analyzed dataset but may fail in further unseen studies. To acquire an optimal fitted model, a compromise of model complexity and data representation is inevitable [9, 16].

Current evidence

Machine learning and coronary artery calcium scoring

Coronary artery calcium scoring (CACS) from non-contrast enhanced images is a well-established tool for screening and risk stratification in patients with low-intermediate risk of CAD and a strong predictor of cardiovascular events [17, 18]. CACS is performed by image postprocessing using the Agatston score by manual assessment of the presence and extent of calcium in the coronaries. Due to the increasing number of CACS scans performed, automated segmentation and quantification of calcium using ML has gained interest. Thus, most ML approaches introduced in CACS have focused on automated calcium detection and scoring [19]. Wolterink et al. introduced an ML approach back in 2015 using automated calcium identification and quantification using intensity-based thresholds on non-contrast-enhanced scans with additional features like size, shape, and location using a decision-tree algorithm. They demonstrated a strong agreement (κ = 0.94) between ML-based calcium risk categorization and human manual segmentation with a good sensitivity of 87% [20]. More recently, the same working group investigated the impact of ML for CACS in 250 datasets using convolutional neural networks and showed a sensitivity of 0.71 and an agreement of 83% in risk classification in comparison to the manual assessment by an expert reader [21]. In a recent study, Martin et al. evaluated a novel deep learning-based research software (Automated CaScoring, Siemens Healthineers) for CACS on non-contrast CT images [22] (Fig. 2). This approach is based on a convolutional neural network and was trained on 2000 annotated datasets. The ML software correctly classified 93.2% of patients (476/511) into the same risk category as the human observers. The authors demonstrated a strong Dice similarity coefficient for a CACS > 0 of 0.95. Likewise, van Velzen et al. [23] investigated the performance of a deep learning convolutional neural network for automatic CACS across a wide range of CT examination types. They showed that the algorithm yielded excellent intraclass correlation coefficients of 0.79-0.97 for CACS in a large and diverse set of CT examinations. More importantly, CT protocol-specific training of the baseline data resulted in an improved risk category assessment for CACS. Yang et al. [24] and Shahzad et al. [25] assessed a fully automatic calcium scoring method on contrast and non-contrast enhanced CT images from different CT systems using a support vector machine classifier. The authors reported sensitivities and specificities of 0.94 and 0.86, respectively. Moreover, recent studies have demonstrated the feasibility of CACS derived from non-contrast enhanced low-dose chest CT [26]. Consequently, ML methods have been applied to imaging studies routinely used for lung cancer screening. In a dataset of 5973 non-contrast non-ECG gated chest CT scans, Cano-Espinosa et al. [27] used a deep convolutional neural network to extract the Agatston scores directly from these images. The algorithm yielded a Pearson correlation coefficient of 0.93 and correctly stratified 73% of cases into the corresponding risk category. In summary, application of ML algorithms for CACS have demonstrated their feasibility with overall good results in diagnostic accuracy and subsequent risk categorization. However, additional efforts are warranted to improve the diagnostic performance of these fully automated software applications in an a priori screening test like CACS.

Fig. 2
figure 2

Case example of deep learning automated calcium scoring from non-contrast CT correctly identifying calcium in the coronary arteries

Machine learning and coronary CT angiography

cCTA has emerged as a cornerstone in the non-invasive evaluation of patients with suspected CAD ruling out the presence of atherosclerotic lesions with high diagnostic accuracy (Fig. 3). Therefore, cCTA serves as a reliable gatekeeper for invasive coronary angiography (ICA) [28, 29]. Moreover, cCTA has demonstrated prognostic value for outcome prediction of major adverse cardiac events (MACE) beyond traditional cardiovascular risk factors [3]. Recent technical advances in image postprocessing and software solutions have added detailed plaque quantification and more importantly, functional evaluation of coronary artery flow and subsequent blood supply to cCTA. Automated plaque detection and quantification ranging from extent of plaque burden to a more detailed characterization of plaques (i.e. non-calcified, calcified, mixed plaques) with the identification of so called “high-risk” plaque features is a coveted target for ML, as current measurements require time-consuming manual human input with wide ranges in intra- and interobserver variability [30, 31]. Technologies such as radiomics and texture analysis, which allow extraction of numerous quantitative parameters from CT images to describe the texture and spatial complexity of plaques, have shown promising results [32, 33].

Fig. 3
figure 3

Coronary CT angiography in a 47-year old woman without coronary artery disease. Automatically generated curved multiplanar reformations and cinematic 3-D volume rendering display (LAD: left anterior descending artery; LCX: left circumflex artery; RCA: right coronary artery)

A major topic that has been extensively studied is the identification of hemodynamically significant CAD by cCTA. CT-FFR and CTP have been of major interest in ML solutions. Whereas promising data for ML approaches using CT-FFR and plaque characteristics are available, only few studies have investigated ML applications in CTP. In general, most studies using ML in cCTA have focused on improving automated segmentation, image pre- and post-processing and computer-aided diagnosis and outcome prediction of CAD.

Machine learning and plaque quantification

ML algorithms for data extraction from cCTA using automated plaque analysis have recently been investigated. Kang et al. [34] used support vector machine learning for the automated detection of obstructive and non-obstructive CAD on cCTA, reporting a diagnostic accuracy of 94%. Utilizing an automated algorithm (AUTOPLAQ) based on automated coronary segmentation and classification for volumetric plaque quantification, Dey et al. [35] added image features to a ML approach with automated feature selection and information gain ranking for the prognostication of lesion-specific ischemia. The ML score resulted in an AUC of 0.84, which was significantly higher compared to all individual CT measures (AUCs 0.63-0.76, p < 0.05). Zreik et al. [36] used a multi-task recurrent convolutional neural network for the automated cCTA-derived detection and classification of coronary plaques and stenosis severity against visual assessment. Their approach achieved an accuracy of 0.77 and 0.80 for the detection and quantification of coronary plaques, and for the determination of its anatomical significance, respectively. Denzinger et al. [37] applied three different ML approaches (convolutional neural network, 2D multi-view ensemble approach for texture analysis, and a newly proposed 2.5D approach) for plaque analysis to identify hemodynamically significant CAD. All three methods demonstrated good performance for the detection of lesion-specific ischemia (all AUC 0.90). Studies by Jawaid et al. [38] and Wei et al. [39] used ML for automated centerline validation and extraction of vessel wall and plaque features using a support vector machine or linear classifier, respectively. They reported accuracies of 88% and 90% for automated detection when compared to an expert reader. More recently, Kolossvary et al. assessed the potential of radiomics in assessing the Napkin ring sign, which is categorized as a so called “high-risk” plaque feature. A total of 4440 radiomics parameters were calculated. Authors showed that the best radiomics parameter performed signficiantly better than plaque attenuation to detect the Napkin ring sign (AUC 0.89 vs. 0.75) [40].

Machine learning and CT-derived fractional flow reserve

The application of ML calculations to CT-FFR represents the most developed field in cardiac CT, as ML has demonstrated to be a reasonable successor of prior computational fluid dynamics (CFD) algorithms [41, 42]. Until recently, ML CT-FFR had been provided only by one vendor (Siemens Healthineers, Forchheim, Germany) and is for research purposes only. Meanwhile other vendors offer similar, commercially available algorithms, such as Keya Medical’s (Beijing, China) DeepVessel FFR application [43]. ML-based CT-FFR employs a multi-layer neural network framework that are trained and validated offline against the former CFD approach to learn the manifold relationship between the anatomy of the coronary tree and its analogous hemodynamic parameters. In the case of the Siemens algorithm, training used a virtual dataset of 12.000 synthetic 3D coronary models [44]. The technical background of ML-based CT-FFR has been described in detail recently [45]. A case example of ML-based CT-FFR is shown in Fig. 4.

Fig. 4
figure 4

Coronary CT angiography in a 56-year old man. a Automatically generated curved multiplanar reformations demonstrate > 70% stenosis of the proximal LAD (arrow). b 3-dimensional color-coded mesh shows a CT-FFR value of 0.57, indicating lesion-specific ischemia (arrow). c Invasive coronary angiography confirms obstructive stenosis of the LAD (arrow) with FFR of 0.56. d, e Color-coded automated plaque evaluation of the causative lesion by the analysis software quantitates the predominantly non-calcified composition of the underlying atheroma

The validation of diagnostic accuracy of ML-based CT-FFR for ischemia prediction has been assessed in one multicenter trial and several single-center studies in relation to cCTA and ICA. The MACHINE registry (Diagnostic Accuracy of a Machine-Learning Approach to Coronary Computed Tomographic Angiography—Based Fractional Flow Reserve: Result from the MACHINE Consortium) was the first investigation to assess ML-based CT-FFR in 351 patients with 525 vessels from 5 sites in Europe, Asia, and the United States [46]. ML-based CT-FFR showed significantly improved diagnostic accuracy (ML CT-FFR 78% vs. cCTA 58%) and specificity (ML CT-FFR 76% vs. cCTA 38%), while no significant changes in sensitivity were observed (ML CT-FFR 81% vs. cCTA 88%). ML-based CT-FFR yielded a significantly higher AUC of 0.84 when compared to cCTA alone (AUC 0.69, p < 0.05) for the detection of lesion-specific ischemia. In line with the multicenter MACHINE registry, several single-center studies have proven ML-based CT-FFR. A recent investigation included 85 patients with 104 vessels and demonstrated a per-lesion sensitivity and specificity of 79% and 94%, respectively. ML CT-FFR revealed a significantly higher diagnostic performance over cCTA alone both on a per-lesion and per-patient level with AUCs of 0.89 vs. 0.61 (p < 0.001) and 0.91 vs. 0.65 (p < 0.001), respectively [47]. Similar results have been reported by von Knebel Doeberitz et al. [48] and Tang et al. [49] who reported sensitivities and specificities of 82% and 94%, and 85% and 94%, respectively. The impact of coronary calcification and gender on the diagnostic performance of ML-based CT-FFR has also been investigated in two sub-studies of the MACHINE registry. Tesche et al. [50] assessed the influence of calcifications on performance characteristics in patients with a wide range of Agatston scores (range 0 to 3920). They demonstrated an excellent discrimination in vessels with high Agatston scores (CAC ≥ 400) and high accuracy in low-to-intermediate Agatston scores (CAC > 0 to < 400), however with significant differences in the corresponding AUCs (AUC: 0.71 vs. 0.85, p = 0.04). Baumann et al. [51] evaluated the accuracy of ML CT-FFR in 398 vessels in men and 127 vessels in women. Whilst the authors found no significant difference in the AUCs in men when compared to women (AUC: 0.83 vs. 0.83, p = 0.89), ML-based CT-FFR was not superior to cCTA alone (AUC: 0.83 vs. 0.74, p = 0.12) in women, however it was significantly different in terms of accuracy in men (0.83 vs. 0.76 p = 0.007).

The impact of ML-based CT-FFR for therapeutic decision making and on adverse cardiac outcome was assessed in two small retrospective single-center studies [52, 53]. The therapeutic strategy (optimal medical therapy alone vs. revascularization) was investigated in 74 patients with 220 vessels. ML CT-FFR correctly identified 35 of 36 patients (97%) with hemodynamically significant CAD on invasive assessment and all patients (38 of 38) with functionally non-significant CAD. Additionally, the appropriate treatment decision was chosen in 73 of 74 patients (99%) with ML-based CT-FFR, with corresponding accuracy, sensitivity, and specificity of 0.99, 0.97, and 1.0. Prediction of MACE by ML CT-FFR was assessed in 82 patients with a median follow-up of 18.5 months by von Knebel Doeberitz et al. [53]. In a multivariable regression analysis, significant CAD defined by ML CT-FFR ≤ 0.80 served as the strongest predictor for adverse cardiac outcome (odds ratio 7.78, p = 0.001).

These promising results support the use of ML in CT-FFR assessment. However, larger studies on the clinical applicability of ML-based CT-FFR for outcome prediction, diagnostic decision making, and its impact on healthcare economics are warranted before ML CT-FFR can be integrated in routine clinical workflows.

Machine learning and CT perfusion

The impact of ML in CTP has only been investigated in a small number of studies. Generally, the application of CTP in cCTA is limited by its novelty and has been assessed in only a few cardiovascular imaging centers. In a recent study, Xiong et al. [54] applied three different ML approaches (Random Forest, Ada Boost, Naive Bayes) for automated segmentation and delineation of the left ventricle using three different myocardial features obtained from resting CTP images, such as normalized perfusion intensity, transmural perfusion ratio, and myocardial wall thickness. They found that the Ada Boost algorithm performed best when compared to manual segmentation by an expert reader with an AUC of 0.73. In line with this investigation, Han et al. [55] used a gradient boosting classifier for supervised ML for predicting physiologically significant CAD in resting myocardial CTP in a dataset of 252 patients. The authors reported a diagnostic accuracy, sensitivity, and specificity of 68%, 53%, and 85% of CTP added to cCTA stenosis > 70% for predicting ischemia. The addition of resting CTP to the evaluation with cCTA resulted in an increased AUC of 0.75 vs. 0.68 for hemodynamically significant CAD. A case example of CTP post-processing image analysis is shown in Fig. 5.

Fig. 5
figure 5

Example of CT myocardial perfusion image post-processing analysis showing a transmural perfusion defect in the LAD (left image) with decreased myocardial blood flow in the interventricular septum and apex of the left ventricle (right image) as displayed by the shown colormap

Overall, the application of ML in CTP has a potential future especially in automated segmentation of the myocardium and automated detection of perfusion defects in both static and dynamic CTP to further reduce human input for image postprocessing. Additionally, the employment of CTP in ML-based risk models for the prediction of ischemia will most likely result in improved diagnostic accuracy in line with CT-FFR.

Machine learning and cardiovascular outcome

One of the first major studies using the application of ML in cCTA for outcome prediction has been conducted by Motwani et al. [56] using a large dataset of 10.030 patients from the CONFIRM registry (Coronary CT Angiography Evaluation for Clinical Outcomes: An International Multicenter). A total of 44 cCTA-derived parameters, together with 25 clinical parameters, were applied to a ML algorithm for outcome prediction at 5-year follow-up. The ML approach involved automated feature selection by information gain ranking, followed by model building with LogitBoost and tenfold cross-validation. The ML score combining clinical and CT parameter exhibited a significantly higher AUC of 0.79 for death prediction when compared to traditional risk factors or CT measures. Likewise, van Rosendael et al. [57] assessed the value of a gradient-boosting tree ensemble ML method by only using imaging markers derived from cCTA for outcome prediction in 8844 patients. They demonstrated that a risk score created by a ML algorithm that uses standard 16-coronary segment analysis and plaque composition performed significantly better in the prediction of adverse events (AUC, 0.77) compared to all other coronary CT angiography-derived scores (AUC ranging from 0.69 to 0.70, p < 0.001). In line with the prior investigations, van Assen et al. [58] used a commercially available software (vascuCAP, Elucid Bioimaging) for the automated model-based extraction of quantitative plaque features and their prognostication for MACE. Adding morphological plaque features to their prognostic model resulted in an accuracy of 77%, with an AUC of 0.94 for MACE prediction. Johnson et al. [59] investigated the performance of four different ML models (logistic regression, K-nearest neighbors, bagged trees, and classification neural network) using 64 CT-derived vessel features to discriminate between patients with and without subsequent death or cardiovascular events in a study cohort of 6892 patients. The results demonstrated that the discriminatory power of the ML models in the identification of patients with MACE were superior to that of all evaluated traditional coronary CT angiography-derived scores (ML models AUC 0.77, CAD-RADS AUC 0.72, CT-Leaman-score AUC 0.74, segment plaque burden score AUC 0.76, all p < 0.001).

Conclusion and future outlook

Cardiac CT has resulted in remarkable advancements in the last few decades both in CT systems and hardware and software applications. Consequently, cardiac CT has emerged as a cornerstone in the non-invasive assessment of CAD with high diagnostic accuracy. The rapidly growing amount of imaging data has triggered interest in automated diagnosis tools. ML might be the most suitable solution yet to handle these increased data volumes to streamline imaging workflows by providing pre-reading, quantification, and extraction of necessary data from CT images to improve diagnostic accuracy, and ultimately to allow outcome prediction from various input variables. Before ML applications can be integrated into clinical workflows, several general questions of principle need to be addressed. First of all, safety and protection of patient-data using computational storage and compliance with regulations must be guaranteed. Furthermore, access of different healthcare providers to these data has to be governed. So far, no study has shown that the integration of ML has led to better quality of care, improved patient outcome, or lowered healthcare costs. If these challenges can be addressed and obstacles can be overcome, ML offers a powerful platform to integrate clinical and imaging data for improved patient care.