Introduction

Hypertrophic cardiomyopathy (HCM) is the most frequent cause of sudden cardiac death (SCD) in young adults [1]. HCM is characterized by heterogeneities in morphological expressions and clinical courses [2]. Additionally, it is the most common autosomal dominant inherited cardiovascular disorder. In up to 60% of HCM patients, more than 1400 mutations in genes encoding sarcomere proteins have been detected [3]. Genetic testing has a limited impact on the treatment strategies for individual patients, but a positive genetic test result could confirm the etiology of the disease and enable cascade genetic screening of their relatives [4]. Moreover, compared with conventional regular clinical screening, the addition of genetic testing is cost-effective [5]. However, based on the current genetic testing studies in HCM populations, the yield of causative mutations was viable (15–70%) [6,7,8]. Thus, it should be noted that the selection of patients who have high probability of positive HCM genotypes can maximize the cost-effectiveness for genetic testing.

Although precise correlations have not been established between the phenotype and genotype in HCM patients, patients with mutations were considered to show significant differences in both clinical and imaging features compared to those without mutations, such as a family history of HCM and SCD, and the left ventricular maximal wall thickness (LVMWT) [9, 10]. Previous studies [11] have proposed several scores to predict positive genotypes in HCM based on the linear regression modeling of clinical and imaging variables; however, these scores were not fully validated in clinical practice and could not directly reflect the dynamic and physiological complexity of the myocardium.

Radiomics have demonstrated the potential to achieve or surpass visual detection in image analysis. Cardiovascular magnetic resonance (CMR) is not only applicable to the clinical diagnosis and management of HCM [4] owing to its excellent high spatial resolution and superior contrast ability, it also lends itself to radiomic analysis. Radiomic signatures based on magnetic resonance sequences have been associated with particular genetic expressions in glioma, EGFR expression [12], and p53 status [13].

Deep learning (DL) is a subset of machine learning algorithms that can learn features and transform them into class labels for classification. DL is currently reported to recognize risk stratification in a variety of CMR studies [14,15,16,17]. In particular, nonenhanced cardiac cine sequence is routinely used in clinical practice to visualize cardiac motion, which is also free from renal impairment due to contrast agents. DL based on nonenhanced cardiac cine image–based DL has accurate fully automated LV segmentations [18] and can efficiently detect myocardial infarction [14]. In this study, we aim to improve the HCM mutation-risk prediction by developing a nonenhanced cine image–based DL model to enable cost-effective genetic testing for HCM patients.

Material and methods

Study population

The study population consisted of 198 HCM patients from January 2012 to December 2013 in our referral center. HCM was defined as the presence of increased LV wall thickness (≥ 15 mm or ≥ 13 mm with a family history) that is not solely explained by abnormal loading conditions. All patients underwent CMR examinations and evaluations of genotype status by three established scores (Mayo Clinic score I, Mayo Clinic score II, Toronto score) [11, 19, 20] to predict HCM genotype (Table 1). Patients with HCM phenocopies (n = 4) and poor image quality (n = 6) were excluded. All patients provided informed consent, and the local ethics committee approved this retrospective observational study (2017-923).

Table 1 Mayo Clinic score I, Mayo Clinic score II, and Toronto score for reflection of HCM genotype

Genetic testing

Genetic testing was performed at two centers: Biotechnology Corporation (n = 147) and the State Key Laboratory of Cardiovascular Disease in Fuwai Hospital (n = 51). Peripheral blood–derived DNA from all HCM patients were used for panel sequencing of the eight sarcomere protein-encoding genes including myosin-binding protein C (MYBPC3), β-myosin heavy chain (MYH7), essential and regulatory myosin light chains (MYL2, MYL3), cardiac troponin T (TNNT2), cardiac troponin I (TNNI3), α-tropomyosin (TPM1), and cardiac actin (ACTC1), as well as 3 HCM phenocopy genes (GLA for Fabry disease, LAMP2 for Danon disease, and PRKAG2 for PRKAG2 Cardiomyopathy). All coding exons and their flanking intronic regions were captured using Agilent probes (Agilent Technologies) and analyzed using the Illumina HiSeq X-ten platform (n = 147) and HiSeq 2500 (n = 51) (Illumina Inc.), respectively. Data analysis was performed using a custom bioinformatics workflow [21] (details are provided in Supplement 1).

CMR cine data

CMR images of all the HCM patients were obtained using a clinical 1.5-T MR scanner (MAGNETOM Avanto, Siemens Healthcare), which implemented an electrocardiographic and respiratory gating with a unified protocol. A balanced steady-state free precession cine sequence with a breath-hold technique was performed in the LV two-chamber, three-chamber, four-chamber, and short-axis orientation. The typical imaging parameters were repetition time (TR) = 2.8–3.0 ms, echo time (TE) = 1.1–1.5 ms, flip angle = 60–70°, temporal resolution = 30–55 ms, field of view (FOV) = 360 × 315 mm2, matrix = 192 × 162, slice thickness = 8 mm, and slice gap = 2 mm. The LVMWT and left ventricular posterior wall thickness (LVPWT) were measured at the thickest segment on the short-axis cine image in the LV end-diastole and the ratio of LVWMT/LVPWT was calculated. The septal morphological subtypes (sigmoid, reverse curve, apical, and neutral) were assessed from the three-chamber long-axis views [22].

Cine image segmentation

Cine images of a four-chamber view were retrieved from the picture archiving and communication system and loaded onto the free ITK-SNAP 3.6.0 software. The end-systolic and end-diastolic phases of the four-chamber-view cine images in the DICOM format were selected for further segmentation by a radiologist (G.Y. with 10 years of experience in CMR). Regions of interest (ROIs) were manually drawn, encircling the LV myocardium on end-systolic and end-diastolic phases. Furthermore, the propagation of the ROIs to the entire cardiac cycle was completed using the DL model, as detailed in the next subsection.

Deep learning model

In this study, the establishment of the DL network was divided into three stages, as shown in Fig. 1. The original cine images were used as the input to the network, and the probability of a gene-positive expression was obtained as the output. In the first stage, the ROIs of the end-systolic and end-diastolic LV myocardium were propagated to all phases throughout the cardiac cycle using a DL image segmentation model DeeplabV3+ [23]. In the second stage, to form a three-channel input image, the original cine image (occupying two channels) and segmentation (occupying one channel) were combined to obtain both LV myocardial texture and morphological information. The three-channel input image was then input into the frozen-weight InceptionResnetV2 [24] model, which was pre-trained on the ImageNet dataset [25] to extract image features (n = 1536) in the ROI regions (Fig. 2). Moreover, to minimize excessive redundant information, these features underwent supervised dimensionality reduction (from 1536 to 32) through the last two fully connected layers in this model. In the third stage, predictions were performed by analyzing the image features obtained from the entire cardiac cycle using a long short-term memory (LSTM) network. The image features of 25 cine images (25*32) were respectively loaded onto the LSTM network [26]; the network was able to learn the subtle changes between different frames of a time-dependent series so that a regression analysis could determine the HCM mutation probability. Figure 3 presents the visualization of DL model from two representative cases.

Fig. 1
figure 1

Full process of the deep learning network. Firstly, the original images with ROI encircling the LV myocardium on end-diastolic and end-systolic phases were propagated to the entire cardiac cycle by using the deeplabV3 + model. Secondly, the ROIs were cropped and combined with the original cine images; image features were subsequently extracted using a classification model: InceptionResnetV2 model. Finally, the image features of the entire frames were placed into a LSTM model to evaluate the motion state of each patient and to obtain the probability of an HCM gene-positive expression. Abbreviations: HCM, hypertrophic cardiomyopathy; LV, left ventricular; ROIs, regions of interest; LSTM, long short-term memory

Fig. 2
figure 2

Flow chart for image feature extraction. In this part, single image features were extracted from the pre-trained InceptionResnetV2 model and underwent targeted dimensionality reduction (from 1536 to 32) by the last two fully connected layers

Fig. 3
figure 3

Four-chamber view of cine images (odd frames) and visualization of model features from a genotype-positive (upper two rows) and a genotype-negative (lower two rows) HCM patient, respectively. The color differences represented the intensity of each pixel in the entire model, which may reflect the probability of HCM gene expression in each patient. Red indicated a higher probability (p = 0.87) and blue indicated a lower probability (p = 0.28). Abbreviations: HCM, hypertrophic cardiomyopathy

The data set was divided into a training set (n = 147) and a test set (n = 51) based on different genetic testing institutions and CMR scan dates (2012 and 2013, respectively) (Table 2). To evaluate the stability of the DL model, we repeated the experiment using random 10-fold cross-validation for the training set. All three models in the network were trained by an Adam optimizer (learning rate = 0.0001) [27], using cross-entropy as a loss function. A total of twenty epochs were trained in each model, and the results were retained to minimize the loss of the training set. The DL network was implemented using the Keras [28] DL toolkit in Python, version 3.6. Moreover, model training was performed on graphical processor units (TITAN XP, NVIDIA).

Table 2 Baseline data in training and test set

Statistical analysis

Data were presented as a mean ± standard deviation, median (quartiles 25–75%), or n (%), as appropriate. The Kolmogorov–Smirnov test was used to assess the normal distribution of continuous data. The Student t test was utilized to compare the continuous variables between the two groups. Nonparametric parameters were compared using the Mann–Whitney U test. Frequencies were compared using the chi-squared test or Fisher’s exact test, as appropriate. The predictive performance of the Mayo Clinic score I, Mayo Clinic score II, Toronto scores, and DL model was evaluated through receiver-operating characteristic (ROC) analysis to identify positive HCM genotypes. Logistic regression was performed using the scikit-learn package in Python. All statistical calculations were performed using the R software version 3.4.

Results

Clinical and CMR parameters

The study population consisted of 198 HCM patients (n = 198, 48% men; aged 47 ± 13 years). Unlike genotype-negative patients, genotype-positive patients included a significantly higher proportion of female patients (37.76% vs 24.00%, p = 0.036), patients aged < 45 years at the time of diagnosis (48.98% vs 35.00%, p = 0.046), and patients with a family history of HCM (29.59% vs 15.00%, p = 0.014). Moreover, these patients had a significantly lower prevalence of hypertension (23.47% vs 47.00%, p = 0.001) than genotype-negative patients. In terms of CMR data, the HCM morphological subtypes were categorized as follows: sigmoid (n = 111), reverse curve (n = 56), apical (n = 27), and neutral (n = 4). Unlike genotype-negative patients, genotype-positive patients exhibited greater LVMWT (23.70 ± 5.66 mm vs 21.53 ± 5.56 mm, p = 0.007) and higher LVMWT/LVPWT ratios (2.90 ± 1.02 vs 2.38 ± 0.68, p < 0.001) (Table 3).

Table 3 Clinical and imaging parameters from three established scores in genotype (+) and genotype (−) patients

HCM genotype

The overall yield of the genetic testing in this study was 49.49% (98/198). In this subset of genotype-positive patients, variants were distributed in MYBPC3 (n = 41; 41.84%), MYH7 (n = 34; 34.69%), MYL2 (n = 2; 2.04%), TPM1 (n = 2; 2.04%), TNNI3 (n = 5; 5.10%), TNNT2 (n = 3; 3.06%), ACTC1 (n = 3; 3.06%), MYL3 (n = 3; 3.06%), and compound mutations (n = 5; 5.10%).(Table 4 and Supplement 2).

Table 4 Distribution of sarcomere protein gene mutations

Model performance

Training set

To evaluate the stability of the models, we used the network to perform a 10-fold cross-validation on the internal dataset. The area under the curve (AUC) value for the Mayo I score and the Mayo II score, the Toronto score, and DL model were 0.63 ± 0.04, 0.67 ± 0.04, 0.68 ± 0.04, and 0.81 ± 0.01, respectively.

Test set

The diagnostic performance in the test set was computed at the optimal thresholds that maximize the Youden index, which were as follows (Table 5 and Fig. 4): Mayo Clinic score I (AUC: 0.64, sensitivity: 64.29%, specificity: 47.83%,, accuracy: 56.86%), Mayo Clinic score II (AUC: 0.70, sensitivity: 64.29%, specificity: 65.22%, accuracy: 64.71%), Toronto score (AUC: 0.74, sensitivity: 75.00%, specificity: 56.52%, false-positive rate: 43.48%, accuracy: 66.67%), and DL model (AUC: 0.80, sensitivity: 85.71%, specificity: 69.57%, accuracy: 78.43%). Although the DL model exhibited a higher predictive performance, no statistical significance was achieved compared with the Mayo Clinic score I, Mayo Clinic score II, and Toronto score. However, the combination of DL and Toronto score resulted in significantly higher predictive performance (AUC = 0.84, sensitivity: 83.33%, specificity: 78.26%, accuracy: 84.31%) compared with Mayo I (p = 006), Mayo II (p = 022), and Toronto score (p = 0.029). Furthermore, the false-positive rate was 52.17%, 34.78%, 43.48%, and 30.43% in Mayo I, Mayo II, Toronto score, and DL model, respectively, resulting in 12, 8, 10, and 7 genotype-negative HCM patients misclassified in the genotype-positive group, respectively. The combination of the DL model and Toronto score resulted in only five genotype-negative HCM patients misclassified in the genotype-positive group.

Table 5 Diagnostic performance of three established scores and DL model for prediction of HCM mutation
Fig. 4
figure 4

ROC curves of the three established scores, DL, and DL + Toronto score for the reflection of HCM genotypes in the test set. Abbreviations: AUC, area under the (receiver-operating characteristic) curve; DL, deep learning; HCM, hypertrophic cardiomyopathy; ROC, receiver-operating characteristic; T, Toronto score

Discussion

In this study, to maximize the cost-effectiveness of HCM genetic testing and explore the potential value of CMR in reflecting HCM genotype status, we developed a nonenhanced cine CMR image–based DL model. The underlying hypothesis is that cine images can provide internal myocardial structural and motional information, which is routinely used in clinical practice, without the administration of contrast agents. Our results indicate that a HCM mutation may be predicted with an AUC of 0.80 and an accuracy of 78.43% using the DL model. Reasonable consistency of the results was noted among the 10-fold cross-validation, thereby suggesting a stable network performance. In addition, the combination of the DL model and the Toronto score (with an AUC of 0.84 and an accuracy of 84.31%) yielded significantly higher diagnostic performance than that of a single score.

Clinical and CMR parameters

The overall yield of genetic testing was 49.5%, which was within the range of yields reported in previous literature (15–70%) [6,7,8]. Similar to the results of previous studies [29], 76.5% of causative mutations were detected in MYBPC3 and MYH7 in this study. Furthermore, several clinical and imaging parameters were reported to be associated with HCM genotypes. In this study, HCM patients with positive gene expressions were associated with diagnosis at a young age, high prevalence of family history of HCM, low prevalence of hypertension, and great LVMWT, which is consistent with previous literatures [9]. Three scores have been established to predict HCM genotypes by combining different clinical and imaging parameters. Mayo Clinic score I only considered age at diagnosis, LVMWT, and family history of HCM. Mayo Clinic score II added septal morphology, family history of SCD, and hypertension. In the Toronto score, further details, with different risk weights, were attached to independent predictor variables. Among the three traditional scores, the Toronto score provided the best predictive performance. However, the traditional scores had a limited accuracy (56.86–66.67%). This may be because the traditional scores were established based on the linear regression modeling of clinical and imaging variables and a single score cannot reflect the dynamic physiological complexity of this heart disease.

Deep learning model

Compared with a human visual inspection, the DL technique may detect subtle motional changes in cardiovascular diseases, e.g., myocardial infarction, with higher precision and sensitivity [14, 30]. The DL model was also used to achieve a fully automated and accurate LV functional analysis of CMR cine images [31]. In our DL model, the differences in the internal myocardial structure were reflected in the intensity changes in the myocardial features in each frame of the image. A.H. Ellims et al [32] detected more regional, but less diffuse myocardial fibrosis in HCM patients with genetic mutations than those without genetic mutations, suggesting different pathological features in HCM patients with positive and negative genotypes. Furthermore, the difference in myocardial motion was reflected in the temporal state of the myocardial features. This may be owing to the different HCM genetic status, which may lead to subtle differences in both the internal structure and motion states of the myocardium. It was emphasized that the combination of the DL model and Toronto score resulted in the best predictive performance with an AUC of 0.84 and a lower false-positive rate (approximately 20%) than single scores. This implies a significant complementation between internal myocardial characterization and clinical features.

Clinical implications

CMR should be considered for patients that fulfill the diagnostic criteria for HCM, to assess cardiac anatomy, ventricular function, and the presence and extent of myocardial fibrosis (IIa) [4]. However, CMR is underutilized in HCM, primarily owing to the length of the CMR examination, lack of clinicians experienced in cardiac imaging, and relatively high cost (approximately 286USD, 70% covered by social medical insurance in China). Thus, it is impractical to perform CMR for all HCM patients. We developed a DL model based on cine images of those patients who have undergone CMR examinations, to identify genotype information and assist in selecting patients with a higher probability of a positive genotype to maximize the cost-effectiveness of genetic testing. For example, using the combined DL model and Toronto score, five of fifty-one (< 10%) patients were misclassified in the genotype-positive group, which was significantly lower than the misclassifications by established scores (15–25%). The identification of causative mutations in an HCM probands could facilitate the detection of asymptomatic HCM patients and mutation carriers among their family members. Although the genotype-phenotype relationship in HCM has not been clearly established, it has been widely accepted that HCM patients with a positive genotype are associated with malignant prognosis, compared to those with a negative genotype [10, 33, 34]. Thus, the DL-based technique on nonenhanced CMR cine images also has the potential to improve the interpretation of the genotype-phenotype correlation in the HCM population.

Study limitation

We acknowledge that the preliminary study has several limitations. First, the DL network exhibits difficulty in obtaining stable image features directly from the data, owing to the limited dataset. A larger set of training and test data may facilitate further improvement. Second, multi-center validation was not performed because the available data were obtained from a single tertiary hospital. Third, we focused on the classification of positive and negative genotypes; therefore, the performance of LV segmentation is not evaluated in this study. An anatomically constrained neuronal network may improve the segmentation and performance of the DL model in future studies. Fourth, although CMR is characterized by multi-sequence imaging, only four-chamber-view cine images were analyzed in this study owing to their reliable position, and no radiation as well as renal fibrosis due to contrast agents. Other sequences in different orientations may be investigated in future studies.

Conclusion

The combination of the DL model, based on cine images and the Toronto score, may aid in identifying HCM patients with positive genotypes, which can also potentially enhance the interpretation of the genotype and phenotype on CMR in the HCM population. Multivendor, multi-center participation, and a larger sample size are imperative to evaluate the feasibility and clinical application of this model.