Introduction

Cardiovascular disease (CVD) is the leading cause of death worldwide [1]. Myocardial fibrosis (MF) is one of the most common histologic features associated with injury to the myocardium [2]. Conditions such as myocardial infarction, coronary and hypertension heart disease, and aortic stenosis are the most frequent causes of MF [3,4,5]. The MF can be of two types: interstitial and replacement fibrosis [6]. Interstitial fibrosis, which is believed to be reversible through early diagnosis and treatment planning [7, 8], is characterized by the diffuse spread of extracellular collagen without cardiomyocyte necrosis [9]. Replacement fibrosis that occurs after myocardial infarction is considered irreversible [7, 10, 11]. The focus of this review is on replacement MF. The mass, surface area, and spatial distribution of the MF have shown to be predictors of patient prognosis and clinical outcomes [12•]. For instance, the MF mass was shown to be an independent predictor of severe diastolic dysfunction [13]. Diastolic dysfunction is correlated with larger MF mass and lower left ventricular ejection fraction [14]. Recently, computational modeling of the patient with ischemic cardiomyopathy has emerged as a promising non-invasive tool that allows clinicians to conduct a patient-specific diagnosis and treatment of associated rhythm disorders [15,16,17,18,19,20,21]. However, accurate remodeling of myocardial structural requires the incorporation of intact geometry of the MF region [15, 22, 23]. Therefore, identification and characterization of MF are of utmost importance. Characterizing MF may also help clinicians in determining the appropriateness and procedural approach to percutaneous ablation (aimed at eliminating electrical channels) and cardiac resynchronization therapy [24, 25].

Cardiovascular magnetic resonance (CMR) is a non-invasive imaging modality for patients with MF and it establishes a reference methodology for cardiac anatomy and function assessment [26, 27]. CMR is routinely performed during multiple breath-holds in predefined 2D image orientations, and it provides an accurate and reproducible estimate of cardiac structure including left ventricle (LV), right ventricle (RV), and myocardial viability [28, 29]. Routine CMR exams include late gadolinium enhancement (LGE) and cinematic CMR (cine CMR) images. Gadolinium, which is a contrast agent, is used in CMR scans to improve the clarity of the cardiac structure in the images. Cine images are required to acquire complete information of the heart function throughout the cardiac cycle [30]. Figure 1 shows an example of the LGE CMR. Two-dimensional (2D) LGE CMR has been recognized as the non-invasive reference standard for MF identification [31, 32]. LGE increases the volume of distribution for the contrast agent and prolongs washout related to the decreased capillary density within the myocardial fibrotic tissue, which in turn causes the MF to appear as the bright signal intensity in the CMR [26]. Spatial resolution at which the images are acquired may affect the predictive value of the quantified MF by the CMR techniques [33]. 3D CMR is relatively a new technique that allows more accurate spatial representation and quantification [34]. 3D LGE CMR images can now be acquired via cardiac gating during free breathing [35, 36]. Furthermore, it improves myocardial nulling, especially relevant at higher magnetic field strengths such as 3 T [37]. Although compared to the 2D technique, 3D LGE CMR images provide higher signal intensity and contrast for MF with reduced overall acquisition times [38,39,40], delineating MF from 3D LGE CMR images is more challenging due to their mere size [41]. The 3D LGE CMR images may consist of hundreds of slices, reformatted in multiple imaging planes with 3D reconstruction. It has been demonstrated that MF segmentation directly from cine CMR images without contrast agents is feasible [42, 43••] for patients with chronic kidney diseases who are not recommended to use gadolinium-based contrast agents [44].

Fig. 1
figure 1

From left to right, an example of one slice, randomly extracted from a multi-slice clinical LGE-CMR image, in the transversal direction showing the right and left ventricles along with the contours of myocardial fibrosis, shown in cyan color

Automated quantification of MF is important for clinicians for the efficient determination of the clinical diagnosis and prognosis of the patients. This is due to increases in both the size of cardiac images and the number of scans acquired annually due to the growth in the number of patients. Rosendahl et al. [45] showed that computer software for MF characterization decreased the required expert evaluation time by more than 50% while maintaining clinical accuracy to that of manual assessment. Moreover, fully automated methods are reproducible and not subject to operator variability.

Prior to the recent advancements in machine learning (ML)-based methods, the intensity-threshold-based methods and energy minimization-based methods were widely employed for MF segmentation from LGE CMR Images. Two widely used intensity threshold-based methods are the full width at half maximum (FWHM) and signal threshold to reference mean (STRM). In FWHM, the maximum intensity within the myocardial boundary is considered as the reference and all intensity values that are larger than half of the maximum intensity value, are defined as scar [46]. In STRM, the intensity of healthy tissue is considered as a reference, and MF is defined as its mean value plus two (STRM2), three (STRM3), four (STRM4), five (STRM5), or six (STRM6) standard deviation [47]. These two methods require myocardial segmentation and choosing seed points manually, which are subject to high inter- and intra-observer variability. Energy minimization-based methods define a mathematical criterion for the “goodness” of MF segmentation as an optimization problem with regularization constraint on the segmented boundary and a term based on image-based energies. These techniques require the users to define an initial contour in the vicinity of the MF region, which is then evolved by minimizing the optimization criteria. The energy minimization-based methods include graph cuts, level sets, and convex max flow-based methods [48,49,50,51]. Most of these methods were semi-automated and required manual segmentation of the myocardium to constrain the search space for segmenting the MF. However, since these methods use only simple straight forward features such as image intensity and spatial consistency and have only a few hyper-parameters, the performance of these methods plateaued even with the increasing number of larger annotated datasets [52].

Deep learning (DL), which is a branch of ML, has evolved over the last decade and has been providing exciting solutions for the interpretation of medical images. DL architectures such as deep neural networks, deep belief networks, recurrent neural networks (RNNs), and convolutional neural networks (CNNs) employ multiple layers of non-linear processing units to solve complex data with large feature sizes [53]. Recent advances in DL and the availability of large annotated datasets have aided in the identification, classification, and quantification of abnormalities in medical images. Automated characterization of MF using LGE CMR images is no exception. In this paper, we describe semi- and fully automated methods developed for LV MF segmentation from 2D/3D LGE CMR images with more emphasis on the ML and DL-based methods. Furthermore, ML-based methods developed for the delineation of MF from non-contrast CMR images will also be summarized.

ML-Based Methods for Identification of MF from 2D/3D LGE CMRI

A summary of previously published ML-based methods for MF quantification and segmentation from 2D/3D LGE CMR images are shown in Table 1. We first describe the conventional ML-based algorithms. The DL-based methodologies will then be presented and within each category semi- and fully automated techniques will be described.

Table 1 Overview of previously published ML-based methods for MF quantification and segmentation from 2D/3D LGE CMRI

Conventional ML-Based Methods

Karim et al. described an evaluation framework for algorithms that segment and quantify MF in LV from LGE CMR images [54]. They created two datasets with a total of 15 images of humans with a known history of ischemic cardiomyopathy and 15 porcine with chronic myocardial ischemia, of which five in each cohort were used as a training set for the algorithms. The datasets were used for an open challenge, put forth to the medical imaging community at the Medical Image Computing and Computer-Assisted Intervention (MICCAI) annual meeting’s workshop entitled as Delayed Enhancement MRI segmentation challenge. Few of those techniques were based on ML-based methods. Lara et al. suggested a method based on support vector machine and level sets to identify MF in LV. The FWHM method was first applied to get an approximate segmentation of MF. The connected-component analysis was employed to find the groups of connected pixels and some features including area, bounding box, major and minor axes, eccentricity, convex-hull area, and Euler number were extracted from those groups of connected pixels. Using these features, the support vector machine (SVM) was then applied to differentiate between healthy and MF tissues. A level set method was used to refine the segmentation. This method yielded the median Dice similarity coefficient (DSC) of 73% and 86% on the patient and porcine LGE CMR test scans, respectively.

Kurzendorfer et al. developed a four-step texture-based method to segment MF from 3D LGE CMR images [55]. They first decomposed the image into a set of binary images by applying a two-threshold binary decomposition. Next, a set of fractal dimension features were extracted from binary images. Global and local features were included in the feature vectors as well. Then, a random forest classifier was employed to classify extracted features. They used a dataset of clinical 3D LGE CMRI images of 30 patients and evaluated their algorithm using a 6-fold nested cross-validation technique. The suggested methodology achieved DSC of 66% ± 17% on the test samples. The authors compared their algorithm with STRM and FWHM, which required user input. The comparison of obtained results using different methods demonstrated that the random forest-based method outperforms STRM, but not the FWHM. According to the authors’ argument, the results likely biased toward the FWHM segmentation as FWHM was used to provide annotated ground truth images.

DL-Based Methods

Majority of the DL-based algorithms employed CNN-based networks to segment MF from LGE-CMR images [56,57,58,59,60, 61••]. In the following subsections, we first describe semi-automated methods, where manual segmentation of LV is required to segment MF boundary in LGE-CMR images. The fully automated techniques will then be summarized.

Semi-Automated Algorithms

Moccia et al. used a fully convolutional neural network (FCN) to delineate MF in the LV from 3D LGE CMR [56]. In FCN architecture, several convolutional layers are stacked in an encoder-decoder fashion. The encoder downsamples the image through convolutional and pooling layers, and the decoder upsamples the image using transpose convolution to generate the segmentation map. The feature maps from the downsampling and upsampling paths are summed up to combine context information with spatial information. The authors delineated MF from pre-segmented myocardium as a region of interest (ROI). A total of 250 slices were compiled from CMR images of 30 patients with ischemic heart disease. The proposed method was evaluated against expert manual segmentation using a one-patient-out cross-validation technique. The suggested technique yielded a median DSC of 71.25%.

Zabihollahy et al. proposed a 2D and 3D CNN-based algorithms to identify MF from pre-segmented LV using a dataset of 3D LGE CMR images of 34 patients with ischemic cardiomyopathy [57, 58]. Small 2D and 3D patches were extracted around each voxel in the LV myocardium and presented to the trained 2D and 3D CNNs to be classified as healthy or MF tissue. A segmentation map was then created using output labels and compared with ground truth images. 2D CNN-based method yielded DSC of 94.50% ± 3.62% on the test instances while DSC of 93.63% ± 2.6% was obtained using 3D CNN.

Fully Automated Algorithms

Moccia et al. segmented MF directly from CMR image using the same method and dataset employed for the semi-automated pipeline described in [56]. The fully automated protocol yielded a median DSC of 54% and 71.25%. The authors compared the results obtained from semi- and fully automated methods and demonstrated that identification of MF is significantly more accurate when the search area is limited to the myocardial region (p value < 0.05).

In another study, a deep convolutional neural network (DCNN) was used to quantify MF from LV on 2D LGE [59]. Since only some of the slices of the image will contain scar tissue, the authors suggested a generative adversarial network (GAN)-based technique to simulate scar tissue on healthy myocardium and artificially augment the training samples. Unlike other approaches that generate samples from scratch, the authors simulated scar tissue on normal scans to generate highly realistic samples for training. The proposed approach was evaluated using a dataset with 159 LGE CMR scans and reported a DSC of 80.5% for MF delineation on test images.

Fahmy et al. employed a U-Net architecture to quantify LV mass and MF volume on LGE. The U-Net architecture is built upon the FCN with the main difference that U-Net is symmetric. Furthermore, in U-Net, the skip connections between the downsampling path and the upsampling path apply a concatenation operator instead of summation. Images of 1041 patients with hypertrophic cardiomyopathy were used for this study (train/test = 80%/20%) [60]. Their method achieved DSCs of 82% ± 8% and 81% ± 11% for LV identification and DSCs of 57% ± 23% and 58% ± 28% for MF segmentation on LGE at the per-patient and slice levels. Using this algorithm, the DSC of LV segmentation was lower in the apical slices compared with other slice locations (70% ± 20% versus 83% ± 10%; p < 0.001).

Zabihollahy et al. introduced a novel DL-based method to discover the boundaries of MF from automatically segmented LV using 3D LGE CMR images of 34 patients with ischemic cardiomyopathy [61••]. Two cascaded modules were employed to segment LV myocardium and use it as an ROI to search for MF tissue. In each module, three different U-Nets were trained using slices extracted from transversal, coronal, and sagittal directions to benefit from the isotropic property of the voxel in the 3D images. The proposed methodology achieved DSC of 88.61% ± 2.54% for fully automated segmentation of MF on 3D LGE CMR test images.

DL-Based Methods for Identification of MF from Non-Contrast CMRI

Several studies have been reported on fully automated segmentation of MF from cine CMR images using DL-based methods. These algorithms are validated by comparing to the ground truths manually segmented from corresponding LGE CMR images. Chen et al. described an algorithm to segment the boundaries of MF in the LV and evaluated their method using a dataset comprised of cine CMR images of 73 subjects [62]. In their approach, the static image information and motion information described by the optical flow were fused and presented as inputs into the stack denoising autoencoder (SDAE), which is comprised of multiple layers of sparse autoencoders. An SVM classifier was then applied to the discriminative feature representation learned by the SDAE. This method yielded the accuracy and precision of 87.6% and 86.2% on 13 test cases.

Xu et al. developed an end-to-end deep-learning algorithm framework to detect MF using the short-axis cine CMR image of 114 patients [63]. The authors first used a fast region-based CNN (Fast R-CNN) algorithm to localize and crop the ROI including the left ventricle. Local and global motion features were then generated by long short-term memory recurrent neural network (LSTM-RNN) and deep optical flow, respectively. Particularly, two types of motion features namely local and global are learned together: patch-based motion features using the local intensity change between patches cropped from image sequences, and image-based motion features using the global intensity change along the whole image sequence. An autoencoder was included at the end to detect the MF region from the learned local and global motion features. The proposed framework yielded a classification accuracy of 94.35% at the pixel level. In another study, the authors used a deep learning framework, which is similar to the previous work [63], except that instead of Fast R-CNN, Faster R-CNN was used for LV localization in non-contract CMR images. Fast R-CNN uses the selective search for generating ROI, while Faster R-CNN uses region proposal network and generates a set of object proposals, each with a score. This algorithm evaluated using 165 cine CMR images and achieved 89.87% DSC using a 10-fold cross-validation technique [42]. Zhang et al. used the same method and evaluated the method with a dataset of 212 patients with chronic MF and 87 control patients from which approximately 20% was used for the test phase. The DSC was reported as 86.1% ± 5.7% for the delineating of MF on the short-axis cine CMR image [64].

Xu et al. proposed a deep spatiotemporal adversarial network to segment and quantify MF directly from the cine MR image [65]. Manual segmentation of MF on the corresponding LGE CMR images was considered as ground truth to evaluate the performance of the proposed methodology. In this algorithm, first a coarse to fine hierarchical features are learned using a multi-level and multi-scale spatiotemporal variation encoder. Then, a cross-task generator generates the segmentation and quantification tasks results and connects the beneficial interaction feature maps of these two related tasks. Then, three discriminators (including segmentation, quantification, and relationship) are iteratively imposed on the encoder and generator to detect and correct the inconsistencies in the label relatedness between and within tasks via adversarial learning. This method yielded an accuracy of 96.98% at the pixel level.

Discussion

Due to the availability of large annotated datasets, supervised learning approaches based on DL methods are ideally suited for the segmentation of ML in CMR images. In this paper, we described the ML-based methods developed for the quantification of MF. The proposed methods are evolved from semi- to fully automated, and the majority of them are CNN-based. We reviewed methods introduced for MF assessment using both types of CMR images including LGE and cine. The LGE CMR imaging has been known as a gold standard for MF assessment as it hyper enhances the intensity regions of MF compared to normal tissue. However, gadolinium-based contrast agents may not be recommended for patients with chronic kidney diseases [44]. Therefore, there is an increased interest in the automated characterization of MF using non-contrast CMR images.

The use of different datasets and metrics in the reported studies on MF segmentation hampers the direct comparison of the proposed algorithms. Furthermore, in some studies where segmentation was performed using cine MR images, pixel-wise accuracy was used as a sole metric to investigate the performance of the developed method for MF segmentation. This metric does not represent the segmentation method performance as MF has a relatively small size in CMR images compared to the background that leads to a substantial class imbalance between pixels labeled as MF or background. It has been shown that in the case of class imbalance, a proper metric other than accuracy (e.g., sensitivity, specificity, receiver operating characteristic curve, precision-recall curves, etc.) must be chosen to evaluate the model performance [66]. Selecting a metric is problem dependent. Another limitation concerning algorithm evaluation is the subjectivity of manual segmentations as the reference standard. In all aforementioned studies, expert manual segmentation was used as a surrogate to evaluate the performance of developed algorithms for MF segmentation performed typically by a single expert or few experts. However, there is observer variability in the manual delineation of the MF in the images, which often has not taken into account. Another limitation of the surveyed studies is that a limited number of images were used for the evaluation of the algorithms. This is a common issue in medical image processing problems as acquiring annotated images is quite expensive. Therefore, further studies are required to investigate the robustness of the presented automated techniques for various LGE CMR images. The future works include domain adaptation and transfer learning that might be beneficial to adapt the pre-trained deep learning models to data from new centers.