Abstract
Left atrial (LA) segmentation from late gadolinium enhanced magnetic resonance imaging (LGE MRI) is a crucial step needed for planning the treatment of atrial fibrillation. However, automatic LA segmentation from LGE MRI is still challenging, due to the poor image quality, high variability in LA shapes, and unclear LA boundary. Though deep learning-based methods can provide promising LA segmentation results, they often generalize poorly to unseen domains, such as data from different scanners and/or sites. In this work, we collect 140 LGE MRIs from different centers with different levels of image quality. To evaluate the domain generalization ability of models on the LA segmentation task, we employ four commonly used semantic segmentation networks for the LA segmentation from multi-center LGE MRIs. Besides, we investigate three domain generalization strategies, i.e., histogram matching, mutual information based disentangled representation, and random style transfer, where a simple histogram matching is proved to be most effective.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Radiofrequency (RF) ablation is a common technique in clinical routine for the atrial fibrillation (AF) treatment via electrical isolation. However, the success rate of some ablation procedures is low due to the existence of incomplete ablation pattern (gaps) on the left atrium (LA). Late gadolinium enhanced magnetic resonance imaging (LGE MRI) has been an important tool to detect gaps in ablation lesions, which are located on the LA wall and pulmonary vein (PV). Thus, it is important to segment LA from LGE MRI for the AF treatment. Manual delineations of the LA from LGE MRI can be subjective and labor-intensive, and automating this segmentation remains challenging.
In recent years, many algorithms have been proposed to perform automatic LA segmentation from medical images, but mostly for non-enhanced imaging modalities. Conversely, LGE MRI has received less attention with respect to developed methods of LA segmentation to assist the ablation procedures. Most of the current studies on LA segmentation from LGE MRI are still based on time-consuming and error-prone manual segmentation methods [5, 12]. This is mainly because LA segmentation methods in non-enhanced imaging modalities are difficult to directly apply to LGE MRI, due to the existence of the contrast agent and its low-contrast boundaries. Therefore, existing conventional automated LA segmentation of LGE MRI approaches generally require hard available supporting information, such as shape priors [20] or additional MRI sequences [8]. Recently, with the development of deep learning (DL) in medical image computing, some DL-based algorithms have been proposed for automatic LA segmentation directly from LGE MRI [7, 15].
However, the generalization ability of the DL-based models is limited, i.e., the performance of a trained model on the known domain (source domain) will be degraded drastically on an unseen domain (target domain). This is mainly due to the existence of a domain shift or distribution shift, which is common among the data collected from different centers and vendors, as shown in Fig. 1. In the clinic, it is impractical to retrain a model each time for the data collected from new vendors or centers. Therefore, improving the model generalization ability is important to avoid the need of retraining. Current domain generalization (DG) methods can be categorized into three types: (1) domain-invariant feature learning approaches, such as disentangled representation [11]; (2) model-agnostic meta-learning algorithms, which optimize on the meta-train and meta-test domain split from the available source domain [4]; (3) data augmentation strategies, which increase the diversity of available data [1].
In this work, we investigate the generalization abilities of four commonly used segmentation models, i.e., U-Net [14], UNet++ [19], DeepLab v3+ [3] and multi-scale attention network (MAnet) [2]. As Fig. 2 shows, we select two different sources of training data, i.e., target domain (TD) and source domains (SD) to evaluate the model generalization ability. Besides, we compare three different DG schemes for LA segmentation of multi-center LGE MRIs. The schemes include histogram matching (HM) [10], mutual information based disentangled (MID) representation [11], and random style transfer (RST) [9, 18].
2 Methodology
In this section, we describe the segmentation models we employ, formulate the DG problem (illustrated in Fig. 2) and describe the investigated three DG strategies.
2.1 Image Segmentation Models
All our segmentation models are supervised approaches based on convolutional neural networks. Typically, the models are trained using a training database \(\mathcal {T_\mathcal {D}}=\{(X_m, Y_m), m=1,\dots , M\}\) with images \(X\in \mathcal {D}\) from a single domain \(\mathcal {D}\) and corresponding labels Y. The segmentation model f(X) can be defined as,
where \(X,Y \in \mathbb {R}^{1 \times H \times W}\) denote the image set and corresponding LA segmentation set.
We consider four commonly used segmentation models, all with an encoder-decoder architecture. The first model is a vanilla U-Net. U-Net++ is a modified version of the U-Net with a more complex decoder. DeepLab v3+ employs atrous spatial convolutions, and MAnet introduce multi-scale attention blocks.
2.2 Domain Generalization Models
The generalization ability of such models is limited, i.e., a model trained on a source domain \(\mathcal {D}\) might perform poorly for images \(X\notin \mathcal {D}\). DG strategies are therefore proposed to generalize models to unseen (target) domains. Given N source domains \(\mathcal {D}_s=\left\{ \mathcal {D}_{1}, \mathcal {D}_{2}, \cdots , \mathcal {D}_{N}\right\} \), we aim to construct a DG model \(f^{DG}(X)\),
where \(\mathcal {D}_{t}\) are unknown target domains.
We investigate three DG strategies for LA segmentation from LGE MRI. In the first and simplest approach, HM is performed on the images from the target domain to match its intensity histogram onto that of the source domains. The model and training process do not change. The second (MID-Net [11]) and third method (RST-Net [9, 18]) are state-of-the-art methods employing different approaches to achieve DG. In MID-Net, domain-invariant features are extracted by mutual information based disentanglement in the latent space, while in RST-Net available domains are augmented via pseudo-novel domains.
3 Materials
3.1 Data Acquisition and Pre-processing
LGE MRIs with various image qualities, types and imaging parameters were collected from three centers, as Table 1 shows. The centers consist of Utah School of Medicine (Center 1), Beth Israel Deaconess Medical Center (Center 2), and Imaging Sciences at King’s College London (Center 3). The dataset were selected from two public challenge, i.e., MICCAI 2018 Atrial Segmentation Challenge [17] and ISBI 2012 Left Atrium Fibrosis and Scar Segmentation Challenge [13]. A total of 140 images were collected and acquired either pre- or post-ablation. The acquisition time of pre-ablation scans varied slightly among 1 to 7 days, but that of post-ablation had a range from 1 to 27 months depending on the imaging center.
The LGE MRIs from center 1, 2, 3 and 4 were reconstructed to 0.625 \(\times \) 0.625 \(\times \) 1.25 mm, (0.7–0.75) \(\times \) (0.7-0.75) \(\times \) 2 mm, 0.625 \(\times \) 0.625 \(\times \) 2 mm, and 0.625 \(\times \) 0.625 \(\times \) 1.25 mm, respectively. All 3D images were divided into 2D slices as network inputs and then were cropped into a unified size of 192 \(\times \) 192 centering at the heart region, with a intensity normalization via Z-score. Random rotation, random flip and Gaussian noise augmentation were applied during training. The data distribution in the subsequent experiments is presented in Table 2.
3.2 Gold Standard and Evaluation
All the LGE MRIs were manually delineated by the experts from the corresponding centers. The manual LA segmentation were regarded as the gold standard. For LA segmentation evaluation, Dice score, average surface distance (ASD) and Hausdorff distance (HD) were applied. Each image from the three centers were assigned an image quality score by averaging the scores from two experts, mainly based on the visibility of enhancements and the existence of image artefacts (please see the Supplementary Material file).
3.3 Implementation
The proposed framework was implemented in PyTorch, running on a computer with 2.20 GHz Intel(R) Xeon(R) E5-2630 v4 CPU and a GeForce GTX 1080 Ti GPU. We employed the released Segmentation Models [16] for experiments. All the backbones of the four semantic segmentation models are the efficientnet-b6. We used the Adam optimizer to update the network parameters. The initial learning rate was set to 5e−5 and multiplied by 0.95 every 10 epochs.
4 Experiment
4.1 Comparisons of Different Semantic Segmentation Networks
Table 3 summarizes the LA segmentation results in terms of Dice, ASD and HD based on the four semantic segmentation models. One can see that all the segmentation models had a performance decrease when the target domain was not included in the training data. It proves that the generalization capabilities of currently commonly used DL-based segmentation models are still very limited. When we observe the Dice value of the LA segmentation, the obtained performances of the four models training on the TD are very close. However, DeepLab v3+ achieved significantly better ASD and HD than the other three models. It may be attributed to its atrous convolution and spatial pyramid pooling module, which promote the network to learn more spatial information. When training on the SD, the performance decrease of was DeepLab v3+ was smaller than other three models. Therefore, in this work DeepLab v3+ is regarded as the baseline model, and we will improve its generalization ability using the proposed DG schemes.
4.2 Comparisons of Post- and Pre-ablation LGE MRI
As Fig. 1 shows, the pre- and post-ablation LGE MRI can have high variability of tissue appearance. There are already several studies that have shown the performance of LA scar segmentation and quantification varied among pre- and post-ablation LGE MRI [6]. This is mainly because that when comparing to post-ablation images, the scars on pre-ablation LGE MRIs are hard to distinguish even for experts. In contrast, as far as we know, there are to this date no studies comparing the LA segmentation performance for pre- and post-ablation LGE MRI. Here, we compared and analyzed the LA segmentation performance on pre- and post-ablation LGE MRI on the four basic segmentation models.
Figure 3 presents the Dice and HD value obtained by the four models on the pre- and post-ablation images, separately. One can see that, the four models all suffered from an accuracy deterioration caused by the domain shift on both pre- and post-ablation LGE MRIs, which is consistent with the results in Table 3. Besides, the Dice obtained by the four models is similar on both pre- and post-ablation LGE MRIs, but DeepLab v3+ performed better in terms of HD, especially on pre-ablation data.
In summary, there is no evident performance difference between pre- and post-ablation data for the four models. However, the standard deviations of the Dice and HD values of the LA segmentation on the pre-ablation data are generally lower than those of post-ablation images. It may indicate that the segmentation model is more robust for the pre-ablation data of the multi-center LGE MRIs.
4.3 Comparisons of Different Generalization Models
Table 4 summarized three DG schemes to compare with the baseline DeepLab v3+ model training on multi-source domains. One can see that three tested generalization strategies worked when comparing with baseline results. Among the three methods, the conventional histogram matching algorithm performed best. The MID-Net and RST-Net obtained similar results in terms of Dice, but the ASD and HD of MID-Net were worse.
Figure 4 presents the 2D visualization results of the four methods on post-/pre-ablation LGE MRI. In the post-ablation case, three DG schemes could identify some missing PV regions by the DeepLab v3+. Similarly, in the pre-ablation subject, MID-Net and RST-Net both mitigated the segmentation errors in the mitral valve (MV) area. It proved that for the both post- and pre-ablation cases, the employed DG methods worked.
5 Conclusion
In this work, we first investigated the generalization abilities of different semantic segmentation models for LA segmentation from multi-center LGE MRIs. The results showed that all the performance of the commonly used segmentation models degraded dramatically on the unknown domain. It emphasized the importance of promoting deep models with efficient inherent generalization abilities for LGE MRI data processing from different centers. We then introduced three DG strategies, which were all able to alleviate the performance decrease. Our study found that, quite surprisingly, the simple histogram matching strategy is the most effective method for DG on the LA segmentation of multi-center LGE MRI data. It may indicate that there is still large scope for further algorithmic developments in DG. In future, we will find the inherent differences of multi-center LGE MRIs, and develop a targeted and effective DG strategy to solve this problem. Moreover, we will further study the domain shift between post- and pre-ablation LGE MRI from the same center, and the label variations of LGE MRIs from different centers.
References
Chen, C., et al.: Improving the generalizability of convolutional neural network-based segmentation on CMR images. Front. Cardiovasc. Med. 7, 105 (2020)
Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2016)
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
Dou, Q., de Castro, D.C., Kamnitsas, K., Glocker, B.: Domain generalization via model-agnostic learning of semantic features. In: Advances in Neural Information Processing Systems, pp. 6450–6461 (2019)
Higuchi, K., et al.: The spatial distribution of late gadolinium enhancement of left atrial magnetic resonance imaging in patients with atrial fibrillation. JACC: Clin. Electrophysiol. 4(1), 49–58 (2018)
Karim, R., et al.: Evaluation of current algorithms for segmentation of scar tissue from late gadolinium enhancement cardiovascular magnetic resonance of the left atrium: an open-access grand challenge. J. Cardiovasc. Magn. Reson. 15(1), 1–17 (2013). Article number: 105
Li, L., Weng, X., Schnabel, J.A., Zhuang, X.: Joint left atrial segmentation and scar quantification based on a DNN with spatial encoding and shape attention. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 118–127. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59719-1_12
Li, L., et al.: Atrial scar quantification via multi-scale CNN in the graph-cuts framework. Med. Image Anal. 60, 101595 (2020)
Li, L., et al.: Random style transfer based domain generalization networks integrating shape and spatial information. In: Puyol Anton, E., et al. (eds.) STACOM 2020. LNCS, vol. 12592, pp. 208–218. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68107-4_21
Ma, J.: Histogram matching augmentation for domain adaptation with application to multi-centre, multi-vendor and multi-disease cardiac image segmentation. In: Puyol Anton, E., et al. (eds.) STACOM 2020. LNCS, vol. 12592, pp. 177–186. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68107-4_18
Meng, Q., et al.: Mutual information-based disentangled neural networks for classifying unseen categories in different domains: application to fetal ultrasound imaging. IEEE Trans. Med. Imaging 40(2), 722–734 (2020)
Njoku, A., et al.: Left atrial volume predicts atrial fibrillation recurrence after radiofrequency ablation: a meta-analysis. Ep Europace 20(1), 33–42 (2018)
Rhode, K., Karim, R.: ISBI 2012: left atrium fibrosis and scar segmentation challenge (2012). http://www.cardiacatlas.org/challenges/left-atrium-fibrosis-and-scar-segmentation-challenge/
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Xiong, Z., et al.: A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging. Med. Image Anal. 67, 101832 (2020)
Yakubovskiy, P.: Segmentation models (2019). https://github.com/qubvel/segmentation_models
Zhao, J., Xiong, Z.: MICCAI 2018: Atrial segmentation challenge (2018). http://atriaseg2018.cardiacatlas.org/
Zhou, K., Yang, Y., Hospedales, T., Xiang, T.: Learning to generate novel domains for domain generalization. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 561–578. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_33
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Zhu, L., Gao, Y., Yezzi, A., Tannenbaum, A.: Automatic segmentation of the left atrium from MR images via variational region growing with a moments-based shape prior. IEEE Trans. Image Process. 22(12), 5111–5122 (2013)
Acknowledgement
This work was funded by the National Natural Science Foundation of China (grant no. 61971142, 62111530195 and 62011540404) and the development fund for Shanghai talents (no. 2020015). L. Li was partially supported by the CSC Scholarship. JA Schnabel and VA Zimmer would like to acknowledge funding from a Wellcome Trust IEH Award (WT 102431), an EPSRC programme grant (EP/P001009/1), and the Wellcome/EPSRC Center for Medical Engineering (WT 203148/Z/16/Z).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, L., Zimmer, V.A., Schnabel, J.A., Zhuang, X. (2021). AtrialGeneral: Domain Generalization for Left Atrial Segmentation of Multi-center LGE MRIs. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12906. Springer, Cham. https://doi.org/10.1007/978-3-030-87231-1_54
Download citation
DOI: https://doi.org/10.1007/978-3-030-87231-1_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87230-4
Online ISBN: 978-3-030-87231-1
eBook Packages: Computer ScienceComputer Science (R0)