Abstract
Reconstruction and visualization of cardiac structures play significant roles in computer-aided clinical practice as well as scientific research. With the advancement of medical imaging techniques, computing facilities, and deep learning models, automatically generating whole-heart meshes directly from medical imaging data becomes feasible and shows great potential. Existing works usually employ a point cloud metric, namely the Chamfer distance, as the optimization objective when reconstructing the whole-heart meshes, which nevertheless does not take the cardiac topology into consideration. Here, we propose a novel currents-represented surface loss to optimize the reconstructed mesh topology. Due to currents’s favorable property of encoding the topology of a whole surface, our proposed pipeline delivers whole-heart reconstruction results with correct topology and comparable or even higher accuracy.
Supported by the National Natural Science Foundation of China (62071210); the Shenzhen Science and Technology Program (RCYX20210609103056042); the Shenzhen Science and Technology Innovation Committee (KCXFZ2020122117340001); the Shenzhen Basic Research Program (JCYJ20200925153847004, JCYJ20190809 120205578).
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
With the advent of advanced medical imaging technologies, such as computed tomography (CT) and magnetic resonance (MR), non-invasive visualizations of various human organs and tissues become feasible and are widely utilized in clinical practice [8, 22]. Cardiac CT imaging and MR imaging play important roles in the understanding of cardiac anatomy, diagnosis of cardiac diseases [20] and multimodal visualizations [9]. Potential applications range from patient-specific treatment planning, virtual surgery, morphology assessment to biomedical simulations [2, 17]. However, in traditional procedures, visualizing human organs usually requires significant expert efforts and could take up to dozens of hours depending on the specific organs of interest [7, 24], which makes large-cohort studies prohibitive and limits clinical applications [16].
Empowered by the great feature extraction ability of deep neural networks (DNNs) and the strong parallel computing power of graph processing units (GPUs), automated visualizations of cardiac organs have been extensively explored in recent years [18, 24]. These methods typically follow a common processing flow that requires a series of post-processing steps to produce acceptable reconstruction results. Specifically, the organs of interest are first segmented from medical imaging data. After that, an isosurface generation algorithm, such as marching cubes [15], is utilized to create 3D visualizations typically with staircase appearance, followed by smoothing filters to create smooth meshes. Finally, manual corrections or connected component analyses [11] are applied to remove artifacts and improve topological correctness. The entire flow is not optimized in an end-to-end fashion, which might introduce and accumulate multi-step errors or still demand non-trivial manual efforts.
In such context, automated approaches that can directly and efficiently generate cardiac shapes from medical imaging data are highly desired. Recently, various DNN works [1, 3, 4, 12,13,14, 19] delve into this topic and achieve promising outcomes. In particular, the method depicted in [1] performs predictions of cardiac ventricles using both cine MR and patient metadata based on statistical shape modeling (SSM). Similarly, built on SSM, [4] uses 2D cine MR slices to generate five cardiac meshes. Another approach proposed in [19] employs distortion energy to produce meshes of the aortic valves. Inspiringly, graph neural network (GNN) based methods [12,13,14] are shown to be capable of simultaneously reconstructing seven cardiac organs in a single pass, producing whole-heart meshes that are suitable for computational simulations of cardiac functioning. The training processes for these aforementioned methods are usually optimized via the Chamfer distance (CD) loss, a point cloud based evaluation metric. Such type of point cloud based losses is first calculated for each individual vertex, followed by an average across all vertices, which nonetheless does not take the overall mesh topology into consideration. This could result in suboptimal or even incorrect topology in the reconstructed mesh, which is undesirable.
To solve this issue, we introduce a novel surface loss that inherently considers the topology of the two to-be-compared meshes in the loss function, with a goal of optimizing the anatomical topology of the reconstructed mesh. The surface loss is defined by a computable norm on currents [6] and is originally introduced in [23] for diffeomorphic surface registration, which has extensive applicability in shape analysis and disease diagnosis [5, 21]. Motivated by its inherent ability to characterize and quantify a mesh’s topology, we make use of it to minimize the topology-considered overall difference between a reconstructed mesh and its corresponding ground truth mesh. Such currents guided supervision ensures effective and efficient whole-heart mesh reconstructions of seven cardiac organs, with high reconstruction accuracy and correct anatomical topology being attained.
2 Methodology
Figure 1 illustrates the proposed end-to-end pipeline, consisting of a voxel feature extraction module (top panel) and a deformation module (middle panel). The inputs contain a CT or MR volume accompanied by seven initial spherical meshes. To be noted, the seven initial spherical meshes are the same for all training and testing cases. A volume encoder followed by a decoder is employed as the voxel feature extraction module, which is supervised by a segmentation loss comprising binary cross entropy (BCE) and Dice. This ensures that the extracted features explicitly encode the characteristics of the regions of interest (ROIs). For the deformation module, a GNN is utilized to map coordinates of the mesh vertices, combine and map trilinearly-interpolated voxel features indexed at each mesh vertex, extract mesh features, and deform the initial meshes to reconstruct the whole-heart meshes. There are three deformation blocks that progressively deform the initial meshes. Each deformation block is optimized on three types of losses: a surface loss for both accuracy and topology correctness purposes, a point cloud loss for an accuracy purpose, and three regularization losses for a smoothness purpose. The network structure details of the two modules are detailed in the supplementary material.
For an input CT or MR volume, it passes into the voxel feature extraction module to predict binary segmentation for the to-be-reconstructed ROIs. Meanwhile, the initial spherical meshes enter into the first deformation block along with the trilinearly-interpolated voxel features to predict the vertex-wise displacements of the initial meshes. Then, the updated meshes go through the following blocks for subsequent deformations. The third deformation block finally outputs the reconstructed whole-heart meshes. The three deformation blocks follow the same process, except for the meshes they deform and the trilinearly-interpolated voxel features they operate on. In the first deformation block, we use high-level voxel features, \(f_{3}\) and \(f_{4}\), obtained from the deepest layers of the volume encoder. In the second deformation block, the middle-level voxel features, \(f_{1}\) and \(f_{2}\), are employed. As for the last deformation block, its input meshes are usually quite accurate and only need to be locally refined. Thus, low-level voxel features are employed to supervise this refining process.
Surface Representation as Currents. Keeping in line with [23], we employ a generalized distribution from geometric measure theory, namely currents [6], to represent surfaces. Specifically, surfaces are represented as objects in a linear space equipped with a computable norm. Given a triangular mesh S embedded in \(\mathbb {R}^{3}\), it can be associated with a linear functional on the space of 2-form via the following equation
where for each \(x\in {S}\) \(u_{x}^{1}\) and \(u_{x}^{2}\) form an orthonormal basis of the tangent plane at x. \(\omega (x)\) is a skew-symmetric bilinear function on \(\mathbb {R}^{3}\). \(d\sigma (x)\) represents the basic element of surface area. Subsequently, a surface can be represented as currents in the following expression
where \(S(\omega )\) denotes the currents representation of the surface. f denotes each face of S and \(\sigma _{f}\) is the surface measure on f. \(\bar{\omega }(x)\) is the vectorial representation of \(\omega (x)\), with \(\cdot \) and \(\times \) respectively representing dot product and cross product. After the currents representation is established, an approximation of \(\omega \) over each face can be obtained by using its value at the face center.
Let \(f_{v}^{1}\), \(f_{v}^{2}\), \(f_{v}^{3}\) denote the three vertices of a face f, \(e^1=f_{v}^{2}-f_{v}^{3}\), \(e^2=f_{v}^{3}-f_{v}^{1}\), \(e^3=f_{v}^{1}-f_{v}^{2}\) are the edges, \(c(f) = \frac{1}{3}(f_{v}^{1}+f_{v}^{2}+f_{v}^{3})\) is the center of the face and \(N(f)=\frac{1}{2}(e^{2} \times e^{3})\) is the normal vector of the face with its length being equal to the face area. Then, \(\omega \) can be approximated over the face by its value at the face center, resulting in \(S(\omega )\approx \sum _{f}\bar{\omega }(c(f))\cdot N(f)\). In fact, the approximation is a sum of linear evaluation functionals \(C(S)=\sum _{f}\delta _{c(f)}^{N(f)}\) associated with a Reproducing Kernel Hilbert Space (RKHS) under the constraints presented elsewhere [23]. Thus, \(S_{\varepsilon }\), the discrepancy between two surfaces S and T, can be approximately calculated via the RKHS as below
where \(W^{*}\) is the dual space of a Hilbert space \((W,\left\langle \cdot ,\cdot \right\rangle _{W})\) of differential 2-forms and \(\left| \left| \; \right| \right| ^{2}\) is \(l_{2}\)-norm. \(()^{T}\) denotes the transpose operator. f, g index the faces of S and q, r index the faces of T. \(k_{W}\) is an isometry between \(W^{*}\) and W, and we have \(\left\langle \delta _{x}^{\xi },\delta _{y}^{\eta }\right\rangle _{W^{*}}=k_{W}(x,y)\xi \cdot \eta \) [23]. The first and third terms enforce the structural integrity of the two surfaces, while the middle term penalizes the geometric and spatial discrepancies between them. With this preferable property, Eq. 3 fulfills the topology correctness purpose, the key of this proposed pipeline.
Surface Loss. As in [23], we choose a Gaussian kernel as the instance of \(k_{W}\). Namely, \(k_{W}(x,y)=exp(-\frac{\left\| x-y \right\| ^2}{\sigma _{W}^{2}})\), where x and y are the centers of two faces and \(\sigma _{W}\) is a scale controlling parameter that controls the affecting scale between the two faces. Therefore, the surface loss can be expressed as
where \(t_{1}\), \(t_{2}\), t and \(p_{1}\), \(p_{2}\), p respectively index faces on the reconstructed surfaces \(S_{R}\) and those on the corresponding ground truth surfaces \(S_{T}\). \(L_{surface}\) not only considers each face on the surfaces but also its corresponding direction. When the reconstructed surfaces are exactly the same as the ground truth, the surface loss \(L_{surface}\) should be 0. Otherwise, \(L_{surface}\) is a bounded positive value [23]. Minimizing \(L_{surface}\) enforces the reconstructed surfaces to be progressively close to the ground truth as the training procedure develops.
Figure 2 illustrates how \(\sigma _{W}\) controls the affecting scale of a face on a surface. The three surfaces are identical meshes of a left atrium structure except for the affecting scale (shown in different colors) on them. There are three colored circles (red, blue, and green) respectively representing the centers of three faces on the surfaces, and the arrowed vectors on these circles denote the corresponding face normals. The color bar ranges from 0 to 1, with 0 representing no effect and 1 representing the most significant effect. From Fig. 2, the distance between the blue circle and the red one is closer than that between the blue circle and the green one, and the effect between the red circle and the blue one is accordingly larger than that between the red circle and the green one. With \(\sigma _{W}\) varying from a large value to a small one, the effects between the red face and other remaining faces become increasingly small. In this way, we are able to control the acting scale of the surface loss via changing the value of \(\sigma _{W}\). Assigning \(\sigma _{W}\) a value that covers the entire surface results in a global topology encoding of the surface, while assigning a small value that only covers neighbors shall result in a topology encoding that focuses on local geometries.
Loss Function. In addition to the surface loss we introduce above, we also involve two segmentation losses \(L_{BCE}\) and \(L_{Dice}\), one point cloud loss \(L_{CD}\), and three regularization losses \(L_{laplace}\), \(L_{edge}\), and \(L_{normal}\) that comply with [13]. The total loss function can be expressed as:
where \(w_{s}\) is the weight for the segmentation loss, and \(w_{1}\), \(w_{2}\), \(w_{3}\) and \(w_{4}\) are respectively the weights for the surface loss, the Chamfer distance, the Laplace loss, and the edge loss. The geometric mean is adopted to combine the five individual mesh losses to accommodate their different magnitudes.
\(L_{seg}\) ensures useful feature learning of the ROIs. \(L_{surface}\) enforces the integrity of the reconstructed meshes and makes them topologically similar to the ground truth. \(L_{CD}\) makes the point cloud representation of the reconstructed meshes to be close to that of the ground truth. Additionally, \(L_{laplace}\), \(L_{edge}\), and \(L_{normal}\) are employed for the smoothness consideration of the reconstructed meshes.
3 Experiments
Datasets and Preprocessing. We evaluate and validate our method on a publicly-accessible dataset MM-WHS (multi-modality whole heart segmentation) [24], which contains 3D cardiac images of both CT and MR modalities. 20 cardiac CT volumes and 20 cardiac MR volumes are provided in the training set. 40 held-out cardiac CT volumes and 40 held-out cardiac MR volumes are offered in the testing set. All training and testing cases are accompanied by expert-labeled segmentation of seven heart structures: the left ventricle (LV), the right ventricle (RV), the left atrium (LA), the right atrium (RA), the myocardium of the LV (Myo), the ascending aorta (Ao) and the pulmonary artery (PA). For preprocessing, we follow [13] to perform resizing, intensity normalization, and data augmentation (random rotation, scaling, shearing, and elastic warping) for each training case. Data characteristics and preprocessing details are summarised in the supplementary material.
Evaluation Metrics. In order to compare with existing state-of-the-art (SOTA) methods, four metrics as in [13] are employed for evaluation, including Dice, Jaccard, average symmetric surface distance (ASSD), and Hausdorff distance (HD). Furthermore, intersected mesh facets are detected by TetGen [10] and used for quantifying self-intersection (SI).
Results. We compare our method with two SOTA methods on the five evaluation metrics. Ours and MeshDeform [13] are trained on the same dataset consisting of 16 CT and 16 MR data that are randomly selected from the MM-WHS training set with 60 augmentations for each, and the remaining 4 CT and 4 MR are used for validation. Evaluations are performed on the encrypted testing set with the officially provided executables. We reimplement MeshDeform [13] with Pytorch according to the publicly available Tensorflow version. Please note the Voxel2Mesh results are directly obtained from [13] since its code has not been open sourced yet. Training settings are detailed in the supplementary material.
Table 1 shows evaluation results on the seven heart structures and the whole heart of the MM-WHS CT testing set. Our method achieves the best results in most entries. For SI, Voxel2Mesh holds the best results in most entries because of its unpooling operations in each deformation procedure, in which topological information is additionally used. However, as described in [13], Voxel2Mesh may easily encounter out-of-memory errors for its increasing vertices along the reconstruction process. More results for the MR data can be found in the supplementary material.
Figure 3 shows the best and the worst CT results for MeshDeform with respect to Dice and our results on the same cases. Noticeably, the best case for MeshDeform is not the best for our method. For that best case of MeshDeform, we can see obvious folded areas on the mesh of PA, while our method yields more satisfactory visualization results. As for the worst case, both methods obtain unsatisfactory visualizations. However, the two structures (PA and RV) obtained from MeshDeform intersect with each other, leading to significant topological errors. Our method does not have such topology issues.
Ablation Study. For the ablation study, we train a model without the surface loss while keeping the rest the same. Table 2 shows the ablation analysis results on the CT data, which apparently validates the effectiveness of the surface loss.
4 Conclusion
In this work, we propose and validate a whole-heart mesh reconstruction method incorporating a novel surface loss. Due to the intrinsic and favorable property of the currents representation, our method is able to generate accurate meshes with the correct topology.
References
Attar, R., et al.: 3D cardiac shape prediction with deep neural networks: simultaneous use of images and patient metadata. In: Shen, D., et al. (eds.) MICCAI 2019, Part II. LNCS, vol. 11765, pp. 586–594. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_65
Bucioli, A.A., et al. Holographic real time 3D heart visualization from coronary tomography for multi-place medical diagnostics. In: 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing, 15th International Conference on Pervasive Intelligence and Computing, 3rd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 239–244. IEEE, 6 November 2017
Beetz, M., Banerjee, A., Grau, V.: Biventricular surface reconstruction from cine MRI contours using point completion networks. In: 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp. 105–109. IEEE, April 2021
Banerjee, A., Zacur, E., Choudhury, R. P., Grau, V.: Automated 3D whole-heart mesh reconstruction from 2D cine MR slices using statistical shape model. In: 2022 44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1702–1706. IEEE, July 2022
Charon, N., Younes, L.: Shape spaces: From geometry to biological plausibility. In: Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging: Mathematical Imaging and Vision, pp. 1929–1958 (2023)
De Rham, G.: Variétés différentiables: formes, courants, formes harmoniques, vol. 3. Editions Hermann (1973)
Fischl, B.: FreeSurfer. Neuroimage 62(2), 774–781 (2012)
Garvey, C.J., Hanlon, R.: Computed tomography in clinical practice. BMJ 324(7345), 1077–1080 (2002)
González Izard, S., Sánchez Torres, R., Alonso Plaza, O., Juanes Mendez, J.A., García-Peñalvo, F.J.: NextMed: automatic imaging segmentation, 3D reconstruction, and 3D model visualization platform using augmented and virtual reality. Sensors 20(10), 2962 (2020)
Hang, S.: TetGen, a delaunay-based quality tetrahedral mesh generator. ACM Trans. Math. Softw 41(2), 11 (2015)
He, L., Ren, X., Gao, Q., Zhao, X., Yao, B., Chao, Y.: The connected-component labeling problem: a review of state-of-the-art algorithms. Pattern Recogn. 70, 25–43 (2017)
Kong, F., Shadden, S.C.: Whole heart mesh generation for image-based computational simulations by learning free-from deformations. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 550–559. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_53
Kong, F., Wilson, N., Shadden, S.: A deep-learning approach for direct whole-heart mesh reconstruction. Med. Image Anal. 74, 102222 (2021)
Kong, F., Shadden, S.C.: Learning whole heart mesh generation from patient images for computational simulations. IEEE Trans. Med. Imaging 42(2), 533–545 (2022)
Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. ACM siggraph Comput. Graph. 21(4), 163–169 (1987)
Mittal, R., et al.: Computational modeling of cardiac hemodynamics: current status and future outlook. J. Comput. Phys. 305, 1065–1082 (2016)
Prakosa, A., et al.: Personalized virtual-heart technology for guiding the ablation of infarct-related ventricular tachycardia. Nat. Biomed. Eng. 2(10), 732–740 (2018)
Painchaud, N., Skandarani, Y., Judge, T., Bernard, O., Lalande, A., Jodoin, P.M.: Cardiac segmentation with strong anatomical guarantees. IEEE Trans. Med. Imaging 39(11), 3703–3713 (2020)
Pak, D.H., et al.: Distortion energy for deep learning-based volumetric finite element mesh generation for aortic valves. In: de Bruijne, M., et al. (eds.) MICCAI 2021, Part VI. LNCS, vol. 12906, pp. 485–494. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_47
Stokes, M.B., Roberts-Thomson, R.: The role of cardiac imaging in clinical practice. Aust. Prescriber 40(4), 151 (2017)
Tang, X., Holland, D., Dale, A.M., Younes, L., Miller, M.I., Initiative, A.D.N.: Shape abnormalities of subcortical and ventricular structures in mild cognitive impairment and Alzheimer’s disease: detecting, quantifying, and predicting. Hum. Brain Mapping 35(8), 3701–3725 (2014)
Tsougos, I.: Advanced MR Neuroimaging: from Theory to Clinical Practice. CRC Press, Boca Raton (2017)
Vaillant, M., Glaunès, J.: Surface matching via currents. In: Christensen, G.E., Sonka, M. (eds.) IPMI 2005. LNCS, vol. 3565, pp. 381–392. Springer, Heidelberg (2005). https://doi.org/10.1007/11505730_32
Zhuang, X., Shen, J.: Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI. Med. Image Anal. 31, 77–87 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, H., Tam, R., Tang, X. (2023). Whole-Heart Reconstruction with Explicit Topology Integrated Learning. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14225. Springer, Cham. https://doi.org/10.1007/978-3-031-43987-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-43987-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43986-5
Online ISBN: 978-3-031-43987-2
eBook Packages: Computer ScienceComputer Science (R0)