Abstract
Finite Element Analysis (FEA) is useful for simulating Transcather Aortic Valve Replacement (TAVR), but has a significant bottleneck at input mesh generation. Existing automated methods for imaging-based valve modeling often make heavy assumptions about imaging characteristics and/or output mesh topology, limiting their adaptability. In this work, we propose a deep learning-based deformation strategy for producing aortic valve FE meshes from noisy 3D CT scans of TAVR patients. In particular, we propose a novel image analysis problem formulation that allows for training of mesh prediction models using segmentation labels (i.e. weak supervision), and identify a unique set of losses that improve model performance within this framework. Our method can handle images with large amounts of calcification and low contrast, and is compatible with predicting both surface and volumetric meshes. The predicted meshes have good surface and correspondence accuracy, and produce reasonable FEA results.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Transcatheter Aortic Valve Replacement (TAVR) is an emerging minimally invasive treatment option for aortic stenosis [22]. In recent years, studies have explored Finite Element Analysis (FEA) for simulating TAVR from pre-operative patient images, and have shown promising results for predicting patient outcome and finding better treatment strategies [3]. However, there exists a significant bottleneck at producing FE meshes from patient images, as the manual process takes several hours for each patient and requires expert knowledge about the anatomy as well as the meshing techniques and requirements.
Automated solutions for aortic valve modeling have been proposed. Ionasec et al. [7] and Liang et al. [11] used a combination of landmark and boundary detection driven by intensity-based features to produce valve meshes with a predefined topology. Pouch et al. [16] used multi-atlas segmentation and medial modeling to match the template mesh to the predicted segmentation. Ghesu et al. [5] used marginal space deep learning to localize an aortic wall template mesh and subsequently deformed it along the surface normal. Although various approaches have demonstrated success with valve modeling, they have drawbacks such as heavy reliance on intensity changes along the valve structures, extensive assumptions about valve geometry, and limited adaptability to FE meshes due to assumptions about the output mesh topology.
To address these issues, we propose a deep learning-based deformation strategy for predicting FE meshes from noisy 3D CT scans of TAVR patients. Our main contributions are three-fold: (1) We propose a novel image analysis problem formulation that allows for weakly supervised training of image-to-mesh prediction models, where training is performed with segmentation labels instead of mesh labels. (2) We make minimal assumptions in defining this formulation, so it can easily adapt to various imaging conditions and desired output mesh topology (even from surface mesh to volumetric mesh). (3) We identify a unique set of losses that improves model performance within this framework.
2 Methods
2.1 Possible Problem Formulation: Meshing from Segmentation
Let I be an image, and let S and M be the corresponding segmentation and mesh outputs, respectively. Considering the sequential generation steps \(I \rightarrow S \rightarrow M\), we can define two mapping functions \(f(I) = S\) and \(g(S) = M\) (Fig. 1).
The most common choices for f(I) and g(S) are the inside volume a structure and marching cubes, respectively. However, for thin structures such as valve leaflets, naïve surface meshing fails to provide the desired open surface meshes with accurate attachment points (Fig. 1). It is also problematic for the aortic wall, which requires tube-like openings. Therefore, to obtain the correct surface, we must extract structural information from S via curve fitting [11], medial modeling [16], or manual labeling. This makes it extremely difficult to define a general g(S) without making heavy assumptions about the anatomy and output mesh even when provided with manually defined S during test time.
2.2 Proposed Problem Formulation: Mesh Template Matching
Instead of solving for g(S), we propose a problem formulation that we refer to as Mesh Template Matching (MTM), summarized schematically in Fig. 2. Here, we find the deformation field \(\phi ^*\):
where M and \(M_0\) are target and template meshes, respectively. We use a convolutional neural network (CNN) as our function approximator \(h_\theta (I; S_0, M_0) = \phi \) where I is the image, \(S_0\) is the segmentation template (paired to \(M_0\)), and \(\theta \) is the network parameters. Then, we solve for \(\theta \) that minimizes the loss:
where D is the training set distribution. We propose two different variations of MTM, as detailed below.
2.2.1 MTM
For the vanilla MTM, we initially defined \(\mathcal {L}\) from Eq. 1 as:
where \(\phi (M_0)\) is the deformed template, \(\mathcal {L}_{acc}\) is the spatial accuracy loss, and \(\mathcal {L}_{smooth}\) is the field smoothness loss with a scaling hyperparameter \(\lambda \). From here, we removed the need for ground truth M with the following steps:
where \(g^*\) is the ideal meshing function (for which the topology is defined by the template mesh), \(\hat{g}\) is marching cubes, and S and \(S_0\) are target and template segmentation volumes. Our key approximation step (Eq. 6) makes two important assumptions: (1) \(g^*\) mesh surface is in close proximity to \(\hat{g}\) mesh surface by Euclidean distance and (2) \(\phi \) is smooth with respect to Euclidean space. The first assumption is reasonable because ground truth meshes are often created using segmentation labels as intermediate steps, and the second assumption is enforced by \(\mathcal {L}_{smooth}\) and the choice of \(\phi \) (discussed further in Sect. 2.3).
Common choices for the spatial accuracy loss are mean surface distance (e.g. Chamfer distance) and volume overlap (e.g. Dice). Since Dice and other segmentation losses are typically lenient towards errors in segmentation boundaries (and we need accuracy at boundaries for meshes), we used the Chamfer distance:
where P and Q are points sampled on surfaces of \(\phi (\hat{g}(S_0))\) and \(\hat{g}(S)\), respectively. We experimented with adding Dice [13] to the loss, but observed no significant improvement in performance.
For field smoothness loss, we used the bending energy term to penalize non-affine fields [19]:
where V is the total number of voxels and X, Y, Z are the number of voxels in each dimension. We experimented with adding gradient magnitude [1] and field magnitude [10] to the loss, but observed no significant improvement in performance.
2.2.2 MTMgeo
For the second variation of MTM, referred to as MTMgeo, we replaced \(\mathcal {L}_{smooth}\) with \(\mathcal {L}_{geo}\) to preserve various desired geometric qualities of the deformed template mesh:
where we apply identical steps as Eq. 4–7 to get:
The geometric quality loss is a weighted sum of three different losses:
where \(\lambda _i\) are scaling hyperparameters, \(\mathcal {L}_{norm}\) is face normal consistency loss, \(\mathcal {L}_{lap}\) is Laplacian smoothing loss, and \(\mathcal {L}_{edge}\) is edge correspondence loss.
\(\mathbf {n}_{fi}\) is the normal vector calculated at a given face and \(\mathcal {N}_f\) is the set of all pairs of neighboring faces’ normals within \(\phi (M_0)\).
\(\mathcal {N}(\mathbf {v}_i)\) represents neighbors of \(\mathbf {v}_i\). The norm represents the magnitude of the differential coordinates, which approximates the local mean curvature.
Our proposed edge correspondence loss is different from the edge length loss [6, 23] in that it allows meshes to change sizes more freely, as long as the edge length ratio stays consistent before and after deformation. This is beneficial for medical FE meshing tasks where (1) patients have organs of different sizes and (2) consistent edge ratio helps convert between triangular and quadrilateral faces. The latter is important because quadrilateral faces are desired in FEA for more accurate simulation results, but many mesh-related algorithms and libraries are only compatible with triangular faces.
2.3 Deformation Field
As mentioned in Sect. 2.2.1, the choice of \(\phi \) is important for applying the key approximation step in MTM. Our approach is to learn a diffeomorphic B-spline deformation field from which we interpolate displacement vectors at each node of the template mesh. By requiring the result to adhere to topology-preserving diffeomorphism, we help prevent mesh intersections that can commonly occur with node-specific displacements. Also, when the field is calculated in the reverse direction for deforming a template image/segmentation to prevent hole artifacts, the invertible property of diffeomorphism allows for much more accurate field calculation. The B-spline aspect helps generate smooth fields, which prevents erratic changes in field direction or magnitude for nearby interpolation points.
3 Experiments and Results
3.1 Data Acquisition and Preprocessing
We used an aortic valve dataset consisting of 88 CT scans from 74 different patients, all with tricuspid valves. Of the 88 total scans, 73 were collected from transcatheter aortic valve replacement (TAVR) patients at the Hartford hospital, and the remaining 15 were from the training set of the MM-WHS public dataset [25]. From the Hartford images, we randomly chose some patients to include more than one random time point from the \(\sim \)10 time points collected over the cardiac cycle. The 88 images were randomly split into 40, 10, 38 for training, validation, and testing sets, respectively, with no patient overlap between the training and testing sets. We pre-processed all CT scans by converting to Hounsfield units (HU), thresholding, and renormalizing to [0, 1]. We resampled all images to fix the spatial resolution at 1 \(\times \) 1 \(\times \) 1 mm\(^3\), and cropped and rigidly aligned them using three manually annotated landmark points to focus on the aortic valve, resulting in final images with [64, 64, 64] voxels.
We focused on 4 aortic valve components: the aortic root/wall (for segmentation/mesh, respectively) and the 3 leaflets. The ground truth segmentation labels (for training) and mesh labels (for testing) for all 4 components were obtained via manual and semi-automated annotations by human experts in 3D slicer [4]. We first obtained the segmentation labels using the brush tool and the robust statistics segmenter. Then, using the segmentation as guidance, we manually defined the boundary curves of the aortic wall and the leaflets, while denoting key landmark points of commissures and hinges. We then applied thin plate spline to deform mesh templates from a shape dictionary to the defined boundaries, and further adjusted nodes along the surface normals, using a combination of manually marked points on surface and intensity gradient-based objective, similar to [11]. The surface mesh template was created by further processing one of the ground truth meshes with a series of remeshing and stitching steps. The final template surface mesh was a single connected component mesh with 12186 nodes and 18498 faces, consisting of both triangular and quadrilateral faces.
3.2 Implementation Details
We used Pytorch [15] to implement a variation of a 3D U-net [18] as our function approximator \(h_\theta (I; S_0, M_0)\). Since the network architecture is not the focus of this paper, we kept it consistent between all deep learning-based methods for fair comparison. The basic Conv unit was Conv3D-InstanceNorm-LeakyReLu, and the network had 4 encoding layers of ConvStride2-Conv with residual connections and dropout, and 4 decoding layers of Concatenation-Conv-Conv-Up-sampling-Conv. The base number of filters was 16, and was doubled at each encoding layer and halved at each decoding layer. The output of the U-net was a 3 \(\times \) 64 \(\times \) 64 \(\times \) 64 tensor, which we interpolated to obtain a 3 \(\times \) 24 \(\times \) 24 \(\times \) 24 tensor that we then used to displace the control points of a 3D diffeomorphic third-order B-spline transformation to obtain a dense displacement field using the Airlab registration library [20]. The Chamfer distance and geometric quality losses were implemented using Pytorch3D [17] with modifications. We used the Adam optimizer [9] with a fixed learning rate of 1e−4, batch size of 1, and 2000 training epochs. The models were trained with B-spline deformation augmentation step over 2000 epochs, resulting in 80k training samples and around 24 h of training time on a single NVIDIA GTX 1080 Ti.
3.3 Evaluation Metrics
For spatial accuracy, we evaluated the Chamfer distance (mean surface accuracy) and the Hausdorff distance (worst-case surface accuracy) between ground truth meshes and predicted surface meshes. The Chamfer distance was calculated using 10k sampled points on each mesh surface using Pytorch3D [17], and the Hausdorff distance was calculated using the meshes themselves using IGL [8]. For correspondence error, we evaluated the Euclidean distance between hand-labeled landmarks (3 commissures and 3 hinges) and specific node positions on the predicted meshes (e.g. commissure #1, was node #81 on every predicted mesh, etc.). Additionally, for bare minimum FEA viability, we used CGAL [12] to check for self-intersections and degenerate edges/faces of predicted volumetric meshes with no post-processing.
3.4 Comparison with an Image Intensity Gradient-Based Approach
We compared our method against the semi-automated version of [11], which uses manually delineated boundaries to first position the template meshes and refines them using an image gradient-based objective (Fig. 3). This approach performs very well under ideal imaging conditions, where there are clear intensity changes along the valve components, but it tends to make large errors in images with high calcification and low contrast. On the other hand, MTM does not particularly favor one condition or another, as long as enough variations exist in training images. However, this could also mean that it could make errors for “easy” images if the model has not seen a similar training image. We chose to not include Liang et al. [11] for evaluation metric comparisons because (1) it uses manually delineated boundaries, which skews both surface distance and correspondence errors and (2) we used some of its meshes as ground truth without any changes, when the images are in good condition (e.g. Fig. 3 bottom left).
3.5 Comparison with Other Deformation Strategies
We chose three deformation-based methods for comparison: Voxelmorph [1], U-net + Robust Point Matching (RPM) [2], and TETRIS [10] (Table 1, Fig. 4).
Voxelmorph [1] uses a CNN-predicted deformation field to perform intensity-based registration with an atlas image. Given a paired image-mesh template, we first trained for image registration and used the resulting field to deform the mesh to the target image. Since there is no guidance for the network to focus on the valve components, the resulting deformation is optimized for larger structures around the valve rather than the leaflets, leading to poor mesh accuracy.
U-net + RPM [2, 18] is a sequential model where we trained a U-net for voxel-wise segmentation and used its output as the target shape for registration with a template mesh. RPM (implemented in [14]) performs optimization during test time, which requires longer time and expert knowledge for parameter tuning during model deployment. It also produces suboptimal results, possibly due to the segmentation output and point sampling not being optimized for matching with the template mesh.
TETRIS [10] uses a CNN-predicted deformation field to deform a segmentation prior to optimize for segmentation accuracy. Using a paired segmentation-mesh template, we first trained for template-deformed segmentation and used the resulting field to deform the mesh. Since the field is not diffeomorphic and calculated in the reverse direction to prevent hole artifacts, we used VTK’s implementation of Newton’s method [21] to get the inverse deformation field for the template mesh. The inaccuracies at the segmentation boundaries and errors due to field inversion lead to suboptimal performance.
MTM consistently outperforms all other deformation strategies in terms of spatial accuracy and produces less degenerate meshes (Table 1, Fig. 4). The accuracy values are also comparable to those in [7, 11], which use cleaner images, heavier assumptions about valve structure, and/or ground truth meshes for training. MTMgeo arguably performs similarly to MTM, suggesting that we may be able to replace field smoothness penalty with other mesh-related losses to refine the results for specific types of meshes. This may be especially useful for training with volumetric meshes, where we might want to dynamically adjust the thickness of different structures based on the imaging characteristics.
3.6 FEA Results
Volumetric FE meshes were produced by applying the MTM-predicted deformation field to a template volumetric mesh, which was created by applying a simple offset + stitch operation from the template surface mesh. We set aortic wall and leaflet thicknesses to 1.5 cm and 0.8 cm, respectively, and used C3D15 and C3D20 elements. FEA simulations were performed with an established protocol, similar to those in [11, 24]. Briefly, to compute stresses on the aortic wall and leaflets during valve closing, an intraluminal pressure (P = 16 kPa) was applied to the upper surface of the leaflets and coronary sinuses, and a diastolic pressure (P = 10 kPa) was applied to the lower portion of the leaflets and intervalvular fibrosa. The maximum principal stresses in the aortic wall and leaflets were approximately 100–500 kPa (Fig. 5), consistent with previous studies [11, 24]. This demonstrates MTM-predicted meshes’ suitability for FEA simulations.
3.7 Limitations and Future Works
Although MTM shows promise, it has much room for improvement. First, the current setup requires 3 manual landmark points during preprocessing for cropping and rigid alignment. We will pursue end-to-end learning using 3D whole-heart images via region proposal networks, similar to [13]. Second, our model does not produce calcification meshes, which are important for proper simulation because calcification and valve components have different material properties. We will need a non-deformation strategy for predicting calcification meshes since their size and position vary significantly between patients. Third, the restriction to smooth and diffeomorphic field prevents large variations in valve shapes. We will continue exploring the possibility of extending our framework to node-specific displacement vectors.
4 Conclusion
We presented a weakly supervised deep learning approach for predicting aortic valve FE meshes from 3D patient images. Our method only requires segmentation labels and a paired segmentation-mesh template during training, which are easier to obtain than mesh labels. Our trained model can predict meshes with good spatial accuracy and FEA viability.
References
Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J., Dalca, A.V.: Voxelmorph: a learning framework for deformable medical image registration. IEEE Trans. Med. Imaging 38(8), 1788–1800 (2019)
Chui, H., Rangarajan, A.: A new point matching algorithm for non-rigid registration. Comput. Vis. Image Underst. 89(2–3), 114–141 (2003)
Dowling, C., Firoozi, S., Brecker, S.J.: First-in-human experience with patient-specific computer simulation of TAVR in bicuspid aortic valve morphology. JACC Cardiovasc. Interven. 13(2), 184–192 (2020)
Fedorov, A., et al.: 3D slicer as an image computing platform for the quantitative imaging network. Magn. Reson. Imaging 30(9), 1323–1341 (2012)
Ghesu, F.C., et al.: Marginal space deep learning: efficient architecture for volumetric image parsing. IEEE Trans. Med. imaging 35(5), 1217–1228 (2016)
Gkioxari, G., Malik, J., Johnson, J.: Mesh R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9785–9795 (2019)
Ionasec, R.I., et al.: Patient-specific modeling and quantification of the aortic and mitral valves from 4-D cardiac CT and tee. IEEE Trans. Med. Imaging 29(9), 1636–1651 (2010)
Jacobson, A., Panozzo, D., et al.: libigl: A simple C++ geometry processing library (2018). https://libigl.github.io/
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, M.C.H., Petersen, K., Pawlowski, N., Glocker, B., Schaap, M.: Tetris: template transformer networks for image segmentation with shape priors. IEEE Trans. Med. Imaging 38(11), 2596–2606 (2019)
Liang, L., et al.: Machine learning-based 3-D geometry reconstruction and modeling of aortic valve deformation using 3-D computed tomography images. Int. J. Numer. Methods Biomed. Eng. 33(5), e2827 (2017)
Loriot, S., Rouxel-Labbé, M., Tournois, J., Yaz, I.O.: Polygon mesh processing. In: CGAL User and Reference Manual. CGAL Editorial Board, 5.1.1 edn. (2020). https://doc.cgal.org/5.1.1/Manual/packages.html#PkgPolygonMeshProcessing
Pak, D.H., Caballero, A., Sun, W., Duncan, J.S.: Efficient aortic valve multilabel segmentation using a spatial transformer network. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 1738–1742. IEEE (2020)
Papademetris, X., et al.: Bioimage suite: an integrated medical image analysis suite: an update. Insight J. 2006, 209 (2006)
Paszke, A., et al.: Automatic differentiation in pytorch (2017)
Pouch, A.M., et al.: Medially constrained deformable modeling for segmentation of branching medial structures: application to aortic valve segmentation and morphometry. Med. Image Anal. 26(1), 217–231 (2015)
Ravi, N., et al.: Accelerating 3D deep learning with pytorch3d. arXiv:2007.08501 (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015, Part III. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Rueckert, D., Sonoda, L.I., Hayes, C., Hill, D.L., Leach, M.O., Hawkes, D.J.: Nonrigid registration using free-form deformations: application to breast MR images. IEEE Trans. Med. Imaging 18(8), 712–721 (1999)
Sandkühler, R., Jud, C., Andermatt, S., Cattin, P.C.: Airlab: Autograd image registration laboratory. arXiv preprint arXiv:1806.09907 (2018)
Schroeder, W.J., Lorensen, B., Martin, K.: The Visualization Toolkit: An Object-Oriented Approach to 3D Graphics. Kitware, NewYork (2004)
Sun, W., Martin, C., Pham, T.: Computational modeling of cardiac valve function and intervention. Annu. Rev. Biomed. Eng. 16, 53–76 (2014)
Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018)
Wang, Q., Primiano, C., McKay, R., Kodali, S., Sun, W.: CT image-based engineering analysis of transcatheter aortic valve replacement. JACC Cardiovasc. Imaging 7(5), 526–528 (2014)
Zhuang, X., Shen, J.: Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI. Med. Image Anal. 31, 77–87 (2016)
Acknowledgments and Conflict of Interest
This work was supported by the NIH R01HL142036 grant. Dr. Wei Sun is a co-founder and serves as the Chief Scientific Advisor of Dura Biotech. He has received compensation and owns equity in the company.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Pak, D.H. et al. (2021). Weakly Supervised Deep Learning for Aortic Valve Finite Element Mesh Generation from 3D CT Images. In: Feragen, A., Sommer, S., Schnabel, J., Nielsen, M. (eds) Information Processing in Medical Imaging. IPMI 2021. Lecture Notes in Computer Science(), vol 12729. Springer, Cham. https://doi.org/10.1007/978-3-030-78191-0_49
Download citation
DOI: https://doi.org/10.1007/978-3-030-78191-0_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78190-3
Online ISBN: 978-3-030-78191-0
eBook Packages: Computer ScienceComputer Science (R0)