Abstract
Purpose
Orthognathic surgery requires an accurate surgical plan of how bony segments are moved and how the face passively responds to the bony movement. Currently, finite element method (FEM) is the standard for predicting facial deformation. Deep learning models have recently been used to approximate FEM because of their faster simulation speed. However, current solutions are not compatible with detailed facial meshes and often do not explicitly provide the network with known boundary type information. Therefore, the purpose of this proof-of-concept study is to develop a biomechanics-informed deep neural network that accepts point cloud data and explicit boundary types as inputs to the network for fast prediction of soft-tissue deformation.
Methods
A deep learning network was developed based on the PointNet++ architecture. The network accepts the starting facial mesh, input displacement, and explicit boundary type information and predicts the final facial mesh deformation.
Results
We trained and tested our deep learning model on datasets created from FEM simulations of facial meshes. Our model achieved a mean error between 0.159 and 0.642 mm on five subjects. Including explicit boundary types had mixed results, improving performance in simulations with large deformations but decreasing performance in simulations with small deformations. These results suggest that including explicit boundary types may not be necessary to improve network performance.
Conclusion
Our deep learning method can approximate FEM for facial change prediction in orthognathic surgical planning by accepting geometrically detailed meshes and explicit boundary types while significantly reducing simulation time.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Facial appearance is a representative image of an individual and can significantly impact self-confidence and social relationships. Patients with jaw deformities suffer from both esthetical impairment and functional abnormality [1]. Orthognathic surgery is a corrective jaw surgery which corrects skeletal deformities by repositioning osteotomized bony segments into desired positions [2, 3]. While surgically “untouched,” the facial soft tissues are passively and “automatically” corrected following the movement of the underlying bony segments [4]. Due to the complex nature of facial anatomy, orthognathic surgery requires an accurate surgical plan. To date, surgeons can accurately plan the movement of bony segments (“bony movement” for short) using computer-aided surgical simulation (CASS) technology [3, 5]. However, because of the nonlinear relationship between bony segments and soft tissues, the prediction of postoperative facial appearance remains a practically challenging task [6].
Various attempts have been made to predict three-dimensional (3D) facial change following orthognathic surgery [4, 7, 8]. Among them, the finite element method (FEM) is reported to be the most accurate and biomechanically relevant method [6, 9]. When using FEM simulation, geometrically accurate patient-specific FE mesh modeling and realistic boundary condition assignment are critical to achieve quantitatively and qualitatively accurate prediction results [4, 6, 9]. However, this level of customization is often difficult to implement using a general FE solver, thus motivating development of a novel incremental simulation method in our previous work [6]. FEM simulation is also a computationally expensive process, and a typical facial change prediction takes about 30 minutes to complete after the bony movement is planned. It is impossible to use FEM simulation for quick surgical planning in clinical settings because surgeons often try multiple procedures or revisions during the planning for each patient in order to achieve the best possible outcome. While various FE acceleration methods such as SOFA [10], NiftySim [11], and proper orthogonal decomposition may accelerate computation, their simulation time rarely approaches that needed for rapid surgical planning, especially for highly detailed meshes. [12]. Therefore, a more efficient approach is needed to improve prediction time while maintaining comparable accuracy to FEM.
Deep learning has been applied to a variety of applications related to orthognathic surgical planning, including classification, segmentation, registration, denoising, and many others [13]. Only recently have deep learning techniques been introduced as a potential alternative to the traditional FEM method to simulate biomechanical problems including tissue deformation [14]. While training a deep neural network is computationally expensive, a fully trained network can decrease simulation time by several orders of magnitude compared to FEM [14]. This decrease in computation time has made deep learning an attractive solution for obtaining simulation results rapidly.
Deep learning networks based on the U-Net architecture have been developed to simulate soft-tissue deformation in various organs [15, 16]. However, such models require input data to be sampled from a regularly spaced grid, which is not suitable for facial change simulations. Accurately capturing details of the face, especially in the clinically critical regions, e.g., the lips, is extremely important for predicting the facial appearance outcome. Therefore, a network which can accept unstructured data with irregularly spaced nodes is needed. Recent works have implemented a PointNet to perform deformation estimation because of its ability to accept data in point cloud format, allowing for unstructured data as input [17, 18]. These networks were able to learn biomechanically relevant tissue deformation while using unstructured data. However, boundary type information was limited to whether or not deformation occurred at a given node. In comparison, boundary types for facial tissue simulations are more complex [4, 6, 9]. For this reason, a way of explicitly supplying boundary type information to a network is needed.
The purpose of this study is to develop a novel biomechanics-informed deep learning method to enable efficient and accurate facial tissue change simulation, addressing the weaknesses of prior deformation prediction networks and FEM simulation. The contribution of this proof-of-concept work are (1) implementation of a deep neural network based on the PointNet++ architecture [19] that accepts data input in point cloud format, and thus is compatible with any geometrically detailed facial mesh, and (2) implementation of explicit patient-specific boundary types as additional input to the network to improve facial change prediction accuracy.
Method
The proposed biomechanics-informed deep learning method is based on PointNet++ [19]. In this method, we assume that the facial tissue mesh has already been generated from computed tomography (CT), and the surgical plan (i.e., exact bony movement) has already been formed. The facial mesh in point cloud format and the explicit patient-specific boundary types are used as inputs to the network for fast prediction of facial soft-tissue deformation following the bony movement. Figure 1 shows an overview of the proposed method.
Data representation
Our network learns a nonlinear mapping between an input state, which is represented by the starting mesh, boundary types, and bony surface displacement, and the predicted mesh following bony displacement. The network is designed to accept input data in the same format as in an FEM simulation. Partially inspired by Mendizabal et al. [12] and Saeed et al. [17], let the input vector, \(x_n\), consist of \(N\) feature vectors \(x_n=[c_n, b_n, s_n]\) where \(n = 1,2,...,N\) represents the nodes in an input FEM mesh. The vectors \(c_n\) are the Cartesian coordinates of the input mesh, \(b_n\) are one-hot encoding vectors of the boundary type, and \(s_n\) represent the applied surface displacement using displacement in \(x, y,\) and \(z\) directions. The encoding vector, b, varies depending on the boundary type at each node (Fig. 1). Boundary types are differentiated using a one-hot encoding vector which can be used to distinguish many types of boundary conditions, as opposed to the binary indicator used by Saeed et. al. [17]. Three different boundary types are implemented: fixed, moving and free nodes [6]. For the fixed nodes, \(b = [0, 0, 1]\). For the moving nodes with known displacement \(b = [0, 1, 0]\). For the remaining free nodes, \(b = [1, 0, 0]\). We therefore refer to the inclusion of the encoding vector \(b\) as “explicit” boundary types, whereas exclusion of the vector \(b\) is referred to as “implicit” boundary types. For the moving nodes where a known displacement is applied, \(s = [s_x, s_y, s_z]\) where \(s_x, s_y, s_z\) represent the applied nodal displacement based on corresponding bony movement. For the fixed and free nodes, \(s = [0, 0, 0]\). We use the FEM simulation output as ground truth while training the deep neural network. The deep neural network is trained to predict the final nodal displacement after deformation \(u_n\) where \(u=[u_x, u_y, u_z]\). The ground-truth nodal displacements after deformation, vector \(v_n\) where \(v=[v_x, v_y, v_z]\), are calculated by subtracting the nodal coordinates of the input mesh from the FEM-simulated mesh. Our network is tasked with finding the function which minimizes the expected error between \(u_n\) and \(v_n\). Mean squared error is used as our loss function:
Network design details
PointNet++ [19] is adopted for our task because of its efficiency on point set processing. Its structure is modified by adding the boundary type information and displacement vectors as additional input channels. PointNet++ is a hierarchical feature extraction network consisting of 4 feature encoding modules, 4 feature decoding modules, and a unit PointNet layer, as shown in Fig. 2a. Each feature encoding module has a sampling layer, a grouping layer, and a PointNet layer as shown in Fig. 2b. Each feature decoding module consists of an interpolation layer and a unit PointNet layer as illustrated in Fig. 2c. The PointNet layer has a multilayer perceptron (MLP) and max pooling operator. The unit PointNet layer is similar to one-by-one convolution in convolutional neural networks [19]. Skip connections are used to concatenate the features between feature encoding and decoding modules. The final output of the network is the predicted 3D displacement vector for each of the N input nodes.
Experiments and results
Experiments
We tested the method’s ability to simulate facial mesh change following synthetic bony movement based on real patient data. We then tested the method on an actual patient’s surgical plan to assess the performance of our network in solving a clinical problem.
Data for facial change simulation
A dataset of synthetic surgical plans was generated from real patient examples to train, test, and validate our network. The actual surgical plan from each patient was reserved to validate our network’s performance on real data. Patients who underwent double-jaw orthognathic surgery were randomly selected from our digital archive [IRB#: Pro00008890]. We generated synthetic bony movements and their corresponding FE facial meshes to be used as data for training, testing, and validating the network. The synthetic bony movements were created first. Following the standard surgical procedure, the midface and mandible of preoperative CT models were osteotomized for a LeFort segment, a distal mandible, and a right and a left proximal segment. After the postoperative CT models were registered to the preoperative ones based on surgically unaltered volumes, the surgical plan for the actual surgery (i.e., the movement of each bony segment) was retrospectively formed. The LeFort and the distal segments were moved individually in 6 degrees of freedom while the right and left proximal segments were rotated around the ipsilateral condyle and aligned to the distal mandible. Each rotational and translational bony movement were finally divided into several sub-steps within the maximal surgical movement. A facial change was then simulated for each combination of LeFort and distal mandibular (bony) movements. An initial hexahedral patient-specific FE mesh model (47,088 nodes and 38,280 elements) with detailed lip geometry was generated from patient CT images using eFTP-VP method [6, 20]. Neo-Hookean material properties (Young’s modulus: 3,000 Pa, Poisson’s ratio: 0.47) and patient-specific boundary conditions were applied [6]. Using our validated FEM simulation method [6], facial meshes were generated (Fig. 3). An incremental approach was used for FEM simulation. In this approach, facial changes were simulated sequentially based on incremental bony movements from preoperative to final position. For each simulation, at least 10 simulation results (e.g. 9 intermediate incremental results and 1 final result) were generated. In this way, each incremental simulation result could be used as a separate data sample. Since each simulation takes about 30 minutes to complete, it took approximately a week to generate the 3600 data samples for subject 1 (Table 1). The number of data samples generated for each subject depended on the range of bony movement that was physiologically plausible. Therefore, the number of data samples generated for each subject was not identical (Table 1). To improve network training efficiency, the area above the infraorbital region was removed from the original mesh, and the number of nodes and elements was also downsampled (from 50,000 to 3,960 nodes) while maintaining the best possible geometrical accuracy (Fig. 3). The moving and fixed nodes in the facial FE mesh were assigned using a K-nearest neighbor algorithm (Supplementary Fig. S1).
Network training and evaluation
The available data were split randomly into training, validation, and test sets by 70%, 10%, and 20%, respectively (Table 1). The feature vectors c and s were scaled such that all data fit in the range between 0 and 1 before being fed to the network. The mean squared error was used as the loss function for training. We used the Adam optimizer, which adaptively adjusts the learning rate, with an initial learning rate of 1e-5 and a batch size of 8. The network trained for 100 epochs, which took approximately 5 hours for a single subject on a Nvidia Tesla V100 GPU.
The network was evaluated using the mean Euclidean error between the predicted node locations and the ground-truth node locations \(e(u,v) = \frac{1}{N}\sum \nolimits _{n=1}^N \Vert (v_n-u_n)\Vert \). The distribution of mean errors was used to evaluate the network’s performance.
Ablation study
To understand the impact of including explicit boundary type information, an ablation study was performed. The boundary type vector \(b\) was omitted from the input vector \(x\). This is referred to as “implicit” boundary types as the network still learns the boundary types implicitly. The size of the first layer in the PointNet++ network was changed to fit the dimension of the input vector accordingly. The mean Euclidean error was calculated for all samples within the validation and test sets to compare the network results with and without boundary types. Since the distribution of mean errors was not normal, a Wilcoxon signed-rank test was used for testing statistical significance in network performance.
Results
In the facial mesh prediction following the synthetic bony movement task, the network achieved between a mean error of 0.159 and 0.642 mm on the test set of each subject (Fig. 4). The mean error rarely exceeded 1 mm, even on simulations with very large input displacement (Supplementary Fig. S2). The predicted facial mesh closely resembled the ground-truth FEM mesh, and the largest error was typically seen around the lips (Fig. 5). On the real surgical plan examples, the network achieved a mean error of between 0.292 and 0.989 mm between the subjects. These results are comparable, if not better, than the synthetic bony movement simulations when compared to the average input displacement (Supplementary Fig. S2). The error was reasonable given the magnitude of the ground truth deformation (Supplementary Fig. S3). The result of the ablation study showed that including explicit boundary types had mixed effects on the performance of the network. In subject 3, including explicit boundary types improved the performance of the network. However, in subjects 4 and 5, inclusion of explicit boundary types hurt network performance. In subjects 1 and 2, network performance was not significantly impacted by inclusion of explicit boundary types. The average run time was only slightly longer when boundary types were included, increasing by 5 ms on average.
Discussion
Our method is capable of closely approximating FEM results. The modified PointNet++ network can predict deformation accurately and consistently, as demonstrated by the low mean error achieved in the facial simulations. The network can also easily adapt to various simulated displacements and achieve low error even on large displacements, as seen in the performance in the surgical simulation results (Table 2). These results indicate that the presented method is robust to high levels of elastic deformation. Our method also demonstrates exemplary performance on real surgical plans. Qualitatively, the network-predicted facial shapes closely resemble those of the ground truth FEM facial shapes (Fig. 5). These results clearly validate our method’s ability to capture fine facial details that are imperative to facial surgical planning.
The inclusion of explicit boundary types did not have a noticeable effect on subjects 1 and 2 and even decreased performance in subjects 4 and 5. Only subject 3 had improved accuracy when explicit boundary types were included. We found that including explicit boundary types only seems to improve accuracy when the simulations have a high maximum deformation, as was seen in the surgical planning results (Table 2). We believe our method of including explicit boundary types may be limited due to the learning process of the network. Since deep learning networks act as a universal approximator, only introducing explicit boundary types in the input may not have a noticeable impact on network learning without also providing a way to enforce boundary conditions in the output. Our future research will seek to develop methods for enforcing boundary conditions through network design or loss algorithms.
The main advantage of our deep learning method is in decreasing simulation time as compared to FEM. The average computation time of our network was less than 700 ms, while FEM takes several minutes on similar simulations [6]. This decrease in computation time allows for clinicians to perform many more simulations during surgical planning and get rapid feedback as compared to FEM. At the same time, our method can achieve simulation results comparable to FEM.
One limitation of our work is the use of mean squared error as our only loss function. In future iterations of our network, adding a smoothing loss algorithm may help lower error while also obtaining better visual accuracy. Furthermore, recent work by Odot et al. has emphasized that use of mean squared error as a loss function may result in shape inaccuracies when simulating hyper-elastic materials [21]. Future iterations of our network will include a governing physics equation as a loss function, as seen in the work of Raissi et al. [22]. We also did not implement sliding nodes as a possible boundary type in this work. We believe that this may have limited the performance of the explicit boundary type encoding as the original FE meshes for our subjects contained sliding nodes. Modeling sliding nodes will require custom loss algorithms, which we will investigate in future work. Another limitation is that we did not include material properties as additional input to the network. This is an additional feature that we will add to our method in future studies, as it has been seen in related works [17]. Additionally, any future iterations of the network will be trained using data from multiple subjects. In future studies, training should occur on a large group of subjects with a wide range of physiologically relevant surgical plans to make the network robust and generalizable to unseen subjects. Ideally, a network trained on sufficient subjects would be adaptable to new subjects (possibly with minimal fine-tuning), making it suitable for clinical use. In order to train our network on multiple subjects, known point correspondence across subjects will need to be established to optimize the training procedure [23]. As this was a proof-of-concept study, the network was trained on data from only one subject at a time. Finally, to validate the performance of our method, we will compare the accuracy of the PointNet++ network to previously used networks, such as U-Net [12, 15, 16].
Conclusion
We presented a deep learning method for biomechanics modeling of facial deformation in orthognathic surgical planning. Our method addressed issues in previous deformation prediction networks approximating FEM, namely network compatibility with geometrically detailed facial meshes and the inclusion of explicit boundary type information. The proposed method achieved accurate performance on facial mesh simulations following synthetic bony movement. Inclusion of explicit boundary type information had mixed results, improving performance in simulations with large deformations but decreasing performance in simulations with small deformations. Finally, our network achieved accurate results on a real surgical example, demonstrating its clinical feasibility.
Data availability
Data are not publicly available.
References
Alanko OM, Svedström-Oristo AL, Tuomisto MT (2010) Patients’ perceptions of orthognathic treatment, well-being, and psychological or psychiatric status: a systematic review. Acta Odontol Scand 68(5):249–260
Shafi MI, Ayoub A, Ju X, Khambay B (2013) The accuracy of three-dimensional prediction planning for the surgical correction of facial deformities using Maxilim. Int J Oral Maxillofac Surg 42(7):801–806
Xia JJ, Gateno J, Teichgraeber JF (2009) New clinical protocol to evaluate craniomaxillofacial deformity and plan surgical correction. Int J Oral Maxillofac Surg 67(10):2093–2106
Kim D, Kuang T, Rodrigues YL, Gateno J, Shen SG, Wang X, Deng H, Yuan P, Alfi DM, Liebschner MA, Xia JJ (2019) A new approach of predicting facial changes following orthognathic surgery using realistic lip sliding effect. MICCAI 11768:336–344
Xia JJ, Gateno J, Teichgraeber JF, Yuan P, Chen KC, Li J, Zhang X, Tang Z, Alfi DM (2015) Algorithm for planning a double-jaw orthognathic surgery using a computer-aided surgical simulation (CASS) protocol. Int J Oral Maxillofac Surg 44(12):1431–1440
Kim D, Kuang T, Rodrigues YL, Gateno J, Shen SG, Wang X, Stein K, Deng HH, Liebschner MA, Xia JJ (2021) A novel incremental simulation of facial changes following orthognathic surgery using FEM with realistic lip sliding effect. Med Image Anal 72:102095
Ullah R, Turner PJ, Khambay BS (2015) Accuracy of three-dimensional soft tissue predictions in orthognathic surgery after Le Fort I advancement osteotomies. Br J Oral Maxillofac Surg 53(2):153–157
Knoops PG, Borghi A, Ruggiero F, Badiali G, Bianchi A, Marchetti C, Rodriguez-Florez N, Breakey RW, Jeelani O, Dunaway DJ, Schievano S (2018) A novel soft tissue prediction methodology for orthognathic surgery based on probabilistic finite element modelling. PloS one 13(5):e0197209
Kim D, Ho DCY, Mai H, Zhang X, Shen SG, Shen S, Yuan P, Liu S, Zhang G, Zhou X, Gateno J (2017) A clinically validated prediction method for facial soft-tissue changes following double-jaw surgery. Med Phys 44(8):4252–4261
Faure F, Duriez C, Delingette H, Allard J, Gilles B, Marchesseau S, Talbot H, Courtecuisse H, Bousquet G, Peterlik I, Cotin S (2012) SOFA: a multi-model framework for interactive physical simulation. Stud Mechanobiol Tissue Eng Biomater 11:283–321
Johnsen SF, Taylor ZA, Clarkson MJ, Hipwell J, Modat M, Eiben B, Han L, Hu Y, Mertzanidou T, Hawkes DJ, Ourselin S (2015) NiftySim: A GPU-based nonlinear finite element package for simulation of soft tissue biomechanics. Int J Comput Assist Radiol Surg 10:1077
Mendizabal A, Márquez-Neila P, Cotin S (2020) Simulation of hyperelastic materials in real-time using deep learning. Med Image Anal 59:101569
Bouletreau P, Makaremi M, Ibrahim B, Louvrier A, Sigaux N (2019) Artificial intelligence: applications in orthognathic surgery. J. Stomatol Oral Maxillofac Surg 120(4):347–354
Phellan R, Hachem B, Clin J, Mac-Thiong J, Duong L (2021) Real-time biomechanics using the finite element method and machine learning: review and perspective. Med Phys 48(1):7–18
Pfeiffer M, Riediger C, Weitz J, Speidel S (2019) Learning soft tissue behavior of organs for surgical navigation with convolutional neural networks. Int J Comput Assist Radiol Surg 14:1147–1155
Mendizabal, A., Tagliabue, E., Brunet, JN., Dall’Alba, D., Fiorini, P., Cotin, S. (2020). Physics-Based Deep Neural Network for Real-Time Lesion Tracking in Ultrasound-Guided Breast Biopsy. In: Miller, K., Wittek, A., Joldes, G., Nash, M., Nielsen, P. (eds) Computational Biomechanics for Medicine. MICCAI MICCAI 2019 2018. Springer, Cham. https://doi.org/10.1007/978-3-030-42428-2_4
Saeed SU, Taylor ZA, Pinnock MA, Emberton M, Barratt DC, Hu Y (2021) Prostate motion modelling using biomechanically-trained deep neural networks on unstructured nodes. arXiv preprint arXiv:2007.04972
Fu Y, Lei Y, Wang T, Patel P, Jani AB, Mao H, Curran WJ, Liu T, Yang X (2021) Biomechanically constrained non-rigid MR-TRUS prostate registration using deep learning based 3D point cloud matching. Med Image Anal 67:101845
Charles R Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++ (2017) Deep hierarchical feature learning on point sets in a metric space. Adv Neural Inf Process Syst 30:5105–5114
Zhang X, Kim D, Sheng S, yuan P, Liu S, Tang Z, Zhang G, Zhou X, Gateno J, Liebschner MA, Xia JJ (2018) An eFTD-VP framework for efficiently generating patient-specific anatomically detailed facial soft tissue FE mesh for craniomaxillofacial surgery simulation. Biomech Model Mechanobiol 17(2):387
Odot A, Haferssas R, Cotin S (2021) DeepPhysics: a physics aware deep learning framework for real-time simulation. arXiv preprint arXiv:2109.09491
Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378:686–707
Foti S, Koo B, Dowrick T, Ramalhinho J, Allam M, Davidson B, Stoyanov D, Clarkson MJ (2020) Intraoperative liver surface completion with graph convolutional VAE. Lect. Notes Comput. Sci 12443 LNCS:198–207
Maas SA, Ellis BJ, Ateshian GA, Weiss JA (2012) FEBio: finite elements for biomechanics. J Biomech Eng 134(1)
Funding
This work was supported in part by NIH grants (R01 DE022676, R01 DE027251 and R01 DE021863).
Author information
Authors and Affiliations
Contributions
NL, DK, XF, XX, JX, and PY conceived the study and designed the methods. DK and NL generated and prepared the data. The first draft of the manuscript was written by NL and DK. All authors commented on previous versions of the manuscript, read and approved the final manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors Nathan Lampen, Daeseung Kim, Xi Fang, Xuanang Xu, Tianshu Kuang, Hannah H. Deng, Joshua C. Barber, Jamie Gateno, James Xia, and Pingkun Yan declare that they have no conflicts of interest.
Ethical approval
The study was approved by our Institutional Review Board under IRB#: Pro00008890.
Informed consent
Informed consent was obtained for all subjects under IRB#: Pro00008890.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Lampen, N., Kim, D., Fang, X. et al. Deep learning for biomechanical modeling of facial tissue deformation in orthognathic surgical planning. Int J CARS 17, 945–952 (2022). https://doi.org/10.1007/s11548-022-02596-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-022-02596-1