Introduction

To optimize treatment outcomes in craniomaxillofacial surgery, a significant number of oral and maxillofacial surgeons (OMSs) are employing powerful computer-assisted surgery (CAS) tools in their practice. This enables them to carry out preoperative planning in a virtual three-dimensional (3D) environment [4, 13]. Preoperative planning is crucial to achieving desired treatment outcomes. Traditional imaging technologies and advanced computer-aided imaging modalities make diverse approaches to surgical planning possible. The limitations of two-dimensional (2D) cephalometric images, panoramic images, and 2D photographs are well-documented in preoperative planning. The current utilization of multislice spiral computed tomography (MSCT) imaging enables us to reconstruct a volumetric facial skeleton and an untextured 3D facial soft tissue surface via thresholding segmentation technology (Fig. 1) [7].

Fig. 1
figure 1

a 3D untextured soft and b hard tissue models reconstructed by the Invesalius software

Abb. 1 A Mit der Software InVesalius rekonstruierte 3-D-Modelle: a nicht texturiertes Weichgewebe, b Hartgewebe

However, the photo-realistic appearance of soft tissue cannot be reproduced in the virtually reconstructed MSCT facial model. The lack of information on soft tissue texture makes it difficult to obtain a solid visual concept of the treatment plan for both the OMS and patients [15].

Alternative techniques for 3D photo-realistic surface capture are currently available. The surface imaging technology is usually either laser-based or optics-based [16]. The optical FaceSCAN3D®Scientific Photo Lab (3D Shape, Erlangen, Germany) was specially developed to measure faces. It maps the entire face quickly and precisely via a single scan by means of an arrangement of mirrors while maintaining acceptable accuracy (0.2 mm) for surgical planning (Fig. 2).

Fig. 2
figure 2

3D high resolution face model built from the FaceSCAN3DsScientific Photo lab: a left side, b front, and c right side

Abb. 2 3-D-Gesichtsmodell in hoher Auflösung, erstellt mit FaceSCAN3DsScientific Photo lab: a linke Seite, b von vorn, c rechte Seite

Recent studies have aimed to register and superimpose 3D facial surface images onto CBCT/MSCT images in order to improve the quality of patient-specific virtual faces. This is done to document a patient’s external facial features and traits before treatment. It is then used as a basis for preoperative planning and progress monitoring throughout treatment [2, 8, 10, 11].

To date, however, few studies have addressed the feasibility and accuracy of implemented registration algorithms. Feasibility and accuracy affect the veracity of the patient-specific virtual face model. When those factors are inadequate, inaccurate diagnoses, treatment planning, and the postoperative evaluation of craniomaxillofacial deformities may be compromised. The aim of this study was to develop an optimal process enabling accurate registration and feasible fusion of three-dimensional (3D) photo-realistic surface images (captured using the FaceSCAN3D® Scientific Photo Lab) with soft and skeletal tissues reconstructed from multislice spiral computed tomography (MSCT) images.

Materials and methods

This study was conducted in 37 patients referred to our hospital for assessment and management of their craniofacial deformities. The study followed procedures in accordance with the 1975 Declaration of Helsinki, as revised in 2000 and was approved by the Ethics Committee of Sichuan University.

3D facial surface image acquisition

The patient’s 3D facial features were obtained using FaceSCAN3D® Scientific Photo Lab (60 Hz unit; 3D Shape, Erlangen, Germany). FaceSCAN3D® enables the measurement of an entire face (>200°) with a capture time under 0.4 s. The required reconstruction time for high-resolution 3D models is <1 min. Images were stored in OBJ file format for subsequent use. All 3D surface features were acquired with the head in a natural position, with the lips at rest, a neutral facial expression, open eyes, and in intercuspidation without visible activation of the mastication muscles. All patients were tutored and observed by the investigator (A.T.) who took all scans.

MSCT data acquisition

After having obtained the facial features, an image scan was taken within 24 h using a 16-slice CT scanner (Philips MX16 EVO CT, Holland, 120 kV, 7700 mAS, pixel size 0.48 mm, increment 0.5 mm, field of view 250 mm, slice thickness 1 mm, matrix 512 × 512 pixels, gantry tilt 0°), the patients were also told to maintain a neutral facial expression, open eyes, and intercuspidation without visibly activating their mastication muscles. Primary data were saved on a DVD disk in Digital Imaging and Communications in Medicine (DICOM) format. The 400 slices were then imported into the Invesalius software (version 3.0.0 Beta5: http://svn.softwarepublico.gov.br/trac/invesalius) for 3D reconstruction of hard and soft tissues. Both hard and untextured soft tissues were segmented using predefined thresholds (soft tissue for adult −718 to −177 and bone for adult 226–3071, respectively). Both were exported as stereo lithography files (*.STL).

Image registration and fusion

Since the soft and hard tissue models reconstructed from MSCT scans used the same coordinate system, the registration process was only required between the FaceSCAN data and soft tissue data derived from the MSCT scans, which share a similar surface. We used a desktop computer Intel Core i7-4770 CPU (3.40 GHz, 32.0 GB main memory; Intel, Fort Worth, TX, USA) with the Windows 7 operating system (Microsoft, Redmond, WA, USA).

Manual registration

For each patient, both sets of surface data were imported to Geomagic Studio software (Raindrop Geomagic Studio 2013®, Raindrop Geomagic, Inc., NC, USA). Seven landmarks distributed over the entire facial area of the patient’s head were digitized manually in the same sequence for both the FaceSCAN data and MSCT skin surfaces, i.e., Procrustes registration according to seven anthropometric landmarks: (1) right entocanthion: r-en, (2) left entocanthion: l-en, (3) right ectocanthion: r-ex, (4) left ectocanthion: l-ex, (5) subnasale: sn, (6) right cheilion: r-ch, and (7) left cheilion: l-ch.

These points were chosen for the ease with which they can be repeatedly identified and because they are evenly spaced on the face, factors that maximize the accuracy of premier rigid registration. Regions of no clinical relevance (hair and neck) were excluded to enhance the accuracy of the initial registration. All the points for registration were applied twice by one investigator.

Semi-automatic registration

After manual registration, we conducted a global registration to automatically achieve optimal registration parameters based on the modified ICP rigid registration algorithm, i.e., Procrustes registration according to seven anthropometric landmarks and global registration based on the modified ICP registration algorithm. The modified ICP calculation control options were set as follows: tolerance 0.0 mm, max iterations 100, sample size 2000, overlap reduction.

Manual registration

To investigate whether raising the number of landmarks would affect the Procrustes registration’s accuracy, we added an additional eight anthropometric landmarks to the manual registration process, i.e., Procrustes registration according to 15 anthropometric landmarks: (8) stomion: sto, (9) gnathion: gn, (10) right zygion: r-zy, (11) left zygion: l-zy, (12) right alare: r-al, (13) left alare: l-al, (14) right gonion: r-go, (15) left gonion: l-go.

Semi-automatic registration

After applying a modified version of the iterative closest point (ICP) algorithm to register the textured (FaceSCAN) and untextured (MSCT) surfaces to the optimal fit, all models were finally merged and then exported as Virtual Reality Modeling Language (*.WRL) files for future applications. i.e., manual registration according to 15 anthropometric landmarks and global registration.

Statistical analysis

Quantitative measurements of the registration errors were calculated for each patient. The absolute average distance between the two surfaces was computed, as was the standard deviation of the distance errors. Surface differences were displayed as color-coded error maps.

Statistical testing was done with SAS 9.2 (SAS Institute, Inc., Cary, NC, USA). Differences among the four processes were assessed using one-way ANOVA and LSD tests. P values of <0.05 were considered significant.

Results

Average distances between the two surfaces were 0.99 [standard deviation (SD) 0.13 mm], 0.77 (SD 0.11 mm), 0.99 (SD 0.15 mm), and 0.77 mm (SD 0.10 mm) for the four groups, respectively (Table 1). A graph of the average absolute distance for all 37 patients is shown in Fig. 3. This was done to compare the SD between each patient’s two registered surfaces in the four groups. Box plots of the mean average distance and SD were plotted for all methods (Fig. 4). The average number of triangles was 800,232 (range 476,345–1,376,401 triangles) for the untextured 3D CT surface and 30,416 (range 22,528–37,392 triangles) for the textured 3D skin surface. Between 2 and 4 min were needed to reconstruct both the hard 3D and untextured soft tissue models. An additional 1.5–3 min were needed to identify the corresponding 7 or 15 anthropometric landmarks on both surfaces.

Tab. 1 Means and standard deviations (SD) of distance errors in four groups for all patients
Fig. 3
figure 3

Average absolute distance in all 37 patients

Abb. 3 Durchschnittliche absolute Distanz bei allen 37 Patienten

Fig. 4
figure 4

Box plot of the mean average distance and standard deviation plotted for four methods to illustrate the distribution of registration error

Abb. 4 Boxplot der mittleren durchschnittlichen Distanz und der Standardabweichung für 4 Methoden zur Darstellung der Verteilung des Registrierungsfehlers

The ANOVA test revealed significant differences between groups (F = 39.783, P = 0.000). In multiple comparisons (LSD), we observed significant differences between the manual registration groups and semi-automatic registration groups (according to 7 and 15 anthropometric landmarks, respectively). No statistical differences were found between the two manual and two semi-automatic registration groups (Fig. 4). The mean of the average distances for all 37 patients across the whole cloud point set for both surfaces remained stable at a level <0.8 mm after carrying out the modified ICP rigid registration algorithm. The average mean distance between the two surfaces was recorded by the histogram accompanying each patient’s color map (Fig. 5).

Fig. 5
figure 5

Images of a color-coded surface mismatch error map for one patient following different registrations: a manual registration according to seven landmarks, b semi-automatic registration according to seven landmarks + MICP, c manual registration according to 15 landmarks, and d semi-automatic registration according to 15 landmarks + MICP

Abb. 5 Farbkodierte Mismatch-Fehler-Karte für einen Patienten nach verschiedenen Registrierungsmethoden: a manuelle Registrierung anhand von 7 Landmarken, b semiautomatische Registrierung anhand von 7 Landmarken + MICP, c manuelle Registrierung mit 15 Landmarken, d semiautomatische Registrierung mit 15 Landmarken + MICP

Discussion

The number of OMSs using computer-assisted surgery (CAS) tools during diagnosis and treatment planning stages is increasing steadily [6]. For the DICOM data found in CT scans, there are a variety of commercial software programs available for the 3D reconstruction of hard and soft tissue models [1, 14, 15]. Most are powerful and have a complex operation interface. However, from the surgeon’s perspective, the real concerns are the software’s convenience during clinical application and how time-consuming it is to master and ultimately use the software.

In this study, the 3D soft and hard tissue models were created with the Invesalius program, a free medical software package used to generate virtual reconstructions of structures in the human body. The software interface is user-friendly and simple. An OMS with poor computer skills can, with basic knowledge, carry out the 3D reconstruction procedure easily. The entire reconstruction process takes <3 min (usually 2–3 min), and has four main steps: (1) loading the DICOM data, (2) selecting the region of interest by building a mask with a predefined or manual threshold set, (3) configuring a 3D surface, and (4) choosing to export the model data in another format (e.g., *.OBJ,*.VRML,*.STL). These files can be used for rapid prototyping, or for other purposes.

Due to poor soft-tissue contrast in CT scans, the reconstructed soft tissue models are insufficiently accurate and do not capture the skin’s genuine color or texture. This limits its diagnostic and preoperative planning value. Advances in 3D surface imaging technology are taking OMS to new horizons, as it enables them to accurately document their patients’ external facial features before and after surgery. It is then used as a basis while communicating with patients, in surgical planning, medical education, and outcome evaluation [16, 18].

In this study, we captured 3D textured surfaces using the FaceSCAN3D® Scientific Photo Lab. Its imaging principle is based on the Moiré fringe projection (MFP) technique and is an improvement over simple structured light. More facial profile features are captured, especially nose’s topology [16]. The measurement inaccuracy of the height data is proportional to the length of the field of view (FoV) and (according to a rule of thumb) amounts to 1/4000th of the FoV (FaceSCAN3D: approximately 0.2 mm with 800 mm length FoV).

Many maintain that combining 3D surface imaging and CT reconstruction technologies results in an optimal outcome for virtual facial evaluation prior to surgical planning, virtual facial cosmetics, and reconstructive surgery. It is an approach that generates a much more accurate and worthy patient-specific facial model closer to anatomic reality [17]. Multiple studies have addressed the combination of 3D surface imaging and CT scans using different 3D image fusion software applying various algorithms. However, keeping clinical requirements in mind, obtaining reliable and stable image registration and fusion processes is challenging.

There are three types of image registration algorithms requiring someone’s participation: automatic, semi-automatic, and manual registration. In the first, the manipulator only need export the corresponding data to a software package. The registration process then takes place automatically. This approach has proven to be practical, but the registration time and reliability need to be considered. The semi-automatic type requires the manipulator to initialize the algorithm, including the initial values or parameter settings. Manual registration is the most interactive, but this leads to a loss of the registration algorithm’s accuracy and practicability.

One of the manual registration methods, Procrustes registration, creates a coordinate-transformation matrix that minimizes the mean square error between accompanying landmarks on both registration surfaces. Previous investigations of Procrustes registration in facial image superimposition emphasized that the utilization of anthropometric landmarks produces more favorable results than artificial ones. We therefore selected facial anthropometric landmarks for registration in our study [9]. In Procrustes registration, the conventional wisdom is that the higher the number of corresponding landmarks chosen, the more accurate the registration will be. However, we observed no statistical difference between the two manual registration groups (based on 7 and 15 anthropometric landmarks) in our study. This may be due to the poor repeatability of the additional eight landmarks (especially r-go and l-go) when carrying out the manual registration process. As a result, we believe that seven anthropometric landmarks suffice during manual Procrustes registration of the face, as these meet the basic application requirements.

The errors in Procrustes registration are relative because the selected landmarks lie on a curved 3D image surface that is difficult to visualize and identify [2]. Accurate registration requires full information about the registration surfaces, guaranteeing the correct solution based solely on accompanying landmarks [2]. For clinical diagnosis, treatment planning, and postoperative evaluation, more powerful methods are required. The ICP algorithm is favorable as it allows 3D translation and 3D rotation, which the computer (via the two model datasets) helps transform into an optimal registration [3, 5, 19]. By applying the ICP algorithm rather than relying solely on landmarks, the entire geometric information on both surfaces can be used to generate a more accurate final alignment.

In this method, we primarily carried out the Procrustes registration procedure initially. The accompanying landmarks were only required to help the software bring the two surfaces into an approximate initial match [11]. The modified ICP registration method was then applied, resulting in less dependence on landmarks and a refinement in the initial manual registration to attain the correct final alignment.

However, after performing the Procrustes registration according to seven anthropometric landmarks, followed by global registration according to the modified ICP registration algorithm, differences were still evident. We observed these in the relatively large areas around bilateral cheeks, eyeballs, nose-wings, and forehead regions.

These differences could be due to the data acquisition difference (capture time and capture position) between the MSCT and FaceSCAN systems. The capture time spent on MSCT generally takes several minutes (<1 min in our study) and the FaceSCAN capture time is under 0.4 s. Since the FaceSCAN capture time takes <0.4 s, any movements caused by of a patient’s breathing or facial expressions are negligible. During CT scans, however, it is difficult for patients to hold their breath and maintain a static facial expression in order to minimize motion artifacts. A study conducted by Naudi et al. [11] concluded that simultaneously capturing both CT and 3D photorealistic data can heighten registration accuracy. We believe that minimizing the capture time lapse between the two systems can enhance the registration accuracy. However, keeping the clinical context in mind, it would be extremely difficult to simultaneously capture data from both sources, and the essential difference between the two imaging principles means that simultaneous capture is practically impossible. When the scans were taken, the patients held a different position. The MSCT scan was taken in with the patient in supine position, while the face scan was taken in sitting position. Soft tissue draping may appear when taking MSCT scans, and the areas around both cheeks revealed a relatively large registration error in our study. Holding patients in exactly the same position while both scans are being taken would enhance registration accuracy [10]. Other factors potentially affecting registration results should also be considered. For instance, use of a head strap to keep the patient immobile during the MSCT scan would lead to small soft-tissue deformation in the forehead region, which would result in registration errors in the forehead region especially. Registration errors in the eyebrow and hairline areas may be due to the sharp-edged areas captured and reconstructed in high resolution by the FaceSCAN systems as compared to the MSCT.

One drawback of MSCT imaging is that precise interocclusal and detailed occlusal data cannot be acquired [12]. To improve the quality and practicability of the patient-specific virtual face, it is essential to upgrade or replace the 3D reconstruction CBCT/CT dental data with digital dental casts [13]. In this investigation, most of our data was obtained from patients presenting serious maxillofacial fractures. Inadequate interincisor distance makes it difficult to gain accurate dental data via standard dental impression technique or a 3D intra-oral scanning method. Inaccurate dental data would compromise the outcome of reconstruction of craniomaxillofacial fractures in virtual surgery. In further studies, we plan to focus on identifying a precise dental data acquisition method and on its integration with the previous model in patients with facial fractures.

Conclusion

Results of this investigation demonstrate that applying the Procrustes registration according to seven anthropometric landmarks in combination with global registration based on the modified ICP registration algorithm is an optimal technique enabling accurate registration and feasible fusion of 3D face scan images with the MSCT reconstructed data. Image registration errors over the entire virtual face were <0.8 mm. The patient-specific virtual face can be used as an objective communication tool among several OMSs for diagnosis, treatment planning, and postoperative evaluation in craniomaxillofacial surgery.