Introduction

Cone beam computed tomography (CBCT) is an increasingly applied imaging acquisition for dental surgical planning [1] due to the lower hardware cost and accessibility compared to conventional CT. The first step in the planning of implant surgery is accurate segmentation of mandibular canal which results in safety margin around the facial nerves. These nerves give sensation to the lower lip, tongue and teeth, and if they become damaged, the recovery time would be about 3 to 6 months [2]. Localization of the canal is usually performed manually by a radiologist; however, manual segmentation becomes tedious and time-consuming due to the large amount of data to be analyzed.

Early researches in automated canal segmentation were performed on CT images. Stein et al. [3] proposed a method based on Dijkstra’s algorithm and limited Dijkstra’s search to multiple erosions and dilations to trace inside the bone. Hanssen et al. [4] improved Stein’s method by replacing Dijkstra’s algorithm with fast marching which leads to more accurate distance results. In [5], firstly the mandible was segmented by thresholding. Then, the mandibular canal was roughly segmented using image gradients and a binary mask-based line tracking method was utilized for canal localization. Rueda et al. [6] proposed a framework based on 2D active appearance models and semi-automatic landmarking to extract mandibular canal, bone and nerve. An adaptive region growing method was employed in [7], in which initial seed point inside the canal should be chosen by the user. In the recent years, some researchers attempted to segment mandibular canal in CBCT images. Localization of the mandibular canal in CBCT data is highly challenging due to the lower dose and higher noise-to-signal ratio in CBCT images comparing to conventional CT [8]. A framework based on active shape model (ASM) and Dijkstra’s algorithm was introduced in [9]. The results were promising; however, canal path near the ending point was not detected precisely. Fuzzy-connectedness approach was utilized in [10] which leads to accurate segmentation of jaw tissues including canal. The main disadvantage of fuzzy connectedness is that it is computationally inefficient and slow. A combination of 3D panoramic volume rendering algorithm and fast marching was employed in [11] to extract the whole region of the mandibular canal. The performance is highly dependent on the utilized texture features to enhance the mental foramens. The potential of active shape model (ASM) and active appearance model (AAM) was evaluated in [12] for automatic segmentation. It was reported that the accuracy of automatic segmentation of the mandibular canal by AAM and ASM methods is inadequate for use in clinical practice.

In the recent years, statistical shape models were successfully applied in many medical image segmentation tasks. Improving the accuracy of statistical shape models in segmentation tasks is still an open problem in medical image analysis. Various researches are performed to find corresponding points accurately [13]. Moreover, some researchers tried to replace principal component analysis with other dimension reduction methodologies [14]. In the last decade, many researchers utilized active shape models for automatic segmentation of the mandibular canal [9, 12]. However, they failed to achieve the accuracy high enough for safe implant surgery in CBCT images [12]. The mean interobserver variability of 1 mm is possible in clinical practice [15] and the largest error occurs in the anterior loop region due to the incomplete bony wall in combination with the unpredictable recurrent course. Identification and segmentation of mandibular canal are challenging due to several reasons. First, due to the large variation in shape and texture between mandibles of patients, building a robust statistical shape model is highly challenging. Second, multiple teeth loss results in severe bone resorption and the shape of mandible changes drastically in these patients. Third, due to the lower contrast of CBCT images compared to conventional CT, automatic segmentation is more challenging. Hence, designing an effective image enhancement and filtering method is essential for CBCT images.

In this article, a framework based on statistical shape models is developed for automatic segmentation of mandibular canal and it is applied to a dataset of CBCT images. In the proposed framework, firstly a new preprocessing algorithm based on low-rank decomposition is utilized. Then, a combination of statistical shape model and fast marching is employed for mandibular bone and canal segmentation. The rest of this paper is organized as follows. In “Material and methods” section, we introduce the proposed framework for automatic segmentation which consists of preprocessing, spatial normalization, conditional statistical shape modeling and fast marching. Then, the “Experimental results” section is reported. “Discussion and conclusion” sections are presented, respectively.

Material and methods

In this section, we introduce the proposed framework for automated segmentation of mandibular canal. Shape and position of mandibular canal, mandibular foramen and mental foramen are illustrated in Fig. 1. In CBCT images, mandibular canals often have missing edges. Moreover, the intensity of the canal is similar to the surrounding cancellous bone. Thus, it is essential to include a priori shape information in the model. This can be achieved using improved statistical shape models. Overview of the proposed method for mandibular canal segmentation is illustrated in Fig. 2.

Fig. 1
figure 1

Shape and position of mandibular canal with respect to mandibular bone

Fig. 2
figure 2

Overview of the proposed method for mandibular canal segmentation. Mandibular and mental foramen (condition points in conditional SSM) are depicted using red cross signs

Dataset

For the research presented in this paper, we collected 120 sets of CBCT images from two dedicated dental imaging centers in Tehran and Guilan provinces, Iran. In both centers, Sirona Galileos Compact 3D Cone Beam X-Ray Machine was used to acquire the high-resolution structural images. Acquisition parameters of the relevant sequences were as follows: field of view (FOV) = \(12\times 15\times 15\ \hbox {cm}^{3}\), effective dosage \({<}29\, \upmu \hbox {Sv}\) (21 mAs, 85 kV) and isotropic voxel size = 0.3 mm. The study involved 120 subjects with a mean age of \(49.7 \pm 25.2\) years. There were 68 males (56.66 %) and 52 females (43.33 %). All patients were referred to dental imaging centers to acquire CBCT images for implant surgical planning. Three-dimensional model of each subject is built from a set of 512 axial cross-sectional slices.

Fig. 3
figure 3

An example of the multi-scale low-rank decomposition. a Original noisy image. b The result of median filtering, c the result of diffusion filtering with optimized scheme. df Filtering result of the proposed method based on low-rank decomposition with different scales: 4, 8 and 16

Image enhancement using multi-scale low-rank decomposition

CBCT images often suffer from low contrast and image enhancement techniques can improve the contrast of these images. We have utilized a combination of multi-scale modeling and low-rank matrix decomposition in [16] for image enhancement. This method was previously utilized for illumination normalization in face recognition application. Convex formulation is employed to solve the decomposition efficiently so that the multi-scale image components are incoherent. It is assumed that the 2D image matrix X with height and width of M and N, respectively, can be decomposed into different scales. In the other words, we assume that we are given a multi-scale partition \(\left\{ {Q_i } \right\} _{i=1}^L \) of an \(M\times N\) matrix, in which each block in \(Q_i \) is an order magnitude larger than the blocks in \(Q_{i-1} \). In other to transform between data matrix and block matrices, a block reshape operator \(R_A (Y)\) is defined to extract a block A from the matrix Y and then it is reshaped into an \(m_i \times n_i \) matrix. Given an \(M\times N\) input matrix X and the corresponding multi-scale partition, the following multi-scale low-rank modeling is proposed in [16]:

$$\begin{aligned} X=\sum _{i=1}^L {Y_i } ,\;\quad Y_i =\sum _{A\in Q_i } {R_A^T (U_A S_A V_A^T )} , \end{aligned}$$
(1)

where \(U_A ,S_A \) and \(V_A \) form singular value decomposition (SVD) of \(R_A (Y_i )\). Given the data matrix X, the goal is to recover \(\left\{ {Y_i } \right\} _{i=1}^L \) from X. This can be achieved using convex programming, and multi-scale low-rank decomposition problem is formulated as follows:

$$\begin{aligned}&\mathop {\min imize}\limits _{Y_1 ,\ldots ,Y_L } \; \sum _{i=1}^L {\lambda _i \left\| {Y_i } \right\| }_{\mathrm{(i)}},\nonumber \\&\mathrm{subject\;to}\;X=\sum _{i=1}^L {Y_i } \end{aligned}$$
(2)

where \(\left\| . \right\| _{(i)} \) is the block-wise nuclear norm for the i-th scale as \(\left\| . \right\| _{(i)} =\mathop {\sum }\nolimits _{A\in Q_i } {\left\| {R_A (.)} \right\| }_{\mathrm{nuc}}\). \(R_A (Y)\) is a block reshape operator which extracts a block matrix A from the full matrix X. This notation is considered to easily transform between the data matrix and the block matrices. Nuclear norm is the sum of the singular values of a matrix. The main characteristic of nuclear norm is that it is the tightest convex lower approximation to the rank function. The results of filtering with different scale numbers are illustrated in Fig. 3. Quantitative evaluation of filtering using different metrics is illustrated in Table 1. Furthermore, conventional median filter is included in this table. PSNRFootnote 1 values of different scales are approximately the same. In addition to PSNR, Root Mean Square Error (RMSE) [17] and Structural Similarity index (SSIM) [18] are quantitative measures which are utilized to choose the best result. PSNR and RMSE are slightly biased toward over smoothed results, i.e. an algorithm which filters not only the noise but also a part of the textures will get a good score. Structural similarity index [18] is a quality reconstruction metric that considers the similarity of the edges (high-frequency content) between the denoised image and the ground truth. To get a good SSIM score, the filtering method should remove the noise and preserve the edges and textures of the objects. Considering these criteria, the second scale gives the best result. Hence, it is the ideal filter for this application. The main advantage of this method is that irregular patterns are prohibited due to the low-rank decomposition. Hence, instead of global smoothing, local processing is done. The proposed filtering technique gives the best enhancement in uniform regions, while the edges are preserved.

Table 1 Quantitative comparison of different low-rank decomposition scales for CBCT enhancement
Fig. 4
figure 4

Block diagram of the steps performed for building and testing conditional SSM

Building statistical shape model

Statistical shape models (SSM) are established as a robust tool for 3D segmentation of medical images. The process of building SSM can be divided into two phases: learning and segmentation phase. Block diagram of the steps performed in this paper for building and testing SSM is depicted in Fig. 4. Mandibles differ in size and shape; hence, normalization preprocessing step is essential for studying the shape. For this purpose, we build a reference mandible surface, firstly. The mandible segmentations are converted to triangulated surface using marching cubes [19]. Mandible shapes are defined by vectors containing the coordinates of a set of landmark points which correspond to different mandible instances and that are typically located on the boundaries of the mandible. To construct the reference shape, the average shape of 84 training mandibles is calculated from corresponding surface meshes. The shape correspondences between the individual and average mandible shape is determined using pair-wise surface registration which is performed by nonrigid registration [20]. In the registration process, we use free-form deformation as the transformation model, the sum of squared difference as the similarity metric and the gradient descent algorithm for optimization. The point positions are optimized to minimize the model variance and obtain the most compact shape model. Training mandibles are transferred to the reference mandible space using the obtained transformation functions. Conditional statistical shape model [21] is utilized to embed the information about the position of the mandibular and mental foramen which are the starting and ending points of mandibular canal. This would avoid treating all regions of shape equally. At first, learning phase is explained. In order to model relations between shapes, let Y and Z be the shape of mandible and the combined shape of the mandibular and mental foramen. The conditional distribution of shape Y given a known shape \(Z=Z_0 \) is formulated using Gaussian conditional density as following:

$$\begin{aligned} P(Y| Z=Z_0 )=N({\upmu } _{Y| {Z_0},\Sigma _{Y|{Z_0 }} }), \end{aligned}$$
(3)

with

$$\begin{aligned} {\upmu } _{Y| {Z_0 }}= & {} {\upmu } _Y +\Sigma _{YZ} \Sigma _{ZZ}^{-1} (Z_0 -{\upmu } _Z ),\nonumber \\ \Sigma _{Y| {Z_0 }}= & {} \Sigma _{YY} -\Sigma _{YZ} \Sigma _{ZZ}^{-1} \Sigma _{ZY} . \end{aligned}$$
(4)

where \(\Sigma _{ij} \) is the joint covariance, and \({\upmu } _Y \) and \({\upmu } _Z \) are the mean shapes of Y and Z in the training set. \(\Sigma _{YY} ,\Sigma _{YZ} ,\Sigma _{ZY} ,\Sigma _{ZZ} \) are the joint covariance matrix defined as follows:

$$\begin{aligned} \Sigma =\left[ \begin{array}{cc} {\Sigma _{YY} }&{}\quad {\Sigma _{YZ} } \\ {\Sigma _{ZY} }&{}\quad {\Sigma _{ZZ} } \\ \end{array}\right] =\left[ \begin{array}{cc} {\hbox {cov}(Y,Y)}&{}\quad {\hbox {cov}(Y,Z)} \\ {\hbox {cov}(Z,Y)}&{}\quad {\hbox {cov}(Z,Z)} \\ \end{array}\right] \end{aligned}$$
(5)

Ridge regression is employed to calculate \(\Sigma _{ZZ}^{-1} \), i.e. is replaced by \((\Sigma _Z Z+\gamma I)^{-1}\). The average shape and model deformation of conventional SSM are constant with subjects, whereas in conditional SSM, the average shape fits the patient-specific shape and the model deformation is restricted. The segmentation phase includes the following steps:

  1. 1.

    Automatic localization of the mandible coordinate system

  2. 2.

    Initial rough segmentation of the mandibular bone region by thresholding

  3. 3.

    Fitting SSM using Levenberg–Marquardt algorithm

  4. 4.

    Refinement of mandibular canal using fast marching which will be explained in the upcoming section.

Step (1) is performed by a previously reported method [22] which consists of spatial normalization, localization of anatomical landmarks using the statistical landmark model and refinement of the anatomical coordinate using the average surface image. In step (2), thresholding is performed and the largest connected component is extracted. The threshold value is automatically learned from the crossvalidation within the training dataset. The threshold value of 210 gives the lowest surface distance in this task. The procedure of fitting SSM to a point cloud is done using Levenberg–Marquardt algorithm through the following cost function as in [23]:

$$\begin{aligned} \hat{{b}}= & {} \mathop {\arg \min }\limits _b \left( \frac{1}{N_E }\sum _{x\in E} D(x,S(b))\right. \nonumber \\&\left. +\,\frac{1}{N_q} \sum _{x\in S(b)} {D(x,E)+\frac{\lambda }{M}} \left\| b \right\| ^{2}\right) , \end{aligned}$$
(6)

where b is the shape parameter vector, S(b) is the shape instance defined by parameter b, and E is a cloud of edge points. \(N_E \), \(N_q \), D and M are the number of edge points, the number of vertices in S(b), the distance metric and the number of modes, respectively. The first two terms represent fitness between a shape instance and the detected edge points, and the last term represents a penalty to avoid a shape go far from the mean. Due to the importance of detecting edge points correctly, we proposed a filtering method in “Image enhancement using multi-scale low-rank decomposition” section. In Eq. (6), we use the same notation, D(xA) for two similar distance metrics: point-to-surface mesh distance and point-to-point cloud distance. Both metrics measure the shortest distance between x and A as follows:

$$\begin{aligned} D(x,A)=\inf \left\{ \left\| {x-p} \right\| ^{2}| {p\in A} \right\} . \end{aligned}$$
(7)

If the argument A is a surface mesh such as the first term in Eq. (6), \(p\in A\) indicates any point on the surface mesh including every point on a triangle consisting of connected three vertices. If A is a point cloud such as the second term in Eq. (6), \(p\in A\) simply indicates a point in the point cloud.

Hence, embedding the information about the shape of the mandibular and mental foramen leads to more flexibility in modeling shape variations. In summary, the conditional SSM is fitted to the boundary edge points of the roughly segmented mandible using simple thresholding of CBCT images to obtain initial parameter settings for subsequent segmentation procedures. Edge detection is done based on intensity profile analysis, and the perpendicular direction at each surface point is estimated in each iteration.

Fast marching

Early researchers utilized Dijkstra’s algorithm for finding the shortest path on the graph of mandible [3, 9], however since the boundary of mandibular bone is not always present or visible in CBCT datasets, Dijkstra’s algorithm often shortcuts outside the mandibular canal. Fast marching is more recent approach for optimal path problem which gives more accurate distance results for image volumes. Fast marching [24, 25] is an efficient iterative algorithm for numerical approximation of fronts propagating in \( \mathbb {R}^{n}\)space. A propagating front is defined as a closed hypersurface, each point of which moves with speed function F in the direction of the surface normal. Suppose that \(S(t)\subset \mathbb {R}^{n}\) is the propagating interface in \(\mathbb {R}^{n}\) space. The evolution of the front can be modeled using Eikonal equation:

$$\begin{aligned} \left| {\nabla T} \right| =\frac{1}{F}, \end{aligned}$$
(8)

where T is the arrival time function, and F is the speed function. In fast marching, one of the most critical parameters is speed function. Since the canal has a low intensity, we consider a speed function in which the speed is inversely related to the intensity. Thus, the shortest path between mental and mandibular foramen will be mandibular canal. The segmented bone surface from the previous stage is utilized to select the background region, and we set all those pixels to a high pixel value. The curved pixel length is calculated which is equal to the length of the canal (\(L_\mathrm{canal})\). Then, we warp the local neighborhood of the canal to a small volume \(I_L \) of dimensions \(4\,\mathrm{mm}\times 4\,\mathrm{mm}\times L_\mathrm{canal}\). In this regard, the curved canal will be a straight line in \(I_L \). After these steps, uniformly distributed normal planes along the channel are estimated and the local neighborhood of the canal is warped to a small volume. The intensities of the warped volume, \(I_\mathrm{w} \) are converted to a speed map F as follows:

$$\begin{aligned} F=e^{(I_\mathrm{w} {*}H_g )}+e^{-\left\| {\nabla I_\mathrm{w} } \right\| }, \end{aligned}$$
(9)

where \(I_\mathrm{w} \) and \(H_g \) are the warped volume and Gaussian kernel with \(\sigma =1.5\), respectively. The first speed term involves the smoothed intensity by Gaussian kernel, while the second speed term involves the local gradient. The source and sink are considered as mental and mandibular foramen. Fast marching is performed using speed map F, and the shortest path is detected using Runge–Kutta algorithm [26]. Then, the shortest path is warped back to the original volume.

Experimental results

The proposed algorithm is implemented on MATLAB 8.1 environment [27] and C++ platform [28] (MS Visual Studio 2013), a personal computer with a P4 (3 GHz) processor and 8 GB memory. Image enhancement algorithm which is explained in “Image enhancement using multi-scale low-rank decomposition” section is built in MATLAB and the filtered images are fed into the C++ program for building SSM. Estimated time for segmenting each dataset with 512 slices by our algorithm is less than 5 min. We asked two radiologists with at least 10 years of experience for manual segmentation which is used as ground truth. For manual segmentation, an expert should spend at least 1 h to segment 512 slices. The dataset is divided into training and test sets. We have considered 70 % of the data for training and the rest for testing, which is considered as a common rule of thumb in machine learning [29, 30]. Splitting the dataset into the training and testing set is performed randomly. Furthermore, the ground truth segmentation of mandibular canal is provided by two radiologists for each case in the dataset.

Fig. 5
figure 5

Comparison of generalization ability in conventional SSM and conditional SSM. X-axis and Y-axis represent the number of shape modes and generalization ability, respectively. Black and red curves correspond to conditional SSM and conventional SSM

Fig. 6
figure 6

Comparison of specificity and compactness in conventional SSM and conditional SSM. X-axis represents the number of shape modes. Black and red curves in each plot indicate conditional SSM and conventional SSM, respectively

Fig. 7
figure 7

Illustrative segmentation results of mandible and mandibular canal. a Original CBCT image. b The red and blue contours correspond to the segmentation result by conditional SSM and conventional SSM, respectively. The yellow contour is ground- truth segmentation performed by radiologist                                     

Fig. 8
figure 8

Evaluation results for segmentation accuracy of mandibular bone. Left box plots of Dice’s coefficient and right average symmetric surface distance (ASSD) for conditional SSM, Kroon’s method in [9] and Kainmueller’s method in [12]

Table 2 Metric results for mandibular bone segmentation with significant differences at p value \({<}0.01\)

The first step is preprocessing which is performed using low-rank decomposition with different scales. We compared the filtering result using the proposed methodology with conventional median filter and diffusion filtering previously proposed by Kroon for CBCT images [31]. Decomposition using different scales such as 4, 8 and 16 are illustrated in Fig. 3. The best performance is achieved using the block size of 8. By comparing the area identified by a red bounding box in each subfigure, it can be observed that some details are lost in diffusion filtering. However, these details are preserved in low-rank decomposition. Quantitative comparison of different filtering methods is reported in Table 1. After preprocessing, conditional SSM is trained using 84 training sets of CBCT images. The standard measures such as compactness capacity, generalization ability and specificity are employed to compare conditional SSM model and conventional SSM model by Cootes et al. [32]. The evaluation metrics are explained thoroughly in “Appendix A.” Figure 5 shows the reconstruction error for conditional and conventional SSM as a function of the number of variation modes. For a constant number of modes, the reconstruction error is higher for conventional SSM. The generalization ability of conditional SSM is better than conventional SSM. The specificity and compactness for conditional and conventional SSM distributions are illustrated in Fig. 6. As it is evident from this figure, the error made by specificity measure is lower for conditional SSM. C(Conditional_SSM) is slightly larger than C(Conventional_SSM), but considering the error for each M, we can say that these two methods offer similar compactness level or conditional SSM is a bit worse than conventional SSM.

Table 3 The distance between manual and automatic segmented mandibular canal using fast marching shortest path optimization for a subset of test dataset
Fig. 9
figure 9

Illustrative result of automatic canal detection using digitally reconstructed radiographs (DRRs) in a sample subject. a Posterior DRR of left half, b lateral DRR of left half. c, d Posterior and lateral DRR with canal overlaid

Figure 7 visualizes sample results of mandible segmentation obtained using the combination of conditional SSM and fast marching. Figure 8 shows accuracy levels and box plots for mandible segmentation results obtained using our method and two other automatic methods in [9, 12] according to the ASSD and Dice criteria for all test dataset. Moreover, Table 2 summarizes the mean, standard deviation and median values of the presented results in Fig. 8. The distance values between manual and automatic segmented mandibular canal (for both right and left nerve) are reported in Table 3. From the quantitative results, it can be concluded that our method can segment mandibular canal with a good level of accuracy and perform better than the methods previously proposed in [9, 12].

Discussion

As we mentioned earlier, segmentation of mandibular canal is a challenging and time-consuming task. The goal of this research was to propose a framework based on statistical shape model for automatic segmentation of the canal. To this aim, we first developed a filtering approach based on low-rank decomposition for CBCT images. The high accuracy of this preprocessing step is essential for fitting SSM using Levenberg–Marquardt algorithm since the accuracy of fitting is dependent on the efficiency of edge points. Then, we segmented mandible in a patient dataset and then considered it as the input information for the canal localization procedure. Fast marching tries to find the darkest tunnel close to the initial segmentation of the canal found, which was obtained by conditional SSM model. Quantitative evaluation of the conditional statistical model was performed by compactness capacity, specificity and generalization ability measures. The overall performance of conditional SSM is superior to conventional SSM based on Figs. 5 and 6. Moreover, a combination of conditional SSM and fast marching was utilized for automatic detection of mandibular canal. Although the error of many methods is inadequate, especially near the canal ending and starting point, adding condition points’ information led to higher accuracy of the method. Figure 9 represents the efficiency of our method in a noisy environment.

In order to compare our proposed methodology with the previous works, we implemented Kainmueller and Kroon’s methods [9, 12] on our dataset. Kroon utilized statistical shape models to localize mandibular canal. In order to enhance CBCT images, he proposed coherence diffusion filtering. There are various schemes such as optimized, standard and nonnegative for solving discretized diffusion filtering equation. The performance of diffusion filtering in various schemes was previously evaluated in [33] and optimized scheme outperformed other schemes in terms of SSIM index. In this article, a new filtering method based on multi-scale low-rank decomposition is proposed. Quantitative comparison of diffusion filtering and the proposed method based on low-rank decomposition is reported in Table 1. In Kainmueller’s paper, active shape model was constructed using 106 datasets and canal segmentation is performed by

Table 4 Average symmetric surface distance (ASSD) for mandibular canal segmentation with significant differences at p value \({<}0.01\)
Table 5 Average mean curve distances to the respective gold standard nerve in mandibular canal segmentation
Fig. 10
figure 10

Mandibular canal localization accuracy. Subfigures a and b show the Euclidian distance error of the right mandibular canal and the left mandibular canal, respectively

a Dijkstra’s algorithm based optimization. It was reported that the right nerve and the left nerve could be detected with an average error of 1.0 and 1.20 mm, respectively. Kroon failed in achieving the desired accuracy for clinical practice. Comparative results for mandibular bone segmentation are reported in Table 2. Based on Dice’s coefficient and ASSD (mm), it can be concluded that the proposed method outperforms Kainmueller and Kroon’s approaches. Moreover, comparative metric results for canal detection are summarized in Table 4. The average mean curve distance to the respective gold standard was utilized as the evaluation metric in Kainmueller’s paper. Comparative performance of the methods based on this metric is reported in Table 5. Hence, it can be concluded that the proposed methodology has higher generalization ability, as well as robustness to unusual mandible shapes.

In the previous methods proposed for canal detection, the error in the mandibular and mental foramen region is more than 1 mm which is not sufficient in the clinical practice. In Fig. 10, the Euclidian distance error between the mandibular canal annotation from the proposed method and expert annotation is illustrated. As it can be seen, the mean error in the nerve entry and exit points is less than 1 mm and standard deviation is small. This is one of the main advantages of the proposed method in this article.

Figure 11 is related to the low accuracy level of our method due to the severe bone resorption. In some cases, such as bone loss resulting from missing teeth or cases with impacted tooth, there are large variations in mandible shape. The most common cause of bone loss is tooth loss, especially multiple teeth. When multiple teeth in an area are missing for a long term, facial drooping will occur. One possible way to improve the accuracy in these cases is increasing the number of condition points. However, in this regard, the accuracy of selecting condition points will affect the whole process.

Fig. 11
figure 11

Illustrative result of a case with severe bone resorption and impacted tooth. a Posterior DRR of left half, b lateral DRR of left half. c, d Posterior and lateral DRR with canal overlaid. Yellow region shows the correct path of mandibular canal

The main challenging part of building statistical shape models is finding corresponding points. Various methods can be utilized to perform this step such as spherical harmonic basis functions [34] and minimum description length (MDL) [35]. However, these methods are mainly suitable for closed surface objects or manual initialization by anatomical landmarks is essential [13]. When the 3D shapes are not topologically equivalent to a sphere, the accuracy of registration methods will decrease significantly. The shape of mandible does not resemble a sphere, and mapping to a sphere is not accurate. Furthermore, this shape is not a closed surface since the right and left mandibular canals are removed from the mandible. In the future work, we will attempt to seek and develop more efficient methods to find the corresponding points. We are aiming to utilize the potential of Lie groups and Lie Algebras theory [36] in this research.

Conclusion

Accurate localization of mandibular canal is essential in dental implant surgery. The main challenges are large variation in shapes and texture between mandibles, the high level of noise and low contrast in CBCT images and small dimension of the canal. Many researchers attempted to utilize statistical shape models for automatic segmentation of the mandibular canal. However, the accuracy of automatic segmentation is inadequate for use in clinical practice. In this article, we presented an accurate and effective framework which is able to segment mandibular canal automatically in CBCT images. From the methodological viewpoint, a particular aspect which differentiates the proposed method from existing methods is the combination of anatomical and statistical information including mental and mandibular foramen position. The proposed framework based on conditional SSM and fast marching leads to more accurate detection of the canal. A priori information about shape makes the mandibular canal segmentation more robust. Based on the quantitative results, we can conclude that the proposed segmentation framework outperforms two other methods in the literature. Due to the variability between the shape of mandibular bone in male and female subjects, future work could be addressed to employ different statistical shape models for male and female subjects and investigate the efficiency of our method for difficult datasets with severe bone resorption.