Introduction

Here, we address the problem of the automation of the femur segmentation. The segmentation of the femur is well established. On the one hand, the conventional methods cannot offer satisfactory results, and the shape-based method [1, 2] is preferred in order to obtain a robust and accurate result. On the other hand, deformable models require initialization in order to fit the model with high accuracy. In practice, deformable models are initialized manually which introduce intra-operator error, and it is also time-consuming. Therefore, there is a need for an automatic shape initialization method. Despite the need of organs localization, most researchers focus on improving the segmentation accuracy and only few of them focus on organ position recognition [3, 4].

The initialization task has received little attention. As a result, user interaction methods are commonly applied for initialization. In general, it is sufficient to roughly align the mean shape of the statistical shape models (SSM) to the target. In [5], they interactively drag and drop the mean shape of SSM over the target image to be segmented. This method is relatively easy for 2D images but it is more time-consuming for 3D images due to the need of image rotation. In [6], they manually selected only four landmark points for initialization. In [7], their method helps the user to select, from the SSM, few landmark points which best describe the shape. All those initialization methods require human assistance, which could turn into a cumbersome task given the increasing number of medical images.

Full automation of the initialization can be done by solutions adapted to the specific task. Often those solutions include histogram analysis or morphologic operator. Thus, they often need adaptation to new problems. Also, approaches based on registration using labeled atlas [8] are popular. However, in such methods, registration errors are highly related to inter-individual variability. Robustness can be improved by using multi-atlas, but at the cost of slower process. As an alternative, evolutionary algorithm can be used. In [9], Bayesian network with a particle filter was applied only for 2D fluoroscopic images of the femoral head. The method depends on many heuristics conditions, which limit its reusability. In [10, 11], they used genetic algorithm (GA) [12] to initialize the SSM. From the initial random solutions, the algorithm provides new solutions based on the best previous solutions using cross-over with some degree of mutation. Following the concept “survival of the fittest”, the algorithm iterates until the best solution is chosen. GA is highly dependent on its fitness function that evaluates the probability for survival of each solution. Moreover, those global search methods are known for their slow convergence and often speed-up strategies need to be added. In [13], the general Hough transform (GHT) [14], a template-based method where each point on the image edge vote for the possible template location, was used for vertebra bone. But it was limited to 2D images. In [15], the GHT was extended to 3D and applied for the pelvic bone SSM initialization. However, the orientation and the scale of the bone in the CT image were limited due to the huge need for memory and computational time. Also in [16], they used GHT in order to initialize the SSM of the femoral head. But they decreased the complexity of the method by reducing the number of points in SSM and limit the orientation and the scale.

Among initialization methods described above, in case of 2D images, atlas registration is very popular for simultaneous multi-organ registration, and GHT is preferred for single-organ recognition due to its robustness. But for 3D images, the selection of landmark points is preferred for initialization due to the complexity and the long average duration of the evolutionary algorithms and the memory needed to perform GHT. Moreover, for our case of study, where the femur is spread over 400 slices and the SSM is described by 2,562 points, GHT and GA are difficult to use without harsh limitations. The cost of GA fitness function will be very high: each proposed position needs to be compared point by point. GHT needs size limitation of the target image or region of interest (ROI) selection in order to save memory. In addition, both require harsh limitations on variation of scale, orientation, and translation. Also, the number of the SSM points needs to be reduced during the initialization process. Those limitations make associated methods less general and then less effective for new cases.

In this paper, we propose a new method to automatically initialize the SSM of the femur based on the detection of primitive shapes and without any human intervention or ROI selection. In “Method” section, we explain how to build the SSM, our method for initialization, and the segmentation process. In “Experiments and results” section, we expose the experiments and results.

Method

3D deformable model

Deformable model is very popular as a segmentation tool due to its way of deformation following training shape constraint, which can avoid result degeneration. Here, we briefly explain how we build our SSM. Deformable model is statistically computed by conventional procedures [17] using an 18 CT DICOM image as a training data set. Every femur is segmented manually using mouse clicks. Then, normal bone surfaces are made from the CT volume images using the well-known marching-cubes method [18]. After the surface generation, scaling is performed by referencing the amount of each bone area. Next, the three principle axes, x, y, and z, are determined by eigenvalue analysis of the bone volume in terms of each bone. By corresponding the bone scales and the principle axes, iterative closest point (ICP) algorithm [19] is employed to determine for each bone a group of correspondence points with other bones. The new coordinates of those points are stacked in a vector \({m}_i \in R^{3}\) with \(i\) the number of bones. Then, an average shape \(\overline{m}\) is consequently determined, and principle component analysis (PCA) for \(m_i\) is performed to get their deforming eigenvectors \(\Phi =\left[ \Phi _{1}, \ldots ,\Phi _{M} \right] \) and eigenvalues \(\lambda =\left[ \lambda _{1} ,\ldots ,\lambda _{M} \right] \) where M is the number of principal modes of variation. As explained in [17], a new shape can be described according to the following formula:

$$\begin{aligned} m\approx \overline{m}+\Phi b \end{aligned}$$
(1)

where \(m\) is the new shape and \(b=\left[ b_1 ,\ldots ,b_M\right] \) are mode weights. Upper limit and lower limit are fixed for b as \(-3\sqrt{\lambda }< b < + 3\sqrt{\lambda } \). The result is shown in (Fig. 1) in term of deformation magnitude of each part of the bone.

Fig. 1
figure 1

Deformable shape model with magnitude of deformation

Primitive shapes recognition

Our initialization method is based on the primitive geometry of the femur. When we see a femur, we can easily distinguish a spherical shape for the femoral head and a cylindrical shape for the femoral shaft (Fig. 2a). Based on this simple observation, we tried to detect those two primitive shapes: sphere and cylinder. The recognition of primitives like sphere and cylinder is widely established due to their simplicity in the mathematical equations [20]. In 3D space, a sphere description consists of four parameters \((c_x ,c_y ,c_z ,R)\), with \(c\) denoting the center of the sphere and \(R\) denoting the radius. The most commonly used method for primitive shapes detection is Hough Transform (HT) [21]. The HT is adapted to different shapes like line and circle [22]. The HT for sphere can be considered as

$$\begin{aligned}&\!\!\!(x-x_0 )^{2}+(y-y_0 )^{2}+(z-z_0 )^{2}=R^{2}\end{aligned}$$
(2)
$$\begin{aligned}&\!\!\!|(x-x_0 )^{2}+(y-y_0 )^{2}+(z-z_0 )^{2}-R^{2}|<\varepsilon \end{aligned}$$
(3)

with \((x_0 ,y_0 ,z_0)\) as the sphere center, \((x,y,z)\) the points coordinates in the sphere surface, and \(\varepsilon \) the tolerance. In [23], they used a 2D HT for circle detection and processed 3D CT images, slice by slice, to successfully detect the femoral head. In our case, we used a 3D extension of the HT circle detection [24] based on the Insight Segmentation and Registration Toolkit (ITK) library. This method is based on a radial voting: for each point on the edge, the normal is drawn and points along that normal, within a radius range, are selected for sphere parameters (center and radius) estimation. The vote of each edge point is accumulated into the Hough accumulator space, also called the parameters space (Fig. 3a). The set of parameters receiving the maximum vote number will be considered as the center of the sphere with the corresponding radius. Even without any radius limitation, the femoral head will still have the maximum vote into the accumulator space. However, providing a range for the femoral head radius will make the program faster and reduce the probability of false detection. In our case, we used \(R\in [16;25]\) extracted from literature [25]. In addition, since we are dealing with a bone, which has a high CT value, a simple threshold is used at the beginning to remove soft tissues and thus making the detection faster. Furthermore, a morphological opening [26], erosion followed by dilation, is performed as a preprocessing to remove noise and enhance the separation between the femoral head and the acetabula without extensive properties (Eq. 5). The erosion removes any foreground object smaller than the structured element such as noise or small part beyond the edges. Also, the size of the object is reduced by the size of structured element. Then, the dilation dilates any foreground object by the structured element. But noise has been already removed by the erosion. Then, the object recovers its original size without the removed noise. A small ball of radius 1 is used as a structured element. The opening of an image I by structured element E, with \(\ominus \) for erosion and \(\oplus \) for dilation, is expressed by

$$\begin{aligned}&\!\!\!I \circ E=(I\ominus E)\oplus E\end{aligned}$$
(4)
$$\begin{aligned}&\!\!\!I \circ E\subseteq I \end{aligned}$$
(5)

For the cylinder detection, on the one hand, the use of surface fitting based on common nonlinear regression requires an initial estimation. On the other hand, the use of HT is time- and memory consuming due to the high number of parameters. Furthermore, RANdom SAmple Consensus (RANSAC), which is based on a random selection of a minimal set of points and a trial of all possible combinations of shape parameters until the highest set of points is extracted, is more fitted to detect the cylinder [27]. But the use of a minimal set of points to detect the cylinder is not intuitive, and the cylinder direction seems to cause most of the trouble [28]. Therefore, we followed the method used for point cloud data set from industrial sites to extract cylinders [29]. The method splits the cylinder detection into two parts: direction detection, then radius and position. Here, we only used the first step which consists of a Gaussian sphere to find the direction of the cylinder. The normal of each point on the image edge is projected into the Gaussian sphere. As explained in [29], normal of points of the cylinder will form a great circle in the Gaussian sphere. The normal to this circle is the orientation of the cylinder. More explicitly, first a uniform mesh is applied for a unit sphere: an icosahedron edge mesh is divided into n triangular mesh and each vertex point is projected into the unit sphere which gives a uniform triangular mesh for the unit sphere. Then, the meshed unit sphere is used as Hough space and the normal direction of each edge point is projected into the unit sphere. Finally, for each projected normal, the big circle, which lies on the plane which passes through the sphere center, is drawn and corresponding mesh cells values are incremented as shown in (Fig. 3b). We can observe a symmetry in the sphere due to the symmetry of the human bone. Also, there are two big sets of cells with high intensity, which corresponds to the two femoral shafts. Since the femoral shaft is not a perfect cylinder, the normal of each edge point does not fall in the same cell. The arrow shows the cell that has the highest accumulator value. The normal to that cell is considered as the cylinder direction. The drawback of this method is that a plane can be considered as a cylinder into the Gaussian sphere. That is why a preprocessing to remove planes from the point set is performed.

Fig. 2
figure 2

Femur and primitive shapes: a red circle represents the primitive shape of the femoral head, and the red cylinder represents the primitive shape of femur shaft. b In green, the mean shape of SSM and in red the detected sphere and cylinder. c Detected sphere and cylinder into CT Image

Fig. 3
figure 3

Accumulator space: each edge point vote for sphere position and cylinder direction, a HT accumulator for the sphere detection, b Gaussian sphere accumulator space for the detection of the cylinder direction. The arrow indicates the cell with the highest number of votes

For the radius and position of the cylinder, we use RANSAC [27]. The combination of the RANSAC method with the found direction will make RANSAC less dependent on outliers than more robust and faster. RANSAC is applied along the found axis with a possible deviation of 5 degrees. The biggest set of points found by RANSAC is considered to be the cylinder. More formally, let us consider the point cloud constituted by the edge point of the CT image as \(P=\left\{ {p_1 ,\ldots ,p_N} \right\} \) with \(P\) the set of points and \(N\) number of points, and the result found by RANSAC as \(P_\mathrm{{s}} =\left\{ {p_{\mathrm{{s}}_1 } ,\ldots ,p_{\mathrm{{s}}_n}} \right\} \) with \(P_{\mathrm{s}}\) the set of points of the cylinder, \(n\) number of cylinder points and \(P_\mathrm{{s}} \in P\). Afterward, a connectivity condition is applied to \(P_\mathrm{{s}}\) to keep only connected points. Moreover, a range for femur shaft radius, \(R\in [12;15]\) extracted from literature [25], is used to speed up the cylinder detection process and to avoid the detection of the internal edge of the femur (spongy bone) instead of the external edge. In addition, as preprocessing, a threshold is used to remove soft tissues followed by a morphological opening to remove noise.

To enhance our detection, we used a heuristic knowledge of the femur derived from the SSM to link the relation between the sphere (femoral head) and the cylinder (femur shaft) as the femur head offset, which is the distance between the femoral head center and the natural axe of the femur. Those conditions help to reject false detected pairs of sphere and cylinder, especially when there are two femurs in the CT image. Every time we reject a detected set of points, we remove them from \(P\) so that they will not be detected again.

The algorithm of our cylinder detection is as follows:

  1. 1.

    Remove any existing plane in the set of points.

  2. 2.

    Calculate the normal for each point.

  3. 3.

    Project each normal into a uniform meshed unit sphere and draw its corresponding big circle by incremented cell mesh.

  4. 4.

    Find the cell with the highest value and consider it normal as cylinder direction.

  5. 5.

    Apply RANSAC for cylinder along that direction with a 5 degree of tolerance.

  6. 6.

    Select the highest set of points found by RANSAC.

  7. 7.

    Remove points not connected along the cylinder direction.

  8. 8.

    Extract the cylinder parameters: center, direction and radius.

  9. 9.

    If the found cylinder does not satisfy the heuristic conditions as femur offset, found points are removed and the process is repeated from step 5 but using the next highest value into the unit sphere as cylinder direction.

Primitive shapes detection is also applied for the average shape of SSM to detect the sphere and the cylinder (Fig. 2b). In this case, the number of outliers is very low, and heuristic conditions are not needed. Figure 2c shows the result of sphere and cylinder recognition in the CT Image.

Registration

After the extraction of primitives from the average shape SSM and CT Images, a rigid registration is performed. First, the SSM model is scaled based on the ratio between the sphere radius of SSM and the sphere radius of CT image. Then, the two cylinders directions are registered using dot product between the two vectors to get the angle of rotation. Let us consider \(\vec {V}_{\mathrm{{SSM}}}\) as the normalized vector direction of the detected cylinder into the SSM and \(\vec {V}_{\mathrm{{CT}}}\) the normalized vector direction of the detected cylinder into CT image. The normal of the plane defined by the two vector is given by the cross-product \(\vec {N}=\vec {V}_{\mathrm{{SSM}}} \wedge \vec {V}_{\mathrm{{CT}}}\) and the angle of rotation \(\alpha \) is given by the dot product of the tow direction vector \(\alpha =\cos ^{-1}(\vec {V}_{\mathrm{{SSM}}}\cdot \vec {V}_{\mathrm{{CT}}})\). The SSM primitive is rotated around \(\vec {N}\) by \(\alpha \). After the registration of the two cylinders directions, the next step is to register the femoral head center. The SSM primitive will be rotated around \(\vec {V}_{\mathrm{{CT}}}\) until the sphere center of the SSM fall in the same plane as the sphere center of the target image. Finally, this transformation is applied to the mean shape of SSM for initialization.

Iterative fitting process

First, we roughly segment the bone in the target image using simple conventional methods: Gaussian blur followed by a threshold for CT image. Second, an iso-surface is constructed based on the previous segmentation. Then, the mean shape of SSM is initialized into the image target using the registration method explained before (step 2.3). Finally, the following iterative process is applied to fit the SSM to the segmented image.

  1. 1.

    For each point of the mean shape, the normal direction is calculated.

  2. 2.

    For each point, the nearest point through the normal direction, into the target image, is selected and stored into a vector \(y=[x_0 ,y_0 ,z_0 ,\ldots ,x_{n-1} ,y_{n-1} ,z_{n-1}]\) with \(n\) number of corresponding points.

  3. 3.

    An affine transformation is applied between points of the average shape \(\overline{m}\) and their corresponding points \(y\).

  4. 4.

    To fit the mean shape \(\overline{m}\) with corresponding edge points \(y\), the well-known Levenberg-Marquardt nonlinear regression is used. This fitting consists on the minimization of the distance between \(\overline{m}\) and \(y\) by varying the parameter b using the following optimization equation.

    $$\begin{aligned} O\left( b \right) =\sum \limits _{i=1}^n \left[ {y_i -f\left( {m_i |b} \right) } \right] ^{2} \end{aligned}$$
    (6)

    \(m_i\) is the shape points, \(y_i\) corresponding edges points, \(f\left( {m_i |b} \right) =m_i +\Phi _m\, b\) explained in Eq. (1), \(b\) is the mode weight and \(O(b)\) sum of the squares of the deviations to minimize.

  5. 5.

    The fitting result is used instead of the mean shape \(\overline{m}\) for normal calculation and the process is repeated until convergence.

Experiments and results

Results of primitives detection

We automatically detected the sphere and the cylinder using the proposed method. Twenty CT images of patients were used. All images contain the pelvis and the femur as shown in (Fig. 4). Some images contain screw in the femur shaft or implant instead of the femoral head. The results are compared to the manually traced sphere and cylinder.

Fig. 4
figure 4

Segmentation result using the proposed method. The red surface is the result of the segmentation, a a coronal view, b a 3D planar view

The femoral head was successfully detected as sphere. The two femoral heads always have the two highest values of HT Accumulator. As shown in Table 1, the difference in parameters between manual sphere and automatically detected sphere is calculated, including the difference between the radius and the distance between sphere centers. The results show that the errors between the manually traced sphere and the automatically detected one are minimal. The mean of the spherical center error is 2.12 mm and the mean of radius error is 0.65 mm.

Table 1 The sphere detection results reported as: mean \(\pm \) standard deviation

For the femoral shaft, despite the presence of the pelvis bone which induces a high number of outliers, the cylinder detection succeeded in the majority of the cases with good results as shown in Table 2. The mean of radius error between the automatically detected cylinder and the manual one is 0.41 mm and the mean of angular deviation \(\Delta O\) (angle between the two directions in the common plane) is \(4.43^{\circ }.\) If there is no big difference, the direction of the cylinder can be corrected during the fitting process because the edge of the femoral shaft is easy to extract. The algorithm is more sensitive to the femoral head position. The cylinder detection can fail in cases where the femur is cut and the shaft portion is small relative to the pelvis.

Table 2 The cylinder detection results reported as: mean \(\pm \) standard deviation

Results of segmentation

After the primitives’ detection, we tested the quality of the overall segmentation. Eight CT images of patients were used for this experiment. First experiment is the standard one: initialization of the SSM using our method, segmentation of the CT image based on threshold, and the fitting of the SSM to the CT image using Levenberg-Marquardt as explained in “Iterative fitting process” section. We compared the proposed method, with a manual initialization by dropping and dragging the SSM at the right position into the target image. In addition, we compared it with landmark-based method. We proposed three landmark points into the SSM which best represent the shape and easy to detect into the CT image. And we asked the user to choose same landmark points into the CT images as in [6, 7]. The results of the comparison between each method and the manual segmentation (gold standard) are reported in Table 3 and Fig. 4. The results show that our initialization method yields good result as a manual initialization or landmark-based initialization with an average distance of 1.48 mm. Also, the dice similarity index was used to compare the results of the proposed method with the manual segmentation. The dice similarity coefficient was 87 \(\pm \) 2.6. Results can be improved by enhancing the stability against outlier during the fitting process which is widely described in the literature. The complete segmentation of CT image of 600 slices of \(512\times 512\) took about 3 min (around 2 min for initialization) using a C++ implementation on 64 bit desktop PC (3.10 GHz Core, 16 GB RAM).

Table 3 Results of experiment 1: segmentation results reported as mean, maximum and dice coefficient

The results of the segmentation also depend on how optimal is the detection of the corresponding edge points in the CT image. Thus, we cannot evaluate the initialization correctly. Therefore, we performed a second test to better evaluate our initialization method. The second test consists of the fitting of the SSM to a manually segmented bone. The manual segmentation replaces the corresponding points detected in the CT image using threshold from the previous experiment. Here the image noise cannot mislead the quality of the segmentation. We compare our initialization method with one of the best methods for rigid registration between two shapes ICP [19]. Once the initialization is performed, the iterative process of “Iterative fitting process” section is applied. Results of 8 data sets are shown in Figs. 5, 6 and Table 4. The ICP initialization might look better. However, the ICP method cannot be applied to the CT image. The ICP will fail to detect the position of the femur because of the presence of other bones such as the pelvis in the CT image. Also in this experiment, the difference between the segmentation results using ICP initialization and our method is minimal. The average distance using our method is 1.37 mm. The difference of the average distance error between the ICP initialization and our method is less than 0.1 mm. Also dice coefficient was calculated for the proposed method. All images have high similarity with the manual segmentation (\(\hbox {dice} = 0.89 \pm 0.01\)). The dice index increases compared to the precedent experiment because here there is no outlier on the corresponding edge.

Fig. 5
figure 5

The result of experiment 2 using our initialization. a The result after initialization: the gray color is the mean shape of SSM and the green color is the bone to segment. b The result after the fitting process of the SSM: the blue color is the target bone and the yellow color is the result

Fig. 6
figure 6

The result of experiment 2 using ICP registration. a The result after initialization: the gray color is the mean shape of SSM, and the green color is the bone to segment. b The result after the fitting process of the SSM: the blue color is the target bone and yellow color is the result

Table 4 Results of experiment 2: segmentation results reported as mean \(\pm \) standard deviation and maximum

Discussions

It is known that the SSM performs well with a good initialization but fails badly in the case of a wrong initialization. Our initialization method based on primitive shapes recognition yields a satisfying result. In experiments, our segmentation results were as good as a manual initialization which demonstrates the quality of the initialization. Moreover, our primitive shapes recognition method based on robust algorithms, HT and RANSAC, works well even for data containing many outliers. The drawback of the use of the unit sphere to detect the cylinder is that, if there are many cylinders, the biggest cylinder, in terms of number of points, will be detected first. Also, planes must be removed not to be detected as a cylinder into the unit sphere. In case of image having a small section of the femur shaft, the detection of the cylinder direction can be corrupted by the presence of the pelvis.

Compared to method based on user interaction such as landmark selection, our method releases users from the burdensome task of manual initialization. Other than methods such as GHT, which is based on template shape, this method uses a general approach based on primitive shapes. In addition, due to it memory requirement, the GHT is mostly impractical in 3D without harsh limitations on orientation and scale. GA is very slow especially in 3D because of its fitness function: every solution must be compared to the edge of the CT image. To reduce the processing time of GA, many assumptions and limitations should be considered which make this method less interested. Also in our case, the femur head spread over 400 slices which make GHT and GA almost impossible to use without harsh limitation. Thus, primitive shapes detection method is faster and needs less heuristics conditions. Also, since it is based on primitive shapes, our method is less dependent to the inter-variability between patients, which is an advantage for new cases. For example, our primitives’ recognition can be applied to right or left femur without change. Further, the proposed method is user friendly, such that users can expect the method will work or not just by looking at the image.

Conclusion

We described how to make one of the best segmentation methods (SSM) fully automatic for the femur segmentation. The method is based on primitive shapes: the femoral head has been detected as sphere, and the femoral shaft has been detected as a cylinder. Then, primitives were used for registration to give initial position for the SSM. The segmentation was carried out using the deformation of the SSM. The results are as good as when manual initialization is performed. The whole segmentation process needs less than 3 min. Furthermore, based on primitive shapes our method can be applied for segmentation of other objects, if a primitive shape can be extracted from the object.