Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

According to the Global Burden of Disease study [1, 2], lower back pain is the single leading cause of disability worldwide. Imaging studies indicate that \(40\,\%\) of patients suffering from chronic back pain showed symptoms of intervertebral disc degeneration (IDD) [3]. Primary treatment for lower back pain consists of non-surgical treatment methods. If non-surgical treatments are ineffective, a surgical procedure may be required to treat IDD, a procedure known as spinal discectomy. Approximately 300,000 discectomy procedures, over \(90\,\%\) of all spinal surgical procedures [4, 5], are performed each year in the United States alone, totaling up to $11.25 billion in cost per year.

Discectomy procedure simulation requires patient-specific and robust three-dimensional (3D) representation of vertebral and intervertebral disc structures, as well as existing pathology, of the lumbar spine. Although lumbar vertebral structures have high variability, the prominent features of the bone are consistent within a sample population. This facilitates the incorporation of a statistical shape model with expected variations into a volumetric image segmentation framework. Low image resolution and image artifacts, such as image noise, make biomedical volumetric image segmentation a challenge. Ambiguous image intensity results in incorrect, or even disconnected, boundary detection of the structure of interest. Prior knowledge, such as expected shape and variance within a sample population, can be incorporated through statistical shape models to optimize the image segmentation process.

This paper describes a framework for the construction of statistical shape models (SSMs) of L1 vertebrae and L1-L2 intervertebral discs from computed tomography (CT) and magnetic resonance (MR) images of respectively of healthy subjects. The generated SSMs are utilized as a reference for knowledge-based priors to optimize segmentation of vertebrae and intervertebral discs in volumetric MR images. These shape models can be incorporated into a controlled-resolution deformable segmentation model of the lumbar spine. Incorporation of strong shape priors would facilitate quantification and analysis of shape variations across healthy subjects. It is aimed as a tool for improving spine segmentation results that can be utilized as part of an anatomical input to an interactive spine surgery training simulator, especially a discectomy procedure [6].

Statistical shape models from nine L1 vertebrae and eight L1-L2 intervertebral discs have been generated to be utilized as shape priors during spine segmentation from volumetric MR images. Correspondence between instances within each model has been established using entropy-based point placement on the image surfaces [79], which is independent of any reference bias or surface parameterization techniques. The rest of the paper is as follows: Sect. 2 provides an overview of the correspondence and active shape model construction methods; Sect. 3 describes the initial shape model results for vertebrae and intervertebral discs. Finally, Sect. 4 presents a conclusion and future improvements of the implemented method.

2 Method

2.1 Image Dataset and Preprocessing

Datasets provided by the SpineWeb initiative have been utilized for generating shape models of an L1 vertebra and an L1-L2 intervetebral disc. Volumetric CT scans of healthy subjects, along with binary masks, of nine anonymized patients [10] were used for model construction of L1 vertebra. The CT scans and binary masks had a resolution of \(0.2\,{\times }\,0.3\,{\times }\,1\,\text {mm}^3\). In addition, expert segmentations of the L1-L2 interveterbral disc of eight anonymized patients, with \(2.0\,{\times }\,1.25\,{\times }\,1.25\,\text {mm}^3\) resolution [11], were preprocessed as input to the correspondence and shape model construction method.

These binary images were initially aligned along the first principal mode, and any aliasing artifacts were removed during image preprocessing. The fast marching method was applied to generate distance maps of the binary images, which were used for 3D surface reconstruction and establish correspondence between instances of both vertebra and disc shape models.

2.2 Correspondence Establishment

Correspondence establishment is the process of finding a set of points on one two-dimensional (2D) contour or 3D surface that can be mapped to the same set of points in another image. Anatomically meaningful and correct correspondences are of utmost importance, as they ensure correct shape parametrization and shape representation. This can be achieved by co-registering manual landmarks onto the shape boundary in 2D shape space but is challenging in 3D space. Anatomical landmarks are points of correspondence on each shape that match within a sample population [12], which may be manually or automatically placed. Correspondence landmarking may entail identifying matching parts between 3D anatomical structures, which is challenging due to inherent variability within geometry or shape of the anatomical structure across a population [13, 14]. Therefore, landmark placement to establish correspondence for robust statistical analysis is a significant task.

According to Heimann et al. [13], a number of methods for correspondence establishment are feasible, where a generic template mesh is registered onto a set of instances through model-to-model or model-to-image registration to achieve a set of instances with automatic point-to-point correspondences through distance [15]. However, this method introduces a bias through selection of a reference topology [16, 17]. To mitigate the reference bias, Rasoulian et al. [18, 19] utilized forward group-wise registration to establish probabilistic point-to-point correspondences to generate 3D training shapes of L2 vertebrae. Similarly, Mutsvangwa et al. [20] employed rigid and non-rigid registration of pointsets, and implemented a probabilistic principal component analysis (PCA) to mitigate outlier effects of a 3D scapula model. Vrtovec et al. [21] established correspondences through a hierarchical elastic mesh-to-image registration of an extracted reference across 25 lumbar vertebral image volumes. Kaus et al. [22] rigidly aligned a reference triangular mesh to training shapes and then utilized discrete deformable models to locally adapt the reference mesh to segmented volumes, thus propagating the reference pointset across 32 vertebral images. Lorenz et al. [23] performed curvature-adaptive landmark-guided warping and mesh relaxation of a reference mesh across a set of 31 lumbar vertebral image volumes for 3D statistical model construction. Becker et al. [24] parameterized 14 lumbar vertebral shapes to a rectangle by utilizing a graph embedding method, and reduced mesh distortion using energy minimization-based adaptive resampling. Heitz et al. [25] also implemented non-rigid b-spline based warping to construct models of C6 and C7 cervical vertebrae. This list is a reference of 3D vertebral and intervertebral disc statistical shape models and is by no means exhaustive. In contrast, 3D shape variability of intervertebral discs is less explored in the literature. Peloquin et al. [26] constructed a statistical shape model of 12 L3-L4 intervertebral discs from signed distance maps of manually segmented binary images.

This research focuses on the refinement of a correspondence technique introduced by Cates et al. [7, 8] that is independent of structure parameterization or a reference bias. The utilized technique employs a two-stage framework, with soft correspondence establishment in the first stage, and correspondence optimization across all instances of the shape space in the second stage. Soft correspondence is established by automatically placing homologous points on the shape surface through an iterative, hierarchical splitting strategy of particles, beginning with a single particle. A 3D surface can be sampled using a discrete set of N points that are considered random variables \(Z\,{=}\,(X_1,\ldots ,X_N)\) drawn from a probability density function (PDF) p(X). Denoting a specific shape realization of this PDF as \(z\,{=}\,(x_1,x_2,\ldots ,x_N)\), the amount of information contained in each point is the differential entropy of the PDF function p(x), which is estimated as the logarithm of its expectation \(\log \{E(p(x))\}\), \(E(\cdot )\) estimated by Parzen windowing. The cost function C becomes

$$\begin{aligned} C\{x_1,\dots ,x_N\} = -H(P^i)&= \sum _{j}\log \frac{1}{N(N-1)}\sum _{k\ne j}p(x_j) \nonumber \\&= \sum _{j}\log \frac{1}{N(N-1)}\sum _{l\ne j}G(x_j-x_l,\sigma _j), \end{aligned}$$
(1)

where G is an isotropic Gaussian kernel with standard deviation \(\sigma _j\). These dynamic particles have repulsive forces that interact within their circle of influence limited through the Gaussian kernel until a steady state is achieved, and are constrained to lie on shape surface through gradient descent in the tangent plane.

These correspondences are further optimized by entropy-based energy minimization of particle distribution along gradient descent by balancing the negative entropy of a shape instance with the positive entropy of the entire shape space encompassing all instances (known as an ensemble) [27]. Consider an ensemble \(\epsilon \) consisting of M surfaces, such as \(\epsilon \,{=}\,(z_1,z_2,\ldots ,z_M)\), where points are ordered according to correspondences between these surface pointsets. A surface \(z_k\) can be modeled as an instance of a random variable Z, where the following cost function is minimized:

$$\begin{aligned} Q = H(Z) - \sum _{k}H(P_k). \end{aligned}$$
(2)

The cost function Q favors a compact representation of the ensemble and assumes a normal distribution of particles along the shape surface. Hence, p(z) is modeled parametrically with a Gaussian distribution with covariance \(\varSigma \). This ensemble entropy term can be represented as

$$\begin{aligned} H(z) \approx \frac{1}{2}\log \Vert \varSigma \Vert = \frac{1}{2}\varSigma _k\lambda _k, \end{aligned}$$
(3)

where \(\lambda _k\) are ensemble covariance eigenvalues. This process optimally repositions the particles of the shapes within the ensemble to generate robust shape representations with uniformly-distributed particles.

ShapeWorksFootnote 1 was used to establish dense correspondences of 16, 384 homologous points on nine L1 vertebral instances, and 4, 038 points on eight L1-L2 intervertebral disc instances. The ensemble shapes were respectively normalized according to centroid-referred coordinates, and were further aligned during the correspondence optimization process through iterative Procrustes analysis [28]. Statistical shape models were respectively generated for an L1 vertebra and L1-L2 disc using these point clouds in the manner summarized in Sect. 2.3.

2.3 Construction of a Statistical Shape Model

The shape of an object is the geometrical information that remains after effects of translation, rotation and scaling have been filtered [29]. Statistical shape model capturing variations within L1 vertebrae and L1-L2 intervertebral disc population have been constructed using PCA [30].

The generalized mean shape \(\bar{X}\) and covariance matrix \(\varSigma _X\) can be calculated for the datasets. Assuming that the training dataset covers a set of closely related shapes, correlation between shape points can be represented by a multivariate Gaussian distribution. PCA is utilized to extract the principal modes, which represent data correlation along principal directions within the dataset, to reduce problem dimensionality.

Each eigenvector \(\phi _i\) represents the modes of variation within the training dataset, and the corresponding eigenvalue \(\lambda _i\) captures the amplitude of variation within the corresponding eigenvector, with the largest \(\lambda \) corresponding to the largest deformation in corresponding modes. The eigenvalues of \(\varPhi \) are sorted in descending order such that \(\lambda _i\,{>}\,\lambda _{i+1}\) and the largest t eigenvalues and corresponding eigenvectors are kept so that \(\varPhi _t\,{=}\,(\phi _1,\phi _2,\ldots ,\phi _t)\). A sample shape X can be approximated as a linear combination of the mean shape and first t modes of variation represented by \(X\,{=}\,\bar{X}+b_t\varPhi _t\), where \(b_t\) is a t-dimensional vector representing modes of variation. Assuming the mean shape \(\bar{X}\) is located at the origin, three standard deviations of \(\lambda _i\) capture expected shape variability with a \(99.7\,\%\) confidence interval.

The calculated average shape and expected variations can be incorporated within the discrete deformable simplex model segmentation [6, 3133] to constraint the model variability and faithfully capture structure boundary in presence of image artifacts and noise.

3 Results

3.1 Shape Mean and Variance Evaluation

Figure 1 is a graphical representation of dense correspondence-based mean shape of the L1 vertebrae and L1-L2 intervertebral discs. Both mean shapes look qualitatively normal. Figure 2 illustrates the changes in the shapes along the first three principal modes of variation by \(3\sigma \) for vertebrae and intervertebral discs. The first mode for both shape models mainly captures scaling across the population. The maximum vertebral variability (16 mm) is observed at the inferior and superior articular processes and the spinous process. The second and third modes in the vertebral model capture variation and scaling in the transverse processes and foramen size respectively. In contrast, the first mode of the intervertebral disc model varies maximally by 7 mm. The second principal mode captured stretching in the lateral parts of the disc, and the third mode captured rotational effects in the lateral part of the disc respectively.

Fig. 1.
figure 1

Correspondence-based mean shapes. (a) Mean L1 vertebra shape from a population of nine instances, viewed from inferior. (b) Mean L1-L2 intervertebral disc shape from a population of eight instances, viewed from superior.

Fig. 2.
figure 2

Graphical representation of shape model variability (in mm) captured by the first three principal modes (\(-3\sigma \) \(\leftarrow \) mean \(\rightarrow \) \(+3\sigma \)) of (a) L1 vertebra, viewed from superior (b) L1-L2 intervertebral disc, viewed from superior. Red corresponds to the maximum outward signed distance (mm) from the mean shape, while blue corresponds to the maximum inward signed distance (mm) from the mean shape. (Color figure online)

3.2 Statistical Shape Model Evaluation

Shape model correspondences and the constructed statistical models may be evaluated through established metrics, such as model compactness, generalization ability, and specificity [14]. A robust statistical model should have low generalization ability, low specificity and high compactness for the same number of modes. Compactness is the ability of the model to use a minimum number of parameters to faithfully capture shape variance within the dataset. This may be calculated as the cumulative variance captured by the first m number of modes

Fig. 3.
figure 3

Compactness ability of (a) L1 vertebra (b) L1-L2 intervertebral disc shape models. \(100\,\%\) of the variation was captured within the first seven modes for both models.

$$\begin{aligned} C(m) = \sum _{i=1}^m \lambda _i, \end{aligned}$$
(4)

where \(\lambda _i\) is the largest eigenvalue of the i-th mode. Figure 3 graphically illustrates the compactness of the statistical models as a function of the number of modes required to capture \(100\,\%\) of the variation across the population. Each principal mode represents a distinct shape variation amongst the shape population. Both shape models were able to capture variance within the first seven principal modes, with \(39.45\,\%\) variance of the vertebra model, and \(71.04\,\%\) disc shape variation captured within the first principal mode respectively. The generalized ability of the statistical model to represent new, unseen instances of a new shape that are not present in the training dataset was evaluated by performing leave-one-out experiments. Vertebra and disc statistical shape models were generated using all training samples except one, which was considered the test sample. This test sample was then reconstructed using the statistical shape model, and the root-mean-square (RMS) distance and Hausdorff distance errors were calculated between the reconstructed sample and the original test sample after rigid registration. This method was repeated over the entire vertebra and disc datasets respectively, to calculate an average and worst measure of error for both statistical models. Generalization ability G(m), and its associated standard error \(\sigma _G(m)\) can be mathematically represented as

$$\begin{aligned} G(m) = \frac{1}{n}\sum _{i=1}^{n}\mathbb {D}_{i}(m), \quad \quad \sigma _{G(m)} = \frac{\sigma }{\sqrt{n-1}}, \end{aligned}$$
(5)

where \(\mathbb {D}_{i}(m)\) is the RMS or Hausdorff distance error between the test sample and the instantiated shape, n is the number of shapes (i.e. nine L1 vertebrae and eight L1-L2 discs in our study) and \(\sigma \) is the standard deviation of G(m).

Model specificity is the measure of a model to only instantiate instances that are valid and similar to those in the training dataset. To measure our statistical models’ specificity, \((n-1)\) instances where randomly generated within \([-3\lambda \),\(+3\lambda ]\) using our statistical models, and compared to the closest shape in the training dataset. Specificity S(m) and its standard error \(\sigma _{S(m)}\) have been calculated as

$$\begin{aligned} S(m) = \frac{1}{n}\sum _{j=1}^{n}\mathbb {D}_{j}(m), \quad \quad \sigma _{S(m)} = \frac{\sigma }{\sqrt{n-1}}, \end{aligned}$$
(6)

where n is the number of samples, \(\mathbb {D}_{i}(m)\) is the RMS distance error between a randomly generated instance and its nearest shape within the training dataset, and \(\sigma \) is the standard deviation of S(m).

Fig. 4.
figure 4

Generalization ability and specificity of L1 vertebra and L1-L2 intervertebral disc shape models. Errorbars indicate standard error. (a)\(\,{-}\,\)(c) L1 vertebra: (a) generalization (RMS), (b) generalization (Hausdorff), and (c) specificity. (d)\(\,{-}\,\)(f) L1-L2 intervertebral disc: (d) generalization (RMS), (e) generalization (Hausdorff), and (f) specificity.

Results of the vertebra model generalization ability are presented in Fig. 4(a) and (b). For the first mode of variation, the average reconstruction error for an unseen instance is 0.47 mm with a confidence interval of 0.03 mm, with an initial Hausdorff distance of 8.2 mm. This error converges to 0.4 mm with worst mean error of 7.6 mm. Our vertebra models cumulative specificity error is 1.43 mm in seven principal modes with negligible standard error. Our vertebra model results are comparable with those in the literature. Vrtovec et al. [21] model is more compact, capturing \(52\,\%\) variability within the first principal mode. Rasoulian et al. [18] capture G(m) RMS error of 0.95 mm, with Hausdorff error of 9 mm within the first principal mode, which is decreased to 0.8 mm RMS and 7.5 mm after seven modes. Their model is worse in generalization and specificity, but outperforms in model compactness (capturing \(60\,\%\) in the first mode). Our statistical model outperforms Kaus et al. [22] whos model reported 1.66 mm mean error after 20 modes, with \(30\,\%\) first mode compactness, constructed with 32 (L1-L4) vertebral training shapes.

Our intervertebral disc model is able to represent unseen instances with an initial RMS error of 0.23 mm, and Hausdorff distance of 2.24 mm, which converges to 0.1 mm RMS error and 0.5 mm worst error after six principal modes. As depicted in Fig. 4(d) and (e), mode 5 attributes a spike in the distance errors. This may be caused by a singular variation within a training sample captured by this mode. The overall effect of this variation is reduced by mode 6, as demonstrated by a reduction in the G(m) error. The disc model specificity captures cumulative 1.0 mm RMS error within six modes. Peloquin et al. [26] present comparable results for model compactness of 14 L3-L4 discs, capturing \(70\,\%\) variability within the first mode. They presented a leave-one-out analysis to determine which samples influenced model outliers, demonstrating that PCs\(\,{>}\,4\) had higher influence on the mean shape of the model.

Overall, the compact model transitions coherently, with a tradeoff between compactness and the ability to faithfully represent new training shapes. Some outliers in the first principal mode can be noted in the variant vertebral shape. These outliers may be reduced by increasing the size of the population dataset, as well as exploring probabilistic PCA instead of simple PCA, which may better account for any outliers in the model. Moreover, large variability exists between the nine vertebrae instances, leading to large variability in the L1 vertebral shape model itself, as seen in Fig. 2. An increase in the training dataset would lead to a more robust and faithful vertebral model better able to represent variability within a population.

4 Future Work and Conclusion

The current shape models can be improved by increasing the size of the training dataset. Moreover, probabilistic PCA can be implemented to capture outliers in the vertebral shape model in presence of a small training size.

This paper quantifies inter-patient 3D shape variation of an L1 vertebra and an L1-L2 intervertebral disc of the lumbar spine. The constructed shape models have been shown to faithfully capture variance within a population with few particle outliers, capturing \(100\,\%\) variability within the first seven modes. The main advantage of this correspondence method is the lack of a reference bias, as it places particles on the implicit shape surface independent of prior surface parameterization. It also iteratively performs alignment during the correspondence optimization phase in order to mitigate error introduced by shape misalignment.

The calculated strong shape prior knowledge will be incorporated within a multi-surface, multi-resolution simplex deformable segmentation model of lumbar vertebrae and intervertebral discs. The shape models can be registered to the volumetric images, and set as a template meshes that are allowed to deform and capture the structure boundary while constraining the model according to expected variation. The inherent regularization and surface smoothness parameters in the discrete simplex model enforce mesh smoothness, mitigating the effects of shape model noise. These shape forces can be integrated in a controlled-resolution segmentation pipeline to faithfully capture structure boundary in presence of image artifacts, improving on our previous segmentation results [6].