Introduction

The spine is a load-bearing structure and failure of its structural integrity from osteoporosis or osteoclastic tumors can lead to vertebral body collapse, resulting in a narrowing of the spinal canal, spine curvature changes, pain, and additional decrements in quality of life.21 A decrease in bone mineral density (BMD) and deterioration of bone microarchitecture resulting from these pathologies lead to reduced skeletal strength and an increase in fracture risk.16,33 About 700,000 vertebral fractures occur in the US each year as a result of low BMD and osteoporosis, with 1 in 3 women and 1 in 5 men over 50 years of age at risk for fracture.37 Thus, there is a clear need for tools that can accurately predict fracture risk and estimate mechanical properties of vertebrae.

Unlike 2D measurements of BMD obtained using dual X-ray absorptiometry (DXA), quantitative computed tomography (QCT) imaging can account for 3D geometry and bone mineral distribution. Thus, subject-specific QCT images can be used as input for finite element (FE) models—(QCT/FEA)—to non-invasively estimate mechanical properties.2,7,9,13,17,22,25 Due to the integration of bone geometry, distribution, and material property estimates within the QCT images, QCT-based FEA modeling is a promising method for fracture risk prediction that has been widely used in the research setting and is becoming attractive as a clinical management approach. Critical steps in the development of subject-specific QCT/FEA models are the assignment of a material equation that relates Young’s modulus of the FE element to the greyscale value, or ash density, of the CT images, as well as a failure criterion to mimic failure of bone. A number of studies have sought to improve predictions of vertebral body strength by employing density-elastic modulus and fracture mathematical equations.3,7,13,25,27,32 These literature equations were obtained by performing local vertebral coupon experimental testing and linear regression analysis relating compressive elastic modulus and density measured using CT.26 However, this experimental process does not include the entire vertebral body, endplates, cortical bone, and other important characteristics used in the QCT/FEA modeling process, such as boundary conditions, which could potentially affect fracture properties outcomes. Chevalier et al. showed that inclusion of the cortical shell and heterogeneous trabecular fabric distribution within their QCT/FEA vertebral models accounts for weak cortical regions and unloads the trabecular centrum.6 Furthermore, following a similar modeling approach, Wu et al. demonstrated that damage of bone occurs closer to the endplates while following trabecular fabric orientations, eventually increasing the QCT/FEA correlations with experimental outcomes, and suggesting that local BMD-based material properties are inaccurate in the peripheral regions away from centrum.38

Thus, the aims of the current study were twofold: first, to optimize ash density-elastic modulus equations incorporating the whole vertebral body and accounting for all variables used during FE modeling in order to improve QCT/FEA prediction of vertebral stiffness; second, to determine the effect of material equations and failure criteria found in the literature on fracture properties of vertebrae. For this purpose, the optimization process consisted of two different approaches. First, the cadaveric vertebrae were pooled to determine optimal coefficients for the material equation as a function of ash density. Second, the vertebrae were optimized individually to obtain individual and vertebra-specific coefficients, and an average of these variables were then calculated for the final and optimum material equation. Subsequently, these two optimized equations, together with a previously proposed material equation,26 were implemented in the QCT/FEA models and run to failure using three different failure criteria. The outcomes of the current study assessed the accuracy and precision of the QCT/FEA models in predicting experimental stiffness and failure loads based on the elastic and failure material equations.

Materials and Methods

Human cadaveric thoracolumbar vertebrae were dissected from cadaveric spines (34 thoracic and 20 lumbar). The vertebral bodies were scanned using QCT and mechanically tested to fracture to assess mechanical properties (i.e., stiffness and fracture load). QCT images were then used to develop QCT/FEA models, and an inverse QCT/FEA method was implemented to optimize the coefficients from the density-modulus material equations. The optimized material equations were then implemented on the QCT/FEA models using three different failure criteria to obtain fracture loads for all specimens.

Experimental Process

Nine (9) human cadaveric torsos were obtained after IRB approval. The mean ± SD age of the specimens were 88.7 ± 9 (range 73–101). The spines were dissected and DXA imaging was performed using a General Electric Lunar Prodigy equipment in the anterior–posterior condition to obtain areal BMD (aBMD in g/cm2) values for each vertebra. A degree of osteoporosis (T-score) was assigned where a range from − 2.5 and below indicates osteoporosis, between − 2.5 and − 1 osteopenia, and above − 1 normal bones (Table 1). Fifty-four (54) vertebral bodies were then dissected from the thoracic (34) and lumbar (20) regions of the nine cadaveric spines. Soft tissues and posterior elements from each vertebra were removed, and the top and bottom regions were then potted with polymethylmethacrylate (PMMA). QCT scanning of the vertebrae was performed using a Siemens Somatom Definition CT scanner (Siemens Healthcare, Forchheim, Germany). Images of the specimens and a calibration phantom (Mindways Inc., Austin, TX) were obtained at 120 kVp, 260 mAs, slice thickness = 0.5 mm, in plane resolution = 0.5 mm, and reconstructed using a soft algorithm (I31). The imaging acquisition protocol is equivalent to what is routinely implemented in the clinic. Post-imaging, each vertebral body was compressed between two aluminum platens to failure at a rate of 5 mm/min using a Mini Bionix 858 servohydraulic test machine (MTS, Eden Prairie, MN). Force vs. displacement data were collected and the testing stopped once failure was observed; as indicated by a sudden and visual drop in the force vs. displacement curve. Stiffness was calculated from the linear portion of the force vs. displacement curve (Fig. 1).

Table 1 Cadaveric specimen demographics.
Figure 1
figure 1

(a) Flowchart describing the QCT/FEA model process; first, QCT images are obtained of the specimen and the entire geometry is segmented to obtain a 3D model and assign FE voxel elements. FE voxels are assigned material properties based on the ash density of the CT voxel, allowing for an heterogenous material distribution. Finally, the QCT/FEA model is run to obtain fracture properties. (b) Mechanical testing of a single vertebral body specimen involved a continuous compressive loading; asterisk denotes bottom surface was constrained in all degrees of freedom to not allow motion of the specimen; white arrow denotes compressive loading. (c) Example of force vs. displacement outcomes from the experiments QCT/FEA predictions.

QCT/FEA Model Development

Specimen-specific QCT/FEA model development and material assignment have been previously described in detail.9,10,13,29 Briefly, QCT-DICOM images were imported into Mimics (Materialise, Leuven, Belgium), and the vertebral bodies were manually segmented to include cortical and trabecular bone, and obtain a 3D geometry. PMMA, on top and bottom of the vertebral body, was also segmented to be used during the simulations. Voxel volume meshes were developed for each vertebra using a 1:1 ratio (FE voxel:CT voxel), and each voxel was assigned a Hounsfield unit (HU) value calculated from the QCT image voxel. The calibration phantom in the images was used to convert HUs to equivalent K2HPO4 density, assumed equal to ash density (ρash). This process has been implemented and described in detail in our previous work,9,10,11,12,13 and was accomplished as follows:

$$\rho_{\text{ash}} = {\text{K}}_{ 2} {\text{HPO}}_{4} = m \times HU + n,$$

where m (slope of linear regression) and n (intercept of linear regression) are parameters determined for each CT imaging scan using the calibration phantom with known densities.

Material Coefficient Optimization

Each QCT/FEA model was imported into ANSYS Mechanical APDL (ANSYS, Canonsburg, PA). A Young’s modulus value (E, [MPa]) was assigned to each finite element based on ρash and a power-law ash density-modulus relation (\(E = a \times \rho_{\text{ash}}^{b}\)). Poisson’s ratio for bony elements was set to 0.3. PMMA was assigned an E = 2500 MPa and a Poisson’s ratio of 0.5. Boundary conditions, mimicking the experimental process, were applied to the models; the bottom layer of PMMA was fixed in all directions, while a displacement was applied to the top layer to induce compression. Force and displacement data were obtained for each model to calculate stiffness.

The optimization process has been previously described in detail.29 Briefly, in order to determine the elastic properties of each vertebra model and the unknown coefficients in the ash density-modulus equation, an objective function in the form of the root mean square (RMS) error was defined between experimentally measured stiffness (Ki) and QCT/FEA predicted stiffness (\(\hat{\varvec{K}}_{i}\)) outcomes as

$$\varvec{J} = \sqrt {\mathop \sum \limits_{{\varvec{i} = 1}}^{\varvec{n}} \left( {\varvec{K}_{\varvec{i}} - \hat{\varvec{K}}_{\varvec{i}} } \right)^{2} }$$

where n (n = 54) represents the number of specimens. The constant coefficients in the density–elastic modulus equation used in the QCT/FEA models were changed iteratively to minimize the objective function. A simplex search Nelder–Mead optimization algorithm (MATLAB, MathWorks, Natick, MA) was implemented to identify the unknown coefficients. When the error was less than a predefined tolerance the process stopped reporting the optimized material coefficients that resulted in an optimum match between the experimental measured stiffness and the predicted QCT/FEA outcomes. To confirm that the optimization outcomes were unique and independent of the initial values, the optimization process of the pooled vertebral bodies was repeated twice with different initial values. The optimization methodology was implemented using two different approaches; first, coefficients were obtained by optimizing all specimens, where all 54 specimens were modeled and the error from all outcomes minimized in the process (previously published29). Second, all vertebrae were optimized individually, obtaining 54 individual coefficients. These coefficients were then averaged to obtain final values (Table 2).

Table 2 Ash density-elastic modulus equations implemented during the optimization process.

Fracture Load Prediction

Using the optimized coefficients for material equations (1) and (2) and a widely used equation from the literature (Eq. (3))26 (Table 2), all QCT/FEA models were then run to failure using three different failure criteria from the literature (Table 3).8,13,29 To simulate element failure, after each simulation load step, a Young’s modulus [E] of 0.01 MPa was assigned to each element in which the von Mises strain exceeded the ash density-based yield strain criterion based on the different equations. Similar to the experimental approach, failure load in the models was indicated by the highest peak in the predicted load data.

Table 3 Ash density-power law failure criteria equations used during the QCT/FEA process.

Statistical Analyses

JMP Pro 14 (SAS Institute Inc., NC, USA) was used for all statistical analyses. Outcome data included aBMD from DXA, experimental stiffness and experimental failure loads, and QCT/FEA stiffness and failure loads from three material equations and three failure criteria. While each spine resulted in multiple vertebral body specimens, the vertebrae vary in density distribution, micro-architecture, shape, and size based on the location along the thoracic and lumbar spines. Therefore, the vertebrae were considered as independent variables in our statistical analyses. Coefficients of determination (R2) were calculated using the experimentally measured stiffness as the dependent variable and the QCT/FEA predicted stiffness as the independent or explanatory variable. R2 were calculated with respect to Y = X to indicate the accuracy of the predictions. Additionally, \(\hat{R}^{2}\) coefficients of determination were determined with respect to the regression line Y = mX + b, to determine the precision of the QCT/FEA predictions and aBMD from DXA. Descriptive plots of predicted stiffness were developed and compared to experimentally measured values. Similarly, QCT/FEA predicted failure loads obtained using the three failure criteria equations were correlated to the experimentally measured values.

Results

Fifty-four (54) intact vertebrae were dissected from nine cadaveric torsos after DXA imaging. The vertebrae were then CT-imaged to develop specimen-specific QCT/FEA models and experimentally tested to fracture (Fig. 1). The average (SD) number of voxels for the vertebrae in the QCT/FEA models based on spine location were the following: thoracic spine region [T6-T10; 23 vertebrae]: 148,528 (35,871); thoracolumbar spine region [T11-L1; 18 vertebrae): 250,512 (54,131); and lumbar spine region [L2-L4; 13 vertebrae): 295,970 (36,569). The QCT/FEA models were used to optimize the material equations and then run to fracture using various failure criteria. The number of optimization iterations at which the coefficients (a and b) and (c and d) converged were ~ 110 and ~ 100, respectively, totaling 11,340 QCT/FEA model simulations. The two optimized material equations together with a previously published equation were then used to obtain failure loads via three different criteria, totaling 486 new simulations [3 (failure criteria) × 3 (material equations) × 54 (specimens)]. Additionally, the individual material coefficients were used to obtain failure loads from all 54 specimens, totaling 162 simulations [3 (failure criteria) × 54 (specimens)].

Areal BMD from the thoracic and lumbar vertebrae were pooled and correlated to the experimental measured fracture loads. While aBMD could predicted about 64% of the measured failure loads it should be noted that standard clinical DXA imaging includes the L1-L4 lumbar region of the spine. Imaging of the thoracic region is not feasible due to limitations such as the presence of the rib cage. Furthermore, degenerative changes in the vertebrae, such as osteophytes, spondylosis, and facet joint sclerosis, can result in falsely elevated BMD values, leading to incorrect T-score evaluations (Table 1). Table 4 shows the material coefficients for the elastic-modulus equations (1) and (2). Figure 2 shows the elastic modulus variation with ash density for all material equations. There are small differences between Eqs. (2) and (3) in elastic modulus outcomes; however, when all specimens were pooled and optimized together, larger modulus values were observed with lower ash densities compared to Eq. (2), with decreasing values as density increased. While Eqs. (2) and (3) represent outcomes from individual vertebra, Eq. (1) minimized the error of a population to increase the accuracy in the prediction. Predicted vs. experimental stiffness outcomes were assessed in terms of model precision and accuracy. Figure 3 presents scatterplots of the measured and predicted stiffness values based on all three equations. The solid and dashed lines show the regression (precision) and Y = X (accuracy) lines, respectively. Precision (\(\hat{R}^{2}\)) and accuracy (R2) estimations for stiffness outcomes are shown in Table 4.

Table 4 Summary table of predicted outcomes. Optimized “a”, “b”, “c”, and “d” coefficients, precision, and accuracy outcomes for the measured and predicted stiffness; measured vs. predicted fracture load outcomes.
Figure 2
figure 2

Elastic modulus vs. ash-density outcomes for the two optimized material equations and a widely used literature equation. An inverse QCT/FEA optimization process of pooled specimens resulted in a material equation with larger modulus values at lower densities, compared to an individual specimen optimization approach.

Figure 3
figure 3

Accuracy (Y = X) and precision (Y = mX + b) outcomes for the QCT/FEA predicted stiffness vs. the experimental measured stiffness for all three material equations.

Fracture load prediction outcomes are shown in Table 4. Precision outcomes are shown with their respective equation of the line for all three material equations and all three failure criteria. Material equation (1) resulted in \(\hat{R}^{2}\) values of 0.47, 0.04, and 0.05 indicating that the model can predict up to 47, 4 and 5% of the measured fracture loads with the given inputs. Material equation (2) yielded higher prediction outcomes, predicting 78, 38 and 41% of the failure loads. On the other hand, material equation (3) resulted in \(\hat{R}^{2}\) values of 0.78, 0.01, and 0.10 indicating that the model can predict up to 78, 1 and 10% of the measured fracture loads. While material equation (2) could predict 78% of the loads using a specific failure criterion (Criterion-A), it also performed fairly well with the other two failure criteria (Criteria B and C). Similarly, material equation (3) could predict 78% of the loads using the same criterion-A, however, it did not perform as well with the other two criteria. Nevertheless, implementing these two material equations (2) and (3) with failure criterion-A resulted in a robust QCT/FEA method that can predict up-to 78% of the experimental failure loads. All 54 specimens were also run to failure using the individual material coefficients obtained during the optimization process (experimental vs. predicted stiffness: \(\hat{R}^{2} = R^{2} = 1\)). While individual material coefficients were implemented, predicted failure loads using all three failure criteria were poorly correlated with the experimental values: criterion-A (\(\hat{R}^{2} = 0.12,\) positive correlation), criterion-B (\(\hat{R}^{2} = 0.13,\) negative correlation), criterion-C (\(\hat{R}^{2} = 0.12,\) negative correlation). These results emphasize that in addition to an ideal material equation, an optimum failure criterion is needed to obtain robust outcomes. Figure 4 shows the fracture load variation for the different material equations and failure criteria. This figure indicates that while all material equations could highly correlate and predict the experimentally failure load outcomes, the choice of material equations as well as failure criterion highly affect the final predictions (Table 4).

Figure 4
figure 4

Accuracy (Y = X) and precision (Y = mX + b) outcomes for the QCT/FEA predicted failure loads vs. the experimental measured outcomes for all three material and failure criteria equations.

Discussion

The purpose of this study was to optimize ash density-elastic modulus equations incorporating the whole vertebral body and accounting for all variables used during FE modeling and assess the accuracy and precision of QCT/FEA in predicting vertebral stiffness and fracture load outcomes using various material equations and failure criteria. We performed mechanical testing on a cadaveric cohort of 54 single vertebral bodies, created QCT/FEA models, and used different material equations and failure criteria to evaluate the robustness of this technique. We developed two sets of density-dependent elastic modulus material equations using an inverse QCT/FEA technique, employed previously published elastic modulus and fracture equations, and compared predicted values with experimental fracture outcomes. Our study showed the importance and robustness of these equations and technique in explaining the variability in mechanical properties outcomes; both material equations (2) and (3) along with fracture criterion-A explained 78% of the variability in fracture load outcomes. This is an important finding; while these elastic material equations resulted in poor accuracy in the predicted stiffness, yet both yielded good precision and, more importantly, strong correlations with fracture loads. In contrast, material equation (1) obtained by optimizing all the specimens together, with an acceptable level of accuracy in predicting stiffness, failed to explain fracture load outcomes when implemented in combination with fracture criteria B or C. On the other hand, when implementing individual material coefficients, describing an excellent correlation between the experimental and predicted stiffness (R2 = 1), and using the stated failure criteria, the predicted failure loads were poor and inaccurate. These findings suggest that both elastic modulus and fracture criterion equations are equally important in estimating accurate and precise results.

Various factors affect the risk of vertebral fractures, including bone quality, complex loading conditions present in the spine, existence of lesions in the vertebrae, and spine curvature.14,15,24,31,35 While DXA is able to estimate BMD, it is unable to account for all other variables to provide accurate estimates of mechanical properties of bones. To improve on the current limitations from DXA, axial rigidity analysis and QCT/FEA have stood out as alternative methodologies. In their studies, Buckley et al. showed that axial rigidity analyses of vertebral compressive strength were moderately (R2 = 0.61)3 to highly (R2 = 0.81)4 correlated with experimental compressive vertebral failure load outcomes. Similarly, Whealan et al. investigated the mechanical properties of cadaveric vertebrae with simulated lesions, concluding that axial rigidity could be used to predict yield loads of vertebrae with lytic defects (R2 = 0.69).36 However, while resulting in moderate to high correlations, the axial rigidity approach has several limitations including limited number of studies validating the methodology in the spine and with various loading conditions, as compared to QCT/FEA, and the inability to include soft tissues such as intervertebral discs and ligaments. On the other hand, several research groups have implemented QCT/FEA as an alternative to DXA for estimation of vertebral mechanical properties and fracture risk prediction,1,5,7,17,18,19,20,23,28,32 however, there seems to be no consensus in the literature as to what material or failure criterion equations to use. To the best of our knowledge, this is the first study showing a comprehensive computational and experimental evaluation between different optimized material equations and fracture criteria, and their effect on predicted outcomes. Results showed that there exist differences in estimated stiffness values based on the material equations of choice, and that the selection of the material equations affected the accuracy of the predictions, while showing little to no effect in precision correlation outcomes. While a population-based optimization can increase the accuracy of the stiffness predictions, individual optimization of the QCT/FEA models, as well as individual experimental coupon testing of samples26 resulted in a similar precision.

In addition to stiffness outcomes, material equations can affect the precision of the predicted failure loads. However, a low accuracy in stiffness predictions does not necessarily imply a poor precision in failure load predictions, as indicated by Table 4. While two material equations (2) and (3) showed poor accuracy, all three showed similar precisions in stiffness predictions. Furthermore, Eqs. (2) and (3) could predict up to 78% of the failure load (criterion-A) while Eq. (1) only 47%. This emphasizes the important role of the failure criterion of choice and the interplay between all inputs in the modeling process. Currently, DXA-measured aBMD is the clinical standard for the assessment of bone fracture risk. However, DXA imaging results in a 2D measurement of a complex 3D structure, and does not take into consideration various imposed loading scenarios,30 resulting in poor sensitivity and incorrect fracture predictions.34 Therefore, QCT/FEA might fill the critical gap needed for an additional methodology that can complement the current clinical gold standard of DXA, providing quantitative and objective fracture properties measurements. Although the precision between measured vs. predicted failure loads differ with the use of the three material equations, all three equations resulted in a robust process and outcomes when implemented with failure criterion-A.

Although the material equation proposed by Morgan et al.26 is a well-used and accepted equation associating Hounsfield units (HU) in the CT image with Young’s modulus, in our analysis and with our specimen cohort population, individual optimization of the specimens resulted in a material equation which could better predict the experimental fracture loads regardless of the failure criterion being implemented in this study. However, the highest prediction in failure loads were obtained with both of these material equations. In the current study, we defined accuracy as the R2 obtained with respect to the line Y = X, and precision as the R2 value obtained with respect to the regression line. There are several take-home messages from this study: (1) material equations implemented in the QCT/FEA process affect the accuracy of the stiffness predictions, while showing little to no effect in the precision correlations; (2) material equations will affect the precision of the predicted failure loads; (3) while performing a population optimization improved the accuracy of the predicted stiffness compared to the individual specimen optimization, optimizing the specimens individually yielded similar precision in predicted stiffness, but more importantly, higher failure load prediction correlations using the specific failure criteria; (4) while a widely used literature material equation26 yielded weak accuracy on the predicted stiffness, it resulted in good stiffness precision outcomes, as well as high failure load predictions; (5) clinically, it is of importance to predict vertebral failure loads and all three material equations showed to be robust, especially when Eqs. (2) and (3) were implemented with failure criterion-A; finally, (6) when implementing QCT/FEA to predict fracture properties it is important to acknowledge that material equations and failure criteria will highly affect the predicted outcomes. When individual material coefficients were implemented to predict fracture loads using the three failure criteria, poor correlations were obtained with the experimental values, emphasizing the need for a critical understanding of bone failure (i.e., criteria) when predicting mechanical properties of bone.

This study has several limitations. While 54 vertebrae were used in the process, only a cohort of 9 cadaveric spines were obtained, preventing us from evaluating the effect of age and sex on our outcomes. Also, a wider range of elastic and failure equations could have provided deeper insight into the difference between these criteria on the robustness of the QCT/FEA technique in vertebral fracture assessment. In this study, only strain-based failure criteria were evaluated, and criterion-C was derived from criterion-B in a previous study, resulting in comparable failure load correlation outcomes; future studies should evaluate other forms of bone failure. A mesh sensitivity analysis was not performed; evaluation of a lower resolution (i.e., larger element size) would have provided information about the robustness of this process using an image resolution that mimics clinical imaging. Additionally, the models were only validated against experimental stiffness and failure load outcomes. Surface strain measurements during compression testing using digital imaging correlation would have provided additional three-dimensional strain information to further validate our models. Finally, a cross-validation of our optimized material equations was not performed. While all specimens were used during the optimization process, either individually or pooled, these equations resulted in a best-fit scenario, representing a large specimen population, the thoracic and lumbar regions of the spine, and a wide distribution of mineral densities. Though our results strengthened the use of QCT/FEA as a better method for vertebral compression fracture prediction compared to common clinical diagnostic tools such as DXA, it remains to be determined if this improvement can be extrapolated to additional loading configurations.

In conclusion, we derived density-elastic modulus equations combining an inverse QCT/FEA process with experimental outcomes and evaluated the robustness of various material and failure criteria equations. Both material and fracture criterion equations are equally important in estimating accurate and precise QCT/FEA predictions. High model accuracy in predicting stiffness will not necessarily translate to high precision and/or accuracy in predicted forces. These results suggest that both elastic modulus and fracture criterion equations should be validated against experimental outcomes to better explain the response of the tissue under various conditions.