Introduction

The prevalence of osteoporosis is increasing due to population ageing worldwide [1]. Being a “silent” disease, osteoporosis may not attract the attention of patients and primary healthcare practitioners until a fragility fracture occurs. It has been estimated that one-third of women and one-fifth of men will sustain a fragility fracture after age 50 years [2], which is a great societal burden associated with high morbidity, mortality and healthcare expenditure. Although many modifiable risk factors for bone fragility have been identified previously, better screening, case-finding and monitoring strategies are still needed given the intrinsic limitations of dual-energy X-ray absorptiometry (DXA) [3].

Although DXA is considered the clinical criterion method for osteoporosis diagnosis, almost seventy percent of patients who had recently sustained a low-trauma fracture were diagnosed with osteopenia or normal bone density based on DXA measurements at our institution [4]. This misclassification is due in part to the technique itself which projects the region of interest into a two-dimensional plane. This data acquisition from a projective X-ray technique does not take bone “depth” into account, thus making results susceptible to artefacts related to different bone sizes. DXA also fails to capture other bone geometric information including cortical thickness which has been closely correlated to bone fragility [5]. In addition, DXA is performed routinely at central sites, i.e. the lumbar spine and the proximal femur in clinical practice, while a large proportion of fragility fractures occur at more peripheral sites, e.g. the distal forearm [2], where fracture is an early indication of bone fragility [6].

Peripheral quantitative computed tomography (pQCT) provides both volumetric bone density and bone geometry measurements which correlate well with bone mechanical properties at multiple peripheral sites [7,8,9]. These variables can differentiate between people with and without prevalent fractures [10] and are associated with fracture occurrence during follow-up in both women and men [11, 12]. Therefore, pQCT has a potential role in improving our understanding of bone fragility compared to using DXA alone [4]. Finite element (FE) analysis is a computational method that can provide non-invasive assessment of bone strength in vivo. FE models based on quantitative computed tomography (QCT) images of either the vertebral body or the proximal femur have provided estimates of bone strength that correlate strongly with cadaveric fracture loads [13, 14], and these correlations were reported to be higher than those of bone mineral density (BMD) measured by DXA [13, 15]. FE models of cadaveric forearm or tibia using either QCT or pQCT were established previously [16,17,18]. While these studies showed good correlation between FE properties and fracture failure load, the models were time-consuming to build and imparted considerable radiation to image the entire bone or joint, thus limiting their use in clinical settings. Since it is designed for measurement at the appendicular skeleton, pQCT is a potentially suitable source for FE analysis at peripheral sites with its volumetric data acquisition, spatial distribution of bone density and comparable resolution with QCT. Moreover, pQCT instruments can be acquired and operated at quite moderate cost.

In this current study, we aimed to evaluate the ability of a clinically relevant, pQCT-based FE model (pQCT-FE) to estimate bone strength in patients with recent fragility fractures and age-matched controls and to compare the diagnostic characteristics of DXA, pQCT and pQCT-FE variables.

Methods

Participants

Two groups of participants were recruited for this study. The fracture group was recruited from a fracture liaison service involving multiple disciplines at a tertiary hospital, which improves uptake of osteoporosis intervention guidelines in a cost-effective way [19]. The control group was recruited through electronic and paper-based advertisements from multiple sources, including members of a community education program (University of the Third Age), staff, volunteers, visitors and contacts of the University of Melbourne and the Royal Melbourne Hospital.

General inclusion criteria for both groups were as follows: [1] aged 50 years or above, [2] English-speaking or has an English-speaking family member or friend available, [3] consent to participate in this study and be able to attend a study visit at the Royal Melbourne Hospital. Specific inclusion criteria for the fracture group were as follows: [1] had sustained at least one low-trauma fracture, i.e. a fracture caused by minimal trauma such as a fall from a standing height or less, within 3 months prior to a study visit; [2] for patients with two or more fractures, there were at least one radius and one tibia without a fracture history or other relevant pathology and available for measurement. Specific inclusion criterion for the control group was as follows: reported no prior history of osteoporosis, low bone density or low-trauma fracture.

General exclusion criteria for both groups were as follows: [1] prior diagnosis of osteoporosis; [2] prolonged (> 3 months) use of osteoporosis therapy, including bisphosphonates, denosumab, selective oestrogen receptor modulators and hormone replacement therapy in the past 2 years; [3] prior therapy with teriparatide or strontium ranelate; [4] other medical conditions which may affect bone health, e.g. hyperthyroidism, hyperparathyroidism, Crohn’s disease, diabetes and Cushing syndrome; [5] currently taking or have recently taken medications which may affect bone health, e.g. glucocorticoid agents, anti-epilepsy drugs and heparin.

DXA scanning and analysis protocol

Standard clinical scans of the lumbar spine (L1–L4) and the right hip were performed using a fan-beam densitometer (Horizon QDR 4500A, Hologic Inc., Bedford, MA, USA). In cases where participants had a right-sided hip replacement or fracture, the left hip was scanned. DXA scans were performed in array mode and were analysed using the manufacturer’s commercial software (v 9.10D). Variables of interest included areal bone mineral density (aBMD) of the lumbar spine (LS aBMD), the total hip (TH aBMD) and the femoral neck (FN aBMD). The 12-month precision of the scanner for the Hologic spine phantom was 0.39% for aBMD.

pQCT scanning and analysis protocol

Scans of the non-dominant radius and tibia were performed using an XCT 3000 pQCT scanner (Stratec Medizintechnik, Pforzheim, Germany) at both the 4% and 66% sites along the limb length. Length was measured from the ulna styloid process to the olecranon process at the forearm and from the base of the medial malleolus to the superior margin of the medial condyle at the tibia. In fracture patients whose non-dominant radius/tibia sustained a fracture, the dominant side was scanned. A scout scan was performed to identify the correct starting line, which was taken as the distal articular surface of the radius or tibia. A single slice at each site was acquired with an in-plane resolution of 0.4 × 0.4 mm and a slice thickness of 2 mm. Scanning speed was 10 mm/s. The manufacturer’s commercial software (version XCT 5.50E) was used to analyse pQCT images for standard pQCT variables. Variables of interest included total volumetric bone mineral density (Tot vBMD), trabecular volumetric bone mineral density (Trb vBMD) and trabecular cross-sectional area (Trb CSA) at the 4% site and cortical volumetric bone mineral density (Crt vBMD), cortical thickness (Crt Thk) and polar stress-strain index (SSIp) at the 66% site.

FE model properties

Detailed methodology of the pQCT-FE models was described elsewhere [20]. Briefly, all pQCT images were exported to MATLAB (version R2016b, Mathworks, Natick, MA, USA), where manual segmentation was performed. A mesh of 0.4 × 0.4 × 2 mm elements was generated from segmentation and then re-sliced in the z-direction to produce a mesh of 0.4 × 0.4 × 0.4 mm elements. The Young’s modulus was calculated using an established equation for the tibia [21] and the radius [18] and was assigned to each element. Each voxel mesh was then used to generate a FE model in Abaqus (version 6.11, Simulia, Dassault Systems, Providence, RI, USA).

Four loading cases were considered for all FE Models (Fig. 1): axial compression, shear, bending and torsion. Axial compression was simulated by a 0.01-mm displacement of the superior surface towards the inferior surface (Fig. 1a). Shear was simulated by a 0.01-mm displacement of the superior surface in the direction of either the x- or y-axes (Fig. 1b). Bending was simulated by a 0.0001 radian rotation of the inferior surface about either the x- or y-axes (i.e. cross-section neutral axes; Fig. 1c). Torsion was simulated by a 0.0001 radian rotation of the inferior surface about the z-axis (Fig. 1d). The reaction forces and moments predicted from the simulations were divided by the respective applied displacement or rotation to derive the compressive, shear, bending and torsional stiffness (kcomp, kshear, kbend and ktorsion, respectively) of each cross section. The bending and shear stiffness were each taken as the minimum value derived from the two neutral axis directions.

Fig. 1
figure 1

Loading cases of pQCT-FE. Figures show examples of 4% radius. a Axial compression loading case. Grey arrows applied displacement of 0.01 mm in the negative z direction. b. Shear loading case. Grey arrows applied displacement of 0.01 mm in either positive x direction or positive y direction. c Torsion loading case. Grey arrow applied rotation of 0.0001 radians about z-axis. d. Bending loading case. Grey arrow applied rotation of 0.0001 radians about either x- or y-axis.

Other data collected

An ethics-approved questionnaire was used to collect other information from participants. The information included date of birth, sex, height, weight, fracture details/history, comorbidities, related medical history and risk factors for osteoporotic fracture according to the FRAX® algorithm (https://www.sheffield.ac.uk/FRAX).

Statistical analysis

Descriptive statistics were expressed as mean ± 1.0 standard deviation (SD). Difference between groups was assessed using a two-sample t test for variables that were normally distributed or the Kolmogorov-Smirnov test for variables that were not normally distributed. Multivariate linear regression models were established to compare means between groups while adjusting for age, height and weight. To identify multicollinearity between variables, multivariate linear regression models were established for each group of variables including DXA, pQCT radius/tibia and pQCT-FE stiffness of 4/66% radius/tibia, from which variance inflation factors (VIFs) were derived for each variable. Age, height and weight were included in each model. Collinearity was assumed with a VIF > 5 [22].

For all DXA variables, and pQCT and pQCT-FE variables that varied significantly between the control and fracture groups, a binary logistic regression was established to evaluate their relationship with fracture status. All logistic regression models were adjusted for confounding factors including age, height and weight. Results of logistic regression models were expressed as odds ratio (OR) per SD decrease of the respective variable and its 95% confidence interval (95% CI). In the case of variables determined to be collinear according to the VIFs, only the variable with the highest average OR was included in the analysis.

Specificity, sensitivity and area under the receiver operating characteristics (AUROC) curve were obtained from the logistic regression models to show the ability of each predictor to classify between fracture patients and controls. The significance of differences between the AUROCs of key DXA, pQCT and pQCT-FE variables (i.e. those with highest AUROC value in each group of DXA, standard pQCT amd pQCT-FE variables) was determined using the method by Delong et al [23], which is a non-parametric method based on U-statistics from which the test statistic follows a χ2 distribution. All statistical analyses were performed in SPSS (version 25, SPSS Inc., Chicago, IL, USA). All significance levels were set as p < 0.05.

Results

One hundred and eight fracture patients (77 females, 31 males) and 120 controls (85 females, 35 males) were recruited into this study (Table 1). No significant difference was observed in age, height, weight, BMI, sex or radius/tibia length between the fracture patients and controls. Among fracture risk factors, no difference between groups was observed in alcohol consumption. The fracture group had a higher rate of smoking than the control group (p = 0.01). More fracture patients were found to have rheumatoid arthritis than controls (p = 0.03), and their parental hip fracture incidence was also greater (p = 0.01). For fracture patients, the average interval between fracture and the study visit (mean ± SD) was 56.6 ± 13.2 days. Most fractures sustained were non-vertebral. Colles’ fractures accounted for the major proportion of all fractures (60.7%), followed by lower leg (13.9%) and humerus (9.0%) fractures (Table 1).

Table 1 Characteristics of study participants

A difference between the fracture group and the control group was observed in TH aBMD but not in DXA aBMD variables at other sites; however, the difference was not significant after adjustment for age, height, weight and sex (Table 2). Several standard pQCT variables differed between groups before adjustment for age, height, weight and sex, but the only variables with significant difference between groups after adjustment were Trb vBMD at both the radius and the tibia (p = 0.01 and 0.02 for the radius and the tibia, respectively). No pQCT-FE variables differed between groups at the 4% radius site. At the 66% radius site, kcomp and kbend were lower in the fracture patients than in the controls after adjustment for age, height, weight and sex (p = 0.04 and 0.03, respectively). At the 4% tibia site, all pQCT-FE variables differed between groups before adjustment for age, height, weight and sex, although the differences were not statistically significant for kbend and ktorsion after adjustment for age, height, weight and sex. At the 66% tibia site, all pQCT-FE stiffness variables except ktorsion differed between groups before and after adjustment for age, height weight and sex. Similar results in different DXA, pQCT and pQCT-FE variables were observed between groups when females and males were analysed separately (Suppl. Tables 1 and 2).

Table 2 Comparison of different properties between the fracture and control groups in all participants. Variance inflation factors (VIFs) were calculated for the properties of each modality. Values are expressed as mean ± 1.0 SD

Age, height and weight did not exhibit collinearity with each group of DXA, pQCT and pQCT-FE variables in any of the regression models, with all corresponding VIF < 5 (Table 2). Collinearity, however, was identified with other predictors in each. Among DXA variables, the highest VIF was found with TH aBMD (VIF = 5.1). For standard pQCT variables, radius tot vBMD (VIF = 11.0) and tibia SSIp (VIF = 16.0) had the highest collinearity with other standard pQCT variables. Strong collinearity with high VIF values was observed in all groups of pQCT-FE properties, indicating strong correlations among the stiffness estimates of the four loading cases at the same site, especially between kcomp and kshear (e.g. at 4% tibia, VIF = 641.8 and 598.6, respectively). Similar trends were observed when females and males were analysed separately (Suppl. Tables 1 and 2).

The ability of each DXA, standard pQCT and pQCT-FE property to classify between fracture and control groups (i.e. OR, specificity, sensitivity and AUROC derived from the logistic regression models) in all participants, and females and males separately, is shown in Table 3. In the pooled analysis, odds of fracture increased 1.53-fold per SD decrease in DXA TH aBMD [95% CI (1.01, 2.15)]. Odds of fracture increased more per SD decrease of tibia Trb vBMD and 4% tibia kshear, which were 7.64 [95% CI (1.92, 26.51)] and 9.13 [95% CI (1.87, 31.36)], respectively. The highest AUROC was observed with 4% tibia kshear, which was 0.79 [95% CI (0.74, 0.84)], compared with 0.69 [95% CI (0.62, 0.75)] for TH aBMD and 0.74 [95% CI (0.68, 0.80)] for tibia Trb vBMD. Specificity and sensitivity were 66.7% and 78.7% for 4% tibia kshear, compared with 76.7% and 54.8% for TH aBMD, respectively. Pairwise comparisons (Fig. 2; Table 4) of AUROC for the three key variables (DXA TH aBMD, tibia Trb vBMD and 4% tibia kshear) showed that the AUROC of kshear was higher than TH aBMD (p = 0.02). Difference in AUROC was observed neither between tibia Trb vBMD and TH aBMD nor between 4% tibia kshear and tibia Trb vBMD.

Table 3 Odds ratio, specificity, sensitivity and area under receiver operating curve (AUROC) of receiver operative curve derived for key variables
Fig. 2
figure 2

Comparisons of AUROC of primary variables in all participants, females and males

Table 4 Pairwise comparison of AUROC of primary variables in females and males

In females, odds of fracture increased 1.52-fold per SD decrease in DXA TH aBMD [95% CI (1.01, 2.16)]. Higher ORs were observed for standard pQCT and pQCT-FE properties, which were greatest for tibial Trb vBMD [OR = 8.15, 95% CI (1.78, 39.72)] and kshear at 4% tibia [OR = 10.34, 95% CI (1.91, 43.98)], respectively. Among all variables, the highest AUROC was found with kshear at 4% tibia, which was 0.83 [95% CI (0.77, 0.89)] compared with 0.72 [95% CI (0.64, 0.79)] for DXA TH aBMD and 0.76 [95% CI (0.68, 0.82)] for tibia Trb vBMD. Specificity and sensitivity for kshear at 4% tibia were 79.2% and 69.4%, compared with 72.4% and 52.7% for DXA TH aBMD, respectively. Pairwise comparisons (Fig. 2; Table 4) of AUROC for the three primary variables (DXA TH aBMD, tibia Trb vBMD and 4% tibia kshear) showed that AUROC of kshear was higher than TH aBMD (p = 0.02). No difference in AUROC was observed between tibia Trb vBMD and TH aBMD (p = 0.4). There was a trend that AUROC of kshear was higher than that of tibia Trb vBMD with a p = 0.07.

For male participants, odds of fracture increased 1.55-fold per SD decrease in DXA TH aBMD [95% CI (1.02, 2.07)]. Higher ORs were observed for standard pQCT and pQCT-FE properties, which were greatest for tibia Trb vBMD [OR = 6.58, 95% CI (2.43, 10.70)] and kbend at 4% tibia [OR = 8.32, 95% CI (4.15, 33.84)], respectively. Among all variables, the highest AUROC was found with kbend at 4% tibia, which was 0.81 [95% CI (0.70, 0.90)] compared with 0.62 [95% CI (0.49, 0.74)] for DXA TH aBMD and 0.71 [95% CI (0.59, 0.82)] for tibia Trb vBMD. Specificity and sensitivity for kbend at 4% tibia were 80.0% and 74.2%, compared with 65.7% and 64.5% for DXA TH aBMD, respectively. Pairwise comparisons (Fig. 2; Table 4) found higher AUROC for 4% tibia kbend than for TH aBMD (p = 0.03). No difference in AUROC was observed either between standard pQCT and DXA variables or between pQCT-FE and standard pQCT variables.

Discussion

This study evaluated whether pQCT-derived FE modeling provided improved discrimination between non-vertebral, low-trauma fracture patients with predominantly peripheral fractures and age-matched controls. Peripheral QCT scans of the radius and the tibia were performed on patients with recent non-vertebral, low-trauma fracture and age-matched healthy controls. The fracture group consisted primarily of ambulatory care patients with limb fractures; therefore, as expected, they were relatively young compared with patients presenting with spine and hip fractures. Thus, our pQCT findings are likely to reflect bone fragility at the relevant peripheral sites in this age group. Peripheral QCT can differentiate the cortical and trabecular compartments of bone, which is advantageous as the two types of bone change differently in response to ageing, bone diseases and treatment [24]. In this study, pQCT-FE variables were found to have enhanced diagnostic performance compared with DXA and statistically comparable diagnostic performance with standard pQCT variables. By incorporating the BMDs across these different compartments in FE models, we hypothesized that predictions of bone stiffness may be used to better discriminate between fracture patients and healthy controls compared to DXA.

QCT-based FE modelling has been used in clinical studies to assess bone strength of the proximal femur and spine due to its strong predictive ability for fracture failure load [13, 14]. In the last decade, a dedicated peripheral CT scanner with higher image resolution and voxel size emerged in clinical research. High-resolution pQCT (HR-pQCT) of the latest generation can achieve voxel sizes of 61 μm which enables depiction of the microstructure of the radius and the tibia. While micro-FE (μFE) generated from HR-pQCT images has good ability to discriminate between fracture patients and controls [25] and is associated with fracture occurrence during follow-up [26], there are several drawbacks with this technique when used in clinical settings. It is time-consuming to scan, set up and analyse an FE model for individual patients from HR-pQCT images (3 to 10 h; [27]); hence, it is not efficient for fracture risk screening or diagnosing osteoporosis in a clinical setting. The amount of time required to setup QCT-based FE analysis is similarly problematic. While HR-pQCT scanners expose patients to slightly greater radiation than for pQCT [28], the difference is negligible considering the minimal radiation dose associated with either scanning system. HR-pQCT provides considerably more bone structural information than pQCT. However, HR-pQCT scanners present increased procurement and maintenance costs compared to pQCT, which might restrict their widespread clinical use. Hence, we thought it worthwhile to evaluate the possible role of pQCT in enhancing the recognition of bone fragility in clinical settings. In our experience, pQCT-FE from single cross sections solves each of these issues. Better diagnostic ability than DXA was achieved with simpler set up, thus shorter time to scan patients and to set up individual pQCT-FE models, with lower radiation exposure and cost. As a dedicated tool for the measurement of clinically relevant sites with several practical advantages, pQCT-FE may play a complementary role in future clinical studies assessing bone health.

Significant differences between groups were observed in pQCT-FE properties, especially at the 4% tibia site, and in some standard pQCT properties. While trabecular or cortical vBMD differed statistically between fracture patients and healthy controls, no difference was identified in SSIp, the bone strength index reported to be a good measure of bone strength and a good predictor for fracture [8]. Fracture odds increased by 10.34 [95% CI (1.91, 43.98)]- and 10.17 [95% CI (1.60, 42.21)]-fold per SD decrease in pQCT-FE properties and were only 1.52 [95% CI (1.01, 2.16)] and 1.74 [95% CI (1.02, 2.37)] with DXA aBMD measures in females and males, respectively. pQCT-FE properties also had higher diagnostic ability than DXA with AUROC of 0.83 vs 0.72 (p = 0.02) in females and that of 0.81 vs 0.62 in males. This strength was also with improvement in specificity (79.2% vs 72.4% in females, 80.0% vs 65.6% in males) and sensitivity (69.4% vs 52.7% in females, 74.2% vs 64.5% in males). The pQCT-FE variables with highest AUROC were observed at the 4% tibia site, although the specific loading variable differed between females (kshear) and males (kbend). However, since pQCT-FE variables at the same site had high VIFs thus correlated strongly with each other, the difference was not considered to imply different mechanical performance between females and males. Overall, pQCT-FE models improved clinical performance in the identification of patients with increased fracture risk compared with DXA.

The best fracture discrimination was observed for pQCT-FE variables at the trabecular-rich site in the tibia. It should be noted that while pQCT-FE variables are computed from both trabecular and cortical bone, their relative contributions will depend on the site; hence, the pQCT-FE variables at the distal site will be more influenced by trabecular bone than at the proximal site. Since the pQCT-FE variables provided the greatest fracture discrimination at the distal tibia, the trabecular bone should be considered of greater importance for classification of peripheral appendicular fractures.

Among the standard pQCT and pQCT-FE variables, better performance was generally observed for variables obtained at the tibia compared to the radius. At the 4% tibia site, all pQCT-FE properties differed significantly between groups with the highest OR for fracture and highest AUROC to classify fracture patients from healthy controls. This finding contrasts with peripheral clinical fracture locations, where more fractures occur at the distal radius compared to the distal tibia. Many studies utilizing DXA have confirmed that scans of one specific site predict fracture of that site better than scans of other sites do. The finding from the current study is unexpected considering most fractures were forearm fractures. However, we do note similar findings by Sornay-Rendu et al [11] utilizing HR-pQCT between patients with mixed fractures and controls. In their study, none of the radius variables differed between groups after adjustment for radius aBMD, while several tibia variables remained statistically significant including both total and trabecular density at the distal tibia, cortical thickness and trabecular thickness. Furthermore, OR per SD decrease were also higher for tibia variables than for radius variables in this study. This may be due to variations in radius morphology and bone density across the population [29], which makes this site less sensitive to identify a fracture risk threshold. Indeed, we noticed higher coefficients of variance in radius variables than in the tibia from previous studies conducted at different centres using pQCT [30,31,32] or HR-pQCT [33,34,35], especially variables at distal site. Another possibility is that movement of the radius due to breathing and upper body movement may have affected imaging at this site to a greater degree than the distal tibia. The controls may have represented a more physically active cohort compared to the fracture patients, which may have contributed to greater bone density, particularly at the tibia, and a better preservation of balance thus reducing their risk of falls and fracture [36]. In addition, it may also result from the mixed fracture types in the studied population. However, even in patients with distal radius fracture only, pQCT-measured tibia variables still seem to have comparable ability to discriminate between fractures and controls [10].

A key limitation in the FE models was that the loading and boundary conditions in the simulations were artificial compared to daily loading of the radius and the tibia in vivo, where a loading combination of all cases occurs simultaneously [37]. The logic for adopting these loading cases in the pQCT-FE models is that a bone’s strength is proportional to its weakest resistance for an idealised loading condition. Indeed, this assumption has been adopted extensively in HR-pQCT-based μFE models where idealised axial compression was used to assess fracture risk [38]. Nevertheless, these idealized loading conditions when combined with the thin cross-sectional geometry of the FE models would have led to bone stiffness predictions that differed from those encountered in vivo. Hence, the bone stiffnesses predicted in the current study are only proportional to whole bone fracture load applicable for comparing relative differences between cohorts of patients rather than assessing the absolute stiffness of bone for an individual.

pQCT has an inferior spatial resolution compared to HR-pQCT, thus provides limited information about bone microstructure. This limitation of pQCT might restrict its wide research utility where knowledge of microstructure is required. However, comparable AUROC was reported in studies investigating the diagnostic ability of HR-pQCT and μFE in distinguishing controls from either patient with radius fracture [39] or with mixed low-trauma fractures [40]. The degree to which HR-pQCT and μFE can improve the diagnostic performance directly compared with pQCT-FE in identifying fracture patients is still uncertain. Future studies are needed to confirm whether comparable diagnostic ability can be achieved using the relatively low-resolution pQCT scanner considering its lower procurement and maintenance costs.

Due to lack of thoracolumbar X-ray/vertebral fracture assessment, there is still a possibility that a sub-group of controls had asymptomatic vertebral fracture. We acknowledge that this is a limitation of the study and might reduce rather than exaggerate the apparent differences in FEM measures between fracture and non-fracture individuals. It is estimated that nearly 70% of vertebral fractures are missed in community in clinical practice [41]. While improving diagnosis rate of asymptomatic vertebral fracture remains difficult, we do notice that osteoporotic vertebral fracture is more prevalent for ages greater than those seen in the current study [42]. In addition, recruiting strategies were applied to eliminate the impact of vertebral fracture as much as possible. Subjects were screened for symptoms such as chronic back pain or loss of height on a self-reported questionnaire. As a proportion of vertebral fractures result from prolonged use of glucocorticoid agents, “currently taking or have recently taken glucocorticoid agents” was one of the general exclusion criteria for both groups.

DXA measurements of the forearm were not available in this study. DXA measurement of the one-third radius predicts wrist fracture better than central DXA measurements; therefore, the subsequent inability to relate pQCT and pQCT-FE to DXA values at similar anatomic site is another limitation of this study, especially for the reported population in whom a majority had forearm fractures. However, the ability of both distal radius and central DXA to predict all types of fragility fracture is comparable [43]. The study by Amiri et al [44] reported good linear correlation between radius and central DXA results, which suggests that the trends in the central DXA results would remain similar if radius DXA measurements were used. The aim of the current study was to compare the pQCT-FE method with the most-accepted DXA measurements used for osteoporosis and fracture risk assessment in clinical practice, which both the WHO and ISCD recommend as central DXA. However, inclusion of radius DXA measurements and their comparisons to pQCT and pQCT-FE measures would provide additional information on the application of this novel yet simple technology and would be of added research interest.

In summary, pQCT-based FE models were applied together with standard pQCT and DXA properties to distinguish patients with recent non-vertebral fragility fracture from healthy controls. Improved diagnostic ability was observed for pQCT-FE, but not primary pQCT variables, compared with DXA properties in both females and males, although no statistical difference was observed in AUROC between primary pQCT and pQCT-FE variables. These results may provide an enhanced assessment for bone fragility in clinical settings considering the limitations of DXA, which is the most established modality currently. This study also strongly supports the rationale for future longitudinal studies with follow-up data for fracture risk assessment using pQCT-FE analysis. Recognizing and quantifying bone fragility before a major osteoporotic fracture occurs may bring considerable clinical benefits. Therefore, the potential clinical impact of applying this technology warrants exploration.