Introduction

Osteoporosis is a skeletal disorder characterized by compromised bone strength predisposing to an increased risk of fracture [1]. The consequences of fracture include increased mortality, morbidity, institutionalization, and economic costs [2, 3]. The ability to accurately gauge fracture risk is critical in identifying cost-effective thresholds for intervention [4, 5]. In 2008, the World Health Organization (WHO) Collaborating Centre at Sheffield, UK released the fracture risk assessment tool (FRAX®) for estimating individualized 10-year probability of hip and major osteoporotic fracture (composite of hip, clinical spine, distal forearm, and proximal humerus) [6]. The FRAX tool integrates seven clinical risk factors (prior fragility fracture, parental hip fracture, smoking, systemic glucocorticoid use, excess alcohol intake, body mass index, rheumatoid arthritis, and other secondary causes of osteoporosis), which, in addition to age and sex, contribute to a 10-year fracture risk estimate independently of bone mineral density (BMD) [5, 7]. The importance of this tool in clinical practice is highlighted by the many published clinical guidelines recommending treatment on the basis of 10-year fracture risk [715].

BMD from the femoral neck is an optional parameter when FRAX is used to calculate 10-year fracture probability [16]. Fracture risk assessment without BMD may be attractive in areas with limited access or for patients with limited mobility for whom attending facilities with bone densitometers is not feasible. Alternatively, use of FRAX without BMD might be an efficient screening tool for selecting individuals for subsequent BMD measurement, as implemented in UK guidelines [10].

In this study, we performed a risk reclassification analysis to address the question of how well fracture probability derived without BMD identifies individuals who would be designated for intervention according to national guidelines that use fracture probability derived with BMD [8, 9, 15]. This assessment was conducted at the level of the Province of Manitoba, Canada, which maintains a regionally based database of all clinical BMD tests and has been linked with a population-based data repository to determine fracture outcomes. Thus, it provides a direct evaluation of how these measurements might apply to routine clinical practice.

Methods

Patient population

In the Province of Manitoba, Canada, health services are provided to virtually all residents through a single public health care system. Bone density testing with dual-energy X-ray absorptiometry (DXA) has been managed as an integrated program since 1997 [17]. The program maintains a database of all DXA results performed from 1990 onwards; the current analysis used results up to March 2007. The DXA database can be linked with other population-based computerized health databases through an anonymous personal identifier. The DXA database, with a completeness and accuracy in excess of 99%, has been previously described [18].

We identified all individuals aged 50 years and older with medical coverage and valid DXA measurements from the lumbar spine and femoral neck. Although the FRAX tool does not currently use lumbar spine BMD in the risk calculation, measurement of this site still affects treatment decision making under US [8, 9] and Canadian guidelines [15]. For those with more than one set of measurements, only the first record was included. The study was approved by the Research Ethics Board for the University of Manitoba.

Bone density measurements

Proximal femur (femoral neck, total hip, and trochanter) and lumbar spine DXA scans were performed and analyzed in accordance with manufacturer recommendations. Femoral neck hip T-scores (number of SDs above or below young adult mean BMD) were calculated from NHANES III White female reference values in accordance with WHO and Canadian guidelines [15, 19]. For the lumbar spine, T-scores used manufacturer USA White female reference values. Vertebral levels affected by artifact were excluded by experienced physicians using conventional criteria [20]. Height and weight were recorded at the time of the BMD test.

Fractures definitions

Fractures and other medical diagnoses were assessed through a combination of hospital discharge abstracts (coded using the ICD-9-CM prior to 2004 and ICD-10-CA thereafter) and physician billing claims (coded using ICD-9-CM) [21]. Systemic glucocorticoid use was obtained from a provincial pharmacy database [22]. Longitudinal health service records were assessed for the presence of hip, clinical vertebral, forearm, and humerus fracture codes (collectively designated as major osteoporotic fractures) that were not associated with trauma codes [23]. Incident fractures were defined as fractures that occurred after the index BMD measurement with site-specific fracture codes (hospitalization or physician visit). Hip and forearm fractures were required to have a site-specific fracture reduction, fixation, or casting code fracture to enhance specificity for an acute fracture event. To minimize potential misclassification of prior incident fractures, we required that there be no hospitalization or physician visit(s) with the same fracture type in the 6 months preceding an incident fracture diagnosis.

Fracture probability calculations

Ten-year probability of a major osteoporotic fracture was calculated using the Canadian FRAX tool (version 3.1) by the WHO Collaborating Centre with and without femoral neck BMD for each case without knowledge of the fracture outcomes. Prior fracture and other conditions required for calculating fracture probability with FRAX were assessed through a combination of hospital discharge abstracts (diagnoses and procedures coded using the ICD-9-CM prior to 2004 and ICD-10-CA thereafter) and physician billing claims (coded using ICD-9-CM) as previously described [24]. Prior fragility fracture was taken to be a major osteoporotic fracture before BMD testing that was not associated with severe trauma. A diagnosis of rheumatoid arthritis was taken from physician office visits or hospitalizations with a compatible ICD-9-CM/ICD-10-CA code in a 3-year period prior to BMD testing. Proxies were used for smoking (COPD diagnosis) and high alcohol intake (alcohol or substance abuse diagnosis) over the same time frame; these show prevalence similar to population-based data [25]. Prolonged corticosteroid use (over 90 days dispensed in the year prior to DXA testing at a mean prednisone-equivalent dose of 7.5 mg/day or greater) was obtained from the provincial pharmacy system [22]. We adjusted for the effect of incomplete parental hip fracture information on FRAX probability estimates prior to 2005 using age- and sex-specific adjustment factors derived from 2005 to 2008 parental hip fracture responses as previously described [24]. FRAX predictions with the Canadian FRAX tool have been shown to agree closely with observed fracture rates in our cohort and in the general Canadian population [24, 25].

Intervention criteria

Major osteoporotic fracture probability derived with BMD was categorized as low risk (<10%), moderate risk (10–19%), and high risk (≥20%) in accordance with Canadian practice guidelines [26] and an intervention threshold of 20% from the US National Osteoporosis Foundation (NOF) [27, 28]. The NOF also recommends intervention for a 10-year hip fracture probability of 3% or greater. For symmetry with major osteoporotic fracture probability, hip fracture probability derived with BMD was categorized as low risk (<1.5%), moderate risk (1.5–2.9%), and high risk (≥3%) where the low risk cutoff was defined as one half of the high risk cutoff. Finally, the NOF recommends that individuals be considered for treatment who have prior fragility fractures of the hip or spine or individuals with osteoporotic BMD (lowest T-score −2.5 SD or lower at the femoral neck, total hip, trochanter, and lumbar spine), [8, 9]. We evaluated the NOF intervention criteria, individually and together, in comparison to a categorization of major osteoporotic fracture probability derived without BMD using the same cutoffs for low, moderate, and high risk. Some guidelines do not recommend the use hip fracture probability in routine management [10, 15]; therefore, we also evaluated intervention based upon any of the NOF criteria excluding hip fracture probability. We did not limit the use of FRAX to those with BMD in the low bone mass range as advocated under current NOF guidelines in order to provide a broader assessment of the potential value of the FRAX methodology; no such restriction is used in the Canadian guidelines.

Statistics

All results are reported as mean ± SD unless otherwise stated. Group comparisons for continuous data were conducted with the Student’s t test and for categorical data using a chi-square test of independence. Survival analyses were used to analyze time to major osteoporotic fracture with observation up to March 2008, providing a mean 5.4 years of observation. Observations were censored for migration out of province (3.0% of the cohort), but not for death (8.3% of the cohort), which was treated as a competing hazard [29]. Survival curves were compared using the log-rank statistic. A Cox proportional hazards model was used to study fracture risk as a function of femoral neck BMD. Fracture probability derived without BMD was included as a covariate in the model. Fracture probability was first entered as a continuous variable (log-transformed) and then as a categorical variable (grouped into risk quintiles). Results were similar and, therefore, the models based upon a continuous variable are reported. Statistical analyses were performed with SPSS for Windows version 16.0 (SPSS Inc., Chicago, IL, USA).

The fracture probabilities derived without BMD were compared with those that included BMD in the calculation. Overall prediction was assessed from the area under the curve (AUC) for receiver operating characteristic (ROC) curves. The AUC assesses discriminatory ability, but has relatively low sensitivity for detecting improvements in model performance; the integrated discrimination improvement was used to estimate the improvement performance [30].

Reclassification of 10-year major osteoporotic fracture probability (low <10%, moderate 10–19%, or high ≥20%), initially derived without BMD and subsequently recalculated with BMD, was assessed according to the method of Janes et al. [31] The Cochran–Armitage test was used to assess for linear trend in fractures when fracture probability without BMD was reclassified using fracture probability with BMD. Fracture outcomes to 10 years within each table subgroup were estimated using the Kaplan–Meier method.

Results

The final study population consisted of 39,603 individuals (36,730 women and 2,873 men). Baseline characteristics of the study population are summarized in Table 1. There was a strong correlation between fracture probabilities determined without and with BMD (rank correlation R = 0.893 for major osteoporotic fracture, R = 0.834 for hip fracture probability). Overall, 30.0% of the study population had an osteoporotic T-score affecting one or more sites (femoral neck 14.0%, lumbar spine 24.0%, total hip 10.0%, and trochanter 13.7%).

Table 1 Study population baseline characteristics at the time of BMD measurement

BMD was a significant predictor of major osteoporotic and hip fractures in all analyses (Table 2). Fracture risk for femoral neck BMD was attenuated after adjustment for age or fracture probability derived without BMD. Fracture discrimination was similar for women and men (Supplementary Table 1). Fracture probability derived with BMD gave higher AUC measures than probability derived without BMD or than BMD alone.

Table 2 Hazard ratios (HR) for fracture per unit (SD) decrease in femoral neck BMD

Table 3 shows that the risk categorization based upon fracture probability without BMD was consistent with the observed 10-year outcome and was similar to fracture probability with BMD (low, 7.5% versus 6.7%; moderate, 15.2% versus 16.4%; and high, 27.5% versus 31.0%). The number of individuals assigned to a given risk category (low, moderate, and high) based upon fracture probability without BMD was reclassified to a different risk category when fracture probability was recalculated with BMD. Within each fracture probability category determined without BMD, inclusion of BMD in the fracture probability calculation led to a significant gradient of risk in the reclassified individuals (all P trend < 0.001). Using BMD in the derivation of fracture probability led to reclassification of 22.2% of all individuals: 6.3% were initially designated at low risk, 10.2% at moderate risk (among whom 2.7% moved to high risk and 7.5% to low risk), and 5.7% at high risk. Almost all reclassifications were to the adjacent risk category, with very few individuals reclassified from low risk to high risk (0.1%) or high risk to low risk (0.2%). Among individuals initially designated as moderate risk without using BMD (29.4% of the entire population), reclassification when fracture probability included BMD gave observed 10-year fracture rates that were within the expected range for the nominal risk category (low 9.3% [predicted < 10%], moderate 15.5% [predicted 10–19%], and high 27.5% [predicted ≥20%]). When reclassification was evaluated using a single cutoff (<20% versus ≥20%), only 8.5% of the cohort had their risk category changed after using BMD in the risk calculation (2.8% moved to the higher risk category, 5.7% moved to the lower risk category).

Table 3 Risk reclassification when major fracture probability initially calculated without BMD is recalculated using BMD

Table 4 summarizes the relationship between fracture probability derived without BMD (5% increments) and reclassification of patients when BMD was used to calculate fracture probability, using an intervention threshold of 20%. Also given are the corresponding ages, femoral neck T-scores, and lowest T-scores. The distribution in fracture risk category was similar for major osteoporotic fracture probability calculated without and with BMD. Higher major osteoporotic fracture probability calculated without BMD was associated with progressively older age and lower BMD T-scores. When major osteoporotic fracture probability calculated without BMD exceeded the treatment threshold (20%), the mean T-score from the lowest measurement site was in the osteoporotic range. The proportion of the cohort that was reclassified, based upon the intervention threshold of 20%, was largely confined to individuals close to the treatment threshold. Overall, the proportion of the cohort that was reclassified was 29.5%. Reclassification rates also varied according to the risk category calculated without BMD: low 11.1%, moderate 17.3%, and high 1.0%.

Table 4 Effect of fracture probability initially calculated without BMD on change in intervention (reclassification) when fracture probability is recalculated using BMD

Figure 1 examines fracture outcomes stratified by whether 10-year major fracture probability calculated without BMD was concordant or discordant with the intervention cutoff after fracture probability was recalculated with BMD. Among those categorized as low risk without BMD but high risk with BMD, the observed fraction with osteoporotic fractures at 10 years was 10.0%, which is well below the treatment threshold. Conversely, among those categorized as high risk without BMD but not with BMD, the observed fraction with osteoporotic fractures at 10 years was 20.4%, which slightly exceeds the intervention threshold. For those categorized as moderate risk without BMD, recalculation of fracture probability with BMD appropriately identified subgroups that had observed fracture outcomes consistent with the treatment threshold: concordant 13.8% versus discordant 27.5%.

Fig. 1
figure 1

Fracture outcomes up to 10 years according to major osteoporotic fracture probability calculated without BMD (categorized as low 0–9%, moderate 10–19%, and high ≥20%). Findings are stratified by whether results are concordant (dotted) or discordant (solid) with the 20% intervention cutoff (dashed) when fracture probability is recalculated with BMD

Risk categorization for major osteoporotic fracture probability calculated without BMD was compared with various intervention criteria (Table 5). Among the 22,602 subjects categorized at low risk without using BMD (57.1% of the entire cohort), only a minority (19.5%) met any of the intervention criteria with the most frequent criterion (17.3%) being a BMD T-score ≤ −2.5 SD (femoral neck 5.3%, lumbar spine 13.7%, total hip 3.0%, and trochanter 6.2%). In the subgroup of 3,916 individuals categorized at low risk without using BMD but with a BMD T-score ≤ −2.5 SD, the observed 10-year fracture rate was 12.5% (95% CI, 10.7–14.4%). Conversely, among individuals initially designated at high probability for major osteoporotic fracture without using BMD, the vast majority (92.8%) satisfied one or more of the intervention criteria. In descending order of frequency, the intervention criteria most commonly satisfied were probability for hip fracture greater than 3% (89.6%), major osteoporotic fracture probability greater than 20% (58.0%), and BMD measurement in the osteoporotic range (61.8%). If hip fracture probability greater than 3% was not included as an intervention criterion, a substantial majority (75.1%) still met one of the remaining intervention criteria. Results were broadly similar for women and men, though limited by the smaller number of men in the study. Age strongly affected the number of individuals falling in the various risk categories (90.5% designated low risk below age 65 years versus 25.0% for age 65 years and older), but within a given risk category, the proportion satisfying the intervention criteria was similar. For individuals initially categorized as moderate probability for major osteoporotic fracture without BMD, 57.6% of women met one or more of the intervention criteria, while 84.5% of men met one or more treatment criteria (43.4% and 50.5% if hip fracture probability greater than 3% was not included). Results were generally similar when the initial risk categorization was based upon probability for hip fracture without using BMD (Supplementary Table 2).

Table 5 Proportion of individuals satisfying intervention criteria according to risk categorization from major fracture probability without BMD

ROC curves for fracture probability derived without BMD were examined as predictors of qualifying for intervention under the various intervention criteria (Supplementary Fig. 1). Results from major osteoporotic probability and hip fracture probability were similar for most criteria except for prior fracture of the spine or hip in which the AUC for major osteoporotic fracture probability (0.826 [95% CI, 0.818–0.835]) was greater than for hip fracture probability (0.770 [95% CI, 0.760–0.780]). AUCs were >0.90 for the intervention criterion based upon fracture probability and >0.80 when based upon any of the criteria.

Discussion

These data shed light on the use of the WHO fracture risk assessment tool (FRAX) without BMD in the clinical management of individuals at risk for osteoporotic fractures. Firstly, in areas without BMD access, FRAX provides a useful index of fracture risk and is similar to BMD alone for the prediction of any fracture. Many countries currently have limited or no access to BMD testing [32], and even within the US, there are geographic barriers that limit utilization [33]. In individuals in whom this risk is sufficiently low, then further testing or pharmacologic treatment may be avoided. Alternatively, for those individuals at sufficiently high risk after calculating fracture probability without BMD, empirical treatment might be considered without the need for further testing. BMD testing had the largest benefit in terms of risk reclassification in those who fall within an intermediate risk range. Limiting BMD testing to this subgroup could potentially have avoided BMD testing in up to 70.6% of our clinical population, similar to a previous report in women aged 75 years and older from Sheffield, UK [34]. The benefit of including BMD in the risk assessment was greatest among those initially at moderate risk (10–19%) when fracture probability was derived without BMD, but this represented only 29.4% of the cohort (9.3% of those aged <65 years and 48.7% of those ≥65 years). Importantly, even among those individuals nominally reclassified as low risk without BMD measurement but who would have qualified for treatment under other criteria (usually based upon T-score −2.5 SD or lower), the observed fracture rates were relatively low, suggesting that not all of these individuals would benefit from treatment [27, 28].

Limitations to an approach based upon a strictly clinical risk assessment are acknowledged. Firstly, if individuals categorized as low risk without BMD did not undergo further testing, then this would miss a substantial minority (17.3%) with an osteoporotic T-score, though we found that 10-year fracture outcomes in this subgroup were still relatively modest (12.5%). Secondly, an approach without BMD measurement may be less satisfying to physicians and patients who find value in BMD monitoring on treatment. There is uncertainty over whether high fracture probability in the absence of an osteoporotic BMD indicates a situation where pharmacologic therapy is beneficial in terms of fracture prevention. Some post hoc analyses and a preplanned clinical trial data have shown greater antifracture efficacy in individuals at higher risk [3537], whereas others have shown benefit across the range of fracture probabilities [3840]. Studies of how to best implement a risk stratification approach without BMD testing into clinical practice and the effect of BMD measurement (or its absence) on patient adherence to treatment are needed.

We found that all risk assessment tools, including fracture probability without BMD, provide information regarding major osteoporotic and hip fracture risk in women and men. Based upon ROC analyses, fracture probability with BMD provided the highest degree of fracture discrimination with higher AUCs than fracture probability without BMD or BMD alone, as has been noted previously [16, 24]. Fracture probability without BMD provides substantial information regarding fracture risk, but BMD remains a significant independent fracture risk factor that is only partially attenuated by adjustment for risk defined from clinical risk factors alone. This is not surprising since meta-analyses from the WHO have shown that these clinical risk factors show little BMD dependence [5, 7].

Our findings are consistent with others that have noted that a very large proportion of older women and men would be recommended for pharmacologic treatment of osteoporosis based upon the NOF criteria [4143]. We found little difference between women (41.0%) and men (42.4%), which probably reflects the fact that higher risk men are more likely to be referred for BMD testing. There was a large difference between those younger than age 65 years (21.0%) and those aged 65 years or older (60.4%), consistent with the importance of age as a major risk factor for osteoporotic fractures. Prior to the age of 65 years, osteoporotic BMD was the largest single determinant of the eligibility for treatment (18.6%), whereas after age 65, hip fracture probability above 3% was the largest single determinant (45.8%).

In contrast to the use of fixed intervention cutoffs, the National Osteoporosis Guideline Group (NOGG) for the UK suggests age-dependent intervention thresholds (equivalent to the probability of a woman with no risk factors other than a prior fragility fracture) [10]. NOGG makes recommendations for BMD testing and treatment based upon fracture probability derived without BMD as part of its overall case-finding strategy: an upper assessment threshold (i.e., a fracture probability above which patients could be treated without BMD) and a lower assessment threshold (i.e., a fracture probability below which neither treatment nor a BMD test should be considered). Those with intermediate probability values would be recommended for BMD testing. The overall use of the NOGG thresholds in a case-finding strategy was projected to identify 6–20% women as eligible for BMD testing and 23–46% as eligible for treatment, depending on age [10].

There are limitations to this study. It was conducted using individuals referred for BMD testing and is, therefore, subject to the factors that motivate or limit BMD testing in clinical practice. However, by capturing all clinical BMD testing for a geographic region, we are able to accurately evaluate implications in clinical practice and loss to follow up is essentially nonexistent. Therefore, our cohort complements information from population-based studies. Reliance on administrative data for fracture ascertainment is less reliable than direct radiographic review, particularly for vertebral fractures since the majority is not clinically diagnosed [44]. Additional limitations include the incomplete parental hip fracture information and the use of proxy variables for smoking and high alcohol intake. The Canadian FRAX tool was used in this study. There are small calibration differences between the US and Canadian FRAX tools (linear regression slope coefficient 0.99 for major osteoporotic fractures and 1.06 for hip fractures), and it is uncertain whether this would alter the analysis. Finally, we did not limit the use of FRAX to those with BMD in the low bone mass range as recommended under current NOF (but not Canadian) guidelines [27, 28].

In summary, we have shown that there is complementary information in using a tool for estimating fracture probability without and with BMD. Although BMD remains a strong predictor of fracture risk after adjustment for fracture probability measured without BMD, there is a high level of agreement between fracture probability measured without and with BMD for designation of high risk (AUCs > 0.90) or for treatment based upon any of the intervention criteria (AUCs > 0.80). Therefore, fracture probability assessed without BMD could potentially be sufficient for clinical decision making in many individuals at sufficiently low or sufficiently high fracture risk that a clinical decision regarding management can be made. For those individuals falling in the moderate risk range, there is incremental value in using BMD in the probability calculation as this appropriately reclassifies risk in over one third of the individuals.