Introduction

Osteoporosis is a common chronic condition; osteoporosis-related fractures impose a significant societal burden in terms of both human and economic costs [1]. Currently available treatments for osteoporosis are effective, yielding 40–60% reductions in the risk of fracture in primary and secondary prevention settings [2, 3]. Tools such as the FRAXR score were developed to predict an individual’s 10-year fracture risk and to help identify those most likely to benefit from osteoporosis treatment [4]. FRAX was designed to assess 10-year fracture probability for MOF and hip fracture [5]. The framework used considers competing mortality and avoids biased over-estimates in older individuals and those with risk factors for increased death [6]. Over 100 clinical practice guidelines include FRAX in their recommendations, making it the most widely used fracture prediction tool worldwide. [7].

Notwithstanding the strengths of FRAX, one criticism has been its restriction to assessing 10-year fracture outcomes [8]. Other validated clinical prediction tools provide the ability to assess fracture risk over a shorter time interval. For example, the Garvan Fracture Risk Calculator assesses fracture risk over 5 years in addition to 10 years, whereas the QFracture score allows the user to select fracture risk from 1 to 10 years [9,10,11,12]. Whether simple rescaling of the FRAX outputs provides an accurate estimate of fracture risk over a shorter time horizon has not been objectively assessed, and there are concerns that it may be particularly affected by the recency of fracture [13, 14]. Similarly, the performance of FRAX for a longer time interval is unknown [8]. If FRAX is able to accurately reflect fracture risk over shorter and longer time intervals, this would extend its clinical usefulness and also inform research studies that wish to evaluate FRAX predicted fracture probabilities when a full 10 years of data are not available.

The current study was performed to address the question of whether FRAX accurately predicts MOF and hip fracture outcomes in women and men age 40 years and older over intervals shorter and longer than 10 years, spanning 1 to 15 years. We hypothesized that the association between FRAX and osteoporotic fractures would be linear over time and that simple rescaling would permit accurate fracture prediction anywhere from 1 to 15 years. This analysis was performed using a large clinical registry of all dual-energy X-ray absorptiometry (DXA) tests for the Province of Manitoba, Canada, providing results that should be broadly applicable to routine clinical practice.

Methods

Study population

From the clinical registry containing all DXA results for Manitoba, Canada (population 1.2 million), we identified all women and men age 40 years and older who were registered residents and underwent baseline DXA measurement of the hip (1996–2013). In the Province of Manitoba, health services are provided to virtually all residents through a single public health care system. Bone density testing DXA is managed as an integrated program [15]. The program maintains a database of all DXA results that can be linked with other population-based computerized health databases through an anonymous personal identifier. The DXA database, with a completeness and accuracy in excess of 99%, has been previously described [16]. The study was approved by the Research Ethics Board for the University of Manitoba.

Bone mineral density measurements

Proximal femur DXA scans were performed and analyzed in accordance with manufacturer recommendations (Lunar DPX or Prodigy, GE Healthcare, Madison WI). All instruments were monitored by a medical physicist through a rigorous quality assurance program [15]. The instruments used exhibited stable long-term performance (coefficient of variation <0.5%) with no significant between-scanner differences detected. Femoral neck T-scores (number of SDs above or below young adult mean BMD) were calculated from NHANES III white female reference values [17] as required by FRAX.

Fracture probability calculations

The Canadian FRAX tool was calibrated using nationwide hip fracture and mortality data [18]. The Manitoba BMD Registry was not used in the creation or calibration of the FRAX tool. Ten-year probability of a MOF and hip fracture without and with femoral neck BMD was calculated for each subject using the Canadian FRAX tool (FRAX® Desktop Multi-Patient Entry, version 3.7). Briefly, prior fracture and other conditions required for calculating fracture probability with FRAX were assessed through a combination of hospital discharge abstracts (diagnoses and procedures coded using the ICD-9-CM prior to 2004 and ICD-10-CA thereafter) and physician billing claims (coded using ICD-9-CM) as previously described [19]. Proxies were used for smoking (COPD diagnosis) and high alcohol intake (alcohol/substance abuse diagnosis) over the same time frame; this method has been shown to provide prevalences similar to population-based estimates [19,20,21]. Prolonged corticosteroid use (over 90 days dispensed in the year prior to DXA testing) was obtained from the provincial pharmacy system [22]. We adjusted for the effect of missing parental hip fracture information prior to 2005 as previously described [19]. FRAX predictions with the Canadian FRAX tool have been shown to agree closely with observed fracture rates in our cohort and in the general Canadian population [19, 20]. We did not exclude prior or concurrent osteoporosis medication users since we have previously shown that this does not affect the performance of FRAX [23].

Fracture outcomes

Hospital discharge abstracts and physician billing claims were assessed from date of DXA (index date) to March 31, 2013, or 15 years (final date) for the presence of non-traumatic hip, clinical vertebral, forearm, and humerus fracture diagnostic codes (collectively designated “major osteoporotic” fractures [MOF]) using previously validated algorithms [24]. Fractures not associated with trauma codes were assessed through a combination of hospital discharge abstracts (coded using the ICD-9-CM prior to 2004 and ICD-10-CA thereafter) and physician billing claims (coded using ICD-9-CM) [25]. Hip and forearm fractures were required to have a site-specific fracture reduction, fixation or casting codes to enhance specificity for an acute fracture event. To minimize potential misclassification of prior and incident fractures involving the same skeletal site, we required that there be no hospitalization or physician visit(s) with the same fracture type in the 12 months preceding an incident fracture diagnosis. There was no time restriction on prior and incident fractures involving different skeletal sites.

Statistical analysis

All analyses were stratified by sex. Continuous variables were reported as means with standard deviations (SD), and counts with percentages. To assess the stability of fracture discrimination over time, hazard ratios (HR) with 95% confidence intervals were obtained from Cox proportional hazards models for time to MOF or hip fracture. HRs were determined for the entire study period (years 0–15) and for 5-year intervals (years <5, years 5–10, years >10). FRAX scores were log-transformed due to a skewed distribution, and results are expressed as a gradient of risk (hazard ratio per standard deviation [HR per SD]). We estimated fracture probability from 1 to 15 years assuming simple linear rescaling of the FRAX probability measurements (performed without and with BMD). For example, 5-year predictions were assumed to be one half of the 10-year predictions and 15-year predictions were assumed to be 1.5 times the 10-year predictions. The cumulative incidence function (CIF) for MOF and hip fracture up to 15 years was constructed following a competing mortality framework [26]. Cumulative fracture probabilities for each year, from 1 to 15 years, with 95% confidence intervals were estimated from the CIF. These observed fracture probabilities were compared with the predicted measurements (rescaled FRAX probability with and without BMD) to estimate calibration ratios (observed vs predicted probability) from 1 to 15 years. The linearity in the relationship between observed vs predicted fracture probability was estimated using the Pearson correlation coefficient and the slope coefficient was estimated from a simple linear regression model through the origin. We used the Pearson R 2 to describe the proportion of total variation explained. We also examined whether the relationship between observed vs predicted fracture probability was affected by time since fracture in those with a prior major fracture. Statistical analyses were performed with Statistica (Version 13.0, StatSoft Inc., Tulsa, OK). A p value of 0.05 was set as the threshold for assessing statistical significance. We assessed the sensitivity of our findings to changes in the underlying age-specific fracture rates (from 0.1- to twofold) and mortality rates (from zero- to twofold) using a simulated population of 10,000 women with age uniformly distributed from 40 to 90 years (Excel for Windows). Age-specific baseline hip fracture and mortality rates were based upon data used to construct the Canadian FRAX model [18]; MOF rates were estimated using previously validated MOF/hip fracture ratios [27]. Cumulative fracture incidence to 15 years for the population (with age updated annually) was estimated in the presence of competing mortality. Once again, calibration ratios (observed vs predicted probability) from the simulated fracture data from 1 to 15 years were derived.

Results

Baseline population characteristics are summarized in Table 1. There were 62,275 women (mean age 64.1 ± 11.1 years) and 6455 men (mean age 66.0 ± 12.0 years, p < 0.001). Men referred for BMD testing tended to have more clinical risk factors for fracture than women (all p < 0.001 except for parental hip fracture), but had higher mean femoral neck T-score (p < 0.001) and lower 10 year fracture risk probabilities (all p < 0.001). During follow-up of up to 15 years (mean 7.1 ± 4.2 years, approximately 486,000 person-years), 5280 (8.5%) of the women sustained one or more incident MOF including 1469 (2.4%) with a hip fracture, while 382 (5.9%) of the men sustained one or more incident MOF including 108 (1.7%) with a hip fracture (both p < 0.001).

Table 1 Study population baseline characteristics

Fracture probability was significantly greater in those with versus those without incident MOF or hip fracture for all follow up time intervals (Supplemental Table 1). FRAX without and with BMD consistently predicted MOF and hip fracture for all time intervals in both women and men (Table 2). In general, the gradient of risk was slightly higher for FRAX with BMD vs FRAX without BMD, and higher for hip fracture prediction than for MOF prediction. There was no attenuation in the gradient of risk for MOF prediction even for years >10 in women (HR per SD: FRAX without BMD 2.15, 95% CI 1.96–2.35; FRAX with BMD 2.24, 95% CI 2.05–2.45) or men (FRAX without BMD 2.55, 95% CI 1.47–4.41; FRAX with BMD 2.15, 95% CI 1.44–3.22). Gradient of risk was slightly lower for hip fracture prediction in years >10 vs years <5, though HRs remained high and CIs overlapped.

Table 2 Gradient of risk (hazard ratio per standard deviation [HR per SD] with 95% confidence intervals [CI]) for prediction of incident major osteoporotic fracture (MOF) and hip fracture from FRAX

Calibration plots for observed vs predicted fracture probability from 1 to 15 years are shown for MOF (Fig. 1) and hip fracture (Fig. 2). Among women, there was near-perfect linearity in MOF predictions without and with BMD (both R 2 = 1.00) with a slope coefficient close to unity (0.92 and 0.97, respectively). Among men, there was also a high degree of linearity in the relationship (both R 2 = 0.96) with a slope that again approximated unity (both 1.07). For hip fracture probability, there was again a high level of linearity for women (both R 2 = 0.95) and men (both R 2 = 0.97). Observed hip fracture probability fell significantly below predicted fracture probability for FRAX without BMD during the early years but slightly exceeded predicted fracture probability for FRAX with BMD during the later years. Among men, there was close agreement between observed and predicted hip fracture probability (slope = 0.99 without BMD, 1.06 with BMD) and all CIs crossed the line of identity.

Fig. 1
figure 1

Calibration plots for observed vs predicted fracture probability for major osteoporotic fractures (MOF), from 1 to 15 years

Fig. 2
figure 2

Calibration plots for observed vs predicted fracture probability for hip fractures, from 1 to 15 years

Calibration ratios (observed vs predicted probability) from 1 to 15 years according to sex are shown for MOF (Supplemental Table 2) and hip fracture (Supplemental Table 3). MOF calibration ratios approximated unity for all years among women; among men MOF calibration ratios slightly exceeded unity during the initial years, but showed good concordance during the later years. Hip fracture calibration ratios approximated unity for all years among men; among women hip fracture calibration ratios were significantly below unity during the initial years, but exceeded unity in the later years when BMD was included in the calculation.

Among the 10,012 women and men with a prior fracture, 3614 had sustained a fracture in the year prior to BMD testing, 2142 between 1 and 5 years prior to BMD testing and 4256 more than 5 years prior to BMD testing. Time to fracture recurrence was not significantly different between the three subgroups (Fig. 3). In those sustaining a fracture in the year prior to BMD testing the MOF calibration ratio slightly exceeded unity, peaking in year 3: ratio for FRAX without BMD 1.18 (95% CI 1.02–1.34), ratio for FRAX with BMD 1.29 (95% CI 1.12–1.46). A similar non-significant trend was also seen in the hip fracture calibration ratio in those who had sustained a fracture in the year prior to BMD testing, which again peaked in year 3: ratio for FRAX without BMD 1.05 (0.78–1.32), ratio for FRAX without BMD 1.33 (0.99–1.67). By year 10 and beyond the MOF and hip fracture, calibration ratios again approximated unity.

Fig. 3
figure 3

Risk for recurrent fracture according to time since prior fracture

Simulation models (Supplemental Figs. 1, 2, 3, 4, 5) showed near-perfect linearity in MOF and hip fracture probability to 15 years when assumptions reflected fracture and mortality rates in the Canadian population (base case). Results were robust to large relative increases or decreases in fracture rates, and moderate relative increases or decreases in mortality rates. In the presence of large mortality differences (no mortality or a doubling), non-linear effects became conspicuous.

Discussion

This assessment of fracture outcomes from a large clinical registry found that FRAX probabilities, computed with and without BMD, continued to predict incident MOF and hip fracture up to 15 years, with little if any attenuation in fracture discrimination over time. There were few consistent changes in calibration in comparisons of observed vs predicted fracture probability, but strong linearity was seen in these relationships for the rescaled FRAX probabilities spanning 1 to 15 years. Deviations from linearity, with a slightly higher observed than predicted fracture probability, were evident in the first years following a recent fracture event and after 10 years for hip fracture prediction in women using FRAX with BMD. In contrast, there was excellent calibration for prediction of MOF in women whether or not BMD was included in the FRAX calculation.

Our data mitigate concerns regarding the use of FRAX for estimation of fracture risk over a time horizon shorter than 10 years [8]. These results may at first appear counter-intuitive since advancing age is a strong independent risk factor for fracture, and the risk of fracture would be expected to increase over time [5]. However, it is also an equally strong risk factor for death, and therefore, the competing mortality framework used by FRAX would be expected to attenuate this effect. This was confirmed in our simulations. The tendency of FRAX to overestimate hip fracture risk in women during early years but to underestimate risk in later years (when computed with BMD) is not clearly explained as it was not seen in men or the base case simulation. Our data are supported by a recent large-scale study from Israel in over one million subjects aged 50 to 90 years where the rate of fracture events was approximately constant, indicating that the cumulative rate of events was linear [21] and, thus, also supporting our hypotheses. Our findings suggest that rescaling FRAX probabilities to estimate fracture risk for intervals less than 10 years and up to 15 years is reasonable in populations with fracture and mortality epidemiology similar to Canada. Simulations suggest that this may not be generalizable to populations with markedly different fracture probability, however.

Although a recent fracture is associated with greater risk for recurrent fracture than a remote fracture [8], we found that this had a relatively small early effect on the performance of FRAX that peaked by 3 years and was no longer evident at 10 years. This may reflect data showing that prior MOF is a persistently strong risk factor for recurrent fracture with a relatively small attenuation seen for fractures within the first year (adjusted HR 1.90, 95% CI 1.60–2.25) compared to fractures more than 10 years earlier (HR 1.62, 95% CI 1.25–2.10) following BMD testing [14]. Alternatively, a recent fracture is associated with excess mortality that rapidly declines [28, 29] and would attenuate the effect of prior fracture on cumulative fracture incidence adjusted for competing mortality [26]. Morphometric vertebral fractures, which are not an output from FRAX and cannot be accurately ascertained through administrative data, may represent an exception as the incidence of a new radiographic vertebral in the year following a vertebral fracture has been reported to be as high as 20% [30].

Limitations to this study include the reliance on administrative data for fracture outcomes and for some of the FRAX input variables. However, the definitions for fracture and FRAX covariates used have been validated and used in many previous studies and have also been tested and adopted for national surveillance of osteoporosis and related fractures in Canada [24, 31, 32]. The clinical referral source of the study cohort is recognized and may explain why observed fracture risk exceeds predicted in some groups (such as MOF in men) who may have additional risk factors that are not considered in FRAX (e.g., hypogonadism and androgen deprivation therapy). However, the inclusion of all individuals within the geographic region referred for BMD testing suggests that our results are likely to be broadly generalizable to clinical practice. Additionally, the sample size for men was much smaller than for women, and there were fewer hip fractures among men particularly in years >10 of the study. As we have previously reported, prior or concurrent osteoporosis medication user did not affect the performance of FRAX in our dataset [23], and therefore, these individuals were not excluded; possible reasons for the insensitivity of FRAX to therapy are discussed by McClung [33]. It has also been proposed that some therapies (most notably bisphosphonates) may reduce mortality independent of their effect on fracture [34]. This remains controversial, but if true might modify the performance of FRAX through an effect on competing mortality. Finally, our simulation accounted for the effect of age as a shared risk factor for fracture and death, but did not consider other shared risk factors.

In summary, we found that FRAX effectively predicts incident MOF and hip fracture up to 15 years following baseline assessment. Rescaling of FRAX probability measurements may be useful to predict both very short- and extremely long-term fracture risk in populations with fracture and mortality epidemiology similar to Canada and assuage concerns that FRAX is only of utility in predicting 10-year risk of fracture.