Introduction

Hip fractures are among the most common fractures in elderly women, and they are responsible for considerable morbidity and increased mortality [1, 2, 3]. Several cross-sectional and prospective studies have shown that the risk of hip fracture increases when bone mineral density (BMD) decreases and that a low BMD is clearly one of the best prognostic factors for hip fracture [4, 5, 6, 7]. There is also evidence that, independently of BMD, quantitative ultrasound parameters, measured on the calcaneus, are also significant predictors of hip fracture [8, 9, 10]. However, few prospective studies have been focused on very old women [7, 9, 11, 12]. It is known that the predictive value of a diagnostic test is influenced by the characteristics of the population being tested [13]. On the other hand, other risk factors for fracture, especially risk factors for falls, increase with age so that the influence of bone density might decrease compare with other risk factors in very elderly women.

Since assessment of bone status in women over 75 years of age is controversial, our goal was to study and compare the utility of BMD and that of ultrasound measurements to predict hip fracture in an older population of women. This study also addressed the specific issue of a possible decrease of this utility with advancing age.

Materials and methods

Subjects

From January 1992 to January 1994, a cohort of 7,598 healthy Caucasian women aged 75 years and over were enrolled in an epidemiological multicenter prospective study: EPIDOS. Subjects were recruited using the voting lists and several health insurance companies’ registers from five French cities (Amiens, Lyon, Montpellier, Paris, and Toulouse). Most women were living independently at home—only 10% lived in nursing homes. Women who had undergone a bilateral hip replacement or had previously suffered a hip fracture were excluded. The study was approved by the appropriate committees on human research, and all the women provided written informed consent. The methodology has been described in more detail in a previous article [14]. The present analysis is based on data collected from the beginning of the study until December 1996.

Bone mineral density and quantitative ultrasound measurements

Baseline BMD (g/cm2) of the proximal femur was measured by dual-energy X-ray absorptiometry (DXA) using the Lunar DPX Plus (GE-Lunar, Madison, WI, USA). Quantitative ultrasound measurements(QUS) of the calcaneus were performed with the Achilles system (GE-Lunar, Madison, WI, USA). This device measures two parameters: the speed of sound (SOS, in m/s) and the attenuation of the ultrasound waves across the calcaneus (BUA, in dB/MHz) [15]. The ultrasound devices were not available in some centers during the first months of the study; thus, ultrasonic measurements were obtained for only 5,978 out of the 7,598 participants. To minimize interoperator variability, all technicians were centrally trained for DXA and QUS acquisition and analysis according to the standard procedures provided by the manufacturer. A study-specific operating manual was also available in each center for further training when needed. Strict quality control procedures including cross-calibration between DXA or QUS devices were performed. Detailed descriptions of the procedures and results were published elsewhere [7, 16, 17]

Assessment of hip fractures

A survey on fracture occurrence was conducted every 4 months. The women were mailed a questionnaire followed up, when necessary, by telephone calls. When a participant could not be reached, we obtained information either from a relative or from her usual doctor. Only 538 women (7.2%) were lost to follow-up. Reported hip fractures were confirmed and classified by a rheumatologist on preoperative radiographs and surgical reports.

Statistical analysis: patients’ characteristics

Normal distribution of anthropometric, densitometric, and ultrasonic parameters was tested. All results were expressed as mean and standard deviation (SD).

Observed incidence rate and age

The observed incidence rates of hip fractures for each 1-year age stratum was calculated and expressed as number of hip fractures per 1,000 woman-years. Important variations from one age to another are due to the fact that the values are observed rates (not predicted ones) based sometimes on a very small number of subjects. This generates considerable variations beyond the age of 91. As a consequence, we modeled the effect of age on the fracture risk using logistic regression.

Regression analysis: relative risk of hip fracture, age, and assessment of bone mineral density

The probability of suffering an event such as a hip fracture for a population at risk over a definite period of time was expressed as the incidence rate per 1,000 woman-years. At the individual level, this represents the absolute risk of suffering a hip fracture. The relative risk is the ratio of the absolute risk in the exposed group to the absolute risk in the unexposed group. This represents the proportion of the overall absolute risk explained by a decrease in quantitative bone assessment but does not tell how many fractures are attributable to this decrease. This latter information depends upon both the absolute and the relative risk. To address this question, we also calculated the risk difference—the risk in the exposed group minus the risk in the unexposed group—that represents the number of hip fractures explained by a decrease in quantitative bone assessment. We calculated the relative risk and the risk difference associated with a decrease of one standard deviation of the quantitative bone assessment techniques. We also calculated the relative risk and the risk difference between the women in the highest quartile of BMD and those in the lowest. When the term “risk” is used alone, it refers to absolute risk.

We used Poisson regression models to calculate the relative risks (and their 95% confidence intervals) for the association between each individual predictor variable (age, BMD, SOS, or BUA) and the risk of hip fracture during follow-up [18]. We also checked possible interactions between variables. Variables and interaction terms with significant effects in bivariate models (p value <0.05) were entered in multivariate models. We calculated the relative risk for one standard deviation increase or decrease of each predictor variable in several models. For each model, we measured the deviance, i.e., the unexplained residual variance of the model, thus its goodness of fit. Theoretically, when only one additional variable is introduced in a given model, the new model is significantly better than the previous one if the difference between the deviances is higher than 3.42 (χ-square test with 1 degree of freedom).

To visualize the effect of aging on the risk of hip fracture stratified by quartiles of BMD, we used the results of the best regression model and plotted the estimated predicted risk against age (Fig. 3). A log scale was chosen to show the trend over time.

We constructed receiver operating characteristic (ROC) curves for BMD and ultrasound measurements in women who had been followed up for at least 3 years. Initially, the ROC curves were stratified by four age groups: 75–79, 80–84, 85–89, ≥90. However, the group ≥90 was too small, so the last two groups had to be pooled to obtain more robust estimates. The areas under the ROC curves (AUC) and their 95% confidence interval were calculated and compared according to the method described by Hanley and McNeil [19]. All analyses were performed with Statistical Analysis System (SAS; Cary, NC, USA).

Results

Patients’ characteristics

During an average of 3.5 years follow-up (representing 27,251 woman-years), 293 women underwent their first nontraumatic hip fracture: 71 during the 1st year (9.5 per 1,000 woman-years), 75 during the 2nd (10.5 per 1,000 woman-years), 81 during the 3rd (11.9 per 1,000 women-years), 58 during the 4th (12.3 per 1,000 woman-years), and 8 during the 5th (7.1 per 1,000 woman-years). The main characteristics of the whole cohort, and those with and without hip fracture are shown in Table 1. Comparing the last two groups using a two-tailed t-test, significant differences (p<0.05) were found for age, weight, body mass index, and BMD and QUS, but not for height and age at menopause.

Table 1 Characteristics of the study volunteers

Observed incidence rate of hip fracture

Observed hip fracture risks are expressed as hip fracture incidence rates per 1,000 woman-years for each year of age and are displayed in Fig. 1. Important variations from one point to another are due to the fact that these are observed rates based, especially in very elderly women (over 90), on a very small number of subjects. The overall incidence rate of hip fracture for the whole follow-up duration increases with age. This is consistent with the exponential increase of hip fracture incidence with age usually described in the literature. Observed fracture risks by age groups and by BMD quartiles are shown in Fig. 2. As expected, they augment with advancing age and with decreasing BMD. From Fig. 2, the respective effect of age and BMD can be expressed as either relative or absolute. The relative risk ratio between the highest and the lowest quartile of BMD decreases with age. For example, in the age group 75–79, the risk is multiplied by a factor of 12 between the highest and the lowest quartile of BMD while this is multiplied by 6 in the group 80–84 and by 3 in the 85+ (Table 2). The corresponding risk difference represents the difference in incidence per 1000 woman-years between women in the lowest BMD quartile and those in the highest. This risk difference amounts approximately 15 hip fractures/1000 woman-year in the age group 75–79, 16 hip fractures for the group 80–84, and 24 for the group 85+.

Fig. 1
figure 1

Incidence rate of hip fracture per 1,000 woman-years by year of age

Fig. 2
figure 2

Average incidence rate of hip fracture by age groups and BMD quartiles

Table 2 Relative risk and risk difference of hip fracture between highest and lowest BMD quartiles in three age groups

Regression analysis: relative risk of hip fracture, age, and assessment of bone status

Poisson regression models are displayed in Table 3. The model that provides the best goodness of fit is no.°8. This model combines age, BMD, and an interaction term between age and BMD. The risk of hip fracture is increased 1.7 times for a standard deviation increase of age (3.7 years) and 2.2 times for a standard deviation decrease of BMD (0.11 g/cm2). In other words, for a 5-year increase of age, the risk would be increased 2.05 times. The interaction term between age and BMD (age*BMD) is negative and statistically significant (associated relative risk: 0.83 [0.74–0.93]), and its introduction leads to a better model since the deviance of this model is smaller (2,622) than the model without the interaction term (2,633) by a difference larger than 3.42. This means that, in elderly women, the relationship between BMD and the risk of hip fracture does not remains the same at all ages. The part of hip fracture risk explained by BMD tends to decrease with advancing age. From the regression models, the average relative risk between quartiles is 2.20 [1.78–2.70] between 75 and 80, 1.66 [1.37–2.02] between 80 and 85, and 1.40 [1.13–1.73] after 85.

Table 3 Relative risks of hip fracture and confidence limits for one standard deviation increase of age (3.7 years) and one SD decrease of BMD, BUA, and SOS estimated from 10 regression models

In the regression models that contain the ultrasound measurements, the interaction term between BUA or SOS and age (1.74 versus 1.66, and 1.75 versus 1.66, respectively) is not statistically significant, because the relative risk associated with the interaction term includes 1 (0.91 [0.81–1.03] for BUA and 0.92 [0.80–1.05] for SOS). This is confirmed by the deviances of models 6 and 7 (respectively 2,667 and 2,677)—without interaction terms—which are not significantly different from the deviances of models 9 and 10 that include interaction terms between BUA or SOS and age (respectively 2,665 and 2,675). In other words, in our study population (5,978 women who underwent QUS), we did not show a significant effect of age on the ability of SOS and BUA to predict hip fracture. To model the respective weight of age and BMD in predicting the risk of hip fracture, we used the results of the best regression model (equation number 8 with the smallest deviance) to plot the relative risks against age on a log scale (Fig. 3). The upper line corresponds to the lowest BMD quartile and to the highest risk of hip fracture. The risk of hip fracture increases linearly, thus exponentially, with age. The convergence of the four curves with increasing age indicates that the role of BMD in predicting the risk of hip fracture decreases. Thus, in very old women, the part of the risk explained by BMD tends to decrease with advancing age. A similar analysis was performed with QUS, based on models no.°9 and 10. The results were very comparable for SOS and BUA, thus only BUA was added in Fig. 2.

Fig. 3
figure 3

Predicted risk of hip fracture by quartile of BMD and BUA with advancing age. Y-axis is a log scale

ROC curves: assessment of hip fracture discrimination

The comparison of the ROC curves relative to the three parameters of bone assessment in the three age groups (Fig. 4a) shows that BMD is apparently more efficient than BUA or SOS since the BMD curve does not cross QUS curves (except for extremely low values of sensitivity). This is confirmed by the comparison of the areas under the ROC curves, only in the group 75–79 years of age. In this group of women, AUC of BMD is significantly larger (0.75 [0.73–0.76]) than that of BUA and SOS (respectively 0.67 [0.66–0.69] and 0.67 [0.65–0.69]) since the confidence interval of the AUC of BMD does not overlap with the confidence intervals of the AUC of ultrasound parameters. In the group aged 80–84 years, the areas under the ROC curves of ultrasound parameters compared with that of BMD (0.65 [0.63–0.67]) is similar for BUA (0.66 [0.64–0.69]) but almost significantly lower for SOS (0.60 [0.58–0.63]). Statistical comparison between BUA and SOS curves is not relevant because of multiple cross-over of the curves.

Fig. 4a,b
figure 4

a Receiver operating characteristic (ROC) curves comparing the ability of femoral neck BMD, BVA and SOS measurement to predict hip fracture. b Receiver operating characteristic (ROC) curves comparing the ability of femoral neck BMD to predict hip fracture in the three different age groups of elderly women

Considering the BMD only (Table 4) the AUC relative to women below 79 years of age is larger than the ones relative to the two older groups. The differences are statistically significant since the confidence limits of the area relative to the youngest group (75–79 years) do not overlap with those relative to the two older groups (80 and over). We can therefore conclude that BMD is a stronger predictor of hip fracture before age 80 than after age 80 and that, in women 75–80, AUC for BMD appears significantly greater than AUC for either SOS or BUA. The corresponding ROC curves of BMD for the three age groups are shown in Fig. 4b. They confirm the preceding results (Table 3), showing that, in our very old cohort, the sensitivity and the specificity of femoral neck BMD measures decrease with advancing age. However, while the sensitivity and the specificity of BMD may decrease with age, they are still in the same range as those of ultrasound (especially in the oldest age group).

Table 4 Area under the ROC curves (AUC) obtained with BMD, BUA, and SOS for three age groups

Discussion

The appropriate use of bone measurements in very elderly women is less clear than for younger postmenopausal women, it has even been suggested that elderly women over the age of 70 years with multiple clinical risk factors are eligible for bisphosphonate therapy without measurement of BMD [20]. However, more recently, a clinical trial of a new bisphophonate (risedronate) showed that, in contrast with a significant reduction of hip fracture in those selected on the basis of a very low BMD (T-score <−4, or <−3 plus at least one risk factor), those selected on the basis of clinical risk factors did not significantly benefit from the treatment [21].

The aim of the present article was to explore more precisely the influence of aging on the ability of femoral neck DXA to predict the risk of hip fracture in women aged 75 and over. We also investigated this pattern for quantitative ultrasound measurements

Our prospective data, as expected, confirm than BMD of the femoral neck, as well as BUA and SOS of the calcaneus, predict the risk of hip fracture in elderly women. Within the follow-up of the cohort (3.5 years), the time since bone assessment did not significantly alter the predictive power of BMD or BUA and SOS. A previous analysis of the EPIDOS study, based on the WHO definition of osteoporosis and on an arbitrary cutoff at 80 years of age showed that the relative risk of hip fracture in the “osteoporotic” group (T-score ≤−2.5) versus the combined group of low-BMD and normal-BMD women was 4.4 (95% CI, 3.6 to 5.5) for women aged 75 to 79, compared with 2.5 (95% CI, 2.0 to 3.1) for women who were 80 and over [7]. The present analysis explored more precisely the relationship between age and BMD and QUS, respectively. We observed a significant negative interaction between age and BMD to predict hip fracture, meaning that with advancing age, a smaller part of the risk seemed explained by BMD. Nevitt et al., in the subgroup of the older women of the SOF study, did not find a significant interaction between age and BMD and concluded that there was no apparent attenuation in the strength of the association between bone density and fracture after 65 years of age [11]. However, the interaction term was associated with a p value of 0.07, which was not far from the usual level of significance. The number of very elderly women (1,005 women aged 80+ years), which was smaller than that of the EPIDOS study, might explain the apparent differences in conclusions. ROC curves are significantly better before 80 years of age than after and Fig. 3 shows that it would take approximately 15 years for women at a low risk (in the highest quartile of BMD) to reach the risk level of women at a high risk (in the lowest quartile of BMD). These are additional reasons to recognize the usefulness of the DXA testing of the femoral neck at least until the age of 80, and this is consistent with the results of the risedronate trial [21].

In very elderly women, BMD explains a smaller proportion of hip fracture risk, but the absolute number of hip fractures/1,000 woman-years attributable to low BMD is more important because of the higher absolute risk at these ages. Similar results were published by Nevitt et al. in the elderly subgroup of the SOF cohort [11]. They showed that the relative risk of hip fracture for BMD below the median value was 5.0 in women 65–79 years of age, and 3.9 in women ≥80, whereas the risk difference was 5.7 and 16.8, respectively. The discussion should therefore be based on both the relative risk and the risk difference [22]. Besides, there is some evidence suggesting that management of osteoporosis should not rely on BMD testing alone but also on risk factor assessment, especially in elderly women [22, 23]. The age of 80 has been chosen arbitrarily to be consistent with existing literature. It does not seem inappropriate since using this limit as cutoff enabled us to show significant differences between ROC curves.

The interaction between age and SOS and between age and BUA was negative but did not reach the level of statistical significance. QUS measures were performed on a smaller number of women (5,978) than BMD (7,598), leading to a lower power of the ultrasound data analyses. ROC curves where restricted to the group of women who had both DXA and QUS measurements. Therefore, the results showing better values of AUC for BMD than BUA or SOS before age 80 years are not due to a lack of statistical power. In conclusion, the comparison of ROC curves show that the ability of femoral neck DXA to predict hip fracture was significantly better before 80 than after, and this is consistent with previous findings in a very large cohort [11]. This is probably because of a higher risk of falling in the very elderly and a smaller part of the fracture risk explained by BMD [24, 25, 26, 27]. This would support the use of other preventive approaches, such as hip protectors and fall prevention programs in very elderly women. Nevertheless, since the risk of hip fracture increases exponentially with advancing age, specially after 80, the number of hip fractures attributable to a low BMD is still very important in the very elderly. Thus, even in very elderly women, quantitative assessment of bone is still of interest before making decisions on bone-specific therapeutics such as bisphosphonates, for example [21], even though they should often be associated with other preventive interventions.

Before the age of 80, based on ROC curves, femoral neck BMD was a better predictor of hip fracture than BUA and SOS. After the age of 80, we did not observe significant differences between DXA and QUS to predict hip fracture. Several studies have attempted to define the best screening strategies, comparing BMD to QUS assessment. The use of QUS may be considered as a valid alternative to DXA beyond this age.

Several studies have compared the fracture discriminatory abilities of DXA and QUS and others have tried to determine what was the best combination of both measurements for screening individuals at high risk and have found different results [28, 29, 30, 31, 32, 33, 34]. In this debate, our results suggest that differences in the age of the patients screened might explain some of the discrepancies between studies and should be considered when selecting a bone assessment technology and/or interpreting its results as diagnostic tools or risk indicators for prophylactic or therapeutic purposes.