Introduction

Osteoporosis is a systemic skeletal disease characterized by low bone mass and microarchitectural deterioration of bone tissue, thus increasing the risk of fractures [1]. The diagnosis of osteoporosis is commonly based on measurements with dual energy X-ray absorptiometry (DXA), which has been considered as a gold standard [2]. Quantitative ultrasound measurement (QUS) is widely used as a non-ionizing and a low-cost method aimed to diagnose osteoporosis [3]. However, the information value of a single ultrasound measurement for a patient is still controversial and interpretation is more complicated than with central DXA [4]. Prospective studies have shown that low QUS predicts hip fracture and is associated with low bone mineral density (BMD) of the hip as measured by DXA [5, 6, 7]. Moreover, a few prospective studies have studied the ability of calcaneal broadband ultrasound attenuation (BUA) to detect fractures also in perimenopausal women [8, 9, 10]. Stewart et al. found significant association in terms of Odd’s ratios between BUA and fracture risk [8], and Thompson et al. showed that QUS can predict fractures among women over 55 years of age [9]. Gnudi et al. found that patellar ultrasound is a significant predictor of osteoporotic fractures in early postmenopausal women [10]. However, information on the ability of QUS to detect future fractures as compared to BMD during early postmenopausal years is scarce. Our aim was to evaluate how well a single baseline QUS measurement can predict future fractures in early postmenopausal women as compared to femoral or lumbar BMD measurement.

Materials and methods

The initial study population consisted of 506 women, a random subsample of 2025 randomly chosen women who participated in the Kuopio Osteoporosis Risk Factor and Prevention Study (OSTPRE) in Kuopio, Eastern Finland. These 506 women were invited to undergo calcaneal ultrasound measurement (Lunar Achilles, Lunar Corporation, Madison, Wisc., USA) at the time of DXA densitometry (Lunar DPX) of the proximal femur and lumbar spine (L2–L4) during 1994–1995. In all, 84 women were excluded from the analyses: 46 did not turn up, seven had died and two had moved from the area; four women could not be measured due to swollen feet and in 24 cases the ultrasound instrument did not work properly. The reasons for these technical hazards in ultrasound measurement could not be fully elucidated. Lunar Achilles was one of the first generation equipments in ultrasound densitometry, and therefore it might have some inherent problems which seem to have been overcome in later generation machines, e.g. in Achilles+. In some patients the machine was unable to get a proper signal, perhaps due to some micro-dust from socks dissolving in water from the skin surface, even although the foot was properly washed before immersion. Thus, 422 women formed the final study cohort. Body weight and height were measured during the bone measurement visit and a questionnaire was used to collect data about previous fractures, use of hormone replacement therapy (HRT), alcohol consumption, smoking and use of dietary calcium. At the end of May 1999, a follow-up questionnaire was sent to the entire cohort in order to acquire data of fractures during the follow-up. Crosschecking radiological reports from medical records was used to validate self-reported fractures. Rib fractures were accepted without radiological evidence based on clinical diagnosis alone. The local ethics committee approved the study design.

Statistical analysis

The first fracture during the follow-up period was the end point event for statistical analyses which were performed using SPSS for Windows software, version 7.5 (Statistical Package for Social Sciences; SPSS Inc.). In univariate analyses, the two-tailed, unpaired Student’s t-test or analysis of variance were used for continuous variables and chi-square statistics for categorical variables. Relative risks were estimated as proportional hazards (HR) with 95% confidence intervals (CI) using the Cox proportional hazards model. The following baseline variables were used as covariates in the Cox model: age, weight, height, femoral neck BMD, previous fracture history and use of HRT. Furthermore, receiver operator characteristic (ROC) analysis was used to identify which measurement of QUS and BMD has the best predictive ability for follow-up fractures. Statistical testing for ROC curves was performed using Roccomp software (Stata/SE 8.0, Stata Base Reference Manual, Volume 3, Stata Corp. College Station, Tex., USA) [11].

Results

During the mean follow-up time of 2.6 years (SD±0.7, range 0.3–4.0 years), 32 women experienced 33 fractures. Nine of these were wrist and nine were ankle fractures (Table 1). In addition, nine fractures were assumed to be high energy fractures and 24 low energy fractures, according to the mechanism of fracture. Characteristics of the study group are presented in Table 2. The only statistically significant differences between the fracture and the non-fracture groups were found in QUS parameters, while differences in the axial bone density were not statistically significant between groups. Mean Z-scores in fracture and non-fracture groups were, respectively, for BUA −0.37, 0.03, for SOS −0.58, 0.05, for SI −0.55, 0.04, for spinal BMD −0.21, 0.02 and for femoral neck BMD −0.27, 0.02. Furthermore, QUS measurement expressed in terms of mean T-score in the fracture group was −1.5 (95% CI −1.7 to −1.2) and in the non-fracture group −1.0 (95% CI −1.1 to −0.9).

Table 1 Site (total number of fractures), number and mechanism of accident of 33 follow-up fractures
Table 2 Characteristics of the study cohort (n=422) Continuous variables are presented as means (SD)

When the study subjects were categorized into tertiles according to QUS and BMD values, the distribution of follow-up fractures was different for each parameter (Fig. 1). Most fractures were cumulated in the lowest tertile of SOS and stiffness, while in the highest tertile of BUA the number of fractures was the least. In order to find clinically relevant threshold values to judge how well an individual measurement separates fractured women from non-fractured ones, we chose cut-off values at −1 SD below the group mean (Z-score −1.0) and at the group mean (Z-score 0.0) (Table 3). Using −1 SD as a cut-off value, only a minority (15.6–40.6%) of fractured women belonged in the group below −1 SD. However, women below the group mean experienced the most fractures (50.0–71.9%) during follow-up. In ROC analyses, the largest area under the curve was shown by SOS (AUC=0.68) and the least area by lumbar BMD (AUC=0.56) (Fig. 2). Nevertheless, statistically significant differences were found only between areas of SOS and spinal BMD (P=0.03, Duncan’s P=0.13) and between SI and spinal BMD (P=0.02, Duncan’s P=0.07).

Fig. 1
figure 1

Distribution of follow-up fractures in QUS and BMD tertiles

Table 3 Proportion (%) of women who experienced a follow-up fracture under cut off values of Z-score less than −1 SD and Z-score less than group mean in different measurement sites
Fig. 2
figure 2

Receiver operator characteristic (ROC) analysis for QUS and BMD. Area under the curve (AUC) is presented in the legends. AUC difference is statistically significant between stiffness (P=0.02, Duncan’s P=0.07) and SOS (P=0.03, Duncan’s P=0.13)

The number of ankle fractures was highest in the lowest SOS and stiffness tertiles, which in all cases seemed to be more sensitive in discriminating ankle fractures as compared to BUA and axial BMD. On the contrary, wrist fractures were equally distributed in BUA, SOS, stiffness and lumbar BMD tertiles. However, the number of wrist fractures was highest in the lowest tertile of femoral neck BMD, although the difference between tertiles was non-significant. Previous fracture was not an independent predictor of fracture; six women (18.8%) in the fracture group and 86 women (22.1%) in the non-fracture group had experienced at least one fracture before baseline measurements.

A backward stepwise Cox regression analysis was performed to evaluate the relative risk of fracture per 1 SD change in QUS and BMD measurements (Table 4). After adjusting for confounding variables, the hazard ratio (HR, 95% confidence interval) of a follow-up fracture for a 1 SD decrease was 1.80 (1.27–2.56), 1.72 (1.21–2.45) and 1.43 (1.01–2.03) for SOS, stiffness and BUA, respectively. Similarly, HR for a 1 SD decrease in spinal BMD was 1.27 (0.85–1.94) and for femoral neck BMD 1.14 (0.78–1.70). All QUS parameters predicted fracture independently of BMD. In the main analyses, traumatic fractures were not excluded to elucidate the value of a single measurement in predicting a future fracture per se. However, when Cox’s analyses were performed taking only low-energy fracture as an end-point variable, the results remained unchanged for QUS and BMD measurements (Table 5).

Table 4 Relative risk estimates for all follow-up fractures by QUS and BMD measurements. Hazard ratio (HR) and 95% confidence intervals for one standard deviation decrease according to Cox regression
Table 5 Relative risk estimates for low-energy follow-up fractures by QUS and BMD measurements. Hazard ratio (HR) and 95% confidence intervals for one standard deviation decrease according to Cox regression

In power analyses, we used unadjusted hazard ratios for QUS and BMD measurements to find out whether our sample size was large enough to detect the associations. For given hazard ratios, a sufficient number to detect associations for BUA was 740, for SOS 274, for stiffness 303, for femoral neck BMD 1543 and for spinal BMD 699 (Hintze 2001; NCSS and PASS. Number Cruncher Statistical Systems. Kaysville, Utah, USA, www.ncss.com).

Discussion

The present study verified that calcaneal QUS is able to predict early postmenopausal fractures as well as or even better than axial BMD. A 1 SD decrease in BUA values increased the risk of future fracture by 43%, which equaled the risk of 1 SD decrease in spinal BMD. Furthermore, only QUS measurements were found to predict fractures after adjusting for confounding factors. Our results are comparable with the prospective study by Thompson et al. [9], in which a 1 SD decrease in BUA resulted in 40% increase in fracture risk in 3150 women aged 45–75 years. Similarly, Stewart et al. showed that 1 SD reduction in BUA in 1000 perimenopausal women aged 45–49 years predicted a follow-up fracture by an odds ratio of 1.4 (95% CI 1.2–2.4), although there were no statistically significant differences in the mean BUA values between the fracture and the non-fracture groups [8]. In a more recent study by Gnudi et al., a low patellar ultrasound measurement increased the risk of fracture by an RR of 2.89 (95% CI 1.12–7.42) for a 1 SD decrease [10]. In the elderly, low ultrasound measurement has also been shown to be associated with elevated hip fracture risk [6, 7, 12] and with other non-spinal fracture risk [12].

In the present study, adjusting for BMD did not change the ability of QUS measurement in prediction of fracture. This is in agreement with the assumption that ultrasound measurement also reflects other components of bone strength than BMD [13]. Interestingly, adjusting BMD for QUS parameters seemed to dilute the effect of BMD in fracture prediction. Possibly the inability of BMD to predict a fracture in the present study was due to the small sample size, as shown in the power analyses. Evidence from previous studies has shown that low BMD is a prominent and independent predictor of perimenopausal fractures [14, 15, 16, 17, 18]. However, according to our results, one could also speculate that peripheral QUS might discriminate peripheral fractures more accurately than axial BMD measurements. This could be explained by site specificity, i.e. calcaneal QUS is more sensitive for changes in bone quality or composition close to the measurement site than axial BMD. The majority of the follow-up fractures in this study were wrist or ankle fractures (56%) and there were only three vertebral and three pelvic fractures. As a consequence, peripheral measurements could be preferable in detecting (peripheral) fractures during the early postmenopausal period. Moreover, similar risk ratios for fractures as reached from calcaneal QUS could be expected from calcaneal BMD measurement [6]. Whether the decision to start treatment for osteoporosis could be based solely on peripheral ultrasonographic measurement is, however, unclear and needs further studies.

The follow-up fractures that were included in each QUS and BMD tertiles were partially different, highlighting the fact that different patients were identified according to the technique. This poses a question on how to interpret QUS results in the light of axial BMD, which is considered the gold standard in osteoporosis diagnosis and treatment. In a study by Blake and Fogelman, differences in axial and peripheral densitometry were evaluated, and indicated that different techniques identify different patients and there is no absolute method of identifying all fracture patients [19]. Therefore, it may be unwise to calibrate peripheral QUS to fit in the results of axial BMD.

Notwithstanding the results, the overall performance of axial and peripheral measurements to find fracture cases seems to be far from perfect. There are still a great number of fracture cases who do not have either low QUS or low BMD values. In the ROC analyses, areas under curves are fairly modest, indicating the inaccuracy of these methods. In other words, women with low QUS or BMD (i.e. value below −1 SD of the group mean) experienced only a minority of fractures. However, if the group mean was used as a threshold, women who belonged to the lower part sustained the majority of fractures. Thus, choosing a cut-off value close to the mean would lead to a more complete detection of fracture candidates. This finding is in agreement with our earlier study, in which we found a threshold point of 0.64 SD below the mean in the axial BMD to offer the best cut-off point to discriminate between the fracture and non-fracture patients [16]. Even so, moving the cut-off point closer to the mean increases the number of false positives. In terms of T-score adapted from stiffness values, the mean T-score was −1.5 for the fracture group and −1.1 for the non-fracture group in our present study. Thus, a cut-off point of T-score −1.2 might provide a useful threshold in clinical practice to discriminate between increased and normal fracture risk.

The strengths of this study include the prospective population-based design and a well-defined cohort of early postmenopausal women, which is a clinically interesting group. However, the main limitations of our study are the low number of fractures as well as the relatively small sample size and short follow-up time resulting in diminished study power. According to power calculations, however, the sample size seemed to be large enough to explain the statistically significant results of stiffness and SOS measurements. On the other hand, for our BMD and BUA results, study power may not be sufficient. This also explains why a historical fracture which has previously been shown to strongly predict future fractures [16, 17, 20] had no effect on follow-up fractures in this study. Moreover, the study subjects consisted only of Caucasian women; the interpretation should not be applied to men, to other races or to women in other age groups. We used a water-coupled device (Lunar Achilles), but according to the study of Njeh et al. [21], our results could be applied with caution to other water-coupled and also gel-coupled devices.

In conclusion, low calcaneal QUS is an independent predictor of fracture in perimenopausal and early postmenopausal women. Our results suggest that the predictive ability of calcaneal QUS is similar or even better than that of axial BMD. The risk of fracture increases after QUS cut-off value T-score −1.2 (adapted from stiffness). It seems that early postmenopausal women might benefit more from peripheral than axial measurements when searching for potential fracture candidates in the short term. Thus, calcaneal QUS can be recommended as a method to assess fracture risk in early postmenopausal women.