Introduction

Quantitative ultrasound, or QUS, has proven useful for the prediction of fracture risk in women, particularly hip fracture [14]. For non-spine fractures, most prospective studies have found that the association between heel QUS parameters, such as broadband ultrasonic attenuation (BUA), and fracture is similar to that observed for BMD of the hip. Conversely, in women hip fracture prediction with BMD may be superior to QUS. For example, in the Study of Osteoporotic Fractures (SOF), a prospective study of QUS and fracture in 6,189 women over 65, each SD reduction in BUA was associated with a 30% increase in non-spine fracture risk, compared to a 40% increase for calcaneal BMD and a 30% increase for femoral neck BMD [3]. In that study each SD decrease in QUS increased the risk of hip fracture twofold, while each SD decrease in femoral neck BMD increased hip fracture risk by a factor of 2.6. Other studies have found that the relationship between QUS, BMD and hip fracture were similar [1]. Although early reports suggested QUS might also assess some elements of bone microstructure [58], in SOF the combination BUA plus BMD was no more useful than BMD measurements alone [3].

There are few prospective data on QUS and fracture risk in men, and some studies found that calcaneal QUS measurements were technically more difficult in men than in women, perhaps because of heel size [9]. At least one manufacturer (Hologic, Waltham, MA) has developed an algorithm to detect suboptimal QUS measurements, but no studies have tested the utility of this algorithm. To test the hypothesis that low QUS was associated with an increased risk of hip and any non-spine fracture in older men, we analyzed data from the Osteoporotic Fractures in Men (MrOS) Study.

Methods

Design and subjects

MrOS is a multi-center prospective, longitudinal, observational study of risk factors for vertebral and all non-vertebral fractures in 5995 older men [10, 11]. Approval of the conduct of the MrOS study was obtained from the institutional review boards of the participating clinics and written informed consent was obtained from all study participants.

The MrOS study population consists of community dwelling, ambulatory men aged 65 years or older. Inclusion criteria included the following: (1) ability to walk without the assistance of another, (2) absence of bilateral hip replacements, (3) ability to provide self-reported data, (4) residence near a clinical site for the duration of the study, (5) absence of a medical condition that (in the judgment of the investigator) would result in imminent death, and (6) ability to understand and sign an informed consent. To qualify as an enrollee, the participant had to provide written informed consent, complete the self-administered questionnaire (SAQ), attend the clinic visit, and complete at least the baseline anthropometric, BMD, and vertebral X-ray procedures.

Participants were recruited at six academic medical centers: Oregon Health & Science University, Portland, OR; Stanford University, Palo Alto, CA; University of Alabama at Birmingham, AL; University of California, San Diego, San Diego, CA; University of Minnesota, Minneapolis, MN; University of Pittsburgh, Pittsburgh, PA. Recruitment efforts focused on community mailings, although a variety of community outreach and educational activities were also used. Details of the MrOS recruitment strategies have been published [10].

Study events and timeline

The study enrolled participants and completed the initial baseline examination over a 25-month period from 3/2000 through 4/2002. Thereafter, all subjects were queried by mail or phone every four months about the occurrence of incident falls and fractures. We were able to complete 99% of these contacts in surviving men.

Measurements

At the baseline visit QUS of the right heel was measured in 5,607 participants with a Sahara machine using the scanning protocol provided by the manufacturer (Hologic, Waltham, MA). This device provides 3 QUS parameters: broadband ultrasonic attenuation (BUA in dB/MHz, defined as the linear slope when ultrasound attenuation, expressed on a logarithmic scale, is plotted against frequency), speed of sound (SOS in M/sec) and quantitative ultrasonic index (QUI, a unitless proprietary linear combination of BUA and SOS). Non-linear plots of attenuation vs. frequency (defined by the manufacturer as a Chi-square > 50) are identified by an asterisk on the individual printout results and in the study database.

The protocol specified two measurements with repositioning; if the first two BUA measurements differed by 10 dB/MHz or more or if either measurement was non-linear, a third measurement was obtained. For men with three measurements, the mean of their two closest measurements was used in all analyses, including precision estimates, and for men with two measurements the mean of those two measurements was used. The coefficient of variation (CV) for each device, based upon the participants’ replicate measurements, ranged from 3.0 to 3.8, and the mean CV for all devices was 3.3. Acoustic phantoms provided by the manufacturer were scanned weekly at each site.

Bone mineral density of the femoral neck (BMDfn) and total hip (BMDth) using dual energy x-ray (DXA) absorptiometry (Hologic QDR 4500) was measured on the same visit when QUS was measured. Quality assurance measures were incorporated into the DXA protocols to optimize longitudinal measurement precision and comparability between DXA machines at the six clinical sites [11].

All fractures in these analyses occurred after the baseline QUS and BMD measurements and before November 23, 2005. Non-spine fractures were adjudicated centrally by review of a radiographic report or examination of the x-rays. Pathologic fractures were excluded and peri-prosthetic fractures were censored at the time that the fracture occurred. Spine fractures were excluded from these analyses because of the poor reliability of self-report for spine fractures [12].

Statistical analysis

The distributions of all variables were plotted and examined. Means of continuous variables were compared between men who experienced an incident fracture and men who did not using Student’s t tests. For each subject we calculated BMD and QUS T-scores, using the following formula:

$$ {\text{T - score = }}{{\left( {{\text{Participant's}}\;{\text{Measurement - Reference}}\;{\text{Mean}}} \right)}} \mathord{\left/ {\vphantom {{{\left( {{\text{Participant's}}\;{\text{Measurement - Reference}}\;{\text{Mean}}} \right)}} {{\text{Reference}}\;{\text{SD}}}}} \right. \kern-\nulldelimiterspace} {{\text{Reference}}\;{\text{SD}}} $$

A QUS T-score for each participant was calculated from male normative data provided by Hologic, and BMD T-scores were calculated from NHANES data [13]. Age-clinic-adjusted Cox proportional hazard models were used to determine the relative risk (and 95% confidence intervals) of hip and all non-spine fractures per standard deviation reduction of either BUA or BMD and to compare the fracture risk in men with T-scores less than or equal to −2.5, between −2.5 and −1, and greater than -1. Our primary analyses included non-linear scans, and we repeated these analyses excluding the non-linear scans.

We used a bootstrap procedure [14, 15] to determine whether BMD in a model adjusted for age and clinic was a better predictor of fracture than BUA adjusted for age and clinic. The full study population was sampled (with replacement) 1,000 times. Each bootstrap sample was fit to the two models being compared, and the difference between the hazard rates was calculated. We used the distribution of these observed differences to make statistical inference about the likelihood that the hazard ratios are significantly different (p < 0.05). We used several techniques to determine the independent contribution of BUA and BMD to fracture risk. First, both BUA and BMD were entered as predictor variables in a single proportional hazards model adjusted for age and clinic. To determine if the combination of BUA plus BMD provided any clinically meaningful improvement in fracture prediction over use of either BUA or BMD alone, we analyzed receiver operating characteristic curves [16], and calculated the age and clinic-adjusted area under the curve (AUC) for BUA alone, BMD alone and the combination of BUA plus BMD. We used a bootstrap approach to test for significant differences between each AUC. Lastly, to compare the fracture prediction of BUA and BMD in high-risk groups of equal size, we used logistic regression models (hazard models cannot be used for this purpose), adjusted for age and clinic, to determine the predicted probability of hip and any non-spine fracture for each subject using BUA alone, BMD alone, and the linear combination of BUA and BMD. Using these predicted probabilities, we grouped the men into quartiles and plotted the observed rate of fracture in each quartile. If BUA and BMD each independently contribute to the prediction of fracture risk, one would expect the risk among men in the highest quartile of predicted risk using the combination of BUA plus BMD to be substantially higher that that predicted by the highest quartile of predicted risk for BUA or BMD alone.

All analyses were performed with SAS, version 9.1 (Cary, North Carolina).

Results

Baseline characteristics

During an average of 4.2 ± 1.0 years, a total of 282 non-spine fractures were confirmed, including 53 hip fractures, among the 5,607 men with QUS data at baseline. Thirty-nine fractures, including 3 hip fractures, were censored on the basis of excessive trauma, mostly motor vehicle accidents and falls from a significant height. An additional four peri-prosthetic fractures, including one hip fracture, were censored. Men whose fracture had not yet been adjudicated for excessive trauma were excluded from the analysis (26 for the non-spine fracture outcome and 1 for the hip fracture outcome). Thus, this report includes 239 non-spine fractures among 5,581 men and 49 hip fractures (32 femoral neck, 14 intertrochanteric, and three subtrochanteric) that occurred among 5,606 MrOS participants.

Men with fracture were older, thinner, and had lower BMD and QUS measurements compared to men without fracture (Table 1). The correlations between QUS measurements (BUA, SOS, and QUI) were very high (r = 0.91–0.99), while the correlations between QUS measurements and hip BMD were modest (r = 0.38–0.44). As results were similar regardless of the QUS parameter (see Table 2), only BUA and not SOS or QUI results are shown. At least one BUA measurement was non-linear in 16% of men, and all three BUA measurements were non-linear in 7%.

Table 1 Baseline characteristics of study population
Table 2 QUS, BMD, and risk of any non-spine and hip fractures

BUA, BMD and fracture risk

Both low BUA and low hip BMD measurements were similarly associated with an increased risk for hip and non-spine fracture (Table 2). For example, after adjusting for age and clinic, each SD reduction in BUA was associated with a doubling of the risk for subsequent hip fracture (RH = 2.0, CI: 1.5, 2.8), and a 60% increase in the risk of any non-spine fracture (RH = 1.6, CI: 1.4, 1.8). The relationship between hip BMD and non-spine fracture was similar to that observed with BUA, with a 60% increase in risk for each SD reduction in femoral neck or total hip BMD. Conversely, compared to BUA the relationship between total hip BMD and hip fractures (RH = 2.9, CI: 2.2, 4.0) was significantly stronger (p = 0.03). The association between femoral neck BMD and hip fracture (RH = 3.5, CI: 2.5, 4.9) was also greater than that observed for BUA and was significant as well (p = 0.02).

The proportion of men above or below various BUA T-score cutpoints and their corresponding fracture risk are shown in Table 3. We further compared these BUA T-score cutpoints to similar cutpoints using BMDfn based on NHANEs male normative data. As previously described for women, compared to BMDfn there were fewer men with BUA T-scores between −1 and −2.5 (20.3% vs. 52.7%), and fewer men with T-scores below −2.5 (1.3% vs. 4.9%). The risk of non-spine fracture was similar among men with BUA and BMDfn T-scores between −1 and −2.5, and was also similar among those with BUA and BMDfn T-scores less than −2.5. Conversely, despite the greater number of men with BMDfn T-scores less than −2.5, the risk of hip fracture was greater than that observed for men within the same BUA T-score strata. For example, among the 73 men with BUA T-scores below −2.5, the risk of hip fracture was nearly 9-fold greater than those with T-scores greater than −1, but among the 273 men with BMDfn T-scores below −2.5, the risk of hip fracture was 18-fold higher (Table 3).

Table 3 BUA and BMD T-scores and risk of any non-spine and hip fracture

Independent effects of low QUS and BMD, and utility of combined measurements

We examined the utility of combined measurements of BUA and BMD three ways. First, after adjustment for either BMDfn or BMDth, low BUA measurements remained associated with an increased risk of non-spine fracture (Table 4). Thus, even after adjustment for BMDfn or BMDth, each SD reduction in BUA was associated with a 30–40% increase in the risk of non-spine fracture. BUA was not associated with the risk of hip fracture after adjustment for BMDfn or BMDth, while both BMDfn and BMDth remained associated with hip fracture risk even after adjustment for BUA.

Table 4 Independent associations between BUA, BMD and fracture outcomes

To evaluate the combined utility of BUA and BMD measurements for the prediction of non-spine and hip fracture, we first performed ROC analyses to determine the AUC for BUA alone, BMDfn alone or the combination of BUA+BMDfn. As expected from the relative hazard per SD, the AUC for BUA and BMDfn were identical for non-spine fracture (AUC = 0.68, p = 0.90), and the combination of BUA+BMDfn improved fracture discrimination minimally (AUC = 0.69, p = 0.03 compared to BUA alone and p = 0.02 compared to BMDfn alone). Despite the observation that for hip fracture the relative hazard per SD change in BUA or BMDfn were moderately different, the AUC for BUA alone and BMDfn alone were similar (0.84 vs. 0.85, p = 0.38), as was the AUC for the combination of BUA+BMDfn (0.85), and none of the differences were statistically significant (p = 0.3–0.6).

To further assess the utility of the combined measurements of BUA and BMDfn, we grouped men into quartiles based on the predicted risk for fracture from measurements of BUA alone, BMDfn alone, and the combination of BUA and BMDfn, and then examined the observed rate of fracture within each of these quartiles (Figs. 1 and 2). The rate of hip and any non-spine fracture were elevated primarily among men in the lowest quartile of BUA or BMDfn. Among the men in the highest quartile of risk using BUA alone, 127 men suffered non-spine fracture and 38 experienced hip fracture, compared to 123 non-spine and 42 hip fractures among the men in the highest quartile of risk using BMDfn alone. Lastly, among the quartile of men with the highest risk using both BUA and BMDfn, 128 suffered non-spine and 42 suffered hip fractures.

Fig. 1
figure 1

Rate of non-spine fracture predicted by measurement of BUA alone, BMDfn alone, and the combination of BUA and BMDfn. Number of fractures in highest quartile indicated. Adjusted for age and clinic: total number of fractures was 239

Fig. 2
figure 2

Rate of hip fracture predicted by measurement of BUA alone, BMDfn alone, and the combination of BUA and BMDfn. Number of fractures in highest quartile indicated. Adjusted for age and clinic: total number of fractures was 49

Analyses excluding non-linear BUA measurements

We repeated our analyses excluding QUS measurements identified as being non-linear, but the results did not change (data not shown). For example, after adjusting for age and clinic and excluding the 16% of BUA measurements that were non-linear, the hip fracture hazard ratio per SD reduction in BUA was 1.8 (CI: 1.4, 2.4), compared to a hazard ratio of 2.0 (CI: 1.5, 2.8) when all BUA measurements were included.

Discussion

In this large prospective study of fracture outcomes among older men with baseline measurements of QUS and hip BMD, we found that both techniques predicted the risk of non-spine and hip fracture. The strength of these associations was similar for non-spine fracture, suggesting either technique may be used to assess the risk of non-spine fracture. For hip fracture, we again found that both techniques were associated with the risk of fracture, with a modestly larger relative risk for hip BMD. Although both BUA and hip BMD were independent predictors of non-spine fracture, we found no evidence that the combination of BUA+BMD was superior to BUA or hip BMD alone.

These findings, particularly the similar number of fractures predicted to occur in men in the lowest quartile of BUA or BMDfn measurements, and the equivalent AUC for non-spine and hip fracture, suggest that the overall ability of QUS to predict fracture is, or could be, similar to hip BMD. Conversely, as previously noted [17], our analyses suggest that Sahara QUS T-scores are not equivalent to T-scores obtained with hip DXA, and different QUS T-score cutpoints should be evaluated. From a practical standpoint, similar to women, when available DXA should remain the gold standard for the assessment of fracture risk in men for several reasons: validation in several populations, published normative data, and well-defined and widely available quality control procedures to ensure accuracy and reliability. Furthermore, thus far all effective pharmacologic treatments in men have selected subjects on the basis of low BMD [1820], not low QUS.

Our analyses do not suggest that the combination of QUS plus BMD is more useful than either QUS or BMD alone. Results using ROC analyses and predicted fracture probability were consistent and did not support combined measurements. This implies the identification of a subset of men with both low QUS and low BMD is not superior to identification of a similar sized subset using either BMD or QUS alone.

Several other aspects of this study deserve further mention. First, all QUS measurements (BUA, SOS and QUI) were highly correlated and, as expected, were similarly associated with fracture risk. We unexpectedly found that exclusion of non-linear measurements in our study did not improve the ability of BUA to predict fracture. Further evaluation of the identification and implications of non-linear BUA measurements are needed.

Several other studies have examined the relationship between QUS and fracture in men, but most have been retrospective [2123]. One prospective study measured calcaneal BUA using a CUBA device in 6,471 older men enrolled in the EPIC-Norfork prospective population study [24]. Although low BUA was associated with an increased risk of non-spine fracture (RH = 1.9, CI: 1.2, 2.9), only 33 fractures were recorded. Another prospective Japanese study [25], obtained calcaneal QUS with a Lunar Achilles device on 1,004 men. During an average follow-up of 5 years, 50 non-spine fractures were recorded in men, and each SD reduction in BUA was associated with a statistically significant twofold increase in fracture risk. BMD was not measured in the Japanese study.

The strengths of our study included the prospective design, the large number of subjects and non-spine fractures, and the careful attention to both QUS measurements and fracture ascertainment. We specifically attempted to reproduce our previous analyses of QUS and fracture in older women [3], and used the same measurement protocol. However, we were limited by insufficient numbers of hip fracture to analyze femoral neck and intertrochanteric fractures separately. BUA and T-score cutpoints are known to differ among different QUS devices [26], and our results may not apply to other QUS instruments. Further, our results may not apply to incident vertebral fractures, which will be analyzed when follow-up radiographs are available, or to non-white or younger men.

In conclusion, in this large prospective study of QUS, BMD and fracture risk in older independent living men, we found that men with low BUA, particularly those in the lowest quartile, had substantial increases in the risk of hip and non-spine fracture. The risk gradient for BUA was similar to that observed for hip BMD, and we found no evidence that combined measurements of BUA and BMD were superior to either one alone. Before QUS can be recommended in place of BMD, future studies should evaluate the cost-effectiveness of QUS measurements and determine the effectiveness of treatments among men selected on the basis of low QUS.