Introduction

Osteoporosis disproportionately affects older women [1], with four out of ten older white women suffering a fracture after age 50 [2]. It contributes to over 300,000 hip fractures in the U.S. annually [3]. The morbidity, mortality, and health-care costs associated with osteoporosis are significant. Bone fragility is likely the result of multiple additive factors, including abnormalities of bone modeling, remodeling, and changes in hormonal milieu, as well as other risk factors [4]. To meet the Healthy People 2010 objectives to reduce by 20% the proportion of adults with osteoporosis [5], additional understanding of the factors that contribute to excess fracture risk is a national priority.

Areal bone mineral density (aBMD in gram per square centimeter), as measured by dual energy X-ray absorptiometry (DXA), is the current gold standard for clinical assessment of bone fragility. Women with low hip aBMD (T-scores < −2.5) are approximately two to three times more likely to experience a hip fracture as women with higher aBMD [6]. However, fracture occurs when stresses from applied loads exceed the stress capacity of bone tissue. Loss of strength infers that tissue stress capacity is diminished or that geometry is altered so that stresses increase. Note that DXA measures only the mineral component of bone tissue, and a DXA aBMD measurement quantifies the average thickness of the mineral in the region. Despite its unquestioned usefulness, aBMD does not actually describe either a tissue strength property or a specific geometric configuration so its mechanical interpretation is not obvious. But, it is well known that osteoporosis mainly alters the amount of bone tissue and its distribution within bones; these changes are intrinsically geometric. Fragility should, therefore, be evident in the geometry even if that geometry can only be crudely measured by current DXA methods.

Although one must be cautious about methodological limits of measuring geometry from two-dimensional DXA scans, recent prospective studies of large epidemiologic cohorts have shown that certain geometric properties, particularly buckling ratio and cortical thickness, predict incident hip fracture as well as conventional aBMD at the femoral neck [7, 8]. The general approach taken in these studies is to examine multiple individual geometric properties in comparison to aBMD to see if any are equivalent or perhaps even superior to the current gold standard in predicting incident hip fracture. The objectives of this study were to determine: (1) which individual hip geometry measurements predicted incident hip fracture in a race/ethnicity and age diverse cohort of US women, how the magnitude of risk compared to aBMD, and whether any of these parameters were independent of aBMD in predicting risk; and (2) determine whether highly correlated hip geometry parameters could be summarized using principal components analysis into factors that might better predict hip fracture than any single parameter or than aBMD.

Methods

Study population

The Women's Health Initiative Program (WHI) enrolled a total of 161,808 postmenopausal women into one or more of the WHI clinical trials (hormone therapy, dietary modification, and/or calcium and vitamin D supplementation) or the WHI observational study. Details regarding the inclusion and exclusion criteria, recruitment procedures, participant characteristics, follow-up, and outcomes ascertainment can be found in the published reports [911]. Briefly, US women ages 50–79 years old, postmenopausal, and not likely to change residence or die within 3 years at the time of enrollment were recruited from 40 clinical centers nationwide between 1993 and 1998. Women were not selected on the basis of their bone density or osteoporosis status. The study protocol and consent forms were approved by the institutional review boards for all participating institutions.

Dual energy X-ray absorptiometry scans and hip structural analysis

Women enrolled at three clinical centers (Tucson/Phoenix, Pittsburgh, Birmingham) had DXA scans at the hip, anteroposterior–lateral spine, and total body. Standard protocols for positioning and analysis were used by technicians trained and certified by the DXA manufacturer and by the WHI DXA coordination center at the University of California at San Francisco. The ongoing quality assurance program monitored spine and hip phantom scans, reviewed a random sample of all subject scans, and flagged those with specific problems. Hardware and software changes were tracked with in vitro and in vivo cross-calibrations and by scans of calibration phantoms across instruments and clinical sites. A baseline hip scan was available for 10,290 women. Conventional femoral neck aBMD was obtained from the Hologic DXA program as usual.

Hip structural analysis (HSA) was conducted on archived scans in Dr. Beck's lab at the Johns Hopkins University. A separate cross-calibration was conducted on all the WHI DXA sites using a special phantom provided by Dr. Beck. The geometric strength of an object is typically evaluated using measurements of the load supporting surface of cross sections at sites where fractures are likely. The HSA software derives geometry of the load supporting surface by employing a projection principle first described by Martin and Burr [12].

The HSA program computes geometry from five parallel lines 1 pixel (~1 mm) apart traversing the bone axis at each of three femur cross sections which are then averaged. Analysis sites include: the narrow neck (NN) across the femur neck at its narrowest point, the shaft (S), across the shaft at a distance of 1.5 times minimum neck width distal to the intersection of the neck and shaft axes, and the intertrochanter (IT) along the bisector of the angle produced by neck and shaft axes. For each region, the HSA program computed the following variables used in this analysis: (1) HSA-derived areal bone mineral density—HSA aBMD (gram per square centimeter); (2) bone cross-sectional area—CSA (square centimeter) an index of resistance to axial forces; (3) outer diameter—(centimeter); (4) section modulus—(cubic centimeter) an index of strength in bending; (5) estimated average cortical thickness; and (6) buckling ratio—an index of susceptibility to local cortical buckling under compressive loads. Cortical thickness for a buckling ratio can only be crudely estimated from DXA data using assumptions of shape and of the proportion of measured bone in the cortex, but this parameter has been shown to provide a mechanical explanation for the predictive value of low BMD in elderly bones [8].

Data collection for other covariates

Questionnaires were used at baseline and follow-up to collect information on age, race/ethnicity, smoking, self-rated health, use of estrogen and/or progesterone therapy, personal history of fracture (any fracture and those occurring after age 55), parental hip fracture after age 40, and self-reported physician diagnosis of diabetes. Hormone therapy was categorized as current, past, or never. Women randomized to active hormone therapy were considered current users. At the screening clinic visit, medication inventories were conducted by direct inspection of prescription and over-the-counter medications taken in the past 2 weeks. Medication names and durations were entered into the Medispan database from which current use of corticosteroids, insulin, and oral hypoglycemic agents were ascertained. Too few women were taking bisphosphonates or selective estrogen receptor modulators at baseline to permit analysis. Diabetes was defined based on self-report and categorized according to use of insulin. Weight was measured to the nearest 0.1 kg on a balance beam scale with the participant dressed in indoor clothing without shoes. Height was measured to the nearest 0.1 cm using a wall-mounted stadiometer. Body mass index was calculated as: weight (kilogram) per height (square meter). Lean body mass was obtained from the baseline whole body DXA scans.

Outcome ascertainment

Women were asked to report the occurrence of any hospitalization and whether they had been diagnosed with a wide variety of outcomes including clinical fractures of any type. In the WHI clinical trials, these contacts occurred during semiannual clinic visits, whereas in the WHI observational study women were contacted annually by mail and/or telephone. All reported clinical fractures other than those of the ribs, chest/sternum, skull/face, fingers, toes, and cervical vertebrae were verified by review of radiology, magnetic resonance imaging, or operative reports by centrally trained physician adjudicators at each of the BMD clinics. For fracture sites other than hip, the local clinic physician-adjudicated fractures were used. Final adjudication of hip fractures was performed centrally by blinded WHI physician adjudicators. The agreement between central and local adjudication for hip fracture was 94%. Detailed outcome definitions and methods for ascertaining, documenting, and classifying outcomes have been published [10]. Follow-up time ranged up to 11 years per participant as of September 2005 with an average of 8–9 years. At that time, 5–6% of WHI participants had been lost to follow-up, and 6–7% had died in the WHI clinical trials and observational study overall. The average length of follow-up was 8 years.

Statistical methods

Baseline characteristics of women who experienced an incident hip fracture or other nonhip clinical fracture were compared to women who remained fracture free during the follow-up using chi-square or t tests. Baseline differences in HSA parameters were compared by calculating the percent differences between women with incident hip fracture and those without any clinical fracture after adjusting for age, height, weight, and percent lean body mass.

To determine whether any data reduction was possible among the HSA parameters, we first examined the intercorrelations between the 15 HSA variables and their correlations with aBMD. Principal components analysis was used to extract factors from the 15 variables. Varimax rotation was used to determine factor loadings on uncorrelated factors. For each extracted factor, time to first adjudicated incident hip fracture was assessed using Kaplan–Meier survival curves.

Cox proportional hazards models were used to compute adjusted hazards ratios (HRs) for hip fracture. Women contributed follow-up time until the date of hip fracture, death, or loss-to-follow-up, whichever came first. Separate models were constructed for each of the 15 HSA parameters. Since the factors derived from principal components analysis were uncorrelated by definition, models included all extracted factors simultaneously. HRs were calculated to reflect a standard deviation difference in each structural geometry parameter or the extracted factor from principal component analysis. In the first set of models (model A), HRs were adjusted for age, race/ethnicity, height, weight, total body percent lean mass, and clinical trial. In model B, clinical risk factor variables were added to model A including smoking, hormone use, corticosteroid use, general health, physical activity, fracture history, fracture on/after age 55, parent broke hip after 40, and diabetes. These covariates were selected based on previous studies on clinical risk factors for hip fractures [13]. Finally, in model C, aBMD was added to the model B covariates to assess the relationship between fractures and HSA measurements independent of aBMD. All analyses were conducted using STATA 10.1.

Results

Among the 10,290 postmenopausal women with baseline BMD and HSA measurements, 8,843 remained free of fracture during follow-up, 147 fractured their hip, and 1,300 had other clinical fractures. Women who had incident hip fracture were significantly older, weighed less, and had lower total body, spine, and hip bone density as compared to women who remained fracture free during follow-up (Table 1). Women who developed other clinical fractures were also significantly different on these parameters but had intermediate values as compared to women who remained fracture free and those who later fractured their hip. Caucasian race, parental history of hip fracture, personal history of fracture (ever or after age 55), and steroid use also consistently differentiated women with hip or other fracture from those who remained free of fracture (Table 1).

Table 1 Baseline characteristics of study participants by fracture (mean ± standard deviation (SD) or n (percentage)

Many of the hip structural geometry measurements were highly correlated with conventional femur neck aBMD (Table 2). The highest correlations were observed for cross-sectional area (r = 0.87), average cortical thickness (r = 0.90), section modulus (r = 0.72), and buckling ratio (r = –0.79) at the narrow neck. The outer diameter width measurements were uncorrelated with aBMD. This pattern was generally consistent across regions, although correlations between femoral neck aBMD were lower in magnitude with HSA measurements made at the intertrochanter and shaft as compared to the overlapping narrow neck region. Many of the HSA parameters were also highly correlated with each other supporting the examination of one or more summary factors.

Table 2 Correlation matrix of parameters from DXA measurements

Principal components analysis based on this correlation matrix yielded two uncorrelated factors. Factor 1 had high factor loadings from cross-sectional area, section modulus, buckling ratio, and cortical thickness in all three regions (eigenvalue = 8.69) and was highly correlated with aBMD (r = 0.85). Factor 2 had high factor loadings from outer diameter measurements as well as section modulus from all three regions (eigenvalue = 3.31; Table 3). The two factors accounted for 80% of the variance in the 15 hip structural geometry measurements included in the principal components analysis.

Table 3 Factor loading matrixa from principal component analysis

Women who had incident hip fracture had lower baseline values of aBMD, bone cross-sectional area, and section modulus, wider diameters, and higher buckling ratios (p values < 0.001; Fig. 1). Effects were similar across regions but larger at the proximal regions. Survival curves showing time to hip fracture by tertile indicate substantially greater risk of hip fracture among women in the lowest tertile of femoral neck aBMD and an almost identical pattern of increased risk among women in the lowest tertile of factor 1 (Fig. 2a, b) as compared to women in the high or medium tertiles of these two variables. There was also a linear relationship between tertile of factor 2 and time to hip fracture (Fig. 2c).

Fig. 1
figure 1

Baseline differences in HSA parameters comparing women with incident hip fracture to women with no clinical fractures during follow-up (hip fractures: n = 147, no clinical fractures: n = 8,843)

Fig. 2
figure 2

a The Kaplan–Meier survival estimates of hip fracture by the areal femoral neck BMD. b The Kaplan–Meier survival estimates of hip fracture by the extracted factor 1. c The Kaplan–Meier survival estimates of hip fracture by the extracted factor 2

Adjusted hazard ratios for the corresponding parameters are shown with and without adjustment for femoral neck aBMD and other clinical risk factors in Table 4. In minimally adjusted models accounting only for age, ethnicity, weight, height, percent lean mass, and clinical trial participation, statistically significant associations were observed for all 15 of the individual HSA parameters and incident hip fracture. Higher levels of cross-sectional area, section modulus, and average cortical thickness were associated with decreased risk of hip fracture, whereas outer diameter width and average buckling ratio were associated with increased risk of hip fracture (all 95% confidence intervals excluded 1; Table 4). Most associations persisted after adding clinical risk factors in model 2, however, section modulus at the intertrochanter and shaft weakened in these adjusted models and 95% confidence intervals included one. After adding aBMD in model C, intertrochanter and shaft outer diameter measurements remained independent predictors of hip fracture with hazard ratios for a one standard deviation increase of 1.61 (95% confidence interval (CI), 1.25–2.08) for the intertrochanter and 1.36 (95% CI, 1.06–1.76) for the shaft. There was no independent association of outer diameter at the narrow neck with incident hip fracture (HR = 1.13; 95% CI, 0.90–1.41). The individual measurement of outer diameter at the intertrochanter predicted hip fracture slightly better than factor 2 which had high loadings from outer diameter measurements at all three regions (HR = 1.57; 95% CI, 1.17–2.11). Average buckling ratios at the intertrochanter and shaft were also independently associated with incident hip fracture with hazard ratios of 1.43 (95% CI, 1.10–1.87) at the intertrochanter and 1.24 (95% CI, 1.00–1.55) at the shaft. As expected, factor 1 which correlated highly with aBMD (r = 0.85) was not associated with incident hip fracture after adjusting for femoral neck aBMD. Other HSA variables that were highly correlated with factor 1 did not independently predict hip fracture in model C and some had hazard ratios that reversed direction owing to the very high correlations with aBMD. Results were similar when adjustment for total hip aBMD was used instead of femoral neck aBMD.

Table 4 Hazard ratiosa of hip fracture by the standardized HSA parametersb and the extracted factors from principal component analysis

Results were also similar when these modeling steps were repeated for the 79 femoral neck fractures and the 60 intertrochanteric fractures separately. Outer diameter at the intertrochanter had a hazard ratio of 1.67 (95% CI, 1.18–2.35) for femoral neck fracture and 1.47 (95% CI, 1.00–2.16) for intertrochanteric fracture.

Discussion

In this prospective study of 10,290 women followed for up to 11 years, measurements of femur outer diameter and average buckling ratio were significantly and independently associated with increased risk of hip fracture after adjustment for body size, race/ethnicity, clinical risk factors, and aBMD. Two factors were found to summarize 80% of the variance in the 15 individual HSA parameters studied, however, factor 1 which was highly correlated with femoral neck aBMD was not a better predictor of incident fracture than the conventional measure. Factor 2, which was related to bone girth (outer diameter at three regions), was independently associated with incident hip fracture, but intertrochanter outer diameter was as good a predictor as the summary measure. Importantly, intertrochanter outer diameter was independently associated with a 61% increased risk of hip fracture for each standard deviation increase in value, suggesting that this parameter could contribute importantly to prediction of future hip fracture after accounting for aBMD.

This study is the third large cohort to examine HSA parameters derived from DXA in relation to future risk of hip fracture in older adults. Among 2,740 women in the Rotterdam study, increased buckling ratio and bone size were also associated with incident hip fracture (n = 106) [8]. However, in the latter study, the observed associations were not adjusted for body size (height, weight) or clinical risk factors. Moreover, none appeared to have predictive value greater than aBMD, and associations between HSA measurements and hip fracture were not evaluated after adjustment for aBMD. Among 7,474 women in the Study of Osteoporotic Fracture, Kaptoge et al. [7] found that bone outer diameter and buckling ratio were associated with increased risks of hip fracture after adjustment for aBMD. However, the latter study did not determine whether either HSA parameter contributed to hip fracture prediction after accounting for body size (height, weight) and clinical risk factors in addition to aBMD.

Although outer diameter appeared to be statistically independent of aBMD, the (conventional) DXA scanner software fixes the region of interest length along the neck so that an expansion of outer diameter increases the size of the region area. Note that aBMD is mathematically equivalent to BMC/region area, so that for the same BMC a larger diameter would have an inverse effect on aBMD. In reality, a wider diameter should help to explain the predictive ability of aBMD. First, it is important to realize that increasing diameters are a hallmark of aging bones [7, 8, 1422]; apparently the process serves to preserve the section modulus [17] in the presence of net bone loss because a larger diameter tube requires less material to achieve a given section modulus. Engineers commonly use wider diameter, thinner walled tubes to produce lightweight structures, but they take care to ensure that tube walls do not become so thin that tube walls buckle under compressive loads. This can mean that the strength is less than one would predict from the section modulus. Nature seems to preserve the section modulus of the aging femoral neck in a way that would make buckling unlikely, but only if the femur is loaded in a physiologic manner and not under unaccustomed loading conditions. In an upright stance, most of the stress in the femoral neck is borne on the well-preserved inferior medial cortex, while the relatively unloaded superior lateral cortex generally gets thinner with age [17]. The unaccustomed loading conditions of a fall on the hip are very different from that of stance. Thus, it is not surprising that femur cross sections do not adapt to that condition. In a fall, the femur bends in the opposite direction concentrating high compressive stresses on the thinned superior lateral cortex. This thinned cortex may buckle under smaller loading forces than would be predicted by the section modulus. This is suggested by results of this study as well as the Rotterdam and Study of Osteoporotic Fractures studies [7, 8] where larger diameters had a negative rather than a positive association with strength, increasing fragility, and risk of hip fracture. This pattern seems to negate the postulated benefit from increased sectional modulus with greater diameters suggesting that the failure mechanism includes local buckling. Indeed, section moduli were less predictive of hip fracture in all three studies than either aBMD or buckling ratio.

Strengths of this study include the large sample size, prospective follow-up with minimal loss to follow-up, adjudication of hip and other clinical fractures, and availability of a large number of clinical covariates. This report is unique in considering summary measures of highly correlated HSA parameters to determine if uncorrelated factors could better predict incident hip fracture than aBMD. In addition, we identified which hip structural parameters and summary factors were independent predictors of incident hip fracture after adjustment for aBMD. This study is limited by the relatively small number of women who suffered an incident hip fracture indicating that the study population has experienced low rates of hip fracture overall. As noted in previous reports, WHI participants were, on average, relatively heavy, healthy, calcium replete, and about one third were taking hormone therapy at baseline. However, the results observed in this cohort are consistent with those seen in cohorts at higher risk in Rotterdam and SOF.

Femurs of women who fracture differ geometrically from those who do not. Although we did not find that combining geometric measurements in a principal components analysis improved the magnitude of fracture predictions over single HSA measurements, the real value of this research may be in guiding the direction of technological improvements in DXA scanners. Our findings reinforce the idea that the predictive ability of DXA data is strongly influenced by dimensional (geometric) parameters, but the human proximal femur is a complex three-dimensional structure. While HSA works well enough to demonstrate that geometry is important, a single-projection low-resolution DXA image was not designed for measuring dimensional effects especially when they are subtle. The average difference in femur outer diameters between fracture cases and women without fracture is only a fraction of a millimeter. It will be challenging to devise DXA methods that have sufficient spatial resolution to reliably detect submillimeter dimensional changes. One must also be able to reliably position the femur so that such small effects can be distinguished from differences in projected dimensions from inconsistent femur positioning. Finally, one will need to devise appropriate body scaling methods to reliably distinguish variations in bone girth from those due to differences in skeletal size.

As a final comment, the present study used data from aBMD regions to determine whether geometry measured in differently defined regions adds to predictive value. The pixel values averaged for BMD at the conventional femoral neck region probably overlap or are proximal to those used in the HSA narrow neck region but are not common to those used in the intertrochanter and shaft regions. Our analysis addresses the clinically relevant question of whether geometry adds to predictive ability of aBMD but not what information from a given region (including BMD) provides the best predictive ability.

We conclude that hip geometry parameters, particularly intertrochanter diameter and buckling ratio, predict incident hip fracture after accounting for clinical risk factors and conventional bone density. The totality of the evidence from prospective studies supports inclusion of these parameters as risk factors for hip fracture. Future development of three-dimensional technologies that improve the precision of measuring these parameters could have promise in improving the identification of older women most likely to have a hip fracture.