Introduction

Hip fracture is the most serious consequence of osteoporosis, because it incurs major healthcare costs and, like other osteoporotic fractures, is associated with increased risk of mortality and reduced quality of life [1]. Several risk factors, including low bone mineral density, advancing age and a history of fracture, have been shown to be associated with hip fracture risk [28]. In addition, it has recently been shown that the risk of hip fracture increased with the cumulative presence of the number of risk factors such as low BMD, advancing age, prior fracture, a fall, high body sway and quadriceps weakness [9]. However, the applicability of these factors for assessing fracture risk in clinical practice has been limited, partly because of the complexity of implementation in primary care [10].

Post-menopausal women and elderly men who know their risk of fracture can make informed decisions about treatment. The assessment of fracture risk has been largely based on the relative risk measure, which is a population measure of risk, not really applicable and useful for an individual. At present, women with a bone mineral density (BMD) T-score of −2.5 or lower or women with a prior fracture are considered for intervention to reduce their future risk of fracture [11]. However, women with multiple non-BMD clinical risk factors can also have similar risk of fracture, but they are not considered for treatment. Although such non-bone factors may not be directly amendable to bone-focused interventions, they may be amendable to lifestyle interventions and would compound any risk attributable to osteoporosis per se that would be amendable to bone-based interventions.

From the methodological viewpoint, the assessment of fracture risk has been mostly based on the concept of risk grouping (such as the classification of women into osteoporosis and non-osteoporosis group). Although this approach is simple and sometimes useful in clinical practice, its predictive value is poor due to the arbitrariness of the cut-off value [12, 13]. For example, a woman aged 60 years may have comparable risk with a woman aged 61 years, but the artificial cut-off at the age of 60 can separate the two women into two different risk groups. In essence, the risk grouping approach attempts to create homogeneous groups of individuals which may in fact be impossible and unnecessary, because there exists a great variation in other risk factors, which are largely continuous variables. A statistical model can combine these risk factors into a single equation and produces a risk estimate, but the computation is not useful in a primary care setting. One alternative approach is to translate the statistical model into a nomogram so that it can be readily used in clinical practice. Several nomograms have been developed and used in the field of oncology, and it has been demonstrated that nomograms exhibited a better performance than risk-grouping categorization [14, 15], because a nomogram estimates a continuous probability of an event, which yields more accurate predictions than models based on risk grouping.

In the field of osteoporosis there is no such nomogram available. The main purpose of this study was therefore to develop a nomogram which combines BMD and non-invasive clinical risk factors for predicting 5-year and 10-year risk of hip fracture for an individual woman and man.

Methods

Setting and subjects

This study was part of an on-going Dubbo Osteoporosis Epidemiology Study (DOES), which was commenced in 1989. At that time, all men and women aged 60 or above living in Dubbo, a city (400 km north west of Sydney, Australia), were invited to participate in an epidemiological study. The population comprised 1,581 men and 2095 women aged ≥60 years, of whom 98.6% were Caucasian and 1.4% indigenous Aboriginal. Dubbo was selected as a base for this study because its population structure resembles the population structure of Australia, and the total incidence of fractures within the city could be completely recorded. The study was approved by the St. Vincent’s campus research ethics committee and informed written consent was obtained from each participant.

Ascertainment of hip fracture

Hip fractures occurring during the study period were identified for residents of the Dubbo local government area through radiologists’ reports from all providing X-ray services as previously described [16, 17]. Fractures were only included if the report of fracture was definite and, on interview, had occurred with minimal trauma (fall from standing height or less). Fractures clearly due to major trauma (such as motor vehicle accidents) or due to underlying diseases (such as cancer or bone-related diseases) were excluded from the analysis. Any fractures more than 3 months before study entry were not considered in the analysis.

Assessment of risk factors

The complete procedures of collection information and measurements of factors were described previously [9]. Briefly, age and anthropometric measurements, including body weight and height at baseline were collected. Lifestyle factors, including daily physical activity level, daily calcium intake, smoking, any history of falls in the preceding 12 months and any history of fractures in the past 5 years were recorded.

At baseline, bone mineral density (BMD, g/cm2) was measured at the lumbar spine and femoral neck by dual energy x-ray absorptiometry using a LUNAR DPX densitometer (GE-LUNAR Corp, Madison, Wisconsin, USA). Based on the actual values of FNBMD obtained, T-score was calculated for each subject according to the young normal level. The “young normal” BMD was obtained from a sample of 52 Australian men and women aged between 20 to 32 years old. These values are identical to those of LUNAR Caucasian database. Quadriceps strength (maximum isometric contraction) was measured in the sitting position in the subject’s dominant (stronger) leg with a horizontal spring gauge calibrated up to 50 kg force. This method has a reliability coefficient of 0.92 [18]. Body sway was measured as displacement of the body at the level of the waist in 30-second periods. The area (cm2) encompassing all movements, forward and backward, left and right, was used as the sway. Four test conditions were used: eyes open and closed on firm surface (wooden floor); eyes open and closed on compliant surface (high density foam 15 cm high). Full descriptions of these assessments, their test and retest reliability and confidence intervals have been given elsewhere [18].

Statistical analysis

The Cox’s proportional hazards regression model [19, 20] was used to model the association between hip fracture risk and potential risk factors. The following risk factors were considered: age, body weight, height, body mass index, prior fracture, falls during the past 12 months, dietary calcium intake, lifestyle factors (i.e., smoking habit and alcohol intakes), postural sway, quadriceps strength, and femoral neck BMD T-scores. The association between each risk factor and hip fracture risk was initially measured by the hazard ratio and its 95% confidence interval (CI) for each standard deviation (SD) or unit change with ordinal risk factors.

Since there were many potential risk factors and the number of “candidate models” for predicting fracture risk can be large, the Bayesian model average (BMA) [21] was applied to search for the most parsimonious models with consistent and maximum discriminatory power. In terms of model consistency and accuracy, it has been shown that the BMA approach performed better than traditional algorithms such as stepwise regression [22, 23], because it can account for model uncertainty in both predictions and parameter estimates [21, 24].

The prognostic performance of the most parsimonious model was assessed by the area under the receiver operating characteristic (ROC) curve [2528], which reflects the model’s ability to discriminate between those who will sustain a fracture from those who will not. An area of 1 represents a perfect discrimination, and an area of 0.5 reflects discrimination that is no better than random chance. In addition, the model was internally validated using the bootstrap validation technique, which assesses how accurately the model will predict fracture in a new similar sample of subjects. In this method, 1000 sub-samples, each with 150 subjects, of the entire sample were repeatedly re-sampled (with replacement) and analyzed, from which biases can be estimated [29]. Based on the parameter estimates of the most parsimonious model, a nomogram using the Design library [29, 30] was constructed for predicting the 5-year and 10-year risks of hip fracture. All database management and statistical analyses were performed via the Statistical Analysis System (SAS), version 9.1.3 [31] and R on the Windows platform [32].

Results

Data were analyzed from 1,028 women and 740 men who had been followed up between 1989 and 2004. The median duration of follow-up was 13 years (inter-quartile range, IQR: 6–14) for both women and men, yielding 10,523 and 7,586 person-years of observation in women and men, respectively. During the follow-up period, 96 women and 31 men sustained at least one hip fracture. As noted in a previous report [9], in both sexes, advancing age, shorter height, reduced weight, quadriceps weakness, postural instability, and lower BMD were each associated with hip fracture risk. In addition, men and women with a prior fracture or a fall during the previous 12 months were also at increased risk of hip fracture. However, there was no significant association between physical activity, dietary calcium intake, coffee intake or cigarette smoking and hip fracture risk (Table 1).

Table 1 Risk factors of hip fracture in men and women: univariate analysis

The Bayesian model average analysis suggested a number of optimal models for predicting fracture risk, and among the models, the most parsimonious one included age, femoral neck BMD, prior fracture, previous fall, and quadriceps strength (Table 2). For ease of application in clinical practice, quadriceps strength which added only 1.5% to the predictive power was excluded from the model. The area under the receiver operating characteristic curves of this model was 0.85 (Fig. 1). Internal validation by the bootstrap method suggested that the bias-corrected estimate of predictive discrimination of 0.70 for women and 0.65 for men. The maximum calibration error in predicting probability of fracture was about 2% for women and 7% for men.

Table 2 Five parsimonious models chosen by Bayesian Model Averaging (BMA)
Fig. 1
figure 1

Area under the receiver operating characteristic curve for the predictive performance of four risk factors: age, femoral neck BMD T-scores, prior fracture and fall for hip fracture in women (thicker solid line) and in men (thinner solid line)

Using the parameter estimates of this model (Table 3), we constructed independent nomograms for predicting hip fracture risk in women and men (Fig. 2a and b). Some typical point estimates of 5-year and 10-year risks are presented for those without a prior fracture and without a history of fall (Table 4) and those with a prior fracture and with a history of falls (Table 5). It is noted that among those with low BMD (T-scores being less than or equal to −2.5) and aged more than 70 years, the risk of hip fracture in men was almost equivalent or even higher than in women.

Table 3 Parameter estimates of risk factors from the multivariate Cox’s proportional hazards model
Fig. 2
figure 2

a Nomogram for predicting the 5-year and 10-year probability of hip fracture for a woman. Instruction for usage: Mark the age of an individual on the “Age” axis and draw a vertical line to the “Point” axis to determine how many points toward the probability of hip fracture the individual receives for his/her age value. Repeat the process for each additional risk factor. Sum the points of the risk factors. Locate the final sum on the “Total points” axis. Draw a vertical line down to the 5-year or 10-year risk line to find the individual’s probability of sustaining a hip fracture within next 5 or 10 years. Example: Mrs. A, 70 years old, has a BMD T-score of −2.5, had a prior fracture and a fall in the past 12 months; her points for age is 9, her BMD points is 65; prior fracture point is approximately 10 and fall point is 4. Her total points is therefore 9 + 65 + 10 + 4 = 88, and her probability of having a hip fracture is around 0.09 in the next 5 years and 0.17 in the next 10 years. In other words, in 100 women like her, one would expect 9 and 17 of them will have a hip fracture in the next 5 years and next 10 years, respectively. b Nomogram for predicting the 5-year and 10-year probability of hip fracture for an elderly man. Instruction for usage: Mark the age of an individual on the “Age” axis and draw a vertical line to the “Point” axis to determine how many points toward the probability of hip fracture the individual receives for his/her age value. Repeat the process for each additional risk factor. Sum the points of the risk factors. Locate the final sum on the “Total points” axis. Draw a vertical line down to the 5-year or 10-year risk line to find the individual’s probability of sustaining a hip fracture within next 5 or 10 years. Example: Mr. B, 70 years old, has a BMD T-score of −2.5, had a prior fracture and a fall in the past 12 months; his points for age is approximately 16, his BMD points is 65; prior fracture point is 6 and fall point is 2. His total points are therefore 89, and his probability of having a hip fracture is around 0.06 in the next 5 years and 0.11 in the next 10 years. In other words, in 100 men like him, one would expect 6 and 11 of them will have a hip fracture in the next 5 years and next 10 years, respectively

Table 4 Probability of 5-year and 10-risk of hip fracture in those without a history of prior fracture and fall (in the last 12 months)
Table 5 Probability of 5-year and 10-risk of hip fracture in those with a history of one prior fracture and one fall (in the last 12 months)

Because the effects of prior fracture and fall were independent of age and BMD, it is not surprising to note that prior fracture and fall significantly increased the risk of fracture in both sexes. For example, an 80 years old woman with BMD T-score of −2.5 and without prior fracture or falls had a 5-year and 10-year risk of approximately 5% and 9%. However, if the same woman had a prior fracture and a fall during the past 12 months, her corresponding risks increased to approximately 15% and 27%. Similarly, the 5-year and 10-year risks of fracture for an 80 year old man with T-score of −2.5 and without prior fracture or falls were 7% and 13%, respectively, and with a prior fracture and a fall during the past 12 months, his risk increased to 15% and 28%.

Discussion

In order to make informed decisions, an individual and physician need to know the individual’s absolute risks of fracture. An accurate and objective prediction of risk should simultaneously take into account the individual’s multiple factors that might affect fracture risk. The present study focused on the utilization of BMD and established non-invasive risk factors to predict hip fracture risk in men and women aged 60 years or above. Apart from BMD, the risk factors considered (i.e., age, prior fracture, and history of falls) are easily obtained from patients; therefore, this model can be easily utilized in the primary care setting.

In the past, the assessment of fracture risk has been largely based on relative risk, which can be misleading to patients and clinicians [33] because the interpretation of a relative risk or its change is highly dependent on the background risk. For instance, doubling a minor risk is still minor, but doubling a common risk is alarming risk. Therefore, a statement such as “your risk of fracture is increased by 2-fold” is not informative and does not mean much to an individual, because the relative risk does not convey the precise likelihood of fracture for the individual. Moreover, relative risk is derived from the risk grouping approach which compares the risks of different individuals with presumably similar [but not identical] characteristics. Therefore, relative risk is an average measure, and can only be applicable to a group of individuals, not to any particular individual as the individual does not have a denominator. Instead of relying on risk grouping as traditionally done in previous studies, the present study considered all risk factors in their continuous scales to construct a prognostic nomogram so that the absolute risk can be estimated for an individual man or woman.

Because it is highly unlikely that two individuals will have identical risk profile, the risk of fracture should ideally be individualized. The individualization can best be based on all relevant factors in both categorical and continuous measurements that define the uniqueness of the individual. A multivariable model, and hence a nomogram, can therefore tailor the fracture risk for an individual. Several studies have suggested that nomogram-based risk prediction performs better than a risk-grouping approach [34, 35], because a nomogram recognizes and can define the unique risk profile for an individual.

The purpose of developing a predictive model is to suggest a prognosis or therapeutic action and to reduce the burden of fracture in the general population. Traditional models for predicting osteoporosis have been largely based on cut-off values of predictors, which were, in turn, determined from functional relationships between the predictors and fracture. Although this approach of model development has the appeal of simplicity, it is prone to misclassification (i.e., false positive and false negative). For example, a classification of osteoporosis versus non-osteoporosis may group two women with BMD T-score of −2.4 and −2.6 into two different groups despite the fact that the two women may have comparable risk of fracture. It is hoped that the nomogram developed in this study can help clinicians and patients make initial informed decisions about risk and treatment options and that may be conveyed to the patient.

The applicability of any nomogram is dependent on: (i) characteristics of patients, (ii) measurement reliability of risk factors, (iii) accuracy of the predictive model, and (iv) validation of the model. With respect to patient group, the present nomogram was developed based on data from a well characterized cohort of Caucasian men and women aged 60 years or above, who have been followed up for up to 15 years, which could thus delineate fine associations. However, the nomogram may not be applicable to other populations, especially to younger individuals, because of difference in the background risk of fracture.

In terms of risk factor measurement reliability, BMD is considered the primary predictor of fracture risk and measurement of BMD is known to be highly reliable [36]. Other risk factors considered in this model (i.e., advancing age, prior fracture, and fall) have been consistently shown to be associated with increased fracture risk in several studies, including the present study [7, 9, 37]. Prior fracture and falls are also risk factors that can be obtained with high reliability. Therefore, the parameters estimated from this study can be considered highly reliable.

The accuracy of a predictive model is normally quantified by its discriminatory power, or the ability to separate individuals who will sustain a fracture along a continuum from those who will not. The incorporation of these non-BMD risk factors has been demonstrated to improve the predictive accuracy of fracture risk. For example, the area under the receiver operating characteristic curve (AUC) for BMD alone was 0.78 for men and 0.80 for women, but the inclusion of age, prior fracture and fall increased this index to 0.85 for both sexes, which was a statistically significant improvement. The index of concordance (which is equivalent to the AUC) is 0.85, which is deemed to be clinically useful. It follows that these factors should be formally considered in any assessment of fracture risk for an individual, particularly when they can easily be solicited from any patient.

The most trustworthy approach to prove the usefulness of a predictive model is to apply the model to independent populations. This nomogram has not been validated in an independent population, and thus, its external validity remains to be established. However, the internal validation of this model by bootstrap method yielded a reasonable concordance between observed and predicted fracture incidence.

Nevertheless, the study has a number of limitations that should be taken into account in the interpretation of data. Selection bias was likely present in this study, in that participants were healthier than non-participants. For instance, although the relative distribution of subjects with respect to the age in the sample was comparable to that in the total target population [17], the mortality rate in the DOES sample was lower than the general population [1], which might reflect the bias toward healthy subjects in the study. Age and BMD considered in this model are subject to change with time, but the present estimates of fracture risk by age and BMD assumed that BMD did not change with time (at present, there is no statistical method to handle such a problem).

Although several risk factors of hip fracture have been identified during the past two decades [5, 9, 38], the utilization of these risk factors in clinical practice has not been realized, because some risk factors require sophisticated measurement which is either not readily available or not easily implemented in the primary care setting. The nomogram developed from this study, based on a linear combination of femoral neck BMD, age, prior fracture and falls during the previous 12 months, can be used for predicting the 5-year and 10-year risks of hip fracture. The final combination of BMD and clinical risk factors as well as characterizing absolute risk also identifies the major source of that risk. Thus if low BMD is not a strong contributor that falls reduction approaches may be most appropriate while if low BMD is a major contributor, the focus should be on bone-specific interventions. Identification of high-risk individuals for intervention is one of the priorities in osteoporosis research [39], and it is hoped that the nomogram approach presented in this study contributes a step toward that direction.