Keywords

Introduction

Numerous risk factors for osteoporosis and fractures have been identified, and several tools have been developed to integrate risk factors into a single estimate of fracture risk for individuals. Developed prediction tools, such as fracture risk assessment tool (FRAX) algorithm [1], Qfracture algorithm [2], and Garvan fracture risk calculator (Garvan) [3, 4], have been developed aimed at assisting clinicians in the management of their patients through the calculation of the patient’s 5-year or 10-year risk of fracture based on a combination of known risk factors. In addition to these most popular algorithms, several other tools exist which vary according to the type and number of risk factors included. Common to all these tools is the ability to identify women at increased risk of osteoporotic fracture and to stratify them into risk categories for osteoporosis or fracture. Several studies [5,6,7,8,9,10] have compared various tools for their ability to identify women at highest risk of fracture. Most of these studies reached the conclusions that the simpler tools perform as well as the more complex tools.

Prior to the advent of these algorithms, self-risk assessment tools were available to identify women with low BMD and/or to estimate the risk of fracture. These include age, body size, no estrogen (ABONE) [11], the osteoporosis risk assessment instrument (ORAI) [12], the Osteoporosis Self-assessment Tool equation (OST) [13, 14], the simple calculated osteoporosis risk estimation (SCORE) tool [15], the study of osteoporotic fractures (SOF)-based screening tool [16], and the osteoporosis index of risk (OSIRIS) [17].

Targeting individuals with increased risk of osteoporotic fracture is an important challenge in the field of osteoporosis. Risk assessment tools may contribute to healthcare decision-making by identifying which patients would benefit most from DXA scanning or treatment. This chapter will review the evidence of osteoporosis screening, benefits, and harms of early detection of osteoporosis, as well as the most common osteoporosis risk assessment tools, including self-assessment tools. The chapter will expand to discuss thresholds for intervention and rooms for improvement.

The Evidence

Screening for osteoporosis, by measuring bone density, can be done with a number of technologies: dual-energy X-ray absorptiometry (DXA), which can measure bone density in the whole body; ultrasound, for measurement in the heel, finger, wrist, and knee; CTXA [a software application] for measurement on the hip; and quantitative computed tomography (QCT) for measurement of the vertebrae and wrist.

Very few studies have addressed the use of these technologies in a mass-screening scenario. Though there are studies of the relative detection rate and of the cost of different technologies, these studies do not mention whether population-based screening is effective or cost effective. One study, however, has calculated that the use of ultrasound examinations, in screening at the population level before an actual measurement is done by DXA, and it concluded that it is not a cost-effective strategy [18].

Validated questionnaires may also be used to identify high-risk patients who might benefit from treatment or to pre-screen those who may need to have their bone density measured. Questionnaires assessed in these studies include the osteoporosis self-assessment tool (OST), the osteoporosis index of risk (OSIRIS), the simple calculated osteoporosis risk estimation (SCORE), the osteoporosis risk assessment instrument (ORAI), and the age, body size, no estrogen (ABONE) decision rules [19,20,21].

Findings from studies of the use of different pre-screening tests demonstrate that these tests may be cost effective in mass-screening strategies. One study calculated that pre-screening at the population level would cost about €300 per patient. Again, this calculation does not provide any information on whether mass screening is effective or cost-effective [22].

A prospective study on the effect of bone mineral density measurements for screening was performed in the United Kingdom on a population of 6282 women 50–54 years of age, with a 5-year follow-up. Of the women screened, 36% were found to have a bone density that required intervention. These patients were sent to a general practitioner (GP) for treatment and follow-up. A total of 1462 women were followed up, and, of these, 12% were already being treated (with HRT, which was the treatment of choice at that time) at the start of screening, 57% were found to be suitable for HRT after consultation with the primary care physician, and 60% of these rejected treatment. The authors concluded that screening all postmenopausal women by measuring bone mineral density was not acceptable for several reasons, of which the potentially low adherence to treatment following screening was a prominent reason [23]. Also, the sensitivity and specificity of population-based screening for osteoporosis is rather low [24].

At the WHO level, screening for osteoporosis has been discussed in WHO technical reports, in which the arguments for general screening of all women were found to be weak [25]. Many other studies, reviews, and agencies have concluded that the evidence is insufficient to recommend general screening for osteoporosis, although they acknowledge the evidence that bone density measurements may be used to diagnose patients in need of treatment [26,27,28,29,30,31].

However, this conclusion, that the evidence is insufficient to recommend general screening for osteoporosis, is not shared universally. Based on a systematic review of the literature, the United States Preventive Services Task Force found good evidence that the risk of osteoporosis and fracture increases with age and other factors, that bone density measurements accurately predict the risk of fractures in the short term, and that treating asymptomatic women with osteoporosis reduces their risk of fracture. On the basis of this indirect evidence, the Task Force concluded that the benefits of screening and treatment are, at least, of moderate magnitude for women at increased risk by virtue of age or presence of other risk factors, and it recommended that routine screening begin at 65 years of age for women at increased risk for osteoporotic fractures [32, 33].

Benefits and Harms of Early Detection of Osteoporosis

There is convincing evidence that bone measurement tests are accurate for predicting osteoporotic fractures in women and men. A study [34] that evaluated the effect of screening for osteoporosis on fracture rates reported a reduction in hip fractures but did not find a reduction in other types of fractures [35, 36]. In concordance, multiple studies showed that drug therapies reduce fractures in postmenopausal women with osteoporosis. For women 65 years and older, there is convincing evidence that screening can detect osteoporosis and that treatment of women with osteoporosis can provide at least a moderate benefit in preventing fractures. For postmenopausal women younger than 65 years who are at increased risk of osteoporosis, there is also adequate evidence that screening can detect osteoporosis and that treatment provides a moderate benefit in preventing fractures.

For men, there has been inadequate evidence reported on the benefits and harms of treating screen-detected osteoporosis to reduce the risk of osteoporotic fractures.

On the other hand, a single study [35] has reported harms of screening for osteoporosis. It reported no increase in anxiety and no decrease in quality of life from screening. Based on the nature of screening with bone measurement tests and the low likelihood of serious harms, the United States Preventive Services Task Force (USPSTF) found adequate evidence to bound these harms as no greater than small. Harms associated with screening may include radiation exposure from DXA and opportunity costs (time and effort required by patients and the healthcare system).

Harms of drug therapies for osteoporosis depend on the specific medication used. The risk of serious adverse events, upper gastrointestinal events, or cardiovascular events associated with the most common class of osteoporosis medication (bisphosphonates) is no greater than small [33]. Therefore, overall, it can be concluded that the adequate evidence that the harms of osteoporosis medications are small.

Risk Assessment Tools

In deciding which women to screen with bone measurement testing, clinicians should first consider factors associated with increased risk of osteoporotic fractures. These include parental history of hip fracture, smoking, excessive alcohol consumption, low body weight, as well as high risk of falling. In addition, menopausal status in women is also an important consideration because studies demonstrating treatment benefit mainly enrolled postmenopausal women. For postmenopausal women younger than 65 years who have at least one risk factor, a reasonable approach to determine who should be screened with bone measurement testing is to use a clinical risk assessment tool.

Assessment of bone mineral density (BMD) provides a crucial determinant of fracture risk and many guidelines have used BMD thresholds to determine whether treatments should be recommended. However, the multifactorial nature of fracture risk means that BMD does not capture non-skeletal determinants of fracture risk such as liability to fall. A number of risk factors for fracture have been identified that contribute significantly to fracture risk over and above that provided by BMD [37]. A good example is age. The same BMD has a different significance at different ages, such that fracture risk is much higher in the elderly than in the young [38, 39]. This is because age contributes to risk independently of BMD. Several tools are available to assess osteoporosis risk, these include as follows.

FRAX

Over the past years, a series of meta-analyses has been undertaken to identify additional clinical risk factors that could be used in case finding strategies, with or without the use of BMD. This gave rise to the development of FRAX®, University of Sheffield, a tool that integrates the information derived from clinical risk factors and BMD and consequently assesses a person’s 10-year risk of fracture probability [40].

FRAX (Fig. 7.1) calculates fracture probability in individuals from age body mass index and dichotomized risk factors comprising prior fragility fracture, parental history of hip fracture, current tobacco smoking, ever use of long-term oral glucocorticoids, rheumatoid arthritis, other causes of secondary osteoporosis, and alcohol consumption (Table 7.1) [40]. Femoral neck BMD can be optionally input to enhance fracture risk prediction. Fracture probability is computed taking both the risk of fracture and the risk of death into account. The use of clinical risk factors in conjunction with BMD and age improves sensitivity of fracture prediction without adverse effects on specificity [41]. Even if the performance of FRAX is enhanced by the use of BMD tests, it should be recognized that FRAX without BMD has a predictive value for fractures that is comparable to the use of BMD alone [42,43,44]. The availability and access to densitometry in many countries is low [43], so that a major advantage of FRAX is the ability to assess fracture risk where BMD is unavailable.

Fig. 7.1
figure 1

Fracture risk assessment tool “FRAX”

Table 7.1 Definitions of the risk factors included in the fracture risk assessment tool*

Fracture probability varies markedly in different regions of the world [44]. Thus, the FRAX® models need to be calibrated to those countries where the epidemiology of fracture and death is known. Models are currently available for 58 countries across the world: for Argentina, Armenia (surrogate), Austria, Australia, Belgium, Brazil, Canada, Chile, Czech, China (revised 2013), Colombia, Croatia, Denmark, Ecuador, Estonia, France, Finland, Germany, Greece, Hong Kong, Hungary, Iceland, India (surrogate), Indonesia, Ireland, Israel, Italy, Japan, Jordan (updated), South Korea, Kuwait, Lebanon, Lithuania, Malta, Mexico, Morocco, Netherlands, New Zealand, Norway, Palestine (surrogate), the Philippines, Poland, Portugal, Romania, Russia, Singapore, Slovakia, Sri Lanka (surrogate), Spain, Sweden, Switzerland, Taiwan, Thailand, Tunisia, Turkey, the UK, the USA, and Venezuela. The model is available in 27 languages: Arabic, Bengali, Chinese (traditional and simplified), Czech, Danish, Dutch, English, Finnish, French, German, Greek, Icelandic, Indonesian, Italian Japanese, Korean, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Spanish, Swedish, Thai, and Turkish [45].

FRAX has been widely used for the assessment of fracture risk since the launch of the website in 2008 and currently processes approximately 225,000 calculations per month. Following regulatory review by the US Food and Drug Administration (FDA), FRAX was incorporated into DXA scanners to provide FRAX probabilities at the time of DXA scanning. For those without internet access, handheld calculators and an application for Apple and Android smartphones have been developed by the IOF (http://itunes.apple.com/us/app/frax/id370146412?mt=8) and (https://play.google.com/store/apps/details?id=com.inkrypt.clients.iof.drfrax). A paper-based FRAX pad allows patients to document risk variables prior to medical consultation and is available from the IOF (www.iofbonehealth.org) in several languages.

The limitations of FRAX (Table 7.2) have been reviewed recently [46, 47]. Though the FRAX tool has been appreciated for its simplicity for use in primary care, yet it has been criticized as it does not take account of exposure response. For example, the risk of fracture increases with exposure to glucocorticoids (both dose and duration), but FRAX only accommodates a yes/no response to the relevant question. Other well-researched examples of “dose–response” include the number of prior fractures and the consumption of alcohol. Other concerns are the lack of provision for lumbar spine BMD which is commonly recommended in treatment guidelines, and the absence of measurements of the material or structural properties of bone. A concern that treatment might invalidate the interpretation of FRAX is misplaced [48].

Table 7.2 Limitations of FRAX

If FRAX is to be made more accurate by the inclusion of different degrees of exposure, then information is required not only on the risk of fracture associated with these exposures but also on their dependence on the other risk variables in FRAX and their independent effect on the death hazard. This demands the collection of new population cohorts that include such information as well as the other FRAX variables in sufficient numbers and with wide geographical representation.

In order to overcome some of these, relatively simple arithmetic procedures have been proposed which can be applied to conventional FRAX estimates of probabilities of hip fracture and a major fracture to adjust the probability assessment with knowledge of steroid dose and duration [20], BMD at the lumbar spine BMD [49, 50], trabecular bone score (TBS) [51,52,53], hip axis length [54], as well as moderate or high risk of falling-over/ history of recurrent falls.

Such analyses can inform the clinician how to temper clinical judgment on the existing output of the FRAX models. The most frequent concern, however, is the omission of falls as a risk variable in the FRAX model, particularly as this is included in other risk assessment tools. Indeed, a Task Force of the ISCD recommended that falls should be incorporated into FRAX [55]. While, from the literature on falls risk, this view is a sound academic conclusion, the incorporation into FRAX is problematic for several reasons. First, at the time of the release of FRAX, existing falls data were not of adequate quality, including the heterogeneous construct of questions on falls. Second, falls risk can be considered, as inherently taken into account in the algorithm, though not as an input variable. Thus, the fracture probability given for any combination of risk factors assumes that the falls risk is that observed (but not documented) in the cohorts used to construct FRAX . Third, the interrelationship of falls risk with the other FRAX variables has been inadequately explored on an international basis. Fourth, the relationship between the risk variable and mortality needs to be accounted for, but there are no data available.

FRAX in Patients’ Management

The use of FRAX in clinical practice demands consideration of the fracture probability at which to recommend treatment—termed the intervention threshold. Many different approaches have been used to set intervention thresholds with FRAX. However, the thresholds used have varied since they depend critically on local factors such as reimbursement issues, health economic assessment, willingness to pay for health care in osteoporosis, and access to DXA. FDA-approved medical therapies in postmenopausal women and men aged 50 years and older, based on the following:

  1. 1.

    A hip or vertebral (clinical or morphometric) fracture.

  2. 2.

    T-score ≤−2.5 at the femoral neck or spine after appropriate evaluation to exclude secondary causes.

  3. 3.

    Low bone mass (T-score between −1.0 and −2.5 at the femoral neck or spine) and a 10-year probability of a hip fracture ≥3% or a 10-year probability of a major osteoporosis-related fracture ≥20% based on the US-adapted WHO algorithm.

  4. 4.

    Clinicians’ judgment and/or patient preferences may indicate treatment for people with 10-year fracture probabilities above or below these levels.

QFracture

In 2009, Hippisley-Cox and Coupland published a paper describing the development and validation of QFracture (www.qfracture.org)—a set of risk prediction algorithms to predict 10-year risk of hip fracture and osteoporotic fracture (hip, vertebral, or distal radius fracture) in primary care. The algorithms were developed using data from a sample of two thirds of practices in the QResearch database and validated using the remaining third so that the validation sample is physically separate from the derivation sample. QResearch is a database derived from general practices using the EMIS clinical system (EMIS is the clinical system used by more than 55% of GP practices nationally in the UK). The resulting publicly available web calculator and open source software can be found at www.qfracture.org.

Like the FRAX tool it takes into account history of smoking, alcohol, corticosteroid use, parental history (of hip fracture or osteoporosis), and several secondary causes of osteoporosis (Fig. 7.2). Unlike FRAX it also includes a history of falls (yes/no only over an unspecified time frame), utilizes a large number of clinical risk factors and no provision is made for BMD. It has been internally validated (i.e., from a stratum of the same population), and externally validated in a similar population (routinely collected data in general practitioner records). The performance characteristics and calibration in the UK have been compared with FRAX with comparable results for hip fracture [56]. The tool has not been calibrated to the epidemiology of other countries. A feature of QFracture is that it is more cumbersome (more questions) and does not accommodate the inclusion of BMD. BMD measurements are dismissed as “expensive and inconvenient tests” and so the model ignores a wealth of data demonstrating the utility of BMD testing in fracture risk assessment [57].

Fig. 7.2
figure 2

QFracture®-2016 risk calculator: http://qfracture.org

Garvan

The Garvan fracture risk calculator or Garvan scale (www.garvan.org.au) was devised by Australian researchers at the Garvan Institute of Medical Research to predict in a given patient the absolute risk of having any osteoporotic fracture within 5 and 10 years [58]. The Garvan tool is based on many fewer men and women from a single study, the Australian Dubbo Osteoporosis Epidemiology Study (DOES) of approximately 2500 men and women age 60 years or more. It differs from FRAX by including a history of falls (categorized as 0, 1, 2, >2 in the previous year), and the number of previous fragility fractures (categorized as 0, 1, 2, >2), but does not include other FRAX variables such as parental history of hip fracture, secondary osteoporosis, rheumatoid arthritis, glucocorticoid use, smoking, and intake of alcohol (Fig. 7.3). The output of the tool differs from FRAX in that it reports the risk of a larger number of fracture sites (additionally includes fractures of the distal femur, proximal tibia/fibula, distal tibia/fibula, patella, pelvis, ribs sternum, hands, and feet excluding digits) [59].

Fig. 7.3
figure 3

Garvan risk assessment tool

The Garvan scale, although apparently very practical and easy to use, is hampered by the limited relevant bibliography. In comparison to the FRAX®, the Garvan tool has been less widely used, showing often divergent results in some studies which compared both scales [60].

Comparative Features

There are important differences in the input variables, output, and model features that make comparison of the models problematic (Table 7.3).

Table 7.3 Comparative features of FRAX, Qfracture, and Garvan

Comparison of Input

With regard to input variables, both Garvan and QFracture include a history of falls, whereas this is not an input variable in FRAX. In particular, the Garvan tool weights the number of falls in the past year. Whereas falls are a strong risk factor for fracture, the incorporation of falls into FRAX is problematic for several reasons as mentioned earlier [61, 62]. Putting these technical problems aside, risk assessment tools are intended to identify a risk that is amenable to a therapeutic intervention. However, falls as a risk variable does not consistently pass the test of reversibility of risk [63, 64], a necessary feature of any risk variable used in tools to direct interventions [42]. Recently, an analysis in elderly men, available as a meeting abstract, indicated that the predictive value of falls for fracture waned significantly with time [65, 66]. If the phenomenon is replicated more generally, then this would further question the utility of falls history in the long-term (e.g., 10-year) assessment of fracture risk . In their review, Kanis and his colleagues [57] suggested that a useful role of fall history in fracture risk assessment remains sub judice. However, on the other hand, a recent study [61] revealed that self-report number of falls in the previous year is strongly associated with incident fracture risk in the routine clinical practice setting, and this risk is independent of age, sex, BMD, and baseline fracture probability. Moreover, there is dose-response with multiple falls (up to a maximum of 3) conferring greater risk than a single fall.

In addition to falls, there are also few data that many of the QFracture risk factors (cardiovascular disease, type 2 diabetes, asthma, tricyclic antidepressants usage, history of falls or liver disease) which characterizes a risk that is amenable to bone-targeted interventions. Other important differences between models include the question construct for fracture history given as past fragility fracture (for FRAX), fractures since the age of 50 years (Garvan) or past wrist, spine, hip or shoulder fracture (QFracture). For BMD, the femoral neck is the reference site for FRAX and for Garvan but is not an input variable for QFracture.

Comparison of Output

Considering the output and model features, the Garvan instrument includes many more fracture outcomes than QFracture or FRAX. Compared with FRAX, the inclusion of these additional fractures is expected to inflate fracture risks in women by 34–45% depending on age [67]. The outcome variable differs between models, not only in the fracture sites but also in the metric. In the case of FRAX, the algorithm computes a fracture probability (i.e., a metric that incorporates the death hazard) which is not synonymous with simple fracture incidence [68].

A comparison of the performance characteristics of the three prediction models appear to be comparable mainly for hip fracture risk [69,70,71,72,73,74,75] taking into account the methodological flaws in most of the comparative studies [68,69,70,71,72,73,74,75]. When QFracture and FRAX are applied to the UK population, there is reasonable concordance for hip fracture risk since both are calibrated to the UK, though in different ways. The Garvan instrument is calibrated only to Dubbo and is the outlier. The concordance of the Garvan and FRAX tools is reported in Canada [69]. This was considered by Kanis et al. [70] as a fortuitous accident occasioned by the similar epidemiology between Canada and Dubbo. The claim of good calibration in Norway is not supported by the evidence [71,72,73,74].

Whereas QFracture and FRAX are comparably calibrated for hip fracture risk [69, 71, 75], a quite different pattern is evident for major osteoporotic fractures where the probabilities derived from FRAX are markedly higher than the incidences from QFracture. The Garvan instrument gives even higher values for the same clinical scenarios. The Garvan tool provides the highest risks, in part because the output is the 10-year incidence of all fractures (minus those at the digits), whereas QFracture gives the lowest estimates [76,77,78].

The reason for the discrepancy is that QFracture is derived from General Practitioner records that are often incomplete for some important variables [78]. For example, GP records are reasonably accurate for the documentation of hip fracture but notoriously unreliable for other major fractures, particularly vertebral fractures [79]. Thus, the prevalence of a prior major fracture in the QFracture data base is 1.9% [72], whereas prior fracture is estimated at 21–45% in women from the UK, depending on age [80,81,82,83]. Of these, approximately half will be major fractures. For a parental history of osteoporosis or hip fracture, the prevalence is given at 0.3% in the QFracture database, whereas meta-analysis of prospective studies gives a prevalence of parental hip fracture at 13% [81]. The impact of the inaccuracies is difficult to quantify but is likely to decrease the median of the distribution of 10-year risk in the population. Empirical observation supports this view in that at each tenth of risk category, QFracture risk is lower than FRAX-based probabilities.

In concordance, the poor and inaccurate capture of clinical risk factors is likely to bias the weights for both hip fracture risk and major fracture risk. In the case of FRAX and Garvan, the probability of fracture is approximately doubled with a prior history of fracture consistent with worldwide observation [73, 82]. In the case of major fracture incidence, QFracture determines an increase in risk ratio of approximately only 8%, rather than the expected doubling of risk [78]. As expected from meta-analysis, the impact of a prior fracture is somewhat greater at younger ages [73] and is accommodated in FRAX. In contrast, the weighting given for a prior fracture as a risk fracture is unrealistic for QFracture and does not vary with age (the latter, also the case for Garvan).

A further problem arises in considering the pattern of fractures with age. As expected, FRAX probabilities of a major fracture exceed that of hip fracture at all ages. In the case of QFracture, the incidence of hip fracture and the incidence of major fracture are identical from the age of 85 years. This implies that no fractures of the spine, humerus, or distal forearm arise in women from the age of 85 years. Again, this contrasts with empirical observations [83, 84]. Indeed, fragility fractures other than hip fracture account for 64–67% of fractures in women and men (respectively) aged 85–89 years [67].

Osteoporosis Self-Assessment Tool

Prior to the advent of FRAX, other risk assessment tools were available to identify women with low BMD and/or to estimate the risk of fracture. Most of the tools were based on fewer clinical risk factors and aimed at predicting low BMD. These include age, body size, no estrogen (ABONE), the osteoporosis risk assessment instrument (ORAI), the osteoporosis self-assessment tool (OST) equation, the simple calculated osteoporosis risk estimation tool (SCORE), and the study of osteoporotic fractures (SOF)-based screening tool. The ABONE and ORAI risk assessment tools use information regarding age, weight, and estrogen use [11, 12]. The OST risk assessment tool uses information regarding weight and age [14]. The simple calculated osteoporosis risk estimation (SCORE) uses information about race, rheumatoid arthritis, history of minimal trauma fracture after age 45 years, age, estrogen therapy use, and weight [15], whereas the osteoporosis index of risk (OSIRIS) include data from age, body weight, current hormone replacement therapy use, and history of previous low impact fracture [17]. The SOF-based risk assessment tool uses information regarding first-degree relatives with hip fracture, weight, presence of dementia, corticosteroid use, seizure medication use, benzodiazepine use, previous fracture at/after age 50, use of menopausal hormone therapy, heart rate, height at age 25, age, race, walking for exercise, ability to rise from a chair without arms, and amount of time per day spent “on feet” [16].The justification for such tools is primarily to identify women who are more likely to have low BMD and then could undergo BMD measurement for a definitive assessment. All these tools have been developed in women, validated in independent cohorts, and the performance of the tools was similar to that seen in the development cohorts [14, 15, 67, 85, 86]. Table 7.4 shows a comparison of the clinical risk factors used to calculate the most common osteoporosis self-assessment fracture risk assessment tools. This is presented in comparison to the most common fracture prevention tool, FRAX. No studies determined the effectiveness of the individual tools in selecting patients for therapy and thus improving fracture outcomes [87, 88].

Table 7.4 The clinical risk factors used to calculate the most common osteoporosis self-assessment risk in comparison to FRAX

Osteoporosis Self-Assessment Tool (OST)

The osteoporosis self-assessment tool (OST) is a predictive algorithm currently in use to predict the risk for osteoporosis [13]. It was first established by Koh et al. [13] using data of postmenopausal women from eight Asian countries. The screening algorithm was only based on age (years) and body weight (kg): OSTA score = (body weight − age) × 0.2, with three osteoporosis risk categories, low risk (>−1), moderate risk (−1 to −4), and high risk (<−4). It performed well to determine women at risk of osteoporosis [14]. The performance of OST among Asian men was first assessed by Kung et al. [89] and it demonstrated a moderate performance in predicting osteoporosis [89]. OST has been known as OSTA (OST for Asians) when it is applied to Asian women. The establishment of OSTA only involved postmenopausal women and men from East and Southeast Asia. The OST was later validated in several studies in Asian and White populations and was compared to other risk indices in large samples of postmenopausal women [88, 90]. Results revealed that The OST is effective and efficient tools to help target high-risk women for DXA testing [14].

Index of Risk (OSIRIS)

OSIRIS is a simple index based on four easy-to-collect variables from postmenopausal women, which showed a high degree of accuracy and performed well for classifying the degree of risk of osteoporosis in western European women of Caucasian lineage. Three categories were arbitrarily created using OSIRIS, with cutoff range of: +1 and −3. The low risk category (OSIRIS > +1) represented 41% of all women; only 7% of the women in this category had osteoporosis. The prevalence of osteoporosis was very high (66%) among the group at high risk (OSIRIS < −3 representing 15% of all women). The prevalence of osteoporosis was 39% in the intermediate risk group (−3 < OSIRIS < +1, 44% of all women). Based on this instrument, a strategy was proposed that would initiate treatment in women with very high risk, postpone BMD measurement in women with low risk and limit BMD measurement to women with intermediate risk of osteoporosis, this would spare more than 55% of the densitometry bill compared with a mass screening scenario [17].

Performance of Fracture Risk Model

The performance of a predictive model is commonly assessed by 2 metrics: discrimination and calibration. Discrimination is the capability of a model to separate individuals who will sustain a fracture along a continuum from those who will not. The primary metric of discrimination is the area under the receiver operating characteristic curve (AUC) which evaluates the compromise between sensitivity and specificity and is thus a global estimate of prognostic accuracy. Calibration assesses the agreement between observed and predicted risk of fracture over the range of predicted probabilities.

Over the past 10 years, there have been several independent studies examining the prognostic performance of the Garvan model [9, 91,92,93], FRAX [94,95,96,97,98,99], or both Garvan and FRAX [9, 100]. In general, the discrimination for hip fracture was better than for total fractures. In predicting hip fracture risk, the median AUC value for Garvan was 0.80, which was equivalent to that of FRAX (AUC, 0.78). In predicting major fracture risk, the median AUC value for Garvan and FRAX was 0.76 and 0.69, respectively [104]. However, it should be noted that as a norm, AUC value for outcome with low frequency (e.g., less than 100 events) such as hip fracture is often overoptimistic [105]. It appears that the discrimination of fracture in men was lower than women [106]. In certain populations [91, 93, 100], it appears that the Garvan model performed well in the discrimination of fracture, particularly in men [103]. For instance, in the Canadian Multicenter Osteoporosis Study, the Garvan model yielded good discrimination, particularly for hip fracture (AUC 0.80 for women and 0.85 for men) [91]. In a recent systematic review, the average AUC for total fracture by FRAX and Garvan was 0.67 (95% confidence interval, 0.64–0.71) and 0.70 (95% CI, 0.64–0.75) [107].

While the discriminatory ability of FRAX and Garvan was comparable, their calibration was very different. Most studies have consistently shown that FRAX tended to underestimate the risk of fracture [100, 101, 103, 108], particularly in diabetic patients [109]. Several studies have indicated that the Garvan model had very good calibration. A validation study on 1422 postmenopausal women living in New Zealand found that the Garvan predicted fracture risk was 99% in agreement with the observed number of fractures; however the Garvan model tended to overestimate the risk of fracture among individuals in the top quartile of fracture risk which was also noted in the initial development study [100]. In the CaMoS cohort, the Garvan model also shows a remarkable agreement between predicted 10-year probability of fracture and observed 10-year risk of fracture [91].

The concordance in the predicted probabilities of fracture between Garvan and FRAX was modest, with the coefficient of correlation being 0.67 [110]. A reason for the discordance is that the Garvan model takes into account the prevalence of falls in the risk estimation, but the FRAX model did not [49]. A validation study in 2012 postmenopausal women of Polish background found that there was a considerable discrepancy in risk estimates between Garvan and FRAX models with the Garvan model predicting fracture more accurately than FRAX [45]. Despite the fact that there are differences in predicted risk of fracture between Garvan and FRAX, the majority of the differences do not seem to impact on treatment recommendation [111].

The discordance between Garvan and FRAX is expected, because the two models use different profiles of risk factors. In essence, the estimated risk is a conditional probability that is dependent on the risk factors and their statistical weights. The estimated weight associated with each risk factor is dependent on the statistical method that is used to model the relationship between the risk factor and fracture. The weights associated with 5 risk factors in the Garvan model were derived from the multivariable Cox proportional hazards analysis [29], whereas the method of derivation of the FRAX model is not known [112]. Thus, an individual can have different predicted risks of fracture dependent on which factors are considered in the prediction [113]. It is also important to appreciate that the predicted risk is actually an average kind of “wisdom of the crowd” [114], with “true” values fluctuating below or above the typical value. Therefore, an individual does not necessarily have a unique risk value. This subtle fact also explains why different valid predictive models can yield substantially different results for an individual.

Is the predicted fracture risk concordant with clinical guidelines? In a validation on 801 men who have been followed up for 10-years, Pluskiewicz et al. [103] found that the Garvan-predicted risk of fracture was more concordant with treatment indication than FRAX-predicted risk. For instance, among 218 men with a prior fracture (i.e., indicated for treatment), 82% of them had Garvan predicted risk ≥20% compared with only 8% had FRAX-predicted risk ≥20%. Similarly, among men with osteoporosis (i.e., indicated for treatment), the proportion of men with ≥20% predicted risk by Garvan and FRAX was 72% and 10%, respectively [103]. Thus, it appears that the threshold of 20% predicted risk for defining “high risk” is reasonably consistent with current clinical guidelines.

However, it remains unknown whether treating patients with high risk as defined by the current predictive models will reduce their risk of future fracture. Virtually all RCTs evaluating antifracture efficacy selected patients based on low BMD (i.e., osteoporosis) and/or the presence of a pre-existing fracture, and among these patients, pharmacological interventions have shown good efficacy [10]. As no clinical trials have been performed on individuals with high risk of fracture based on either FRAX or Garvan, it is not known whether these patients can be benefited from pharmacological treatments. Nevertheless, post hoc analyses of RCTs appear to suggest that those with high risk of fracture at baseline (as assessed by FRAX) had a slightly greater relative risk reduction of fracture associated with denosumab [115] and bazedoxifene [116], but not with strontium ranelate [117] and raloxifene [118]. In another post hoc analysis [67], it was found that among women in the top 25th percentile of fracture probability (average probability of 24%), clodronate treatment reduced the risk of fracture by 23% over 3 years; among those in the top 10% percentile (average fracture probability of 30%), treatment reduced the fracture risk by 31% [119]. Taken together, these results seem to be consistent with the hypothesis that treatment of individuals at high risk or moderate risk identified by predictive models could reasonably be expected to reduce fractures.

The Concept of Very High Fracture Risk

In 2020, the dichotomisation of high risk into high- and very-high-risk categories was published by the IOF and the ESCEO [120]. Basically, this was based on the assessment of the 10-year probability of a major osteoporotic fracture (clinical spine, hip, forearm, or humerus). Women with fracture probabilities below the lower assessment threshold can be considered at low risk. Women with probabilities above the upper assessment threshold can be considered for treatment. Women with probabilities between the upper and lower assessment threshold should be referred for BMD measurements and their fracture probability reassessed. The subgroup eligible for treatment were then stratified into high and very high fracture risk categories.

This new concept of high fracture risk was driven by the data emerging from drug trials of the recently approved romosozumab, abaloparatide, as well as the established medications such as teriparatide. In contrast to antiresorptive therapies, anabolic agents demonstrated a more rapid and greater fracture risk reductions [121,122,123]. Such strategy of tailoring the medical management to the patient’s needs represents a revolution in the management of osteoporosis, particularly for those subjects at very high fracture risk. So, while the current guidelines for management of postmenopausal women at high fracture risk advise to start with antiresorptive therapy (mostly oral bisphosphonates) [124,125,126], according to the recent recommendations, it would be more suitable for postmenopausal women at very high fracture risk to start treatment with anabolic therapy followed by an antiresorptive agent [123, 127,128,129].

Thresholds for Intervention

Critically, none of the fracture risk assessment tools currently available directly yield an indication for treatment. Thus, the probability or risk generated needs to be interpreted, and thresholds set, above which pharmaceutical intervention is judged to be warranted. The cost-effectiveness of a therapeutic approach is often a key consideration in threshold setting.

There are two major approaches to the health economic assessment in a particular condition [130, 131]. First, one can assess the cost-effectiveness of the intervention, and set the threshold for intervention, for example FRAX probability, accordingly. Alternatively, one can derive a clinically informed and appropriate intervention threshold and use cost-effectiveness analysis to validate a threshold. The 2017 National Institute for Health and Care Excellence (NICE) updated Multiple Technology Appraisal (MTA) on bisphosphonate use in osteoporosis [132] serves as an example of how, for a common disorder, the strict application of cost-effectiveness thresholds for relatively inexpensive drugs may lead to counterintuitive and potentially harmful guidance (Fig. 7.4) [130, 133]. The widespread availability of low-cost generic forms of the main oral and intravenous bisphosphonates resulted in oral treatments being deemed cost-effective above a 1% risk of major osteoporotic fracture. Unfortunately, these were initially interpreted by some payers as clinical intervention thresholds, but, in fact, NICE directs practitioners to the UK National Osteoporosis Guideline Group (NOGG) guidance, which provides an illustration of the alternative approach to threshold setting. NOGG developed its guidance on the basis of clinical appropriateness, setting the threshold at the age-specific 10-year FRAX probability of fracture equivalent to women having already sustained a fracture. This approach, which avoids inappropriate overtreatment of older individuals and undertreatment of younger individuals, has been shown to be cost-effective [134] and has been adopted in many countries [135].

Fig. 7.4
figure 4

Assessment and treatment thresholds without (left) or with (right) BMD test to compute fracture probability for men and women. (Adapted from: Kanis et al. [134]. © The International Osteoporosis Foundation and National Osteoporosis Foundation, reprinted with permission)

The approach to threshold setting varies substantially across the world, with guidelines using either fixed or variable age-dependent threshold, and, sometimes, combining a probability threshold with the requirement for BMD in the osteoporotic range [136]. Even between the USA and UK guidance, there is marked heterogeneity. The National Osteoporosis Foundation in the USA suggests BMD assessment in women and men aged ≥65 years or 70 years, respectively, or at younger ages if they have had a prior fracture, and treatment for those with either a history of vertebral or hip fracture, osteoporosis on BMD assessment, or osteopenia and a 10-year FRAX-calculated probability of a hip fracture ≥3% or major osteoporotic fracture ≥20% [137]. Conversely, as mentioned above, the UK National Osteoporosis Guideline Group (NOGG) recommends the use of FRAX with or without BMD as the first step in risk assessment, with prior fragility fractures at older ages usually a sufficient basis for treatment regardless of other risk factors (Fig. 7.5). Where a 10-year probability has been generated by FRAX, threshold graphs are subsequently used to guide appropriate intervention. The possible outcomes include patient reassurance with further risk calculation at a later date (low risk), BMD assessment (intermediate risk), or immediate treatment without the need for BMD assessment (high risk) [138]. Once BMD has been performed, the 10-year probability of fracture is plotted by age, either above or below a single treatment threshold, which is set at the 10-year fracture probability conferred by having had a previous fragility fracture, corresponding to older UK national guidance. The treatment threshold, thus, increases with age, but even so, the proportion of women potentially eligible for treatment rises from 20 to 40% across the age range assessed (Fig. 7.6). A key message is that it should not be assumed that one size will fit all countries. For example, intervention in China at a threshold of 20% for FRAX major osteoporotic fracture, a threshold used in the USA, would lead to only a very tiny proportion of the population treated [136]. Accordingly, the International Osteoporosis Foundation has published guidance relating to osteoporosis and corticosteroid-induced osteoporosis, which can be readily modified to reflect national priorities and subsequent treatment thresholds [139,140,141,142,143].

Fig. 7.5
figure 5

NOGG osteoporosis recommendation for management

Fig. 7.6
figure 6

Graphical representation of our proposed age-stratified FRAX hip and major osteoporotic fracture risk score thresholds. (Quoted under open access scheme from: Inderjeeth and Raymond [182])

Closing the Gap: Intervention Thresholds of Very High vs High Fracture Risk

Two approaches have been published describing how to identify the high and very high fracture risk categories; these are as follows.

National Osteoporosis Guideline Group (NOGG)

NOGG developed age-dependent assessment thresholds for the UK. The intervention threshold is set at a risk equivalent to that associated with a prior fracture. Two bounds around the intervention threshold have been identified where the assessment of BMD will help to determine whether the individual close to the threshold either exceed that bound or lie below the intervention threshold. These are called assessment threshold for bones. The lower assessment threshold was set to rule out the requirement for BMD testing among women without any clinical risk factors [144, 145]. The upper assessment threshold was set at 1.2 times the intervention threshold [146]. Very high risk is identified as the risk lying above the upper assessment threshold, whereas high risk lies between the intervention threshold and the upper assessment threshold. On the other hand, low risk is reported when the risk lies below the intervention threshold. The assessment thresholds are illustrated in Fig. 7.7 [147].

Fig. 7.7
figure 7

Infographic outlining the four osteoporosis risk categories. Initial risk assessment relies on using FRAX with clinical risk factors alone. Two intervention thresholds are identified. FRAX probability in the red zone, above intervention threshold (2) indicates very high risk. For this group of people an initial course of anabolic therapy followed by antiresorptive treatment may be appropriate. FRAX probability in the green zone suggests low risk, with advice to be given regarding lifestyle, calcium, and vitamin D supplementation. FRAX probability in the intermediate (yellow) zone should be followed by BMD assessment and recalculation of FRAX probability including femoral neck BMD. After recalculation, if the risk got in the red zone above intervention threshold 2 this indicates very high fracture risk, whereas if the risk got in between intervention threshold 1 and below intervention threshold 2 this would indicate high risk, which suggests initial antiresorptive therapy. If the risk lie below the intervention threshold 1, this would indicate low risk (management would be similar to green zone. Patients with a prior fragility fracture are designated either at high risk or possibly at very high risk dependent on the FRAX probability. (Amended from figure 7.1 published in: Kanis et al. [120] (quoted under open access scheme)

European Society of Endocrinology

In 2019, the European Society of Endocrinology published its algorithm for the management of postmenopausal osteoporosis [148]. The algorithm was based on the proposal that a determination of fracture risk would include measurement of lumbar spine and hip BMD and inserting the total hip or femoral neck BMD value into the FRAX tool. Using that FRAX algorithm, four risk categories were identified: “low risk” includes no prior hip or spine fractures, a BMD T-score at the hip and spine both above −1.0, and 10-year hip fracture risk <3% and 10-year risk of major osteoporotic fractures <20%; “moderate risk” includes no prior hip or spine fractures, a BMD T-score at the hip and spine both above −2.5, or 10-year hip fracture risk <3% or risk of major osteoporotic fractures <20%; “high risk” includes a prior spine or hip fracture, or a BMD T-score at the hip or spine of −2.5 or below, or 10-year hip fracture risk ≥3%, or risk of major osteoporotic fracture risk ≥20%; and “very high risk” includes multiple spine fractures and a BMD T-score at the hip or spine of −2.5 or below (Table 7.5).

Table 7.5 Characteristics of the four osteoporosis risk categories identified according to the European Society of Endocrinology

Fracture Risk Assessment Tools: Room for Improvement

From the point of view of predictive accuracy, all current models for fracture risk assessment are suboptimal. Indeed, the average AUC value for total fracture prediction by FRAX and Garvan was only ~0.7 [150] which may be considered “adequate.” The challenge is to find ways to improve the accuracy of fracture prediction. Table 7.6 shows a summary of the potential options to improve predictive accuracy of fracture risk assessment. In their article, Liu et al. [143] postulated that the accuracy can be improved by incorporating new markers for fracture risk and by adopting new modelling strategies.

Table 7.6 Towards a new concept of fracture prediction—potential options to improve predictive accuracy of fracture risk assessment

Genetic Profiling

It is well known that the risk of fragility fracture is partly influenced by genetic factors. Almost half of the variance in fracture susceptibility among individuals is due to hereditary factors [151]. Over the past 20 years or so, several large-scale collaborative studies [69] have revealed that there are 62 loci that are associated with BMD; among the 62 single nucleotide polymorphisms (SNPs) identified, 8 SNPs were associated with fracture risk at the genome-wide significance level [152]. A common characteristic of these SNPs is that their effect sizes were modest, with odds ratios ranging between 1.1 and 1.4, suggesting that individually they have limited utility for fracture prediction. Nevertheless, a genetic profiling may help improve the accuracy of fracture prediction. A simulation study showed that a genetic profile of up to 50 genetic variants, with each having a modest effect size (odds ratio, 1.01–1.35) could improve the accuracy of fracture prediction by 10% points of AUC [153,154,155]. Recent study revealed that the incorporation of an “osteogenomic profile” of 62 BMD-associated SNPs into existing Garvan fracture risk calculator could modestly improve the predictive accuracy of fracture [156], and this finding was consistent with a previous observation from MrOS study [157]. Taken together, these latest results studies suggest that genetic profiling could help improve the accuracy of fracture prediction over and above that of clinical risk factors.

Trabecular Bone Score

Trabecular bone score (TBS) is a measure of the distributional trabecular architecture [158]. TBS is derived as a texture parameter that reflects pixel grey level variation in dual-energy X-ray absorptiometry images. Previous studies have reported that TBS is significantly correlated with trabecular number, trabecular separation, and structure model index [159]. Moreover, TBS was found to be associated with fracture risk in elderly women and diabetic patients [160] independently of BMD and classical clinical risk factors [161]. A recent meta-analysis found that TBS was a FRAX-independent predictor of fracture risk [162], suggesting that TBS could improve the discriminatory power of fracture risk assessment for an individual.

Bone Turnover Markers

Several cross-sectional and longitudinal studies have observed that fragility fractures occur not only because of low BMD but also as a result of rapid bone turnover that leads to adverse architectural changes. There is accumulating evidence that accelerated bone resorption is a risk factor for fracture, independent of BMD, and other clinical risk factors [163]. For instance, increased urinary levels of the pyridinium crosslink, deoxypyridinoline (DPD), was associated with a two- to threefold increase in the risk of hip fracture [164]. Increased urinary type I collagen C-telopeptide (CTX) and free deoxypyridinoline (DPD) levels were associated with a twofold increase in hip fracture risk after adjusting for BMD and physical mobility [165]. In men, increased bone resorption was also associated with increased fracture risk [166]. A meta-analysis of longitudinal studies found that increased serum levels of serum aminoterminal propeptide of type I collagen and C-telopeptide (CTX) were modestly associated with an increase in fracture risk in men and women [167]. These results strongly suggest that the incorporation of bone turnover markers into the existing prognostic models could improve the prediction of absolute fracture risk. However, the use of bone turnover markers for fracture risk assessment is faced with challenges in the standardization of measurements and treatment of intrasubject variability.

Fracture Type-Specific Prediction

Existing individualized risk assessment models were developed for predicting the risk of total (or major) fractures and hip fracture. The implicit assumption behind the development of these models is that all fracture types share common risk factors. However, this assumption is unlikely true, as a risk factor for one fracture type may not be associated with another fracture type. For instance, fall is a major risk factor for hip fracture, but it is not a risk factor for vertebral fracture. Therefore, future models should move away from the “one size-fits-all” approach by focusing on specific fracture sites.

Artificial Intelligence

Most, if not all, existing models were developed under the assumption that there are no interactions between risk factors. However, this assumption may not be true, because complex interactions between risk factors are likely present but not detected by traditional statistical methods. In the presence of interactions or potential interactions, implementing artificial intelligence such as artificial neural network (ANN) can be useful in the prediction of fracture. By imitating human brain functions, ANN can model complex real-world relationships, including interacting variables. Recent studies have demonstrated that ANN performed better than traditional statistical models in terms of predicting vertebral fracture among postmenopausal women [168] and mortality following a hip fracture [169]. Earlier study [170] has shown that for hip fracture prediction, artificial neural network (ANN) yielded a more accurate prediction than traditional statistical methods such as the logistic regression model. From a conceptual viewpoint, it is important to distinguish between prediction and association [171, 172]. Traditional statistical methods focus on association which is mainly concerned with the identification of statistically significant predictors to explain the relationship between the predictors and an outcome for a group of individuals. On the other hand, prediction is concerned with the derivation of rules based on observed data for forecasting specific outcomes for an individual. Although a strong association can translate into a good prediction, they are not synonymous. Indeed, a statistically significant association in a group of individuals does not necessarily translate into good prediction for an individual [173]. A risk factor may achieve statistical significance (i.e., p < 0.05) with large sample size even if it is a poor predictor of future outcome. A risk factor or a set of risk factors may be statistically significantly associated with an outcome due to larger effect on a small number of events in the population; yet provide poor prediction for individuals in the population [174]. Therefore, it has been proposed that future fracture risk assessment models should move beyond association analysis and adopt more prediction analyses [170]. Instead of finding factors that are associated with fracture, we should focus on the factors that have high predictive value of fracture risk. The factors that influence fracture risk are likely to be related, and their effects on fracture risk are likely interactional. Prediction analysis using machine learning approach(e.g., ANN and deep learning) may be statistically less elegant, but it could help identify potential highly predictive factors that are ignored by traditional association analysis [170, 171].

Time-Variant Predictions

All risk factors change with time, and the rates of change are highly variable between individuals. For example, BMD in the elderly declines with advancing age, and the rates of decline vary substantially among individuals [175]. However, all existing predictive models assume that risk factors are constant with time. Of course, this assumption is not realistic, but it is a convenient starting point for building a predictive model. Therefore, one important aspect of future model development should take the time-varying nature of risk factors into account to achieve a better estimate of risk for an individual.

Ethnic-Specific Models

It is important to keep in mind that all existing predictive models (e.g., FRAX, Garvan, and Qfracture) were developed from data pertaining to North American and European populations, not Asian or African populations. These models have also been largely validated in Caucasian populations, and their performance in Asian populations is not well documented. Nevertheless, few studies have attempted to assess the utility of FRAX in the prediction of fracture in Asian individuals. In a validation analysis based on the Hong Kong Osteoporosis Study (266 postmenopausal women), the AUC of the FRAX model for predicting total fracture was ~0.73, which is not substantially different from the model with BMD alone (AUC, 0.71) [176]. In a study carried out by Chen et al. [177] on 198 Chinese individuals with very recent fracture, it was observed that the average FRAX-predicted fracture risk was 6.6%, with only 2 individuals (1%) who had 10-year risk ≥20%, suggesting a poor calibration. In a Japanese population, FRAX model had a moderate discrimination for self-reported total fracture (AUC, 0.69), which is similar to the model with age and femoral neck BMD (AUC, 0.69) [141]. In an analysis of 405 postmenopausal women and139 men with fractures, Min et al. [178] observed a ~twofold difference in FRAX-predicted risk of fracture between the Korean FRAX model and Japanese FRAX model, despite the fact that the two populations have similar background risk. Taken together, these results suggest that the FRAX model has modest prognostic performance in Asian populations. Thus, there is a strong need for the development of individualized fracture risk assessment models for Asian populations. This is true, because at the population level, the incidence of fracture in Asians is generally lower than that in Caucasian populations [179], and the distribution of behavioral risk factors for fracture is expected to be different between Asian and Caucasian populations.

On another front, the prevalence of cigarette smoking in Asian women is lower than that in Caucasian women, but Asian men are more like to smoke than Caucasian men [180], and these ethnic-related differences need to be methodologically weighed in the estimation of fracture risk for an individual. It would be unrealistic to assume that Asian men and women share exactly the same risk factor profile as Caucasian populations; it is even more unrealistic to assume that the magnitude of association between smoking and fracture in Caucasian women is the same as in Asian women. Experience in the field of cardiovascular disease shows that the Caucasian based models (e.g., Framingham risk score and QRISK2) did not perform well in Asian populations [181]. International prospective population-based studies are urgently needed for the development and validation of new fracture risk assessment models for Asian populations.

Any statistical model is an imperfect representation of reality. Model development is a struggle between complexity and simplicity. Overly complex models with too many factors may yield better accuracy but they are of little practical use because it is hard to implement such models in practice. On the other hand, too simple models can miss high-risk individuals. Nevertheless, given the current modest calibration and discrimination of simple models, the addition of highly predictive factors to the existing models is likely to help improve the accuracy of prediction without increasing the burden complexity.

In conclusion, over the past 10 years, a number of individualized risk assessment models have been developed and implemented in clinical setting. The advance of these models represents a significant achievement of translational osteoporosis research. The FRAX tool is the most commonly established tool that is used worldwide, to calculate 10-year fracture risk probability. This can aid discussion with patients and help in decisions regarding treatment for osteoporosis and in fracture prevention. The ultimate goal of risk assessment model is to provide clinicians and patients with accurate and reproducible risk estimate that helps guide clinical decisions. Current fracture risk assessment models have contributed substantially to the management of osteoporotic patients over the past decade. Still, much remains to be done to enhance the discrimination and calibration of existing models, as well as to develop new models which can help maximize benefits and preclude potential problems of overmedicalization and false assurance.