Introduction

Osteoporosis is characterised by low bone mass and micro-architectural deterioration of bone tissue leading to increased risk of fractures [1]. Since 1995, the measurement of bone mineral density (BMD), defined as bone mass per projected bone area (grams per square centimetre), has been used as a guide to assessing those most at risk of suffering from osteoporotic fractures and needing treatment [2, 3]. Those having BMD values of 2.5 standard deviations (T-score) below the mean BMD value expected for young normal healthy subjects (usually 20–30 years) are defined as being osteoporotic. This definition has since been criticised for failing to recognise the biomechanical properties of bone, such as bone size, shape and its micro-architecture, that are also associated with bone fragility but cannot be measured by BMD [4, 5]. Thus, the measurement of BMD alone may not be sufficiently predictive for determining who is most likely to suffer from an osteoporotic fracture. In addition, accessibility to bone densitometry is limited in many clinical settings [6].

Epidemiological studies have now identified clinical risk indicators associated with incident fracture, and these might be independently useful for predicting future risk of an osteoporotic fracture. Of these, a previous fracture has been noted as a strong predictor of future fracture [7]. In particular, vertebral fracture stands out as important in predicting future fractures since they can occur comparatively early in life yet generally remain undetected without a spine X-ray. A vertebral fracture also predicts a succeeding vertebral fracture in nearly 10% of cases within a year [8] and the location of the next vertebral fracture [9] as well as its severity [10].

The objective of this analysis was two-fold: first, to develop an algorithm to help determine who is at most risk of suffering from a new vertebral fracture using easily measurable clinical risk factors (the clinician algorithm) and, secondly, to evaluate whether baseline vertebral X-rays and/or bone densitometry would significantly improve the predictive ability of this clinical algorithm. A successful algorithm would provide medical practitioners with a rational criterion for deciding whom to treat and when to treat.

Subjects and methods

Subjects

Data was analysed from subjects participating in the European Prospective Osteoporosis Study (EPOS), a follow-up study to the European Vertebral Osteoporosis Study (EVOS). At baseline, 17,342 men and women were recruited from 36 centres across Europe (19 countries) into the EVOS study, and 7222 participants from 31 centres completed the follow-up prospective study (EPOS) that was designed to ascertain the occurrence of incident vertebral and limb fractures.

Ascertainment of vertebral fracture

At baseline, 15,570 subjects agreed to lateral vertebral radiographs (T4–L4) and answered a risk factor questionnaire. The radiographs were taken with subjects lying in the left lateral position, and the breathing technique used was one which allowed for blurring of underlying ribs for thoracic films. Repeat lateral radiographs were taken for 7222 subjects at a mean of 3.8 years later using the same protocol. Baseline and repeat radiographs were analysed separately in Berlin as described previously [11]. For this analysis we used the standardised clinical definitions of prevalent and incident vertebral fractures by a small team of experienced radiologists since it models the usual clinical paradigm [11, 12] and showed quite good agreement with vertebral morphometry [11]. Vertebral fractures were categorised by the shapes assumed by the fractured vertebrae and vertebral level (13 vertebrae from T4 through L4). The four shapes were: wedge, concave (one endplate fractured), biconcave (both endplates fractured) and crushed.

Questionnaire data

The EVOS questionnaire administered at baseline contained questions on demographics, medical history, fracture history, gynaecological information, physical activity and lifestyle variables. Participants were asked first about their height at the age of 25 years and minimum weight after the age of 25 years, then their current height and weight was measured and recorded on the questionnaire. Height loss was calculated as: reported height at age 25 years minus measured height at study entry. To assess fracture history, participants were asked if they had ever suffered from a broken bone and to give details on which bone, age at first fracture and level of trauma experienced. The choices of fracture location given were vertebral, hip, rib, forearm and other. Trauma level was divided into spontaneous, minor and major trauma. In all, 31 of the 36 originally participating centres agreed to follow up the subjects by questionnaire. Each subject was sent an annual questionnaire, (every 6 months for the German centres) asking about fractures and falls in the intervening period [13]. Self-reports of fractures were validated where possible by local review of radiographs, medical records or subject interview. The reproducibility and validity of the questionnaires have been tested and found to be good [13, 14]. For example, of 563 subjects who reported a history of a non-spine fracture on the questionnaire, only 11% were not confirmed upon review of medical records (false positives) while in another 174 individuals with confirmed fracture, only 12% did not recall sustaining a fracture from the questionnaire responses (false negatives). Furthermore, of those who reported a date of fracture on the questionnaire, 91% were found to be correct to within 1 month of the actual date of the fracture.

Exclusions

Causes of vertebral deformity other than fracture – for example, osteoarthritis, congenital malformations etc. – were identified using conventional radiological criteria. There were 494 participants diagnosed with such other medical conditions who were excluded, leaving 6728 men and women eligible for inclusion in the analysis. Due to incomplete questionnaires and lack of follow-up data on incident peripheral fractures in two centres, only 81% of the eligible participants furnished all of the information required for this analysis.

Bone mineral density

Twenty-one of the EVOS/EPOS centres were able to measure BMD at the hip and/or the spine at baseline or during follow up in sub-samples of between 20 and 100% of their available participants using dual energy X-ray absorptiometry (DXA). For the participants considered in this analysis, 2643 (39%) from 19 centres had hip BMD (femoral neck or trochanter) measurements and 2086 (31%) from 13 centres had spine BMD measurements.

Statistical analysis

Negative binomial regression, a technique useful for modelling overdispersed count data – meaning that the statistical mean was not equal to the variance [15] – was used to model the expected number of incident vertebral fractures as a function of the predictor variables (Appendix 1). These predictors were broadly classified as demographics (age, sex, height loss, weight); fracture history by questionnaire (any, vertebral, hip, rib, forearm, other); incident peripheral fracture (any upper or lower limb, upper limb, lower limb); BMD (femoral neck, trochanter, spine). Since there was a greater likelihood of observing an incident fracture in participants followed for longer periods, years of follow up was used as an exposure variable in order to adjust the risk estimates for varying follow-up times. The predictors were entered into the negative binomial regression model using a forward stepwise approach if the likelihood ratio test was significant at the 5% level. Because fewer than one-half of the participants had BMD data, the effects of demographic variables, fracture history and incident peripheral fracture were first assessed using the larger dataset to develop the clinical algorithm. Significant predictors from this model were then forced into a model with BMD to assess if measuring BMD would be of additional value. Similarly, the number of prevalent vertebral fractures on the baseline X-ray was added to the best model to determine if vertebral fracture data would significantly improve the prediction of future (incident) vertebral fracture. Interactions with sex were tested to determine if the effect of an explanatory variable was significantly different in men and women.

Risk scores were calculated from each model by weighting each of the risk factors by its log relative risk (RR) plus the model-specific constant term. Thus, for an individual with known values of age (years), height loss in centimetres since age 25 years (htloss), weight (kilograms), sex and fracture history of the vertebrae (hvert), forearm (hfore) and lower limb (low1), the linear predictor risk score with questionnaire variables only, for example, was calculated as:

$$\begin{array}{*{20}l} {{Model1Score = - 7.338 + 0.056 \times age + 0.074 \times htloss - 0.016 \times weight + \left\{ {\begin{array}{*{20}l} {{0.394\,{\text{if}}\,sex = Female} \hfill} \\ {{0\,{\text{if}}\,sex = Male} \hfill} \\ \end{array} } \right.} \hfill} \\ {{\begin{array}{*{20}l} {{ + \left\{ {\begin{array}{*{20}l} {{1.110\,{\text{if}}\,hvert = Yes} \hfill} \\ {{0\,{\text{if}}\,hvert = No} \hfill} \\ \end{array} } \right.} \hfill} & {{ + \left\{ {\begin{array}{*{20}l} {{0.503\,{\text{if}}\,hfore = Yes} \hfill} \\ {{0\,{\text{if}}\,hfore = No} \hfill} \\ \end{array} } \right.} \hfill} & {{ + \left\{ {\begin{array}{*{20}l} {{0.790\,{\text{if}}\,low1 = Yes} \hfill} \\ {{0\,{\text{if}}\,low1 = No} \hfill} \\ \end{array} } \right.} \hfill} \\ \end{array} } \hfill} \\ \end{array} $$

The risk score was then transformed to time-specific probabilities of suffering one or more (1+) and two or more (2+) new vertebral fractures in 1–5 years based on the negative binomial probability density function [15] (Appendix 1) after adding the natural logarithm of the exposure period (i.e. 1–5 years) to the risk score as another constant. These model-based predictions of absolute probabilities of incident vertebral fracture were used to draw receiver operating characteristic (ROC) curves to summarise overall predictive ability of the models.

Results

Participant characteristics

Table 1 shows the baseline characteristics observed in the 6728 participants (3074 men and 3654 women) eligible for inclusion in the analysis, including the numbers of incident vertebral fractures observed. In total, 222 (3.3%) participants suffered at least one or more (1+) incident vertebral fracture(s) during the 3.8-year mean follow-up period; among these, 167 (75%) had one incident fracture, 31 (14%) had two incident fractures and the remaining 24 (11%) had three or more incident fractures.

Table 1 Participant characteristics

Determinants of incident vertebral fracture

Because of missing data, the number of participants contributing to the multivariate models were slightly fewer than shown in Table 1 (n=5561, with 185 incident fractures). Table 2 shows the relative risk estimates for variables that were significantly associated with incident vertebral fracture. In the first clinical model that included questionnaire variables only (Model 1), the risk of incident vertebral fracture significantly increased with increasing age, height loss since age 25 years and reported histories of prior vertebral fracture, forearm fracture and recent lower limb fracture. Female sex was associated with a greater risk of incident vertebral fracture, and higher body weight with decreased risk. There was no significant interaction with sex for any of the other predictive variables (p>0.526).

Table 2 Determinants of incident vertebral fracture – models without bone mineral density (BMD) as a variable (fx fracture, RR relative risk, 95% CI 95% confidence interval)

To assess the utility of questionnaire-based predictors of incident vertebral fracture in those with and without prevalent vertebral fracture, two further models were fitted to subgroups. Model 2 (Table 2) shows the relative risk estimates in those without prevalent vertebral fracture at baseline and Model 3 in those with 1+ prevalent vertebral fracture on the baseline X-ray. Only the effect of weight differed significantly between the two groups (those with and without baseline vertebral fractures) when assessed by a test of interaction (p=0.0003), being stronger in those with 1+ prevalent vertebral fracture than in those without (Table 2).

When baseline X-ray assessment of prevalent fractures was included as an additional determinant (Table 2, Model 4), a strong positive association was found between the number of prevalent vertebral fractures (from 0 to 3+) at baseline and incident vertebral fracture risk. Compared to participants without prevalent vertebral fracture, participants who had one or two prevalent vertebral fractures were approximately threefold more likely to suffer from an incident vertebral fracture, while those who had 3+ prevalent vertebral fractures were at a ninefold greater risk. However, after including the baseline X-ray assessment, the relative risks associated with the clinical questionnaire variables were slightly attenuated, except for sex in which the relative risk associated with being female increased. A history of vertebral fracture from the questionnaire, understandably showed a marked reduction from 3.03 to 1.15, since the X-ray results provided more accurate information than subject recall. In contrast, age, height loss, sex and forearm fracture history remained statistically significant (p<0.05) predictors of vertebral fracture, each being independent of the baseline X-ray assessment. There was no significant interaction found between sex and the number of prevalent fractures in determining the risk of incident vertebral fracture (p=0.59). Model 5 was confined to those with a prevalent vertebral fracture. In this sub-set, the number of prevalent vertebral fracture remained predictive of the risk of incident vertebral fracture among those with 1+ prevalent fracture, the risk being 2.86-fold greater in those with 3+ prevalent vertebral fractures than in those with one prevalent fracture, which is consistent with the results from Model 4.

Incident vertebral fracture prediction

Figure 1 shows the predicted absolute probabilities of 1+ and 2+ incident vertebral fractures from 1 to 5 years later as functions of the risk scores. It will be seen that for each range of risk scores, a minimum risk of zero was predicted at the lowest risk score and that various maximum risks were obtained that depended in part on the length of exposure to risk and in part on the combination of risk indicators used in the modelling (derived from Models 1–5), as described under Methods. From Model 1 using questionnaire variables only and based on data from all 5561 subjects with 185 incident vertebral fractures, the 1-year predicted probability of 1+ incident vertebral fracture ranged from zero to 13%. This range was expanded to 0–29% upon including baseline X-ray assessment of prevalent vertebral fractures (Model 4). Similarly, the 5-year predicted probability range was expanded from 0–49% to 0–82%. When the risk ranges from the Models based on subgroups with and without prevalent vertebral fractures were compared, the much higher predicted probabilities in those with baseline fractures were mainly attributable to the difference in the constant term, indicating a greater underlying risk in those with prior vertebral fractures that was not attributable to the other known risk factors.

Fig. 1
figure 1

Predicted absolute probabilities of 1+ and 2+ incident vertebral fractures after 1 to 5 years related to risk scores combining clinical risk factors without and with baseline X-ray information on prevalent vertebral fractures in all subjects, subjects without prevalent vertebral fracture at baseline and subjects with 1+ prevalent vertebral fracture at baseline. The inclusion of X-ray data broadens the range of risk as described in the text (dashed lines)

Figure 2 shows ROC curves summarising the predictive ability of the models presented in Table 2. The area under the ROC curves (AUC) significantly improved (p<0.013) when baseline radiological assessment of prevalent vertebral fracture was added to Model 1. However, the greatest AUC improvement was seen in the identification of those with multiple incident vertebral fractures (9% improvement in AUC for 2+); this compared to a more modest 3% improvement for identifying 1+ incident vertebral fracture. At a specificity of 90%, the model including X-ray data improved the sensitivity for predicting 2+ and 1+ incident fractures by sixfold and fourfold, respectively, compared with random guesswork; at 75% specificity, the improvements were 3.2- and 2.4-fold respectively.

Fig. 2
figure 2

ROC curves comparing the use of questionnaire variables only and questionnaire variables plus baseline X-ray assessment of prevalent fractures to discriminate participants with one or more (1+) and two or more (2+) incident vertebral fractures from those without incident vertebral fracture. The ROC curves were drawn for all subjects, subjects without prevalent vertebral fracture at baseline and subjects with 1+ prevalent vertebral fracture (1+ prevalent), respectively. Thetop row shows ROC curves for predicting one or more (1+) incident vertebral fractures in all subjects (n=185 with 1+ incident vs. 5376 without incident vertebral fracture), subjects without prevalent vertebral fracture at baseline (n=118 with 1+ incident vs. 4802 without incident vertebral fracture) and subjects with 1+ prevalent vertebral fracture at baseline (n=67 with 1+ incident vs. 574 without incident vertebral fracture). Thebottom row shows similar ROC curves for discriminating 2+ versus no incident vertebral fracture

The risk factors scores also performed better than random guessing in predicting incident vertebral fracture among those without prevalent vertebral fracture at baseline [AUC: 0.67; 95% CI: (0.62, 0.71) for detecting 1+ incident], although the AUC was slightly lower, 0.66 (95% CI: 0.54, 0.77), for detecting 2+ versus 0 incident vertebral fractures (Fig. 2). Despite this, the specificity in this group was still somewhat disappointing in the regions of high sensitivity.

Effects of adding hip or spine BMD to treatment decision models

Table 3 shows the relative risk estimates when hip and spine BMD were added into Model 1 that contained significant determinants from the larger data set (Table 2, Model 1). BMD was entered as T-scores calculated respectively using the NHANES III young normal reference data for hip BMD sites (femoral neck and trochanter) [16] and from normative European data for spine BMD [17]. Relative risks were estimated for T-scores calculated either using sex-specific reference data or using just female reference data. There was a significant increase in the risk of incident vertebral fracture per unit T-score decline, which was largest for femoral neck (FN) BMD after adjusting for the variables that were significant in the larger data set. The relative risks were approximately 1.8 per unit decline in the T-score for FN BMD whichever way the T-scores were calculated. The area under the ROC curve for predicting 1+ versus 0 incident vertebral fractures improved modestly upon adding FN BMD (p=0.010) in this dataset from 0.72 (0.66, 0.79) using the questionnaire variables only to 0.76 (0.71, 0.82). The addition of FN BMD failed to significantly improve the detection of those with 2+ (2% AUC improvement,p=0.238) incident vertebral fractures. When baseline X-ray information was added to this FN BMD model, there was in contrast a significant improvement in the AUC for identifying those with multiple fractures (AUC improvement=6% for 2+ incident,p=0.031 and 7% for 3+ incident,p=0.040).

Table 3 Determinants of incident vertebral fracture – models with hip and spine BMD (fx fracture)

Among the group without prevalent vertebral fracture(s) the AUC improved from 0.69 (0.60, 0.77) using risk factors only to 0.72 (0.63, 0.80) upon adding FN BMD (p=0.132), yet the specificity remained quite low in regions of high sensitivity. For example, at a sensitivity of 90% the specificity was 29%, and at a sensitivity of 80% the specificity was 50%, suggesting that the addition of BMD into the risk score is unlikely to greatly improve the prediction of incident vertebral fracture in this group. Our findings were similar when femoral trochanter BMD was substituted for FN BMD. For spine BMD, there was no significant improvement in any AUC. Total hip BMD was not available for most of the dataset, because at the time the data were collected only one DXA manufacturer offered total hip BMD.

Selective use of a spine X-ray to aid risk stratification

In previous work, we developed an algorithm to guide the request for a spine X-ray [18]. This algorithm (see Appendix 2) was based on questionnaire variables; age, sex, height loss (htloss), weight and history of other major fracture(s) (hotherfx) defined as fracture of the forearm, rib or recent lower limb.

We therefore investigated the use of this algorithm for selecting patients for a spine X-ray in combination with our new prediction algorithm that made use of the X-ray results (Table 2, Model 4). Assuming that spine X-rays were available only selectively and setting cut-points where 40% of females and 30% of males aged 50–75 years would be X-rayed with a 22% chance in the individual of demonstrating a prior vertebral fracture (score ≥23.1 for females and ≥27.9 for males) [18], the AUC for predicting 1+ future (i.e. incident) vertebral fracture was 0.73 (95% CI: 0.70, 0.77). This was similar (p=0.843) to the AUC of 0.74 (0.70, 0.77) in Model 4 for which all subjects had baseline spine X-rays, but was significantly improved (p=0.007) compared to the AUC of 0.71 (0.67, 0.74) from Model 1 based only on questionnaire variables. Clearly, in the absence of a previous spine X-ray, it would be more cost-effective to select patients at high risk to have a spine X-ray first and then to assess future vertebral fracture risk in only those who had been X-rayed. Appendix 2 presents a simplified algorithm for assessing risk in clinical practice, together with information allowing local cost data to be applied in assessing the algorithm’s cost-utility in specific settings defined by age-specific incidence and prevalence rates.

Discussion

We have shown that an algorithm based on age, height loss, weight, sex and history of fracture identified with moderate accuracy those patients likely to suffer new vertebral fracture(s). Although the predictions were statistically better than guesswork, the specificity was still modest in the desirable regions of high sensitivity without a baseline spine X-ray. This limits the likely utility of a clinical treatment decision algorithm based on clinical risk factors only and points to the need for vertebral X-ray data in populations with prevalence and incidence rates that are typically seen in Europe.

There was substantial improvement in the prediction of incident vertebral fracture when baseline X-ray information on prevalent vertebral fractures was added to risk factor models. Radiological information from the spine X-ray was particularly useful in identifying those likely to suffer from multiple incident vertebral fractures. It is those who acquire multiple fractures that are likely to suffer greatest quality of life impairment and to be at greater risk of mortality/co-morbidity [19]; hence these patients should be a primary target for early identification and treatment. This suggests that a reasonable approach to developing clinical treatment decision algorithms would be to first identify which patients should undergo spinal radiography and secondly to identify from among those X-rayed then or previously who should be considered for treatment. In this context, policies to destroy old spine X-rays and with it unrecorded vertebral fracture data as is a widespread practice, for example, in the UK NHS, should be viewed as inimical to the patients’ interests.

The addition of BMD data into the model without X-ray data did not usefully improve the prediction of those likely to have multiple incident vertebral fractures, although an improvement in judging the risk of a single vertebral fracture was evident with femoral BMD measurements. When the comparison of interest was any incident vertebral fracture versus none, hip BMD gave the expected nearly twofold increase in risk of fracture for a 1 SD unit reduction in T-score, after adjusting for the questionnaire variables. This confirms the likelihood that BMD data has a complementary rather than a duplicating role alongside plane X-ray data in the assessment of risk. It is being increasingly appreciated that reliance on BMD alone for effective screening strategies might not sufficiently protect the highly vulnerable who have a defect in bone quality with or without an osteoporotic BMD and a T-score <–2.5. Empirically, clinical spine X-rays appear to provide information on bone quality, especially in the minority of patients who are destined to have multiple vertebral fractures without intervention. In this role, vertebral radiography appears to be complementary to BMD measurement [2022].

Differing strategies for identifying those at risk of osteoporosis have been adopted in various countries. In the USA, densitometry screening has been recommended for women >65 years [23], whereas in Europe a selective case finding approach has been adopted using diagnostic methodologies that are locally available [24]. In some countries, densitometry is only recommended for those with established prior fragility fracture, an approach which our results suggest might not be of much value for the prediction of incident vertebral as distinct from non-vertebral fracture, since BMD failed to greatly improve the area under the ROC curves. In the UK, for example, only about 5–7% of radiological vertebral fractures are identified in primary care; this rises to 20% in Sweden and perhaps to over 30% in Rochester, Minnesota [11, 20, 2527]. The application of our algorithm for deciding whom should be referred for a vertebral radiograph would remove much of the ignorance that currently prevents more than one-half of deserving patients from receiving treatment, with nearly 60% identification at the suggested cut-points for doing a spine X-ray.

On the other hand, a prior fracture is a strong predictor of future fractures, especially for future fractures occurring at similar (e.g. contra-lateral) sites to the prior fracture, and the strength of this effect is independent of BMD [7]. For vertebral fracture, not only does the presence of prevalent fracture predict subsequent fracture, but the characteristics of the prevalent vertebral fracture are also important in predicting incident vertebral fractures [9, 10, 28]. These include properties such as number, location, adjacency (with future fractures being more likely to be found within three vertebrae at either side of the vertebrae with prevalent fracture) [9] and severity or shape of the prevalent fracture [9, 10, 28]. Such local properties are unlikely to be measurable by BMD at any measurement site.

This study has strengths and limitations. The EPOS study was a prospective population-based Europe-wide study that included men and women aged 50+ years using standardised methods in the conduct and evaluation of radiographs to ensure comparability of data across centres. The results are therefore applicable to the diverse populations in Europe, but should be extrapolated with caution to non-European subjects. We did not explore the use of lifestyle risk factors in the algorithms we produced since previous EPOS analysis did not find any major association between these and the risk of incident vertebral fracture [29]. Our conclusions concerning BMD effects are based on the assumption that BMD data was missing (in about one-half of our subjects) in a way that did not differ from random. A statistical comparison of the means or proportions of the predictor variables among those who had hip BMD versus those without hip BMD suggested there were some small but statistically significant differences for some variables: age (61.97 vs. 62.60; p=0.002), weight (73.49 vs. 74.32; p=0.016), height loss (−2.03 cm vs. −2.43 cm; p<0.0001), female sex (57% vs. 49%; p<0.0001) and forearm fracture history (10% vs. 13%; p<0.0001). There were no differences in the proportions of those with a history of vertebral fracture (1.5 vs. 2.1%;p=0.086), rib fracture (3.8% vs. 3.6%; p=0.647) and lower limb fracture (2.1% vs. 1.9%;p=0.624). For the response variable, there was a slightly smaller proportion of subjects with at least one incident vertebral fracture in the sample that had hip BMD compared to those without hip BMD (2.7% vs. 3.8%;p=0.015). However, the proportions of those with multiple incident vertebral fractures did not significantly differ among those with versus without hip BMD (0.80% vs. 0.95%; p=0.580 for 2+ incident; 0.38 vs. 0.36%; p=0.897 for 3+ incident). The extent to which such small differences could have indicated non-randomness is uncertain, but generally the models with BMD were more likely to be under powered due to the smaller sample sizes. Less than 43% of the estimation sample had hip BMD available and, consequently, a smaller number of incident vertebral fractures were available in this group.

In conclusion, we have shown that an algorithm based on age, weight, height and clinical history of fracture by questionnaire provides guidance to the doctor to help make decisions on whom to treat with anti-osteoporosis medication. The accuracy of this decision can, however, be substantially improved by having X-ray information on prevalent vertebral fractures. With anticatabolic treatments capable of preventing 35–65% of new vertebral fractures and the personal impact of such fractures increasing exponentially with their numbers in the spine, these results suggest that an accurately read spine X-ray will form a central component, alongside bone densitometry and clinical risk factors in future algorithms for targeting treatment. This model and other models like it [18] for aiding treatment decisions could usefully be re-tested in larger datasets. For example, it would be desirable to use the meta-reanalysis approach of Kanis et al. (based on individual subject data) [30, 31] applied across several cohorts similar to EPOS that have taken spine X-rays at intervals for the assessment of incident fractures and have then performed the relatively reproducible technique of vertebral morphometry [11], which can be standardised for the identification of fracture.