Introduction

Osteoporosis is a common bone disease, which causes deterioration of bone mineral density (BMD), structure and strength. In the UK, fractures due to osteoporosis (vertebral, wrist and hip fractures) occur in 1 in 2 women and 1 in 5 men over the age 50 [1]. The estimated annual cost of fragility fractures to the National Health Service (NHS) is around £4.4 billion per year [2, 3].

Vertebral fractures are a classic hallmark of osteoporosis. Their diagnosis and management pose a major healthcare challenge. Vertebral fractures affect 18–26% of all European women ages ≥ 50 years [4]. They cause significant morbidity, affect quality of life and result in back pain and height loss. The presence of any fragility fracture is a powerful indicator of high future fracture risk [5]. Women with prevalent vertebral fractures have a 5-fold increased risk of vertebral fracture and a 2-fold increased risk of hip fracture [6, 7]. This risk can be reduced significantly with treatment.

Measurement of areal bone mineral density (aBMD) of the lumbar spine and total hip by dual-energy X-ray absorptiometry (DXA) is an integral part of the fracture risk assessment process [8], and the results obtained can be interpreted using the criteria set out by the World Health Organization [9, 10]. However, not all patients with fragility fractures have low aBMD by DXA. Bliuc et al. reported that out of 715 individuals (women = 528, men = 187) with prevalent fragility fractures, participating in the Dubbo Osteoporosis Epidemiology Study, 12% had normal femoral neck aBMD, 42% had osteopenia and 46% had osteoporosis [11]. It is essential, therefore, that patients with vertebral and other fragility fractures are investigated thoroughly to ensure effective management and treatment targeting.

Reliable, well-established tools are necessary for the assessment of bone status in patients with and without fragility fractures. These provide an important source of information on which the clinician can base their diagnosis and the patient’s care plan. Dual-energy X-ray absorptiometry is the gold standard technique for the measurement of aBMD, however the information gained is limited to quantitative bone measures due to its 2D nature and no qualitative 3D information relating to bone structure is produced [12]. The aBMD results acquired during posteroanterior imaging of the lumbar spine by DXA include the posterior elements of the vertebrae and other artefacts including aortic calcification. Furthermore, aBMD results acquired at the lumbar spine may be unreliable if there is evidence of degenerative changes/degenerative joint disease or vertebral fractures [13, 14].

Considerable research has been conducted to better understand the factors affecting the measurement of aBMD, the influence these factors have on DXA-derived results and the potential errors that may be introduced. Bone mineral apparent density (BMAD, g/cm3) can be calculated from the results acquired when performing DXA of the lumbar spine [15]. BMAD was developed to take into account differences in the size of vertebrae within (i.e. due to growth) and between individuals and reduces the effect of bone size on bone mass. However, it is a surrogate 2D indicator of volumetric BMD (vBMD) and does not represent true vBMD of the lumbar spine or describe the bone spatial distribution within the vertebrae [16].

Microstructural, qualitative properties must also be considered when assessing the ability of the bone to resist fracture. Trabecular bone score (TBS) can be derived from DXA scans of the lumbar spine and may provide some insight into the qualitative 3D microarchitectural properties of trabecular bone [17]. There is evidence that TBS is able to discriminate between women with and without recent fractures [18, 19], and between women with different fracture types including those of the humerus, forearm, vertebra and femur [20]. Moreover, TBS, when used in combination with aBMD, may provide additional information regarding glucocorticoid-associated alterations in bone quality [20]. Despite this, it is important to recognise that TBS can only provide a 2D indirect measure of bone microstructure. The results may be influenced by degenerative changes and fractures within the lumbar spine and the posterior elements of the vertebrae.

Direct 3D measures of the skeleton have improved our understanding of the effects of fractures, treatment and disease on the bone. In the central skeleton, this can be achieved through the use of quantitative computed tomography (QCT) of the lumbar spine. The disadvantage of using QCT is that the patient is exposed to a high dose of ionising radiation (typical in-house effective doses for the lumbar spine, DXA L1 to L4 = 14.9 μSv and QCT L1 to L3 = 980 μSv) during each examination, and therefore this contraindicates the wide and frequent use of QCT as a tool with which to examine the bone status. However, QCT does have the advantage that it can provide a true measure of vBMD that excludes the posterior elements of the vertebrae. It takes into account the bone size and enables the separate study of trabecular and the cortical bone. The International Society for Clinical Densitometry (ISCD), in its 2007 Positions Statement on the use of QCT and peripheral QCT in the management of osteoporosis in adults, stated that trabecular vBMD by QCT can be used to predict vertebral fractures but more evidence is needed to compare its discriminatory ability with that of aBMD by DXA [21]. A number of studies have reported comparisons of the ability of different bone imaging techniques to discriminate between women with and without vertebral fractures [22,23,24,25,26]. All of these studies concluded that trabecular vBMD by QCT had the best discriminatory ability for vertebral fracture when compared with different bone imaging techniques including aBMD of the lumbar spine by DXA.

Different bone imaging techniques, even when performed at the same anatomical site, may assess different properties of the bone and hence may not discriminate exactly between individuals with (i) low aBMD and prevalent fragility fractures and (ii) low aBMD only. Also, it is possible that the discriminatory ability of an imaging technique is itself influenced by aBMD. To our knowledge, the studies described previously have not controlled for this. The study findings we report here are independent of aBMD.

More information about the discriminatory ability of different bone imaging techniques particularly QCT is required to ensure informed clinical decision-making, effective treatment targeting and the optimum care of patients with and without fragility fractures [21].

The aims of this study were to

  1. 1.

    Ascertain the ability of lumbar spine BMAD, TBS and vBMD to discriminate between individuals with (i) low aBMD by DXA with vertebral fractures and (ii) with low aBMD only.

  2. 2.

    Compare the discriminatory ability of lumbar spine BMAD, TBS and vBMD for vertebral fracture.

The knowledge gained during this study could directly benefit patients through improved assessment and treatment targeting and ultimately a reduction in individual fracture risk.

Materials and methods

Study design

We conducted a single-site, observational, cross-sectional, case-controlled study of postmenopausal women—the Low Bone Mineral DenSity witH And wiThouT vErtebral fRactures (SHATTER) study.

Study population

We studied postmenopausal women aged 50 or over, whose last menstrual period was more than 12 months before study entry. Three groups of participants were studied as described below:

  • Group 1 (cases) were individuals with vertebral fractures (≥ 1 vertebral fracture) and low aBMD, defined as an aBMD T-score of < − 1.0 at either the total hip or lumbar spine by DXA. Vertebral fracture cases were recruited (i) from those patients referred to and attending the Sheffield Fracture Risk Assessment Service (FRAS), Metabolic Bone Centre, Northern General Hospital, Sheffield, UK [8]; (ii) by members of the clinical care team following a review of spinal radiographs, CT or MRI images acquired in the Department of Radiology, Sheffield Teaching Hospitals NHS Foundation Trust, Sheffield, UK and (iii) from our volunteer database of previous study participants who had expressed an interest in participating in future bone research projects.

  • Group 2 (age- and BMD-matched controls) were women with low aBMD but without vertebral fractures individually matched by age (± 5 years) and total hip or lumbar spine aBMD (± 0.05 g/cm2) to women in group 1. Age- and BMD-matched controls were recruited (i) from those patients referred to and attending FRAS, Metabolic Bone Centre, Northern General Hospital, Sheffield, UK; (ii) from our volunteer database of previous study participants who had expressed an interest in participating in future bone research projects; (iii) through study advertising posters and emails to staff and students within The University of Sheffield and Sheffield Teaching Hospitals NHS Foundation Trust and (iv) through general practice mail-outs.

  • Group 3 (age-matched controls) were women with normal BMD (defined as an aBMD T-score of > − 1.0 at either the total hip or spine by DXA), individually matched by age (± 5 years) to women in group 1. Group 3 were recruited through the same routes as group 2.

The lowest BMD T-score, either at the hip or lumbar spine, was used as the BMD inclusion criterion by which we assigned participants to either group 1, group 2 or group 3.

Volunteers were not eligible to participate in the study if they had (i) been diagnosed and treated for malignancy within the last 5 years; (ii) overt Cushing’s syndrome; (iii) received glucocorticoid treatment or oestrogen replacement 6 months before the start of the study; (iv) secondary causes of osteoporosis including chronic renal disease, malabsorption syndromes, endocrine disorders, hypercalcaemia or hypocalcaemia, chronic alcoholism; (v) any diseases known to affect bone metabolism and (vi) type 1 diabetes or pharmacologically treated type 2 diabetes.

This study was approved by the South Yorkshire Research Ethics Committee and all participants gave fully informed written consent prior to their participation. All investigations were carried out in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments and in accordance with the International Conference on Harmonization Good Clinical Practice (ICH GCP) guidelines.

Anthropometric assessments

Anthropometric measurements, height (to the nearest 0.1 cm) and weight (to the nearest 0.1 kg), were measured using a wall-mounted stadiometer (Seca 242, Seca, Birmingham, UK) and an electronic column scale (Seca), respectively. Body mass index (BMI) was calculated using Quetelet’s index (weight (kg)/height (m)2) to the nearest 0.1 kg/m2.

Medical and lifestyle questionnaire

Information on demographics including lifestyle (diet, dietary supplementation and calcium intake, physical activity, smoking status and alcohol consumption), medical history (drug history, current medication, gynaecological history, fracture history and family history of osteoporosis) was collected through self-completion of a validated medical and lifestyle questionnaire. It is currently used as a clinical tool to assess the individual patient’s risk of fracture as part of FRAS (Metabolic Bone Centre, Northern General Hospital, Sheffield, UK).

Dual-energy X-ray absorptiometry

Areal BMD (in g/cm2) of the lumbar spine (L1 to L4) was measured by DXA using a Discovery A densitometer (Hologic Inc., Bedford MA). As aBMD cannot be reliably measured in fractured vertebrae, all study participants were required to have a minimum of two evaluable lumbar vertebrae to be included in the study. Fractured vertebrae were excluded during the analysis of the DXA lumbar spine scan images. Individuals with fractures of all lumbar vertebrae were excluded from the study. BMAD of L1–L4 (in g/cm3) was calculated using the method described by Carter et al. and the following equation [13].

BMAD = BMC/(Ap)1.5.where BMAD is the bone mineral apparent density (g/cm3), BMC is the bone mineral content (g) and Ap is the projected area (cm2).

Total hip aBMD of the right proximal femur was also measured by DXA. If the right proximal femur had been fractured or replaced, the contralateral proximal femur was imaged. Daily measurements of the manufacturer’s device-specific anthropomorphic phantom were performed in order to monitor the stability of the DXA device.

Trabecular bone score

Following standard analysis procedures, the TBS Clinical Data Analysis software (version 1.6; Medimaps, Pessac, France) was applied to the DXA lumbar spine scan images. The calculation of TBS has been previously described in detail by Winzenrieth et al. [17]. Here, we provide only a brief overview of the automated calculation process. Firstly, a greyscale variogram, examining pixel intensity within the image, was produced. TBS of L1–L4 was then calculated as the slope at the origin of the log-log representation of the greyscale variogram to produce a measure of the mean rate of local variations in these greyscale differences. This was expressed as a trabecular bone score or TBS [17].

Quantitative computed tomography

QCT of vertebrae L1–L3 was performed using a 64-row LightSpeed volumetric computed tomography system (Lightspeed 64 VCT XT, GE Medical Systems) as previously described by Paggiosi et al. [27].

Images of L1–L3 were acquired in the axial plane with a helical full 1.0 s rotation time and a table height of 155 cm. All scans were performed using the following scan settings: pitch = 0.969, tube current = 140 mA, tube voltage = 80 kVp and slice thickness = 0.625 mm. Scanning began 5 mm above the superior endplate of L1 (inclusive of the T12–L1 joint space) and ended 5 mm below the inferior endplate of L3 (inclusive of the L3–L4 joint space). Images were reconstructed at 0.625 mm × 0.625 mm using the standard algorithm and a field of view of 480 mm. Images were analysed using QCT Pro software (V5.0.3, Mindways Software Inc., Austin, TX, USA). Fractured vertebrae were excluded from the analysis of the QCT scan images. Trabecular vBMD of vertebrae L1, L2, L3 and L1–L3 was determined. Firstly, the vertebral bodies were rotated for optimal placement of the regions of interest (ROI). This was performed in the axial, sagittal and coronal planes. The centre of the vertebrae were identified and marked with a cross. Elliptical ROIs were then automatically placed, by the Mindways software, within the frontal trabecular region of each vertebral body to exclude the cortical and sub-cortical bone and the posterior elements.

A Model 3 CT density calibration phantom (Mindways, Mindways Software Inc., Austin, TX, USA) was positioned under the participants during each L1–L3 scan. Information extracted from the calibration phantom allowed the conversion of measured Hounsfield units to units of bone mineral.

Statistical analyses

Our sample size calculations were based on the clinically significant difference in lumbar spine vBMD reported by Melton et al. [28]. A one standard deviation (SD) decrease in spine vBMD was associated with a 2.2-fold increase in the risk of vertebral fracture [28]. We calculated that 30 patients per group would provide a power of 80% to detect a difference of 1SD at the 5% significance level.

Participant characteristics were reported as mean and standard error (mean (SE)).

Between-group differences were identified using the general linear model (GLM) univariate procedure [29]. Each measurement variable (BMAD, TBS and vBMD) was selected, in turn, as the dependent variable. Fixed factors were defined as (i) study participant group and (ii) case/control matching. Post hoc Tukey tests were performed when between-group differences were shown to be statistically significant by the GLM univariate procedure.

The ability of each measurement variable (BMAD, TBS, vBMD) to discriminate between postmenopausal women with (i) low aBMD with vertebral fractures and (ii) low aBMD only was determined using receiver operating characteristic (ROC) analysis. The combined measurement variable of aBMD + TBS was calculated using logistic regression as previously described by Winzenrieth et al. [30]. The areas under the receiver operating characteristic curve (AUCs) were then compared using pairwise comparisons of ROC curves to examine differences between the discriminatory ability of lumbar spine measurement variables for vertebral fracture [31].

Statistical analyses were performed using SPSS Statistics (version 24.0, IBM Corporation, New York, USA) and MedCalc (version 18, MedCalc Software bvba, Belgium). A level of p < 0.05 was considered to show statistical significance.

Results

Study population

Participant characteristics are presented in Table 1.

Table 1 Participant characteristics for groups 1, 2 and 3 presented as mean and standard error (mean (SE))

A total of 110 postmenopausal women (age = 68.8 (6.2) years, weight = 70.2 (12.6) kg and height = 161.1 (6.9) cm) were studied. Of these, 39, 34 and 37 women were recruited to groups 1, 2 and 3, respectively. Statistically significant differences in weight (p < 0.001) and BMI (p = 0.02) were observed when comparing group 3 with groups 1 and 2, with participants in group 3 being heavier, but height was similar across the groups. However, we did not attempt to match for height and weight only for age and aBMD (groups 1 and 2) or age alone (group 3). In group 1, a total of 57 vertebral fractures were identified using the algorithm-based qualitative approach as described by Jiang et al. [32]. Thirty-two women had sustained one vertebral fracture, and seven women had sustained > 1 vertebral fracture. Per individual, the number of fractures ranged from 1 (n = 32 women) to 5 (n = 1 woman). Vertebral fractures were most prevalent at the thoracolumbar junction (T11 to L1) and at the mid thoracic spine with 35% and 37% of the total number of fractures occurring in these two regions. Wedge (68%) and concave/biconcave (32%) but no compression fractures were observed. Of all the fractures identified, 18%, 44% and 38% were categorised as grade 1, grade 2 or grade 3, respectively. Of the 39 women in group 1, 17 had previously received or were currently taking oral bisphosphonate treatment (alendronate = 15 women and risedronate = 2 women) and three had been prescribed calcium and vitamin D. No participants in group 2 had previously received or were currently taking bisphosphonate treatment, but 11 had been prescribed calcium and vitamin D.

Between-group differences in DXA and QCT measurement variables

Measurement variable data acquired using DXA and QCT for groups 1, 2 and 3 are presented in Table 2. Significant differences in lumbar spine measurement variables were observed between groups 1, 2 and 3. Group 3 (age-matched controls) had significantly higher lumbar spine BMAD and TBS than women in groups 1 (cases) and 2 (age- and aBMD-matched controls). Furthermore, vBMD was significantly different for all three groups with the lowest values being observed for group 1 and the highest for group 3.

Table 2 Differences in lumbar spine measurement variables between groups 1, 2 and 3 as measured using DXA and QCT. Data are presented as mean and standard error (mean (SE)). p values for group differences as determined using general linear univariate modelling approaches and group multiple comparisons as determined using the Tukey post hoc testing are given to 1 significant figure

Discriminatory ability of DXA and QCT measurement variables for vertebral fracture

The outcomes of the discriminatory ability comparison analysis of lumbar spine measurement variables are presented in Table 3 and Fig. 1. The combined measurement variable aBMD + TBS did not show better discriminatory ability than aBMD alone or TBS alone. Of all the measurement variables examined, only vBMD was able to discriminate between postmenopausal women with (i) low aBMD with vertebral fractures and (ii) low aBMD only.

Table 3 Ability of lumbar spine measurement variables to discriminate between postmenopausal women with (i) low aBMD with vertebral fractures (group 1) and (ii) low aBMD only (group 2). p values are presented to 1 significant figure
Fig. 1
figure 1

The ability of BMAD, TBS and vBMD to discriminate between individuals with (i) low aBMD with vertebral fractures (group 1) and (ii) low aBMD only (group 2). AUC is significantly different to 0.5 (p < 0.05) and AUC is significantly better than AUC for aBMD (p < 0.05).

Discussion

The ability of different bone imaging techniques to discriminate between fracture and non-fracture cases has been well reported [22,23,24,25]. We found that the discriminatory ability of lumbar spine vBMD for vertebral fractures is significantly better than that for lumbar spine aBMD. This finding is in keeping with those previously reported by other investigators [22,23,24,25,26]. However, to our knowledge, the SHATTER study is the first to examine the ability of lumbar spine measurement variables to discriminate between postmenopausal women with (i) low aBMD with prevalent vertebral fractures and (ii) low aBMD only. It is possible that the discriminatory ability of an imaging technique is itself dependent on aBMD. Our study findings are independent of aBMD as women in groups 1 and 2 were aBMD- and age-matched. This makes the SHATTER study novel and informative. Furthermore, we observed that neither BMAD nor TBS could distinguish between women with low aBMD with and without vertebral fractures.

To understand why vBMD has a significantly better discriminatory ability for vertebral fractures than the other measurement variables studied, we must consider the (i) mechanism underlying vertebral fracture and (ii) differences between the imaging techniques used to examine the lumbar spine.

Eastell et al. [33] demonstrated, through the use of ash weighing and microdensitometry, that adult cadaveric lumbar vertebrae (L2 and L3) are predominantly composed of trabecular bone (72% for women and 80% for men). Vertebral fracture is a consequence of altered trabecular microarchitecture due to ageing, disease, medication, lifestyle or other factors. The pathogenesis of vertebral fracture has been well described by Parfitt [34, 35] and Moseklide [36]. Loss of trabecular bone occurs due to the removal or disruption of some of the microstructural elements. This primarily effects the horizontal trabeculae resulting in a decrease in trabecular thickness and number and an increase in trabecular separation. Overall, there is a loss of bone strength and an increased susceptibility to fracture. Wolfram et al. [37] described how repeated strain fatigue or acute injury initiates the vertebral fracture development process. This causes minor endplate deformities. Decreased bone strength due to altered trabecular microstructure and repeated loading eventually cause vertebral collapse.

Different bone imaging techniques, even when performed at the same anatomical site, may assess different properties of the bone. Information gained about the vertebrae when performing DXA of the lumbar spine is limited to 2D quantitative bone measures and no qualitative 3D information relating to trabecular bone structure can be acquired. BMAD, although useful for reducing the influence of body size on bone mass, cannot provide qualitative 3D information about the bone. Moreover, TBS, derived from DXA scans of the lumbar spine, can only provide a 2D indirect measure of 3D bone microstructure. However, spine QCT has the advantage that it can reveal 3D information about vertebral trabecular bone and provide a direct measure of vBMD. This may partly explain why, during the SHATTER study, trabecular vBMD was able to discriminate between women with low aBMD with and without vertebral fractures. We must also appreciate that aBMD, BMAD and TBS results, acquired during posteroanterior imaging of the lumbar spine by DXA, include the vertebral posterior elements (i.e. the spinous processes and pedicles). Lee et al. reported that the posterior elements accounted for 51.4 ± 4.2% of the total bone mineral content in the DXA lumbar spine scan region [38]. A morphometric study of adult human cadaveric lumbar vertebra, conducted by Defino and Vendrame [39], revealed that the pedicles comprised of 71.4% trabecular bone and 28.6% cortical bone. When analysing QCT images of L1–L3, an elliptical ROI is positioned within the frontal trabecular region of each vertebral body to exclude any cortical and sub-cortical bone providing a direct measure of trabecular vBMD only.

Link et al. [40] examined the spine, distal radius and calcaneus of human cadavers using high resolution magnetic resonance imaging and computed tomography to determine the diagnostic value of structural bone measures for predicting vertebral fractures. Volumetric BMD of the lumbar spine (L4) was also measured using QCT. Link et al. [40] concluded that structural measures of the spine were best suited to predict osteoporotic fracture status in the spine (AUCs, 0.62 to 0.75) and that vBMD of L4 could also be used to discriminate between those with and without vertebral fractures.

To our knowledge, this is the first study to compare the ability of the lumbar spine measurement variables BMAD, TBS and vBMD to discriminate between postmenopausal women with (i) low aBMD with prevalent vertebral fractures and (ii) low aBMD only.

Our study does have limitations, the main being the small number of participants studied (n = 110) but we did successfully recruit 39, 34 and 37 participants into groups 1, 2 and 3, respectively. Our sample size calculation indicated that 30 participants per group would be required to provide a power of 80% to detect a difference of 1SD at the 5% significance level.

Furthermore, the inclusion of women receiving various effective OP treatments (group 1, 17 participants taking bisphosphonates and 3 taking calcium and vitamin D versus group 2, no participants taking bisphosphonates but 11 taking calcium and vitamin D) may undermine our findings. Although participants from groups 1 and 2 were age- and aBMD-matched, the women with vertebral fractures were more likely to have received bisphosphonates previously or be currently taking bisphosphonates. This has introduced an additional confounding factor that may have had some bearing on our aBMD and vBMD data, namely treatment versus no treatment.

Finally, we acknowledge that an old version of the TBS software (v1.6) was applied to the SHATTER study lumbar spine DXA scans. In 2017, Schacter et al. [41] used the Manitoba Study scans to compare TBS values acquired using software versions 1.8 and 2.1. They concluded that the updated TBS algorithm is less affected by BMI, gives higher mean results for men than women consistent with their lower fracture risk and improves fracture prediction in both men and women [36]. The use of an older software version (v1.6), as reported here, may have compromised the discriminatory ability of TBS for vertebral fracture.

We conclude that vBMD may discriminate well between individuals with and without vertebral fractures as it provides a 3D measure of vBMD, excludes the posterior elements of the vertebrae and takes into account bone size. The SHATTER study has revealed important information about the discriminatory ability of different bone imaging techniques. The main unique feature of our study is that groups 1 and 2 were matched for aBMD (and age). Thus, our findings for group 1 and group 2 are independent of aBMD. Furthermore, we observed that neither BMAD nor TBS could distinguish between women with low aBMD with and without vertebral fractures. Superior discrimination of women with and without radiographic vertebral fracture portends better prediction of all future osteoporotic fractures. The knowledge gained from the SHATTER study will influence clinical and therapeutic decision-making, thereby optimising the care of patients with and without vertebral and other fragility fractures.