Introduction

The United States National Institutes of Health described osteoporosis as a skeletal condition characterized by compromised bone strength that predisposes the patient to an increased risk of non-traumatic fractures, with a focus on both bone density and quality in generating bone strength [1]. Bone mineral density (BMD) represents the mineral content per area and is expressed in g/cm2 when measured using dual energy X-ray absorptiometry (DXA). Bone quality includes a number of parameters, such as mineralization, micro-architecture, and bone turnover [2]. The World Health Organization considers BMD values measured with DXA as the reference standard for the definition of osteoporosis [3]. It is well known that BMD contributes approximately 70 % to bone strength for [1] and that trabecular bone micro-architecture plays a pivotal role in determining bone quality [4]. Moreover, BMD alone is not always an accurate predictor of fracture risk, and overlap exists between BMD distributions of patients with and without osteoporotic fractures [5].

Trabecular bone score (TBS) was firstly reported in 2008 as a new tool to assess bone micro-architecture [6]. It is a grey-level texture measurement based on an experimental variogram of two-dimensional projection images: the higher the TBS, the better the skeletal micro-architecture [7]. The software for TBS calculation uses raw data obtained with DXA scans and can be applied only to lumbar spine DXA examinations [7].

Being a relatively new tool, the guidelines published in 2013 by the International Society of Clinical Densitometry (ISCD) do not include recommendations regarding TBS usage in clinical practice [8]. Furthermore, differently from BMD, no reference values exist in terms of precision or accuracy. The TBS manufacturer user’s manual only reports accuracy derived by ex vivo studies. Also, this manual suggest to perform short-term precision assessment according to recommendations as for BMD. To our knowledge, only five previous papers reported the coefficient of variation (CoV) for TBS, ranging from 1.1 % to 2.1 % [913]. These values are higher than those reported for BMD, implying that TBS is less reproducible than BMD. However, those studies have limitations and none of them reported on the scan mode used. To summarize, targeted assessment of TBS reproducibility is scarce.

Thus, our aim was to estimate the in vivo reproducibility of TBS using different scan modes, to be compared to the reproducibility of BMD, according to the ISCD guidelines.

Materials and methods

This prospective, single-centre study was approved by the local ethics committee and written informed consent was obtained from all patients. All lumbar spine DXA examinations were performed at our institution on May 2014 following the ISCD 2013 guidelines [8].

Study population and inclusion criteria

ISCD 2013 guidelines state that in vivo precision assessment has to be performed using a series of patients representative of the local patient population. Thus, we prospectively enrolled patients who were routinely sent to our institution to perform a lumbar spine DXA for suspected osteoporosis or follow-up. In this study we considered only Caucasian female patients. As DXA has known limitations in predicting fracture risk in patients with body mass index (BMI) greater than 30 kg/m2 [14], and the TBS user’s manual recommends not to use TBS values obtained in patients with a BMI ≥ 35 kg/m2 or ≤15 kg/m2, we only included patients with 15 < BMI < 30 kg/m2.

Densitometer

All DXA examinations were performed using a Hologic QDR-Discovery A unit (Hologic Inc., Bedford, MA, USA). This densitometer is accepted by the Fracture Risk Assessment tool (FRAX®) and allows for three different scan modes, defined by the manufacturer as fast-array, array, and high definition [15]. Radiation dose and scan times for the three scan modes are shown in Table 1. DXA examinations were performed by a radiologist with a 6-year experience in DXA and 3 years of experience in TBS. The last available version of TBS software (Med Imaps TBS iNsight® v02, Hologic) was used.

Table 1 Radiation dose and scan time for three DXA scan modes

Short-term precision assessment of lumbar spine BMD and TBS

The short-term precision, i.e. the reproducibility, was evaluated according to ISCD 2013 guidelines [8]. For each of the three scan modes (fast-array, array, and high definition), 30 consecutive patients were scanned twice with in-between patient repositioning. If the patient had a DXA examination previously performed at our institution, the same scan mode was selected for both measurements in this study. If not, we attributed to the patient one of the three scan modes randomly. After the first acquisition, the computer suggested an automatic segmentation of the lumbar spine. At this stage, the operator was allowed to correct the region of interest manually to avoid analysis bias. Once the region of interest was correctly defined, it was copied on the second scan of the same patient. T-score was then calculated and patients were classified as normal, affected with osteopenia, or affected with osteoporosis. Differently from BMD, the operator cannot modify any parameter of the TBS processing. TBS analysis was performed on DXA raw data, avoiding further exposure to ionizing radiation.

Statistical analysis

The three independent groups of 30 patients per each scan mode were compared for age, BMI, T-score, and TBS using the one-way ANOVA. The distribution of densitometric classification of T-score was tested among the three groups of patients using the χ2 test.

According to the ISCD guidelines [8], we calculated the root mean square standard deviation (RMS-SD) of both BMD and TBS; CoV was calculated as the ratio between RMS-SD and the grand mean; LSC at 95 % confidence level was calculated as 2.77 × CoV. This analysis was repeated for each scan mode (fast-array, array, and high definition). Reproducibility was calculated as the complement to 100 % LSC.

The comparison between the reproducibility of BMD and that of TBS was performed for each scan mode by calculating the standard deviation distribution of the two measurements obtained for each patient. BMD and TBS distributions were tested using the Student’s t test for paired data.

The comparison among the reproducibility of the three scan modes was performed for both BMD and TBS by calculating the standard deviation distribution of the two measurements obtained for each patient. The three resulting distributions were tested using one-way ANOVA, after verification of variance homogeneity (Levene test). Calculations were performed using SPSS v20 (SPSS Inc., Chicago, IL, USA). A p-value lower than 0.05 was considered as statistically significant.

Results

Study population

A total of 90 patients (30 per each scan mode) were enrolled, with a mean age ± standard deviation equal to 62 ± 10 years. Details are reported in Table 2. The three groups were not significantly different for age (p = 0.820) and T-score (p = 0.833). They differed for BMI (p = 0.007) and TBS (p < 0.001): mean BMI of the group studied with high definition mode was slightly higher (24 kg/m2) than that of the group studied with fast array (23 kg/m2) or array mode (24 kg/m2); mean TBS for the group studied with high definition mode (1.216 mm-1) was slightly lower than that of the group studied with fast array (1.315 mm-1) or array mode (1.258 mm-1). The classification for osteoporosis, osteopenia, and normal condition did not differ among the three groups (p = 0.979).

Table 2 Characteristics of three groups of patients who underwent DXA twice using the same scan mode

Short-term precision assessment

The LSC of TBS ranged from 5.2 % to 5.4 %, more than twice as that observed for BMD (from 2.0 % to 2.1 %). As a consequence, reproducibility of TBS (range 94.6–94.8 %) was significantly lower (p ≤ 0.011) than that of BMD (range 97.9–98.0 %). The difference in reproducibility among the three scan modes was not significant for either BMD (p = 0.942) or TBS (p = 0.942). An example is shown in Fig. 1. Full results of short-term precision assessment, including CoVdata, are reported in Table 3.

Fig. 1
figure 1

Comparison between bone mineral density (BMD) and trabecular bone score (TBS) in a patient who underwent dual-energy X-ray absorptiometry twice. Panel A shows results of the first scan with a L1-L4 BMD of 0.803 g/cm2 and a L1-L4 TBS of 1.158 mm-1. Panel B shows results of the second scan, after patient repositioning: L1-L4 BMD 0.807 g/cm2 and L1-L4 TBS 1.224 mm-1

Table 3 Short-term precision assessment of BMD and TBS in three independent groups of 30 patients for three scan modes

Discussion

Bone mineral density is a determinant for bone fragility and its measurement using DXA has been universally adopted as a standard of care for the clinical diagnosis of osteoporosis [16]. Of note, DXA suffers from limitations due to its technical concepts, as outlined by Bolotin [17]. In fact, even when BMD is in the normal range, the occurrence of one or more low-impact fragility fractures is considered as a sign of severe osteoporosis [18]. For that reason, when assessing fracture risk, data other than BMD are taken into account, such as age, history of previous or parental fractures, and presence of diseases inducing secondary osteoporosis, corticosteroid therapy. These factors are accounted for by FRAX®, which is a computer-based algorithm that estimates individual 10-year fracture probability [19]. Since lumbar spine TBS holds promise as an emerging technology as a clinical tool for diagnosing osteoporosis and assessing fracture risk [20], its result might be considered an additional risk factor to be considered.

The main finding of our work is that in vivo reproducibility of TBS was significantly lower than that of BMD on lumbar spine DXA examinations. Five previous papers tested the TBS and obtained CoV values ranging from 1.1 % to 2.1 % [913], higher than that of BMD. Our results are at the upper limit of the range reported in the literature in terms of CoV and confirm that TBS is less reproducible than BMD. To our knowledge, our study is the first systematic in vivo experience estimating TBS reproducibility in which ISCD 2013 guidelines are applied and methodologically described.

We demonstrated that LSC of TBS is more than two-fold that of BMD. In the review by Silva et al [20], the authors explained that the impact of osteoporosis therapy on TBS is generally smaller in magnitude than that on BMD, ranging from 1.1 % (patients treated with zoledronic acid) to 4.3 % at 24 months (patients treated with teriparatide). Thus, at best, therapy-induced changes in TBS are smaller than the statistical variation we estimated (maximum LSC 5.4 %). At any rate, data are lacking about the natural evolution of TBS and its role for monitoring treated or untreated osteoporosis is unclear. If the TBS rate of change will be shown to be slower than that of BMD, the evaluation of TBS variations over time may need a longer time interval between two consecutive DXA examinations. The review by Silva et al illustrated the potential utility of TBS as a clinical tool [20]. TBS decreases with age and appears to reflect qualitative aspects of skeletal structure that are complementary to BMD data. TBS may improve fracture risk discrimination over DXA alone. However, whether serial changes in TBS have clinical value in both treated and untreated patients remains to be demonstrated [20].

The lack of differences between the reproducibility of the TBS in different scan modes implies that the three scan modes may be considered as interchangeable. Of note, as per the official guidelines [8], once a scan mode is applied on a patient, all subsequent follow-up examinations should be performed using the same mode. Thus, although DXA has very low imaging time and nearly negligible radiation dose [21], fast-array imaging mode could be preferred, being quicker and providing lower radiation dose.

Our study has two main limitations. First, our reproducibility data are only valid for the studied densitometer and may not be directly transferable to different densitometers and/or used by operators with different experience and training. Second, taking into account ISCD 2013 guidelines, patients were only partially randomized to the three groups, resulting in inhomogeneity in terms of BMI and TBS, although still homogeneous in terms of age, T-scores, and clinical classification.

In conclusion, our work demonstrated that the in vivo reproducibility of the TBS was lower than that of BMD. Differences between the imaging modes were not significant in terms of TBS and BMD, so fast-array imaging mode may be preferred.