Introduction

Osteoporosis is a skeletal disorder, leading to an increased risk for fragility fractures [1]. Assessing risk of fracture using dual-energy x-ray absorptiometry (DXA) derived bone mineral density (BMD) often fails due to lack of underlying information on 3D structure and local bone mass distribution [2]. MDCT-derived quantitative measures using textural analysis and finite element analysis (FEA) are becoming useful in identifying patients at the risk of osteoporotic fractures. However, in routine clinical settings, MDCT images are frequently acquired with IVCM to enhance the image contrast. The effect of these variations on quantitative bone microstructure parameters are not yet clear [3,4,5,6]. Furthermore, long-term reproducibility of these quantitative measures in MDCT images acquired during a routine clinical procedure has not been documented.

Texture-derived measures are highly sensitive to orientation, size, and intensity of voxels. Studies have shown that texture-derived features and structural indices including trabecular bone score (TBS) are able to quantify trabecular bone microarchitecture and fracture risk [7]. Uniformity, heterogeneity, and appearance of repetitive patterns in an image are useful in characterizing morphological properties of underlying tissue using texture-derived measures [8]. Similarly, texture features were used to analyse the effect of radiation dose reduction and reconstruction algorithms for MDCT imaging [3, 9]. Moreover, texture features were used to study the microarchitecture characteristics of trabecular bone from high-resolution peripheral quantitative computed tomography (HR-pQCT) images [10]. However, the effect of image acquisition parameters namely IVCM, slice thickness, and long-term reproducibility of scan-rescan [11] are not studied adequately using texture features [12].

In this study, we systematically investigated the reproducibility of MDCT images (i) with and without IVCM, (ii) with different slice thickness (iii) scan-rescan protocols, and (iv) discrimination ability of texture features between subjects with and without osteoporotic fractures using GLCM texture features.

Materials and methods

The proposed methodology is grouped into four parts MDCT data acquisition, image analysis, texture data analy- sis, and classification of healthy/fracture cohort (Fig. 1). The MDCT images were acquired with and without IVCM, dif- ferent sagittal slice thickness (1, 2, and 3 mm), scan-rescan within 8 weeks, and healthy/osteoporotic fracture cohort. Vertebrae were manually delineated using open source med- ical imaging interaction toolkit (MITK) [13]. The gray level uniform quantization and texture analysis were performed using MATLAB (The MathWorks Inc., Natick, MA). The statistical analysis was performed using Pearson correlation coefficient and Bland-Altman analysis for with and without IVCM, slice thickness, and scan-rescan texture data. The independent sample t test was performed to rank the fea- tures as per t value and SVM classifier with different kernel functions were used to classify healthy/fracture cohort.

Fig. 1
figure 1

Schematic representation of proposed methodology grouped in MDCT data acquisition, image analysis, texture data analysis, and healthy/fracture cohort classification

MDCT imaging and vertebrae segmentation

This retrospective study was approved by the local institutional review board. Subjects were identified in our institution’s digital image archive (PACS). Patients with pathological bone changes like bone metastases, hematological or metabolic bone disorders aside from osteoporosis were excluded. Subgroup for the investigation of different slice thickness and contrast medium effects: seven subjects (five males, two females; mean age: 71.86 ± 7.40 years) with non-contrast-enhanced MDCT scan and immediately followed by a contrast-enhanced MDCT scan were included. All MDCT exams were performed with a 64-row MDCT scanner (Somatom Sensation Cardiac 64, Siemens Medical Solutions, Erlangen, Germany). Routine abdominal non-contrast-enhanced MDCT images were obtained with a standard protocol. Scanning parameters were 120 kVp tube voltage, adapted tube load of averaged 200 mAs, and minimum collimation (0.6 mm). Sagittal reformations of the spine were reconstructed with a slice thickness of 1, 2, and 3 mm, respectively. Sagittal reformations of the spine with a slice thickness of 3 mm are the standard reconstruction in clinical routine [14]. Examinations with administration of intravenous contrast medium were performed in a standardized way: Administration of intravenous contrast medium (Imeron 400, Bracco, Konstanz, Germany) was done using a high pressure injector (Fresenius Pilot C, Fresenius Kabi, Bad Homburg, Germany). The intravenous contrast medium injection was performed with a delay of 70 s, a flow rate of 3 ml/s, and a body weight dependent dose (80 ml for body weight up to 80 kg, 90 ml for body weight up to 100 kg, and 100 ml for body weight over 100 kg). Manual segmentations of all depicted vertebrae (T11, T12, and L1–L3) were performed with MITK (n = 45). Routine abdominal contrast-enhanced MDCT images were obtained with a standard protocol as outlined above. Sagittal reformations of the spine were reconstructed with a slice thickness of 3 mm.

Subgroup for the investigation of long-term reproducibility: nine subjects (five males, three females; mean age: 59.56 ± 9.44 years) with two contrast-enhanced MDCT scans within 8 weeks were identified. All MDCT exams were performed with a 64-row MDCT scanner (Somatom Sensation Cardiac 64, Siemens Medical Solutions, Erlangen, Germany). Manual segmentations of all depicted vertebrae (T1–T12 and L1–L5) were performed with MITK (n = 130).

Subgroup with subjects with/without vertebral fractures: Lastly, nine subjects (four males, five females; mean age: 75.44 ± 10.19 years) with contrast-enhanced MDCT and osteoporotic vertebral fracture and age-matched controls (four males, five females, mean age: 71.44 ± 10.05 years) without vertebral fracture were included. MDCT scans were acquired using a 256-row scanner (iCT, Philips Healthcare, Best, the Netherlands). Application of intravenous contrast medium was performed as stated above. Sagittal reformations of the spine were reconstructed with a slice thickness of 3 mm. The presence of vertebral fractures were determined by a radiologist. Manual segmentations of all depicted vertebrae (L1–L4) were performed with MITK (n = 27) (Fig. 2).

Fig. 2
figure 2

Sagittal reformations of MDCT images a with and without IVCM, b with different slice thickness, c scan and rescan, and d healthy/fracture cohort (vertebral fracture of L2 marked by an red arrow)

Texture analysis

The gray level quantization was performed on each segmented vertebra using highest gray level present in an image. The GLCM texture analysis was performed on quantized images to extract second-order texture features namely energy, contrast, correlation, homogeneity, dissimilarity, entropy, variance, and sum average [15]. These aforementioned features quantify textural patterns (fine, coarse, smooth, or irregular) in an image. GLCM is derived from co-occurrence of voxel intensity pairs at a given distance d = (dx,dy,dy) and direction 𝜃 = (0, 45, 90, and 135) [15, 16], where dx and dy denote the co-occurrence matrix and dz represents voxel moved along the z axis. Each entries in GLCM(x,y)(x,y = 0,1,…N g − 1) denotes probability of occurrence of gray levels which were computed from 26 neighbors. The N g represents a number of gray levels that can be encoded in an image. Hence, the dimension of GLCM matrix is N g × N g [15, 16].

Healthy/fracture classification

SVM classifier with linear, polynomial, and radial basis function (RBF) kernels were used to discriminate healthy/osteo- porotic fracture cohort. SVM uses hyperplane in the high-dimensional feature space and separates the healthy/fracture data points to obtain best possible solutions [17].

Statistical analysis

The texture analysis obtained a feature matrix of 45 × 8 for study group contrast, 55 × 8 for slice thickness, 200 × 8 for scan-rescan, and 54 × 8 for healthy/fracture cohort. In this study, each vertebra was considered as single specimen and texture data analysis was performed using Pearson correlation coefficient and Bland-Altman plot. The following hypothesis testing was performed for Pearson correlation.

Null hypothesis H0: There is no significant relationship between with and without IVCM, slice thickness (1 mm versus 2 mm and 1 mm versus 3 mm), scan-rescan texture features. (i.e., Pearson correlation, we are testing r = 0)

The Bland-Altman (mean-difference or limits of agreement) plot is used to compare two measurements of the same variable. It is used to understand repeatability of the measurement system. The limits of agreement between two tests are expressed by a 95% confidence interval (± 1.96 SD) of a particular value of the difference [18].

The independent sample t test is used to compare the means of two groups (healthy/fracture) using the following hypothesis testing.

Null hypothesis H0: μ1 = μ2 (no significant difference between means of texture features of trabecular bone in healthy/fracture cohort)

Results

Investigating the effect of IVCM on trabecular bone texture analysis

Significant correlations were obtained for energy (r = 0.88, p < 0.0001), entropy (r = 0.88, p < 0.0001), homogeneity (r = 0.88, p < 0.0001), and variance (r = 0.91, p < 0.0001) between with and without IVCM (Fig. 3a, e, g, and m). The Bland-Altman plot quantifies bias and range of agreements. Limits of agreement (LOA) were narrow for energy (− 0.00 to 0.00), entropy (− 0.04 to 0.05), homogeneity (− 0.00 to 0.00) and variance (− 0.00 to 0.00) (Fig. 3b, f, h, and n). Whereas contrast (r = 0.87, p < 0.0001), correlation (r = 0.87, p < 0.0001), sum average (r = 0.84, p < 0.0001), and dissimilarity (r = 0.86, p < 0.0001) showed significant correlation (Fig. 3c, i, k, and o); however, Bland-Altman plot for these features namely contrast (− 170 to 170) and dissimilarity (− 0.29 to 0.36) obtained wider LOA and sum average (− 0.00 to 0.00) and correlation (− 0.04 to 0.05) showed narrow LOA and larger bias (Fig. 3d, j, l, and p).

Fig. 3
figure 3

Correlation between with and without IVCM texture features a energy, c contrast, e entropy, g homogeneity, i correlation, k sum average, m variance, and o dissimilarity, and Bland-Altman plots of b energy, d contrast, f entropy, h homogeneity, j correlation, l sum average, n Variance, and p Dissimilarity depicting the mean of each with and without IVCM versus the difference between the with and without IVCM. In correlation, plot dotted (-.-.-) line indicates the slope of the least squares line, solid line (–) adjacent to slope of least squares indicates line of best fit and other two solid lines (–) indicate upper and lower confidence interval (95%) limits. In Bland-Altman plots, horizontal solid line (–) represents mean value, and two dotted lines (-.-.-) indicates 95% confidence intervals at ± 1.96 SD

Investigating the effect of slice thickness on trabecular bone texture analysis

The correlation of energy (r = 0.97, p < 0.0001), entropy (r = 0.97, p < 0.0001), and homogeneity (r = 0.97, p < 0.0001) revealed a coherent linear dependency (Fig. 4a, e, and g) and excellent agreements were observed in Bland-Altman plot for energy (− 0.00 to 0.00), entropy (− 0.02 to 0.03), homogeneity (− 0.00 to 0.00), correlation (− 0.11 to 0.00), sum average (− 0.00 to 0.00), and variance (− 0.00 to 0.00) (Fig. 4b, f, h, l, and n) between sagittal slice thickness of 1and 2 mm. Whereas contrast (r = 0.92, p < 0.0001), sum average (r = 0.93, p < 0.0001), variance (r = 0.96, p < 0.0001), and dissimilarity (r = 0.92, p < 0.0001) showed significant correlation in regression fits (Fig. 4c, k, m, and o). However, correlation obtained r value of 0.81 (p < 0.0001) and moderate agreements (Fig. 4i and j). Contrast (− 150 to 290) and dissimilarity (− 0.21 to 0.49) showed minimal agreement in Bland-Altman plot (Fig. 4d, and p).

Fig. 4
figure 4

Correlation between 1- and 2-mm slice thickness texture features a energy, c contrast, e entropy, g homogeneity, i correlation, k sum average, m variance, and o dissimilarity and Bland-Altman plots of b energy, d contrast, f entropy, h homogeneity, j correlation, l sum average, n variance, and p dissimilarity depicting the mean of 1 and 2 mm versus the difference between 1 and 2 mm. In correlation, plot dotted (-.-.-) line indicates the slope of the least squares line, solid line (–) adjacent to slope of least squares indicates line of best fit and other two solid lines (–) indicate upper and lower confidence interval (95%) limits. In Bland-Altman plots, the horizontal solid line (–) represents mean value, and two dotted lines (-.-.-) indicates 95% confidence intervals at ± 1.96 SD

The texture features namely contrast (p < 0.0001), sum average (p < 0.0001), variance (p < 0.0001), and dissimilarity (p < 0.0001) showed high correlation (r value range, 0.91 to 0.97, Fig. 5c, k, m, and o) and good agreement (Fig. 5d, l, n, and p) except contrast (LOA: − 66 to 460) between slice thickness of 1 and 3 mm. Similarly, energy (p < 0.0001), entropy (p < 0.0001), homogeneity (p < 0.0001), and correlation (p < 0.0001) showed significant correlation and good agreement (LOA of energy (− 0.01 to 0.00), entropy (− 0.03 to 0.07), homogeneity (− 0.00 to 0.00), correlation (− 0.16 to 0.07), sum average (− 0.00 to 0.00), variance (− 0.00 to 0.00), and dissimilarity (− 0.05 to 0.85) in Bland-Altman plot (Fig. 5a and b, e and f, g and h, and i and j), r value ranged from 0.88 to 0.89. However, the line of best fit was not following the slope of least squares line.

Fig. 5
figure 5

Correlation between 1- and 3-mm sagittal slice thickness texture features a energy, c contrast, e entropy, g homogeneity, i correlation, k sum average, m variance, and o dissimilarity, and Bland-Altman plots of b energy, d contrast, f entropy, h homogeneity, j correlation, l sum average, n variance, and p dissimilarity depicting the mean of 1 and 3 mm versus the difference between 1 and 3 mm. In correlation, the plot dotted (-.-.-) line indicates the slope of the least squares line, the solid line (–) adjacent to slope of least squares indicates line of best fit, and other two solid lines (–) indicate upper and lower confidence interval (95%) limits. In Bland-Altman plots, the horizontal solid line (–) represents mean value, and two dotted lines (-.-.-) indicate 95% confidence intervals at ± 1.96 SD

Investigating the long-term reproducibility of trabecular bone texture analysis

The scan-rescan texture features showed significant correlation (p < 0.05) with r values ranged from 0.47 to 0.59 (Fig. 6). The Bland-Altman plot showed wider limits of agreement as compared to the effect of IVCM and different slice thickness (Fig. 6).

Fig. 6
figure 6

Correlation between scan and rescan texture features a energy, c contrast, e entropy, g homogeneity, i correlation, k sum average, m variance, and o dissimilarity, and Bland-Altman plots of b energy, d contrast, f entropy, h homogeneity, j correlation, l sum average, n variance, and p dissimilarity depicting the mean of scan and rescan versus the difference between scan and rescan. In correlation, the plot dotted (-.-.-) line indicates the slope of the least squares line, solid line (–) adjacent to slope of least squares indicates line of best fit and other two solid lines (–) indicate upper and lower confidence interval (95%) limits. In Bland-Altman plots, the horizontal solid line (–) represents mean value, and two dotted lines (-.-.-) indicates 95% confidence intervals at ± 1.96 SD

Investigating the discrimination ability of trabecular bone texture analysis for healthy/fracture cohort

The box plot of texture features shows the distinct discrimination between the healthy/fracture cohort. The box plots are available in supplementary material Fig. S1. The texture features of trabecular bone in healthy/fracture cohort ranked using t test and sequentially fed to SVM classifier. The entire data was divided into roughly ten equal parts, where nine parts were used for classifier development and the one part was used to evaluate the classifier performance. This process was repeated 10 times and average performance measures such as sensitivity, specificity and accuracy were calculated. The RBF kernel obtained highest sensitivity of 93.33%, specificity of 79.33%, and accuracy of 83% among different kernel functions. The detailed results are available in supplementary material Table S2.

Discussion

In this study, we have studied the reproducibility of MDCT images with and without IVCM, different slice thickness (1, 2, and 3 mm slice thickness), investigated long-term reproducibility (scan-rescan), and demonstrated the classification of healthy/osteoporotic fracture cohort using texture features.

The energy, entropy, homogeneity, and variance showed significant correlation (p < 0.0001) and narrow limits of agreement between with and without IVCM (Fig. 3). Energy quantifies orderliness, entropy measures the presence of heterogeneity in the GLCM elements, homogeneity calculates the closeness of distribution in the GLCM diagonal elements, and variance is similar to entropy, it measures dispersion of the difference between the reference and the neighbour voxels [15, 19]. Significant correlation of aforementioned features revealed that IVCM does not affect the orderliness, randomness, and distribution of voxels in an image. Thus, these texture parameters can be used for opportunistic osteoporosis screening in contrast-enhanced routine MDCT. Contrast, correlation, sum average, and dissimilarity obtained significant correlation (p < 0.0001). However, these features showed wider limits of agreement and larger bias in the Bland-Altman plot due to the IVCM-based increased attenuation values. Contrast reflects the sharpness of images and the depth of texture grooves. High contrast led to better visual sharpness and deeper texture grooves [15, 19]. Large number of voxels with high difference in gray scale led to higher value of contrast. Whereas dissimilarity measures the texture heterogeneity [15, 19]. Hence, wider limits of agreement for contrast and dissimilarity revealed that the IVCM enhances the image contrast. The large bias for correlation and sum average in the Bland-Altman plot shows the minimal change of consistency in image texture with IVCM. Thus, these parameters are less suitable for opportunistic osteoporosis screening in contrast-enhanced MDCT.

Energy, entropy, and homogeneity showed high correlation (p < 0.0001) and excellent agreement between 1- and 2-mm slice thickness. Whereas, contrast, sum average, variance, and dissimilarity showed significant correlation, however contrast, and dissimilarity showed wider agreement and larger bias (Fig. 4). It revealed that the increase in slice thickness does not affect the uniformity of voxels. Further, the texture features between 1- and 3-mm slice thickness showed high correlation (p < 0.0001) and good agreement for contrast, sum average, variance, and dissimilarity. Energy, entropy, homogeneity, and correlation showed significant correlation (p < 0.0001), but wider agreement and bias (Fig. 5). The correlation coefficient (r) value was lower compared to 1 versus 2 mm slice thickness. Thus, energy, entropy, and homogeneity can be used for opportunistic osteoporosis screening in sagittal reformations of routine MDCT with a slice thickness up to 3 mm.

The long-term reproducibility of scan-rescan MDCT images were evaluated using texture features. The results showed that all extracted features showed significant correlation (p < 0.0001). The Bland-Altman plot showed wider limits of agreement as compared to the effect of IVCM and different slice thickness (Fig. 6). The bias could be due to change in field of view resulting in different slice thickness [11]. Thus, opportunistic osteoporosis screening is feasible in contrast-enhanced routine MDCT, but accidental changes of the field of view have to be taken into account for longitudinal osteoporosis assessment, e.g. therapy monitoring.

The necessity of classifiers in disease screening is increasing gradually, since large amount of data is generated in routine clinical examinations and research studies. The pattern recognition techniques helps to understand the disease patterns and to make useful predictions [20]. Among various classification techniques SVM provides optimum results in noisy and complex domains [20]. This statistical learning method works in the principle of structural risk minimization. SVM uses hyperplane in the high-dimensional feature space and separates the data to obtain best possible solutions [17]. Texture-based quantitative features are used to characterise uniformity, randomness and repetitive patterns in an image [21] and have been used in detecting pathologies and malignancies [22,23,24]. GLCM features were widely used texture quantification methods [16, 25, 26]. In this work, we used SVM classifier with linear, polynomial, and RBF kernels to discriminate healthy/osteoporotic fracture cohort using GLCM features and obtained an average classification accuracy of 83%. The quantitative measurements could adequately discriminate subjects with and without vertebral fractures. The limitation of our study is the correlation between texture features and BMD was not evaluated. Since, DXA and quantitative computed tomography (QCT) data were not obtained for the study subjects.

In summary, this study proved that specific texture features could be reliably extracted from sagittal reformations up to 3 mm slice thickness in routine MDCT scans with IVCM. Moreover, long-term reproducibility of scan-rescan may be clinically acceptable. Lastly, routine MDCT images with IVCM can be used for down stream applications such as fracture risk prediction using SVM.