Introduction

Fractures of the proximal femur are common injuries in the elderly and since the percentage of elderly people in industrial countries is rising, this pathology will result in a significant healthcare burden in the years to come [13].

Surgical fixation of geriatric hip fractures is frequently compromised by reduced bone quality that may lead to insufficient implant anchorage followed by cutting out and failure of fixation [4, 5]. In patients with severe osteoporosis additional procedures such as bone cement augmentation may become necessary to improve implant anchorage in the femoral head [69]. Therefore, in daily clinical practice, information regarding the patients bone mineral density (BMD) may be of importance when planning surgical fixation of acute geriatric hip fractures.

In 2007 Sah et al. [10] proposed the cortical thickness index (CTI) [11] as an easy assessable and valid tool to assess the bone density of the proximal femur on the basis of plain hip X-rays. In their study, they evaluated the CTI on the basis of 32 female patients with a mean age of 67 years who had been scheduled for total hip replacement for advanced osteoarthritis (OA) and found a significant correlation between CTI and BMD [10]. In the opinion of the authors, this patient collective does not represent typical geriatric patients who usually present with low energy fractures of the proximal femur. In addition, a recent study has shown that patients with advanced OA may have higher BMD compared to patients without OA [12]. Information regarding the inter- and intraobserver reliability of the CTI remains sparse in the literature.

The purpose of this study was to assess the inter- and intraobserver reliability of the CTI and to evaluate the correlation between CTI and BMD at different areas of the proximal femur in a group of geriatric patients.

This study hypothesized that the CTI may have sufficient reliability for daily use in clinical practice and that it may show a correlation with the BMD of the proximal femur in geriatric patients.

Materials and methods

In our department, all patients admitted for hip pain after falling have standardized native ap pelvis and lateral hip X-rays for fracture diagnostics. In addition, patients older than 65 years are included into our in-house OsteoFit program. OsteoFit is an internal interdisciplinary project which aims to prevent elderly people from falls and low energy fractures. This includes evaluation of the BMD of the proximal femur by dual energy X-ray absorptiometry (DEXA).

DEXA is routinely performed at the femoral neck, the intertrochanteric region, the major trochanter, the Ward triangle, and the overall proximal femur using a Discovery QDR machine (Discovery QDR, Hologic, USA).

Since patients who sustain hip fractures are usually treated either by intramedullary nailing or hemi-/total hip replacement DEXA of the affected hip is frequently compromised by metal hardware. Therefore, measurement of BMD is routinely performed at the unaffected contralateral hip.

From this group of patients between 2010 and 2013, a total of 60 consecutive patients met the inclusion criteria and were retrospectively included into the study.

Inclusion criteria were an age of more than 65 years, a history of falling on the hip, the ability to walk before falling and a measurement of BMD by DEXA within 6 months after the accident.

Exclusion criteria were prior hip surgery, congenital hip pathologies (dysplasia, Perthes disease, epiphysiolysis, etc.), radiological signs of osteoarthritis exceeding grade 2 according to Kellgren and Lawrence [13], metabolic diseases affecting bone metabolism, prior diagnosis of osteoporosis or medication therapy for osteoporosis, and neurological pathologies that could affect physiological loading of the hip.

Three subgroups of patients were included: 20 patients with pertrochanteric fractures, 20 patients with femoral neck fractures and 20 patients who had a hip contusion but did not sustain a fracture.

Measurement of inter- and intraobserver reliability

Plain ap pelvis and axial hip X-rays of all patients were obtained. The CTI was measured according to the recommendations of Sah et al. [10] (Fig. 1a, b). X-rays were evaluated digitally using a picture archiving and communication system (PACS).

Fig. 1
figure 1

The ap and lateral cortical thickness indices (CTI) are measured 10 cm distally to the minor trochanter as the quotient of femoral diaphysis width (DW) minus intramedullary width (FW) divided by diaphysis width (DW); CTI = (DW−FW)/DW

The ap CTI was measured on the healthy contralateral femur and the lateral CTI was measured on the injured femur since axial x-rays of the healthy contralateral side were not routinely available. Four independent blinded observers in randomized order performed measurements. None of the observers had been involved into the patients’ treatment.

To consider any possible impact of the observers experience on the measurement, we choose four observers with different levels of clinical training: one medical student, one first-year orthopaedic surgery resident, one sixth-year orthopaedic surgery resident and one consultant for orthopaedic and trauma surgery with several years of experience in treatment and diagnostics of proximal femoral fractures.

Prior to the study, three sets of radiographs (one pertrochanteric fracture, one femoral neck fracture and one set of radiographs showing no fracture) were randomly selected from the collective sample. The observers were instructed on the nuances and subtleties of CTI measurement by reviewing these three sets of radiographs as a group. Before the reading, these X-rays were put back into the collective sample in randomized order.

The reading was repeated 4 weeks later to assess intraobserver reliability.

The order of the radiographs was randomised again to prevent possible recollection of the previous viewing. The observers were not provided with any feedback and the X-rays were not available to them between the readings.

Correlation between CTI and BMD

The mean CTI for each patient was calculated from the eight single CTI values measured during reliability testing. All patients had measurement of BMD by DEXA at the femoral neck (FN), the major trochanter (MT), the intertrochanteric region (IT) and the Ward triangle (WT) of the contralateral uninjured femur. Additionally, overall BMD of the proximal femur was measured according to our standard protocol.

The mean CTI of each patient was correlated with the single BMD values (FN, MT, IT, WT) and the overall BMD.

Statistical analysis

Independent student t test was used to calculate statistical differences between linear data such as patient’s ages mean ap and lateral CTI and mean BMD at different areas of the proximal femur. The intraclass correlation coefficient was calculated to evaluate inter- and intraobserver reliability of the CTI. Interpretation of this was performed according to the recommendations of Landis and Koch [14] who defined a value of >0.8 as excellent, between 0.6 and 0.8 as good, between 0.4 and 0.6 as moderate and of <0.4 as poor. To evaluate any possible statistical difference between single ICC values, we calculated the 95 % confidence intervals (95 % CI). According to the recommendations of Doornberg et al. [15] differences between single ICC values were considered significant when upper and lower boundaries of the 95 % confidence intervals did not overlap.

A Pearson correlation analysis (r) was performed to evaluate the correlation between BMD and the corresponding CTI. A correlation coefficient r < 0.3 was considered to be a weak correlation, between 0.3 and 0.7 moderate and >0.7 high. The level of statistical significance was defined as p ≤ 0.05.

Based on the data published by Sah et al. [16], we performed a power analysis. A minimum sample size of 17 was calculated to detect differences of one standard deviation between mean CTI values with α = 0.05 and β = 0.05.

The minimum sample size to calculate a Pearson correlation coefficient of r = 0.7 with α = 0.05 and β = 0.05 was 20.

Furthermore, statistical power (1−β) was calculated for final results which were not significant (p > 0.05).

Results

The evaluated patient collective included 14 men and 46 women with a mean age of 73 years (range 65–88 years; SD 7.1 years).

No statistical significant difference regarding age was found between men (mean age 70 years, range 65–82 years; SD 5.9 years) and women (mean age 74 years, range 65–88 years; SD 7.2 years) (p = 0.08, power = 0.78). The mean ap CTI was 0.52 (range 0.36–0.61; SD 0.06) and the mean lateral CTI was 0.45 (range 0.26–0.55; SD 0.06) (Table 1). Patients who had not sustained fractures during falling had significantly higher ap CTI values compared to patients with pertrochanteric fractures (p < 0.01) and femoral neck fractures (p = 0.02). No statistical significant difference was found between the mean ap CTI values of patients with pertrochanteric and femoral neck fractures (p = 0.65, power = 0.61).

Table 1 Mean ap and lateral CTI values (with standard deviation, SD) for patients with pertrochanteric fractures, femoral neck fractures and patients without a fracture

The mean lateral CTI was higher in patients without fractures compared to patients with pertrochanteric and femoral neck fractures but the differences did not reach statistical significance (p = 0.08, power = 0.89; p = 0.07, power = 0.88). Likewise, no statistical significant difference regarding lateral CTI was found between patients with pertrochanteric und femoral neck fractures (p = 0.96, power = 0.48).

Overall BMD was 0.739 g/cm2 (range 0.411–1.108 g/cm2; SD 0.13 g/cm2). Patients with femoral neck fractures had a significantly lower overall BMD compared to patients without fractures (p = 0.04). No statistical significant difference regarding BMD was found between patients with pertrochanteric fractures and femoral neck fractures (p = 0.102, power = 0.69) and between pertrochanteric fractures and patients without fractures (p = 0.57, power = 0.58) (Table 2).

Table 2 Mean BMD (g/cm2) (with standard deviation, SD) measured at the femoral neck (FN), the major trochanter (MT), the intertrochanteric region (IT) and the Ward triangle (WT)

11 of the 60 patients (pertrochanteric fractures: n = 4; femoral neck fractures: n = 4; without fractures: n = 3) had t scores less than −2.5. According to the guidelines of the World Health Organisation (WHO) [17], this was considered as manifest osteoporosis.

Inter- and intraobserver reliability

Interobserver reliability was good for the ap CTI (mean ICC 0.71, range 0.67–0.78; SD 0.04) and for the lateral CTI (mean ICC 0.65, range 0.55–0.76; SD 0.07) (Table 3).

Table 3 ICC values for the interobserver reliability between observers: 1 (medical student), 2 (1st-year resident), 3 (6th-year resident) and 4 (consultant)

Likewise, intraobserver reliability showed good ICC values for the ap CTI (mean ICC 0.79, range 0.71–0.88; SD 0.08) and for the lateral CTI (mean ICC 0.69, range 0.63–0.79, SD 0.07) (Table 4). All ICC values were statistically significant (p < 0.01). No significant differences were found between single ICC values of different observers or different ICC values measured by the same observer (95 % CI did overlap in all cases).

Table 4 ICC values for the intraobserver reliability of the observers: 1 (medical student), 2 (1st-year resident), 3 (6th-year resident) and 4 (consultant)

Correlation between CTI and BMD

Patients who had not sustained a fracture during the falling accident showed a high significant correlation between overall BMD and ap CTI (r = 0.74, p < 0.01) and a moderate significant correlation between overall BMD and lateral CTI (r = 0.67, p < 0.01). No significant correlation between overall BMD and ap and lateral CTI was found for patients with pertrochanteric fractures (power = 0.56, power = 0.5) and femoral neck fractures (power = 0.5, power = 0.5). Likewise, patients without fractures showed significant high and moderate correlations of BMD at the femoral neck, major trochanter and the intertrochanteric region and the ap and lateral CTI. Patients with pertrochanteric fractures showed a moderate significant correlation between ap CTI and BMD at the major trochanter. Patients with femoral neck fractures showed significant correlation between BMD at the intertrochanteric region and the ap and lateral CTI (Tables 5, 6).

Table 5 Pearson’s correlation coefficients (r) and their statistical significance (p) for the correlation between ap CTI in patients with pertrochanteric fractures, femoral neck fractures and without fractures and the BMD (g/cm2) measured at different areas of the proximal femur (FN, MT, IT, WT, overall)
Table 6 Pearson’s correlation coefficients (r) and their statistical significance (p) for the correlation between lateral CTI in patients with pertrochanteric fractures, femoral neck fractures and without fractures and the BMD (g/cm2) measured at different areas of the proximal femur (FN, MT, IT, WT, overall)

Discussion

The first part of this study evaluates the inter- and intraobserver reliability of the CTI.

Both, the ap and lateral CTI showed good mean ICC values for inter- and intraobserver reliability. No statistically significant differences regarding ICC values were found between the single observers. This suggests that reliability of the CTI may be independent from the observer’s level of clinical experience.

In our patients collective measurement of the ap CTI was performed on the unaffected contralateral proximal femur since this hip was used for measurement of BMD by DEXA. The lateral CTI was measured on the affected proximal femur (with fracture or contusion) because lateral X-rays of the unaffected side were not retrospectively available in all patients. In some patients with unstable pertrochanteric fractures, observers found it difficult to determine the original level of the minor trochanter which could have influenced the exact level of CTI measurement 10 cm distally to the minor trochanter. This is a limitation of our study and may explain the slightly lower inter- and intraobserver reliability for the lateral CTI in comparison to the ap CTI. However, the differences did not become statistically significant. In the opinion of the authors, the CTI has sufficient reproducibility for use in clinical practice.

The second part of this study evaluates the correlation between CTI and BMD. Patients who did not sustain fractures during falling showed a good and significant correlation between CTI and overall BMD. Likewise, this subgroup of patients showed significant correlation between CTI and BMD at the femoral neck, major trochanter and intertrochanteric region. These results are in accordance with the data published by Sah et al. [10].

In contrast, in patients who had sustained a fracture during a falling accident, no correlation between CTI and overall BMD could be found. However, these data showed relatively low statistical power and may be of risk for a type II error.

In patients with pertrochanteric fractures lower CTI values were found compared to patients without fractures but no significant differences regarding overall BMD. Measurement of the ap CTI and BMD was performed at the contralateral uninjured femur. Patients with non-symmetric distribution of BMD at both proximal femurs may not have been adequately assessed by this method. However, patients with prior surgery, congenital hip deformities or neurological diseases that could affect non-symmetric loading of both hips were excluded from the study. In addition, recently, Maeda et al. [18] evaluated the bone morphology and BMD of a group of patients with hip fractures in comparison to the uninjured contralateral side. In their study, no differences regarding BMD and bone distribution could be found between both hips. Therefore, we do not think that this theoretical limitation may have a significant impact on our results.

In the presented study, we used DEXA to assess BMD. This method evaluates BMD in terms of an areal density (g/cm2) calculated from the ratio of bone mineral content to projected area of bone tissue and does not give detailed information about the three-dimensional morphology of the bone and the exact distribution of BMD [19, 20]. Therefore, several authors have proposed quantitative computed tomography scanning (qCT) for precise evaluation of the distribution of BMD [1922].

However, qCT was not routinely used in our department at the time when this study was planned and conducted and so it was not included into the study.

In summary, the CTI showed sufficient reliability for the use in daily practice. It showed significant and good correlation with the BMD of the proximal femur in patients who did not sustain a fracture during falling. In patients who sustained a proximal femoral fracture, no correlation between CTI and BMD could be found.

Conclusion

The CTI is a reliable tool for rough assessment of the BMD in patients without hip fractures. In geriatric patients who sustained low impact fractures of the proximal femur, the CTI shows insufficient correlation with the BMD.

Therefore, we do not recommend the CTI as routine parameter to assess the quality of implant anchorage or the need of additional implant augmentation when treating geriatric patients with hip fractures.