Introduction

Avascular necrosis (AVN) of the hip is a disease affecting mainly middle-aged patients with a tendency to affect the male gender [1]. There is a wide variety of nontraumatic etiologies [2], but in most cases, a multifactorial pathogenesis combining risk factors and genetic predisposition is presumed. In all cases, necrosis results from a decreased vascularization of the subchondral bone, leading to a rarefication of trabeculae and ischemic lesions of the subchondral bone [3]. This can cause the collapse of the articular surface, resulting in the need for hip replacement [4].

To prevent patients from undergoing joint replacement, several kinds of decompression surgeries are offered beside nonoperative treatment options [5]. The aim of these methods is to enable neovascularization and to prevent the femoral head collapse. Decompression surgery is only reasonable up to stage 3 referring to the classification of Steinberg [6], indicating a subchondral collapse, but no femoral flattening [7]. But even in earlier stages, core decompression cannot prevent a collapse of the femoral head in any case.

A modification of the conventional core decompression is “advanced core decompression” (ACD), which has been under clinical evaluation since 2009 [8]. Contrary to standard decompression, this novel treatment uses an expandable reamer after drilling a core into the femoral head, which allows an enlarged and optimized debridement of the necrosis, based on the hypothesis that the size of necrosis reduction correlates with the clinical outcome [8]. After decompression, the bone defect is filled with a bone graft substitute [(CaSo4)-(CaPo4) with β-tricalcium phosphate granules, Pro-Dense®, Wright Medical Technology, Arlington, TN, USA]. This leads to an osteoconductive effect with formation of healthy bone as the final intent, because the three-stage resorption profile of the graft makes an early vascularization possible, while a scaffold for new bone formation remains [9]. Unfortunately, up to now, the structural changes in the femoral head after ACD have only been observed histologically in a few individual cases after a subsequently necessary hip replacement procedure [10]. Furthermore, so far no comprehensive data are available about the MRI-assisted evaluation and structured follow-up of patients after ACD. Especially the appearance of bone remodeling after ACD with the possibility to distinguish between responders and therapy failures is not defined yet.

Therefore, the aim of this study was to analyze remodeling processes after ACD in patients with avascular femoral head necrosis by means of 3 Tesla (T) MRI and to identify indicators for clinical outcome considering the defect size and the characteristics of the bone graft and neighboring regeneration tissue.

Materials and methods

Patients

After approval by the local institutional ethics committee and acquiring informed consent, 26 patients (16 males, 10 females, mean age 53.2 years) with AVN of the femoral head, treated by ACD between July 2009 and July 2012, were included into the study consecutively in the order they appeared to their routine consultation in our orthopedic department. Eight patients (7 males, 1 female) had bilateral AVN with subsequent ACD resulting in a total of 34 hips evaluated. The stage of AVN ranged from 1 (patients with only edema excluded) to 3c referring to the classification of Steinberg [6] with the majority of hips (n = 29) presenting with a geographic defect and crescent sign on MRI, without femoral flattening. ACD was performed by one particular orthopedic surgeon based on a previously described method [8], using a composite calcium sulfate-calcium phosphate bone graft substitute (Pro-Dense®, Wright Medical Technology™, Arlington, TN, USA) for filling the defect.

MRI

As the patients were included in the study postoperatively, no standardized preoperative MRIs could be performed. As part of this analysis, the routinely taken preoperative 1.5T MRIs of 21 hips (16 patients) were collected from different institutions and evaluated retrospectively; all included T1w and T2w spin or turbo spin echo sequences, T2w turbo spin echo sequences with inversion recovery fat saturation, PDw sequences with fat saturation and predominantly contrast-enhanced T1 spin or turbo spin echo sequences with fat saturation in different orientations.

For the postoperative examinations, we used a standardized 3T MRI protocol (Magnetom Skyra, Siemens AG, Healthcare Sector, Erlangen, Germany), including a coronal T2w turbo inversion recovery magnitude (TIRM) sequence, high-resolution (hr) T1w turbo spin echo (TSE) sequence (0.4 × 0.4 × 3.0 mm), 3D double echo steady state (DESS) sequence, sagittal proton density (PD)-weighted TSE sequence with fat saturation (fs), coronal PDw TSE sequence without fs as well as contrast-enhanced (ce) spectral fat saturated T1w TSE sequences in coronal and transverse orientation with a total duration of less than 30 min including preparations (Table 1).

Table 1 Sequence parameters

Patients were examined 1 to 34 months (mean 12.7 months) postoperatively. Three of these patients were examined twice, and one patient with bilateral AVN was examined three times after ACD, resulting in a total of 41 standardized postoperative MRIs available for analysis (1–3 months: 9 hips; 4–6 months; 9 hips; 6–12 months: 5 hips; 12–16 months: 7 hips; 16–24 months: 8 hips; 24–34 months: 3 hips).

MRI evaluation

All qualitative evaluations were done in consensus by two radiologists with 4 years and 8 years of experience in musculoskeletal MR imaging.

Every sequence was evaluated regarding the delineation of the residual necrosis using a 5-point scale (0, none; 1, poor; 2, intermediate; 3, good; 4, very good), especially focusing on the discrimination of necrosis and edema as well as on the definition of its border. Afterwards, the mean values were compared between the sequences. Subsequently, the volume of necrosis was measured in the individual best sequence for delineation in the postoperative images and in the best fitting sequence in the preoperative images. Those measurements were done by a radiologist with 4 years of experience in musculoskeletal imaging, manually drawing the border of the necrosis in every slice and afterwards calculating the volume considering slice thickness and slice spacing. In the case of more than one postoperative MRI, the first follow-up examination was used for measuring the residual necrosis. Images of patients without a preoperative MRI were not considered for this evaluation. Finally, absolute (preoperative volume – postoperative volume) and percentage reductions (100 – 100/preoperative volume * postoperative volume) of the volume of necrosis were calculated.

The area of the drilling channel in the femoral neck and the drilling defect in the femoral head were separately evaluated using a 4-point scale regarding signal intensity (1, markedly hypointense; 2, hypointense; 3, hyperintense; 4, markedly hyperintense) and signal homogeneity (1, very homogeneous; 2, predominantly homogeneous; 3, moderately inhomogeneous; 4, markedly inhomogeneous) in T2w TIRM, hr T1w TSE and DESS sequences.

Furthermore, every sequence was scanned for the appearance of border phenomena between the bone graft and healthy bone respectively between the bone graft and residual necrosis. Initially, the contrast between the bone and border zone as well as between the bone graft and border zone was evaluated using a 5-point scale (0, no contrast; 1, little contrast; 2, intermediate contrast; 3, good contrast; 4, very good contrast). Afterwards, the three sequences with the highest contrast were used to describe the border phenomena in detail regarding the signal intensity and arrangement of layers, each separately for the region of the drilling channel in the femoral neck and the drilling defect in the femoral head. Finally, the observed border phenomena were descriptively divided into five groups according to their visual appearance.

Clinical follow-up

Patients were clinically followed up by an orthopedic surgeon (follow-up time 10.4–46.8 months, mean 28.6 months) and categorized according to whether they were asymptomatic or symptomatic, assessing signs of femoral head cortical collapse by means of clinical complaints, radiographs, postoperative MRIs and scheduling for hip replacement surgery.

Statistical analysis

As neither the ordinal data of MRI evaluation nor the volume of necrosis showed a normal distribution referring to the Shapiro-Wilk test, nonparametric tests were used. For calculating correlations between signs of femoral head collapse and (1) the volume of necrosis (initial volume, absolute reduction, percentage reduction, remaining volume), (2) signal intensity respectively signal homogeneity of the bone graft and (3) the appearance of specific border phenomena, the point-biserial correlation coefficient (Pearson correlation) was used. Wilcoxon sign-rank test was used to evaluate differences between signal intensities and homogeneities of the bone graft depending on the region (femoral head/femoral neck) as well as between the contrast between the bone respectively bone graft and border zone depending on the choice of the sequence.

The significance level α = 0.05 was used throughout. All statistical analysis was performed using PASW Statistics 18 (IBM, Armonk, NY, USA).

Results

Residual necrosis

The residual necrosis could be best delineated in the hr T1w sequence (1–4 points, mean 2.92, SD 0.82) (Fig. 1), followed by PDw sequences with fat saturation (1–4 points, mean 2.19, SD 0.17) and without (0–4 points, mean 2.36, SD 1.05). The worst delineation was found for the DESS sequence (1–3 points, mean 1.94, SD 0.53) and for the ce T1w TSE sequence (0–4 points, mean 1.96, SD 1.13).

Fig. 1
figure 1

A 48-year-old male patient before (a c) and 6 months after (b, d) ACD in coronal T1w TSE images of the left femoral head. In c and d (enlarged views of a and b), the margin of the necrosis is marked in one representative slice as was done for the measurements of the volume of necrosis. The asterisk highlights the defect of the expandable reamer

The volumes prior to ACD ranged from 0.38 to 31.64 cm3 (mean 11.23 cm3, SD 9.40), whereas the volumes after ACD ranged from 0.16 to 25.65 cm3 (mean 7.22 cm3, SD 7.52). In every hip a reduction of necrosis by ACD could be observed (by 0.06 to 12.38 cm3, mean 4.02 cm3, SD 3.05, p < 0.001), with percentage reductions from 8.96 to 93.78 % (mean 45.31 %, SD 26.13).

Dividing the study population into two groups regarding clinical signs of femoral head cortical collapse postoperatively (8 positive hips, 13 negative hips), a statistically significant correlation could be found between the percentage of necrosis reduction and clinical outcome (correlation coefficient = 0.564, p = 0.008): Percentage of necrosis reduction in the asymptomatic group ranged from 8.96 to 93.77 % (mean 59.36 %, SD 26.57), whereas percentage necrosis reduction in the symptomatic group ranged from 17.71 to 57.89 % (mean 28.78 %, SD 14.3). In contrast to this, no correlation was found between the absolute size of necrosis reduction and collapse of the femoral head (correlation coefficient = 0.083, p = 0.719). Furthermore, a correlation between clinical signs of femoral head collapse and the initial volume of necrosis (correlation coefficient = 0.54, p = 0.012) as well as the volume of the remaining necrosis (correlation coefficient = 0.557, p = 0.001) could be observed.

Bone graft

In two MRI examinations artifacts of metallic abrasion after surgery were so pronounced that the conclusive assessment of the signal intensity of the bone graft in the femoral neck was not possible. In two further MRI examinations, the bone graft in the femoral neck was completely replaced by healthy bone tissue. Because of that, 37 MRIs were used for the evaluation of the bone graft in the femoral neck and 41 MRIs for the evaluation of the bone graft in the femoral head.

Mean signal intensities of the bone graft did not differ significantly comparing the region of the drilling channel in the femoral neck and the filled defect in the femoral head in hr T1w TSE (1.67 ± 0.76 vs. 1.85 ± 0.73, p = 0.193), T2w TIRM (2.73 ± 1.12 vs. 2.85 ± 1.19, p = 0.714) or in DESS (2.16 ± 0.76 vs. 2.27 +/- 0.81, p = 0.356). Interestingly, the signal of the bone graft increased in hr T1w TSE as well as T2w TIRM sequences over time after ACD in the femoral head in the patients who were examined more than once (Fig. 2), but also when comparing MRIs in the early with MRIs in the late follow-up of different patients. No tendency was seen for these sequences in the femoral neck.

Fig. 2
figure 2

A 58-year-old male patient with AVN 1.5 (a, c) and 3.5 (b, d) months after ACD in a coronal MRI image of the right femoral head. The signal intensity of the bone graft clearly increases in T2w TIRM (a, b) and slightly in hr T1w TSE (c, d)

Comparing the mean signal intensities in the groups with and without clinical signs of femoral head collapse, statistically significantly lower signal intensities of the bone graft in the femoral neck were observed in the symptomatic group in all three sequences (hr T1w TSE: 1.40 ± 0.51 vs. 2.0 2.0 ± 0.83, p = 0.043; T2w TIRM: 1.7 ± 1.16 vs. 3.04 ± 1.16, p = 0.004; DESS: 1.70 ± 0.82 vs. 2.33 ± 0.76, p = 0.038).

No differences regarding signal homogeneity were observed comparing the bone graft in the region of the femoral neck and femoral head (hr T1w TSE: p = 0.763, T2w TIRM: p = 0.839, DESS: p = 1.0) with mean values between 1.27 ± 0.55 (hr T1w TSE in the femoral head) and 1.78 ± 0.79 (DESS in the femoral head). No difference could be observed comparing the signal homogeneity depending on different periods of time after ACD or depending on clinical signs of femoral head collapse either.

Border zone

In some sequences, a strong contrast between healthy bone and border phenomena could be observed: The highest contrast was seen in T2w TIRM (2–4 points, mean 3.11, SD 0.6), ce T1w TSE (2–4 points, mean 3.0, SD 0.58) and fat-saturated PDw TSE sequences (1–4 points, mean 2.88, SD 0.83), without significant differences between these sequences; all other sequences showed no marked contrast between these two tissues (overall: 1–3 points, mean 1.5, SD 0.61). The differences between these two groups were statistically significant (exemplary T2w TIRM vs. DESS: p = 0.005, PDw TSE fs vs. hr T1w TSE: p = 0.038). In opposition to this, the contrast between the bone graft and border phenomena was less dependent on the choice of the sequence with greater variability, but was still statistically significant [best contrast: T2w TIRM (0–4 points, mean 2.89, SD 1.69), worst contrast: DESS and hr T1w (0–4 points, mean 2.22, SD 1.48 respectively 1.2), p = 0.014] (Fig. 3).

Fig. 3
figure 3

Bar graph showing the contrast between border phenomena and healthy bone (dark gray bars) respectively bone graft (light gray bars) depending on the choice of the MRI sequence

Therefore, further evaluations regarding border phenomena were performed using T2w TIRM, ce T1w TSE and fat-saturated PDw TSE sequences, resulting in a total number of 123 observations (41 MRIs * 3 sequences).

Overall, border phenomena were only observed in areas where the bone graft was in direct contact to the healthy bone. No border phenomena were observed between the bone graft and residual necrosis in any of the patients.

The different border phenomena could be divided descriptively into five groups (Fig. 4): (1) hypointense edge, (2) hyperintense edge, (3) inner hyperintense line with a hypointense edge, (4) inner hypointense line with a hyperintense edge, (5) the so-called “rail sign” with a total of three layers in alternating signal intensities. The last group was further divided into two subgroups, depending on the signal intensities of the “rails” (layer 1 and 3): (5a) hyperintense rails; (5b) hypointense rails. The absence of visible border phenomenon was classified as (0).

Fig. 4
figure 4

Examples of the observed border phenomena after ACD in coronal respectively sagittal PDw TSE sequences with fat saturation in six different patients with AVN: (1) hypointense edge; (2) hyperintense edge; (3) inner hyperintense line with a hypointense edge; (4) inner hypointense line with a hyperintense edge; (5a) hyperintense rail sign; (5b) hypointense rail sign

The most commonly seen border phenomena in the femoral neck were 5a (23 observations) and 1 (20 observations). The other border phenomena were observed ten times or less. In 54 observations, no specific border pattern could be delineated. Most border phenomena were observed in T2w TIRM (27 observations), followed by PDw fs (22 observations) and ce T1w TSE (20 observations) (Table 2).

Table 2 Observed border phenomena in the femoral neck and head

In the femoral head, the most commonly seen border phenomena were 1 (21 observations), 2 and 5a (19 observations each). The frequency of the other border phenomena was 11 or less. In 43 observations, no specific border pattern could be delineated. Most border phenomena were observed in PDw fs (29 observations), closely followed by T2w TIRM (26 observations) and ce T1w TSE (25 observations) (Table 2).

Interestingly, the rail sign could only be found in MRIs up to 18 months after ACD, whereas the other border phenomena appeared over the whole evaluated postoperative time period (1–34 moths) without any specific pattern. This could be confirmed by the observations in patients with more than one follow-up where an initial rail sign disappeared in MRI examinations longer than 18 months after ACD (Fig. 5).

Fig. 5
figure 5

A 54-year-old female patient with AVN 2 (a, c) and 20 (b, d) months after ACD. The initially visible rail sign was no longer definable in the later follow-up. a, b: Sagittal PDw TSE FS; c, d: coronal T2w TIRM

The commonly observed border phenomenon 1 in both the femoral neck and head appeared at the earliest at 5 respectively 4 months after ACD. No tendency regarding time to appear or disappear after ACD could be observed for the other border phenomena.

Because of the great variability of the observed border phenomena and the resulting small number of cases in each group, a correlation of the type of border phenomenon and clinical signs of femoral head breakdown was not conclusive.

Discussion

One of the novelties of ACD is the use of a calcium-phosphate calcium-sulfate composite for filling the drilling defect, which has been described for use after standard core decompression in two prior studies [11, 12]. Beneath other synthetic grafts, this composite is also widely in use for filling surgical and traumatic bone defects [13]. However, up to now, the remodeling processes after using a calcium-sulfate calcium-phosphate composite have not been evaluated in detail. Kotnis et al. analyzed the chronology of appearances of this composite following resection of bone tumors, only focusing on radiographs [14]. Here they identified a characteristic, time-related radiographic pattern during graft resorption and complete new bone incorporation, beginning at the periphery of the graft.

An animal study by Athanasiou et al. [15], examining the histological appearances of different grafts after filling a trabecular bone defect, unfortunately did not include a calcium-phosphate calcium-sulfate composite, but evaluated both substances independently. Here, for both substances signs of osteoconduction between graft and preexisting bone trabeculae were shown.

Also MRI has not been utilized to analyze the remodeling processes adjacent to composite grafts, although several studies comparing MRI and histology have focused on the necrotic region of avacular necrosis itself [16, 17]. Plenk et al. [18] observed different repair processes after conservative treatment as well as standard core decompression by means of MRI and histology. They concluded that MRI shows a characteristic pattern of the reactive interface in patients with limited repair, but can only in part distinguish between reconstructive and destructive repair. However, the application of allografts was not included in their study population.

Our study now for the first time describes remodeling processes after advanced core decompression with the use of a composite allograft by means of MRI. We could observe several alterations of both the bone graft and surrounding tissue, representing remodeling processes. Increasing signal intensities of the initial low- to no-signal bone graft in the femoral head in hr T1w TSE and T2w TIRM over time, observed both intra- and interindividually, can be interpreted as a remodeling process from the avital graft to vital tissue with increasing vascularization. This correlates with the observation of higher signals of the bone graft in asymptomatic patients compared to patients with clinical signs of femoral head collapse. Therefore, the signal intensity of the graft represents an interesting marker of healing response, which might be useful in further clinical controlled follow-up studies. Also the rail sign, described for the first time, appearing up to 18 months after ACD seems to us to provide an interesting marker of repair. This border phenomenon is correlative to histological observations showing an inner growth zone, where the resorption of the graft takes place, followed by an adjacent scar with granulation tissue and an outer growth zone where new bone is built up [10], which we suppose to be a sign of physiological repair processes. Additionally, the rail sign was the most common border phenomenon observed in the femoral neck, an area not affected by adjacent necrosis and therefore regarded as a healthy control. In comparison to this, the most common border phenomenon in the femoral head, a single hypointense edge, appeared at the earliest 4 months after ACD, probably representing fibrosis or sclerosis.

The fact that border phenomena overall were only observed in areas where the bone graft was in direct contact to the healthy bone underlines that these truly represent remodeling processes, which can only take place where revascularization originating from healthy tissue is possible. No remodeling of the bone graft was seen where in contact with residual necrosis.

Because of the great variability of the phenomena observed at the border zone, correlations of the type of border phenomenon and clinical signs of femoral head collapse were not conclusive. Investigations in a larger clinical controlled study might be of interest for identifying further markers for good healing response respectively therapy failure. For this, our study delivers a basis with a comprehensive characterization of possible changes in postoperative MRIs after ACD.

Using an expandable reamer during core decompression of the femoral head is another modification of the described advanced method. With this, a larger amount of necrosis can be removed than with a drill alone, resulting in a better clinical outcome. This adds to previously described studies [19, 20] showing a correlation between the initial size of necrosis and clinical outcome as well. In another study using an automatic volumetric analysis, it has also already been shown that a larger percentage amount of debridement respectively a lower amount of remaining necrosis after ACD correlates with better clinical outcome [21], which is in line with our results.

Our study had several limitations: Unfortunately we were not able to do standardized preoperative MRIs, which aggravated the comparison of preoperative and residual postoperative necrosis. Furthermore, only 21 hips of 16 patients could be included in the analysis of necrotic lesion size because of a lack of preoperative images of the other 11 patients. The postoperative MRIs taken at our institution did not meet a specific time schedule, with the disadvantage of an inhomogeneous study group, but the advantage of a huge span of time represented in our MRIs after ACD.

As far as suggestions can be made from this initial study, we recommend evaluating patients after ACD at the earliest 5 and at the latest 18 months postoperatively with MRI, including hr T1w TSE and PDw fs sequences for quantification of necrosis as well as T2w TIRM and ce T1w TSE sequences for the evaluation of border phenomena. A large amount of necrosis reduction, the appearance of the rail sign and a high signal of the bone graft in hr T1w TSE and T2w TIRM sequences should be considered predictors of a positive outcome. These assumptions need to be validated in a subsequent clinical controlled study with a larger study population.

In conclusion, our study describes different appearances of the bone graft and a variety of border phenomena representing remodeling processes, giving a basis for identifying markers for the prognosis after ACD. Besides the percentage amount of necrosis reduction and the size of the remaining necrosis we identified the signal intensity of the bone graft as an indicator for clinical outcome.