Introduction

Magnetic resonance imaging (MRI) is an advantageous technique for imaging soft tissue structures, but it suffers from artifacts close to metal implants caused by strong susceptibility changes between metal and tissue [1]. The resulting inhomogeneities of the magnetic field induce signal loss and distortion, which can lead to non-diagnostic image quality or cause missing potential pathologies. By using fast spin-echo sequences with increased bandwidth [“high bandwidth” (HBW)] and frequency encoding direction adapted to the long axis of the implant, MRI near metal implants can be improved with standard sequences [2]. Additionally, new metal artifact reduction sequences (MARS), for example, view angle tilting (VAT), which reduces in-plane distortions [3], have emerged since the beginning of the twenty-first century [4]. VAT has been shown to effectively reduce metal artifacts in vitro [5] as well as in vivo [6]. The application of WARP (Siemens Healthcare, Erlangen, Germany) combines VAT and HBW.

One of the indications for MARS MRI is to image metal-on-metal (MoM) hip resurfacings, especially in the case of postoperative complications. Recently, the application of those resurfacings emerged especially for young patients, because these surface substitutions were supposed to offer better joint functionality compared to total hip arthroplasties (THA), and the possibility to switch to a THA still remains as a later option [7]. MoM resurfacings are associated with higher metal ion concentrations in the blood because of abrasion [8], being discussed as the cause for the development of periprosthetic masses [9]. These so-called pseudotumors are fluid- or soft-tissue-like lesions, which are expressions of chronic sterile inflammatory reactions with a high percentage of metal debris, lymphocytic infiltration, necrosis and vascular injections [1013]. Furthermore, particular MoM hip resurfacings have been associated with a higher incidence of postoperative pain and patient disaffection than expected [14], and complications led to the recall of several of these resurfacings in the past [15]. Because of a lack of reliable diagnostic tools, there is still uncertainty concerning the correlation among pain, pseudotumors and metal ion concentrations. To gain more diagnostic accuracy in this area of strong image degradation, all efforts to combine quality-enhancing techniques are being undertaken.

The feasibility of imaging MoM hip resurfacings using MARS MRI at 1.5T has been described before [1618]. In addition, the purpose of our study was to evaluate the usefulness of the metal artifact reduction technique “WARP” in the assessment of MoM hip resurfacings at 1.5 and 3T in the context of image quality and imaging speed.

Materials and methods

Patients

The study was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. Approval from the local institutional ethics committee was gained prior to the study, and all patients signed informed consents.

We prospectively evaluated 25 hip resurfacings of 19 patients (13 male, 6 female, 39 – 70 years, mean age 58 years) 1 – 10 years (mean 8 years) after hip resurfacing. All of them received the implantation of a MoM hip resurfacing (Conserve® plus, Wright Medical Technology, Memphis, TN, USA) between 2003 and 2012 and were consecutively acquired for this study during their routine consultation in our orthopedic department by the orthopedic surgeon, independent of clinical complaints. Six of them had MoM hip resurfacing of the right side, seven of the left side and six bilaterally. Patients were mainly asymptomatic, only five hips were associated with pain or limitations of activities in daily life.

MR imaging

Patients were alternately randomized to 1.5T (Aera, Siemens Healthcare, Erlangen, Germany) and 3T (Skyra, Siemens Healthcare, Erlangen, Germany) in the order they appeared at the MRI examination. Additionally, eight of them with unilateral hip resurfacing were voluntarily examined at the other field strength as well, resulting in a total of 33 MRIs of hip resurfacings performed (18 at 1.5T, 15 at 3T). Mean size of the prostheses examined at 1.5T was 54.4 mm (range 48 – 60 mm, SD 3.9) and 56.2 mm (range 48 – 60 mm, SD 4.0) at 3T. All examinations took place between March and October 2013.

On one hand we used the same protocol at both field strengths (Table 1) and adjusted the parameters as identically as possible to be able to evaluate differences that would only be attributable to the change in field strength. Sequences used were a coronal turbo inversion recovery magnitude (TIRM) sequence, coronal T2w turbo spin-echo (TSE) sequence, transverse T2w TSE sequence and coronal T1w TSE sequence. Each sequence was acquired with a high bandwidth of 610 – 620 Hz/px and VAT 100 % (WARP) and a second time with standard parameters, including a bandwidth of 287 Hz/px and no VAT (STD). The coronal T2w sequence was additionally acquired with a high bandwidth of 610 Hz/px, but without VAT (HBW). On the other hand, we additionally performed optimized sequences at 3T to benefit from the higher field strength: WARP sequences were acquired with higher parallel acquisition factors (PAT factor) and fewer averages to invest the higher signal-to-noise ratio in accelerating acquisition speed (Table 1).

Table 1 Sequence parameters at 1.5 and 3T

Quantitative image analysis

For quantitative analysis of artifacts, the diameter of artifacts parallel to the acetabulum as well as the area of artifacts was measured, in both a slice showing the center of the femoral head and including the stem of the prosthesis. Identical slices were used in the correlating sequences with and without WARP at both field strengths. Images were windowed to the subjective best contrast as used for clinical evaluation. Measurements were done by one particular radiologist with 3 years' experience in musculoskeletal MRI, who was blinded to the MR technique as well as the field strength. Measurements were repeated by a second radiologist with 8 years' experience in musculoskeletal radiology, being aware of the slices in which the first measurements were performed, but not of the measured values. Abnormal hyper- and hypointensities as well as distortion of structures and blurring were assessed as artifacts.

Qualitative image analysis

For qualitative assessment of metal artifacts, a 5-point score was used, with 1 = excellent image quality without any artifacts, 2 = very good image quality with hardly definable artifacts, 3 = few artifacts without affection of diagnostic image quality, 4 = moderate artifacts with affection of diagnostic image quality and 5 = severe artifacts with non-diagnostic image quality.

Additionally, several anatomical structures were analyzed regarding their delineation and the influence of artifacts. We used a 3-point scale with 1 = structure visible without artifacts, 2 = structure partly visible because of artifacts and 3 = structure not visible because of artifacts. The structures evaluated in every sequence were the internal as well as external obturator muscle with their tendons and the direct head of the rectus femoris tendon. In the transverse sequences, additional structures, namely the iliofemoral and ischiofemoral ligament, as well as the tendon attachments of the gluteus minimus and medius muscle, were analyzed.

Every sequence was screened for the appearance of pseudotumors. If a suspicous lesion was found, the maximum diameter of the structure as well as its signal intensity was recorded. Lesions were classified according to the grading system of Matthies [19]: 1: flat shaped and thin walled, walls mainly in apposition, hypointense in T1, hyperintense in T2; 2a: thick walled or irregular, >50 % of the walls not in apposition, hypointense in T1, hyperintense in T2; 2b: thick walled or irregular, hyperintense in T1, hypointense in T2; 3: solid, mixed signal, any shape.

The qualitative image analysis was done in consensus by two radiologists with 8 and 3 years' experience in musculoskeletal MRI, blinded to the MRI technique, field strength and clinical status.

Statistical analysis

Interreader agreement was calculated for the quantitative data using intraclass correlation (ICC). For further evaluations, the measurements of the first reader were used. The results were compared between the sequences with and without WARP and between the field strengths. Percentage reductions of diameters, areas and percentage improvement in the quality score were calculated for every patient individually (percentage reduction = 100 – 100 / A * B with A and B being the two values to compare), and mean values of percentage reduction were calculated afterwards.

As the data of the qualitative analysis showed no normal distribution, a Wilcoxon sign-rank test was used for statistical analysis. For the data of the quantitative analysis, which were normally distributed, a paired t-test was used. For the statistical analysis of the field strength comparison, only the matched values of patients who participated in examinations at both field strengths were included. The p-values were corrected for family-wise error rates using the Bonferroni correction: for the quantitative data (2 measurements: area and diameter) a coefficient of 2 and for the qualitative delineation of anatomical structures a coefficient of 3 (3 anatomical structures evaluated in coronal sequences) respectively 7 (7 anatomical structures evaluated in transverse sequences) was used. The significance level α = 0.05 was used throughout.

As far as not stated otherwise, values are given as mean with standard deviation (mean ± SD).

All statistical analyses were performed using PASW Statistics 18 (IBM, Armonk, USA). The power (1-β with β being the chance of a type II error) of the applied tests was calculated post hoc using the freely available software G*Power [20].

Results

Complete lists of artifact diameters and areas as well as of the quality scores for all evaluated sequences including ranges, mean values and standard deviations are given in Tables 2, 3 and 4. Subsequently, only the relevant parameters will be mentioned in the text.

Table 2 Range of artifact diameters, mean and standard deviations in standard and WARP sequences with percentage reduction, p-values after Bonferroni correction (coefficient: 2) and power of the applied paired t-test in the evaluated sequences at 1.5 and 3T
Table 3 Range of artifact areas, mean and standard deviations in standard and WARP sequences with percentage reduction, p-values after Bonferroni correction (coefficient: 2) and power of the applied paired t-test in the evaluated sequences at 1.5 and 3T
Table 4 Mean quality score with standard deviation, percentage reduction, p-values and power of the applied Wilcoxon sign-rank test for dependent variables in standard and WARP sequences in the evaluated sequences at 1.5 and 3T

Measurements of artifact diameter and artifact area showed an excellent interreader reliability (ICC coefficient 0.96 and 0.95, respectively). Therefore, only the values measured by the first reader were used for further evaluation.

1.5 T STD, HBW and WARP imaging

The maximum size (diameter and area) of artifacts in the STD technique appeared in coronal TIRM and T1w sequences, whereas the minimum size of artifacts could be observed in axial T2w sequences.

WARP compared to STD showed artifacts with significantly smaller diameters and areas: The artifact reduction worked best in axial T2w sequences with a mean reduction of artifact area by 45 % (STD: 49 ± 10 cm2; WARP: 27 ± 8 cm2; p < 0.001), followed by coronal T2w sequences. The least artifact reduction with a mean of 15.5 % regarding artifact diameter was achieved in coronal TIRM sequences (STD: 89 ± 16 mm; WARP: 75 ± 10 mm; p < 0.001).

The diameters and areas of artifacts measured in the HBW sequence lay in between the aforementioned results of STD and WARP (Fig. 1). Differences were significant comparing STD and HBW regarding artifact diameter (STD: 82 ± 12 mm; HBW: 70 ± 8 mm; p < 0.001, power 0.99) and area (STD: 42 ± 5 cm2; HBW: 31 ± 6 cm2; p < 0.001, power 1.0) as well as comparing HBW and WARP regarding artifact area (HBW: 31 ± 6 cm2; WARP: 26 ± 6 cm2; p < 0.001, power 0.99; power for artifact diameter 0.87).

Fig. 1
figure 1

A 39-year-old female patient, 8 years after MoM hip resurfacing: Comparison of the standard (a), high bandwith (b) and WARP technique (c) in a coronal T2-weighted 1.5T MRI of the right hip. The borders of metal artifacts were marked manually (white line). Using a high bandwidth shows a clear reduction of artifacts compared to the standard (29 cm2 vs. 44 cm2). Adding VAT by using WARP reaches a slight additional advantage (22 cm2)

WARP also showed significantly fewer artifacts affecting the qualitative aspect. Over all sequences, subjective image quality could be improved from an average of 4.99 ± 0.1 points to an average of 3.90 ± 0.3 points, with the highest artifact reduction of 25 % in axial T2w sequences (STD: 5 ± 0: WARP: 3.8 ± 0.4, p = 0.002). No difference between WARP and HBW could be observed in the coronal T2 sequence (WARP: 3.0 ± 0.3; HBW: 3.0 ± 0.3; p = 1.0).

Also the delineation of the evaluated anatomical structures was better with WARP compared to STD (Fig. 2): The best results regarding metal artifact reduction could be achieved in the coronal T2w sequence, where artifacts affecting the internal obturator muscle and tendon could be reduced from 2.8 ± 0.4 to 1.3 ± 0.5 points (p = 0.001) and artifacts affecting the direct head of the rectus femoris tendon from 2.1 ± 0.6 to 1.4 ± 0.5 points (p = 0.007). Of the evaluated structures, only the tendon attachment of the gluteus medius could be imaged free of artifacts in every patient using WARP in the transverse T2w sequence (1.0 points), but this structure achieved 1.1 points with STD as well. The least artifact reduction of an average over all anatomical structures of 15 % was observed in the TIRM sequence. Every evaluated anatomical structure showed significantly fewer artifacts using HBW compared to STD (p = 0.003 to p = 0.024), but larger artifacts compared to WARP (i.e., internal obturator tendon: 1.7 ± 0.5 points vs. 1.3 ± 0.5 points, p = 0.024).

Fig. 2
figure 2

Bar graph showing the delineation of anatomical structures at 1.5T, referred to the qualitative analysis (1 = structure visible without artifacts, 2 = structure partly visible because of artifacts and 3 = structure not visible because of artifacts). Each whole bar represents the quality degradation through artifacts using the standard technique, whereas the black part of each bar represents the quality degradation through artifacts using WARP. The gray part of each bar therefore represents the quality gain using WARP. Significant differences (p < 0.05) between the standard and WARP technique are marked with a star. T1w c = T1 weighted coronal sequence. T2w c = T2 weighted coronal sequence. T2w t = T2 weighted transverse sequence. Ext. obturator = external oburator muscle and tendon. Int. obturator = internal oburator muscle and tendon. Rectus fem. = direct head of the rectus femoris tendon. ILF-L = iliofemoral ligament. ISF-L = ischiofemoral ligament. Gl. Mi. = tendon attachment of the gluteus minimus muscle. Gl. Me. = tendon attachment of the gluteus medius muscle

3T STD, HBW and WARP imaging

Concordant to 1.5T, the largest artifacts in the STD technique at 3T were observed in the coronal TIRM sequence, whereas the T2w sequences showed the smallest artifact sizes.

WARP compared to STD showed artifacts with significantly smaller diameters and areas: The best reduction of artifact area with a mean of 33 % (STD: 60 ± 9 cm2; WARP: 40 ± 8 cm2; p < 0.001) and of artifact diameter with a mean of 27 % (STD: 102 ± 11 mm; WARP: 74 ± 10 mm, p < 0.001) could be achieved in coronal T2w sequences. The least artifact reduction with a mean of 10 % regarding artifact diameter was achieved in the coronal TIRM sequence (STD: 104 ± 18 mm; WARP: 91 ± 15 mm; p = 0.130). Concordant to 1.5T, the results for only HBW in the coronal T2w sequence lay in between the aforementioned results of STD and WARP. Also at 3T, significant differences could be observed comparing STD and HBW regarding artifact diameter (STD: 102 ± 11 mm; HBW: 84 ± 13 mm; p = 0.002, power 0.98) and area (STD: 60 ± 9 cm2; HBW: 49 ± 11 cm2; p = 0.002, power 0.97). Significant differences between HBW and WARP could only be observed for artifact area (HBW: 49 ± 11 cm2; WARP: 40 ± 8 cm2; p = 0.001, power 0.99), but not for artifact diameter (power 0.59).

Also at 3T WARP showed significantly fewer artifacts affecting qualitative aspects: On average over all sequences, points for image quality could be raised from 4.79 ± 0.4 points to 4.02 ± 0.2 points, with the highest artifact reduction of 19 % in coronal TIRM sequences (STD: 4.8 ± 0.4 points, WARP: 3.9 ± 0.3 points, p = 0.003). HBW and WARP showed identical results in the coronal T2w sequence, with a mean result of 4.0 points for both techniques (p > 0.05).

Similar to 1.5T, the delineation of the evaluated anatomical structures was better with WARP compared to STD at 3T as well (Fig. 3), with the best results in the coronal T2w sequence, where artifacts in the internal obturator muscle and tendon could be reduced from 2.4 ± 0.5 to 1.4 ± 0.5 points (p = 0.002) and in the direct head of the rectus femoris tendon from 2.7 ± 0.5 to 2.0 ± 0.5 points (p = 0.034). Every evaluated anatomical structure apart from the external obturator muscle and tendon and the direct head of the rectus femoris tendon showed significantly fewer artifacts using HBW compared to STD (p = 0.003 to p = 0.024), but larger artifacts compared to the WARP sequence, which were statistically not significant.

Fig. 3
figure 3

Bar graph showing the delineation of anatomical structures at 3T, referred to the qualitative analysis (1 = structure visible without artifacts, 2 = structure partly visible because of artifacts and 3 = structure not visible because of artifacts). Each whole bar represents the quality degradation through artifacts using the standard technique, whereas the black part of each bar represents the quality degradation through artifacts using WARP. The gray part of each bar therefore represents the quality gain using WARP. Significant differences (p < 0.05) between the standard and WARP technique are marked with a star. T1w c = T1 weighted coronal sequence. T2w c = T2 weighted coronal sequence. T2w t = T2 weighted transverse sequence. Ext. obturator = external oburator muscle and tendon. Int. obturator = internal oburator muscle and tendon. Rectus fem. = direct head of the rectus femoris tendon. ILF-L = iliofemoral ligament. ISF-L = ischiofemoral ligament. Gl. Mi. = tendon attachment of the gluteus minimus muscle. Gl. Me. = tendon attachment of the gluteus medius muscle

Field strength comparison

Mean artifact diameters using the STD technique were up to 20 % larger at 3T compared to 1.5T with significant differences in T1w (1.5T: 85 ± 9 mm; 3T: 103 ± 5 mm; p = 0.008) and the transverse T2w sequence (1.5T: 77 ± 7 mm; 3T: 93 ± 6 mm; p = 0.02). Also mean artifact areas at 3T were significantly larger compared to 1.5T using STD (i.e., coronal T2w: 1.5T: 42 ± 5 cm2; 3T: 60 ± 9 cm2; p = 0.006).

A significant difference between 1.5T and 3T using WARP regarding mean artifact diameter could only be seen in the transverse T2w sequence (1.5T: 59 ± 10 mm; 3T: 81 ± 9 mm; p = 0.018; Fig. 4). Referring to the mean artifact area, the differences were significant in the transverse (1.5T: 27 ± 8 cm2; 3T: 55 ± 10 cm2; p < 0.001) and coronal T2w sequence (1.5T: 26 ± 6 cm2; 3T: 40 ± 8 cm2; p = 0.026).

Fig. 4
figure 4

A 59-year-old male patient 5 years after MoM hip resurfacing: Comparison of the standard (a, c) and WARP technique (b, d) at 1.5 (a, b) and 3T (c, d) in an axial T2-weighted MRI of the left hip. The marked area of artifacts (white lines) in identical slices in the center of the femoral head is clearly reduced by WARP imaging at both field strengths (1.5T: 24 cm2 vs. 39 cm2; 3T: 59 cm2 vs. 87 cm2), but generally larger at 3T compared to 1.5T

There was no statistically significant difference when comparing subjective image quality at 1.5 and 3T in the STD as well as WARP technique.

Image acceleration at 3T

The accelerated WARP sequences at 3T using a PAT factor of up to 3 and/or reduced number of averages (1) achieved comparable results: The degree of artifact reduction was similar to the non-accelerated WARP sequences, focused on the quantitative as well as the qualitative analysis (p > 0.05 for all parameters). The acceleration resulted in a reduction of scanning time of 64 % compared to the standard (PAT factor 2, 2 averages) in the coronal T2w sequence (1:27 min vs. 4:04 min) and of 60 % compared to 1.5T (no PAT, 1 average) in the coronal TIRM sequence (1:37 min vs. 4:23 min), each by maintaining adequate image quality (coronal T2w: 4.1 ± 0.3 points vs. 4.2 ± 0.4 points, p = 1.0 and TIRM: 4.2 ± 0.4 points vs. 3.9 ± 0.3 points, p = 1.0). The acceleration of the T1w as well as the transverse T2w sequences was less pronounced, but still remarkable (33 – 57 % reduction of scanning time).

Appearance of pseudotumors

Lesions suspicious for pseudotumors appeared in six hips (2 symptomatic, 4 asymptomatic) out of five patients (Fig. 5). Two of these hips were examined only at 1.5T, one only at 3T and the other three at both field strengths. Three of the lesions appeared thin walled, hyperintense in T2w and hypointense in T1w, but not flat shaped (Matthies 2a), whereas two of the lesions were classified as Matthies 2b because of their hyperintense signal in T1w. One solid lesion with mixed signal was classified as Matthies 3. In two cases, a communication of the lesions with the joint fluid was likely, but not proven. The maximum diameter of the lesions ranged from 12 to 33 mm, depending on the patient and the orientation of the sequence. All detected lesions could be visualized at 1.5T as well as at 3T in the coronal and axial T2w sequences. The delineation of the lesions in T1w was better at 3T (3 of 4 lesions) compared to 1.5T (2 of 5 lesions), but in the TIRM sequence it was better at 1.5T (4 of 5 lesions) compared to 3T (2 of 4 lesions). WARP did not increase the number of detected pseudotumors, but made their delineation clearer in all evaluated sequences. A significant difference regarding the size, depending on the use of WARP compared to STD, could not be observed.

Fig. 5
figure 5

A 63-year-old female patient 9 years after MoM hip resurfacing: In the coronal 3T MRI of the right hip the solid pseudotumor (Matthies Grad 3), marked with the arrow, can be seen in a TIRM- (a, b) and T2-weighted (c, d) sequence, without (a, c) as well as with (b, d) WARP

Discussion

Concordant with previously published data regarding THA [6] and spinal implants [21], our study shows, that HBW and VAT are useful when imaging MoM hip resurfacings, resulting in a significant reduction of artifact size and improving the delineation of near anatomical structures.

Additionally we could show that metal artifact reduction using HBW and VAT is also possible at 3T with evidence that artifacts are larger compared to 1.5T, as previously published [2224]. However, referring to our data, this increase of artifacts from 1.5 to 3T does not affect the subjective diagnostic quality or the delineation of anatomical structures, although the mean size of the MoM hip resurfacing was slightly higher in the patient group examined at 3T. Also the detection of pathologies, in this context pseudotumors, was - as far as this conclusion can be made based on the small number of pseudotumors in our study population - not dependent on the field strength in our study in both symptomatic and asymptomatic patients. These results suggest that imaging MoM hip resurfacings can be performed equally well at 1.5 and 3T. One advantage of the higher field strength is the possibility of accelerating the scanning speed, which did not reduce the diagnostic quality in our study. This advantage might be useful particularly in agitated or anxious patients, patients with pain or institutions with a high workload. Additionally, the use of a higher field strength may compensate the loss of signal-to-noise ratio resulting of the use of a higher bandwidth.

In our study, adding VAT to the coronal T2 sequence was advantageous over HBW alone at both field strengths, but the differences in artifact size and image quality were less pronounced between HBW and VAT compared to the differences between STD and HBW. This might be due to a slight increase of image blurring using VAT, which was described previously [18]. This observation implies that optimized standard sequences with high bandwidth might serve as a valuable substitute to examine patients with metal implants when dedicated metal artifact reduction techniques are not available.

A limitation of our study is the lack of using slice encoding for metal artifact reduction (SEMAC), as mentioned in other studies [6, 25]. This technique, which can be used in addition to a high bandwidth and view angle tilting, leads to a reduction of through-plane distortion [26]. Recent studies revealed, that the use of VAT and SEMAC is superior to the use of VAT [2729] or HBW [6] alone regarding metal artifact reduction. But unfortunately SEMAC was under development during our study and not available in our institution. However, SEMAC leads to extended scanning time, which is less applicable for clinical routine [30]. Another limitation is the small number of patients included in the study, especially those who participated in MR examinations at both field strengths. However, the number of patients available for this study was limited to this figure, and power analysis revealed predominantly good to excellent results regarding quantitative as well as qualitative comparisons between STD and WARP. A further limitation of our study is a possible bias of the measured artifact size because of individual windowing. Measurements could have been more objective if the images had been windowed to a real black-white image previously. However, this would not have reflected clinical reality, as a radiologist always tries to window images to the best contrast for evaluation.

In conclusion, our study reveals the usefulness of WARP-MRI in imaging MoM hip resurfacings at both 1.5 and 3T. Superior image quality near metal implants reduces artifact size and increases visibility of anatomical structures by using VAT and HBW together compared to HBW alone or none of these. Additionally in our study 3T could be used to reduce the scanning time markedly without losing image quality. The detection rate of pseudotumors did not change through the use of artifact reduction techniques.

The authors declare that they have no conflict of interest.