Introduction

Total joint replacement is the standard surgical treatment for advanced symptomatic osteoarthritis of the knee and hip. Complications associated with total hip joint replacement (THR) that may require a surgical revision are aseptic loosening, infections, peri-prosthetic fractures and particle disease [1, 2]. With the application of metal on metal cobalt-chromium joint replacements local adverse soft tissue reactions have been recognized, called aseptic lymphocytic vasculitis-associated lesions (ALVAL) [3, 4]. The cumulative incidence for THR revision rises from 2 % at the 5-year follow-up to 8 % at 15-year follow-up, and for TKR revisions from 2.3 % to 7.1 %, respectively [5].

The standard imaging modality for the follow-up of joint replacements is conventional radiography. This technique is useful as a screening tool, however, the sensitivity for complications is low (57.6 %) while the specificity was shown to be high (92.9 %) [6]. In many cases, radiographs do not lead to the definite diagnosis of an arthroplasty-related complication such as ALVAL that often involves soft tissue changes such as joint effusion and bursitis, nodular thickening of the synovia and capsule, and may result in early failure through extensive bone resorption. MRI would be far superior to detect these changes; however, regional image degradation due to metal artefacts in conventional fast spin echo (FSE) or FSE with short T1 inversion recovery (STIR) sequences precludes the diagnostic use of this technique. Newer developments in implant metal composition and dedicated artefact suppressing MRI sequences have led to a reduction of artefacts [711]. Efforts to apply artefact-reducing techniques in a clinical setting have focused on 1.5 T-scanners [12], since it is known that metal artefacts are aggravated with higher field strength [13]. MRI scanners operating at 3 T, however, are increasingly replacing 1.5 T MRIs and create a demand for feasible artefact reduction also at higher field strength.

The aim of our study was, therefore, to evaluate the feasibility and diagnostic value using a hybrid approach (MAVRIC-SL), combining the multi-acquisition with variable resonance image combination (MAVRIC), with the slice encoding for metal artifact correction (SEMAC) technique at 3 T. MAVRIC-SL sequences obtained in patients with symptomatic THR were compared with standard clinical FSE-STIR sequences, optimized for imaging around metal, concerning image quality, artefact size, visibility of anatomical structures, and detectability of pathology findings.

Material and methods

Subjects

Sixty-one patients who were referred from our orthopaedic surgery outpatient clinic for assessment of symptoms at the site of a hip implant were enrolled between January 2011 and October 2013 meeting the following inclusion criteria: report of sustained pain and/or malfunction at the hip treated with a THR and written informed consent for the participation of the study. Exclusion criteria were MRI contraindications including claustrophobia or MRI-incompatible implants (e.g. pacemakers). The resulting cohort consisted of 35 men (57 %) and 26 women (43 %). Mean age at the time of examination was 58.7 years (27 – 77 years, SD 10.3 years). One third of the cohort had metal implants at both hips (21 cases). Two types of implants were included: resurfacing devices (n = 40 implants) and total hip replacements (THR) (n = 42 implants). All resurfacing devices were cobalt-chromium metal on metal prostheses. Of the THRs 18 were metal on polyethylene and 12 were metal on metal prosthesis. Mean follow-up interval between MRI and surgery was 3.7 years (1 – 12 years, SD 2.4 years). Fifty-two patients (85 %) reported pain varying from occasional aching to severe constant pain in the region of the operated hip.

In 49 cases (80 %) radiographs of the hip/pelvis were unremarkable showing correct integration and alignment of the implant. In the remaining 12 cases (20 %) the following findings were noted: osteolysis (seven cases), calcific tendinitis and heterotopic ossifications (three cases), cortical stress shielding (one case), and displacement of a greater trochanter fragment (one case). The UCSF Committee of Human Research approved the study protocol, and all patients gave written, informed consent prior to participation in this HIPAA compliant study.

Image acquisition

In each participant a 3 T-MR study was performed (Discovery MR-750, GE Healthcare, Waukesha, WI, USA) using an eight-channel phased-array cardiac coil (In-Vivo Corporation, Gainesville, FL, USA). The MRI protocol (Table 1) included axial and coronal 2D-FSE sequences with short T1 inversion recovery (STIR) fat suppression, a proton-density (PD)-weighted coronal 3D-MAVRIC-SL sequence, and an axial 3D-MAVRIC-SL sequence with STIR fat suppression. For 2D-FSE, a large receiver bandwidth was used to minimize chemical shift artefacts. Both FSE-STIR sequences are part of our clinical routine protocol and were optimized for diagnostic purposes with regard to spatial resolution and SNR resulting in a slice thickness of 4 mm. Since the 3D-acquired MAVRIC-SL provides a higher SNR, we could slightly reduce the slice thickness (3.2 mm) to optimize the resolution.

Table 1 Sequence parameters at 3 T used in this study

Image analysis

All MRI exams were reviewed on PACS workstations (Agfa, Ridgefield Park, NJ, USA). The MRI studies were graded independently by two radiologists (L.N. and M.K.) blinded to sequence parameters and clinical information. In cases of disagreement the senior radiologist (TML) was consulted to reach consensus. To assess intra-rater reliability a subset of nine patients was graded again after 3 weeks by one of the radiologists (L.N.).

The evaluation system developed for this study included image quality, extent of artefacts, visibility of anatomical structures, and abnormalities/complications related to the hip implants.

  1. 1.

    Image quality was visually assessed according to noise, spatial resolution and contrast and finally graded using one 5-point quality score with: 5 = clear visibility of anatomic details with sharp contours and obvious differences in signal intensity (Fig. 1), 4 = mild loss of contrast without impairment of visibility of image details, 3 = moderate contrast and mild blurring that mildly affected the discrimination of anatomic details, 2 = moderate contrast and blurring with vague discrimination of anatomic details like vessels of nerves, 1 = low contrast and blurry contours obscured anatomic details (Fig. 1). Quality of fat saturation was graded with: 5 = complete fat suppression with homogenously low signal in the bone marrow or subcutaneous fat, 4 = mild heterogeneity of low fat signal that did not affect image evaluation, 3 = moderate heterogeneity that mildly impaired image evaluation, 2 = intermediate and inhomogeneous signal intensity of fatty tissue definitely impairing image evaluation, 1 = missing fat saturation with subcutaneous fat and bone marrow appearing brighter compared to muscle tissue. The analysis of this parameter was restricted to the axial MAVRIC-SL-STIR since the coronal MAVRIC-SL was designed as a non-fat suppressed sequence. Image distortion around metal was recognized as an abnormal deviation of anatomical lines such as the bone-soft tissue interface within the slice or between slices and graded with 5 = not present, 4 = minimal distortion, 3 = distortion mildly altering anatomic contours, 2 = distortion severely impaired anatomic allocation near the metal implant, 1 = distortion made the anatomic allocation of structures surrounding the implant impossible.

    Fig. 1
    figure 1

    Axial MAVRIC-SL-STIR (A) and axial FSE-STIR image (B) in a patient with resurfacing implant at the right hip. Image quality of the MAVRIC-SL-STIR is reduced compared to the FSE-STIR with blurry contours, lower contrast, and lower resolution as visible in the magnification of the left inguinal vessels. Image noise is low in both techniques. The fat saturation is clearly reduced with the MAVRIC-SL technique as seen in the subcutaneous fat (black star). The size of the metal artefact at the right hip is, however, significantly reduced (A)

  2. 2.

    Extent of the artefact was defined as the area of signal void that included both the implant itself and the implant-induced artefact. It was measured on the slice with the maximum extent of artefact multiplying the length of the two maximum rectangular diameters (cm2) (Fig. 2).

    Fig. 2
    figure 2

    Coronal FSE-STIR (A) and coronal MAVRIC-SL-PD (B) of a patient with a total hip replacement at the left hip. The size of signal voids in the MAVRIC-SL-PD image is markedly smaller (5.9 × 5.6 cm) compared to the artefact size in the FSE-STIR image (14.3 × 10.3 cm) (lines with arrows). Because of the reduced artefact, the prosthesis itself is now visible as a signal void area, and more details such as the acetabular screw (white arrow) and anatomical structures such as the periprosthetic bone of the proximal femur (arrow heads) are visible

  3. 3.

    A 5-point grading system was used to analyze the visibility of the anatomical structures greater trochanter, lesser trochanter, femoral head and neck, and acetabulum as follows: 1 = not visible, 2 = partially visible, 3 = visible but significant blurring of borders, 4 = fully visible with slight blurring of borders, 5 = excellent visibility (Fig. 2).

  4. 4.

    Abnormalities indicating joint replacement complications such as joint effusion, ALVAL, bursitis, bone marrow oedema pattern (BMEP), insertion tendinopathy at the greater trochanter, and osteolysis were assessed with regard to their detectability. The findings were graded on a nominal scale with the terms: absent, probably absent, query, probably present, and present. In a second step this score was condensed to a score of diagnostic confidence with 0 = query, 1 = probably present or probably absent, 2 = definitely present or definitely absent (Table 2).

    Table 2 Comparison of MAVRIC, STIR, and FSE sequences according to image and diagnostic quality, visibility of anatomic structures, and diagnostic confidence of abnormalities

    Joint effusion was defined as an abnormal presence of fluid within the joint. ALVAL was defined as a soft tissue mass or fluid-filled cavity with thickened low intensity capsule in the periarticular region with joint effusion, bursitis and possible remodelling of the adjacent bone [14, 15]. Bursitis was defined as accumulation of fluid in a bursa. BMEP was defined as a poorly marginated elevated signal on the axial MAVRIC-SL-STIR and FSE-STIR sequences. Insertion tendinopathy was defined as signal alteration and thickening at the insertion of the gluteal muscles at the greater tuberosity. Osteolysis was defined as intermediate or high signal intensity marrow replacement with potential disruption of the cortical bone.

Statistical analysis

Descriptive statistics were used for the analysis of frequencies of definite abnormalities in the MAVRIC-SL and FSE sequences. Differences of image quality, visibility of anatomical structures and detectability of joint abnormalities between MAVRIC-SL and FSE sequences were tested with the Wilcoxon signed-rank test (level of confidence = p < 0.05). Since we did not have a surgical standard of reference, we defined the diagnostic grade “definitely present” in either of the sequences as the true finding. Inter- and intra-reader agreement of the training subset of 10 patients were determined by calculation of kappa values. A kappa of 0 indicated poor agreement; 0.01–0.20, slight agreement; 0.21 – 0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, good agreement; and 0.81–1.00, excellent agreement.

Results

Reproducibility measurements

Comparing the training sample results of both readers and the reading results of one reader after two sessions we found kappa values of 0.66 (image quality score), 0.70 (anatomic score), and 0.69 (abnormality score) for the inter-reproducibility and kappa values of 0.66 (image quality score), 0.78 (anatomic score), and 0.72 (abnormality score) for the intra-reader reproducibility.

Image quality and metal artefacts

Visual assessment of the image quality revealed significant differences between the MAVRIC-SL and FSE sequences. Lowest quality scores with regard to noise, spatial resolution and contrast were seen in the axial MAVRIC-SL-STIR sequence (mean 2.82, SD 0.9) with blurry contours and reduced discriminability of small details like nerves or cartilage (Fig. 1, Table 2). Fat suppression was less effective in the axial MAVRIC-SL-STIR (mean 2.50, SD 0.8) compared to the axial FSE-T2-STIR (mean 3.56, SD 1.0; p < 0.0001) (Fig. 1, Table 2). Metal induced image distortion was comparable in axial MAVRIC-SL-STIR (mean 3.80, SD 1.5) and axial FSE-T2-STIR sequence (mean 3.16, SD 1.0; p = 0.149). Image quality scores of the PD weighted coronal MAVRIC-SL (mean 3.42, SD 0.5) were comparable with the coronal FSE-T2-STIR sequence (3.41, SD 0.7; p = 0.308), whereas in the PD-weighted coronal MAVRIC-SL sequence distortion (mean 3.53, SD 1.0) appeared to be less pronounced compared to the FSE-STIR (mean 2.64, SD 0.7; p < 0.0001) (Fig. 2, Table 2).

Metal implant induced artefacts appeared in all sequences as focal zones of signal voids, obscuring the anatomic structures of the hip joint and to some degree of the surrounding anatomy. The size of signal voids was significantly reduced in the MAVRIC-SL-STIR compared to the FSE-STIR (mean difference 98.1 cm2 SD 62.8; p < 0.0001). Likewise signal voids in the coronal MAVRIC-SL were significantly smaller compared to the coronal FSE-STIR (mean difference 108.8 cm2, SD 66.6; p < 0.0001).

Visibility of anatomic structures

Both, greater and lesser trochanter were significantly better identified in the axial MAVRIC-SL-STIR and the coronal MAVRIC-SL than in the corresponding FSE-STIR sequences (p < 0.0001) (Table 2). The zones of the prosthetic neck and head appeared as areas of signal loss and were better delineated in both MAVRIC-SL sequences compared to the FSE-STIR sequences (p < 0.0001) (Fig. 2). The acetabulum was the anatomic region with the most impaired visibility of the bone surrounding the prosthesis, which was regularly obscured by the artefact of the acetabular prosthetic component. But both MAVRIC-SL sequences were superior in displaying the osseous anatomy compared to the FSE-STIR sequences (p < 0.0001).

Detectability of joint abnormalities

Overall 94 abnormal findings based on a diagnostic grade 4 (definitely present) on any sequence in 49 (80.3 %) out of 61 patients were found. Six cases of focal osteolysis (9.8 %) (Figs. 3, 4) were demonstrated and 38 (62.3 %) joint effusions were detected. The diagnosis of a bursitis (mostly the trochanteric bursa and the iliopsoas bursa) was made in 30 cases (49.1 %), (Figs. 6, 7). In 10 cases (16.4 %), structural changes of the joint such as joint effusion (n = 10), nodular thickening of the synovia and capsule (n = 10), bursitis (n = 8), and bony erosions (n = 3) qualified for ALVAL (Fig. 4). Tendinopathy was found in eight cases (13.1 %). BMEP was detected in two cases (3.3 %). All pathology findings were detected less frequently in the FSE sequences (Table 3, Figs. 3, 4, 5 and 6). Without the use of the MAVRIC-SL sequences, two osteolyses (3.3 %), eight ALVAL cases (13.1 %), five tendinopathies (8.2 %), 12 bursitis cases (36.1 %), and 31 joint effusions (50.8 %) would have been missed. Moreover, the diagnostic confidence for all five diagnoses was significantly higher with the MAVRIC-SL sequences (p = 0.0018 to <0.0001) (Table 2). When comparing the axial MAVRIC-SL-STIR with the axial FSE-STIR, the greatest differences of diagnostic confidence were found for joint effusion (MAVRIC mean 1.75, SD 0.6 vs. FSE mean 0.56, SD 0.8), ALVAL (MAVRIC-SL mean 1.63, SD 0.7 vs. FSE mean 0.29, SD 0.7), and osteolysis (MAVRIC-SL mean 1.81, SD 0.5 vs. FSE mean 0.36, SD 0.7). Similar results were found for the coronal sequences (Table 2).

Fig. 3
figure 3

Axial MAVRIC-SL-STIR (A) and axial FSE-STIR (B) of a patient with a total right hip replacement. While an extensive artefact from the cranial prosthetic head overlays the joint area in the FSE-STIR (B), markedly reduced artefacts are seen in the MAVRIC-SL image (A). Because of artefact reduction now multiple abnormalities are visualized including an osteolysis of the greater trochanter (open arrow), a fluid collection around the femoral stem of the implant (arrow heads) and iliopsoas bursitis (white arrow)

Fig. 4
figure 4

Axial MAVRIC-SL-STIR (A) and axial FSE-STIR (B) of a patient with a resurfacing implant at the right hip. The findings seen in the MAVRIC-SL image (A) with joint effusion extending in the iliopsoas bursa, synovial thickening (white arrows), and osteolysis (open arrow) adjacent to the stem of the femoral resurfacing component (arrow head) are consistent with ALVAL. In the FSE-STIR image (B), only the osteolysis (open arrow) is visible due to the extensive artefact originating from the cranial femoral head resurfacing implant

Table 3 Number of cases with definite diagnoses* in MAVRIC and FSE sequences
Fig. 5
figure 5

Axial MAVRIC-SL-STIR (A) and axial FSE-STIR (B) of a patient with a total hip replacement at both hips. Multiple cystic osteolyses adjacent to the left acetabular cup are visible in the MAVRIC-SL image (A, white arrows) while only a small focal hyperintensity near the larger artefact can be recognized in the FSE-STIR

Fig. 6
figure 6

Axial MAVRIC-SL-STIR (A) and axial FSE-STIR (B) of a patient with a resurfacing implant at the right hip. An extensive artefact in the FSE-STIR image (B) obscures a large trochanteric bursitis that is visible in the MAVRIC-SL-STIR image (A, white arrow)

Discussion

The results of our study demonstrate that the application of MAVRIC-SL sequences at 3 T leads to a significant reduction of the size of metal artefacts compared to the FSE-STIR sequences, as demonstrated in patients with hip replacements. MAVRIC-SL sequences, however, showed decreased general image quality particularly in the MAVRIC-SL-STIR with regard to spatial resolution, contrast, and fat saturation. On the other hand, the reduction of artefact size improved visualization of joint anatomy and of diagnostic confidence regarding implant associated pathologies significantly. As a consequence 59 % of joint abnormalities would have been missed using standard FSE-STIR sequences without the additional information of MAVRIC-SL sequences.

Our findings are in line with studies that investigated the technical and diagnostic image quality at 1.5 T [7, 12, 16]. Koch et al introduced the MAVRIC technique for 1.5 T scanners in 2009. The investigators showed that with MAVRIC B0 artefacts were significantly reduced and image distortion in the direct vicinity of metal implants was reduced. Hayter et al compared MAVRIC-SL and FSE sequences at 1.5 T in diagnosing arthroplasty related joint pathologies of hips, knees, and shoulders [12]. They found a significant improvement of the visualization of the synovia and the periprosthetic bone. Fifteen percent of 61 cases could be diagnosed only with the MAVRIC sequence. Similarly, in 16 % of cases, osteolysis would have been missed if only the FSE sequences were assessed. The higher rate of missed diagnosis with FSE sequences in our study may be explained by the greater extent of artefacts regularly seen in 3 T-MRI examinations.

A recently published in vitro study of Liebl et al. [17] investigated the feasibility of the hybrid MAVRIC-SL technique at 3 T in comparison to 1.5 T. They demonstrated in a porcine knee model that MAVRIC-SL sequences substantially reduced the artefact size of screws in images obtained both at 1.5 and 3 T. Moreover, in both field strengths, the MAVRIC-SL technique similarly improved the detectability of bone and cartilage lesions.

To the best of our knowledge, this is the first study investigating the clinical use of MAVRIC-SL sequences in vivo at 3 T. The most important effect of the MAVRIC-SL technique was the reduction of artefact size that enabled us to better delineate the joint anatomy and detect pathologies within the joint. Image quality was, however, significantly reduced particularly in the axial MAVRIC-SL-STIR sequence with impaired image contrast and spatial resolution. Reasons for this are related to the lengthy acquisition required to sample all off-resonance frequencies and the need to combine several images with different off-resonance bins as previously described [7]. Thus, the MAVRIC-SL technique does not allow reliable assessment of smaller structures such as the cartilage layer or the acetabular labrum. It is, however, very useful for the detection of larger, high contrast abnormalities, e.g. effusion, bursitis, and osteolysis. Also, the fat saturation of the MAVRIC-SL-STIR was clearly reduced compared to FSE-STIR sequences. The reduced fat saturation could have led to some false negative diagnosis regarding the presence of BMEP. Another finding of the MAVRIC-SL technique was a curve shaped, ill-defined signal loss in the centre of the image occurring in patients with bilateral total hip replacements. These shading artefacts were most likely a consequence of implant-induced B1 perturbations impacting the performance of the broadband inversion pulse utilized by MAVRIC-SL STIR. They appeared in fat saturated images in some distance to the hip joint, however, the beneficial diagnostic effect of size reduction of the local metal artefact was rarely diminished (Fig. 7). Since we could not compare MAVRIC-SL at 1.5 T and 3.0 T the impact of the magnetic field strength on MAVRIC-SL image quality in a clinical setting remains to be investigated.

Fig. 7
figure 7

Axial MAVRIC-SL-STIR (A) and axial FSE-STIR (B) of a patient with total hip replacements at both hips. Curve shaped shading artefacts (A, arrow heads) that were seen in MAVRIC-SL sequences of patients with bilateral implants degrade central parts of the image. Nevertheless, the reduced size of the local metal artefact makes a bursitis of the right hip visible (A, white arrow)

Other limitations of our study are grading of image quality, visibility of anatomy, and detectability of joint abnormalities was based on visual judgment. We, therefore, calculated inter- and intra-reader reproducibility, which were found to be high. The only quantitative measure was the artefact size. However, since both the metal implant and the resulting artefact appear as signal voids, they cannot be differentiated on images. As a consequence, the signal voids in MAVRIC-SL sequences are mainly determined by the size of the prostheses itself. Also, we could not compare the abnormalities diagnosed with the two techniques with a pathology gold standard such as intraoperative findings. The purpose of our study, however, was not to estimate the diagnostic accuracy of MAVRIC-SL sequences, but to compare MAVRIC-SL and FSE techniques to investigate the additional diagnostic value of MAVRIC-SL sequences. We could not compare fat suppression between the coronal MAVRIC-SL and FSE-STIR sequence since we acquired the coronal MAVRIC-SL sequence without fat suppression. Also, since MAVRIC-SL and FSE sequences can be identified due to the differing image quality and characteristics, readers were not truly blinded. We tried to mitigate this potential bias by choosing a design with two readers who independently confirmed the results.

In conclusion, our study showed that in patients with painful hip arthroplasties, suppression of metal induced artefacts with the MAVRIC-SL technique is feasible at 3 T. MAVRIC-SL sequences reduced substantially the size of metal induced artefacts compared to standard FSE-STIR sequences. Although the overall image quality of MAVRIC-SL sequences at 3 T was reduced compared to standard FSE sequences, it provided important additional diagnostic information through the substantial reduction of local metal artefacts.