Introduction

For the last 45 years, fixation of the lumbar spine has been a common procedure in spinal surgery [1]. Titanium pedicle screws are routinely used for this purpose, in order to reconstruct the vertebral column following damage from degeneration, infection and tumour [2]. However, the complications of such surgery carry significant risks due to the proximity of the screws to crucial neural, visceral and vascular structures. Screws can be misplaced, causing weakness of the construct and more importantly neurological damage [1].

For these reasons, it is imperative to place screws in the optimum position within the pedicle of the vertebra. There is little room for error, and surrounding structures impair visual confirmation of screw placement [1]. Although a misplaced screw does not necessarily entail negative clinical consequences, verification of screw placement remains essential. The lumbar spine is most commonly fixed, and further optical imaging can be applied to the screw position as a check of quality control, or if revision surgery is considered necessary [3].

Currently, computed tomography (CT) is the most reliable technology for detecting misplaced screws, although there is no ‘gold standard’ [4]. CT delivers harmful X-ray ionising radiation that precipitates DNA damage [5]. Magnetic resonance imaging (MRI) is a more routine investigation in neurosurgery because of its higher soft-tissue resolution and its safe image capture [2]. Many patients undergo MRI post-operatively for neuro-imaging purposes. However, when using MRI with metal, currently titanium, a susceptibility artefact distorts the local area around the screw, hampering the review of screw placement [6]. CT is not perfect, producing artefact and exposing patients to radiation, and thus should be omitted unless absolutely necessary. With medical imaging advancing fast, the necessity for both MRI and CT scans is questionable. New metal artefact reduction sequences are being introduced and need to be explored [7,8,9,10,11,12]. The commonality of patients undergoing both scans in this large UK spinal unit prompted retrospective investigation. Posterior lumbar interbody fusion (PLIF) is performed with high relative frequency and fixation of the lumbar spine most established, giving rise to the choice of these levels [1].

Many studies have examined the use of CT scanning in misplaced screw detection; [13]; however, less research has been conducted into the detection of misplaced screws using MRI [7, 14, 15]. We aim to establish the statistical agreeability between MRI and CT for evaluation of adequate screw placement and screw depiction in PLIF.

Methods and patients

Hypotheses

Null hypothesis (H0) states: H0 = There is no statistically significant agreement in the detection of misplaced spinal pedicle screws in patients when imaging with MRI or CT.

The alternative Hypothesis (H1) states: H1 = There is a statistically significant agreement in the detection of misplaced spinal pedicle screws in patients when imaging with MRI or CT.

Patient selection

This single-centre, retrospective study of 763 patients included those who underwent PLIF from 2007 to 2015 and whom had a post-operative MRI scan and a contiguous CT scan (n = 111). The sample comprised 26 males and 32 females with mean age 67.24 years and range of 69 years. Patients were excluded if imaging was performed outside of the local hospital trust (n = 11) or if any images were: un-viewable (n = 15), excluded the lumbar spine (n = 3); or were absent of any metal instrumentation (n = 24). The 58 patients for study are presented in Fig. 1.

Fig. 1
figure 1

Inclusion and exclusion criteria for patient selection into research cohort. The electronic data were collected over an 8-year period

Surgeries and post-operative protocols

In total, 296 screw insertions followed standard free-hand operation technique using intra-operative image intensifier using the Stryker XIA Precision cannulated screw system [16].

Imaging protocols

The CT and MRI images viewed by the investigator were archived, and the interval between imaging was typically 2–4 months post-operatively. The imaging protocols are given in Table 1.

Table 1 Imaging protocols for post-operative MRI and CT in elective surgical patients after PLIF. CT parameters varied when using different brands of machine due to different hospital CT suites. The CT parameters were standardised to give replicable images across the patient sample. MRI aquisitions with no standard metal reduction artefact sequences varied per-patient persequence. Improvement in this standardisation with the use of image optimisation for metal implants could the ability to appreciate pedicle screw placement. It should be noted that the gross quantity of CT sections is much higher than the gross numbers in MRI due to the thinner slice measure with CT scanning

Image interpretation

The MR and CT images of screws were viewed on Picture Archiving and Communication Service (PACS) imaging software by a blinded investigator in order of operation date (oldest to most recent). Each patient’s MR images were viewed before their corresponding CT image to mitigate any interpreter bias, given the well-established depiction of screw placement with CT. Two hundred nine MR images were derived from 80-T1 and 129-T2 pulse sequences. One or more pulse sequences were used to depict the same physical screw in order to determine its placement. Eighty-eight CT slices of the same spinal levels were required for analysis. One image slice depicts up to two physical screws resulting in 297 total images of 296 physical screws analysed.

Using axial images, screw length and diameter were measured with measurement callipers on PACS. The cortex around each vertebra was traced for areas of poor depiction. If the cortex was obscured by screw encroachment, the orientation and position were noted. This was repeated on sagittal images and for the area around the spinal canal. Screws were visually analysed for their location in the pedicle and vertebral body. This was calculated by subtracting mean screw length and diameter in the MR images from the corresponding manufacturer lumbar screw dimensions. By following a distinct post-operative imaging assessment tool to guide the interpreter through this practical visual assessment method, an informed estimation could be made of screw placement; this assessment tool is presented in Fig. 2. Screws which were borderline misplaced were viewed by another investigator and consensus reached. Further to this, CT was evaluated for streaking artefact and imaging problems arising from patient movement [12, 17].

Fig. 2
figure 2

Screw placement assessment toola guide to analysing screw placement using post-operative imaging protocols. This assessment tool to analyse screw placement sets out a guide for visually scrutinising images of the spine post-operatively. The rating system identifies the direction and degree of screw placement and accounts for minor degrees of misplacement that neither impacts the patient or the metalwork but may be caused by image artefact

Statistical analysis

Statistical analysis for proportions and diagnostic agreement was carried out using Stata to test hypotheses. Statistically, CT was used as a gold standard because the results produced by its imaging capabilities gave 100% accuracy. Sensitivity and specificity were calculated to determine the true positive and true negative rate in the data. Positive and negative predictive values were calculated when proportions of the sensitivity and specificity were required. Kappa statistical analysis was conducted to measure the inter-rater agreeability between MRI and CT in detecting misplaced screws. The significance of the kappa result against a random result was also calculated. Interpretation of the significance of the kappa statistic followed the Landis and Koch method [18, 19].

Results

The most common Stryker XIA-Precision-screw lengths and diameters used for the thoracolumbar spine were 40–50 mm and 6.5–7.5 mm, respectively. Mean MRI measurements of length and diameter were 65.3 mm and 12 mm, respectively. CT imaging of screws displayed length and diameter ± 1 mm to manufacturer dimensions [20].

Out of the 296 implanted screws, 42 were misplaced resulting in a raw misplacement rate of 14.2%. CT reclassified seven screws that were charted as misplaced on MRI when in fact they were placed correctly. Sensitivity for screw misplacement was 88.7%, specificity 96.2%, positive predictive value (PPV) 78.3% and negative predictive value (NPV) 98.2%.

Regarding MRI, the cortices were visible in 60.7% and the spinal canal visible in 74.6% of images.

Measures of agreement between MRI and CT were calculated using a kappa statistic: Unweighted, linear-weighted and quadratic-weighted kappa values all resulted as the same value. Kappa = 0.8042 with 0.05 standard error, a lower limit of 0.7183 and an upper limit of 0.8901. The proportions of agreement observed were 0.9523 with maximum 0.9824 and chance expected 0.7562; related lower and upper limits of the 95% confidence intervals of the observed value were 0.9252 and 0.9702, respectively.

The data were normally distributed and a Z-statistic of 16.08 resulted (16.08 > 1.96. p = < 0.025). Therefore, the significance of the results is such that the null hypothesis (H0) can be rejected with 95% confidence that the classification methods are better than random chance. The significant agreeability means accepting the alternative hypothesis.

Discussion

Imaging screws post-operatively gives a baseline for local bone fusion. It also ensures the decompression operation was adequate and checks for complications if the patient remains symptomatic [15]. MRI is necessary in ensuring adequate decompression and valuable if complications arise. Now MRI is progressing in assessing screw placement. The high sensitivity and even higher specificity emerging from this research suggest that MRI is an accurate method of analysing screw placement. The occurrences of false positive results (11.3%) and false negative results (3.8%) were rare. This is reinforced by the PPV for MRI (78.3%) in detecting misplaced screws as compared to CT (100%), which, although less, shows a significant basis for a very real consideration of MRI as a diagnostic tool post-operatively. The near-perfect NPV (98.2%) strongly supports MRI in determining correctly placed screws.

To improve the post-operative imaging sequence, standardised protocols should be developed for elective patients in order to minimise unnecessary scanning. The lack of comprehensive post-operative imaging protocols currently means patients in the elective setting are scanned with both MRI and CT. The necessity of both scans is questionable, and considering the statistics presented in this research, additional scanning should be minimised. Comprehensive decision-making tools should be adopted to aid teams in selecting the optimal scan. For example, asymptomatic patients could be limited to an MRI scan post-operatively. In the scenario where there is low diagnostic confidence, (MRI = positive or undetermined), a further radiological investigation could be requested to help define the screw position. Coupling the clinical pictures with the imaging assessment tool would provide information to the clinical team. For example, if post-operative radiculopathy could be attributed to a specific anatomical level, the documented assessment tool may help to indicate whether a malpositioned screw could be to blame, and further imaging could collimate to include only the anatomical level in question.

Although there is no gold standard for assessing screw placement, CT has virtually assumed this role [20, 21]. The kappa statistic resulting from this data is strong and (K = 0.8042) suggests substantial to near-perfect agreement between MRI and CT, see Figs. 3 and 4. Undeniably, none of the CT images were rendered poorly enough as to disable the interpreter from assessing the images. Each image gave visual, diagnostic-level confidence in determining the placement of the screw [11].

Fig. 3
figure 3

MRI of L4 with markup. Scanning Parameters: Siemens Avanto, T1, TE:10, TR:523, ETL:5, window 1707 level 810. When viewed in conjunction with Fig. 4, these comparison images show a strong level of agreeability between MRI and CT scans of the same vertebral level in the same patient

Fig. 4
figure 4

CT of L4 with markup. Scanning parameters: Siemens sensation 64, 140 kV, 390 mA, 1000 ms, 135DFOV, 1 mm, window 1500 level 450. Note—the CT image calliper measurements have not been displayed well on the image output due to colour, and the measurements relating to standard anatomical position are 7.4 mm, right, and 7.9 mm, left. The right screw length is 44.3 mm and the left screw length is 44.1 mm. When viewed in conjunction with Fig. 3, these comparison images show a strong level of agreeability between MRI and CT scans of the same vertebral level in the same patient

CT imaging of the spine is accurate and well developed. Furthermore, the field of view in CT is restricted to a small area surrounding the vertebrae, producing an image with increased magnification that is well resolved. Standardisation in CT generates a reliable, high-quality image, which avoids artefact (high kV, high mA, low pitch and collimating thinly). Reconstruction algorithms are able to tailor the image to enhance bone resolution. Higher kernel settings generate sharper and smoother images (60/70 is specific to bone enhancement). However, an important concern is exposing patients to high levels of ionising radiation [5]. Central to the calibration for optimising the metal structure is minimising this radiation dose—using the ALARA principle—because even with modern dose reduction techniques, CT always delivers harmful radiation. That being said, not undertaking a CT scan post-operatively for PLIF could also be inadequate, because MRI did not render 100% accurate results, whereas CT did. In the near future, the balance between opting for a CT-spine analysis and relying on it would certainly benefit from redesign. In elective cases with no complications, and at the hand of an experienced surgeon, MRI analysis may be sufficient.

Patient movement produces artefact and rarely, streaking artefact is seen in CT. In this study, minor visual problems arose tracing the screw path through the pedicle, see Fig. 5 [12, 18].

Fig. 5
figure 5

This poorly resolved CT image has created an unsatisfactory level of visual artefact and iterates the point that CT can also benefit from further imaging in surgical decision making. Imaging parameters: GE lightspeed VCT, 140 kV, 99 mA, 1095 ms, 1.25 mm, 188DFOV, window 1500 level 350

Regarding MRI, susceptibility artefact is solely responsible for the poor image quality of the screw and the consequent difficulty in interpretation (Fig. 6). Fifty-one per cent of screws were reported back as perfectly placed (not imposing on any cortices) (Fig. 7). The MR image does not affect the ultimate outcome of whether the screw is misplaced but may lead to false positives. Although T1 images anecdotally offer more clarity than T2, image clarity can be substantially improved in both with the use of metal artefact reduction sequences. Provision for MRI can be more limited and the ability to visualise the cortex less assured than with CT.

Fig. 6
figure 6

This MR of the lumbar spine delineates the extent of the susceptibility artefact and its general obscuring of the pedicle. Imaging parameters: Siemens Avanto, T2, TE:95, TR:7250, ETL:15, window 1338 level 633

Fig. 7
figure 7

A perfectly placed pair of screws imaged using MR in the lumbar spine without calliper measurements or mark ups. MRI imaging parameters: Siemens Avanto, T1, TE:10, TR:581, ETL:5, window 1229 level 590

While MR scanning can be onerous, we anticipate this will improve with standardised protocols tailored for PLIF with metal artefact reduction. In the post-operative setting, scanning with MRI offsets the requirement of a CT scan if it is deemed to be surplus and redirects CT time slots to more resourceful uses. It also fulfils the usage of idle MR scanners—which must be actively scanning to remain viable.

As metal artefact is more pronounced on MRI, visual assessment is crucial. Using this post-operative imaging assessment tool will enable the interpreter to formulate a level of certainty using MRI to interpret the spine. MRI may not match CT for detecting misplaced screws, but it can provide a platform for visual assessment and decision-making in patient care depending on this level of clinical certainty. Along with the ranking for adequacy of screw position, the interpreter can formulate the level of certainty attributed to the MRI scan. This could be presented to radiology colleagues to consider further in-depth scanning of the area of uncertainty in the most appropriate modality.

There are multiple image optimisation techniques for metal implants using MRI [7]—fast spin-echo, thinner slices, high bandwidth. Metal artefact reduction sequence (MARS) is an adapted spin-echo sequence with view angle tilting (VAT). VAT is used to correct for in-plane distortions from metal, using a gradient on the slice-select component during the readout phase [22, 23]. Most techniques inclusive of this suffer from through-plane distortions. But the second technique, namely slice encoding for metal artefact correction (SEMAC), is a contemporary susceptibility artefact solution, correcting both in-plane and through-plane distortions. It needs no additional hardware and can be applied to most current MR scanners. It is further adapted from MARS and allows time for additional phase encoding along the slice-select z-axis in order to deal with through-plane distortions [24].

Moreover, short-tau inversion recovery (STIR) sequencing optimises inversion pulses and increases its receiver bandwidth and has been shown to reduce artefact and improve depiction of anatomy [8]. In a different approach, iterative decomposition of water and fat with echo asymmetry and least-squares estimation (IDEAL), T2-sequences have been compared against contrast-enhanced T1-sequences to achieve a very high signal-to-noise ratio and unlike the above STIR sequence, are unsusceptible to local field inhomogeneity [9].

If these techniques can be standardised and programmed specifically for each post-operative MR scan with metal, the precision of MRI in detecting spinal instrumentation can be re-evaluated to test the significance against CT. In doing so, research has found no differences between the two modalities [10, 11]. Given this recent literature and the results of this research, it is suggested that MRI for PLIF should incorporate metal artefact reduction techniques and be the initial post-operative imaging choice for elective PLIF.

Limitations

These results display the raw capabilities of MRI. But the retrospective nature of this study brings limitations in control over imaging protocols and introduces a risk of bias from exposure–outcome inconsistencies. A prospective study design would more robustly control exposure variables and confounders while eliminating recall bias. The MR studies shared ‘lumbar and sacral’ field of view but differed in age, type/brand and scanning parameters—consequentially, MRI reliability is low. This is further impacted by the lack of a specific metal artefact reduction sequence in this study. Standardising these parameters is recommended for further studies in this area using an imaging protocol designed specifically to reduce metal artefact.

MRI images of one patient were reviewed before the CT images of the same patient. This introduces some methodology bias. The order of MRI before CT reduces the impact of the bias as CT has better gross diagnostic capability, but CT may not have rendered such accurate results if the order were reversed. This study was limited to PLIF approach, which narrows the scope of study, and had there been inclusion of other approaches, this could have provided broader conclusions.

Conclusion

This research studied patients who were scanned with both MR and CT as routine, mostly not using any metal artefact reduction techniques. Despite this, it shows a good level of matching between MR and CT. There is strong statistical agreement between MRI and CT for evaluating the adequacy of screw placement in PLIF. The necessity for both imaging modalities for every patient is refuted. MRI has the best potential due to its unparalleled soft-tissue contrast and demonstrates its place in evaluating adequate screw position. MRI detects accurately placed screws with high statistical confidence. However, there remains a discrepancy in screw size when using MRI from susceptibility artefact, which is not seen in CT. It is therefore recommended to use MR in the post-operative evaluation and restrict the use of CT for cases of uncertainty following MR, and this decision should be based on the clinical presentation of the patient. Future research could add to the reliability of MRI by investigating the optimal metal artefact reduction sequence for internal spine fixation.