Introduction

Whole-body magnetic resonance imaging (WB-MRI) is a modern MRI methodology with implementations in bone marrow staging. A systematic review and meta-analysis of 11 studies estimated a pooled sensitivity (Se) of 90% with a 95% confidence interval (CI) of 84–94% and a specificity of 92%, 95% CI 88–95% [1]. WB-MRI is reported to be the second more sensitive and specific method for the detection of bone metastasis after the 2-deoxy-2-[fluorine-18]fluoro-d-glucose integrated with computed tomography (18F-FDG PET/CT). However, even though the main body of literature supports the nuclear medicine approaches, WB-MRI is developing to a current diagnostic trend due to the apparent advantages such as the lack of ionizing radiation, excellent soft-tissue contrast, lower cost, increased availability as well as its safe applicability in pregnant women.

MRI can demonstrate intramedullary metastatic deposits in advance of cortical or matrix destruction and before a pathologic osteoblastic process manifests as a focal accumulation of radiotracer on 99mTc-hydroxydiphosphonate bone scintigraphy (BS). The conspicuity of bone lesions was shown to be supreme in gadolinium-enhanced, fat-suppressed T1-weighted (T1w) scans compared to diffusion-weighted imaging (DWI) or short tau inversion recovery (STIR) techniques [2].

Despite the fact that the WB-MRI is not widely included in the European Society for Medical Oncology (ESMO) staging guidelines for breast [3], prostate [4] and lung cancer [5], recent studies converge towards a potentially beneficial role compared to the first-line recommendation, the computed tomography of chest–abdomen–pelvis (CT-CAP) [6, 7] for the classification and follow-up of bone metastasis [8]. Interestingly, WB-MRI was proven powerful, especially for the selection and monitoring of the favorable prognosis patients with an oligometastatic [9] or “bone-only disease” [10].

In the current study, we aim to (i) assess the diagnostic accuracy of WB-MRI for the detection of metastatic bone disease compared to BS and (ii) evaluate the added diagnostic value of gadolinium for WB-MRI bone staging.

Methods

Recruitment and flow of participants

The study was retrospective for cancer patients screened with WB-MRI between 2007 and 2018 (Fig. 1) and was designed and structured according to the STARD guidelines [11, 12]. From a total of n = 1797 WB-MRI scans, we excluded: n = 10 aborted or incomplete scans, n = 7 non-cancer patients (battered child), and n = 529 patients without ground truth. No other eligibility criteria applied. From n = 1256 eligible patients, n = 682 were scanned at 3.0 T field strength (Philips Ingenia, Philips Medical Systems, Böblingen, Germany) and n = 574 in a 1.5 T setup (Philips Achieva or Philips Ingenia, Philips Medical Systems). The field strength selection was solely decided upon the patient compatibility with a 3.0 T magnetic field strength. In n = 528 patients, the WB-MRI was enhanced with gadoteridol (ProHance®, Bracco Imaging S.p.A., Konstanz, Germany) 0.1 mmol/kg (WB-MRI + Gd). We administered contrast enhancement randomly, upon demand of the treating clinician or commitment to a standard therapeutic protocol. N = 728 patients came without a specific recommendation for enhanced MRI, refused, had a previously reported allergic reaction to gadolinium, or a deteriorated renal function. In this case, the enhanced sequence was omitted (non-enhanced, NE WBMRI). A subgroup of n = 285 patients received both a WB-MRI and bone scan (BS) within a time range of 12 months, which allowed for a paired method comparison (Fig. 1).

Fig. 1
figure 1

Flow of participants. WB-MRI, whole-body magnetic resonance imaging. A subgroup of n = 285 patients received both a WB-MRI and bone scan (BS) within a time range of 12 months, which allowed for a paired method comparison

Compliance with ethical standards

All the patient data were derived from the database of our institution. Data were analyzed retrospectively, fully anonymized, in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its amendments, the European regulation 536/2014 and its latest addendum ICH GCP E6(R2)/2017, as well as with the guidelines of the local Institutional Review Board for clinical studies (Ethical commission of the University Hospital Jena, IRB Number 2019-1288).

Imaging and image evaluation

WB-MRI images were acquired with a dStreamWholeBody coil (Philips Medical Systems) (Fig. 2) and the protocol consisted of the following coronal sequences in brief: (i) a T1-weighted (T1w) sequence in turbo spin echo (TSE) or fast field echo (FFE) technique, (ii) a short tau inversion recovery (STIR), and (iii) a T1w FFE sequence in mDixon technique with gadoteridol contrast (mDixon + Gd). The protocol details are summarized in Table S1 and Table S2 (Appendix). The scanning duration, approximately 1 h with, and 50 min without the mDixon + Gd sequences, is illustrated in Fig. 2. The WB-MRI image evaluation was based on the joint report of two radiologists, one with low (2 years) or intermediate (5 years) experience and a consultant with more than 15 years of experience in interpreting WB-MRI.

Fig. 2
figure 2

WB-MRI scanning protocol. All the scans took place in Philips MRI scanners 1.5 T and 3 T using a dStream whole-body coil (left panel). The scanning protocol with a total duration of ca. 60 min with and 49 min without gadoteridol consisted of a T1w ffast field echo (FFE) or fast spin echo (FSE), a short tau inversion recovery (STIR) and a mDixon with gadoteridol enhancement (mDixon + Gd)

The ground truth of metastatic disease was confirmed either with an imaging follow-up within one year or a biopsy. Bone metastases were identified as T1w-low/STIR-high signal areas of the bone marrow of cancer patients with the following behavior in follow-up scans: (i) newly appearing compared to previous scans, (ii) size progress with or without chemotherapy, (iii) size diminution or changing in character under chemotherapy. Persisting disseminated osseous disease with or without chemotherapy was also considered as a real ground truth. Solitary osseous lesions or atypical degenerative changes without dynamic in follow-up scans within 1 year were not regarded as metastatic unless proven by biopsy.

BS consisted of planar images of the entire skeleton, which were acquired 3 h after i.v. application of 650 MBq 99mTc-hydroxydiphosphonate (Tc-HDP, ROTOP Pharmaka GmbH, Dresden, Germany) using a Symbia Evo Excel setup (Siemens Healthcare GmbH, Erlangen, Germany). The BS evaluation was performed by a nuclear medicine consultant with more than 15 years of experience.

Statistics and software

Logistics and descriptive statistical data processing were performed with LibreOffice™ 4.4.7.2 (The Document Foundation, Berlin, Germany) and the Microsoft© Office suite 2010 (Microsoft Ireland Operations Limited, Dublin, Ireland). Graphical processing was accomplished using the free source platform Inkscape (License name: GPL v2+, https://inkscape.org). Images were anonymized without manipulating any clinical information. For the statistical workout, we used SigmaPlot V16.0 (Systat Software Inc., San Jose, CA USA). Unpaired binary datasets were processed with a binary logistic regression with Fischer’s exact test and paired datasets with the McNemar’s algorithm. The interobserver agreement rate was assessed using kappa statistics. Percentages are rounded up to the closest integer unless other specified.

Results

WB-MRI exceeds the sensitivity of bone scan for bone metastasis detection without compromising the specificity

From a total of n = 1256 eligible patients, n = 285 received both a WB-MRI and a BS within a time interval of 12 months. The selection of those patients was based on the independent therapeutic decisions of the clinical team and was not affected by age, gender, or type of primary cancer. The cancer distribution in the WB-MRI + BS subgroup did not differ from the total eligible patient pool, P > 0.05, and logistic regression (Fig. 3a, b). First, we compared the performance of the gathered WB-MRI pool to the total BS number using unpaired statistics (Fig. 3c). Indeed, WB-MRI showed a significantly higher sensitivity (Se, 98%) for the bone metastasis detection compared to BS (82%), whereas both the methods were comparable in terms of specificity (Spe, 93 and 91%, respectively), P < 0.001, and binary logistic regression with Fischer’s exact test. The result was reproduced with paired statistics between the WB-MRI and BS in the subgroup of patients (n = 285) that were subjected to both examinations (Fig. 3d). The Se of WB-MRI exceeded that of BS by approximately 16% (98 and 82%, respectively), P < 0.001 Mc Nemar’s test. The Spe and the PPV of WB-MRI appeared beneficial towards the BS, especially for the prostate and lung cancer group, but not the breast cancer patients (Fig. 3d). However, this observation calls for a cautious interpretation, since the disproportionally more considerable number of the breast compared to lung cancer patients might bias the statistical result.

Fig. 3
figure 3

Diagnostic accuracy of WB-MRI vs. bone scan for bone metastasis. The diagnostic accuracy of WB-MRI (n = 1256) was compared to bone scans from a subgroup of n = 285 patients that received both the WB-MRI and bone scan (WB-MRI + BS) within 12 months. The distribution of primary cancers was similar between the WB-MRI (a) and the WB-MRI + BS (b) group. c Unpaired comparison of the sensitivity (Se), specificity (Spe), positive predictive value (PPV), and negative predictive value (NPV) between the WB-MRI and BS. The Se and NPV of WB-MRI were significantly higher compared to BS, P < 0.001 binary logistic regression with Fischer’s exact test. d Paired comparison of the Se, Spe, PPV, and NPV between the WB-MRI and the BS in the WB-MRI + BS subgroup. The Se and NPV of WB-MRI were significantly higher compared to BS, P < 0.001 Mc Nemar’s test. For the three most common cancer groups, the Se/Spe/PPV and NPV of WB-MRI was: for breast cancer 99/92/87/99%, for prostate cancer 93/91/93/91% and lung cancer 96/100/100/97%

The calculated interobserver agreement (IOA) between WB-MRI and BS was 71% (kappa statistics) and Cohen’s coefficient = 0.42 (Fig. 4). Regardless of the primary disease (Fig. 4), the IOA between the WB-MRI and BS fluctuated at a moderate range. Here it is worth noticing that, due to the retrospective character of the study, the reporting was not explicitly double-blinded between the nuclear medicine and the radiology consultant; hence, the result incongruence occurred even based on an open-access to the first report. Indeterminate data with an interobserver disagreement between the two methods were managed individually according to the level of conspicuity: high conspicuity in the dominant detection method favored the treatment as metastasis. Low-conspicuity lesions in either or both methods were clarified further by a dedicated MRI of the body region, an F-FDG PET/CT, or a biopsy. The statistical analysis of the management of indeterminate lesions will not be a topic of this study.

Fig. 4
figure 4

Interobserver rating agreement between WB-MRI and bone scan. Tornado plot of the interobserver agreement (right) and disagreement (left) rate for bone marrow metastases in different cancer groups. Interobserver agreement (IOA) = 71.15%, kappa statistics with Cohen’s kappa = 0.42. CA cancer, SCC squamous cell cancer, CUP cancer of unknown origin, GI gastrointestinal, NSCLC non-small-cell lung cancer, SCC small cell cancer

Summarizing the above, we conclude that the WB-MRI is superior in sensitivity and at least equal in specificity for bone metastasis detection compared to the BS. The moderate IOA, even for non-double-blinded reporting, confirms that the characteristics of metastatic lesions detected by each method are not necessarily overlapping.

Gadolinium does not improve the diagnostic accuracy of WB-MRI

As a second aim in this study, we opted to approach whether the addition of an mDixon + Gd sequence improves the diagnostic efficiency. Driven by the recent observation on gadolinium accumulation in the deep brain nuclei [13,14,15] and amid the obvious benefits concerning patient preparation, compliance, scanning time, additional cost, and side effect avoidance, we questioned whether the joint conspicuity of bone metastases in T1w and STIR sequences are indeed significantly improved by the complementary mDixon + Gd sequence (Fig. 5). While the T1w TSE or FFE sequences reproduce the anatomy in high resolution with a voxel detail of ca. 1 × 1 × 3 mm (x, y, z) in both field strengths (see Table S1 and Table S2 in the supplement for more information) and are more robust against field inhomogeneities, the lower resolution STIR sequences (Fig. 5) allow for a signal intensification in regions where the fatty bone marrow is replaced by metastatic cells and extravasated fluid, such as the osseous metastasis (Fig. 5, arrow). Bone marrow metastases, shown as intensified spots on STIR images, reveal a vivid enhancer accumulation in the mDixon + Gd sequences due to neoangiogenesis (Figs. 4b, 5b, white arrows) [16, 17].

Fig. 5
figure 5

Sample images of WB-MRI for the detection of bone metastasis in prostate and breast cancer. Left panels: non-enhanced T1-weighted fast field echo (T1w FFE) in coronal sections with whole-body stitching. Middle panels: short tau inversion recovery (STIR). Right panels: gadolinium-enhanced T1-weighted mDixon fast field echo (mDixon + Gd) from the same patient in coronal sections with whole-body stitching. The arrow points at the vertebral osseous metastatic lesions, magnified in the insert image for each panel. Bone metastases have a weak T1w, a high STIR signal, and a homogeneous shortening of the T1 relaxation time in the mDixon + Gd. Cyst of the right kidney as an incidental finding in the prostate cancer case (upper row)

The comparison of NE WB-MRI (n = 728) vs. WB-MRI + Gd (n = 528) was unpaired between the different patient groups. Clinical criteria determined the decision for Gd administration, the patient’s tolerance and consent, hence was not statistically randomized. As a result, the proportion of patients with breast cancer outnumbers the other cancer types in the WB-MRI + Gd group, 83% vs. 61% in NE WB-MRI (Fig. 6a, b). The sensitivity (Se) and specificity (Spe) for NE and WB-MRI + Gd (Fig. 6c) was approximately 98/99% and 93/93%, respectively. The positive predictive value (PPV) was 92% for NE and 85% for WB-MRI + Gd, thus suggesting an even increased proportion of false positives. The negative predictive value (NPV) was 98% and 100%, respectively. Binary logistic regression with Fischer’s exact test suggests that the diagnostic accuracy of WB-MRI for bone staging in our dataset was not influenced by gadolinium application, P = 0.836 (Fig. 6c). Hence, we conclude that the addition of Gd did not improve the diagnostic accuracy of the WB-MRI for the detection of bone metastases.

Fig. 6
figure 6

Diagnostic accuracy of WB-MRI for bone staging with and without enhancer. a Distribution of primary cancers in the non-enhanced group (NE): breast cancer n = 443 (61%), prostate cancer n = 77 (10%), lung cancer n = 64 (9%), and miscellaneous cancers n = 144 (20%). b Distribution of primary cancers in the gadolinium-enhanced group (WB-MRI + Gd): breast cancer n = 441 (83%), prostate cancer n = 24 (5%), lung cancer n = 13 (2%), and different cancers n = 50(9%). c Diagnostic accuracy of NE and WB-MRI + Gd for bone staging: the sensitivity (Se) NE/ WB-MRI + Gd was 98/99%, the specificity (Spe) 93/93%, the positive predictive value (PPV) 92/85% and the negative predictive value (NPV) 98/99%, P = 0.836 binary logistic regression with Fischer’s exact test, odds ratio 0.9 with 95% CI 0.502–1.612. d Incidental findings in WB-MRI. Additional tumors (NE/WB-MRI + Gd) n = 55/66, benign findings with surgical relevance n = 29/46, orthopedic/degenerative findings n = 316/264, additional non-osseous metastasis (MTS) n = 176/113

Another attractive feature of the WB-MRI compared to BS is the number of incidental findings, sometimes with vital oncological, orthopedic, or surgical significance. Although the analysis of incidentals is not a central subject of the current study, Fig. 6d illustrates that the number and type of incidental findings did not significantly differ between the NE WB-MRI and WB-MRI + Gd. This result is, however, not normalized for the cancer type, disease duration, or patient’s age. Although a non-negligible number of non-osseous metastases was detected with WB-MRI, the diagnostic accuracy for non-osseous metastatic disease and local staging was not further evaluated. WB-MRI is not a standard of diagnosis for extraosseous staging, primarily due to the low spatial resolution, motion artifacts, and signal inhomogeneities of the multi-array whole-body coil.

Discussion

In this study, we assess the diagnostic accuracy of WB-MRI retrospectively as a bone staging tool. Primarily, we are focusing on (i) the WB-MRI performance in comparison with BS and (ii) the added diagnostic value of the gadolinium enhancer.

Several studies support that the WB-MRI is a cost-effective and accurate method for the detection of bone metastases, with the potential to modify diagnostic decisions compared to CT, BS [6], and PET-CT [18], even without contrast enhancement [18]. The superiority of WB-MRI for bone staging was recognized as early as 15 years ago [19]. Since then, technological advances and optimization of patient’s comfort encourage the diagnostic implementation of WB-MRI as a staging tool. Even though the European diagnostic guidelines for the most common cancers [3, 4] do not recommend WB-MRI as a standard of diagnosis, a growing body of evidence supports its superiority versus the BS. Two meta-analyses, including 145 [20] and 33 studies [21] came up with a pooled Se of 91/86% and Spe 95/81% for WB-MRI and BS, respectively, hence demonstrating a statistically significant superiority of the WB-MRI versus BS for bone staging regardless of the primary disease.

Interestingly, the metabolic and vascular fingerprint of the osseous metastases from different primary cancers influences the diagnostic accuracy of WB-MRI. Breast cancer metastases are highly conspicuous in the water-enhanced MRI sequences (Fig. 5) in our and previous studies that demonstrated a Se/Spe/PPV of 95/100/100% versus 70/94/95% for BS [22]. A meta-analysis designed to determine the best diagnostic tool for breast cancer bone staging [23] gathers 23 studies in a pooled Se of 97/88% for WB-MRI /BS. The same meta-analysis supports that the WB-MRI is the best method for the detection of skeletal metastases in breast cancer, significantly outperforming both the FDG-PET and BS [23].

Similar to the breast cancer, prostate cancer is another promising candidate for WB-MRI bone staging. Shen et al., in a meta-analysis from 2014, encompass 27 studies to a pooled Se for WB-MRI /BS of 97/79% on a per-patient basis [24]. According to the authors of the meta-analysis [24] and based on data from the prospective SKELETA study by Jambor et al. [25], WB-MRI is the method of choice for prostate bone staging, with at least equal [25] or even significantly improved performance towards the BS and choline PET/CT [24]. Our extensive database (n = 1256 patients) is an essential add-on to the growing body of evidence that highlights WB-MRI as the method of choice for breast and prostate cancer staging.

Apart from the breast and prostate cancer patients, which together make up 78% of our database, we included a small percentage of lung cancers (6%, n = 77 patients). A separate analysis of different cancer types showed a benefit for lung cancer patients that received WB-MRI compared to BS (Fig. 6). Two meta-analyses of 17 and 34 studies [26, 27] reveal a pooled Se of 77–86% and 80–92% for WB-MRI and BS, respectively. The standard of diagnosis for bone metastasis in lung cancer is the F-FDG PET/CT with superior Se and Spe to BS and WB-MRI. Our study suggests, however, that WB-MRI might be a reliable alternative to F-FDG PET/CT with higher Se and Spe values. This result calls nevertheless for cautious interpretation, since the number of lung cancer patients represented only a small fraction (6%) of the total database. The lung cancer second-line staging decision should thus be individualized based on the experience of the diagnostic team.

Although gadolinium contrast agents are steadily improving towards a kidney-friendly direction, gadolinium deposition in the CNS and the bone marrow [28], as well as significant concerns and side-effects in children [29], are some of the method drawbacks. The baseline WB-MRI protocol consists of a T1w anatomical sequence and a STIR with enhanced water contrast for the bone marrow edema. We retrospectively questioned the sufficiency of those sequences and the necessity for additional + Gd images for a precise and sensitive bone metastasis detection. The diagnostic accuracy of STIR sequences has been rated with high sensitivity in the range of 96% for the detection of metastatic skeletal disease already 20 years ago [30] and is currently a standard of diagnosis in WB-MRI imaging protocols [6]. Implementation of the Dixon technique in WB-MRI enriched the diagnostic battery with high-resolution fat-suppressed images. Costelloe et al. compared the metastatic bone lesion conspicuity in the mDixon + Gd and STIR, reporting a significant advantage of the mDixon + Gd [2]. Our study, on the other hand, did not reproduce this result and revealed an equal diagnostic accuracy for the combination of STIR+T1w and mDixon + Gd. The explanation of this discrepancy lies most likely in differences in the study design. Costelloe et al. use a semi-quantitative arbitrary conspicuity scale to classify the lesions, whereas in our model, the diagnosis is “all-or-none” and the statistical analysis is based on the binary data.

The retrospective character of this study does not allow for unbiased randomization of the patients regarding the administration of enhancer in WB-MRI. This study weakness is anticipated by the large data sample in the range of “big data,” which was not subjected to any rejection criteria. Moreover, retrospective fragmentation of the large dataset into smaller groups based not only on the enhancer administration, but also on the field strength and primary cancer further reduces the probability of bias penetration into the sub-analysis panels.

A disadvantage of this study is the lack of WB-DWI from the imaging protocol, mainly due to the additional prolongation of an already time-demanding examination (Fig. 2). Recent advances in the shortening of the scanning time, as well as omitting of the Gd-enhanced sequences, should allow for the standardization of DWI in the WB-MRI. Another drawback of the current study is its retrospective character, which can provide only a low grade of evidence. Adequately powered, prospective study designs such as the SKELETA study [25] could support the establishment of NE WB-MRI in the clinical guidelines for breast and prostate cancer.