Introduction

Lung cancer is the leading cause of death from cancer worldwide (1.8 million new cases and 1.6 million deaths in 2012) (Lung cancer 2012). The 5-year survival rates vary from 8 to 16% in Europe and Northern America (National Collaborating Centre for Cancer (UK) 2011; Alberg et al. 2013). Importantly, the survival rates are highly dependent on the stage of the disease (localized stage 52%, regional stage 24%, and distant stage 4%) (Alberg et al. 2013). Unfortunately, in early stages, patients are often asymptomatic or show unspecific symptoms; thus, 70–75% of the tumors are primarily inoperable at the time of diagnosis (National Collaborating Centre for Cancer (UK) 2011; Howington et al. 2013; Jett et al. 2013).

In 2011, the US National Lung Screening Trial Research Team et al. (2011) could demonstrate a 20% reduction of lung cancer mortality due to early detection with annual low-dose computed tomography (LDCT) which is unmatched by any therapeutic approach. On the downside, LDCT of the thorax was associated with a radiation exposure of approximately 1.5 mSv per scan (National Lung Screening Trial Research Team et al. 2011). An estimated rate for radiation-induced cancers for screening participants receiving annual LDCT has been calculated to be in a range of 0.5–5.5% (Brenner 2004).

Magnetic resonance imaging (MRI) allows for radiation-free lung imaging, but is hampered by susceptibility artifacts at air–tissue interfaces, low proton density of the lung parenchyma, and vulnerability to respiratory and cardiovascular motion artifacts (Wielpütz and Kauczor 2012; Wild et al. 2012; Biederer et al. 2012). Recent technical developments in MRI improved sensitivity rates for nodules ≥ 8 mm to up to 100% (Biederer et al. 2017; Sommer et al. 2014; Cieszanowski et al. 2016; Koyama et al. 2013; Meier-Schroers et al. 2016; Schroeder et al. 2005). The current threshold size for detection of pulmonary nodules is believed to be 3–4 mm (Biederer et al. 2017).

The aim of this study is to report on the first screening round of an MRI lung cancer screening in a high-risk population and to evaluate the suitability of MRI compared to LDCT.

Materials and methods

This prospective study was approved by the institutional review board and by the federal agency for radiation protection. Written informed consent was obtained from all subjects.

Study participants

233 consecutive participants (mean age 58.5 ± 5.7 years) were recruited for our lung cancer screening program. Inclusion criteria were selected to allow for comparison with other lung cancer screening programs, namely, the US National Lung Screening Trial and the German LUSI-Trial (National Lung Screening Trial Research Team et al. 2011; Becker et al. 2012). General inclusion criteria were age 50–70 years, as well as long-term nicotine abuse (at least 15 cigarettes per day for at least 25 years, or at least 10 cigarettes per day for at least 30 years); participants were active smokers or had quitted smoking for not more than 10 years (Becker et al. 2012). Nine of 233 participants only obtained low-dose CT (LDCT) due to contraindications for magnetic resonance imaging (MRI), eight of them because of claustrophobia, and one because of a cochlear implant. A total of 224 participants underwent LDCT and MRI within the same day or week.

Technique

LDCT was performed on a clinical 128-slice spiral CT scanner (iCT, Philips Healthcare, Best, The Netherlands) in inspiratory breathhold with a reconstructed slice thickness of 2 mm. MRI was performed on a clinical 1.5T scanner (Ingenia, Philips Healthcare, Best, The Netherlands) using a phased array body coil with subjects’ head first and arms down. Acquired MRI sequences were transverse T2-weighted short tau inversion recovery (STIR) MultiVane XD (MVXD, Philips Healthcare, Best, The Netherlands), transverse and coronal T2-weighted MVXD, transverse balanced steady-state-free precession (bSSFP), coronal 3D T1-weighted high-resolution isotropic volume excitation images (THRIVE, Philips Healthcare, Best, The Netherlands), and transverse diffusion-weighted images (DWI). Image acquisition for MVXD and DWI was gated to the expiratory phase of the respiratory cycle, and bSSFP and THRIVE were obtained in expiratory breathhold. The type of sequences we used and their imaging parameters followed the recommendations of Biederer et al. (2012). For the scan protocol and technical data, see Table 1. Maximum in-room-time was 20 min, so that up to three participants could be scanned per hour.

Table 1 Imaging parameters of the scan protocol

Image analysis

MR images were prospectively reviewed by two radiologists (Michael Meier-Schroers and Daniel Thomas) with an experience of 5.5 and 16 years, respectively. Both readers were unaware of the CT findings to eliminate a detection bias. MRI data sets were anonymized and randomly presented to the readers. All acquired MRI sequences were evaluated in synopsis.

In the first reading session, the pulmonary nodules were prospectively assessed and categorized based on their appearance and size, following the recommendations of the Lung Screening Reporting and Data System (Lung-RADS) (American College of Radiology 2017): solid nodules 4–5, 6–7, 8–14, or ≥ 15 mm and subsolid nodules < 20 or ≥ 20 mm. The minimum size of assessed nodules was 4 mm. Nodule size was defined as the average of longest and shortest axial diameters rounded to the nearest whole number. Measurement was performed on the sequence that best displayed the nodule. Diffusion restriction was separately assessed for nodules ≥ 6 mm.

In a second reading session, MRI findings were correlated with LDCT, which served as the reference imaging modality (gold standard of nodule detection). To assess sensitivity and specificity of MRI for the different Lung-RADS subgroups, nodules had to be categorized as true positive, true negative, false positive, and false negative for each subgroup. Nodules were classified as false negative on MRI when they could not be detected or when underestimation of size led to a downgrading of their category. Nodules were considered false positive in cases of lack of evidence on LDCT or MRI-based overestimation of size leading to an upgrading of their category.

Both MRI and LDCT data sets were viewed on a professional medical monitor using IMPAX EE (AGFA Healthcare, Bonn, Germany).

Management of nodules

Following Lung-RADS recommendation (American College of Radiology 2017), further work-up was considered for subjects with nodules ≥ 6 mm. This was defined as a positive screening result. Those with nodules 6–7 mm underwent follow-up after 6 months. In cases of nodules ≥ 8 mm, subsequent management was discussed in an interdisciplinary conference with thoracic surgeons, pulmonologists, and oncologists. According to conference decision, subjects underwent follow-up after 3 months, positron-emission tomography–computed tomography (PET/CT), biopsy, and/or resection. The results of follow-up examinations, PET/CT, biopsy, and surgery were also analyzed in this study.

Statistical analysis

Statistical analysis was performed with SPSS 24 (IBM, Armonk, New York, USA). We calculated sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve of MRI for all nodule subgroups according to appearance and size. These measures of diagnostic performance were calculated for nodule detection by MRI compared to LDCT as the gold standard of nodule detection. The Pearson coefficient was applied to determine correlations of nodule size as measured by MRI and LDCT.

Results

According to magnetic resonance imaging (MRI), 31 of 224 participants had a positive screening result leading to immediate recall or short-term follow-up. Hence, the MRI-based early recall rate was 13.8%. The early recall rate for low-dose CT (LDCT) as reference was slightly lower (29 of 233 participants, 12.5%).

The screening results were false positive in 23 of 31 cases for MRI (74.2%) compared to 21 of 29 cases for LDCT (72.4%). In one case, a 6-mm nodule on MRI was not seen on LDCT. In another case, MRI could not depict that a 6-mm nodule was fat containing and thus definitely benign according to Lung-RADS (American College of Radiology 2017) (Fig. 1).

Fig. 1
figure 1

Solid 6-mm nodule in the right upper lobe, fat containing according to CT, slightly T2 hyperintense on MRI (from left to right: CT lung window, CT soft-tissue window, MRI T2 STIR MultiVane XD, and MRI-balanced steady-state-free precession)

In the remaining 21 cases, which were identical on LDCT and MRI, findings were stable after 3 or 6 months. Two of these 21 subjects were additionally examined with positron-emission tomography–computed tomography (PET/CT) because of an ill-defined pulmonary nodule with a size of 10 and 11 mm, respectively. According to PET/CT, both lesions were most probably atelectasis and inflammatory tissue. In both cases, opacities were distinctly regressing at 3-month follow-up.

Following Lung-RADS recommendations and based on an interdisciplinary consensus decision, histology (biopsy or surgery) was obtained in eight of 224 cases for MRI (biopsy rate of 3.6%) and in eight of 233 cases for LDCT (biopsy rate of 3.4%). The cases with recommendations for biopsy were identical for MRI and LDCT.

In six of these eight cases, screening MRI and LDCT showed highly cancer suspicious nodules. Four subjects were scheduled for curative tumor resection, and subsequent histology revealed non-small cellular lung cancer (NSCLC) in all cases (stages IA, IB, IIA, and IIB in one case each). Two different subjects showed distant metastases in clinical staging directly after screening. The histological proof of NSCLC was obtained from CT-guided lung biopsy in one of these cases and from stereotactic biopsy of a brain metastasis in the other. In both cases, tumors were classified as stage IV, and both patients received palliative chemotherapy.

One participant with a 7-mm nodule underwent 3-month follow-up despite the recommendation of the interdisciplinary conference for a 6-month follow-up. The nodule was broadly stable after 3 months, but grew distinctly from 7 to 25 mm after another 9-month interval. Histology revealed small cell lung cancer (SCLC). However, at that time, the participant already had brain and liver metastases (extensive disease). In a different case, one nodule grew from 8 mm at baseline to 10 mm 3 months later. This participant underwent surgery, and the nodule turned out to be a stage IA NSCLC.

According to LDCT, 110 of 224 participants showed pulmonary nodules. MRI accurately determined that nodules were present in 86 of these 110 cases (78.2%).

The advantages of MRI over non-enhanced LDCT are shown in Figs. 2 and 3. In both cases, nodules are hard to see on LDCT, because they are adjacent to pulmonary vessels. Both nodules are clearly visible on MRI.

Fig. 2
figure 2

Solid 8-mm nodule between pulmonary vessels in the left upper lobe (from left to right: CT lung window, CT soft-tissue window, MRI T2 STIR MultiVane XD, and MRI-balanced steady-state-free precession)

Fig. 3
figure 3

Solid 15-mm nodule centrally in the middle lobe (from left to right: CT lung window, CT soft-tissue window, MRI T2 STIR MultiVane XD, MRI-balanced steady-state-free precession)

The total number of detected nodules in our study population was 137 according to LDCT. MRI accurately detected 61 of 88 solid nodules with a size of 4–5 mm (69.3%). Regarding the 27 missed solid nodules 4–5 mm, ten were not visible due to motion or susceptibility artifacts, seven were mistaken for pulmonary vessels (or not seen, because they were adjacent to vessels), six were not visible due to calcifications, and four were located inside hypoventilated lung parenchyma and thus not definable as nodules on MRI.

37 of 38 solid nodules ≥ 6 mm (97.4%) were detected by MRI. The one missed nodule (6 mm) was fat containing, very flat, and located inside hypoventilated lung parenchyma. Three nodules with a size of 6–7 mm were calcified; all of them could be detected by MRI.

MRI accurately detected 8 of 11 subsolid nodules all in the group < 20 mm (72.7%). Three of them (4, 5, and 7 mm) were missed due to artifacts.

Ten of the 115 nodules detected nodules by MRI were false positive (8.7%). In four of these cases, pulmonary vessels were erroneously diagnosed as nodules; all of them had a size of 4–5 mm on MRI. In four different cases, alleged nodules on MRI were streaky, probably inflammatory opacities on LDCT; two of these appeared to be solid on MRI with a size of 4 and 6 mm, respectively; the other two appeared to be subsolid on MRI with sizes of 9 and 14 mm, respectively. In another case, a 4-mm solid nodule on MRI was elongated scar tissue on LDCT. Finally, there was one solid nodule with a size of 9 mm on MRI, which actually measured 7 mm according to LDCT. This nodule was detected on MRI, yet the assigned category was too high (classified as false positive for 8–14 mm).

Table 2 displays the number of detected nodules on LDCT, as well as the number of true-positive, false-negative, and false-positive nodules on MRI for the different nodule subgroups. Sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve of MRI for the different nodule subgroups are shown in Table 3.

Table 2 Nodules detected by LDCT and MRI
Table 3 Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the curve (AUC) of MRI

Mean nodule size was 6.44 mm on MRI (median 5 mm) and 6.00 mm on LDCT (median 4 mm). Nodule size as determined by MRI significantly correlated with measurements on LDCT (r = 0.992 and p < 0.001).

There was size discrepancy of 1 mm between LDCT and MRI in 19 nodules. In one case, size discrepancy of 2 mm led to a false-positive MRI result. This 9-mm nodule on MRI actually measured 7 mm on LDCT. In another case, a 17-mm subsolid nodule on MRI actually measured 14 mm on LDCT. For all other nodules, size discrepancy did not exceed 1 mm.

17 of 38 solid nodules ≥ 6 mm and seven of eight bronchial carcinomas showed diffusion restriction. Three of 21 the nodules without diffusion restriction were calcified. None of the eight subsolid nodules on MRI showed restricted diffusion. Sensitivity/specificity of diffusion-weighted imaging (DWI) for predicting bronchial carcinoma was 87.5/96.0%, but positive predictive value was only 41.2%.

Discussion

The results of our study suggest that magnetic resonance imaging (MRI) might be suitable for lung cancer screening. This corresponds to the hypothesis that screening can be performed with MRI due to a high-sensitivity rate and a low false-positive rate (Biederer et al. 2017). The latest technical developments of pulmonary MRI, such as radial k-space sampling (Meier-Schroers et al. 2016), ultrashort echo time (UTE) imaging (Ohno et al. 2016), and ultrafast 3D balanced steady-state-free precession (Bieri 2013), have made MRI a potential radiation-free alternative to computed tomography (CT). Still, we agree with Sommer et al. (2014) that T2-weighted sequences are the mainstay of pulmonary MRI.

The incidence of bronchial carcinoma diagnosed by lung cancer screening in the first year was 3.4% in our study population. The MRI-based early recall rate was 13.8 with 74.2% cases being a false-positive baseline screening result. Both rates are significantly lower compared to reported data from the US National Lung Screening Trial (28 and 96%) (National Lung Screening Trial Research Team et al. 2011), the German LUSI study (22 and 95%) (Becker et al. 2012), and the Dutch/Belgian NELSON study (21 and 95%, applied to the definition of a positive screening results in our study) (van Klaveren et al. 2009). The lower rates are mainly attributable to the fact that the nodule size threshold for early recall was 6 mm in our study, while it was 4 mm in the National Lung Screening Trial and 5 mm in the LUSI study and the NELSON study, respectively (National Lung Screening Trial Research Team et al. 2011; Becker et al. 2012). However, it must be considered that the authors of the NELSON study defined a test positive when non-calcified nodules had a solid component > 500 mm3 (> 9.8 mm in diameter) and when nodules with a solid component of 50–500 mm3 (4.6–9.8 mm in diameter) showed a volume-doubling time of less than 400 days in 3-month follow-up. Hence, according to their definition, 2.6% of the participants had a positive test result with a sensitivity of LDCT-screening for the detection of lung cancer of 94.6% (van Klaveren et al. 2009).

The early recall rate for MRI was only slightly higher than for LDCT in our study (13.8 vs. 12.5%), since MRI erroneously detected one 6-mm nodule that was not seen on LDCT, and classified another nodule as suspicious, even though it was fat containing on LDCT and thus definitely benign.

The probability of malignancy of small pulmonary nodules is very low, but it increases with size (McWilliams et al. 2013; Horeweg et al. 2014; MacMahon et al. 2017), which makes accurate sizing of nodules essential. In our study, the sensitivity of MRI for detection of small solid nodules < 6 mm is relatively low compared to LDCT. Yet, according to Lung-RADS, only nodules ≥ 6 mm have a slightly elevated risk of malignancy of > 1% requiring short-term follow-up or further investigation (American College of Radiology 2017). In our study, MRI had a sensitivity of more than 95% and a specificity of more than 99% for these nodules measuring ≥ 6 mm. This finding is comparable to recently published studies showing sensitivity rates of 60–90% for 4–8-mm nodules, and up to 100% sensitivity for nodules larger than 8 mm (Sommer et al. 2014; Cieszanowski et al. 2016; Koyama et al. 2013; Meier-Schroers et al. 2016; Schroeder et al. 2005). These studies concern lung cancer screening using MRI. We detected that 8.7% false-positive nodules, which was only slightly higher compared to Sommer et al. (5%), still mean nodule size in our study was much lower (6 mm compared to 15 mm) (Sommer et al. 2014). Furthermore, sensitivity and specificity for nodules ≥ 6 mm in our study excelled the minimally acceptable performance criteria for screening mammography (75% sensitivity and 88% specificity) as reported by Carney et al. (2010). Altogether, there was strong correlation between MRI and LDCT concerning detectability and size determination of pulmonary nodules in our study.

30 of 137 nodules were missed by MRI in our study. 26 of them were solid with a size of 4–5 mm, one was fat containing with a size of 6 mm, and three were subsolid with a diameter of < 20 mm. According to Lung-RADS, all these nodules do not require short-term follow-up or further investigation (PET/CT, biopsy, and/or surgery), since their probability of malignancy is very low (< 1%) (Becker et al. 2012). In other words, not one nodule with a statistically elevated risk of being malignant was missed by MRI in the present study.

Subsolid nodules with ground-glass appearance on CT can turn into carcinomas with lepidic growth (syn. adenocarcinoma in situ, previously called bronchioalveolar carcinoma). The sensitivity for detecting such nodules was 73% in our study. In comparison, Rajaram et al. (2012) detected 75% of ground-glass opacities as patterns of pulmonary fibrosis with MRI. Koyama et al. (2010) demonstrated a sensitivity of 96% for distinguishing bronchioalveolar carcinomas from adenocarcinomas with solid or mixed appearance in a tumor positive population. Thus, the readers of that study were aware of the presence of tumor; besides, the mean lesion diameter was 15.7 mm in their study (8.5 mm in our study). In a different study by Koyama et al. (2008), 78% of bronchioalveolar carcinomas were detected by MRI, yet the authors did not provide information about lesion size. In the present study, subsolid nodules were relatively small, and no subsolid nodule ≥ 8 mm was missed. According to the Lung-RADS criteria, missing subsolid nodules < 20 mm are of low relevance, because such nodules do not require further investigation, albeit regular follow-up in 12 months (American College of Radiology 2017).

There was size discrepancy of more than 1 mm between MRI and LDCT in only two cases. First, a 9-mm nodule on MRI actually measured 7 mm on LDCT. When only MRI had been performed, this subject would have undergone follow-up after 3 months instead of 6 months. Second, a 17-mm subsolid nodule on MRI actually measured 14 mm on LDCT. However, the management of this subject would not have been different if only MRI had been performed, because subsolid nodules < 20 mm do not require short-term follow-up.

MRI could only detect 33.3% of calcified nodules ≥ 4 mm in this study population, which is likely clinically irrelevant, since calcification strongly suggests benignancy (MacMahon et al. 2017).

Besides the accurate determination of nodule size by T1- and T2-weighted MR imaging, diffusion-weighted imaging (DWI) can help to further estimate malignancy (Chen et al. 2013; Deng et al. 2016). Our results support these findings, since seven of eight carcinomas showed diffusion restriction. Still, the positive predictive value of DWI for the prediction of malignancy was only 41% in our study. Beyond that, the use of contrast-enhanced sequences can improve nodule detection as well tissue characterization, since there is usually low contrast uptake in benign nodules (Kono et al. 2007; Alper et al. 2013). Still, we agree with Biederer et al. (2017) that the application of contrast media in a screening setting is questionable.

We decided for a fast scan protocol that was easy to perform and optimized for screening with a maximum in-room-time of 20 min. The most important sequences of our study were the T2-weighted MultiVane XD sequences. This technique allows for free breathing acquisition with excellent motion correction (Meier-Schroers et al. 2016), which is a big advantage for participants with impaired lung function. Still, a multiparametric scan protocol with high-resolution T1- and T2-weighted images, DWI, and dynamic contrast-enhanced images is believed to further improve sensitivity, specificity, and diagnostic validity regarding malignancy.

Our study has several limitations. First, MR images were read by two radiologists experienced in body MRI and especially in lung MRI. Unexperienced readers would probably not achieve a comparably high diagnostic performance.

Second, sensitivity and specificity were calculated for nodule detection by MRI compared to LDCT as the gold standard of nodule detection. Hence, we did not assess the test characteristics of MRI for the detection of lung cancer, which would require a gold standard of lung cancer diagnosis for both screening cancers and interval cancer. This study only provided the test characteristics for the detection of LDCT-detected nodules.

Third, we did not evaluate all possible indicators of malignancy, such as nodule signal intensity and contrast on different MRI sequences, since this approach would have been beyond the scope of this present study. The MRI characteristics of benign and malignant nodules and the ability of MRI to identify slight changes of nodule size need to be evaluated in future studies.

As a further limitation, we cannot make a statement about possible nodule growth in those subjects, who were scheduled for regular follow-up after 12 months, since we only evaluated follow-up examinations in cases of suspicious findings on baseline LDCT and MRI.

A final limitation is that we did not assess partly solid nodules, as they are hard to discriminate from solid nodules and pure ground-glass nodules on MR images. Consequently, partly solid nodules were subsumed as solid nodules, since the management of these nodules is very similar to the solid ones according to Lung-RADS criteria (American College of Radiology 2017).

In conclusion, MRI might be suitable for lung cancer screening, since it showed excellent sensitivity and specificity for the detection of solid pulmonary nodules ≥ 6 mm, and since MRI findings strongly correlated with LDCT. The weaknesses of MRI as a screening modality with a lower sensitivity for calcified nodules, small solid nodules < 6 mm and small subsolid nodules < 8 mm, has probably little clinical relevance and might contribute to a higher specificity and positive predictive value, which is known to be one of the biggest weaknesses of screening with LDCT. Still, further investigation is needed regarding tissue characterization as well as discrimination of benign and malignant lesions by MRI.