Introduction

Cystic fibrosis (CF) is the most common genetic disease, causing a reduced life expectancy in Caucasians [1]. Life expectancy is reduced because of progressive lung disease, characterised by several structural lung changes, most importantly bronchiectasis and trapped air [2, 3]. To guide therapy to prevent or reduce bronchiectasis and trapped air, it is crucial to monitor CF lung disease at regular intervals.

CF lung disease monitoring has changed in the last decade [4, 5]. Nowadays, computed tomography (CT) is considered the most sensitive method to monitor CF lung disease in early and advanced stages [6] and is usually preferred over chest x-rays. The main disadvantage of CT is that it exposes patients to ionising radiation. Therefore, CT protocols for CF are usually low or ultra-low dose [7, 8]. Despite this relatively low radiation dose, it restricts the use of chest CT, especially in children, who are more sensitive to radiation exposure than adults [9]. Furthermore, ionising radiation limits the frequency with which chest CT can be repeated for short- and long-term monitoring of CF lung disease [10]. Magnetic resonance imaging (MRI), as a radiation-free technique, has been introduced as an alternative to CT [11, 12]. To date, MRI has been directly compared to CT in only a few studies, using various MRI sequences [1315], but none of those included motion correction free-breathing sequences, such as PROPELLER. The PROPELLER sequence has been designed to be relatively insensitive to respiratory movements, as it enables correction of in-plane motion, rotation and translation [16, 17]. Moreover, PROPELLER is applied and increasingly proposed as a respiratory-triggered sequence in non-compliant patients, such as patients not able to perform breath-hold manoeuvres (i.e. CF children) [18]. Finally, to date PROPELLER has not been compared to CT for CF. Therefore, our study aims to assess whether the breathing motion correction of PROPELLER MRI improves the diagnostic performance of MRI versus CT to assess CF lung disease in a group of stable CF patients who had both examinations performed on the same day.

Material and methods

Stable CF patients were consecutively recruited in two CF centres by their treating clinicians. A CF patient was defined as stable when no signs or symptoms of acute or recent pulmonary exacerbations were present as defined in the exclusion criteria. Inclusion and exclusion criteria were as follows:

Inclusion criteria: CF proven by a positive sweat test, genotyping and clinical symptoms; willing and able to participate in the study; scheduled for biennial routine chest-CT requested by the attending physician; ability to comply with instructions during MRI and CT examinations; informed consent.

Exclusion criteria: minimum age of 6 years; chronic oxygen therapy; present or recent (2 weeks) pulmonary exacerbation defined as treatment with intravenous antibiotics (non-stable); history of lung transplantation; participation in other trials; contraindications for MRI; possible pregnancy.

On the same day, prior to MRI and CT, each patient underwent a clinical examination and spirometry (Masterscope, Jaeger-Care Fusion, Germany). Spirometry was performed according to the “ATS/ERS guidelines” [19]. Approval for this multicentre prospective cohort study was obtained by the institutional review boards of both participating centres.

MRI and CT protocol

The MRI protocol was performed in 1.5T scanners (Avanto, Siemens, Enlargen Germany) as follows: 16-channels torso superficial coil; PROPELLER (BLADE@Siemens) sequence proton density (PD) weighted with respiratory triggering (Navigator@Siemens) at end-expiration; (TR/TE/alpha/TA: 2000/27 ms/150°/18 s, Ø = 5 mm) axial and coronal. Average time per acquisition was 7 to 10 min. The complete MRI protocol lasted on average 35 min (range 20-45 min).

The CT protocol was performed in 16- or 64-rows scanners (Sensation, Siemens, Erlangen, Germany) as follows: volumetric end-inspiratory and end-expiratory optimised-dose scans, 100 kV for children <35 kg and 120 kV for children ≥35 kg, mAs 15-40, scan range lung apices to bases, slice collimation 1 mm, slice thickness (Ø) 1 mm, reconstruction increment 0.8 mm and pitch 1. For comparison with MRI, multiplanar reconstructions (axial, coronal) with a slice thickness of 5 mm and kernels B31f, B60f and B70f were obtained. Both MRI and CT protocols did not include the administration of a contrast agent.

Image analysis – CT and MRI

All MRI and CT images were anonymised and scored in random order by two independent radiologists (G.S. and P.C.), who were experienced in scoring and had 2 and 4 years of experience in thoracic CT and MR imaging, respectively. Both radiologists were blinded to all clinical information.

To score CTs, we used the CF-CT scoring system (CF-CT) [20]. The total CF-CT score is the sum of the following sub-scores: bronchiectasis, mucous plugging, peribronchial thickening, parenchymal score and trapped air. Each of the five sub-scores is rated according to presence and severity, in each of the six lobes, with the lingula as a separate lobe. All scores and sub-scores are expressed as a percentage of the maximum possible score, ranging from 0 (no pathological findings) to 100 (maximum severity).

To score MRI images, we used a CF-MRI scoring system that is equivalent to the CF-CT scoring system and has been used in previous studies [13, 14]. To train both observers, MRI images of 16 patients excluded from the final analysis because of incomplete CT data sets (absence of end-expiratory scan) were used. Consensus meetings were held during and at the end of scoring all MRIs of the training set. To determine intra-observer agreement, observer 1 (P. C.) scored all CT and MRI images twice, separated by 2 months to avoid recall bias.

Statistical analysis

Inter- and intra-observer agreement

Intra-class correlation coefficients (ICCs) were used to evaluate inter- and intra-observer agreement for both MRI and CT scores. ICC values between 0.4 and 0.6, 0.6 and 0.8 or ≥ 0.8 are considered to indicate moderate, good and very good agreement, respectively [21].

To assess pairwise inter- and intra-observer agreement between MRI and CT scores, Bland-Altman and identity plots were created for: (1) MRIobs1 vs. MRIobs2; (2) CTobs1 vs. CTobs2 and (3) MRIobs1 vs. MRIobs1 and CTobs1 vs. CTobs1. To compare MRI scores with CT scores, we evaluated MRImean obs 1+2 vs. CTmean obs 1+2. All statistical analyses were performed using SPSS 16.0 (SPSS, Chicago, IL), and STATA 12.0 (STATA, College Station, TX).

Sensitivity, specificity, PPV and NPV of MRI versus CT

The CF-CT bronchiectasis sub-score was used as the main indicator of CF lung disease severity in order to determine the sensitivity, specificity, positive predictive values (PPVs) and negative predictive values (NPVs) of MRI versus CT. Bronchiectasis has been recognised as a well-validated and clinically relevant outcome measure in CF [2]. Using chest CT as the reference test, a cut-off value of 26 % or higher, representing severe bronchiectasis, was defined as a true-positive finding for the MRI-bronchiectasis sub-score; 26 % represents the percentage of the maximum possible score for bronchiectasis, which is 12 per each lobe and 72 in total according to CF-CT. This cut-off of 26 % was based on the median value of bronchiectasis severity observed in two cohort studies, with either mild or severe CF lung disease [3, 22], using receiver-operating characteristic analysis to determine different cut-off values. Sensitivity, specificity, PPV and NPV were calculated for each observer (obs1 and obs2), in addition to the mean of the bronchiectasis scores of both observers (mean obs1+2).

Descriptive statistics were used to characterise the patients at the time of their MRI and CTs. All scoring data are numerical with results expressed as median (range) and scores expressed as percentage of the maximum possible score.

Results

In this study, we enrolled 54 patients. Sixteen were excluded from the analysis because of missing the end-expiratory CT scans. MRIs of these 16 patients were used for training the observers. Hence, 38 patients with stable CF (median 21 years, range 6-51 years, 22 female) were eligible for the comparison of CTs and MRIs. Baseline characteristics are shown in Table 1. Prevalence of bronchiectasis, trapped air, total CF scores with MRI and CT represented as the first and third quartiles are shown in Table 2.

Table 1 Baseline patient characteristics
Table 2 Prevalence of total MRI, total CT, bronchiectasis MRI, bronchiectasis CT, trapped air MRI and trapped air CT scores expressed in quartiles

Inter- and intra-observer agreement for MRI versus CT

Inter- and intra-observer agreement, expressed as ICCs, for MRIobs1 vs. MRIobs2, CTobs1 vs. CTobs2, MRImean obs 1+2 vs. CTmean obs 1+2, CTobs1 vs. CTobs1 and MRIobs1 vs. MRIobs1 are summarised in Table 3. In summary, the inter- and intra-observer agreement for MRI was good or very good, although lower than for CT. The ICCs for CF-CT sub-scores were higher compared to CF-MRI sub-scores, especially for peribronchial thickening and trapped air. Trapped air measurement by MRI did not match with CT as showed by the ICC values: CT-trapped air (0.87); MRI-trapped air (0.51) and MRI/CT-trapped air (0.46). Finally, the intra-observer agreement for the observer 1, who repeated the score twice, was overall very good, but lower for MRI than for CT. Bland-Altman and identity plots showed that PROPELLER MRI, for almost all sub-scores, tends to overestimate mild CF cases and underestimate severe CF cases (Figs. 1 and 2).

Table 3 ICC values of inter- and intra-observer agreement between observers for all sub-scores, CF-MRI total score and CF-CT total score
Fig. 1
figure 1

Bland-Altman plot of MRImean obs 1+2 vs. CTmean obs 1+2. All scoring data are numerical and scores are expressed as percentage of the maximal possible score. Horizontal axis: average bronchiectasis sub-score [(MRImean + CTmean)/2]; vertical axis = difference in the bronchiectasis sub-score (MRImean - CTmean). Red and green lines represent mean and ±2 standard deviations (SD) respectively

Fig. 2
figure 2

Identity plot MRI-bronchiectasis versus CT-bronchiectasis sub-scores (mean score observer 1 + oserver 2). Green line identity line Y = 1 * X + 0. Note that MRI tends to overestimate mild CF cases and underestimate severe CF cases (red line)

Sensitivity, Specificity, PPV and NPV

Sensitivity of MRI to detect severe bronchiectasis was low. Using the cut-off value of 26 %, MRI identified 6 (16 %) and CT 14 (38 %) patients with severe bronchiectasis. This was true for both observers (obs1 = 25 %; obs2 = 50 %) and also when the mean of both observers was used for analysis (mean obs1+2 = 33 %). Specificity for MRI to detect severe bronchiectasis was very high: obs1 = 95 %; obs2 = 100 % and average of obs1+2 = 100 %. Similarly the PPV of MRI was high, while the NPV was low, using either observer scores or the average. Sensitivity, specificity, PPV and NPV values for each observer and as an average of both observers are shown in Table 4.

Table 4 Differentiation of mild versus severe CF lung disease using chest MRI

Discussion

In this prospective two-centre study, we compared a motion-insensitive MRI sequence, PROPELLER, to CT to assess lung disease in a group of stable CF patients. This study showed that PROPELLER MRI is not as good as CT to assess CF lung disease. Moreover, our study confirmed low sensitivity of MRI compared to CT to detect trapped air.

First, we showed that although MRI had good or very good inter- and intra-observer agreement, this was lower than for CT. Moreover, the inter-observer agreement for MRI was lower than for CT, indicating that it was more difficult for the observers to score MRIs compared to scoring CTs. In previous studies the inter-observer agreement between MRI and CT scores ranged from moderate to very good, while intra-observer agreement was not assessed [13, 14]. In our study, the intra-observer agreement was very good, thus suggesting that the technique is robust, although less sensitive than CT.

Second, we confirmed that trapped air could not be reliably visualised with MRI, as previously demonstrated by Failo et al. and more recently by Rajaram et al. [14, 15]. Trapped air is an early change associated with small airways disease [3, 23]. Unfortunately our MRI protocol was not sensitive enough to differentiate trapped air from surrounding normal lung parenchyma. However, we believe that innovative sequences can be added to the MR examination that might be more sensitive to detect trapped air, such as the recently developed Fourier decomposition [24].

Third, in our study with PROPELLER, MRI tended to overestimate mild CF lung disease and underestimate severe CF lung disease compared to CT. CT appears to have superior sensitivity especially for changes in the periphery of the lung, where small abnormalities, such as bronchiolectasis and mucus plugs can be easily missed by MRI (Fig. 3). To overcome this problem, it is important to further increase the spatial resolution of our routine MRI protocol. This is likely possible with new techniques, such as 3D T1 GRE sequences with radial acquisition in free breathing or in breath-hold condition and highly accelerated parallel imaging [25].

Fig. 3
figure 3

Example of underestimation by chest MRI relative to chest CT. On the left the CT image during inspiration and on the right the correspondent PROPELLER MRI image. Note the peripheral bronchiectasis depicted by CT in the right upper lobe tends to disappear in the correspondent MRI image (white oval). Similarly the small areas of tree in bud in the left lower lobe are blurred and therefore less visible in the correspondent MRI image (white dotted oval)

Finally, we found a low sensitivity and NPV for MRI to detect severe bronchiectasis compared to CT. This finding partially matches the study of Puderbach et al., where the lowest level of concordance between MRI and CT was for the severity of bronchiectasis [13]. Similarly, Failo et al. demonstrated the lower spatial resolution of MRI relative to CT, especially at the periphery of the lung [14]. Conversely, we found a high specificity and PPV for MRI, which indicates that in case a structural abnormality is observed on MRI it is also visible on CT. Thus, such an abnormality on MRI can be further followed by MRI, avoiding CT. The high specificity with the very good intra- and inter-observer agreement of PROPELLER might be highly advantageous for short-term follow-up of CF patient with pulmonary exacerbation. The use of PROPELLER could allow assessing the efficacy of antibiotic therapy without exposing the patient to ionising radiation. Moreover, PROPELLER might even be suitable for really young patients who are not capable of breath-hold acquisition or in those sick CF patients with reduced compliance.

There are some limitations that might have reduced the agreement between MRI and CT in our study. The main limitation is that we evaluated a single sequence, which may have influenced the sensitivity of MRI to detect CF lung disease. We used a single sequence study design as in prior publications [13, 14] because we wanted to assess whether the motion correction technique of PROPELLER would have resulted in improved image quality in chest MRI by reducing the impact of respiratory movements [26]. However, a potential limitation of the use of PROPELLER is that this sequence can produce streak artefacts related to the K-space reconstruction [16, 17]. To reduce this artefact we increased the blade width and k-space coverage [27]. However, despite this adaptation, streak artefacts were still observed in about 20 % of patients (Fig. 4). Hence the occurrence of streak artefacts might have reduced the sensitivity of MRI in our study.

Fig. 4
figure 4

Streak artefacts of propeller sequence. On the left the CT image during inspiration and on the right the corresponding MRI image acquired using the PROPELLER sequence. Note that streaks artefacts are visible not only inside the parenchyma (arrowheads), but also outside (white arrows)

The second limitation, which might have affected our MRI-CT comparison, is that the image quality of the MRI images was partially impaired by motion artefacts related to pulsation of the heart and great vessels (Fig. 5). We did not attempt to reduce these artefacts by using cardiac gating since this would have added 5-10 min scan time to an already lengthy MRI protocol of 35 min (average 20-45 min). Purposely, the PROPELLER sequence was applied with the NAVIGATOR technique, which reduced problems related to patient cooperation, especially in younger patients, since the acquisition was automatically triggered at the end of the expiration independently by the respiratory pattern. A disadvantage of the NAVIGATOR technique was that in children with irregular breathing it prolonged the acquisition time.

Fig. 5
figure 5

Motion artefacts of MRI PROPELLER (BLADE@Siemens). Note the motion artefacts around the heart (white arrows)

The third limitation of our study was that images were acquired using different breathing conditions for the CT and MRI protocols. As discussed, MRI images were acquired near the functional residual volume level (FRC) using free-breathing acquisition, while for CT bronchiectasis was assessed after a deep inspiration at a lung volume close to total lung capacity while trapped air was assessed at a lung volume close to residual volume during a breath hold manoeuvre at end expiration. However, it has been shown that CTs acquired near the FRC overestimate trapped air relative to scans acquired near residual volume [28]. Therefore, our MRI protocol should have shown larger amounts of trapped air compared to CT, but this was not the case.

Finally, the reduced sensitivity of chest MRI relative to CT might be related to the use of our CF-MRI semi-quantitative scoring system. We decided to use a similar scoring strategy for MRI as for CT to allow a fair comparison between sub-scores. The CF-MRI scoring system was used by our group with success in previous studies [13, 14]. As shown by the good intra- and inter-observer variability both the CF-CT and CF-MRI scoring systems can be trained reliably. It is possible that the development of more sophisticated scoring methods for MRI might improve the sensitivity of MRI.

In conclusion, this comparison showed that motion-corrected MR image acquisition with PROPELLER does not show good sensitivity for assessing CF lung disease, and it is still not able to assess the severity of CF lung disease with the same precision as CT. However, the good or very good inter- and intra-observer agreement and the high specificity suggest that MRI can be applied in the follow-up of CF lung disease. MRI may be used for short-term follow-up of major lung abnormalities, such as evaluating the efficacy of intravenous treatment for a pulmonary exacerbation, and avoiding ionising radiation, which is an inherent limitation of CT. Likely in future, further protocol optimisation and alternative, more sensitive image analysis techniques will allow chest MRI to playing a more important role in the follow-up of CF lung disease.