Introduction

Dual-energy CT (DECT) enables reconstruction of iodine-specific images, which have emerged as an important qualitative and quantitative imaging biomarker. These images are provided by all different DECT manufacturers and are generated by means of material decomposition [1]. Iodine-specific images are considered to represent iodine uptake more accurately within tissues compared to attenuation measurements in Hounsfield units (HU) derived from conventional gray-scale CT images. They provide a “snapshot perfusion” of a lesion or tissue of interest at the time of the image acquisition, cementing their role as a surrogate marker for tissue/tumor vascularity [2, 3]. This feature enables an array of applications, ranging from lesion detection and characterization [4,5,6] to assessing treatment response in oncology [7,8,9]. Intra-scanner variability of iodine quantification in patients with repeated scans using the same DECT has been previously described [10, 11]. Data on inter-scanner variability of patients examined on different DECT scanner types is limited [12], albeit the latter being a common scenario in imaging practices with DECT platforms from different manufacturers or in patients undergoing follow-up at different institutions throughout the course of their disease.

Iodine quantification can be expected to be influenced by scanner-independent and scanner-dependent factors. Scanner-independent factors may be physiological changes in the cardiopulmonary status of the patient at the time of image acquisition, differences in contrast media phase due to variations in intravenous administration or the error margin of bolus tracking, and pathological processes taking place between examinations, potentially affecting organ perfusion. Scanner-dependent factors may be related to hardware (i.e., different technical concepts of DECT) [1], and differences in material decomposition algorithms.

For scanner-independent factors, protocol optimization and homogenization between different scanner types may help in reducing variability of quantitative CT. Moreover, normalization of iodine concentrations has previously been suggested as a means to mitigate patient-dependent, physiological variability of parenchymal perfusion in the setting of repeated examinations on the same DECT scanner type [13, 14].

The scanner-dependent differences of iodine quantification between DECT scanner types have been extensively investigated, yet are mostly limited to phantom experiments [15,16,17,18]. Those phantom experiments investigated the technical accuracy of iodine quantification. However, intra-patient, inter-scanner variability of quantitative DECT in clinical routine can be considered to be influenced by more factors such as above-mentioned scanner-independent ones and may be of higher magnitude than in phantom-based tests. Therefore, assessing intra-patient variability of iodine quantification in a multi-vendor clinical setting and determining the potential utility of normalization are of clinical relevance, particularly in the light of the incremental evidence suggesting DECT-derived iodine quantification as a longitudinal oncologic response parameter [7, 9, 19]. The purpose of our study was therefore to investigate intra-patient variability of iodine quantification at longitudinal follow-up on different DECT scanner types and to analyze different normalization approaches for increasing consistency of iodine measurements.

Materials and methods

Patients

Our institutional review board approved this retrospective study and waived the requirement for patient informed consent. The study is in compliance with the Health Insurance Portability and Accountability Act. To identify patients who had undergone portal venous phase abdominal DECT on a third-generation dsDECT (Somatom Force, Siemens), a second-generation rsDECT (Revolution CT, GE), and a dlDECT (IQon, Philips), a database search within our dose monitoring system (Radimetrics, Bayer) was conducted. The dates of CT scans of each vendor eligible for inclusion followed the dates of installation of each type of scanner at our institution: 01/2016–09/2020 for dsDECT, 08/2017–09/2020 for rsDECT, and 07/2018–09/2020 for dlDECT.

In patients with multiple CTs from the same vendor, we included those CTs with the least absolute time interval with the CTs from other vendors, such that the sum of absolute time intervals between three included examinations was minimal for each patient. In our initial query, 55 consecutive patients matching these criteria were identified, of which a total of 11 patients were excluded. Consequently, 44 patients formed the final study population.

DECT technique

All patients that were included underwent one abdominal DECT each on dsDECT, rsDECT, and dlDECT in the portal venous phase using iodinated contrast media (Isovue 370 mg/mL, Bracco Diagnostics). At our institution, image acquisition protocols are homogenized as far as possible between the three scanner types to reduce protocol-related variability: DECT examinations were executed with supine patient positioning and acquisitions in the portal venous phase were initiated with a scan delay of 50 s using bolus tracking (50 s after attenuation cut-off of 150 HU in the thoracic aorta at the level of the diaphragm). Average amounts of contrast media following body weight–adapted application were similar for each scanner type (dsDECT 91.2 ± 14.1 mL; rsDECT 94.5 ± 15.0 mL; dlDECT 91.5 ± 17.7 mL). Table 1 comprises scanning and reconstruction parameters for all three DECT scanner types included in this study.

Table 1 Detailed parameters of image acquisition and reconstruction for each dual-energy CT scanner type (dsDECT: dual-source dual-Energy CT; rsDECT: rapid kV switching dual-energy CT; dlDECT: dual-layer detector dual-energy CT)

Iodine quantification

Region of interest (ROI)–based measurements of iodine concentration (IC) were obtained by a fellowship-trained radiologist with 12 years of experience in abdominal imaging using vendor-specific proprietary software solutions (dsDECT: Syngo via, Siemens; rsDECT: AW server, GE Healthcare; dlDECT: IntelliSpace Portal, Philips). The liver, kidneys, and pancreas were included as target organs; erector spinae back muscle, retroperitoneal fat, and gallbladder as reference tissues; and aorta and portal vein as blood pool measurements for normalization.  Supplementary material includes a detailed description of ROI placement.

Inter-scanner analysis: iodine concentration

The mean IC for each of the above-mentioned tissues (liver, gallbladder, portal vein, aorta, pancreas, kidneys, retroperitoneal fat, muscle) was calculated as

$${\text{IC}}= \sum\nolimits_{i=1}^{n}\frac{{x}_{i}}{n},$$
(1)

where x is the sum of all iodine concentrations measured within one tissue and n is the number of tissue-specific ROIs.

Moreover, different normalized ICs were calculated by dividing ICs of parenchymal organs (liver, pancreas, kidneys) by the corresponding ICs of the abdominal aorta (NICAA), portal vein (NICPV), and the sum of the vascular and parenchymal IC of the abdominal aorta, portal vein, liver, and pancreas (NICALL), with the latter being conceived to serve as a surrogate for the overall iodine load at the time of image acquisition:

$${{\text{NIC}}}_{{\text{AA}}}= \frac{{{\text{IC}}}_{ ({\text{organ}})}}{{{\text{IC}}}_{ ({\text{aorta}})}},$$
(2)
$${{\text{NIC}}}_{{\text{PV}}}= \frac{{{\text{IC}}}_{ ({\text{organ}})}}{{{\text{IC}}}_{ (\mathrm{portal\; vein})}},$$
(3)
$${{\text{NIC}}}_{{\text{ALL}}}= \frac{{{\text{IC}}}_{ ({\text{organ}})}}{{{\text{IC}}}_{ ({\text{aorta}}+\mathrm{portal\; vein}+{\text{liver}}+{\text{kidneys}}+{\text{pancreas}})}}$$
(4)

IC, NICAA, NICPV, and NICALL were compared intra-individually between the three examinations undergone by a patient on dsDECT, rsDECT, and dlDECT.

Inter-scanner analysis: variability of iodine concentration

To assess inter-scanner variability of IC and NIC, variation between two scans executed at different scanners was calculated as suggested before [20]:

$${\text{Inter}}-\mathrm{scanner\; variation}= \left|\frac{x1-x2}{\left(x1+x2\right)*0.5}\right|*100$$
(5)

where x1 and x2 are the values obtained from the examinations undergone by a patient on two different scanner types. Inter-scanner variations were calculated for each inter-scanner comparison of each tissue and patient, respectively, on an intra-individual level and the median tissue-specific inter-scanner variation was calculated from those individual values.

Statistical analyses

Statistical analysis was performed using software (JMP V14 and SAS V 9.4, both SAS Institute). The paired, non-parametric Wilcoxon signed rank test with Bonferroni correction was used for pairwise comparisons of IC and NICAA, NICPV, and NICALL among dsDECT, rsDECT, and dlDECT, with the null hypothesis being that the median of the population of differences between the scanner pairs for the respective measures is zero [21]. Two-tailed p values lower than 0.05 were considered statistically significant.

A longitudinal mixed effects analysis was conducted for the normalized iodine concentration that yielded the greatest reduction in variability (i.e., NICALL) using scanner type, contrast media amount, and the time between examinations, as well as the order with which patients underwent scans on different scanner types as fixed effects, and the number of patients with specific orders of scanning as a random effect (patients nested within orders). Quadratic effects of time and the interaction of scanner type × order were pretested and removed if not significant. The purpose of the mixed effects analysis was two-fold: first, to analyze the effect of the scanner type on the iodine concentration after internal normalization, at which we assumed a reduced perfusion-related component of iodine variability; second, to address the inherent limitation of the intra-patient, retrospective study design that introduces other sources of variability than the scanner type, such as time-dependent changes in organ perfusion, or contrast media amounts applied.

Continuous variables are reported as mean and standard deviations. Inter-scanner variability of IC is reported as median percentage and inter-quartile range.

Results

Patients

A total of 18 women and 26 men (age 59.5 ± 14.1 years) were included. The mean inter-scan intervals were − 49.2 ± 270.3 days between dsDECT and dlDECT, 23.4 ± 227.7 days between dlDECT-rsDECT, and − 25.9 ± 318.9 days between dsDECT-rsDECT (negative sign implying that the scan at the first named scanner type was performed earlier). Table 2 provides a summary of patient characteristics.

Table 2 Information on patients and examinations

Inter-scanner analysis: iodine concentration

In the intra-patient, inter-scanner comparison, IC was comparable between all three scanner types in the kidneys (dsDECT 5.73 ± 1.12 mg/mL, rsDECT 5.96 ± 1.12 mg/mL, dlDECT 5.66 ± 1.27 mg/mL; p = 1.00), the aorta (dsDECT 5.84 ± 1.71 mg/mL, rsDECT 5.61 ± 1.15 mg/mL, dlDECT 5.63 ± 1.67 mg/mL; p = 1.00), and the portal vein (dsDECT 5.69 ± 0.93 mg/mL, rsDECT 6.03 ± 1.02 mg/mL, dlDECT 6.04 ± 1.06 mg/mL; p-value range 0.81–1.00).

Significant differences in IC in pairwise comparisons among the three scanner types were found in the liver (dsDECT 1.81 ± 0.62 mg/mL, rsDECT 2.59 ± 0.45 mg/mL, dlDECT 2.17 ± 0.54 mg/mL, p < 0.05 for all pairwise comparisons). In the pancreas, mean IC of dsDECT and rsDECT was comparable (2.84 ± 0.66 mg/mL and 2.84 ± 0.58, p = 1.00), whereas the IC of dlDECT was significantly lower (2.50 ± 0.76; p < 0.05).

In retroperitoneal fat, rsDECT showed the lowest iodine values with significant differences compared to each of the other scanners (dsDECT 0.08 ± 0.4 mg/mL, rsDECT − 1.03 ± 0.20 mg/mL, dlDECT 0.15 ± 0.70 mg/mL, p < 0.05). In contrast, in the gallbladder and muscle, rsDECT showed the highest IC of all three scanners (0.44 ± 0.37 mg/mL and 0.62 ± 0.21 mg/mL, respectively). Table 3 depicts mean iodine concentrations in all included organs and tissues.

Table 3 Mean iodine concentrations and standard deviations (mg/mL) in included organs, vessels, and reference tissues with low (muscle) and no contrast enhancement (gallbladder, retroperitoneal fat). One asterisk (*) indicates significant differences compared to one other DECT scanner, while two asterisks (**) indicate significant differences compared to two other DECT scanners as per the paired Wilcoxon signed rank test

In the kidneys, normalized iodine concentrations did not show any significant differences between scanner types, similar to the non-normalized IC. For the liver and the pancreas, different normalization approaches partly altered significance levels of individual inter-scanner comparisons, yet for both organs and all normalization approaches, significant differences between certain inter-scanner combinations persisted. Table 4 summarizes inter-scanner differences of IC and the normalized IC NICPV, NICAA, and NICALL.

Table 4 Absolute and normalized iodine concentrations and standard deviations in the liver, pancreas, and kidneys. One asterisk (*) indicates significant differences compared to one other DECT scanner, while two asterisks (**) indicate significant differences compared to two other DECT scanners as per the paired Wilcoxon signed rank test

Inter-scanner analysis: variability of iodine concentration

The median variability of non-normalized IC was highest between scans conducted on different DECT scanner types in the liver (dsDECT vs. dlDECT 28.96 (14.28–46.87) %, dsDECT vs. rsDECT 29.08 (16.59–62.55) %, rsDECT vs. dlDECT 22.85 (7.52–33.49) %), lowest for the kidneys (dsDECT vs. dlDECT 15.76 (7.03–26.1) %, dsDECT vs. rsDECT 15.67 (8.86–25.56) %, rsDECT vs. dlDECT 10.92 (4.92–22.79) %), and intermediate for the pancreas (dsDECT vs. dlDECT 22.24 (7.06–37.93) %, dsDECT vs. rsDECT 19.86 (10.9–27.14) %, rsDECT vs. dlDECT 13.66 (7.67–30.72) %).

In the liver, none of the normalization approaches markedly decreased inter-scanner variability between dsDECT vs. dlDECT and dsDECT vs. rsDECT, respectively. In contrast, between rsDECT and dlDECT, NICPV (17.61 (7.29–32.33) %), NICAA (18.09 (7.9–36.73) %), and NICALL (11.03 (4.88–26.75) %) decreased inter-scanner variability compared to IC (22.85 (7.52–33.49) %).

In the pancreas as well as the kidneys, NICALL yielded the most pronounced decrease in inter-scanner variability in all inter-scanner comparisons, whereas NICPV and NICAA only resulted in small reductions of inter-scanner variability and even led to higher variabilities in some inter-scanner comparisons. Table 5 depicts inter-scanner variability found in the liver, pancreas, and kidneys for all inter-scanner comparisons and normalization approaches, respectively. Figure 1 depicts an example case of a patient who underwent DECT on dlDECT, rsDECT, and dsDECT. Figure 2 summarizes the results regarding inter-scan variability of absolute and normalized IC between different scanner pairs.

Table 5 Inter-scanner variability of original IC and different normalized iodine concentrations. Measures are reported as median and inter-quartile range
Fig. 1
figure 1

Iodine-specific images from repeated abdominal dual-energy CT examinations on three different scanners (i.e., dual-layer detector DECT (dlDECT), rapid kV switching DECT (rsDECT), dual-source DECT (dsDECT)), undergone by a 68-year-old man with exemplary ROI measurements. On the right, average iodine concentrations for the liver, kidneys, and pancreas are shown for each scanner type. After normalization to the overall iodine load (NICALL), the range with which iodine concentrations vary between the scanner types is diminished, yet the highest magnitude of variability remains in the liver

Fig. 2
figure 2

Box-Whisker plots of proportional inter-scanner variations for different combinations of dual-energy CT scanners. Normalization towards the overall iodine load (NICALL), represented by the sum of vascular and organ iodine concentrations, reduced inter-scanner variability between all three scanner types in the pancreas and kidneys, whereas median inter-scanner variability remained on a similar level as that of original, non-normalized iodine concentrations in the liver between dual-source DECT (dsDECT) and the other two scanner types

Longitudinal mixed effects analysis

The longitudinal mixed effects analysis revealed that the scanner type significantly influenced NICALL in the liver (F-value = 23.78, df 2, 84, p < 0.0001) and pancreas (F-value = 12.26, df 2, 84, p < 0.0001), whereas it did not have an effect on NICALL in the kidney (F-value = 0.56, df 2, 84, p = 0.57). In the liver, Tukey post hoc tests showed all pairwise differences were significant, whereas in the pancreas, significant differences were found between rsDECT and dsDECT vs. dlDECT, but not between rsDECT and dsDECT. The contrast media amount, the order in which patient underwent DECT on different scanner types, and the inter-scan interval were statistically adjusted for in all these analyses but did not have significant effects on NICALL (p ≥ 0.05).

Discussion

This study assessed intra-patient iodine variability between three commercially available DECT platforms. We found different magnitudes of variability in iodine concentration within the same patient across different DECT scanner types, depending on the observed tissues and inter-scanner comparisons with the highest variability found in the liver. In contrast, there was a high agreement between the different scanner types for the iodine measurements within the renal cortex. Normalization mitigated the variability for most tissues and inter-scanner comparisons, yet variability in the liver remained high, particularly between dsDECT and the other two scanner types. The longitudinal mixed effects model determined the scanner type as a significant determinant of NIC in the liver and the pancreas. One inherent limitation of commonly used normalization approaches using reference measurements in the portal vein or aorta is that the iodine load within those vessels is highly variable and dependent on different factors such as cardiac output and contrast media application. To mitigate this, we explored a normalization taking into account multiple reference measurements both in organs and vessels. We found that this normalization approach (NICALL) yielded the best overall reduction of iodine variability, which indicates that normalizing the target iodine concentration to multiple reference points as a surrogate for the overall iodine load in the body might be more robust and preferable in clinical practice compared to normalizing against one particular vessel.

The higher variability in liver parenchymal iodine concentration is expected due to the dual blood supply of the liver and potentially higher physiologic variability in perfusion between repeated examinations. We found that neither of the normalization approaches that led to a reduction in iodine variability in other organs markedly reduced variability between dsDECT and the other two scanner types while this was the case for rsDECT vs. dlDECT. In concordance with this result, the mixed effects analysis revealed that even after normalization, scanner type still had a significant effect on the iodine concentration within the liver. One factor leading to lower variability of iodine concentration in the kidneys may be the highly autoregulated perfusion [22]. Moreover, the accuracy of iodine quantification is dependent on the absolute iodine concentration itself with a tendency towards lower observed errors in higher iodine concentrations [15], which might have reduced the observed variability in the kidneys in our analyses. This might also explain that the divergence between rsDECT and dlDECT vs. dsDECT that was found in the liver was not present in the kidneys. Another underlying reason for this result could be differences in material decomposition between the corresponding scanner types that may become apparent in the liver as an organ that may more frequently deviate from the paradigms of two-material decomposition, which are used in rsDECT and dlDECT [23]: Contrast-enhanced renal cortex primarily consists of water/soft tissue and iodine, whereas in the liver, an additional fatty component due to organ degeneration could be causing the smaller agreement between rsDECT and dlDECT vs. dsDECT. One important factor that may contribute to the variability observed is the differences in radiation dose and image noise between the three scanner types. The nature of our retrospective investigation on clinical data did not allow prospective alignment of acquisition parameters for attaining equivalent radiation dose exposures such as it has been performed in previous phantom studies [15]. Instead, the clinical protocols were set up to create a similar image impression and hereby warrant a comparable quality of the scans throughout clinical operations. A previous study reported that the effect of radiation dose on iodine quantification is particularly relevant at very low radiation dose levels, which were not attained in subjects included in our study [24]. However, we acknowledge that a certain contribution of radiation dose differences to the variability observed, particularly in the liver, can be assumed.

One important implication of the results we found is that the order of magnitude of iodine variability in the liver is within the range that has been previously described for determining response in liver tumors. For instance, Dai et al described a mean decrease in iodine concentration of 19.1% in hepatocellular carcinoma patients with disease control (i.e., either partial response or stable disease) [25]. Therefore, based on our initial results, the expected hepatic inter-scanner variability could hamper accurately assessing such marginal, yet potentially clinically impactful changes in lesional iodine content.

One recent study investigated inter-scanner variability iodine quantification [26]. In concordance to our results, variability was not mitigated by normalization to the aorta, a method that has been applied in several oncologic use cases. In their study, Cai et al proposed a phantom-based model for correcting inter-scanner variability of iodine. The possibility of combining the proposed perfusion-directed method (i.e., NICALL) that we found efficient in reducing variability in some instances, with a phantom-derived correction model should be further investigated.

In our study, different factors contributing to variability had to be accounted for. Physiological changes in organ perfusion can be expected to be a relevant factor, albeit protocol homogenization. By using different approaches for normalizing iodine concentrations, we aimed to mitigate perfusion-related differences in iodine concentration, which we consider an important factor contributing to the variability we found. We used normalization against the aorta as well as the portal vein, which have been used in previous studies examining iodine variability [10]. Moreover, we implemented a normalization approach based on both vessels as well as the liver and pancreas representing parenchymal organs, aiming to create a surrogate for the overall iodine load present in the body at the time of the scan. Interestingly, the last approach yielded the highest overall reduction in inter-scanner variability for all scanner pairs when assessing the pancreas and kidneys. One reason for that result might be that it more accurately reflects the overall iodine load at the time of image acquisition, and that it accounts for potential variation in vessel perfusion that might skew the normalization approaches based on one vessel. Notably, due to the body weight adaption of contrast media, the contrast media amounts were similar between scans undergone on different scanners by one patient, which explains why contrast media were not deemed to influence IC in our longitudinal mixed effects analysis.

As our study included cancer patients, the increase in extent of organ tumor burden over time was another important potential determinant influencing iodine concentration measurements. The first measure we took to mitigate this was excluding patients with extensive organ disease. Secondly, we included the order with which the patients were scanned on the different scanners as well as the inter-scan intervals as surrogate parameters for time-dependent (i.e., pathological) changes in organ perfusion in our mixed effects model, which did not reveal any significant effect of these variables.

Our study has limitations that need to be discussed. First, the retrospective design of the study limits the study population to the small number of patients who by chance were imaged on all three scanner types. Moreover, in our limited study cohort, we were unable to test the intra-patient, intra-scanner variability, and compare it to the inter-scanner variability due to insufficient availability of repeated examinations on the same scanner type. Second, we did not include iodine concentration of lesions to determine the impact of different DECT technologies on pathological entities. The reason for including normal tissue is that we aimed to focus on technical variability in a clinical setting, which can hardly be observed in tumor lesions undergoing substantial biological change in normal follow-up intervals. Whereas we acknowledge that a scan-rescan test scenario with much shorter intervals between the scans measuring intra-patient iodine variability within lesions would be of interest, such a test would be impeded by radiation dose considerations. Based on our results, we assume that inter-scanner differences can be expected to be lower and better amendable by normalization in well-vascularized soft tissue lesions (comparable to physiological renal parenchyma) and higher in lesions of mixed composition (e.g., soft tissue lesions with fatty components and contrast enhancement, comparable to liver parenchyma with a certain amount of fatty degeneration). Third, our results should not be generalized to prior generation dsDECT or rsDECT scanners as well as other technical implementations such as split filter DECT, which has been described to show variability compared to dsDECT in a previous study [27].

In conclusion, we found substantial variability in iodine concentrations obtained within the same patient across different DECT scanner types. Our data show that normalization can mitigate variability, yet with a dependency on tissues and scanner pairs. The observed variability across different DECT scanner types currently may result in erroneous lesion characterization or suboptimal treatment response evaluation in patients scanned on different systems, underscoring an unmet need for standardization across different DECT manufacturers.