Introduction

Imaging is a key modality for the diagnosis and follow-up of oncologic disease, as it allows to non-invasively assess the extent of oncologic burden and spread [1, 2]. In the era of precision medicine, anti-cancer treatments vary widely; here, it is of particular importance to correctly identify therapeutic response. Subjective or solely measurement-based variables were shown to provide insufficient information for some tumor entities regarding treatment response, e.g., response evaluation criteria in solid tumors (RECIST) for hepatocellular carcinoma (HCC) [3, 4]. Hence, recent approaches suggest to combine measurements with enhancement assessment (modified RECIST for HCC) or to use other more sophisticated metrics instead; however, all these techniques work on the assumption of reproducibility of imaging methods [2, 4,5,6].

In computed tomography (CT), iodinated contrast media is frequently used to overcome the intrinsic low soft tissue contrast [7, 8]. Besides improvement of image contrast, iodine distribution patterns allow for a more detailed characterization of hepatic tissue and/or lesions, especially in bi- or multiphase abdominal examinations. Dual-energy computed tomography (DECT) has been an emerging technology in the field of CT imaging over the past decade. While conventional CT assesses the global attenuation of all photons irrespective of their energies throughout a polychromatic X-ray spectrum, DECT aims to differentiate between attenuation of low- and high-energy photons of the X-ray spectrum. The attenuation coefficient is an inherent characteristic of any given element or compound, due to its energy dependency it can be approximated using the aforementioned energy resolved information provided by DECT a process referred to as material decomposition [9, 10]. Material decomposition for iodine allows for identification of all voxels that contain iodine and further for estimation of its iodine content, due to linear dependency of concentration and attenuation [11, 12]. The resulting images contain quantitative data visualizing iodine distribution referred to as iodine maps. Several concepts of DECT are clinically available of which one approach separates photon’s energies on the detector level referred to as spectral detector CT (SDCT) [13,14,15,16].

To evaluate and understand the clinical utility of iodine maps, testing their accuracy, precision repeatability, and reproducibility is necessary. While several technical guidelines defining the aforementioned terms are available, such recommendations for medical imaging are missing. We aimed to transfer the definitions defined in the 1994 Guideline on Validation of Analytical Procedures towards their application in a medical imaging context. Accordingly, accuracy describes the closeness of agreement between the true value (either as a known true value or as an accepted reference value) and the value found [17]. Overall precision expresses the closeness of agreement obtained from multiple repetitions of a measurement at different time points and under varying conditions [17]. Precision can be subdivided into repeatability and reproducibility. While repeatability refers to the closeness of agreement between multiple repetitions under identical conditions over a short interval of time, reproducibility refers the closeness of agreement throughout different days and operators [17].

It has been reported recently that iodine maps from SDCT are accurate [18,19,20]; however, their precision, repeatability, and reproducibility have not been investigated systematically. Especially, no study investigated the reproducibility of iodine maps in vivo, although necessary for the evaluation of their utility for oncologic imaging.

The overall aim of this study is to gain understanding of iodine distribution in order to evaluate the utility of iodine maps for oncologic imaging. This study features a phantom and an in vivo (patient) portion for the evaluation of reliability of iodine maps. The phantom portion focuses on evaluation of precision and repeatability, while the patient portion evaluates the intra-individual reproducibility with a focus on the liver.

Materials and methods

Phantom

We used an anthropomorphic liver phantom (customized, QRM) designed for DECT imaging exhibiting an average attenuation of 100 ± 5 HU at 120 kVp and an iodine density of approximately 1.4 mg/ml according to the vendor (due to inaccuracies in the manufacturing process, more precise values cannot be established).

Patients

The institutional review board waived informed consent due to the retrospective character of this study. All patients that underwent repetitive examination on the SDCT between May 2016 and December 2017 were screened for inclusion in this study; detailed in- and exclusion criteria are shown in Fig. 1. Basic laboratory results regarding the liver function [bilirubin, aspartate aminotransferase (GOT), and/or glutamic-pyruvate transaminase (GPT)] were available for all patients. Patients who underwent a change of liver function as indicated by these were excluded from further analysis, i.e., elevated liver enzymes if normal in prior testing or in- or decrease > 30% [21, 22]. In total, 77 patients were included of which 52 and 25 underwent examination two and three times, respectively, resulting in a total of 179 examinations. Mean patient age was 59.8 ± 15.2 years; further details on patient characteristics are reported in Table 1.

Fig. 1
figure 1

Flowchart illustrating patient inclusion and exclusion criteria. In total, 77 patients were included of which 52 underwent double and 25 underwent triple spectral detector CT examinations

Table 1 Patient characteristics

One radiologist with 8 years of experience in abdominal and oncologic imaging reviewed the examinations and identified all benign focal lesions within the liver as either cysts or hemangioma. To determine the true origin of a lesion, prior/follow-up examinations (≥ 6 months), magnetic resonance imaging, biopsy, and/or contrast-enhanced ultrasound were considered; lesions with missing standard of reference were excluded from further analysis. A total of 29 cysts and 29 hemangiomas were identified. All malignant lesions were excluded due to the possibility of therapy effects in the follow-up examinations. Further, it was noted if patients showed liver metastasis, progression of oncologic disease, or abnormalities of hepatic blood supply (including thrombosis of the portal vein and stenosis > 20% of the celiac trunk).

Image acquisition and reconstruction

All imaging was carried out on a clinical SDCT (IQon Spectral Detector CT, Philips Healthcare). For contrast enhancement, a bodyweight-adapted volume of iodinated contrast media (< 55 kg: 1 ml/kg; 55–120 kg: 100 ml; > 120 kg: 120 ml; Accupaque 350 mg/ml, GE Healthcare) was applied via an antecubital vein followed by a 30 ml saline flush (flow rate 3.5 ml/s). Portal venous phase images were acquired using bolus-tracking technique with a delay of 50 s after reaching a trigger threshold of 150 HU in the abdominal aorta. Contrast injection and imaging protocol did not differ between repetitive examinations. In patients, dose modulation was enabled in all scans (reference tube current 116 mA s, DoseRight 3D-DOM, Philips Healthcare), while the phantom was scanned with a fixed tube current of 116 mA s. Further relevant scan parameters were identical: tube voltage 120 kVp, collimation 64 × 0.625 mm, rotation time 0.33 s, pitch 0.671, matrix 512 × 512. Further, CTDIvol was recorded from the radiation dose report. The effective diameter of patients (Dw) was automatically determined using an in-house developed software in the Matlab environment (Matlab, MathWorks). The size-specific dose estimate (SSDE) was calculated according to Eqs. 1 and 2 as suggested earlier [23, 24].

$$ \mathrm{SSDE}={f}_{32\ \mathrm{cm}}^D\cdot {\mathrm{CTDI}}_{\mathrm{vol}\ 32\ \mathrm{cm}}^D $$
(1)
$$ {f}_{32\ \mathrm{cm}}^D=3.70469\cdot {e}^{-0.03671937\cdot {D}_{\mathrm{w}}} $$
(2)

All images were reconstructed using a dedicated spectral image reconstruction algorithm (Spectral B, denoising level 3, Philips Healthcare) that allows for reconstruction of conventional CT images (CI, similar to the vendor’s hybrid-iterative reconstruction algorithm) and iodine maps [16]. Images were reconstructed in axial plane with a slice thickness and section increment of 2 mm.

The phantom was scanned at three different time points with a minimum of 1 week apart (TP1–TP3). At each time point, image scanning was repeated three times (Acq1–Acq3). Further, at one time point, image reconstruction was repeated three times for the first acquisition (Reco1–Reco3). In total, this resulted in 11 reconstructions for each CI and iodine maps Fig. 2.

Fig. 2
figure 2

Reconstructions acquired for the phantom portion of the study including their use towards determination of precision, repeatability, and reproducibility. TP time point, Acq acquisition, Recon reconstruction

Image analysis

Phantom images were analyzed using an in-house developed script (Matlab, MathWorks): Upper and lower boundaries of the phantom were determined, and in every 10th image between these, a circular region of interest (ROI) was automatically placed at a central position within the homogeneous liver parenchyma mimic (Fig. 1, diameter 30 px = 2.41 cm, 8 images per scan). Mean and standard deviation of attenuation in CI (HU) and iodine concentration from iodine maps (IM) were computed.

The coefficient of variation (CV) of a single image k was calculated according to Eq. 3, with xi representing all n images of the corresponding reconstruction (CI or iodine map). CV values were averaged. In principle, CV reports the offset between a single image k and the average of all acquired images in percent.

$$ \mathrm{CV}={x}_k\cdot {\left(\frac{1}{n}\cdot \sum \limits_{i=1}^n{x}_i\right)}^{-1}-1 $$
(3)

To determine the repeatability, acquisitions for each time point were compared against each other. In addition, for TP1–Acq1, the repetitive image reconstructions were analyzed. The different time points were compared to evaluate the ex vivo reproducibility (Fig. 2).

In patients, one radiologist with 5 years of experience in abdominal imaging placed two circular ROI with a size of 100 mm2 in the liver and the portal vein using the proprietary image viewer (Intellispace Portal, Philips Healthcare). If present, one circular ROI as large as possible was placed in the liver lesion. All imaging time points of a single patient were reviewed after another to ensure comparable placement of the ROI. In a subset of 50, randomly determined examinations, ROI placement was repeated by a second reader with 4 years of experience in abdominal imaging to assess interrater agreement.

Besides absolute values, the values for the liver and lesions were normalized to the portal vein to account for differences in intra-individual perfusion. For longitudinal assessment in patients, we modified the coefficient of variation in a way that maintains the direction of change between a value pair (CVmodxi, and xi + 1 in Eq. 4). In principle, CVmod reports the difference between two imaging time points relatively to the mean between these two in percent. TP were compared against each other using CVmod and visualized as waterfall plots [25].

$$ {\mathrm{CV}}_{\mathrm{mod}}=\frac{\left({x}_{i+1}-{x}_i\right)}{\frac{1}{2}\left({x}_i+{x}_{i+1}\right)}\cdotp 100\% $$
(4)

Statistic assessment

Interval scaled data are indicated as mean ± standard deviation. Nominal and ordinal scaled data are reported as median and interquartile difference. Statistical analysis was performed using JMP (v13, SAS Institute, Cary, NC, USA). Possible influence on CVmod from the presence of liver metastasis, progression of oncologic disease, change in Dw, or abnormalities of hepatic blood supply and size of lesions was analyzed by means of linear or logistic regression. To determine interrater agreement, the intraclass correlation coefficient (ICC) was determined and interpreted as follows: excellent agreement (ICC > 0.8), good agreement (ICC > 0.6), moderate agreement (ICC > 0.4), and poor agreement (ICC ≤ 0.4) [26]. Statistical significance was defined as p ≤ 0.05.

Results

Phantom

Precision

Throughout all time points, acquisition and reconstruction mean attenuation of the liver parenchyma mimic was 98.5 ± 0.6 HU, ranging from 97.2 to 99.7 HU and mean iodine concentration was 1.44 ± 0.04 mg/ml, ranging from 1.37 to 1.49 mg/ml (Fig. 3).

Fig. 3
figure 3

Analysis of the phantom portion. Circular regions of interest were automatically placed in the liver parenchyma mimic. The graphs illustrate attenuation values and iodine concentrations throughout the different imaging time points (TP) and acquisitions (Acq). The horizontal bar indicates the overall mean; the gray area illustrates the overall standard deviation

Precision was excellent as indicated by an overall CV of 0.31% and 1.13% for CI and iodine concentration, respectively (Fig. 4).

Fig. 4
figure 4

Waterfall plots illustrating the offset (CVmod [%]) between repetitive time points and acquisitions in phantoms (A), and between different time points in patient examinations within the portal vein, the liver, and the normalized liver (B) as well as in liver lesions (C). Y-axis is cropped to offsets outside ± 60% (B) and ± 200% (C) to ensure proper visualization

Repeatability

Even on a per pixel basis, no differences between any reconstruction were found in CI nor in iodine maps (p = 1), indicating perfect repeatability of the reconstruction algorithm.

Reproducibility

Reproducibility between acquisitions was good as indicated by CV for TP1, TP2, and TP3 of 0.39%, 0.40%, and 0.14% for CI and 1.15%, 1.07%, and 1.95% for iodine maps.

Between TP1, TP2, and TP3, we found excellent reproducibility indicated by a CV of 0.29%, 0.56%, and 0.45% for CI and 1.39%, 1.94%, and 1.75% for iodine maps. Overall reproducibility was as high as 0.37% for CI and 1.41% for iodine maps (Fig. 4).

Patients

In the subset analysis of 50 examinations that were assessed by two independent readers, ICC values of 0.98/0.99 and 1.00/0.99 indicated excellent agreement for measurements in the liver/portal vein in CI and iodine maps, respectively. CTDIvol in patient examinations was 13.5 ± 6.1 mGy cm and Dw ranged from 20.8 ± 41.2 cm (30.0 ± 4.1 cm) resulting in an SSDE of 15.8 ± 4.2 mGy. Average change between repetitive examinations of Dw was -1.0 ± 0.3%.

Overall variability

For the liver, we found an average attenuation of 112.4 ± 22.2 HU on CI and an average iodine concentration of 2.12 ± 0.76 mg/ml. To normalize for differences in intra-individual perfusion, we normalized to the portal vein which exhibited an average attenuation of 166.9 ± 31.1 mg/ml and an average iodine concentration of 5.38 ± 1.43 mg/ml, resulting in average values of 0.678 ± 0.090 and 0.390 ± 0.084 for CI and iodine maps, respectively. Histograms and distribution are available as an electronic supplement.

Intra-individual course: parenchyma and portal vein

Intra-individual variance as indicated by CVmod for the liver was -0.5 ± 9.8% for CI and -3.0 ± 20.9% for iodine. For the portal vein, CVmod was 2.5 ± 17.1% and 3.2 ± 26.4% for CI and iodine maps, respectively. The normalized CVmod values for the liver were -1.5 ± 12.2% and -4.4 ± 18.9% for CI and iodine maps, respectively (ESM 1). In 80% of patients, the intra-individual differences were within 20% boundaries (Fig. 4).

Intra-individual course: lesions

Overall lesion sizes was 1.3 ± 1.6 cm and did not differ between cysts and hemangiomas (1.2 ± 0.9 cm and 1.4 ± 2.1 cm; p ≥ 0.05).

In cysts, average attenuation was 6.8 ± 6.3 HU resulting in normalized values of 0.042 ± 0.041. Average iodine uptake was 0.230 ± 0.312 mg/ml and 0.049 ± 0.067 for absolute and normalized measurements, respectively. CVmod was found to be as high as 68.1 ± 227% and 37.7 ± 80.7% for CI and iodine maps, respectively (note that CVmod has limited validity in this context as division by small numbers will result in false high values, see Eq. 1, Fig. 4).

In hemangiomas, average values for hemangioma in CI and iodine maps were 72.5 ± 58.9 HU and 1.601 ± 1.874 mg/ml, respectively. Normalized values were 0.444 ± 0.344 and 0.291 ± 0.331 respectively. In CI, CVmod indicated a moderate variability 3.6 ± 30.6%, while variability in iodine maps was high -37.1 ± 66.6% (Fig. 4).

Possible confounders regarding CVmod

Repeatability of parenchyma CVmod was independent from the presence of liver metastasis, progression of oncologic disease, change in Dw, or abnormalities of hepatic blood supply (p ≥ 0.05). Lesion’s CVmod was independent of lesion size (p ≥ 0.05).

Discussion

The distribution of iodinated contrast media constitutes a powerful and important diagnostic tool in daily routine and SDCT allows for its quantification. In order to evaluate the clinical utility of iodine maps, this study investigated the repeatability and reproducibility in a phantom study as well as the in vivo reproducibility as indicated by the intra-individual course over follow-up examinations.

The accuracy of iodine maps from SDCT in phantoms has recently been investigated and the offset regarding the quantification accuracy was reported to be in the range of 2–10% [18,19,20, 27, 28]. Two studies used contrast media in water and iodine in various solvents considering the iodine content as known reference [18, 27], while the other studies relied on commercially available dual-energy phantoms [19, 20, 28]. Analogously to the latter, we used a commercially available solid iodine inserts in abdomen and dual-energy phantoms; however, due to inaccuracies in the manufacturing process, the exact iodine concentration within the phantom cannot be established; therefore, we deliberately refrained from an analysis of accuracy.

To evaluate the repeatability of the reconstruction algorithm, we conducted a repetitive reconstruction of images from a single acquisition without finding any difference. Further, we compared the three conducted acquisitions at each time point against each other and again found only some minimal differences between these. Out of the overall small differences, the examination at different time points showed the largest offset regarding the iodine concentration. Reasons for this may be a greater dependence on a perfect system calibration; hence, small offsets such as cross-scatter and beam hardening may hamper image reconstruction [18, 20, 29, 30]. Nevertheless, overall precision, repeatability, and reproducibility in the phantom study were very high and therefore reliable.

For evaluation in vivo, the intra-individual, longitudinal course of iodine concentration in the liver in patients undergoing repetitive examinations was examined. Despite unaltered imaging acquisition, contrast media application, and normalization of values to the portal vein to account for differences in individual circulation, we found a variation of iodine concentration by 20%. Taking the demonstrated high validity of the image reconstruction and acquisition from phantom experiments into account, alterations within the biological system can be considered major contributors to this offset.

While a manifest organ dysfunction was excluded based on recent laboratory results, slight changes in organ function (not represented by these tests) were not considered though possibly present. Further, the level of blood supply through the proper hepatic artery and portal vein may differ inter-individually; however, the degree is expected to be rather constant intra-individually. Accordingly, we did not find the supply type to be a relevant confounder regarding repeatability. In our study cohort, patient’s effective diameter remained rather constant throughout the longitudinal course; therefore, although body size was reported to influence quantification accuracy [18, 30], its influence in context of repeatability and reproducibility seems negligible. Neither did the presence of liver metastasis, progression of oncologic disease, nor lesion size confound repeatability significantly; however, it needs to be noted that the number of large lesions was limited, and validity of the latter confounder therefore is restricted.

Another factor that may explain the large intra-individual, longitudinal variations is differences in cardiac output and hence distribution of iodinated contrast media in dependency of the time point of image acquisition. In comparison to perfusion CT, few studies indicated that iodine maps from DECT may provide comparable information if the time point of image acquisition is wisely chosen [31,32,33]. A study by Skornitzke et al study suggested a delay of 15–21 s in imaging of pancreatic cancer [31, 32]. In imaging of HCC, Godic et al reported an optimal delay of 9 s to yield best correlation between DECT snapshot and actual time-resolved perfusion CT [33]. Hence, we assume that further protocol optimization and more individualized contrast protocols may allow for more reproducible measurements of iodine concentration within a biological system, while actual changes of blood supply, organ perfusion, etc. can be considered as major contributors in addition.

In the clinically encountered time frame, distribution of iodinated contrast media is primarily dependent of intravascular distribution. Therefore, it can be considered a surrogate parameter of organ perfusion; that is why a theoretical benefit from iodine maps in cancer imaging can be assumed [33,34,35,36,37,38,39]. Some studies and reports demonstrated a utility of these maps in visualization of tumor response and in differentiation of different tumors [34,35,36,37]. On the other hand, our data clearly identifies limitations or points to consider when evaluating such iodine maps clinically. This is the great variance encountered inter- and intra-individually. To consider a change in iodine uptake significant, it should clearly exceed the intra-individual variation, i.e., it has to be larger than 20% at minimum. Yet, it remains elusive if this variation can be reduced by protocol optimization or other means of standardization or if arises from actual biologic variation.

There are further limitations that need to be considered. The proposed CVmod has limited validity in the context of cysts as it becomes infinite if the true/expected value approximates zero. We deliberately accepted this drawback as it allows for comparison between CI and iodine maps and maintains the direction of change between two values (i.e., an in- or decrease). Functional cardiac parameters were not considered as confounders as they were not available for our study population. An evaluation of validity in other organs and in the context of metastatic disease would be interesting; yet, it was out of scope of this study. Our findings are limited to SDCT and our contrast and scanning protocol only.

To conclude, we found that the repeatability and reproducibility of iodine maps in phantoms meet the earlier reported high accuracy and precision; from a technical viewpoint, they can be considered a reliable tool for image reconstruction; however, regarding the repeatability in the context of in vivo liver imaging, we found a variability of approximately 20% over the intra-individual course in patients. Hence, these maps have to be used with caution if considered as supplemental information for clinical decision-making.