Introduction

Contrast-enhanced computed tomography (CECT) is routinely used for oncologic staging and surveillance of metastatic disease in the chest, abdomen, and pelvis for a wide variety of primary malignancies [1]. Although there was a transient decline in CT utilisation related to concerns over radiation dose exposures [2], CT volumes appear to be trending higher again [3]. Nonetheless, scrutiny surrounding the small theoretical increased risk of malignancy from medical radiation at the population level remains high [4]. The impact of this effect, while controversial in terms of actual harm, has led to the prioritisation of CT dose reduction while maintaining diagnostic image quality. Recent research has focussed on iterative algorithms designed to reconstruct CT data acquired at reduced doses. Increased availability of commercially available iterative reconstruction methods that operate in image space, projection (raw data) space, or both has allowed for moderate dose reductions beyond classic filtered back-projection (FBP) [5]. Specific algorithms operating primarily in image space include adaptive statistical iterative reconstruction (ASIR, GE Healthcare), adaptive iterative dose reconstruction (AIDR, Toshiba), iDose4 (Phillips), and sinogram-affirmed iterative reconstruction and iterative reconstruction in image space (SAFIRE and IRIS, Siemens) [59]. For ASIR, studies have demonstrated the potential to reduce the dose an estimated 25-40 % [6, 9, 10]. The emergence of higher-order model-based image reconstruction (MBIR) algorithms, which may operate throughout the system with multiple forward and backward reconstructions, shows promise for even more aggressive dose reduction [5]. MBIR algorithms include Veo (GE), advanced model-based iterative reconstruction (ADMIRE, Siemens), iterative model reconstruction (IMR, Philips), and forward-projected model-based iterative reconstruction solution (FIRST, Toshiba) [5, 11, 12]. Of note, ADMIRE operates in both image and raw-data space, whereas MBIR, IMR, and FIRST primarily operate in raw-data space.

The utility of iterative reconstruction algorithms has previously been demonstrated for specific low-dose abdominal CT applications including CT colonography (CTC) [6, 13, 14], detection of urolithiasis [15, 16], and colonic diverticulitis [17]. For these clinical indications, aggressive dose reductions up to 90 % relative to FBP may be feasible. However, in these contexts, the findings of interest are inherently high contrast relative to their surroundings. For the liver, detection of low-contrast soft tissue lesions against a background of hepatic parenchyma is a much more difficult imaging task. Accurate imaging evaluation of the liver for focal metastatic lesions is critical in oncology for staging and therapeutic decisions, as well as for determining treatment response or progression [1, 18].

Clinical performance data for detection of low-contrast liver lesions at reduced-dose CT are lacking, especially with regard to newer iterative reconstruction algorithms. Recent phantom-based studies have shown little or no improvement in the performance of iterative reconstruction techniques compared with traditional FBP reconstruction at highly reduced doses [1921]. Recent human subject research suggested a possible role for iterative reconstruction techniques for assessing liver lesions at reduced-dose CT, but these studies either relied on retrospective artificial noise insertion [22] or did not compare directly against standard-dose CT [12]. Preliminary data by our own research group suggested superior subjective image quality for MBIR (Veo) over FBP and ASIR at reduced-dose CT in human subjects, but also that further study of actual diagnostic performance was warranted [11].

The purpose of this prospective trial was to directly compare the diagnostic performance of reduced-dose (RD) CECT using three different reconstruction algorithms [FBP, ASIR, and MBIR(Veo)] with concurrent standard-dose (SD) CECT for the detection of low-contrast liver lesions.

Materials and methods

This HIPAA-compliant prospective study was approved by our institutional review board. All subjects provided signed informed consent.

Patient population

Between April 2011 and April 2014, 70 adult patients (mean age 59.4 ± 12.8 years, male:female 31:39) were enrolled. Mean patient BMI was 27.7 ± 5.2 kg/m2 (Appendix Table 5). Inclusion criteria included adult men and non-pregnant women in outpatient care with a known diagnosis of non-liver primary malignancy presenting for CECT of the abdomen for staging or surveillance of metastatic disease. Primary malignancies included colorectal (n = 18), pancreatic (n = 14), neuroendocrine (n = 9), breast (n = 9), lung (n = 4), oesophageal (n = 3), gastrointestinal strom7al tumour (n = 3), lymphoma (n = 2), gallbladder (n = 1), cervical (n = 1), sarcoma (n = 1), prostate (n = 1), small bowel adenocarcinoma (n = 1), anal squamous cell (n = 1), testicular (n = 1), and gastric (n = 1).

CT protocol

All studies were performed on a 64-slice multi-detector CT scanner (Discovery CT750 HD, GE Healthcare) with a detector configuration of 64 × 0.625 mm, 120 kV, and tube current modulation (Auto-mA and Smart-mA, GE Healthcare). All patients first underwent SD CECT (Appendix Table 6) of the abdomen in the portal venous phase with automatic bolus tracking (SmartPrep), followed immediately by matched-coverage RD acquisition within the same breath hold to minimise any differences in phase of parenchymal contrast enhancement. The target range for dose reduction was deliberately set at 60-70 %, which is intermediate between typical reduction levels for CECT with ASIR (25-40 %) and unenhanced indications such as CTC and urolithiasis (80-90 %). This level of dose reduction (approximate CTDIvol of 5 mGy) was based on prior studies suggesting levels that would maintain diagnostic quality [23, 24]. Although lower levels down to CTDIvol of 3-4 mGy have been reported [11], this aggressive reduction is better suited for abdominal applications such as CTC [14] and urolithiasis [15]. Given the inherent low-contrast nature of most liver metastases, the decision was made to reduce the dose to a more intermediate degree.

Individual dose reductions were achieved as needed by adjustments to the tube current range and noise index. CT doses are reported both as the volume CT dose index (CTDIvol, mGy) and size-specific dose estimate (SSDE, mGy). Effective dose measurements (mSv) were calculated from the dose-length product (DLP), multiplied by the conversion factor of 0.015 mSv/mGy × cm) recommended by the AAPM [25] and verified by Deak et al. [26].

CT image reconstruction and analysis

Following acquisition, CT examinations were anonymised. The SD data set for each patient was reconstructed using FBP. The RD data set for each patient was reconstructed using three different reconstruction algorithms equipped with the CT scanner: FBP, ASIR, and MBIR. For the ASIR series, a 40 % blend was used to optimise subjective image quality, as previously described [9, 11, 27].

All SD and RD series were reconstructed in the axial and coronal planes using 2.5-mm slice thickness at 1.25-mm intervals. Following reconstruction, the four series (SD-FBP, RD-FBP, RD-ASIR, and RD-MBIR) were randomised into separate experimental groups (one series per patient per group) for review. In this way, no experimental group contained more than one series from any given patient, preventing experimental readers from making direct comparisons among reconstructions.

Each of the experimental groups was independently interpreted by three different reviewers, including a senior radiology resident, an abdominal imaging fellow, and an experienced abdominal radiologist with over 15 years of experience. Reviewers were aware the study was performed to evaluate for metastatic disease in the liver, but were blinded to the specific indication and reconstruction algorithm for each series, as well as any prior or subsequent comparison CT examinations and all patient clinical information. Reviewers were instructed to identify up to five distinct liver lesions ≥4 mm in diameter, ignoring the remainder of the abdomen and prioritising the identification of malignant lesions, only listing benign lesions if there were fewer than five total lesions believed to be malignant. Use of axial versus coronal reconstructions as well as window and level settings was left to the discretion of each reviewer.

The location, diameter, Hounsfield unit (HU) density, subjective classification as malignant or benign (5-point scale: 1 = definitely benign, 2 = likely benign, 3 = malignancy not excluded, 4 = likely malignant, 5 = definitely malignant), and confidence that the lesion was a true finding (3-point scale: 1 = low confidence, 2 = moderate confidence, 3 = high confidence) were recorded by each reader for each lesion identified. Overall by-patient confidence was also recorded for all studies interpreted as negative for liver lesions. Each reader reviewed all randomised experimental groups such that all SD and all RD series were independently reviewed by all three experimental reviewers, with a minimum of 1 week washout period between groups.

Reference standard

Two experienced abdominal radiologists (over 15 and 7 years in practice, respectively) served as tandem reviewers for establishment of the reference standard. All SD and RD series were reviewed with attention to the lesions identified by the experimental group reviewers. For the reference standard, comparison was made with available past and subsequent cross-sectional imaging examinations (CT, MR, and PET/CT), as well as all available clinical information. All potential lesions identified by experimental group reviewers on the SD and RD series were correlated to findings on both the index and comparison studies and were classified as metastases, benign lesions, or false-positive lesions. Additionally, any malignant lesions ≥4 mm found on reference standard review but not identified by the experimental group reviewers were identified as false negatives. For patients with five or more metastatic lesions by the reference standard, the experimental readers were only scored with false negatives if they failed to identify five malignant lesions. Experimental group findings for both SD and RD series were compared with this enhanced reference standard and performance metrics calculated on a per-patient basis. For calculation of performance metrics, benign or false-positive lesions scored 3 or higher on the malignancy scale were considered falsely positive against the reference standard. Truly malignant lesions on the reference standard either not identified by reviewers or scored 2 or lower on the malignancy scale were considered falsely negative.

With regard to prior and follow-up cross-sectional imaging examinations (CT, MR, and PET/CT) used in the enhanced reference standard, patients underwent a mean [± SD] of 5.3 ± 4.4 scans (range, 0-27) over 2.1 ± 2.4 years (range, 0.1-10.1 years) prior to the study and 4.3 ± 4.2 scans (range, 0-17) over an average of 1.6 ± 1.1 years (range, 0.1-4.4 years) for follow-up after the study CT scan.

Statistical analysis

The modified Wald method was used to calculate confidence intervals for proportions. The t-test was used to assess differences in continuous variables and chi-squared or Fisher’s exact test was used, where appropriate, to assess differences in categorical variables. A two-tailed P-value < 0.05 was used as the criterion for statistical significance. Receiver-operating characteristic (ROC) curves were constructed using Microsoft Excel (Microsoft Excel 2010, Microsoft Corporation, Redmond, WA) and areas under the curve (AUC) were calculated using the trapezoidal rule with comparison of ROC curves performed using the bivariate analysis model described by Hanley and McNeil [28, 29].

Results

The mean [± SD] effective dose for SD and RD CECT series were 5.77 ± 4.02 mSv (median, 4.81 mSv) and 2.01 ± 1.36 mSv (median, 1.71 mSv), respectively, representing a mean [± SD] reduction of 3.77 ± 2.94 mSv or 64.1 ± 8.8 % (Appendix Table 5). The mean [± SD] SSDEs for SD and RD CECT series were 16.98 ± 9.04 mGy (median, 14.04 mGy) and 5.92 ± 3.16 mGy (median, 4.92 mGy). The mean [± SD] CTDIvol for the SD and RD CECT series were 14.49 ± 9.29 mGy (median, 11.67 mGy) and 5.04 ± 3.18 mGy (median, 4.05 mGy).

Of the 70 patients in the study group, 19/70 (27 %) had at least one verified metastatic liver lesion ≥ 4 mm identified by the reference standard, with 12/19 (63 %) of those patients having at least five metastatic lesions. Compared with the reference standard, pooled SD-FBP among the experimental readers demonstrated sensitivity/specificity/PPV/NPV/accuracy of 0.91/0.78/0.60/0.96/0.81, respectively (Table 1). Pooled performance metrics for RD series (sensitivity/specificity/PPV/NPV/accuracy) were: 0.79/0.75/0.54/0.91/0.76 for RD-FBP; 0.84/0.75/0.56/0.93/0.78 for RD-ASIR and 0.84/0.68/0.49/0.92/0.72 for RD-MBIR (Fig. 1). Although none of the 95 % confidence intervals for the RD series fall entirely below that of SD, relatively decreased sensitivity for RD-FBP and decreased specificity for RD-MBIR are seen. In comparison, RD-ASIR appears to maintain sensitivity relative to RD-FBP and specificity relative to RD-MBIR, respectively. Performance metrics for individual readers are provided in Appendix Table 7.

Table 1 Per-patient pooled performance data by reconstruction algorithm
Fig. 1
figure 1

Pooled per-patient performance data by reconstruction algorithm

ROC curves for per-patient performance data were calculated for SD-FBP and each of the reduced-dose reconstruction algorithms (Fig. 2). Corresponding AUC values were 0.896 for SD-FBP, 0.834 for RD-FBP, 0.858 for RD-ASIR, and 0.854 for RD-MBIR. Statistically significant differences were seen between the AUC values for SD-FBP and RD-FBP (P = 0.002) and for SD-FBP and RD-MBIR (P = 0.032), but not quite for SD-FBP and RD-ASIR (P = 0.052); no statistically significant difference was observed among the AUC curves of the reduced-dose reconstructions (RD-FBP vs. RD-ASIR, P = 0.251; RD-FBP vs. RD-MBIR, P = 0.344; RD-ASIR vs. RD-MBIR, P = 0.842). Examples of false-negative and false-positive cases at RD are shown in Figs. 3 and 4, respectively.

Fig. 2
figure 2

Receiver-operating curve for reduced dose reconstructions versus standard dose

Fig. 3
figure 3

False-negative lesions at reduced-dose reconstructions. Red arrows indicate metastatic lesions within the right lobe of the liver on standard-dose FBP imaging (top left) of an 82-year-old male with metastatic neuroendocrine tumour. These lesions were not identified at reduced-dose MBIR (top right), FBP (bottom left), or ASIR (bottom right). Dose reduction was 74 % on the reduced dose series, with an effective dose of 1.49 mSv. Additional lesions in the left lobe are partially seen on the standard dose image but not clearly seen on the reduced dose series

Fig. 4
figure 4

False-positive lesion at reduced-dose MBIR. The red arrow indicates a false-positive lesion within the right lobe of the liver on reduced dose MBIR (top right) in a 52-year-old female with a primary diagnosis of pancreatic neuroendocrine tumour. This lesion persisted on adjacent axial slices. No lesion is seen at standard dose FBP (top left), reduced dose FBP (bottom left), or reduced dose ASIR (bottom right). Comparison to subsequent CT scans confirmed absence of metastatic disease. Dose reduction was 82 % on the reduced dose series, with an effective dose of 1.34 mSv

Reader confidence scores are summarised in Table 2. Notably, compared to SD-FBP, patients truly negative for metastatic disease were called true negative with significantly less confidence on all RD reconstructions with mean confidence scores of 2.70 ± 0.55 for SD-FBP and 1.98 ± 0.63, 1.98 ± 0.74, and 1.62 ± 0.64 for RD-FBP, RD-ASIR, and RD-MBIR, respectively (P < 0.001 in all cases, with RD-MBIR also statistically significantly lower than RD-FBP, P < 0.001, and RD-ASIR, P = 0.001). Compared to SD-FBP, clinically small, but statistically significant, differences in reader confidence were noted for true positive malignant lesions for RD-FBP (P = 0.016) and RD-ASIR (P =0.038), but not for RD-MBIR, with reduced-dose series demonstrating lower reader confidence. Differences in lesion diameter among reconstruction algorithms are summarised in Table 3. Of note, false-positive lesions detected at RD-MBIR, in addition to being greater in number, were statistically significantly smaller when compared with SD-FBP (P = 0.004). Differences in lesion density among reconstruction algorithms are summarised in Table 4. Notably, true-positive malignancies were noted to be of significantly lower density at RD-MBIR when compared to SD-FBP (P = 0.042); no other statistically significant differences were noted among SD or RD reconstructions.

Table 2 Per-patient pooled reader confidence by reconstruction algorithm
Table 3 Lesion diameter by reconstruction algorithm
Table 4 Lesion density by reconstruction algorithm

Discussion

CECT remains a mainstay in the diagnosis, staging, and surveillance of metastatic disease within the chest, abdomen, and pelvis [1]. Consequently, the continued development of methods to lower the CT radiation dose without significantly impacting performance is an important goal, especially given that many patients will undergo multiple CT scans over the course of their diagnosis and treatment. Although the actual health risks related to medical radiation remain a topic of considerable debate [4], it is nevertheless reasonable to ensure exposures are as low as reasonably achievable for accurate diagnosis. To this end, considerable effort has been directed toward the development of more sophisticated iterative reconstruction algorithms, such as MBIR, with the promise of substantial dose reduction. While there are data in the literature to support subjective improvements in image quality with these newer iterative reconstruction techniques (e.g. MBIR) compared with older techniques such as ASIR, there remains some question as to whether there will also be meaningful differences in diagnostic performance.

CT dose-reduction strategies have been explored and largely validated for some specific abdominopelvic indications, including CTC [6, 13, 14] and urolithiasis [15, 16]. However, these tasks feature greater inherent contrast between the target lesion and background tissues (e.g. soft tissue colorectal polyps against gas or contrast-tagged fluid and high-attenuation calculi against renal parenchyma). Comparatively, low-attenuation liver lesions against a background of liver parenchyma provide less contrast and pose a much greater diagnostic challenge. Several previous studies have evaluated this key area of reduced-dose CT for the detection of liver lesions [12, 1922]. However, while very informative, these studies all have some methodological issues that may limit their applicability to clinical practice, including use of static non-contrast phantoms instead of actual patients [1921], the artificial introduction of noise [22], and lack of direct comparison to a SD acquisition [12]. Our study builds upon these prior efforts and, to our knowledge, represents the first attempt at aggressive dose reduction (60-70 %) in a clinical patient population where both RD and SD series were obtained in immediate succession, allowing for direct meaningful comparison.

The methodology applied in this study allows for comparisons between RD and SD CECT not possible in prior studies. Rather than simply compare the experimental low dose series to an SD reference standard, we directed our experimental readers to also interpret the SD images under the same blinded and isolated conditions as with each RD series. This methodology demonstrates that even the SD series is not infallible when interpretation is performed in a vacuum without clinical history or comparison imaging. In comparison, the reference standard readers interpreted the SD images with the benefit of full access to the medical record and comparison studies (both prior and follow-up) for each patient. This paradigm permits better direct comparison of RD and SD CECT under the same conditions and tempers the lower accuracy values found with the former.

Interpreting the SD-FBP images in isolation, our experimental readers were able to achieve reasonable overall per-patient sensitivity (91 %) with modest specificity (78 %) and accuracy (81 %) for metastatic liver lesions ≥4 mm when compared to the non-blinded reference standard. For the RD series, we aggressively reduced the dose by over 60 % on average, with a mean effective dose of approximately 2 mSv. At these doses, we observed decreased diagnostic performance for all RD reconstructions when compared to SD-FBP, with the performance values for the RD series falling at or below the 95 % confidence interval for the SD-FBP series. Although at no point did the 95 % confidence intervals for the RD series fall entirely below that of SD, relative outliers include decreased sensitivity for RD-FBP and decreased specificity for RD-MBIR, whereas RD-ASIR maintained a more balanced performance, matching RD-MBIR for sensitivity and RD-FBP for specificity. We also noted that the area under the ROC curves were lower for RD-FBP and RD-MBIR, with statistical significance, and nearly so for RD-ASIR. Practically, this translated into both false-negative (both missed and misclassified lesions) (Fig. 3) as well as false-positive lesions (Fig. 4) at RD imaging, which resulted in complete misclassification of the metastatic disease status of some patients. Additionally, we noted that the readers had lower confidence in the RD reconstruction algorithms when compared to SD, most notably when declaring patients free of disease, where there was a relatively large (nearly a full point on a 3-point scale) and statistically significant difference for all three RD reconstruction algorithms. Differences in lesion size and density between SD and RD CECT were considerably less pronounced.

Perhaps more importantly, unlike prior studies evaluating more modest dose reductions there was no improvement seen with MBIR over ASIR or FBP at aggressive dose reduction in our study. In fact, RD-MBIR demonstrated lower specificity compared with both RD-ASIR and even RD-FBP. As we and others have noted anecdotally, RD-MBIR seems more prone to generating a somewhat smudgy liver texture with occasional pseudolesions (Fig. 4), which were frequently labelled in our study as concerning for metastatic disease and drove down specificity. Of note, this appearance is not isolated to single slices; in fact, this phenomenon seems to persist (and even appear accentuated) while scrolling through these pseudolesions in a cine mode. While further investigation is clearly warranted to explain this phenomenon, we hypothesise that it may stem from MBIR’s attempts to constrain noise within a subtly photopenic region of liver, perhaps resulting in the appearance of a smoothly marginated low-density focus. This phenomenon is also evident in the confidence data as readers were significantly less confident in labelling patients truly negative on RD-MBIR when compared to RD-FBP and RD-ASIR. Given decreased performance and reader confidence at the radiation doses evaluated in this study—in some cases resulting in the misclassification of patients as falsely positive or negative for liver metastatic disease—it may be prudent to question the perceived benefit of a sparing radiation dose in this patient population versus the risk of potential misdiagnosis.

We acknowledge several study limitations. Given that dose reduction in CT is an on-going area of research, there is no broadly accepted standard protocol for reduced-dose imaging; consequently, centres whose protocols differ substantially from our own may experience different results. For this prospective trial, we were confined to the dose reduction level set at the outset of the study, which was based on the best available data. For the purposes of this study, our experimental readers were blinded to patient clinical history as well as prior and follow-up imaging, which may result in some degree of underestimation of diagnostic performance when compared to clinical practice (as seen with the reduced performance with the SD series). Additionally, while our SD and RD scans were obtained in immediate succession, there is an unavoidable associated slight delay in phase of contrast, which could theoretically affect lesion detection in some instances. Our patient population was obese on average (mean BMI of 28 kg/m2), which likely affected lesion detection, even if the resulting effective doses for both the SD and RD series were higher relative to thinner patients. Despite the use of at least a 1-week washout period between series, the possibility of recall bias exists, particularly for patients who may have had memorable or characteristic findings. There was heterogeneity in the type, number, and interval of prior and follow-up cross-sectional examinations used to establish the reference standard, based on the particular clinical course of each individual patient; however, patients averaged multiple scans both before and after the scan for the current study (nearly 10 total scans per patient on average), allowing for establishment of a robust reference standard. Finally, this study assessed the performance of a specific model-based reconstruction algorithm from one vendor (Veo). As such, our results may not necessarily generalise to iterative reconstruction algorithms offered by other vendors, and additional specific studies are clearly needed to confirm the diagnostic performance for those related techniques.

In conclusion, using aggressive CT dose reduction (i.e. >60 % reduction from our standard imaging protocol; effective doses around 2 mSv) to evaluate low contrast liver lesions ≥4 mm, we observed lower diagnostic performance and reader confidence for all RD CECT reconstructions (FBP, ASIR, and MBIR) when compared to SD CECT, sometimes resulting in the misclassification of the metastatic disease status of patients. Furthermore, unlike prior experiences with less aggressive dose reduction, there was no clear advantage seen with MBIR over ASIR (in fact, specificity was decreased for MBIR). Consequently, the role of aggressive CECT dose reduction in the clinical oncology setting for detecting low-contrast liver lesions should be questioned given the potential for misdiagnosis in cancer patients.