Introduction

Chest CT is the modality of choice for thoracic imaging in various clinical situations, and the radiation dose can be reduced in evaluations of lung parenchyma due to its excellent natural contrast. As current CT vendors have the capability of whole-body scanning in a single breath-hold, the upper abdomen is usually also included in chest CT with contrast enhancement, especially in cancer patients. However, the radiation dose optimized for chest CT may result in degraded image quality in the abdomen, and a separate CT scan covering the upper abdomen could be required. The effective dose for standard abdominal CT was reported to be about 1.5 times that of standard chest CT [1], or approximately 3–4 times the weighted computed tomographic dose index (CTDIw) amended with the conversion factor [2]. Improvements in CT detector and iterative reconstruction technology [3,4,5] have contributed to lowering the radiation dose while maintaining image quality. Nonetheless, additional noise reduction may enhance the diagnostic performance for mediastinal and upper abdominal lesions, enabling further radiation dose reduction or single-pass whole-body scanning.

Recently, deep learning technology has been reported to show excellent performance in various fields of radiology, including lesion detection, lesion classification, and image reconstruction. If a deep learning–based image reconstruction (DLIR) algorithm (TrueFidelity; GE Healthcare) can improve the image quality of chest CT images, especially for upper abdominal evaluations, unnecessary radiation exposure could be reduced by avoiding additional examinations. There have been some studies which reported superior noise reduction performance of DLIR to that of standard adaptive statistical iterative reconstruction (ASiR) algorithm [6, 7], but the image quality of DLIR-reconstructed images on reduced-dose scans was not compared to that of ASiR images on standard-dose scans from human subjects.

In this study, we compared the image quality of the upper abdomen covered in contrast-enhanced chest CT images reconstructed by a commercial deep learning algorithm to that of images from dedicated contrast-enhanced abdominal CT.

Materials and methods

Patients

The institutional review board of Seoul National University Hospital approved this retrospective study, and the requirement for informed consent was waived. We included 100 consecutive patients (mean age, 64.0 ± 10.2 years [range, 34–82 years]; 63 men and 37 women) who underwent contrast-enhanced chest and abdominal CT scans on the same day, with a single contrast medium injection from January 2020 to March 2020. All patients underwent the examinations for an assessment of disease progression or postoperative surveillance of a known malignancy, with a detailed breakdown as follows: colorectal (n = 37), stomach (n = 14), biliary (n = 12), lung (n = 9), pancreas (n = 8), liver (n = 4), genitourinary (n = 4), lymphoma (n = 4), and other malignancy (n = 8).

Deep learning–based image reconstruction algorithm

The commercial DLIR algorithm (TrueFidelity, GE Healthcare) [8] was built based on the CT vendor’s detailed design embedded in a convolutional neural network. The algorithm took a chest CT sinogram as input data, and the ground truth was standard-dose CT information reconstructed by filtered back projection (FBP) of the same data. The ground truth training data were images from both phantoms and patients. The DLIR algorithm was deployed to run on the reconstruction hardware of a specific CT system.

CT acquisition and image reconstruction

Contrast-enhanced chest CT

All chest CT images were scanned by one multidetector CT scanner (Revolution CT; GE Healthcare) under a standard-dose contrast-enhanced CT protocol. The chest CT scan was performed craniocaudally (scan range from lower neck to costophrenic angle level) 55 s after contrast injection into an antecubital vein (total contrast volume and injection speed adjusted by patient’s body weight; 84 mL with 2.8 mL/s when the body weight was 70 kg) and saline push (10 s at the same rate). The following CT parameters were used: tube voltage, 100 kVp; automatic tube current modulation (noise index, 16); gantry rotation period, 280 ms; detector collimation, 0.625 mm; and detector pitch, 1.53. All CT scans were reconstructed into 2.5-mm slices with a 512 × 512 matrix size. Images were reconstructed using two different algorithms: DLIR (high level) and ASiR (ASiR-CHT, 50% blending with FBP) for the standard kernel.

Contrast-enhanced abdominal CT

All images were scanned by the same scanner as chest CT, after a single contrast injection following a standard CT protocol. The portal-phase abdominal CT scan was performed craniocaudally immediately after the chest CT scan. The following CT parameters were used: tube voltage, 100 kVp; automatic tube current modulation (noise index, 13.5); gantry rotation period, 500 ms; detector collimation, 0.625 mm; and detector pitch, 0.51. All CT scans were reconstructed into 2.5-mm slices with a 512 × 512 matrix size. Images were reconstructed using ASiR (ASiR-ABD; 40% blending with FBP) for the standard kernel. Details are summarized in Table 1.

Table 1 Protocol and dose information for contrast-enhanced chest and abdominal CT

Quantitative image quality assessment

For objective comparison of image quality, noise, the signal-to-noise ratio (SNR), and the contrast-to-noise ratio (CNR) were measured and compared among different reconstruction algorithms. For the thoracic evaluation, SD of values in Hounsfield units (HU) was measured in regions of interest (ROIs) measuring ≥ 1 cm2 drawn in the bilateral axillary fat (SDax1 and SDax2), and average HU values were measured in the bilateral paraspinal muscles (HUPSM1 and HUPSM2), ascending thoracic aorta (HUao), and main pulmonary artery (HUMPA) (Fig. 1a). Noise, SNR, and CNR for each scan were calculated using the following equations [9, 10]:

$$ \mathrm{Noise}=\left({\mathrm{SD}}_{\mathrm{ax}1}+{\mathrm{SD}}_{\mathrm{ax}2}\right)/2 $$
$$ \mathrm{SNR}=\left({\mathrm{HU}}_{\mathrm{PSM}1}+{\mathrm{HU}}_{\mathrm{PSM}2}\right)/\left({\mathrm{SD}}_{\mathrm{ax}1}+{\mathrm{SD}}_{\mathrm{ax}2}\right) $$
$$ \mathrm{CNR}=\left[\left({\mathrm{HU}}_{\mathrm{ao}}+{\mathrm{HU}}_{\mathrm{MPA}}\right)-\left({\mathrm{HU}}_{\mathrm{PSM}1}+{\mathrm{HU}}_{\mathrm{PSM}2}\right)\right]/\left({\mathrm{SD}}_{\mathrm{ax}1}+{\mathrm{SD}}_{\mathrm{ax}2}\right) $$
Fig. 1
figure 1

Examples of the measurement methods used to calculate noise, the signal-to-noise ratio, and the contrast-to-noise ratio. a In the thoracic evaluation, we measured the standard deviation (SD) of values in Hounsfield units (HU) in bilateral axillary fat (SDax1 and SDax2) and average HU values in bilateral paraspinal muscles (HUPSM1 and HUPSM2), ascending thoracic aorta (HUao), and main pulmonary artery (HUMPA). b For the upper abdominal evaluation, we measured the SD of HU values in abdominal wall fat (SDfat) and the right and left liver lobes (SDlvr1 and SDlvr2). Average HU values were measured in the right and left liver lobes (HUlvr1 and HUlvr2) and abdominal aorta (HUaao)

In evaluating the upper abdomen, the SD of HU values was measured in ROIs measuring ≥ 1 cm2 drawn in the anterior abdominal wall fat (SDfat) and right and left liver lobes (SDlvr1 and SDlvr2), while average HU values were measured in the right and left liver lobes (HUlvr1 and HUlvr2) and abdominal aorta (HUaao) (Fig. 1b). The perirenal fat area was used for cases lacking measurable (≥ 1 cm2) anterior abdominal wall fatty tissue. Noise, SNR, and CNR for each scan were calculated using the following equations [11, 12]:

$$ \mathrm{Noise}={\mathrm{SD}}_{\mathrm{fat}} $$
$$ \mathrm{SNR}=\left[\left({\mathrm{HU}}_{\mathrm{lvr}1}/{\mathrm{SD}}_{\mathrm{lvr}1}\right)+\left({\mathrm{HU}}_{\mathrm{lvr}2}/{\mathrm{SD}}_{\mathrm{lvr}2}\right)\right]/2 $$
$$ \mathrm{CNR}=\left[{\mathrm{HU}}_{\mathrm{aao}}-\left({\mathrm{HU}}_{\mathrm{PSM}1}+{\mathrm{HU}}_{\mathrm{PSM}2}\right)/2\right]/{\mathrm{SD}}_{\mathrm{fat}} $$

Qualitative image quality evaluation

Three independent board-certified radiologists (two thoracic radiologists [J.H.H. and D.S.K.] and one general radiologist [J.S.O.], with 5–7 years of experience in the interpretation of chest CT imaging) evaluated the image quality of each reconstructed image. For each case, two reconstructed image sets were anonymized and distributed in random order for thoracic evaluation, and three image sets were provided for the evaluation of the upper abdomen. Only overlapping parts (from the basal lung level to the kidney level) were selected and provided for the evaluation of the upper abdomen. The readers were blinded to information on the reconstruction technique and the clinical indication. For each image set, subjective image noise, spatial resolution, presence of artifacts, and overall image quality were evaluated for both the thoracic and upper abdominal images, following previous studies [13, 14]. An example of a distortion artifact observed in DLIR-reconstructed images is provided in Fig. 2. The readers then selected the most preferred algorithm for image interpretation. Detailed information on the scale used for image evaluation is provided in Table 2.

Fig. 2
figure 2

An example of a distortion artifact observed on a deep learning–based image reconstruction (DLIR-CHT) scan. A 71-year-old male patient receiving postoperative surveillance for advanced gastric cancer underwent contrast-enhanced chest and abdominal CT scans. DLIR-reconstructed chest CT images (a, b) showed diffuse checkered line-like artifacts. One out of three readers assessed that these artifacts diminished the diagnostic performance of the image (score 1). Those artifacts were not visualized on adaptive statistical iterative reconstruction (ASiR)-reconstructed abdominal CT images (c, d)

Table 2 Scoring scale for various parameters in the assessment of the image quality of each CT image

Statistical analysis

For comparison of the qualitative and quantitative image quality parameters between reconstruction algorithms, either the Wilcoxon signed-rank test or the paired t test was used for continuous variables after normality testing, as appropriate. The levels of interobserver agreement were evaluated using intraclass correlation coefficients (ICCs) with a two-way model for each qualitative analysis item: 0.76–1.0, excellent agreement; 0.40–0.75, fair to good agreement; and < 0.40, poor agreement [15]. Statistical analyses were performed using MedCalc software version 15.8. For the upper abdominal evaluation, a Bonferroni-corrected p value of < .017 (.05/3) was considered to indicate statistical significance, and a p value < .05 was used for the thoracic evaluation.

Results

Radiation dose analysis

The mean volume computed tomographic dose index (CTDIvol) for the 100 CT scans was 3.19 ± 0.70 mGy (range, 1.57–4.57 mGy) for the contrast-enhanced chest CT scans, which was significantly lower than that of the contrast-enhanced abdominal CT scans (7.10 ± 1.88 mGy [range, 4.97–15.18]; p < .001). The average tube current-time product at the celiac level was also significantly lower for the contrast-enhanced chest CT scans (1.99 ± 0.75 mAs [range, 1–3 mAs]) than for the abdominal CT scans (4.50 ± 1.59 mAs [range, 3–11 mAs]; p < .001) (Table 2).

Quantitative image quality assessment

In the thoracic evaluation, significantly less noise was measured in the images obtained using DLIR-CHT than in those obtained using ASiR-CHT (Table 3) (8.03 ± 2.52 vs. 14.6 ± 2.54; p < .001). DLIR-CHT demonstrated a significantly better SNR (4.67 ± 1.19 vs. 2.72 ± 0.68; p < .001) and CNR (20.0 ± 5.91 vs. 10.3 ± 3.01; p < .001). In the evaluation of the upper abdomen, less noise was measured for DLIR-CHT than for ASiR-ABD (8.01 ± 2.81 vs. 12.3 ± 2.28; p < .001). DLIR-CHT also exhibited a better SNR (10.7 ± 1.70 vs. 7.96 ± 1.20; p < .001) and CNR (8.42 ± 5.86 vs. 3.52 ± 1.74; p < .001) than ASiR-ABD.

Table 3 Quantitative image quality evaluation results for the various reconstruction algorithms

Qualitative image quality evaluation

In the thoracic evaluation, there was significantly less subjective noise in the DLIR-CHT images than in the ASiR-CHT images with both lung and mediastinal window settings (noise score, 3.37 ± 0.54 vs. 2.89 ± 0.32 and 3.16 ± 0.48 vs. 2.73 ± 0.25 for lung and mediastinal settings, respectively; p < .001). DLIR-CHT also scored better in terms of spatial resolution in both window settings, and statistical significance was only found for the mediastinal window (Table 4). Distortion artifacts were noted in two ASiR cases by one reader, while the other two readers did not observe any distortion artifacts. Beam hardening artifacts were less frequent in DLIR-CHT images than in ASiR-CHT images (2.54 ± 0.44 vs. 2.41 ± 0.44; p = .01). Overall, readers gave better scores to DLIR than to ASiR-CHT (3.34 ± 0.46 vs. 2.94 ± 0.30; p < .001).

Table 4 Qualitative image quality assessment results

In the evaluation of the upper abdomen, the least subjective noise was observed in the ASiR-ABD images, followed by the DLIR-CHT images (2.94 ± 0.23 vs. 2.87 ± 0.26; p = .01). In terms of spatial resolution (liver sharpness evaluation), DLIR-CHT scored better than ASiR-ABD (2.60 ± 0.34 vs. 2.50 ± 0.30; p < .001). Distortion artifacts were more frequently observed on DLIR-CHT than on ASiR-ABD, but poor inter-reader agreement was found (ICC, 0.11). Distortion was noted on 17.0% (51/300) of the DLIR-CHT scans, but only one case was reported to affect the diagnostic value of the image (score 1, moderate to severe distortion, affecting diagnostic value; presented in Fig. 2), and others were interpreted as having a score of 2 (i.e., some distortions were recognizable but without diagnostic disturbance). There was no significant difference in beam hardening artifacts. ASiR-ABD showed the best overall image quality (3.04 ± 0.23), followed by DLIR-CHT (2.93 ± 0.20; p < .001). Detailed results are provided in Table 3, and representative cases are presented in Figs. 3, 4, and 5.

Fig. 3
figure 3

A 74-year-old female patient underwent consequent contrast-enhanced chest and abdominal CT for postoperative surveillance for colon cancer. Deep learning–based image reconstruction (DLIR)-reconstructed chest CT (a) showed a better noise level than adaptive statistical iterative reconstruction (ASiR)-reconstructed chest CT (b) and better spatial resolution than ASiR-reconstructed abdominal CT, with comparable noise (c). The surface of a hepatic cyst (arrow) was more clearly visualized on DLIR-reconstructed images, probably due to better spatial resolution

Fig. 4
figure 4

A 86-year-old male patient underwent contrast-enhanced chest and abdominal CT to evaluate gastric cancer progression. DLIR-reconstructed chest CT (a) showed less noise than ASiR-reconstructed chest CT (b) and comparable noise to ASiR-reconstructed abdominal CT (c)

Fig. 5
figure 5

Examples of metastatic lesions covered on both deep learning–based image reconstruction–reconstructed chest CT (DLIR-CHT) and adaptive statistical iterative reconstruction–reconstructed abdominal CT (ASiR-ABD) images. a, b A metastatic nodule in the liver segment 2 (arrow) was more clearly demarcated on (a) DLIR-CHT images (required CTDIvol for the scan: 3.24 mGy) than on (b) ASiR-ABD images (CTDIvol: 7.06 mGy). c, d Peripancreatic soft tissue infiltration (arrows) and anterior peritoneal fat haziness (arrowheads) were clearly demonstrated on both (c) DLIR-CHT (CTDIvol: 3.09 mGy) and (d) ASiR-ABD (CTDIvol: 6.92 mGy) images

Preferences of the readers

All three readers preferred DLIR-CHT to ASiR-CHT for the thoracic evaluation (pooled rate 90.7% vs. 9.3%). For the evaluation of the upper abdomen, two of the three readers (readers 1 and 3) preferred DLIR-CHT to the other two algorithms. Reader 2 preferred ASiR-ABD (Table 5). In the pooled results, readers preferred DLIR-CHT in 49.3% (148/300) of cases for the evaluation of the upper abdomen, while ASiR-ABD was preferred in 46.7% (140/300) of cases (Table 5).

Table 5 Reader preference analysis among reconstruction algorithms

Discussion

In our study, DLIR applied to contrast-enhanced chest CT showed better measured values of noise, SNR, and CNR compared to ASiR-CHT in the thoracic evaluation. For evaluating the upper abdomen, DLIR on chest CT also showed significantly better measured values of noise, SNR, and CNR than ASiR-ABD. In the subjective analysis, readers assessed DLIR-CHT as showing better spatial resolution in the upper abdominal evaluations than ASiR-ABD, while ASiR-ABD demonstrated better subjective noise level and overall image quality. The pooled readers selected DLIR as a more preferred reconstruction algorithm more frequently than other algorithms.

Our study results imply that with a chest CT-equivalent radiation dose, application of DLIR could improve the noise level and image quality of abdominal images to the level of standard abdominal CT. Developments of iterative reconstruction algorithms have achieved radiation dose reductions of 25–50% in contrast-enhanced body imaging [3,4,5]; however, further progress in dose reduction technology is necessary. Iterative reconstruction, based on a human-dependent optimization process, which limits the number of parameters to typically less than a hundred, has been particularly challenged by a growing number of parameters that have made it more challenging to retain the necessary convergence properties of the algorithm [8]. The deep learning–based approach does not require simplification of parameters and may handle millions of parameters, and therefore has the potential to open the door to a new paradigm of image reconstruction [8]. In our study, DLIR successfully reconstructed images from FBP images with less than 50% of the radiation dose than was required for the vendor-specific iterative reconstruction algorithm (ASiR). Further prospective studies are warranted to optimize the protocol and to confirm its clinical applicability, but our study has strength in that we evaluated the clinical feasibility of radiation dose–reduced abdominal imaging in a retrospective cohort without imposing an additional radiation dose to the patients.

Tube current modulation scans can be applied for whole-body CT to reduce radiation exposure. Clinical applications of single-pass CT have been regularly reported, mostly in trauma patients [16,17,18,19,20,21]. Without substantial change in mAs for each body part, 17–46% dose reductions have been reported by eliminating overlap zones [17, 18, 21]. If a reduced dose is applied to single-pass CT, further dose reduction could be achieved. Implementation of single-pass CT should be more cautious for oncology patients, as an accurate evaluation of soft tissue organs is necessary in these patients. Previous studies on single-pass contrast-enhanced CT scans focused more on appropriate contrast enhancement using a split-bolus protocol, while solely depending on automatic tube current modulation for radiation dose control [21, 22]. Total radiation dose reduction was observed by eliminating overlap zones, but thoracic radiation exposure might have been increased [21, 22]. In addition to the contrast injection technique, DLIR-based dose reduction should be studied and applied to single-pass whole-body CT scans.

As contrast-enhanced scans are commonly used for oncology patients, lesion detectability is critical. In this study, we did not evaluate diagnostic performance. This is one of the major limitations of our study, and further studies are warranted to confirm the non-inferiority of DLIR-reconstructed reduced-dose images in detecting abnormalities compared to standard dose ASiR images. Instead, we analyzed some criteria related to diagnostic capability, including spatial resolution, level of distortion artifacts, and readers’ preferences for upper abdominal evaluation. Under automatic tube current modulation with a certain noise level, the gantry rotation time is limited due to tube current saturation [23]. As the X-ray tube capacity is limited in terms of tube current, the gantry rotation time was set to be slower for abdominal CT than for chest imaging (500 ms vs. 280 ms). This difference could be important especially for patients with malignancies, as they are more likely to have breathing difficulties, and the faster gantry rotation time resulted in better spatial resolution of DLIR-CHT. As deep learning technology is not yet fully validated, some unexpected image distortions may be observed. We evaluated the level of distortion artifacts, and the readers reported that more image distortions were visualized on DLIR-CHT than on ASiR-ABD. However, only one case was reported to affect diagnostic capability, and the inter-reader agreement was poor (ICC, 0.11). Overall, the pooled readers found DLIR-CHT to be preferable in evaluating the upper abdominal area to ASiR-ABD in more cases (49.3% vs. 46.7%).

There was a discrepancy between qualitative and quantitative analyses of image quality, especially in comparing DLIR-CHT and ASiR-ABD. DLIR-CHT showed better-measured noise, SNR, and CNR, while pooled readers assessed ASiR-ABD to have better subjective noise and overall image quality. We thought this discrepancy was attributed to mainly two causes. First, DLIR was trained to achieve better noise and contrast resolution [8], while other factors, including spatial resolution of image texture, may affect image quality. As a result, evaluation of noise, SNR, and CNR should be advantageous to DLIR. Second, as ASiR-ABD was the standard images used in the institution where all three readers were employed, the readers were accustomed to the ASiR-ABD images. Subjective evaluation of image quality could be favorable for ASIR-reconstructed images.

Our study has several limitations. First, the study dataset was collected retrospectively and may have been affected by selection bias; however, we included consecutive cases. Second, the chest and abdominal CT scans that we used had different scan timing. The slightly later timing of the abdominal CT scans may have altered the CNR. Third, we used 2.5-mm slices, but some readers may prefer thinner section images. Fourth, we only compared a single DLIR level (high level) with a single ASiR degree (40% for abdominal CT). Optimization of the protocol may enable further dose reduction with equivalent image quality. Furthermore, we could only evaluate the upper abdominal area. The pelvic area, which contains abundant bone structures, may show different noise profiles. Finally, the reconstruction algorithms we used are vendor-specific and cannot be utilized for other scanners.

In conclusion, with approximately 50% of the radiation dose, DLIR-applied contrast-enhanced chest CT showed comparable image quality in the upper abdomen to that of dedicated abdominal CT and was preferred by most readers.