Introduction

Trials demonstrated that screening for lung cancer with low-dose CT (LDCT) increased its detection at an earlier stage [16]. The National Lung Screening Trial (NLST) showed that annual screening with LDCT reduced death from lung cancer by 20% compared with screening by conventional radiography for current or former heavy smokers [7]. Thus, lung cancer screening with LDCT is a promising method for detecting lung cancer and for improving its prognosis in current or former heavy smokers although effectiveness of it has not been confirmed in non-smokers. However, radiation exposure, false-positive results, procedural complications, and overdiagnosis are of concern [810].

Although the mean effective radiation dose at lung cancer screening with LDCT is 1.5-2.0 mSv [7, 1113], i.e. one-fifth to one-fourth of conventional chest CT [14, 15], radiation exposure risks associated with long-term and repeated lung cancer screening programs cannot be ignored [16, 17]. Radiation dose reduction is particularly important in CT lung cancer screening because most of the subjects are healthy.

Recently some authors used ultra-low-dose CT (ULDCT) with iterative reconstruction (IR) to diagnose chest diseases; the effective radiation dose (ED) was 0.13–0.44 mSv [1824]. The minimal requirements for ULDCT to be used for lung cancer screening are a diagnostic ability comparable to LDCT with respect to the detection of pulmonary nodules, the interpretability of the nodule density (solid, part-solid, or pure ground glass), and accuracy of nodule size measurements [25]. The purpose of this study was to investigate the diagnostic capability of ULDCT with full iterative reconstruction (f-IR) for lung cancer screening.

Materials and methods

This prospective study was approved by our ethics committee. Written informed consent was obtained from all subjects who agreed to undergo ULDCT in addition to our routine LDCT studies.

Patient population

We enrolled 567 subjects who underwent screening for lung cancer in our institute between January 4th and May 12th, 2015 (Fig. 1). Our entry criteria for LDCT screening for lung cancer are based on the National Comprehensive Cancer Network (NCCN) guidelines ver. 1.2013; they include a subject age between 55 and 74 years and a history of smoking at least 30 packs of cigarettes a year before smoking cessation less than 15 years prior to enrollment. Alternatively, the entry criteria were an age of at least 50 years, a history of smoking 20 packs per year, and one additional risk factor other than second-hand smoke. Two board-certified radiologists independently read the LDCT images. Computer aided pulmonary nodule detection systems (CIRCUS) [26, 27] were also used to detect pulmonary nodules. One of the 2 radiologists synthesized and recorded the presence, number, nodule density [solid, part-solid, pure ground glass (PGG)], and size of the pulmonary nodules and the CAD results for each subject.

Fig. 1
figure 1

Enrollment and randomization of our subjects

Of 567 subjects undergoing LDCT screening, 550 agreed to undergo additional ULDCT scanning.

CT scanning and image reconstruction

All subjects were scanned with a 320-detector CT scanner (Aquilion ONE; Toshiba Medical Systems Corp., Otawara, Japan). The tube current second product for ULDCT scans was 5 mAs (fixed mAs), the minimum on our CT scanner. The tube current for LDCT scans was regulated using automatic tube current modulation (ATCM), the preset noise value was 22 Hounsfield units. Other scanning parameters were same for LDCT and ULDCT; they were detector configuration, detector configuration, 0.5 mm (detector collimation) × 80 (detector number); reconstruction slice thickness and interval 2.0 mm; tube rotation time, 0.5 s; pitch factor, 0.813; and scan and field of view, 32–50 cm (depending on the subject’s habitus). All scans were performed at a tube voltage of 120 kVp. ULDCT was performed just after LDCT. We reconstructed LDCT images with a filtered back projection (FBP) algorithm (reconstruction kernel: FC51, Toshiba Medical Systems) and ULDCT images with an f-IR algorithm [Forward projected model-based Iterative Reconstruction SoluTion (FIRST), Toshiba Medical Systems] (see “Appendix”). The image reconstruction time for the f-IR algorithm was 3 min for 320 images.

Reference standards

The reference standard for pulmonary nodules was determined consensually by two board-certified radiologists (M.F. and K.A. with 7 and 27 years of experience with chest CT, respectively) who referred to the clinical reports and the CAD results. The two radiologist (M.F. and K.A.) reviewed all CT reports, CAD results, and CT images of the 550 subjects who had undergone LDCT screening. In the review, they identified 85 subjects with 136 nodules without calcification. From among the 85 subjects they selected 69 harboring 3 or fewer nodules measuring 4–20 mm in diameter; the other 465 subjects did not have nodules (Fig. 1). Our size criteria are based on the NCCN guidelines for lung cancer screening ver. 1.2013. [28]. They chose 20 mm as the maximum size because our preliminary study had shown that nodules larger than 20 mm are very easily detected on both LDCT and ULDCT scans. Then two radiologists independently classified the nodules into 3 patterns according to nodule density as solid-, part-solid-, and PGG nodules on LDCT images. When their classification agreed, it was adopted as the reference standard for each nodule. Disagreements in the classification were resolved by consensus and the final interpretation was regarded as the reference standard. Both radiologists recorded the mean diameter of the nodules (the mean of the longest diameter and the perpendicular diameter of each nodule) according to the NCCN guidelines [21]. The recorded diameters were averaged and defined as the reference diameter of each nodule. The Cohen kappa coefficient for agreement of the two radiologists as to the nodule density on LDCT images was 0.96 [95% confident interval (CI): 0.87–1.00].

Of 69 subjects, 51 harbored one-, 10 had 2, and 8 had 3 nodules; 75 of the 95 nodules were classified as solid, 8 as part-solid, and 12 as PGG nodules on the LDCT images. The median nodule diameter was 7 mm (range 5–18 mm) and the mean nodule diameter was 8.0 mm [standard deviation (SD) 2.7] on the LDCT images. The 69 subjects consisted of 64 men and 5 women.

Image analysis

Image analysis on LDCT or ULDCT images consisted of several substudies including qualitative analysis of the nodule density (solid, part-solid, or PGG) and the nodule size, and the observer performance study as to nodule detectability.

Qualitative image analysis

For our qualitative and quantitative assessments we used the 95 nodules of 69 subjects with 3 or fewer nodules measuring 4–20 mm in diameter. Two different board-certified radiologists with 14 and 9 years of experience interpreting chest CT scans who were not involved in the evaluation of nodules on LDCT scans independently classified the density of the 95 nodules on ULDCT scans as solid, part-solid, PGG, undetermined, and unidentified. In addition, the classified the nodule into solid- and subsolid (part-solid and PGG) nodule.

Quantitative image analysis

These two radiologists also recorded the mean diameter of the nodules (mean of the longest diameter of the nodule and its perpendicular diameter) on ULDCT images based on the NCCN guidelines [28].

Observer performance study

We first conducted a pilot study with two observers who were not involved in the subsequent observer performance study. It took them approximately 8 h to read all images in two reading sessions, suggesting that reader fatigue is an important factor. Therefore, we randomly chose 50 of the 69 subjects for the subsequent observer performance study. They were 35 subjects (50 nodules) used for qualitative and quantitative image analysis; 23 had one-, 9 had 2, and 3 had 3 nodules; 44 nodules were solid-, 4 were part-solid-, and 2 were PGG nodules. The other 15 subjects were from the pool without nodules.

Seven diagnostic radiologists with 3–14 years (median 7 years) of experience reading chest CT scans and LDCT images acquired at lung cancer screening participated in the observer performance study.

We used software developed by one of the authors (T.H.). The readers interpreted the images, specified the location of the pulmonary nodules by marking them using a mouse, and rated their confidence in their identification of a nodule on the right side of the screen. The default setting of the window width and level were 1500 and −500 HU, respectively and the readers were allowed to change arbitrary the window level and window with on a gray-scale monitor (model PA301A; NEC, Tokyo, Japan) with a spatial resolution of 2560 × 1600.

All 7 radiologists read the images in two sessions at an interval of 2 or 3 weeks to avoid recall bias. In the first session they read LDCT or ULDCT images, in the second they interpreted images they had not evaluated in the first session (LDCT or ULDCT scans). Their reading of LDCT or ULDCT images in the first and second session was determined by using a random table.

In each session the radiologists read the 50 cases included in the observer performance study in random order. They were informed in advance that the number of pulmonary nodules was 3 or fewer in each case. No restriction was placed on the reading time; each reading session took 2–3 h.

Statistical analysis

We assessed the interobserver agreement for the interpretation of the nodule density between the reference standard and the two radiologists who read the ULDCT images by using the Cohen kappa coefficient of concordance. Kappa values of 0–0.20 indicate poor-, 0.21–0.40 fair-, 41–0.60 moderate-, 0.61–0.80 good-, and 0.81 or greater very good agreement [29]. We also calculated the kappa value for nodules when we classified as solid, sub-solid (part-solid plus PGG), undetermined, and unidentified.

We used Bland–Altman plots and intra-class correlation coefficients (ICCs) to evaluate the agreement between the nodule size measured on ULDCT images by two radiologists and the reference standard. An ICC of 0–0.20 indicates slight-, 0.21–0.40 fair-, 0.41–0.60 moderate-, 0.61–0.80 substantial-, and 0.81 or greater almost perfect agreement [24].

To calculate the kappa coefficients and ICCs, and for Bland–Altman analysis we used statistical software (MedCalc, ver.15.11.3; MedCalc Software bvba, Ostend, Belgium; https://www.medcalc.org; 2015).

The objective of the observer performance study was to assess whether ULDCT scans were inferior to LDCT scans. To calculate the 95% CI of the difference in the area under the curve (AUC) between ULDCT and LDCT [(AUC for ULDCT)—(AUC for LDCT)] we used Jack-knife alternative free response receiver observer characteristics (JAFROC) analysis software provided by Chakraborty and Yoon (JAFROC 4.2.1). It takes into account the nodule location, permits evaluation of multiple lesions in each case, and allows the multiple-reader multiple-case (MRMC) design for JAFROC analysis [30]. In our previous study [31], the standard deviation of the AUC values of 10 radiologists for detecting pulmonary nodules on chest CT was 0.08. Therefore we pre-defined a margin of non-inferiority [32] for differences in the AUC as 0.08. Calculation of the number of observers was based on a margin of non-inferiority of 0.08, on 50 as the number of data, on 0.80 as the desired power (1-beta), and on alpha 0.05. As a result, the minimum required number of observers was 5.

Estimation of the radiation dose

We recorded the CT dose index (CTDIvol) and the dose-length product (DLP) obtained from the dose report on the CT scanner for the 550 subjects who underwent both LDCT and ULDCT scanning. We obtained the estimated effective dose (ED) was obtained by multiplying DLP by a conversion factor of 0.014 (Sv/mGy cm) using tissue-weighting factors [33]. To calculate the size-specific dose estimate (SSDE), we measured the antero-posterior (AP) and lateral (LAT) dimensions at the mid-thorax level on axial CT images. Then we obtained the conversion factor corresponding to a [AP + LAT] provided by AAPM Report 204 [34] for each subject. Multiplication of CTDIvol by the conversion factor yielded the SSDE for each subject.

Results

Qualitative analysis

Both observers involved in the qualitative analysis of the ULDCT images identified all nodules that had been circled in red. No nodules were recorded as undetermined or unidentified. The interobserver agreement as to the nodule density between the reference standard and the two observers who evaluated ULDCT scans is summarized in Tables 1 and 2. The kappa value between observer 1 and observer 2 and the reference standard was 0.70 (95% CI, 0.53–0.87) and 0.83 (95% CI, 0.69–0.97), respectively. When we classified as solid and sub-solid (part-solid and PGG) pulmonary nodules, the kappa value between observer 1 and observer 2 and the reference standard was 0.84 (95% CI, 0.70–0.97) and 0.93 (95% CI, 0.84–1.00), respectively.

Table 1 Interobserver agreement between observer 1 and the reference standard
Table 2 Interobserver agreement between observer 2 and the reference standard

Quantitative analysis

The Bland–Altman plot shows that the difference between the reference standard and the measurements of observer 1 and observer 2 was independent of the score magnitude (Fig. 2a, b). The mean difference between the reference standard and the measurements recorded by observer 1 and observer 2 was 0.4 and 0.8 mm, respectively. The 95% agreement between observer 1 and the reference standard was −1.8 to 2.5 mm; for observer 2 it was −1.6 to 3.1 mm. The intra-class correlation coefficient between the reference standard and the measurements recorded by observer 1 and observer 2 was 0.84 (95% CI 0.77–0.89) and 0.90 (95% CI 0.86–0.94), respectively (Fig. 3a, b).

Fig. 2
figure 2

Bland–Altman plots showing that the difference in the nodule diameter between the reference standard and the measurement recorded by observer 1 (a) and observer 2 (b) is independent of the score magnitude. In each figure, the center line shows the difference between the reference standard and the measurement recorded by the observer. The upper and lower lines indicate the 95% limits of agreement

Fig. 3
figure 3

Plots showing the correlation between the reference standard and the nodule diameter recorded by observer 1 (a) and 2 (b). The intra-class correlation coefficient (ICC) was 0.90 [95% confidence interval (CI) = 0.86–0.93] for observer 1 and 0.84 (95% CI = 0.77–0.89) for observer 2

Observer performance study

The mean of the AUC for detecting pulmonary nodules on LDCT and ULDCT images was 0.80 (SD 0.06) and 0.77 (SD 0.05), respectively (Table 3). The 95% CI for the AUC difference between LDCT and ULDCT was −0.03 to 0.07. The 95% CI crossed the 0 difference for the AUC; it did not cross the pre-defined non-inferiority margin of −0.08 (Fig. 4). Thus, we concluded that the nodule detectability on ULDCT scans was not statistically significantly inferior to their detectability on LDCT scans.

Table 3 AUC values for the performance detecting all nodules
Fig. 4
figure 4

Non-inferiority of ULDCT vis- a-vis LDCT for the detection of pulmonary nodules. The 95% confidence interval (CI) for the area under the curve (AUC) difference between LDCT and ULDCT was −0.03 to 0.07. The 95% CI crosses the 0 difference in the AUC but not the pre-defined non-inferiority margin of −0.08. LDCT low-dose CT, ULDCT ultra-low-dose CT

Estimated radiation exposure

The radiation dose for LDCT and ULDCT is summarized in Table 4. The mean CTDIvol, DLP, ED, and SSDE for the ULDCT scanning were almost 1/10 of those for LDCT scanning. The mean and SD of [AP + LAT] were 56.8 and 3.9 cm, respectively.

Table 4 Comparison of radiation exposure for LDCT and ULDCT

Representative images of solid-, PGG-, and part-solid nodules are shown in Figs. 5, 6 and 7, respectively.

Fig. 5
figure 5

63-year-old man with a 7-mm solid nodule in the right upper lobe detected by lung cancer screening. a Low-dose CT (LDCT) image reconstructed with filtered back projection (FBP). The arrow points to the nodule. b Ultra-low-dose CT (ULDCT) image reconstructed with FBP. While identification of the pulmonary nodule is easy, characterization of its nodule density and margin is difficult. c Ultra-low-dose CT (ULDCT) image reconstructed with full iterative reconstruction (IR). Although the spatial resolution is inferior to the LDCT image, identification and characterization of the nodule is easy

Fig. 6
figure 6

71-year-old man with a 7-mm pure ground glass nodule in the left upper lobe detected by lung cancer screening (arrow). a Low-dose CT (LDCT) image reconstructed with filtered back projection (FBP). The arrow points to the nodule. b Ultra-low-dose CT (ULDCT) image reconstructed with FBP. It is difficult to identify or characterize the nodule due to artifacts. c Ultra-low-dose CT (ULDCT) image reconstructed with full iterative reconstruction (IR). The pure ground glass nodule is clearly demonstrated. Although many black-dot-like artifacts are scattered throughout the lung, they do not impair identification and characterization of the nodule

Fig. 7
figure 7

75-year-old man with an 11-mm part-solid nodule the right lower lobe. a Low-dose CT (LDCT) image reconstructed with filtered back projection. The arrow points to the nodule. b Ultra-low-dose CT (ULDCT) image reconstructed with full iterative reconstruction (IR). The solid- (arrowhead) and the ground-glass part of the nodule can be differentiated on the ULDCT image despite the black-dot-like artifacts that are scattered throughout the lung

Discussion

When LDCT is used for lung cancer screening, the presence of nodules has to be confirmed and their nodule density and mean diameter has to be evaluated for the determination of appropriate management strategies [28]. We found that the nodule density identified on ULDCT and LDCT scans was in good agreement (kappa value 0.70 and 0.83, respectively); the mean nodule diameter recorded on ULDCT and LDCT images also agreed (ICC value 0.84 and 0.90, respectively). Our non-inferiority analysis demonstrated that nodule delectability on ULDCT scans was not inferior to the delectability on LDCT images. Based on these findings we concluded that lung cancer screening using ULDCT with f-IR is feasible and that radiation exposure is 1/10 of the exposure of conventional LDCT. While there are some reports on chest ULDCT using hybrid IR or f-IR [1823, 3537], ours is the first report that documents the non-inferiority of ULDCT vis-à-vis LDCT by AFROC analysis.

There are two types of full iterative reconstruction techniques (IRTs): hybrid IRTs and full IRTs. Although hybrid IR is essentially the same as the FBP method which has been used since the early 1970’s, hybrid IR incorporates some noise reduction techniques in sinograms and the image spaces. Commercially available hybrid IRTs include AIDR 3D (Toshiba Medical Systems) [21], ASiR (GE Healthcare) [18, 23, 37], SAFIRE (Siemens Healthcare) [19, 20], and iDose4 (Philips Healthcare) [38, 39]. The image noise is lower with hybrid IRTs than conventional FBP; the reconstruction time is substantially the same. Full IRTs are “pure” IRTs in the real sense of the word; projection and back-projection are repeated and the image quality is optimized on both sinograms and in the image space (see “Appendix”). With full IRT streak artifacts are reduced, spatial resolution is improved, and noise is drastically lower. Available full IRTs include Veo (GE Healthcare) [18, 23, 37], IMR (Philips Healthcare) [39, 40], and FIRST (Toshiba Medical Systems). The prominent characteristic of the IRTs applied in this study (FIRST) is optimization of image reconstruction of each organ or body part. The reconstruction time with f-IR tends to be long, with FIRST it was 3 min for 320 images and thus within a clinically acceptable range.

On ULDCT images black, dot-like artifacts are seen throughout the lung (Figs. 5, 6, 7). We think that they arise due to the extremely low number of photons on the ULDCT scan. Although they do not hamper the detection and characterization of nodules, they may impair the diagnosis of pulmonary emphysema. The incidence of emphysema in current and former smokers undergoing CT lung cancer screening is 28.5 and 20.6% [41], respectively. As the presence of emphysema on LDCT images may be an independent risk factor for lung cancer [41, 42] to use ULDCT for lung cancer screening, f-IR must be improved to eliminate the black dot-like artifacts.

When they differentiated sub-solid from solid nodules on ULDCT images, the kappa values between observer 1 and observer 2 and the reference standard were 0.84 (95% CI 0.70–0.97) and 0.93 (95% CI 0.84–1.00), respectively. Ridge et al. [43] reported that these kappa values for 8 experienced thoracic radiologists were 0.62 (95% CI 0.58–0.66) for the first-, and 0.67 (95% CI 0.61–0.73) for the second interpretation session. As their study included CT images obtained with different radiation dose, it is difficult to compare their findings with ours. Nonetheless, we think that the diagnostic ability of ULDCT was good enough to differentiate between sub-solid and solid nodules.

Although overall agreement as to nodule density between the reference standard and the two observers was good, interobserver agreement as to the part-solid nodules was not necessarily good. The possible reason is low-contrast resolution of the ULDCT with full IR is somewhat inferior to that of LDCT with FBP. Furthermore, setting of window width and level may give effect on differentiation of part-solid nodules from pure ground glass nodules.

Although some studies suggested that there is no increased risk of cancer from CT [44, 45], others raised concerns that lung cancer screening with LDCT may elicit radiation-induced cancer [10, 16, 17, 4648]. According to Shuryak et al. [49], the excess lifetime risks of radiation-induced lung cancer may peak at around age 50. While the dose at each screening session is low, the cumulative radiation dose due to iterative screening may become significant [34]. Participants in the NLST were subjected to a mean radiation exposure of 8 mSv over the course of 3 years of screening- and diagnostic studies [10]. According to Brenner [47], annual CT screening starting at the age of 50 adds about 0.85% to the 16.9% lung cancer risk faced by a 50-year-old female smoker; for a male smoker of the same age, the 15.8% lung cancer risk is increased by 0.23% when the radiation dose to the lungs is 5.2 mGy. As the tube current second product was 5 mAs and the CTDIvol was 0.30 mGy for our ULDCT scans, based on the findings of Brenner, the increase in the lung cancer risk associated with annual ULDCT screening may be less than 0.09 and 0.02%, respectively, for 50-year-old female and male smokers. Therefore we think that the excess lung cancer risk for screening with ULDCT can be reduced.

Although the tube current second product for ULDCT was 5 mAs and fixed, no nodules were classified as undetermined or unidentified in our qualitative analysis of the ULDCT images. The SD of our [AP + LAT] measurements was 3.9 cm and thus much smaller than of North American patients (7.4 cm) [50]. The smaller variance in [AP + RL] in our Japanese population may account for the acquisition of diagnostically sufficient images for the evaluation of pulmonary nodules at a fixed mAs. In populations with larger constitutions where the variability in the body habitus may be larger, ATCM may be necessary for ULDCT scanning. At this point we do not know whether ATCM can be applied when the tube current is very low.

Our study has some limitations. First, we only used CT image data of 69 subjects who harbored 3 or few nodules with a diameter of 4–20 mm although individuals seen in clinical practice may have more and/or larger nodules. Second, we informed the participants in our observer performance study that the selected subjects (35 with- and 15 without nodules) had 3 or fewer nodules to expedite the observer performance study. Although this may have introduced bias, reader fatigue renders prolonged observer performance tests difficult. Third, we did not investigate the relationship between the subjects’ constitution and the image quality. In larger subjects the image quality of ULDCT scans may not be diagnostically sufficient. Fourth, the habitus of our Japanese subjects was probably smaller than of Europeans and Americans. In heavier subjects, the radiation dose delivered at ULDCT may be insufficient for the acquisition of diagnostically adequate images.

In conclusion, for the detectability and characterization of pulmonary nodules, the diagnostic ability of ULDCT using f-IR is comparable to LDCT studies in subjects with a relatively small habitus.