Introduction

Lung cancer accounts for 18 % of cancer deaths worldwide. It is the leading cause of cancer deaths in men, and the second leading cause of cancer deaths in women after breast cancer [1]. In 1996, Kaneko et al. [2] showed that conventional chest radiography was inferior to computed tomography (CT) for lung cancer screening. Annual CT screening of patients at risk allows for the detection of early-stage lung cancer, which is curable [3]. The National Lung Cancer Screening Trial (NLST) is the first, and so far only, large multicentre study that showed a reduction of 20 % in the mortality rate of a high-risk population following annual lung cancer screening, including 53,454 patients [4]. Other large multicentre studies such as the European randomised lung cancer CT screening trial (EUCT) or the Dutch-Belgian randomised lung cancer multi-slice CT screening trial (NELSON) are currently on-going [5, 6]. Now many associations recommend lung cancer screening [710].

Nonetheless, since there are some important arguments against lung cancer screening, it is not yet widely applied. The major concerns with lung cancer screening by CT are the costs generated by the repetitive CT acquisitions [11], the high detection rate of false-positive benign nodules [12] and the radiation dose [13]. Smokers and former smokers who receive an annual low-dose CT from the age of 50–75 years, with a presumed effective dose of 5.2 mGy per CT acquisition, have a calculated additional risk of 1.8 % (95 % CI 0.5–5.5) for lung cancer, due to repetitive CT acquisitions [13].

The newest generation of CTs, in combination with iterative reconstruction techniques [14], allow for ultralow-dose CT acquisitions with effective doses similar to the dose of the topogram, and thus not more than the dose with conventional thoracic radiography in two projections, which is 0.05–0.24 mSv [15]. These ultralow-dose CT acquisitions allow a high sensitivity and diagnostic confidence for the detection of pulmonary nodules [16].

Various secondary read-out tools exist that can increase the detection rate for small lung nodules, such as the use of maximum intensity projections (MIPs) [17] or computer-aided detection (CAD) software [18, 19].

The aim of this lung phantom study was to investigate the sensitivity to lung nodules of an ultralow-dose CT with the same effective dose as the dose from conventional radiography, compared to a standard CT alone, as well as in combination with MIP reconstructions and CAD software.

Materials and methods

Lung phantom preparation and computed tomography (CT) acquisition

In this lung phantom study, an anthropomorphic chest phantom (Chest Phantom N1 by Kyoto Kagaku©, 43 × 40 × 48 cm) was used. Thus, approval from the local ethics committee was not necessary. The phantom was equipped with solid (100 HU/5, 8, 10 and 12 mm) and ground-glass (−630 HU/5, 8, 10 and 12 mm) spherical nodules (size 5, 8, 10 and 12 mm) that were randomly distributed by number, location, type and size to the different lung segments, with a minimum of zero nodules and a maximum of eight nodules per phantom. In a total of 60 different phantom arrangements, 232 nodules have been placed, randomly assigned to a lung side, lung segment and peripheral or central location (the inner half of the lung toward the hilum versus the outer half of the lung toward the ribs, radially): 115 were solid and 117 had a ground-glass density (Fig. 1). Five phantoms did not contain any nodules. The phantoms were acquired twice: once with a standard radiation dose, and once with an ultralow radiation dose.

Fig. 1
figure 1

Image comparision (1-mm axial stack): the standard acquisitions (tube voltage: 100 kVp, tube current: 100 mAs, effective dose: 1.81 mSv) with a solid (a) and a ground-glass nodule (b) compared to the ultralow-dose acquisitions (tube voltage: 80 kVp, tube current-time product: 6 mAs, effective dose: 0.135 mSv) of the same solid (c) and ground-glass (d) nodule

All acquisitions were performed on a Somatom Definition Flash CT (Siemens Forchheim, Germany) featuring iterative reconstruction algorithms (IRIS, Siemens, Germany) and a detector system with integrated readout electronics (Stellar detector, Siemens Forchheim, Germany) with a gantry rotation time of 0.28 s. The pitch was 2.2 and the collimation was 0.6 mm. The standard acquisition was performed with 100 kVp in the care-mAs mode with a reference tube-current time product of 100 mAs. The care-kV mode was disabled. For the ultralow-dose acquisition with a tube potential of 80 kVp and a tube current-time product of 6 mAs, care-kV and reference mAs were disabled. These acquisition settings resulted in a volume CT dose index (CTDIvol) of 3.15 mGy for the standard acquisitions and a CTDIvol of 0.22 mGy for the ultralow-dose acquisitions. The corresponding Dose-length product for a scan length of 40 cm was 126 mGy*cm and 9 mGy*cm (Table 1).

Table 1 Acquisition protocol and reconstruction settings for the standard and the ultralow-dose acquisitions

The images for both acquisitions were reconstructed in axial stacks with a slice thickness of 1 mm and an increment of 1 mm in a lung parenchyma window (level: −600; window width: 1,200), using the vendor specific iterative reconstruction algorithm (iterative reconstruction in imaging space [IRIS]; Siemens, Erlangen, Germany) with the level 3 reconstruction process, integrated in syngo.via (Siemens Healthcare). Based on data from previous pilot studies, investigating the optimum convolution kernel in ultralow-dose acquisitions [20], radiologists achieved a significantly higher detection rate looking at the 1-mm axial image stack reconstructed with a I30 convolution kernel compared to the I70 convolution kernel normally used in standard CT, while the CAD software performed best with a I50 convolution kernel.

To compare the highest possible detection rates, we consequently used the standard I70 convolution kernel for the standard acquisition, the I30 convolution kernel for the ultralow-dose acquisition and the I50 convolution kernel for the Lung CAD VD10 Mode 2 software. The standard acquisition axial 1 mm stack and the ultralow-dose acquisition axial 1 mm stack were used with the CAD software as well as for the reconstruction with maximum intensity projections (MIP, slice thickness 8 mm, increment 2 mm).

Dose calculation

The effective dose was calculated by multiplying the DLP from the CT protocol with the conversion factors of ICRP 103 [21], which are 0.0147 [mSv/(mGy*cm)] for an adult thoracic CT acquisition with 80 kVp and 0.0144 [mSv/(mGy*cm)] for 100 kVp.

Read-out

The read-out was conducted on a Picture Archiving and Communication System (PACS R11.4.1, 2009; Philips, Best, Netherlands; Sectra, Linkoping, Sweden). A total of six radiologists with 3–7 years of experience in thoracic radiology examined the ultralow-dose and the standard acquisitions in three steps with an interval of at least 2 weeks between read-out sessions. First, each reader examined the axial 1-mm image stack for solid and ground-glass nodules without any knowledge of additional MIPs or CAD. The location of the nodules (right/left lung and table position), nodule type (solid/ground-glass) and nodule size (in millimeters) were noted. For the second and third read-outs, two groups of readers were formed (three readers each with comparable experience in thoracic imaging between the groups); the first read-out group examined the MIP reconstructions during the second read-out, finally checking with the CAD (third read-out), while the second group used CAD during the second read-out, finally referred to the MIPs (third read-out).

Statistical analysis

The comparisons of nodule detection with standard CT and ultralow-dose CT with and without MIP reconstructions and CAD software were performed with the McNemar test [22]. The differences in the radiation doses between standard and ultralow-dose CT were compared with Wilcoxon’s test for paired samples that are not normally distributed [23]. The chi-square test was applied to the comparison of the central and peripheral locations of the nodules. Sensitivity, specificity, positive predictive value and negative predictive value for lung nodule detection were calculated on a per-lung segment basis for the standard-dose and the ultralow-dose acquisitions with and without the additional use of MIP/CAD.

To calculate the inter-reader variability, a non-weighted binary κ-statistic for multiple readers was used by computing the arithmetic mean of the κ-values of each pair of readers (κ-value 0–0.2 poor; 0.21–0.4 fair; 0.41–0.6 moderate; 0.61–0.8 substantial; 0.81–1 almost perfect), using the MedCalc® software [24, 25]. The κ-statistic was performed with and without the additional use of MIP and CAD, once for the standard low-dose CTs and once for the ultralow-dose CTs.

The McNemar test, Wilcoxon’s test, the chi-square test and the Kappa statistic analysis were performed using MedCalc® software, version 7.6.0.0. (MedCalc Software, Mariakerke, Belgium) [26].

Results

Dose calculation

The estimated effective dose was 1.81 mSv for standard acquisition (DLP: 126 [mGy*cm] * 0.0144 [mSv/(mGy*cm)]) and 0.135 mSv for ultralow-dose acquisition (DLP: 9 [mGy*cm] * 0.0147 [mSv/(mGy*cm)]). The mean effective dose of ultralow-dose acquisition consisted of 0.074 mSv from the acquisition of the topogram alone and only 0.059 from the actual Flash spiral.

Per-nodule and per-segment analysis

The overall detection rate was 95.5 % (±6.6 % standard deviation) with standard CT and 93.3 % (±4.3 %) with ultralow-dose CT (Table 2). If the read-outs from both standard CT and ultralow-dose CT were enhanced by additional MIP reconstructions, the detection rate was 96.7 % (±3.8 %) and 95.4 % (±3.4 %), with no significant difference between the two groups. The group that used CAD software detected 99.7 % (±0.4 %) and 97.7 % (±1.4 %) of the nodules, with a minimal but significant difference (Table 2). The six readers all together detected a total of 11 false-positive nodules on the ultralow-dose acquisitions and a total of 12 false-positive nodules on the standard acquisitions, compared to a total of 1,357 and 1,377 true-positive nodules for the standard and ultralow-dose acquisition with the additional use of MIP and CAD. The mean sensitivity, specificity, and positive and negative predictive values of all readers are summarized in Table 3.

Table 2 The sensitivity of standard and ultralow-dose acquisitions for all lung nodules with and without the assistance of maximum intensity projection (MIP) and computer-assisted detection (CAD)
Table 3 The per-lung segment analysis of the mean performance of all readers with the sensitivity, the specificity, the positive predictive value and the negative predictive value of standard and ultralow-dose acquisitions including all lung nodules with and without the assistance of maximum intensity projection (MIP) and computer-assisted detection (CAD)

Inter-reader variability

The κ-statistic showed an almost perfect mean unweighted inter-reader variability value of κ = 0.91 for the ultralow-dose CT and a mean κ = 0.96 for the ultralow-dose CT with the additional use of MIP and CAD, compared to κ = 0.92 and κ = 0.98 with standard CT.

Influence of nodule size, density and location

While solid nodules had a significantly better detection rate with standard CT (94.2 % ± 7.1 %) compared to ultralow-dose CT (91.0 % ± 6.0 %, p-value = 0.006) (Table 4), there was no significant difference between the standard CT (96.9 % ± 6.3 %) and the ultralow-dose CT (95.6 % ± 2.9 %, p-value = 0.188) for the detection of ground-glass nodules (Table 5).

Table 4 The mean sensitivity of standard and ultralow-dose acquisitions for solid lung nodules with and without the assistance of maximum intensity projection (MIP) and computer-assisted detection (CAD)
Table 5 The mean sensitivity of standard and ultralow-dose acquisitions for ground glass nodules with and without the assistance of maximum intensity projection (MIP) and computer-assisted detection (CAD)

An analysis of the detection rate per nodule diameter showed that only the smallest nodules with diameters of 5 mm were detected at a significantly lower rate with ultralow-dose CT (standard dose 89.9 % ± 18 % vs. ultralow-dose 83.9 % ± 8.8 %; p-value = 0.0075). The lower detection rate of these small nodules using ultralow-dose CT was no longer significant when MIP reconstructions were used (Fig. 2), resulting in detection rates of 89.1 % ± 6.3 % and 93.7 % ± 4.5 % for ultralow-dose CT when MIP reconstructions and CAD were used, respectively (Table 6). A separate analysis of the ground-glass and the solid nodules per size showed a slight sensitivity drop for the 8- and 5-mm solid nodules for both standard and ultralow-dose acquisition, while for the ground-glass nodules, a drop of sensitivity was observed just for the smallest 5-mm ground-glass nodules. The 5-mm ground-glass nodules were detected with a similar sensitivity than the 8-mm solid nodules (Table 7).

Fig. 2
figure 2

Per-nodule diameter analysis: mean sensitivity of standard and ultralow-dose CT with and without assistance of maximum intensity projection (MIP) and computer-aided detection (CAD)

Table 6 The per-nodule diameter analysis: mean sensitivity of standard and ultralow-dose acquisitions with and without the assistance of maximum intensity projection (MIP) and computer-assisted detection (CAD)
Table 7 The per-nodule diameter analysis: mean sensitivity of ground-glass and solid nodules without additional maximum intensity projection/computer-assisted detection (MIP/CAD)

There was no significant difference in the detection of nodules located in peripheral areas versus central areas (p-value = 0.28).

Combination of computer-aided detection and maximum intensity projections

The combination of MIP and CAD resulted in nodule detection sensitivity of 98.9 % (±2.0 %) for standard CT and 97.5 % (±2.5 %) for ultralow-dose CT (p = 0.033).

Discussion

The aim of this lung phantom study was to evaluate the lung nodule detection rate in ultralow-dose CT acquisitions using iterative reconstructions, and to investigate the additional use of MIP reconstructions and CAD software.

The overall detection rate of 93.3 % for all pulmonary nodules with ultralow-dose acquisition was very similar to the standard acquisition with 95.5 %. This small difference could be offset with the use of MIP reconstructions, resulting in a detection rate of 95.4 % with ultralow-dose CT, or with the use of CAD software resulting in an even higher detection rate of 97.4 % for ultralow-dose CT.

The detection rates in this study were similar to a prior lung phantom study that showed a detection rate of 91 % for standard CT acquisition and 97 % of standard acquisition with CAD [18], and also to studies investigating CT acquisitions of real patients, such as the study of Veronesi et al. [27] that showed a sensitivity of 90 % with a standard lung CT protocol. A recent study by Doo et al. [28], using exactly the same anthropomorphic chest phantom as the present study, investigated the detectability of 5-mm and 8-mm ground-glass nodules (−630 and −800 HU) with a low-dose acquisition and iterative construction (effective dose of 0.47 mSv) [28]. The resulting sensitivity of 89 % for 8-mm ground-glass nodules was slightly lower than in the present study (96.2 %). However, for the 5-mm ground-glass nodules, there is a striking difference between the results of Doo et al. with a sensitivity of 49 % and the present study, with a sensitivity of 86.7 % (Table 7). One possible explanation for this difference is that in the present study exclusively ground-glass nodules with a density of −630 HU were used, while Doo et al. also used lower density −800 HU nodules. These smallest 5-mm, very low-density −800 HU ground-glass nodules seem to be very hard to detect with an important drop of sensitivity for the 5-mm ground-glass nodules in the study of Doo et al. However, these very low-density ground-glass nodules have not been investigated in the present study.

Detection rates of lung nodules were very high for standard-dose as well as for ultralow-dose acquisition, especially when using additional MIP reconstructions or a lung CAD software. While there was not a significantly different detection rate between ultralow-dose CT with MIPs and standard CT with MIPs, ultralow-dose CT with CAD was marginally, but significantly, inferior to standard CT with CAD. In a comparison of MIPs and CAD, MIPs showed a slightly higher added value than CAD in the standard dose setting, while CAD was slightly better in the ultralow-dose setting.

A combination of both MIP reconstructions and CAD software resulted in maximum detection rates of 97.5 % for the ultralow-dose acquisitions and of 98.9 % for the standard acquisition; thus, there was no real benefit in combining MIP reconstructions and CAD software.

After an analysis of the detection rates of the different sizes of the lung nodules, the only significant difference between the two acquisitions was measured for the smallest nodules (5 mm). These nodules were detected with a sensitivity of 83.9 % with ultralow-dose CT, compared to a sensitivity of 89.9 % with standard CT. Even in an isolated analysis of the 5-mm nodules, the inferior sensitivity of ultralow-dose CT could be offset by using MIP reconstructions (89.1 %) or CAD software (93.7 %). Ground-glass nodules in general had a slightly better detection rate than solid nodules, with no significant difference between the standard dose and the ultralow-dose with or without MIP reconstructions/CAD. One possible explanation for this finding was the phantom anatomy itself, in which the lung parenchyma is slightly darker than the usual ground-glass appearance in real patients, depending on how well the patients are able to inspire during CT acquisition. On the other hand, the solid nodules have a very similar density to the underlying artificial broncho-vascular bundles, and are thus probably more difficult to detect. However, studies using CT acquisitions of real patients have also shown a higher detection rate for ground-glass nodules compared to solid nodules, for both human readers and CAD software [29, 30].

This is an interesting point since small solid pulmonary nodules are less likely to be malignant than large solid nodules and ground-glass nodules [31], and, thus, small solid nodules show a higher false-positive rate [32]. Additionally, the incidence of lung adenocarcinoma, which often presents as peripheral ground-glass nodules in the early stages, has increased [33].

The slightly inferior detection rate of the small solid 5-mm nodules in ultralow-dose acquisitions could, thus, be acceptable since the relatively low incidence of lung cancer compared to frequently detected lung nodules in a lung cancer screening population, as well as the false-positive nodules, are some of the major problems of lung cancer screening.

The Fleischner Society gave recommendations for the management of detected lung nodules depending on nodule size [34], but so far they have not proposed how to follow up lung nodules detected in lung cancer screening. There is such a suggestion based on preliminary data of the NELSON CT-screening trial, recommending no follow-up for small nodules <5 mm, follow-up with calculation of volume doubling time for intermediate nodules of 5–10 mm, and immediate diagnostic evaluation of large nodules of 10 mm or more [35].

If the ultralow-dose CT is not performed for lung cancer screening but for surveillance of pulmonary metastasis in patients with a known malignancy, detection of very small solid nodules is crucial; thus, ultralow-dose acquisition should be performed at least with additional use of MIP reconstructions or, if available, a lung CAD software.

The NLST showed for the first time a mortality reduction of 20 % with lung cancer screening in a high-risk screening population (aged 55–74 years with ≥30 pack-years of smoking) [4]. Although many associations now recommend lung cancer screening, there are still on-going discussions as well as on-going, large multicentre trials such as the NELSON trial [6]. Nevertheless, Bach et al. [36] calculated that for 2,500 screened patients in the NLST with repetitive CT acquisitions, one radiation-related cancer death would result. Radiation dose is thus an important issue in a lung cancer screening population because of the cumulative application of radiation dose to the patient’s lung with repetitive screening CTs.

With one ultralow-dose CT acquisition of the lung phantom the effective dose was 0.074 mSv, compared to the 1.81 mSv with standard acquisition. One standard acquisition thus exposed the lung phantom with a higher effective radiation dose than 13 ultralow-dose acquisitions (including the planning topograms). For the ultralow-dose acquisitions, the topogram even showed a higher effective dose than the actual CT acquisition.

There is an on-going debate on cost-effectiveness with an estimated cost of over US$100,000 per quality-adjusted life-year gained, but could be lowered to US$75,000 if linked to a smoking cessation programme [37]. Lung cancer screening results in incidental findings, some of which led to further investigations, resulted in moderate additional costs of €8.95 [US$12.67] per patient at baseline and €2.25 [US$3.19] at 5-year follow-up [38]. Another point of discussion is the influence of indeterminate baseline screening results on the quality of life of the patients during lung cancer screening [39]. False-positive nodules led to biopsy in 1.2 % of the patients who were not found to have lung cancer in both the NLST and the NELSON trial [36]. There are new biomarkers to stratify high-risk populations for lung cancer screening, which could result in a better performance of all the discussed points above [40]. But this is a subject of on-going investigations.

A major limitation of this study was the use of a lung phantom, which of course can never fully substitute for the 70-kg Asian man it represents. Moreover, real patients will have individual body constitutions that result in variations in the effective dose. Thus, CT acquisitions of the phantom are comparable to authentic CT acquisitions, but they will never replace CT acquisitions of real patients. Since patients in Europe and America normally weigh 80 kg and more, with many patients weighing over 100 kg, these results may not be applicable to these populations. Thus, the results of this study have to be approved in a real setting.

Another limitation is the different lung convolution kernels used in this study: I70 for standard acquisition, I50 for the lung CAD software and I30 for the ultralow-dose acquisition. This was based on previous pilot studies that proved optimal detection rates with significantly better performance of the lung CAD software with the I50 convolution kernel and a softer I30 convolution kernel for the ultralow-dose acquisition compared to the standard I70 convolution kernel. The softer I30 convolution kernel for the ultralow-dose acquisition helps to compensate the decreased signal-to-noise ratio of the ultralow-dose acquisitions. Although the use of different convolution kernels might decrease comparability from the strict point of view of systematics, this allows comparison of the best possible performance of standard CT, ultralow-dose CT and the CAD software for lung nodules detection, as any radiologist would also try to achieve in a clinical setting with real patients.

Although the two formed read-out groups consisted of radiologists with comparable experience in thoracic radiology and used the same read-out settings, there are some differences between the two groups. The amount of time that the radiologists spent looking at the images was not calculated, and there might have been some differences depending on this factor. However, as long as human radiologists are not completely replaced by computer detection software, differences and some small fluctuation will occur. Nevertheless, the inter-reader variability in this study was very low (almost perfect), and, as shown, the additional use of MIPs and CAD helped to lower these effects. This is another argument to use additional MIP-reconstructions or a lung CAD software.

Conclusion

Ultralow-dose CT, using the same dose as a conventional thoracic radiography on two planes, has a comparable sensitivity to standard CT for lung nodule detection, and is thus adequate for lung cancer screening. The small difference in detection rates between ultralow-dose CT and standard CT can be compensated for by using MIP reconstructions. The additional use of CAD software results in a slightly higher detection rate for the smallest solid micronodules of 5 mm used in this study. This might be helpful in staging for lung metastases in patients with known cancer, but is not necessarily beneficial in the setting of lung cancer screening, due to a high false-positive rate of exactly those small solid micronodules.