1 Introduction

In the last decade, several studies have been focused on the development of new algorithms to precisely locate small pulmonary structures, such as airways, on chest CT images. Once the structures are identified, the following step is represented by a quantitative measurement to extract geometrical properties, which may lead to improved diagnosis and new studies of lung disorders, as the morphology of the bronchial tree is commonly affected by inflammatory and infectious lung diseases. As an example, the smaller conducting airways are the structures most affected in patients with chronic obstructive pulmonary disease (COPD) [1], and the thickness of the airway wall (measured on CT) has been correlated to the severity and duration of asthma in different works [2, 3]. For this reason, having a method that automatically analyzes airway walls thickness and lumen size is becoming of great interest for the scientific community.

On CT images, airways are often close to vessels and surrounded by parenchyma, and image resolution as well as noise artifacts often affect an accurate measurement. To perform airway wall thickness detection, the traditional approaches are based on non-parametric methods, which analyze the properties of the structure directly on the reconstructed CT signal. The most typical approach is the so-called full width at half max (FWHM) [4], which is based on the idea that the true edge of an ideal step function undergoing low-pass filtering is located at the FWHM location. An alternative popular approach to measure airway walls is the use of the zero crossing of the second order derivative (ZCSD) [5], which is used to characterize the signal transitions (i.e., lumen-to-wall and wall-to-parenchyma). More recently, a new approach for airway wall segmentation that starts from a coarse airway segmentation and implements an optimal graph construction method for wall segmentation was proposed [6]. However, all traditional methods suffer from over- and under-estimation errors when the structure size approaches the scanning resolution [7].

To overcome these issues, we propose to use a convolutional neural regressor (CNR) [8] approach, which uses a customized loss function to automatically and simultaneously measure airway wall thickness and airway lumen on small 2D patches extracted around the structure of interest. To the best of our knowledge, this approach has not yet been considered to solve problems such as measurement and analysis of the morphology of airways on CT images.

Since creating an accurate and reliable ground truth for small airway is quite a tedious and complicated task, to train the network we developed a synthetic model that aims at reproducing the main characteristics of airways with exact knowledge of the physical dimensions. The generated model is then refined using a Simulated and Unsupervised Generative Adversarial Network (SimGAN) [9].

New synthetic airway images are used for an initial validation to compute the relative error obtained by the proposed error. Then, as a further test, we created a synthetic phantom of airways with varying wall thicknesses. Finally, in order to prove the reliability of our approach, we performed an indirect validation on in-vivo cases in comparison to traditional methods through the correlation between the predicted FEV1% and the Pi10 parameter.

2 Materials and Methods

The proposed CNR algorithm used 2D patches of 32\(\,\times \,\)32 pixels extracted from the structure of interest. These patches are then refined using SimGAN to resemble in-vivo patches better. In this section, we first introduce the creation and refinement of the synthetic patches. Then, the proposed CNR is described with the different training processes implemented. Finally, the validation methods are presented.

Table 1. Parameter ranges used for the creation of the airway model. All values were uniformly distributed within the specified ranges. LR stands for lumen radius.

2.1 Synthetic Modeling of Airways

In order to generate reliable synthetic patches of airways, the main aspects of the structure of interest as well as the characteristics of the CT scanner with regard to resolution, PSF, and imposed noise have to be reproduced. Based on the knowledge that on reformatted axial plane airways have tangent vessels [10], each airway patch consisted of two bright ellipses (inner and outer walls) with a dark central zone (airway lumen) and zero, one or two tangent vessels, represented by bright ellipses rotated around the airway. The parameters to create the synthetic airways were randomly chosen based on physiological values and are reported in Table 1. Although the creation of a synthetic airway presents some limitations, we think that the proposed model represents an appropriate simulation that helps a neural network learn the main features of real airways. Also, using the multi-scale particle extraction method described in [11], 2D patches can be easily extracted along the airway’s main axis, which is given by the first eigenvector of the Hessian matrix. For this reason, we do not consider 3D patches, which due to the different tubular profiles and a wide variation of 3D orientations that should be taken into account would increase the complexity of the modeling.

To reproduce the structure of the parenchyma, a Gaussian smoothing (with a standard deviation of 5) was applied to Gaussian distributed noise, to create some broadly correlated noise, which made a texture of multiple structures that mimicked the parenchyma. Afterward, the correlated noise was altered to have a mean intensity of −900 HU and a standard deviation of 150. All values were empirically chosen.

All patches were created starting at a super-resolution of 0.05 mm/pixel in a sampling grid of 640\(\,\times \,\)640 pixels. To obtain the final patch, the obtained images were first down-sampled to a resolution of 0.5 mm/pixel. Then, a PSF was simulated to mimic the blurring caused by the image reconstruction process. To this end, due to the small size of the patch, we assumed that the PSF can be approximated by means of a spatially locally invariant Gaussian function, as demonstrated in [12]. The standard deviation of the Gaussian filter was randomly chosen in an empirically determined range of 0.4 to 0.9 mm to simulate the differences in the PSF across CT scanners and manufacturers. Finally, a spatially correlated Gaussian noise was added to the image based on Gaussian distributed random noise smoothed with a Gaussian filter (with a standard deviation of 2), with the empirically determined mean of zero and standard deviation of 25. As a last step, the image is cropped to a 32\(\,\times \,\)32 pixels grid.

2.2 SimGAN Refinement

Although the proposed generative model simulates reasonably well the geometrical aspects of the structure of interest, the generated patches still may present differences to patches extracted from real structures. For this reason, we implemented a SimGAN refinement, similar to the one described in [9], to improve the quality of the synthetic patches. SimGAN makes use of simulated and unsupervised learning by using a generative adversarial network (GAN) that consists of both a generator (refiner) and a discriminator. The purpose of the refining step is to trick the discriminator in deciding whether an image is a synthetic or real image.

For the implementation of this network, we pre-trained the refiner on synthetic images with 1000 steps and a batch size of 256, while the discriminator was pre-trained on real patches (extracted using the multi-resolution particles method described in [11], initialized with the technique of [13]) and refined patches, obtained from the pre-trained refiner, with 100 steps and a batch size of 256. The number of steps was the same as in [9]. Then, the adversarial training of the SimGAN network was trained for 10,000 steps, batch size of 256, and all parameters and loss function set as in [9]. An example of a generated synthetic airway is shown in Fig. 1.

Fig. 1.
figure 1

Example of creation of a small synthetic airway patch (lumen: 0.7 mm, wall thickness: 1.25 mm). (a) The initial geometric model; (b) downsampling of the model; (c) blurring of the downsampled patch; (d) noise addition; (e) final synthetic airway (after applying SimGAN and cropping to obtain a 32 \(\times \) 32 pixels patch).

2.3 Measurement of Airway Morphology

To extract both measurements for airways, we implemented a 9-layer 2D network, which consists of seven convolutional layers, five of which had stride 1 and two had stride 2, and two fully-connected layers (see Fig. 2). The network regresses the measure of the central structure in a patch 32\(\times \)32 pixels, a size chosen to include enough neighborhood information for big structures, without losing specificity for small and thin features. To train the network, we used an Adam update (\(\beta _1\) = 0.9, \(\beta _2\) = 0.999, \(\epsilon \) = \(1e^{-08}\), decay = 0.0) with a specifically customized loss function that combines the absolute relative error and the precision of the measure to improve the network performance and stability (see Sect. 2.4).

The network was trained on a NVIDIA Titan X GPU machine, using the deep learning framework Keras [14] on top of TensorFlow [15], for 300 epochs at a learning rate of 0.001 and batch size of 64.

Fig. 2.
figure 2

Scheme of neural network used for measuring airways. The network is the same in both cases. The CNN for airways had 2 outputs (wall thickness and lumen)

2.4 Customized Loss Function for Airway Morphology Measurement

When trying to accurately measure small airways with sizes at image resolution level, typical approaches usually have problems of under- or over-estimation. For this reason, in this paper we suggest the usage of a new loss function that combines the loss of the relative error over all images, \(\mathcal {L}_{\mu }\), and the precision of the measure over 25 replicas of the same structure, \(\mathcal {L}_{\sigma }\):

$$\begin{aligned} \mathcal {L}(\varvec{y, \widehat{y}}) = \mathcal {L}_{\mu }(\varvec{y, \widehat{y}}) + \lambda \cdot \mathcal {L}_{\sigma }(\varvec{y, \widehat{y}}) \end{aligned}$$
(1)

where \(\varvec{y}\) is the true measure of a synthetic patch, \(\varvec{\widehat{y}}\) is the measure predicted by the CNR, and \(\lambda \) defines the weight of \(\mathcal {L}_{\sigma }\) with respect to \(\mathcal {L}_{\mu }\). The definition of \(\mathcal {L}_{\mu }\) is given by:

$$\begin{aligned} \mathcal {L}_{\mu }(\varvec{y, \widehat{y}}) = \sum _{i=1}^{N} \frac{|y_{i} - \widehat{y}_{i}|}{y_{i}} \end{aligned}$$
(2)

where N indicates the total number of patches. On the other hand, the loss term for the precision, \(\mathcal {L}_{\sigma }\), is computed over a number of replicas of the same geometric model (with fixed physical dimensions) to which varying PSFs are applied and a different number of airways and vessels are added with varying locations and rotations. This way, the network learns to accurately measure the structures of interest regardless of possible confounding factors inside the patch. The definition of \(\mathcal {L}_{\sigma }\) is given by:

$$\begin{aligned} \mathcal {L}_{\sigma } = \frac{1}{N} \sum _{i=1}^{N} \Bigg (\sum _{j=1}^{M} \Big (y_{i,j} - \widehat{y}_{i,j}\Big )^2 - \Big (\frac{1}{M} \sum _{j=1}^{M} (y_{i,j} - \widehat{y}_{i,j}) \Big )^2\Bigg ) \end{aligned}$$
(3)

where N represents the total number of images, and M indicates the number of replicas considered. In this work, we used M = 25.

Since for airways lumen radius and wall thickness are measured simultaneously, for this structure the two terms of the loss, \(\mathcal {L}_{\mu }\) and \(\mathcal {L}_{\sigma }\), are given by the sum of the corresponding loss computed independently for the two measures.

Since we noticed that the measurement of small airways (lumen less than 1.0 mm) was the most affected by a high standard deviation, we also empirically assigned a higher weight to the precision term of these structures so that they acquire more importance when computing the loss. Therefore, Eq. 1 becomes:

$$\begin{aligned} \mathcal {L}_{a}(\varvec{y, \widehat{y}}) = \mathcal {L}_{\mu }(\varvec{y, \widehat{y}}) + \lambda \cdot \Big ( \omega _{\text {l}} \cdot \mathcal {L}_{\sigma , \text {l}}(\varvec{y, \widehat{y}}) + \omega _{\text {wt}} \cdot \mathcal {L}_{\sigma , \text {wt}}(\varvec{y, \widehat{y}}) \Big ) \end{aligned}$$
(4)

where \(\lambda \) = 2.0 has been empirically selected, l indicates the airway lumen, wt stands for wall thickness, and

$$\begin{aligned} \omega _{\text {l}} = {\left\{ \begin{array}{ll} 1.5 &{} \text {if airway lumen} < \text {1.0 mm} \\ 1.0, &{} \text {otherwise} \\ \end{array}\right. } \end{aligned}$$
(5)

and

$$\begin{aligned} \omega _{\text {wt}} = {\left\{ \begin{array}{ll} 3.0, &{} \text {if wall thickness} < \text {1.0 mm} \\ 1.0, &{} \text {otherwise} \\ \end{array}\right. } \end{aligned}$$
(6)

2.5 Training Set Definition

The training dataset consisted of 100,000 \(\times \) 25 replicas of the same geometric model, to which varying PSFs were applied, and different additional vessels were added at varying locations and rotations. Therefore, a total of 2,500,000 training patches were used. Conversely, for the validation set we generated 1,000,000 patches (40,000 \(\times \) 25 replicas).

The values of the parameters used for the creation of the images were randomly chosen in ranges that were empirically defined based on physiological measures of the structures of interest, as shown in Table 1. We trained the network using all images refined by SimGAN.

Finally, in order to help the network focus more on geometry than intensity values, during training, we applied a data augmentation that in addition to adding random noise it also randomly inverts intensity values inside the patches. Furthermore, we introduce a small random shift and random axes flipping to the patch to improve the learning of the network.

Fig. 3.
figure 3

An image taken from the CT scan of phantom showing the 8 tubes used for testing the CNR.

2.6 Experimental Setup

We evaluated the proposed approach for airway measurements on both synthetic and in-vivo cases. For the synthetic validation, we first generated a dataset of 200,000 patches (with random values chosen in the range of Table 1) that were used in three different experiments. First, we evaluated the accuracy of the algorithm by calculating the relative error (RE) between the CNR measurement and the ground truth defined by our geometrical model when varying lumen and the wall thickness size. To compare our results to the state-of-the-art methods, we also computed the absolute error obtained for airways with a wall thickness of 1.0 mm at the image resolution level (0.5 mm).

In order to demonstrate the ability of the method to accurately measure the structures of interest regardless of presence of noise and smoothness, as a second experiment we generated 100 images for each level of noise (\(\sigma _n \in [0,40]\) HU) and for each level of Gaussian smoothing (\(\sigma _s \in [0.4,0.9]\) mm) and computed the mean RE (in percentage) across the 100 patches. We repeated the same experiment first fixing the wall thickness at 1.5 mm and considering three values of airway lumen (small: 0.5 mm; medium: 2.5 mm; large: 4.5 mm), and then fixing the airway lumen at 1.5 mm and using three wall thickness values (small: 0.5 mm; medium: 1.2 mm; large: 2.0 mm).

As a final test on synthetic images, we compared the proposed method for airway measurement to FWHM and ZCSD computing the mean RE (in percentage) on patches of different sizes.

As a further validation, we tested the performance of the algorithm on a CT airway phantom of known lumen size and wall thickness. The phantom was constructed using Nylon66 tubing inserted into polystyrene to simulate lung parenchyma surrounding the airways. Non-overlapping, 0.6 mm collimation images, 40 cm FOV, were acquired using a GE Siemens Sensation 64 CT scanner and reconstructed with a standard reconstruction kernel. Eight tubes with varying wall thickness and lumen diameter (reported in Table 2), as measured by a caliper, were studied. An image taken from the CT scan of the phantom showing the eight tubes is presented in Fig. 3.

Table 2. Wall thickness (WT) and lumen diameter (in mm) for the eight tubes of the synthetic phantom as measured by a caliper.

As a final experiment, since an accurate and reliable in-vivo ground-truth is very complicated to obtain, we performed an indirect validation by means of a physiological evaluation. To this end, we computed the Pi10 parameter with our approach and with ZCSD, and analyzed its correlation to FEV1% on 590 clinical cases, with airway particles extracted using [11]. Pi10 is a metric of airway thickness that is computed measuring the square root of the wall area across the whole airway tree and regressing the value at a hypothetical airway with an internal perimeter of 10 mm. The wall area is found by subtracting the area of the lumen from the airway area, while Pi is computed from the lumen radius.

3 Results

3.1 Synthetic Evaluation

Figure 4 shows the tendency of the RE for predictions obtained on the synthetic data when varying the lumen radius (Fig. 4a) and the wall thickness (Fig. 4b) of the airway. As expected, the error is small for airways with a large lumen (Fig. 4a), while it increases (with a tendency to under-estimate the measure) for lumens smaller than 1.0 mm, although it is always below a 10% RE. Regarding the wall thickness (Fig. 4b), a significant under-estimation error is obtained at sub-voxel levels (below the image resolution of 0.5 mm), while a tendency to over-estimation is obtained when the wall thickness is bigger than 2.0 mm.

Fig. 4.
figure 4

Tendency of the relative error obtained with CNR when varying (a) airway lumen and (b) wall thickness.

Fig. 5.
figure 5

Effect of varying noise (first row) and smoothing (second row) on lumen (a) and wall thickness (b) predictions. The RE is reported in %.

On average, an absolute RE of 6.3% is obtained for airways with a wall thickness of 1.0 mm, while when the airway wall thickness is at the image resolution (0.5 mm) the absolute RE is at 13.09%. These REs are significantly lower than those previously reported in the literature for structures of similar sizes [5, 7].

Results obtained when fixing three values of airway lumen (0.5, 2.5, and 4.5 mm) and three values of wall thickness (0.5, 1.5, and 2.5 mm) and varying the level of noise and smoothing are presented in Fig. 5. As shown, for both measurements the RE is stable across the different levels of noise and smoothness. While for medium and large structures a very high accuracy is obtained (RE close to 0), the smallest structures (generated with airway lumen or wall thickness at the image resolution of 0.5 mm) are the one confusing the network the most determining also a bigger standard deviation. In all cases, the RE is stable when varying noise and smoothness, and the bias introduced by the CNR is small, with a little under-estimation for small structures, as expected. For small wall thicknesses (0.5 mm), when the smoothing level is low (<0.6 mm) a very small RE is obtained, while this error increases when applying higher levels of smoothing (>0.6 mm).

Finally, Table 3 shows the mean relative error (in percentage) obtained for different sizes of wall thickness using the proposed method in comparison to ZCSD and FWHM on the 200,000 testing patches. Three wall thickness intervals were chosen: lower than 0.7 mm, between 0.7 mm and 1.5 mm, and bigger than 1.5 mm. As shown, while traditional methods tend to have a very high relative error, especially for small airways, the proposed method yields a very high accuracy and outperforms them. Similar results were obtained for the airway lumen.

Table 3. Mean RE (in %) for the proposed method (CNR), FWHM, and ZCSD for the wall thickness (wt).

3.2 Phantom Evaluation

The relative error obtained measuring the wall thickness of the eight tubes of the phantom using the proposed method (CNR) in comparison with traditional techniques are presented in Table 4. For completeness, the relative error obtained when measuring the lumen with CNR (not measured by traditional methods) is also reported. The proposed CNR has the lowest RE for all considered tubes, with the exception of tube C where FWHM gives the best result, and in general is able to well measure the wall thickness even for small and thin tubes, as in case of tube D. Although there is variance in the RE for the measurement of the wall thickness of all tubes, this variance is smaller than the one obtained using traditional methods, that for some tubes seem to really confounded. An important aspect to notice is the small RE obtained for all tubes when measuring the lumen radius with the proposed CNR.

Table 4. Mean RE (in %) obtained measuring the wall thickness (WT) on the eight phantom tubes using the proposed method (CNR) in comparison with FWHM and ZCSD. Smallest relative error is reported in bold. For completeness, the last column reports the relative error obtained measuring the lumen of the tubes with CNR (traditional methods only provide WT). All results are in %.
Table 5. Results from the indirect in-vivo analysis for airways. The Pearson’s correlation coefficient for the correlation between the Pi10 computed with the ZCSD and SimGAN, and FEV1% is reported

3.3 In-Vivo Indirect Evaluation (FEV1% in Correlation to Pi10)

Table 5(a) shows the Pearson’s correlation coefficient between FEV1% and the Pi10 metric computed with our approach and ZCSD in airway patches extracted from a real CT. The correlation coefficient between FEV1% and Pi10 calculated by ZCSD and CNR was −0.38 and −0.54, respectively, indicating a significantly higher correlation of the Pi10 computed by the CNR with FEV1%. This result suggests that the proposed method could potentially be used to accurately measure FEV1% in patients with COPD.

4 Discussion and Conclusion

In this paper, a novel method to automatically measure and analyze the morphology of airways using deep learning on chest CT images is proposed. The use of a neural network in combination with SimGAN to refine the synthetic model and the proposed loss function represent the innovative aspects of this work.

Results from the validation on synthetic patches showed a low absolute relative error across all airway wall thicknesses and airway lumens. Although a direct comparison is not possible, considering the absolute relative error for airways of 1.0 mm, the presented method obtains a better performance (absolute relative error around 6%) than the method proposed in [16], where the wall thickness was measured on plastic tubes of 1.0 mm yield to an absolute relative error of approximately 10%. Also, a test for structures of different sizes and varying the level of noise and smoothing showed that the proposed method is not affected by noise or smoothing, and, as expected, only sizes at lower than the image resolution may determine a small increase of the prediction error. A comparison of two traditional algorithms shows that our method outperforms the state-of-the-art, especially for small and complex airways.

Finally, phantom-related results and indirect validation with in-vivo patches showed promising results, indicating the stability of the CNR in accurately measuring the wall thickness and lumen radius regardless of the varying starting conditions. This indicates that the method here proposed may potentially be used for future early diagnosis of lung disorders.

For future work, the creation of the synthetic model might be improved by reducing the level of approximation of the PSF and additive noise. Also, new refinement processes of the synthetic images, such as using CycleGAN [17], should be explored.