Keywords

1 Introduction

Optical endoscopy is used nowadays to examine the interior of hollow organs or cavities of the body. These methods consist of introducing an instrument called endoscope which has a light source and a camera mounted on it to observe the particular organ. There is a recent trend on developing intelligent systems for endoscopy which aim at providing additional information to the procedure by analysing image content. The most immediate applications of these systems are: the on-line assistance in the diagnosis, to provide a complete endoluminal scene description (intervention time) or quality assessment (post-intervention).

The objective of this work is the characterization of the lumen centre in bronchoscopy and colonoscopy videos. Lumen centre detection can be useful for several applications such as: (1) scene description; (2) calculation of the navigation path or (3) seed of lumen segmentation algorithms.

A main challenge is to cope with the large variability in lumen appearance across images types and acquisitions. Such variability is related to differences in acquisition and illumination conditions and make it difficult to define a model of appearance common to colon and bronchi. Figure 1 shows different lumen appearances in bronchi (Fig. 1(a,b)) and colon (Fig. 1(c,d)) procedures. The lumen in bronchoscopy is enclosed by the concentric tracheobronchial rings and it is usually centred in the image. This is not the case of colon lumen which might be in any part of the image related on the navigation differences.

The majority of the relevant work in lumen localization and detection is related to gastrointestinal image analysis. Under the assumptions that the largest dark blob of the images usually correspond to lumen [1] and it is always present in the images [2], there are several works that segment the lumen using a region growing approach over the image grey level [3]. These approaches are accurate as far as the initial seed for the region growing is placed inside the luminal area and their performance decrease in the presence of shadows or low contrast images. Recent approaches use contrast changes to account for local differences in image intensity. For instance, the authors in [4] characterize the luminal region in wireless capsule videos by means of Haar features followed by a supervised boosting for detecting the probability of having the lumen in a given frame. A main drawback for its application to standard bronchospcopy procedures is that its usual central navigation illuminates the luminal area and, thus, reduces contrast changes (compare images in Fig. 1(a) and (b)).

Fig. 1.
figure 1

Examples of variability in lumen appearance: single (a) and multiple (b) bronchoscopy image; centred (c) and biased (d) colonoscopy image.

A common limitation is that most methods can not handle having more than one lumen in an image, which is quite frequent in bronchoscopy videos. The recent approach in [5] detects multiple lumen areas by using mean shift. Although it provides information about multiple lumen, it might fail in the absence of any luminal area and it has a high computational cost not suitable for its use in intervention time. Other approaches for multiple lumen detection in bronchoscopy [6, 7] are semi-automatic procedures which are applied off-line. Finally, up to our knowledge, there is no public annotated database of lumen regions (for, both, bronchoscopy and colonoscopy videos) allowing the comparison of the performance of different methods. This constitutes a major flaw for the development of generic algorithms able to achieve accurate results in a wide range of images.

This paper addresses two main points in the context of lumen characterization in endoscopy videos. First, we present a lumen centre detection method that can be used for a wide range of endoscopic images, covering single and multiple lumens. The proposed method is inspired on the work in tracheal ring segmentation presented in [8] and combines appearance and geometric features of the lumen that are present in both bronchoscopy and colonoscopy frames. Second, we present a manually annotated database, which includes representative cases of colonoscopy and bronchoscopy videos, along with a validation protocol. The rest of the paper is structured as follows: our lumen centre detection method is presented in Sect. 2. We introduce our validation protocol including the description of the annotated database in Sect. 3. Results are reported in Sect. 4 and conclusions in Sect. 5.

Fig. 2.
figure 2

Model of appearance of the lumen based on the illumination.

2 Lumen Centre Detection

Our processing scheme consists of three different stages: (1) Image preprocessing; (2) Calculation of Lumen Energy Maps (LEMs) and (3) Obtention of the centre points. Specific details about the preprocessing steps can be found in [9]. The central point of our method, the calculation of LEM maps, is based on a model of appearance of the lumen and it has been designed to overcome the limitations of existing approaches. Finally the obtention of the centre point uses unsupervisded learning over a training set to provide the likelihood of a pixel to be inside the lumen. The local maxima of such likelihood map are our lumen centres.

2.1 Model of Appearance of the Lumen

In order to build our model of appearance of the lumen we will lean on a graphical scheme of how endoscopy images are generated, shown in Fig. 2. As illustrated in Fig. 2, the amount of light that falls on the scene decreases approximately according to the square of the distance between the light source and each 3D point. Consequently the farthest parts of the image, such as the lumen, are poorly lighted. The fact that the amount of light increases from the centre of the lumen outwards allows to incorporate geometric gradient-based features to our characterization of the lumen. Our model of appearance for lumen uses the former cues to characterize the lumen centre as the dark region of the image which centre is the hub of image gradients. These two cues are used to develop our two LEM maps algorithms which are described next.

2.2 LEM Maps

We present here two different Lumen Energy Maps. The first one, Directed Gradient Accumulation (DGA), is based on the idea that the lumen centre is the source of all image gradients whereas the second, Dark Region Identification (DRI), exploits the fact that lumen region tends to be the darker part of the image.

Fig. 3.
figure 3

Graphical explanation of DGA algorithm: Original synthetic image (a); Corresponding gradient vectors superimposed to the image (b); Example of the extension of gradient vector lines (c), and resulting DGA accumulation map (d).

DGA value for each point is calculated as the number of gradient-directed lines that cross it. These lines have the same direction than the gradient and they are created by extending gradient lines to cover the whole frame. If a given image point is at the centre of a tubular structure, by Phong’s illumination model, image normal lines will accumulate around this point. It follows that DGA achieves maximum values at either darker (i.e. lumen) or brighter (i.e. specular highlights, polyps) regions. The synthetic images in Fig. 3 illustrate how DGA works. In this example all gradient vectors are directed from the centre of the image (darkest part) to the brightest external part and, thus, DGA maximum response corresponds to the centre of the image (Fig. 3(d)).

DRI maps are calculated by applying a smoothing using a gaussian kernel which \(\sigma \) is related to the scale of the lumen and it is determined using a training set. The response to DRI enhances dark values and, thus, the lumen region. Figure 4 shows the output of DRI for several scales. Note that as we decrease the scale we go from having a big dark blob (Fig. 4(b)) to a smaller one which matches better the lumen region (Fig. 4(e)).

Fig. 4.
figure 4

Graphical explanation of DRI algorithm: Original bronchoscopy image (a); Smoothed images with \(\sigma \): \(1/8\) (b); \(1/16\) (c); \(1/24\) (d); \(1/32\) (e).

2.3 Centre Point Characterization

The 2D feature space given by (DGA,DRI) characterizes several elements of the endoluminal scene. In particular, pixels belonging to the lumen have a low value of DRI and a high DGA value, polyps have high DGA and DRI and structures like folds and rings (which generate shadows) have low DGA and DRI values. The partition of the feature space into this three classes is obtained by unsupervised k-means clustering over a training set. In order to have comparable values, the output of both LEM maps has been normalized in \([0,1]\) range. This normalization has been obtained by means of the maximum and minimum values achieved for the training set.

The distance of a pixel to the borders given by the clustering defines a likelihood map of its belonging to the each of the classes. In our application this border has been approximated by a linear plane of origin \((DRI_0, DGA_0)\) and normal direction \((V_{DRI}, V_{DGA})\), so, for each feature point \((DRI, DGA)\) its likelihood map \(LK\) is defined by:

$$\begin{aligned} LK(DRI, DGA)=(DRI-DRI_0)V_{DRI}+(DGA-DGA_0) V_{DGA} \end{aligned}$$
(1)

A threshold, \(Th_{LK}\), on \(LK\) determines those points having a larger likelihood of belonging to a given class. We obtain the optimal \(Th_{LK}\) value as the maximum value of the ROC curve [10] corresponding to lumen segmentation for the training set. This stage has been applied separately for bronchoscopy and colonoscopy images. Figure 5 shows our feature spaces obtained for each type of images.

Fig. 5.
figure 5

Adequation of our feature space to bronchoscopy and colonoscopy examples.

Finally to calculate to the lumen centre we proceed as follows: for the case of colonoscopy videos, as there is only one lumen per image, we take the best candidate inside the lumen region cluster. On the other hand for bronchoscopy we take all the local maxima that we may find inside the lumen cluster.

3 The Annotated Database and Validation Protocol

In order to be useful for validating a wide range of algorithms, an annotated database should fulfill the following requirements: (1) It should contain examples of frames with lumen from both bronchoscopy and colonoscopy videos; (2) The selected frames should be different enough in order to have the maximum variability available of lumen appearance; (3) The database should also contain examples of frames both with multiple lumen (bronchoscopy) and without lumen. Taking these constraints into account we have built up a database of 250 imagesFootnote 1 extracted from 15 and 20 sequences of colonoscopy and bronchoscopy respectively. Table 1 gives a description of the different groups and Fig. 6 shows an example with its segmentation.

Table 1. Description of lumen database.

The lumen detection has been validated in terms of true localizations (TL), false localizations (FL) and no localizations (NL). We have used Precision, \(Prec = \#TL/(\#TL + \#FL)\), and Recall, \(Rec = \#TL/(\#TL + \#NL)\), scores to summarize the performance.

Fig. 6.
figure 6

Examples of images from our database: bronchoscopy image (a) and its ground truth (b); colonoscopy image (c) and its ground truth (d).

4 Experimental Results

The complete methodology has been trained using 30 images per endoscopy type. The optimal parameter values have been chosen to maximize the number of candidate points inside the ground truth. The final values of the different parameters are: DRI \(\sigma = 1/24\), \(Th_{LK} = 0.12\) for bronchoscopy and \(Th_{LK} =0.14\) for colonoscopy.

Precision and Recall results are given in Table 2. Precision and Recall are over 93 % regardless of the type of image and lumen multiplicity. We only miss 4 lumens: 2 in colonoscopy and 2 in bronchoscopy (all in multi lumen images). We also carried out an experiment to assess the potential of our method to detect lumen presence in the images. It is worth noticing that in the absence of lumen our algorithm does not detect any centre in \(8/10\) bronchoscopy images and in \(16/18\) colonoscopy images.

Table 2. Precision and recall results on lumen centre detection.

Figure 7 shows qualitative results including good and bad detections. The first 3 columns in the image show examples of good lumen centre detection in single and multi lumen images. Column 4 shows an example of the potential of our method on detecting lumen presence: we can observe that no centre point is marked in the image. The erroneous detections are shown in columns 5 (lumen detection with no lumen presence) and 6. It is worth to mention that in some cases like the ones shown Fig. 7 it is unclear if our algorithm has not really performed well due to the fact that when making a ground truth sometimes there is a great variability on delimiting the lumen region -even the presence/absence of lumen in certain images depend of the experts’ criteria-.

Fig. 7.
figure 7

Qualitative lumen centre detection results. Good detections marked with green crosses and bad ones with green circles. (Color figure online)

All the experiments shown in this section have been performed in a PC with an Intel i7 processor with 16 GB of RAM. The whole processing of one frame takes 0.057 s for bronchoscopy videos and 0.4 s for colonoscopy videos. The difference in computational time is related with the resolution of the image.

5 Conclusions

The detection of the lumen centre is useful for several applications, such as scene description, 3D reconstruction processes or helping in computer aided diagnosis. Moreover, by detecting accurately the lumen centre we can potentially obtain the navigation path inside the organ which could be useful for quality assessment purposes or the following-up of injured tissues. This paper presents a novel lumen centre detection based on a model of appearance and geometry valid for the respiratory and gastrointestinal systems. The presented experimental show a reliable performance on an extensive database that contains images from two modalities (bronchoscopy and colonoscopy) and includes images with multiple lumens and without them.