Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In medical X-ray examinations, radiographers often adopt collimation to avoid unnecessary exposure to patients and improve the image quality. It is important to detect the non-collimated region for improving the visual quality in the diagnosis environment and for the processing of the image, e.g. image compression. Nevertheless, collimation detection remains a challenging problem due to the large variability in shape and appearance across collimated images. In this study, the region of interest (ROI) is the non-collimated region in the X-ray image, which is a quadrilateral caused by rectangle collimators.

There have been many works using 2D boundary detection technique for collimation detection [2, 3]. However, all these methods used an unsupervised model for edge detection. Because of the large variability of the collimated images and the overlap of the pixel-level feature distribution between the ROI and the collimation field, these method are not reliable. In Kawashita et al. [4], the authors detect the ROI with a plane detection Hough transform. But this method works only when the ROI is a rectangle and is not as large as the whole image. In Mao et al. [5], the authors proposed a multi-view learning-based method combining region and corner detection. The accuracy of this method highly depends on the region detection with a pixel-level two-class classification. As this method makes a classification on each pixel, the accuracy is also limited and compromises efficiency. In this paper, we develop our method based on two simple observations. First, if two pixels are near to each other, one is in the ROI and one is not, the difference of their intensity is often relatively large. Second, along a piece of boundary between the ROI and the collimation field, the directions of gradient vectors are often from the collimation field to the ROI.

Fig. 1.
figure 1

An example of our algorithm: (a) Input image; (b) Edge map; (c) The optimal group of four directed straight lines; (d) Detected ROI.

The first observation indicates that, if we over-segment the image with both location and appearance information, e.g. SLIC [6], most pixels of the boundary of the ROI would be on the boundaries of superpixels. The second observation indicates that, compared with undirected straight lines, straight lines with positive normal directions specified can make stricter shape constraints for the ROI detection in X-ray images.

Hence, we proposed a novel algorithm based on superpixel-level learning-based edge estimation and a directed Hough transform. We estimate the edge strength for pixels with a random forests approach on pairs of neighbouring superpixels (Fig. 1b). As the edge map is estimated with the training data, it is robust and accurate. By computing the edge map on the superpixel level, we decrease the number of samples for learning-based classification, take advantage of superpixel-level features and decrease the number of pixels visited in the Hough transform. In Andrews and Hamarneh’s work [7], the authors also used learning-based method for boundary detection. However, their method makes classifications for each boundary candidates on the pixel-level and our method only makes classifications on pairs of neighboring superpixels, which should be more efficient.

We define directed straight lines by specifying their normal vector, and extend the classical Hough transform to detect directed straight lines with the gradient vector field and the learning-based edge map. We define positive and negative half-planes for directed straight lines. And we regard the quadrilateral ROI as the intersection of four positive half-planes of the optimal group of four directed lines, as shown in Fig. 1c and d. By setting special values in the edge map and the vector field for the pixels on the four sides of the image, our method works even if the ROI is the whole image.

2 Learning-Based Edge Estimation

Given an X-ray image I, we compute its edge map \(I_e\) with the following four steps (Fig. 2).

  1. 1.

    We make an over-segmentation on the image with SLIC [6], resulting in a set of superpixels, S.

  2. 2.

    For each pair of neighbouring superpixels, we compute its probability of belonging to each of the three given classes with random forests (See Sect. 2.1 for details).

  3. 3.

    For each superpixel, we compute its probability of being in the ROI.

  4. 4.

    For each pixel \(p\in I\), we compute its edge strength \(I_e(p)\) based the computed superpixel-level probabilities.

As our algorithm is based on the assumption that most pixels on the boundary of the ROI are on the boundaries of some superpixels in S, we use the algorithm SLIC (Simple Linear Iterative Clustering) [6] to make the over-segmentation with pixel-level features, including intensity and texture information, which efficiently produces smooth regular-sized superpixels. We select this algorithm to guarantee our assumption above and make the features of different superpixels comparable.

Fig. 2.
figure 2

An example of our edge estimation algorithm: (a) Input image; (b) superpixels; (c) Edge map with our method; (d) Canny edge map [8]; (e) gradient magnitude

2.1 Superpixel-Level Probability

We use a three-class random forests classifier to estimate the probability of each pair of neighbouring superpixels. The set of classes is \(\{\L _0,\ \L _1,\ \L _2\}\), which represents that 0, 1 or 2 superpixels of this pair are in the ROI, respectively. Let \(Pr_{(P, Q)}(\L )\) be the probability of the pair of superpixels, (PQ), belonging to the class \(\L \). The features used in the classifier include the averages and the standard deviation of pixel-level features, such as intensity and gradients, of the pixels contained by either of the two superpixel, and properties of shape. By comparing the average of the intensity of P and Q, and rearranging their features, we guarantee that (PQ) and (QP) have the same feature value for the random forests classifier and \(Pr_{(P, Q)}(\L )=Pr_{(Q, P)}(\L )\), \(\L =\L _0,\ \L _1,\ \L _2\).

Let Pr(P) be the probability for the superpixel P of being in the ROI, which is defined as: \(Pr(P)=\frac{\sum _{Q\in \mathbf N (P)}{Pr_{(P, Q)}(\L _2)}}{|\mathbf N (P)|}\), where \(\mathbf N (\cdot )\) is the neighbourhood of a superpixel, \(|\cdot |\) is the size of a set.

2.2 Pixel-Level Probability

Let B(PQ) be the common boundary of a pair to superpixels (PQ), \(l_{four}\) be the set of pixels on the four sides of the image, such that \(l_{four}=\{p\Vert p_x= 1 |p_x= M|p_y=1|p_y=N\}\), where \(M\times N\) is the size of I.

Note that, given a perfect over-segmentation S of I, such that each superpixel P is either exactly in the ROI or exactly not, a pixel p is on the boundary of the ROI if and only if it is in one of the following two situations:

  • \(p\in B(P,Q)\), one of P and Q is in the ROI, the other is not.

  • \(p\in l_{four}\), \(p\in P\), P is in the ROI

Hence we estimate the strength of edge \(I_e(p)\) with the probability of pixel p being on the boundary:

$$\begin{aligned} I_e(p)={\left\{ \begin{array}{ll} \alpha Pr(P)&{}\text{ if } p\in P\ and\ p\in l_{four} \\ Pr_{(P, Q)}(\L _1)&{}\text{ if } p\in B(P,Q)\ and\ p\not \in l_{four} \\ 0&{}\text{ otherwise } \end{array}\right. } \end{aligned}$$
(1)

where, \(\alpha \) is positive constant parameter.

3 Directed Hough Transform

3.1 Classical Hough Transform

Applying the classical Hough transform (HT) to the non-negative edge map \(I_e'\) of an image I results in an 2D accumulator array \(C(\rho _k, \theta _l)\), that represents the sum of edge strength (\(I_e'\)) of points satisfying the linear equation \(\rho _k=p_x \cos \theta _l+p_y \sin \theta _l\), where p is a pixel in the image and \((p_x, p_y)\) is its coordinate. Local maxima of C can be used to detect straight lines in the image. \(C(\cdot )\) is defined as: \(C(\rho _k, \theta _l )=\sum _{p:\ p_x \cos \theta _l+p_y \sin \theta _l -\rho _k=0}{I_e'(p)}\). Where, \(I_e'\) can be binary, e.g. Canny edges, or grey-level, e.g. gradient magnitude. To compute \(C(\cdot )\), for each pixel p, such that \(I_e'(p)>0\), the Hough transform algorithm calculates the parameters \((\rho , \theta )\) of lines at p, and increments the value of the corresponding bins in \(C(\cdot )\) by \(I_e'(p)\) [9].

3.2 Directed Straight Line

Given the coordinate \((p_{0x}, p_{0y})\) of a pixel \(p_0\) and a normal vector \(\mathbf {n_0}\), a straight line passing through \(p_0\) is defined with a linear equation:

$$\begin{aligned} p_x \cos \theta _0+p_y \sin \theta _0 -\rho _0=0 \end{aligned}$$
(2)

where \(\theta _0\) is the angle between \(\mathbf {n_0}\) and the positive X-axis, \(\rho _0\) is defined as \(p_{0x} \cos \theta _0+p_{0y} \sin \theta _0\). In this definition, \(\theta _0\in [0, \pi )\).

In this paper, we use the same equation (Eq. 2) to define a directed straight line \(l_{p_0,\mathbf {n_0}}\) by measuring the angle \(\theta _0\) counter-clockwise from the positive X-axis to \(\mathbf {n_0}\). In this way, a point with two opposite normal vectors define two different directed lines passing through the same points. \(\theta _0\in [0,2\pi )\), and \(\theta _0\) is in one-to-one correspondence to normal vectors \(\mathbf {n_0}\), \(\mathbf {n_0}=(\cos \theta _0, \sin \theta _0)\).

We define the positive and negative half-plane of \(l_{p_0,\mathbf {n_0}}\) with two linear inequalities, respectively:

$$\begin{aligned} H^+(l_{p_0,\mathbf {n_0}}):\ p_x \cos \theta _0+p_y \sin \theta _0 -\rho _0>0\end{aligned}$$
(3)
$$\begin{aligned} H^-(l_{p_0,\mathbf {n_0}}):\ p_x \cos \theta _0+p_y \sin \theta _0 -\rho _0<0 \end{aligned}$$
(4)

Given a vector field \({{\varvec{W}}}=(U,V)\) for the image I, a pixel q is a supporting pixel of the directed line \(l_{p_0,\mathbf {n_0}}\), if and only if \((q_x\), \(q_y)\) satisfies Eq. 2 and \(\left| \angle ({{\varvec{W}}}(q),\mathbf {n_0})\right| <\pi /2\). It is easy to see, the directions of the vectors of supporting pixels are from the negative half-plane to the positive half-plane. We define \({{\varvec{W}}}\) as the supporting vector field of the image I and estimate the strength of directed straight lines with its supporting pixels in Sect. 3.3.

3.3 Directed Hough Transform with Gradient Vectors

Given the image I, its edge map \(I_e\) and its supporting vector field \({{\varvec{W}}}\), we define an 2D accumulator array \(C_d(\cdot )\) for the directed Hough transform as:

$$\begin{aligned} C_d(\rho _k, \theta _l )=\sum _{\begin{array}{c} p:\ \left| \angle {({{\varvec{W}}}(p),{{\varvec{n}}_{\varvec{l}}})}\right| <\pi /2 \\ \ p_x \cos \theta _l+p_y \sin \theta _l -\rho _k=0 \end{array}}{I_e(p)} \end{aligned}$$
(5)

where \({{\varvec{n}}_{\varvec{l}}}=(\cos \theta _l, \sin \theta _l)\).

Our goal is to guarantee that each directed line l passing through pixels on the boundary of the ROI, such that the ROI is in its positive half-plane \(H^+(l)\), has a correspondent local maxima in \(C_d\). Hence we need to define a proper supporting vector field and a good edge map.

With the observation that, along the boundary of the ROI in a X-ray image, the directions of intensity gradient vectors are often from the shadow region (non-ROI) to the ROI, we define the supporting vector field (UV) with the estimated gradient vector \(\nabla I=(I_x, I_y)\). Special values are set in the four sides of the image. As in some cases, parts of the boundary of the ROI are on the sides of the image and there is no shadow region near it.

$$\begin{aligned} {{\varvec{W}}}(p)={\left\{ \begin{array}{ll} (1, 0)&{}\text{ if } p_x=1\\ (-1, 0)&{}\text{ if } p_x=M\\ (0, 1)&{}\text{ if } p_y=1,\ 1<p_x<M\\ (0, -1)&{}\text{ if } p_y=N,\ 1<p_x<M\\ (I_x(p),I_y(p))&{}\text{ otherwise } \end{array}\right. } \end{aligned}$$
(6)

where, \(M\times N\) is the size of the image I. And we estimate the edge map \(I_e\) with a learning-based method to take advantage of the training data (see Sect. 2 for details). The complete directed Hough transform algorithm for the ROI boundary is shown in Algorithm 1.

figure a

4 Optimal Quadrilateral Detection

To detect the optimal quadrilateral, we need to find the optimal group of four directed straight lines cropping the ROI.

Firstly, we detect a list of directed lines \(l_i\), denoted as \(L_{list}=\{l_i,\ i=1\,\ldots \,k\}\), using the directed Hough transform. \(l_i\) is in correspondence with local maxima \((\rho _i,\theta _i)\) in \(C_d(\cdot )\), and \(C_d(\rho _i,\theta _i)>\tau _1\), where \(\tau _1\) is a positive threshold. \(\theta _i\le \theta _j\), if \(i<j\). As the optimal quadrilateral can also be regarded as the intersection of the positive half-plane of the four lines cropping it, \(l_i\) should also guarantee that its positive half-plane \(H^+(l)\) contains the ROI. Let \(R_{\tau _2}=\cup _{Pr(P)>\tau _2}(P)\) be a region that pixels in it have high probability in the ROI, where \(\tau _2\) is set to 0.96 in this study. Hence, each directed line \(l_i\) in \(L_{list}\) satisfies: \(\frac{\left| R_{tau_2}\cap H^+(l_1)\right| }{\left| R_{tau_2}\right| }>\tau _3\), where \(0<\tau _3<1\).

Secondly, we detect a group of four directed lines \({{\varvec{l}}_{\varvec{i}}}=(l_{i_1},l_{i_2},l_{i_3},l_{i_4})\) in \(L_{list}\), such that \(i_1<i_2<i_3<i_4\), satisfying that \(\left| \theta _{i_{j+1}}-\theta _{i_{j}}-\pi /2\right| <\tau _\theta ,\ j=1,2,3\), \(\left| \theta _{i_{1}}-\theta _{i_{4}}+3\pi /2\right| <\tau _\theta \), \(\left| \theta _{i_{j+2}}-\theta _{i_{j}}-\pi \right| <\tau _\theta ,\ j=1,2\), \(\rho _{i_{j+2}}+\rho _{i_j}<-\tau _\rho ,\ j=1,2\), to maximize the function f:

$$\begin{aligned} f_{{\varvec{l}}_{\varvec{i}}}= \beta \sum _{j=1}^{4}C_d(\rho _{i_j}\theta _{i_j}) -\sum _{j=1}^{3}(\theta _{i_{j+1}}-\theta _{i_{j}}-\pi /2)^2 -(\theta _{i_{1}}-\theta _{i_{4}}+3\pi /2)^2 \end{aligned}$$
(7)

where, \(\tau _\theta \), \(\tau _\rho \) and \(\beta \) are positive constant parameters.

We optimize Eq. 7 with a constrained exhaustive search. For \(l_i\in L_{list}\), let \(s(i)=\min \{j|\theta _j>\theta _i+\pi /2-\tau _\theta \}\), \(e(i)=\max \{j|\theta _j<\theta _i+\pi /2+\tau _\theta \}\), it is easy to see that, in the optimal group of \((l_{i_1},l_{i_2},l_{i_3},l_{i_4})\), \(s(i_j)\le i_{j+1}\le e(i_j)\), for \(j=1,2,3\). We compute \(s(\cdot )\) and \(e(\cdot )\) beforehand to speed up the exhaustive search.

With the optimal group of \((l_{i_1},l_{i_2},l_{i_3},l_{i_4})\), we can generate the optimal quadrilateral as the detected region of the interest (ROI), as in Fig. 1c.

Table 1. Comparative success rate: our method (\(s^*\)), proposed directed Hough transform with gradient magnitude map (\(s^1\)), undirected Hough transform with proposed learning-based edge map (\(s^2\)), and the method in Mao et al. [5] (\(s^A\)). \(D_1'\) and \(D_2'\) are the sets of testing images in \(D_1\) and \(D_2\), respectively. The success rate is computed with expert identification. The running time of the method in [5] is about 15 s per case in average, and ours is about 1.8 s.

5 Experiments

To show the robustness of our proposed method and the impact of the two components, we evaluate our algorithm on two data sets, \(D_1\) and \(D_2\), acquired by X-ray machines. We randomly select 100 images from the union of the two data sets for training, and evaluate our method on the remaining images. We also evaluate another learning based method [5] with the same training and testing images for comparison, in Table 1. \(D_1\) and \(D_2\) are from different sites and the difference of the success rate between \(D_1'\) and \(D_2'\) is due to larger variability of orientation and size of images in \(D_1\).

For each image, our algorithm takes 1.807 s in average to detect the ROI, on an Intel Core i5-3470, 3.2 GHz processor and 4 GB memory system. The average size of input images is \(350\,\times \,350\). The training of the selected images takes about 2 h, in which the training data are labeled as ROI or not in pixel-level. In Fig. 3, we show some results of our method, including a failure case Fig. 3e, in which, almost all the right side of the ROI is very weak.

Fig. 3.
figure 3

Some results of our methods, (e) is a failure case

6 Discussion

In this paper, we propose a novel automatic algorithm to detect the region of interest in a X-ray image quickly and accurately. Learning-based edge maps are much more accurate than unsupervised methods, and the directed Hough transform add strict shape constraints for the quadrilateral detection. However, although we use a learning-based edge estimation, our method still cannot work if parts of the boundary are too weak, which makes the basic assumption of our method no longer hold.