Keywords

1 Introduction

Images studied in this paper are obtained using microcomputed tomography (µCT) method. Imaging using µCT is a powerful technique for non-destructive internal structure imaging of small objects. Best µCT devices available today can obtain the resolution even better than one micrometer. That advantage lets µCT to be widely used in biology, geology, material science and many other areas where imaging of small structures is required. The general idea behind µCT measurements is to generate electromagnetic radiation with X-ray tube. That radiation after penetrating the sample is deposited in 2D detector on the opposite side of the sample. The detector registers the attenuation of the X-ray intensity. The registered 2D array of X-ray intensities is called “projection”. Intensity of registered radiation depends on the material radiation absorption property across the single ray. Generally denser materials absorb more radiation. Sample is rotating and few hundred or thousand projections are registered. Computer software is used to reconstruct 3D object from a set of 2D projections using one of the available methods, such as one of the most popular filtered back projection methods based on Radon transform theorem [1]. The 3D object obtained from this step is represented by a 3D matrix of voxels. 2D slices of objects voxels could be represented as 2D image. In this paper images of porous structures are studied. Porous materials can be described as a two-phase composite where one phase is a solid phase and the other is a void or some gas or liquid phase. Separation of these phases by segmenting 2D cross-section images separately is studied in this paper. When differences in X-ray linear attenuation factor for both phases are high, the solution is easier but for images studied in this paper the differences are small in intensities of pixel values for both phases. Studied images have large amount of noise which is also a difficulty. In easy cases, the segmentation could be performed using filtering step such as median or bilateral filter and binarized by simple thresholding method, even with one threshold. This approach could be then applied to all images in stack to obtain all properly segmented images. From binary image stack we could render a 3D visualization, for example. Images studied in this paper require a more sophisticated method to obtain fully segmented images.

The work is the extended version of the authors’ work in the Second International Doctoral Symposium on Applied Computation and Security Systems organized by University of Calcutta [2]. More details and examples are given in this paper. Some theoretical aspects are repeated for the reader’s convenience.

2 Used Methods

2.1 Bilateral Filter

For obtaining properly detected edges from a noisy image, a proper smoothing stage is required prior to edge detection. When processing noisy images this step is crucial for obtaining good results of entire approach. We found bilateral filter is the proper way of smoothing microtomography images presented in this paper. Bilateral filter is a technique which allows to remove unwanted details (textures, noise), and still preserving edges without blurring is the great advantage of this method. Bilateral filter uses a modified version of Gaussian convolution. In Gaussian filtering, weighted average of the adjacent pixels intensities in the given neighbourhood results in new value of the considered pixel. Weights decrease along with the increasing spatial distance from the central pixel (1). Moreover, pixels are less significant for new value of the processed pixel. That dependency is given as

$$ G[I]_{p} = \frac{1}{{W_{pG} }}\mathop \sum \limits_{q \in S} G_{\sigma } (|\left| {p - q} \right||)I_{q} , $$
(1)

where \( G_{\sigma } \left( x \right) \) is Gaussian convolution kernel given by (2).

$$ G_{\sigma } \left( x \right) = \frac{1}{{2\pi \sigma^{2} }}\exp \left( { - \frac{{x^{2} }}{{2\sigma^{2} }}} \right) $$
(2)

where: S is the spatial domain, \( W_{pG} \)—sum of all weights, I—intensity of pixel, \( |\left| {p - q} \right|| \)—the Euclidean distance between the considered central pixel p and another pixel q form the given neighbourhood. Profile of weights changes depending on spatial distance as given by \( \sigma \). Higher sigma results in higher smoothing level. Main disadvantage of Gaussian filter is edge blurring.

Bilateral filter is defined by (3).

$$ B[I]_{p} = \frac{1}{{W_{pB} }}\mathop \sum \limits_{q \in S} G_{{\sigma_{s} }} (|\left| {p - q} \right||)G_{{\sigma_{r} }} (|\left| {I_{p} - I_{q} } \right||)I_{q} , $$
(3)

where

$$ W_{pB} = \mathop \sum \limits_{q \in S} G_{{\sigma_{s} }} \left( {\left| {\left| {p - q} \right|} \right|} \right)G_{{\sigma_{r} }} \left( {\left| {\left| {I_{p} - I_{q} } \right|} \right|} \right). $$
(4)

Only pixels close in space and intensity range are considered (close to the central pixel). Spatial domain Gaussian kernel is given by \( G_{{\sigma_{s} }} \) Weights decrease with increasing distance. Range domain Gaussian kernel is given by \( G_{{\sigma_{r} }} \). Weights decreases with increasing intensity distance. Simultaneous filtering in both spatial and intensity domain gives bilateral filter capability of smoothing image (background and object area) and preserve edges at the same time [3]. That kind of behaviour is crucial when processing noisy images demanding high level of smoothing to remove noise.

2.2 Canny–Deriche Edge Detector

Canny formulated three important criteria for effective edge detection in his paper [4]:

  • Good detection—low probability of failing to detect existing edges and low probability of false detection of edges

  • Good localization—detected edges should be as close as possible to the true edges

  • One response to one edge—multiply responses to one real edge should not appear

Canny combined these criteria into one optimal operator (approximately the first derivative of Gaussian) [4, 5]—see (5).

$$ f\left( x \right) = - \frac{x}{{\sigma^{2} }}e^{{ - \frac{{x^{2} }}{2} \cdot \sigma^{2} }} . $$
(5)

Deriche modified Canny’s approach to obtain a better optimal edge detector [5]. He presented his optimal edge detector in the form of:

$$ f\left( x \right) = k \cdot e^{ - \alpha \cdot |x|} \sin \omega x $$
(6)

and for the case when \( \omega \) tends to 0

$$ g\left( x \right) = k \cdot xe^{ - \alpha \cdot |x|} . $$
(7)

Performance of that approach is better than Canny’s original idea. At the beginning calculation of magnitude and gradient direction are performed to obtain gradient map. Higher gradient values are obtained near to the edges of objects. Then non-maximal suppression selects the single brightest pixel across the width of an edge which is a thin edge. Last stage involves hysteresis thresholding performed to get the final result of edge detection. Hysteresis thresholding uses two thresholds as parameters. Accordingly thresholds pixels are divided into three groups. Pixels with values below low threshold are removed which means that they are classified as non-edges. Pixels with values above high threshold are retained so they are considered as edges. Pixel with intensity value between low and high threshold is considered as edge pixel only if connected to some pixel above high threshold [4, 5].

2.3 Active Contours (Snakes) and Gradient Vector Flow (GVF)

Snake could be described as parametric curve

$$ \varvec{x}\left( s \right) = \left[ {x\left( s \right), y\left( s \right)} \right], \quad s \in \left[ {0, 1} \right] . $$
(8)

That snake could move in spatial domain of the image to minimize energy functional

$$ E = \mathop \smallint \limits_{0}^{1} \frac{1}{2}\left[ {\alpha \left| {x^{\prime} \left( s \right)} \right|^{2} + \beta \left| {x^{\prime\prime} \left( s \right)} \right|^{2} } \right] + E_{\text{ext}} \left( {x\left( s \right)} \right){\text{d}}s. $$
(9)

where \( \alpha \) and \( \beta \) are weighting parameters controlling tension (first derivative) and rigidity (second derivative). \( E_{\text{ext}} \) is obtained from image gradient map. It takes smaller values near objects of interest such as edges. In our approach this external force obtained from gradient vector flow (GVF) method is computed as a diffusion of the gradient vectors. GVF method could be applied for example to a grey-level or a binary edge map derived from the image. GVF fields are dense vector fields derived from images by minimizing energy functional. The minimization is achieved by solving a pair of decoupled linear partial differential equations that diffuses the gradient vectors of edge map obtained from the image. Active contour using GVF field as external force could be named GVF snake. Detailed description and numerical implementation could be found in original GVF paper [6].

2.4 Statistical Region Merging (SRM) and Multilevel Thresholding

In region merging-based method, regions are described as sets of pixels with homogeneous properties and they are iteratively grown by combining smaller regions. Pixels are elementary regions. Statistical test is performed to decide if merge tested regions. Detailed description is available in original SRM paper [7]. Multilevel thresholding modify Otsu method allowing to get more than two pixel classes by choosing the optimal thresholds by maximizing a modified between-class variance. In this paper pixels were divided into three classes (background, object, holes inside objects together with objects shadows). Detailed description is available in original multilevel thresholding paper [8].

3 The Proposed Methodology

Images analysed in this paper are quite complicated to segment. The major difficulty is that the objects can have intensities of pixels very similar to the background. Sometimes even humans cannot say where exactly object edge is placed. To obtain proper segmentation of these images complex approach combining several methods is required. Simple segmentation methods based on pixel intensity like thresholding do not apply here because they are not good with segmenting noisy images with non-uniform objects [913]. This paper focuses on combining various methods like: Canny–Deriche edge detection [5], bilateral filtering [3], gradient vector flow [6], active bontour [6], statistical region merging [7] and multilevel thresholding based on Otsu method [8] to obtain multistage approach with good segmentation results.

In this paper data in the form of 8-bit grayscale images were used. First histogram normalization is applied to the original image. Due to high noise level of images, efficient smoothing is crucial to make proper segmentation possible. Smoothing step is based on bilateral filter. Adjustable Parameters for bilateral filter are: mask size, intensity range, spatial sigma and intensity sigma. Finding proper values of these parameters using semi-automatic approach was described in the authors’ previous work [14]. When proper values of the parameters are obtained for one image from bigger set those values could be used to process entire µCT images set. Smoothing step was iteratively applied twice, once with bigger parameters values (more smoothing) and then with smaller parameters values (less smoothing, see Fig. 1). This step removes significant amount of noise and unwanted textures resulting in the simplified image.

Fig. 1
figure 1

Bilateral filtering: first iteration (a) and second iteration (b)

In the edge detection stage Canny–Deriche edge detector is applied. Adjustable Parameters for edge detection step are: alpha, high threshold, low threshold. Finding proper values of these parameters were described in previous article [14]. In this step, gradient magnitude and direction are calculated then non-maximum suppression is performed which allows thin edges (Fig. 2a, b). Hysteresis thresholding is performed at the end of process to obtain most relevant edges (Fig. 2c).

Fig. 2
figure 2

Canny–Deriche edge detector: gradient map (a), non-maximum suppression (b), final edge detection result after hysteresis thresholding (c) and original image (d)

Object grouping is performed to group all edges into array of objects constructed from edges. This is achieved by grouping all edge pixels which are close to each other within chosen radius. From that point all objects are processed separately. Objects are processed with GVF method (Fig. 3) to obtain gradient map proper for active contour method. Snake is initialized outside each object and evolves to find object boundaries by minimalizing energy of snake at each iteration (Figs. 4 and 5). Energies used: internal energy (first and second derivative), external energy (obtained from gradient map) and external pressure force. Parameters were chosen to perform well on this kind of images. In our approach snake is discretized. Finite number of control points was used to calculate total snake energy. For each iteration, snake was resampled to assure proper behaviour. Viterbi algorithm helps with optimization of the contour evolution.

Fig. 3
figure 3

Example of gradient vector flow field after 100 iterations (a), 300 iterations (b) and 700 iterations (c)

Fig. 4
figure 4

Example of snake evolution after 15 iterations (a) and final snake (b)

Fig. 5
figure 5

Two examples of gradient map obtained with GVF method and snake evolved to find object boundaries on top of it (blue pixels are discrete points used to calculate snake energy)

Snake curve allows to create mask which in combination with original object coordinates allows to cut off pixels inside this snake from original image after smoothing (Fig. 6).

Fig. 6
figure 6

Two examples of pixels cut off from original image after smoothing with use of mask obtained from snake curve coordinates

Statistical region merging is performed to merge pixels into regions of similar intensity to simplify image (Fig. 7).

Fig. 7
figure 7

Two examples of pixels cut off from original image after smoothing with use of mask obtained from snake curve coordinates and after use of statistical region merging method

Simplified images are finally segmented using multilevel thresholding based on Otsu method (Fig. 8). If objects are smaller than 60 px (width or height) or all pixels in image are higher than some threshold, simple thresholding with one threshold is used. Algorithm flow is shown in Fig. 9.

Fig. 8
figure 8

Two examples of binary images obtained after use of multilevel thresholding based on Otsu method

Fig. 9
figure 9

Flowchart of proposed approach

Finally all binary objects images are combined to obtain a fully segmented input image (Fig. 10).

Fig. 10
figure 10

Fully segmented image with bigger objects (a) and smaller objects (b)

4 Experimental Results and Interpretation

Presented method allows to treat weak contrast images of porous structures and images containing separate objects giving good results of image segmentation. Further enhancements will be applied in the future to improve results of segmentation. Proper filtering using two-step bilateral filter prior to Canny–Deriche edge detection allows to obtain images with proper localized edges with very small amount of false edges detected and true edges omitted. Some small gaps in edges occurred after edge detection step which was resolved by using active contour method along with GVF. Full segmentation of processed image fragments was achieved by using statistical region merging and multilevel thresholding based on Otsu method. All binarized objects were combined into a fully segmented image (Fig. 10). The approach introduced in this paper expands the idea presented in the authors’ previous papers [14, 15] which concerned only edge detection. Much better results are achieved with this upgraded approach resulting in fully segmented binary images. Presented approach leaves possibility for future upgrades to obtain even better results.

5 Evaluation and Comparison of Results

To evaluate the results we need to know the exact position of all object and background pixels. Evaluating results obtained from that kind of images is a difficult task. Marking object contour by hand is time consuming and not so precise especially when complex object with non-trivial shapes are considered. In this paper the results were evaluated with the aid of mock input image prepared to imitate the real µCT image. That image was obtained from the algorithm with final output as binarized image. Objects and background grayscale level were set according to the average values of those elements obtained from the original µCT images. Then noise with various standard deviations was added to make mock images similar to the real data obtained from µCT device (Fig. 11).

Fig. 11
figure 11

Fragments of mock test images with noise standard deviation equals to: 12 (a), 20 (b)

Artificial mock image, with added standard deviation of noise equals to 12 which is similar value to the original one, was prepared. Image was processed using our algorithm noise filtering part (see Fig. 9) combined with single threshold binarization (Fig. 12a), using multilevel Otsu thresholding after the algorithm noise filtering part (Fig. 12c), using multilevel Otsu thresholding after filtering stage without second bilateral filter (Fig. 12b) and using the full approach presented in this paper (Fig. 12d). Several statistical evaluation measures of binary classification for our algorithm such as sensitivity, specificity, precision, negative predictive value and accuracy were presented in Table 1. Accordingly, the results are good and they become even better after some further modification in the last stage of the algorithm after a much precise binarization at this step. Test performed with artificial mock image shows that the approach with simple thresholding gives weak results whilst the approach with multilevel Otsu thresholding produces results comparable to our algorithm output but still produces little noise artefacts (Fig. 12). The same methodology used to binarize real µCT image, shows that the algorithm presented in this paper performs much better with original µCT images than the compared approaches (Fig. 13). Original µCT images are much harder to binarize than artificial mock images introduced to evaluate results. Real images have much more complex structure of objects, such as shadows near borders and various grayscale levels inside objects, sometimes with intensities similar to the background level. Noise also seems to have much more complex structure than simple random noise with given standard deviation. Other tested methods fail because they could not produce uniform and noise-free objects.

Fig. 12
figure 12

Fragments of mock test images with noise standard deviation equals to 12, processed by: a authors’ algorithm filtering part with single threshold binarization, b filtering part without second bilateral filter and with multilevel Otsu thresholding, c filtering part with multilevel Otsu thresholding, d final result of authors’ algorithm

Table 1 Segmentation evaluation test results
Fig. 13
figure 13

Fragments of original input image processed by authors’ algorithm: a filtering part with single threshold binarization, b algorithm filtering part without second bilateral filter and with multilevel Otsu thresholding, c filtering part with multilevel Otsu thresholding, d final result of authors’ algorithm