1 Introduction

Image segmentation plays an important role in image analysis and computer vision. It is a process of partitioning an image into several non-overlapping, homogenous regions containing similar objects. In the literature, there exist several techniques to solve image segmentation problems, namely histogram thresholding, edge detection, clustering and so on [13]. If objects present in the image are distinguishable by their gray values, then the histogram of the image may have many peaks to represent different objects. The potential thresholds can be found at the valley regions of the histogram by applying thresholding technique. Automatic detection of these thresholds is one of the major challenge in the thresholding techniques. Due to the advantages of smaller storage space, fast processing speed and ease in manipulation, histogram thresholding has drawn a lot of attention in many applications [15, 20]. A survey of various thresholding techniques and their applications can be found in [17].

Histogram thresholding techniques can be divided into bi-level [2, 810, 12, 21] and multilevel [1, 3, 5, 23], depending on the number of thresholds required to be detected. Otsu’s method [12] is one of the popular bi-level thresholding technique that selects the optimal threshold by maximizing the between class variance. The minimum error thresholding presented in [10] selects the optimal threshold based on the assumption that the object and background pixels are normally distributed. In [9], the optimal threshold is determined by maximizing the entropy of the object and background pixels. Bi-level histogram thresholding methods based on fuzzy sets theory are presented in [2, 8, 21]. In [23], a multilevel thresholding method for automatic image segmentation is presented. A fast multilevel thresholding based on low-pass and high-pass filter is proposed in [3]. Multilevel thresholding methods based on optimization techniques are found in [1, 5].

The aforementioned thresholding techniques are based on one-dimensional (1D) histogram of the image. Histogram-based thresholding techniques are easy to implement and have low computational burden. However, they do not take into account the spatial contextual information in thresholds selection process. To mitigate this limitation, two-dimensional histogram [16], two-dimensional direction histogram [24], gray level spatial correlation histogram [18] and gray level gradient magnitude histogram [22] have been proposed. Although these techniques produced improved results but lost the basic advantages of 1D histogram-based thresholding techniques.

In this article, a fast context-sensitive multilevel threshold selection technique is presented. To incorporate spatial contextual information in threshold selection process, the proposed technique employs recently defined 1D energy curve [14] of the image. The energy curve has the similar characteristics as 1D histogram of the image. The proposed technique has several advantages: (i) it is context sensitive, (ii) computationally less demanding, (iii) preserve the advantages of 1D histogram-based thresholding techniques (such as smaller storage space, fast processing speed and ease in manipulation) and (iv) able to determine the optimal number of segments in the image without expert knowledge.

The rest of this paper is organized as follows. The proposed technique is presented in Sect. 2. Section 3 provides the detailed description of the experimental settings and the results obtained on the considered images. Finally, Sect. 4 draws the conclusion of this work.

2 Proposed technique

The framework of the proposed technique is presented in Fig. 1. It consists of three steps: (1) initial thresholds selection, (2) detection of optimum number of potential thresholds and (3) detection of optimal value of each potential threshold. Each of these step is elucidated next.

Fig. 1
figure 1

Framework of the proposed technique

2.1 Initial thresholds selection

Histogram does not take into account the spatial contextual information of the image. To incorporate spatial contextual information in threshold selection process, the energy curve of the image defined in [14] is used here.

Let \(I=\{l_{ij},1\le i\le m,1\le j\le n\}\) be an image of size \(m\times n\) where \(l_{ij}\) is the gray value of the image I at pixel position (ij). The energy value \(E_l\) of the image I at gray value l is computed as:

$$\begin{aligned} E_l=-\sum _{i=1}^{m}\sum _{j=1}^{n}\sum _{pq\in {N_{ij}^2}}b_{ij}.b_{pq} + C \end{aligned}$$
(1)

where \(B_l=\{b_{ij}, 1\le i\le m,1\le j \le n\}\) such that \(b_{ij}=1\) if \(l_{ij} > l\); else \(b_{ij}=-1\). \(N_{ij}^2\) represents the second-order neighbor pixels of the pixel at spatial position (ij) and C is a constant that ensures the energy value \(E_l>0\). For more details, the reader may refer [14].

Like histogram, the energy curve of the image also include peaks, which can be separated into number of modes. Each mode is expected to correspond to a region, and there exists a valley between any two adjacent modes. Since the energy curve is generated by taking into an account the spatial contextual information of the image, it is smoother and has better discriminatory capability as compared to the histogram of the image. Figure 2 shows the original House image, its histogram and energy curve. From these figures, one can see that the energy curve of the image is smoother than the histogram of the image; hence, the number of peaks (as well as valleys) that exist in the energy curve is lower than that in the histogram of the image. Thus, energy curve may be a better choice for suitable threshold selection than the histogram not only because of the inclusion of spatial contextual information but also for the quick selection of optimal number of potential thresholds.

Fig. 2
figure 2

a Original House image, b histogram and c energy curve. The circles represent the peaks (local maxima) of the curves

The energy curve-based segmentation technique presented in [14] is unable to detect optimal number of segments of the input image. The technique presented in [19] solves this problem by employing concavity analysis. Since concavity analysis technique skip some valleys of the energy curve, it may fail to detect the optimal number of potential thresholds. To mitigate these limitations, a novel initial threshold selection technique is proposed as follows:

Let E be the energy curve of an image defined over the set of gray level [0, L]. Consider the subset of E say S defined over [\(k_1\),\( k_2\)] such that \(E(k_1)\) and \(E(k_2)\) are the first and last nonzero values of the energy curve E, respectively. Let \(P=\{p_0,p_1,\ldots ,p_{n-1}\}\) be the set of n peaks exist in S. Then the curve S can be partition into \(n+1\) regions, namely \(\{(k_1,p_0),(p_0,p_1),\ldots ,(p_{n-1},k_2)\}\), where each region may contain a threshold to distinguish different objects present in the image. In the proposed technique, the possible \(n+1\) initial thresholds are defined by taking the midpoint gray value of each partition. The number of initial thresholds obtained by this method may be larger than the number of objects present in the image. The method to select the optimum number of potential thresholds and their respective bounds where the optimal value of each potential threshold may exist is presented in the next subsection.

2.2 Detection of optimum number of potential thresholds

In this subsection, a novel technique based on cluster validity measure is proposed to detect the optimal number of potential thresholds. The optimal number of potential thresholds are selected from the list of initial thresholds obtained from the energy curve. It is agglomerative in nature. The proposed technique exhaustively merge two adjacent modes to select optimal number of potential thresholds with the help of a validity measure called Davies–Boulding (DB) index [4]. It is a function of the ratio of the sum of within-object scatter to between object separations. Let \(\omega _1,\omega _2,\omega _3,\ldots ,\omega _k\) be the k objects of a segmented image separated from each other by defining thresholds \(t_1, t_2, t_3,\ldots ,t_{k-1}\), where \(t_1<t_2<t_3<\cdots <t_{k-1}\). Thus, the pixels of the image whose gray values are in the range \([t_{i-1} t_i]\) construct the object \(\omega _i\) of the segmented image. The DB index of the segmented image is computed as:

$$\begin{aligned} \mathrm{DB}= & {} \displaystyle \frac{1}{k}\sum _{i=1}^{k}R_i\nonumber \\ R_i= & {} \displaystyle \max _{j=1,\ldots ,k, i\ne j}\{R_{ij}\}\nonumber \\ R_{ij}= & {} \displaystyle \frac{\sigma _{i}^{2}+\sigma _{j}^{2}}{d^{2}_{ij}}. \end{aligned}$$
(2)

where \(\sigma _i^2\) and \(\sigma _j^2\) are the variances of object \(\omega _i\) and \(\omega _j\), respectively, and \(d_{ij}^2\) is the distance between centers of object \(\omega _i\) and \(\omega _j\). The gray values of the pixels belong to an object are used to compute variance and center of that object. Smaller the DB value, better is the segmentation as a low scatter and a high distance between object gives small values of \(R_{ij}\).

Table 1 Different subsets generated by the proposed technique for the House image

Let \(T_k = \{t_1, t_2, \ldots , t_k\}\) be the set of k initial thresholds obtained from the energy curve. To determine the optimal number of objects present in the image following steps are taken into an account: First, DB index is computed by taking into account all the k thresholds. Then at a time by dropping single threshold from \(T_k\), \(k-1\) subsets each containing \(k-1\) thresholds are generated along with their respective DB index values. From these \(k-1\) subsets, the subset denoted as \(T_{k-1}\) that produced minimum DB index is selected for further analysis. The same procedure is repeated to obtain another subset of \(T_{k-1}\) denoted as \(T_{k-2}\) that contain \(k-2\) thresholds with lowest DB index value. The process is repeated until the subset \(T_1\) consisting of single initial threshold is obtained. After the computation of \(T_{k}, T_{k-1}, T_{k-2},\ldots ,T_{2},\) and \(T_{1}\) along with their corresponding DB index values, the subset \(T_i (i=1,2, \ldots ,k)\) associated with smallest DB index is chosen to select the optimal number of potential thresholds. For the House image, different subsets of potential thresholds generated by the proposed technique are shown in Table 1. Since the subset \(T_4\) correspond to the lowest DB index value, the optimal number of potential thresholds is determined as 4 with the initial thresholds \(t_1=20, t_2=90, t_3=172\) and \(t_4=227\).

Since the initial thresholds are assumed as the middle of every two adjacent peaks of the energy curve, the value of each potential threshold selected may not be optimal. The optimal value of each potential threshold may lie any where within the range of two adjacent peaks to which it belongs. Thus, to define the bounds of each potential threshold where its optimal value may exist, the nearest left and right peaks associated with a threshold are considered as its lower and upper bound, respectively. Figure 3 shows the four potential thresholds of the House image determined using the aforementioned technique and their respective regions where the optimal thresholds can be obtained.

Fig. 3
figure 3

Potential thresholds \(t_1, t_2, t_3\) and \(t_4\) of the House image. The gray region associated with each potential threshold represents the range in which the optimal value can be obtained

Fig. 4
figure 4

Convergence graph of the House image

Fig. 5
figure 5

Original image data set: a Man, b Cameraman, c Fingerprint, d Two Swans, e Peppers, f Lena, g House and h Flinstones

Fig. 6
figure 6

Histograms of the experimental image data set: a Man, b Cameraman, c Fingerprint, d Two Swans, e Peppers, f Lena, g House and h Flinstones

Fig. 7
figure 7

Energy curves of the experimental image data set: a Man, b Cameraman, c Fingerprint, d Two Swans, e Peppers, f Lena, g House and h Flinstones

Table 2 Initial thresholds, potential thresholds and optimal thresholds detected by the proposed technique for different images

2.3 Detection of optimal potential thresholds

To find the optimal (or near optimal) value of each potential threshold within their defined range, GA [6] is employed. Let k be the optimal number of potential thresholds obtained from the energy curve of the image. The initial values of the potential thresholds in a chromosome are taken randomly within their defined ranges. To compute fitness value of each chromosome in the population, DB index presented in Eq. (2) is used. The fitness value computation, selection, crossover and mutation are executed for a certain number of iterations to find optimal value (or near optimal value) of each potential threshold within its defined range. The GA is terminated when the average fitness value of the population becomes stable. Finally, the chromosome in the population that has maximum fitness value (minimum value of DB index) represents the optimal thresholds. These optimal thresholds discriminate the different homogenous regions of the image.

Table 3 Quantitative results obtained by the proposed, the ECCS [14], the FODPSO [5], the PCS [11] and the Kapur [9] methods

The computational complexity of this step is significantly reduced by shrinking the search space of GA. Since the lower and upper bound of each potential threshold are given as input to the GA, the search space becomes narrower. Thus, the termination condition is satisfied in less number of iterations. Figure 4 shows the convergence graph of the House image.

3 Experimental results

To evaluate the effectiveness of the proposed technique, eight different images are considered for the experiments. Figures 5, 6 and 7 show the original images, their histograms and energy curves, respectively. From these figures, one can observe that the energy curve of an image has similar characteristics as that of histogram of the image, i.e., it also has peaks and valleys to discriminate different objects as that of histogram. Since the energy curve is generated by taking into an account the spatial contextual information of the image, it is smoother and has better discriminatory capability as compared to that of histogram of the image. Thus, without loosing the benefits of the histogram for suitable threshold selection, energy curve may be a better choice.

To assess the effectiveness of the proposed technique, results are compared with four state-of-the-art techniques exist in the literature. Since the technique is context sensitive, it has been compared with two GA-based context-sensitive techniques: an energy curve based (referred as ECCS) [14] and a pattern based (referred as PCS) [11]. The proposed technique is also compared with two context-insensitive techniques: a fractional-order Darwinian particle swarm optimization-based (referred as FODPSO) [5] and an entropy-based histogram thresholding (referred as Kapur method) [9].

To assess the effectiveness of the proposed technique, a cluster validity measure S_Dbw index which is not involved in the implementation of the proposed and existing techniques is taken into account [7]. It is based on the cluster compactness in terms of intra-cluster variance and inter-cluster density. The S_Dbw index with C number of clusters denoted as S_Dbw(C) is defined as:

$$\begin{aligned} S\_\mathrm{Dbw}(C)=\mathrm{Scat}(C) + \mathrm{Den}(C) \end{aligned}$$
(3)

where \(\mathrm{Scat}(C)\) and \(\mathrm{Den}(C)\) represent the intra-cluster variance and the inter-cluster density, respectively. The number of clusters that minimizes the S_Dbw index can be considered as an optimal number of the objects present in the image.

The algorithms have been implemented in MATLAB (R2012a). In order to fix different control parameters value of GA, several experiments were performed by varying their values within a wide range. From the experiments, it is found that the proposed technique produced similar results when the population size, crossover probability and mutation probability of GA are varied in the range of [20 80], [0.5 0.9] and [0.05 0.001], respectively. In the present experiments, for all the considered images the population size, crossover probability and mutation probability are set as 20, 0.8 and 0.01, respectively. Stochastic selection strategy is used to select fittest chromosomes from the mating pool. For FODPSO-based thresholding technique, both the individual and social weight of the particles is set as 0.8. The fractional coefficient is set as 0.6.

3.1 Results analysis

The first experiment is devoted to analyze the validity of the thresholds obtained by the proposed technique. Table 2 shows the initial thresholds, optimal number of thresholds and their optimal values obtained by the proposed technique for different images. By analyzing the histograms, energy curves shown in Figs. 6 and 7 and the results reported in Table 2, one can visualize that for all the considered images the optimal thresholds obtained by the proposed technique always pass through the valley region of the energy curves as well as histograms of the images. As an example, for the Peppers image the optimal thresholds 51 and 122 obtained by the proposed technique passes through the valley region of Figs. 6e and 7e. This confirms the validity of the proposed technique.

The second experiment compares the performance of the proposed technique with the ECCS, the FODPSO, the PCS and Kapur methods by using different images. Table 3 reports the optimal threshold and corresponding DB index and S_Dbw index obtained by the proposed, the ECCS, the FODPSO and the Kapur methods considering different images. It also reports the cluster representatives, associated DB index and S_Dbw index obtained by the PCS techniques. From this table, one can observe that the proposed technique always produced better DB index as compared to the ECCS, the FODPSO, the PCS and the Kapur methods. These results are expected as proposed technique minimizes the DB index to obtain optimal thresholds. For fair comparisons, another cluster validity measure called S_Dbw index that has no involvement for the implementation of these techniques is computed. From Table 3, one can observe for the images Cameraman, Fingerprint and Flinstones the proposed technique produced similar S_Dbw index as obtained by the most effective technique. For the other images the proposed technique produced much better S_Dbw index. Figure 8 shows the segmented results of the proposed technique.

Fig. 8
figure 8

Segmented images of the experimental image data set using proposed technique: a Man, b Cameraman, c Fingerprint, d Two Swans, e Peppers, f Lena, g House and h Flinstones

Fig. 9
figure 9

Box plot of 20 runs for all the input images

The third experiment validates the consistency of the proposed technique. Figure 9 shows box plot of all the images for 20 runs. Form this figure, one can see that the widths of the boxes are thin. This indicate the standard deviation obtained for each considered image is very small which ensures the consistency of the proposed technique.

Table 4 Computational time (in seconds) taken by the different techniques for all the input images

The fourth experiment deals with the computational time required by the different techniques for the experimental images. All the experiments were carried out on a PC [INTEL(R) Core(TM)2 Duo 2.0 GHz with 2.0 GB of RAM]. As explained earlier, the proposed technique narrows down (shrinks) the search space of GA by defining lower and upper bound of each detected potential threshold. Thus, the termination criteria are satisfied in less number of iterations. This resulted faster convergence of GA as compared to the conventional optimization techniques. Table 4 shows the computational time taken by the different techniques for all eight images. From this table, it can be observed that the computational time required by the proposed technique is significantly lower than the ECCS, the FODPSO and the PCS techniques. The Kapur method is computationally efficient for detecting small number of thresholds. When the number of thresholds increases, its computational complexity also increases exponentially.

4 Conclusion

In this article, a context-sensitive fast threshold selection technique is proposed for solving image segmentation problems. To incorporate spatial contextual information in threshold selection process, the technique analyzed energy curve of the image recently proposed in the literature [14]. The proposed technique has several advantages: (i) it is context sensitive, (ii) it is computationally less demanding, (iii) it preserves the advantages of 1D histogram-based thresholding techniques and (iv) it is able to determine optimal number of segments present in the image.

To assess the effectiveness of proposed technique, the results obtained by it are compared with the four state-of-the-art methods. Experimental results on large number of images confirmed the effectiveness of the proposed technique.

As future developments of this work, we plan to explore different cluster validity criteria to improve the results. Variants of evolutionary approaches can also be explored to achieve the improved results.