Keywords

1 Introduction

Image segmentation implies the procedure of grouping the similar pixels of an image into distinctive sets or sections, has a variety of applications in a variety of domains [1, 2]. Though different segmentation algorithms are available in the literatures [1], still image segmentation is a challenging task because, segmentation objective for different application is different. Not only that, presence of different textures in the image and improper illumination makes the segmentation more difficult. Discontinuity and similarity/homogeneity are two basic properties of the image pixels used in many segmentation methods.

Cluster based segmentation is significant approaches among the different approaches of segmentation available in literature. FCM, a clustering algorithm is widely used for image segmentation [3, 4]. Xia [5] defined the process of image segmentation as clustering of spatial patterns, as image can be modeled as set spatial patterns on rectangular lattice. Zhang and Jiang [6] proposed a Gaussian kernel function based weighted FCM algorithm which considered the effect of neighboring pixels to make the algorithm robust towards noisy images. Shasidhar et al. [7] proposed a modified FCM algorithm to produce better results on normal as well as noisy images.

In [8] Yannis and Stavros proposed an adaptive fuzzy clustering algorithm for image segmentation with neighborhood information. Cai et al. [9] proposed a fast and robust modification of standard FCM which incorporated a spatial-grey similarity measure to obtain robustness to noise and improved segmentation results.

Though a various variation of FCM is available in literature all of them require the number of clusters to be specified manually. Also the algorithms do not consider the spatial features of the pixels as a strong parameter in segmentation, which is a major drawback as different images have different area of interest. Hence a method which can segment color images in a very efficient way and also doesn’t require any type of previous knowledge (e.g. number of segments) about the image had to be devised. So in this paper a novel method has been proposed to automatically segment images which incorporate the spatial feature of the image pixels.

Section 13.2 describes the proposed algorithm and the method of segmentation. Experimental results have been discussed in Sect. 13.3 and finally Sect. 13.4 concludes the paper.

2 The Proposed Method

Determining the number of clusters is a crucial task for segmenting an image correctly. Though some unsupervised learning based [10] and fuzzy method based algorithms [3, 4] are available for segmenting images, no such methods are available which first determine the number of segments automatically and then cluster the image. To identify the number of segments, cluster validation indices based method is used.

2.1 FCM Algorithm

The FCM algorithm is an unsupervised clustering method; this algorithm makes soft partitions where a datum can belong to different clusters with a different membership degree to each cluster. This clustering method is an iterative algorithm which uses the necessary condition to achieve the minimization of the objective function J m for fuzzyfier [11] ‘m’ as given in Eq. (13.1).

$$ {\text{J}}_{\text{m}} = \mathop \sum \limits_{{{\text{i}} = 1}}^{\text{c}} \mathop \sum \limits_{{{\text{j}} = 1}}^{\text{n}} {\text{U}}_{\text{ij}}^{\text{m}} \left\| {{\text{x}}_{\text{j}} - {\text{v}}_{\text{i}} } \right\|^{2} $$
(13.1)

Where x 1 , x 2 , , x n are the values of the image pixels, v 1 , v 2 , , v c are the centers of the clusters, U ij is the degree of membership of pixel x j in the cluster v i , ‘c’ is the number of clusters to be produced and ‘n’ is the number of data points available as input. The parameter ‘m’ is called the fuzzyfier which controls how fuzzy the clusters are going to be; here m = 2 has been used. The above objective function is minimized when datum close to the center points are assigned higher values in the U matrix. The elements of U are fractions between 0 and 1 which are the probability of how much a given datum belongs to a specific cluster. At the first iteration the membership matrix is initialized and the centroids are chosen as random data points. The cluster centroids V = [v 1 , v 2 , …, v c ] for 1 ≤ i ≤ c is updated using Eq. (13.2).

$$ {\text{v}}_{\text{i}} = \frac{{\mathop \sum \nolimits_{{{\text{j}} = 1}}^{\text{n}} ({\text{U}}_{\text{ij}} )^{\text{m}} {\text{x}}_{\text{j}} }}{{\mathop \sum \nolimits_{{{\text{j}} = 1}}^{\text{n}} ({\text{U}}_{\text{ij}} )^{\text{m}} }} $$
(13.2)

And the membership matrix U for each of v i (1 ≤ i ≤ c) is updated using Eq. (13.3).

$${\text{U}}_{{{\text{ij}}}} = \frac{1}{{\sum\nolimits_{{{\text{k}} = 1}}^{{\text{c}}} {(\left\| {{\text{x}}_{{\text{j}}} - {\text{v}}_{{\text{i}}} } \right\|/\left\| {{\text{x}}_{{\text{j}}} - {\text{v}}_{{\text{k}}} } \right\|)^{{\frac{{ - 2}}{{({\text{m}} - 1)}}}} } }}$$
(13.3)

The distance function \(\| x_{j} - v_{i} \| \) used in the above equations is the Euclidian distance between the two data points. The above process is repeated until the algorithm converges to a locally minimum solution for v i . Convergence can be tested by the difference in the membership matrix in each iteration; \( \mathop {\hbox{max} }\limits_{ij} \left\{ {\left| {U_{ij}^{k + 1} - U_{ij}^{k} } \right|} \right\} < \varepsilon \), where ε is a termination criterion, 0 < ε < 1 and k is the iteration step. In this way the algorithm iteratively converges on the most optimum cluster centers.

2.2 Determining the Number of Clusters

Cluster validity index provides an objective measurement of a clustering result and its optimal value is often used to indicate the best possible choice for the number of clusters [12]. Thus cluster validation indices are used here for identifying the optimal number of segments in an image. Since the use of single validation index may lead to biasness, five widely used cluster validity indices Partition coefficient (PC) [12], Partition entropy (PE) [11, 13], Xie-Beni index (XB) [14], Kwon’s index (K) [15], Fuzzy hyper volume (FH) [16] are used here.

To find out the number of segments present in the image we first cluster the image using the standard FCM assuming the cluster values 2 through 10. Five different cluster validation indices are computed for each segmented image, obtained by segmenting the image using standard FCM taking cluster values 2, 3, … , 10 respectively. Then for each of the cluster validation indices optimal number of cluster is determined which gives the best value (minimum value for PE, XB, K, FH and maximum value for PC) of that index. The optimal number of clusters is computed by the majority of five indices consider here. If tie occur then optimal number of cluster is considered as the number of cluster determined by XB.

2.3 Modified Spatial Constraint Based FCM

Once the number of segment is identified, next task is to segment the image into the appropriate number of segment. To do this modified FCM, which includes the spatial information along with the color level information is proposed here.

The basic method for clustering the data in the proposed algorithm differs from the standard FCM in the way it measures the distance between two pixels. As FCM is not optimized for color image segmentation and can only take grey values. If all three R, G, B values are used in clustering inspite of using only one grey value, it will produce better clusters. Also the standard FCM neglects the spatial feature of the pixels. Here in the proposed algorithm distance function is incorporated in a manner so that it utilize the spatial feature of the pixels; here \( \left\| {p - v} \right\| \) is the Euclidian distance between two given pixels (on individual properties), which is calculated using Eq. (13.4).

$$ \begin{aligned} \left\| {p - v} \right\| &= \left( {a \times \sqrt {\left( {p_{X} - v_{X} } \right)^{2} \,+\, \left( {p_{Y} - v_{Y} } \right)^{2} } } \right) \\ & \quad + \left( {b \times \sqrt {\left( {p_{R} - v_{R} } \right)^{2}\, +\, \left( {p_{G} - v_{G} } \right)^{2}\, +\, \left( {p_{B} - v_{B} } \right)^{2} } } \right) \\ \end{aligned} $$
(13.4)

Here x and y suffixes are used to denote the X and Y coordinate values of the pixel respectively and suffixes R, G, and B signify the Red, Green and Blue color component values of the pixel respectively. The weighting parameters ‘a’ and ‘b’ are used to control the influence of the spatial and the color components of the image in the clustering process determined as follow.

2.4 Determining the Values for ‘a’ and ‘b’

The values of the parameters ‘a’ and ‘b’ were determined by evaluating large number of standard images from Berkeley segmentation dataset [17]. The segmentation result obtained by using different values of the parameters ‘a’ and ‘b’ were evaluated using Jaccard index [18]. The result shows that the number of clusters suggested by cluster validity indices varied with the number of picks present in the histogram of the image. Thus experimentally, for an image that has only one prominent peak in the histogram the parameter values are chosen as a = 0.4, b = 0.6. For images having two or three prominent peaks in their histogram a = 0.3, b = 0.7 is used. And for images having more than 3 peaks in their histograms a = 0.2, b = 0.8 is used.

2.5 The Proposed Segmentation Method

The procedure of segmenting an image is described step by step below.

  1. Step 1:

    Cluster the given image using standard FCM.

  2. Step 2:

    Estimate the values of different validation indices for this cluster.

  3. Step 3:

    Repeat step 1 and 2 assuming number of clusters to be 2, 3… 10.

  4. Step 4:

    For each of the cluster validation indices optimal number of cluster is determined which gives the best value (minimum value for PE, XB, K, FH and maximum value for PC) of that index.

  5. Step 5:

    Optimal number of clusters is computed by the majority of five indices consider here. If tie occur then optimal number of cluster is consider as the number of cluster determined by XB.

  6. Step 6:

    Determine the number of peaks in histogram and assign values to parameters ‘a’ and ‘b’.

  7. Step 7:

    Use MSCFCM to segment the image taking optimum number of clusters determined in step 5.

The procedure can best be understood by looking at the flow-chart in Fig. 13.1.

Fig. 13.1
figure 1

Flow chart describing the proposed method

3 Experimental Data Analysis and Results

The proposed Modified spatial constraint based FCM (MSCFCM) algorithm has been applied on 600 images available in the Berkeley image segmentation dataset [17]. For explaining the procedure the aircraft image labeled as ‘3096.jpg’ in the Berkeley image Segmentation dataset is used and shown in Fig. 13.2a. Taking the cluster value from 1 to 10 and segmenting it using standard FCM, five different cluster validation indices (PC, PE, XB, K and FH) are computed and results are given in Table 13.1.

Fig. 13.2
figure 2

Performance of the algorithms for ‘3096.jpg’: a ‘3096.jpg’ original image. b O/P of the MSCFCM. c O/P of Otsu’s method. d O/P of FCM. e O/P of K-means

Table 13.1 Validation index values for ‘3096.jpg’

From the column two of Table 13.1, it is found that optimum value (maximum value) of PC is 0.9455 which corresponds to number of cluster = 2. Similarly optimum number of cluster for PE, FH, XB and K are 2, 3, 2, and 2 respectively, obtained by considering the minimum values of the columns 3, 4, 5, and 6 respectively as smaller values of these indices represents better cluster. The indices PC, PE, XB and K gives the vote for number of cluster = 2, which clearly shows the majority for cluster 2, and is treated as the optimal cluster value. Then the MSCFCM is applied on the image taking number of cluster = 2 and the result is shown in Fig. 13.2b.

To show the efficiency of the proposed method, the image in Fig. 13.2a is segmented using the proposed method, the existing Otsu’s method [19], the standard fuzzy C-Means algorithm [11] and the K-means algorithm [20] and the results are shown in Fig. 13.2b–e respectively. The noises (unwanted pixels) produce by different methods are shown by arrows in the Fig. 13.2.

3.1 Performance Comparison

In this section we present an example image segmentation samples using all the four algorithms compared above. The objective evaluation of the proposed method is done using the Jaccard Index [18] computed suing the Eq. (13.5).

$$ J\left( {A,B} \right) = \frac{A\mathop \cap \nolimits B}{A\mathop \cup \nolimits B} $$
(13.5)

Where A is the segmented ground truth images (the ‘correct’ segmentation) given in the Berkeley image segmentation dataset of the image and B is the segmented image generated by the segmentation algorithm. The JI values computed for the MCSFCM, Otsu’s, K-means and standard FCM segmentation algorithm using Berkeley image segmentation dataset is shown in Fig. 13.3.

Fig. 13.3
figure 3

Chart showing performance comparison of the MSCFCM method with others

Clearly the graph shows the superiority of the proposed MSCFCM over the existing Otsu’s, FCM and K-means algorithms.

4 Conclusion and Future Work

The paper presents a method for automatic color image segmentation using cluster validity indices. Some well-known validation indices have been used to accurately calculate the optimal number of clusters present in an image. It is observed that the proposed method is able to automatically segment given images into an optimal number of segments. Performance of the proposed MSCFCM was compared with well-known segmentation algorithms such as Otsu’s, standard FCM, and K-means using standard data set, and the results are satisfactorily better than those algorithms. proposed method is evaluated using the standard image set, applicability of the method for a specific field is also open. The improvements may also be done by using the other validation indices and selecting the value of parameter ‘a’ and ‘b’ in the Eq. (13.4) based on the application field.