1 Introduction

Clustering is an unsupervised learning task since it groups data objects into clusters without any prior information such as class labels. The clustering techniques, thus, should be able to deduce the structure embedded in data without any extra information. Clustering algorithms have been successfully applied in several fields such as information retrieval [15, 36], medicine [19], biology [18], customer analysis [30], image segmentation [38] and many others. Clustering has been an area of active research and many clustering algorithms have been proposed in the literature.

The most widely used and the most popular algorithm for clustering is the k-means algorithm, proposed by J. MacQueen in 1967 [20]. K-means algorithm is fairly straightforward, simple to implement and has been employed by several researchers [9, 14, 17]. However, it may be easily trapped in a local optimum and fail to achieve a global optimum in several cases since the algorithm’s performance is highly dependent on the initial centroids chosen. To overcome this problem, several heuristic based approaches have been proposed for clustering. Selim and Alsultan [32] provided a simulated annealing (SA) algorithm for clustering in 1991. They have demonstrated that the simulated annealing algorithm converges to a global optimum for the clustering problem. Maulik and Bandyopadhyay [22] presented a clustering technique based on genetic algorithm, known as GA-clustering, in 2000. The centers of a pre-defined number of clusters were encoded using chromosomes and the improved performance of GA-clustering over k-means algorithm was demonstrated with the help of three real datasets. A tabu search based method was presented for solving the clustering problem in [1, 35]. Shelokar et al. presented an Ant Colony Optimization (ACO) based technique for optimally assigning objects to a pre-defined number of clusters, in 2004 [33]. The ACO based technique provided very promising results when compared with other heuristic methods such as genetic algorithm, simulated annealing and tabu search. Fathian et al. proposed an algorithm for clustering based on honeybee mating optimization (HBMO), in 2007 [8]. The performance of HBMO based approach was better compared to SA, GA, tabu search and ACO when evaluated over several well-known datasets. Ching-Yi et al. provided a Particle Swarm Optimization (PSO) based approach for clustering, in 2004 [6]. They compared the performance of PSO-based approach with traditional clustering algorithms and demonstrated that the PSO-based approach performed better using four artificial datasets. Hatamlou et al. applied the Gravitational Search Algorithm (GSA) to data clustering, in 2011 [11]. The results over four well-known datasets depicted that GSA based approach performed better than several other clustering algorithms namely PSO, HBMO, ACO, GA, SA and k-means. In 2012, Hatamlou et al., presented a technique combining the benefits of k-means algorithm with GSA, called GSA-KM, in clustering [12]. In GSA-KM approach, the initial population for GSA was generated with the help of k-means algorithm which allowed GSA to converge faster. When compared with other well known algorithms, such as k-means, GA, SA, ACO, HBMO, PSO and the conventional GSA approach, GSA-KM approach provided better results over several real datasets.

Manual segmentation of MRI images [26] is very cumbersome as well as time consuming. It also involves variability depending upon the individual examining the results. [37] It may vary from one observer to another and also within same observer. Though manual segmentation by an expertise has proven to be of superior quality but automated methods can be very advantageous to deal with such variations and to handle large data. So there is a need to develop appropriate automated or semi-automated system to perform segmentation of medical images as per the requirement. Different image segmentation techniques, both supervised and unsupervised have been proposed and been applied to numerous applications in real world [24].Unsupervised segmentation techniques requires less human intervention in obtaining clinically useful results. We are basically concerned with unsupervised techniques based on soft clustering in which an object can belong to more than one cluster with varying degree of membership. Various techniques for MRI brain image segmentation have been proposed and it is an active area of research. Ortiz et al. proposed a hybrid approach based on growing hierarchical self-organising map classifier and a probability clustering method [28]. Chen et al. modified the objective function of the fuzzy c-means(FCM) algorithm to compensate for intensity inhomogeneities by using basis functions and to compensate for noise by using improved non-local information [4]. To overcome the problem of FCM algorithm trapping in local minima, Benaichouche et al. used a metaheuristic optimization technique for initialisation [2] and incorporated Mahalanobis distance in the objective function to reduce the effect of geometrical shapes of different clusters. Mahmood et al. proposed a framework [21] by integrating Bayesian-based adaptive mean shift, a priori spatial tissue probability maps and FCM. Verma et al. proposed an improved intuitionistic FCM algorithm [10] for brain image segmentation, which considers the local spatial information in an intuitionistic fuzzy way. The algorithm takes care of the uncertainty with the help of intuitionistic approach and noise effect is taken care of by including local information. Also, Roy et al. performed the three stage repetitive level set segmentation method [34] with three membership functions to find out three brain regions. Convolutional Neural Network based method [23] was proposed by Moeskops et al. for automatic segmentation of brain images into tissues. Chen et al. proposed an improved FCM clustering method (FCM) [5] by using the non-local information and including the covariance and the prior probability in the distance function. Kalaiselvi et al. also modified the FCM algorithm by incorporating spatial parameter for minimizing the objective function of conventional FCM and new weighting parameter for centroid initialization [16]. Generalized Rough intutionistic FCM algorithm [25] is proposed by Namburu et al. to overcome dependency on membership function and parameter tuning. An improved MRI Brain Image Segmentation algorithm is proposed using fuzzy unsupervised learning by Keyvan et al. in 2019 [31]. In 2020 Verma et al [27] have proposed a new brain image segmentation technique based on intuitionistic approach which incorporates the advantage of Fuzzy co-clustering.

GSA uses a constant value of parameter α for the calculation of gravitational constant. In the beginning, smaller value of α allows for a greater exploration of the search space. Furthermore, higher value of α during the last few iterations enhances the search space exploitation. Therefore, the approach based on GSA can be improved by adapting and controlling the value of parameter α as the algorithm proceeds. To overcome the challenges faced by GSA, this paper proposes a novel algorithm called Fuzzy-GSA, based on Gravitational Search Algorithm (GSA). The proposed approach uses fuzzy rules for controlling the parameter α in GSA algorithm as the search progresses. The proposed novel algorithm is also applied for segmentation of brain images. The segmentation of human brain image from magnetic resonance imaging into three brain tissues: cerebrospinal fluid, gray matter and white matter is one of the important components in computer-aided diagnosis and neuroscience research. It helps to detect different diseases such as tumors, edema, Alzheimer’s Disease and Schizophrenia. Due to complicated structure of human brain and absence of well-defined boundary between different tissues, segmentation of brain image is a very difficult task. Hence, working on segmentation of brain images can have good impact in the clinical environment in the real world. One major challenge in brain image segmentation is to obtain distinct clusters and we will see in the results section how this is successfully achieved by our proposed algorithm. The rest of the paper is organized as follows. In Section 2, we present an overview of the GSA proposed by Rashedi [29]. Section 3 describes the proposed clustering approach, Fuzzy-GSA, where Section 3.1 describes the developed fuzzy inference system and Section 3.2 presents the proposed algorithm for clustering. In Section 4, we discuss the experimental results and comparison with other clustering approaches for brain MRI image segmentation. Finally, we provide the conclusions of this research in Section 5.

2 Background

2.1 Gravitational search algorithm

GSA is an optimization algorithm proposed by Rashedi [29] in 2009. It is based on the Newton’s laws of gravity and motion. The law of gravity states that “Every particle in the universe attracts every other particle with a force that is directly proportional to the product of the masses of the particles and inversely proportional to the square of the distance between them”. By this definition, the gravitational force is determined using the following equation [29]:

$$\text{F}=\text{G} \frac{{\text{M}}_{1}{\text{M}}_{2}}{{\text{R}}^{2}}$$
(1)

where, F is the gravitational force acting between two masses \({\text{M}}_{1}\) and \({\text{M}}_{2}\), G is the gravitational constant with a value of 6.67259 × 10-11 N m2/kg2, and R is the distance between the two masses.

Newton’s second law of motion states that when a force acts on a mass, acceleration is produced. The magnitude of acceleration produced is obtained using the Eq. 2 below [27]:

$$ \mathrm{a}=\frac{\mathrm{F}}{\mathrm{M}} $$
(2)

where, F and M denote the net force acting on a given particle and its mass, respectively

The GSA employs this physical phenomenon for solving optimization problems. Consider a system with N masses or agents. The position of ith mass is defined as:

$$\begin{array}{*{20}c}{\text{X}}_\text{i}=\left(\text{x}_\text{i}^1,\dots,\text{x}_\text{i}^\text{d},\dots,\text{x}_\text{i}^\text{n}\right), & \mathrm{for} \ \;i\;=\;1,\;2,\dots,\;N,\end{array}$$
(3)

where, \({\text{x}}_{\text{i}}^{\text{d}}\) is the position of ith agent in dth dimension and n is the total number of dimensions in the search space. The positions of agents correspond to the solutions of the problem. The mass of each agent is computed, after evaluating the present population’s fitness, using the following equations:

$${\text{m}}_{\text{i}}\left(\text{t}\right)=\frac{{\text{fit}}_{\text{i}}\left(\text{t}\right)- \text{worst}\left(\text{t}\right)}{\text{best}\left(\text{t}\right)- \text{worst} \left(\text{t}\right)}$$
(4)
$${\text{M}}_{\text{i}}\left(\text{t}\right)=\frac{{\text{m}}_{\text{i}}\left(\text{t}\right)}{\sum _{\text{j}=1}^{\text{N}}{\text{m}}_{\text{j}}\left(\text{t}\right)}$$
(5)

where, \({\text{f}\text{i}\text{t}}_{\text{i}}\left(\text{t}\right)\), denotes the fitness value of ith agent at time t, and best(t) and worst(t) are computed as follows(for minimization problems):

$$\begin{array}{*{20}c}\text{best}\left(\text{t}\right)={\text{min fit}}_\text{j}\left(\text{t}\right)\text{,} & \text{j}\;=\;1,\;2,\dots,\;\text{N}\end{array}$$
(6)
$$\begin{array}{*{20}c}\text{worst}\left(\text{t}\right)=\text{max} \ {\text{fit}}_\text{j}\left(\text{t}\right), & \text{j}\;=\;1,\;2,\dots,\;\text{N}\end{array}$$
(7)

Similarly, for maximization problems best(t) and worst(t) are computed by taking the maximum and minimum fitness values respectively.

The acceleration of an agent is computed next, by considering the total forces from a set of heavier masses using the laws of gravity and motion using Eqs. 8 and 9. The new velocity of an agent is computed next by adding a fraction of its current velocity to its acceleration (Eq. 10), followed by the calculation of its new position (Eq. 11).

$${\text{F}}_{\text{i}}^{\text{d}}\left(\text{t}\right)=\sum _{\text{j} \in \text{k}\text{best}, \text{j} \ne \text{i}}{\text{rand}}_{\text{j}}\text{G}\left(\text{t}\right)\frac{{\text{M}}_{\text{j}}\left(\text{t}\right){\text{M}}_{\text{i}}\left(\text{t}\right)}{{\text{R}}_{\text{ij}}\left(\text{t}\right)+ {\varepsilon }}\left({\text{x}}_{\text{j}}^{\text{d}}\left(\text{t}\right)- {\text{x}}_{\text{i}}^{\text{d}}\left(\text{t}\right)\right)$$
(8)
$${\text{a}}_{\text{i}}^{\text{d}}\left(\text{t}\right)= \frac{{\text{F}}_{\text{i}}^{\text{d}}\left(\text{t}\right)}{{\text{M}}_{\text{i}}\left(\text{t}\right)}=\sum _{\text{j} \in \text{k}\text{best}, \text{j} \ne \text{i}}{\text{rand}}_{\text{j}}\text{G}\left(\text{t}\right)\frac{{\text{M}}_{\text{j}}\left(\text{t}\right)}{{\text{R}}_{\text{ij}}\left(\text{t}\right)+ {\varepsilon }}\left({\text{x}}_{\text{j}}^{\text{d}}\left(\text{t}\right)- {\text{x}}_{\text{i}}^{\text{d}}\left(\text{t}\right)\right)$$
(9)
$${\text{v}}_{\text{i}}^{\text{d}}\left(\text{t}+1\right)={\text{rand}}_{\text{i}}\times {\text{v}}_{\text{i}}^{\text{d}}\left(\text{t}\right)+ {\text{a}}_{\text{i}}^{\text{d}}\left(\text{t}\right)$$
(10)
$${\text{x}}_{\text{i}}^{\text{d}}\left(\text{t}+1\right)={\text{x}}_{\text{i}}^{\text{d}}\left(\text{t}\right)+ {\text{v}}_{\text{i}}^{\text{d}}\left(\text{t}+1\right)$$
(11)

where, \({\text{r}\text{a}\text{n}\text{d}}_{\text{i}}\)and \({\text{r}\text{a}\text{n}\text{d}}_{\text{j}}\) are two random numbers uniformly distributed in the range of [0, 1], \({\upepsilon }\) is a small value to prevent division by zero, \({\text{R}}_{\text{i}\text{j}}\left(\text{t}\right)\) is the Euclidean distance between agent i and agent j. kbest is the set of first K agents with best fitness values and thus, largest mass. kbest is dependent on time, initialized to Ko at the start and decreases as time progresses. The gravitational constant, \(\text{G}\left(\text{t}\right)\), decreases with time to control the search accuracy. The value of \(\text{G}\left(\text{t}\right)\) is calculated using the Eq. 12:

$$\text{G}\left(\text{t}\right)={\text{G}}_{\text{o}}{\text{e}}^{\frac{-\propto \text{t}}{\text{T}}}$$
(12)

where, Go is the initial value of gravitational constant, α is a parameter which governs the degree of exploration versus exploitation of the search and T is the maximum number of iterations.

3 Proposed work

This paper proposes a novel algorithm for segmentation of MRI brain images, called Fuzzy-GSA, based on Gravitational Search Algorithm (GSA). GSA uses a constant value of parameter α for the calculation of gravitational constant. In the beginning, smaller value of α allows for a greater exploration of the search space. Furthermore, higher value of α during the last few iterations enhances the search space exploitation. Therefore, the approach based on GSA can be improved by adapting and controlling the value of parameter α as the algorithm proceeds. The proposed approach uses fuzzy rules for controlling the parameter α in GSA algorithm as the search progresses. In this work, an automatic framework for segmentation of brain tissue classes namely, white matter, gray matter and cerebrospinal fluid has been proposed. The brain MRI images are used for the purpose of segmenting these tissues. We describe the proposed method, called Fuzzy-GSA, for segmentation of brain images. The proposed approach is based on GSA, described in Section 2 and uses fuzzy inference rules for controlling the parameter α as search progresses. This section is divided into two subsections. Section 3.1 describes the Fuzzy Inference System(FIS) developed, and the Section 3.2 presents the proposed Fuzzy-GSA algorithm for clustering.

3.1 The developed Fuzzy inference system

The FIS is developed with two input variables and one output variable. The input variables are as follows:

  • IT: The current iteration number.

  • Fbest: The best value of fitness achieved till the current iteration.

IT enables us to consider how far we have reached in the search process. During the initial iterations, i.e. when IT is low, a lower value of α is desired since lower the value of α, higher the value of gravitational constant, G(t), will be (Eq. 12) and thus, higher the force, F, (Eq. 8) resulting in a higher acceleration, a, (Eq. 9) and velocity, v(t) (Eq. 10). This allows for higher exploration at the beginning of search. Similarly, towards the final few iterations, i.e. when IT is high, a higher value of α is desired to promote higher exploitation. Figure 1 depicts the membership function for IT. The iterations are represented as a fraction of the maximum number of iterations allowed, such that 0.5 means half of the total iterations and 1 represents the maximum iterations.

Fig. 1
figure 1

Membership Function for IT

Fig. 2
figure 2

Membership Function for Fbest

Fbest represents the lowest value of fitness, since clustering is a minimization problem with the fitness function as mean square error, achieved till the current iteration. If the value of Fbest is high, then we need to reduce α to promote a greater exploration, since higher values for Fbest mean we are still far from the solution. However, if Fbest is low, we should increase α to allow for a higher exploitation as we are near the solution. Figure 2 shows the membership function for Fbest. Note that the membership function for Fbest needs to be tuned as per the input dataset being considered, since the acceptable values of fitness function will vary for different datasets.

To obtain the marked value M5 in Fig. 2, we executed five independent runs of GSA for a single iteration and equated M5 to the maximum value of Fbest obtained, after adding hundred and then rounding it off to the nearest hundred. For the value of M3, we considered the integer part of the best fitness value obtained using GSA, taken from [11], for that dataset. M4 was computed by adding one to the value of M3, and M2 was calculated by rounding off M3 to the nearest ten smaller than M3. Finally, M1 was obtained by subtracting ten from M2.

Note that the fitness function, representing the total mean square error or the sum of intra-cluster distances, is computed using the following equation [11]:

$$\text{f}(\text{O},\text{C})=\sum _{\text{l}=1}^{\text{k}}\sum _{{\text{O}}_{\text{i}}\in {\text{C}}_{\text{l}}}{\text{D}({\text{O}}_{\text{i}},{\text{CC}}_{\text{l}})}^{2}$$
(13)

where, \({\text{C}\text{C}}_{\text{l}}\) represents the centroids of the cluster \({\text{C}}_{\text{l}}\), \(\text{D}\left({\text{O}}_{\text{i}},{\text{C}\text{C}}_{\text{l}}\right)\) denotes the distance or dissimilarity between object \({\text{O}}_{\text{i}}\) and cluster centroid \({\text{C}\text{C}}_{\text{l}}\). The most popular and widely used distance metric is the Euclidean distance, which we have used in this work. Euclidean distance between two objects \({\text{X}}_{\text{i}}\) and \({\text{X}}_{\text{j}}\) with d dimensions is calculated as:

$$\text{D}({\text{X}}_{\text{i}},{\text{X}}_{\text{j}})=\sqrt{\sum _{\text{p}=1}^{\text{d}}{({\text{x}}_{\text{i}}^{\text{p}}-{\text{x}}_{\text{j}}^{\text{p}})}^{2}}$$
(14)

where, \({\text{x}}_{\text{i}}^{\text{p}}\) denotes the value of pth dimension for the object \({\text{X}}_{\text{i}}\) and \({\text{x}}_{\text{j}}^{\text{p}}\) denotes the value of pth dimension for the object \({\text{X}}_{\text{j}}\).

The developed FIS consists of one output variable, i.e. alpha(t), which denotes the value of parameter α in Eq. 12. Figure 3 shows the membership function for alpha(t). The range of parameter α is taken as [0, 50] to provide a wide range of search on the value of alpha(t) as we can see in Fig. 3.

Fig. 3
figure 3

Membership Function for alpha(t)

The following eight fuzzy rules were formulated to control the parameter α in the calculation of the gravitational constant (Eq. 12):

  • RULE 1: If (IT is low) and (Fbest is low) then (alpha(t) is high).

  • RULE 2: If (IT is low) and (Fbest is medium) then (alpha(t) is medium).

  • RULE 3: If (IT is low) and (Fbest is high) then (alpha(t) is low).

  • RULE 4: If (IT is medium) and (Fbest is high) then (alpha(t) is low).

  • RULE 5: If (IT is medium) and (Fbest is medium) then (alpha(t) is medium).

  • RULE 6: If (IT is high) and (Fbest is high) then (alpha(t) is medium).

  • RULE 7: If (IT is high) and (Fbest is medium) then (alpha(t) is medium).

  • RULE 8: If (IT is high) and (Fbest is low) then (alpha(t) is high).

The method used in the developed fuzzy inference system for “And” is min and for “Or” is max. The implication method is min, aggregation method is max and defuzzification method used is centroid.

3.2 Fuzzy GSA

The proposed algorithm, Fuzzy-GSA, comprises of two main steps. The first step is to generate an initial population for GSA. We have generated the initial population by considering three agents (or candidate solutions) corresponding to the maximum, minimum and median values for all features in a given dataset, respectively. This provides a better initial population which would allow for a higher exploration since a wide range of values, including maximum, minimum and median, are present while searching the solution space. The rest of the agents are generated randomly by considering the range of features in the given dataset.

The second step involves application of GSA, described in Section 2, to the given dataset and using the fuzzy inference system developed to control the parameter α while searching for the solution. The flow diagram for the proposed Fuzzy-GSA algorithm is depicted by Fig. 4.

The step by step algorithm for the proposed approach is stated next. Let N denote the population size, Ci be the ith candidate solution or agent, k be the number of clusters, d be the number of features in a given dataset.

Generate initial population, P = {C1, C2,. . ., CN}.

  • Generate C1 consisting of maximum values of all the features.

  • Generate C2 consisting of minimum values of all the features.

  • Generate C3 consisting of median values of all the features.

  • Generate the remaining N-3 candidates randomly within the range of minimum to maximum values for all features.

Apply GSA and use the developed FIS, described in Section 3.1, for parameter adaptation.

  • Calculate the fitness function, as per Eq. 13, for all the candidate solutions.

  • Feed the values of IT, current iteration number, and Fbest, best fitness achieved, as inputs to the developed FIS, and obtain the value of parameter α.

  • Calculate G, F, M and a for all the candidate solutions using Eqs. 5, 8, 9 and 12 as described in the Gravitational Search Algorithm (GSA).

  • Update the velocity and position of each candidate solution as per Eqs. 10 and 11 respectively.

  • Check if termination criteria, i.e. maximum number of iterations allowed is reached or fitness function is not exhibiting a minimum improvement, are met. If yes, then return the best value of fitness function achieved as the final solution, else reiterate through this step.

The final solution consists of the best value of fitness function, i.e. the minimum mean square error, achieved by running the proposed Fuzzy-GSA algorithm.

Fig. 4
figure 4

Flowchart for Fuzzy GSA algorithm

4 Experiments &; results

The algorithm is implemented on personal computer (2.40 GHz CPU, 4GB RAM) using MATLAB R2010a. We have considered the population size to be 50, i.e. N = 50, and the maximum number of iterations to be 300. Go in Eq. 12 is taken as 100. The minimum acceptable improvement in the fitness values between two successive iterations is set as 1 × 10−6. Each candidate solution, in the population, consists of cluster centers for each of the k clusters, and each cluster center comprises of values for each feature in a dataset. Figure 5 illustrates the representation of the ith candidate solution Ci. CCij denotes the jth cluster center of the ith candidate solution and Fij represents the value of jth feature for ith cluster center. Therefore, each candidate solution consists of (d × k) values. Here, the value of k=3.

Fig. 5
figure 5

Representation of ith Candidate Solution, Ci

We have measured the performance of the proposed novel approach, Fuzzy-GSA, by calculating the sum of intra-cluster distances as defined by Eq. 13. We have considered four benchmark datasets namely, Iris, Wine, Breast Cancer Wisconsin and Contraceptive Method Choice (CMC) for evaluation. The datasets are all obtained from UC Irvine repository of machine learning databases [3] and have been extensively used by researchers to validate the performance of clustering algorithms. A description of each benchmark dataset is provided below:

  • Iris Dataset: It consists of three classes with 50 instances each, where each class refers to a species of iris flower. There are four features in the dataset namely, petal length, petal width, sepal length and sepal width which report certain characteristics of iris flower. The dataset comprises of a total of 150 instances. There no missing feature values in this dataset.

  • Wine Dataset: It consists of three classes representing different types of wine. The data is a result of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivators. There are 13 features which represent quantities of different constituents found in each of the three types of wines. The dataset consists of 178 instances with no missing values.

  • Breast Cancer Wisconsin Dataset: This dataset comprises of two classes namely, malignant and benign representing the severity of cancer. There are a total of 683 instances, without missing values. It has 9 attributes or features namely, clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli and mitoses.

  • Contraceptive Method Choice (CMC) Dataset: It consists of three classes namely, no-use, long-term and short-term. There are 1473 instances in this dataset, without any missing values. It contains 9 features or attributes namely, wife’s age, wife’s education, husband’s education, number of children ever born, wife’s religion, whether wife’s working, husband’s occupation, standard of living and media exposure.

Sum of intra-cluster distances is then calculated over each of the four benchmark datasets considered, using Eq. 13. We have also compared the performance of the proposed Fuzzy-GSA approach with existing clustering algorithms such as conventional GSA [11], combined k-means and GSA [12], PSO [6] and k-means [20] algorithms on selected datasets. Due to the stochastic nature of these algorithms, we have considered 20 independent runs for each algorithm over each dataset. The results are then compared in terms of best, average and worst solutions over 20 independent simulations. Moreover, the standard deviation of the achieved solutions from each clustering algorithm is also calculated. Note that, a lower value of the sum of intra-cluster distances denotes a higher quality clustering.

Figure 6 illustrates the three clusters assigned by the proposed Fuzzy-GSA algorithm over the Iris dataset. It shows a 3-D plot considering the three dimensions namely, petal length, sepal width and sepal length of the Iris dataset. The remaining dimension, petal width, is highly correlated with the dimension, petal length, and thus can be ignored without losing the quality of cluster representation. The three clusters, corresponding to the three types of iris flower, are depicted by three different colour coded symbols namely, blue squares, green circles and red triangles. The X-axis represents the dimension sepal length, the Y-axis represents the dimension sepal width and the Z-axis represents the dimension petal length. The respective cluster centers are represented by black coloured circles.

Fig. 6
figure 6

Cluster plot by Fuzzy-GSA over Iris Dataset

The best, average, worst and standard deviation of the obtained solutions by different clustering algorithms on the selected datasets are shown in Table 1. The results are achieved over 20 independent runs.

Table 1 Simulation results for clustering algorithms

As can be seen from Table 1, the proposed approach, Fuzzy-GSA, demonstrates the highest quality solutions in terms of best, average and worst intra-cluster distances over all the four benchmark datasets. Furthermore, the standard deviation of Fuzzy-GSA is smaller, which indicates that it can locate a near-optimal solution in most of the cases when compared with other clustering algorithms.

For Iris dataset, the best, average and worst solutions by the proposed Fuzzy-GSA approach are 96.5403, 96.5425 and 96.5581, respectively with a standard deviation of 0.0054. For Wine dataset, the best, average and worst solutions achieved by the Fuzzy-GSA approach are 16292.23, 16293.369 and 16294.35 respectively, with a standard deviation of 0.82. For Breast Cancer Wisconsin dataset, the achieved best, average and worst solutions are 2964.38, 2964.38 and 2964.39, respectively with a standard deviation of 0.003. Lastly, for CMC dataset, the best, average and worst solutions obtained by Fuzzy-GSA are 5532.2, 5532.6 and 5533.7, respectively with a standard deviation of 0.5831.

To summarize, the proposed Fuzzy-GSA approach achieves the best quality clustering when compared with several popular clustering algorithms, depicted in Table 1, over four benchmark datasets considering 20 independent runs.

After evaluating the efficacy of the proposed novel algorithm we have also applied it for MRI brain image segmentation. The real time database of Brain MRI images has been taken from Insight Journal. Insight Journal is an Open Access on-line publication that covers domain of medical image processing and visualization. One of its journals is MIDAS. Midas community include National Alliance for medical image computing (NAMIC) which presents the data for two autistic and two normal children (male and female) [13]. Three types of MRI scanning are presented i.e. T1 weighted, T2 weighted, PD weighted images. Coronal slices are obtained with slice thickness of 1.5mms. Also the tissue segmentation label map is presented with the database. This tissue label is atlas based segmentation by making use of expectation-maximization scheme. The quantitative analysis of MRI brain images is done in comparison with the ground truth images for the sample image shown in Fig. 7 and is presented in Table 2. True positive gives the measure of the correctly classified pixels. Whereas False positive counts the total number of pixels that our system classifies but is not present in ground truth and True negative counts the total number of pixels that our system doesn’t classifies but is present in the ground truth as belonging to particular cluster. In the table sensitivity is also calculated to measure the performance of the proposed approach. Here, we can see that average sensitivity is 0.926. Sensitivity in general can be calculated as.

Table 2 Classification of pixels for FUZZY GSA for real time MRI database
$$\mathrm{Sensitivity}=\mathrm{TP}/\left(\mathrm{TP}+\mathrm{FN}\right)$$

Where, TP is True Positive and FN is False Negative.

Fig. 7
figure 7

(a) Autistic Female T2-weighted MRI(slice 140), (b) Ground truth, (c) Segmentation result, (d) Gray matter, (e) CSF, (f) White matter

The Dice Coefficient(DC) [7] is defined as

$$DC\left( {Y,Z} \right)=2\left( {\left| Y{\text{ }} \cap {\text{ Z}}\right|} \right)/\left( {\left| Y \right| + \left| Z \right|} \right)$$
(15)

where Y and Z are two sets. The value of DC lies between 0 and 1. 0 signifies no match/overlap and 1 signifies complete match/overlap. In terms of segmentation, ‘Y’ represents the segmented image containing tumor and ‘Z’ represents the ground truth given by the radiologist.

The algorithm is run over several slices of both normal and autistic male and female dataset provided by NIMAC. The average of several runs was calculated using DC and is tabulated in Table 3. It is observed that the significant improvement can be seen in the result obtained from our approach (FUZZY-GSA) as compared to GSA .The classification of brain tissues is more promising in case of FUZZY GSA as there are more number of correctly classified pixels (Fig. 8).

Table 3 This table presents the value of DC for three brain tissues namely, Gray matter, White matter, CSF using our approach (FUZZY GSA) and GSA

The more the value of DC is close to 1, the better is the segmentation accuracy. It is studied that DC’s value >0.7 implies good segmentation.

In above table patient numbers refers to following slices of dataset:

  • P11-5074-004-02_10_t2_fit.nnrd (slice 100)  P12- 5074-004-02_10_t2_fit.nnrd (slice 140)

  • P21-5128-004-01_10_T2_fit.nrrd (slice 80)   P22-5128-004-01_10_T2_fit.nrrd (slice 114)

  • P31-5150-004-02_10_T2_fit.nrrd (slice 110) P32-5150-004-02_10_T2_fit.nrrd (slice 67)

  • P41-5157-004-02_10_T2_fit.nrrd (slice 70)   P42-5157-004-02_10_T2_fit.nrrd (slice 160)

Fig. 8
figure 8

Comparison of DC values for GSA and Fuzzy GSA

It can be noticed that classification/segmentation accuracy is highly dependent upon the classification of cerebrospinal fluid. The accuracy of both gray matter and white matter tissues is also improved but significant change can be seen in case of segmentation of CSF tissue. CSF is a very complex tissue (fluid flowing in our brain). It is sometimes difficult to segment such a flowing matter from brain MRI. Therefore there is a requirement to get better classification of data points belonging to CSF class. This improvement can be seen in our approach as compared to standard FCM.

The FUZZY GSA approach for segmenting brain MRI images is also applied and validated for simulated 3D brain MRI images with varying level of noise from brain web database. The simulated dataset from brain web is provided by McGill University and can be obtained with different file extensions. [3] It contains normal anatomical brain structures with size of each image is 181*217. The results are tested and validated for T1-weighted images with slice thickness of 1mm and 3%, 5% noise levels as shown in Fig. 11 and results are tabulated in Table 4.

Table 4 This table presents the value of DC for three brain tissues namely, Gray matter, White matter, CSF using our approach(FUZZY GSA) and GSA for Brain Web database
Fig. 9
figure 9

Comparison of DC values for GSA and Fuzzy GSA with noise level 3%

Fig. 10
figure 10

Comparison of DC values for GSA and Fuzzy GSA with noise level 5%

It can be noticed from Table 4 that FUZZY GSA performs better for both levels of noise as compared to GSA. The efficiency of GSA is reduced with higher level of noise as we can see from Figs. 9 and 10. This can be also seen from the Table 4 that when slice 100 is corrupted with both 3% and 5% noise GSA performs poorly in case of more noise. Accuracy of GSA reduces significantly whereas FUZZY GSA performs efficiently in higher level of noise as well (Figs. 11 and 12).

Fig. 11
figure 11

(a) Normal Brain(T1-weighted slice 60), (b) Normal brain (noise 3%), (c) Segmented result using FUZZY GSA, (d) Gray matter, (e) CSF, (f) White matter

Fig. 12
figure 12

(a) Normal Brain(T1-weighted slice 99), (b) Normal brain (noise 5%), (c) Segmented result using FUZZY GSA, (d) Gray matter, (e) CSF, (f) White matter

The results of the proposed Fuzzy GSA algorithm are also compared with Growing Hierarchical Self Organising Map(GHSOM) [28], Non-Local FCM(NLFCM) [4], Improved Spatial FCM(IFCMS) [2], combination of Mean Shift and FCM (MFCM) [21], Local Information based Intutionistic FCM(IIFCM) [10], Repeated Level Set(RLS) [34], Convolutional Neural Network(CNN) [23], Non Local based Spatially Constrained Hierarchical FCM(NLSCHFCM) [5], Improved FCM(IFCM) [16], Generalised Rough Intuitionistic FCM(GRIFCM) [25], Fuzzy Unsupervised Learning(FUSL) [31] and Intutionistic Fuzzy Co-Clusteing(IFCC) [27] algorithm. Also, the comparison with the algorithms is done with varying levels of noise(3% and 5%). As, we can see from Table 5 that the proposed algorithm outperforms other segmentation algorithms and has the ability to accurately segment the brain images in the presence of noise as well.

Table 5 Comparative performance analysis of DC values for different segmentation algorithms on brain image with varying noise level

5 Conclusion

This paper proposes an algorithm Fuzzy-GSA for MRI brain image segmentation which is based on the conventional GSA with a provision for adapting the value of parameter α used in the calculation of the gravitational constant. In the beginning, a smaller value of α is desired to achieve a higher exploration, whereas towards the end of search, a relatively higher value of α helps in achieving a higher exploitation. Step by step methodology for automated brain MRI image segmentation is presented. The performance of Fuzzy-GSA is evaluated by comparing its best, average and worst solutions with several other clustering algorithms over four selected benchmark datasets namely, Iris, Wine, Breast Cancer Wisconsin and CMC, considering 20 independent runs. Classification of three main tissues of brain is also performed. These tissues include- gray matter, white matter and cerebrospinal fluid. The performance of Fuzzy-GSA is evaluated in comparison with GSA by performing experiments on real time database and simulated database with varying level of noise. The proposed algorithm is also compared on the basis of DC values with various brain image segmentation algorithms and is seen to outperform even in the presence of noise. In future, the work of paper can be extended to further improve the accuracy and efficiency of the algorithm. Also this work dealt with noise and partial volume effect, so it can be extended to work for images with intensity inhomegenity as well.