Keywords

1 Introduction

Today medical imaging technologies provide the physician with some complementary diagnostic tools, such as X-ray, computer tomography (CT), magnetic resonance imaging (MRI), and ultrasound (US). Human anatomy can be visualized by using two widely used methodologies, which are MRI and X-ray. The human soft tissue anatomy can be visualized by using MRI that provides information in 3D, whereas X-ray imaging is used to visualize bones [1]. The most complex organ is the brain of the human body. So, the differentiation between various components and deeply analyze them is a difficult task. The most common images are MRI images for brain image analysis. The magnetic field and radio waves are utilizing by MRI for providing a detailed image of the brain. Moreover, conventional imaging techniques have not many advantages as MRI. Few of them are [2]: high spatial resolution, excellent discrimination of soft tissues, and rich information about the anatomical structure. Brain tumors are classified by neuroradiologists into two groups, namely: glial tumors (gliomas) and non-glial tumors. There are different types of brain tumors that more than 120 types, which leads to the complexity of the effective treatment [3].

For MRI images, segmentation into different intensity classes is required by many clinical and research applications. The best available representation is doing by these classes for biological tissues [4, 5]. Therefore, image segmentation is a crucial process for deciding the spatial location, shape and size of the focus, establishing and amending the therapeutic project, selecting operation path, and evaluating the therapeutic effect. In general, the interest tissues in the brain MRI images are White Matter (WM), Gray Matter (GM), and Cerebrospinal Fluid (CSF). Multimodal medical image fusion is carried out to minimize the redundancy. Also, it enhances the necessary information from the input images that is acquired using different medical imaging sensors. The essential aim is to yield a single fused image that could be more informative for an efficient clinical analysis [6]. The retrieval of complementary information is facilitated by using image fusion for medical images and has been diversely employed for computer-aided diagnosis (CAD) of life-threatening diseases. Fusion has been performed using various approaches, such as pyramids, multi-resolution, and multi-scale. Each and every approach of fusion depicts only a particular feature (i.e. the information content or the structural properties of an image) [7].

On the other hand, Images can be divided into constituent sub-regions this process known as image segmentation. The group of segments or sub-regions is the result of image segmentation that collectively covers the whole image or a set of contours derived from the image. Color, intensity, or textures are some considerations or computed properties for classifying the pixels in some regions. Adjacent regions are significantly different with respect to the tested characteristic(s) [8]. The manual segmentation takes much time, but it is possible. Therefore, automated detection and segmentation of brain abnormalities are a challenging problem of research since decades [9].

The complexity of the segmentation arises from the different characteristics of the images. Therefore, medical image segmentation is considered as a challenging task [10]. Image segmentation divides digital images into non-overlapping regions. It extracts significant and meaningful information from the processed images. In addition, the numerous analysis can be performed to extract critical areas from the images [11]. MRI is the most commonly used technique for evaluating the anatomical of human brain structures. It provides a comprehensive vision of what happen in patient’s brain. It consists of the typical structures of brains, such as GM, WM, CSF, and damage regions. They are presented in single common structures or overlapped areas [12]. WM, GM, and CSF need the accurate measurement for the quantitative pathological analyzes. Segmentation of the MRI brain image data is a goal that is required to process these regions [13].

Segmentation divides an image into regions that are meaningful for a particular task. Region-based and boundary-based methods are two major segmentation approaches. The first approach is based on detecting the similarities. The second approach is based on the continuous boundaries around regions that are formed by detecting discontinuities (edges) and linking them.

Region-based methods find connected regions based on some similarities between the pixels [14]. The most fundamental feature of defining the regions is image gray level or brightness, but other features, such as color or texture, can also be used. However, if we require that the pixels in a region be very similar, we may over segment the image. If we allow too much dissimilarity, we may merge what should be separate objects. The goal is to find regions that correspond to objects as humans see them, which is not an easy goal [15]. Region-based methods include thresholding (either using a global or a locally adaptive threshold; optimal thresholding (e.g., Otsu, isodata, or maximum entropy thresholding)). If this results in overlapping objects, thresholding of the distance transform of the image or using the watershed algorithm can help to separate them. Other region-based methods include region growing (a bottom-up approach using “seed” pixels) and split-and-merge (a top-down quad tree-based approach).

Boundary-based methods tend to use either an edge detector (e.g., the canny detector) and edge linking to link any breaks in the edges, or boundary tracking to form continuous boundaries. Alternatively, an active contour (or snake) can be used. It is a controlled continuity contour that elastically snaps around and encloses a target object by locking on to its edges [14, 16].

There are many image segmentation techniques for medical applications. The specific applications and different imaging modalities control the selection between the various methods of segmentation. The performance of segmentation algorithms is still challenging because there are several imaging problems, such as noise, partial volume effects, and motion. Some of these methods, such as thresholding methods, region-growing methods, and clustering methods, were studied by many researchers [1719].

The most frequently used techniques for medical image segmentation is the thresholding. Different classes can be obtained according to the thresholding, which is separating pixels to their gray levels. Partitioning the scalar image intensities to a binary is made by using thresholding approaches. In the segmentation of thresholding techniques, the threshold value is compared with all pixels. If the threshold value is less than the pixels’ intensity value, the pixels are grouped into one class. Otherwise, another class grouped other pixels.

Multi-thresholding can be determined by processing the threshold with many values instead of only one value. In digital image processing, the most popular and simple method is a multi-thresholding technique. It can be divided into three different types: global, local, and optimal thresholding methods. In the former, global thresholding methods are used to determinate a threshold for the entire image. It only concerns the binarization of image after segmentation. The second is the local thresholding methods, which are fast methods. In the case of multilevel thresholding, the local methods are suitable. However, the number of the threshold determination is a major drawback. The usage of the objective function is the main advantage of the optimal thresholding methods [20]. Indeed, the determining of the best threshold values amounts to optimize the objective function. There are different types of optimization approaches, such as the Genetic Algorithms (GAs), Firefly Algorithm, and Particle Swarm Optimization (PSO). GAs has a problem for finding an exact solution but is good at reaching a near optimal solution. In contrast, an optimal solution is enhanced by using PSO. The FODPSO is especially used in this paper because it presents a statistically significant improvement in terms of both fitness value and CPU time. In other words, the optimal set of thresholds and less computational time is achieved by using the FODPSO approach with a larger between-class variance than the other approaches [21].

In image segmentation, the most common used techniques are clustering algorithms. It is an unsupervised learning technique, in addition to the number of clusters should be determined by the user in advance to classify pixels [22, 23]. As a result, the grouping of similar pixels or dissimilar pixels in one group is called clustering process [24]. Partitioning and grouping pixels are the two ways of clustering [25]. In partitioning type, dividing the whole image can be done by clustering algorithm into smaller clusters in a successive way. In contrast the grouping type, larger clusters are obtained by starting each element as a separate cluster after then are gathered. The decision of grouping pixels together is based on some assumptions. Mean Shift is an example of an unsupervised clustering technique that does not require prior knowledge, such as the number of the data cluster. It is an iterative method that starts with an initial estimation [26]. MS segmentation is used for making concatenation for both the spatial and range domains of an image. In addition, it is used for identifying modes in this multidimensional joint spatial-range feature space. The bandwidth parameter (the value of kernel size) is free and is not restricted to a constant value. Several methods are used for estimating a single fixed bandwidth. Over-clustering and under-clustering arise from the chosen value of the bandwidth. The too small value of the bandwidth produces over-clustering, and also the too large value of bandwidth provide critical modes that can be merged under-clustering. When the feature space has significantly different local characteristics across space, under- or over-clustering arise from the use of a single fixed bandwidth that is considered as a drawback [27].

In this chapter, we concentrate on both clustering and multilevel thresholding methods for medical brain MRI image segmentation. Our experiments were conducted by using the most used multilevel thresholding and clustering techniques. This paper is organized into six sections. Section 2 introduces the basic concepts of some different medical image segmentation systems. Section 3 presents some different medical image segmentation systems for the current related work. In Sect. 4, the proposed medical image segmentation system is discussed. It is based on Cascaded FODPSO and Mean Shift Clustering. The experimental results are conducted on three different standard datasets in Sect. 5. The conclusion and the future work are presented in Sect. 6.

2 Related Work

Image segmentation plays a significant role in the field of medical image analysis. The most frequently used techniques for medical image segmentation is the thresholding. Therefore, many researchers have proposed many segmentation techniques for obtaining optimal threshold values based on a multi-thresholding method for image segmentation. In the rest of this section, we will speak about some current research effort in medical image segmentation.

Parvathi et al. [28] proposed for high-resolution remote sensing images a new segmentation algorithm. It can also be applied to medical and nonmedical images. Frist, the remote sensing image is decomposed in multiple resolutions by using a biorthogonal wavelet. A suitable resolution level is determined. The simple grayscale morphology is used for computing the gradient image. The selective minima (regional minima of the image) had imposed to avoid over-segmentation on the gradient image. Second, they applied the watershed transform, and the segmentation result is projected to a higher resolution, using the inverse wavelet transform until the full resolution of the segmented image is obtained. The main drawback in preprocessing step they did not make skull removing this leads to increasing the amount of used memory and processing time.

Clustering techniques are the most common used for medical image segmentation. For example, Khalifa et al. [29] proposed a system for MRI brain image segmentation that is based on wavelet and FCM (WFCM) algorithm. Their algorithm is a robust and efficient approach to segmenting noisy medical images. Feature extraction and clustering are the two main stages of the proposed system. The multi-level 2D wavelet decomposition is used to make extraction of Features. The FCM clustering is provided with the feature of the wavelet decomposition. Finally, the image is segmented into (WM, GM, and CSF) these three classes are the brain tissue. The limitation of their work is that they did not apply skull removal. Without removing the skull, scalp, eyes, and all structures, which are not of interest, increases the amount of used memory and increase the processing time.

Bandhyopadhyay and Paul [30] proposed a way for brain tumor diagnosis that it is an efficient and fast way. Multiple phases are included in their system. The first phase consists of more than MR images registration taken on adjacent layers of the brain. In the second phase, to obtain a high-quality image, a fusion between registered images is performed. Finally, improved K-means algorithm is performed with the dual localization methodology for segmentation. The main disadvantage is the large grid dimension. The fine anatomic details also were ignored, such as an overlapping region of gray and white matters in the brain or twists and turns in the boundary of the tumor.

Arakeri and Reddy [31] proposed an approach for MRI brain tumor by using wavelet and modified FCM clustering that provides efficient segmentation of brain tumor. In the first phase, the wavelet transform is used for making decomposition of the image and in the next phase modified FCM algorithm is used to segment the approximate image in the highest wavelet level. The low-resolution image is restraining noise and reducing the computational complexity. Then, the low-resolution segmented image is projected on to the full resolution image by taking inverse wavelet transform. The main limitation of this work is the use of highest wavelet level decomposition this may lead to neighboring features overlapped of the lower band signals.

On the other hand, many researchers do this best to improve the FCM algorithm performance for image segmentation. For example, Mostfa and Tolba [32] proposed a wavelet multi-resolution with EM algorithm for segmenting the medical image known as (WMEM). In the first stage, a spatial correlation between pixels is detected by Haar transform with length 2. In the second stage, EM algorithm receives the original image. The two scaled images are generated from 2D Haar wavelet transform to make segmentation separately. Then, these three segmented images are produced with their weighted or thresholding value. Each pixel in the image is classified depending on these three segmented images. They did not demonstrate what about the time of each algorithm or in the integration method.

Javed et al. [11] proposed a system for noise removal and image segmentation. Their system comprised of two major phases that involved a multi-resolution based technique and k-means technique. False segmentation is arisen from noise corrupted images, which this is primary issues of Uncertainty and ambiguity. Therefore, on the input image multi-resolution based noise removal is applied as a preprocessing step. The image free noise is segmented by k-means based technique to identify different objects present in image data automatically. The main disadvantage is they did not make skull removing in preprocessing step that increases the amount of used memory and increases the processing time.

Jin et al. [13] proposed a multispectral MRI brain image segmentation algorithm. This algorithm based on kernel clustering analysis. The algorithm is called as multi-spectral kernel based fuzzy c-means clustering (MS-KFCM). In their proposed system, MRI T1-weighted and T2-weighted brain image are filtered and then make a selection to the features as the input data. The separation improvement of the input data is doing by mapping the input data to a high-dimensional feature space. The output of FCM clustering is used as the initial clustering center of MS-KFCM. The performance of MS-KFCM is better than FCM and KFCM, but FCM and KFCM are similar in the performance. The advantage of using the multi-spectral image segmentation is to achieve higher accuracy than to use single-channel image segmentation. The limitation of their work is that they did not make skull removal. Without removing the skull, scalp, eyes, and all structures, which are not of interest, the memory usage and the processing time are increased.

Mangala and Suma [33] presented brain MRI image segmentation algorithm that is called Fuzzy Local Gaussian Mixture Model (FLGMM). They removed noise by applying Gaussian filter. They handled the bias field estimation by using BCFCM. Second, all techniques initialized by using K-means. Then, they used FLGMM to make segmentation to the processed image. The Jaccard similarity (JS) is used for measuring the segmentation accuracy. The JS value is [0, 1], and the higher value of JS means that the segmentation is more accurate than the lower values. They did not deal with reducing the computational complexity and improving the robustness.

The most frequently used techniques for medical image segmentation is the thresholding. Therefore, many researchers have proposed many segmentation techniques for obtaining optimal threshold values based on a multi-thresholding method for image segmentation. For example, Ghamisi et al. [34] presented two methods for images segmentation to identifying the n − 1 optimal for the n-level threshold. The FODPSO and (DPSO) are proposed for image segmentation. Delineating multilevel threshold, the disadvantages of preceding methods in terms of limitation of the local optimum, and high CPU process time are solved by using these two methods [34]. The efficiency of other well-known thresholding segmentation methods is compared with their proposed methods. When taking into consideration some different measures, such as the fitness value, STD, and CPU, their experimental results showed that their proposed methods superior to other compared methods. On the other hand, they did not handle real-time image segmentation.

Ghamisi et al. [35] introduced two main segmentation approaches for classification of hyperspectral images. They used FODPSO and MS segmentation techniques. The support vector machine (SVM) is used for classifying the output of these two methods. In their proposed system, in the beginning, the input image with (B bands) enters to the FODPSO to perform segmentation. Second, the output of FODPSO is supplied to MS as input to make segmentation to the (B bands) image. Finally, the classification process of (B bands) to produce (1 band) image is done by using SVM. The main disadvantage of MS is the tuning size of the kernel, and the obtained result may be affected considerably by the kernel size.

Hamdoui et al. [36] proposed an approach that known as Multithresholding based on Modified Particle Swarm Optimization (MMPSO). They implemented their proposed method for segmenting images based on PSO to identify a multilevel threshold. They mentioned that their proposed method was suitable for complex gray-level images. Their results indicated that the MMPSO is more efficient than PSO and GA. The main drawbacks of this method are that their approach is better only when the level of segmentation increase and the image is with more details.

AbdelMaksoud et al. [37, 38] proposed a system based on hybrid clustering techniques for medical image segmentation to provide the detection of brain tumor with an accurate way and minimal execution time. The integration clustering techniques are doing between K-means and FCM or K-means and PSO. In each stage, the accuracy and minimum execution time are putting into account. In the preprocessing phase, the median filter is used to enhance the quality of the image and making skull removal, this leads to reducing the time and the used amount of memory. In segmentation stage, all advantages are preserved for K-means, FCM, and PSO; while the proposed techniques solved their main problems. The thresholding is applied for clear brain tumor clustering. Finally, the contoured tumor area is obtained by the level set stage on the original image.

Samanta et al. [39] proposed a multilevel thresholding technique that has been used for image segmentation. An optimal threshold value is selected by using a new approach of Cuckoo Search (CS). CS is used to achieve the best solution for the initial random threshold values or solutions. It evaluates the quality of a solution correlation function. Finally, MSE and PSNR are measured to understand the segmentation quality. For CS, the first phase is to initial generations of the population for the cuckoo nest. Second, the original image is segmented by the candidate solution and rank the solution as per the correlation value. Third, the current best solution is found. Fourth, randomly few nests are distorted by pa probability. Finally, the final segmented image is doing by the best candidate solution.

Dey et al. [40] presented a system that extracted blood vessels from retinal images. It provides early diagnosis of diseases like diabetic retinopathy, glaucoma, and macular degeneration. The most frequent disease that can occur glaucoma. It has serious ocular consequences, which can even lead to blindness if it is not detected early. First, they made a conversion from the green channel of the Color Retinal Fundus to grayscale image. Second, the gray image is used to apply an adaptive histogram equalization [6]. Third, the median filter is used to make subtracting the background from the foreground. Fourth, they used FCM followed by binarization and filtering. Fifth, the corresponding disease is compared with the ground truth image. Finally, the calculation of the sensitivity, specificity, PPV, PLR, and accuracy are applied.

3 Basic Concepts

3.1 Thresholding Techniques

Several techniques for image segmentation are proposed for medical applications. The specific applications and different imaging modalities control the selection of the various methods. Imaging problems, such as noise, partial volume effects, and motion can also have significant consequences on the performance of the segmentation algorithms. In the thresholding, different classes can be obtained according to separating pixels to their gray levels. The approaches that perform a binary partitioning of the image intensities to scalar segment images is called Thresholding approaches. In the thresholding segmentation, the threshold value is compared with all pixels. The threshold value that is less than pixel’s intensity value is grouped into one class. Otherwise, another class grouped other pixels. The multi-thresholding determined more than one threshold values [11, 41]. The main restriction of thresholding the spatial characteristics of an image does not typically take into consideration. Therefore, noise and intensity inhomogeneities were susceptible to it, which can occur in MRI images. Thresholding is defined mathematically by Eq. (1) [42]:

$$ g(\text{x,y}) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {\text{if}\,\text{f(x,}\;\text{y > }T\text{)}} \hfill \\ {0,} \hfill & {\text{if}\,\text{f(x,}\;\text{y)} \le {\text{T}}} \hfill \\ \end{array} } \right. $$
(1)

where f(x, y) represent the input image and T the value of the threshold. g(x, y) is the segmented image that is given by Eq. (1). Using the above Eq. (1), we can be segmented the image into two groups. The multi-threshold point is used when we want to segment the given image into multiple groups. This equation Eq. (2) segments the image into three groups If we have two threshold values.

$$ g(\text{x,y}) = \left\{ {\begin{array}{*{20}l} {{\text{a}},} \hfill & {\text{if}\,\text{f(x,}\;\text{y) > }{\text{T}}_{2} } \hfill \\ {{\text{b}},} \hfill & {\text{if}\,\text{T}_{1} \;\text{ < }\;f(\text{x,y})\; \le \;T_{2} } \hfill \\ {{\text{c}},} \hfill & {\text{if}\,\text{f}(\text{x,}\;\text{y}) \le {\text{T}}_{1} } \hfill \\ \end{array} } \right. $$
(2)

The algorithm for the thresholding is given by Gonzalez et al. [43] as follows:

Step 1:

An initial estimation is selected for the global threshold, T.

Step 2:

The image is segmented by using the value of threshold (T), as shown in Eq. (4), to get 2 groups of pixels. If pixels with intensity values > T are contained in G1, else the pixels with values ≤ T are contained in G2.

Step 3:

m1 and m2 are the average mean intensity values that are computed for the pixels in G1 and G2 respectively.

Step 4:

The new threshold value is computed.

Step 5:

If the difference between a predefined parameter. \( \Delta \,\text{T} \) is smaller than values of T in successive iterations. This process is repeated for steps 2 through 4. Otherwise, it is stopped.

3.1.1 Global Thresholding

In the Global thresholding method, for the entire image, only one threshold value is selected. Bimodal images are used to Global thresholding where the image foreground and background has the homogeneous intensity and high contrast between them, the Global thresholding method is simple and faster in computation time.

3.1.2 Local Thresholding

An image is divided into sub-images and the threshold value computed for each part. Global thresholding takes less computation time than a local threshold. When there is a variation in the background in an image, Its result is satisfactory. It can extract only small regions [44].

3.1.2.1 Histogram Thresholding

It is based on thresholding of histogram features and gray level thresholding. The threshold is mathematically defined by Eq. (1). The algorithms as follows [4549]:

Step 1:

The histogram is drawn for each part of the MRI brain image that is divided around its central axis into two halves.

Step 2:

Threshold point of the histogram is calculated based on the comparison technique made between two histograms.

Step 3:

The segmentation process for both the halves is doing by the threshold point.

Step 4:

For finding out the physical dimension of the tumor, the detected image is cropped along its contour.

Step 5:

The segmented image pixel value is checked for creating an image of the original size. If the threshold value is less than the pixel value, then assign a value equal to 255 else 0.

Step 6:

Segment the tumor area.

Step 7:

The tumor region is calculated.

3.2 An Overview of PSO Algorithm

One of the evolutionary optimization methods is the PSO algorithm. Typically, the evolutionary methods are successful as shown in the experiments for segmentation purposes [50, 51]. Evolutionary algorithms ideally do not make any assumption about the underlying problem. Therefore, all types of problems are performed well approximating solutions. In the traditional PSO, the particles are called candidate solutions. To find an optimal solution, these particles travel through the search space, by interacting and sharing information with neighbor particles, namely their individual best solution (local best) and computing the neighborhood best. Also, in each step of the procedure, the global best solution obtained in the entire swarm is updated. Using all of this information, particles realize the locations of the search space where success was obtained and are guided by these successes.

3.3 Multilevel Thresholding Method Based on FODPSO

An efficient way to perform image analysis is to use multi-level segmentation techniques. However, the selection of a robust optimum n-level threshold is required to be automatic. In the following discussion, a more accurate formulation of the problem is introduced.

Image analysis can be performed in an efficient way by using multi-level thresholding segmentation techniques. The essential challenge in the image segmentation is the selection of the optimum n-level threshold. However, the selection of the optimum n-level threshold is required to be automated. The rest of this section presents a more precise formulation of the problem, introducing some basic notation.

In the proposed system, a gray image is used as the color image takes more computation time. For each image, there are L intensity levels, which are in the range \( of\left\{ {0,1,2, \ldots ,L - 1} \right\} \). Then, we can define the probability distribution as [52]:

$$ \text{p}_{\text{i}} = \frac{{\text{h}_{\text{i}} }}{\text{N}},\sum\nolimits_{{\text{i} = 1}}^{\text{N}} {\text{p}_{\text{i}} = 1} $$
(3)

where i represents a particular intensity level, i.e.,\( 1 \le i \le L - 1 \). The total number of the pixels in the image is N. The number of pixels can be represented by \( h_{i} \) for the corresponding intensity level i. In other words, image histogram is represented by \( h_{i} , \) which can be normalized and considered as the probability distribution \( p_{i} \) for component of the image. The total mean (i.e., combined mean) can be simply computed as:

$$ \mu_{\text{T}} = ip_{i} $$
(4)

The generic n-level thresholding can be derived from the 2-level thresholding in which n − 1 threshold levels \( {\text{t}}_{\text{j}} \), \( {\text{j}} - 1, \ldots ,{\text{n}} - 1 \), are necessary and where the operation is performed as expressed below in Eq. (5):

$$ \begin{array}{*{20}c} {\;\;F(x,y)} \\ { = \left\{ {\begin{array}{*{20}l} 0 \hfill & {f(x,y) \le t_{1} } \hfill \\ {\frac{1}{2}(t_{1} + t_{2} ),} \hfill & {t_{1} \text{ < }f(x,y) \le t_{2} } \hfill \\ \vdots \hfill & {} \hfill \\ {\frac{1}{2}(t_{n - 2} + t_{n - 1} ),} \hfill & {t_{n - 2} \text{ < }f(x,y) \le t_{n - 1} } \hfill \\ {L,} \hfill & {f(x,y)\text{ > }t_{n - 1} } \hfill \\ \end{array} } \right.} \\ \end{array} $$
(5)

The image is represented by x, which is the width (W) of the image, and y, which is the height (H) of the image. Then, the size can be represented by \( {\text{H}} \times {\text{W}} \) denoted by \( {\text{f}}({\text{x}},{\text{y}}) \) with L intensity gray levels. In this situation, the pixels of a given image will be divided into n classes \( ({\text{D}}_{1} , \ldots ,{\text{D}}_{\text{n}} ) \) It may represent multiple objects or even specific features on such objects (e.g., topological features).

The method that maximizes the between-class variance is used for obtaining the optimal threshold. It is the most efficient computational method that can be generally defined by:

$$ \upsigma_{\text{B}}^{2} = \sum\limits_{{{\text{j}} = 1}}^{\text{n}} {\text{W}_{\text{J}} (\upmu_{\text{j}} -\upmu_{\text{T}} )^{2} ,} $$
(6)

where j represents a particular class in such a way that \( W_{J} \,{\text{and}}\;\mu_{j} \) are the probability of occurrence and the mean of the class j, respectively. The probabilities of occurrence \( {\text{W}}_{\text{J}} \) of classes \( ({\text{D}}_{1} , \ldots ,{\text{D}}_{\text{n}} ) \) are given by:

$$ \text{W}_{\text{J}} = \left\{ {\begin{array}{*{20}l} {\sum\limits_{i = 1}^{{t_{j} }} {p_{i} ,j = 1} } \hfill \\ {\sum\limits_{{i = \text{t}_{j - 1} + 1}}^{{t_{j} }} {p_{i} ,1\text{ < }\;j\text{ < }\;n,} } \hfill \\ {\sum\limits_{{i = \text{t}_{j - 1} + 1}}^{L} {p_{i} ,j = n,} } \hfill \\ \end{array} } \right. $$
(7)

\( \text{W}_{\text{J}} \) is the mean of each class that is computed as:

$$ \mu_{j} = \left\{ {\begin{array}{*{20}l} {\sum\limits_{i = 1}^{{t_{j} }} {\frac{{ip_{i} }}{{W_{j} }},j = 1} } \hfill \\ {\sum\limits_{{i = \text{t}_{j - 1} + 1}}^{{t_{j} }} {\frac{{ip_{i} }}{{W_{j} }},1\text{ < }\;j\text{ < }\;n,} } \hfill \\ {\sum\limits_{{i = \text{t}_{j - 1} + 1}}^{L} {\frac{{ip_{i} }}{{W_{j} }},j = n,}} \hfill \\ \end{array} } \right. $$
(8)

In other words, the n-level thresholding problem is limited to an optimization problem. It searches for the thresholds \( {\text{t}}_{\text{j}} \) that make maximization for the objective function (i.e., a fitness function) defined as:

$$ \varphi = \mathop {\hbox{max} }\limits_{{1\text{ < t}_{1} \cdots \text{ < t}_{n - 1} \text{ < }L}} \quad \sigma_{B}^{2} (\text{t}_{j} ) $$
(9)

As the number of threshold levels increases, this optimization problem involves a much larger computational effort. It makes us think of the question: Which type of methods that the researcher can use for solving this optimization problem for real-time applications? [52]. FODPSO is an example of such methods that recently presented. FODPSO is a new version that derived from the DPSO. To control the convergence rate of FODPSO, the fractional calculus is used to solve this kind of problems [35].

When the threshold levels and image components increase the optimization problem, it needs much computational effort. Recently, biologically inspired methods, such as PSO, are alternatives to analytical methods to solve efficiently optimization problems [13]. An example of such methods that is presented recently is the FODPSO. This method is a natural extension of the DPSO. It is presented using fractional calculus to control the convergence rate. It was extended for the classification of remote sensing images in [18, 35, 52].

As in the classical PSO, to find an optimal solution particles travel through the search space in FODPSO by interacting and sharing information with other particles. In each step of the algorithm t, the success for a particle is evaluated by a fitness function. Each particle n, moves in a multidimensional space to model the swarms according to a position \( {\text{x}}_{\text{n}}^{\text{s}} [{\text{t}}],0 \le {\text{x}}_{\text{n}}^{\text{s}} [{\text{t}}] \le {\text{L}} - 1 \), and velocity \( {\text{v}}_{\text{n}}^{\text{s}} \) [t]. the individually best \( {\tilde{x}}_{\text{n}}^{\text{s}} \) [t] and the globally best \( \tilde{g}_{\text{n}}^{\text{s}} \) [t] information are highly control the position and velocity values.

$$ \begin{aligned} \text{v}_{\text{n}}^{\text{s}} [\text{t} + 1] & =\upalpha\,\text{v}_{\text{n}}^{\text{s}} [\text{t}] + \frac{1}{2}\upalpha\,\text{v}_{\text{n}}^{\text{s}} [\text{t} - \text{1}] + \frac{1}{6}\upalpha(1 -\upalpha)\text{v}_{\text{n}}^{\text{s}} [\text{t} - 2] \\ & \quad + \frac{1}{24}\upalpha(1 -\upalpha)(2 -\upalpha)\text{v}_{\text{n}}^{\text{s}} [\text{t} - 3] +\uprho_{1} \text{r}_{1} \left( {\tilde{g}_{n} - \text{x}_{\text{n}}^{\text{s}} [\text{t}]} \right) \\ & \quad { + \rho }_{2} \text{r}_{2} \left( {{\tilde{x}}_{\text{n}}^{\text{s}} - \text{x}_{\text{n}}^{\text{s}} [\text{t}]} \right) \\ \end{aligned} $$
(10)
$$ \text{x}_{\text{n}}^{\text{s}} [\text{t} + 1] = \text{x}_{\text{n}}^{\text{s}} [\text{t}] + \text{v}_{\text{n}}^{\text{s}} [\text{t + 1}] $$
(11)

The global and individual performance are controlled by weights coefficients \( \uprho_{1} \) and \( \uprho_{2} \). Within the FODPSO algorithm, the fractional coefficient controls the inertial influence of particles. The random vectors \( {\text{r}}_{1} \) and \( {\text{r}}_{2} \), which is a uniform randomly number between 0 and 1 with each component. The fractional coefficient is parameter α, will weigh the influence of past events in determining a new velocity,\( 0 < \alpha < 1 \). The velocities of particles’ are initially set to zero when applying multilevel thresholding FODPSO of images and their position is randomly set within the boundaries of the search space, i.e., \( {\text{v}}_{\text{n}}^{\text{s}} [0] = 0\,{\text{and}}\,0 < {\text{x}}_{\text{n}}^{\text{s}} [0] < L - 1 \). In other words, the number of intensity levels L determine the search space, i.e., if an 8-bit image segmentation, and then particles will be deployed between 0 and 255. Hence, each particle in the same swarm will be found and compared to all particles, a possible solution \( {\varphi }^{\text{c}} \). The higher between-class variance \( {\varphi }^{\text{c}} \) the particle will be the best performing particle (i.e., \( {\tilde{\text{g}}}_{\text{n}}^{\text{s}} \)), thus luring other particles toward it. It is also noteworthy that when a particle improves, i.e., when a particle is able to find a higher between-class variance from one step to another, the fractional extension of the algorithm outputs a higher exploitation behavior. This allows achieving an improved collective convergence of the algorithm, thus allowing a good short-term performance. FODPSO is a method with a higher between-class variance to specify a predefined number of clusters. In [35], the authors demonstrated that the FODPSO-based segmentation method performs considerably better in terms of accuracies than genetic algorithm, bacterial algorithm, PSO, and DPSO, thus finding different number of clusters with a higher between-class variance and more stability in less computational processing time.

4 The Proposed MRI Image Segmentation System

There are many medical image segmentation systems that are used for detecting brain structure and tumor. All of these systems are not equal in accuracy and in execution time. Therefore, our goal is to build a robust segmentation system to deal with the brain images. As all thresholding-based methods, FODPSO segmentation suffers from two main disadvantages. First, inhomogeneity cannot be handled. Second, when the object intensity does not appear as a peak in the histogram. In the MS method, the size of the kernel needs to be tuned by the user [35]. The tuning may be a difficult task, and the final results may be dramatically affected. The proposed medical image segmentation system consists of three main phases: pre-processing, segmentation, and validation, as shown in Fig. 1. We take into account the accuracy and the time. In the preprocessing stage, we used the median filter and brain extractor tool for skull stripping from the processed image. In the segmentation phase, we make integration between MS and FODPSO that takes all advantages of them. Finally, validation is performed on the proposed system and the ground truth.

Fig. 1
figure 1

The block diagram of the proposed framework

The CT is used for image segmentation method, but it is not used alone. In addition, it is not good as MRI. It is used with MRI in the fusion process to improve the data. The image resolution of lesion or target is high in MRI rather than CT scans in stereotactic surgery. The stereotactic frame makes artifacts in images but less in MRI because it is used contrast enhancement or different pulse sequences. Especially, the benefits of using MRI rather than CT that is high contrast ventriculography, when performing stereotactic surgery in patients with brain lesions or normal anatomical targets [53].

Ghamisi et al. [35] proposed an approach that is based on two segmentation methods: FODPSO and mean shift segmentation. The proposed framework is used for dealing with Hyperspectral image analysis. In contrast, we applied the same proposed approach with a different data type of image for brain MRI. We applied proposed approach in MRI brain medical image. As compared the hyperspectral image with MRI brain medical image, there are many disadvantages of hyperspectral image. The cost and complexity are the primary disadvantages. Large data storage capacities, fast computers, and sensitive detectors are needed for hyperspectral data analysis. Large hyperspectral cubes require significant data storage capacity, multidimensional datasets, and potentially exceeding hundreds of megabytes. The processing hyperspectral data, cost, and time are greatly increased. Therefore, our proposed system is applied on MRI brain medical image that gives better accuracy and small time consuming of the segmented image as compared to Hyperspectral image.

4.1 The Preprocessing Phase

The improvement of image quality and noise removal are the main target of this stage. The de-noising and skull stripping are sub-stages of the pre-processing stage. In medical images, de-noising is necessary for sharping, clearing, and eliminating noise and artifacts. Gaussian and Poisson’s noise are usually affected by MRI images [54]. By using a median filter, the numerically sorted order is obtained from all pixel values in the window, and then the processed pixel is replaced by the median of the pixel values. Linear filtering is not better as median filtering for removing noise in the existence of edges [55]. The MR images also corrupted by Rician distributed noise. It is assumed to be white, and these images are suffered from reducing a contrast of signal-dependent bias. However, a widely used acquisition technique to decrease the acquisition time gives rise to correlated noise [56, 57]. On the other hand, the skull and the background of the image are removed while they do not contain any useful information. Decreasing the amount of the memory usage and increase the processing speed are done by removing unhelpful information, such as background, skull, scalp, eyes, and all other structures. Skull removal is done by using BET (Brain Extractor Tool) algorithm [58].

4.2 The Segmentation Phase

In this stage, we make integration between MS and FODPSO to take the advantages of these segmentation techniques. First, FODPSO will segment the input MRI brain image as shown in Table 1. Then, MS will segment the output of this step again. In other words, the result of FODPSO is used as an input to MS. The number of the clusters can be predefined by FODPSO, and a higher between-class variance to find the optimal set of thresholds in less computational time can be obtained by it. So, it is a favorable method. Therefore, we extract brain structure (WM, GM, and CSF) from the segmented image to the binary image then the proposed system is validated in the next phase.

Table 1 FODPSO segmentation algorithm [18]

4.3 The Validation Phase

In this stage, the result of the image segmentation with the proposed clustering techniques was compared to the ground truth as illustrated in the experimental results. The calculated measures are time, Jaccard similarity coefficient, and Dice similarity coefficient. The performance of the segmented images is shown in the experimental results in details and how to compute each of the performance measures. The accuracy of segmented image (SA) can define as:

$$ \text{SA = }\frac{{\text{Number}\,\text{of}\,\text{correctly}\,\text{classified}}}{{\text{Total}\,\text{number}\,\text{of}\,\text{pixels}}} \times 100\,{\% } $$
(12)

5 The Experimental Results and Discussion

The proposed system is implemented by using MATLAB R2011a on a Core(TM) 2 Due, 2 GHz processor, and 4 GB RAM system. We used three standard datasets. The first dataset is BRATS [59] database from Multimodal Brain Tumor Segmentation. It consists of 30 glioma patients with multi-contrast MRI scans (both low-grade and high-grade, and both with and without resection) along with expert observation for “active tumor” and “edema”. For each patient, there are many available types of images, such as T1, T2, FLAIR, and post-Gadolinium T1 MRI images. This database contains 81 images and has ground truth images to compare the results of our method with them. These images are got from Brain Web Database at the McConnell Brain Imaging Centre of the Montreal Neurological Institute, McGill University.

The second dataset is the Brain Web [60] database. It contains phantom and simulated brain MRI data based on two anatomical models: normal and multiple sclerosis. For both of these models, the data volumes of the full 3-dimensional data are emulating by using the three sequences (T1-, T2-, and proton density- (PD-) weighted). On the other hand, there is a variety of slice thicknesses, noise levels, and non-uniformity levels of intensity. It is a T1 modality, 1 mm slice thickness. This dataset consists of 152 images.

The third dataset is the Digital Imaging and Communications in Medicine (DICOM) [61]. DICOM consists of 22 images that contain brain tumors. All DICOM image files are encoded in JPEG2000 transfer syntax with “.DCM” extension. It has no ground truth images for the contained images.

5.1 Measuring the Segmentation Performance

To provide a proper comparison between the tested methods, we use different performance measures, such as:

  1. 1.

    Jaccard similarity coefficient [62, 63]: It is a widely used overlap measure, which is public and used usually as similarity indices for binary data. The area of overlap \( JSC \) is computed between the segmented image \( S_{1} \) and the gold standard image \( S_{2} \) as shown in Eq. (13).

    $$ JSC = (S_{1} \cap S_{2} )\text{/}(S_{1} \cup S_{2} ) $$
    (13)
  2. 2.

    Dice similarity coefficient [62, 63]: It measures the number of the extent of spatial overlap between two binary images. It is the most widely used for measuring the performance of segmentation. Its values range between 0 and 1 if the value is zero there is no overlap. If the value is one, this means a good agreement.The Dice coefficient is defined as:

    $$ D = 2(S_{1} \cap S_{2} )\text{/}vol(S_{1} \cup S_{2} ) = 2JSC\text{/}(1 + JSC) $$
    (14)
  3. 3.

    Accuracy

    $$ True\,Positive(TP) = \frac{No\,of\,resulted\,images\,having\,brain\,tissue}{total\,No\,of\,images} $$
    (15)
    $$ True\,Negative(TN) = \frac{No\,of\,images\,that\,haven't\,brain\,tissue}{total\,No\,of\,images} $$
    (16)
    $$ False\,Positive(FP) = \frac{No\,of\,images\,that\,non\,brain\,and\,detected\,positive}{total\,No\,of\,images} $$
    (17)
    $$ False\,Negative(FN) = \frac{No\,of\,images\,have\,brain\,tissue\,and\,not\,detected}{total\,No\,of\,images} $$
    (18)
    $$ Accuracy = \left[ {\frac{(TP + TN)}{(TP + TN + FP + FN)}} \right] $$
    (19)

    TP is true positive, and FP is false positive. They are correctly and incorrectly classified a number of voxels as brain tissue by the automated algorithm. TN is true negative, and FN is false negative. They are correctly and incorrectly classified a number of voxels as non-brain tissue by the automated algorithm.

In Table 2, we listed the main parameter of FODPSO. The maximum number of iterations is IT. N is initial number particles with each swarm. The coefficients \( \uprho_{{1\,{\text{and}}}}\uprho_{2} \) are weights, which control the global and individual performance. The fractional coefficient is commonly known as \( { \propto } \). It will weigh the influence of past events in determining a new velocity, \( 0 < \propto < 1 \). The number of swarms is \( {\text{N}}^{\text{s}} \) where \( {\text{N}}_{ \hbox{max} }^{\text{s}} \) represents the maximum number of allowed swarms. \( {\text{N}}_{ \hbox{min} }^{\text{s}} \) represents the minimum number of allowed swarms. The number of particles is described by \( N_{kill} \), no enhancement in fitness means that the number of particles was deleted by the swarm over a period. Initialize \( \Delta \,{\text{v}} \) maximum number of levels a practical can travel between iterations.

Table 2 The parameters of FODPSO

Table 3 shows the main stages of the proposed method. The first stage is the skull removal that performed by using BET algorithm [58]. The second stage uses the FODPSO algorithm combined with the MS algorithm. The output of FODPSO is supplied as an input to MS. By doing the experiments on all images of the three datasets using the MS; we found that the best results in image clusters can be got if bandwidth = 0.2 that proved by try and error. By decreasing the bandwidth for the same threshold, it processes the images in less time. Over-clustering and under-clustering arise from the chosen value of the bandwidth. The too small value of the bandwidth produces over-clustering, and also, the too large value of bandwidth provide critical modes that can be merged under-clustering. Also, in the third dataset, we make detection of the tumor by FODPSO algorithm combined with the MS algorithm (Tables 4, 5, and 6).

Table 3 The main steps of the proposed framework
Table 4 The comparison between five different segmentation techniques on the two tested datasets
Table 5 The comparison between FCM, Mean Shift, PSO segmentation algorithms
Table 6 The comparison between DPSO and FODPSO + MS segmentation algorithms

In Tables 7 and 8, the mean of errors is measured in the two tested datasets by using the JSC and Dice. It is established that the proposed technique (FODPSO + MS) gives the best result than any other tested techniques.

Table 7 The mean errors for the Jaccard and the Dice similarity coefficients for DS1
Table 8 The mean errors for the Jaccard and the Dice similarity coefficients for DS2

In Table 9, we can observe that the accuracy of FCM same as MS for the two datasets. In Table 10, the accuracy of DPSO is better than PSO. In Table 11, the FODPSO + MS is superior to the previous techniques with accuracy 99.67 % (Figs. 2 and 3).

Table 9 The performance metrics of FCM and Mean Shift
Table 10 The performance metrics of PSO and DPSO
Table 11 The performance metrics of FODPSO + MS
Fig. 2
figure 2

The performance measure of the segmentation techniques in seconds for DS1

Fig. 3
figure 3

The performance measure of the segmentation techniques in seconds for DS2

6 Conclusion

Achieving acceptable performance is a hard target in the segmentation process because unknown noise is contained in the medical images. The proposed approach is based on the combination of FODPSO and MS techniques. A number of clusters can be predefined by FODPSO, and a higher between-class variance for finding the optimal set of thresholds in less computational time can be obtained by it. In the proposed approach, the result of FODPSO is used as the input to MS to develop a pre-processing method for the classification. The main difficulty of MS is tuning the size of the kernel, and the obtained result may be affected by the kernel size. Results indicate that the use of both segmentation methods can overcome the shortcomings of each of them. The combination can significantly improve the outcome of the classification process. In the future, a hybrid technique based on clustering algorithms and multilevel thresholding like FODPSO can be combined to work on input dataset for better results.

In the future, the chaos-based concept will be integrated with PSO. Also, a hybrid technique based on clustering algorithms like FCM and multilevel thresholding like FODPSO can be combined to work on input dataset for better results. In the future, We can also use a multi-modal image like MRI and CT for improving results. To overcome the issue of trapping the solution in local optima is solved by Clustering based a biologically inspired Genetic algorithm was developed that we can apply in the future work.