1 Introduction

A brain tumor is a growth of abnormal and unnecessary cells in the brain [48]. Early detection and accurate segmentation of brain tumors can be crucial for further effective treatment. Although there are various imaging modalities, magnetic resonance (MR) imaging technique due to its advantages such as producing safe radiation and creating high contrast between soft tissues is one of the most common diagnostic imaging modalities. Despite its advantages, noise and inhomogeneity in MR images are inevitable. On the other hand, different sizes and shapes, poor boundaries, and various places of occurrence make the brain tumor segmentation (BTS) a challenging task [5]. Manual analysis of brain MR images is time consuming, complex, and error-prone process. Furthermore, since it depends on the individual performance of the operator, labels of analysis of different experts show 14–22% differences in the results [13]. Therefore, to overcome these problems, automatic and computerized methods can be very helpful.

So far, numerous methods have been proposed for BTS that can be categorized into four groups: region-based methods [2, 18, 39, 45], symmetry analysis [20, 25, 37, 53], learning-based methods [1, 4, 8, 11, 12, 16, 22, 31, 42, 44, 46, 47, 54,55,56] and contour/surface evolution methods [17, 19, 24, 29, 36, 40, 49].

Region-based methods such as region growing [2] and thresholding [18] are basic and simple methods and their performances highly depend on the significant differences in the intensities between tumor and non-tumor regions and consequently they may lead to poor performance in noisy and heterogeneous cases. Thus, thresholding methods are usually used as an initial step to determine the approximate location of the tumor [1].

Symmetry analysis methods use asymmetry between the left and right cerebral hemispheres which are caused by appearing the tumor [25]. In [37], a change detection method was proposed based on the symmetry axis of the brain. In this method, the Bhattacharya coefficient computed with gray level intensity histograms was used to find the most dissimilar regions. Although utilizing symmetry analysis can make the diagnosis process faster, finding the accurate symmetry axis is a challenging task. Furthermore, sometimes locating the tumor across the symmetry axis can cause inaccurate segmentation.

Learning-based methods usually use supervised classifiers such as support vector machine [8] and decision trees [42] to segment brain tumors. More recently, random forest (RF) methodology, which operates by constructing a multitude of decision trees at the training phase, has been used to make more accurate decisions [1, 16]. However, these classification methods need to extract useful and effective features that complicate the algorithm. In [1], features extracted from the histogram of orientation gradients and the local binary pattern methods are used as the learning attributes, and then the RF is used as a classifier to segment tumorous regions. In [31], a generative-discriminative hybrid model is proposed, which generates initial tissue probabilities for enhancing the classification and spatial regularization. In this model, 44 features including first-order texture, gradient information, and symmetry features are extracted for classification. In another scheme, based on similarities between multi-channel patches, the segmentation approach presented in [12] chooses similar patches from the training data and combines labels of them to result in a segmentation map for the test case. In [47], first, features such as intensity, intensity differences, local neighborhood, and wavelet texture are extracted and then the RF classifier is applied for identification of different regions and tumor tissues. Some other learning-based methods which are known as deep learning [11, 44, 46, 54, 56] do not need the feature extraction step and they automatically learn a hierarchy of increasingly complex features directly from data. However, supervised learning-based methods relatively have high computational costs and need a massive dataset with ground truth data for the training phase. In this regard, the model presented in [44] is a deep learning-based framework for BTS and survival prediction which has an ensemble of three different convolutional neural network architectures for robust performance through majority rule.

On the other hand, a review of recent articles shows that the active contour models (ACM)s are among the most powerful methods that have been used for BTS [17, 19, 24, 29, 36, 40, 49]. ACMs basically evolve a curve through minimizing an energy function to extract the desired object [23]. The ACMs can be classified as explicit [51] and implicit deformable models [38]. While explicit methods are based on rigid parametric formulation, implicit deformable models (or level set methods), can handle topological changes for the merging and splitting of evolving curves and consequently, they are less sensitive to the initial condition [38]. There are two main categories for level set methods: edge-based [9] and region-based models [10]. As edge-based models utilize image gradients to stop the evolving contours, this type of highly localized image information may cause erroneous segmentations in cases of noisy images and images with smooth or discontinuous edges [9].

The most widely used region-based approaches which are well-known as Chan–Vese (CV) [10] and Mean Separation (MS) energy [52] models are based on the Mumford-Shah technique [33] that utilizes image global statistic and assumes that each image is formed by two regions of approximately piecewise-constant intensities. Therefore, due to the use of global statistics, they are not appropriate for heterogeneous objects. Since heterogeneous objects frequently appear in natural and medical images, Lankton and Tannenbaum [28] proposed localized ACMs by considering a circular mask around each point along the evolving curve. Hence, localized models of CV and MS energy, which are known as Localized Chan–Vese (LCV) and Localized Mean Separation (LMS) Energy models, due to their local energy functions, have relatively good performances in case of heterogeneous images.

Region-based ACMs that utilize statistical intensity information are sensitive to the high mean intensity distance between consecutive regions. In this regard, Ilunga et al. [19] proposed a new reformulation of the LMS model which compensates the background intensity to balance the mean intensity distance between the foreground and the background. They used this model for BTS in MR images and named it localized ACM with background intensity compensation (LACM-BIC). Furthermore, in [17] a Fractional Wright Function (FWF) is used as a minimization of energy technique to improve the boundary tracking of the CV model wherein the FWF is utilized to find the boundaries of an object by controlling the inside and outside values of the contour.

In [27], to create a balanced technique alongside a strong ability to reject weak local minima, a new class of ACMs has been proposed based on fuzzy logic. This method which is named fuzzy energy-based active contour (FEAC) has a robust performance and desirable resistance to noise and can handle objects even with weak and smooth boundaries. Furthermore, instead of traditional methods for solving the associated Euler-Lagrange equations, it uses a fast optimization algorithm that has been proposed for level set based optimization to minimize the fuzzy energy function [43]. Similar to what happened in CV and MS models, due to the use of global statistics, this method also fails to find the object contour when gradual tonality variations appear in the image, and consequently, other elements may be wrongly considered as objects of the scene or vice versa. Therefore, Fang et al. [15] proposed a localized patch-based fuzzy active contour (LPFAC) to solve the drawback of the FEAC model. As the LPFAC model utilizes fuzzy logic in addition to region-based and local information, it has the potential to be used for BTS since using fuzzy logic besides local statistics can lead to appropriate segmentation in images with noise, blurred boundary and discontinuous edges. However, the LPFAC model, similar to other advanced fuzzy ACMs [27, 30, 41, 50], considers that the whole image is formed by two classes; the target object and rest part of the image while this assumption may lead to an erroneous segmentation in images with considerable dark areas such as medical images. For instance, in the Fluid Attenuation Inversion Recovery (FLAIR) brain MR images, the cerebrospinal fluid has a low-intensity value and considering these dark areas with other brain tissues in one class makes the mean intensity value of the class inappropriate as the representative of the corresponding pixels and consequently, it causes inaccurate segmentation. Thus, in this paper, an extended and reformulated model of the LPFAC energy function called extended localized patch-based fuzzy active contour (ELPFAC) model is proposed as it assumes that each image is formed by three regions; tumor, dark tissues with a dark background and rest of the brain tissues. Moreover, since pixel-based models are inevitably sensitive to the initial contour, noise, artifact, amount of considering local information, and inhomogeneity of MR images, we have used superpixels (SP)s as basic processing units to improve robustness of the algorithm. Despite the advantages mentioned above, utilizing SPs reduces the resolution of the images. Hence, to preserve the accuracy of the segmentation in addition to improving the robustness and execution time of the algorithm, we use the SP-based result of the algorithm as an initial value to continue the process of segmentation in a pixel-based way.

The rest of this paper is organized as follows. Section 2 introduces the basic concepts. In Section 3, the proposed BTS method is illustrated in detail. Experimental results and discussion are brought in Section 4. Finally, the main conclusions are provided in Section 5.

2 Basic concepts

2.1 Superpixel (SP) segmentation

SP producing algorithms group pixels into homogeneous and meaningful clusters using the degree of similarity between them. In this work, we used Simple Linear Iterative Clustering (SLIC) [3] method to create SPs which are used as atomic units for further procedures. SLIC is a fast method with a good boundary adherence that uses both intensity and spatial values of each pixel to generate SPs. As a result, the created SPs approximately have regular shapes especially in regions without intensity inhomogeneities. Initialization of the cluster centers on a regular grid is the first step of this algorithm and since the hexagonal shape will be more flexible than square one to match their boundaries to the edges of images, in this paper, the generated SP seed points have been shaped in a hexagonal pattern. Thus, the grid interval is \( S={\left(2\sqrt{3}N/3k\right)}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.} \), where N is number of pixels in the image and k is number of the SPs. Pixels are labeled by computing the weighted distances between cluster centers and pixels within a 2S × 2S region. The weighted distance, D, is defined as follows [3],

$$ D=\sqrt{{d_c}^2+{\left(\frac{d_s}{S}\right)}^2{R}^2} $$
(1)
$$ {d}_c=\sqrt{{\left({I}_i-{I}_j\right)}^2} $$
(2)
$$ {d}_s=\sqrt{{\left({a}_i-{a}_j\right)}^2+{\left({b}_i-{\mathrm{b}}_j\right)}^2} $$
(3)

where dc and ds are intensity and spatial distances and R is the compactness factor which controls the flexibility of SP boundaries. When R is small, the resulting SPs adhere more tightly to image boundaries whereas they have less regular size and shape. When R has a higher value, the effect of spatial proximity will increase and consequently SPs’ shapes will be more regular. Ii and Ij are intensity values of the ith and the jth pixels respectively and a and b are pixel’s Cartesian coordinates. Cluster centers must be updated iteratively until labels of all pixels remained unchanged. Figure 1a is an original MR image and Fig. 1b shows its corresponding SP map. It is evident that SPs’ boundaries have a good boundary adherence to edges and discontinuity of the image so that it can be appropriate for accurate BTS.

Fig. 1
figure 1

a An original MR image and b the corresponding SPs

2.2 Localized patch-based fuzzy active contour (LPFAC) model

In [15], Fang et al. proposed a localized version of the FEAC model by incorporation a local patch along each pixel of the evolving curve. Let I(x) : Ω ∈ R be a given gray level image to be segmented and C be a closed evolving curve in the image domain Ω and assume that the image I is divided into two regions by the contour C, i.e., inside C and outside C. Fang et al. proposed the energy functional defined as follows,

$$ F\left(C,{v}_1,{v}_2,u\right)={\int}_{\varOmega_x}{\int}_{\varOmega_y}W\left(x,y\right).\kern0.5em u{(y)}^m{\left(I(y)-{v}_1(x)\right)}^2 dy\ dx+{\int}_{\varOmega_x}{\int}_{\varOmega_y}W\left(x,y\right).\kern0.5em {\left(1-u(y)\right)}^m{\left(I(y)-{v}_2(x)\right)}^2 dy\ dx $$
(4)

where x and y are independent spatial variables each representing a single point in Ω (y is a neighborhood of x); I(y) represents the intensities of the points y which are in a local region centered at the point x; v1(x) and v2(x) represent the intensity means of two local regions around the point x inside and outside the contour C respectively; the membership function u(y) ∈ [0, 1] is the membership degree of I(y) to the interior local region centered at the point x inside the contour C and m is a weighting exponent on each fuzzy membership.

The function W(x, y), which is defined in (5), masks local regions. It will be 1 when the point y is within a circle with radius r centered at x, and 0 otherwise. The interaction of W(x, y) with the interior and exterior regions is illustrated in Fig. 2.

$$ W\left(x,y\right)=\left\{\begin{array}{c}1,\kern2.25em \left\Vert x-y\right\Vert <r\\ {}0,\kern2.75em otherwise.\end{array}\right. $$
(5)
Fig. 2
figure 2

a A circle patch is considered for a certain pixel, x, (black dot in the image) along the contour (the red curve). b The patch is split by the contour into the local interior (the green part) and local exterior regions (the yellow part)

The segmentation is then performed via a pseudo level set formulation based on the membership values u, where the evolving curve is represented by the pseudo zero level set of Lipschitz similar function u, such that,

$$ \left\{\begin{array}{c}C=\left\{x\in \varOmega :u(x)=0.5\right\}\kern4.25em \\ {} inside(C)=\left\{x\in \varOmega :u(x)>0.5\right\}\\ {} outside(C)=\left\{x\in \varOmega :u(x)<0.5\right\}\end{array}\right. $$
(6)

Keeping u fixed and minimizing the energy function F(C, v1, v2, u) with respect to v1 and v2, it is easy to get the following equations,

$$ {v}_1(x)=\frac{\int_{\varOmega_y}W\left(x,y\right).u{(y)}^m\ I(y)\ dy}{\int_{\varOmega_y}W\left(x,y\right).u{(y)}^m\ dy} $$
(7)
$$ {v}_2(x)=\frac{\int_{\varOmega_y}W\left(x,y\right).{\left(1-u(y)\right)}^m\ I(y)\ dy}{\int_{\varOmega_y}W\left(x,y\right).{\left(1-u(y)\right)}^m\ dy} $$
(8)

Furthermore, Keeping v1 and v2 fixed and minimizing the energy function F(C, v1, v2, u) with respect to u, the fuzzy membership degree can be achieved as follows,

$$ u(x)=\frac{1}{1+{\left(\frac{I(x)-{v}_1(x)}{I(x)-{v}_2(x)}\right)}^{\frac{2}{m-1}}} $$
(9)

3 Proposed method

As mentioned in Section 2.2, the conventional LPFAC model is based on the assumption that the image is composed of two regions. This assumption in the images that have regions like dark tissues and black background (such as what is often observed in medical images) leads to inaccurate segmentation results. Thus, in this section, we develop a new fuzzy ACM which has an extended fuzzy energy function to provide a separate class for dark tissues, and then we explain how to use SPs to enhance the robustness of the algorithm in addition to reducing the computational cost.

3.1 Extended localized patch-based fuzzy active contour (ELPFAC)

Let I be a given gray level image and R1 be a target region to be segmented. There will be two different possible cases for the target location in the foreground region which have been illustrated in Fig. 3. In Fig. 3a, as the target region, R1, is far from the dark regions, the LPFAC model can successfully find the target region. However, in Fig. 3b, R1 is adjacent to the dark areas and it has common boundaries with the dark background. As it is shown in Fig. 4a, let us define a closed evolving curve C as an initial contour for the target object, in the image domain Ω. Moreover, according to the LPFAC model, consider a circle patch (W) around a pixel along the initial contour of the target, C. Based on what mentioned in Fig. 2b, the local patch will be split by the contour C into local interior (the green part of the circle in Fig. 4b) and local exterior (the yellow part of the circle in Fig. 4b) regions and computations for the under consideration pixel will be done according to these two classes. However, the exterior part of the local patch includes dark areas, and considering them together with gray regions makes the mean intensity value of the class inappropriate as the representative of the corresponding pixels and consequently, it leads to an incorrect segmentation. Since MR brain images, especially Flair images, have considerable dark tissues plus black background, we reformulated the LPFAC model into a new local fuzzy energy function which considers dark areas in a separate class. To achieve this goal, in the proposed approach, we divide the image into three non-overlapping regions using the automatic nonparametric Otsu thresholding method which selects a global threshold value by maximizing the separability of the resultant clusters in gray levels [34]. As it is depicted in Fig. 4c, these classes are as follows: target (R1), dark areas and background (R2) and rest part of the foreground (R3). Thus, in contrast with the LPFAC model, the exterior part of the local patch, W, will be split into two regions by R2 and R3 and consequently it reduces the standard deviation of the classes and improves the segmentation result in such cases. The general form of the proposed energy function when a given image I(x) is approximated by the three distinct regions, is as follows,

$$ F\left({C}_1,{C}_2,{v}_1,{v}_2,{v}_3,{u}_1,{u}_2\right)={\int}_{\varOmega_x}{\int}_{\varOmega_y}W\left(x,y\right).\kern0.5em {u}_1{(y)}^m{\left(I(y)-{v}_1(x)\right)}^2 dy\ dx+{\int}_{\varOmega_x}{\int}_{\varOmega_y}W\left(x,y\right).\kern0.5em {u}_2{(y)}^m{\left(I(y)-{v}_2(x)\right)}^2 dy\ dx+{\int}_{\varOmega_x}{\int}_{\varOmega_y}W\left(x,y\right).\kern0.5em {\left(1-{u}_1(y)-{u}_2(y)\right)}^m{\left(I(y)-{v}_3(x)\right)}^2 dy\ dx $$
(10)

where v1, v2 and v3 represent the intensity means of three local regions around the point x in R1, R2, and R3 respectively. u1(y), u2(y) and (1 − u1(y) − u2(y)) are the fuzzy membership degrees of I(y) to the local interior region centered at the point x in R1, R2 and R3 respectively.

Fig. 3
figure 3

Illustration of two cases of target location: a the target region (R1) is far from the dark areas and b R1 is adjacent to the dark areas and it has common boundaries with the dark background

Fig. 4
figure 4

Illustration of the differences of the LPFAC and the ELPFAC model at dealing with dark areas. a the initial contour for the target region is shown by the red curve and a circle patch (the orange curve) is considered around each pixel (small black dot) along the initial contour, b according to the LPFAC model, the patch is split by the contour C into local interior and local exterior (the green and yellow parts of the patch respectively) regions and c the image is divided into three non-overlapping regions; R1, R2, and R3. (they are shown by red, black and blue colors respectively) so that the patch is split into three local regions by the R1, R2 and R3 regions

We initialize u1 and u2 as follows,

$$ {u}_1(x)=\left\{\begin{array}{c}\beta \kern6.5em x\in {R}_1\\ {}\left(1-\beta \right)/2\kern2.5em otherwise\end{array}\right. $$
(11)
$$ {u}_2(x)=\left\{\begin{array}{c}\beta \kern6.5em x\in {R}_2\\ {}\left(1-\beta \right)/2\kern2.5em otherwise\end{array}\right. $$
(12)

where β is a selectable parameter and this value can be in the range (0.5,1]. We empirically choose β = 0.8 for all the results mentioned in this paper.

It should be noted that to compute (10), we only consider those pixels which are near to the target evolving contour, C since considering all pixels in the image is not necessary and it just increases computational cost and memory occupation.

Keeping u1 and u2 fixed and minimizing the energy function F(C1, C2, v1, v2, v3, u1, u2) with respect to v1, v2 and v3, these local intensity means can be easily obtained as follows,

$$ {v}_i(x)=\frac{\int_{\varOmega_y}W\left(x,y\right).{u}_i{(y)}^m\ I(y)\ dy}{\int_{\varOmega_y}W\left(x,y\right).{u}_i{(y)}^m\ dy},\kern1em i=1,2 $$
(13)
$$ {v}_3(x)=\frac{\int_{\varOmega_y}W\left(x,y\right).\kern0.5em {\left(1-{u}_1(y)-{u}_2(y)\right)}^m\ I(y)\ dy}{\int_{\varOmega_y}W\left(x,y\right).{\left(1-{u}_1(y)-{u}_2(y)\right)}^m\ dy} $$
(14)

Furthermore, keeping v1, v2 and v3 fixed and minimizing (10) with respect to u1 and u2, the variable u can be expressed as follows,

$$ {u}_i(x)=\frac{1}{\sum_{j=1}^3{\left(\frac{I(x)-{v}_i(x)}{I(x)-{v}_j(x)}\right)}^{\frac{2}{m-1}}},\kern1em i=1,2 $$
(15)

To solve the energy functional, F(C1, C2, v1, v2, v3, u1, u2) in (10), we use a fast numerical scheme inspired by Song and Chan [43] and developed by Krinidis and Chatzis [27] instead of solving the Euler-Lagrange equation of the underlying problem. Thus, by this way, the algorithm calculates the energy changes directly and decides for each pixel depending on the sign of the energy changes.

Now for a pixel P of the given image, assume that the intensity value of P is I(P) and the corresponding fuzzy membership degrees for this point are \( {u}_{o_1} \) and \( {u}_{o_2} \). Suppose that we change the membership degrees of point P to the new values \( {u}_{n_1} \) and \( {u}_{n_2} \)which are calculated by (15), and ∆F is the difference between the new and old energy when we change the membership degrees of point P. Then, ∆F (derived from (29) in the Appendix) is calculated as follows,

$$ \Delta F=\sum \limits_{i=1}^2\sum \limits_{\varOmega_x}\left({s}_i(x)\ \left(\frac{{u_{n_i}}^m-{u_{o_i}}^m}{s_i(x)+{u_{n_i}}^m-{u_{o_i}}^m}\right)\ {\left(I(P)-{v}_i(x)\right)}^2\right)+\sum \limits_{\varOmega_x}\left({s}_3(x)\ \left(\frac{{\left(1-{u}_{n_1}-{u}_{n_2}\right)}^m-{\left(1-{u}_{o_1}-{u}_{o_2}\right)}^m}{s_i(x)+{\left(1-{u}_{n_1}-{u}_{n_2}\right)}^m-{\left(1-{u}_{o_1}-{u}_{o_2}\right)}^m}\right)\ {\left(I(P)-{v}_i(x)\right)}^2\right) $$
(16)

where \( {s}_i(x)={\sum}_{\varOmega_y}W\left(x,y\right).{\left[{u}_i(y)\right]}^m,\kern0.5em i=1,2 \) and \( {s}_3(x)={\sum}_{\varOmega_y}W\left(x,y\right).{\left[1-{u}_1(y)-{u}_2(y)\right]}^m \).

The proposed algorithm of ACM evolved by the fuzzy energy function is summarized in Algorithm 1.

ALGORITHM 1: The proposed ELPFAC model.

figure a

As it is shown in Fig. 5, we applied both the ELPFAC and the LPFAC models to the synthetic images. The initial contours and results of the LPFAC and ELPFAC models are depicted in Fig. 5a–c respectively. As shown in Fig. 5(b-1) and (c-1), when the target object is far from dark areas, both LPFAC and ELPFAC models find it properly. However, as it is shown in Fig. 5(b-2) and (c-2), when the target object is adjacent to the dark areas, the ELPFAC successfully converges while the LPFAC model fails in this situation. Moreover, to evaluate the robustness of the ELPFAC model, Rician and Gaussian noises with three different percentages are added to the challenging image Fig. 5(a-2), and results of the ELPFAC model are shown in Fig. 6. As it is shown in Figs. 6a, b, the proposed ACM also has acceptable performance in the presence of different levels of both Rician and Gaussian noises.

Fig. 5
figure 5

a Illustration of two Synthetic brain tumor images with their assumed initial contours (red curves), a-1 the target region is far from the dark tissues and dark background, a-2 the target object is adjacent to the dark tissues and it has common boundaries with the dark background, b results of the LPFAC and c results of the ELPFAC

Fig. 6
figure 6

Results of the ELPFAC for the challenging image of Fig. 5a-2 in the presence of noise. From left to right, results of the ELPFAC for corrupted images with 3%, 5%, and 8% a Rician noise and b Gaussian noise

3.2 Automatic brain tumor segmentation utilizing superpixels and ELPFAC model (SP- ELPFAC)

Although the ELPFAC model due to utilizing extended fuzzy energy and local statistics has some advantages over other ACMs, it is still sensitive to noise, location of the initial contour, inhomogeneity, and amount of considering local information. Therefore, here, we use SPs mentioned in Section 2.1 instead of pixels as basic atomic units to improve the robustness of the algorithm in addition to reducing the computational cost. Moreover, to preserve the accuracy, pixel-based ELPFAC will continue the processes of the segmentation when the SP-based ELPFAC has stopped. Hence, its sensitivity to the contour initialization, noise, and heterogeneity can be significantly reduced without any concern about the accuracy declining.

Figure 7 illustrates the block diagram of the proposed method for automatic BTS. At the first stage, the intensity range of images is normalized between [0, 255]. Moreover, due to low contrast, poor edges, and heterogeneity of medical images, anisotropic diffusion filter [35] is used to reduce noise besides preserving edges and homogenize areas. Then, to initialize the SP-based ACM, the automatic Otsu thresholding method [34] is used to divides SPs of the target slice into three distinct clusters: tumorous SPs (SP1), SPs of dark tissues, and background (SP2) and SPs of the rest of brain tissue (SP3). In some cases, due to the nature of brain MR images, and the presence of a significant amount of artifacts and intensity inhomogeneity in MR images, non-tumor brain tissues may also have high-intensity values. Therefore, to avoid incorrect assignment of them to the SP1 cluster, among SP1s, we disregard groups of connected SPs whose number of SPs is less than 4 and we put them in the SP3 cluster. The entire proposed BTS is presented in Algorithm 2.

Fig. 7
figure 7

The framework of the proposed automatic BTS

ALGORITHM 2: The proposed automatic BTS algorithm (SP-ELPFAC).

figure b

4 Experimental results

In our experiments, to evaluate the proposed BTS method and to compare it with other state-of-the-art methods, we use the publicly available multimodal BRATS datasets [6, 7, 26, 32] in two versions; 2013 and 2019. The BRATS 2013 dataset contains 80 patient images with ground truth data that 30 of them are real (with 20 high grade (HG) and 10 low grade (LG) glioma subjects) and 50 of them are synthetic images (25 cases for each grade). All volumes of the dataset are skull stripped and interpolated to 1 mm isotropic resolution. Synthetic images of the dataset are degraded with different noise levels and intensity inhomogeneities, using Gaussian noise and polynomial bias fields with random coefficients. For each patient in both real and synthetic images, T1, T1-contrast enhanced (T1C), T2 and FLAIR MR images are available. Since FLAIR images, due to greater sensitivity to subtle abnormalities, are usually considered as a standard diagnostic tool for BTS in clinical routines, therefore, we here use FLAIR images to evaluate the proposed methods. Although all the real images of the BRATS 2013 dataset are also available in BRATS 2019, some new images of the BRATS 2019 dataset are also considered in our experiments.

To evaluate the quantitative performance of the proposed BTS method, Jaccard Similarity (JS) [21], Dice Similarity Coefficient (DSC) [14], Sensitivity, and Specificity metrics are used. Their definitions are given by Eqs. (17)–(20) respectively,

$$ JS=\frac{TP}{FN+ TP+ FP} $$
(17)
$$ DSC=\frac{2(TP)}{FN+2(TP)+ FP} $$
(18)
$$ Sensitivity=\frac{TP}{TP+ FN} $$
(19)
$$ Specificity=\frac{TN}{TN+ FP} $$
(20)

where TP is the True Positive (pixels correctly selected as tumorous tissue), FP the False Positive (pixels wrongly selected as tumorous pixels), FN the False Negative (undetected tumorous tissue) and TN the True Negative (pixels correctly selected as healthy tissue).

All simulations have been performed in MATLAB 2019b on Windows 10 operating system with a five core processor and 6 GB RAM.

4.1 Superpixels parameter

In the proposed method, we have used the SLIC method which has two parameters to be tuned manually: compactness factor (R) and the grid interval (S). To set the optimal values for these two parameters, we used an empirical experiment on 10 randomly selected subjects from the dataset. As it is shown in Table 1, the DSC of the segmentation is calculated for four different values of S. Moreover, for each S, five different values of compactness factor, R, are considered. SPs with a larger size reduce the computational cost while they have less homogeneity and it is observable from Table 1 that the DSC has decreased for S = 10. On the other hand, using extremely smaller SPs increases error and reduces segmentation accuracy especially in noisy cases. Therefore, we selected S = 8 and R = 65 which leads to an acceptable compromise between the efficiency and the computational cost.

Table 1 The effect of compactness factor (R) and size of SPs (S) on the segmentation performance with DSC metric (%)

4.2 Analyzing effect of SPs on the proposed BTS

Utilizing SPs in the proposed BTS not only reduces the computational cost but also improves the segmentation performance in two different aspects which are investigated in the following subsections.

4.2.1 Initialization improvement

In the proposed BTS method, due to some advantages of Otsu thresholding [34], we use it to initialize the ELPFAC model. In fact, Otsu method is an automatic nonparametric thresholding technique, which selects a global threshold value by maximizing the separability of the resultant clusters in gray levels. The procedure is very simple, utilizing only the zeroth and the first-order cumulative moments of the gray-level histogram. However, as it is shown in Fig. 8a, some cases have a large amount of noise and heterogeneity and consequently, in addition to tumorous tissues, as what is depicted in Fig. 8b, some other parts of the brain may be selected as tumorous tissues by a pixel-based Otsu thresholding algorithm. By contrast, since in SP-based algorithms, a set of pixels based on a defined similarity criterion receives a unique label, the effect of noisy pixels will be reduced. Hence, as it is illustrated in Fig. 8c, applying Otsu thresholding on SPs has a robust result.

Fig. 8
figure 8

Effect of SPs on the initialization step. a an original real brain MR image with FLAIR Modality b the pixel-based and c the SP-based Otsu thresholding

4.2.2 Reduction sensitivity to the initial contour

Despite the advantages of the proposed ELPFAC model, this model is sensitive to the location of the initial contour. However, utilizing SPs instead of pixels reduces this sensitivity and makes the algorithm more flexible against the location of the initial contour. Figure 9 shows results of both pixel-based and SP-based ELPFAC models on a real brain tumor image using two different initial contours. The initial contours and results of the pixel-based and SP-based ELPFAC models are depicted in Fig. 9a–c respectively. As illustrated in Fig. 9(b-2), with the same initial contour, the pixel-based ELPFAC model yields different results by considering different values of r. It means that it is relatively sensitive to the localization parameter value, r, which is the radius of the local mask, W(x, y). Hence, r should be chosen based on the size of tumors, location of the initial contours, and the amount of noise and inhomogeneity of images. Since different patients and also different slices of each case have different sizes of tumors and different amounts of noise and inhomogeneity, determining a constant value for r to segment all the images would not be possible. However, as it is shown in Fig. 9c, the SP-based ELPFAC model significantly reduces the sensitivity of the algorithm to the parameter, r, and from comprehensive empirical results, it can be concluded that considering one neighboring SPs (rSP − based = 1) for every under processing SP leads to reliable results for both real and synthetic images.

Fig. 9
figure 9

a Illustration of a real brain tumor image with its two assumed initial contours (red curves), b-1 result of the pixel-based ELPFAC with r = 25, b-2 results of the pixel-based ELPFAC with two different r = 25 and 35, c results of the SP-based ELPFAC. As it is shown in b-2, the pixel-based ELPFAC model cannot segment the tumor using the initial contour shown in a-2 and it also depends on the value of r, whereas the SP-based ELPFAC model is resistant to change of the initial contour

4.3 Complexity analysis

The local methods, due to applying local masks, incur a linear increase in computations compared to global methods. Hence, to evaluate the computational complexity of the proposed local ACM methodology, assume that at each iteration, ℘ pixels are crossed by the moving contour, therefore the proposed ACM would perform ℘q updates, where q is the number of pixels that exist within the W (x, y) neighborhood. Thus, as we have considered c = 3 classes, the time complexity is O (c℘qT), where T is the total number of iterations required to convergence. On the other hand, the space complexity of the proposed ACM is linear in the number of pixels since it needs to store membership degrees of each pixel. Hence, the space complexity of the proposed ACM is O(cN), where N is the number of pixels in the image.

Although created SPs add some computational cost to the algorithm, utilizing SPs instead of pixels considerably reduces both the number of computations at each iteration and the number of required iterations to convergence. If we apply the SPs to the proposed ACM, the time and space complexity turn into O (c℘'q'T') and O (ck) respectively, where k is the number of the total SPs in the image, ℘' is the number of SPs crossed by the moving contour and q' is the number of neighboring SPs respectively. Based on what is mentioned in Section 2.1 for the grid interval of SP segmentation, it can be concluded that \( k=2\sqrt{3}N/3{S}^2 \). Since we set S = 8 in our experiments, the number of the total SPs in the image (k) will be considerably less than the number of pixels in the image (N) and consequently, ℘' and q' in SP-based segmentation will be significantly less than and q in pixel-based segmentation. Furthermore, in the SP-based segmentation, since in each iteration a considerable amount of pixels will be classified through processing SPs, the total number of iterations (T) required to convergence declines to a great extent as T ≪ T.

4.4 Qualitative and quantitative results compared to the other ACMs

The qualitative results of the proposed Pixel-Based ELPFAC and SP-ELPFAC in comparison with LCV, LMS, LACM-BIC, and LPFAC ACMs for four real and four synthetic images of the BRATS 2013 dataset are presented in Figs. 10 and 11 respectively.

Fig. 10
figure 10

Segmentation results of four real images (BRATS 2013): (from left to right) the initial contour, the ground truth, the LCV, the LMS, the LACM-BIC, the LPFAC, the ELPFAC and the SP-ELFAC

Fig. 11
figure 11

Segmentation results of four synthetic images (BRATS 2013): (from left to right) the initial contour, the ground truth, the LCV, the LMS, the LACM-BIC, the LPFAC, the ELPFAC and the SP-ELFAC

To have a fair comparison, since Otsu thresholding algorithm has a better performance on SPs rather than pixels (which is mentioned in Section 4.2.1 and illustrated in Fig. 8 (b)), we used the same and manual initial contour (for the tumor region) for all the ACMs. Parameters of the LCV, LMS, and LACM-BIC are set as what is assumed in [19]. The localization parameter, r, is set to r = 25 for the LPFAC and the pixel-based ELPFAC models while one neighboring SPs (equivalent to rSP − based = 1) in the local mask is assumed for the SP-based ELPFAC model. It should be noted that since different brain MR images contain different amounts of noise, artifact, and inhomogeneity, the unfit initialization is probable. On the other hand, the pixel-based ELPFAC model similar to other pixel-based models is sensitive to noise and inhomogeneity of MR images. Thus, the best range for the localization parameter which can be appropriate for the most images is r≥ 25 whereas in the SP-ELPFAC model, due to the use of more local information, we can assume more delicate localization parameter (r = 10) for the pixel-based process by the time the SP-based ELPFAC has stopped.

Quantitative evaluation results of both Figs. 10 and 11 are shown in Table 2. As can be observed, the LACM-BIC, ELPFAC, and SP-ELPFAC have the best results because of considering dark areas in their computations. Moreover, since the SP-ELPFAC model utilizes fuzzy logic and advantages of both SP-based and pixel-based processing, it outperforms the other ACMs as it can be seen even from the qualitative results in Figs. 10 and 11. To make the advantages more clear, the average and standard deviation of the four metrics in addition to the average of the time processing per image for the proposed model and the comparative models are presented in Table 3. Results show that the SP-ELPFAC model has the best performance with JS = 0.8440, DSC = 0.9144, and Specificity = 0.9957. Although the Sensitivity of the proposed model has not the highest value, since assuming the entire image as tumorous tissue leads to the highest value of the Sensitivity, we cannot evaluate the performance of a segmentation method without considering other metrics. The SP-ELPFAC has also the lowest standard deviation for JS, DSC, and Specificity among the comparative models and it shows the robustness and reliability of the proposed model. Furthermore, as can be seen in Table 3, the SP-ELPFAC has the smallest computation time in addition to the most accuracy and undoubtedly it can be a prominent advantage for BTS in consecutive slices of a 3D volume.

Table 2 Quantitative results of ACMs for images of Figs. 10 and 11
Table 3 Average and standard deviation of quantitative metrics (mean ± std) and average of computation time (in seconds) for both Figs. 10 and 11 (BRATS 2013)

Figure 12 shows another evaluation of the proposed methods on four real images selected from BRATS 2019 dataset. The proposed approaches are compared with those ACMs that had better performances in Table 3. As can be seen in Table 4 which contains the quantitive results of Fig. 12, the proposed ELPFAC and SP-ELPAC approaches still have competitive results compared to the best ACM methods.

Fig. 12
figure 12

Segmentation results of four real images (BRATS 2019): (from left to right) the initial contour, the ground truth, the LACM-BIC, the LPFAC, the ELPFAC and the SP-ELFAC

Table 4 Average and standard deviation of quantitative metrics (mean ± std) and average of computation time (in seconds) for Fig. 12 (BRATS 2019)

4.5 Comparision with the other state-of-the-art methods

Table 5 shows the quantitative comparison of SP-ELPFAC against some other state-of-the-art approaches on real images of the BRATS 2013 dataset while Table 6 is based on selected methods of Table 5 whose results were available for synthetic brain images in the same dataset. In Table 7, the total result of the proposed method is compared with the methods mentioned in [17, 22] which reported their results in total for both real and synthetic images. The results illustrate the feasibility of the proposed method on both synthetic and real images. Moreover, for the real clinical images, the proposed method overcomes the others in most cases. In total, comparing the results in Tables 5, 6, and 7 shows that although none of the methods can outperform the others in all metrics and in both HG and LG cases, the proposed method has acceptable and competitive performance in both HG and LG cases and also in heterogeneous synthetic images.

Table 5 Comparison with state-of-the-art methods using BRATS 2013 dataset for real images
Table 6 Comparison with state-of-the-art methods using BRATS 2013 dataset for synthetic images
Table 7 Comparison with state-of-the-art methods which reported their results in total (for both real and synthetic images) using BRATS 2013 dataset

5 Conclusion

In this paper, we proposed a new method that utilizes an extended localized fuzzy ACM to segment brain tumors in MR images. Compared to the previous ones, the proposed fuzzy ACM provides a separate class for dark tissues and the other dark parts of brain MR images, so that it leads to better performance in cases where there are large amounts of dark regions. Moreover, to preserve the accuracy along with reducing the computational time, the segmentation process begins based on SPs and ends based on pixels and consequently, it makes the method more appropriate for BTS in consecutive slices of 3D MR volume data. On the other hand, experimental results show that utilizing SPs helps to have a better initialization and also makes the method more robust against the location of the initial contour and size of the localization parameter, r. Finally, Comparative experiments on the BRATS 2013 and 2019 datasets have demonstrated the advantages of the proposed BTS method over other related methods. It should be noted that the extended fuzzy energy function of the proposed ACM makes it able to be used in other modalities as well, such as computed tomography scans and ultrasounds which also have considerable dark areas.

Optimizing the proposed method and extending the developed method to accurately segment different tissues of brain tumors is another idea that this potential is currently under consideration in our research efforts.