1 Introduction

Image thresholding continues attracting the attention of researchers from different areas, due to the growing demand for computer vision systems over the last decade. Digital cameras are ubiquitous nowadays; multiple electronic devices are equipped with cameras and require specific software for image treatment and understanding for applications such as surveillance, medical diagnosis, industrial implementations, etc. In this context, the first stage of this kind of systems is the segmentation [47].

Thresholding (TH) is a specific form of segmentation which consists in the separation of the pixels into different groups depending on their intensity level according to one or more threshold values [46]. Thresholding is not only easy but also is robust and it can work with noisy images [19, 46]. Based on the number of thresholds required for the image, the TH process can be divided into two categories: 1) bi-level and 2) multilevel [29, 60]. In the bi-level approach, the pixels that have an intensity value higher than the threshold are labeled as an object, and the remaining pixels are part of the background. On the other hand, the multilevel TH (MTH) uses several thresholds to separate the pixels in different regions that represent objects contained in the image. For the segmentation of real-life images, the best alternative is the use of multilevel thresholding. The thresholding problem can be summarized as the search for the best threshold values on an image. It should be noticed that the threshold points depend on the histogram of the image; thus, each image has its own set of best threshold values.

According to Hammouche [23], different approaches have been proposed to estimate the best threshold values; such methods can be divided into parametric and nonparametric. In the first group, a set of parameters is used to generate an approximation of the histogram; such parameters also help to generate the classes for the pixel segmentation. This process requires a big amount of computational effort. In contrast, the nonparametric techniques are less compute-intensive and consequently more frequently used. Nonparametric approaches only require the number of thresholds and the histogram of the image to find the best threshold values by maximizing or minimizing a criterion. Two nonparametric approaches are extensively used in the literature: 1) Otsu’s between-class variance [40] and 2) Kapur’s entropy [26]. The Otsu’s criterion employs the histogram to search the maximum value of the between-class variance that exists in the regions. Similarly, Kapur’s method maximizes the entropy to measure the homogeneity of the classes. Both Otsu and Kapur have been initially proposed for bi-level thresholding. However, they can be easily extended for multilevel segmentation. Such approaches can provide good segmentation results; nevertheless, there are two important disadvantages [44]. The first drawback is the computational time required to find the best number of desired thresholds. Such problem occurs because the threshold values (th) are commonly extracted using an exhaustive search procedure. However, the computational effort and complexity required to perform the search increase exponentially with the number of th to search [39]. This situation also affects the complexity making unfeasible the pixel classification in a high number of groups [57]. In this context, the use of Otsu and Kapur for real-time applications is prohibitive. In this sense, for this kind of the applications, it is necessary to have a good balance to find the best th values in a considerable time is necessary [41]. Meanwhile, the second drawback of the classical approaches concerns the information used to find the best classification of the pixels. During this process, they do not consider spatial contextual information and only use the histogram of the image.

To face the problems previously described some interesting solutions had been developed. For the first disadvantage, the use of Evolutionary Computation Algorithms (ECA) is proposed. ECAs are interesting search strategies designed to solve complex optimization problems. For image segmentation, ECAs use as an objective function the Otsu or Kapur mechanisms to find the best threshold values. In this context, several ECA implementations have been introduced for multilevel thresholding. Some examples of such approaches are the use of Genetic Algorithms (GA) [22, 51, 59], Particle Swarm Optimization (PSO) [14, 15, 32, 45, 57], Differential Evolution (DE) [3, 7, 49], Cuckoo Search (CS) [5], Wind Driven Optimization (WDO) [5], Firefly Algorithm (FFA) [24, 56], Electromagnetism-Like Optimization (EMO) [37], Social Spider Optimization (SSO) [41] Flower Pollination Algorithm (FPA) [41] and Crow Search Algorithm (CSA) [39]. For the second limitation related to the spatial contextual information, the Energy Function (EF) has been proposed [18, 44]. The EF computes the energy of each intensity level considering the position and vicinity of each pixel. This process generates the energy curve that possesses the same features that the histogram. For example, the energy curve can contain some peaks, in this case, it can be separated according to a number of modes [42]. In other words, it is possible to find a threshold in a valley between two modes [6]. In the related literature, it has been proposed the use of CS with EF for image segmentation [42]. Meanwhile, in [44] is presented a segmentation approach using the GA to find the best thresholds using the energy curve. However, since EF has recently proposed from the best of our knowledge, it has not been studied for other implementations.

The Antlion Optimizer (ALO) [34] and the Sine Cosine Algorithm (SCA) [35] are two ECA recently proposed. They have not been tested over image processing problems. In this way, and considering the No-Free-Lunch (NFL) theorem that states that not all the optimization algorithms can be applied to the same problem [54], it is necessary to verify if ALO and SCA can provide good results for image segmentation. The main motivation of this article is to propose and test alternative solutions for the problematic of multilevel thresholding image segmentation. In other words, in this paper, there are not only applied the ECA but, also their performance is analyzed to prove that are able to solve real problems.

The ALO is an interesting approach inspired by the hunting mechanism of an insect known as antlion. The iterative process of ALO consists of five steps: random walks for ants, traps generation, lure ants into traps, capture the preys, and restructure traps. ALO was firstly introduced for mathematical optimization and for solving some single engineering problems. The ALO has been applied to train neural networks [55], it has also been modified for feature selection in machine learning [58] and power systems [20]. There exist different metaheuristic approaches inspired by ants that take as a base the Ant Colony Optimization (ACO) proposed by Dorigo [10, 11]. The main difference between ALO and ACO is that ACO is defined for combinatorial problems and use graphs to represent the problems [11]. ACO has also been adapted to solve non-combinatorial optimization problems preserving the analogy with ants and graphs; this version is called ACOR [48]. In this way, ALO works directly in continue domains without any adaptation to graphs. ACO and other similar methods like the Ant Colony Systems (ACS) [12] are part of the trajectory-based algorithms. Meanwhile, the ALO is based on a swarm and corresponds to the ECA.

On the other hand, the SCA is an optimization technique that uses the mathematical functions sine and cosine to perform the exploitation and exploration of the search space. The SCA considers two elements of the set of candidate solutions. One of the selected elements affects the next position that the other element will take. The sine and cosine functions are used to compute the new positions using some variables that allow switching between one of both mathematical operators (sine or cosine). The SCA has been tested over a big amount of benchmark function showing good performance in comparison with similar approaches [35]. In the same context, SCA has also been applied to the design of airfoil to verify its capabilities over real problems [35]. Recently it has also been applied for feature selection in different datasets [21].

Considering the capabilities of ALO and SCA showed in different implementations, this paper introduces their use for multilevel thresholding based on the energy function. The objective functions considered in this work are the maximization of Otsu and Kapur criteria using ALO and SCA. The performance of the two algorithms is compared considering the segmented output images and the values of the objective function. The ALO and SCA were selected since they are new approaches that consider only a few internal parameters. Since the good performance of an ECA relies on the precise arrangement of the initial parameters, algorithms with fewer parameters are easier to configure.

Considering the above, to the best of our knowledge, ALO and SCA have not been used for image processing problems. In this sense, the aims of this paper can be summarized as follows:

  1. 1.

    The introduction of two alternative methods for multilevel thresholding based on ALO and SCA considering spatial contextual information.

  2. 2.

    Perform a comparative study of the results of these new optimization approaches in the selected problem.

  3. 3.

    Test the ALO and SCA in a multidimensional image processing problem and verify their efficiency.

The methodology for MTH using ALO and SCA with energy function is similar. Both approaches consider as an input (search space) the energy curve previously computed. The candidate solutions are generated based on the number of thresholds to find for each image; this amount is similar to the number of dimension of the problem. ALO and SCA employ different operators (separately) to find the best configuration of thresholds. The quality of the candidate solution is evaluated using the Otsu or Kapur criteria. The best solutions from ALO and SCA are compared with both classical and modern approaches such as PSO, GA, CSA, Runner-Root Algorithm (RRA) [RRA] and ACOR. The comparisons include an analysis of the segmented images from each ECA. For this task, the Peak Signal-to-Noise Ratio (PSNR), the Root Mean Squared Error (RMSE), the Feature Similarity (FSIM) and the Structural Similarity Index (SSIM) [37, 38, 41] are evaluated.

The reminder paper is organized as follows: Section 2 presents the energy function, the problem of multilevel thresholding and the methods of Otsu and Kapur. Section 3 explains the proposed algorithms based on ALO and SCA. Meanwhile, in Section 4 are presented the experimental results. Finally, Section 5 discusses the conclusions.

2 Energy curve for multilevel thresholding

Image segmentation has become an important subject of interest as an intensive research community has explored this topic over the last years. Thresholding has become the most used technique since it has proven to be the most efficient method due to its characteristics. Thresholding can provide accurate results with a simple implementation for image segmentation. The performance of thresholding has been widely reported in the literature where new algorithms are evaluated every year. However, most research is focused on applying thresholding over the image’s histogram. Recently, the energy curve was introduced to incorporate spatial information into the thresholding method [18, 44]. As a result, the performance of traditional and novel algorithms used for thresholding should be evaluated considering the formulation of the energy curve to provide useful information to the image segmentation community and update the state-of-the-art.

This section presents a study of the previous work related to multilevel thresholding. Here are also introduced the concepts and properties of the energy curve of an image. Additionally, in this section, the problem of multilevel thresholding is defined. Two of the most iconic techniques used to accurately classify the pixels are introduced which can be extended to work with the energy curve instead of the histogram.

2.1 Previous work

Several segmentation techniques are available as the results of decades of research. Among the developed techniques, the thresholding approach is the most simple, robust, and accurate from all of them [23, 47]. Nevertheless, the multilevel thresholding of images performed by classical methods generates problems as those implementations are associated with high computational cost in the process of finding the best threshold values. The image thresholding problem has solved using evolutionary algorithms to search for the best threshold values without increasing the computational cost. Recent contributions continue to provide applications of novel metaheuristic algorithms to the thresholding problem [9, 13, 25, 30, 50].

Although the histogram-based thresholding is the most used approach, there have been efforts to incorporate contextual information into the thresholding process. Some of the most relevant approaches incorporate the spatial information into the segmentation. The work of Ghosh started the incorporation of this information [18], and it was recently coupled with the Cuckoo Search algorithm [42] and later extended with the use of a grey-level co-occurrence [43]. As the formulation of the incorporation of the energy curve to the thresholding process was not addressed until recently, an extensive evaluation of thresholding techniques using the energy curve is not present in the current literature.

2.2 Energy curve

The energy curve of an image provides an alternative to the histogram-based segmentation. The energy curve has interesting properties such as it considers spatial contextual information of the image and not just the intensity of the pixel. Compared to the histogram, energy curves seem to be smoother while preserving valleys and peaks. The image thresholding technique consists of the search for the corresponding threshold values in the middle of the valley regions of the energy curve. Every valley exists between two adjacent modes, and each mode characterizes an object present in the image. Thus, an accurate selection of threshold values generates good segmentation results.

A digital image I used for this process can be defined as a matrix I = {lij, 1 ≤ i ≤ m, 1 ≤ j ≤ n} of size m × n where lij denotes the gray level of the image I at the pixel (i,j). The maximum value of the gray intensity of the image I is denoted as L. To consider the contextual spatial information, a spatial correlation between surrounding pixels is calculated. For this purpose, a neighborhood N of order d at given position (i,j) is used \( {N}_{ij}^d=\left\{\left(i+u,j+v\right),\left(u,v\right)\in {N}^d\right\} \).The value of d determines the configuration that the neighborhood system takes [18]. This paper considers the second-order neighborhood system N2. The system can be defined in spatial terms as(u, v) ∈ {(±1, 0), (0, ±1)(1, ±1)(−1, ±1)} and is shown in Fig. 1.

Fig. 1
figure 1

The spatial representation of the neighborhood system N2

The energy of the image I at gray intensity value l(0 ≤ l ≤ L)is calculated by generating a two-dimensional matrix for every intensity value asBl = {bi, j, 1 ≤ i ≤ m, 1 ≤ j ≤ n} where bi, j = 1 if the intensity at the current position is greater than l the intensity value (li, j > l), or else bi, j =  − 1. Figure 2 depicts a grayscale image I with the intensity values l on top of each pixel and two examples of the binary matrix Bl. Considering Fig. 2 as an example, let C = {cij, 1 ≤ i ≤ m, 1 ≤ j ≤ n} be a constant matrix where cij = 1, ∀ (i, j) the energy value El of the image I at gray intensity value l is computed as:

$$ {E}_l=-\sum \limits_{i=1}^m\sum \limits_{j=1}^n\sum \limits_{pq\in {N}_{ij}^2}{b}_{ij}\cdot {b}_{pq}+\sum \limits_{i=1}^m\sum \limits_{j=1}^n\sum \limits_{pq\in {N}_{ij}^2}{c}_{ij}\cdot {c}_{pq} $$
(1)
Fig. 2
figure 2

Example image I, binary matrix Bl for l = 35 and l = 80 and constant matrix C

The right side of the equation Eq. 1 is a fixed term devoted to assuring a positive energy value El ≥ 0. A quick look at Eq. 1 shows that for a given image I at intensity value l will be zero if all the elements of the binary image Bl are either 1 or −1. This approach determinates the energy associated to every intensity value of the image to generate a curve considering spatial contextual information of the image.

2.3 Thresholding

Thresholding is the easiest way to segment an image. The ease of thresholding consists in the use of threshold values (th) and apply them over the histogram until an optimal criterion is met. However, as it was previously explained, the histogram does not provide information about the position of the pixel. To include spatial information, the process of thresholding can be applied to the energy curve in the following rule:

$$ {I}_s\left(r,c\right)=\left\{\begin{array}{ccc}{I}_{Gr}\left(r,c\right)& \mathrm{if}& {I}_{Gr}\left(r,c\right)\le {th}_1\\ {}{th}_{k-1}& \mathrm{if}& {th}_{k-1}<{I}_{Gr}\left(r,c\right)\le {th}_k\\ {}{I}_{Gr}\left(r,c\right)& \mathrm{if}& {I}_{Gr}\left(r,c\right)>{th}_{nt}\end{array}\right.,\kern1em k=2,3,\dots nt $$
(2)

where Is(r, c) is the gray value of the segmented image, IGr(r, c) is the gray value of the original image both in the pixel position r, c. Since most applications require the segmentation of more than two classes, the energy curve is divided into nt + 1 classes using nt thresholds, where thk is the k-th threshold value used for the segmentation process. The main problem in image segmentation is to obtain the best thresholds that can ensure the best classification of pixels. In this paper, the Otsu’s and Kapur’s methods are used to find the best threshold configuration over the energy curve. The two approaches are described in the next subsections.

2.4 Otsu between class variance

The most popular thresholding technique was proposed by Otsu [40]. This unsupervised technique segments an image by maximizing the difference between various classes considering the image’s histogram. However, this method can be applied to the energy curve since it has similar features to the histogram. For the multi-level approach, nt thresholds are necessary to divide the original image into nt + 1 classes. Thus, the set of thresholds used for segmentation for a given image is encoded as TH = [th1, th2, …, thnt]. In this sense, the energy value Ei of each pixel of a digital image according to the frequency of its occurrence generates a probability PEi = Ei/NP where \( {\sum}_{i=1}^{NP}{PE}_i=1 \), and NP is the total number of pixels. According to the placement of every threshold value, each of the generated classes is used to compute the variance σ2 and their means μk as:

$$ {\sigma}^2=\sum \limits_{k=1}^{nt}{\sigma}_k=\sum \limits_{k=1}^{nt}{\omega}_k{\left({\mu}_k-{\mu}_T\right)}^2 $$
(3)
$$ {\mu}_k=\sum \limits_{i={th}_k}^{th_{k+1}-1}\frac{kPE_i}{\omega_i\left({th}_i\right)} $$
(4)

where:

$$ {\omega}_k=\sum \limits_{i={th}_k}^{th_{k+1}-1}{PE}_i $$
(5)

Finally, the Otsu’s method maximizes the variance for the given set of threshold values:

$$ {f}_{Otsu}\left(\mathbf{TH}\right)=\max \left({\sigma}^2\left(\mathbf{TH}\right)\right),\kern1em 0\le {th}_i\le L-1,\kern1em i=1,2,\dots, nt $$
(6)

2.5 Kapur’s entropy

An entropy-based method used to find optimal threshold values is the one presented by Kapur [27]. This thresholding method is based on the probability distribution of the image histogram and the entropy of the classes. Similar to Otsu’s method, it can be applied directly to the energy curve. Kapur’s method searches for the optimal set of thresholds TH that maximizes the overall entropy.

$$ {f}_{Kapur}\left(\mathbf{TH}\right)=\max \left(\sum \limits_{k=1}^{nt}{H}_k\right) $$
(7)

Where the entropy of each class is calculated as:

$$ {H}_k=\sum \limits_{i={th}_k}^{th_{k+1}-1}\frac{PE_i}{\omega_k}\ln \left(\frac{PE_i}{\omega_k}\right) $$
(8)

The probability distribution PEi and ωk are computed using the same criteria as Otsu’s method.

3 Optimization-based approaches for multilevel thresholding

This section presents the basics of Antlion Optimizer (ALO) and Sine Cosine Algorithm (SCA) and how they are modified for multilevel thresholding. Considering the procedure described in section 2, the selected evolutionary algorithms explore the search space defined by the energy function instead of the histogram. Each optimization approach is divided into two implementations; one considers the Otsu method and the second uses the Kapur entropy. The images used with this methods do not require any preprocessing steps.

The framework of the proposed approaches is depicted in Fig. 3. The image (I) is the input the proposed approach(s), and compute the energy curve of I, then the main steps of the ALO or SCA algorithm will be performed until predefined criteria (such as the maximum number of iterations).

Fig. 3
figure 3

The proposed model for multilevel segmentation

The first step in each approach (ALO/SCA) is to generate a random population that represents the threshold values of the given image. The structure definition of this population of candidate solutions X and how each element is constructed are illustrated in Eq. 9.

$$ {\displaystyle \begin{array}{l}\mathbf{X}=\left[{\mathbf{TH}}_1,{\mathbf{TH}}_2,\dots, {\mathbf{TH}}_N\right],\kern1em {\mathbf{TH}}_i={\left[{th}_1,{th}_2,\dots, {th}_{nt}\right]}^T,\kern0.5em \\ {}\mathrm{subject}\ \mathrm{to}\kern1em {th}_1<{th}_2<\dots <{th}_k<L\end{array}} $$
(9)

From Eq. 9, THi ⊆ X and it is a vector that contains the set of thresholds (thj) that should segment the image. The amount of thresholds in THi is defined by the dimensions of the problem (nt). On the other hand, the 8 bits digital images have 255 intensity levels (L), it means that the search space is defined between the bounds [0, 255].

Moreover, similar to most of the evolutionary algorithms, the candidate solutions are randomly generated from a feasible search space defined by the bounds [gmin, gmax], where gmin and gmax are the minimum and the maximum gray levels in the energy curve, respectively. Therefore, thi of the i-th solution (TH) can be generated using Eq. 10:

$$ {th}_i={g}_{\mathrm{min}}+\mathit{\operatorname{rand}}\times \left({g}_{\mathrm{max}}-{g}_{\mathrm{min}}\right),\kern0.5em \forall i=\left[1,2,\dots, K\right] $$
(10)

where rand is a random number uniformly distributed between 0 and 1. thi is the i-th threshold value from a feasible solution THi i corresponds to a dimension of the search space.

After that, the ALO (Section 3.1) or SCA (Section 3.2) is used to update the current solutions according to its behaviors then the terminal conditions are checked. If they are met, the algorithm stops and returns the best threshold; otherwise, it starts a new generation. The following subsections explain the main components of the proposed approach.

3.1 The antlion optimizer for multilevel thresholding

The aim of this component of the proposed algorithm that called ALO algorithm is to determine the multilevel thresholding value considering the next description. After the population X of N individuals is created (as in the previous step), the next step is to evaluate the quality of the thresholds using the Otsu or Kapur objective function defined in Eq. 6 and Eq. 7, respectively.

Then at each iteration in ALO algorithm, the position of ants are modified based on a random walk by using Eq. 11.

$$ {\mathbf{TH}}_i^t=\frac{\left({\mathbf{TH}}_i^t-{a}_i\right)\times \left({d}_i-{c}_i^t\right)}{\left({d}_i^t-{a}_i\right)}+{c}_i $$
(11)

Where ai and bi are the minimum and the maximum values of a random walk for the i-th variable respectively. \( {c}_i^t \) and \( {d}_i^t \) are the minimum and the maximum of the i-th variable at t iteration. The Eq. 11 represents the normalized of the random walks to constrain it in the boundary of search space.

The next stage in the ALO is the simulation of how the random walk of ants is affected by antlion’ traps. This is performed by using the following mathematical formula:

$$ {\displaystyle \begin{array}{l}{c}_i^t={Antlion}_j^t+{c}^t\\ {}{d}_i^t={Antlion}_j^t+{d}^t\end{array}} $$
(12)

Where ct and dt are the minimum and maximum of all variables at t iteration. Meanwhile, the roulette wheel operator is used to select the j-th \( {Antlion}_j^t \)at the t iteration. This element is a candidate solution THi extracted from X. Eq. 13 establishes that the random walk of ants in a hypersphere defined by the vectors c and d around a selected antlion. Based on this concept the ants are restricted to move within a hypersphere around a selected antlion. The ALO algorithm builds the strategy to hunt the ants by using a roulette wheel which is used to select an antlion (since ants are trapped in only one antlion) based on the fitness function values. Based on this strategy the fitter antlion has higher chances to catch the ants.

After finishing the stage of traps building, the ants are sliding towards antlion, and the antlions shoot sand outwards the center of the pit. To emulate this behavior, the radius of each hypersphere is decreased adaptively by using the following equation:

$$ {c}^t=\frac{c^t}{rad},\kern0.5em {d}^t=\frac{d^t}{rad},\kern0.5em rad=\frac{10^wt}{T} $$
(13)

rad is a radius for exploitation, w is a constant defined based on the current iteration (w = 2 when t < 0.1 T, w = 3 when t < 0.5 T, w = 4 when t < 0.75 T w = 5 when t < 0.9 T and w = 10 when t < 0.95 T). In other words, w adjust the accuracy level of exploitation. The final stage is to catch the prey when it reaches the bottom of the pit. In this step, the antlion withdraws the ant inside the sand to catch it. For formulating this process mathematically, the following equations are used:

$$ { Ant lion}_j^t={Ant}_i^t\kern0.5em \mathrm{if}\kern0.5em f\left({Ant}_i^t\right)>f\left({Ant lion}_j^t\right) $$
(14)

From Eq. 14 \( {Antlion}_j^t \) and \( {Ant}_j^t \) are the positions (THi) selected from X at the t iteration. This equation considers that whenever the ants become fitter than its corresponding antlion, the prey catching occurs. Then the antlion updates its position to the latest position of the hunted ant to enhance its chance of catching new preys.

The concept of elitism is used in this algorithm to save the best solution obtained at each iteration and affect the movements of all the ants. In this sense, each ant randomly walks around a selected antlion (which is selected by the roulette wheel) and the elite simultaneously using Eq. 15.

$$ {Ant}_i^t=\frac{R_A^t+{R}_E^t}{2} $$
(15)

\( {R}_A^t \) and \( {R}_E^t \) are the random walks around the antlion selected by the roulette wheel and around the elite, respectively, at t-th iteration. Based on all the previous stages of the ALO algorithm, the two random walks around the selected antlion and elite are used to update the position. Besides, the parameters c and d are updated with respect to the current iteration. The fitness function is used to evaluate the ants and based on this evaluation the antlions update their positions as follows: If any of the ants become fitter than any other antlions, then the position of antlions is changed to be the position of ants in the next iteration. Then the best antlion (elite) is updated. The previous steps are repeated until a stop criterion is met. Figure 4 summarizes all the steps of ALO.

Fig. 4
figure 4

The ALO algorithm

As was previously mentioned, the ALO mimics the hunting mechanism of a specific type of ants. Different to other ant-based algorithms as ACO or ACOR, in ALO the ant lion is a depredator that generates a trap for other ants considered the prey. Meanwhile, in the ACO based approaches, the metaphor is to search for the best food source while reinforcing the path generated with pheromones. Moreover, the operators used in ALO permits to work directly in continues search spaces, and the ACO was initially proposed for combinatorial problems.

3.2 The sine cosine algorithm for multilevel thresholding

For the multilevel thresholding problem, the SCA aims to find the optimal position in the search domain. In other words, the best solution represents the value of optimal thresholds configuration that maximizes one of the objective functions given in section 2. For MTH the SCA uses as an input the curve energy of the image with the input population X and the optimal thresholds (THbest) is the output (the optimal threshold position). In SCA each candidate solution THi (i = 1,2,…, N) is represented as a vector of possible real values corresponding to thresholds (Eq. 10). The quality of each candidate solution is evaluated at the beginning using one of the objective function (f) defined by Otsu’s or Kapur’s approaches.

The SCA has its novel mechanism to update the position of the candidate solutions after the evaluation of the objective function. Eq. 16 and Eq. 17 present the process to compute the new positions based on the sine and cosine mathematical functions [35].

$$ {th}_j={th}_j+{r}_1\times \sin \left({r}_2\right)\times \left|{r}_3\times \left({th}_{best}^j-{th}_j\right)\right|,\kern0.5em \forall j=\left[1,2,\dots, nt\right] $$
(16)
$$ {th}_j={th}_j+{r}_1\times \cos \left({r}_2\right)\times \left|{r}_3\times \left({th}_j^{best}-{th}_j\right)\right|,\kern0.5em \forall j=\left[1,2,\dots, nt\right] $$
(17)

In Eq. 16 and Eq. 17, thj is the current solution in the dimension j. The variables r1, r2 and r3 are random variables uniformly distributed between 0 and 1. Meanwhile \( {th}_j^{best}\in {\mathbf{TH}}_{best} \) is the destination position and |⋅| indicates the absolute value [35].

Based on the original SCA [35], the process to determinate the next areas for a new solution is the responsibility of r1; these areas could be either in the space between thi and \( {th}_i^{best} \) or outside this space. This parameter is updated through iterations as:

$$ {r}_1=a-t\left(\frac{a}{T}\right), $$
(18)

where t is the current iteration. T is the maximum number of iterations, and a is a constant. Moreover, determining the direction of movement of a candidate solution THi is the responsibility of r2. Meanwhile, r3 is employed to determine the random weight for THbest in order to stochastically deemphasize r3 < 1 or r3 > 1.

However, each solution can be updated simultaneous based on a random variable r4 (which is used to switch between the sine and cosine). Therefore, the Eq. 16 and Eq. 17 are combined as in Eq. 19:

$$ {th}_j=\left\{\begin{array}{ccc}{th}_j+{r}_1\times \sin \left({r}_2\right)\times \left|{r}_3\times \left({th}_j^{best}-{th}_j\right)\right|,\kern0.5em & \mathrm{if}& {r}_4<0.5\\ {}{th}_j+{r}_1\times \cos \left({r}_2\right)\times \left|{r}_3\times \left({th}_j^{best}-{th}_j\right)\right|,\kern0.5em & & \mathrm{otherwise}\end{array}\right.,\kern1em \forall j=\left[1,2,\dots, nt\right] $$
(19)

Summarizing the optimization process of SCA, the first step in SCA is to generate a random population and then compute the fitness function of each THi. The algorithm then determines the global best solution THbest (the target point) and the rest of the population is updated based on THbest. Moreover, to emphasize the exploitation of the search space the values of the parameters r1, r2, r3 and r4 are updated at each iteration. The stop criteria are set to the maximum number of iterations. Finally, the pseudo code of the SCA algorithm is presented in Fig. 5.

Fig. 5
figure 5

The SCA algorithm

4 Experiments and results

The ALO and SCA for multilevel thresholding are tested over eleven benchmark images. Such images contain different complexity levels. All the images have the same size (512 × 512 pixels), and they are in JPEG format [1]. This paper includes a supplementary material file (SMF) where is presented the full set of images (Fig. S1), the experimental results and comparisons of ALO and SCA. In the supplementary material, there are used only five images to graphically show the results of the proposed algorithms, and they are marked with an asterisk (*). For numerical outcomes, the eleven images are used, and the full tables are also included in the supplementary material file. In this context, in the SMF the reader can find an extended version of Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10, Figs. 7, 8, 9, and 10. To avoid confusion, the tables in SMF are labelled with an S before the number and the order is the same that in this section (from Table S1 to Table S14).

Table 1 Threshold obtained by SCA, ALO, GA, PSO, CSA, RRA, and ACOR using the Otsu method as an objective function
Table 2 Mean and STD values of the PSNR using Otsu’s Method obtained by SCA, ALO, GA, PSO, SCA, RRA and ACOR
Table 3 Mean and STD values of the SSIM using Otsu’s Method obtained by SCA, ALO, GA, PSO, CSA RRA, and ACOR
Table 4 Mean and STD values of the FSIM using Otsu’s Method obtained by SCA, ALO, GA, PSO, CSA, RRA, and ACOR
Table 5 Threshold results after obtained by SCA, ALO, GA, PSO, CSA, RRA, and ACOR using the Kapur method as the objective function
Table 6 Mean and STD values of the PSNR using Kapur’s Method obtained by ALO, SCA, GA, PSO, CSA, RRA, and ACOR
Table 7 Mean and STD values of the SSIM using Kapur’s Method obtained by SCA, ALO, GA, PSO, CSA, RRA and ACOR
Table 8 Mean and STD values of the FSIM using Kapur’s Method obtained by SCA, ALO, GA, PSO, CSA, RRA, and ACOR
Table 9 p- results from the Wilcoxon’s rank test performed over the results obtained using Otsu as the objective function
Table 10 p- results from the Wilcoxon’s rank test performed over the results obtained using Kapur as the objective function

On the other hand, in Fig. 6, it is presented the subset of the most representative images along with their histograms and the corresponding energy curve computed as in Section 2. All the images from Fig. 6, present different histograms that represent the pixels distributions. In this context, the energy curve preserves some features from such distribution, but also it possesses other interesting features extracted from the neighborhood of each pixel. An example of the differences between histograms and curve energy is the Hunter image, where is possible to see how the energy curve is completely different to the histogram.

Fig. 6
figure 6

Subset of images for visual experiments

To verify the efficacy of the proposed approaches they have been compared against different metaheuristics from the state-of-the-art. Such methods are the Particle Swarm Optimization (PSO) [28], the Genetic Algorithms (GA) [8], Crow Search Algorithm (CSA) [4], Runner Root Algorithm (RRA) [33] and the Ant Colony Optimization for continuous domains (ACOR) [48]. All the selected ECA have different control parameters that are set based on their respective authors. According to the No-Free-Lunch (NFL) theorem, not all the optimization algorithms can be applied to the same problem [54]. In this context, the experiments and comparisons of ECA for image thresholding help to prove that the ALO and SCA are competitive for real problems.

Considering that the ECA involves the use of random variables it is necessary to statistically analyze their results. A set of 35 independent experiments were performed for each selected algorithm over all the images contained in the dataset. For the segmentation in this paper, there are used 2, 4, 8, 16 and 32 thresholds are used over the energy curve instead of the histogram. The stop criteria and the number of elements of the population are set to 500 and 50, respectively for each algorithm. All experiments were performed using MATLAB 8.3 on an Intel Xeon-2620 v3 CPU @ 2.4Ghz with 16GB of RAM.

The results of the MT algorithms based on ALO and SCA are measured not only considering the objective function values but also considering the quality of the segmented image. One of the metrics used is the Standard Deviation (STD) that represents the stability of the results obtained by the algorithms [17]. The STD is computed as follows:

$$ STD=\sqrt{\sum \limits_{i=1}^{Iter_{\mathrm{max}}}\frac{\left({\sigma}_i-\mu \right)}{Ru}} $$
(20)

Another interesting metric is the Peak-Signal-to-Noise Ratio (PSNR). It is used compare the similarity of the segmented image with the original image. The PSNR is based on the mean square error (MSE) of each pixel [1, 2, 24, 36]. Both PSNR and RMSE are defined as:

$$ {\displaystyle \begin{array}{l} PSNR=20{\log}_{10}\left(\frac{255}{RMSE}\right),\left(\mathrm{dB}\right)\\ {} RMSE=\sqrt{\frac{\sum \limits_{i=1}^{ro}\sum \limits_{j=1}^{co}\left({I}_{or}\left(i,j\right)-{I}_{th}\left(i,j\right)\right)}{ro\times co}}\end{array}} $$
(21)

From Eq. 21 Ior is the original image, Ith is the segmented image. Meanwhile, ro and co are the maximum number of rows and columns of the image. In the same context, the Structure Similarity Index (SSIM) compares the structures of the original segmented image [52], and it is defined in Eq. 22. A higher SSIM value represents a better segmentation of the original image.

$$ {\displaystyle \begin{array}{c} SSIM\left({I}_{or},{I}_{th}\right)=\frac{\left(2{\mu}_{I_{or}}{\mu}_{I_{th}}+C1\right)\left(2{\sigma}_{I_{or}{I}_{th}}+C2\right)}{\left({\mu_{I_{or}}}^2+{\mu_{I_{th}}}^2+C1\right)\left({\sigma_{I_{or}}}^2+{\sigma_{I_{th}}}^2+C2\right)}\\ {}{\sigma}_{I_{th}{I}_{or}}=\frac{1}{N-1}\sum \limits_{i=1}^N\left({I}_{or_i}+{\mu}_{I_{or}}\right)\left({I}_{th_i}+{\mu}_{I_{th}}\right)\end{array}} $$
(22)

In Eq. 22 the mean of the original image is \( {\mu}_{I_{or}} \) and the mean of the thresholded image is represented by \( {\mu}_{I_{th}} \). In the same way, for each image, the values of \( {\sigma}_{I_{Gr}} \) and \( {\sigma}_{I_{th}} \) correspond to the standard deviation. C1 and C2 are constants used to avoid the instability when \( {\mu_{I_{Gr}}}^2+{\mu_{I_{th}}}^2\approx 0 \). The values of C1 and C2 are set to 0.065 considering the experiments of [1].

In the same context, the Feature Similarity Index (FSIM) [31], calculates the similarity between two images: in this case, the original grayscale image, and the segmented image. As PSNR and SSIM the higher value is interpreted as better performance of the thresholding method. The FSIM is then defined as:

$$ FSIM=\frac{\sum \limits_{w\in \Omega}{S}_L(w){PC}_m(w)}{\sum \limits_{w\in \Omega}{PC}_m(w)} $$
(23)

On the FSIM the entire domain of the image is defined by Ω, and their values are computed by Eq. 24.

$$ {\displaystyle \begin{array}{c}{S}_L(w)={S}_{PC}(w){S}_G(w)\\ {}{S}_{PC}(w)=\frac{2{PC}_1(w){PC}_2(w)+{T}_1}{{PC_1}^2(w)+{PC_2}^2(w)+{T}_1}\\ {}{S}_G(w)=\frac{2{G}_1(w){G}_2(w)+{T}_2}{{G_1}^2(w)+{G_2}^2(w)+{T}_2}\end{array}} $$
(24)

G is the gradient magnitude (GM) of a digital image and is defined, and the value of PC that is the phase congruence is defined as follows:

$$ {\displaystyle \begin{array}{c}G=\sqrt{{G_x}^2+{G_y}^2}\\ {} PC(w)=\frac{E(w)}{\left(\varepsilon +\sum \limits_n{A}_n(w)\right)}\end{array}} $$
(25)

where An(w) is the local amplitude on scale n and E(w) is the magnitude of the response vector in w on n. ε is a small positive number and PCm(w) = max(PC1(w), PC2(w)).

4.1 Otsu’s results

The results of ALO and SCA using the Otsu’s method are presented in this subsection. ALO and SCA take Eq. 6 as an objective function and maximizes the between-class variance. This procedure is performed over the eleven images using five different amounts of thresholds. Table 1 presents a comparison of the best thresholds obtained by the selected optimization algorithms over the subset of images. In Table 1, the column I corresponds to the name of the image; column nt is the number of thresholds (nt = 2, 4, 8, 16, 32). The rest of the columns correspond to the results obtained by ALO, SCA, GA, PSO, CSA, RRA, and ACOR. Such values can be applied directly to the energy curve using Eq. 2.

The aim of multilevel image segmentation is to obtain the best quality in segmented images based on thresholds. In this context, the results of the PSNR between the segmented and the original images for the selected algorithms should be analyzed. To statistically compare the solutions obtained by the approaches the Mean after 35 experiments is reported; to verify the stability of the methods for this problem the STD is computed from the same experiments. All of these values are presented in Table 2 for the Otsu’s PSNR.

From Table 2 a higher Mean of PSNR value represents a better segmentation of the image using the thresholds obtained by each algorithm. On the other hand, a smaller value of the STD reflects less variation between the results obtained in each experiment. A similar study is presented in Table 3 for the values of the SSIM using the four algorithms.

Another interesting metric used to verify the quality of the segmented images is the FSIM. Table 4 shows the results of FSIM using the results obtained by each one of the selected evolutionary algorithms; like SSIM and PSNR, a higher value of FSIM represents a better segmentation performance.

Summarizing, the metrics presented in Tables 2, 3 and 4, are used to verify the quality of the segmented images generated by using the best thresholds obtained by the compared algorithms. Resultant values of PSNR, SSIM, and FSIM indicate that ALO can find better thresholds in comparison with the remainder methodologies used in the experiments. Figures 7 and 8 show the images obtained by ALO and SCA with a different number of thresholds using the Otsu’s method as the objective function. In specific Fig. 7, is based on the best thresholds generated by the ALO. The resultant images provide evidence that increasing the number of thresholds the quality of the segmented image also increases. This fact is also supported by the values of PSNR, SSIM, and FSIM.

Fig. 7
figure 7

Segmented images and thresholded Energy Curves obtained using ALO with Otsu objective function

Fig. 8
figure 8

Segmented images and thresholded Energy Curves obtained using SCA with Otsu objective function

Similar to the results of ALO, the images of Fig. 8 are obtained by the MTH algorithm based on SCA in combination with Otsu, this table shows the histograms with their respective thresholds. The graphical quality of the segmented images increases with the number of th used for segmentation. The problem here is that some values of th are very close to each other, this situation reduces the quality no matter the number of thresholds used.

4.2 Kapur’s results

This subsection presents the results of the selected algorithms using as an objective function the Kapur entropy (Eq. 7). The experimental process is the same that in section 4.1. A comparison of the best configuration of thresholds obtained by each of the selected algorithms (ALO, SCA, GA, PSO, CSA, RRA and ACOR) is presented in Table 5. From all the tables in this section, I correspond to an image from the dataset and nt is the number of thresholds.

The thresholds obtained by the ECA are applied over the energy curve of the image to classify the pixels. Once the output images are generated, they should be evaluated in order to verify the quality of this process. Again, the PSNR, SSIM, and FSIM are used to measure the quality of the segmentation. In Table 6 are presented the mean and STD values of PSNR for all the algorithms. For all metrics, it is expected a higher value that represents a good segmentation.

From Table 6 it is possible to observe that some algorithms can generate high PSNR values. However, this metric is not completely related to good optimization results. For example, algorithms like GA or PSO provide bad thresholds, but the PSNR is high in some cases. This fact occurs due that different pixels are not correctly classified and it can be also analyzed in Table S8 of the supplementary material. On the other hand, Tables 7 and 8 presents the mean and STD study for FSIM and SSIM metrics for each algorithm.

Alternatively, different segmented images are obtained using the thresholds and Eq. 1. The resultant images using the Kapur’s method as the objective function and ALO algorithm are presented in Fig. 9. Notice that this is a subset of the most representative images from the entire dataset.

Fig. 9
figure 9

Segmented images and thresholded Energy Curves obtained using ALO with Kapur objective function

The solutions obtained using Kapur are different to the threshold generated by Otsu using ALO over the same images. This situation occurs due that the methodologies compute different information from the energy curve. This problem occurs using any evolutionary approach. In Fig. 10 are shown the graphical results of using SCA and Kapur over the set of test images.

Fig. 10
figure 10

Segmented images and thresholded Energy Curves obtained using SCA with Kapur objective function

From Fig. 10 it is observed that the solutions provided by SCA are not completely good in all the cases. For example, in the image called “Lake” from the Table S12 in the SMF the thresholds obtained with nt = 32 are close to each other. This situation harms the segmentation because only a few values of intensity are considered to generate small classes. From such results, it is possible to conclude that the SCA algorithm in some cases fails to find the optimal in a high number of dimensions.

In this subsection, Tables 6, 7, and 8 provide evidence about the performance of the selected algorithms for image thresholding. From such results, it is possible to conclude that ALO is the method that provides the best values for the segmentation of digital images in comparison with other similar optimization algorithms.

4.3 Statistical comparisons

This section presents a comparative study of the results obtained by ALO, SCA, GA, PSO, CSA, RRA, and ACOR. Since all these algorithms employ random numbers, their results can be different at each iteration. The objective function values of the selected methods are statistically compared using a non-parametric significance proof known as the Wilcoxon’s rank test [16, 53] that is conducted with 35 independent samples. Such proof allows assessing results differences among two related methods. The analysis is performed considering a 5% significance level over the solution with the best objective function value considering the Otsu or Kapur method. This test is performed over each image and considering the different amounts of thresholds. Table 9 reports the p-values produced by Wilcoxon’s test for a pair-wise comparison of the values of Otsu function between ALO vs. GA, ALO vs. PSO, ALO vs. CSA, ALO vs. RRA, ALO vs. ACOR, SCA vs. GA, SCA vs. PSO, SCA vs. CSA, SCA vs. RRA, SCA vs. ACOR, and ALO vs. SCA.

As a null hypothesis, it is assumed that there is no difference between the values of the two algorithms tested. The alternative hypothesis considers an existent difference between the values of both approaches. In Table 9, the p-values for all the comparisons are less than 0.05 (5% significance level) which is strong evidence against the null hypothesis. This fact indicates that ALO is statistically better (based on the Otsu method) and such results have not occurred by chance. On the other hand, in Table 10 the results of the p-values for comparison between ALO vs. GA, ALO vs. PSO, ALO vs. CSA, ALO vs. RRA, ALO vs. ACOR, SCA vs. GA, SCA vs. PSO, SCA vs. CSA, SCA vs. RRA, SCA vs. ACOR, and ALO vs. SCA., considering the Kapur results.

The values provided in Table 10 present the evidence that ALO is better, and it can provide a significantly different results than SCA, GA, PSO, CSA, RRA, and ACOR (this fact is also present in Table 9). In this context, SCA is better than GA, but their results are very similar to the computed by PSO (it also occurs with Otsu).

4.4 Discussion

The results obtained by the proposed algorithms (in specific the ALO) in comparison with similar optimization mechanism are illustrated in Tables 1, 2, 3, and 4, Figs. 7 and 8 for Otsu’s method and Tables 5, 6, 7, and 8, Fig. 9 for Kapur’s entropy. Such Tables include graphical results to illustrate the segmentation process performed by ALO and SCA over the Energy Curve (EC). The results from Tables 1 and 5 indicate that all the algorithms perform nearly equally for nt = 2, this situation occurs due that is a two-dimensional problem. In contrast, when the number of thresholds to find increases the algorithms tends to fail in suboptimal values that affect the quality of the segmented images. Considering such facts, the ALO algorithms is more stable to the high dimensionality than SCA, and the segmentation provided by ALO also preserves the features of the pixels in the original image. Figures 7, 8, 9, and 10 prove that this is true since some thresholds provided by SCA are close to each other.

The Tables 2, 3, 4, 6, 7, and 8 show the quality results of the segmented images using the thresholds obtained by the selected algorithms using the Otsu and Kapur functions. The values of the metrics PSNR, SSIM, and FSIM, increase with the number of thresholds; this means that the quality of the segmented image is better. Here is important to mention that the use of the Energy Curve helps to have more information about the pixels distribution [18, 44]. The results provided in this test by ALO are higher in most of the cases followed by some of the other algorithms included the SCA. This behavior occurs because each image has different features that represent a specific optimization problem. Moreover, the randomness of ECA generate some oscillations in the results. Finally based on the No-Free Lunch Theorem (NFL) [54] and using the results presented in this section is possible to conclude that is difficult to define if an algorithm can segment any images. This is also supported by the fact that the Energy Curve is multimodal. For example, if a threshold is selected in a section of the EC that is not appropriate, the segmentation probably will be not the best. Such situation only can be seen by PSNR, SSIM or FSIM, because the objective function (Otsu or Kapur) only provide information about how the pixels are separated in classes. In addition, this subsection can be extended for the results presented in the corresponding tables of the supplementary material file.

5 Conclusions

This paper introduces the implementation of two optimization algorithms for the problem of multilevel image thresholding. These two approaches have been recently proposed in the related literature; one of them is the ALO that is inspired by the hunting behavior of antlions, while that the SCA uses the sine and cosine functions to modify the positions of the candidate solutions. In this paper, ALO and SCA are tested over two objective functions (Otsu and Kapur) to find the best thresholds for image segmentation. Different to other similar approaches here the spatial contextual information of the image is also considered to select the thresholds. This method is known as energy curve of the image, and it has similar properties than the histogram, but it also includes information about the neighborhood of each pixel.

The proposed thresholding approaches based on ALO and SCA have been tested over a set of benchmark images. Moreover, considering that ALO and SCA are relatively novel approaches, they have been compared with GA, PSO, CSA, RRA, and ACOR. From the experimental results, we can conclude that ALO outperforms all the other algorithms. The segmentation process in this paper is a multidimensional problem since 8-bit images are used; the number of thresholds is set to 2, 4, 8, 16 and 32. Based on the results obtained the future work might include the implementation of ALO with energy curve for medical images (MRI, blood cells, or thermal).